From hongbin034 at gmail.com Wed May 1 01:33:44 2019 From: hongbin034 at gmail.com (Hongbin Lu) Date: Tue, 30 Apr 2019 21:33:44 -0400 Subject: [Zun] openstack appcontainer create Error In-Reply-To: References: Message-ID: Hi Alejandro, It looks your etcd cluster failed to elect a leader. You might want to check your etcd log for details, and bring your etcd cluster back to healthy state. Unfortunately, I don't have operational experience with etcd. You would need to look at their admin guide for help: https://coreos.com/etcd/docs/latest/v2/admin_guide.html . In the worst case, remove and re-install etcd could work. Best regards, Hongbin On Tue, Apr 30, 2019 at 5:32 PM Alejandro Ruiz Bermejo < arbermejo0417 at gmail.com> wrote: > Hi, i'm installing Zun in Openstack Queens with Ubuntu 18.04.1 LTS, i > already have configured docker and kuyr-libnetwork. I'm following the guide > at https://docs.openstack.org/zun/queens/install/index.html. I followed > all the steps of the installation at controller node and everything > resulted without problems. > > After finished the installation direction at compute node the *systemctl > status zun-compute* have the following errors > > root at compute /h/team# systemctl status zun-compute > ● zun-compute.service - OpenStack Container Service Compute Agent > Loaded: loaded (/etc/systemd/system/zun-compute.service; enabled; > vendor preset: enabled) > Active: active (running) since Tue 2019-04-30 16:46:56 UTC; 4h 26min ago > Main PID: 2072 (zun-compute) > Tasks: 1 (limit: 4915) > CGroup: /system.slice/zun-compute.service > └─2072 /usr/bin/python /usr/local/bin/zun-compute > > Apr 30 16:46:56 compute systemd[1]: Started OpenStack Container Service > Compute Agent. > Apr 30 16:46:57 compute zun-compute[2072]: 2019-04-30 16:46:57.929 2072 > INFO zun.cmd.compute [-] Starting server in PID 2072 > Apr 30 16:46:57 compute zun-compute[2072]: 2019-04-30 16:46:57.941 2072 > INFO zun.container.driver [-] Loading container driver > 'docker.driver.DockerDriver' > Apr 30 16:46:58 compute zun-compute[2072]: 2019-04-30 16:46:58.028 2072 > INFO zun.container.driver [-] Loading container driver > 'docker.driver.DockerDriver' > Apr 30 16:48:33 compute zun-compute[2072]: 2019-04-30 16:48:33.645 2072 > INFO zun.image.driver [req-7e0b8325-1e09-4410-80f4-af807cbc0420 > a16c6ef0319b4643a4ec8e56a1d025cb 59065d8f970b467aa94ef7b35f1edab5 default - > -] Loading container image driver 'glance' > Apr 30 16:48:33 compute zun-compute[2072]: 2019-04-30 16:48:33.911 2072 > INFO zun.image.driver [req-7e0b8325-1e09-4410-80f4-af807cbc0420 > a16c6ef0319b4643a4ec8e56a1d025cb 59065d8f970b467aa94ef7b35f1edab5 default - > -] Loading container image driver 'glance' > Apr 30 16:48:35 compute zun-compute[2072]: 2019-04-30 16:48:35.455 2072 > INFO zun.image.driver [req-7e0b8325-1e09-4410-80f4-af807cbc0420 > 16c6ef0319b4643a4ec8e56a1d025cb 59065d8f970b467aa94ef7b35f1edab5 default - > -] Loading container image driver 'glance' > Apr 30 16:48:35 compute zun-compute[2072]: 2019-04-30 16:48:35.939 2072 > ERROR zun.image.glance.driver [req-7e0b8325-1e09-4410-80f4-af807cbc0420 > a16c6ef0319b4643a4ec8e56a1d025cb 59065d8f970b467aa94ef7b35f1edab5 default - > -] Imae cirros was not found in glance: ImageNotFound: Image cirros could > not be found. > Apr 30 16:48:35 compute zun-compute[2072]: 2019-04-30 16:48:35.940 2072 > INFO zun.image.driver [req-7e0b8325-1e09-4410-80f4-af807cbc0420 > a16c6ef0319b4643a4ec8e56a1d025cb 59065d8f970b467aa94ef7b35f1edab5 default - > -] Loading container image driver 'docker' > Apr 30 16:48:55 compute zun-compute[2072]: 2019-04-30 16:48:55.011 2072 > ERROR zun.compute.manager [req-7bfa764a-45b8-4e2f-ac70-84d8bb71b135 - - - - > -] Error occurred while calling Docker create API: Docker internal error: > 500 Server Error: Internal Server Error ("failed to update store for object > typpe *libnetwork.endpointCnt: client: etcd member http://controller:2379 > has no leader").: DockerError: Docker internal error: 500 Server Error: > Internal Server Error ("failed to update store for object type > *libnetwork.endpointtCnt: client: etcd member http://controller:2379 has > no leader"). > > Also *systemctl status docker* show the next output > > root at compute /h/team# systemctl status docker > ● docker.service - Docker Application Container Engine > Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor > preset: enabled) > Drop-In: /etc/systemd/system/docker.service.d > └─docker.conf, http-proxy.conf, https-proxy.conf > Active: active (running) since Tue 2019-04-30 16:46:25 UTC; 4h 18min ago > Docs: https://docs.docker.com > Main PID: 1777 (dockerd) > Tasks: 21 > CGroup: /system.slice/docker.service > └─1777 /usr/bin/dockerd --group zun -H tcp://compute:2375 -H > unix:///var/run/docker.sock --cluster-store etcd://controller:2379 > > Apr 30 16:46:20 compute dockerd[1777]: > time="2019-04-30T16:46:20.815305836Z" level=warning msg="Your kernel does > not support cgroup rt runtime" > Apr 30 16:46:20 compute dockerd[1777]: > time="2019-04-30T16:46:20.815933695Z" level=info msg="Loading containers: > start." > Apr 30 16:46:24 compute dockerd[1777]: > time="2019-04-30T16:46:24.378526837Z" level=info msg="Default bridge > (docker0) is assigned with an IP address 17 > Apr 30 16:46:24 compute dockerd[1777]: > time="2019-04-30T16:46:24.572558877Z" level=info msg="Loading containers: > done." > Apr 30 16:46:25 compute dockerd[1777]: > time="2019-04-30T16:46:25.198101219Z" level=info msg="Docker daemon" > commit=e8ff056 graphdriver(s)=overlay2 vers > Apr 30 16:46:25 compute dockerd[1777]: > time="2019-04-30T16:46:25.198211373Z" level=info msg="Daemon has completed > initialization" > Apr 30 16:46:25 compute dockerd[1777]: > time="2019-04-30T16:46:25.232286069Z" level=info msg="API listen on > /var/run/docker.sock" > Apr 30 16:46:25 compute dockerd[1777]: > time="2019-04-30T16:46:25.232318790Z" level=info msg="API listen on > 10.8.9.58:2375" > Apr 30 16:46:25 compute systemd[1]: Started Docker Application Container > Engine. > Apr 30 16:48:55 compute dockerd[1777]: > time="2019-04-30T16:48:55.009820439Z" level=error msg="Handler for POST > /v1.26/networks/create returned error: failed to update store for object > type *libnetwork.endpointCnt: client: etcd member http://controller:2379 > has no leader" > > > When i try to launch an app container as the guide says it shows an Error > state and when i run opentack appcontainer show this is the reason of the > error > status_reason | Docker internal error: 500 Server Error: Internal > Server Error ("failed to update store for object type > *libnetwork.endpointCnt: client: etcd member http://controller:2379 has > no leader") > > I had some troubles installing Kuryr-libnetwork besides that i didn't had > any othet problem during the installation of Zun > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cdent+os at anticdent.org Wed May 1 03:33:56 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Tue, 30 Apr 2019 21:33:56 -0600 (MDT) Subject: [placement][nova][ironic][blazar][ptg] Placement PTG Agenda Message-ID: Near the top of the placement ptg etherpad [1] I've sketched out a schedule for the end of this week for those who happen to be in Denver. Since so many of the placement team will be required elsewhere, it is pretty thin. I think this is okay because a) we got quite a bit accomplished during the pre-PTG emails, b) the main things we need to discuss [2] will be strongly informed by other discussion in the week and need to be revisited several times. The summary of the schedule is: Thursday: 14:30-Beer: In the nova room doing cross project stuff Friday: Morning: wherever you need to be, often nova room Afternoon: In the placement room (for those who can) to capture and clarify results of the Thursday session and Friday morning and topics as people present allows. Saturday: Morning: Ironic/Blazar/Placement/Anyone else interested in using placement. Afternoon: Capture and clarify, retrospective, refactoring goals, hacking. The topics in [2] (and all the related emails) will be mixed into Thursday, Friday and Saturday afternoons. Thank you for whatever time you're able to make available. If you have conflicts, don't worry, everything will get summarized later and if it is properly important will come up again. [1] https://etherpad.openstack.org/p/placement-ptg-train [2] http://lists.openstack.org/pipermail/openstack-discuss/2019-April/005715.html -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From Arvind.Kumar at ril.com Wed May 1 06:20:27 2019 From: Arvind.Kumar at ril.com (Arvind Kumar) Date: Wed, 1 May 2019 06:20:27 +0000 Subject: [External]Re: [Ceilometer]: cpu_util meter not being calculated as expected leading to delay in scaling In-Reply-To: References: Message-ID: Hi Trinh, I am using OpenStack Queens release on Ubuntu setup. Regards, Arvind. From: Trinh Nguyen Sent: 26 April 2019 07:31 To: Arvind Kumar Cc: openstack-discuss at lists.openstack.org Subject: [External]Re: [Ceilometer]: cpu_util meter not being calculated as expected leading to delay in scaling The e-mail below is from an external source. Please do not open attachments or click links from an unknown or suspicious origin. Hi Arvind, Could you please tell us which release of Ceilometer that you are referring to? Bests, On Wed, Apr 24, 2019 at 4:55 PM Arvind Kumar > wrote: Hi, A design issue is observed in ceilometer service of Openstack. Setup include multiple compute nodes and 3 controller nodes. Meters from each compute node are sent to all the 3 ceilometer instances via RabbitMQ in round robin fashion at an interval of 10 min. After transformation of cumulative cpu meter data, cpu_util is generated by ceilometer instance at controller node and is published to the http address configured in ceilometer pipeline configuration. cpu_util is used by the application to take the decision if scaling of VM needs to be triggered or not. Ceilometer instance calculates cpu_util for a VM from the difference between cumulative cpu usage of VM at two timestamp divided by the timestamp difference. Let’s say 1 compute node send the cumulative cpu usage of a VM (C1, C2, C3, C4) at timestamp T1, T2, T3, T4 (difference between any two timestamp is 10 min). Now (C1,T1) & (C4,T4) tuple is received by ceilometer instance 1, (C2,T2) by instance 2, (C3,T3) by instance 3. Here even if CPU usage of VM is increased between T1 & T2, cpu_util is calculated for 30 min duration (T1 & T4) rather than as expected for 10 min. This leads to scaling getting triggered after T4 that too when CPU usage is consistently above the threshold between T1 and T4. Please suggest how could this issue could be resolved. Do we have any solution to bind VM or compute node meter data to specific ceilometer instance for processing? Regards, Arvind. "Confidentiality Warning: This message and any attachments are intended only for the use of the intended recipient(s), are confidential and may be privileged. If you are not the intended recipient, you are hereby notified that any review, re-transmission, conversion to hard copy, copying, circulation or other use of this message and any attachments is strictly prohibited. If you are not the intended recipient, please notify the sender immediately by return email and delete this message and any attachments from your system. Virus Warning: Although the company has taken reasonable precautions to ensure no viruses are present in this email. The company cannot accept responsibility for any loss or damage arising from the use of this email or attachment." -- Trinh Nguyen www.edlab.xyz "Confidentiality Warning: This message and any attachments are intended only for the use of the intended recipient(s). are confidential and may be privileged. If you are not the intended recipient. you are hereby notified that any review. re-transmission. conversion to hard copy. copying. circulation or other use of this message and any attachments is strictly prohibited. If you are not the intended recipient. please notify the sender immediately by return email. and delete this message and any attachments from your system. Virus Warning: Although the company has taken reasonable precautions to ensure no viruses are present in this email. The company cannot accept responsibility for any loss or damage arising from the use of this email or attachment." -------------- next part -------------- An HTML attachment was scrubbed... URL: From ianyrchoi at gmail.com Wed May 1 10:46:06 2019 From: ianyrchoi at gmail.com (Ian Y. Choi) Date: Wed, 1 May 2019 19:46:06 +0900 Subject: [PTG][I18n] Additional Virtual PTG scheduling Message-ID: Hello, Although I shared my priorities as I18n PTL during Train cycle [1], I couldn't attend to the PTG at this time. There will be Docs+I18n PTG tomorrow in Denver with lots of discussions for more on cross-project collaboration with Docs team, and other teams, and I wanna join as remote on 16:00-17:00 according to [2] (thanks a lot, Frank!). I want to design an additional virtual PTG event, which some of other teams also design something similar as, but would like to plan somewhat differently to reflect I18n team members' geographical & language diversity as much as possible. Any translators, language coordinators, and contributors are welcome with the following cadence: - Please allocate your 30 minutes on May 2 (according to Denver timezone). - Please visit https://ethercalc.openstack.org/i18n-virtual-ptg-train and grasp how I18n Virtual PTG operates. - Choose your best time and write your name, country, preferred comm method, and notes by filling out cells on H19-K66.   I will be online on IRC or Zoom (or please share your best communication method - I will follow as much as possible). This might be something different from general cadence on PTG, but I really hope that I18n team will have better communication through such activities. Please join in the discussion - I will reflect all of opinions as much as possible for better I18n world during this cycle. Note that I purposely marked some of my unavailable time slots but can be adjusted well - believe me, since someone asks me when I sleep (although it is getting harder.. :p ) With many thanks, /Ian [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-March/003757.html [2] http://lists.openstack.org/pipermail/openstack-discuss/2019-April/005668.html From zigo at debian.org Wed May 1 11:44:34 2019 From: zigo at debian.org (Thomas Goirand) Date: Wed, 1 May 2019 13:44:34 +0200 Subject: properly sizing openstack controlplane infrastructure In-Reply-To: <20190430153021.jhdgri7g2nvpn5vj@alle-irre.de> References: <20190430153021.jhdgri7g2nvpn5vj@alle-irre.de> Message-ID: <6448907c-6aaf-2f91-fe77-48e697c7b80f@debian.org> On 4/30/19 5:30 PM, Hartwig Hauschild wrote: > Also: We're currently running Neutron in OVS-DVR-VXLAN-Configuration. > Does that properly scale up and above 50+ nodes It does, that's not the bottleneck. >From my experience, 3 heavy control nodes are really enough to handle 200+ compute nodes. Though what you're suggesting (separating db & rabbitmq-server in separate nodes) is a very good idea. Cheers, Thomas Goirand (zigo) From manuel.sb at garvan.org.au Wed May 1 12:31:17 2019 From: manuel.sb at garvan.org.au (Manuel Sopena Ballesteros) Date: Wed, 1 May 2019 12:31:17 +0000 Subject: how to get best io performance from my block devices Message-ID: <9D8A2486E35F0941A60430473E29F15B017EA658B2@mxdb2.ad.garvan.unsw.edu.au> Dear Openstack community, I would like to have a high performance distributed database running in Openstack vms. I tried attaching dedicated nvme pci devices to the vm but the performance is not as good as I can get from bare metal. Bare metal: [root at zeus-54 data]# fio --ioengine=libaio --name=test --filename=test --bs=4k --size=40G --readwrite=randrw --runtime=120 --time_based test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1 fio-3.1 Starting 1 process Jobs: 1 (f=1): [f(1)][100.0%][r=39.5MiB/s,w=39.6MiB/s][r=10.1k,w=10.1k IOPS][eta 00m:00s] test: (groupid=0, jobs=1): err= 0: pid=50892: Wed May 1 22:22:45 2019 read: IOPS=9805, BW=38.3MiB/s (40.2MB/s)(4596MiB/120001msec) slat (usec): min=39, max=6678, avg=94.72, stdev=55.78 clat (nsec): min=450, max=18224, avg=525.83, stdev=120.10 lat (usec): min=39, max=6679, avg=95.36, stdev=55.79 clat percentiles (nsec): | 1.00th=[ 462], 5.00th=[ 478], 10.00th=[ 482], 20.00th=[ 486], | 30.00th=[ 490], 40.00th=[ 494], 50.00th=[ 502], 60.00th=[ 510], | 70.00th=[ 516], 80.00th=[ 532], 90.00th=[ 596], 95.00th=[ 676], | 99.00th=[ 860], 99.50th=[ 1048], 99.90th=[ 1384], 99.95th=[ 2480], | 99.99th=[ 3728] bw ( KiB/s): min= 720, max=40736, per=100.00%, avg=39389.00, stdev=5317.58, samples=239 iops : min= 180, max=10184, avg=9847.23, stdev=1329.39, samples=239 write: IOPS=9799, BW=38.3MiB/s (40.1MB/s)(4594MiB/120001msec) slat (nsec): min=2982, max=106207, avg=4220.09, stdev=980.04 clat (nsec): min=407, max=18130, avg=451.48, stdev=103.71 lat (usec): min=3, max=111, avg= 4.74, stdev= 1.03 clat percentiles (nsec): | 1.00th=[ 414], 5.00th=[ 418], 10.00th=[ 422], 20.00th=[ 430], | 30.00th=[ 434], 40.00th=[ 434], 50.00th=[ 438], 60.00th=[ 438], | 70.00th=[ 442], 80.00th=[ 446], 90.00th=[ 462], 95.00th=[ 588], | 99.00th=[ 700], 99.50th=[ 916], 99.90th=[ 1208], 99.95th=[ 1288], | 99.99th=[ 3536] bw ( KiB/s): min= 752, max=42608, per=100.00%, avg=39366.63, stdev=5355.73, samples=239 iops : min= 188, max=10652, avg=9841.64, stdev=1338.93, samples=239 lat (nsec) : 500=69.98%, 750=28.64%, 1000=0.90% lat (usec) : 2=0.42%, 4=0.04%, 10=0.01%, 20=0.01% cpu : usr=2.20%, sys=10.85%, ctx=1176675, majf=0, minf=1372 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwt: total=1176625,1175958,0, short=0,0,0, dropped=0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): READ: bw=38.3MiB/s (40.2MB/s), 38.3MiB/s-38.3MiB/s (40.2MB/s-40.2MB/s), io=4596MiB (4819MB), run=120001-120001msec WRITE: bw=38.3MiB/s (40.1MB/s), 38.3MiB/s-38.3MiB/s (40.1MB/s-40.1MB/s), io=4594MiB (4817MB), run=120001-120001msec Disk stats (read/write): nvme9n1: ios=1174695/883620, merge=0/0, ticks=105502/72225, in_queue=192101, util=99.28% >From vm: [centos at kudu-1 nvme0]$ sudo fio --ioengine=libaio --name=test --filename=test --bs=4k --size=40G --readwrite=randrw --runtime=120 --time_based test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1 fio-3.1 Starting 1 process Jobs: 1 (f=1): [m(1)][100.0%][r=29.2MiB/s,w=29.7MiB/s][r=7487,w=7595 IOPS][eta 00m:00s] test: (groupid=0, jobs=1): err= 0: pid=44383: Wed May 1 12:22:24 2019 read: IOPS=6994, BW=27.3MiB/s (28.6MB/s)(3278MiB/120000msec) slat (usec): min=54, max=20476, avg=115.27, stdev=71.45 clat (nsec): min=1757, max=31476, avg=2163.02, stdev=688.66 lat (usec): min=56, max=20481, avg=118.51, stdev=71.66 clat percentiles (nsec): | 1.00th=[ 1800], 5.00th=[ 1832], 10.00th=[ 1864], 20.00th=[ 1992], | 30.00th=[ 2040], 40.00th=[ 2064], 50.00th=[ 2064], 60.00th=[ 2096], | 70.00th=[ 2096], 80.00th=[ 2128], 90.00th=[ 2480], 95.00th=[ 2544], | 99.00th=[ 4448], 99.50th=[ 5536], 99.90th=[11072], 99.95th=[12736], | 99.99th=[18560] bw ( KiB/s): min= 952, max=31224, per=100.00%, avg=28153.51, stdev=4126.89, samples=237 iops : min= 238, max= 7806, avg=7038.23, stdev=1031.70, samples=237 write: IOPS=6985, BW=27.3MiB/s (28.6MB/s)(3274MiB/120000msec) slat (usec): min=7, max=963, avg=12.60, stdev= 6.24 clat (nsec): min=1662, max=199250, avg=2030.26, stdev=712.33 lat (usec): min=10, max=970, avg=15.68, stdev= 6.48 clat percentiles (nsec): | 1.00th=[ 1688], 5.00th=[ 1720], 10.00th=[ 1736], 20.00th=[ 1864], | 30.00th=[ 1928], 40.00th=[ 1944], 50.00th=[ 1944], 60.00th=[ 1960], | 70.00th=[ 1960], 80.00th=[ 1992], 90.00th=[ 2352], 95.00th=[ 2384], | 99.00th=[ 4048], 99.50th=[ 4768], 99.90th=[11456], 99.95th=[13120], | 99.99th=[19072] bw ( KiB/s): min= 912, max=31880, per=100.00%, avg=28119.64, stdev=4176.38, samples=237 iops : min= 228, max= 7970, avg=7029.75, stdev=1044.07, samples=237 lat (usec) : 2=51.56%, 4=47.17%, 10=1.03%, 20=0.22%, 50=0.01% lat (usec) : 250=0.01% cpu : usr=4.96%, sys=28.37%, ctx=839307, majf=0, minf=26 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwt: total=839283,838268,0, short=0,0,0, dropped=0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): READ: bw=27.3MiB/s (28.6MB/s), 27.3MiB/s-27.3MiB/s (28.6MB/s-28.6MB/s), io=3278MiB (3438MB), run=120000-120000msec WRITE: bw=27.3MiB/s (28.6MB/s), 27.3MiB/s-27.3MiB/s (28.6MB/s-28.6MB/s), io=3274MiB (3434MB), run=120000-120000msec Disk stats (read/write): nvme0n1: ios=838322/651596, merge=0/0, ticks=83804/22119, in_queue=104773, util=70.18% Is there a way I can get near bare metal performance from my nvme block devices? NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosmaita.fossdev at gmail.com Wed May 1 13:21:26 2019 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Wed, 1 May 2019 07:21:26 -0600 Subject: [glance] [ops] Issue sharing an image with another project (something related to get_image_location) In-Reply-To: References: Message-ID: (Apologies for top-posting.) Hi Massimo, Two things: (1) Please file a glance bug for this. I didn't think the sharing code would touch image locations, but apparently it does. In the bug report, please include your policy settings for *_location and *_member, and also the output of an image-show call for the image you're trying to share, and the log extract. (2) With the policy settings you have for *_location, I don't think that any regular (non-admin) user will be able to download an image or boot an instance from an image, so you should verify those operations. Given what I just said, how do you protect against OSSN-0065? The following is from the Rocky release notes [0] (which you may not have seen; this item was merged after 17.0.0, and we haven't done a point release, so they're only available online): "The show_multiple_locations configuration option remains deprecated in this release, but it has not been removed. (It had been scheduled for removal in the Pike release.) Please keep a watch on the Glance release notes and the glance-specs repository to stay informed about developments on this issue. "The plan is to eliminate the option and use only policies to control image locations access. This, however, requires some major refactoring. See the draft Policy Refactor spec [1] for more information. "There is no projected timeline for this change, as no one has been able to commit time to it. The Glance team would be happy to discuss this more with anyone interested in working on it. "The workaround is to continue to use the show_multiple_locations option in a dedicated “internal” Glance node that is not accessible to end users. We continue to recommend that image locations not be exposed to end users. See OSSN-0065 for more information." Sorry for the long quote, but I wanted to take this opportunity to remind people that "The Glance team would be happy to discuss this more with anyone interested in working on it". It's particularly relevant to anyone who will be at the PTG this week -- please look for the Glance team and get a discussion started, because I don't think this item is currently a priority for Train [2]. [0] https://docs.openstack.org/releasenotes/glance/rocky.html#known-issues [1] https://review.opendev.org/#/c/528021/ [2] https://wiki.openstack.org/wiki/PTG/Train/Etherpads On 4/29/19 8:43 AM, Massimo Sgaravatto wrote: > I have a small Rocky installation where Glance is configured with 2 > backends (old images use the 'file' backend while new ones use the rbd > backend, which is the default) > > > show_multiple_locations  is true but I have these settings in policy.json: > > # grep _image_location /etc/glance/policy.json >     "delete_image_location": "role:admin", >     "get_image_location": "role:admin", >     "set_image_location": "role:admin", > > This was done because of: > https://wiki.openstack.org/wiki/OSSN/OSSN-0065 > > > If an unpriv user tries to share a private image: > > $ openstack image add project 3194a04b-ffc8-4aaf-b6c8-adc24e3d3fe6 > e81df4c0b493439abb8b85bfd4cbe071 > 403 Forbidden: Not allowed to create members for image > 3194a04b-ffc8-4aaf-b6c8-adc24e3d3fe6. (HTTP 403) > > In the log file it looks like that the problem is related to the > get_image_location operation: > > /var/log/glance/api.log:2019-04-29 16:06:54.523 8220 WARNING > glance.api.v2.image_members [req-dd93cdc9-767d-4c51-8e5a-edf746c02264 > ab573ba3ea014b778193b6922ffffe6d ee1865a76440481cbcff08544c7d580a - > default default] Not allowed to create members for image > 3194a04b-ffc8-4aaf-b6c8-adc24e3d3fe6.: Forbidden: You are not authorized > to complete get_image_location action. > > > But actually the sharing operation succeeded: > > $ glance member-list --image-id 3194a04b-ffc8-4aaf-b6c8-adc24e3d3fe6 > +--------------------------------------+----------------------------------+---------+ > | Image ID                             | Member ID                      >   | Status  | > +--------------------------------------+----------------------------------+---------+ > | 3194a04b-ffc8-4aaf-b6c8-adc24e3d3fe6 | > e81df4c0b493439abb8b85bfd4cbe071 | pending | > +--------------------------------------+----------------------------------+---------+ > > > Cheers, Massimo From james.page at canonical.com Wed May 1 14:54:02 2019 From: james.page at canonical.com (James Page) Date: Wed, 1 May 2019 08:54:02 -0600 Subject: [ptg][sig][upgrades] Train PTG Upgrades SIG session In-Reply-To: References: Message-ID: Hi All Reminder that the Upgrades SIG session is tomorrow morning (Thursday) in room 201 at the PTG. https://etherpad.openstack.org/p/upgrade-sig-ptg-train I've added slots for our regular agenda topics of Operator Feedback and Deployment Project Updates - so if you are an operator or a developer on one of the numerous deployment projects please add your name to the etherpad along with your proposed topic! Cheers James On Sat, Apr 27, 2019 at 2:41 AM James Page wrote: > Hi All > > I've finally found time to create an etherpad for the Upgrades SIG session > at the upcoming PTG in Denver (on the train on my way to LHR to catch my > flight). > > https://etherpad.openstack.org/p/upgrade-sig-ptg-train > > I've added a few proposed topics but if you're at the PTG (or summit) and > have anything upgrade related to discuss please add your topic to the > etherpad over the next few days - I'll then put together a rough schedule > for our half day of upgrades discussion on Thursday morning in room 201. > > IRC meetings never really got restarted since the last PTG but I know that > the promise of getting together to discuss upgrade successes and challenges > generally appeals to us all based on prior sessions! > > Thanks in advance and see you all in Denver! > > Cheers > > James > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skylightcoder at gmail.com Wed May 1 15:19:13 2019 From: skylightcoder at gmail.com (=?UTF-8?B?R8O2a2hhbiBJxZ5JSw==?=) Date: Wed, 1 May 2019 18:19:13 +0300 Subject: [Nova][Neutron] When Trying to Use Xen Hypervisor on OpenStack, Virtual machines can not get ip Message-ID: Hi Team, I am trying to test hypervisors which OpenStack supported. So I installed 1 controller node , 1 xen compute node and 1 kvm compute node. I installed OpenStack Pike version. My kvm compute node works properly but I have problem about xen compute node. ı installed xen server 7.0 version to my server and on domU I created centos 7 virtual machine and ı installed nova-compute on it. For installing Openstack on xen I followed https://openstack-xenserver.readthedocs.io/en/latest/ guide. When I tried creating virtual machine , virtual machine is created but it has no ip. It didn't get any ip. I have no experince on xen server and ı don't know how ı can solve this problem. I looked at logs but I didn't see any errors. I need your help. I doubt of my neutron config. ı have 2 nics and one for management network and 1 for public network. These are my nics on dom0[ https://imgur.com/a/IrdLoCn]. these are my nics on domU [https://imgur.com/a/5RliHa7]. I am sending my ifconfig output on dom0[ http://paste.openstack.org/show/750146/] and domU[ http://paste.openstack.org/show/750147/]. I am also sending my openvswitch_agent.ini file[ http://paste.openstack.org/show/750148/ ]. -------------- next part -------------- An HTML attachment was scrubbed... URL: From skylightcoder at gmail.com Wed May 1 15:22:06 2019 From: skylightcoder at gmail.com (=?UTF-8?B?R8O2a2hhbiBJxZ5JSw==?=) Date: Wed, 1 May 2019 18:22:06 +0300 Subject: [Nova][Neutron] Using Xen hypervisor On Openstack Message-ID: Hi Team, I am trying to test hypervisors which OpenStack supported. So I installed 1 controller node , 1 xen compute node and 1 kvm compute node. I installed OpenStack Pike version. My kvm compute node works properly but I have problem about xen compute node. ı installed xen server 7.0 version to my server and on domU I created centos 7 virtual machine and ı installed nova-compute on it. For installing Openstack on xen I followed https://openstack-xenserver.readthedocs.io/en/latest/ guide. When I tried creating virtual machine , virtual machine is created but it has no ip. It didn't get any ip. I have no experince on xen server and ı don't know how ı can solve this problem. I looked at logs but I didn't see any errors. I need your help. I doubt of my neutron config. ı have 2 nics and one for management network and 1 for public network. These are my nics on dom0[ https://imgur.com/a/IrdLoCn]. these are my nics on domU [https://imgur.com/a/5RliHa7]. I am sending my ifconfig output on dom0[ http://paste.openstack.org/show/750146/] and domU[ http://paste.openstack.org/show/750147/]. I am also sending my openvswitch_agent.ini file[ http://paste.openstack.org/show/750148/ ]. ı am waiting for your help. Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: From colleen at gazlene.net Wed May 1 15:21:51 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Wed, 01 May 2019 11:21:51 -0400 Subject: [devstack] Identity URL problem In-Reply-To: References: Message-ID: <3114008b-bf06-4f74-b74a-2c03676cb860@www.fastmail.com> On Tue, Apr 30, 2019, at 10:41, Neil Jerram wrote: > Does anyone know what causes this problem at [1]: > > 2019-04-30 16:34:03.137 | +++ functions-common:oscwrap:2346 : command > openstack role add admin --user neutron --project service --user-domain > Default --project-domain Default > 2019-04-30 16:34:03.139 | +++ functions-common:oscwrap:2346 : openstack > role add admin --user neutron --project service --user-domain Default > --project-domain Default > 2019-04-30 16:34:04.331 | Failed to discover available identity > versions when contacting http://104.239.175.234/identity. Attempting to > parse version from URL. > 2019-04-30 16:34:04.331 | Could not find versioned identity endpoints > when attempting to authenticate. Please check that your auth_url is > correct. Not Found (HTTP 404) > > [1] > http://logs.openstack.org/79/638479/3/check/networking-calico-tempest-dsvm/5431e4b/logs/devstacklog.txt.gz > > I think there are loads of uses of that URL, before where the > networking-calico plugin uses it, so I can't see why the plugin's use > hits that error. > > Thanks, > Neil > That error usually means that keystone couldn't be reached at all. Looking through the devstack log, it looks like keystone is not even enabled: http://logs.openstack.org/79/638479/3/gate/networking-calico-tempest-dsvm/5888def/logs/devstacklog.txt.gz#_2019-04-11_11_05_00_946 Colleen From neil at tigera.io Wed May 1 15:28:51 2019 From: neil at tigera.io (Neil Jerram) Date: Wed, 1 May 2019 16:28:51 +0100 Subject: [devstack] Identity URL problem In-Reply-To: <3114008b-bf06-4f74-b74a-2c03676cb860@www.fastmail.com> References: <3114008b-bf06-4f74-b74a-2c03676cb860@www.fastmail.com> Message-ID: On Wed, May 1, 2019 at 4:21 PM Colleen Murphy wrote: > On Tue, Apr 30, 2019, at 10:41, Neil Jerram wrote: > > Does anyone know what causes this problem at [1]: > > > > 2019-04-30 16:34:03.137 | +++ functions-common:oscwrap:2346 : command > > openstack role add admin --user neutron --project service --user-domain > > Default --project-domain Default > > 2019-04-30 16:34:03.139 | +++ functions-common:oscwrap:2346 : openstack > > role add admin --user neutron --project service --user-domain Default > > --project-domain Default > > 2019-04-30 16:34:04.331 | Failed to discover available identity > > versions when contacting http://104.239.175.234/identity. Attempting to > > parse version from URL. > > 2019-04-30 16:34:04.331 | Could not find versioned identity endpoints > > when attempting to authenticate. Please check that your auth_url is > > correct. Not Found (HTTP 404) > > > > [1] > > > http://logs.openstack.org/79/638479/3/check/networking-calico-tempest-dsvm/5431e4b/logs/devstacklog.txt.gz > > > > I think there are loads of uses of that URL, before where the > > networking-calico plugin uses it, so I can't see why the plugin's use > > hits that error. > > > > Thanks, > > Neil > > > > That error usually means that keystone couldn't be reached at all. Looking > through the devstack log, it looks like keystone is not even enabled: > > > http://logs.openstack.org/79/638479/3/gate/networking-calico-tempest-dsvm/5888def/logs/devstacklog.txt.gz#_2019-04-11_11_05_00_946 Many thanks Colleen, I'll explicitly enable keystone and see if that helps. Do you know if that's a recent change, that keystone used to be enabled by default, and now requires explicit enabling? Best wishes, Neil -------------- next part -------------- An HTML attachment was scrubbed... URL: From neil at tigera.io Wed May 1 16:10:39 2019 From: neil at tigera.io (Neil Jerram) Date: Wed, 1 May 2019 17:10:39 +0100 Subject: [devstack] Identity URL problem In-Reply-To: References: <3114008b-bf06-4f74-b74a-2c03676cb860@www.fastmail.com> Message-ID: On Wed, May 1, 2019 at 4:28 PM Neil Jerram wrote: > On Wed, May 1, 2019 at 4:21 PM Colleen Murphy wrote: > >> On Tue, Apr 30, 2019, at 10:41, Neil Jerram wrote: >> > Does anyone know what causes this problem at [1]: >> > >> > 2019-04-30 16:34:03.137 | +++ functions-common:oscwrap:2346 : command >> > openstack role add admin --user neutron --project service --user-domain >> > Default --project-domain Default >> > 2019-04-30 16:34:03.139 | +++ functions-common:oscwrap:2346 : openstack >> > role add admin --user neutron --project service --user-domain Default >> > --project-domain Default >> > 2019-04-30 16:34:04.331 | Failed to discover available identity >> > versions when contacting http://104.239.175.234/identity. Attempting >> to >> > parse version from URL. >> > 2019-04-30 16:34:04.331 | Could not find versioned identity endpoints >> > when attempting to authenticate. Please check that your auth_url is >> > correct. Not Found (HTTP 404) >> > >> > [1] >> > >> http://logs.openstack.org/79/638479/3/check/networking-calico-tempest-dsvm/5431e4b/logs/devstacklog.txt.gz >> > >> > I think there are loads of uses of that URL, before where the >> > networking-calico plugin uses it, so I can't see why the plugin's use >> > hits that error. >> > >> > Thanks, >> > Neil >> > >> >> That error usually means that keystone couldn't be reached at all. >> Looking through the devstack log, it looks like keystone is not even >> enabled: >> >> >> http://logs.openstack.org/79/638479/3/gate/networking-calico-tempest-dsvm/5888def/logs/devstacklog.txt.gz#_2019-04-11_11_05_00_946 > > > Many thanks Colleen, I'll explicitly enable keystone and see if that helps. > > Do you know if that's a recent change, that keystone used to be enabled by > default, and now requires explicit enabling? > I'm sorry, I've spotted what the real problem is now, and it doesn't implicate any change to the enablement of keystone. But many thanks again for your input, which was the hint I needed to see the problem! (networking-calico's devstack plugin supports running on multiple nodes, and has a heuristic to differentiate between when it's the first node being set up - with both control and compute functions - and when it's a subsequent node - with compute only. That heuristic had gone wrong, so CI was installing a compute-only node.) Best wishes, Neil -------------- next part -------------- An HTML attachment was scrubbed... URL: From amy at demarco.com Wed May 1 16:45:27 2019 From: amy at demarco.com (Amy) Date: Wed, 1 May 2019 10:45:27 -0600 Subject: [OpenStack-Ansible][OSA] Team Dinner Denver In-Reply-To: References: Message-ID: <8B54A126-2501-4EC3-B0BF-E5FB92FD3CC1@demarco.com> We will be going to the 5280 Burger Bar at 7:00pm. We have a private room!! Hope to see everyone there! Amy (spotz) Sent from my iPhone > On Apr 28, 2019, at 2:34 PM, Amy Marrich wrote: > > We are looking at having our official team dinner Wednesday evening. Please visit this etherpad: > > https://etherpad.openstack.org/p/osa-team-dinner-plan > > To add your name and vote on restaurants so I can get a head count and make a reservation. > > Thanks, > > Amy (spotz) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeremyfreudberg at gmail.com Wed May 1 16:45:38 2019 From: jeremyfreudberg at gmail.com (Jeremy Freudberg) Date: Wed, 1 May 2019 12:45:38 -0400 Subject: [ironic][neutron][ops] Ironic multi-tenant networking, VMs Message-ID: Hi all, I'm wondering if anyone has any best practices for Ironic bare metal nodes and regular VMs living on the same network. I'm sure if involves Ironic's `neutron` multi-tenant network driver, but I'm a bit hazy on the rest of the details (still very much in the early stages of exploring Ironic). Surely it's possible, but I haven't seen mention of this anywhere (except the very old spec from 2015 about introducing ML2 support into Ironic) nor is there a gate job resembling this specific use. Ideas? Thanks, Jeremy From luka.peschke at objectif-libre.com Wed May 1 17:33:13 2019 From: luka.peschke at objectif-libre.com (Luka Peschke) Date: Wed, 01 May 2019 19:33:13 +0200 Subject: [cloudkitty] May meeting is cancelled Message-ID: <27905e8e.AMQAADooJtoAAAAAAAAAAAQR_QkAAAAAZtYAAAAAAAzbjABcydha@mailjet.com> Hello everybody, Given that most of us won't be available on friday the 3rd, the cloudkitty IRC meeting that was planned at that date is cancelled. The next meeting will be held on june 7th at 15h UTC / 17h CET. Cheers, -- Luka Peschke From gagehugo at gmail.com Wed May 1 21:57:52 2019 From: gagehugo at gmail.com (Gage Hugo) Date: Wed, 1 May 2019 15:57:52 -0600 Subject: [security-sig] Security SIG BoF Notes Message-ID: Thanks to everyone who attended the Security SIG BoF session! Attached are the notes taken from the discussion during the session with relevant links. If there was anything missed, please feel free to mention it here or reach out in #openstack-security. Board Picture: https://drive.google.com/open?id=1YWYdp9F5faGzlww1Cr7-i2TawDh60trg Topics: - Overall Security SIG - Links: - https://security.openstack.org/ - https://wiki.openstack.org/wiki/Security-SIG - Security SIG: https://wiki.openstack.org/wiki/Security-SIG - Weekly Agenda: https://etherpad.openstack.org/p/security-agenda - Meeting Time: Weekly on Thursday at 1500 UTC #openstack-meeting - IRC Server: irc.freenode.net - Key Lime: https://github.com/keylime/keylime - Integration with Ironic https://github.com/keylime/keylime/issues/101 - Bandit: https://github.com/PyCQA/bandit - Running bandit as part of tox gate - Keystone does this: https://github.com/openstack/keystone/blob/master/tox.ini#L40 - Run as a separate job - Example (not tox): https://github.com/openstack/openstack-helm/blob/master/zuul.d/jobs-openstack-helm.yaml#L27-L36 - Host Intrusion - Wazuh was mentioned: https://wazuh.com/ - Ansible Hardening - OpenStack Ansible: https://docs.openstack.org/openstack-ansible/latest/ - Security SIG "Help Wanted" - https://docs.openstack.org/security-analysis/latest/ - Only has Barbican, missing other projects that have been added since - Multiple other libraries in review to be added - https://review.openstack.org/#/q/project:openstack/security-analysis+is:open - https://docs.openstack.org/security-guide/ - Security guide doesn’t seem to have been updated since Pike, so it’s a good 1.5 years behind - https://security.openstack.org/#secure-development-guidelines - Improve documentation of secure coding practices - improve coverage of bandit and syntribos jobs across projects, and look into other similar tools we could be using to better secure the software we write - https://wiki.openstack.org/wiki/Security_Notes - Help with writing security notes and triaging the backlog - https://wiki.openstack.org/wiki/Security/Security_Note_Process - https://bugs.launchpad.net/ossn - Security blog: http://openstack-security.github.io/ - VMT Public Bug Assistance - Many reports of suspected vulnerabilities start out as public bugs or are made public over the course of being triaged, and assistance with those is encouraged from the entire community - https://bugs.launchpad.net/ossa - Having someone who is familiar with the affected project provide context to a security bug really helps the VMT definine concrete impact statements and speeds up the overall process - Bootstrapping AWS / Windows Guest Domains / Guest VMs - nova-join: https://github.com/openstack/novajoin - application credentials: https://docs.openstack.org/keystone/latest/user/application_credentials.html - Barbican: https://wiki.openstack.org/wiki/Barbican - Policy - Cross-project policy effort: - https://governance.openstack.org/tc/goals/queens/policy-in-code.html - https://specs.openstack.org/openstack/keystone-specs/specs/keystone/ongoing/policy-goals.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From juliaashleykreger at gmail.com Wed May 1 22:38:37 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Wed, 1 May 2019 16:38:37 -0600 Subject: [ironic][neutron][ops] Ironic multi-tenant networking, VMs In-Reply-To: References: Message-ID: Greetings Jeremy, Best Practice wise, I'm not directly aware of any. It is largely going to depend upon your Neutron ML2 drivers and network fabric. In essence, you'll need an ML2 driver which supports the vnic type of "baremetal", which is able to able to orchestrate the switch port port binding configuration in your network fabric. If your using vlan networks, in essence you'll end up with a neutron physical network which is also a trunk port to the network fabric, and the ML2 driver would then appropriately tag the port(s) for the baremetal node to the networks required. In the CI gate, we do this in the "multitenant" jobs where networking-generic-switch modifies the OVS port configurations directly. If specifically vxlan is what your looking to use between VMs and baremetal nodes, I'm unsure of how you would actually configure that, but in essence the VXLANs would still need to be terminated on the switch port via the ML2 driver. In term of Ironic's documentation, If you haven't already seen it, you might want to check out ironic's multi-tenancy documentation[1]. -Julia [1]: https://docs.openstack.org/ironic/latest/admin/multitenancy.html On Wed, May 1, 2019 at 10:53 AM Jeremy Freudberg wrote: > > Hi all, > > I'm wondering if anyone has any best practices for Ironic bare metal > nodes and regular VMs living on the same network. I'm sure if involves > Ironic's `neutron` multi-tenant network driver, but I'm a bit hazy on > the rest of the details (still very much in the early stages of > exploring Ironic). Surely it's possible, but I haven't seen mention of > this anywhere (except the very old spec from 2015 about introducing > ML2 support into Ironic) nor is there a gate job resembling this > specific use. > > Ideas? > > Thanks, > Jeremy > From ekcs.openstack at gmail.com Wed May 1 22:59:57 2019 From: ekcs.openstack at gmail.com (Eric K) Date: Wed, 1 May 2019 15:59:57 -0700 Subject: [self-healing] live-migrate instance in response to fault signals Message-ID: Hi dasp, Follow up on the discussion today at self-healing BoF. I think you said on the etherpad [1]: ==== Ability to drain (live migrate away) instances automatically in response to any failure/soft-fail/early failure indication (e.g. dropped packets, SMART disk status, issues with RBD connections, repeated build failures, etc) Then quarantine, rebuild, self-test compute host (or hold for hardware fix) Context: generally no clue what is running inside VMs (like public cloud) ==== I just want to follow up to get more info on the context; specifically, which of the following pieces are the main difficulties? - detecting the failure/soft-fail/early failure indication - codifying how to respond to each failure scenario - triggering/executing the desired workflow - something else [1] https://etherpad.openstack.org/p/DEN-self-healing-SIG From gn01737625 at gmail.com Wed May 1 07:45:25 2019 From: gn01737625 at gmail.com (Ming-Che Liu) Date: Wed, 1 May 2019 15:45:25 +0800 Subject: [Deploy problem] deploy openstack+monasca with kolla-ansible 8.0.0.0rc1. Message-ID: Hello, I deployed openstack+monasca with kolla-ansible 8.0.0.0rc1. I follow the steps as mentioned in https://docs.openstack.org/kolla-ansible/latest/user/quickstart.html The setting in my computer's globals.yml as same as [Quick Start] tutorial (attached file: globals.yml is my setting). My machine environment as following: OS: Ubuntu 16.04 Kolla-ansible verions: 8.0.0.0rc1 ansible version: 2.7 When I execute [bootstrap-servers] and [prechecks], it seems ok (no fatal error or any interrupt). But when I execute [deploy], it will occur some error about rabbitmq(when I set enable_rabbitmq:yes) and nova compute service(when I set enable_rabbitmq:no). I have some detail screenshot about the errors as attached files, could you please help me to solve this problem? Thank you very much. [Attached file description]: globals.yml: my computer's setting about kolla-ansible As mentioned above, the following pictures show the errors, the rabbitmq error will occur if I set [enable_rabbitmq:yes], the nova compute service error will occur if I set [enable_rabbitmq:no]. [image: docker-version.png] [image: kolla-ansible-version.png] [image: nova-compute-service-error.png] [image: rabbitmq_error.png] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: kolla-ansible-version.png Type: image/png Size: 122071 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: rabbitmq_error.png Type: image/png Size: 245303 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: nova-compute-service-error.png Type: image/png Size: 255191 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: docker-version.png Type: image/png Size: 118420 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: globals.yml Type: application/octet-stream Size: 20214 bytes Desc: not available URL: From me at not.mn Wed May 1 23:14:25 2019 From: me at not.mn (John Dickinson) Date: Wed, 1 May 2019 17:14:25 -0600 Subject: [stable] propose Tim Burke as stable core Message-ID: <1F014297-E404-49B6-BE09-61F4DA478AF5@not.mn> Tim has been very active in proposing and maintaining patches to Swift’s stable branches. Of recent (non-automated) backports, Tim has proposed more than a third of them. --John From gmann at ghanshyammann.com Wed May 1 23:18:11 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 01 May 2019 18:18:11 -0500 Subject: [qa][ptg] QA Dinner on 2nd May @ 6.30 PM Message-ID: <16a75b0f051.f1ab3fb0153137.4432104510546283876@ghanshyammann.com> Hi All, We have planned for QA dinner on 2nd May, Thursday 6.30 PM. Anyone is welcome to join. Here are the details: Restaurant: Indian Resturant ('Little India Downtown Denver') Map: shorturl.at/byDIJ Wednesday night, 6:30 PM Meeting at the restaurant directly. -gmann From jungleboyj at gmail.com Thu May 2 00:14:04 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Wed, 1 May 2019 19:14:04 -0500 Subject: [cinder] [PTG] Room for Thursday Morning ... Message-ID: <39a39d58-ac37-a024-5014-ab5548debd8b@gmail.com> Team, There is some confusion with the schedule.  Thought we were scheduled for room 203 in the morning but we weren't. Room 112, was free so I have booked that for our use Thursday morning. See you all there.  Looking forward to a few productive days of discussion. Jay From sean.mcginnis at gmx.com Thu May 2 00:32:50 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Wed, 1 May 2019 19:32:50 -0500 Subject: [cinder][ops] Nested Quota Driver Use? Message-ID: <20190502003249.GA1432@sm-workstation> Hey everyone, I'm hoping to get some feedback from folks, especially operators. In the Liberty release, Cinder introduced the ability to use a Nest Quota Driver to handle cases of heirarchical projects and quota enforcement [0]. I have not heard of anyone actually using this. I also haven't seen any bugs filed, which makes me a little suspicious given how complicated it can be. I would like to know if any operators are using this for nested quotas. There is an effort underway for a new mechanism called "unified limits" that will require a lot of modifications to the Cinder code. If this quota driver is not needed, I would like to deprecated it in Train so it can be removed in the U release and hopefully prevent some unnecessary work being done. Any feedback on this would be appreciated. Thanks! Sean [0] https://specs.openstack.org/openstack/cinder-specs/specs/liberty/cinder-nested-quota-driver.html From massimo.sgaravatto at gmail.com Thu May 2 07:03:09 2019 From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto) Date: Thu, 2 May 2019 09:03:09 +0200 Subject: [glance] [ops] Issue sharing an image with another project (something related to get_image_location) In-Reply-To: References: Message-ID: Hi Brian Thanks for your A couple of answers in-line: On Wed, May 1, 2019 at 3:25 PM Brian Rosmaita wrote: > (Apologies for top-posting.) > > Hi Massimo, > > Two things: > > (1) Please file a glance bug for this. I didn't think the sharing code > would touch image locations, but apparently it does. In the bug report, > please include your policy settings for *_location and *_member, and > also the output of an image-show call for the image you're trying to > share, and the log extract. > Sure: I will > > (2) With the policy settings you have for *_location, I don't think that > any regular (non-admin) user will be able to download an image or boot > an instance from an image, so you should verify those operations. Actually it works E.g.: $ openstack image show 7ebe160d-5498-477b-aa2e-94a6d962a075 +------------------+------------------------------------------------------------------------------+ | Field | Value | +------------------+------------------------------------------------------------------------------+ | checksum | b4548edf0bc476c50c083fb88717d92f | | container_format | bare | | created_at | 2018-01-15T16:14:35Z | | disk_format | qcow2 | | file | /v2/images/7ebe160d-5498-477b-aa2e-94a6d962a075/file | | id | 7ebe160d-5498-477b-aa2e-94a6d962a075 | | min_disk | 3 | | min_ram | 512 | | name | CentOS7 | | owner | 56c3f5c047e74a78a71438c4412e6e13 | | properties | locations='[]', os_hash_algo='None', os_hash_value='None', os_hidden='False' | | protected | False | | schema | /v2/schemas/image | | size | 877985792 | | status | active | | tags | | | updated_at | 2018-01-15T16:21:23Z | | virtual_size | None | | visibility | public | +------------------+------------------------------------------------------------------------------+ So locations are not showed, as expected, since I am a 'regular' (non-admin) user But I able to download the image: $ openstack image save --file ~/CentOS7.qcow2 7ebe160d-5498-477b-aa2e-94a6d962a075 $ ls -l ~/CentOS7.qcow2 -rw-r--r-- 1 sgaravat utenti 877985792 May 2 08:54 /home/sgaravat/CentOS7.qcow2 $ md5sum ~/CentOS7.qcow2 b4548edf0bc476c50c083fb88717d92f /home/sgaravat/CentOS7.qcow2 I am also able to launch an instance using this image Thanks, Massimo > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eyalb1 at gmail.com Thu May 2 07:12:12 2019 From: eyalb1 at gmail.com (Eyal B) Date: Thu, 2 May 2019 10:12:12 +0300 Subject: [Vitrage] add datasource kapacitor for vitrage In-Reply-To: <1324083046.973516.1556615406841.JavaMail.zimbra@viettel.com.vn> References: <14511424.947437.1556614048877.JavaMail.zimbra@viettel.com.vn> <1324083046.973516.1556615406841.JavaMail.zimbra@viettel.com.vn> Message-ID: Hi, Please make sure all test are passing Eyal On Thu, May 2, 2019, 02:18 wrote: > Hi, > In our system, we use monitor by TICK stack (include: Telegraf for > collect metric, InfluxDB for storage metric, Chronograf for visualize and > Kapacitor alarming), which is popular monitor solution. > We hope can integrate vitrage in, so we decide to write kapacitor > datasource contribute for vitrage. > The work is almost done , you can review in: > https://review.opendev.org/653416 > > So i send this mail hope for more review, ideal,... Appreciate it. > also ask: have any step i miss in pipeline of contribute datasource > vitrage? like create blueprints, vitrage-spec,vv.. Should i do it? > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ceo at teo-en-ming-corp.com Thu May 2 08:03:30 2019 From: ceo at teo-en-ming-corp.com (Turritopsis Dohrnii Teo En Ming) Date: Thu, 2 May 2019 08:03:30 +0000 Subject: Which are the most popular free open source OpenStack cloud operating systems or distros? Message-ID: Subject/Topic: Which are the most popular free open source OpenStack cloud operating systems or distros? Good afternoon from Singapore, First of all, I am very new to OpenStack. May I know which are the most popular free open source OpenStack cloud operating systems or distros? How do I download, install and deploy these OpenStack distros as private cloud, public cloud or hybrid cloud? Where can I find good and detailed documentation? Thank you very much for your advice. -----BEGIN EMAIL SIGNATURE----- The Gospel for all Targeted Individuals (TIs): [The New York Times] Microwave Weapons Are Prime Suspect in Ills of U.S. Embassy Workers Link: https://www.nytimes.com/2018/09/01/science/sonic-attack-cuba-microwave.html ******************************************************************************************** Singaporean Mr. Turritopsis Dohrnii Teo En Ming's Academic Qualifications as at 14 Feb 2019 [1] https://tdtemcerts.wordpress.com/ [2] https://tdtemcerts.blogspot.sg/ [3] https://www.scribd.com/user/270125049/Teo-En-Ming -----END EMAIL SIGNATURE----- From berndbausch at gmail.com Thu May 2 08:20:13 2019 From: berndbausch at gmail.com (Bernd Bausch) Date: Thu, 2 May 2019 17:20:13 +0900 Subject: Which are the most popular free open source OpenStack cloud operating systems or distros? In-Reply-To: References: Message-ID: <762494EA-9BC6-45E1-A75C-5D0DAC488DE3@gmail.com> I am not aware of a popularity ranking, but the usual commercial Linux vendors and a large number of other providers offer distros. See https://www.openstack.org/marketplace/distros/ for a list. Download and documentation details are available at the vendors’ web sites. Since OpenStack is open-source, so are the distros. In addition, you find non-commercial deployment tools on the documentation web site https://docs.openstack.org/stein/deploy/. You can also hand-craft your cloud: https://docs.openstack.org/stein/install/. Bernd > On May 2, 2019, at 17:03, Turritopsis Dohrnii Teo En Ming wrote: > > Subject/Topic: Which are the most popular free open source OpenStack cloud operating systems or distros? > > Good afternoon from Singapore, > > First of all, I am very new to OpenStack. > > May I know which are the most popular free open source OpenStack cloud operating systems or distros? > > How do I download, install and deploy these OpenStack distros as private cloud, public cloud or hybrid cloud? > > Where can I find good and detailed documentation? > > Thank you very much for your advice. > > -----BEGIN EMAIL SIGNATURE----- > > The Gospel for all Targeted Individuals (TIs): > > [The New York Times] Microwave Weapons Are Prime Suspect in Ills of > U.S. Embassy Workers > > Link: https://www.nytimes.com/2018/09/01/science/sonic-attack-cuba-microwave.html > > ******************************************************************************************** > > Singaporean Mr. Turritopsis Dohrnii Teo En Ming's Academic > Qualifications as at 14 Feb 2019 > > [1] https://tdtemcerts.wordpress.com/ > > [2] https://tdtemcerts.blogspot.sg/ > > [3] https://www.scribd.com/user/270125049/Teo-En-Ming > > -----END EMAIL SIGNATURE----- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at stackhpc.com Thu May 2 08:21:39 2019 From: doug at stackhpc.com (Doug Szumski) Date: Thu, 2 May 2019 09:21:39 +0100 Subject: [Deploy problem] deploy openstack+monasca with kolla-ansible 8.0.0.0rc1. [kolla] In-Reply-To: References: Message-ID: <10f217bf-33a2-d40a-8bcf-6994c26be699@stackhpc.com> On 01/05/2019 08:45, Ming-Che Liu wrote: > Hello, > > I deployed openstack+monasca with kolla-ansible 8.0.0.0rc1. It doesn't look like Monasca is enabled in your globals.yml file. Are you trying to set up OpenStack services first and then enable Monasca afterwards? You can also deploy Monasca standalone if that is useful: https://docs.openstack.org/kolla-ansible/latest/reference/logging-and-monitoring/monasca-guide.html > > I follow the steps as mentioned in > https://docs.openstack.org/kolla-ansible/latest/user/quickstart.html > > The setting in my computer's globals.yml as same as [Quick Start] > tutorial (attached file: globals.yml is my setting). > > My machine environment as following: > OS: Ubuntu 16.04 > Kolla-ansible verions: 8.0.0.0rc1 > ansible version: 2.7 > > When I execute [bootstrap-servers] and [prechecks], it seems ok (no > fatal error or any interrupt). > > But when I execute [deploy], it will occur some error about > rabbitmq(when I set enable_rabbitmq:yes) and nova compute service(when > I set  enable_rabbitmq:no). > > I have some detail screenshot about the errors as attached files, > could you please help me to solve this problem? Please can you post more information on why the containers are not starting. - Inspect rabbit and nova-compute logs (in /var/lib/docker/volumes/kolla_logs/_data/) - Check relevant containers are running, and if they are restarting check the output. Eg. docker logs --follow nova_compute > > Thank you very much. > > [Attached file description]: > globals.yml: my computer's setting about kolla-ansible > > As mentioned above, the following pictures show the errors, the > rabbitmq error will occur if I set [enable_rabbitmq:yes], the nova > compute service error will occur if I set [enable_rabbitmq:no]. > docker-version.png > kolla-ansible-version.png > nova-compute-service-error.png > rabbitmq_error.png From massimo.sgaravatto at gmail.com Thu May 2 08:28:22 2019 From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto) Date: Thu, 2 May 2019 10:28:22 +0200 Subject: [glance] [ops] Issue sharing an image with another project (something related to get_image_location) In-Reply-To: References: Message-ID: On Thu, May 2, 2019 at 9:03 AM Massimo Sgaravatto < massimo.sgaravatto at gmail.com> wrote: > > Hi Brian > > Thanks for your > A couple of answers in-line: > > On Wed, May 1, 2019 at 3:25 PM Brian Rosmaita > wrote: > >> (Apologies for top-posting.) >> >> Hi Massimo, >> >> Two things: >> >> (1) Please file a glance bug for this. I didn't think the sharing code >> would touch image locations, but apparently it does. In the bug report, >> please include your policy settings for *_location and *_member, and >> also the output of an image-show call for the image you're trying to >> share, and the log extract. >> > > Sure: I will > https://bugs.launchpad.net/glance/+bug/1827342 Thanks again, Massimo -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at stackhpc.com Thu May 2 09:56:47 2019 From: doug at stackhpc.com (Doug Szumski) Date: Thu, 2 May 2019 10:56:47 +0100 Subject: [Deploy problem] deploy openstack+monasca with kolla-ansible 8.0.0.0rc1. [kolla] In-Reply-To: References: <10f217bf-33a2-d40a-8bcf-6994c26be699@stackhpc.com> Message-ID: On 02/05/2019 10:13, Ming-Che Liu wrote: > Hello, > > Thank you for replying, my goal is to deploy [all-in-one] > openstack+monasca(in the same physical machine/VM). > > I will check the detail error information and provide such logs for > you, thank you. > > I also have a question about kolla-ansible 8.0.0.0rc1, when I check > the new feature about kolla-ansible 8.0.0.0rc1, it seems only > 8.0.0.0rc1 provide the "complete" monasca functionality, it that > right(that means you can see monasca's plugin in openstack horizon, as > the following picture)? > You are correct that Monasca is supported from the Stein release onwards. Due to a number of people asking we have created a backport to Rocky, but the patches are not merged yet. Please see this bug for a link to the patch chains: https://bugs.launchpad.net/kolla-ansible/+bug/1824982 The horizon-ui-plugin isn't currently installed in the Horizon image, but I can easily add a patch for it. Similar functionality is currently provided by the monasca-grafana fork (which provides Keystone integration), for example: Menu Overview > Thank you very much. > > Regards, > > Shawn > > monasca.png > > > Doug Szumski > 於 > 2019年5月2日 週四 下午4:21寫道: > > > On 01/05/2019 08:45, Ming-Che Liu wrote: > > Hello, > > > > I deployed openstack+monasca with kolla-ansible 8.0.0.0rc1. > > It doesn't look like Monasca is enabled in your globals.yml file. Are > you trying to set up OpenStack services first and then enable Monasca > afterwards? You can also deploy Monasca standalone if that is useful: > > https://docs.openstack.org/kolla-ansible/latest/reference/logging-and-monitoring/monasca-guide.html > > > > > I follow the steps as mentioned in > > https://docs.openstack.org/kolla-ansible/latest/user/quickstart.html > > > > The setting in my computer's globals.yml as same as [Quick Start] > > tutorial (attached file: globals.yml is my setting). > > > > My machine environment as following: > > OS: Ubuntu 16.04 > > Kolla-ansible verions: 8.0.0.0rc1 > > ansible version: 2.7 > > > > When I execute [bootstrap-servers] and [prechecks], it seems ok (no > > fatal error or any interrupt). > > > > But when I execute [deploy], it will occur some error about > > rabbitmq(when I set enable_rabbitmq:yes) and nova compute > service(when > > I set  enable_rabbitmq:no). > > > > I have some detail screenshot about the errors as attached files, > > could you please help me to solve this problem? > > Please can you post more information on why the containers are not > starting. > > - Inspect rabbit and nova-compute logs (in > /var/lib/docker/volumes/kolla_logs/_data/) > > - Check relevant containers are running, and if they are restarting > check the output. Eg. docker logs --follow nova_compute > > > > > Thank you very much. > > > > [Attached file description]: > > globals.yml: my computer's setting about kolla-ansible > > > > As mentioned above, the following pictures show the errors, the > > rabbitmq error will occur if I set [enable_rabbitmq:yes], the nova > > compute service error will occur if I set [enable_rabbitmq:no]. > > docker-version.png > > kolla-ansible-version.png > > nova-compute-service-error.png > > rabbitmq_error.png > From doug at stackhpc.com Thu May 2 09:58:45 2019 From: doug at stackhpc.com (Doug Szumski) Date: Thu, 2 May 2019 10:58:45 +0100 Subject: [Deploy problem] deploy openstack+monasca with kolla-ansible 8.0.0.0rc1. [kolla] In-Reply-To: References: <10f217bf-33a2-d40a-8bcf-6994c26be699@stackhpc.com> Message-ID: On 02/05/2019 10:56, Doug Szumski wrote: > > On 02/05/2019 10:13, Ming-Che Liu wrote: >> Hello, >> >> Thank you for replying, my goal is to deploy [all-in-one] >> openstack+monasca(in the same physical machine/VM). >> >> I will check the detail error information and provide such logs for >> you, thank you. >> >> I also have a question about kolla-ansible 8.0.0.0rc1, when I check >> the new feature about kolla-ansible 8.0.0.0rc1, it seems only >> 8.0.0.0rc1 provide the "complete" monasca functionality, it that >> right(that means you can see monasca's plugin in openstack horizon, >> as the following picture)? >> > You are correct that Monasca is supported from the Stein release > onwards. Due to a number of people asking we have created a backport > to Rocky, but the patches are not merged yet. Please see this bug for > a link to the patch chains: > https://bugs.launchpad.net/kolla-ansible/+bug/1824982 > > The horizon-ui-plugin isn't currently installed in the Horizon image, > but I can easily add a patch for it. > > Similar functionality is currently provided by the monasca-grafana > fork (which provides Keystone integration), for example: > Apologies, the images were stripped, please see this link: https://github.com/monasca/monasca-grafana > > Menu > > Overview > >> Thank you very much. >> >> Regards, >> >> Shawn >> >> monasca.png >> >> >> Doug Szumski > 於 >> 2019年5月2日 週四 下午4:21寫道: >> >> >>     On 01/05/2019 08:45, Ming-Che Liu wrote: >>     > Hello, >>     > >>     > I deployed openstack+monasca with kolla-ansible 8.0.0.0rc1. >> >>     It doesn't look like Monasca is enabled in your globals.yml file. >> Are >>     you trying to set up OpenStack services first and then enable >> Monasca >>     afterwards? You can also deploy Monasca standalone if that is >> useful: >> >> https://docs.openstack.org/kolla-ansible/latest/reference/logging-and-monitoring/monasca-guide.html >> >>     > >>     > I follow the steps as mentioned in >>     > >> https://docs.openstack.org/kolla-ansible/latest/user/quickstart.html >>     > >>     > The setting in my computer's globals.yml as same as [Quick Start] >>     > tutorial (attached file: globals.yml is my setting). >>     > >>     > My machine environment as following: >>     > OS: Ubuntu 16.04 >>     > Kolla-ansible verions: 8.0.0.0rc1 >>     > ansible version: 2.7 >>     > >>     > When I execute [bootstrap-servers] and [prechecks], it seems ok >> (no >>     > fatal error or any interrupt). >>     > >>     > But when I execute [deploy], it will occur some error about >>     > rabbitmq(when I set enable_rabbitmq:yes) and nova compute >>     service(when >>     > I set  enable_rabbitmq:no). >>     > >>     > I have some detail screenshot about the errors as attached files, >>     > could you please help me to solve this problem? >> >>     Please can you post more information on why the containers are not >>     starting. >> >>     - Inspect rabbit and nova-compute logs (in >>     /var/lib/docker/volumes/kolla_logs/_data/) >> >>     - Check relevant containers are running, and if they are restarting >>     check the output. Eg. docker logs --follow nova_compute >> >>     > >>     > Thank you very much. >>     > >>     > [Attached file description]: >>     > globals.yml: my computer's setting about kolla-ansible >>     > >>     > As mentioned above, the following pictures show the errors, the >>     > rabbitmq error will occur if I set [enable_rabbitmq:yes], the nova >>     > compute service error will occur if I set [enable_rabbitmq:no]. >>     > docker-version.png >>     > kolla-ansible-version.png >>     > nova-compute-service-error.png >>     > rabbitmq_error.png >> From doka.ua at gmx.com Thu May 2 10:27:36 2019 From: doka.ua at gmx.com (Volodymyr Litovka) Date: Thu, 2 May 2019 13:27:36 +0300 Subject: [octavia] Error while creating amphora Message-ID: Dear colleagues, I'm using Openstack Rocky and trying to launch Octavia 4.0.0. After all installation steps I've got an error during 'openstack loadbalancer create' with the following log: DEBUG octavia.controller.worker.tasks.compute_tasks [-] Compute create execute for amphora with id d037721f-2cf9-492e-99cb-0be5874da0f6 execute /opt/openstack/lib/python3.6/site-packages/octavia/controller/worker/tasks/compute_tasks.py:63 ERROR octavia.controller.worker.tasks.compute_tasks [-] Compute create for amphora id: d037721f-2cf9-492e-99cb-0be5874da0f6 failed: TypeError: can't concat str to bytes ERROR octavia.controller.worker.tasks.compute_tasks Traceback (most recent call last): ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/octavia/controller/worker/tasks/compute_tasks.py", line 94, in execute ERROR octavia.controller.worker.tasks.compute_tasks config_drive_files) ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/octavia/common/jinja/user_data_jinja_cfg.py", line 38, in build_user_data_config ERROR octavia.controller.worker.tasks.compute_tasks return self.agent_template.render(user_data=user_data) ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/asyncsupport.py", line 76, in render ERROR octavia.controller.worker.tasks.compute_tasks return original_render(self, *args, **kwargs) ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/environment.py", line 1008, in render ERROR octavia.controller.worker.tasks.compute_tasks return self.environment.handle_exception(exc_info, True) ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/environment.py", line 780, in handle_exception ERROR octavia.controller.worker.tasks.compute_tasks reraise(exc_type, exc_value, tb) ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/_compat.py", line 37, in reraise ERROR octavia.controller.worker.tasks.compute_tasks raise value.with_traceback(tb) ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/octavia/common/jinja/templates/user_data_config_drive.template", line 29, in top-level template code ERROR octavia.controller.worker.tasks.compute_tasks {{ value|indent(8) }} ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/filters.py", line 557, in do_indent ERROR octavia.controller.worker.tasks.compute_tasks s += u'\n' # this quirk is necessary for splitlines method ERROR octavia.controller.worker.tasks.compute_tasks TypeError: can't concat str to bytes ERROR octavia.controller.worker.tasks.compute_tasks WARNING octavia.controller.worker.controller_worker [-] Task 'STANDALONE-octavia-create-amp-for-lb-subflow-octavia-cert-compute-create' (06134192-def9-420c-9feb-0d08a068f3b2) transitioned into state 'FAILURE' from state 'RUNNING' Any advises where is the problem? My environment: - Openstack Rocky - Ubuntu 18.04 - Octavia installed in virtualenv using pip install: # pip list |grep octavia octavia 4.0.0 octavia-lib 1.1.1 python-octaviaclient 1.8.0 Thank you. -- Volodymyr Litovka "Vision without Execution is Hallucination." -- Thomas Edison -- Volodymyr Litovka "Vision without Execution is Hallucination." -- Thomas Edison -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at doughellmann.com Thu May 2 11:03:29 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Thu, 2 May 2019 05:03:29 -0600 Subject: [requirements][qa][all] mock 3.0.0 released References: Message-ID: <4E536AA4-479B-4A84-A1D2-91FF8FAD122C@doughellmann.com> There's a major version bump of one of our testing dependencies, so watch for new unit test job failures. Doug > Begin forwarded message: > > From: Chris Withers > Subject: [TIP] mock 3.0.0 released > Date: May 2, 2019 at 2:07:34 AM MDT > To: "testing-in-python at lists.idyll.org" , Python List > > Hi All, > > I'm pleased to announce the release of mock 3.0.0: > https://pypi.org/project/mock/ > > This brings to rolling backport up to date with cpython master. > > It's been a few years since the last release, so I'd be surprised if there weren't some problems. > If you hit any issues, please pin to mock<3 and then: > > - If your issue relates to mock functionality, please report in the python tracker: https://bugs.python.org > > - If your issue is specific to the backport, please report here: https://github.com/testing-cabal/mock/issues > > If you're unsure, go for the second one and we'll figure it out. > > cheers, > > Chris > > _______________________________________________ > testing-in-python mailing list > testing-in-python at lists.idyll.org > http://lists.idyll.org/listinfo/testing-in-python -------------- next part -------------- An HTML attachment was scrubbed... URL: From florian.engelmann at everyware.ch Thu May 2 11:56:20 2019 From: florian.engelmann at everyware.ch (Florian Engelmann) Date: Thu, 2 May 2019 13:56:20 +0200 Subject: [nova usage] openstack usage CLI output differs from horizon output Message-ID: Hi, as far as I understood Horizon overview usage should give the same numbers as openstack usage show --project --start 2019-03-01 --end 2019-03-31 But in our deployment (rocky) the Horizon numbers are higher (~15%). Any idea why? Could be a bug? All the best, Flo -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5230 bytes Desc: not available URL: From doka.ua at gmx.com Thu May 2 12:42:10 2019 From: doka.ua at gmx.com (Volodymyr Litovka) Date: Thu, 2 May 2019 15:42:10 +0300 Subject: [octavia] anchor discountinued? Message-ID: <78073332-bb86-b00b-6aaf-8e309cbcd160@gmx.com> Dear colleagues, it seems Anchor, which is used by Octavia as PKI system, is discontinued. Is there replacement for Anchor which can be used with Octavia? Thank you. -- Volodymyr Litovka "Vision without Execution is Hallucination." -- Thomas Edison From saphi070 at gmail.com Thu May 2 12:43:43 2019 From: saphi070 at gmail.com (Sa Pham) Date: Thu, 2 May 2019 21:43:43 +0900 Subject: [octavia] anchor discountinued? In-Reply-To: <78073332-bb86-b00b-6aaf-8e309cbcd160@gmx.com> References: <78073332-bb86-b00b-6aaf-8e309cbcd160@gmx.com> Message-ID: Hi Volodymyr, You mean SSL Certificate for Octavia, You can use Barbican. On Thu, May 2, 2019 at 9:42 PM Volodymyr Litovka wrote: > Dear colleagues, > > it seems Anchor, which is used by Octavia as PKI system, is > discontinued. Is there replacement for Anchor which can be used with > Octavia? > > Thank you. > > -- > Volodymyr Litovka > "Vision without Execution is Hallucination." -- Thomas Edison > > > -- Sa Pham Dang Master Student - Soongsil University Kakaotalk: sapd95 Skype: great_bn -------------- next part -------------- An HTML attachment was scrubbed... URL: From zigo at debian.org Thu May 2 12:45:10 2019 From: zigo at debian.org (Thomas Goirand) Date: Thu, 2 May 2019 14:45:10 +0200 Subject: Which are the most popular free open source OpenStack cloud operating systems or distros? In-Reply-To: <762494EA-9BC6-45E1-A75C-5D0DAC488DE3@gmail.com> References: <762494EA-9BC6-45E1-A75C-5D0DAC488DE3@gmail.com> Message-ID: <9328228c-e244-e695-84b6-73b4eaf86c41@debian.org> On 5/2/19 10:20 AM, Bernd Bausch wrote: > I am not aware of a popularity ranking, but the usual commercial Linux > vendors and a large number of other providers offer distros. > See https://www.openstack.org/marketplace/distros/ for a list. Download > and documentation details are available at the vendors’ web sites. Since > OpenStack is open-source, so are the distros. > > In addition, you find non-commercial deployment tools on the > documentation web site https://docs.openstack.org/stein/deploy/. It's not non-commercial list, it's openstack-community-maintained list. For example, my own tool [1] isn't listed despite [2]. Cheers, Thomas Goirand (zigo) [1] https://salsa.debian.org/openstack-team/debian/openstack-cluster-installer [2] https://review.opendev.org/618111 From doka.ua at gmx.com Thu May 2 13:03:54 2019 From: doka.ua at gmx.com (Volodymyr Litovka) Date: Thu, 2 May 2019 16:03:54 +0300 Subject: [octavia] anchor discountinued? In-Reply-To: References: <78073332-bb86-b00b-6aaf-8e309cbcd160@gmx.com> Message-ID: <53e59d83-03e2-a333-1277-03d02ba2120d@gmx.com> Hi Sa, as far as I understand, Octavia uses Barbican for storing certs for TLS offload. While Anchor used for signing certs/keys when doing provisioning of Amphoraes. On 5/2/19 3:43 PM, Sa Pham wrote: > Hi Volodymyr, > > You mean SSL Certificate for Octavia, You can use Barbican. > > > > On Thu, May 2, 2019 at 9:42 PM Volodymyr Litovka > wrote: > > Dear colleagues, > > it seems Anchor, which is used by Octavia as PKI system, is > discontinued. Is there replacement for Anchor which can be used with > Octavia? > > Thank you. > > -- > Volodymyr Litovka >    "Vision without Execution is Hallucination." -- Thomas Edison > > > > > -- > Sa Pham Dang > Master Student - Soongsil University > Kakaotalk: sapd95 > Skype: great_bn > > -- Volodymyr Litovka "Vision without Execution is Hallucination." -- Thomas Edison -------------- next part -------------- An HTML attachment was scrubbed... URL: From lajos.katona at ericsson.com Thu May 2 13:35:39 2019 From: lajos.katona at ericsson.com (Lajos Katona) Date: Thu, 2 May 2019 13:35:39 +0000 Subject: [openstack-dev] [neutron] PTG agenda In-Reply-To: References: Message-ID: <1335a531-9e74-d900-07a9-a6aa4ce285f4@ericsson.com> Hi Miguel, Just a note, the pad is not on the "official" list of pads here: https://wiki.openstack.org/wiki/Forum/Denver2019 Regards Lajos On 2019. 04. 29. 16:46, Miguel Lavalle wrote: > Hi Neutrinos,, > > I took your proposals for PTG topics and organized them in an agenda. > Please look at > https://etherpad.openstack.org/p/openstack-networking-train-ptg. Let's > have a very productive meeting! > > Best regards > > Miguel From lajos.katona at ericsson.com Thu May 2 13:41:11 2019 From: lajos.katona at ericsson.com (Lajos Katona) Date: Thu, 2 May 2019 13:41:11 +0000 Subject: [openstack-dev] [neutron] PTG agenda In-Reply-To: <1335a531-9e74-d900-07a9-a6aa4ce285f4@ericsson.com> References: <1335a531-9e74-d900-07a9-a6aa4ce285f4@ericsson.com> Message-ID: Sorry, This is the PTG page: https://wiki.openstack.org/wiki/PTG/Train/Etherpads and of course neutron is there..... On 2019. 05. 02. 7:35, Lajos Katona wrote: > Hi Miguel, > > Just a note, the pad is not on the "official" list of pads here: > https://wiki.openstack.org/wiki/Forum/Denver2019 > > Regards > Lajos > > On 2019. 04. 29. 16:46, Miguel Lavalle wrote: >> Hi Neutrinos,, >> >> I took your proposals for PTG topics and organized them in an agenda. >> Please look at >> https://etherpad.openstack.org/p/openstack-networking-train-ptg. >> Let's have a very productive meeting! >> >> Best regards >> >> Miguel > From Tim.Bell at cern.ch Thu May 2 13:49:28 2019 From: Tim.Bell at cern.ch (Tim Bell) Date: Thu, 2 May 2019 13:49:28 +0000 Subject: Which are the most popular free open source OpenStack cloud operating systems or distros? In-Reply-To: References: Message-ID: The OpenStack community takes part in an annual survey which can be useful for this sort of information. Details of 2018 report are at https://www.openstack.org/user-survey/2018-user-survey-report/ The 2019 user survey is also now open so you can create/update your install details at https://www.openstack.org/user-survey/survey-2019/landing Tim -----Original Message----- From: Turritopsis Dohrnii Teo En Ming Date: Thursday, 2 May 2019 at 02:09 To: "openstack-discuss at lists.openstack.org" Cc: Turritopsis Dohrnii Teo En Ming Subject: Which are the most popular free open source OpenStack cloud operating systems or distros? Subject/Topic: Which are the most popular free open source OpenStack cloud operating systems or distros? Good afternoon from Singapore, First of all, I am very new to OpenStack. May I know which are the most popular free open source OpenStack cloud operating systems or distros? How do I download, install and deploy these OpenStack distros as private cloud, public cloud or hybrid cloud? Where can I find good and detailed documentation? Thank you very much for your advice. -----BEGIN EMAIL SIGNATURE----- The Gospel for all Targeted Individuals (TIs): [The New York Times] Microwave Weapons Are Prime Suspect in Ills of U.S. Embassy Workers Link: https://www.nytimes.com/2018/09/01/science/sonic-attack-cuba-microwave.html ******************************************************************************************** Singaporean Mr. Turritopsis Dohrnii Teo En Ming's Academic Qualifications as at 14 Feb 2019 [1] https://tdtemcerts.wordpress.com/ [2] https://tdtemcerts.blogspot.sg/ [3] https://www.scribd.com/user/270125049/Teo-En-Ming -----END EMAIL SIGNATURE----- From openstack at hauschild.it Thu May 2 14:11:18 2019 From: openstack at hauschild.it (Hartwig Hauschild) Date: Thu, 2 May 2019 16:11:18 +0200 Subject: properly sizing openstack controlplane infrastructure In-Reply-To: <1A3C52DFCD06494D8528644858247BF01C30E5B5@EX10MBOX03.pnnl.gov> References: <20190430153021.jhdgri7g2nvpn5vj@alle-irre.de> <1A3C52DFCD06494D8528644858247BF01C30E5B5@EX10MBOX03.pnnl.gov> Message-ID: <20190502141117.ukowjeuwqxmwphsv@alle-irre.de> Am 30.04.2019 schrieb Fox, Kevin M: > I've run that same network config at about 70 nodes with no problems. I've run the same without dvr at 150 nodes. > > Your memory usage seems very high. I ran 150 nodes with a small 16g server ages ago. Might double check that. > That's what I was thinking as well, but it did not match up with what we currently have at all. I'll need to figure out what went wrong here. -- Cheers, Hardy From strigazi at gmail.com Thu May 2 14:11:22 2019 From: strigazi at gmail.com (Spyros Trigazis) Date: Thu, 2 May 2019 08:11:22 -0600 Subject: [magnm][ptg] Room for magnum this afternoon Message-ID: Hello everyone, Magnum will have two PTG sessions this afternoon [0] in Room 112. Note that magnum's track is not in the printed scheduled you have taken from the registration desk. You can join remotely in this etherpad [1]. Cheers, Spyros [0] http://ptg.openstack.org/ptg.html [1] https://etherpad.openstack.org/p/magnum-train-ptg -------------- next part -------------- An HTML attachment was scrubbed... URL: From strigazi at gmail.com Thu May 2 14:13:03 2019 From: strigazi at gmail.com (Spyros Trigazis) Date: Thu, 2 May 2019 08:13:03 -0600 Subject: [magnum][ptg] Room for magnum this afternoon In-Reply-To: References: Message-ID: I did a typo in the subject. Cheers, Spyros On Thu, May 2, 2019 at 8:11 AM Spyros Trigazis wrote: > Hello everyone, > > Magnum will have two PTG sessions this afternoon [0] in Room 112. > Note that magnum's track is not in the printed scheduled you have taken > from the registration desk. > > You can join remotely in this etherpad [1]. > > Cheers, > Spyros > > [0] http://ptg.openstack.org/ptg.html > [1] https://etherpad.openstack.org/p/magnum-train-ptg > -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at hauschild.it Thu May 2 14:15:17 2019 From: openstack at hauschild.it (Hartwig Hauschild) Date: Thu, 2 May 2019 16:15:17 +0200 Subject: properly sizing openstack controlplane infrastructure In-Reply-To: <2114088542.1037659.1556641040235.JavaMail.zimbra@speichert.pl> References: <20190430153021.jhdgri7g2nvpn5vj@alle-irre.de> <2114088542.1037659.1556641040235.JavaMail.zimbra@speichert.pl> Message-ID: <20190502141517.muvcrkh6wej3i7wo@alle-irre.de> Am 30.04.2019 schrieb Daniel Speichert: > ----- Original Message ----- > > From: "Hartwig Hauschild" > > To: openstack-discuss at lists.openstack.org > > Sent: Tuesday, April 30, 2019 9:30:22 AM > > Subject: properly sizing openstack controlplane infrastructure > > > The requirements we've got are basically "here's 50 compute-nodes, make sure > > whatever you're building scales upwards from there". > > It depends what's your end goal. 100? 500? >1000 nodes? > At some point things like Nova Cells will help (or become necessity). I really hope not that high, but splitting into cells or AZs / Regions is definitely planned if it goes up. > > The pike-stack has three servers as control-plane, each of them with 96G of > > RAM and they don't seem to have too much room left when coordinating 14 > > compute-nodes. > > 96 GB of RAM per controller is much more than enough for 14 compute nodes. > There's room for improvement in configuration. > > > We're thinking about splitting the control-nodes into infrastructure > > (db/rabbit/memcache) and API. > > > > What would I want to look for when sizing those control-nodes? I've not been > > able to find any references for this at all, just rather nebulous '8G RAM > > should do' which is around what our rabbit currently inhales. > > You might want to check out Performance Docs: > https://docs.openstack.org/developer/performance-docs/ > > For configuration tips, I'd suggest looking at what openstack-ansible > or similar projects provide as "battle-tested" configuration. > It's a good baseline reference before you tune yourself. > Problem is: For all I know this is a non-tuned openstack-ansible-setup. I guess I'll have to figure out why it's using way more memory than it should (and run out every now and then). Thanks, -- cheers, Hardy From arbermejo0417 at gmail.com Thu May 2 14:23:07 2019 From: arbermejo0417 at gmail.com (Alejandro Ruiz Bermejo) Date: Thu, 2 May 2019 10:23:07 -0400 Subject: [ETCD] client: etcd member http://controller:2379 has no leader Message-ID: Hi, i'm installing Zun in Openstack Queens with Ubuntu 18.04.1 LTS, i already have configured docker and kuyr-libnetwork. I'm following the guide at https://docs.openstack.org/zun/queens/install/index.html. I followed all the steps of the installation at controller node and everything resulted without problems. After finished the installation direction at compute node the *systemctl status zun-compute* have the following errors root at compute /h/team# systemctl status zun-compute ● zun-compute.service - OpenStack Container Service Compute Agent Loaded: loaded (/etc/systemd/system/zun-compute.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2019-04-30 16:46:56 UTC; 4h 26min ago Main PID: 2072 (zun-compute) Tasks: 1 (limit: 4915) CGroup: /system.slice/zun-compute.service └─2072 /usr/bin/python /usr/local/bin/zun-compute Apr 30 16:46:56 compute systemd[1]: Started OpenStack Container Service Compute Agent. Apr 30 16:46:57 compute zun-compute[2072]: 2019-04-30 16:46:57.929 2072 INFO zun.cmd.compute [-] Starting server in PID 2072 Apr 30 16:46:57 compute zun-compute[2072]: 2019-04-30 16:46:57.941 2072 INFO zun.container.driver [-] Loading container driver 'docker.driver.DockerDriver' Apr 30 16:46:58 compute zun-compute[2072]: 2019-04-30 16:46:58.028 2072 INFO zun.container.driver [-] Loading container driver 'docker.driver.DockerDriver' Apr 30 16:48:33 compute zun-compute[2072]: 2019-04-30 16:48:33.645 2072 INFO zun.image.driver [req-7e0b8325-1e09-4410-80f4-af807cbc0420 a16c6ef0319b4643a4ec8e56a1d025cb 59065d8f970b467aa94ef7b35f1edab5 default - -] Loading container image driver 'glance' Apr 30 16:48:33 compute zun-compute[2072]: 2019-04-30 16:48:33.911 2072 INFO zun.image.driver [req-7e0b8325-1e09-4410-80f4-af807cbc0420 a16c6ef0319b4643a4ec8e56a1d025cb 59065d8f970b467aa94ef7b35f1edab5 default - -] Loading container image driver 'glance' Apr 30 16:48:35 compute zun-compute[2072]: 2019-04-30 16:48:35.455 2072 INFO zun.image.driver [req-7e0b8325-1e09-4410-80f4-af807cbc0420 16c6ef0319b4643a4ec8e56a1d025cb 59065d8f970b467aa94ef7b35f1edab5 default - -] Loading container image driver 'glance' Apr 30 16:48:35 compute zun-compute[2072]: 2019-04-30 16:48:35.939 2072 ERROR zun.image.glance.driver [req-7e0b8325-1e09-4410-80f4-af807cbc0420 a16c6ef0319b4643a4ec8e56a1d025cb 59065d8f970b467aa94ef7b35f1edab5 default - -] Imae cirros was not found in glance: ImageNotFound: Image cirros could not be found. Apr 30 16:48:35 compute zun-compute[2072]: 2019-04-30 16:48:35.940 2072 INFO zun.image.driver [req-7e0b8325-1e09-4410-80f4-af807cbc0420 a16c6ef0319b4643a4ec8e56a1d025cb 59065d8f970b467aa94ef7b35f1edab5 default - -] Loading container image driver 'docker' Apr 30 16:48:55 compute zun-compute[2072]: 2019-04-30 16:48:55.011 2072 ERROR zun.compute.manager [req-7bfa764a-45b8-4e2f-ac70-84d8bb71b135 - - - - -] Error occurred while calling Docker create API: Docker internal error: 500 Server Error: Internal Server Error ("failed to update store for object typpe *libnetwork.endpointCnt: client: etcd member http://controller:2379 has no leader").: DockerError: Docker internal error: 500 Server Error: Internal Server Error ("failed to update store for object type *libnetwork.endpointtCnt: client: etcd member http://controller:2379 has no leader"). Also *systemctl status docker* show the next output root at compute /h/team# systemctl status docker ● docker.service - Docker Application Container Engine Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled) Drop-In: /etc/systemd/system/docker.service.d └─docker.conf, http-proxy.conf, https-proxy.conf Active: active (running) since Tue 2019-04-30 16:46:25 UTC; 4h 18min ago Docs: https://docs.docker.com Main PID: 1777 (dockerd) Tasks: 21 CGroup: /system.slice/docker.service └─1777 /usr/bin/dockerd --group zun -H tcp://compute:2375 -H unix:///var/run/docker.sock --cluster-store etcd://controller:2379 Apr 30 16:46:20 compute dockerd[1777]: time="2019-04-30T16:46:20.815305836Z" level=warning msg="Your kernel does not support cgroup rt runtime" Apr 30 16:46:20 compute dockerd[1777]: time="2019-04-30T16:46:20.815933695Z" level=info msg="Loading containers: start." Apr 30 16:46:24 compute dockerd[1777]: time="2019-04-30T16:46:24.378526837Z" level=info msg="Default bridge (docker0) is assigned with an IP address 17 Apr 30 16:46:24 compute dockerd[1777]: time="2019-04-30T16:46:24.572558877Z" level=info msg="Loading containers: done." Apr 30 16:46:25 compute dockerd[1777]: time="2019-04-30T16:46:25.198101219Z" level=info msg="Docker daemon" commit=e8ff056 graphdriver(s)=overlay2 vers Apr 30 16:46:25 compute dockerd[1777]: time="2019-04-30T16:46:25.198211373Z" level=info msg="Daemon has completed initialization" Apr 30 16:46:25 compute dockerd[1777]: time="2019-04-30T16:46:25.232286069Z" level=info msg="API listen on /var/run/docker.sock" Apr 30 16:46:25 compute dockerd[1777]: time="2019-04-30T16:46:25.232318790Z" level=info msg="API listen on 10.8.9.58:2375" Apr 30 16:46:25 compute systemd[1]: Started Docker Application Container Engine. Apr 30 16:48:55 compute dockerd[1777]: time="2019-04-30T16:48:55.009820439Z" level=error msg="Handler for POST /v1.26/networks/create returned error: failed to update store for object type *libnetwork.endpointCnt: client: etcd member http://controller:2379 has no leader" When i try to launch an app container as the guide says it shows an Error state and when i run opentack appcontainer show this is the reason of the error status_reason | Docker internal error: 500 Server Error: Internal Server Error ("failed to update store for object type *libnetwork.endpointCnt: client: etcd member http://controller:2379 has no leader") -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at hauschild.it Thu May 2 14:21:32 2019 From: openstack at hauschild.it (Hartwig Hauschild) Date: Thu, 2 May 2019 16:21:32 +0200 Subject: properly sizing openstack controlplane infrastructure In-Reply-To: <6448907c-6aaf-2f91-fe77-48e697c7b80f@debian.org> References: <20190430153021.jhdgri7g2nvpn5vj@alle-irre.de> <6448907c-6aaf-2f91-fe77-48e697c7b80f@debian.org> Message-ID: <20190502142131.llh7udkpgyhncb4d@alle-irre.de> Am 01.05.2019 schrieb Thomas Goirand: > On 4/30/19 5:30 PM, Hartwig Hauschild wrote: > > Also: We're currently running Neutron in OVS-DVR-VXLAN-Configuration. > > Does that properly scale up and above 50+ nodes > > It does, that's not the bottleneck. > Oh, Ok. I've read that OVS-DVR-VXLAN will produce a lot of load on the messaging-system, at least if you enable l2-pop and don't run broadcast. > From my experience, 3 heavy control nodes are really enough to handle > 200+ compute nodes. Though what you're suggesting (separating db & > rabbitmq-server in separate nodes) is a very good idea. > Ah, cool. Then I'll head that way and see how that works out (and how many add-on-services it can take) -- cheers, Hardy From johnsomor at gmail.com Thu May 2 14:44:38 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Thu, 2 May 2019 08:44:38 -0600 Subject: [octavia] anchor discountinued? In-Reply-To: <53e59d83-03e2-a333-1277-03d02ba2120d@gmx.com> References: <78073332-bb86-b00b-6aaf-8e309cbcd160@gmx.com> <53e59d83-03e2-a333-1277-03d02ba2120d@gmx.com> Message-ID: Volodymyr, Correct, Anchor is no longer an OpenStack project and we need to remove the reference to it in our code. Currently there is not another option beyond the built in "local_cert_generator" for this function. Michael On Thu, May 2, 2019 at 7:05 AM Volodymyr Litovka wrote: > > Hi Sa, > > as far as I understand, Octavia uses Barbican for storing certs for TLS offload. While Anchor used for signing certs/keys when doing provisioning of Amphoraes. > > On 5/2/19 3:43 PM, Sa Pham wrote: > > Hi Volodymyr, > > You mean SSL Certificate for Octavia, You can use Barbican. > > > > On Thu, May 2, 2019 at 9:42 PM Volodymyr Litovka wrote: >> >> Dear colleagues, >> >> it seems Anchor, which is used by Octavia as PKI system, is >> discontinued. Is there replacement for Anchor which can be used with >> Octavia? >> >> Thank you. >> >> -- >> Volodymyr Litovka >> "Vision without Execution is Hallucination." -- Thomas Edison >> >> > > > -- > Sa Pham Dang > Master Student - Soongsil University > Kakaotalk: sapd95 > Skype: great_bn > > > > -- > Volodymyr Litovka > "Vision without Execution is Hallucination." -- Thomas Edison From johnsomor at gmail.com Thu May 2 14:58:34 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Thu, 2 May 2019 08:58:34 -0600 Subject: [octavia] Error while creating amphora In-Reply-To: References: Message-ID: Volodymyr, It looks like you have enabled "user_data_config_drive" in the octavia.conf file. Is there a reason you need this? If not, please set it to False and it will resolve your issue. It appears we have a python3 bug in the "user_data_config_drive" capability. It is not generally used and appears to be missing test coverage. I have opened a story (bug) on your behalf here: https://storyboard.openstack.org/#!/story/2005553 Michael On Thu, May 2, 2019 at 4:29 AM Volodymyr Litovka wrote: > > Dear colleagues, > > I'm using Openstack Rocky and trying to launch Octavia 4.0.0. After all installation steps I've got an error during 'openstack loadbalancer create' with the following log: > > DEBUG octavia.controller.worker.tasks.compute_tasks [-] Compute create execute for amphora with id d037721f-2cf9-492e-99cb-0be5874da0f6 execute /opt/openstack/lib/python3.6/site-packages/octavia/controller/worker/tasks/compute_tasks.py:63 > ERROR octavia.controller.worker.tasks.compute_tasks [-] Compute create for amphora id: d037721f-2cf9-492e-99cb-0be5874da0f6 failed: TypeError: can't concat str to bytes > ERROR octavia.controller.worker.tasks.compute_tasks Traceback (most recent call last): > ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/octavia/controller/worker/tasks/compute_tasks.py", line 94, in execute > ERROR octavia.controller.worker.tasks.compute_tasks config_drive_files) > ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/octavia/common/jinja/user_data_jinja_cfg.py", line 38, in build_user_data_config > ERROR octavia.controller.worker.tasks.compute_tasks return self.agent_template.render(user_data=user_data) > ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/asyncsupport.py", line 76, in render > ERROR octavia.controller.worker.tasks.compute_tasks return original_render(self, *args, **kwargs) > ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/environment.py", line 1008, in render > ERROR octavia.controller.worker.tasks.compute_tasks return self.environment.handle_exception(exc_info, True) > ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/environment.py", line 780, in handle_exception > ERROR octavia.controller.worker.tasks.compute_tasks reraise(exc_type, exc_value, tb) > ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/_compat.py", line 37, in reraise > ERROR octavia.controller.worker.tasks.compute_tasks raise value.with_traceback(tb) > ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/octavia/common/jinja/templates/user_data_config_drive.template", line 29, in top-level template code > ERROR octavia.controller.worker.tasks.compute_tasks {{ value|indent(8) }} > ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/filters.py", line 557, in do_indent > ERROR octavia.controller.worker.tasks.compute_tasks s += u'\n' # this quirk is necessary for splitlines method > ERROR octavia.controller.worker.tasks.compute_tasks TypeError: can't concat str to bytes > ERROR octavia.controller.worker.tasks.compute_tasks > WARNING octavia.controller.worker.controller_worker [-] Task 'STANDALONE-octavia-create-amp-for-lb-subflow-octavia-cert-compute-create' (06134192-def9-420c-9feb-0d08a068f3b2) transitioned into state 'FAILURE' from state 'RUNNING' > > Any advises where is the problem? > > My environment: > - Openstack Rocky > - Ubuntu 18.04 > - Octavia installed in virtualenv using pip install: > # pip list |grep octavia > octavia 4.0.0 > octavia-lib 1.1.1 > python-octaviaclient 1.8.0 > > Thank you. > > -- > Volodymyr Litovka > "Vision without Execution is Hallucination." -- Thomas Edison > > -- > Volodymyr Litovka > "Vision without Execution is Hallucination." -- Thomas Edison From jacob.anders.au at gmail.com Thu May 2 15:18:53 2019 From: jacob.anders.au at gmail.com (Jacob Anders) Date: Fri, 3 May 2019 01:18:53 +1000 Subject: [baremetal-sig][ironic][ptg] Bare-metal whitepaper meeting at PTG Message-ID: Hi All, As discussed in the forum session earlier in the week, I would like to put together a session at the PTG for the Bare-metal SIG members to discuss the Bare Metal Whitepaper work and plan out next steps. Ironic schedule for the PTG is pretty tight but how about 4pm on the Friday? We could do this as a breakout in the main Ironic session. Who would be interested/available? Thanks, cheers, Jacob -------------- next part -------------- An HTML attachment was scrubbed... URL: From miguel at mlavalle.com Thu May 2 15:25:29 2019 From: miguel at mlavalle.com (Miguel Lavalle) Date: Thu, 2 May 2019 09:25:29 -0600 Subject: [openstack-dev] [neutron] PTG agenda In-Reply-To: References: <1335a531-9e74-d900-07a9-a6aa4ce285f4@ericsson.com> Message-ID: Awesome, thanks for making sure everything is in working order :-) On Thu, May 2, 2019 at 7:41 AM Lajos Katona wrote: > Sorry, > > This is the PTG page: > https://wiki.openstack.org/wiki/PTG/Train/Etherpads > and of course neutron is there..... > > On 2019. 05. 02. 7:35, Lajos Katona wrote: > > Hi Miguel, > > > > Just a note, the pad is not on the "official" list of pads here: > > https://wiki.openstack.org/wiki/Forum/Denver2019 > > > > Regards > > Lajos > > > > On 2019. 04. 29. 16:46, Miguel Lavalle wrote: > >> Hi Neutrinos,, > >> > >> I took your proposals for PTG topics and organized them in an agenda. > >> Please look at > >> https://etherpad.openstack.org/p/openstack-networking-train-ptg. > >> Let's have a very productive meeting! > >> > >> Best regards > >> > >> Miguel > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pawel.konczalski at everyware.ch Thu May 2 15:48:22 2019 From: pawel.konczalski at everyware.ch (Pawel Konczalski) Date: Thu, 2 May 2019 17:48:22 +0200 Subject: kube_cluster_deploy fails In-Reply-To: <497c1efd-a2af-6958-7e11-ae8e38eb4df9@everyware.ch> References: <0f00a092-1f7d-e85b-9ce4-da38cfd2c9da@everyware.ch> <5080c19a-3c98-8c13-eec1-49706d3e591c@everyware.ch> <497c1efd-a2af-6958-7e11-ae8e38eb4df9@everyware.ch> Message-ID: <42689a26-358f-09b9-6dfe-5a5b57b916ba@everyware.ch> Also you have to ensure that swap is disabled in the Kubernetes master and minion flavor(s). Following commands should result in a working Kubernetes deploy process: # Create image for Kubernetes VMs wget https://download.fedoraproject.org/pub/alt/atomic/stable/Fedora-29-updates-20190429.0/AtomicHost/x86_64/images/Fedora-AtomicHost-29-20190429.0.x86_64.raw.xz xz -d Fedora-AtomicHost-29-20190429.0.x86_64.raw.xz openstack image create "Fedora AtomicHost 29" \   --file Fedora-AtomicHost-29-20190429.0.x86_64.raw \   --disk-format raw \   --container-format=bare \   --min-disk 10 \   --min-ram 4096 \   --public \   --protected \   --property os_distro=fedora-atomic \   --property os_admin_user=fedora \   --property os_version="20190429.0" # Create flavor for Kubernetes cluster openstack flavor create m1.kubernetes \   --disk 40 \   --vcpu 2 \   --ram 4096 \   --public # Create Kubernetes template openstack coe cluster template create kubernetes-cluster-template \   --image "Fedora AtomicHost 29" \   --external-network public \   --dns-nameserver 8.8.8.8 \   --master-flavor m1.kubernetes \   --flavor m1.kubernetes \   --coe kubernetes \   --volume-driver cinder \   --network-driver flannel \   --docker-volume-size 40 # Create Kubernetes cluster openstack coe cluster create kubernetes-cluster \   --cluster-template kubernetes-cluster-template \   --master-count 1 \   --node-count 2 \   --keypair mykey BR Pawel -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5227 bytes Desc: not available URL: From daniel at speichert.pl Thu May 2 16:02:46 2019 From: daniel at speichert.pl (Daniel Speichert) Date: Thu, 2 May 2019 18:02:46 +0200 (CEST) Subject: [self-healing] live-migrate instance in response to fault signals In-Reply-To: References: Message-ID: <1640608910.1064843.1556812966934.JavaMail.zimbra@speichert.pl> ----- Original Message ----- > From: "Eric K" > To: "openstack-discuss" > Sent: Wednesday, May 1, 2019 4:59:57 PM > Subject: [self-healing] live-migrate instance in response to fault signals ... > > I just want to follow up to get more info on the context; > specifically, which of the following pieces are the main difficulties? > - detecting the failure/soft-fail/early failure indication > - codifying how to respond to each failure scenario > - triggering/executing the desired workflow > - something else > > [1] https://etherpad.openstack.org/p/DEN-self-healing-SIG We currently attempt to do all of the above using less-than-optimal custom scripts (using openstacksdk) and pipelines (running Ansible). I think there is tremendous value in developing at least one tested way to do all of the above by connecting e.g. Monasca, Mistral and Nova together to do the above. Maybe it's currently somewhat possible - then it's more of a documentation issue that would benefit operators. One of the derivative issues is the quality of live-migration in Nova. (I don't have production-level experience with Rocky/Stein yet.) When we do lots of live migrations, there is obviously a limit on the number of live migrations happening at the same time (doing more would be counter productive). These limits could be smarter/more dynamic in some cases. There is no immediate action item here right now though. I would like to begin with putting together all the pieces that currently work together and go from there - see what's missing. -Daniel From mriedemos at gmail.com Thu May 2 16:11:02 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 2 May 2019 10:11:02 -0600 Subject: [forum][sdk][nova] Closing compute API feature gaps in the openstack CLI - recap Message-ID: <799f4669-5c92-5cd5-f8ee-4e9a8baae35a@gmail.com> I wanted to give a quick recap of this Forum session for those that couldn't attend and also find owners. Please reply to this thread if you'd like to sign up for any specific item. The etherpad [1] has the details. To restate the goal: "Identify the major issues and functional gaps (up through Mitaka 2.25) and prioritize which to work on and what the commands should do." We spent the majority of the time talking about existing issues with compute API functionality in openstack CLI, primarily boot-from-volume, live migration and lack of evacuate support (evacuate as in rebuild on a new target host because the source host is dead, not drain a host with live migrations [2]). We then talked through some of the microversion gaps and picked a few to focus on. Based on that, the agreements and priorities are: **High Priority** 1. Make the boot-from-volume experience better by: a) Support type=image for the --block-device-mapping option. b) Add a --boot-from-volume option which will translate to a root --block-device-mapping using the provided --image value (nova will create the root volume under the covers). Owner: TBD (on either) 2. Fix the "openstack server migrate" command We're going to deprecate the --live option and add a new --live-migration option and a --host option. The --host option can be used for requesting a target host for cold migration (omit the --live/--live-migration option for that). Then in a major release we'll drop the --live option and intentionally not add a --force option (since we don't want to support forcing a target host and bypassing the scheduler). Owner: TBD (I would split the 2.56 cold migration --host support from the new --live-migration option review-wise) **Medium Priority** Start modeling migration resources in the openstack CLI, specifically for microversions 2.22-2.24, but note that the GET /os-migrations API is available since 2.1 (so that's probably easiest to add first). The idea is to have a new command resource like: openstack compute migration (list|delete|set) [--server ] Owner: TBD (again this is a series of changes) **Low Priority** Closing other feature gaps can be done on an as-needed basis as we've been doing today. Sean Mooney is working on adding evacuate support, and there are patches in flight (see [3]) for other microversion-specific features. I would like to figure out how to highlight these to the OSC core team on a more regular basis, but we didn't really talk about that. I've been trying to be a type of liaison for these patches and go over them before the core team tries to review them to make sure they match the API properly and are well documented. Does the OSC core team have any suggestions on how I can better socialize what I think is ready for core team review? [1] https://etherpad.openstack.org/p/DEN-osc-compute-api-gaps [2] http://www.danplanet.com/blog/2016/03/03/evacuate-in-nova-one-command-to-confuse-us-all/ [3] https://etherpad.openstack.org/p/compute-api-microversion-gap-in-osc -- Thanks, Matt From zigo at debian.org Thu May 2 16:31:45 2019 From: zigo at debian.org (Thomas Goirand) Date: Thu, 2 May 2019 18:31:45 +0200 Subject: properly sizing openstack controlplane infrastructure In-Reply-To: <20190502142131.llh7udkpgyhncb4d@alle-irre.de> References: <20190430153021.jhdgri7g2nvpn5vj@alle-irre.de> <6448907c-6aaf-2f91-fe77-48e697c7b80f@debian.org> <20190502142131.llh7udkpgyhncb4d@alle-irre.de> Message-ID: On 5/2/19 4:21 PM, Hartwig Hauschild wrote: > Am 01.05.2019 schrieb Thomas Goirand: >> On 4/30/19 5:30 PM, Hartwig Hauschild wrote: >>> Also: We're currently running Neutron in OVS-DVR-VXLAN-Configuration. >>> Does that properly scale up and above 50+ nodes >> >> It does, that's not the bottleneck. >> > Oh, Ok. I've read that OVS-DVR-VXLAN will produce a lot of load on the > messaging-system, at least if you enable l2-pop and don't run broadcast. Yes, but that's really not a big problem for a 200+ nodes setup, especially if you dedicate 3 nodes for messaging. >> From my experience, 3 heavy control nodes are really enough to handle >> 200+ compute nodes. Though what you're suggesting (separating db & >> rabbitmq-server in separate nodes) is a very good idea. >> > Ah, cool. Then I'll head that way and see how that works out (and how many > add-on-services it can take) > From tony at bakeyournoodle.com Thu May 2 16:35:23 2019 From: tony at bakeyournoodle.com (Tony Breeds) Date: Thu, 2 May 2019 10:35:23 -0600 Subject: [stable] propose Tim Burke as stable core In-Reply-To: <1F014297-E404-49B6-BE09-61F4DA478AF5@not.mn> References: <1F014297-E404-49B6-BE09-61F4DA478AF5@not.mn> Message-ID: <20190502163522.GB32106@thor.bakeyournoodle.com> On Wed, May 01, 2019 at 05:14:25PM -0600, John Dickinson wrote: > Tim has been very active in proposing and maintaining patches to > Swift’s stable branches. Of recent (non-automated) backports, Tim has > proposed more than a third of them. Done. Given the smaller number of backports I've judged Tim's understanding of the stable policy from those rather than reviews. Yours Tony. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From mthode at mthode.org Thu May 2 16:37:56 2019 From: mthode at mthode.org (Matthew Thode) Date: Thu, 2 May 2019 11:37:56 -0500 Subject: [requirements][qa][all] mock 3.0.0 released In-Reply-To: <4E536AA4-479B-4A84-A1D2-91FF8FAD122C@doughellmann.com> References: <4E536AA4-479B-4A84-A1D2-91FF8FAD122C@doughellmann.com> Message-ID: <20190502163756.srtgbyabxwj3ewvh@mthode.org> On 19-05-02 05:03:29, Doug Hellmann wrote: > There's a major version bump of one of our testing dependencies, so watch for new unit test job failures. > > Doug > > > Begin forwarded message: > > > > From: Chris Withers > > Subject: [TIP] mock 3.0.0 released > > Date: May 2, 2019 at 2:07:34 AM MDT > > To: "testing-in-python at lists.idyll.org" , Python List > > > > Hi All, > > > > I'm pleased to announce the release of mock 3.0.0: > > https://pypi.org/project/mock/ > > > > This brings to rolling backport up to date with cpython master. > > > > It's been a few years since the last release, so I'd be surprised if there weren't some problems. > > If you hit any issues, please pin to mock<3 and then: > > > > - If your issue relates to mock functionality, please report in the python tracker: https://bugs.python.org > > > > - If your issue is specific to the backport, please report here: https://github.com/testing-cabal/mock/issues > > > > If you're unsure, go for the second one and we'll figure it out. > > Ack, thanks for the notice -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From miguel at mlavalle.com Thu May 2 16:57:22 2019 From: miguel at mlavalle.com (Miguel Lavalle) Date: Thu, 2 May 2019 10:57:22 -0600 Subject: [openstack-dev] [neutron] Team picture Message-ID: Dear Neutrinos, Please remember that we will have out team picture taken at 11:50, NEXT TO THE PTG REGISTRATION DESK. Please be there on time Best regards Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at stackhpc.com Thu May 2 17:12:44 2019 From: mark at stackhpc.com (Mark Goddard) Date: Thu, 2 May 2019 11:12:44 -0600 Subject: [Deploy problem] deploy openstack+monasca with kolla-ansible 8.0.0.0rc1. In-Reply-To: References: Message-ID: On Wed, 1 May 2019 at 17:10, Ming-Che Liu wrote: > Hello, > > I deployed openstack+monasca with kolla-ansible 8.0.0.0rc1. > > I follow the steps as mentioned in > https://docs.openstack.org/kolla-ansible/latest/user/quickstart.html > > The setting in my computer's globals.yml as same as [Quick Start] tutorial > (attached file: globals.yml is my setting). > > My machine environment as following: > OS: Ubuntu 16.04 > Kolla-ansible verions: 8.0.0.0rc1 > ansible version: 2.7 > > When I execute [bootstrap-servers] and [prechecks], it seems ok (no fatal > error or any interrupt). > > But when I execute [deploy], it will occur some error about rabbitmq(when > I set enable_rabbitmq:yes) and nova compute service(when I > set enable_rabbitmq:no). > > I have some detail screenshot about the errors as attached files, could > you please help me to solve this problem? > > Thank you very much. > > [Attached file description]: > globals.yml: my computer's setting about kolla-ansible > > As mentioned above, the following pictures show the errors, the rabbitmq > error will occur if I set [enable_rabbitmq:yes], the nova compute service > error will occur if I set [enable_rabbitmq:no]. > Hi Ming-Che, Since Stein, we no longer test Kolla Ansible with Ubuntu 16.04 upstream. Could you try again using Ubuntu 18.04? Regards, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Thu May 2 19:18:00 2019 From: openstack at nemebean.com (Ben Nemec) Date: Thu, 2 May 2019 13:18:00 -0600 Subject: =?UTF-8?Q?Re=3a_=5boslo=5d_Proposing_Herv=c3=a9_Beraud_for_Oslo_Cor?= =?UTF-8?Q?e?= In-Reply-To: <75a2b34b-e46c-8361-1ab3-c910c95a6ecb@nemebean.com> References: <75a2b34b-e46c-8361-1ab3-c910c95a6ecb@nemebean.com> Message-ID: <14c9d853-0e48-dce7-23bc-48623dcbd3f3@nemebean.com> There were no objections and it's been a week, so I've added Hervé to the oslo-core team. Welcome! On 4/24/19 9:42 AM, Ben Nemec wrote: > Hi, > > Hervé has been working on Oslo for a while now and in that time has > shown tremendous growth in his understanding of Oslo and OpenStack. I > think he would make a good addition to the general Oslo core team. > Existing Oslo team members (+Keystone, Castellan, and anyone else we > co-own libraries with) please respond with +1/-1. If there are no > objections I'll add him to the ACL next week and we can celebrate in > person. :-) > > Thanks. > > -Ben > From openstack at fried.cc Thu May 2 19:31:02 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 2 May 2019 13:31:02 -0600 Subject: [nova][ptg] Summary: Stein Retrospective Message-ID: <5b287537-9489-b10e-4d52-7a4cbb617d0a@fried.cc> Etherpad: https://etherpad.openstack.org/p/nova-ptg-train-retrospective Summary: - At least one newcomer was very happy with the welcome/support he received. Let's keep up the encouragement and make new long-term contributors - Placement extraction went pretty well. Forum session had no negative energy. (Is this because we planned really well, or because it wasn't that big a deal to begin with?) - Great work and collaboration on the bandwidth series. Framework will set us up nicely for other uses as well (e.g. cyborg). - Runways work. Let's keep using them. - Release themes: some people benefit from their existence; for others they are irrelevant but harmless. So let's keep doing them, since they benefit some. - Good coordination & communication with the TripleO team. - Long commit chains are hard. Things that have helped some people, which should be encouraged for the future: hangouts, videos, and/or emails, supplementing specs, acting as "review guidance". - The Stein release seemed a little less focused than usual. No cause or action was identified. - Pre-PTG emails around Placement were very effective for some, suggested to be used in the future for Nova, though noting the limitations and restricting to the parts of the design process not appropriate for other forums (like specs). Actions: - do themes (to be discussed Saturday 1200) - keep doing runways - (mriedem) hangout/video/review-guide-email for cross-cell resize work - Consider pre-PTG email threads for U efried . From arbermejo0417 at gmail.com Thu May 2 19:55:23 2019 From: arbermejo0417 at gmail.com (Alejandro Ruiz Bermejo) Date: Thu, 2 May 2019 15:55:23 -0400 Subject: [Zun] openstack appcontainer run error Message-ID: I'm having troubles with the verify step of the Zun intallation at Openstack Queens on Ubuntu 18.04 LTS. I previously Posted a trouble with it and already fixed the error you guys pointed at. Now i still can't launch the app container. It freeze at container_creating task, the shows an error state root at controller /h/team# openstack appcontainer show 4a657ac5-058c-43eb-8cbf-7239ad3c4d76 +-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value | +-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | addresses | {} | | links | [{u'href': u' http://controller:9517/v1/containers/4a657ac5-058c-43eb-8cbf-7239ad3c4d76', u'rel': u'self'}, {u'href': u' http://controller:9517/containers/4a657ac5-058c-43eb-8cbf-7239ad3c4d76', u'rel': u'bookmark'}] | | image | cirros | | labels | {} | | disk | 0 | | security_groups | [] | | image_pull_policy | None | | user_id | a16c6ef0319b4643a4ec8e56a1d025cb | | uuid | 4a657ac5-058c-43eb-8cbf-7239ad3c4d76 | | hostname | None | | environment | {} | | memory | None | | project_id | 59065d8f970b467aa94ef7b35f1edab5 | | status | Error | | workdir | None | | auto_remove | False | | status_detail | None | | host | None | | image_driver | docker | | task_state | None | | status_reason | *Docker internal error: 500 Server Error: Internal Server Error ("failed to update store for object type *libnetwork.endpointCnt: client: endpoint http://10.8.9.54:2379 exceeded header timeout"). * | | name | test1 | | restart_policy | {} | | ports | [] | | command | "ping" "8.8.8.8" | | runtime | None | | cpu | None | | interactive | False | +-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ I tried to launch another container without defining a network and without executing any command, and it also had the same error. I can launch container from the computer node cli with docker commands, the errors are when i try to launch them from the controller CLI. I run a docker run hello-world at the compute node and everything went fine Wen u runned openstack appcontainer create hello-world i had exactly the same error -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.page at canonical.com Thu May 2 21:57:49 2019 From: james.page at canonical.com (James Page) Date: Thu, 2 May 2019 15:57:49 -0600 Subject: [sig][upgrades] job done Message-ID: Hi All After this mornings PTG session, I'm pleased to propose that we formally end the Upgrades SIG. That’s a “pleased” because we feel that our job as a SIG is done! Upgrades in OpenStack are no longer a "special interest"; they are now an integral part of the philosophy of projects within the OpenStack ecosystem and although there are probably still some rough edges, we don’t think we need a SIG to drive this area forward any longer. So thanks for all of the war stories, best practice discussion and general upgrade related conversation over the Forums, PTG’s and Summits over the last few years - it's been fun! Regards James -------------- next part -------------- An HTML attachment was scrubbed... URL: From alifshit at redhat.com Thu May 2 22:02:50 2019 From: alifshit at redhat.com (Artom Lifshitz) Date: Thu, 2 May 2019 16:02:50 -0600 Subject: [forum][sdk][nova] Closing compute API feature gaps in the openstack CLI - recap In-Reply-To: <799f4669-5c92-5cd5-f8ee-4e9a8baae35a@gmail.com> References: <799f4669-5c92-5cd5-f8ee-4e9a8baae35a@gmail.com> Message-ID: On Thu, May 2, 2019 at 10:15 AM Matt Riedemann wrote: > > I wanted to give a quick recap of this Forum session for those that > couldn't attend and also find owners. Please reply to this thread if > you'd like to sign up for any specific item. The etherpad [1] has the > details. > > To restate the goal: "Identify the major issues and functional gaps (up > through Mitaka 2.25) and prioritize which to work on and what the > commands should do." > > We spent the majority of the time talking about existing issues with > compute API functionality in openstack CLI, primarily boot-from-volume, > live migration and lack of evacuate support (evacuate as in rebuild on a > new target host because the source host is dead, not drain a host with > live migrations [2]). > > We then talked through some of the microversion gaps and picked a few to > focus on. > > Based on that, the agreements and priorities are: > > **High Priority** > > 1. Make the boot-from-volume experience better by: > > a) Support type=image for the --block-device-mapping option. > > b) Add a --boot-from-volume option which will translate to a root > --block-device-mapping using the provided --image value (nova will > create the root volume under the covers). > > Owner: TBD (on either) I can take this. I'll also work on all the device tagging stuff - both tagged attach in 2.49 ([3] L122) and the original tagged boot devices in 2.32 (which I've added to [3] as a quick note). > 2. Fix the "openstack server migrate" command > > We're going to deprecate the --live option and add a new > --live-migration option and a --host option. The --host option can be > used for requesting a target host for cold migration (omit the > --live/--live-migration option for that). Then in a major release we'll > drop the --live option and intentionally not add a --force option (since > we don't want to support forcing a target host and bypassing the scheduler). > > Owner: TBD (I would split the 2.56 cold migration --host support from > the new --live-migration option review-wise) > > **Medium Priority** > > Start modeling migration resources in the openstack CLI, specifically > for microversions 2.22-2.24, but note that the GET /os-migrations API is > available since 2.1 (so that's probably easiest to add first). The idea > is to have a new command resource like: > > openstack compute migration (list|delete|set) [--server ] > > Owner: TBD (again this is a series of changes) > > **Low Priority** > > Closing other feature gaps can be done on an as-needed basis as we've > been doing today. Sean Mooney is working on adding evacuate support, and > there are patches in flight (see [3]) for other microversion-specific > features. > > I would like to figure out how to highlight these to the OSC core team > on a more regular basis, but we didn't really talk about that. I've been > trying to be a type of liaison for these patches and go over them before > the core team tries to review them to make sure they match the API > properly and are well documented. Does the OSC core team have any > suggestions on how I can better socialize what I think is ready for core > team review? > > [1] https://etherpad.openstack.org/p/DEN-osc-compute-api-gaps > [2] > http://www.danplanet.com/blog/2016/03/03/evacuate-in-nova-one-command-to-confuse-us-all/ > [3] https://etherpad.openstack.org/p/compute-api-microversion-gap-in-osc > > -- > > Thanks, > > Matt > > > -- Artom Lifshitz Software Engineer, OpenStack Compute DFG From thierry at openstack.org Thu May 2 22:03:36 2019 From: thierry at openstack.org (Thierry Carrez) Date: Fri, 3 May 2019 00:03:36 +0200 Subject: [sig][upgrades] job done In-Reply-To: References: Message-ID: James Page wrote: > After this mornings PTG session, I'm pleased to propose that we formally > end the Upgrades SIG. > > That’s a “pleased” because we feel that our job as a SIG is done! > > Upgrades in OpenStack are no longer a "special interest"; they are now > an integral part of the philosophy of projects within the OpenStack > ecosystem and although there are probably still some rough edges, we > don’t think we need a SIG to drive this area forward any longer. > > So thanks for all of the war stories, best practice discussion and > general upgrade related conversation over the Forums, PTG’s and Summits > over the last few years - it's been fun! That makes a lot of sense. Upgrades are (1) in a much better shape than a couple of years ago, and (2) are now a general concern with work happening on every team (as the upgrade-checks goal in Stein showed), so a SIG is a bit redundant. Thanks James for your help driving this ! -- Thierry Carrez (ttx) From james.page at canonical.com Thu May 2 22:08:42 2019 From: james.page at canonical.com (James Page) Date: Thu, 2 May 2019 16:08:42 -0600 Subject: [sig][upgrades] job done In-Reply-To: References: Message-ID: On Thu, May 2, 2019 at 3:57 PM James Page wrote: > Hi All > > After this mornings PTG session, I'm pleased to propose that we formally > end the Upgrades SIG. > > That’s a “pleased” because we feel that our job as a SIG is done! > > Upgrades in OpenStack are no longer a "special interest"; they are now an > integral part of the philosophy of projects within the OpenStack ecosystem > and although there are probably still some rough edges, we don’t think we > need a SIG to drive this area forward any longer. > Making this more formal as a proposal - https://review.opendev.org/656878 Cheers James -------------- next part -------------- An HTML attachment was scrubbed... URL: From pshchelokovskyy at mirantis.com Thu May 2 22:19:49 2019 From: pshchelokovskyy at mirantis.com (Pavlo Shchelokovskyy) Date: Thu, 2 May 2019 16:19:49 -0600 Subject: [keystone][heat] security_compliance options and auto-created users In-Reply-To: <7c744974-a22d-517b-765d-d5ea9912d953@redhat.com> References: <7c744974-a22d-517b-765d-d5ea9912d953@redhat.com> Message-ID: Hi all, to follow up on this, I created the following issues: Heat story https://storyboard.openstack.org/#!/story/2005210 , first patch is up https://review.opendev.org/#/c/656884/ Keystone bugs https://bugs.launchpad.net/keystone/+bug/1827431 https://bugs.launchpad.net/keystone/+bug/1827435 I'll work on patches to Keystone next, please review / comment on bugs/stories/patches :-) Cheers, On Wed, Apr 17, 2019 at 9:42 AM Zane Bitter wrote: > On 16/04/19 6:38 AM, Pavlo Shchelokovskyy wrote: > > Hi all, > > > > I am currently looking at options defined in [security_compliance] > > section of keystone.conf [0] and trying to understand how enabling those > > security features may affect other services. > > > > The first thing I see is that any project that auto-creates some > > temporary users may be affected. > > Of the top of my head I can recall only Heat and Tempest doing this. > > For Tempest situation is easier as a) tempest can use static credentials > > instead of dynamic ones so it is possible to craft appropriate users > > beforehand and b) those users are relatively short-lived (required for > > limited time). > > In case of Heat though those users are used for deferred auth (like in > > autoscaling) which for long lived stacks can happen at arbitrary time in > > future - which is a problem. > > > > Below is breakdown of options/features possible to set and what problems > > that may pose for Heat and ideas on how to work those around: > > > > - disable_user_account_days_inactive - this may pose a problem for > > deferred auth, and it seems is not possible to override it via user > > "options". IMO there's a need to add a new user option to Keystone to > > ignore this setting for given user, and then use it in Heat to create > > temporary users. > > +1 > > > - lockout failure options (lockout_failure_attempts, lockout_duration) - > > can be overridden by user option, but Heat has to set it first. Also the > > question remains how realistically such problem may arise for an > > auto-created internal user and whether Heat should set this option at all > > Sounds like a DoS waiting to happen if we don't override. > > > - password expiry options > > > (password_expires_days, unique_last_password_count, minimum_password_age) - > > poses a problem for deferred auth, but can be overridden by user option, > > so Heat should definitely set it IMO for users it creates > > +1 > > > - change_password_upon_first_use - poses problem for auto-generated > > users, can be overridden by a user option, but Heat must set it for its > > generated users > > +1 > > > - password strength enforcement > > (password_regex, password_regex_description) - now this is an > > interesting one. Currently Heat creates passwords for its temporary > > users with this function [1] falling back to empty password if a > > resource is not generating one for itself. Depending on regex_password > > setting in keystone, it may or may not be enough to pass the password > > strength check. > > This is technically true, although I implemented it so it should pass > all but the most brain-dead of policies. So I feel like doing nothing is > a valid option ;) > > > I've looked around and (as expected) generating a random string which > > satisfies a pre-defined arbitrary regex is quite a non-trivial task, > > couple of existing Python libraries that can do this note that they > > support only a limited subset of full regex spec. > > Yeah. If we're going to do it I think a more achievable way is by making > the current generator's rules (which essentially consist of a length > plus minimum counts of characters from particular classes) configurable > instead of hard-coded. I always assumed that we might eventually do > this, but didn't build it in at the start because the patch needed to be > backported. > > This is still pretty terrible because it's a configuration option the > operator has to set to match keystone's, and in a different format to > boot. Although, TBH a regex isn't a great choice for how to configure it > in keystone either - it's trivial if you want to force the user to > always use the password "password", but if you want to force the user to > e.g. have both uppercase and lowercase characters then you have to do > all kinds of weird lookahead assertions that require a PhD in Python's > specific flavour of regexps. > > As long as we don't try to do something like > https://review.openstack.org/436324 > > Note that Heat has it's own requirements too - one that I discovered is > that the passwords can't contain '$' because of reasons. > > > So it seems that a most simple solution would be to add yet another user > > option to Keystone to ignore password strength enforcement for this > > given user, and amend Heat to set this option as well for internal users > > it creates. > > That also works. > > > We in Heat may also think as to whether it would have any benefit to > > also set the 'lock_password' user option for the auto-created users > > which will prohibit such users to change their passwords via API > themselves. > > I can't think of any real benefit - or for that matter any real harm. > Presumably Heat itself would still be able to change the account's > password later, so it wouldn't stop us from implementing some sort of > rotation thing in the future. > > > I'd very like to hear opinion from Keystone community as most solutions > > I named are 'add new user option to Keystone' :-) > > > > [0] > > > https://opendev.org/openstack/keystone/src/branch/master/keystone/conf/security_compliance.py > > [1] > > > https://opendev.org/openstack/heat/src/branch/master/heat/common/password_gen.py#L112 > > > > Cheers, > > - Pavlo > > -- > > Dr. Pavlo Shchelokovskyy > > Principal Software Engineer > > Mirantis Inc > > www.mirantis.com > > > -- Dr. Pavlo Shchelokovskyy Principal Software Engineer Mirantis Inc www.mirantis.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From stig.openstack at telfer.org Thu May 2 22:31:42 2019 From: stig.openstack at telfer.org (Stig Telfer) Date: Thu, 2 May 2019 16:31:42 -0600 Subject: [baremetal-sig] Planet for syndicating bare metal activity? Message-ID: Hi all - Good to see all the activity around bare metal this week. To keep information flowing, would it make sense to implement something like a planet feed for syndicating baremetal blog content from program members, linked to from the landing page https://www.openstack.org/bare-metal/ ? Cheers, Stig -------------- next part -------------- An HTML attachment was scrubbed... URL: From morgan.fainberg at gmail.com Thu May 2 22:30:58 2019 From: morgan.fainberg at gmail.com (Morgan Fainberg) Date: Thu, 2 May 2019 15:30:58 -0700 Subject: [keystone][heat] security_compliance options and auto-created users In-Reply-To: References: <7c744974-a22d-517b-765d-d5ea9912d953@redhat.com> Message-ID: There has been some work to allow for "defaults" for these overrides at, for example, the domain level (all users within a domain). Allowing such defaults based upon ownership would solve the concerns. On Thu, May 2, 2019 at 3:22 PM Pavlo Shchelokovskyy < pshchelokovskyy at mirantis.com> wrote: > Hi all, > > to follow up on this, I created the following issues: > Heat story https://storyboard.openstack.org/#!/story/2005210 , first > patch is up https://review.opendev.org/#/c/656884/ > Keystone bugs https://bugs.launchpad.net/keystone/+bug/1827431 > https://bugs.launchpad.net/keystone/+bug/1827435 > > I'll work on patches to Keystone next, please review / comment on > bugs/stories/patches :-) > > Cheers, > > On Wed, Apr 17, 2019 at 9:42 AM Zane Bitter wrote: > >> On 16/04/19 6:38 AM, Pavlo Shchelokovskyy wrote: >> > Hi all, >> > >> > I am currently looking at options defined in [security_compliance] >> > section of keystone.conf [0] and trying to understand how enabling >> those >> > security features may affect other services. >> > >> > The first thing I see is that any project that auto-creates some >> > temporary users may be affected. >> > Of the top of my head I can recall only Heat and Tempest doing this. >> > For Tempest situation is easier as a) tempest can use static >> credentials >> > instead of dynamic ones so it is possible to craft appropriate users >> > beforehand and b) those users are relatively short-lived (required for >> > limited time). >> > In case of Heat though those users are used for deferred auth (like in >> > autoscaling) which for long lived stacks can happen at arbitrary time >> in >> > future - which is a problem. >> > >> > Below is breakdown of options/features possible to set and what >> problems >> > that may pose for Heat and ideas on how to work those around: >> > >> > - disable_user_account_days_inactive - this may pose a problem for >> > deferred auth, and it seems is not possible to override it via user >> > "options". IMO there's a need to add a new user option to Keystone to >> > ignore this setting for given user, and then use it in Heat to create >> > temporary users. >> >> +1 >> >> > - lockout failure options (lockout_failure_attempts, lockout_duration) >> - >> > can be overridden by user option, but Heat has to set it first. Also >> the >> > question remains how realistically such problem may arise for an >> > auto-created internal user and whether Heat should set this option at >> all >> >> Sounds like a DoS waiting to happen if we don't override. >> >> > - password expiry options >> > >> (password_expires_days, unique_last_password_count, minimum_password_age) - >> > poses a problem for deferred auth, but can be overridden by user >> option, >> > so Heat should definitely set it IMO for users it creates >> >> +1 >> >> > - change_password_upon_first_use - poses problem for auto-generated >> > users, can be overridden by a user option, but Heat must set it for its >> > generated users >> >> +1 >> >> > - password strength enforcement >> > (password_regex, password_regex_description) - now this is an >> > interesting one. Currently Heat creates passwords for its temporary >> > users with this function [1] falling back to empty password if a >> > resource is not generating one for itself. Depending on regex_password >> > setting in keystone, it may or may not be enough to pass the password >> > strength check. >> >> This is technically true, although I implemented it so it should pass >> all but the most brain-dead of policies. So I feel like doing nothing is >> a valid option ;) >> >> > I've looked around and (as expected) generating a random string which >> > satisfies a pre-defined arbitrary regex is quite a non-trivial task, >> > couple of existing Python libraries that can do this note that they >> > support only a limited subset of full regex spec. >> >> Yeah. If we're going to do it I think a more achievable way is by making >> the current generator's rules (which essentially consist of a length >> plus minimum counts of characters from particular classes) configurable >> instead of hard-coded. I always assumed that we might eventually do >> this, but didn't build it in at the start because the patch needed to be >> backported. >> >> This is still pretty terrible because it's a configuration option the >> operator has to set to match keystone's, and in a different format to >> boot. Although, TBH a regex isn't a great choice for how to configure it >> in keystone either - it's trivial if you want to force the user to >> always use the password "password", but if you want to force the user to >> e.g. have both uppercase and lowercase characters then you have to do >> all kinds of weird lookahead assertions that require a PhD in Python's >> specific flavour of regexps. >> >> As long as we don't try to do something like >> https://review.openstack.org/436324 >> >> Note that Heat has it's own requirements too - one that I discovered is >> that the passwords can't contain '$' because of reasons. >> >> > So it seems that a most simple solution would be to add yet another >> user >> > option to Keystone to ignore password strength enforcement for this >> > given user, and amend Heat to set this option as well for internal >> users >> > it creates. >> >> That also works. >> >> > We in Heat may also think as to whether it would have any benefit to >> > also set the 'lock_password' user option for the auto-created users >> > which will prohibit such users to change their passwords via API >> themselves. >> >> I can't think of any real benefit - or for that matter any real harm. >> Presumably Heat itself would still be able to change the account's >> password later, so it wouldn't stop us from implementing some sort of >> rotation thing in the future. >> >> > I'd very like to hear opinion from Keystone community as most solutions >> > I named are 'add new user option to Keystone' :-) >> > >> > [0] >> > >> https://opendev.org/openstack/keystone/src/branch/master/keystone/conf/security_compliance.py >> > [1] >> > >> https://opendev.org/openstack/heat/src/branch/master/heat/common/password_gen.py#L112 >> > >> > Cheers, >> > - Pavlo >> > -- >> > Dr. Pavlo Shchelokovskyy >> > Principal Software Engineer >> > Mirantis Inc >> > www.mirantis.com >> >> >> > > -- > Dr. Pavlo Shchelokovskyy > Principal Software Engineer > Mirantis Inc > www.mirantis.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From MZavala at StateStreet.com Thu May 2 22:59:07 2019 From: MZavala at StateStreet.com (Zavala, Miguel) Date: Thu, 2 May 2019 22:59:07 +0000 Subject: [Desginate][Infoblox] Using infoblox as a backend to designate Message-ID: Hi all, Ive been trying to get Infoblox integrated with designate and I am running into some issues. Currently I can go to horizon, and create a zone there that then shows in infoblox, but when checking the logs I get :: Could not find 1556226600 for openstack.example. on enough nameservers.:: I saw the documentation listed here ,https://docs.openstack.org/designate/queens/admin/backends/infoblox.html, says that I have to set the designate mini-dns server as my external primary. Do I have to have a mini-dns running in order for designate to operate correctly? Im asking because designate has a database so it does not require synchronization like bind 9 does. I currently have a mini-dns setup on my controller node if I do need it. Thank you for reading! Regards, Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From hongbin034 at gmail.com Fri May 3 01:12:06 2019 From: hongbin034 at gmail.com (Hongbin Lu) Date: Thu, 2 May 2019 21:12:06 -0400 Subject: [Zun] openstack appcontainer run error In-Reply-To: References: Message-ID: Hi Alejandro, The error message "http://10.8.9.54:2379 exceeded header timeout" indicates that Docker Daemon was not able to access the URL "http://10.8.9.54:2379", which is supposed to be the ETCD endpoint. If you run "curl http://10.8.9.54:2379" in compute host. Are you able to reach that endpoint? Best regards, Hongbin On Thu, May 2, 2019 at 3:58 PM Alejandro Ruiz Bermejo < arbermejo0417 at gmail.com> wrote: > I'm having troubles with the verify step of the Zun intallation at > Openstack Queens on Ubuntu 18.04 LTS. I previously Posted a trouble with it > and already fixed the error you guys pointed at. Now i still can't launch > the app container. It freeze at container_creating task, the shows an > error state > > root at controller /h/team# openstack appcontainer show > 4a657ac5-058c-43eb-8cbf-7239ad3c4d76 > > +-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ > | Field | Value > > > | > > +-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ > | addresses | {} > > > | > | links | [{u'href': u' > http://controller:9517/v1/containers/4a657ac5-058c-43eb-8cbf-7239ad3c4d76', > u'rel': u'self'}, {u'href': u' > http://controller:9517/containers/4a657ac5-058c-43eb-8cbf-7239ad3c4d76', > u'rel': u'bookmark'}] | > | image | cirros > > > | > | labels | {} > > > | > | disk | 0 > > > | > | security_groups | [] > > > | > | image_pull_policy | None > > > | > | user_id | a16c6ef0319b4643a4ec8e56a1d025cb > > > | > | uuid | 4a657ac5-058c-43eb-8cbf-7239ad3c4d76 > > > | > | hostname | None > > > | > | environment | {} > > > | > | memory | None > > > | > | project_id | 59065d8f970b467aa94ef7b35f1edab5 > > > | > | status | Error > > > | > | workdir | None > > > | > | auto_remove | False > > > | > | status_detail | None > > > | > | host | None > > > | > | image_driver | docker > > > | > | task_state | None > > > | > | status_reason | *Docker internal error: 500 Server Error: Internal > Server Error ("failed to update store for object type > *libnetwork.endpointCnt: client: endpoint http://10.8.9.54:2379 > exceeded header timeout"). * | > | name | test1 > > > | > | restart_policy | {} > > > | > | ports | [] > > > | > | command | "ping" "8.8.8.8" > > > | > | runtime | None > > > | > | cpu | None > > > | > | interactive | False > > > | > > +-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ > > > I tried to launch another container without defining a network and without > executing any command, and it also had the same error. > I can launch container from the computer node cli with docker commands, > the errors are when i try to launch them from the controller CLI. > I run a docker run hello-world at the compute node and everything went > fine > Wen u runned openstack appcontainer create hello-world i had exactly the > same error > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at fried.cc Fri May 3 03:31:55 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 2 May 2019 21:31:55 -0600 Subject: [nova][ptg] Summary: CPU modeling in placement Message-ID: Spec: https://review.openstack.org/#/c/555081/ Summary: Rework the way logical processors are represented/requested in conf/placement/flavors. Stephen has at this point simplified the spec dramatically, reducing it to what may be the smallest possible cohesive unit of work. Even so, we all agreed that this is ugly and messy and will never be perfect. It was therefore agreed to... Action: approve the spec pretty much as is, start slinging code, make progress, refine as we go. efried . From openstack at fried.cc Fri May 3 03:47:58 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 2 May 2019 21:47:58 -0600 Subject: [nova][ptg] Summary: Persistent Memory Message-ID: <374d31c8-39ad-5cba-3827-794dc8a45757@fried.cc> Specs: - Base: https://review.openstack.org/601596 - Libvirt: https://review.openstack.org/622893 Patches: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/virtual-persistent-memory Summary/agreements: - Support persistent memory in units of "namespaces" with custom resource class names. - Namespaces to be pre-carved out by admin/deployer (not by Nova). - Custom RC names mapped to byte sizes via "conf" [1] so virt driver can know how to map them back to the real resources. - "Ignore NUMA for now" (sean-k-mooney will have to tell you what that means exactly). - Spec needs to list support-or-not for all instance lifecycle operations. - Keep one spec for base enablement and one for libvirt, but make sure the right bits are in the right spec. efried [1] There has been a recurring theme of needing "some kind of config" - not necessarily nova.conf or any oslo.config - that can describe: - Resource provider name/uuid/parentage, be it an existing provider (the compute root RP in this case) or a new nested provider; - Inventory (e.g. pmem namespace resource in this case); - Physical resource(s) to which the inventory corresponds; - Traits, aggregates, other? As of this writing, no specifics have been decided, even to the point of positing that it could be the same file for some/all of the specs for which the issue arose. From openstack at fried.cc Fri May 3 03:59:15 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 2 May 2019 21:59:15 -0600 Subject: [nova][ptg] Summary: Using Forbidden Aggregates Message-ID: <18a3542d-68d2-fd29-253f-880e54f12369@fried.cc> Spec: https://review.opendev.org/#/c/609960/ Summary: - TL;DR: Allows you to say "You can't land on a host that does X unless you specifically require X". Example: Keep my Windows-licensed hosts for Windows instances. - Exploit placement enablement for forbidden aggregates [1] in Nova - Set (nova) aggregate metadata with a syntax similar/identical to that of extra_specs for required traits (e.g. 'trait:CUSTOM_WINDOWS_ONLY': 'required') - During scheduling, nova will discover all aggregates with metadata of this form. For each: - Construct a list of the traits in the aggregate metadata - Subtract traits required by the server request's flavor+image. - If any traits from the aggregate remain, add this aggregate's UUID (which corresponds to a placement aggregate) to the list of "forbidden aggregates" for the GET /allocation_candidates request. Agreements: - The "discover all aggregates" bit has the potential to be slow, but is better than the alternative, which was having the admin supply the same information in a confusing conf syntax. And if performance becomes a problem, we can deal with it later; this does not paint us into a corner. - Spec has overall support, but a few open questions. Answer those, and we're good to approve and move forward. efried [1] https://docs.openstack.org/placement/latest/specs/train/approved/2005297-negative-aggregate-membership.html From openstack at fried.cc Fri May 3 04:03:55 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 2 May 2019 22:03:55 -0600 Subject: [nova][ptg] Summary: Corner case issues with root volume detach/attach Message-ID: Etherpad: https://etherpad.openstack.org/p/detach-attach-root-volume-corner-cases Summary (copied verbatim from the bottom of the etherpad - ask mriedem if further explanation is needed): - Allow attaching a new root volume with a tag as described [in the etherpad] and/or a multiattach volume, don't restrict on whether or not the existing root volume had a tag or multiattach capability. - During unshelve, before scheduling, modify the RequestSpec (and don't persist it) if the BDMs have a tag or are multiattach (this is honestly an existing bug for unshelve). This is where the compute driver capability traits will be used for pre-filtering. (if unshelving fails with NoValidHost the instance remains in shelve_offloaded state [tested in devstack], so user can detach the volume and retry) - Refactor and re-use the image validation code from rebuild when a new root volume is attached. - Assert the new root volume is bootable. efried . From openstack at fried.cc Fri May 3 04:11:09 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 2 May 2019 22:11:09 -0600 Subject: [nova][ptg] Summary: Extra specs validation Message-ID: <07673fec-c193-1031-b9f0-5d32c65cc124@fried.cc> Spec: https://review.openstack.org/#/c/638734/ Summary: Schema for syntactic validation of flavor extra specs, mainly to address otherwise-silently-ignored fat-fingering of keys and/or values. Agreements: - Do it in the flavor API when extra specs are set (as opposed to e.g. during server create) - One spec, but two stages: 1) For known keys, validate values; do this without a microversion. 2) Validate keys, which entails - Standard set of keys (by pattern) known to nova - Mechanism for admin to extend the set for snowflake extra specs specific to their deployment / OOT driver / etc. - "Validation" will at least comprise messaging/logging. - Optional "strict mode" making the operation fail is also a possibility. efried . From openstack at fried.cc Fri May 3 04:14:34 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 2 May 2019 22:14:34 -0600 Subject: [nova][ptg] Summary: docs Message-ID: Summary: Nova docs could use some love. Agreement: Consider doc scrub as a mini-theme (cycle themes to be discussed Saturday) to encourage folks to dedicate some amount of time to reading & validating docs, and opening and/or fixing bugs for discovered issues. efried . From openstack at fried.cc Fri May 3 04:22:21 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 2 May 2019 22:22:21 -0600 Subject: [nova][ptg] Summary: Next Steps for QoS Bandwidth Message-ID: <03bedee9-f1c2-dac1-af2e-83408cbe66d9@fried.cc> Blueprints/Specs: - Live migration support: https://review.opendev.org/#/c/652608 - Grab bag for other stuff: https://blueprints.launchpad.net/nova/+spec/enhance-support-for-ports-having-resource-request - Request group to resource provider mapping (needs to be moved to a placement spec): https://review.opendev.org/#/c/597601/ Agreements: - No microversion for adding migration operation support (or at least propose it that way in the spec and let discussion ensue there) - Can the live migration support depend on the existence of multiple portbinding or we have to support the old codepath as well when the port binding is created by the nova-compute on the destination host? => Yes, this extension cannot be turned off - Pull the trigger on rg-to-rp mappings in placement. This is also needed by other efforts (cyborg and VGPU at least). - Tag PFs in the PciDeviceSpec, and tag the corresponding RP indicating that it can do that. Require the trait - refuse to land on a host that can't do this, because the assignment will fail late. - Default group_policy=none and do the post-filtering on the nova side More discussion related to this topic may occur in the nova/neutron cross-project session, scheduled for Friday at 1400 in the Nova room: https://etherpad.openstack.org/p/ptg-train-xproj-nova-neutron efried . From jeremyfreudberg at gmail.com Fri May 3 04:27:13 2019 From: jeremyfreudberg at gmail.com (Jeremy Freudberg) Date: Fri, 3 May 2019 00:27:13 -0400 Subject: [ironic][neutron][ops] Ironic multi-tenant networking, VMs In-Reply-To: References: Message-ID: Thanks Julia; this is helpful. Thanks also for reading my mind a bit, as I am thinking of the VXLAN case... I can't help but notice that in the Ironic CI jobs, multi tenant networking being used seems to entail VLANs as the tenant network type (instead of VXLAN). Is it just coincidence / how the gate just is, or is it hinting something about how VXLAN and bare metal get along? On Wed, May 1, 2019 at 6:38 PM Julia Kreger wrote: > > Greetings Jeremy, > > Best Practice wise, I'm not directly aware of any. It is largely going > to depend upon your Neutron ML2 drivers and network fabric. > > In essence, you'll need an ML2 driver which supports the vnic type of > "baremetal", which is able to able to orchestrate the switch port port > binding configuration in your network fabric. If your using vlan > networks, in essence you'll end up with a neutron physical network > which is also a trunk port to the network fabric, and the ML2 driver > would then appropriately tag the port(s) for the baremetal node to the > networks required. In the CI gate, we do this in the "multitenant" > jobs where networking-generic-switch modifies the OVS port > configurations directly. > > If specifically vxlan is what your looking to use between VMs and > baremetal nodes, I'm unsure of how you would actually configure that, > but in essence the VXLANs would still need to be terminated on the > switch port via the ML2 driver. > > In term of Ironic's documentation, If you haven't already seen it, you > might want to check out ironic's multi-tenancy documentation[1]. > > -Julia > > [1]: https://docs.openstack.org/ironic/latest/admin/multitenancy.html > > On Wed, May 1, 2019 at 10:53 AM Jeremy Freudberg > wrote: > > > > Hi all, > > > > I'm wondering if anyone has any best practices for Ironic bare metal > > nodes and regular VMs living on the same network. I'm sure if involves > > Ironic's `neutron` multi-tenant network driver, but I'm a bit hazy on > > the rest of the details (still very much in the early stages of > > exploring Ironic). Surely it's possible, but I haven't seen mention of > > this anywhere (except the very old spec from 2015 about introducing > > ML2 support into Ironic) nor is there a gate job resembling this > > specific use. > > > > Ideas? > > > > Thanks, > > Jeremy > > From openstack at fried.cc Fri May 3 04:36:19 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 2 May 2019 22:36:19 -0600 Subject: [nova][ptg] Summary: Resource Management Daemon Message-ID: Specs: - Base enablement: https://review.openstack.org/#/c/651130/ - Power management using CPU core P state control: https://review.openstack.org/#/c/651024/ - Last-level cache: https://review.openstack.org/#/c/651233/ Summary: - Represent new resources (e.g. last-level cache) which can be used for scheduling. - Resource Management Daemon (RMD) manages the (potentially dynamic) assignment of these resources to VMs. Direction: - There shall be no direct communication between nova-compute (including virt driver) and RMD. - Admin/orchestration to supply "conf" [1] describing the resources. - Nova processes this conf while updating provider trees to make the resources appear appropriately in placement. - Flavors can be designed to request the resources so they are considered and allocated during scheduling. - RMD must do its thing "out of band", e.g. triggered by listening for events (recommended: libvirt events, which are local to the host, rather than nova events) and requesting/introspecting information from flavor/image/placement. - Things not related to resource (like p-state control) can use traits to ensure scheduling to capable hosts. (Also potential to use forbidden aggregates [2] to isolate those hosts to only p-state-needing VMs.) - Delivery mechanism for RMD 'policy' artifacts via an extra spec with an opaque string which may represent e.g. a glance UUID, swift object, etc. efried [1] There has been a recurring theme of needing "some kind of config" - not necessarily nova.conf or any oslo.config - that can describe: - Resource provider name/uuid/parentage, be it an existing provider or a new nested provider; - Inventory (e.g. last-level cache in this case); - Physical resource(s) to which the inventory corresponds (e.g. "cache ways" in this case); - Traits, aggregates, other? As of this writing, no specifics have been decided, even to the point of positing that it could be the same file for some/all of the specs for which the issue arose. [2] http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005803.html From openstack at fried.cc Fri May 3 04:41:33 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 2 May 2019 22:41:33 -0600 Subject: [nova][ptg] Summary: Replace python-*client with OpenStack SDK Message-ID: <9b7f4f6a-0355-ad21-d6c7-91f8415d9be7@fried.cc> Blueprint: https://blueprints.launchpad.net/nova/+spec/openstacksdk-in-nova Summary: - Enable use of OpenStack SDK from nova. - Phase out use of python-*client (for * in ironic, glance, cinder, neutron...) eventually removing those deps completely from nova. - SDK capable of using ksa oslo.config options, so no changes necessary in deployments; but deployments can start using clouds.yaml as they choose. Agreement: Do it. Action: Reviewers to look at the blueprint and decide whether a spec is needed. efried . From florian.engelmann at everyware.ch Fri May 3 06:59:47 2019 From: florian.engelmann at everyware.ch (Florian Engelmann) Date: Fri, 3 May 2019 08:59:47 +0200 Subject: [all projects] events aka notifications Message-ID: Hi, most or all openstack services do send notifications/events to the message bus. How to know which notifications are sent? Is there some list of the event names? All the best, Flo -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5230 bytes Desc: not available URL: From doka.ua at gmx.com Fri May 3 07:07:26 2019 From: doka.ua at gmx.com (Volodymyr Litovka) Date: Fri, 3 May 2019 10:07:26 +0300 Subject: [octavia] Error while creating amphora In-Reply-To: References: Message-ID: <867dde2f-83ca-63ce-5ee7-bfa962ff46aa@gmx.com> Hi Michael, the reason is my personal perception that file injection is quite legacy way and I even didn't know whether it enabed or no in my installation :-) When configdrive is available, I'd prefer to use it in every case. I set "user_data_config_drive" to False and passed this step. Thanks for pointing on this. Now working with next issues launching amphorae, will back soon :-) Thank you. On 5/2/19 5:58 PM, Michael Johnson wrote: > Volodymyr, > > It looks like you have enabled "user_data_config_drive" in the > octavia.conf file. Is there a reason you need this? If not, please > set it to False and it will resolve your issue. > > It appears we have a python3 bug in the "user_data_config_drive" > capability. It is not generally used and appears to be missing test > coverage. > > I have opened a story (bug) on your behalf here: > https://storyboard.openstack.org/#!/story/2005553 > > Michael > > On Thu, May 2, 2019 at 4:29 AM Volodymyr Litovka wrote: >> Dear colleagues, >> >> I'm using Openstack Rocky and trying to launch Octavia 4.0.0. After all installation steps I've got an error during 'openstack loadbalancer create' with the following log: >> >> DEBUG octavia.controller.worker.tasks.compute_tasks [-] Compute create execute for amphora with id d037721f-2cf9-492e-99cb-0be5874da0f6 execute /opt/openstack/lib/python3.6/site-packages/octavia/controller/worker/tasks/compute_tasks.py:63 >> ERROR octavia.controller.worker.tasks.compute_tasks [-] Compute create for amphora id: d037721f-2cf9-492e-99cb-0be5874da0f6 failed: TypeError: can't concat str to bytes >> ERROR octavia.controller.worker.tasks.compute_tasks Traceback (most recent call last): >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/octavia/controller/worker/tasks/compute_tasks.py", line 94, in execute >> ERROR octavia.controller.worker.tasks.compute_tasks config_drive_files) >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/octavia/common/jinja/user_data_jinja_cfg.py", line 38, in build_user_data_config >> ERROR octavia.controller.worker.tasks.compute_tasks return self.agent_template.render(user_data=user_data) >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/asyncsupport.py", line 76, in render >> ERROR octavia.controller.worker.tasks.compute_tasks return original_render(self, *args, **kwargs) >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/environment.py", line 1008, in render >> ERROR octavia.controller.worker.tasks.compute_tasks return self.environment.handle_exception(exc_info, True) >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/environment.py", line 780, in handle_exception >> ERROR octavia.controller.worker.tasks.compute_tasks reraise(exc_type, exc_value, tb) >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/_compat.py", line 37, in reraise >> ERROR octavia.controller.worker.tasks.compute_tasks raise value.with_traceback(tb) >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/octavia/common/jinja/templates/user_data_config_drive.template", line 29, in top-level template code >> ERROR octavia.controller.worker.tasks.compute_tasks {{ value|indent(8) }} >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/filters.py", line 557, in do_indent >> ERROR octavia.controller.worker.tasks.compute_tasks s += u'\n' # this quirk is necessary for splitlines method >> ERROR octavia.controller.worker.tasks.compute_tasks TypeError: can't concat str to bytes >> ERROR octavia.controller.worker.tasks.compute_tasks >> WARNING octavia.controller.worker.controller_worker [-] Task 'STANDALONE-octavia-create-amp-for-lb-subflow-octavia-cert-compute-create' (06134192-def9-420c-9feb-0d08a068f3b2) transitioned into state 'FAILURE' from state 'RUNNING' >> >> Any advises where is the problem? >> >> My environment: >> - Openstack Rocky >> - Ubuntu 18.04 >> - Octavia installed in virtualenv using pip install: >> # pip list |grep octavia >> octavia 4.0.0 >> octavia-lib 1.1.1 >> python-octaviaclient 1.8.0 >> >> Thank you. >> >> -- >> Volodymyr Litovka >> "Vision without Execution is Hallucination." -- Thomas Edison >> >> -- >> Volodymyr Litovka >> "Vision without Execution is Hallucination." -- Thomas Edison -- Volodymyr Litovka "Vision without Execution is Hallucination." -- Thomas Edison From florian.engelmann at everyware.ch Fri May 3 09:07:52 2019 From: florian.engelmann at everyware.ch (Florian Engelmann) Date: Fri, 3 May 2019 11:07:52 +0200 Subject: [ceilometer] events are deprecated - true? Message-ID: <3daf12c2-a82a-16b9-515a-206628bc1cff@everyware.ch> Hi, I was wondering if events are still deprecated? https://github.com/openstack/ceilometer/blob/master/doc/source/admin/telemetry-events.rst "Warning Events support is deprecated." But how to handle all those service events if ceilometer will drop the support to validate and store those messages in gnocchi? Is there any longterm plan how to handle billing then? Why should this feature be deprecated? All the best, Flo -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5230 bytes Desc: not available URL: From balazs.gibizer at ericsson.com Fri May 3 13:40:28 2019 From: balazs.gibizer at ericsson.com (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Fri, 3 May 2019 13:40:28 +0000 Subject: [all projects] events aka notifications In-Reply-To: References: Message-ID: <1556890817.16566.0@smtp.office365.com> On Fri, May 3, 2019 at 12:59 AM, Florian Engelmann wrote: > Hi, > > most or all openstack services do send notifications/events to the > message bus. How to know which notifications are sent? Is there some > list of the event names? Nova versioned notifications are documented in [1]. Cheers, gibi [1] https://docs.openstack.org/nova/latest/reference/notifications.html#existing-versioned-notifications > > All the best, > Flo From tetsuro.nakamura.bc at hco.ntt.co.jp Fri May 3 14:22:13 2019 From: tetsuro.nakamura.bc at hco.ntt.co.jp (Tetsuro Nakamura) Date: Fri, 03 May 2019 23:22:13 +0900 Subject: [placement][nova][ptg] resource provider affinity In-Reply-To: <1556631941.24201.1@smtp.office365.com> References: <776bc9b18cf33713708c22d893bd2a46d7a899ed.camel@redhat.com> <1CC272501B5BC543A05DB90AA509DED527552AD6@fmsmsx122.amr.corp.intel.com> <97bd8e53-0285-1c92-845f-21098b0b0e38@gmail.com> <21aa22e7-be7d-8ecf-b5bd-9c6afcd789f5@fried.cc> <27624C30-2BB6-43DF-9613-783674389C0B@fried.cc> <1556631941.24201.1@smtp.office365.com> Message-ID: <264f10b8-05dc-5280-28af-1f29cae91821@hco.ntt.co.jp> Sorry for the late response, Here is my thoughts on "resource provider affinity". “The rps are in a same subtree” is equivalent to “there exits an rp which is an ancestor of all the other rps” Therefore, * group_resources=1:2 means “rp2 is a descendent of rp1 (or rp1 is a descendent of rp2.)” We can extend it to cases we have more than two groups: * group_resources=1:2:3 means "both rp2 and rp3 are descendents of rp1 (or both rp1 and rp3 are of rp2 or both rp1 and rp2 are of rp3) Eric's question from PTG yesterday was whether to keep the symmetry between rps, that is, whether to take the conditions enclosed in the parentheses above. I would say yes keep the symmetry because 1. the expression 1:2:3 is more of symmetry. If we want to make it asymmetric, it should express the subtree root more explicitly like 1-2:3 or 1-2:3:4. 2. callers may not be aware of which resource (VCPU or VF) is provided by the upper/lower rp.     IOW, the caller - resource retriever (scheduler) -  doesn't want to know how the reporter - virt driver - has reported the resouces. Note that even in the symmetric world the negative expression jay suggested looks good to me. It enables something like: * group_resources=1:2:!3:!4 which means 1 and 2 should be in the same group but 3 shoudn't be the descendents of 1 or 2, so as 4. However, speaking in the design level, the adjacency list model (so called naive tree model), which we currently use for nested rps, is not good at retrieving subtrees (compared to e.g. nested set model[1]). [1] https://en.wikipedia.org/wiki/Nested_set_model I have looked into recursive SQL CTE (common table expression) feature which help us treat subtree easily in adjacency list model in a experimental patch [2], but unfortunately it looks like the feature is still experimental in MySQL, and we don't want to query like this per every candidates, do we? :( [2] https://review.opendev.org/#/c/636092/ Therefore, for this specific use case of NUMA affinity I'd like alternatively propose bringing a concept of resource group distance in the rp graph. * numa affinity case   - group_distance(1:2)=1 * anti numa affinity   - group_distance(1:2)>1 which can be realized by looking into the cached adjacency rp (i.e. parent id) (supporting group_distance=N (N>1) would be a future research or implement anyway overlooking the performance) One drawback of this is that we can't use this if you create multiple nested layers with more than 1 depth under NUMA rps, but is that the case for OvS bandwidth? Another alternative is having a "closure table" from where we can retrieve all the descendent rp ids of an rp without joining tables. but... online migration cost? - tetsuro -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at fried.cc Fri May 3 15:03:38 2019 From: openstack at fried.cc (Eric Fried) Date: Fri, 3 May 2019 09:03:38 -0600 Subject: [placement][nova][ptg] resource provider affinity In-Reply-To: <264f10b8-05dc-5280-28af-1f29cae91821@hco.ntt.co.jp> References: <776bc9b18cf33713708c22d893bd2a46d7a899ed.camel@redhat.com> <1CC272501B5BC543A05DB90AA509DED527552AD6@fmsmsx122.amr.corp.intel.com> <97bd8e53-0285-1c92-845f-21098b0b0e38@gmail.com> <21aa22e7-be7d-8ecf-b5bd-9c6afcd789f5@fried.cc> <27624C30-2BB6-43DF-9613-783674389C0B@fried.cc> <1556631941.24201.1@smtp.office365.com> <264f10b8-05dc-5280-28af-1f29cae91821@hco.ntt.co.jp> Message-ID: <4aa76244-fce0-86f3-a6f5-cd7f4d8cb2f0@fried.cc> > “The rps are in a same subtree” is equivalent to “there exits an rp > which is an ancestor of all the other rps” ++ > I would say yes keep the symmetry because > > 1. the expression 1:2:3 is more of symmetry. If we want to make it > asymmetric, it should express the subtree root more explicitly like > 1-2:3 or 1-2:3:4. > 2. callers may not be aware of which resource (VCPU or VF) is provided > by the upper/lower rp. >     IOW, the caller - resource retriever (scheduler) -  doesn't want to > know how the reporter - virt driver - has reported the resouces. This. (If we were going to do asymmetric, I agree we would need a clearer syntax. Another option I thought of was same_subtree1=2,3,!4. But still prefer symmetric.) > It enables something like: > * group_resources=1:2:!3:!4 > which means 1 and 2 should be in the same group but 3 shoudn't be the > descendents of 1 or 2, so as 4. In a symmetric world, this one is a little ambiguous to me. Does it mean 4 shouldn't be in the same subtree as 3 as well? > However, speaking in the design level, the adjacency list model (so > called naive tree model), which we currently use for nested rps, > is not good at retrieving subtrees Based on my limited understanding, we may want to consider at least initially *not* trying to do this in sql. We can gather the candidates as we currently do and then filter them afterward in python (somewhere in the _merge_candidates flow). > One drawback of this is that we can't use this if you create multiple > nested layers with more than 1 depth under NUMA rps, > but is that the case for OvS bandwidth? If the restriction is because "the SQL is difficult", I would prefer not to introduce a "distance" concept. We've come up with use cases where the nesting isn't simple. > Another alternative is having a "closure table" from where we can > retrieve all the descendent rp ids of an rp without joining tables. > but... online migration cost? Can we consider these optimizations later, if the python-side solution proves non-performant? efried . From sbauza at redhat.com Fri May 3 15:57:38 2019 From: sbauza at redhat.com (Sylvain Bauza) Date: Fri, 3 May 2019 09:57:38 -0600 Subject: [placement][nova][ptg] resource provider affinity In-Reply-To: <4aa76244-fce0-86f3-a6f5-cd7f4d8cb2f0@fried.cc> References: <776bc9b18cf33713708c22d893bd2a46d7a899ed.camel@redhat.com> <1CC272501B5BC543A05DB90AA509DED527552AD6@fmsmsx122.amr.corp.intel.com> <97bd8e53-0285-1c92-845f-21098b0b0e38@gmail.com> <21aa22e7-be7d-8ecf-b5bd-9c6afcd789f5@fried.cc> <27624C30-2BB6-43DF-9613-783674389C0B@fried.cc> <1556631941.24201.1@smtp.office365.com> <264f10b8-05dc-5280-28af-1f29cae91821@hco.ntt.co.jp> <4aa76244-fce0-86f3-a6f5-cd7f4d8cb2f0@fried.cc> Message-ID: On Fri, May 3, 2019 at 9:24 AM Eric Fried wrote: > > “The rps are in a same subtree” is equivalent to “there exits an rp > > which is an ancestor of all the other rps” > > ++ > > > I would say yes keep the symmetry because > > > > 1. the expression 1:2:3 is more of symmetry. If we want to make it > > asymmetric, it should express the subtree root more explicitly like > > 1-2:3 or 1-2:3:4. > > 2. callers may not be aware of which resource (VCPU or VF) is provided > > by the upper/lower rp. > > IOW, the caller - resource retriever (scheduler) - doesn't want to > > know how the reporter - virt driver - has reported the resouces. > > This. > > (If we were going to do asymmetric, I agree we would need a clearer > syntax. Another option I thought of was same_subtree1=2,3,!4. But still > prefer symmetric.) > > > It enables something like: > > * group_resources=1:2:!3:!4 > > which means 1 and 2 should be in the same group but 3 shoudn't be the > > descendents of 1 or 2, so as 4. > > In a symmetric world, this one is a little ambiguous to me. Does it mean > 4 shouldn't be in the same subtree as 3 as well? > > First, thanks Tetsuro for investigating ways to support such queries. Very much appreciated. I hope I can dedicate a few time this cycle to see whether I could help with implementing NUMA affinity as I see myself as the first consumer of such thing :-) > > However, speaking in the design level, the adjacency list model (so > > called naive tree model), which we currently use for nested rps, > > is not good at retrieving subtrees > > > Based on my limited understanding, we may want to consider at least > initially *not* trying to do this in sql. We can gather the candidates > as we currently do and then filter them afterward in python (somewhere > in the _merge_candidates flow). > > > One drawback of this is that we can't use this if you create multiple > > nested layers with more than 1 depth under NUMA rps, > > but is that the case for OvS bandwidth? > > If the restriction is because "the SQL is difficult", I would prefer not > to introduce a "distance" concept. We've come up with use cases where > the nesting isn't simple. > > > Another alternative is having a "closure table" from where we can > > retrieve all the descendent rp ids of an rp without joining tables. > > but... online migration cost? > > Can we consider these optimizations later, if the python-side solution > proves non-performant? > > Huh, IMHO the whole benefits of having SQL with Placement was that we were getting a fast distributed lock proven safe. Here, this is a read so I don't really bother on any potential contention, but I just wanted to say that if we go this way, we absolutely need to make enough safeguards so that we don't loose the key interest of Placement. This is not trivial either way then. -Sylvain efried > . > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gr at ham.ie Fri May 3 16:51:09 2019 From: gr at ham.ie (Graham Hayes) Date: Fri, 3 May 2019 10:51:09 -0600 Subject: [Desginate][Infoblox] Using infoblox as a backend to designate In-Reply-To: References: Message-ID: <50ed14f2-fe5d-2af2-f165-1360b9832681@ham.ie> Hi, Yes - Designate needs miniDNS to be running for this to work. What we do is create a secondary zone on the InfoBlox server, and it will do a zone transfer from Designate when you update the zone. Thanks, Graham On 02/05/2019 16:59, Zavala, Miguel wrote: > Hi all, > > Ive been trying to get Infoblox integrated with designate and I am > running into some issues. Currently I can go to horizon, and create a > zone there that then shows in infoblox, but when checking the logs I get > :: Could not find 1556226600 for openstack.example. on enough > nameservers.:: I saw the documentation listed here > ,https://docs.openstack.org/designate/queens/admin/backends/infoblox.html, > says that I have to set the designate mini-dns server as my external > primary. Do I have to have a mini-dns running in order for > designate to operate correctly? Im asking because designate has a > database so it does not require synchronization like bind 9 does. I > currently have a mini-dns setup on my controller node if I do need it. > Thank you for reading! > > Regards, > > Miguel > From ed at leafe.com Fri May 3 17:02:00 2019 From: ed at leafe.com (Ed Leafe) Date: Fri, 3 May 2019 11:02:00 -0600 Subject: [placement][nova][ptg] resource provider affinity In-Reply-To: <264f10b8-05dc-5280-28af-1f29cae91821@hco.ntt.co.jp> References: <776bc9b18cf33713708c22d893bd2a46d7a899ed.camel@redhat.com> <1CC272501B5BC543A05DB90AA509DED527552AD6@fmsmsx122.amr.corp.intel.com> <97bd8e53-0285-1c92-845f-21098b0b0e38@gmail.com> <21aa22e7-be7d-8ecf-b5bd-9c6afcd789f5@fried.cc> <27624C30-2BB6-43DF-9613-783674389C0B@fried.cc> <1556631941.24201.1@smtp.office365.com> <264f10b8-05dc-5280-28af-1f29cae91821@hco.ntt.co.jp> Message-ID: On May 3, 2019, at 8:22 AM, Tetsuro Nakamura wrote: > > I have looked into recursive SQL CTE (common table expression) feature which help us treat subtree easily in adjacency list model in a experimental patch [2], > but unfortunately it looks like the feature is still experimental in MySQL, and we don't want to query like this per every candidates, do we? :( At the risk of repeating myself, SQL doesn’t model the relationships among entities involved with either nested providers or shared providers. These relationships are modeled simply in a graph database, avoiding the gymnastics needed to fit them into a relational DB. I have a working model of Placement that has already solved nested providers (any depth), shared providers, project usages, and more. If you have time while at PTG, grab me and I’d be happy to demonstrate. -- Ed Leafe From cdent+os at anticdent.org Fri May 3 17:36:39 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Fri, 3 May 2019 11:36:39 -0600 (MDT) Subject: [placement][ptg] Updated Agenda Message-ID: See also the agenda in: https://etherpad.openstack.org/p/placement-ptg-train Yesterday's cross project session with nova [1] was efficient enough that the expected overflow into today has not been necessary. It also filled up Placement's work queue enough that we don't really need to choose more work, just refine the plans. To that end the agenda for today is very open: Friday: 2:30-2:40: Team Photo Rest of the time: Either working with other projects in their rooms (as required), or working on refining plans, writing specs, related in the placement room. With Saturday more concrete when people may have more free time. Saturday: 09:00-??:??: Discuss possibilities with Ironic and Blazar 10:00-??:??: Cinder joins those discussions 13:30-14:30: Implementing nested magic. See http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005815.html 14:30-15:00: Consumer types: https://review.opendev.org/#/c/654799/ 15:00-15:30: Catchup / Documenting Future Actions 15:30-Beer: Retrospective Refactoring and Cleanliness Goals Hacking [1] I'm currently producing some messages summarizing that, but wanted to get these agenda adjustments out first. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From Tushar.Patil at nttdata.com Fri May 3 17:58:44 2019 From: Tushar.Patil at nttdata.com (Patil, Tushar) Date: Fri, 3 May 2019 17:58:44 +0000 Subject: [nova][ptg] Summary: Using Forbidden Aggregates In-Reply-To: <18a3542d-68d2-fd29-253f-880e54f12369@fried.cc> References: <18a3542d-68d2-fd29-253f-880e54f12369@fried.cc> Message-ID: >> - Spec has overall support, but a few open questions. Answer those, and >> we're good to approve and move forward. I have replied to all open questions and fix the nits. I have accepted Tetsuro suggestion to add traits to the compute node resource provider in the nova placement sync_aggregates command if aggregates are configured with metadata with kye/value pair "trait:traits_name=required". Request everyone to kindly review the updated specs. https://review.opendev.org/#/c/609960/ Regards, Tushar Patil ________________________________________ From: Eric Fried Sent: Friday, May 3, 2019 12:59:15 PM To: OpenStack Discuss Subject: [nova][ptg] Summary: Using Forbidden Aggregates Spec: https://review.opendev.org/#/c/609960/ Summary: - TL;DR: Allows you to say "You can't land on a host that does X unless you specifically require X". Example: Keep my Windows-licensed hosts for Windows instances. - Exploit placement enablement for forbidden aggregates [1] in Nova - Set (nova) aggregate metadata with a syntax similar/identical to that of extra_specs for required traits (e.g. 'trait:CUSTOM_WINDOWS_ONLY': 'required') - During scheduling, nova will discover all aggregates with metadata of this form. For each: - Construct a list of the traits in the aggregate metadata - Subtract traits required by the server request's flavor+image. - If any traits from the aggregate remain, add this aggregate's UUID (which corresponds to a placement aggregate) to the list of "forbidden aggregates" for the GET /allocation_candidates request. Agreements: - The "discover all aggregates" bit has the potential to be slow, but is better than the alternative, which was having the admin supply the same information in a confusing conf syntax. And if performance becomes a problem, we can deal with it later; this does not paint us into a corner. - Spec has overall support, but a few open questions. Answer those, and we're good to approve and move forward. efried [1] https://docs.openstack.org/placement/latest/specs/train/approved/2005297-negative-aggregate-membership.html Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. From michele at acksyn.org Fri May 3 17:59:05 2019 From: michele at acksyn.org (Michele Baldessari) Date: Fri, 3 May 2019 19:59:05 +0200 Subject: [oslo][oslo-messaging][nova] Stein nova-api AMQP issue running under uWSGI In-Reply-To: References: <229a2a53-870f-44c3-5e0c-6cfa9d45d0c5@oracle.com> <3275304e-d717-8b89-557e-b650fc4f661a@oracle.com> <20190420063850.GA18527@holtby.speedport.ip> <8b9cb0e4-b3a4-986a-be59-5bba6ae00f4e@nemebean.com> Message-ID: <20190503175904.GA26117@holtby> On Mon, Apr 22, 2019 at 01:21:03PM -0500, Ben Nemec wrote: > > > On 4/22/19 12:53 PM, Alex Schultz wrote: > > On Mon, Apr 22, 2019 at 11:28 AM Ben Nemec wrote: > > > > > > > > > > > > On 4/20/19 1:38 AM, Michele Baldessari wrote: > > > > On Fri, Apr 19, 2019 at 03:20:44PM -0700, iain.macdonnell at oracle.com wrote: > > > > > > > > > > Today I discovered that this problem appears to be caused by eventlet > > > > > monkey-patching. I've created a bug for it: > > > > > > > > > > https://bugs.launchpad.net/nova/+bug/1825584 > > > > > > > > Hi, > > > > > > > > just for completeness we see this very same issue also with > > > > mistral (actually it was the first service where we noticed the missed > > > > heartbeats). iirc Alex Schultz mentioned seeing it in ironic as well, > > > > although I have not personally observed it there yet. > > > > > > Is Mistral also mixing eventlet monkeypatching and WSGI? > > > > > > > Looks like there is monkey patching, however we noticed it with the > > engine/executor. So it's likely not just wsgi. I think I also saw it > > in the ironic-conductor, though I'd have to try it out again. I'll > > spin up an undercloud today and see if I can get a more complete list > > of affected services. It was pretty easy to reproduce. > > Okay, I asked because if there's no WSGI/Eventlet combination then this may > be different from the Nova issue that prompted this thread. It sounds like > that was being caused by a bad interaction between WSGI and some Eventlet > timers. If there's no WSGI involved then I wouldn't expect that to happen. > > I guess we'll see what further investigation turns up, but based on the > preliminary information there may be two bugs here. So just to get some closure on this error that we have seen around mistral executor and tripleo with python3: this was due to the ansible action that called subprocess which has a different implementation in python3 and so the monkeypatching needs to be adapted. Review which fixes it for us is here: https://review.opendev.org/#/c/656901/ Damien and I think the nova_api/eventlet/mod_wsgi has a separate root-cause (although we have not spent all too much time on that one yet) cheers. Michele -- Michele Baldessari C2A5 9DA3 9961 4FFB E01B D0BC DDD4 DCCB 7515 5C6D From cdent+os at anticdent.org Fri May 3 18:22:45 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Fri, 3 May 2019 12:22:45 -0600 (MDT) Subject: [placement][nova][ptg] Summary: Nested Magic With Placement Message-ID: This message is an attempt to summarize some of the discussions held yesterday during the nova-placement cross project session [1] that were in some way related to the handling of nested providers. There were several individual topics: * NUMA Topology with placement * Subtree affinity with placement * Request group mapping * Resourceless trait filters But they are all closely inter-related, so summarizing the discussion here as a lump. There was some discussion raised about whether representing NUMA topology in placement was worth pursuing as it is not strictly necessary and dang it is sure taking us a long time to get there and will replace an existing set of worms with a new set of worms. The conversation to resolved to: It's worth trying. To make it work there are some adjustments required to how placement operates: * We need to implement a form of the can_split feature (as previously described in [2]) to allow some classes of resource to be satisfied by multiple providers. * The `group_policy=same_tree[...]` concept is needed (as initially proposed in [3]) for affinity (and anti). Some discussion on implementation has started at [4] and there will be in-person discussion in the placement PTG room tomorrow (Saturday) afternoon. * trait and aggregate membership should "flow down" when making any kind of request (numbered or unnumbered). This is closely tied to the implementation of the previous point. * Being able to express 'required' without a 'resources' is required when making an allocation candidates query. * There are several in-flight nova patches where some hacks to flavors are being performed to work around the current lack of this feature. These are okay and safe to carry on with because they are ephemeral. * The number required and resources query parameters need to accept arbitrary strings so it is possible to say things like 'resources_compute' and 'resources_network' to allow conventions to emerge when multiple parties may be involved in manipulating a RequestGroup. * A 'mappings' key will be added to the 'allocations' object in the allocation_candidates response that will support request group mapping. * There will be further discussion on these features Saturday at the PTG starting at 13:30. Actions: * This (Friday) afternoon at the PTG I'll be creating rfe stories associated with these changes. If you'd like to help with that, find me in the placement room (109). We'll work out whether those stories needs specs in the normally processing of the stories. We'll also need to find owners for many of them. * Gibi will be updating the request group mapping spec. [1] https://etherpad.openstack.org/p/ptg-train-xproj-nova-placement [2] https://review.opendev.org/#/c/560974/ [3] http://lists.openstack.org/pipermail/openstack-discuss/2019-April/005673.html [4] http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005815.html [5] https://review.opendev.org/#/c/597601/ -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From openstack at fried.cc Fri May 3 18:32:24 2019 From: openstack at fried.cc (Eric Fried) Date: Fri, 3 May 2019 12:32:24 -0600 Subject: [nova][ptg] Summary: API inconsistency cleanups Message-ID: Spec: https://review.openstack.org/#/c/603969/ Summary: The spec describes a bunch of mostly-unrelated-to-each-other API cleanups, listed below. The spec proposes to do them all in a single microversion. Consensus: - 400 for unknown/invalid params in querystring / request body => Do it. - Remove OS-* prefix from request and response field. - Proposed alternative: accept either, return both => Don't do it (either way) - Making server representation always consistent among all APIs returning the complete server representation. => Do it (in the same microversion) - Return ``servers`` field as an empty list in response of GET /os-hypervisors when there are no servers (currently it is omitted) => Do it (in the same microversion) - Consistent error codes on quota exceeded => Don't do it - Lump https://review.opendev.org/#/c/648919/ (change flavors.swap default from '' [string] to 0 [int] in the response) into the same effort? => Do it (in the same microversion) efried . From paye600 at gmail.com Fri May 3 18:48:10 2019 From: paye600 at gmail.com (Roman Gorshunov) Date: Fri, 3 May 2019 20:48:10 +0200 Subject: [tc][all] Github mirroring (or lack thereof) for unofficial projects Message-ID: Hello Jim, team, I'm from Airship project. I agree with archival of Github mirrors of repositories. One small suggestion: could we have project descriptions adjusted to point to the new location of the source code repository, please? E.g. "The repo now lives at opendev.org/x/y". Thanks to AJaeger & clarkb. Thank you. Best regards, -- Roman Gorshunov From Tim.Bell at cern.ch Fri May 3 18:58:41 2019 From: Tim.Bell at cern.ch (Tim Bell) Date: Fri, 3 May 2019 18:58:41 +0000 Subject: [cinder][ops] Nested Quota Driver Use? In-Reply-To: <20190502003249.GA1432@sm-workstation> References: <20190502003249.GA1432@sm-workstation> Message-ID: We're interested in the overall functionality but I think unified limits is the place to invest and thus would not have any problem deprecating this driver. We'd really welcome this being implemented across all the projects in a consistent way. The sort of functionality proposed in https://techblog.web.cern.ch/techblog/post/nested-quota-models/ would need Nova/Cinder/Manila at miniumum for CERN to switch. So, no objections to deprecation but strong support to converge on unified limits. Tim -----Original Message----- From: Sean McGinnis Date: Thursday, 2 May 2019 at 02:39 To: "openstack-discuss at lists.openstack.org" Subject: [cinder][ops] Nested Quota Driver Use? Hey everyone, I'm hoping to get some feedback from folks, especially operators. In the Liberty release, Cinder introduced the ability to use a Nest Quota Driver to handle cases of heirarchical projects and quota enforcement [0]. I have not heard of anyone actually using this. I also haven't seen any bugs filed, which makes me a little suspicious given how complicated it can be. I would like to know if any operators are using this for nested quotas. There is an effort underway for a new mechanism called "unified limits" that will require a lot of modifications to the Cinder code. If this quota driver is not needed, I would like to deprecated it in Train so it can be removed in the U release and hopefully prevent some unnecessary work being done. Any feedback on this would be appreciated. Thanks! Sean [0] https://specs.openstack.org/openstack/cinder-specs/specs/liberty/cinder-nested-quota-driver.html From arbermejo0417 at gmail.com Fri May 3 19:05:42 2019 From: arbermejo0417 at gmail.com (Alejandro Ruiz Bermejo) Date: Fri, 3 May 2019 15:05:42 -0400 Subject: [ZUN] Proxy on Docker + Zun Message-ID: I'm still working on my previous error of the openstack appcontainer run error state: I have Docker working behind a Proxy. As you can see in the Docker info i attach to this mail. I tried to do the curl http://10.8.9.54:2379/health with the proxy environment variable and i got timeout error (without it the curl return the normal healthy state for the etcd cluster). So my question is if i'm having a problem with the proxy configuration and docker commands when i'm executing the openstack appcontainer run. And if you know any use case of someone working with Docker behind a proxy and Zun in the Openstack environment. This is the outputh of # systemctl show --property Environment docker Environment=HTTP_PROXY=http://10.8.7.60:3128/ NO_PROXY=localhost, 127.0.0.0/8,10.8.0.0/16 HTTPS_PROXY=http://10.8.7.60:3128/ And this is the one of root at compute /h/team# docker info Containers: 9 Running: 0 Paused: 0 Stopped: 9 Images: 7 Server Version: 18.09.5 Storage Driver: overlay2 Backing Filesystem: extfs Supports d_type: true Native Overlay Diff: true Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: bridge host macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog Swarm: inactive Runtimes: runc Default Runtime: runc Init Binary: docker-init containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84 runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30 init version: fec3683 Security Options: apparmor seccomp Profile: default Kernel Version: 4.15.0-48-generic Operating System: Ubuntu 18.04.2 LTS OSType: linux Architecture: x86_64 CPUs: 8 Total Memory: 15.66GiB Name: compute ID: W35H:WCPP:AM3K:NENH:FEOR:S23C:N3FZ:QELB:LLUR:USMJ:IM7W:YMFX Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false HTTP Proxy: http://10.8.7.60:3128/ HTTPS Proxy: http://10.8.7.60:3128/ No Proxy: localhost,127.0.0.0/8,10.8.0.0/16 Registry: https://index.docker.io/v1/ Labels: Experimental: false Cluster Store: etcd://10.8.9.54:2379 Insecure Registries: 127.0.0.0/8 Live Restore Enabled: false Product License: Community Engine WARNING: API is accessible on http://compute:2375 without encryption. Access to the remote API is equivalent to root access on the host. Refer to the 'Docker daemon attack surface' section in the documentation for more information: https://docs.docker.com/engine/security/security/#docker-daemon-attack-surface WARNING: No swap limit support -------------- next part -------------- An HTML attachment was scrubbed... URL: From pabelanger at redhat.com Fri May 3 19:05:38 2019 From: pabelanger at redhat.com (Paul Belanger) Date: Fri, 3 May 2019 15:05:38 -0400 Subject: [tc][all] Github mirroring (or lack thereof) for unofficial projects In-Reply-To: References: Message-ID: <20190503190538.GB3377@localhost.localdomain> On Fri, May 03, 2019 at 08:48:10PM +0200, Roman Gorshunov wrote: > Hello Jim, team, > > I'm from Airship project. I agree with archival of Github mirrors of > repositories. One small suggestion: could we have project descriptions > adjusted to point to the new location of the source code repository, > please? E.g. "The repo now lives at opendev.org/x/y". > This is something important to keep in mind from infra side, once the repo is read-only, we lose the ability to use the API to change it. >From manage-projects.py POV, we can update the description before flipping the archive bit without issues, just need to make sure we have the ordering correct. Also, there is no API to unarchive a repo from github sadly, for that a human needs to log into github UI and click the button. I have no idea why. - Paul > Thanks to AJaeger & clarkb. > > Thank you. > > Best regards, > -- Roman Gorshunov > From cdent+os at anticdent.org Fri May 3 20:08:00 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Fri, 3 May 2019 14:08:00 -0600 (MDT) Subject: [placement][nova][ptg] Summary: Shared resource providers for shared disk on compute hosts Message-ID: See also: https://etherpad.openstack.org/p/ptg-train-xproj-nova-placement There's a spec in progress about turning on support for shared disk providers [1]. We discussed some of the details that need to be resolved and actions that need to be taken. The next action is for Tushar to update the spec to reflect the decisions and alternatives: * For some virt drivers, we need example one or two tools for: * creating a shared disk provider, setting inventory, creating aggregate, adding compute nodes to the aggregate * updating inventory when the (absolute) size of the storage changes These were initially discussed as example tools that live in the placement repo but it might actually be better in nova. There's an abandoned example [2] from long ago. * Other virt drivers (and potentially Ceph w/libvirt if a reliable source of identifier is available) will be able to manage this sort of thing themselves in update_provider_tree. * Other options (for managing the initial management of the shared disk provider) include: * write the provider info into a well-known file on the shared disk * variations on the inventory.yaml file * We would like to have shared disk testing in the gate. Matt has started https://review.opendev.org/#/c/586363/ but it does not test multinode, yet. Note that apart from the sample tools described above, which might be in the placement repo, the required actions here are on the nova side. At least until we find bugs on the placement side resulting from this work. [1] https://review.opendev.org/#/c/650188/ [2] https://review.opendev.org/382613 -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From cdent+os at anticdent.org Fri May 3 20:16:57 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Fri, 3 May 2019 14:16:57 -0600 (MDT) Subject: [placement][nova][ptg] Summary: Testing PlacementFixture effectively Message-ID: See also: https://etherpad.openstack.org/p/ptg-train-xproj-nova-placement Nova uses the PlacementFixture (provided by placement) to be able to do functional tests with a real placement API and database. This works pretty well but we discovered during the run to the end of Stein that seemingly unrelated changes in placement could break the fixture and bring nova's gate to a halt. Bad. Decision: Placement will run nova's functional tests in its own gate on each change. If it proves to save some time the api_sample tests will be blacklisted. We do not want to whitelist as that will lead to trouble in the future. There was discussion of doing this for osc-placement as well, but since we just saved a bunch of elapsed time with functional tests in osc-placement with https://review.opendev.org/#/c/651939/ and there's no integrated gate criticality with osc-placement, we decided not to. Action: cdent will make a story and do this -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From persia at shipstone.jp Fri May 3 20:49:42 2019 From: persia at shipstone.jp (Emmet Hikory) Date: Sat, 4 May 2019 05:49:42 +0900 Subject: [tc] Proposal: restrict TC activities Message-ID: <20190503204942.GB28010@shipstone.jp> All, I’ve spent the last few years watching the activities of the technical committee , and in recent cycles, I’m seeing a significant increase in both members of our community asking the TC to take action on things, and the TC volunteering to take action on things in the course of internal discussions (meetings, #openstack-tc, etc.). In combination, these trends appear to have significantly increased the amount of time that members of the technical committee spend on “TC work”, and decreased the time that they spend on other activities in OpenStack. As such, I suggest that the Technical Committee be restricted from actually doing anything beyond approval of merges to the governance repository. Firstly, we select members of the technical committee from amongst those of us who have some of the deepest understanding of the entire project and frequently those actively involved in multiple projects and engaged in cross-project coordination on a regular basis. Anything less than this fails to produce enough name recognition for election. As such, when asking the TC to be responsible for activities, we should equally ask whether we wish the very people responsible for the efficiency of our collaboration to cease doing so in favor of whatever we may have assigned to the TC. Secondly, in order to ensure continuity, we need to provide a means for rotation of the TC: this is both to allow folk on the TC to pursue other activities, and to allow folk not on the TC to join the TC and help with governance and coordination. If we wish to increase the number of folk who might be eligible for the TC, we do this best by encouraging them to take on activities that involve many projects or affect activities over all of OpenStack. These activities must necessarily be taken by those not current TC members in order to provide a platform for visibility to allow those doing them to later become TC members. Solutions to both of these issues have been suggested involving changing the size of the TC. If we decrease the size of the TC, it becomes less important to provide mechanisms for new people to develop reputation over the entire project, but this ends up concentrating the work of the TC to a smaller number of hands, and likely reduces the volume of work overall accomplished. If we increase the size of the TC, it becomes less burdensome for the TC to take on these activities, but this ends up foundering against the question of who in our community has sufficient experience with all aspects of OpenStack to fill the remaining seats (and how to maintain a suitable set of folk to provide TC continuity). If we instead simply assert that the TC is explicitly not responsible for any activities beyond governance approvals, we both reduce the impact that being elected to the TC has on the ability of our most prolific contributors to continue their activities and provide a means for folk who have expressed interest and initiative to broadly contribute and demonstrate their suitability for nomination in a future TC election Feedback encouraged -- Emmet HIKORY From sfinucan at redhat.com Fri May 3 20:54:12 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Fri, 03 May 2019 14:54:12 -0600 Subject: Retiring bilean Message-ID: The Bilean appears to be dead and has had no activity in over two years. I would like to retire the repository. Please let me know if there are any objections. I'm proposing patches now with topic retire-bilean. Stephen From sfinucan at redhat.com Fri May 3 20:55:15 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Fri, 03 May 2019 14:55:15 -0600 Subject: Retiring hurricane Message-ID: <6f80aca07d72dd16e190c4396a15cdca39724b72.camel@redhat.com> The x/hurricane repo was created but has not been populated in the two years since. I would like to retire the repository. Please let me know if there are any objections. I'm proposing patches now with topic retire-hurricane. Stephen From sfinucan at redhat.com Fri May 3 20:56:24 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Fri, 03 May 2019 14:56:24 -0600 Subject: Retiring ailuropoda Message-ID: <07e22626baf782deb7cbedddafadf9b655612594.camel@redhat.com> The ailuropoda project appears to be dead and has had no activity in nearly three years. I would like to retire the repository. Please let me know if there are any objections. I'm proposing patches now with topic retire-ailuropoda. Stephen From cdent+os at anticdent.org Fri May 3 21:13:38 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Fri, 3 May 2019 15:13:38 -0600 (MDT) Subject: [placement][nova][ptg] resource provider affinity In-Reply-To: <4aa76244-fce0-86f3-a6f5-cd7f4d8cb2f0@fried.cc> References: <776bc9b18cf33713708c22d893bd2a46d7a899ed.camel@redhat.com> <1CC272501B5BC543A05DB90AA509DED527552AD6@fmsmsx122.amr.corp.intel.com> <97bd8e53-0285-1c92-845f-21098b0b0e38@gmail.com> <21aa22e7-be7d-8ecf-b5bd-9c6afcd789f5@fried.cc> <27624C30-2BB6-43DF-9613-783674389C0B@fried.cc> <1556631941.24201.1@smtp.office365.com> <264f10b8-05dc-5280-28af-1f29cae91821@hco.ntt.co.jp> <4aa76244-fce0-86f3-a6f5-cd7f4d8cb2f0@fried.cc> Message-ID: On Fri, 3 May 2019, Eric Fried wrote: >> Another alternative is having a "closure table" from where we can >> retrieve all the descendent rp ids of an rp without joining tables. >> but... online migration cost? > > Can we consider these optimizations later, if the python-side solution > proves non-performant? My preference would be that we start with the simplest option (make multiple selects, merge them appropriately in Python) and, as Eric says, if that's not good enough, pursue the optimizations. In fact, I think we should likely pursue the optimizations [1] in any case, but they should come _after_ we have some measurements. Jay provided a proposed algorithm in [2]. We have a time slot tomorrow (Saturday May 3) at 13:30 to discuss some of the finer points of implementing nested magic [3]. [1] Making placement faster is constantly a goal, but it is a secondary goal. [2] http://lists.openstack.org/pipermail/openstack-discuss/2019-April/005432.html [3] http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005823.html -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From sfinucan at redhat.com Fri May 3 21:17:43 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Fri, 03 May 2019 15:17:43 -0600 Subject: Retiring aeromancer Message-ID: <7c289216c6d1079d2c7d9c4c03b3740ebf5a5339.camel@redhat.com> The aeromancer project appears to be dead and has had no activity in over four years. I would like to retire the repository. Please let me know if there are any objections. I'm proposing patches now with topic retire-aeromancer. Stephen From openstack at fried.cc Fri May 3 21:20:07 2019 From: openstack at fried.cc (Eric Fried) Date: Fri, 3 May 2019 15:20:07 -0600 Subject: [nova][neutron][ptg] Summary: Leaking resources when ports are deleted out-of-band Message-ID: <62ef48e0-9425-9191-a648-c1009c1032b7@fried.cc> Summary: When a port is deleted out of band (while still attached to an instance), any associated QoS bandwidth resources are orphaned in placement. Consensus: - Neutron to block deleting a port whose "owner" field is set. - If you really want to do this, null the "owner" field first. - Nova still needs a way to delete the port during destroy. To be discussed. Possibilities: - Nova can null the "owner" field first. - The operation can be permitted with a certain policy role, which Nova would have to be granted. - Other? efried . From sbauza at redhat.com Fri May 3 21:34:55 2019 From: sbauza at redhat.com (Sylvain Bauza) Date: Fri, 3 May 2019 15:34:55 -0600 Subject: [placement][nova][ptg] resource provider affinity In-Reply-To: References: <776bc9b18cf33713708c22d893bd2a46d7a899ed.camel@redhat.com> <1CC272501B5BC543A05DB90AA509DED527552AD6@fmsmsx122.amr.corp.intel.com> <97bd8e53-0285-1c92-845f-21098b0b0e38@gmail.com> <21aa22e7-be7d-8ecf-b5bd-9c6afcd789f5@fried.cc> <27624C30-2BB6-43DF-9613-783674389C0B@fried.cc> <1556631941.24201.1@smtp.office365.com> <264f10b8-05dc-5280-28af-1f29cae91821@hco.ntt.co.jp> <4aa76244-fce0-86f3-a6f5-cd7f4d8cb2f0@fried.cc> Message-ID: On Fri, May 3, 2019 at 3:19 PM Chris Dent wrote: > On Fri, 3 May 2019, Eric Fried wrote: > > >> Another alternative is having a "closure table" from where we can > >> retrieve all the descendent rp ids of an rp without joining tables. > >> but... online migration cost? > > > > Can we consider these optimizations later, if the python-side solution > > proves non-performant? > > My preference would be that we start with the simplest option (make > multiple selects, merge them appropriately in Python) and, as Eric > says, if that's not good enough, pursue the optimizations. > > In fact, I think we should likely pursue the optimizations [1] in > any case, but they should come _after_ we have some measurements. > > Jay provided a proposed algorithm in [2]. > > That plan looks good to me, with the slight detail that I want to reinforce the fact that python usage will have a cost anyway, which is to drift us from the perfect world of having a distributed transactional model for free. This is to say, we should refrain *at the maximum* any attempt to get rid of SQL and use Python (or other tools) until we get a solid consensus on those tools being as efficient and as accurately possible than the current situation. We have a time slot tomorrow (Saturday May 3) at 13:30 to discuss > some of the finer points of implementing nested magic [3]. > > I'll try to be present. -Sylvain > [1] Making placement faster is constantly a goal, but it is a > secondary goal. > > [2] > http://lists.openstack.org/pipermail/openstack-discuss/2019-April/005432.html > > [3] > http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005823.html > > -- > Chris Dent ٩◔̯◔۶ https://anticdent.org/ > freenode: cdent tw: @anticdent -------------- next part -------------- An HTML attachment was scrubbed... URL: From balazs.gibizer at ericsson.com Fri May 3 21:35:23 2019 From: balazs.gibizer at ericsson.com (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Fri, 3 May 2019 21:35:23 +0000 Subject: [nova][neutron][ptg] Summary: Leaking resources when ports are deleted out-of-band In-Reply-To: <62ef48e0-9425-9191-a648-c1009c1032b7@fried.cc> References: <62ef48e0-9425-9191-a648-c1009c1032b7@fried.cc> Message-ID: <1556919312.16566.2@smtp.office365.com> On Fri, May 3, 2019 at 3:20 PM, Eric Fried wrote: > Summary: When a port is deleted out of band (while still attached to > an > instance), any associated QoS bandwidth resources are orphaned in > placement. > > Consensus: > - Neutron to block deleting a port whose "owner" field is set. > - If you really want to do this, null the "owner" field first. > - Nova still needs a way to delete the port during destroy. To be > discussed. Possibilities: > - Nova can null the "owner" field first. > - The operation can be permitted with a certain policy role, which > Nova would have to be granted. > - Other? Two additions: 1) Nova will log an ERROR when the leak happens. (Nova knows the port_id and the RP UUID but doesn't know the size of the allocation to remove it). This logging can be added today. 2) Matt had a point after the session that if Neutron enforces that only unbound port can be deleted then not only Nova needs to be changed to unbound a port before delete it, but possibly other Neutron consumers (Octavia?). Cheers, gibi > efried > . > From sfinucan at redhat.com Fri May 3 21:37:23 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Fri, 03 May 2019 15:37:23 -0600 Subject: Retiring mors Message-ID: <8d05d9032e98dca63ff7c00b0b3b43e86f4a367f.camel@redhat.com> The aeromancer project appears to be dead and has had no activity in nearly two years. I would like to retire the repository. Please let me know if there are any objections. I'm proposing patches now with topic retire-mors. Stephen From sfinucan at redhat.com Fri May 3 21:42:49 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Fri, 03 May 2019 15:42:49 -0600 Subject: Retiring alexandria Message-ID: The alexandria project appears to be dead and has had no activity in over three years. I would like to retire the repository. Please let me know if there are any objections. I'm proposing patches now with topic retire-alexandria. Stephen From doug at doughellmann.com Fri May 3 22:12:17 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Fri, 03 May 2019 16:12:17 -0600 Subject: Retiring aeromancer In-Reply-To: <7c289216c6d1079d2c7d9c4c03b3740ebf5a5339.camel@redhat.com> References: <7c289216c6d1079d2c7d9c4c03b3740ebf5a5339.camel@redhat.com> Message-ID: Stephen Finucane writes: > The aeromancer project appears to be dead and has had no activity in > over four years. I would like to retire the repository. Please let > me know if there are any objections. > > I'm proposing patches now with topic retire-aeromancer. > > Stephen > > That one was mine. Go right ahead. I recommend that folks look at beagle [1] if they want a command line tool for submitting searches to codesearch.openstack.org. [1] https://pypi.org/project/beagle/ -- Doug From mriedemos at gmail.com Fri May 3 22:10:46 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 3 May 2019 16:10:46 -0600 Subject: [nova][neutron][ptg] Summary: Leaking resources when ports are deleted out-of-band In-Reply-To: <1556919312.16566.2@smtp.office365.com> References: <62ef48e0-9425-9191-a648-c1009c1032b7@fried.cc> <1556919312.16566.2@smtp.office365.com> Message-ID: <5f87ea30-0bdf-31a4-a3f5-0e9d201b3665@gmail.com> On 5/3/2019 3:35 PM, Balázs Gibizer wrote: > 2) Matt had a point after the session that if Neutron enforces that > only unbound port can be deleted then not only Nova needs to be changed > to unbound a port before delete it, but possibly other Neutron > consumers (Octavia?). And potentially Zun, there might be others, Magnum, Heat, idk? Anyway, this is a thing that has been around forever which admins shouldn't do, do we need to prioritize making this change in both neutron and nova to make two requests to delete a bound port? Or is just logging the ERROR that you've leaked allocations, tsk tsk, enough? I tend to think the latter is fine until someone comes along saying this is really hurting them and they have a valid use case for deleting bound ports out of band from nova. -- Thanks, Matt From smooney at redhat.com Fri May 3 22:19:48 2019 From: smooney at redhat.com (Sean Mooney) Date: Fri, 3 May 2019 23:19:48 +0100 Subject: Neutron - Nova cross project topics Message-ID: https://etherpad.openstack.org/p/ptg-train-xproj-nova-neutron PTG summary: below is a summary of the section i lead for the cross project sessions. hopefully the others can extend this with there sections too. Topic: Optional NUMA affinity for neutron ports (sean-k-mooney) Summary: we will model numa affinity of neutron port via a new qos rule type that will be applied to the port. neutron will comunicate the policy to nova allowing different policies per interface. the numa polices will be defined in the spec and willl likely just be the ones we support already today in the the pci alias. AR: sean-k-mooney to write sibling specs for nova and neutron Topic: track neutron ports in placement. Summary: nova will create RPs for each SR-IOV PF and apply the nic feature flags and physnet as traits. the RP name will contain the PF netdev name. neutron l2 agents will add inventoies of ports under exsiting agent RPs. this will allow us to track the capsity of each network backend as well as schdule based on nic feature flag, vnic type and physnets. details will be worked out in the specs and it will target the U cycle AR: sean-k-mooney to write sibling specs for nova and neutron for U Topic: port binding records https://review.openstack.org/#/c/645173/ Summary: os-vif will be extended to contain new fields to record the conectiviy type and ml2 driver that bound the vif. each neutron ml2 driver will be modified to add a serialised os-vif object to the binding responce. the nova.network.model.vif object will be extended to store the os-vif object. the virt drivers will conditionnall skip calling the nova vif to os-vif vif object conversion function and fall back to the legacy workflow if not present in the nova vif object. initially none of the legacy code willl be removed untill all ml2 drivers are updated. AR: Sean and Rodolfo to update spec and with nova spec for nova specific changes. Topic: boot vms with unaddressed port. https://blueprints.launchpad.net/nova/+spec/boot-vm-with-unaddressed-port Summary: Agreed we should do this and we should depend on the port binding records change. AR: rodlofo to start codeing this up and update the spec. From balazs.gibizer at ericsson.com Fri May 3 22:37:48 2019 From: balazs.gibizer at ericsson.com (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Fri, 3 May 2019 22:37:48 +0000 Subject: [nova][neutron][ptg] Summary: Leaking resources when ports are deleted out-of-band In-Reply-To: <1556919312.16566.2@smtp.office365.com> References: <62ef48e0-9425-9191-a648-c1009c1032b7@fried.cc> <1556919312.16566.2@smtp.office365.com> Message-ID: <1556923057.16566.3@smtp.office365.com> > > 1) Nova will log an ERROR when the leak happens. (Nova knows the > port_id and the RP UUID but doesn't know the size of the allocation to > remove it). This logging can be added today. Path is up with an ERROR log: https://review.opendev.org/#/c/657079/ gibi From aspiers at suse.com Fri May 3 23:05:25 2019 From: aspiers at suse.com (Adam Spiers) Date: Fri, 3 May 2019 17:05:25 -0600 Subject: [tc][all][airship] Github mirroring (or lack thereof) for unofficial projects In-Reply-To: <20190503190538.GB3377@localhost.localdomain> References: <20190503190538.GB3377@localhost.localdomain> Message-ID: <20190503230525.a3vxsnliklitnei4@arabian.linksys.moosehall> Paul Belanger wrote: >On Fri, May 03, 2019 at 08:48:10PM +0200, Roman Gorshunov wrote: >>Hello Jim, team, >> >>I'm from Airship project. I agree with archival of Github mirrors of >>repositories. Which mirror repositories are you referring to here - a subset of the Airship repos which are no longer needed, or all Airship repo mirrors? I would prefer the majority of the mirrors not to be archived, for two reasons which Alan or maybe Matt noted in the Airship discussions this morning: 1. Some people instinctively go to GitHub search when they want to find a software project. Having useful search results for "airship" on GitHub increases the discoverability of the project. 2. Some people will judge the liveness of a project by its activity metrics as shown on GitHub (e.g. number of recent commits). An active mirror helps show that the project is alive and well. In contrast, an archived mirror makes it look like the project is dead. However if you are only talking about a small subset which are no longer needed, then archiving sounds reasonable. >>One small suggestion: could we have project descriptions >>adjusted to point to the new location of the source code repository, >>please? E.g. "The repo now lives at opendev.org/x/y". I agree it's helpful if the top-level README.rst has a sentence like "the authoritative location for this repo is https://...". >This is something important to keep in mind from infra side, once the >repo is read-only, we lose the ability to use the API to change it. > >From manage-projects.py POV, we can update the description before >flipping the archive bit without issues, just need to make sure we have >the ordering correct. > >Also, there is no API to unarchive a repo from github sadly, for that a >human needs to log into github UI and click the button. I have no idea >why. Good points, but unless we're talking about a small subset of Airship repos, I'm a bit puzzled why this is being discussed, because I thought we reached consensus this morning on a) ensuring that all Airship projects are continually mirrored to GitHub, and b) trying to transfer those mirrors from the "openstack" organization to the "airship" one, assuming we can first persuade GitHub to kick out the org-squatters. This transferral would mean that GitHub would automatically redirect requests from https://github.com/openstack/airship-* to https://github.com/airship/... Consensus is documented in lines 107-112 of: https://etherpad.openstack.org/p/airship-ptg-train From johnsomor at gmail.com Fri May 3 23:05:49 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Fri, 3 May 2019 17:05:49 -0600 Subject: [octavia] Error while creating amphora In-Reply-To: <867dde2f-83ca-63ce-5ee7-bfa962ff46aa@gmx.com> References: <867dde2f-83ca-63ce-5ee7-bfa962ff46aa@gmx.com> Message-ID: Yes, with this setting to False, you will use config driver, but it will not use the "user_data" section of data source. Michael On Fri, May 3, 2019 at 1:07 AM Volodymyr Litovka wrote: > > Hi Michael, > > the reason is my personal perception that file injection is quite legacy > way and I even didn't know whether it enabed or no in my installation > :-) When configdrive is available, I'd prefer to use it in every case. > > I set "user_data_config_drive" to False and passed this step. Thanks for > pointing on this. > > Now working with next issues launching amphorae, will back soon :-) > > Thank you. > > On 5/2/19 5:58 PM, Michael Johnson wrote: > > Volodymyr, > > > > It looks like you have enabled "user_data_config_drive" in the > > octavia.conf file. Is there a reason you need this? If not, please > > set it to False and it will resolve your issue. > > > > It appears we have a python3 bug in the "user_data_config_drive" > > capability. It is not generally used and appears to be missing test > > coverage. > > > > I have opened a story (bug) on your behalf here: > > https://storyboard.openstack.org/#!/story/2005553 > > > > Michael > > > > On Thu, May 2, 2019 at 4:29 AM Volodymyr Litovka wrote: > >> Dear colleagues, > >> > >> I'm using Openstack Rocky and trying to launch Octavia 4.0.0. After all installation steps I've got an error during 'openstack loadbalancer create' with the following log: > >> > >> DEBUG octavia.controller.worker.tasks.compute_tasks [-] Compute create execute for amphora with id d037721f-2cf9-492e-99cb-0be5874da0f6 execute /opt/openstack/lib/python3.6/site-packages/octavia/controller/worker/tasks/compute_tasks.py:63 > >> ERROR octavia.controller.worker.tasks.compute_tasks [-] Compute create for amphora id: d037721f-2cf9-492e-99cb-0be5874da0f6 failed: TypeError: can't concat str to bytes > >> ERROR octavia.controller.worker.tasks.compute_tasks Traceback (most recent call last): > >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/octavia/controller/worker/tasks/compute_tasks.py", line 94, in execute > >> ERROR octavia.controller.worker.tasks.compute_tasks config_drive_files) > >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/octavia/common/jinja/user_data_jinja_cfg.py", line 38, in build_user_data_config > >> ERROR octavia.controller.worker.tasks.compute_tasks return self.agent_template.render(user_data=user_data) > >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/asyncsupport.py", line 76, in render > >> ERROR octavia.controller.worker.tasks.compute_tasks return original_render(self, *args, **kwargs) > >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/environment.py", line 1008, in render > >> ERROR octavia.controller.worker.tasks.compute_tasks return self.environment.handle_exception(exc_info, True) > >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/environment.py", line 780, in handle_exception > >> ERROR octavia.controller.worker.tasks.compute_tasks reraise(exc_type, exc_value, tb) > >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/_compat.py", line 37, in reraise > >> ERROR octavia.controller.worker.tasks.compute_tasks raise value.with_traceback(tb) > >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/octavia/common/jinja/templates/user_data_config_drive.template", line 29, in top-level template code > >> ERROR octavia.controller.worker.tasks.compute_tasks {{ value|indent(8) }} > >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/filters.py", line 557, in do_indent > >> ERROR octavia.controller.worker.tasks.compute_tasks s += u'\n' # this quirk is necessary for splitlines method > >> ERROR octavia.controller.worker.tasks.compute_tasks TypeError: can't concat str to bytes > >> ERROR octavia.controller.worker.tasks.compute_tasks > >> WARNING octavia.controller.worker.controller_worker [-] Task 'STANDALONE-octavia-create-amp-for-lb-subflow-octavia-cert-compute-create' (06134192-def9-420c-9feb-0d08a068f3b2) transitioned into state 'FAILURE' from state 'RUNNING' > >> > >> Any advises where is the problem? > >> > >> My environment: > >> - Openstack Rocky > >> - Ubuntu 18.04 > >> - Octavia installed in virtualenv using pip install: > >> # pip list |grep octavia > >> octavia 4.0.0 > >> octavia-lib 1.1.1 > >> python-octaviaclient 1.8.0 > >> > >> Thank you. > >> > >> -- > >> Volodymyr Litovka > >> "Vision without Execution is Hallucination." -- Thomas Edison > >> > >> -- > >> Volodymyr Litovka > >> "Vision without Execution is Hallucination." -- Thomas Edison > > -- > Volodymyr Litovka > "Vision without Execution is Hallucination." -- Thomas Edison > From amotoki at gmail.com Fri May 3 23:22:46 2019 From: amotoki at gmail.com (Akihiro Motoki) Date: Fri, 3 May 2019 17:22:46 -0600 Subject: [nova][neutron][ptg] Summary: Leaking resources when ports are deleted out-of-band In-Reply-To: <5f87ea30-0bdf-31a4-a3f5-0e9d201b3665@gmail.com> References: <62ef48e0-9425-9191-a648-c1009c1032b7@fried.cc> <1556919312.16566.2@smtp.office365.com> <5f87ea30-0bdf-31a4-a3f5-0e9d201b3665@gmail.com> Message-ID: On Fri, May 3, 2019 at 4:11 PM Matt Riedemann wrote: > On 5/3/2019 3:35 PM, Balázs Gibizer wrote: > > 2) Matt had a point after the session that if Neutron enforces that > > only unbound port can be deleted then not only Nova needs to be changed > > to unbound a port before delete it, but possibly other Neutron > > consumers (Octavia?). > > And potentially Zun, there might be others, Magnum, Heat, idk? > > Anyway, this is a thing that has been around forever which admins > shouldn't do, do we need to prioritize making this change in both > neutron and nova to make two requests to delete a bound port? Or is just > logging the ERROR that you've leaked allocations, tsk tsk, enough? I > tend to think the latter is fine until someone comes along saying this > is really hurting them and they have a valid use case for deleting bound > ports out of band from nova. > neutron deines a special role called "advsvc" for advanced network services [1]. I think we can change neutron to block deletion of bound ports for regular users and allow users with "advsvc" role to delete bound ports. I haven't checked which projects currently use "advsvc". [1] https://opendev.org/openstack/neutron/src/branch/master/neutron/conf/policies/port.py#L53-L59 > > -- > > Thanks, > > Matt > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhipengh512 at gmail.com Sat May 4 00:58:38 2019 From: zhipengh512 at gmail.com (Zhipeng Huang) Date: Sat, 4 May 2019 08:58:38 +0800 Subject: [tc] Proposal: restrict TC activities In-Reply-To: <20190503204942.GB28010@shipstone.jp> References: <20190503204942.GB28010@shipstone.jp> Message-ID: Then it might fit the purpose to rename the technical committee to governance committee or other terms. If we have a technical committee not investing time to lead in technical evolvement of OpenStack, it just seems odd to me. TC should be a place good developers aspired to, not retired to. BTW this is not a OpenStack-only issue but I see across multiple open source communities. On Sat, May 4, 2019 at 4:51 AM Emmet Hikory wrote: > All, > I’ve spent the last few years watching the activities of the > technical committee , and in recent cycles, I’m seeing a significant > increase in both members of our community asking the TC to take action > on things, and the TC volunteering to take action on things in the > course of internal discussions (meetings, #openstack-tc, etc.). In > combination, these trends appear to have significantly increased the > amount of time that members of the technical committee spend on “TC > work”, and decreased the time that they spend on other activities in > OpenStack. As such, I suggest that the Technical Committee be > restricted from actually doing anything beyond approval of merges to the > governance repository. > > Firstly, we select members of the technical committee from amongst > those of us who have some of the deepest understanding of the entire > project and frequently those actively involved in multiple projects and > engaged in cross-project coordination on a regular basis. Anything less > than this fails to produce enough name recognition for election. As > such, when asking the TC to be responsible for activities, we should > equally ask whether we wish the very people responsible for the > efficiency of our collaboration to cease doing so in favor of whatever > we may have assigned to the TC. > > Secondly, in order to ensure continuity, we need to provide a means > for rotation of the TC: this is both to allow folk on the TC to pursue > other activities, and to allow folk not on the TC to join the TC and > help with governance and coordination. If we wish to increase the > number of folk who might be eligible for the TC, we do this best by > encouraging them to take on activities that involve many projects or > affect activities over all of OpenStack. These activities must > necessarily be taken by those not current TC members in order to provide > a platform for visibility to allow those doing them to later become TC > members. > > Solutions to both of these issues have been suggested involving > changing the size of the TC. If we decrease the size of the TC, it > becomes less important to provide mechanisms for new people to develop > reputation over the entire project, but this ends up concentrating the > work of the TC to a smaller number of hands, and likely reduces the > volume of work overall accomplished. If we increase the size of the TC, > it becomes less burdensome for the TC to take on these activities, but > this ends up foundering against the question of who in our community has > sufficient experience with all aspects of OpenStack to fill the > remaining seats (and how to maintain a suitable set of folk to provide > TC continuity). > > If we instead simply assert that the TC is explicitly not > responsible for any activities beyond governance approvals, we both > reduce the impact that being elected to the TC has on the ability of our > most prolific contributors to continue their activities and provide a > means for folk who have expressed interest and initiative to broadly > contribute and demonstrate their suitability for nomination in a future > TC election > > Feedback encouraged > > -- > Emmet HIKORY > > > -- Zhipeng (Howard) Huang Principle Engineer OpenStack, Kubernetes, CNCF, LF Edge, ONNX, Kubeflow, OpenSDS, Open Service Broker API, OCP, Hyperledger, ETSI, SNIA, DMTF, W3C -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.settle at outlook.com Sat May 4 03:35:38 2019 From: a.settle at outlook.com (Alexandra Settle) Date: Sat, 4 May 2019 03:35:38 +0000 Subject: [nova][ptg] Summary: docs In-Reply-To: References: Message-ID: I know you have Stephen on the team, but let me know if the team also wants to look further into formalising the information architecture and help reviewing patches. Cheers, Alex Get Outlook for Android ________________________________ From: Eric Fried Sent: Thursday, May 2, 2019 10:14:34 PM To: OpenStack Discuss Subject: [nova][ptg] Summary: docs Summary: Nova docs could use some love. Agreement: Consider doc scrub as a mini-theme (cycle themes to be discussed Saturday) to encourage folks to dedicate some amount of time to reading & validating docs, and opening and/or fixing bugs for discovered issues. efried . -------------- next part -------------- An HTML attachment was scrubbed... URL: From hongbin034 at gmail.com Sat May 4 05:12:38 2019 From: hongbin034 at gmail.com (Hongbin Lu) Date: Sat, 4 May 2019 01:12:38 -0400 Subject: [ZUN] Proxy on Docker + Zun In-Reply-To: References: Message-ID: Alejandro, Yes, it might be an proxy issue. According to https://docs.docker.com/config/daemon/systemd/#httphttps-proxy , the NO_PROXY is a list of comma-separated hosts (not a cidr like 10.8.0.0/16 ). So you might want to try: NO_PROXY=localhost,127.0.0.1,10.8.9. 54,... On Fri, May 3, 2019 at 3:09 PM Alejandro Ruiz Bermejo < arbermejo0417 at gmail.com> wrote: > I'm still working on my previous error of the openstack appcontainer run > error state: > > I have Docker working behind a Proxy. As you can see in the Docker info i > attach to this mail. I tried to do the curl http://10.8.9.54:2379/health > with the proxy environment variable and i got timeout error (without it the > curl return the normal healthy state for the etcd cluster). So my question > is if i'm having a problem with the proxy configuration and docker commands > when i'm executing the openstack appcontainer run. And if you know any use > case of someone working with Docker behind a proxy and Zun in the Openstack > environment. > > This is the outputh of > > # systemctl show --property Environment docker > Environment=HTTP_PROXY=http://10.8.7.60:3128/ NO_PROXY=localhost, > 127.0.0.0/8,10.8.0.0/16 HTTPS_PROXY=http://10.8.7.60:3128/ > > And this is the one of > > root at compute /h/team# docker info > Containers: 9 > Running: 0 > Paused: 0 > Stopped: 9 > Images: 7 > Server Version: 18.09.5 > Storage Driver: overlay2 > Backing Filesystem: extfs > Supports d_type: true > Native Overlay Diff: true > Logging Driver: json-file > Cgroup Driver: cgroupfs > Plugins: > Volume: local > Network: bridge host macvlan null overlay > Log: awslogs fluentd gcplogs gelf journald json-file local logentries > splunk syslog > Swarm: inactive > Runtimes: runc > Default Runtime: runc > Init Binary: docker-init > containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84 > runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30 > init version: fec3683 > Security Options: > apparmor > seccomp > Profile: default > Kernel Version: 4.15.0-48-generic > Operating System: Ubuntu 18.04.2 LTS > OSType: linux > Architecture: x86_64 > CPUs: 8 > Total Memory: 15.66GiB > Name: compute > ID: W35H:WCPP:AM3K:NENH:FEOR:S23C:N3FZ:QELB:LLUR:USMJ:IM7W:YMFX > Docker Root Dir: /var/lib/docker > Debug Mode (client): false > Debug Mode (server): false > HTTP Proxy: http://10.8.7.60:3128/ > HTTPS Proxy: http://10.8.7.60:3128/ > No Proxy: localhost,127.0.0.0/8,10.8.0.0/16 > Registry: https://index.docker.io/v1/ > Labels: > Experimental: false > Cluster Store: etcd://10.8.9.54:2379 > Insecure Registries: > 127.0.0.0/8 > Live Restore Enabled: false > Product License: Community Engine > > WARNING: API is accessible on http://compute:2375 without encryption. > Access to the remote API is equivalent to root access on the > host. Refer > to the 'Docker daemon attack surface' section in the > documentation for > more information: > https://docs.docker.com/engine/security/security/#docker-daemon-attack-surface > WARNING: No swap limit support > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From persia at shipstone.jp Sat May 4 13:25:50 2019 From: persia at shipstone.jp (Emmet Hikory) Date: Sat, 4 May 2019 22:25:50 +0900 Subject: [tc] Proposal: restrict TC activities In-Reply-To: References: <20190503204942.GB28010@shipstone.jp> Message-ID: <20190504132550.GA28713@shipstone.jp> Zhipeng Huang wrote: > Then it might fit the purpose to rename the technical committee to > governance committee or other terms. If we have a technical committee not > investing time to lead in technical evolvement of OpenStack, it just seems > odd to me. OpenStack has a rich governance structure, including at least the Foundation Board, the User Committee, and the Technical Committee. Within the context of governance, the Technical Committee is responsible for both technical governance of OpenStack and governance of the technical community. It is within that context that "Technical Committee" is the name. I also agree that it is important that members of the Technical Committee are able to invest time to lead in the technical evolution of OpenStack, and this is a significant reason that I propose that the activities of the TC be restricted, precisely so that being elected does not mean that one no longer is able to invest time for this. > TC should be a place good developers aspired to, not retired to. BTW this > is not a OpenStack-only issue but I see across multiple open source > communities. While I agree that it is valuable to have a target for the aspirations of good developers, I am not convinced that OpenStack can be healthy if we restrict our aspirations to nine seats. From my perspective, this causes enough competition that many excellent folk may never be elected, and that some who wish to see their aspirations fufilled may focus activity in other communities where it may be easier to achieve an arbitrary title. Instead, I suggest that developers should aspire to be leaders in the OpenStack comunuity, and be actively involved in determining the future technical direction of OpenStack. I just don't think there needs to be any correlation between this and the mechanics of reviewing changes to the governance repository. -- Emmet HIKORY From jfrancoa at redhat.com Sat May 4 16:02:03 2019 From: jfrancoa at redhat.com (Jose Luis Franco Arza) Date: Sat, 4 May 2019 18:02:03 +0200 Subject: =?UTF-8?Q?Re=3A_=5Btripleo=5D_Nominate_C=C3=A9dric_Jeanneret_=28Tengu=29_for?= =?UTF-8?Q?_tripleo=2Dvalidations_core?= In-Reply-To: <20190418102939.heykaeyphydgocq4@olivia.strider.local> References: <20190418102939.heykaeyphydgocq4@olivia.strider.local> Message-ID: +1 On Thu, Apr 18, 2019 at 12:32 PM Gaël Chamoulaud wrote: > Hi TripleO devs, > > The new Validation Framework is a big step further for the > tripleo-validations > project. In many ways, it improves the way of detecting & reporting > potential > issues during a TripleO deployment. As the mastermind of this new > framework, > Cédric brought a new lease of life to the tripleo-validations project. > That's > why we would highly benefit from his addition to the core reviewer team. > > Assuming that there are no objections, we will add Cédric to the core team > next > week. > > Thanks, Cédric, for your excellent work! > > =Gaël > -------------- next part -------------- An HTML attachment was scrubbed... URL: From johnsomor at gmail.com Sat May 4 16:25:08 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Sat, 4 May 2019 10:25:08 -0600 Subject: [nova][neutron][ptg] Summary: Leaking resources when ports are deleted out-of-band In-Reply-To: References: <62ef48e0-9425-9191-a648-c1009c1032b7@fried.cc> <1556919312.16566.2@smtp.office365.com> <5f87ea30-0bdf-31a4-a3f5-0e9d201b3665@gmail.com> Message-ID: I think this will have implications for Octavia, but we can work through those. There are cases during cleanup from an error where we delete ports owned by "Octavia" that have not yet be attached to a nova instance. My understanding of the above discussion is that this would not be an issue under this change. However.... We also, currently, manipulate the ports we have hot-plugged (attached) to nova instances where the port "device_owner" has become "compute:nova", mostly for failover scenarios and cases where nova detach fails and we have to revert the action. Now, if the "proper" new procedure is to first detach before deleting the port, we can look at attempting that. But, in the common failure scenarios we see nova failing to complete this, if for example the compute host has been powered off. In this scenario we still need to delete the neutron port for both resource cleanup and quota reasons. This so we can create a new port and attach it to a new instance to recover. I think this change will impact our current port manage flows, so we should proceed cautiously, test heavily, and potentially address some of the nova failure scenarios at the same time. Michael On Fri, May 3, 2019 at 5:23 PM Akihiro Motoki wrote: > > > > On Fri, May 3, 2019 at 4:11 PM Matt Riedemann wrote: >> >> On 5/3/2019 3:35 PM, Balázs Gibizer wrote: >> > 2) Matt had a point after the session that if Neutron enforces that >> > only unbound port can be deleted then not only Nova needs to be changed >> > to unbound a port before delete it, but possibly other Neutron >> > consumers (Octavia?). >> >> And potentially Zun, there might be others, Magnum, Heat, idk? >> >> Anyway, this is a thing that has been around forever which admins >> shouldn't do, do we need to prioritize making this change in both >> neutron and nova to make two requests to delete a bound port? Or is just >> logging the ERROR that you've leaked allocations, tsk tsk, enough? I >> tend to think the latter is fine until someone comes along saying this >> is really hurting them and they have a valid use case for deleting bound >> ports out of band from nova. > > > neutron deines a special role called "advsvc" for advanced network services [1]. > I think we can change neutron to block deletion of bound ports for regular users and > allow users with "advsvc" role to delete bound ports. > I haven't checked which projects currently use "advsvc". > > [1] https://opendev.org/openstack/neutron/src/branch/master/neutron/conf/policies/port.py#L53-L59 > >> >> >> -- >> >> Thanks, >> >> Matt >> From emilien at redhat.com Sat May 4 16:27:34 2019 From: emilien at redhat.com (Emilien Macchi) Date: Sat, 4 May 2019 10:27:34 -0600 Subject: =?UTF-8?Q?Re=3A_=5Btripleo=5D_Nominate_C=C3=A9dric_Jeanneret_=28Tengu=29_for?= =?UTF-8?Q?_tripleo=2Dvalidations_core?= In-Reply-To: References: <20190418102939.heykaeyphydgocq4@olivia.strider.local> Message-ID: I went ahead and added Cédric to the list of TripleO core (there is no tripleo-validation group in Gerrit). On Sat, May 4, 2019 at 10:13 AM Jose Luis Franco Arza wrote: > +1 > > On Thu, Apr 18, 2019 at 12:32 PM Gaël Chamoulaud > wrote: > >> Hi TripleO devs, >> >> The new Validation Framework is a big step further for the >> tripleo-validations >> project. In many ways, it improves the way of detecting & reporting >> potential >> issues during a TripleO deployment. As the mastermind of this new >> framework, >> Cédric brought a new lease of life to the tripleo-validations project. >> That's >> why we would highly benefit from his addition to the core reviewer team. >> >> Assuming that there are no objections, we will add Cédric to the core >> team next >> week. >> >> Thanks, Cédric, for your excellent work! >> >> =Gaël >> > -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From cdent+os at anticdent.org Sat May 4 16:55:35 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Sat, 4 May 2019 10:55:35 -0600 (MDT) Subject: [placement][nova][ptg] Summary: Testing PlacementFixture effectively In-Reply-To: References: Message-ID: On Fri, 3 May 2019, Chris Dent wrote: > Action: > > cdent will make a story and do this https://review.opendev.org/#/q/topic:story/2005562 -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From balazs.gibizer at ericsson.com Sat May 4 16:57:37 2019 From: balazs.gibizer at ericsson.com (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Sat, 4 May 2019 16:57:37 +0000 Subject: [nova][neutron][ptg] Summary: Leaking resources when ports are deleted out-of-band In-Reply-To: References: <62ef48e0-9425-9191-a648-c1009c1032b7@fried.cc> <1556919312.16566.2@smtp.office365.com> <5f87ea30-0bdf-31a4-a3f5-0e9d201b3665@gmail.com> Message-ID: <1556989044.27606.0@smtp.office365.com> On Sat, May 4, 2019 at 10:25 AM, Michael Johnson wrote: > I think this will have implications for Octavia, but we can work > through those. > > There are cases during cleanup from an error where we delete ports > owned by "Octavia" that have not yet be attached to a nova instance. > My understanding of the above discussion is that this would not be an > issue under this change. If the port is owned by Octavia then the resource leak does not happen. However the propose neutron code / policy change affects this case as well. > > However.... > > We also, currently, manipulate the ports we have hot-plugged > (attached) to nova instances where the port "device_owner" has become > "compute:nova", mostly for failover scenarios and cases where nova > detach fails and we have to revert the action. > > Now, if the "proper" new procedure is to first detach before deleting > the port, we can look at attempting that. But, in the common failure > scenarios we see nova failing to complete this, if for example the > compute host has been powered off. In this scenario we still need to > delete the neutron port for both resource cleanup and quota reasons. > This so we can create a new port and attach it to a new instance to > recover. If Octavai also deletes the VM then force deleting the port is OK from placement resource prespective as the VM delete will make sure we are deleting the leaked port resources. > > I think this change will impact our current port manage flows, so we > should proceed cautiously, test heavily, and potentially address some > of the nova failure scenarios at the same time. After talking to rm_work on #openstack-nova [1] it feels that the policy based solution would work for Octavia. So Octavia with the extra policy can still delete the bound port in Neutron safely as Ocatavia also deletes the VM that the port was bound to. That VM delete will reclaim the leaked port resource. The failure to detach a port via nova while the nova-compute is down could be a bug on nova side. cheers, gibi [1] http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2019-05-04.log.html#t2019-05-04T16:15:52 > > Michael > > On Fri, May 3, 2019 at 5:23 PM Akihiro Motoki > wrote: >> >> >> >> On Fri, May 3, 2019 at 4:11 PM Matt Riedemann >> wrote: >>> >>> On 5/3/2019 3:35 PM, Balázs Gibizer wrote: >>> > 2) Matt had a point after the session that if Neutron enforces >>> that >>> > only unbound port can be deleted then not only Nova needs to be >>> changed >>> > to unbound a port before delete it, but possibly other Neutron >>> > consumers (Octavia?). >>> >>> And potentially Zun, there might be others, Magnum, Heat, idk? >>> >>> Anyway, this is a thing that has been around forever which admins >>> shouldn't do, do we need to prioritize making this change in both >>> neutron and nova to make two requests to delete a bound port? Or >>> is just >>> logging the ERROR that you've leaked allocations, tsk tsk, enough? >>> I >>> tend to think the latter is fine until someone comes along saying >>> this >>> is really hurting them and they have a valid use case for deleting >>> bound >>> ports out of band from nova. >> >> >> neutron deines a special role called "advsvc" for advanced network >> services [1]. >> I think we can change neutron to block deletion of bound ports for >> regular users and >> allow users with "advsvc" role to delete bound ports. >> I haven't checked which projects currently use "advsvc". >> >> [1] >> https://protect2.fireeye.com/url?k=e82c8753-b4a78c60-e82cc7c8-865bb277df6a-a57d1b5660e0038e&u=https://opendev.org/openstack/neutron/src/branch/master/neutron/conf/policies/port.py#L53-L59 >> >>> >>> >>> -- >>> >>> Thanks, >>> >>> Matt >>> > From cdent+os at anticdent.org Sat May 4 18:09:49 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Sat, 4 May 2019 12:09:49 -0600 (MDT) Subject: [placement][nova][ptg] Summary: Nested Magic With Placement In-Reply-To: References: Message-ID: On Fri, 3 May 2019, Chris Dent wrote: > * This (Friday) afternoon at the PTG I'll be creating rfe stories > associated with these changes. If you'd like to help with that, find > me in the placement room (109). We'll work out whether those > stories needs specs in the normally processing of the stories. > We'll also need to find owners for many of them. I decided to capture all of this in one story: https://storyboard.openstack.org/#!/story/2005575 which will likely need to be broken into several stories, or at least several detailed tasks. We will also need to determine what of it needs a spec (there's already one in progress for the request group mapping), if one spec will be sufficient, or we can get away without one. And people. Always with the people. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From tetsuro.nakamura.bc at hco.ntt.co.jp Sat May 4 18:40:04 2019 From: tetsuro.nakamura.bc at hco.ntt.co.jp (Tetsuro Nakamura) Date: Sun, 05 May 2019 03:40:04 +0900 Subject: [placement][nova][ptg] resource provider affinity In-Reply-To: References: <776bc9b18cf33713708c22d893bd2a46d7a899ed.camel@redhat.com> <1CC272501B5BC543A05DB90AA509DED527552AD6@fmsmsx122.amr.corp.intel.com> <97bd8e53-0285-1c92-845f-21098b0b0e38@gmail.com> <21aa22e7-be7d-8ecf-b5bd-9c6afcd789f5@fried.cc> <27624C30-2BB6-43DF-9613-783674389C0B@fried.cc> <1556631941.24201.1@smtp.office365.com> <264f10b8-05dc-5280-28af-1f29cae91821@hco.ntt.co.jp> <4aa76244-fce0-86f3-a6f5-cd7f4d8cb2f0@fried.cc> Message-ID: <03922b54-994e-dcae-8543-7c9c2f75b87d@hco.ntt.co.jp> Okay, I was missing that at the point to merge each candidate from each request groups, all the rps info in the trees are already in ProviderSummaries, and we can use them without an additional query. It looks like that this can be done without impacting the performance of existing requests that have no queryparam for affinity, so I'm good with this and can volunteer it in Placement since this is more of general "subtree" thing, but I'd like to say that looking into tracking PCPU feature in Nova and see the related problems should precede any Nova related items to model NUMA in Placement. On 2019/05/04 0:03, Eric Fried wrote: >> It enables something like: >> * group_resources=1:2:!3:!4 >> which means 1 and 2 should be in the same group but 3 shoudn't be the >> descendents of 1 or 2, so as 4. > In a symmetric world, this one is a little ambiguous to me. Does it mean > 4 shouldn't be in the same subtree as 3 as well? I thought the negative folks were just refusing to be with in the positive folks. Looks like there are use cases where we need multiple group_resources? - I want 1, 2 in the same subtree, and 3, 4 in the same subtree but the two subtrees should be separated: * group_resources=1:2:!3:!4&group_resources=3:4 -- Tetsuro Nakamura NTT Network Service Systems Laboratories TEL:0422 59 6914(National)/+81 422 59 6914(International) 3-9-11, Midori-Cho Musashino-Shi, Tokyo 180-8585 Japan From kchamart at redhat.com Sat May 4 18:45:17 2019 From: kchamart at redhat.com (Kashyap Chamarthy) Date: Sat, 4 May 2019 20:45:17 +0200 Subject: [nova][ptg] Summary: Secure Boot support for QEMU- and KVM-based Nova instances Message-ID: <20190504184517.GF28897@paraplu> Spec: https://review.opendev.org/#/c/506720/ -- Add "Secure Boot support for KVM & QEMU guests" spec Summary: - Major work in all the lower-level dependencies: OVMF, QEMU and libvirt is ready. Nova can now start integrating this feature. (Refer to the spec for the details.) - [IN-PROGRESS] Ensure that the Linux distributions Nova cares about ship the OVMF firmware descriptor files. (Requires QEMU 4.1, coming out in August. Refer this QEMU patch series; merged in Git master: https://lists.nongnu.org/archive/html/qemu-devel/2019-04/msg03799.html bundle edk2 platform firmware with QEMU.) - NOTE: This is not a blocker for Nova. We can parallely hammer away at the work items outlined in the spec. - [IN-PROGRESS] Kashyap is working with Debian folks to ship a tool ('ovmf-vars-generator') to enroll default UEFI keys for Secure Boot. - Filed a Debian "RFP" for it https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=927414 - Fedora already ships it; Ubuntu is working on it (https://launchpad.net/ubuntu/+source/edk2/0~20190309.89910a39-1ubuntu1) - NOTE: This is not a blocker, but a nice-to-have, because distributions already ship an OVMF "VARS" (variable store file) with default UEFI keys enrolled. - ACTION: John Garbutt and Chris Friesen to review the Nova spec. (Thanks!) -- /kashyap From josephine.seifert at secustack.com Sat May 4 18:57:38 2019 From: josephine.seifert at secustack.com (Josephine Seifert) Date: Sat, 4 May 2019 20:57:38 +0200 Subject: [nova][cinder][glance][Barbican]Finding Timeslot for weekly Image Encryption IRC meeting Message-ID: Hello, as a result from the Summit and the PTG, I would like to hold a weekly IRC-meeting for the Image Encryption (soon to be a pop-up team).  As I work in Europe I have made a doodle poll, with timeslots I can attend and hopefully many of you. If you would like to join in a weekly meeting, please fill out the poll and state your name and the project you are working in: https://doodle.com/poll/wtg9ha3e5dvym6yt Thank you Josephine (Luzi) From sukhdevkapur at gmail.com Sat May 4 19:43:29 2019 From: sukhdevkapur at gmail.com (Sukhdev Kapur) Date: Sat, 4 May 2019 12:43:29 -0700 Subject: [ironic][neutron][ops] Ironic multi-tenant networking, VMs In-Reply-To: References: Message-ID: Jeremy, If you want to use VxLAN networks for the bremetal hosts, you would use ML2 VLAN networks, as Julia described, between the host and switch port. That VLAN will then terminate into a VTAP on the switch port which will carry appropriate tags in the VxLAN overlay. Hope this helps -Sukhdev On Thu, May 2, 2019 at 9:28 PM Jeremy Freudberg wrote: > Thanks Julia; this is helpful. > > Thanks also for reading my mind a bit, as I am thinking of the VXLAN > case... I can't help but notice that in the Ironic CI jobs, multi > tenant networking being used seems to entail VLANs as the tenant > network type (instead of VXLAN). Is it just coincidence / how the gate > just is, or is it hinting something about how VXLAN and bare metal get > along? > > On Wed, May 1, 2019 at 6:38 PM Julia Kreger > wrote: > > > > Greetings Jeremy, > > > > Best Practice wise, I'm not directly aware of any. It is largely going > > to depend upon your Neutron ML2 drivers and network fabric. > > > > In essence, you'll need an ML2 driver which supports the vnic type of > > "baremetal", which is able to able to orchestrate the switch port port > > binding configuration in your network fabric. If your using vlan > > networks, in essence you'll end up with a neutron physical network > > which is also a trunk port to the network fabric, and the ML2 driver > > would then appropriately tag the port(s) for the baremetal node to the > > networks required. In the CI gate, we do this in the "multitenant" > > jobs where networking-generic-switch modifies the OVS port > > configurations directly. > > > > If specifically vxlan is what your looking to use between VMs and > > baremetal nodes, I'm unsure of how you would actually configure that, > > but in essence the VXLANs would still need to be terminated on the > > switch port via the ML2 driver. > > > > In term of Ironic's documentation, If you haven't already seen it, you > > might want to check out ironic's multi-tenancy documentation[1]. > > > > -Julia > > > > [1]: https://docs.openstack.org/ironic/latest/admin/multitenancy.html > > > > On Wed, May 1, 2019 at 10:53 AM Jeremy Freudberg > > wrote: > > > > > > Hi all, > > > > > > I'm wondering if anyone has any best practices for Ironic bare metal > > > nodes and regular VMs living on the same network. I'm sure if involves > > > Ironic's `neutron` multi-tenant network driver, but I'm a bit hazy on > > > the rest of the details (still very much in the early stages of > > > exploring Ironic). Surely it's possible, but I haven't seen mention of > > > this anywhere (except the very old spec from 2015 about introducing > > > ML2 support into Ironic) nor is there a gate job resembling this > > > specific use. > > > > > > Ideas? > > > > > > Thanks, > > > Jeremy > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at fried.cc Sat May 4 20:02:12 2019 From: openstack at fried.cc (Eric Fried) Date: Sat, 4 May 2019 14:02:12 -0600 Subject: [placement][nova][ptg] resource provider affinity In-Reply-To: <03922b54-994e-dcae-8543-7c9c2f75b87d@hco.ntt.co.jp> References: <1CC272501B5BC543A05DB90AA509DED527552AD6@fmsmsx122.amr.corp.intel.com> <97bd8e53-0285-1c92-845f-21098b0b0e38@gmail.com> <21aa22e7-be7d-8ecf-b5bd-9c6afcd789f5@fried.cc> <27624C30-2BB6-43DF-9613-783674389C0B@fried.cc> <1556631941.24201.1@smtp.office365.com> <264f10b8-05dc-5280-28af-1f29cae91821@hco.ntt.co.jp> <4aa76244-fce0-86f3-a6f5-cd7f4d8cb2f0@fried.cc> <03922b54-994e-dcae-8543-7c9c2f75b87d@hco.ntt.co.jp> Message-ID: > It looks like that this can be done without impacting the performance of > existing requests that have no queryparam for affinity, Well, the concern is that doing this at _merge_candidates time (i.e. in python) may be slow. But yeah, let's not solve that until/unless we see it's truly a problem. > but I'd like to say that looking into tracking PCPU feature in Nova and > see the related problems should precede any Nova related items to model > NUMA in Placement. To be clear, placement doesn't need any changes for this. I definitely don't think we should wait for it to land before starting on the placement side of the affinity work. > I thought the negative folks were just refusing to be with in the > positive folks. > Looks like there are use cases where we need multiple group_resources? Yes, certainly eventually we'll need this, even just for positive affinity. Example: I want two VCPUs, two chunks of memory, and two accelerators. Each VCPU/memory/accelerator combo must be affined to the same NUMA node so I can maximize the performance of the accelerator. But I don't care whether both combos come from the same or different NUMA nodes: ?resources_compute1=VCPU:1,MEMORY_MB:1024 &resources_accel1=FPGA:1 &same_subtree:compute1,accel1 &resources_compute2=VCPU:1,MEMORY_MB:1024 &resources_accel2=FPGA:1 &same_subtree:compute2,accel2 and what I want to get in return is: candidates: (1) NUMA1 has VCPU:1,MEMORY_MB:1024,FPGA:1; NUMA2 likewise (2) NUMA1 has everything (3) NUMA2 has everything Slight aside, could we do this with can_split and just one same_subtree? I'm not sure you could expect the intended result from: ?resources_compute=VCPU:2,MEMORY_MB:2048 &resources_accel=FPGA:2 &same_subtree:compute,accel &can_split:compute,accel Intuitively, I think the above *either* means you don't get (1), *or* it means you can get (1)-(3) *plus* things like: (4) NUMA1 has VCPU:2,MEMORY_MB:2048; NUMA2 has FPGA:2 > - I want 1, 2 in the same subtree, and 3, 4 in the same subtree but the > two subtrees should be separated: > > * group_resources=1:2:!3:!4&group_resources=3:4 Right, and this too. As a first pass, I would be fine with supporting only positive affinity. And if it makes things significantly easier, supporting only a single group_resources per call. efried . From dciabrin at redhat.com Sat May 4 21:14:50 2019 From: dciabrin at redhat.com (Damien Ciabrini) Date: Sat, 4 May 2019 23:14:50 +0200 Subject: [oslo][oslo-messaging][nova] Stein nova-api AMQP issue running under uWSGI In-Reply-To: <20190503175904.GA26117@holtby> References: <229a2a53-870f-44c3-5e0c-6cfa9d45d0c5@oracle.com> <3275304e-d717-8b89-557e-b650fc4f661a@oracle.com> <20190420063850.GA18527@holtby.speedport.ip> <8b9cb0e4-b3a4-986a-be59-5bba6ae00f4e@nemebean.com> <20190503175904.GA26117@holtby> Message-ID: On Fri, May 3, 2019 at 7:59 PM Michele Baldessari wrote: > On Mon, Apr 22, 2019 at 01:21:03PM -0500, Ben Nemec wrote: > > > > > > On 4/22/19 12:53 PM, Alex Schultz wrote: > > > On Mon, Apr 22, 2019 at 11:28 AM Ben Nemec > wrote: > > > > > > > > > > > > > > > > On 4/20/19 1:38 AM, Michele Baldessari wrote: > > > > > On Fri, Apr 19, 2019 at 03:20:44PM -0700, > iain.macdonnell at oracle.com wrote: > > > > > > > > > > > > Today I discovered that this problem appears to be caused by > eventlet > > > > > > monkey-patching. I've created a bug for it: > > > > > > > > > > > > https://bugs.launchpad.net/nova/+bug/1825584 > > > > > > > > > > Hi, > > > > > > > > > > just for completeness we see this very same issue also with > > > > > mistral (actually it was the first service where we noticed the > missed > > > > > heartbeats). iirc Alex Schultz mentioned seeing it in ironic as > well, > > > > > although I have not personally observed it there yet. > > > > > > > > Is Mistral also mixing eventlet monkeypatching and WSGI? > > > > > > > > > > Looks like there is monkey patching, however we noticed it with the > > > engine/executor. So it's likely not just wsgi. I think I also saw it > > > in the ironic-conductor, though I'd have to try it out again. I'll > > > spin up an undercloud today and see if I can get a more complete list > > > of affected services. It was pretty easy to reproduce. > > > > Okay, I asked because if there's no WSGI/Eventlet combination then this > may > > be different from the Nova issue that prompted this thread. It sounds > like > > that was being caused by a bad interaction between WSGI and some Eventlet > > timers. If there's no WSGI involved then I wouldn't expect that to > happen. > > > > I guess we'll see what further investigation turns up, but based on the > > preliminary information there may be two bugs here. > > So just to get some closure on this error that we have seen around > mistral executor and tripleo with python3: this was due to the ansible > action that called subprocess which has a different implementation in > python3 and so the monkeypatching needs to be adapted. > > Review which fixes it for us is here: > https://review.opendev.org/#/c/656901/ > > Damien and I think the nova_api/eventlet/mod_wsgi has a separate root-cause > (although we have not spent all too much time on that one yet) > > Right, after further investigation, it appears that the problem we saw under mod_wsgi was due to monkey patching, as Iain originally reported. It has nothing to do with our work on healthchecks. It turns out that running the AMQP heartbeat thread under mod_wsgi doesn't work when the threading library is monkey_patched, because the thread waits on a data structure [1] that has been monkey patched [2], which makes it yield its execution instead of sleeping for 15s. Because mod_wsgi stops the execution of its embedded interpreter, the AMQP heartbeat thread can't be resumed until there's a message to be processed in the mod_wsgi queue, which would resume the python interpreter and make eventlet resume the thread. Disabling monkey-patching in nova_api makes the scheduling issue go away. Note: other services like heat-api do not use monkey patching and aren't affected, so this seem to confirm that monkey-patching shouldn't happen in nova_api running under mod_wsgi in the first place. [1] https://github.com/openstack/oslo.messaging/blob/master/oslo_messaging/_drivers/impl_rabbit.py#L904 [2] https://github.com/openstack/oslo.utils/blob/master/oslo_utils/eventletutils.py#L182 -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at fried.cc Sat May 4 22:43:26 2019 From: openstack at fried.cc (Eric Fried) Date: Sat, 4 May 2019 16:43:26 -0600 Subject: [nova][all][ptg] Summary: Same-Company Approvals Message-ID: (NB: I tagged [all] because it would be interesting to know where other teams stand on this issue.) Etherpad: https://etherpad.openstack.org/p/nova-ptg-train-governance Summary: - There is a (currently unwritten? at least for Nova) rule that a patch should not be approved exclusively by cores from the same company. This is rife with nuance, including but not limited to: - Usually (but not always) relevant when the patch was proposed by member of same company - N/A for trivial things like typo fixes - The issue is: - Should the rule be abolished? and/or - Should the rule be written down? Consensus (not unanimous): - The rule should not be abolished. There are cases where both the impetus and the subject matter expertise for a patch all reside within one company. In such cases, at least one core from another company should still be engaged and provide a "procedural +2" - much like cores proxy SME +1s when there's no core with deep expertise. - If there is reasonable justification for bending the rules (e.g. typo fixes as noted above, some piece of work clearly not related to the company's interest, unwedging the gate, etc.) said justification should be clearly documented in review commentary. - The rule should not be documented (this email notwithstanding). This would either encourage loopholing or turn into a huge detailed legal tome that nobody will read. It would also *require* enforcement, which is difficult and awkward. Overall, we should be able to trust cores to act in good faith and in the appropriate spirit. efried . From openstack at fried.cc Sat May 4 22:56:34 2019 From: openstack at fried.cc (Eric Fried) Date: Sat, 4 May 2019 16:56:34 -0600 Subject: [placement][nova][ptg] resource provider affinity In-Reply-To: References: <1CC272501B5BC543A05DB90AA509DED527552AD6@fmsmsx122.amr.corp.intel.com> <97bd8e53-0285-1c92-845f-21098b0b0e38@gmail.com> <21aa22e7-be7d-8ecf-b5bd-9c6afcd789f5@fried.cc> <27624C30-2BB6-43DF-9613-783674389C0B@fried.cc> <1556631941.24201.1@smtp.office365.com> <264f10b8-05dc-5280-28af-1f29cae91821@hco.ntt.co.jp> <4aa76244-fce0-86f3-a6f5-cd7f4d8cb2f0@fried.cc> <03922b54-994e-dcae-8543-7c9c2f75b87d@hco.ntt.co.jp> Message-ID: <5fd214e8-4822-53a5-a7d6-622c5133a26f@fried.cc> For those of you following along at home, we had a design session a couple of hours ago and hammered out the broad strokes of this work, including rough prioritization of the various pieces. Chris has updated the story [1] with a couple of notes; expect details and specs to emerge therefrom. efried [1] https://storyboard.openstack.org/#!/story/2005575 From openstack at fried.cc Sat May 4 23:32:02 2019 From: openstack at fried.cc (Eric Fried) Date: Sat, 4 May 2019 17:32:02 -0600 Subject: [nova][ptg] Summary/Outcome: Train Cycle Themes Message-ID: Etherpad: https://etherpad.openstack.org/p/nova-train-themes Summary: In Stein, we started doing cycle themes instead of priorities. The distinction being that themes should represent tangible user (as in OpenStack consumer) facing value, whereas priorities represent what work items we want to do. Outcome: We decided on themes around: (1) The use of placement (2) Cyborg integration (3) Docs I have curated the etherpad and discussions and proposed these themes to the nova-specs repository at https://review.opendev.org/657171 efried . From morgan.fainberg at gmail.com Sun May 5 01:19:48 2019 From: morgan.fainberg at gmail.com (Morgan Fainberg) Date: Sat, 4 May 2019 19:19:48 -0600 Subject: [nova][all][ptg] Summary: Same-Company Approvals In-Reply-To: References: Message-ID: On Sat, May 4, 2019, 16:48 Eric Fried wrote: > (NB: I tagged [all] because it would be interesting to know where other > teams stand on this issue.) > > Etherpad: https://etherpad.openstack.org/p/nova-ptg-train-governance > > Summary: > - There is a (currently unwritten? at least for Nova) rule that a patch > should not be approved exclusively by cores from the same company. This > is rife with nuance, including but not limited to: > - Usually (but not always) relevant when the patch was proposed by > member of same company > - N/A for trivial things like typo fixes > - The issue is: > - Should the rule be abolished? and/or > - Should the rule be written down? > > Consensus (not unanimous): > - The rule should not be abolished. There are cases where both the > impetus and the subject matter expertise for a patch all reside within > one company. In such cases, at least one core from another company > should still be engaged and provide a "procedural +2" - much like cores > proxy SME +1s when there's no core with deep expertise. > - If there is reasonable justification for bending the rules (e.g. typo > fixes as noted above, some piece of work clearly not related to the > company's interest, unwedging the gate, etc.) said justification should > be clearly documented in review commentary. > - The rule should not be documented (this email notwithstanding). This > would either encourage loopholing or turn into a huge detailed legal > tome that nobody will read. It would also *require* enforcement, which > is difficult and awkward. Overall, we should be able to trust cores to > act in good faith and in the appropriate spirit. > > efried > . > Keystone used to have the same policy outlined in this email (with much of the same nuance and exceptions). Without going into crazy details (as the contributor and core numbers went down), we opted to really lean on "Overall, we should be able to trust cores to act in good faith". We abolished the rule and the cores always ask for outside input when the familiarity lies outside of the team. We often also pull in cores more familiar with the code sometimes ending up with 3x+2s before we workflow the patch. Personally I don't like the "this is an unwritten rule and it shouldn't be documented"; if documenting and enforcement of the rule elicits worry of gaming the system or being a dense some not read, in my mind (and experience) the rule may not be worth having. I voice my opinion with the caveat that every team is different. If the rule works, and helps the team (Nova in this case) feel more confident in the management of code, the rule has a place to live on. What works for one team doesn't always work for another. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tomi.juvonen at nokia.com Sun May 5 01:28:59 2019 From: tomi.juvonen at nokia.com (Juvonen, Tomi (Nokia - FI/Espoo)) Date: Sun, 5 May 2019 01:28:59 +0000 Subject: [fenix][ptg] summary Message-ID: Fenix Train PTG, What to do next, prioritizing https://etherpad.openstack.org/p/DEN2019-fenix-PTG Two non-Telco users would like to use Fenix to maintain their cloud. For this, Fenix need to prioritize work so we can provide production ready framework without Telco features first. Work is now prioritized in the Etherpad and missing things should also be added to storyboard next week. Fenix and ETSI NFV synch https://etherpad.openstack.org/p/DEN2019-fenix-ETSI-NFV-PTG There was also a discussion about supporting ETSI NFV defined constraints. Some instance and anti-affinity group constraints could be in Nova. Anyhow, for Fenix to be generic for any cloud, it would make sense to have more information kept within Fenix. This needs further investigation. Then there was a proposal of having direct subscription to Fenix from VNFM side instead of using subscription to AODH to have event alarm from Fenix notification to VNFM. One bad thing here was that VIM shouldn't have direct API call to external system. The current notification / AODH was also nice for any user to build simple manager just to receive the hint what new capability is coming to application (non-Telco) or to still have some simple solution to interact at the time of maintenance. Thanks, Tomi (tojuvone) -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Sun May 5 07:10:35 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Sun, 05 May 2019 02:10:35 -0500 Subject: [qa][ptg][patrole] RBAC testing improvement ideas for Patrole Message-ID: <16a86d4834e.e46610fc23956.8020827235456111857@ghanshyammann.com> Patrole is emerging as a good tool for RBAC testing. AT&T already running it on their production cloud and we have got a good amount of interest/feedback from other operators. We had few discussions regarding the Patrole testing improvement during PTG among QA, Nova, Keystone team. I am writing the summary of those discussions below and would like to get the opinion from Felipe & Sergey also. 1. How to improve the Patrole testing time: Currently Patrole test perform the complete API operaion which takes time and make Patrole testing very long. Patrole is responsible to test the policies only so does not need to wait for API complete operation to be completed. John has a good idea to handle that via flag. If that flag is enabled (per service and disabled by default) then oslo.policy can return some different error code on success (other than 403). The API can return the response with that error code which can be treated as pass case in Patrole. Morgan raises a good point on making it per API call than global. We can do that as next step and let's start with the global flag per service as of now? - https://etherpad.openstack.org/p/ptg-train-xproj-nova-keystone Another thing we should improve in current Patrole jobs is to separate the jobs per service. Currently, all 5 services are installed and run in a single job. Running all on Patrole gate is good but the project side gate does not need to run any other service tests. For example, patrole-keystone which can install the only keystone and run only keystone tests. This way project can reuse the patrole jobs only and does not need to prepare a separate job. 2. How to run patrole tests with all negative, positive combination for all scope + defaults roles combinations: - Current jobs patrole-admin/member/reader are able to test the negative pattern. For example: patrole-member job tests the admin APIs in a negative way and make sure test is passed only if member role gets 403. - As we have scope_type support also we need to extend the jobs to run for all 9 combinations of 3 scopes (system, project, domain) and 3 roles(admin, member, reader). - option1: running 9 different jobs with each combination as we do have currently for admin, member, reader role. The issue with this approach is gate will take a lot of time to run these 9 jobs separately. - option2: Run all the 9 combinations in a single job with running the tests in the loop with different combination of scope_roles. This might require the current config option [role] to convert to list type and per service so that the user can configure what all default roles are available for corresponding service. This option can save a lot of time to avoid devstack installation time as compared to 9 different jobs option. -gmann From gmann at ghanshyammann.com Sun May 5 07:18:08 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Sun, 05 May 2019 02:18:08 -0500 Subject: [qa][ptg][nova][cinder][keystone][neutron][glance][swift][placement] How to make integrated-gate testing (tempest-full) more stable and fast Message-ID: <16a86db6ccd.d787148123989.2198391414179782565@ghanshyammann.com> Current integrated-gate jobs (tempest-full) is not so stable for various bugs specially timeout. We tried to improve it via filtering the slow tests in the separate tempest-slow job but the situation has not been improved much. We talked about the Ideas to make it more stable and fast for projects especially when failure is not related to each project. We are planning to split the integrated-gate template (only tempest-full job as first step) per related services. Idea: - Run only dependent service tests on project gate. - Tempest gate will keep running all the services tests as the integrated gate at a centeralized place without any change in the current job. - Each project can run the below mentioned template. - All below template will be defined and maintained by QA team. I would like to know each 6 services which run integrated-gate jobs 1."Integrated-gate-networking" (job to run on neutron gate) Tests to run in this template: neutron APIs , nova APIs, keystone APIs ? All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for neutron gate: exlcude the cinder API tests, glance API tests, swift API tests, 2."Integrated-gate-storage" (job to run on cinder gate, glance gate) Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, Nova APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for cinder, glance gate: excluded the neutron APIs tests, Keystone APIs tests 3. "Integrated-gate-object-storage" (job to run on swift gate) Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for swift gate: excluded the neutron APIs tests, - Keystone APIs tests, - Nova APIs tests. Note: swift does not run integrated-gate as of now. 4. "Integrated-gate-compute" (job to run on Nova gate) tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs and All scenario currently running in tempest-full in same way ( means non-slow and in serial) Improvement for Nova gate: excluded the swift APIs tests(not running in current job but in future, it might), Keystone API tests. 5. "Integrated-gate-identity" (job to run on keystone gate) Tests to run is : all as all project use keystone, we might need to run all tests as it is running in integrated-gate. But does keystone is being unsed differently by all services? if no then, is it enough to run only single service tests say Nova or neutron ? 6. "Integrated-gate-placement" (job to run on placement gate) Tests to run in this template: Nova APIs tests, Neutron APIs tests + scenario tests + any new service depends on placement APIs Improvement for placement gate: excluded the glance APIs tests, cinder APIs tests, swift APIs tests, keystone APIs tests Thoughts on this approach? The important point is we must not lose the coverage of integrated testing per project. So I would like to get each project view if we are missing any dependency (proposed tests removal) in above proposed templates. - https://etherpad.openstack.org/p/qa-train-ptg -gmann From gmann at ghanshyammann.com Sun May 5 07:21:32 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Sun, 05 May 2019 02:21:32 -0500 Subject: [qa][form][ptg] QA Summary for Forum & PTG Message-ID: <16a86de8992.f13a438724004.7062826677271782113@ghanshyammann.com> Hello Everyone, We had a good discussion at QA forum and PTG. I am summarizing those and will start the separate thread for the few topics which needs more feedback. Summit: QA Forum sessions: 1. OpenStack QA - Project Update: Tuesday, April 30, 2:35pm-2:55pm We gave the updates on what we finished on Stein and draft plan for Train cycle. The good thing to note is we still have a lot of activity going on in QA. As overall QA projects, we did >3000 reviews and 750 commits. Video is not up still so I am copying the slide link below. Slides: https://docs.google.com/presentation/d/10zupeFZuOlxroAMl29qVJl78nD4_YWHkQxANNVlIjE0/edit?ts=5cc73ae8#slide=id.p1 2. OpenStack QA - Project Onboarding : Wednesday, May 1, 9:00am-9:40am We did host the QA onboarding sessions but there were only 3 attendees and no new contributor. I think it is hard to see any new contributor in summits now so I am thinking whether we should host the onboarding sessions from next time. Etherpad: https://etherpad.openstack.org/p/DEN-qa-onboarding 3. Users / Operators adoption of QA tools / plugins : Wednesday, May 1, 10:50am-11:30am3. As usual, we had more attendees in this session and useful feedback. There are few tooling is being shared by attendees: 1. Python hardware module for bare metal detailed hardware inspection & anomaly detection https://github.com/redhat-cip/hardware 2. Workload testing: https://opendev.org/x/tobiko/ Another good idea from Doug was plugin feature in openstack-health dashboard. That is something we discussed in PTG. For more details on this, refer the PTG " OpenStack-health improvement" section. Etherpad: https://etherpad.openstack.org/p/Den-forum-qa-ops-user-feedback QA PTG: 2nd - 3rd May: We were 3-4 attendee in the room always and others attended per topics. Good discussions and few good improvement ideas about gate stability and dashboard etc. 1. Topic: Stein Retrospective We collect good and need improvement things in this session. In term of good things, we completed the OpenStack gate migration fro Xenial to Bionic, lot of reviews and code. Doug from AT&T mentioned about to add tempest and patrole to gates and check in their production deployment process, "Thank you for all of the hard work from the QA team!!!" Slow reviews are a concern as we have a good number of the incoming request. This is something we should improve in Train. Action items: gmann: start the plan for backlogs especially for review and doc cleanup. masayukig: plan to have resource leakage check in gate. ds6901:will work with his team to clean up leaks and submit bugs 2. Topic: Keystone system-scope testing QA and Keystone team gathered together in this cross-project session about next steps on system scope testing. We talked on multiple points about how to cover all new roles for system scope and how to keep the backward compatibility testing for stable branches still testing the without system scope. We decided to move forward for system_admin as of now and fall back the system_admin to project scope if there is no system_scope testing flag is true on Tempest side (this will cover the stable branch testing unaffected). We agreed : - To move forward with system admin - https://review.opendev.org/#/c/604909/ - Add tempest job to test system scope - https://review.opendev.org/#/c/614484/ - Then add to tempest full - gmann - Then add testing for system reader - Investigate more advanced RBAC testing with Patrole - gmann Etherpad: https://etherpad.openstack.org/p/keystone-train-ptg-testing-system-scope-in-tempest 3. Topic: new whitebox plugin for tempest: This is a new idea from artom about testing things outside of Tempest's scope (currently mostly used to check instance XML for NFV use case tests). Currently, this tool does ssh into VM and fetch the xml for further verification etc. We agreed on point to avoid any duplicate test verification from the Tempest or nova functional tests This is good to tests from more extra verification by going inside VM like after migration data, CPU pinning etc. As next step artom to propose the QA spec with details and proposal of this plugin under QA program. 4. Topic: Document the QA process or TODO things for releases, stable branch cut: Idea is to start a centralized doc page for QA activities and process etc. we want to use the qa-specs repo to publish the content to doc.openstack.org/qa/. This can be not so easy and need few tweaks on doc jobs. I will get into the details and then discuss with infra team. This is a low priority for now. 5. Topic: Plugin sanity check Current tempest-plugins-sanity job is not stable and so it is n-v. We want to make it voting by only installing the active plugins. many plugins are failing either they are dead or not so active. We agreed on: - make faulty plugins as blacklist with bug/patch link and notify the same on ML every time we detect any failure - Publish the blakclist on plugins-registry doc. - After that make this job voting, make the process of fixing and removing the faulty plugin which unblocks the tempest gate with author self-approve. - Make sanity job running on plugins which are dependent on each other. For example, congress-tempest-plugin use neutron-tempest-plugin, mistral-tempest-plugin etc so all these plugins should have a sanity job which can install and list these plugins tests only not all the plugins. 6. Topic: Planning for Patrole Stable release: We had a good amount of discussions for Patrole improvements area to release it stable. Refer the below ML thread for details and further discussions on this topic: - http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005870.html 7. Topic: How to make tempest-full stable ( Don't fail integrated job when not related test will fail ) Current integrated-gate jobs (tempest-full) is not so stable for various bugs specially timeout. We discussed the few ideas to improve it. Refer the below ML thread for details and further discussions on this topic : http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005871.html 8. Topic: OpenStack-Health Improvement Doug from AT&T has few improvement ideas for health dashboard which has been discussed in PTG: - Test Grouping - Define groups - Assigned test to groups - filter by groups - Compare 2 runs = Look into push AQuA report to subunit2SQL as a tool Action Items: - Doug is going to write the spec for plugin approach. All other ideas can be done after we have the plugin approach ready. - filter - presentation 9. Topic: Stein Backlogs & Train priorities & Planning We collected the Train items in below mentioned etherpad with the assignee. Anyone would like to help on any of the item, ping me on IRC or reply here. Etherpad: https://etherpad.openstack.org/p/qa-train-priority 10. Topic: grenade zuulv3 jobs review/discussions We did not get the chance to review these. Let's continue it after PTG. Full Detail discussion: https://etherpad.openstack.org/p/qa-train-ptg -gmann From liuyulong.xa at gmail.com Sun May 5 09:37:56 2019 From: liuyulong.xa at gmail.com (LIU Yulong) Date: Sun, 5 May 2019 17:37:56 +0800 Subject: [qa][ptg][nova][cinder][keystone][neutron][glance][swift][placement] How to make integrated-gate testing (tempest-full) more stable and fast In-Reply-To: <16a86db6ccd.d787148123989.2198391414179782565@ghanshyammann.com> References: <16a86db6ccd.d787148123989.2198391414179782565@ghanshyammann.com> Message-ID: +1 On Sun, May 5, 2019 at 3:18 PM Ghanshyam Mann wrote: > Current integrated-gate jobs (tempest-full) is not so stable for various > bugs specially timeout. We tried > to improve it via filtering the slow tests in the separate tempest-slow > job but the situation has not been improved much. > > We talked about the Ideas to make it more stable and fast for projects > especially when failure is not > related to each project. We are planning to split the integrated-gate > template (only tempest-full job as > first step) per related services. > > Idea: > - Run only dependent service tests on project gate. > - Tempest gate will keep running all the services tests as the integrated > gate at a centeralized place without any change in the current job. > - Each project can run the below mentioned template. > - All below template will be defined and maintained by QA team. > > I would like to know each 6 services which run integrated-gate jobs > > 1."Integrated-gate-networking" (job to run on neutron gate) > Tests to run in this template: neutron APIs , nova APIs, keystone APIs ? > All scenario currently running in tempest-full in the same way ( means > non-slow and in serial) > Improvement for neutron gate: exlcude the cinder API tests, glance API > tests, swift API tests, > > 2."Integrated-gate-storage" (job to run on cinder gate, glance gate) > Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, Nova > APIs and All scenario currently running in tempest-full in the same way ( > means non-slow and in serial) > Improvement for cinder, glance gate: excluded the neutron APIs tests, > Keystone APIs tests > > 3. "Integrated-gate-object-storage" (job to run on swift gate) > Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and > All scenario currently running in tempest-full in the same way ( means > non-slow and in serial) > Improvement for swift gate: excluded the neutron APIs tests, - Keystone > APIs tests, - Nova APIs tests. > Note: swift does not run integrated-gate as of now. > > 4. "Integrated-gate-compute" (job to run on Nova gate) > tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs and > All scenario currently running in tempest-full in same way ( means non-slow > and in serial) > Improvement for Nova gate: excluded the swift APIs tests(not running in > current job but in future, it might), Keystone API tests. > > 5. "Integrated-gate-identity" (job to run on keystone gate) > Tests to run is : all as all project use keystone, we might need to run > all tests as it is running in integrated-gate. > But does keystone is being unsed differently by all services? if no then, > is it enough to run only single service tests say Nova or neutron ? > > 6. "Integrated-gate-placement" (job to run on placement gate) > Tests to run in this template: Nova APIs tests, Neutron APIs tests + > scenario tests + any new service depends on placement APIs > Improvement for placement gate: excluded the glance APIs tests, cinder > APIs tests, swift APIs tests, keystone APIs tests > > Thoughts on this approach? > > The important point is we must not lose the coverage of integrated testing > per project. So I would like to > get each project view if we are missing any dependency (proposed tests > removal) in above proposed templates. > > - https://etherpad.openstack.org/p/qa-train-ptg > > -gmann > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From colleen at gazlene.net Sun May 5 15:58:22 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Sun, 05 May 2019 11:58:22 -0400 Subject: [dev][keystone][ptg] Keystone team action items Message-ID: Hi everyone, I will write an in-depth summary of the Forum and PTG some time in the coming week, but I wanted to quickly capture all the action items that came out of the last six days so that we don't lose too much focus: Colleen * move "Expand endpoint filters to Service Providers" spec[1] to attic * review "Policy Goals"[2] and "Policy Security Roadmap"[3] specs with Lance, refresh and possibly combine them * move "Unified model for assignments, OAuth, and trusts" spec[4] from ongoing to backlog, and circle up with Adam about refreshing it * update app creds spec[5] to defer access_rules_config * review app cred documentation with regard to proactive rotation * follow up with nova/other service teams on need for microversion support in access rules * circle up with Guang on fixing autoprovisioning for tokenless auth * keep up to date with IEEE/NIST efforts on standardizing federation * investigate undoing the foreign key constraint that breaks the pluggable resource driver * propose governance change to add caching as a base service * clean out deprecated cruft from keystonemiddleware * write up Outreachy/other internship application tasks [1] http://specs.openstack.org/openstack/keystone-specs/specs/keystone/backlog/service-providers-filters.html [2] http://specs.openstack.org/openstack/keystone-specs/specs/keystone/ongoing/policy-goals.html [3] http://specs.openstack.org/openstack/keystone-specs/specs/keystone/ongoing/policy-security-roadmap.html [4] http://specs.openstack.org/openstack/keystone-specs/specs/keystone/ongoing/unified-delegation.html [5] http://specs.openstack.org/openstack/keystone-specs/specs/keystone/train/capabilities-app-creds.html Lance * write up plan for tempest testing of system scope * break up unified limits testing plan into separate items, one for CRUD in keystone and one for quota and limit validation in oslo.limit[6] * write up spec for assigning roles on root domain * (with Morgan) check for and add interface in oslo.policy to see if policy has been overridden [6] https://trello.com/c/kbKvhYBz/20-test-unified-limits-in-tempest Kristi * finish mutable config patch * propose "model-timestamps" spec for Train[7] * move "Add Multi-Version Support to Federation Mappings" spec[8] to attic * review and possibly complete "Devstack Plugin for Keystone" spec[9] * look into "RFE: Improved OpenID Connect Support" spec[10] * update refreshable app creds spec[11] to make federated users expire rather then app creds * deprecate federated_domain_name [7] http://specs.openstack.org/openstack/keystone-specs/specs/keystone/backlog/model-timestamps.html [8] http://specs.openstack.org/openstack/keystone-specs/specs/keystone/backlog/versioned-mappings.html [9] http://specs.openstack.org/openstack/keystone-specs/specs/keystone/ongoing/devstack-plugin.html [10] https://bugs.launchpad.net/keystone/+bug/1815971 [11] https://review.opendev.org/604201 Vishakha * investigate effort needed for Alembic migrations spec[12] (with help from Morgan) * merge "RFE: Retrofit keystone-manage db_* commands to work with Alembic"[13] into "Use Alembic for database migrations" spec * remove deprecated [signing] config * remove deprecated [DEFAULT]/admin_endpoint config * remove deprecated [token]/infer_roles config [12] http://specs.openstack.org/openstack/keystone-specs/specs/keystone/backlog/alembic.html [13] https://bugs.launchpad.net/keystone/+bug/1816158 Morgan * review "Materialize Project Hierarchy" spec[14] and make sure it reflects the current state of the world, keep it in the backlog * move "Functional Testing" spec[15] to attic * move "Object Dependency Lifecycle" spec[16] to complete * move "Add Endpoint Filter Enforcement to Keystonemiddleware" spec[17] to attic * move "Request Helpers" spec[18] to attic * create PoC of external IdP proxy component * (with Lance) check for and add interface in oslo.policy to see if policy has been overridden * investigate removing [eventlet_server] config section * remove remaining PasteDeploy things * remove PKI(Z) cruft from keystonemiddleware * refactor keystonemiddleware to have functional components instead of needing keystone to instantiate keystonemiddleware objects for auth [14] http://specs.openstack.org/openstack/keystone-specs/specs/keystone/backlog/materialize-project-hierarchy.html [15] http://specs.openstack.org/openstack/keystone-specs/specs/keystone/ongoing/functional-testing.html [16] http://specs.openstack.org/openstack/keystone-specs/specs/keystone/ongoing/object-dependency-lifecycle.html [17] http://specs.openstack.org/openstack/keystone-specs/specs/keystonemiddleware/backlog/endpoint-enforcement-middleware.html [18] http://specs.openstack.org/openstack/keystone-specs/specs/keystonemiddleware/backlog/request-helpers.html Gage * investigate with operators about specific use case behind "RFE: Whitelisting (opt-in) users/projects/domains for PCI compliance"[19] request * follow up on "RFE: Token returns Project's tag properties"[20] * remove use of keystoneclient from keystonemiddleware [19] https://bugs.launchpad.net/keystone/+bug/1637146 [20] https://bugs.launchpad.net/keystone/+bug/1807697 Rodrigo * Propose finishing "RFE: Project Tree Deletion/Disabling"[21] as an Outreachy project [21] https://bugs.launchpad.net/keystone/+bug/1816105 Adam * write up super-spec on explicit project IDs plus predictable IDs Thanks everyone for a productive week and for all your hard work! Colleen From jeremyfreudberg at gmail.com Sun May 5 20:24:14 2019 From: jeremyfreudberg at gmail.com (Jeremy Freudberg) Date: Sun, 5 May 2019 16:24:14 -0400 Subject: [ironic][neutron][ops] Ironic multi-tenant networking, VMs In-Reply-To: References: Message-ID: Sukhdev- yes it helps a ton. Thank you! If anyone reading the list has a citable example of this, public on the web, feel free to chime in. On Sat, May 4, 2019 at 3:43 PM Sukhdev Kapur wrote: > > Jeremy, > > If you want to use VxLAN networks for the bremetal hosts, you would use ML2 VLAN networks, as Julia described, between the host and switch port. That VLAN will then terminate into a VTAP on the switch port which will carry appropriate tags in the VxLAN overlay. > > Hope this helps > -Sukhdev > > > On Thu, May 2, 2019 at 9:28 PM Jeremy Freudberg wrote: >> >> Thanks Julia; this is helpful. >> >> Thanks also for reading my mind a bit, as I am thinking of the VXLAN >> case... I can't help but notice that in the Ironic CI jobs, multi >> tenant networking being used seems to entail VLANs as the tenant >> network type (instead of VXLAN). Is it just coincidence / how the gate >> just is, or is it hinting something about how VXLAN and bare metal get >> along? >> >> On Wed, May 1, 2019 at 6:38 PM Julia Kreger wrote: >> > >> > Greetings Jeremy, >> > >> > Best Practice wise, I'm not directly aware of any. It is largely going >> > to depend upon your Neutron ML2 drivers and network fabric. >> > >> > In essence, you'll need an ML2 driver which supports the vnic type of >> > "baremetal", which is able to able to orchestrate the switch port port >> > binding configuration in your network fabric. If your using vlan >> > networks, in essence you'll end up with a neutron physical network >> > which is also a trunk port to the network fabric, and the ML2 driver >> > would then appropriately tag the port(s) for the baremetal node to the >> > networks required. In the CI gate, we do this in the "multitenant" >> > jobs where networking-generic-switch modifies the OVS port >> > configurations directly. >> > >> > If specifically vxlan is what your looking to use between VMs and >> > baremetal nodes, I'm unsure of how you would actually configure that, >> > but in essence the VXLANs would still need to be terminated on the >> > switch port via the ML2 driver. >> > >> > In term of Ironic's documentation, If you haven't already seen it, you >> > might want to check out ironic's multi-tenancy documentation[1]. >> > >> > -Julia >> > >> > [1]: https://docs.openstack.org/ironic/latest/admin/multitenancy.html >> > >> > On Wed, May 1, 2019 at 10:53 AM Jeremy Freudberg >> > wrote: >> > > >> > > Hi all, >> > > >> > > I'm wondering if anyone has any best practices for Ironic bare metal >> > > nodes and regular VMs living on the same network. I'm sure if involves >> > > Ironic's `neutron` multi-tenant network driver, but I'm a bit hazy on >> > > the rest of the details (still very much in the early stages of >> > > exploring Ironic). Surely it's possible, but I haven't seen mention of >> > > this anywhere (except the very old spec from 2015 about introducing >> > > ML2 support into Ironic) nor is there a gate job resembling this >> > > specific use. >> > > >> > > Ideas? >> > > >> > > Thanks, >> > > Jeremy >> > > >> From jeremyfreudberg at gmail.com Sun May 5 20:36:49 2019 From: jeremyfreudberg at gmail.com (Jeremy Freudberg) Date: Sun, 5 May 2019 16:36:49 -0400 Subject: [sahara][all] Sahara virtual PTG reminder (approaching quickly!) Message-ID: The Sahara virtual PTG will take place Monday, May 6, at 15:00 UTC. All are welcome. Etherpad link: https://etherpad.openstack.org/p/sahara-train-ptg Bluejeans link: https://bluejeans.com/6304900378 From doka.ua at gmx.com Sun May 5 21:34:01 2019 From: doka.ua at gmx.com (Volodymyr Litovka) Date: Mon, 6 May 2019 00:34:01 +0300 Subject: [octavia] Amphora agent returned unexpected result code 500 Message-ID: <5798b929-737e-fd29-a2a5-7c1246a632bb@gmx.com> Dear colleagues, trying to launch Amphorae, getting the following error in logs: Amphora agent returned unexpected result code 500 with response {'message': 'Error plugging VIP', 'details': 'SIOCADDRT: Network is unreachable\nFailed to bring up eth1.\n'} While details below, questions are here: - whether it's enough to assign roles as explained below to special project for Octavia? - whether it can be issue with image, created by diskimage_create.sh? - any recommendation on where to search for the problem. Thank you. My environment is: - Openstack Rocky - Octavia 4.0 - amphora instance runs in special project "octavia", where users octavia, nova and neutron have admin role - amphora image prepared using original git repo process and elements without modification: * git clone * cd octavia * diskimage-create/diskimage-create.sh * openstack image create [ ... ] --tag amphora After created, amphora instance successfully connects to management network and can be accessed by controller: 2019-05-05 20:46:06.851 18234 DEBUG octavia.amphorae.drivers.haproxy.rest_api_driver [-] Connected to amphora. Response: request /opt/openstack/lib/python3.6/site-packages/octavia/amphorae/drivers/haproxy/rest_api_driver.py:486 2019-05-05 20:46:06.852 18234 DEBUG octavia.controller.worker.tasks.amphora_driver_tasks [-] Successfuly connected to amphora 5bec4c09-a209-4e73-a66e-e4fc0fb8ded5: {'ipvsadm_version': '1:1.28-3', 'api_version': '0.5', 'haproxy_version': '1.6.3-1ubuntu0.2', 'hostname': 'amphora-5bec4c09-a209-4e73-a66e-e4fc0fb8ded5', 'keepalived_version': '1:1.2.24-1ubuntu0.16.04.1'} execute /opt/openstack/lib/python3.6/site-packages/octavia/controller/worker/tasks/amphora_driver_tasks.py:372 [ ... ] 2019-05-05 20:46:06.990 18234 DEBUG octavia.controller.worker.tasks.network_tasks [-] Plumbing VIP for amphora id: 5bec4c09-a209-4e73-a66e-e4fc0fb8ded5 execute /opt/openstack/lib/python3.6/site-packages/octavia/controller/worker/tasks/network_tasks.py:382 2019-05-05 20:46:07.003 18234 DEBUG octavia.network.drivers.neutron.base [-] Neutron extension security-group found enabled _check_extension_enabled /opt/openstack/lib/python3.6/site-packages/octavia/network/drivers/neutron/base.py:66 2019-05-05 20:46:07.013 18234 DEBUG octavia.network.drivers.neutron.base [-] Neutron extension dns-integration found enabled _check_extension_enabled /opt/openstack/lib/python3.6/site-packages/octavia/network/drivers/neutron/base.py:66 2019-05-05 20:46:07.025 18234 DEBUG octavia.network.drivers.neutron.base [-] Neutron extension qos found enabled _check_extension_enabled /opt/openstack/lib/python3.6/site-packages/octavia/network/drivers/neutron/base.py:66 2019-05-05 20:46:07.044 18234 DEBUG octavia.network.drivers.neutron.base [-] Neutron extension allowed-address-pairs found enabled _check_extension_enabled /opt/openstack/lib/python3.6/site-packages/octavia/network/drivers/neutron/base.py:66 2019-05-05 20:46:08.406 18234 DEBUG octavia.network.drivers.neutron.allowed_address_pairs [-] Created vip port: b0398cc8-6d52-4f12-9f1f-1141b0f10751 for amphora: 5bec4c09-a209-4e73-a66e-e4fc0fb8ded5 _plug_amphora_vip /opt/openstack/lib/python3.6/site-packages/octavia/network/drivers/neutron/allowed_address_pairs.py:97 [ ... ] 2019-05-05 20:46:15.405 18234 DEBUG octavia.network.drivers.neutron.allowed_address_pairs [-] Retrieving network details for amphora 5bec4c09-a209-4e73-a66e-e4fc0fb8ded5 _get_amp_net_configs /opt/openstack/lib/python3.6/site-packages/octavia/network/drivers/neutron/allowed_address_pairs.py:596 [ ... ] 2019-05-05 20:46:15.837 18234 DEBUG octavia.amphorae.drivers.haproxy.rest_api_driver [-] Post-VIP-Plugging with vrrp_ip 10.0.2.13 vrrp_port b0398cc8-6d52-4f12-9f1f-1141b0f10751 post_vip_plug /opt/openstack/lib/python3.6/site-packages/octavia/amphorae/drivers/haproxy/rest_api_driver.py:233 2019-05-05 20:46:15.838 18234 DEBUG octavia.amphorae.drivers.haproxy.rest_api_driver [-] request url plug/vip/10.0.2.24 request /opt/openstack/lib/python3.6/site-packages/octavia/amphorae/drivers/haproxy/rest_api_driver.py:462 2019-05-05 20:46:15.838 18234 DEBUG octavia.amphorae.drivers.haproxy.rest_api_driver [-] request url https://172.16.252.35:9443/0.5/plug/vip/10.0.2.24 request /opt/openstack/lib/python3.6/site-packages/octavia/amphorae/drivers/haproxy/rest_api_driver.py:465 2019-05-05 20:46:16.089 18234 DEBUG octavia.amphorae.drivers.haproxy.rest_api_driver [-] Connected to amphora. Response: request /opt/openstack/lib/python3.6/site-packages/octavia/amphorae/drivers/haproxy/rest_api_driver.py:486 2019-05-05 20:46:16.090 18234 ERROR octavia.amphorae.drivers.haproxy.exceptions [-] Amphora agent returned unexpected result code 500 with response {'message': 'Error plugging VIP', 'details': 'SIOCADDRT: Network is unreachable\nFailed to bring up eth1.\n'} During the process, NEUTRON logs contains the following records that indicate the following (note "status=DOWN" in neutron-dhcp-agent; later immediately before to be deleted, it will shed 'ACTIVE'): May  5 20:46:13 ardbeg neutron-dhcp-agent: 2019-05-05 20:46:13.857 1804 INFO neutron.agent.dhcp.agent [req-07833602-9579-403b-a264-76fd3ee408ee a18f38c780074c6280dde5edad159666 41a02a69918849509f4102b04f8a7de9 - - -] Trigger reload_allocations for port admin_state_up=True, allowed_address_pairs=[{u'ip_address': u'10.0.2.24', u'mac_address': u'72:d0:1c:4c:94:91'}], binding:host_id=ardbeg, binding:profile=, binding:vif_details=datapath_type=system, ovs_hybrid_plug=False, port_filter=True, binding:vif_type=ovs, binding:vnic_type=normal, created_at=2019-05-05T20:46:07Z, description=, device_id=f1bce6e9-be5b-464b-8f64-686f36e9de1f, device_owner=compute:nova, dns_assignment=[{u'hostname': u'amphora-5bec4c09-a209-4e73-a66e-e4fc0fb8ded5', u'ip_address': u'10.0.2.13', u'fqdn': u'amphora-5bec4c09-a209-4e73-a66e-e4fc0fb8ded5.loqal.'}], dns_domain=, dns_name=amphora-5bec4c09-a209-4e73-a66e-e4fc0fb8ded5, extra_dhcp_opts=[], fixed_ips=[{u'subnet_id': u'24b10886-3d53-4aee-bdc6-f165b242ae4f', u'ip_address': u'10.0.2.13'}], id=b0398cc8-6d52-4f12-9f1f-1141b0f10751, mac_address=72:d0:1c:4c:94:91, name=octavia-lb-vrrp-5bec4c09-a209-4e73-a66e-e4fc0fb8ded5, network_id=b24d2830-eec6-4abd-82f2-ac71c8ecbf40, port_security_enabled=True, project_id=41a02a69918849509f4102b04f8a7de9, qos_policy_id=None, revision_number=5, security_groups=[u'6df53a15-6afc-4c99-b464-03de4f546b4f'], status=DOWN, tags=[], tenant_id=41a02a69918849509f4102b04f8a7de9, updated_at=2019-05-05T20:46:13Z May  5 20:46:14 ardbeg neutron-openvswitch-agent: 2019-05-05 20:46:14.185 31542 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-a4425cdb-afc1-4f6a-9ef9-c8706e3285d6 - - - - -] Port b0398cc8-6d52-4f12-9f1f-1141b0f10751 updated. Details: {'profile': {}, 'network_qos_policy_id': None, 'qos_policy_id': None, 'allowed_address_pairs': [{'ip_address': AuthenticIPNetwork('10.0.2.24'), 'mac_address': EUI('72:d0:1c:4c:94:91')}], 'admin_state_up': True, 'network_id': 'b24d2830-eec6-4abd-82f2-ac71c8ecbf40', 'segmentation_id': 437, 'fixed_ips': [{'subnet_id': '24b10886-3d53-4aee-bdc6-f165b242ae4f', 'ip_address': '10.0.2.13'}], 'device_owner': u'compute:nova', 'physical_network': None, 'mac_address': '72:d0:1c:4c:94:91', 'device': u'b0398cc8-6d52-4f12-9f1f-1141b0f10751', 'port_security_enabled': True, 'port_id': 'b0398cc8-6d52-4f12-9f1f-1141b0f10751', 'network_type': u'vxlan', 'security_groups': [u'6df53a15-6afc-4c99-b464-03de4f546b4f']} May  5 20:46:14 ardbeg neutron-openvswitch-agent: 2019-05-05 20:46:14.197 31542 INFO neutron.agent.securitygroups_rpc [req-a4425cdb-afc1-4f6a-9ef9-c8706e3285d6 - - - - -] Preparing filters for devices set([u'b0398cc8-6d52-4f12-9f1f-1141b0f10751']) Note Nova returns response 200/completed: May  5 20:46:14 controller-l neutron-server: 2019-05-05 20:46:14.326 20981 INFO neutron.notifiers.nova [-] Nova event response: {u'status': u'completed', u'tag': u'b0398cc8-6d52-4f12-9f1f-1141b0f10751', u'name': u'network-changed', u'server_uuid': u'f1bce6e9-be5b-464b-8f64-686f36e9de1f', u'code': 200} and "openstack server show" shows both NICs are attached to the amphorae: $ openstack server show f1bce6e9-be5b-464b-8f64-686f36e9de1f +-------------------------------------+------------------------------------------------------------+ | Field                               | Value                                                      | +-------------------------------------+------------------------------------------------------------+ [ ... ] | addresses                           | octavia-net=172.16.252.35; u1000-p1000-xbone=10.0.2.13     | +-------------------------------------+------------------------------------------------------------+ Later Octavia worker reports the following: 2019-05-05 20:46:16.124 18234 DEBUG octavia.controller.worker.controller_worker [-] Task 'STANDALONE-octavia-plug-net-subflow-octavia-amp-post-vip-plug' (f105ced1-72c6-4116-b582-599a21cdee36) transitioned into state 'REVERTING' from state 'FAILURE' _task_receiver /opt/openstack/lib/python3.6/site-packages/taskflow/listeners/logging.py:194 2019-05-05 20:46:16.127 18234 WARNING octavia.controller.worker.controller_worker [-] Task 'STANDALONE-octavia-plug-net-subflow-octavia-amp-post-vip-plug' (f105ced1-72c6-4116-b582-599a21cdee36) transitioned into state 'REVERTED' from state 'REVERTING' with result 'None' 2019-05-05 20:46:16.141 18234 DEBUG octavia.controller.worker.controller_worker [-] Task 'STANDALONE-octavia-plug-net-subflow-reload-amp-after-plug-vip' (c4d6222e-2508-4a9c-9514-e7f9bcf84e31) transitioned into state 'REVERTING' from state 'SUCCESS' _task_receiver /opt/openstack/lib/python3.6/site-packages/taskflow/listeners/logging.py:194 2019-05-05 20:46:16.142 18234 WARNING octavia.controller.worker.controller_worker [-] Task 'STANDALONE-octavia-plug-net-subflow-reload-amp-after-plug-vip' (c4d6222e-2508-4a9c-9514-e7f9bcf84e31) transitioned into state 'REVERTED' from state 'REVERTING' with result 'None' 2019-05-05 20:46:16.146 18234 DEBUG octavia.controller.worker.controller_worker [-] Task 'STANDALONE-octavia-plug-net-subflow-ocatvia-amp-update-vip-data' (2e1d1a04-282d-43b7-8c4f-fe31e75804ea) transitioned into state 'REVERTING' from state 'SUCCESS' _task_receiver /opt/openstack/lib/python3.6/site-packages/taskflow/listeners/logging.py:194 2019-05-05 20:46:16.148 18234 WARNING octavia.controller.worker.controller_worker [-] Task 'STANDALONE-octavia-plug-net-subflow-ocatvia-amp-update-vip-data' (2e1d1a04-282d-43b7-8c4f-fe31e75804ea) transitioned into state 'REVERTED' from state 'REVERTING' with result 'None' 2019-05-05 20:46:16.173 18234 DEBUG octavia.controller.worker.controller_worker [-] Task 'STANDALONE-octavia-plug-net-subflow-octavia-amp-plug-vip' (c63a5bed-f531-4ed3-83d2-bce72e835932) transitioned into state 'REVERTING' from state 'SUCCESS' _task_receiver /opt/openstack/lib/python3.6/site-packages/taskflow/listeners/logging.py:194 2019-05-05 20:46:16.174 18234 WARNING octavia.controller.worker.tasks.network_tasks [-] Unable to plug VIP for amphora id 5bec4c09-a209-4e73-a66e-e4fc0fb8ded5 load balancer id e01c6ff5-179a-4ed5-ae5d-1d00d6c584b8 and Neutron then deletes port but NOTE that immediately before deletion port reported by neutron-dhcp-agent as ACTIVE: May  5 20:46:17 ardbeg neutron-dhcp-agent: 2019-05-05 20:46:17.080 1804 INFO neutron.agent.dhcp.agent [req-835e5b91-28e5-44b9-a463-d04a0323294f a18f38c780074c6280dde5edad159666 41a02a69918849509f4102b04f8a7de9 - - -] Trigger reload_allocations for port admin_state_up=True, allowed_address_pairs=[], binding:host_id=ardbeg, binding:profile=, binding:vif_details=datapath_type=system, ovs_hybrid_plug=False, port_filter=True, binding:vif_type=ovs, binding:vnic_type=normal, created_at=2019-05-05T20:46:07Z, description=, device_id=f1bce6e9-be5b-464b-8f64-686f36e9de1f, device_owner=compute:nova, dns_assignment=[{u'hostname': u'amphora-5bec4c09-a209-4e73-a66e-e4fc0fb8ded5', u'ip_address': u'10.0.2.13', u'fqdn': u'amphora-5bec4c09-a209-4e73-a66e-e4fc0fb8ded5.loqal.'}], dns_domain=, dns_name=amphora-5bec4c09-a209-4e73-a66e-e4fc0fb8ded5, extra_dhcp_opts=[], fixed_ips=[{u'subnet_id': u'24b10886-3d53-4aee-bdc6-f165b242ae4f', u'ip_address': u'10.0.2.13'}], id=b0398cc8-6d52-4f12-9f1f-1141b0f10751, mac_address=72:d0:1c:4c:94:91, name=octavia-lb-vrrp-5bec4c09-a209-4e73-a66e-e4fc0fb8ded5, network_id=b24d2830-eec6-4abd-82f2-ac71c8ecbf40, port_security_enabled=True, project_id=41a02a69918849509f4102b04f8a7de9, qos_policy_id=None, revision_number=8, security_groups=[u'ba20352e-95b9-4c97-a688-59d44e3aa8cf'], status=ACTIVE, tags=[], tenant_id=41a02a69918849509f4102b04f8a7de9, updated_at=2019-05-05T20:46:16Z May  5 20:46:17 controller-l neutron-server: 2019-05-05 20:46:17.086 20981 INFO neutron.wsgi [req-835e5b91-28e5-44b9-a463-d04a0323294f a18f38c780074c6280dde5edad159666 41a02a69918849509f4102b04f8a7de9 - default default] 10.0.10.31 "PUT /v2.0/ports/b0398cc8-6d52-4f12-9f1f-1141b0f10751 HTTP/1.1" status: 200  len: 1395 time: 0.6318841 May  5 20:46:17 controller-l neutron-server: 2019-05-05 20:46:17.153 20981 INFO neutron.wsgi [req-37ee0da3-8dcc-4fb8-9cd3-91c5a8dcedef a18f38c780074c6280dde5edad159666 41a02a69918849509f4102b04f8a7de9 - default default] 10.0.10.31 "GET /v2.0/ports/b0398cc8-6d52-4f12-9f1f-1141b0f10751 HTTP/1.1" status: 200  len: 1395 time: 0.0616651 May  5 20:46:18 controller-l neutron-server: 2019-05-05 20:46:18.179 20981 INFO neutron.wsgi [req-8896542e-5dcb-4e6d-9379-04cd88c4035b a18f38c780074c6280dde5edad159666 41a02a69918849509f4102b04f8a7de9 - default default] 10.0.10.31 "DELETE /v2.0/ports/b0398cc8-6d52-4f12-9f1f-1141b0f10751 HTTP/1.1" status: 204  len: 149 time: 1.0199890 Thank you. -- Volodymyr Litovka "Vision without Execution is Hallucination." -- Thomas Edison -------------- next part -------------- An HTML attachment was scrubbed... URL: From cdent+os at anticdent.org Sun May 5 23:54:18 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Sun, 5 May 2019 17:54:18 -0600 (MDT) Subject: [placement][nova][ptg] Summary: Consumer Types Message-ID: We had a brief conversation in the placement room yesterday (Saturday May 5th) to confirm we were all on the same page with regard to consumer types. These provide a way to say that a set of allocations "is an instance" or "is a migration" and will help with quota accounting. We decided that since no one has stepped forward with a more complicated scenario, at this time, we will go with the simplest implementation, for now: * add a consumer types table that has a key and string (length to be determined, values controlled by clients) that represents a "type". For example (1, 'instance') * add a column on consumer table that takes one of those keys * create a new row in the types table only when a new type is created, don't worry about expiring them * provide an online migration to default existing consumers to 'instance' and treat unset types as 'instance' [1]. This probably needs some confirmation from mel and others that it is suitable. If not, please provide an alternative suggestion. * In a new microversion: allow queries to /usages to use a consumer type parameter to limit results to particular types and add 'consumer_type' key will be added to the body of an 'allocations' in both PUT and POST. * We did not discuss in the room, but the email thread [2] did: We may need to consider grouping /usages results by type but we could probably get by without changing that (and do multiple requests, sometimes). Surya, thank her very much, has volunteered to work on this and has started a spec at [3]. We have decided, again due to lack of expressed demand, to do any work (at this time) related to resource provider partitioning [4]. There's a pretty good idea on how to do this, but enough other stuff going on there's not time. Because we decided in that thread that any one resource provider can only be in one partition, there is also a very easy workaround: Run another placement server. It takes only a few minutes to set one up [5] This means that all of the client services of a single placement service need to coordinate on what consumer types they are using. (This was already true, but stated here for emphasis.) [1] I'm tempted to test how long a million or so rows of consumers would take to update. If it is short enough we may wish to break with the nova tradition of not doing data migrations in schema migrations (placement-manage db sync). But we didn't get a chance to discuss that in the room. [2] http://lists.openstack.org/pipermail/openstack-discuss/2019-April/thread.html#4720 [3] https://review.opendev.org/#/c/654799/ [4] http://lists.openstack.org/pipermail/openstack-discuss/2019-April/004721.html [5] https://docs.openstack.org/placement/latest/install/from-pypi.html -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From cdent+os at anticdent.org Mon May 6 00:21:09 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Sun, 5 May 2019 18:21:09 -0600 (MDT) Subject: [placement][nova][ptg] Summary: Nested Magic With Placement In-Reply-To: References: Message-ID: On Sat, 4 May 2019, Chris Dent wrote: > On Fri, 3 May 2019, Chris Dent wrote: > >> * This (Friday) afternoon at the PTG I'll be creating rfe stories >> associated with these changes. If you'd like to help with that, find >> me in the placement room (109). We'll work out whether those >> stories needs specs in the normally processing of the stories. >> We'll also need to find owners for many of them. > > I decided to capture all of this in one story: > > https://storyboard.openstack.org/#!/story/2005575 > > which will likely need to be broken into several stories, or at > least several detailed tasks. I have added some images to the story. They are from flipchart drawings made during yesterday's discussions and reflect some syntax and semantics decisions we made. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From cdent+os at anticdent.org Mon May 6 00:57:51 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Sun, 5 May 2019 18:57:51 -0600 (MDT) Subject: [placement][ironic][blazar] Summary: Placment + Ironic and Blazar Message-ID: There were a few different discussions about Ironic using placement in various ways. On line 117 (for now) of the placement PTG etherpad [1] there are some notes about things that Ironic and Blazar could do for reservations. These are not expected to require any changes in placement. Dmitry and Tetsuro may have more to say about this. There was also a separate discussion about the options for using Placement do granular/detailed expression of available resources but full/chunky consumption of resources in a context where Ironic is running without Nova. That is: * record inventory for baremental nodes that say the inventory of node1 is CPU:24,DISK_GB:1048576,MEMORY_MB=1073741824 (and whatever else). * query something smaller (eg CPU:8,DISK_GB:524288,MEMORY_MB:536870912) in a GET /allocation_candidates * include node1 in the results along with others, let the client side sort using provider summaries * send back some mode of allocation that consumes the entire inventory of node1 There were a few different ideas on how to do that last step. One idea would have required different resource providers have an attribute that caused a change in behavior when allocated to. This was dismissed as "too much business logic in the guts". Another option was a flag on PUT /allocations that says "consume everything, despite what I've said". However, the option that was most favored was a new endpoint (name to be determined later if we ever do this) that is for the purpose of "fullly consuming the named resource provider". Clearly this is something that fairly contrary to how Nova imagines baremetal instances, but makes pretty good sense outside of the context where people want to be able to use placement to simultaneously get a flexible view of their baremetal resources and also track them accurately. There are no immediate plans to do this, but there are plans for Dmitry to continue investigating the options and seeing what can or cannot work. Having the feature described above would make things cleaner. I have made a placeholder (and low priority) story [2] that will link back to this email. [1] https://etherpad.openstack.org/p/placement-ptg-train [2] https://storyboard.openstack.org/#!/story/2005575 -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From eumel at arcor.de Mon May 6 00:58:30 2019 From: eumel at arcor.de (Frank Kloeker) Date: Mon, 06 May 2019 02:58:30 +0200 Subject: [I18n] Translation plan Train Message-ID: <64b2c98efa6931fbf9d3a5e288a3cf79@arcor.de> Hello Stackers, hopefully you enjoyed the time during the Open Infra Summit and the PTG in Denver - onsite or remotely. Maybe you inspired also the spirit from something new which will start from now on. As usually we at I18n after release we merge translations back from stable branch to master and starting with a new translation plan [1]. Without a simple copy of the last one I investigated project status under the help of commits and the OpenStack Health Tracker. That's very helpful to see which projects are active and which one have a break, so we can adjust translation priority. At the end the translation plan will be a little bit shorter and we have enough space to onboard new stuff. Additional project docs are not decided yet. But we have the new projects outside OpenStack like Airship, StarlingX and Zuul and if you think on the next Summit in Shanghai and the target downstream users, a translated version of user interfaces or documentation might be useful. If you have questions or remarks, let me know, or Ian, or the list :) Frank [1] https://translate.openstack.org/version-group/view/Train-dashboard-translation/projects From cdent+os at anticdent.org Mon May 6 01:23:31 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Sun, 5 May 2019 19:23:31 -0600 (MDT) Subject: [placement][ptg] Open Questions Message-ID: A few questions we failed to resolve during the PTG that we should work out over the next couple of weeks. * There are two specs in progress related to more flexible ways to filter traits: * any trait in allocation candidates https://review.opendev.org/#/c/649992/ * support mixing required traits with any traits https://review.opendev.org/#/c/649368/ Do we have pending non-placement features which depend on the above being completed? I got the impression during the nova-placement xproj session that maybe they were, but it's not clear. Anyone willing to state one way or another? * We had several RFE stories already in progress, and have added a few more during the PTG. We have not done much in the way of prioritizing these. We certainly can't do them all. Here's a link to the current RFE stories in the placement group (this includes placement, osc-placement and os-*). https://storyboard.openstack.org/#!/worklist/594 I've made a simple list of those on an etherpad, please register you +1 or -1 (or nothing) on each of those. Keep in mind that there are several features in "Update nested provider support to address train requirements" and that we've already committed to them. Please let me know what I've forgotten. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From sergey at vilgelm.info Mon May 6 01:36:00 2019 From: sergey at vilgelm.info (Sergey Vilgelm) Date: Sun, 5 May 2019 20:36:00 -0500 Subject: [qa][ptg][patrole] RBAC testing improvement ideas for Patrole In-Reply-To: <16a86d4834e.e46610fc23956.8020827235456111857@ghanshyammann.com> References: <16a86d4834e.e46610fc23956.8020827235456111857@ghanshyammann.com> Message-ID: <7a5dcff9-ca99-496e-a022-f06830fd03a5@Spark> Hi, Gmann, thank you so much. 1. I’m not sure that I understood the #1. Do you mean that oslo.policy will raise a special exceptions for successful and unsuccessful verification if the flag is set? So a service will see the exception and just return it. And Patorle can recognize those exceptions? I’m totally agree with using one job for one services, It can give us a possibility to temporary disable some services and allow patches for other services to be tested and merged. 2. +1 for the option 2. We can decrease the number of jobs and have just one job for one services, but we need to think about how to separate the logs. IMO we need to extend the `action` decorator to run a test 9 times (depends on the configuration) and memorize all results for all combinations and use something like `if not all(results): raise PatroleException()` -- Sergey Vilgelm https://www.vilgelm.info On May 5, 2019, 2:15 AM -0500, Ghanshyam Mann , wrote: > Patrole is emerging as a good tool for RBAC testing. AT&T already running it on their production cloud and > we have got a good amount of interest/feedback from other operators. > > We had few discussions regarding the Patrole testing improvement during PTG among QA, Nova, Keystone team. > I am writing the summary of those discussions below and would like to get the opinion from Felipe & Sergey also. > > 1. How to improve the Patrole testing time: > Currently Patrole test perform the complete API operaion which takes time and make Patrole testing > very long. Patrole is responsible to test the policies only so does not need to wait for API complete operation > to be completed. > John has a good idea to handle that via flag. If that flag is enabled (per service and disabled by default) then > oslo.policy can return some different error code on success (other than 403). The API can return the response > with that error code which can be treated as pass case in Patrole. > Morgan raises a good point on making it per API call than global. We can do that as next step and let's > start with the global flag per service as of now? > - https://etherpad.openstack.org/p/ptg-train-xproj-nova-keystone > > Another thing we should improve in current Patrole jobs is to separate the jobs per service. Currently, all 5 services > are installed and run in a single job. Running all on Patrole gate is good but the project side gate does not need to run > any other service tests. For example, patrole-keystone which can install the only keystone and run only > keystone tests. This way project can reuse the patrole jobs only and does not need to prepare a separate job. > > 2. How to run patrole tests with all negative, positive combination for all scope + defaults roles combinations: > - Current jobs patrole-admin/member/reader are able to test the negative pattern. For example: > patrole-member job tests the admin APIs in a negative way and make sure test is passed only if member > role gets 403. > - As we have scope_type support also we need to extend the jobs to run for all 9 combinations of 3 scopes > (system, project, domain) and 3 roles(admin, member, reader). > - option1: running 9 different jobs with each combination as we do have currently > for admin, member, reader role. The issue with this approach is gate will take a lot of time to > run these 9 jobs separately. > - option2: Run all the 9 combinations in a single job with running the tests in the loop with different > combination of scope_roles. This might require the current config option [role] to convert to list type > and per service so that the user can configure what all default roles are available for corresponding service. > This option can save a lot of time to avoid devstack installation time as compared to 9 different jobs option. > > -gmann > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaronzhu1121 at gmail.com Mon May 6 02:30:05 2019 From: aaronzhu1121 at gmail.com (Rong Zhu) Date: Mon, 6 May 2019 10:30:05 +0800 Subject: [stackalytics] Reported numbers seem inaccurate In-Reply-To: References: Message-ID: Hi Sergey, Do we have any process about my colleague's data loss problem? Sergey Nikitin 于2019年4月29日 周一19:57写道: > Thank you for information! I will take a look > > On Mon, Apr 29, 2019 at 3:47 PM Rong Zhu wrote: > >> Hi there, >> >> Recently we found we lost a person's data from our company at the >> stackalytics website. >> You can check the merged patch from [0], but there no date from >> the stackalytics website. >> >> stackalytics info as below: >> Company: ZTE Corporation >> Launchpad: 578043796-b >> Gerrit: gengchc2 >> >> Look forward to hearing from you! >> > Best Regards, Rong Zhu > >> -- Thanks, Rong Zhu -------------- next part -------------- An HTML attachment was scrubbed... URL: From valleru at cbio.mskcc.org Mon May 6 07:26:51 2019 From: valleru at cbio.mskcc.org (valleru at cbio.mskcc.org) Date: Mon, 6 May 2019 02:26:51 -0500 Subject: [neutron] Unable to configure multiple external networks Message-ID: Hello All, I am trying to install Openstack Stein on a single node, with multiple external networks (both networks are also shared). However, i keep getting the following error in the logs, and the router interfaces show as down. 2019-05-06 02:19:45.046 52175 ERROR neutron.agent.l3.agent 2019-05-06 02:19:45.048 52175 INFO neutron.agent.l3.agent [-] Starting router update for a2ec6c99-944e-408a-945a-dffbe09f65ce, action 3, priority 2 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: a2ec6c99-944e-408a-945a-dffbe09f65ce: Exception: The 'gateway_external_network_id' option must be configured for this agent as Neutron has more than one external network. 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent Traceback (most recent call last): 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 701, in _process_routers_if_compatible 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent     self._process_router_if_compatible(router) 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 548, in _process_router_if_compatible 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent     target_ex_net_id = self._fetch_external_net_id() 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 376, in _fetch_external_net_id 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent     raise Exception(msg) 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent Exception: The 'gateway_external_network_id' option must be configured for this agent as Neutron has more than one external network. 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent 2019-05-06 02:19:46.252 52175 WARNING neutron.agent.l3.agent [-] Hit retry limit with router update for a2ec6c99-944e-408a-945a-dffbe09f65ce, action 3 2019-05-06 02:19:46.253 52175 WARNING neutron.agent.l3.agent [-] Info for router a2ec6c99-944e-408a-945a-dffbe09f65ce was not found. Performing router cleanup I have set these parameters to empty, as mentioned in the docs. /etc/neutron/l3_agent.ini gateway_external_network_id = external_network_bridge = interface_driver = openvswitch I tried linuxbridge-agent too,but i could not get rid of the above error.  openstack port list --router router1 +--------------------------------------+------+-------------------+----------------------------------------------------------------------------+--------+ | ID                                   | Name | MAC Address       | Fixed IP Addresses                                                         | Status | +--------------------------------------+------+-------------------+----------------------------------------------------------------------------+--------+ | 1bcaad17-17ed-4383-9206-34417f8fd2df |      | fa:16:3e:c1:b1:1f | ip_address='192.168.1.1', subnet_id='b00cb3bf-ca89-4e00-8bd7-83a75dbb6080' | DOWN   | | f49d976f-b733-4360-9d1f-cdd35ecf54e6 |      | fa:16:3e:54:82:4b | ip_address='10.0.10.11', subnet_id='7cc01a33-f078-494d-9b0b-e988f5b4915d'  | DOWN   | +--------------------------------------+------+-------------------+----------------------------------------------------------------------------+————+ However it does work when i have just one external network  openstack port list --router router1 +--------------------------------------+------+-------------------+--------------------------------------------------------------------------------+--------+ | ID                                   | Name | MAC Address       | Fixed IP Addresses                                                             | Status | +--------------------------------------+------+-------------------+--------------------------------------------------------------------------------+--------+ | cdb06cf7-7492-4275-bd93-88a46b9769a8 |      | fa:16:3e:7c:ea:55 | ip_address='192.168.1.1', subnet_id='b00cb3bf-ca89-4e00-8bd7-83a75dbb6080'     | ACTIVE | | fc9b06d7-d377-451b-9af5-07e1fab072dc |      | fa:16:3e:d0:6d:7c | ip_address='140.163.188.149', subnet_id='4a2bf30a-e7f8-44c1-8b08-4de01b2b1296' | ACTIVE | +--------------------------------------+------+-------------------+--------------------------------------------------------------------------------+--------+ May i please know, how to get the above working. I have seen multiple articles online that mention that this should be working, however i am unable to get this to work. It is really important for us to have to have 2 external networks in the environment, and be able to route to both of them if possible. Thank you, Lohit -------------- next part -------------- An HTML attachment was scrubbed... URL: From valleru at cbio.mskcc.org Mon May 6 07:58:15 2019 From: valleru at cbio.mskcc.org (valleru at cbio.mskcc.org) Date: Mon, 6 May 2019 02:58:15 -0500 Subject: [neutron] Unable to configure multiple external networks In-Reply-To: References: Message-ID: It started to work , after i modified this code: def _fetch_external_net_id(self, force=False):         """Find UUID of single external network for this agent."""         self.conf.gateway_external_network_id = ''         #if self.conf.gateway_external_network_id:         #    return self.conf.gateway_external_network_id         return self.conf.gateway_external_network_id from https://github.com/openstack/neutron/blob/master/neutron/agent/l3/agent.py Looks like, that respective option is not being read correctly from the respective configuration file. Regards, Lohit On May 6, 2019, 2:28 AM -0500, valleru at cbio.mskcc.org, wrote: > Hello All, > > I am trying to install Openstack Stein on a single node, with multiple external networks (both networks are also shared). > However, i keep getting the following error in the logs, and the router interfaces show as down. > > 2019-05-06 02:19:45.046 52175 ERROR neutron.agent.l3.agent > 2019-05-06 02:19:45.048 52175 INFO neutron.agent.l3.agent [-] Starting router update for a2ec6c99-944e-408a-945a-dffbe09f65ce, action 3, priority 2 > 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: a2ec6c99-944e-408a-945a-dffbe09f65ce: Exception: The 'gateway_external_network_id' option must be configured for this agent as Neutron has more than one external network. > 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 701, in _process_routers_if_compatible > 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent     self._process_router_if_compatible(router) > 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 548, in _process_router_if_compatible > 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent     target_ex_net_id = self._fetch_external_net_id() > 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 376, in _fetch_external_net_id > 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent     raise Exception(msg) > 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent Exception: The 'gateway_external_network_id' option must be configured for this agent as Neutron has more than one external network. > 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent > 2019-05-06 02:19:46.252 52175 WARNING neutron.agent.l3.agent [-] Hit retry limit with router update for a2ec6c99-944e-408a-945a-dffbe09f65ce, action 3 > 2019-05-06 02:19:46.253 52175 WARNING neutron.agent.l3.agent [-] Info for router a2ec6c99-944e-408a-945a-dffbe09f65ce was not found. Performing router cleanup > > > I have set these parameters to empty, as mentioned in the docs. > > /etc/neutron/l3_agent.ini > > gateway_external_network_id = > external_network_bridge = > interface_driver = openvswitch > > I tried linuxbridge-agent too,but i could not get rid of the above error. > >  openstack port list --router router1 > > +--------------------------------------+------+-------------------+----------------------------------------------------------------------------+--------+ > | ID                                   | Name | MAC Address       | Fixed IP Addresses                                                         | Status | > +--------------------------------------+------+-------------------+----------------------------------------------------------------------------+--------+ > | 1bcaad17-17ed-4383-9206-34417f8fd2df |      | fa:16:3e:c1:b1:1f | ip_address='192.168.1.1', subnet_id='b00cb3bf-ca89-4e00-8bd7-83a75dbb6080' | DOWN   | > | f49d976f-b733-4360-9d1f-cdd35ecf54e6 |      | fa:16:3e:54:82:4b | ip_address='10.0.10.11', subnet_id='7cc01a33-f078-494d-9b0b-e988f5b4915d'  | DOWN   | > +--------------------------------------+------+-------------------+----------------------------------------------------------------------------+————+ > > However it does work when i have just one external network > >  openstack port list --router router1 > +--------------------------------------+------+-------------------+--------------------------------------------------------------------------------+--------+ > | ID                                   | Name | MAC Address       | Fixed IP Addresses                                                             | Status | > +--------------------------------------+------+-------------------+--------------------------------------------------------------------------------+--------+ > | cdb06cf7-7492-4275-bd93-88a46b9769a8 |      | fa:16:3e:7c:ea:55 | ip_address='192.168.1.1', subnet_id='b00cb3bf-ca89-4e00-8bd7-83a75dbb6080'     | ACTIVE | > | fc9b06d7-d377-451b-9af5-07e1fab072dc |      | fa:16:3e:d0:6d:7c | ip_address='140.163.188.149', subnet_id='4a2bf30a-e7f8-44c1-8b08-4de01b2b1296' | ACTIVE | > +--------------------------------------+------+-------------------+--------------------------------------------------------------------------------+--------+ > > May i please know, how to get the above working. > I have seen multiple articles online that mention that this should be working, however i am unable to get this to work. > It is really important for us to have to have 2 external networks in the environment, and be able to route to both of them if possible. > > > Thank you, > Lohit > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ifatafekn at gmail.com Mon May 6 08:50:45 2019 From: ifatafekn at gmail.com (Ifat Afek) Date: Mon, 6 May 2019 11:50:45 +0300 Subject: [vitrage] No IRC meeting this week Message-ID: Hi, The IRC meeting this week is canceled, since most of Vitrage contributors will be on vacation. We will meet again on Wednesday, May 15. Thanks, Ifat -------------- next part -------------- An HTML attachment was scrubbed... URL: From balazs.gibizer at ericsson.com Mon May 6 09:14:00 2019 From: balazs.gibizer at ericsson.com (=?utf-8?B?QmFsw6F6cyBHaWJpemVy?=) Date: Mon, 6 May 2019 09:14:00 +0000 Subject: [placement][ptg] Open Questions In-Reply-To: References: Message-ID: <1557134030.12068.0@smtp.office365.com> On Mon, May 6, 2019 at 3:23 AM, Chris Dent wrote: > > A few questions we failed to resolve during the PTG that we should > work out over the next couple of weeks. > > * There are two specs in progress related to more flexible ways to > filter traits: > > * any trait in allocation candidates > > https://protect2.fireeye.com/url?k=d24a660a-8ec044e0-d24a2691-0cc47ad93e32-6536d6dba76bfa81&u=https://review.opendev.org/#/c/649992/ > > * support mixing required traits with any traits > > https://protect2.fireeye.com/url?k=dbbffb10-8735d9fa-dbbfbb8b-0cc47ad93e32-545ba4b564785811&u=https://review.opendev.org/#/c/649368/ > > Do we have pending non-placement features which depend on the > above being completed? I got the impression during the > nova-placement xproj session that maybe they were, but it's not > clear. Anyone willing to state one way or another? From the first spec: "This is required for the case when a Neutron network maps to more than one physnets but the port's bandwidth request can be fulfilled from any physnet the port's network maps to." So yes there is a use case that can only be supported if placement supports any traits in a_c query. It is to support multisegment neutron networks with QoS minimum bandwidth rule and with more than one segment mapped to physnet. A reason we did not discussed it in detail is that the use case was downprioritized on my side. (see https://etherpad.openstack.org/p/ptg-train-xproj-nova-neutron L40) > > * We had several RFE stories already in progress, and have added a > few more during the PTG. We have not done much in the way of > prioritizing these. We certainly can't do them all. Here's a link > to the current RFE stories in the placement group (this includes > placement, osc-placement and os-*). > > https://storyboard.openstack.org/#!/worklist/594 > > I've made a simple list of those on an etherpad, please register > you +1 or -1 (or nothing) on each of those. Keep in mind that > there are several features in "Update nested provider support to > address train requirements" and that we've already committed to > them. Did you forget to paste the etherpad link? > > Please let me know what I've forgotten. > > -- > Chris Dent ٩◔̯◔۶ > https://protect2.fireeye.com/url?k=2e065f7d-728c7d97-2e061fe6-0cc47ad93e32-0c1780ffb89507f5&u=https://anticdent.org/ > freenode: cdent tw: @anticdent From emilien at redhat.com Mon May 6 09:27:05 2019 From: emilien at redhat.com (Emilien Macchi) Date: Mon, 6 May 2019 11:27:05 +0200 Subject: [tripleo] deprecating keepalived support Message-ID: We introduced Keepalived a long time ago when we wanted to manage virtual IPs (VIPs) on the Undercloud when SSL is enabled and also for an HA alternative to Pacemaker on the overcloud, The multi-node undercloud with more than once instance of Keepalived never got attraction (so VRRP hasn't been useful for us), and Pacemaker is the de-facto tool to control HA VIPs on the Overcloud. Therefore, let's continue to trim-down our services and deprecate Keepalived. https://blueprints.launchpad.net/tripleo/+spec/replace-keepalived-undercloud The creation of control plane IP & public host IP can be done with os-net-config, and the upgrade path is simple. I've been working on 2 patches: # Introduce tripleo-container-rpm role https://review.opendev.org/#/c/657279/ Deprecate tripleo-docker-rm and add a generic role which supports both Docker & Podman. In the case of Podman, we cleanup the systemd services and container. # Deprecate Keepalived https://review.opendev.org/#/c/657067/ Remove Keepalived from all the roles, deprecate the service, tear-down Keepalived from the HAproxy service (if it was running), and use os-net-config to configure the interfaces previously managed by Keepalived service. I've tested the upgrade and it seems to work fine: https://asciinema.org/a/MpKBYU1PFvXcYai7aHUwy79LK Please let us know any concern and we'll address it. Thanks, -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From balazs.gibizer at ericsson.com Mon May 6 09:33:36 2019 From: balazs.gibizer at ericsson.com (=?utf-8?B?QmFsw6F6cyBHaWJpemVy?=) Date: Mon, 6 May 2019 09:33:36 +0000 Subject: [placement][nova][ptg] Summary: Consumer Types In-Reply-To: References: Message-ID: <1557135206.12068.1@smtp.office365.com> On Mon, May 6, 2019 at 1:54 AM, Chris Dent wrote: > > We had a brief conversation in the placement room yesterday > (Saturday May 5th) to confirm we were all on the same page with > regard to consumer types. These provide a way to say that a set of > allocations "is an instance" or "is a migration" and will help with > quota accounting. > > We decided that since no one has stepped forward with a more > complicated scenario, at this time, we will go with the simplest > implementation, for now: > > * add a consumer types table that has a key and string (length to be > determined, values controlled by clients) that represents a "type". > For example (1, 'instance') > > * add a column on consumer table that takes one of those keys > > * create a new row in the types table only when a new type is > created, don't worry about expiring them > > * provide an online migration to default existing consumers to > 'instance' and treat unset types as 'instance' [1]. This probably > needs some confirmation from mel and others that it is suitable. > If not, please provide an alternative suggestion. If there are ongoing migration then defaulting the consumer type to instance might be incorrect. However nova already has a mechanism to distingush between migration and instance consumer so nova won't break by this. Still nova might want to fix this placement data inconsistency. I guess the new placement microversion will allow to update the consumer type of an allocation. Cheers, gibi > * In a new microversion: allow queries to /usages to use a consumer > type parameter to limit results to particular types and add > 'consumer_type' key will be added to the body of an 'allocations' > in both PUT and POST. > > * We did not discuss in the room, but the email thread [2] did: We > may need to consider grouping /usages results by type but we could > probably get by without changing that (and do multiple requests, > sometimes). > > Surya, thank her very much, has volunteered to work on this and has > started a spec at [3]. > > We have decided, again due to lack of expressed demand, to do any > work (at this time) related to resource provider partitioning [4]. > > There's a pretty good idea on how to do this, but enough other stuff > going on there's not time. Because we decided in that thread that > any one resource provider can only be in one partition, there is > also a very easy workaround: Run another placement server. It takes > only a few minutes to set one up [5] > > This means that all of the client services of a single placement > service need to coordinate on what consumer types they are using. > (This was already true, but stated here for emphasis.) > > [1] I'm tempted to test how long a million or so rows of consumers > would take to update. If it is short enough we may wish to break > with the nova tradition of not doing data migrations in schema > migrations (placement-manage db sync). But we didn't get a chance to > discuss that in the room. > > [2] > http://lists.openstack.org/pipermail/openstack-discuss/2019-April/thread.html#4720 > > [3] > https://protect2.fireeye.com/url?k=e2926c01-be19673d-e2922c9a-86ef624f95b6-55a34a8e4a7579ba&u=https://review.opendev.org/#/c/654799/ > > [4] > http://lists.openstack.org/pipermail/openstack-discuss/2019-April/004721.html > > [5] https://docs.openstack.org/placement/latest/install/from-pypi.html > > -- > Chris Dent ٩◔̯◔۶ > https://protect2.fireeye.com/url?k=b585a35f-e90ea863-b585e3c4-86ef624f95b6-8f691958e6e41ae2&u=https://anticdent.org/ > freenode: cdent tw: @anticdent From skaplons at redhat.com Mon May 6 09:49:37 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Mon, 6 May 2019 11:49:37 +0200 Subject: [neutron] Unable to configure multiple external networks In-Reply-To: References: Message-ID: Hi, It is known and already reported issue. Please see https://bugs.launchpad.net/neutron/+bug/1824571 > On 6 May 2019, at 09:58, valleru at cbio.mskcc.org wrote: > > It started to work , after i modified this code: > > def _fetch_external_net_id(self, force=False): > """Find UUID of single external network for this agent.""" > self.conf.gateway_external_network_id = '' > #if self.conf.gateway_external_network_id: > # return self.conf.gateway_external_network_id > return self.conf.gateway_external_network_id > > from https://github.com/openstack/neutron/blob/master/neutron/agent/l3/agent.py > > Looks like, that respective option is not being read correctly from the respective configuration file. > > Regards, > Lohit > > On May 6, 2019, 2:28 AM -0500, valleru at cbio.mskcc.org, wrote: >> Hello All, >> >> I am trying to install Openstack Stein on a single node, with multiple external networks (both networks are also shared). >> However, i keep getting the following error in the logs, and the router interfaces show as down. >> >> 2019-05-06 02:19:45.046 52175 ERROR neutron.agent.l3.agent >> 2019-05-06 02:19:45.048 52175 INFO neutron.agent.l3.agent [-] Starting router update for a2ec6c99-944e-408a-945a-dffbe09f65ce, action 3, priority 2 >> 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: a2ec6c99-944e-408a-945a-dffbe09f65ce: Exception: The 'gateway_external_network_id' option must be configured for this agent as Neutron has more than one external network. >> 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent Traceback (most recent call last): >> 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 701, in _process_routers_if_compatible >> 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) >> 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 548, in _process_router_if_compatible >> 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent target_ex_net_id = self._fetch_external_net_id() >> 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 376, in _fetch_external_net_id >> 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent raise Exception(msg) >> 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent Exception: The 'gateway_external_network_id' option must be configured for this agent as Neutron has more than one external network. >> 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent >> 2019-05-06 02:19:46.252 52175 WARNING neutron.agent.l3.agent [-] Hit retry limit with router update for a2ec6c99-944e-408a-945a-dffbe09f65ce, action 3 >> 2019-05-06 02:19:46.253 52175 WARNING neutron.agent.l3.agent [-] Info for router a2ec6c99-944e-408a-945a-dffbe09f65ce was not found. Performing router cleanup >> >> >> I have set these parameters to empty, as mentioned in the docs. >> >> /etc/neutron/l3_agent.ini >> >> gateway_external_network_id = >> external_network_bridge = >> interface_driver = openvswitch >> >> I tried linuxbridge-agent too,but i could not get rid of the above error. >> >> openstack port list --router router1 >> >> +--------------------------------------+------+-------------------+----------------------------------------------------------------------------+--------+ >> | ID | Name | MAC Address | Fixed IP Addresses | Status | >> +--------------------------------------+------+-------------------+----------------------------------------------------------------------------+--------+ >> | 1bcaad17-17ed-4383-9206-34417f8fd2df | | fa:16:3e:c1:b1:1f | ip_address='192.168.1.1', subnet_id='b00cb3bf-ca89-4e00-8bd7-83a75dbb6080' | DOWN | >> | f49d976f-b733-4360-9d1f-cdd35ecf54e6 | | fa:16:3e:54:82:4b | ip_address='10.0.10.11', subnet_id='7cc01a33-f078-494d-9b0b-e988f5b4915d' | DOWN | >> +--------------------------------------+------+-------------------+----------------------------------------------------------------------------+————+ >> >> However it does work when i have just one external network >> >> openstack port list --router router1 >> +--------------------------------------+------+-------------------+--------------------------------------------------------------------------------+--------+ >> | ID | Name | MAC Address | Fixed IP Addresses | Status | >> +--------------------------------------+------+-------------------+--------------------------------------------------------------------------------+--------+ >> | cdb06cf7-7492-4275-bd93-88a46b9769a8 | | fa:16:3e:7c:ea:55 | ip_address='192.168.1.1', subnet_id='b00cb3bf-ca89-4e00-8bd7-83a75dbb6080' | ACTIVE | >> | fc9b06d7-d377-451b-9af5-07e1fab072dc | | fa:16:3e:d0:6d:7c | ip_address='140.163.188.149', subnet_id='4a2bf30a-e7f8-44c1-8b08-4de01b2b1296' | ACTIVE | >> +--------------------------------------+------+-------------------+--------------------------------------------------------------------------------+--------+ >> >> May i please know, how to get the above working. >> I have seen multiple articles online that mention that this should be working, however i am unable to get this to work. >> It is really important for us to have to have 2 external networks in the environment, and be able to route to both of them if possible. >> >> >> Thank you, >> Lohit >> — Slawek Kaplonski Senior software engineer Red Hat From doka.ua at gmx.com Mon May 6 10:50:26 2019 From: doka.ua at gmx.com (Volodymyr Litovka) Date: Mon, 6 May 2019 13:50:26 +0300 Subject: [octavia] Amphora agent returned unexpected result code 500 In-Reply-To: <5798b929-737e-fd29-a2a5-7c1246a632bb@gmx.com> References: <5798b929-737e-fd29-a2a5-7c1246a632bb@gmx.com> Message-ID: Hi, I did some additional tests (out of Octavia, reproducing Octavia's model) to check whether granted roles are enough. Seems, enough: 1) I have "customer" project with plenty of VM's connected to project's local (not shared) network (b24d2...) and subnet (24b10...), which is supposed to be vip-subnet: # openstack subnet show 24b10... project_id: ec62f... 2) I have "octavia" project, where users octavia, nova and neutron have "admin" role 3) under user "octavia" I create port in project "octavia", connected to "customer"s subnet and bind it to VM: octavia at octavia$ openstack port create --network b24d2... --fixed-ip subnet=24b10... --disable-port-security tport port id: 1c883... project_id: 41a02... octavia at octavia$ openstack server create --image cirros-0.4 --flavor B1 --nic port-id=1c883... tserv project_id: 41a02... 4) finally, I able to ping test server from customer project's VMs, despite the fact they're in different projects So it seems that roles to reproduce Octavia's model are enough and Openstack configured in the right way. On 5/6/19 12:34 AM, Volodymyr Litovka wrote: > Dear colleagues, > > trying to launch Amphorae, getting the following error in logs: > > Amphora agent returned unexpected result code 500 with response > {'message': 'Error plugging VIP', 'details': 'SIOCADDRT: Network is > unreachable\nFailed to bring up eth1.\n'} > > While details below, questions are here: > - whether it's enough to assign roles as explained below to special > project for Octavia? > - whether it can be issue with image, created by diskimage_create.sh? > - any recommendation on where to search for the problem. > > Thank you. > > My environment is: > - Openstack Rocky > - Octavia 4.0 > - amphora instance runs in special project "octavia", where users > octavia, nova and neutron have admin role > - amphora image prepared using original git repo process and elements > without modification: > * git clone > * cd octavia > * diskimage-create/diskimage-create.sh > * openstack image create [ ... ] --tag amphora > > After created, amphora instance successfully connects to management > network and can be accessed by controller: > > 2019-05-05 20:46:06.851 18234 DEBUG > octavia.amphorae.drivers.haproxy.rest_api_driver [-] Connected to > amphora. Response: request > /opt/openstack/lib/python3.6/site-packages/octavia/amphorae/drivers/haproxy/rest_api_driver.py:486 > 2019-05-05 20:46:06.852 18234 DEBUG > octavia.controller.worker.tasks.amphora_driver_tasks [-] Successfuly > connected to amphora 5bec4c09-a209-4e73-a66e-e4fc0fb8ded5: > {'ipvsadm_version': '1:1.28-3', 'api_version': '0.5', > 'haproxy_version': '1.6.3-1ubuntu0.2', 'hostname': > 'amphora-5bec4c09-a209-4e73-a66e-e4fc0fb8ded5', 'keepalived_version': > '1:1.2.24-1ubuntu0.16.04.1'} execute > /opt/openstack/lib/python3.6/site-packages/octavia/controller/worker/tasks/amphora_driver_tasks.py:372 > [ ... ] > 2019-05-05 20:46:06.990 18234 DEBUG > octavia.controller.worker.tasks.network_tasks [-] Plumbing VIP for > amphora id: 5bec4c09-a209-4e73-a66e-e4fc0fb8ded5 execute > /opt/openstack/lib/python3.6/site-packages/octavia/controller/worker/tasks/network_tasks.py:382 > 2019-05-05 20:46:07.003 18234 DEBUG > octavia.network.drivers.neutron.base [-] Neutron extension > security-group found enabled _check_extension_enabled > /opt/openstack/lib/python3.6/site-packages/octavia/network/drivers/neutron/base.py:66 > 2019-05-05 20:46:07.013 18234 DEBUG > octavia.network.drivers.neutron.base [-] Neutron extension > dns-integration found enabled _check_extension_enabled > /opt/openstack/lib/python3.6/site-packages/octavia/network/drivers/neutron/base.py:66 > 2019-05-05 20:46:07.025 18234 DEBUG > octavia.network.drivers.neutron.base [-] Neutron extension qos found > enabled _check_extension_enabled > /opt/openstack/lib/python3.6/site-packages/octavia/network/drivers/neutron/base.py:66 > 2019-05-05 20:46:07.044 18234 DEBUG > octavia.network.drivers.neutron.base [-] Neutron extension > allowed-address-pairs found enabled _check_extension_enabled > /opt/openstack/lib/python3.6/site-packages/octavia/network/drivers/neutron/base.py:66 > 2019-05-05 20:46:08.406 18234 DEBUG > octavia.network.drivers.neutron.allowed_address_pairs [-] Created vip > port: b0398cc8-6d52-4f12-9f1f-1141b0f10751 for amphora: > 5bec4c09-a209-4e73-a66e-e4fc0fb8ded5 _plug_amphora_vip > /opt/openstack/lib/python3.6/site-packages/octavia/network/drivers/neutron/allowed_address_pairs.py:97 > [ ... ] > 2019-05-05 20:46:15.405 18234 DEBUG > octavia.network.drivers.neutron.allowed_address_pairs [-] Retrieving > network details for amphora 5bec4c09-a209-4e73-a66e-e4fc0fb8ded5 > _get_amp_net_configs > /opt/openstack/lib/python3.6/site-packages/octavia/network/drivers/neutron/allowed_address_pairs.py:596 > [ ... ] > 2019-05-05 20:46:15.837 18234 DEBUG > octavia.amphorae.drivers.haproxy.rest_api_driver [-] Post-VIP-Plugging > with vrrp_ip 10.0.2.13 vrrp_port b0398cc8-6d52-4f12-9f1f-1141b0f10751 > post_vip_plug > /opt/openstack/lib/python3.6/site-packages/octavia/amphorae/drivers/haproxy/rest_api_driver.py:233 > 2019-05-05 20:46:15.838 18234 DEBUG > octavia.amphorae.drivers.haproxy.rest_api_driver [-] request url > plug/vip/10.0.2.24 request > /opt/openstack/lib/python3.6/site-packages/octavia/amphorae/drivers/haproxy/rest_api_driver.py:462 > 2019-05-05 20:46:15.838 18234 DEBUG > octavia.amphorae.drivers.haproxy.rest_api_driver [-] request url > https://172.16.252.35:9443/0.5/plug/vip/10.0.2.24 request > /opt/openstack/lib/python3.6/site-packages/octavia/amphorae/drivers/haproxy/rest_api_driver.py:465 > 2019-05-05 20:46:16.089 18234 DEBUG > octavia.amphorae.drivers.haproxy.rest_api_driver [-] Connected to > amphora. Response: request > /opt/openstack/lib/python3.6/site-packages/octavia/amphorae/drivers/haproxy/rest_api_driver.py:486 > 2019-05-05 20:46:16.090 18234 ERROR > octavia.amphorae.drivers.haproxy.exceptions [-] Amphora agent returned > unexpected result code 500 with response {'message': 'Error plugging > VIP', 'details': 'SIOCADDRT: Network is unreachable\nFailed to bring > up eth1.\n'} > > During the process, NEUTRON logs contains the following records that > indicate the following (note "status=DOWN" in neutron-dhcp-agent; > later immediately before to be deleted, it will shed 'ACTIVE'): > > May  5 20:46:13 ardbeg neutron-dhcp-agent: 2019-05-05 20:46:13.857 > 1804 INFO neutron.agent.dhcp.agent > [req-07833602-9579-403b-a264-76fd3ee408ee > a18f38c780074c6280dde5edad159666 41a02a69918849509f4102b04f8a7de9 - - > -] Trigger reload_allocations for port admin_state_up=True, > allowed_address_pairs=[{u'ip_address': u'10.0.2.24', u'mac_address': > u'72:d0:1c:4c:94:91'}], binding:host_id=ardbeg, binding:profile=, > binding:vif_details=datapath_type=system, ovs_hybrid_plug=False, > port_filter=True, binding:vif_type=ovs, binding:vnic_type=normal, > created_at=2019-05-05T20:46:07Z, description=, > device_id=f1bce6e9-be5b-464b-8f64-686f36e9de1f, > device_owner=compute:nova, dns_assignment=[{u'hostname': > u'amphora-5bec4c09-a209-4e73-a66e-e4fc0fb8ded5', u'ip_address': > u'10.0.2.13', u'fqdn': > u'amphora-5bec4c09-a209-4e73-a66e-e4fc0fb8ded5.loqal.'}], dns_domain=, > dns_name=amphora-5bec4c09-a209-4e73-a66e-e4fc0fb8ded5, > extra_dhcp_opts=[], fixed_ips=[{u'subnet_id': > u'24b10886-3d53-4aee-bdc6-f165b242ae4f', u'ip_address': > u'10.0.2.13'}], id=b0398cc8-6d52-4f12-9f1f-1141b0f10751, > mac_address=72:d0:1c:4c:94:91, > name=octavia-lb-vrrp-5bec4c09-a209-4e73-a66e-e4fc0fb8ded5, > network_id=b24d2830-eec6-4abd-82f2-ac71c8ecbf40, > port_security_enabled=True, > project_id=41a02a69918849509f4102b04f8a7de9, qos_policy_id=None, > revision_number=5, > security_groups=[u'6df53a15-6afc-4c99-b464-03de4f546b4f'], > status=DOWN, tags=[], tenant_id=41a02a69918849509f4102b04f8a7de9, > updated_at=2019-05-05T20:46:13Z > May  5 20:46:14 ardbeg neutron-openvswitch-agent: 2019-05-05 > 20:46:14.185 31542 INFO > neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent > [req-a4425cdb-afc1-4f6a-9ef9-c8706e3285d6 - - - - -] Port > b0398cc8-6d52-4f12-9f1f-1141b0f10751 updated. Details: {'profile': {}, > 'network_qos_policy_id': None, 'qos_policy_id': None, > 'allowed_address_pairs': [{'ip_address': > AuthenticIPNetwork('10.0.2.24'), 'mac_address': > EUI('72:d0:1c:4c:94:91')}], 'admin_state_up': True, 'network_id': > 'b24d2830-eec6-4abd-82f2-ac71c8ecbf40', 'segmentation_id': 437, > 'fixed_ips': [{'subnet_id': '24b10886-3d53-4aee-bdc6-f165b242ae4f', > 'ip_address': '10.0.2.13'}], 'device_owner': u'compute:nova', > 'physical_network': None, 'mac_address': '72:d0:1c:4c:94:91', > 'device': u'b0398cc8-6d52-4f12-9f1f-1141b0f10751', > 'port_security_enabled': True, 'port_id': > 'b0398cc8-6d52-4f12-9f1f-1141b0f10751', 'network_type': u'vxlan', > 'security_groups': [u'6df53a15-6afc-4c99-b464-03de4f546b4f']} > May  5 20:46:14 ardbeg neutron-openvswitch-agent: 2019-05-05 > 20:46:14.197 31542 INFO neutron.agent.securitygroups_rpc > [req-a4425cdb-afc1-4f6a-9ef9-c8706e3285d6 - - - - -] Preparing filters > for devices set([u'b0398cc8-6d52-4f12-9f1f-1141b0f10751']) > > Note Nova returns response 200/completed: > > May  5 20:46:14 controller-l neutron-server: 2019-05-05 20:46:14.326 > 20981 INFO neutron.notifiers.nova [-] Nova event response: {u'status': > u'completed', u'tag': u'b0398cc8-6d52-4f12-9f1f-1141b0f10751', > u'name': u'network-changed', u'server_uuid': > u'f1bce6e9-be5b-464b-8f64-686f36e9de1f', u'code': 200} > > and "openstack server show" shows both NICs are attached to the amphorae: > > $ openstack server show f1bce6e9-be5b-464b-8f64-686f36e9de1f > +-------------------------------------+------------------------------------------------------------+ > | Field                               | Value                                                      | > +-------------------------------------+------------------------------------------------------------+ > [ ... ] > | addresses                           | octavia-net=172.16.252.35; u1000-p1000-xbone=10.0.2.13     | > +-------------------------------------+------------------------------------------------------------+ > > Later Octavia worker reports the following: > > 2019-05-05 20:46:16.124 18234 DEBUG > octavia.controller.worker.controller_worker [-] Task > 'STANDALONE-octavia-plug-net-subflow-octavia-amp-post-vip-plug' > (f105ced1-72c6-4116-b582-599a21cdee36) transitioned into state > 'REVERTING' from state 'FAILURE' _task_receiver > /opt/openstack/lib/python3.6/site-packages/taskflow/listeners/logging.py:194 > 2019-05-05 20:46:16.127 18234 WARNING > octavia.controller.worker.controller_worker [-] Task > 'STANDALONE-octavia-plug-net-subflow-octavia-amp-post-vip-plug' > (f105ced1-72c6-4116-b582-599a21cdee36) transitioned into state > 'REVERTED' from state 'REVERTING' with result 'None' > 2019-05-05 20:46:16.141 18234 DEBUG > octavia.controller.worker.controller_worker [-] Task > 'STANDALONE-octavia-plug-net-subflow-reload-amp-after-plug-vip' > (c4d6222e-2508-4a9c-9514-e7f9bcf84e31) transitioned into state > 'REVERTING' from state 'SUCCESS' _task_receiver > /opt/openstack/lib/python3.6/site-packages/taskflow/listeners/logging.py:194 > 2019-05-05 20:46:16.142 18234 WARNING > octavia.controller.worker.controller_worker [-] Task > 'STANDALONE-octavia-plug-net-subflow-reload-amp-after-plug-vip' > (c4d6222e-2508-4a9c-9514-e7f9bcf84e31) transitioned into state > 'REVERTED' from state 'REVERTING' with result 'None' > 2019-05-05 20:46:16.146 18234 DEBUG > octavia.controller.worker.controller_worker [-] Task > 'STANDALONE-octavia-plug-net-subflow-ocatvia-amp-update-vip-data' > (2e1d1a04-282d-43b7-8c4f-fe31e75804ea) transitioned into state > 'REVERTING' from state 'SUCCESS' _task_receiver > /opt/openstack/lib/python3.6/site-packages/taskflow/listeners/logging.py:194 > 2019-05-05 20:46:16.148 18234 WARNING > octavia.controller.worker.controller_worker [-] Task > 'STANDALONE-octavia-plug-net-subflow-ocatvia-amp-update-vip-data' > (2e1d1a04-282d-43b7-8c4f-fe31e75804ea) transitioned into state > 'REVERTED' from state 'REVERTING' with result 'None' > 2019-05-05 20:46:16.173 18234 DEBUG > octavia.controller.worker.controller_worker [-] Task > 'STANDALONE-octavia-plug-net-subflow-octavia-amp-plug-vip' > (c63a5bed-f531-4ed3-83d2-bce72e835932) transitioned into state > 'REVERTING' from state 'SUCCESS' _task_receiver > /opt/openstack/lib/python3.6/site-packages/taskflow/listeners/logging.py:194 > 2019-05-05 20:46:16.174 18234 WARNING > octavia.controller.worker.tasks.network_tasks [-] Unable to plug VIP > for amphora id 5bec4c09-a209-4e73-a66e-e4fc0fb8ded5 load balancer id > e01c6ff5-179a-4ed5-ae5d-1d00d6c584b8 > > and Neutron then deletes port but NOTE that immediately before > deletion port reported by neutron-dhcp-agent as ACTIVE: > > May  5 20:46:17 ardbeg neutron-dhcp-agent: 2019-05-05 20:46:17.080 > 1804 INFO neutron.agent.dhcp.agent > [req-835e5b91-28e5-44b9-a463-d04a0323294f > a18f38c780074c6280dde5edad159666 41a02a69918849509f4102b04f8a7de9 - - > -] Trigger reload_allocations for port admin_state_up=True, > allowed_address_pairs=[], binding:host_id=ardbeg, binding:profile=, > binding:vif_details=datapath_type=system, ovs_hybrid_plug=False, > port_filter=True, binding:vif_type=ovs, binding:vnic_type=normal, > created_at=2019-05-05T20:46:07Z, description=, > device_id=f1bce6e9-be5b-464b-8f64-686f36e9de1f, > device_owner=compute:nova, dns_assignment=[{u'hostname': > u'amphora-5bec4c09-a209-4e73-a66e-e4fc0fb8ded5', u'ip_address': > u'10.0.2.13', u'fqdn': > u'amphora-5bec4c09-a209-4e73-a66e-e4fc0fb8ded5.loqal.'}], dns_domain=, > dns_name=amphora-5bec4c09-a209-4e73-a66e-e4fc0fb8ded5, > extra_dhcp_opts=[], fixed_ips=[{u'subnet_id': > u'24b10886-3d53-4aee-bdc6-f165b242ae4f', u'ip_address': > u'10.0.2.13'}], id=b0398cc8-6d52-4f12-9f1f-1141b0f10751, > mac_address=72:d0:1c:4c:94:91, > name=octavia-lb-vrrp-5bec4c09-a209-4e73-a66e-e4fc0fb8ded5, > network_id=b24d2830-eec6-4abd-82f2-ac71c8ecbf40, > port_security_enabled=True, > project_id=41a02a69918849509f4102b04f8a7de9, qos_policy_id=None, > revision_number=8, > security_groups=[u'ba20352e-95b9-4c97-a688-59d44e3aa8cf'], > status=ACTIVE, tags=[], tenant_id=41a02a69918849509f4102b04f8a7de9, > updated_at=2019-05-05T20:46:16Z > May  5 20:46:17 controller-l neutron-server: 2019-05-05 20:46:17.086 > 20981 INFO neutron.wsgi [req-835e5b91-28e5-44b9-a463-d04a0323294f > a18f38c780074c6280dde5edad159666 41a02a69918849509f4102b04f8a7de9 - > default default] 10.0.10.31 "PUT > /v2.0/ports/b0398cc8-6d52-4f12-9f1f-1141b0f10751 HTTP/1.1" status: > 200  len: 1395 time: 0.6318841 > May  5 20:46:17 controller-l neutron-server: 2019-05-05 20:46:17.153 > 20981 INFO neutron.wsgi [req-37ee0da3-8dcc-4fb8-9cd3-91c5a8dcedef > a18f38c780074c6280dde5edad159666 41a02a69918849509f4102b04f8a7de9 - > default default] 10.0.10.31 "GET > /v2.0/ports/b0398cc8-6d52-4f12-9f1f-1141b0f10751 HTTP/1.1" status: > 200  len: 1395 time: 0.0616651 > May  5 20:46:18 controller-l neutron-server: 2019-05-05 20:46:18.179 > 20981 INFO neutron.wsgi [req-8896542e-5dcb-4e6d-9379-04cd88c4035b > a18f38c780074c6280dde5edad159666 41a02a69918849509f4102b04f8a7de9 - > default default] 10.0.10.31 "DELETE > /v2.0/ports/b0398cc8-6d52-4f12-9f1f-1141b0f10751 HTTP/1.1" status: > 204  len: 149 time: 1.0199890 > > Thank you. > > -- > Volodymyr Litovka > "Vision without Execution is Hallucination." -- Thomas Edison -- Volodymyr Litovka "Vision without Execution is Hallucination." -- Thomas Edison -------------- next part -------------- An HTML attachment was scrubbed... URL: From nate.johnston at redhat.com Mon May 6 11:47:34 2019 From: nate.johnston at redhat.com (Nate Johnston) Date: Mon, 6 May 2019 07:47:34 -0400 Subject: [qa][ptg][nova][cinder][keystone][neutron][glance][swift][placement] How to make integrated-gate testing (tempest-full) more stable and fast In-Reply-To: <16a86db6ccd.d787148123989.2198391414179782565@ghanshyammann.com> References: <16a86db6ccd.d787148123989.2198391414179782565@ghanshyammann.com> Message-ID: <20190506114734.mehzyjf7dhj6mqkr@bishop> I think this is a really great approach. +1 Nate On Sun, May 05, 2019 at 02:18:08AM -0500, Ghanshyam Mann wrote: > Current integrated-gate jobs (tempest-full) is not so stable for various bugs specially timeout. We tried > to improve it via filtering the slow tests in the separate tempest-slow job but the situation has not been improved much. > > We talked about the Ideas to make it more stable and fast for projects especially when failure is not > related to each project. We are planning to split the integrated-gate template (only tempest-full job as > first step) per related services. > > Idea: > - Run only dependent service tests on project gate. > - Tempest gate will keep running all the services tests as the integrated gate at a centeralized place without any change in the current job. > - Each project can run the below mentioned template. > - All below template will be defined and maintained by QA team. > > I would like to know each 6 services which run integrated-gate jobs > > 1."Integrated-gate-networking" (job to run on neutron gate) > Tests to run in this template: neutron APIs , nova APIs, keystone APIs ? All scenario currently running in tempest-full in the same way ( means non-slow and in serial) > Improvement for neutron gate: exlcude the cinder API tests, glance API tests, swift API tests, > > 2."Integrated-gate-storage" (job to run on cinder gate, glance gate) > Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, Nova APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) > Improvement for cinder, glance gate: excluded the neutron APIs tests, Keystone APIs tests > > 3. "Integrated-gate-object-storage" (job to run on swift gate) > Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) > Improvement for swift gate: excluded the neutron APIs tests, - Keystone APIs tests, - Nova APIs tests. > Note: swift does not run integrated-gate as of now. > > 4. "Integrated-gate-compute" (job to run on Nova gate) > tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs and All scenario currently running in tempest-full in same way ( means non-slow and in serial) > Improvement for Nova gate: excluded the swift APIs tests(not running in current job but in future, it might), Keystone API tests. > > 5. "Integrated-gate-identity" (job to run on keystone gate) > Tests to run is : all as all project use keystone, we might need to run all tests as it is running in integrated-gate. > But does keystone is being unsed differently by all services? if no then, is it enough to run only single service tests say Nova or neutron ? > > 6. "Integrated-gate-placement" (job to run on placement gate) > Tests to run in this template: Nova APIs tests, Neutron APIs tests + scenario tests + any new service depends on placement APIs > Improvement for placement gate: excluded the glance APIs tests, cinder APIs tests, swift APIs tests, keystone APIs tests > > Thoughts on this approach? > > The important point is we must not lose the coverage of integrated testing per project. So I would like to > get each project view if we are missing any dependency (proposed tests removal) in above proposed templates. > > - https://etherpad.openstack.org/p/qa-train-ptg > > -gmann > > From pawel.konczalski at everyware.ch Mon May 6 12:01:31 2019 From: pawel.konczalski at everyware.ch (Pawel Konczalski) Date: Mon, 6 May 2019 14:01:31 +0200 Subject: OpenStack Kubernetes uninitialized taint on minion nodes Message-ID: <76abf981-543b-1742-2ab3-5423ba93b0d0@everyware.ch> Hi, i try to deploy a Kubernetes cluster with OpenStack Magnum. So far the deployment works fine except for the uninitialized taints attribute on the worker / minion nodes. This has to be removed manually, only after that is it possible to deploy containers in the cluster. Any idea how to fix / automate this that Magnum automaticaly deploy functional cluster? openstack coe cluster template create kubernetes-cluster-template \   --image Fedora-AtomicHost-29-20190429.0.x86_64 \   --external-network public \   --dns-nameserver 8.8.8.8 \   --master-flavor m1.kubernetes \   --flavor m1.kubernetes \   --coe kubernetes \   --volume-driver cinder \   --network-driver flannel \   --docker-volume-size 25 openstack coe cluster create kubernetes-cluster \   --cluster-template kubernetes-cluster-template \   --master-count 1 \   --node-count 2 \   --keypair mykey kubectl describe nodes | grep Taints [fedora at kubernetes-cluster9-efikj2wr5lsi-master-0 ~]$ kubectl describe nodes | grep Taints Taints:             CriticalAddonsOnly=True:NoSchedule Taints: node.cloudprovider.kubernetes.io/uninitialized=true:NoSchedule Taints: node.cloudprovider.kubernetes.io/uninitialized=true:NoSchedule kubectl taint nodes --all node.cloudprovider.kubernetes.io/uninitialized- [root at kubernetes-cluster31-vrmbz6yjvuvd-master-0 /]# kubectl describe nodes | grep Taints Taints:             dedicated=master:NoSchedule Taints:             Taints:             BR Pawel -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5227 bytes Desc: not available URL: From lyarwood at redhat.com Mon May 6 13:18:34 2019 From: lyarwood at redhat.com (Lee Yarwood) Date: Mon, 6 May 2019 14:18:34 +0100 Subject: [nova][cinder][ptg] Summary: Swap volume woes Message-ID: <20190506131834.nyc7k7qltdsmamuq@lyarwood.usersys.redhat.com> Hello, tl;dr - No objections to reworking the swap volume API in Train https://etherpad.openstack.org/p/ptg-train-xproj-nova-cinder - L3-18 Summary: - Deprecate the existing swap volume API in Train, remove in U. - Deprecate or straight up remove existing CLI support for the API. - Write up a spec introducing a new API specifically for use by Cinder when retyping or migrating volumes. Potentially using the external events API or policy to lock down access to the API. - Optionally rework the Libvirt virt driver implementation of the API to improve performance and better handle failure cases as suggested by mdbooth. This might include introducing and using a quiesce volume API. I'm personally out for the next two weeks but will start on the above items once back. Cheers, -- Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 From valleru at cbio.mskcc.org Mon May 6 13:39:56 2019 From: valleru at cbio.mskcc.org (valleru at cbio.mskcc.org) Date: Mon, 6 May 2019 08:39:56 -0500 Subject: [neutron] Unable to configure multiple external networks In-Reply-To: References: Message-ID: <2b4e0900-dd5b-47a7-a383-dbe0884653a9@Spark> Thank you Slawek, Yes - I see that it is a reported bug. Will keep a track. Regards, Lohit On May 6, 2019, 4:51 AM -0500, Slawomir Kaplonski , wrote: > Hi, > > It is known and already reported issue. Please see https://bugs.launchpad.net/neutron/+bug/1824571 > > > On 6 May 2019, at 09:58, valleru at cbio.mskcc.org wrote: > > > > It started to work , after i modified this code: > > > > def _fetch_external_net_id(self, force=False): > > """Find UUID of single external network for this agent.""" > > self.conf.gateway_external_network_id = '' > > #if self.conf.gateway_external_network_id: > > # return self.conf.gateway_external_network_id > > return self.conf.gateway_external_network_id > > > > from https://github.com/openstack/neutron/blob/master/neutron/agent/l3/agent.py > > > > Looks like, that respective option is not being read correctly from the respective configuration file. > > > > Regards, > > Lohit > > > > On May 6, 2019, 2:28 AM -0500, valleru at cbio.mskcc.org, wrote: > > > Hello All, > > > > > > I am trying to install Openstack Stein on a single node, with multiple external networks (both networks are also shared). > > > However, i keep getting the following error in the logs, and the router interfaces show as down. > > > > > > 2019-05-06 02:19:45.046 52175 ERROR neutron.agent.l3.agent > > > 2019-05-06 02:19:45.048 52175 INFO neutron.agent.l3.agent [-] Starting router update for a2ec6c99-944e-408a-945a-dffbe09f65ce, action 3, priority 2 > > > 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: a2ec6c99-944e-408a-945a-dffbe09f65ce: Exception: The 'gateway_external_network_id' option must be configured for this agent as Neutron has more than one external network. > > > 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent Traceback (most recent call last): > > > 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 701, in _process_routers_if_compatible > > > 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) > > > 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 548, in _process_router_if_compatible > > > 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent target_ex_net_id = self._fetch_external_net_id() > > > 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 376, in _fetch_external_net_id > > > 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent raise Exception(msg) > > > 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent Exception: The 'gateway_external_network_id' option must be configured for this agent as Neutron has more than one external network. > > > 2019-05-06 02:19:46.249 52175 ERROR neutron.agent.l3.agent > > > 2019-05-06 02:19:46.252 52175 WARNING neutron.agent.l3.agent [-] Hit retry limit with router update for a2ec6c99-944e-408a-945a-dffbe09f65ce, action 3 > > > 2019-05-06 02:19:46.253 52175 WARNING neutron.agent.l3.agent [-] Info for router a2ec6c99-944e-408a-945a-dffbe09f65ce was not found. Performing router cleanup > > > > > > > > > I have set these parameters to empty, as mentioned in the docs. > > > > > > /etc/neutron/l3_agent.ini > > > > > > gateway_external_network_id = > > > external_network_bridge = > > > interface_driver = openvswitch > > > > > > I tried linuxbridge-agent too,but i could not get rid of the above error. > > > > > > openstack port list --router router1 > > > > > > +--------------------------------------+------+-------------------+----------------------------------------------------------------------------+--------+ > > > | ID | Name | MAC Address | Fixed IP Addresses | Status | > > > +--------------------------------------+------+-------------------+----------------------------------------------------------------------------+--------+ > > > | 1bcaad17-17ed-4383-9206-34417f8fd2df | | fa:16:3e:c1:b1:1f | ip_address='192.168.1.1', subnet_id='b00cb3bf-ca89-4e00-8bd7-83a75dbb6080' | DOWN | > > > | f49d976f-b733-4360-9d1f-cdd35ecf54e6 | | fa:16:3e:54:82:4b | ip_address='10.0.10.11', subnet_id='7cc01a33-f078-494d-9b0b-e988f5b4915d' | DOWN | > > > +--------------------------------------+------+-------------------+----------------------------------------------------------------------------+————+ > > > > > > However it does work when i have just one external network > > > > > > openstack port list --router router1 > > > +--------------------------------------+------+-------------------+--------------------------------------------------------------------------------+--------+ > > > | ID | Name | MAC Address | Fixed IP Addresses | Status | > > > +--------------------------------------+------+-------------------+--------------------------------------------------------------------------------+--------+ > > > | cdb06cf7-7492-4275-bd93-88a46b9769a8 | | fa:16:3e:7c:ea:55 | ip_address='192.168.1.1', subnet_id='b00cb3bf-ca89-4e00-8bd7-83a75dbb6080' | ACTIVE | > > > | fc9b06d7-d377-451b-9af5-07e1fab072dc | | fa:16:3e:d0:6d:7c | ip_address='140.163.188.149', subnet_id='4a2bf30a-e7f8-44c1-8b08-4de01b2b1296' | ACTIVE | > > > +--------------------------------------+------+-------------------+--------------------------------------------------------------------------------+--------+ > > > > > > May i please know, how to get the above working. > > > I have seen multiple articles online that mention that this should be working, however i am unable to get this to work. > > > It is really important for us to have to have 2 external networks in the environment, and be able to route to both of them if possible. > > > > > > > > > Thank you, > > > Lohit > > > > > — > Slawek Kaplonski > Senior software engineer > Red Hat > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dms at danplanet.com Mon May 6 15:04:35 2019 From: dms at danplanet.com (Dan Smith) Date: Mon, 06 May 2019 08:04:35 -0700 Subject: [placement][nova][ptg] Summary: Consumer Types In-Reply-To: <1557135206.12068.1@smtp.office365.com> (=?utf-8?Q?=22Bal?= =?utf-8?Q?=C3=A1zs?= Gibizer"'s message of "Mon, 6 May 2019 09:33:36 +0000") References: <1557135206.12068.1@smtp.office365.com> Message-ID: > If there are ongoing migration then defaulting the consumer type to > instance might be incorrect. Right, and you have to assume that there are some in progress. Only Nova has the ability to tell you which consumers are instances or migrations. If we did this before we split, we could have looked at the api db instance mappings to make the determination, but I think now we need to be told via the API which is which. > However nova already has a mechanism to distingush between migration > and instance consumer so nova won't break by this. This would mean placement just lies about what each consumer is, and an operator trying to make sense of an upgrade by dumping stuff with osc-placement won't be able to tell the difference. They might be inclined to delete what, to them, would look like a bunch of stale instance allocations. > Still nova might want to fix this placement data inconsistency. I > guess the new placement microversion will allow to update the consumer > type of an allocation. Yeah, I think this has to be updated from Nova. I (and I imagine others) would like to avoid making the type field optional in the API. So maybe default the value to something like "incomplete" or "unknown" and then let nova correct this naturally for instances on host startup and migrations on complete/revert. Ideally nova will be one one of the users that wants to depend on the type string, so we want to use our knowledge of which is which to get existing allocations updated so we can depend on the type value later. --Dan From pierre at stackhpc.com Mon May 6 15:32:37 2019 From: pierre at stackhpc.com (Pierre Riteau) Date: Mon, 6 May 2019 16:32:37 +0100 Subject: [blazar] Scheduling a new Blazar IRC meeting for the Americas In-Reply-To: References: Message-ID: Hello, The new IRC meeting for the Blazar project has been approved: https://review.opendev.org/#/c/656392/ We will meet this Thursday (May 9th) at 1600 UTC, then every two weeks. Everyone is welcome to join. On Tue, 9 Apr 2019 at 16:50, Pierre Riteau wrote: > > Hello, > > Contributors to the Blazar project are currently mostly from Europe or > Asia. Our weekly IRC meeting at 0900 UTC is a good match for this > group. > > To foster more contributions, I would like to schedule another IRC > meeting in the morning for American timezones, probably every two > weeks to start with. > I am thinking of proposing 1600 UTC on either Monday, Tuesday, or > Thursday, which doesn't appear to conflict with closely related > projects (Nova, Placement, Ironic). > > If there is anyone who would like to join but cannot make this time, > or has a preference on which day, please let me know. > I will wait for a few days before requesting a meeting slot. > > Thanks, > Pierre From cdent+os at anticdent.org Mon May 6 15:46:17 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Mon, 6 May 2019 08:46:17 -0700 (PDT) Subject: [placement][ptg] Open Questions In-Reply-To: <1557134030.12068.0@smtp.office365.com> References: <1557134030.12068.0@smtp.office365.com> Message-ID: On Mon, 6 May 2019, Balázs Gibizer wrote: >> * We had several RFE stories already in progress, and have added a >> few more during the PTG. We have not done much in the way of >> prioritizing these. We certainly can't do them all. Here's a link >> to the current RFE stories in the placement group (this includes >> placement, osc-placement and os-*). >> >> https://storyboard.openstack.org/#!/worklist/594 >> >> I've made a simple list of those on an etherpad, please register >> you +1 or -1 (or nothing) on each of those. Keep in mind that >> there are several features in "Update nested provider support to >> address train requirements" and that we've already committed to >> them. > > Did you forget to paste the etherpad link? Whoops, sorry about that. Clearly there have been some long days: https://etherpad.openstack.org/p/placement-ptg-train-rfe-voter Thanks for noticing. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From cdent+os at anticdent.org Mon May 6 15:49:24 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Mon, 6 May 2019 08:49:24 -0700 (PDT) Subject: [placement][nova][ptg] Summary: Consumer Types In-Reply-To: References: <1557135206.12068.1@smtp.office365.com> Message-ID: On Mon, 6 May 2019, Dan Smith wrote: >> Still nova might want to fix this placement data inconsistency. I >> guess the new placement microversion will allow to update the consumer >> type of an allocation. > > Yeah, I think this has to be updated from Nova. I (and I imagine others) > would like to avoid making the type field optional in the API. So maybe > default the value to something like "incomplete" or "unknown" and then > let nova correct this naturally for instances on host startup and > migrations on complete/revert. Ideally nova will be one one of the users > that wants to depend on the type string, so we want to use our knowledge > of which is which to get existing allocations updated so we can depend > on the type value later. Ah, okay, good. If something like "unknown" is workable I think that's much much better than defaulting to instance. Thanks. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From doka.ua at gmx.com Mon May 6 15:54:51 2019 From: doka.ua at gmx.com (Volodymyr Litovka) Date: Mon, 6 May 2019 18:54:51 +0300 Subject: [octavia] Error while creating amphora In-Reply-To: References: Message-ID: <0994c2fb-a2c1-89f8-10ca-c3d0d9bf79e2@gmx.com> Hi Michael, regarding file injection vs config_drive - https://specs.openstack.org/openstack/nova-specs/specs/queens/implemented/deprecate-file-injection.html - don't know when this will happen, but you see - people are thinking in this way. On 5/2/19 5:58 PM, Michael Johnson wrote: > Volodymyr, > > It looks like you have enabled "user_data_config_drive" in the > octavia.conf file. Is there a reason you need this? If not, please > set it to False and it will resolve your issue. > > It appears we have a python3 bug in the "user_data_config_drive" > capability. It is not generally used and appears to be missing test > coverage. > > I have opened a story (bug) on your behalf here: > https://storyboard.openstack.org/#!/story/2005553 > > Michael > > On Thu, May 2, 2019 at 4:29 AM Volodymyr Litovka wrote: >> Dear colleagues, >> >> I'm using Openstack Rocky and trying to launch Octavia 4.0.0. After all installation steps I've got an error during 'openstack loadbalancer create' with the following log: >> >> DEBUG octavia.controller.worker.tasks.compute_tasks [-] Compute create execute for amphora with id d037721f-2cf9-492e-99cb-0be5874da0f6 execute /opt/openstack/lib/python3.6/site-packages/octavia/controller/worker/tasks/compute_tasks.py:63 >> ERROR octavia.controller.worker.tasks.compute_tasks [-] Compute create for amphora id: d037721f-2cf9-492e-99cb-0be5874da0f6 failed: TypeError: can't concat str to bytes >> ERROR octavia.controller.worker.tasks.compute_tasks Traceback (most recent call last): >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/octavia/controller/worker/tasks/compute_tasks.py", line 94, in execute >> ERROR octavia.controller.worker.tasks.compute_tasks config_drive_files) >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/octavia/common/jinja/user_data_jinja_cfg.py", line 38, in build_user_data_config >> ERROR octavia.controller.worker.tasks.compute_tasks return self.agent_template.render(user_data=user_data) >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/asyncsupport.py", line 76, in render >> ERROR octavia.controller.worker.tasks.compute_tasks return original_render(self, *args, **kwargs) >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/environment.py", line 1008, in render >> ERROR octavia.controller.worker.tasks.compute_tasks return self.environment.handle_exception(exc_info, True) >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/environment.py", line 780, in handle_exception >> ERROR octavia.controller.worker.tasks.compute_tasks reraise(exc_type, exc_value, tb) >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/_compat.py", line 37, in reraise >> ERROR octavia.controller.worker.tasks.compute_tasks raise value.with_traceback(tb) >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/octavia/common/jinja/templates/user_data_config_drive.template", line 29, in top-level template code >> ERROR octavia.controller.worker.tasks.compute_tasks {{ value|indent(8) }} >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/filters.py", line 557, in do_indent >> ERROR octavia.controller.worker.tasks.compute_tasks s += u'\n' # this quirk is necessary for splitlines method >> ERROR octavia.controller.worker.tasks.compute_tasks TypeError: can't concat str to bytes >> ERROR octavia.controller.worker.tasks.compute_tasks >> WARNING octavia.controller.worker.controller_worker [-] Task 'STANDALONE-octavia-create-amp-for-lb-subflow-octavia-cert-compute-create' (06134192-def9-420c-9feb-0d08a068f3b2) transitioned into state 'FAILURE' from state 'RUNNING' >> >> Any advises where is the problem? >> >> My environment: >> - Openstack Rocky >> - Ubuntu 18.04 >> - Octavia installed in virtualenv using pip install: >> # pip list |grep octavia >> octavia 4.0.0 >> octavia-lib 1.1.1 >> python-octaviaclient 1.8.0 >> >> Thank you. >> >> -- >> Volodymyr Litovka >> "Vision without Execution is Hallucination." -- Thomas Edison >> >> -- >> Volodymyr Litovka >> "Vision without Execution is Hallucination." -- Thomas Edison -- Volodymyr Litovka "Vision without Execution is Hallucination." -- Thomas Edison From miguel at mlavalle.com Mon May 6 15:59:15 2019 From: miguel at mlavalle.com (Miguel Lavalle) Date: Mon, 6 May 2019 10:59:15 -0500 Subject: [qa][ptg][nova][cinder][keystone][neutron][glance][swift][placement] How to make integrated-gate testing (tempest-full) more stable and fast In-Reply-To: <20190506114734.mehzyjf7dhj6mqkr@bishop> References: <16a86db6ccd.d787148123989.2198391414179782565@ghanshyammann.com> <20190506114734.mehzyjf7dhj6mqkr@bishop> Message-ID: Yes, I also like this approach On Mon, May 6, 2019 at 6:48 AM Nate Johnston wrote: > I think this is a really great approach. +1 > > Nate > > On Sun, May 05, 2019 at 02:18:08AM -0500, Ghanshyam Mann wrote: > > Current integrated-gate jobs (tempest-full) is not so stable for various > bugs specially timeout. We tried > > to improve it via filtering the slow tests in the separate tempest-slow > job but the situation has not been improved much. > > > > We talked about the Ideas to make it more stable and fast for projects > especially when failure is not > > related to each project. We are planning to split the integrated-gate > template (only tempest-full job as > > first step) per related services. > > > > Idea: > > - Run only dependent service tests on project gate. > > - Tempest gate will keep running all the services tests as the > integrated gate at a centeralized place without any change in the current > job. > > - Each project can run the below mentioned template. > > - All below template will be defined and maintained by QA team. > > > > I would like to know each 6 services which run integrated-gate jobs > > > > 1."Integrated-gate-networking" (job to run on neutron gate) > > Tests to run in this template: neutron APIs , nova APIs, keystone APIs > ? All scenario currently running in tempest-full in the same way ( means > non-slow and in serial) > > Improvement for neutron gate: exlcude the cinder API tests, glance API > tests, swift API tests, > > > > 2."Integrated-gate-storage" (job to run on cinder gate, glance gate) > > Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, > Nova APIs and All scenario currently running in tempest-full in the same > way ( means non-slow and in serial) > > Improvement for cinder, glance gate: excluded the neutron APIs tests, > Keystone APIs tests > > > > 3. "Integrated-gate-object-storage" (job to run on swift gate) > > Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and > All scenario currently running in tempest-full in the same way ( means > non-slow and in serial) > > Improvement for swift gate: excluded the neutron APIs tests, - Keystone > APIs tests, - Nova APIs tests. > > Note: swift does not run integrated-gate as of now. > > > > 4. "Integrated-gate-compute" (job to run on Nova gate) > > tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs > and All scenario currently running in tempest-full in same way ( means > non-slow and in serial) > > Improvement for Nova gate: excluded the swift APIs tests(not running in > current job but in future, it might), Keystone API tests. > > > > 5. "Integrated-gate-identity" (job to run on keystone gate) > > Tests to run is : all as all project use keystone, we might need to run > all tests as it is running in integrated-gate. > > But does keystone is being unsed differently by all services? if no > then, is it enough to run only single service tests say Nova or neutron ? > > > > 6. "Integrated-gate-placement" (job to run on placement gate) > > Tests to run in this template: Nova APIs tests, Neutron APIs tests + > scenario tests + any new service depends on placement APIs > > Improvement for placement gate: excluded the glance APIs tests, cinder > APIs tests, swift APIs tests, keystone APIs tests > > > > Thoughts on this approach? > > > > The important point is we must not lose the coverage of integrated > testing per project. So I would like to > > get each project view if we are missing any dependency (proposed tests > removal) in above proposed templates. > > > > - https://etherpad.openstack.org/p/qa-train-ptg > > > > -gmann > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From paye600 at gmail.com Mon May 6 16:50:58 2019 From: paye600 at gmail.com (Roman Gorshunov) Date: Mon, 6 May 2019 18:50:58 +0200 Subject: [tc][all][airship] Github mirroring (or lack thereof) for unofficial projects In-Reply-To: <20190503230525.a3vxsnliklitnei4@arabian.linksys.moosehall> References: <20190503190538.GB3377@localhost.localdomain> <20190503230525.a3vxsnliklitnei4@arabian.linksys.moosehall> Message-ID: Thanks, Adam. I haven't been on PTG, sorry. It's good that there has been a discussion and agreement is reached. Best regards, -- Roman Gorshunov On Sat, May 4, 2019 at 1:05 AM Adam Spiers wrote: > > Paul Belanger wrote: > >On Fri, May 03, 2019 at 08:48:10PM +0200, Roman Gorshunov wrote: > >>Hello Jim, team, > >> > >>I'm from Airship project. I agree with archival of Github mirrors of > >>repositories. > > Which mirror repositories are you referring to here - a subset of the > Airship repos which are no longer needed, or all Airship repo mirrors? > > I would prefer the majority of the mirrors not to be archived, for two > reasons which Alan or maybe Matt noted in the Airship discussions this > morning: > > 1. Some people instinctively go to GitHub search when they > want to find a software project. Having useful search results > for "airship" on GitHub increases the discoverability of the > project. > > 2. Some people will judge the liveness of a project by its > activity metrics as shown on GitHub (e.g. number of recent > commits). An active mirror helps show that the project is > alive and well. In contrast, an archived mirror makes it look > like the project is dead. > > However if you are only talking about a small subset which are no > longer needed, then archiving sounds reasonable. > > >>One small suggestion: could we have project descriptions > >>adjusted to point to the new location of the source code repository, > >>please? E.g. "The repo now lives at opendev.org/x/y". > > I agree it's helpful if the top-level README.rst has a sentence like > "the authoritative location for this repo is https://...". > > >This is something important to keep in mind from infra side, once the > >repo is read-only, we lose the ability to use the API to change it. > > > >From manage-projects.py POV, we can update the description before > >flipping the archive bit without issues, just need to make sure we have > >the ordering correct. > > > >Also, there is no API to unarchive a repo from github sadly, for that a > >human needs to log into github UI and click the button. I have no idea > >why. > > Good points, but unless we're talking about a small subset of Airship > repos, I'm a bit puzzled why this is being discussed, because I > thought we reached consensus this morning on a) ensuring that all > Airship projects are continually mirrored to GitHub, and b) trying to > transfer those mirrors from the "openstack" organization to the > "airship" one, assuming we can first persuade GitHub to kick out the > org-squatters. This transferral would mean that GitHub would > automatically redirect requests from > > https://github.com/openstack/airship-* > > to > > https://github.com/airship/... > > Consensus is documented in lines 107-112 of: > > https://etherpad.openstack.org/p/airship-ptg-train From snikitin at mirantis.com Mon May 6 16:59:33 2019 From: snikitin at mirantis.com (Sergey Nikitin) Date: Mon, 6 May 2019 20:59:33 +0400 Subject: [stackalytics] Reported numbers seem inaccurate In-Reply-To: References: Message-ID: Hello Rong, Sorry for long response. I was on a trip during last 5 days. What I have found: Lets take a look on this patch [1]. It must be a contribution of gengchc2, but for some reasons it was matched to Yuval Brik [2] I'm still trying to find a root cause of it, but anyway on this week we are planing to rebuild our database to increase RAM. I checked statistics of gengchc2 on clean database and it's complete correct. So your problem will be solved in several days. It will take so long time because full rebuild of DB takes 48 hours, but we need to test our migration process first to keep zero down time. I'll share a results with you here when the process will be finished. Thank you for your patience. Sergey [1] https://review.opendev.org/#/c/627762/ [2] https://www.stackalytics.com/?user_id=jhamhader&project_type=all&release=all&metric=commits&company=&module=freezer-api On Mon, May 6, 2019 at 6:30 AM Rong Zhu wrote: > Hi Sergey, > > Do we have any process about my colleague's data loss problem? > > Sergey Nikitin 于2019年4月29日 周一19:57写道: > >> Thank you for information! I will take a look >> >> On Mon, Apr 29, 2019 at 3:47 PM Rong Zhu wrote: >> >>> Hi there, >>> >>> Recently we found we lost a person's data from our company at the >>> stackalytics website. >>> You can check the merged patch from [0], but there no date from >>> the stackalytics website. >>> >>> stackalytics info as below: >>> Company: ZTE Corporation >>> Launchpad: 578043796-b >>> Gerrit: gengchc2 >>> >>> Look forward to hearing from you! >>> >> > Best Regards, > Rong Zhu > >> >>> -- > Thanks, > Rong Zhu > -- Best Regards, Sergey Nikitin -------------- next part -------------- An HTML attachment was scrubbed... URL: From miguel at mlavalle.com Mon May 6 17:12:34 2019 From: miguel at mlavalle.com (Miguel Lavalle) Date: Mon, 6 May 2019 12:12:34 -0500 Subject: [openstack-dev] [neutron] Cancelling Neutron weekly meeting on May 7th Message-ID: Dear Neutron Team, Since we just meet during the PTG, we will skip the weekly team meeting on May 7th. We will resume our meetings on the 13th Best regards -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at fried.cc Mon May 6 18:03:27 2019 From: openstack at fried.cc (Eric Fried) Date: Mon, 6 May 2019 13:03:27 -0500 Subject: [nova][ptg] Summary: Implicit trait-based filters Message-ID: Summary: In keeping with the first proposed cycle theme [1] (though we didn't land on that until later in the PTG), we would like to be able to add required traits to the GET /allocation_candidates query to reduce the number of results returned - i.e. do more filtering in placement rather than in the scheduler (or worse, the compute). You can already do this by explicitly adding required traits to flavor/image; we want to be able to do it implicitly based on things like: - If the instance requires multiattach, make sure it lands on a compute that supports multiattach [2]. - If the image is in X format, make sure it lands on a compute that can read X format [3]. Currently the proposals in [2],[3] work by modifying the RequestSpec.flavor right before select_destinations calls GET /allocation_candidates. This just happens to be okay because we don't persist that copy of the flavor back to the instance (which we wouldn't want to do, since we don't want these implicit additions to e.g. show up when we GET server details, or to affect other lifecycle operations). But this isn't a robust design. What we would like to do instead is exploit the RequestSpec.requested_resources field [4] as it was originally intended, accumulating all the resource/trait/aggregate/etc. criteria from the flavor, image, *and* request_filter-y things like the above. However, gibi started on this [5] and it turns out to be difficult to express the unnumbered request group in that field for... reasons. Action: Since gibi is going to be pretty occupied and unlikely to have time to resolve [5], aspiers has graciously (been) volunteered to take it over; and then follow [2] and [3] to use that mechanism once it's available. efried [1] https://review.opendev.org/#/c/657171/1/priorities/train-priorities.rst at 13 [2] https://review.opendev.org/#/c/645316/ [3] https://review.opendev.org/#/q/topic:bp/request-filter-image-types+(status:open+OR+status:merged) [4] https://opendev.org/openstack/nova/src/commit/5934c5dc6932fbf19ca7f3011c4ccc07b0038ac4/nova/objects/request_spec.py#L93-L100 [5] https://review.opendev.org/#/c/647396/ From ashlee at openstack.org Mon May 6 18:20:00 2019 From: ashlee at openstack.org (Ashlee Ferguson) Date: Mon, 6 May 2019 13:20:00 -0500 Subject: Shanghai Summit Programming Committee Nominations Open Message-ID: <2631F356-5352-41BF-AD86-DF2AB17F349C@openstack.org> Thank you to everyone who attended the Open Infrastructure Summit in Denver. The event was a huge success! If you weren’t able to make it, check out the videos page [1]. Keynotes are up now, and the rest of the sessions will be uploaded in the next week. We’ll also be sharing a Summit recap in the Open Infrastructure Community Newsletter, which you can subscribe to here [2]. The next Summit + PTG will be in Shanghai, November 4 - 6, and the PTG will be November 6 - 8, 2019. Registration and Programming Committee nominations for the Shanghai Open Infrastructure Summit + PTG [3] are open! The Programming Committee helps select the content from the Call for Presentations (CFP) for the Summit schedule. Sessions will be presented in both English and Mandarin, so we will be accepting CFP submissions in both languages. The CFP will open early next week. • Nominate yourself or someone else for the Programming Committee [4] before May 20, 2019 • Shanghai Summit + PTG registration is available in the following currencies: • Register in USD [5] • Register in RMB (includes fapiao) [6] Thanks, Ashlee [1] https://www.openstack.org/videos [2] https://www.openstack.org/community/email-signup [3] https://www.openstack.org/summit/shanghai-2019 [4] http://bit.ly/ShanghaiProgrammingCommittee [5] https://app.eventxtra.link/registrations/6640a923-98d7-44c7-a623-1e2c9132b402?locale=en [6] https://app.eventxtra.link/registrations/f564960c-74f6-452d-b0b2-484386d33eb6?locale=en From openstack at fried.cc Mon May 6 18:44:15 2019 From: openstack at fried.cc (Eric Fried) Date: Mon, 6 May 2019 13:44:15 -0500 Subject: [nova][ptg] Summary: Implicit trait-based filters In-Reply-To: Message-ID: Addendum: There's another implicit trait-based filter that bears mentioning: Excluding disabled compute hosts. We have code that disables a compute service when "something goes wrong" in various ways. This code should decorate the compute node's resource provider with a COMPUTE_SERVICE_DISABLED trait, and every GET /allocation_candidates request should include ?required=!COMPUTE_SERVICE_DISABLED, so that we don't retrieve allocation candidates for disabled hosts. mriedem has started to prototype the code for this [1]. Action: Spec to be written. Code to be polished up. Possibly aspiers to be involved in this bit as well. efried [1] https://review.opendev.org/#/c/654596/ From jungleboyj at gmail.com Mon May 6 19:19:36 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Mon, 6 May 2019 14:19:36 -0500 Subject: [cinder] No weekly meeting this week ... Message-ID: Team, It was discussed at the PTG last week that we would take this week off from our usual team meeting. So, enjoy getting an hour back on Wednesday and we will go back to our regularly scheduled meetings on May 15th . Thanks! Jay From openstack at fried.cc Mon May 6 19:32:10 2019 From: openstack at fried.cc (Eric Fried) Date: Mon, 6 May 2019 14:32:10 -0500 Subject: [nova][ptg] Summary: Server group [anti-]affinity Message-ID: <27bb593b-62c3-9167-59de-d7e6effab9e9@fried.cc> The Issue: Doing server group affinity ("land all these instances on the same host") and anti-affinity ("...on different hosts") on the nova side is problematic in large deployments (like CERN). We'd like to do it on the placement side - i.e. have GET /allocation_candidates return [just the one host to which servers in the group are being deployed] (affinity); or [only hosts on which servers in the group have not yet landed] (anti-affinity). Summary: - Affinity is fairly easy: ?in_tree=. - For anti-affinity, we need something like ?in_tree=!. - The size of in the latter case could quickly get out of hand, exceeding HTTP/wsgi (querystring length) and/or database query (`AND resource_provider.uuid NOT IN `) limits. - Race conditions in both cases are a little tricky. - tssurya to come up with spec(s) for ?in_tree=! and nova usage thereof, wherein discussions of the above issues can occur. - Unclear if this will make Train. efried . From sean.mcginnis at gmx.com Mon May 6 20:08:13 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Mon, 6 May 2019 15:08:13 -0500 Subject: [cinder] Third party CI failures with namespace changes Message-ID: <20190506200813.GA29759@sm-workstation> Just a heads up for third party CI maintainers. You should have already noticed this, but we have quite a few that are failing tests right now because the CI systems have not been updated for the new git namespaces. There are several I noticed that are failure trying to clone https://git.openstack.org/openstack-dev/devstack. With all of the changes a couple weeks ago, this should now be from https://opendev.org/openstack/devstack. Please update your CI's to pull from the correct location, or disable them for now until you are able to make the updates. The current barrage of CI failure comments soon after submitting patches are not particularly helpful. Thanks for your prompt attention to this. As a reminder, we have a third party CI policy that impacts in-tree drivers if third party CI's are not maintained and cannot give us useful feedback as to whether a driver is functional or not [0]. Thanks, Sean [0] https://wiki.openstack.org/wiki/Cinder/tested-3rdParty-drivers#Non-Compliance_Policy From openstack at fried.cc Mon May 6 20:12:25 2019 From: openstack at fried.cc (Eric Fried) Date: Mon, 6 May 2019 15:12:25 -0500 Subject: [nova][ptg] Summary: Tech Debt Message-ID: Tech debt items we discussed, and actions to be taken thereon: - Remove cellsv1: (continue to) do it [1] - Remove nova-network: do it - Remove the nova-console, nova-consoleauth, nova-xvpxvncproxy services: do it - Migrating rootwrap to privsep: (continue to) do it [2] - Bump the minimum microversion: don't do it - Remove mox: (continue to) do it [3] It's possible I missed some; if so, please reply. efried [1] https://review.opendev.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/remove-cells-v1 [2] https://review.opendev.org/#/q/project:openstack/nova+branch:master+topic:my-own-personal-alternative-universe [3] https://review.opendev.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/mox-removal-train From alifshit at redhat.com Mon May 6 20:31:18 2019 From: alifshit at redhat.com (Artom Lifshitz) Date: Mon, 6 May 2019 16:31:18 -0400 Subject: [nova][ptg] Summary: Server group [anti-]affinity In-Reply-To: <27bb593b-62c3-9167-59de-d7e6effab9e9@fried.cc> References: <27bb593b-62c3-9167-59de-d7e6effab9e9@fried.cc> Message-ID: On Mon, May 6, 2019 at 3:35 PM Eric Fried wrote: > - Race conditions in both cases are a little tricky. So we currently have the late group policy check [1] done on the host itself during instance build. It'd be great if we can get rid of the need for it with this work, or at the very least make it very very unlikely to fail. I know it won't be easy though. [1] https://github.com/openstack/nova/blob/eae1f2257a9ee6e851182bf949568b1cfe2af763/nova/compute/manager.py#L1350 > - tssurya to come up with spec(s) for ?in_tree=! and nova usage > thereof, wherein discussions of the above issues can occur. > - Unclear if this will make Train. > > efried > . > > -- Artom Lifshitz Software Engineer, OpenStack Compute DFG From sundar.nadathur at intel.com Mon May 6 21:17:41 2019 From: sundar.nadathur at intel.com (Nadathur, Sundar) Date: Mon, 6 May 2019 21:17:41 +0000 Subject: [cyborg] [ptg] PTG summary Message-ID: <1CC272501B5BC543A05DB90AA509DED5275553E7@fmsmsx122.amr.corp.intel.com> Thanks to all Cyborg, Nova and Ironic developers for the productive PTG. * Cyborg-Nova integration: Demo slides are here [1]. The xproj etherpad with summary of outcomes is [2], and includes the demo slides link at the bottom. (The Cyborg PTG etherpad [3] also contains the link to the slides in Line 26.) Good to see that Nova has made Cyborg integration as a cycle theme [4]. * Major goals for Train release [5] were unanimously agreed upon. * Mapping names to IDs: All agreed on the need. The discussion on how to do that needs to be completed. By using alternative mechanisms for function IDs and region IDs, we could potentially avoid the need for a new API. * Good discussions on a variety of other topics, as can be seen in [3], but they need follow-up. * Owners identified for most of the ToDo tasks [3]. * In offline conversations after the PTG, ZTE developers have agreed to help with getting tempest CI started, to be followed up by others later. * Cyborg-Ironic cross-project [6]: Good discussion. The need for the integration was understood: between the types of bare metal servers, and varying number/types of accelerators, there is a combinatorial explosion of the number of variations; Cyborg can help address that. Need to write a spec for the approach. Next steps: * Cyborg/Nova integration: o Drive Nova spec to closure, write and merge some Cyborg specs (device profiles, REST API), merge Cyborg pilot code into master, incorporate some feedback in Nova patches. o Set up tempest CI, with a real or fake device. o Only after both steps above are done will Nova patches get merged. * Need to write a bunch of specs: Cyborg (REST API spec, driver API spec?), DDP-related specs, Ironic+Cyborg spec. * Complete the discussions on remaining items. [1] https://docs.google.com/presentation/d/1uHP2kVLiZVuRcdeCI8QaNDJILjix9VCO7wagTBuWGyQ/edit?usp=sharing [2] https://etherpad.openstack.org/p/ptg-train-xproj-nova-cyborg [3] https://etherpad.openstack.org/p/cyborg-ptg-train [4] http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005867.html [5] https://etherpad.openstack.org/p/cyborg-train-goals [6] https://etherpad.openstack.org/p/ptg-train-xproj-ironic-cyborg Regards, Sundar -------------- next part -------------- An HTML attachment was scrubbed... URL: From jp.methot at planethoster.info Mon May 6 21:56:50 2019 From: jp.methot at planethoster.info (=?utf-8?Q?Jean-Philippe_M=C3=A9thot?=) Date: Mon, 6 May 2019 17:56:50 -0400 Subject: [ops][nova]Logging in nova and other openstack projects Message-ID: Hi, We’ve been modifying our login habits for Nova on our Openstack setup to try to send only warning level and up logs to our log servers. To do so, I’ve created a logging.conf and configured logging according to the logging module documentation. While what I’ve done works, it seems to be a very convoluted process for something as simple as changing the logging level to warning. We worry that if we upgrade and the syntax for this configuration file changes, we may have to push more changes through ansible than we would like to. Is there an easier way to set the nova logs to warning level and up than making an additional config file for the python logging module? Best regards, Jean-Philippe Méthot Openstack system administrator Administrateur système Openstack PlanetHoster inc. -------------- next part -------------- An HTML attachment was scrubbed... URL: From morgan.fainberg at gmail.com Mon May 6 22:06:23 2019 From: morgan.fainberg at gmail.com (Morgan Fainberg) Date: Mon, 6 May 2019 15:06:23 -0700 Subject: [qa][ptg][nova][cinder][keystone][neutron][glance][swift][placement] How to make integrated-gate testing (tempest-full) more stable and fast In-Reply-To: <16a86db6ccd.d787148123989.2198391414179782565@ghanshyammann.com> References: <16a86db6ccd.d787148123989.2198391414179782565@ghanshyammann.com> Message-ID: On Sun, May 5, 2019 at 12:19 AM Ghanshyam Mann wrote: > Current integrated-gate jobs (tempest-full) is not so stable for various > bugs specially timeout. We tried > to improve it via filtering the slow tests in the separate tempest-slow > job but the situation has not been improved much. > > We talked about the Ideas to make it more stable and fast for projects > especially when failure is not > related to each project. We are planning to split the integrated-gate > template (only tempest-full job as > first step) per related services. > > Idea: > - Run only dependent service tests on project gate. > - Tempest gate will keep running all the services tests as the integrated > gate at a centeralized place without any change in the current job. > - Each project can run the below mentioned template. > - All below template will be defined and maintained by QA team. > > I would like to know each 6 services which run integrated-gate jobs > > 1."Integrated-gate-networking" (job to run on neutron gate) > Tests to run in this template: neutron APIs , nova APIs, keystone APIs ? > All scenario currently running in tempest-full in the same way ( means > non-slow and in serial) > Improvement for neutron gate: exlcude the cinder API tests, glance API > tests, swift API tests, > > 2."Integrated-gate-storage" (job to run on cinder gate, glance gate) > Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, Nova > APIs and All scenario currently running in tempest-full in the same way ( > means non-slow and in serial) > Improvement for cinder, glance gate: excluded the neutron APIs tests, > Keystone APIs tests > > 3. "Integrated-gate-object-storage" (job to run on swift gate) > Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and > All scenario currently running in tempest-full in the same way ( means > non-slow and in serial) > Improvement for swift gate: excluded the neutron APIs tests, - Keystone > APIs tests, - Nova APIs tests. > Note: swift does not run integrated-gate as of now. > > 4. "Integrated-gate-compute" (job to run on Nova gate) > tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs and > All scenario currently running in tempest-full in same way ( means non-slow > and in serial) > Improvement for Nova gate: excluded the swift APIs tests(not running in > current job but in future, it might), Keystone API tests. > > 5. "Integrated-gate-identity" (job to run on keystone gate) > Tests to run is : all as all project use keystone, we might need to run > all tests as it is running in integrated-gate. > But does keystone is being unsed differently by all services? if no then, > is it enough to run only single service tests say Nova or neutron ? > > 6. "Integrated-gate-placement" (job to run on placement gate) > Tests to run in this template: Nova APIs tests, Neutron APIs tests + > scenario tests + any new service depends on placement APIs > Improvement for placement gate: excluded the glance APIs tests, cinder > APIs tests, swift APIs tests, keystone APIs tests > > Thoughts on this approach? > > The important point is we must not lose the coverage of integrated testing > per project. So I would like to > get each project view if we are missing any dependency (proposed tests > removal) in above proposed templates. > > - https://etherpad.openstack.org/p/qa-train-ptg > > -gmann > > > For the "Integrated-gate-identity", I have a slight worry that we might lose some coverage with this change. I am unsure of how varied the use of Keystone is outside of KeystoneMiddleware (i.e. token validation) consumption that all services perform, Heat (not part of the integrated gate) and it's usage of Trusts, and some newer emerging uses such as "look up limit data" (potentially in Train, would be covered by Nova). Worst case, we could run all the integrated tests for Keystone changes (at least initially) until we have higher confidence and minimize the tests once we have a clearer audit of how the services use Keystone. The changes would speed up/minimize the usage for the other services directly and Keystone can follow down the line. I want to be as close to 100% sure we're not going to suddenly break everyone because of some change we land. Keystone fortunately and unfortunately sits below most other services in an OpenStack deployment and is heavily relied throughout almost every single request. --Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim at swiftstack.com Mon May 6 23:25:11 2019 From: tim at swiftstack.com (Tim Burke) Date: Mon, 6 May 2019 16:25:11 -0700 Subject: [qa][ptg][nova][cinder][keystone][neutron][glance][swift][placement] How to make integrated-gate testing (tempest-full) more stable and fast In-Reply-To: <16a86db6ccd.d787148123989.2198391414179782565@ghanshyammann.com> References: <16a86db6ccd.d787148123989.2198391414179782565@ghanshyammann.com> Message-ID: On 5/5/19 12:18 AM, Ghanshyam Mann wrote: > Current integrated-gate jobs (tempest-full) is not so stable for various bugs specially timeout. We tried > to improve it via filtering the slow tests in the separate tempest-slow job but the situation has not been improved much. > > We talked about the Ideas to make it more stable and fast for projects especially when failure is not > related to each project. We are planning to split the integrated-gate template (only tempest-full job as > first step) per related services. > > Idea: > - Run only dependent service tests on project gate. I love this plan already. > - Tempest gate will keep running all the services tests as the integrated gate at a centeralized place without any change in the current job. > - Each project can run the below mentioned template. > - All below template will be defined and maintained by QA team. My biggest regret is that I couldn't figure out how to do this myself. Much thanks to the QA team! > > I would like to know each 6 services which run integrated-gate jobs > > 1."Integrated-gate-networking" (job to run on neutron gate) > Tests to run in this template: neutron APIs , nova APIs, keystone APIs ? All scenario currently running in tempest-full in the same way ( means non-slow and in serial) > Improvement for neutron gate: exlcude the cinder API tests, glance API tests, swift API tests, > > 2."Integrated-gate-storage" (job to run on cinder gate, glance gate) > Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, Nova APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) > Improvement for cinder, glance gate: excluded the neutron APIs tests, Keystone APIs tests > > 3. "Integrated-gate-object-storage" (job to run on swift gate) > Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) > Improvement for swift gate: excluded the neutron APIs tests, - Keystone APIs tests, - Nova APIs tests. This sounds great. My only question is why Cinder tests are still included, but I trust that it's there for a reason and I'm just revealing my own ignorance of Swift's consumers, however removed. > Note: swift does not run integrated-gate as of now. Correct, and for all the reasons that you're seeking to address. Some eight months ago I'd gotten tired of seeing spurious failures that had nothing to do with Swift, and I was hard pressed to find an instance where the tempest tests caught a regression or behavior change that wasn't already caught by Swift's own functional tests. In short, the signal-to-noise ratio for those particular tests was low enough that a failure only told me "you should leave a recheck comment," so I proposed https://review.opendev.org/#/c/601813/ . There was also a side benefit of having our longest-running job change from legacy-tempest-dsvm-neutron-full (at 90-100 minutes) to swift-probetests-centos-7 (at ~30 minutes), tightening developer feedback loops. It sounds like this proposal addresses both concerns: by reducing the scope of tests to what might actually exercise the Swift API (if indirectly), the signal-to-noise ratio should be much better and the wall-clock time will be reduced. > > 4. "Integrated-gate-compute" (job to run on Nova gate) > tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs and All scenario currently running in tempest-full in same way ( means non-slow and in serial) > Improvement for Nova gate: excluded the swift APIs tests(not running in current job but in future, it might), Keystone API tests. > > 5. "Integrated-gate-identity" (job to run on keystone gate) > Tests to run is : all as all project use keystone, we might need to run all tests as it is running in integrated-gate. > But does keystone is being unsed differently by all services? if no then, is it enough to run only single service tests say Nova or neutron ? > > 6. "Integrated-gate-placement" (job to run on placement gate) > Tests to run in this template: Nova APIs tests, Neutron APIs tests + scenario tests + any new service depends on placement APIs > Improvement for placement gate: excluded the glance APIs tests, cinder APIs tests, swift APIs tests, keystone APIs tests > > Thoughts on this approach? > > The important point is we must not lose the coverage of integrated testing per project. So I would like to > get each project view if we are missing any dependency (proposed tests removal) in above proposed templates. As far as Swift is aware, these dependencies seem accurate; at any rate, *we* don't use anything other than Keystone, even by way of another API. Further, Swift does not use particularly esoteric Keysonte APIs; I would be OK with integrated-gate-identity not exercising Swift's API with the assumption that some other (or indeed, almost *any* other) service would likely exercise the parts that we care about. > > - https:/etherpad.openstack.org/p/qa-train-ptg > > -gmann > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nate.johnston at redhat.com Tue May 7 03:11:59 2019 From: nate.johnston at redhat.com (Nate Johnston) Date: Mon, 6 May 2019 23:11:59 -0400 Subject: [neutron] bug deputy notes 2019-04-29 - 2019-05-06 Message-ID: <20190507031159.iwvgpme7gdonwjh3@bishop> Neutrinos, It was a quiet week with the summit and PTG. All reported bugs have a fix in progress except for 1827363. High: - "snat gateway port may stay 4095 after router fully initialized in l3 agent" * https://bugs.launchpad.net/bugs/1827754 * Fix in progress - "Network won't be synced when create a new network node" * https://bugs.launchpad.net/bugs/1827771 * Fix in progress - "Additional port list / get_ports() failures when filtering and limiting at the same time" * https://bugs.launchpad.net/neutron/+bug/1827363 Low: - "Remove deprecated SR-IOV devstack file" * https://bugs.launchpad.net/neutron/+bug/1827089 * Fix merged - "Routed provider networks in neutron - placement CLI example" * https://bugs.launchpad.net/bugs/1827418 * Fix in progress - "Wrong IPV6 address provided by openstack server create" * https://bugs.launchpad.net/neutron/+bug/1827489 * Fix merged Thanks, Nate From marcin.juszkiewicz at linaro.org Tue May 7 06:42:09 2019 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Tue, 7 May 2019 08:42:09 +0200 Subject: [kolla][neutron] Python3 issue: "TypeError: Unicode-objects must be encoded before hashing" Message-ID: <1d56ad05-9fa4-16b7-5cbe-af5c339f58b1@linaro.org> I am working on making Kolla images Python 3 only. So far images are py3 but then there are issues during deployment phase which I do not know how to solve. https://review.opendev.org/#/c/642375/ is a patch. 'kolla-ansible-ubuntu-source' CI job deploys using Ubuntu 18.04 based images. And fails. Log [1] shows something which looks like 'works in py2, not tested with py3' code: 1. http://logs.openstack.org/75/642375/19/check/kolla-ansible-ubuntu-source/40878ed/primary/logs/ansible/deploy "+++ neutron-db-manage --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini upgrade head", "INFO [alembic.runtime.migration] Context impl MySQLImpl.", "INFO [alembic.runtime.migration] Will assume non-transactional DDL.", "INFO [alembic.runtime.migration] Context impl MySQLImpl.", "INFO [alembic.runtime.migration] Will assume non-transactional DDL.", "INFO [alembic.runtime.migration] Running upgrade -> kilo", "INFO [alembic.runtime.migration] Running upgrade kilo -> 354db87e3225", "INFO [alembic.runtime.migration] Running upgrade 354db87e3225 -> 599c6a226151", "INFO [alembic.runtime.migration] Running upgrade 599c6a226151 -> 52c5312f6baf", "INFO [alembic.runtime.migration] Running upgrade 52c5312f6baf -> 313373c0ffee", "INFO [alembic.runtime.migration] Running upgrade 313373c0ffee -> 8675309a5c4f", "INFO [alembic.runtime.migration] Running upgrade 8675309a5c4f -> 45f955889773", "INFO [alembic.runtime.migration] Running upgrade 45f955889773 -> 26c371498592", "INFO [alembic.runtime.migration] Running upgrade 26c371498592 -> 1c844d1677f7", "INFO [alembic.runtime.migration] Running upgrade 1c844d1677f7 -> 1b4c6e320f79", "INFO [alembic.runtime.migration] Running upgrade 1b4c6e320f79 -> 48153cb5f051", "INFO [alembic.runtime.migration] Running upgrade 48153cb5f051 -> 9859ac9c136", "INFO [alembic.runtime.migration] Running upgrade 9859ac9c136 -> 34af2b5c5a59", "INFO [alembic.runtime.migration] Running upgrade 34af2b5c5a59 -> 59cb5b6cf4d", "INFO [alembic.runtime.migration] Running upgrade 59cb5b6cf4d -> 13cfb89f881a", "INFO [alembic.runtime.migration] Running upgrade 13cfb89f881a -> 32e5974ada25", "INFO [alembic.runtime.migration] Running upgrade 32e5974ada25 -> ec7fcfbf72ee", "INFO [alembic.runtime.migration] Running upgrade ec7fcfbf72ee -> dce3ec7a25c9", "INFO [alembic.runtime.migration] Running upgrade dce3ec7a25c9 -> c3a73f615e4", "INFO [alembic.runtime.migration] Running upgrade c3a73f615e4 -> 659bf3d90664", "INFO [alembic.runtime.migration] Running upgrade 659bf3d90664 -> 1df244e556f5", "INFO [alembic.runtime.migration] Running upgrade 1df244e556f5 -> 19f26505c74f", "INFO [alembic.runtime.migration] Running upgrade 19f26505c74f -> 15be73214821", "INFO [alembic.runtime.migration] Running upgrade 15be73214821 -> b4caf27aae4", "INFO [alembic.runtime.migration] Running upgrade b4caf27aae4 -> 15e43b934f81", "INFO [alembic.runtime.migration] Running upgrade 15e43b934f81 -> 31ed664953e6", "INFO [alembic.runtime.migration] Running upgrade 31ed664953e6 -> 2f9e956e7532", "INFO [alembic.runtime.migration] Running upgrade 2f9e956e7532 -> 3894bccad37f", "INFO [alembic.runtime.migration] Running upgrade 3894bccad37f -> 0e66c5227a8a", "INFO [alembic.runtime.migration] Running upgrade 0e66c5227a8a -> 45f8dd33480b", "INFO [alembic.runtime.migration] Running upgrade 45f8dd33480b -> 5abc0278ca73", "INFO [alembic.runtime.migration] Running upgrade 5abc0278ca73 -> d3435b514502", "INFO [alembic.runtime.migration] Running upgrade d3435b514502 -> 30107ab6a3ee", "INFO [alembic.runtime.migration] Running upgrade 30107ab6a3ee -> c415aab1c048", "INFO [alembic.runtime.migration] Running upgrade c415aab1c048 -> a963b38d82f4", "INFO [alembic.runtime.migration] Running upgrade kilo -> 30018084ec99", "INFO [alembic.runtime.migration] Running upgrade 30018084ec99 -> 4ffceebfada", "INFO [alembic.runtime.migration] Running upgrade 4ffceebfada -> 5498d17be016", "INFO [alembic.runtime.migration] Running upgrade 5498d17be016 -> 2a16083502f3", "INFO [alembic.runtime.migration] Running upgrade 2a16083502f3 -> 2e5352a0ad4d", "INFO [alembic.runtime.migration] Running upgrade 2e5352a0ad4d -> 11926bcfe72d", "INFO [alembic.runtime.migration] Running upgrade 11926bcfe72d -> 4af11ca47297", "INFO [alembic.runtime.migration] Running upgrade 4af11ca47297 -> 1b294093239c", "INFO [alembic.runtime.migration] Running upgrade 1b294093239c -> 8a6d8bdae39", "INFO [alembic.runtime.migration] Running upgrade 8a6d8bdae39 -> 2b4c2465d44b", "INFO [alembic.runtime.migration] Running upgrade 2b4c2465d44b -> e3278ee65050", "INFO [alembic.runtime.migration] Running upgrade e3278ee65050 -> c6c112992c9", "INFO [alembic.runtime.migration] Running upgrade c6c112992c9 -> 5ffceebfada", "INFO [alembic.runtime.migration] Running upgrade 5ffceebfada -> 4ffceebfcdc", "INFO [alembic.runtime.migration] Running upgrade 4ffceebfcdc -> 7bbb25278f53", "INFO [alembic.runtime.migration] Running upgrade 7bbb25278f53 -> 89ab9a816d70", "INFO [alembic.runtime.migration] Running upgrade 89ab9a816d70 -> c879c5e1ee90", "INFO [alembic.runtime.migration] Running upgrade c879c5e1ee90 -> 8fd3918ef6f4", "INFO [alembic.runtime.migration] Running upgrade 8fd3918ef6f4 -> 4bcd4df1f426", "INFO [alembic.runtime.migration] Running upgrade 4bcd4df1f426 -> b67e765a3524", "INFO [alembic.runtime.migration] Running upgrade a963b38d82f4 -> 3d0e74aa7d37", "INFO [alembic.runtime.migration] Running upgrade 3d0e74aa7d37 -> 030a959ceafa", "INFO [alembic.runtime.migration] Running upgrade 030a959ceafa -> a5648cfeeadf", "INFO [alembic.runtime.migration] Running upgrade a5648cfeeadf -> 0f5bef0f87d4", "INFO [alembic.runtime.migration] Running upgrade 0f5bef0f87d4 -> 67daae611b6e", "INFO [alembic.runtime.migration] Running upgrade 67daae611b6e -> 6b461a21bcfc", "INFO [alembic.runtime.migration] Running upgrade 6b461a21bcfc -> 5cd92597d11d", "INFO [alembic.runtime.migration] Running upgrade 5cd92597d11d -> 929c968efe70", "INFO [alembic.runtime.migration] Running upgrade 929c968efe70 -> a9c43481023c", "INFO [alembic.runtime.migration] Running upgrade a9c43481023c -> 804a3c76314c", "INFO [alembic.runtime.migration] Running upgrade 804a3c76314c -> 2b42d90729da", "INFO [alembic.runtime.migration] Running upgrade 2b42d90729da -> 62c781cb6192", "INFO [alembic.runtime.migration] Running upgrade 62c781cb6192 -> c8c222d42aa9", "INFO [alembic.runtime.migration] Running upgrade c8c222d42aa9 -> 349b6fd605a6", "INFO [alembic.runtime.migration] Running upgrade 349b6fd605a6 -> 7d32f979895f", "INFO [alembic.runtime.migration] Running upgrade 7d32f979895f -> 594422d373ee", "INFO [alembic.runtime.migration] Running upgrade 594422d373ee -> 61663558142c", "INFO [alembic.runtime.migration] Running upgrade 61663558142c -> 867d39095bf4, port forwarding", "INFO [alembic.runtime.migration] Running upgrade 867d39095bf4 -> d72db3e25539, modify uniq port forwarding", "INFO [alembic.runtime.migration] Running upgrade d72db3e25539 -> cada2437bf41", "INFO [alembic.runtime.migration] Running upgrade cada2437bf41 -> 195176fb410d, router gateway IP QoS", "INFO [alembic.runtime.migration] Running upgrade 195176fb410d -> fb0167bd9639", "INFO [alembic.runtime.migration] Running upgrade fb0167bd9639 -> 0ff9e3881597", "INFO [alembic.runtime.migration] Running upgrade 0ff9e3881597 -> 9bfad3f1e780", "INFO [alembic.runtime.migration] Running upgrade b67e765a3524 -> a84ccf28f06a", "INFO [alembic.runtime.migration] Running upgrade a84ccf28f06a -> 7d9d8eeec6ad", "INFO [alembic.runtime.migration] Running upgrade 7d9d8eeec6ad -> a8b517cff8ab", "INFO [alembic.runtime.migration] Running upgrade a8b517cff8ab -> 3b935b28e7a0", "INFO [alembic.runtime.migration] Running upgrade 3b935b28e7a0 -> b12a3ef66e62", "INFO [alembic.runtime.migration] Running upgrade b12a3ef66e62 -> 97c25b0d2353", "INFO [alembic.runtime.migration] Running upgrade 97c25b0d2353 -> 2e0d7a8a1586", "INFO [alembic.runtime.migration] Running upgrade 2e0d7a8a1586 -> 5c85685d616d", "INFO [alembic.runtime.migration] Context impl MySQLImpl.", "INFO [alembic.runtime.migration] Will assume non-transactional DDL.", "Traceback (most recent call last):", " File \"/var/lib/kolla/venv/bin/neutron-db-manage\", line 10, in ", " sys.exit(main())", " File \"/var/lib/kolla/venv/lib/python3.6/site-packages/neutron/db/migration/cli.py\", line 657, in main", " return_val |= bool(CONF.command.func(config, CONF.command.name))", " File \"/var/lib/kolla/venv/lib/python3.6/site-packages/neutron/db/migration/cli.py\", line 179, in do_upgrade", " run_sanity_checks(config, revision)", " File \"/var/lib/kolla/venv/lib/python3.6/site-packages/neutron/db/migration/cli.py\", line 641, in run_sanity_checks", " script_dir.run_env()", " File \"/var/lib/kolla/venv/lib/python3.6/site-packages/alembic/script/base.py\", line 475, in run_env", " util.load_python_file(self.dir, \"env.py\")", " File \"/var/lib/kolla/venv/lib/python3.6/site-packages/alembic/util/pyfiles.py\", line 90, in load_python_file", " module = load_module_py(module_id, path)", " File \"/var/lib/kolla/venv/lib/python3.6/site-packages/alembic/util/compat.py\", line 156, in load_module_py", " spec.loader.exec_module(module)", " File \"\", line 678, in exec_module", " File \"\", line 219, in _call_with_frames_removed", " File \"/var/lib/kolla/venv/lib/python3.6/site-packages/networking_infoblox/neutron/db/migration/alembic_migrations/env.py\", line 88, in ", " run_migrations_online()", " File \"/var/lib/kolla/venv/lib/python3.6/site-packages/networking_infoblox/neutron/db/migration/alembic_migrations/env.py\", line 79, in run_migrations_online", " context.run_migrations()", " File \"\", line 8, in run_migrations", " File \"/var/lib/kolla/venv/lib/python3.6/site-packages/alembic/runtime/environment.py\", line 839, in run_migrations", " self.get_context().run_migrations(**kw)", " File \"/var/lib/kolla/venv/lib/python3.6/site-packages/alembic/runtime/migration.py\", line 350, in run_migrations", " for step in self._migrations_fn(heads, self):", " File \"/var/lib/kolla/venv/lib/python3.6/site-packages/neutron/db/migration/cli.py\", line 632, in check_sanity", " revision, rev, implicit_base=True):", " File \"/var/lib/kolla/venv/lib/python3.6/site-packages/alembic/script/revision.py\", line 767, in _iterate_revisions", " uppers = util.dedupe_tuple(self.get_revisions(upper))", " File \"/var/lib/kolla/venv/lib/python3.6/site-packages/alembic/script/revision.py\", line 321, in get_revisions", " resolved_id, branch_label = self._resolve_revision_number(id_)", " File \"/var/lib/kolla/venv/lib/python3.6/site-packages/alembic/script/revision.py\", line 491, in _resolve_revision_number", " self._revision_map", " File \"/var/lib/kolla/venv/lib/python3.6/site-packages/alembic/util/langhelpers.py\", line 230, in __get__", " obj.__dict__[self.__name__] = result = self.fget(obj)", " File \"/var/lib/kolla/venv/lib/python3.6/site-packages/alembic/script/revision.py\", line 123, in _revision_map", " for revision in self._generator():", " File \"/var/lib/kolla/venv/lib/python3.6/site-packages/alembic/script/base.py\", line 109, in _load_revisions", " script = Script._from_filename(self, vers, file_)", " File \"/var/lib/kolla/venv/lib/python3.6/site-packages/alembic/script/base.py\", line 887, in _from_filename", " module = util.load_python_file(dir_, filename)", " File \"/var/lib/kolla/venv/lib/python3.6/site-packages/alembic/util/pyfiles.py\", line 90, in load_python_file", " module = load_module_py(module_id, path)", " File \"/var/lib/kolla/venv/lib/python3.6/site-packages/alembic/util/compat.py\", line 156, in load_module_py", " spec.loader.exec_module(module)", " File \"\", line 678, in exec_module", " File \"\", line 219, in _call_with_frames_removed", " File \"/var/lib/kolla/venv/lib/python3.6/site-packages/networking_infoblox/neutron/db/migration/alembic_migrations/versions/4d0bb1d080f8_member_sync_improvement.py\", line 43, in ", " default=utils.get_hash()))", " File \"/var/lib/kolla/venv/lib/python3.6/site-packages/networking_infoblox/neutron/common/utils.py\", line 374, in get_hash", " return hashlib.md5(str(time.time())).hexdigest()", "TypeError: Unicode-objects must be encoded before hashing" ], Any ideas which project goes wrong? And how/where to fix it? From balazs.gibizer at ericsson.com Tue May 7 07:19:55 2019 From: balazs.gibizer at ericsson.com (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Tue, 7 May 2019 07:19:55 +0000 Subject: [nova][ptg] Summary: Implicit trait-based filters In-Reply-To: References: Message-ID: <1557213589.2232.0@smtp.office365.com> On Mon, May 6, 2019 at 8:03 PM, Eric Fried wrote: > Summary: > In keeping with the first proposed cycle theme [1] (though we didn't > land on that until later in the PTG), we would like to be able to add > required traits to the GET /allocation_candidates query to reduce the > number of results returned - i.e. do more filtering in placement > rather > than in the scheduler (or worse, the compute). You can already do this > by explicitly adding required traits to flavor/image; we want to be > able > to do it implicitly based on things like: > - If the instance requires multiattach, make sure it lands on a > compute > that supports multiattach [2]. > - If the image is in X format, make sure it lands on a compute that > can > read X format [3]. > > Currently the proposals in [2],[3] work by modifying the > RequestSpec.flavor right before select_destinations calls GET > /allocation_candidates. This just happens to be okay because we don't > persist that copy of the flavor back to the instance (which we > wouldn't > want to do, since we don't want these implicit additions to e.g. show > up > when we GET server details, or to affect other lifecycle operations). > > But this isn't a robust design. > > What we would like to do instead is exploit the > RequestSpec.requested_resources field [4] as it was originally > intended, > accumulating all the resource/trait/aggregate/etc. criteria from the > flavor, image, *and* request_filter-y things like the above. However, > gibi started on this [5] and it turns out to be difficult to express > the > unnumbered request group in that field for... reasons. Sorry that I was not able to describe the problems with the approach on the PTG. I will try now in a mail. So this patch [5] tries to create the unnumbered group in RequestSpec.requested_resources based on the other fields (flavor, image ..) in the RequestSpec early enough that the above mentioned pre-filters can add traits to this group instead of adding it the the flavor extra_spec. The current sequence is the following: * RequestSpec is created in three diffefent ways 1) RequestSpec.from_components(): used during server create. (and cold migrate if legacy compute is present) 2) RequestSpec.from_primitives(): deprecated but still used during re-schedule 3) RequestSpec.__init__(): oslo OVO deepcopy calls __init__ then copies over every field one by one. * Before nova scheduler sends the Placement a_c query it calls nova.scheduler.utils.resources_from_request_spec(RequestSpec) that code use the RequesSpec fields and collect all the request groups and all the other parameters (e.g. limit, group_policy) What we would need at the end: * When the RequetSpec is created in any way we need to populate the RequestSpec.requested_resources field based on the other RequestSpec fields. Note that __init__ cannot be used for this as all three instantiation of the object creates an empty object first with __init__ then pupulates the fields later one by one. * When any of the interesting fields (flavor, image, is_bvf, force_*, ...) is updated on the RequestSpec the request groups in RequestSpec.requested_resources needs to be updated to reflect the change. However we have to be careful not to blindly re-generate such data as the unnumbered group migh already contain traits that are not coming form any of these direct sources but coming from the above mentioned implicit required traits code paths. * When the Placement a_c query is generated it needs to be generated from RequestSpec.requested_resources There are couple of problems: 1) Detecting a change of a RequestSpec field cannot be done via wrapping the field in a propery due to OVO limitations [6]. Even if it would be possible the way we create the RequestSpec object (init an empty object then set fields one by one) the field setters might be called on an incomplete object. 2) Regeneration of RequestSpec.requested_resources would need to distinguish between data that can be regenerated from the other fields of the RequestSpec and the traits added from outside (implicit required traits). 3) The request pre-filters [7] run before the placement a_c query is generated. But these today changes the fields of the RequestSpec (e.g. requested_destination) that would mean the regeneration of RequestSpec.requested_resources would be needed. This probably solvable by changing the pre-filters to work directly on RequestSpec.requested_resources after we solved all the other issues. 4) The numbered request groups can come from multiple places. When it comes from the Flavor the number is stable as provided by the person created the Flavor. But when it comes from a Neutron port the number is generated (the next unoccupied int). So a re-generation of such groups would potentially re-numbed the groups. This makes the debuging hard as well as mapping numbered group back to the entity it requested the resource (port) after allocation. This probably solvable by using the proposed placement extension that allows a string in the numbered group name instead of just a single int. [8] This way the port uuid can be used as the identity for the numbered group to make the indenity stable. Cheers, gibi [6] https://bugs.launchpad.net/oslo.versionedobjects/+bug/1821619 [7] https://github.com/openstack/nova/blob/master/nova/scheduler/request_filter.py [8] https://storyboard.openstack.org/#!/story/2005575 > > Action: > Since gibi is going to be pretty occupied and unlikely to have time to > resolve [5], aspiers has graciously (been) volunteered to take it > over; > and then follow [2] and [3] to use that mechanism once it's available. Aspier, ping me if you want to talk about these in IRC. Cheers, gibi > > efried > > [1] > https://protect2.fireeye.com/url?k=07226944-5ba84bad-072229df-0cc47ad93e2e-db879b26751dd159&u=https://review.opendev.org/#/c/657171/1/priorities/train-priorities.rst at 13 > [2] > https://protect2.fireeye.com/url?k=6793d282-3b19f06b-67939219-0cc47ad93e2e-b61d4c15f019d018&u=https://review.opendev.org/#/c/645316/ > [3] > https://protect2.fireeye.com/url?k=975e0f6d-cbd42d84-975e4ff6-0cc47ad93e2e-9cf6144999db0dfb&u=https://review.opendev.org/#/q/topic:bp/request-filter-image-types+(status:open+OR+status:merged) > [4] > https://protect2.fireeye.com/url?k=495a140e-15d036e7-495a5495-0cc47ad93e2e-745cad547e47b7cc&u=https://opendev.org/openstack/nova/src/commit/5934c5dc6932fbf19ca7f3011c4ccc07b0038ac4/nova/objects/request_spec.py#L93-L100 > [5] > https://protect2.fireeye.com/url?k=733c10d0-2fb63239-733c504b-0cc47ad93e2e-25f07d70c4385f31&u=https://review.opendev.org/#/c/647396/ > From marcin.juszkiewicz at linaro.org Tue May 7 07:34:26 2019 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Tue, 7 May 2019 09:34:26 +0200 Subject: [kolla][neutron][networking-infoblox] Python3 issue: "TypeError: Unicode-objects must be encoded before hashing" In-Reply-To: <1d56ad05-9fa4-16b7-5cbe-af5c339f58b1@linaro.org> References: <1d56ad05-9fa4-16b7-5cbe-af5c339f58b1@linaro.org> Message-ID: <42626a00-df14-3d9b-e52c-1dfc3eeb639f@linaro.org> W dniu 07.05.2019 o 08:42, Marcin Juszkiewicz pisze: > I am working on making Kolla images Python 3 only. So far images are py3 > but then there are issues during deployment phase which I do not know > how to solve. > > https://review.opendev.org/#/c/642375/ is a patch. > > 'kolla-ansible-ubuntu-source' CI job deploys using Ubuntu 18.04 based > images. And fails. > > Log [1] shows something which looks like 'works in py2, not tested with py3' > code: > > 1. http://logs.openstack.org/75/642375/19/check/kolla-ansible-ubuntu-source/40878ed/primary/logs/ansible/deploy > > > " File \"/var/lib/kolla/venv/lib/python3.6/site-packages/networking_infoblox/neutron/common/utils.py\", line 374, in get_hash", > " return hashlib.md5(str(time.time())).hexdigest()", > "TypeError: Unicode-objects must be encoded before hashing" > ], > > Any ideas which project goes wrong? And how/where to fix it? > Found something interesting. And no idea who to blame... We use http://tarballs.openstack.org/networking-infoblox/networking-infoblox-master.tar.gz during development. But master == 2.0.3dev97 So I checked on tarballs and on Pypi: newton = 9.0.1 ocata = 10.0.1 pike = 11.0.1 queens = 12.0.1 rocky = 13.0.0 (tarballs only) stein is not present Each of those releases were done from same code but changelog always says 2.0.2 -> current.release.0 -> current.release.update Can not it be versioned in sane way? 2.0.2 -> 9.0.0 -> 10.0.0 -> 11.0.0 -> 12.0.0 -> 13.0.0 -> 13.x.ydevz? From cjeanner at redhat.com Tue May 7 07:56:07 2019 From: cjeanner at redhat.com (=?UTF-8?Q?C=c3=a9dric_Jeanneret?=) Date: Tue, 7 May 2019 09:56:07 +0200 Subject: [TripleO][PTG] Validation summary Message-ID: Hello all, Last Saturday, we had a session about two topics: - Validation Framework - In-flight validations Here's a summary about the different discussions around those topics. ## Current state: - all the existing validations within "tripleo-validations" have been moved to the new format (proper ansible roles with dedicated playbook). Big thumb up to the involved people for the hard work! - Mistral runs them from the hard drive instead of using swift - Possibility to run validations through the CLI using the new "openstack tripleo validator" subcommand - Possibility to run the validations directly with ansible-playbook - Blog posts with demos and some explanations: ° https://cjeanner.github.io/openstack/tripleo/validations/2019/04/24/validation-framework.html ° https://cjeanner.github.io/openstack/tripleo/validations/2019/04/25/in-flight-validations.html ° https://cjeanner.github.io/openstack/tripleo/validations/2019/04/26/in-flight-validations-II.html ## TODO - Refactor tripleoclient code regarding the way ansible is called, in order to allow bypassing mistral (useful if mistral is broken or not available, like on a standalone deploy) - Get more validations from the Services teams (Compute, Neutron, and so on) - CI integration: get a job allowing to validate the framework (running the no-op validation and group) as well as validations themselves - Doc update (WIP: https://review.opendev.org/654943) - Check how the tripleo-validations content might be backported down to Pike or even Newton. We don't consider the CLI changes, since the cherry-picks will be more than painful, and might break things in a really bad way. You can find the whole, raw content on the following pad: https://etherpad.openstack.org/p/tripleo-ptg-train-validations In case you have questions or remarks, or want to dig further in the topics present on that pad, feel free to contact me or just run a thread on the ML :). Cheers, C. -- Cédric Jeanneret Software Engineer DFG:DF -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From mark at stackhpc.com Tue May 7 09:01:47 2019 From: mark at stackhpc.com (Mark Goddard) Date: Tue, 7 May 2019 10:01:47 +0100 Subject: [kolla] Denver summit summary Message-ID: Hi, Here are links to slides from the kolla project sessions at the summit. * Project update [1] * Project onboarding [2] There should be a video of the update available in due course. We also had a user feedback session, the Etherpad notes are at [3] Picking out some themes from the user feedback: * TLS everywhere * Nova Cells v2 * SELinux * podman & buildah support for CentOS/RHEL 8 I think we're in a good position to support the first two in the Train cycle since they have some work in progress. The latter two will require some investigation. [1] https://docs.google.com/presentation/d/1npG6NGGsJxdXFzmPLfrDsWMhxeDVY9-nBfmDBvrAAlQ/edit?usp=sharing [2] https://docs.google.com/presentation/d/11gGW93Xu7DQo_G1LiRDm6thfB5gNLm39SHuKcgSW8FQ/edit?usp=sharing [3] https://etherpad.openstack.org/p/DEN-train-kolla-feedback Cheers, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at stackhpc.com Tue May 7 09:02:57 2019 From: mark at stackhpc.com (Mark Goddard) Date: Tue, 7 May 2019 10:02:57 +0100 Subject: [kayobe] Denver summit summary Message-ID: Hi, The Kayobe feedback & roadmap session Etherpad notes are at [1]. A major theme was documentation, including reference configurations and more around day 2 ops. On Tuesday evening we ran a packed workshop [2] on deploying OpenStack via Kayobe. It went pretty smoothly overall, and we had some positive feedback. Thanks to Packet for providing the infrastructure - the bare metal servers let us cover a lot of ground in a short time. Anyone wanting to try out the workshop can do so using a VM or bare metal server running CentOS 7 with at least 32GB RAM and 40GB disk. Follow the 'Full Deploy' section [3] in the README. I spoke with many people during the week who feel that Kayobe could be a great fit for them, which is really encouraging. Please look out for new users and contributors reaching out via IRC and the mailing list and help them get up to speed. [1] https://etherpad.openstack.org/p/DEN-19-kayobe-feedback-roadmap [2] https://www.openstack.org/summit/denver-2019/summit-schedule/events/23426/a-universe-from-nothing-containerised-openstack-deployment-using-kolla-ansible-and-kayobe [3] https://github.com/stackhpc/a-universe-from-nothing#full-deploy Cheers, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at stackhpc.com Tue May 7 09:07:28 2019 From: mark at stackhpc.com (Mark Goddard) Date: Tue, 7 May 2019 10:07:28 +0100 Subject: [ptg][kolla][openstack-ansible][tripleo] PTG cross-project summary Message-ID: Hi, This is a summary of the ad-hoc cross project session between the OpenStack Ansible and Kolla teams. It occurred to me that our two teams likely face similar challenges, and there are areas we could collaborate on. I've tagged TripleO also since the same applies there. [Collaboration on approach to features] This was my main reason for proposing the session - there are features and changes that all deployment tools need to make. Examples coming up include support for upgrade checkers and IPv6. Rather than work in isolation and solve the same problem in different ways, perhaps we could share our experiences. The implementations will differ, but providing a reasonably consistent feel between deployment tools can't be a bad thing. As a test case, we briefly discussed our experience with the upgrade checker support added in Stein, and found that our expectation of how it would work was fairly aligned in the room, but not aligned with how I understand it to actually work (it's more of a post-upgrade check than a pre-upgrade check). I was also able to point the OSA team at the placement migration code added to Kolla in the Stein release, which should save them some time, and provide more eyes on our code. I'd like to pursue this more collaborative approach during the Train release where it fits. Upgrade checkers seems a good place to start, but am open to other ideas such as IPv6 or Python 3. [OSA in Kayobe] This was my wildcard - add support for deploying OpenStack via OSA in Kayobe as an alternative to Kolla Ansible. It could be a good fit for those users who want to use OSA but don't have a provisioning system. This wasn't true of anyone in the room, and lack of resources deters from 'build it and they will come'. Still, the seed is planted, it may yet grow. [Sharing Ansible roles] mnaser had an interesting idea: add support for deploying kolla containers to the OSA Ansible service roles. We could then use those roles within Kolla Ansible to avoid duplication of code. There is definitely some appeal to this in theory. In practice however I feel that the two deployment models are sufficiently different that it would add significantly complexity to both projects. Part of the (relative) simplicity and regularity of Kolla Ansible is enabled by handing off installation and other tasks to Kolla. One option that might work however is sharing some of the lower level building blocks. mnaser offered to make a PoC for using https://github.com/openstack/ansible-config_template to generate configuration in Kolla Ansible in place of merge_config and merge_yaml. It requires some changes to that role to support merging a list of source template files. We'd also need to add an external dependency to our 'monorepo', or 'vendor' the module - trade offs to make in complexity vs. maintaining our own module. I'd like to thank the OSA team for hosting the discussion - it was great to meet the team and share experience. Cheers, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at stackhpc.com Tue May 7 09:25:06 2019 From: mark at stackhpc.com (Mark Goddard) Date: Tue, 7 May 2019 10:25:06 +0100 Subject: [kolla] Denver summit summary In-Reply-To: References: Message-ID: On Tue, 7 May 2019 at 10:01, Mark Goddard wrote: > Hi, > > Here are links to slides from the kolla project sessions at the summit. > > * Project update [1] > * Project onboarding [2] > > There should be a video of the update available in due course. > > We also had a user feedback session, the Etherpad notes are at [3] > > Picking out some themes from the user feedback: > > * TLS everywhere > * Nova Cells v2 > * SELinux > * podman & buildah support for CentOS/RHEL 8 > > I think we're in a good position to support the first two in the Train > cycle since they have some work in progress. The latter two will require > some investigation. > > [1] > https://docs.google.com/presentation/d/1npG6NGGsJxdXFzmPLfrDsWMhxeDVY9-nBfmDBvrAAlQ/edit?usp=sharing > [2] > https://docs.google.com/presentation/d/11gGW93Xu7DQo_G1LiRDm6thfB5gNLm39SHuKcgSW8FQ/edit?usp=sharing > It was brought to my attention that Google slides might not be accessible from some places. I've uploaded to slideshare also, but it appears this is blocked in China. Is there another location where they can be hosted that is accessible from China? > [3] https://etherpad.openstack.org/p/DEN-train-kolla-feedback > > Cheers, > Mark > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From john at johngarbutt.com Tue May 7 09:27:21 2019 From: john at johngarbutt.com (John Garbutt) Date: Tue, 7 May 2019 10:27:21 +0100 Subject: [nova][ptg][keystone] Summary: Unified Limits and Policy Refresh in Nova Message-ID: Hi, A summary of the nova/keystone cross project PTG session. Full etherpad is here: https://etherpad.openstack.org/p/ptg-train-xproj-nova-keystone 1) Policy Refresh Spec: https://review.openstack.org/#/c/547850/ Notes: * Better defaults to make policy changes easier * Move from current to: System Admin vs Project Member * Also add System Reader and Project Reader ** Above requires more granular policy for some APIs ** Also change DB check: system or admin, eventually drop it * Lots of testing to avoid regressions * Patrole may be useful, but initial focus on in-tree tests Actions: * johnthetubaguy to update spec * melwitt, gmann and johnthetubaguy happy to work on these * upload POC for testing plan 2) Unified Limits Spec: https://review.opendev.org/#/c/602201/ Notes: * only move instances and resource class based quotas to keystone * work on tooling to help operators migrate to keystone based limits * adopt oslo.limit to enforce unified limits * eventually we get hierarchical limits and the "per flavor" use case Actions: * johnthetubaguy to update the spec * johnthetubaguy, melwitt, alex_xu happy to work on these things * work on POC to show approach Thanks, johnthetubaguy -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at stackhpc.com Tue May 7 11:52:18 2019 From: mark at stackhpc.com (Mark Goddard) Date: Tue, 7 May 2019 12:52:18 +0100 Subject: [kolla] Virtual PTG scheduling poll In-Reply-To: References: Message-ID: The results are in! There were a few ties, so I picked the two sessions that most cores could attend and was most friendly to timezones of the attendees. Tues May 28th, 12:00 - 16:00 UTC Weds May 29th, 12:00 - 16:00 UTC Lets try to cover as much as possible in the first session, then decide if we need another. Unless anyone has any other suggestions, I propose we use Google hangouts for voice and/or video. Hangout: https://meet.google.com/pbo-boob-csh?hs=122 Calendar: https://calendar.google.com/event?action=TEMPLATE&tmeid=MGE1MHRuN2s2cTdkMm12YWtpMnY5YWZlNHRfMjAxOTA1MjhUMTIwMDAwWiBtYXJrQHN0YWNraHBjLmNvbQ&tmsrc=mark%40stackhpc.com&scp=ALL Cheers, Mark On Tue, 30 Apr 2019 at 19:01, Mark Goddard wrote: > Hi, > > We struggled to find a suitable date, so I've added another two weeks. > Please update your responses. > > https://doodle.com/poll/adk2smds76d8db4u > > Thanks, > Mark > > On Mon, 15 Apr 2019 at 07:34, Mark Goddard wrote: > >> Hi, >> >> Since most of the Kolla team aren't attending the PTG, we've agreed to >> hold a virtual PTG. >> >> We agreed to start with two 4 hour sessions. We can finish early or >> schedule another session, depending on how we progress. We'll use some >> video conferencing software TBD. >> >> I've created a Doodle poll here [2], please fill it in if you hope to >> attend. Times are in UTC. >> >> Please continue to fill out the planning Etherpad [1]. >> >> Thanks, >> Mark >> >> [1] https://etherpad.openstack.org/p/kolla-train-ptg >> [2] https://doodle.com/poll/adk2smds76d8db4u >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balazs.gibizer at ericsson.com Tue May 7 12:05:40 2019 From: balazs.gibizer at ericsson.com (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Tue, 7 May 2019 12:05:40 +0000 Subject: [placement][nova][ptg] Summary: Nested Magic With Placement In-Reply-To: References: Message-ID: <1557230737.31620.1@smtp.office365.com> On Fri, May 3, 2019 at 8:22 PM, Chris Dent wrote: > > * A 'mappings' key will be added to the 'allocations' object in the > allocation_candidates response that will support request group > mapping. I refreshed the spec in the following way: 1) updated the spec in the nova-spec repo to capture the agreement [1] 2) copied the spec from the nova-spec repo to the placement repo [2] 3) uploaded the both spec updates [1][2] 4) abandoned the nova-spec [1] by pointing to the placement spec 5) marked the nova bp [3] in launchpad as superseded pointing to the placement story [4]. [1] https://review.opendev.org/#/c/597601/ [2] https://review.opendev.org/#/c/657582/ [3] https://blueprints.launchpad.net/nova/+spec/placement-resource-provider-request-group-mapping-in-allocation-candidates [4] https://storyboard.openstack.org/#!/story/2005575 Please note that I removed myself as 'Primary assignee' in the spec as this work has low prio in Train from my side so it is free for anybody to take over. I will try to help at least with the review. Cheers, gibi From hongbin034 at gmail.com Tue May 7 12:14:59 2019 From: hongbin034 at gmail.com (Hongbin Lu) Date: Tue, 7 May 2019 08:14:59 -0400 Subject: [devstack-plugin-container][zun][kuryr] Extend core team for devstack-plugin-container Message-ID: Hi all, I propose to add Zun and Kuryr core team into devstack-plugin-container. Right now, both Zun and Kuryr are using that plugin and extending the core team would help accelerating the code review process. Please let me know if there is any concern of the proposal. Best regards, Hongbin -------------- next part -------------- An HTML attachment was scrubbed... URL: From aspiers at suse.com Tue May 7 12:58:18 2019 From: aspiers at suse.com (Adam Spiers) Date: Tue, 7 May 2019 13:58:18 +0100 Subject: [tc][all][airship] Github mirroring (or lack thereof) for unofficial projects In-Reply-To: References: <20190503190538.GB3377@localhost.localdomain> <20190503230525.a3vxsnliklitnei4@arabian.linksys.moosehall> Message-ID: <20190507125818.ykue2rycwcrqjhms@pacific.linksys.moosehall> Roman Gorshunov wrote: >Thanks, Adam. > >I haven't been on PTG, sorry. It's good that there has been a >discussion and agreement is reached. Oh sorry, I assumed you must have been in the room when we discussed it, since your mail arrived just after then ;-) But it was just a coincidence! :-) From jaypipes at gmail.com Tue May 7 13:02:53 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Tue, 7 May 2019 09:02:53 -0400 Subject: [ops][nova]Logging in nova and other openstack projects In-Reply-To: References: Message-ID: On 05/06/2019 05:56 PM, Jean-Philippe Méthot wrote: > Hi, > > We’ve been modifying our login habits for Nova on our Openstack setup to > try to send only warning level and up logs to our log servers. To do so, > I’ve created a logging.conf and configured logging according to the > logging module documentation. While what I’ve done works, it seems to be > a very convoluted process for something as simple as changing the > logging level to warning. We worry that if we upgrade and the syntax for > this configuration file changes, we may have to push more changes > through ansible than we would like to. It's unlikely that the syntax for the logging configuration file will change since it's upstream Python, not OpenStack or Nova that is the source of this syntax. That said, if all you want to do is change some or all package default logging levels, you can change the value of the CONF.default_log_levels option. The default_log_levels CONF option is actually derived from the oslo_log package that is used by all OpenStack service projects. It's default value is here: https://github.com/openstack/oslo.log/blob/29671ef2bfacb416d397abc57170bb090b116f68/oslo_log/_options.py#L19-L31 So, if you don't want to mess with the standard Python logging conf, you can just change that CONF.default_log_levels option. Note that if you do specify a logging config file using a non-None CONF.log_config_append value, then all other logging configuration options (like default_log_levels) are ignored). Best, -jay From mark at stackhpc.com Tue May 7 13:23:12 2019 From: mark at stackhpc.com (Mark Goddard) Date: Tue, 7 May 2019 14:23:12 +0100 Subject: [Deploy problem] deploy openstack+monasca with kolla-ansible 8.0.0.0rc1. In-Reply-To: References: Message-ID: On Fri, 3 May 2019 at 09:13, Ming-Che Liu wrote: > Apologies,this mail will attach rabbitmq log file(ues command "docker logs > --follow rabbitmq") for debug. > > Logs in /var/lib/docker/volumes/kolla_logs/_data/rabbitmq are empty. > > Hmm, there's not much to go on there. Are you now running Ubuntu 18.04? One thing that can help is running the container manually via docker run. It can take a little while to work out the right arguments to pass, but it's possible. Mark > thanks. > > Regards, > > Ming-Che > > Ming-Che Liu 於 2019年5月3日 週五 下午3:26寫道: > >> Hi Mark, >> >> I tried to deploy openstack+monasca with kolla-ansible 8.0.0.0rc1(in the >> same machine), but still encounter some fatal error. >> >> The attached file:golbals.yml is my setting, machine_package_setting is >> machine environment setting. >> >> The error is: >> RUNNING HANDLER [rabbitmq : Waiting for rabbitmq to start on first node] >> ************************************************************ >> fatal: [localhost]: FAILED! => {"changed": true, "cmd": "docker exec >> rabbitmq rabbitmqctl wait /var/lib/rabbitmq/mnesia/rabbitmq.pid", "delta": >> "0:00:00.861054", "end": "2019-05-03 15:17:42.387873", "msg": "non-zero >> return code", "rc": 137, "start": "2019-05-03 15:17:41.526819", "stderr": >> "", "stderr_lines": [], "stdout": "", "stdout_lines": []} >> >> When I use command "docker inspect rabbitmq_id |grep RestartCount", I >> find rabbitmq will restart many times >> >> such as: >> >> kaga at agre-an21:~$ sudo docker inspect 5567f37cc78a |grep RestartCount >> "RestartCount": 15, >> >> Could please help to solve this problem? Thanks. >> >> Regards, >> >> Ming-Che >> >> >> >> >> >> >> >> Ming-Che Liu 於 2019年5月3日 週五 上午9:22寫道: >> >>> Hi Mark, >>> >>> Sure, I will do that, thanks. >>> >>> Regards, >>> >>> Ming-Che >>> >>> Mark Goddard 於 2019年5月3日 週五 上午1:12寫道: >>> >>>> >>>> >>>> On Wed, 1 May 2019 at 17:10, Ming-Che Liu wrote: >>>> >>>>> Hello, >>>>> >>>>> I deployed openstack+monasca with kolla-ansible 8.0.0.0rc1. >>>>> >>>>> I follow the steps as mentioned in >>>>> https://docs.openstack.org/kolla-ansible/latest/user/quickstart.html >>>>> >>>>> The setting in my computer's globals.yml as same as [Quick Start] >>>>> tutorial (attached file: globals.yml is my setting). >>>>> >>>>> My machine environment as following: >>>>> OS: Ubuntu 16.04 >>>>> Kolla-ansible verions: 8.0.0.0rc1 >>>>> ansible version: 2.7 >>>>> >>>>> When I execute [bootstrap-servers] and [prechecks], it seems ok (no >>>>> fatal error or any interrupt). >>>>> >>>>> But when I execute [deploy], it will occur some error about >>>>> rabbitmq(when I set enable_rabbitmq:yes) and nova compute service(when I >>>>> set enable_rabbitmq:no). >>>>> >>>>> I have some detail screenshot about the errors as attached files, >>>>> could you please help me to solve this problem? >>>>> >>>>> Thank you very much. >>>>> >>>>> [Attached file description]: >>>>> globals.yml: my computer's setting about kolla-ansible >>>>> >>>>> As mentioned above, the following pictures show the errors, the >>>>> rabbitmq error will occur if I set [enable_rabbitmq:yes], the nova compute >>>>> service error will occur if I set [enable_rabbitmq:no]. >>>>> >>>> >>>> Hi Ming-Che, >>>> >>>> Since Stein, we no longer test Kolla Ansible with Ubuntu 16.04 >>>> upstream. Could you try again using Ubuntu 18.04? >>>> >>>> Regards, >>>> Mark >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jim at jimrollenhagen.com Tue May 7 13:25:17 2019 From: jim at jimrollenhagen.com (Jim Rollenhagen) Date: Tue, 7 May 2019 09:25:17 -0400 Subject: [kolla] Denver summit summary In-Reply-To: References: Message-ID: On Tue, May 7, 2019 at 5:26 AM Mark Goddard wrote: > > > On Tue, 7 May 2019 at 10:01, Mark Goddard wrote: > >> Hi, >> >> Here are links to slides from the kolla project sessions at the summit. >> >> * Project update [1] >> * Project onboarding [2] >> >> There should be a video of the update available in due course. >> >> We also had a user feedback session, the Etherpad notes are at [3] >> >> Picking out some themes from the user feedback: >> >> * TLS everywhere >> * Nova Cells v2 >> * SELinux >> * podman & buildah support for CentOS/RHEL 8 >> >> I think we're in a good position to support the first two in the Train >> cycle since they have some work in progress. The latter two will require >> some investigation. >> >> [1] >> https://docs.google.com/presentation/d/1npG6NGGsJxdXFzmPLfrDsWMhxeDVY9-nBfmDBvrAAlQ/edit?usp=sharing >> [2] >> https://docs.google.com/presentation/d/11gGW93Xu7DQo_G1LiRDm6thfB5gNLm39SHuKcgSW8FQ/edit?usp=sharing >> > > It was brought to my attention that Google slides might not be accessible > from some places. I've uploaded to slideshare also, but it appears this is > blocked in China. Is there another location where they can be hosted that > is accessible from China? > Maybe upload as PDFs in a patch to kolla. No need to merge, but folks can checkout the patch to get the files. // jim -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Tue May 7 13:59:33 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 7 May 2019 13:59:33 +0000 Subject: [kolla][neutron][networking-infoblox] Python3 issue: "TypeError: Unicode-objects must be encoded before hashing" In-Reply-To: <42626a00-df14-3d9b-e52c-1dfc3eeb639f@linaro.org> References: <1d56ad05-9fa4-16b7-5cbe-af5c339f58b1@linaro.org> <42626a00-df14-3d9b-e52c-1dfc3eeb639f@linaro.org> Message-ID: <20190507135932.y4j24clfc43nj6cs@yuggoth.org> On 2019-05-07 09:34:26 +0200 (+0200), Marcin Juszkiewicz wrote: [...] > Found something interesting. And no idea who to blame... > > We use > http://tarballs.openstack.org/networking-infoblox/networking-infoblox-master.tar.gz > during development. > > But master == 2.0.3dev97 > > So I checked on tarballs and on Pypi: > > newton = 9.0.1 > ocata = 10.0.1 > pike = 11.0.1 > queens = 12.0.1 > rocky = 13.0.0 (tarballs only) > stein is not present > > Each of those releases were done from same code but changelog always > says 2.0.2 -> current.release.0 -> current.release.update > > > Can not it be versioned in sane way? > > 2.0.2 -> 9.0.0 -> 10.0.0 -> 11.0.0 -> 12.0.0 -> 13.0.0 -> 13.x.ydevz? The reason for this is that our present practice for service projects in OpenStack (which the x/networking-infoblox repository seems to partly follow) is to tag major releases after creating stable branches rather than before, and those tags therefore end up missing in the master branch history from which the master branch tarballs you're consuming are created. We used to have a process of merging the release tags back into the master branch history to solve this, but ceased a few years ago because it complicated calculating release notes across various branches. Instead official projects following this release model now receive an auto-proposed change to master as part of the cycle release process which sets a Sem-Ver commit message footer to increment the minor version past the rc1 tag (which is the stable branch point for them). Popular alternatives to this are either to tag an early prerelease on master soon after branching, or follow a different release process where branches are created when/after tagging rather than before (this is more typical of shared libraries in particular). One way in which x/networking-infoblox is not fully following the same release model as official services is that they don't seem to be tagging release candidates on master (or at all for that matter), which would partly mitigate this as you would instead see versions like 13.0.0.0rc2.dev3. Another way it's not fully following that model is, as you have observed, there's no stable/stein branch for it yet. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From fungi at yuggoth.org Tue May 7 14:02:57 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 7 May 2019 14:02:57 +0000 Subject: [kolla] Denver summit summary In-Reply-To: References: Message-ID: <20190507140257.7rmlio6he3gov6gn@yuggoth.org> On 2019-05-07 09:25:17 -0400 (-0400), Jim Rollenhagen wrote: > On Tue, May 7, 2019 at 5:26 AM Mark Goddard wrote: [...] > > It was brought to my attention that Google slides might not be > > accessible from some places. I've uploaded to slideshare also, > > but it appears this is blocked in China. Is there another > > location where they can be hosted that is accessible from China? > > Maybe upload as PDFs in a patch to kolla. No need to merge, but > folks can checkout the patch to get the files. If these are slides for a summit session, the event coordinators generally send a message out to all speakers shortly following the conference with instructions on how/where to upload their slide decks so they can be served alongside the session abstracts. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From sean.mcginnis at gmx.com Tue May 7 14:20:47 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Tue, 7 May 2019 09:20:47 -0500 Subject: [cinder][ops] Nested Quota Driver Use? In-Reply-To: References: <20190502003249.GA1432@sm-workstation> Message-ID: <20190507142046.GA3999@sm-workstation> On Fri, May 03, 2019 at 06:58:41PM +0000, Tim Bell wrote: > We're interested in the overall functionality but I think unified limits is the place to invest and thus would not have any problem deprecating this driver. > > We'd really welcome this being implemented across all the projects in a consistent way. The sort of functionality proposed in https://techblog.web.cern.ch/techblog/post/nested-quota-models/ would need Nova/Cinder/Manila at miniumum for CERN to switch. > > So, no objections to deprecation but strong support to converge on unified limits. > > Tim > Thanks Tim, that helps. Since there wasn't any other feedback, and no one jumping up to say they are using it today, I have submitted https://review.opendev.org/657511 to deprecated the current quota driver so we don't have to try to refactor that functionality into whatever we need to do for the unified limits support. If anyone has any concerns about this plan, please feel free to raise them here or on that review. Thanks! Sean From cjeanner at redhat.com Tue May 7 14:33:57 2019 From: cjeanner at redhat.com (=?UTF-8?Q?C=c3=a9dric_Jeanneret?=) Date: Tue, 7 May 2019 16:33:57 +0200 Subject: [TripleO][Validations] Tag convention Message-ID: <3c383d8d-54fa-b054-f0ad-b97ed67ba03f@redhat.com> Dear all, We're currently working hard in order to provide a nice way to run validations within a deploy (aka in-flight validations). We can already call validations provided by the tripleo-validations package[1], it's working just fine. Now comes the question: "how can we disable the validations?". In order to do that, we propose to use a standard tag in the ansible roles/playbooks, and to add a "--skip-tags " when we disable the validations via the CLI or configuration. After a quick check in the tripleoclient code, there apparently is a tag named "validation", that can already be skipped from within the client. So, our questions: - would the reuse of "validation" be OK? - if not, what tag would be best in order to avoid confusion? We also have the idea to allow to disable validations per service. For this, we propose to introduce the following tag: - validation-, like "validation-nova", "validation-neutron" and so on What do you think about those two additions? Thank you all for your feedbacks and idea! Cheers, C. [1] as shown here: https://cjeanner.github.io/openstack/tripleo/validations/2019/04/26/in-flight-validations-II.html -- Cédric Jeanneret Software Engineer - OpenStack Platform Red Hat EMEA https://www.redhat.com/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From mark at stackhpc.com Tue May 7 14:36:22 2019 From: mark at stackhpc.com (Mark Goddard) Date: Tue, 7 May 2019 15:36:22 +0100 Subject: [kolla] Denver summit summary In-Reply-To: <20190507140257.7rmlio6he3gov6gn@yuggoth.org> References: <20190507140257.7rmlio6he3gov6gn@yuggoth.org> Message-ID: On Tue, 7 May 2019 at 15:03, Jeremy Stanley wrote: > On 2019-05-07 09:25:17 -0400 (-0400), Jim Rollenhagen wrote: > > On Tue, May 7, 2019 at 5:26 AM Mark Goddard wrote: > [...] > > > It was brought to my attention that Google slides might not be > > > accessible from some places. I've uploaded to slideshare also, > > > but it appears this is blocked in China. Is there another > > > location where they can be hosted that is accessible from China? > > > > Maybe upload as PDFs in a patch to kolla. No need to merge, but > > folks can checkout the patch to get the files. > > If these are slides for a summit session, the event coordinators > generally send a message out to all speakers shortly following the > conference with instructions on how/where to upload their slide > decks so they can be served alongside the session abstracts. > Thanks, I'll wait for that. > -- > Jeremy Stanley > -------------- next part -------------- An HTML attachment was scrubbed... URL: From johnsomor at gmail.com Tue May 7 14:37:21 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Tue, 7 May 2019 07:37:21 -0700 Subject: [octavia] Error while creating amphora In-Reply-To: <0994c2fb-a2c1-89f8-10ca-c3d0d9bf79e2@gmx.com> References: <0994c2fb-a2c1-89f8-10ca-c3d0d9bf79e2@gmx.com> Message-ID: Yes, we have had discussions with the nova team about this. Their response was that the current config drive method we are using is a stable interface and will not go away. We also asked that the "user_data" method storage size be increased to a reasonable size that could be used for our current needs. Even growing that to an old floppy disk size would address our needs, but this was not committed to. Michael On Mon, May 6, 2019 at 8:54 AM Volodymyr Litovka wrote: > > Hi Michael, > > regarding file injection vs config_drive - > https://specs.openstack.org/openstack/nova-specs/specs/queens/implemented/deprecate-file-injection.html > - don't know when this will happen, but you see - people are thinking in > this way. > > On 5/2/19 5:58 PM, Michael Johnson wrote: > > Volodymyr, > > > > It looks like you have enabled "user_data_config_drive" in the > > octavia.conf file. Is there a reason you need this? If not, please > > set it to False and it will resolve your issue. > > > > It appears we have a python3 bug in the "user_data_config_drive" > > capability. It is not generally used and appears to be missing test > > coverage. > > > > I have opened a story (bug) on your behalf here: > > https://storyboard.openstack.org/#!/story/2005553 > > > > Michael > > > > On Thu, May 2, 2019 at 4:29 AM Volodymyr Litovka wrote: > >> Dear colleagues, > >> > >> I'm using Openstack Rocky and trying to launch Octavia 4.0.0. After all installation steps I've got an error during 'openstack loadbalancer create' with the following log: > >> > >> DEBUG octavia.controller.worker.tasks.compute_tasks [-] Compute create execute for amphora with id d037721f-2cf9-492e-99cb-0be5874da0f6 execute /opt/openstack/lib/python3.6/site-packages/octavia/controller/worker/tasks/compute_tasks.py:63 > >> ERROR octavia.controller.worker.tasks.compute_tasks [-] Compute create for amphora id: d037721f-2cf9-492e-99cb-0be5874da0f6 failed: TypeError: can't concat str to bytes > >> ERROR octavia.controller.worker.tasks.compute_tasks Traceback (most recent call last): > >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/octavia/controller/worker/tasks/compute_tasks.py", line 94, in execute > >> ERROR octavia.controller.worker.tasks.compute_tasks config_drive_files) > >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/octavia/common/jinja/user_data_jinja_cfg.py", line 38, in build_user_data_config > >> ERROR octavia.controller.worker.tasks.compute_tasks return self.agent_template.render(user_data=user_data) > >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/asyncsupport.py", line 76, in render > >> ERROR octavia.controller.worker.tasks.compute_tasks return original_render(self, *args, **kwargs) > >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/environment.py", line 1008, in render > >> ERROR octavia.controller.worker.tasks.compute_tasks return self.environment.handle_exception(exc_info, True) > >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/environment.py", line 780, in handle_exception > >> ERROR octavia.controller.worker.tasks.compute_tasks reraise(exc_type, exc_value, tb) > >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/_compat.py", line 37, in reraise > >> ERROR octavia.controller.worker.tasks.compute_tasks raise value.with_traceback(tb) > >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/octavia/common/jinja/templates/user_data_config_drive.template", line 29, in top-level template code > >> ERROR octavia.controller.worker.tasks.compute_tasks {{ value|indent(8) }} > >> ERROR octavia.controller.worker.tasks.compute_tasks File "/opt/openstack/lib/python3.6/site-packages/jinja2/filters.py", line 557, in do_indent > >> ERROR octavia.controller.worker.tasks.compute_tasks s += u'\n' # this quirk is necessary for splitlines method > >> ERROR octavia.controller.worker.tasks.compute_tasks TypeError: can't concat str to bytes > >> ERROR octavia.controller.worker.tasks.compute_tasks > >> WARNING octavia.controller.worker.controller_worker [-] Task 'STANDALONE-octavia-create-amp-for-lb-subflow-octavia-cert-compute-create' (06134192-def9-420c-9feb-0d08a068f3b2) transitioned into state 'FAILURE' from state 'RUNNING' > >> > >> Any advises where is the problem? > >> > >> My environment: > >> - Openstack Rocky > >> - Ubuntu 18.04 > >> - Octavia installed in virtualenv using pip install: > >> # pip list |grep octavia > >> octavia 4.0.0 > >> octavia-lib 1.1.1 > >> python-octaviaclient 1.8.0 > >> > >> Thank you. > >> > >> -- > >> Volodymyr Litovka > >> "Vision without Execution is Hallucination." -- Thomas Edison > >> > >> -- > >> Volodymyr Litovka > >> "Vision without Execution is Hallucination." -- Thomas Edison > > -- > Volodymyr Litovka > "Vision without Execution is Hallucination." -- Thomas Edison > From jp.methot at planethoster.info Tue May 7 15:15:47 2019 From: jp.methot at planethoster.info (=?utf-8?Q?Jean-Philippe_M=C3=A9thot?=) Date: Tue, 7 May 2019 11:15:47 -0400 Subject: [ops][nova]Logging in nova and other openstack projects In-Reply-To: References: Message-ID: Hi, I’ve just tried setting everything to warn through the nova.conf option default_log_levels, as suggested. However, I’m still getting info level logs from the resource tracker like this : INFO nova.compute.resource_tracker Could the compute resource tracker logs be managed by another parameter than what’s in the default list for that configuration option? Best regards, Jean-Philippe Méthot Openstack system administrator Administrateur système Openstack PlanetHoster inc. > Le 7 mai 2019 à 09:02, Jay Pipes a écrit : > > On 05/06/2019 05:56 PM, Jean-Philippe Méthot wrote: >> Hi, >> We’ve been modifying our login habits for Nova on our Openstack setup to try to send only warning level and up logs to our log servers. To do so, I’ve created a logging.conf and configured logging according to the logging module documentation. While what I’ve done works, it seems to be a very convoluted process for something as simple as changing the logging level to warning. We worry that if we upgrade and the syntax for this configuration file changes, we may have to push more changes through ansible than we would like to. > > It's unlikely that the syntax for the logging configuration file will change since it's upstream Python, not OpenStack or Nova that is the source of this syntax. > > That said, if all you want to do is change some or all package default logging levels, you can change the value of the CONF.default_log_levels option. > > The default_log_levels CONF option is actually derived from the oslo_log package that is used by all OpenStack service projects. It's default value is here: > > https://github.com/openstack/oslo.log/blob/29671ef2bfacb416d397abc57170bb090b116f68/oslo_log/_options.py#L19-L31 > > So, if you don't want to mess with the standard Python logging conf, you can just change that CONF.default_log_levels option. Note that if you do specify a logging config file using a non-None CONF.log_config_append value, then all other logging configuration options (like default_log_levels) are ignored). > > Best, > -jay > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mihalis68 at gmail.com Tue May 7 15:18:14 2019 From: mihalis68 at gmail.com (Chris Morgan) Date: Tue, 7 May 2019 11:18:14 -0400 Subject: [ops] ops meetups team meeting 2019-5-7 Message-ID: Minute from todays meeting are linked below. A vote was taken to officially confirm acceptance of Bloomberg's offer to host the second ops meetup of 2019 and passed. There is also some news of possible further meetups in 2020 and discussion of how to structure ops events at future Open Infra Summits. reminder: key ops events are notified via : https://twitter.com/osopsmeetup Now up to 63 followers! Meeting ended Tue May 7 15:00:30 2019 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) 11:00 AM O<•openstack> Minutes: http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-05-07-14.04.html 11:00 AM Minutes (text): http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-05-07-14.04.txt 11:00 AM Log: http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-05-07-14.04.log.html Chris -- Chris Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmendiza at redhat.com Tue May 7 15:37:47 2019 From: dmendiza at redhat.com (=?UTF-8?Q?Douglas_Mendiz=c3=a1bal?=) Date: Tue, 7 May 2019 10:37:47 -0500 Subject: [nova][cinder][glance][Barbican]Finding Timeslot for weekly Image Encryption IRC meeting In-Reply-To: References: Message-ID: <6cdb30ba-888c-cd89-5bff-f432edb90467@redhat.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Hi Josephine, I think it's a great idea to have a recurring meeting to keep track of the Image Encryption effort. I tried to answer your doodle, but it seems that it does not have actual times, just dates? Maybe we need a new doodle? I live in the CDT (UTC-5) Time Zone if that helps. Thanks, - - Douglas Mendizábal (redrobot) On 5/4/19 1:57 PM, Josephine Seifert wrote: > Hello, > > as a result from the Summit and the PTG, I would like to hold a > weekly IRC-meeting for the Image Encryption (soon to be a pop-up > team). > > As I work in Europe I have made a doodle poll, with timeslots I > can attend and hopefully many of you. If you would like to join in > a weekly meeting, please fill out the poll and state your name and > the project you are working in: > https://doodle.com/poll/wtg9ha3e5dvym6yt > > Thank you Josephine (Luzi) > > > -----BEGIN PGP SIGNATURE----- iQEzBAEBCAAdFiEEan2ddQosxMRNS/FejiZC4mXuYkoFAlzRpksACgkQjiZC4mXu Ykqfawf7BngccaTpWzDNIipc697bjA2eg8guEYvEJ4KKlgl0vC7duY5Jn/7B/cKp wCFLtTA9V00pdBsdF0ZPOIeRAMlLkcx2BX2H6KqY/NzX0jB2xCtVem4PkAQcig/y 7ika3q/1SdRLKkbxA/07TtY5Obh7T0WUeK0WoylEgKW4YWLnWmMsD6lgcLzgfG1Z 2oDcjyVYShX9A+MVk4saLU3Zt9EG81WY81Y6iOElcj1MQGDY8Ukgc7m4/ykho3Du fZmj3IvxnE134ZGUECTKklmXeOgUWCcnUucIkyTKoAa/uXzxdxfdLT8MHHPxaGFa 6KGECV916VjY0ck32KmzbnpamUbdgw== =MOwN -----END PGP SIGNATURE----- From morgan.fainberg at gmail.com Tue May 7 15:38:41 2019 From: morgan.fainberg at gmail.com (Morgan Fainberg) Date: Tue, 7 May 2019 08:38:41 -0700 Subject: [keystone] reminder no irc meeting today, may 7 Message-ID: This is a reminder that there will be no weekly irc Keystone meeting this week so that everyone can recover post Summit and PTG [1]. Meetings will resume normally next week on May 14th. [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-April/005531.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaypipes at gmail.com Tue May 7 15:39:07 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Tue, 7 May 2019 11:39:07 -0400 Subject: [ops][nova]Logging in nova and other openstack projects In-Reply-To: References: Message-ID: As mentioned in my original response, if you have CONF.log_config_append set to anything, then the other conf options related to logging will be ignored. Best, -jay On Tue, May 7, 2019, 11:15 AM Jean-Philippe Méthot < jp.methot at planethoster.info> wrote: > Hi, > > I’ve just tried setting everything to warn through the nova.conf option > default_log_levels, as suggested. However, I’m still getting info level > logs from the resource tracker like this : > > INFO nova.compute.resource_tracker > > Could the compute resource tracker logs be managed by another parameter > than what’s in the default list for that configuration option? > > Best regards, > > Jean-Philippe Méthot > Openstack system administrator > Administrateur système Openstack > PlanetHoster inc. > > > > > Le 7 mai 2019 à 09:02, Jay Pipes a écrit : > > On 05/06/2019 05:56 PM, Jean-Philippe Méthot wrote: > > Hi, > We’ve been modifying our login habits for Nova on our Openstack setup to > try to send only warning level and up logs to our log servers. To do so, > I’ve created a logging.conf and configured logging according to the logging > module documentation. While what I’ve done works, it seems to be a very > convoluted process for something as simple as changing the logging level to > warning. We worry that if we upgrade and the syntax for this configuration > file changes, we may have to push more changes through ansible than we > would like to. > > > It's unlikely that the syntax for the logging configuration file will > change since it's upstream Python, not OpenStack or Nova that is the source > of this syntax. > > That said, if all you want to do is change some or all package default > logging levels, you can change the value of the CONF.default_log_levels > option. > > The default_log_levels CONF option is actually derived from the oslo_log > package that is used by all OpenStack service projects. It's default value > is here: > > > https://github.com/openstack/oslo.log/blob/29671ef2bfacb416d397abc57170bb090b116f68/oslo_log/_options.py#L19-L31 > > So, if you don't want to mess with the standard Python logging conf, you > can just change that CONF.default_log_levels option. Note that if you do > specify a logging config file using a non-None CONF.log_config_append > value, then all other logging configuration options (like > default_log_levels) are ignored). > > Best, > -jay > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From emilien at redhat.com Tue May 7 16:08:56 2019 From: emilien at redhat.com (Emilien Macchi) Date: Tue, 7 May 2019 18:08:56 +0200 Subject: [TripleO][Validations] Tag convention In-Reply-To: <3c383d8d-54fa-b054-f0ad-b97ed67ba03f@redhat.com> References: <3c383d8d-54fa-b054-f0ad-b97ed67ba03f@redhat.com> Message-ID: On Tue, May 7, 2019 at 4:44 PM Cédric Jeanneret wrote: > Dear all, > > We're currently working hard in order to provide a nice way to run > validations within a deploy (aka in-flight validations). > > We can already call validations provided by the tripleo-validations > package[1], it's working just fine. > > Now comes the question: "how can we disable the validations?". In order > to do that, we propose to use a standard tag in the ansible > roles/playbooks, and to add a "--skip-tags " when we disable the > validations via the CLI or configuration. > > After a quick check in the tripleoclient code, there apparently is a tag > named "validation", that can already be skipped from within the client. > > So, our questions: > - would the reuse of "validation" be OK? > - if not, what tag would be best in order to avoid confusion? > > We also have the idea to allow to disable validations per service. For > this, we propose to introduce the following tag: > - validation-, like "validation-nova", "validation-neutron" and > so on > > What do you think about those two additions? > Such as variables, I think we should prefix all our variables and tags with tripleo_ or something, to differentiate them from any other playbooks our operators could run. I would rather use "tripleo_validations" and "tripleo_validation_nova" maybe. Wdyt? -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnaser at vexxhost.com Tue May 7 16:24:42 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Tue, 7 May 2019 12:24:42 -0400 Subject: [TripleO][Validations] Tag convention In-Reply-To: References: <3c383d8d-54fa-b054-f0ad-b97ed67ba03f@redhat.com> Message-ID: On Tue, May 7, 2019 at 12:12 PM Emilien Macchi wrote: > > > > On Tue, May 7, 2019 at 4:44 PM Cédric Jeanneret wrote: >> >> Dear all, >> >> We're currently working hard in order to provide a nice way to run >> validations within a deploy (aka in-flight validations). >> >> We can already call validations provided by the tripleo-validations >> package[1], it's working just fine. >> >> Now comes the question: "how can we disable the validations?". In order >> to do that, we propose to use a standard tag in the ansible >> roles/playbooks, and to add a "--skip-tags " when we disable the >> validations via the CLI or configuration. >> >> After a quick check in the tripleoclient code, there apparently is a tag >> named "validation", that can already be skipped from within the client. >> >> So, our questions: >> - would the reuse of "validation" be OK? >> - if not, what tag would be best in order to avoid confusion? >> >> We also have the idea to allow to disable validations per service. For >> this, we propose to introduce the following tag: >> - validation-, like "validation-nova", "validation-neutron" and >> so on >> >> What do you think about those two additions? > > > Such as variables, I think we should prefix all our variables and tags with tripleo_ or something, to differentiate them from any other playbooks our operators could run. > I would rather use "tripleo_validations" and "tripleo_validation_nova" maybe. Just chiming in here.. the pattern we like in OSA is using dashes for tags, I think having something like 'tripleo-validations' and 'tripleo-validations-nova' etc > Wdyt? > -- > Emilien Macchi -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. http://vexxhost.com From sundar.nadathur at intel.com Tue May 7 16:50:00 2019 From: sundar.nadathur at intel.com (Nadathur, Sundar) Date: Tue, 7 May 2019 16:50:00 +0000 Subject: [cyborg] No meetings this week Message-ID: <1CC272501B5BC543A05DB90AA509DED527557514@fmsmsx122.amr.corp.intel.com> Many of our developers are either jetlagged or have other conflicts, and prefer to reconvene later. Regards, Sundar -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeremyfreudberg at gmail.com Tue May 7 17:22:44 2019 From: jeremyfreudberg at gmail.com (Jeremy Freudberg) Date: Tue, 7 May 2019 13:22:44 -0400 Subject: [sahara] Cancelling Sahara meeting May 9 Message-ID: Hi all, There will be no Sahara meeting this upcoming Thursday, May 9. Holler if you need anything. Thanks, Jeremy From alifshit at redhat.com Tue May 7 17:47:01 2019 From: alifshit at redhat.com (Artom Lifshitz) Date: Tue, 7 May 2019 13:47:01 -0400 Subject: [nova][CI] GPUs in the gate Message-ID: Hey all, Following up on the CI session during the PTG [1], I wanted to get the ball rolling on getting GPU hardware into the gate somehow. Initially the plan was to do it through OpenLab and by convincing NVIDIA to donate the cards, but after a conversation with Sean McGinnis it appears Infra have access to machines with GPUs. >From Nova's POV, the requirements are: * The machines with GPUs should probably be Ironic baremetal nodes and not VMs [*]. * The GPUs need to support virtualization. It's hard to get a comprehensive list of GPUs that do, but Nova's own docs [2] mention two: Intel cards with GVT [3] and NVIDIA GRID [4]. So I think at this point the question is whether Infra can support those reqs. If yes, we can start concrete steps towards getting those machines used by a CI job. If not, we'll fall back to OpenLab and try to get them hardware. [*] Could we do double-passthrough? Could the card be passed through to the L1 guest via the PCI passthrough mechanism, and then into the L2 guest via the mdev mechanism? [1] https://etherpad.openstack.org/p/nova-ptg-train-ci [2] https://docs.openstack.org/nova/rocky/admin/virtual-gpu.html [3] https://01.org/igvt-g [4] https://docs.nvidia.com/grid/5.0/pdf/grid-vgpu-user-guide.pdf From dtantsur at redhat.com Tue May 7 17:47:57 2019 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Tue, 7 May 2019 19:47:57 +0200 Subject: [ironic] My PTG & Forum notes Message-ID: <7313c6aa-1693-2cb0-4ed9-a73646764070@redhat.com> Hi folks, I've published my personal notes from the PTG & Forum in Denver: https://dtantsur.github.io/posts/ironic-denver-2019/ They're probably opinionated and definitely not complete, but I still think they could be useful. Also pasting the whole raw RST text below for ease of commenting. Cheers, Dmitry Keynotes ======== The `Metal3`_ project got some spotlight during the keynotes. A (successful!) `live demo`_ was done that demonstrated using Ironic through Kubernetes API to drive provisioning of bare metal nodes. The official `bare metal program`_ was announced to promote managing bare metal infrastructure via OpenStack. Forum: standalone Ironic ======================== On Monday we had two sessions dedicated to the future development of standalone Ironic (without Nova or without any other OpenStack services). During the `standalone roadmap session`_ the audience identified two potential domains where we could provide simple alternatives to depending on OpenStack services: * Alternative authentication. It was mentioned, however, that Keystone is a relatively easy service to install and operate, so adding this to Ironic may not be worth the effort. * Multi-tenant networking without Neutron. We could use networking-ansible_ directly, since they are planning on providing a Python API independent of their ML2 implementation. Next, firmware update support was a recurring topic (also in hallway conversations and also in non-standalone context). Related to that, a driver feature matrix documentation was requested, so that such driver-specific features are easier to discover. Then we had a separate `API multi-tenancy session`_. Three topic were covered: * Wiring in the existing ``owner`` field for access control. The idea is to allow operations for non-administrator users only to nodes with ``owner`` equal to their project (aka tenant) ID. In the non-keystone context this field would stay free-form. We did not agree whether we need an option to enable this feature. An interesting use case was mentioned: assign a non-admin user to Nova to allocate it only a part of the bare metal pool instead of all nodes. We did not reach a consensus on using a schema with the ``owner`` field, e.g. where ``keystone://{project ID}`` represents a Keystone project ID. * Adding a new field (e.g. ``deployed_by``) to track a user that requested deploy for auditing purposes. We agreed that the ``owner`` field should not be used for this purpose, and overall it should never be changed automatically by Ironic. * Adding some notion of *node leased to*, probably via a new field. This proposal was not well defined during the session, but we probably would allow some subset of API to lessees using the policy mechanism. It became apparent that implementing a separate *deployment API endpoint* is required to make such policy possible. Creating the deployment API was identified as a potential immediate action item. Wiring the ``owner`` field can also be done in the Train cycle, if we find volunteers to push it forward. PTG: scientific SIG =================== The PTG started for me with the `Scientific SIG discussions`_ of desired features and fixes in Ironic. The hottest topic was reducing the deployment time by reducing the number of reboots that are done during the provisioning process. `Ramdisk deploy`_ was identified as a very promising feature to solve this, as well as enable booting from remote volumes not supported directly by Ironic and/or Cinder. A few SIG members committed to testing it as soon as possible. Two related ideas were proposed for later brainstorming: * Keeping some proportion of nodes always on and with IPA booted. This is basing directly on the `fast-track deploy`_ work completed in the Stein cycle. A third party orchestrator would be needed for keeping the percentage, but Ironic will have to provide an API to boot an ``available`` node into the ramdisk. * Allow using *kexec* to instantly switch into a freshly deployed operating system. Combined together, these features can allow zero-reboot deployments. PTG: Ironic =========== Community sustainability ------------------------ We seem to have a disbalance in reviews, with very few people handling the majority of reviews, and some of them are close to burning out. * The first thing we discussed is simplifying the specs process. We considered a single +2 approval for specs and/or documentation. Approving documentation cannot break anyone, and follow-ups are easy, so it seems a good idea. We did not reach a firm agreement on a single +2 approval for specs; I personally feel that it would only move the bottleneck from specs to the code. * Facilitating deprecated feature removals can help clean up the code, and it can often be done by new contributors. We would like to maintain a list of what can be removed when, so that we don't forget it. * We would also like to switch to single +2 for stable backports. This needs changing the stable policy, and Tony volunteered to propose it. We felt that we're adding cores at a good pace, Julia had been mentoring people that wanted it. We would like people to volunteer, then we can mentor them into core status. However, we were not so sure we wanted to increase the stable core team. This team is supposed to be a small number of people that know quite a few small details of the stable policy (e.g. requirements changes). We thought we should better switch to single +2 approval for the existing team. Then we discussed moving away from WSME, which is barely maintained by a team of not really interested individuals. The proposal was to follow the example of Keystone and just move to Flask. We can use ironic-inspector as an example, and probably migrate part by part. JSON schema could replace WSME objects, similarly to how Nova does it. I volunteered to come up with a plan to switch, and some folks from Intel expressed interest in participating. Standalone roadmap ------------------ We started with a recap of items from `Forum: standalone Ironic`_. While discussing creating a driver matrix, we realized that we could keep driver capabilities in the source code (similar to existing iSCSI boot) and generate the documentation from it. Then we could go as far as exposing this information in the API. During the multi-tenancy discussion, the idea of owner and lessee fields was well received. Julia volunteered to write a specification for that. We clarified the following access control policies implemented by default: * A user can list or show nodes if they are an administrator, an owner of a node or a leaser of this node. * A user can deploy or undeploy a node (through the future deployment API) if they are an administrator, an owner of this node or a lessee of this node. * A user can update a node or any of its resources if they are an administrator or an owner of this node. A lessee of a node can **not** update it. The discussion of recording the user that did a deployment turned into discussing introducing a searchable log of changes to node power and provision states. We did not reach a final consensus on it, and we probably need a volunteer to push this effort forward. Deploy steps continued ---------------------- This session was dedicated to making the deploy templates framework more usable in practice. * We need to implement support for in-band deploy steps (other than the built-in ``deploy.deploy`` step). We probably need to start IPA before proceeding with the steps, similarly to how it is done with cleaning. * We agreed to proceed with splitting the built-in core step, making it a regular deploy step, as well as removing the compatibility shim for drivers that do not support deploy steps. We will probably separate writing an image to disk, writing a configdrive and creating a bootloader. The latter could be overridden to provide custom kernel parameters. * To handle potential differences between deploy steps in different hardware types, we discussed the possibility of optionally including a hardware type or interface name in a clean step. Such steps will only be run for nodes with matching hardware type or interface. Mark and Ruby volunteered to write a new spec on these topics. Day 2 operational workflow -------------------------- For deployments with external health monitoring, we need a way to represent the state when a deployed node looks healthy from our side but is detected as failed by the monitoring. It seems that we could introduce a new state transition from ``active`` to something like ``failed`` or ``quarantined``, where a node is still deployed, but explicitly marked as at fault by an operator. On unprovisioning, this node would not become ``available`` automatically. We also considered the possibility of using a flag instead of a new state, although the operators in the room were more in favor of using a state. We largely agreed that the already overloaded ``maintenance`` flag should not be used for this. On the Nova side we would probably use the ``error`` state to reflect nodes in the new state. A very similar request had been done for node retirement support. We decided to look for a unified solution. DHCP-less deploy ---------------- We discussed options to avoid relying on DHCP for deploying. * An existing specification proposes attaching IP information to virtual media. The initial contributors had become inactive, so we decided to help this work to go through. Volunteers are welcome. * As an alternative to that, we discussed using IPv6 SLAAC with multicast DNS (routed across WAN for Edge cases). A couple of folks on the room volunteered to help with testing. We need to fix python-zeroconf_ to support IPv6, which is something I'm planning on. Nova room --------- In a cross-project discussion with the Nova team we went through a few topics: * Whether Nova should use new Ironic API to build config drives. Since Ironic is not the only driver building config drives, we agreed that it probably doesn't make much sense to change that. * We did not come to a conclusion on deprecating capabilities. We agreed that Ironic has to provide alternatives for ``boot_option`` and ``boot_mode`` capabilities first. These will probably become deploy steps or built-in traits. * We agreed that we should switch Nova to using *openstacksdk* instead of *ironicclient* to access Ironic. This work had already been in progress. Faster deploy ------------- We followed up to `PTG: scientific SIG`_ with potential action items on speeding up the deployment process by reducing the number of reboots. We discussed an ability to keep all or some nodes powered on and heartbeating in the ``available`` state: * Add an option to keep the ramdisk running after cleaning. * For this to work with multi-tenant networking we'll need an IPA command to reset networking. * Add a provisioning verb going from ``available`` to ``available`` booting the node into IPA. * Make sure that pre-booted nodes are prioritized for scheduling. We will probably dynamically add a special trait. Then we'll have to update both Nova/Placement and the allocation API to support preferred (optional) traits. We also agreed that we could provide an option to *kexec* instead of rebooting as an advanced deploy step for operators that really know their hardware. Multi-tenant networking can be tricky in this case, since there is no safe point to switch from deployment to tenant network. We will probably take a best effort approach: command IPA to shutdown all its functionality and schedule a *kexec* after some time. After that, switch to tenant networks. This is not entirely secure, but will probably fit the operators (HPC) who requests it. Asynchronous clean steps ------------------------ We discussed enhancements for asynchronous clean and deploy steps. Currently running a step asynchronously requires either polling in a loop (occupying a green thread) or creating a new periodic task in a hardware type. We came up with two proposed updates for clean steps: * Allow a clean step to request re-running itself after certain amount of time. E.g. a clean step would do something like .. code-block:: python @clean_step(...) def wait_for_raid(self): if not raid_is_ready(): return RerunAfter(60) and the conductor would schedule re-running the same step in 60 seconds. * Allow a clean step to spawn more clean steps. E.g. a clean step would do something like .. code-block:: python @clean_step(...) def create_raid_configuration(self): start_create_raid() return RunNext([{'step': 'wait_for_raid'}]) and the conductor would insert the provided step to ``node.clean_steps`` after the current one and start running it. This would allow for several follow-up steps as well. A use case is a clean step for resetting iDRAC to a clean state that in turn consists of several other clean steps. The idea of sub-steps was deemed too complicated. PTG: TripleO ============ We discussed our plans for removing Nova from the TripleO undercloud and moving bare metal provisioning from under control of Heat. The plan from the `nova-less-deploy specification`_, as well as the current state of the implementation, were presented. The current concerns are: * upgrades from a Nova based deployment (probably just wipe the Nova database), * losing user experience of ``nova list`` (largely compensated by ``metalsmith list``), * tracking IP addresses for networks other than *ctlplane* (solved the same way as for deployed servers). The next action item is to create a CI job based on the already merged code and verify a few assumptions made above. PTG: Ironic, Placement, Blazar ============================== We reiterated over our plans to allow Ironic to optionally report nodes to Placement. This will be turned off when Nova is present to avoid conflicts with the Nova reporting. We will optionally use Placement as a backend for Ironic allocation API (which is something that had been planned before). Then we discussed potentially exposing detailed bare metal inventory to Placement. To avoid partial allocations, Placement could introduce new API to consume the whole resource provider. Ironic would use it when creating an allocation. No specific commitments were made with regards to this idea. Finally we came with the following workflow for bare metal reservations in Blazar: #. A user requests a bare metal reservation from Blazar. #. Blazar fetches allocation candidates from Placement. #. Blazar fetches a list of bare metal nodes from Ironic and filters out allocation candidates, whose resource provider UUID does not match one of the node UUIDs. #. Blazar remembers the node UUID and returns the reservation UUID to the user. When the reservation time comes: #. Blazar creates an allocation in Ironic (not Placement) with the candidate node matching previously picked node and allocation UUID matching the reservation UUID. #. When the enhancements in `Standalone roadmap`_ are implemented, Blazar will also set the node's lessee field to the user ID of the reservation, so that Ironic allows access to this node. #. A user fetches an Ironic allocation corresponding to the Blazar reservation UUID and learns the node UUID from it. #. A user proceeds with deploying the node. Side and hallway discussions ============================ * We discussed having Heat resources for Ironic. We recommended the team to start with Allocation and Deployment resources (the latter being virtual until we implement the planned deployment API). * We prototyped how Heat resources for Ironic could look, including Node, Port, Allocation and Deployment as a first step. .. _Metal3: http://metal3.io .. _live demo: https://www.openstack.org/videos/summits/denver-2019/openstack-ironic-and-bare-metal-infrastructure-all-abstractions-start-somewhere .. _bare metal program: https://www.openstack.org/bare-metal/ .. _standalone roadmap session: https://etherpad.openstack.org/p/DEN-train-next-steps-for-standalone-ironic .. _networking-ansible: https://opendev.org/x/networking-ansible .. _API multi-tenancy session: https://etherpad.openstack.org/p/DEN-train-ironic-multi-tenancy .. _Scientific SIG discussions: https://etherpad.openstack.org/p/scientific-sig-ptg-train .. _Ramdisk deploy: https://docs.openstack.org/ironic/latest/admin/interfaces/deploy.html#ramdisk-deploy .. _fast-track deploy: https://storyboard.openstack.org/#!/story/2004965 .. _python-zeroconf: https://github.com/jstasiak/python-zeroconf .. _nova-less-deploy specification: http://specs.openstack.org/openstack/tripleo-specs/specs/stein/nova-less-deploy.html From aspiers at suse.com Tue May 7 18:16:14 2019 From: aspiers at suse.com (Adam Spiers) Date: Tue, 7 May 2019 19:16:14 +0100 Subject: [nova][all][ptg] Summary: Same-Company Approvals In-Reply-To: References: Message-ID: <20190507181614.2s3qb3gopzvryt7o@pacific.linksys.moosehall> Morgan Fainberg wrote: >On Sat, May 4, 2019, 16:48 Eric Fried wrote: >> (NB: I tagged [all] because it would be interesting to know where other >> teams stand on this issue.) >> >> Etherpad: https://etherpad.openstack.org/p/nova-ptg-train-governance I didn't pipe up during the PTG discussion because a) I missed the first 5-10 minutes and hence probably some important context, and b) I've not been a nova contributor long enough to be well-informed on this topic. Apologies if that was the wrong decision. But I do have a few thoughts on this, which I'll share below. Given b), take them with a pinch of salt ;-) Firstly, I was impressed with the way this topic was raised and discussed, and I think that is a very encouraging indicator for the current health of nova contributor culture. We're in a good place :-) >> Summary: >> - There is a (currently unwritten? at least for Nova) rule that a patch >> should not be approved exclusively by cores from the same company. This >> is rife with nuance, including but not limited to: >> - Usually (but not always) relevant when the patch was proposed by >> member of same company >> - N/A for trivial things like typo fixes >> - The issue is: >> - Should the rule be abolished? and/or >> - Should the rule be written down? >> >> Consensus (not unanimous): [snipped] >Keystone used to have the same policy outlined in this email (with much of >the same nuance and exceptions). Without going into crazy details (as the >contributor and core numbers went down), we opted to really lean on "Overall, >we should be able to trust cores to act in good faith". We abolished the >rule and the cores always ask for outside input when the familiarity lies >outside of the team. We often also pull in cores more familiar with the >code sometimes ending up with 3x+2s before we workflow the patch. > >Personally I don't like the "this is an >unwritten rule and it shouldn't be documented"; if documenting and >enforcement of the rule elicits worry of gaming the system or being a dense >some not read, in my mind (and experience) the rule may not be worth >having. I voice my opinion with the caveat that every team is different. If >the rule works, and helps the team (Nova in this case) feel more confident >in the management of code, the rule has a place to live on. What works for >one team doesn't always work for another. +1 - I'm not wildly enthusiastic about the "keep it undocumented" approach either. Here's my stab at handling some of the objections to a written policy. >> - The rule should not be documented (this email notwithstanding). This >> would either encourage loopholing I don't see why the presence of a written rule would encourage people to deliberately subvert upstream trust any more than they might otherwise do. And a rule with loopholes is still a better deterrent than no rule at all. This is somewhat true for deliberate subversions of trust (which I expect are non-existent or at least extremely rare), but especially true for accidental subversions of trust which could otherwise happen quite easily due to not fully understanding how upstream works. >> or turn into a huge detailed legal tome that nobody will read. I don't think it has to. It's not a legal document, so there's no need to attempt to make it like one. If there are loopholes which can't easily be covered by a simple rewording, then so be it. If the policy only catches 50% of cases, it's still helping. So IMHO the existence of loopholes doesn't justify throwing the baby out with the bathwater. >> It would also *require* enforcement, which >> is difficult and awkward. Overall, we should be able to trust cores to >> act in good faith and in the appropriate spirit. I agree that enforcement would be difficult and awkward, and that we should be able to trust cores. But in the unlikely and unfortunate situation that a problem arose in this space, the lack of a written policy wouldn't magically solve that problem. in fact it would make it even *harder* to deal with, because there'd be nothing to point to in order to help explain to the offender what they were doing wrong. That would automatically make any judgement appear more subjective than objective, and therefore more prone to being taken personally. From pawel.konczalski at everyware.ch Tue May 7 19:10:54 2019 From: pawel.konczalski at everyware.ch (Pawel Konczalski) Date: Tue, 7 May 2019 21:10:54 +0200 Subject: Magnum Kubernetes openstack-cloud-controller-manager unable not resolve master node by DNS Message-ID: Hi, i try to deploy a Kubernetes cluster with OpenStack Magnum but the openstack-cloud-controller-manager pod fails to resolve the master node hostname. Does magnum require further parameter to configure the DNS names of the master and minions? DNS resolution in the VMs works fine. Currently there is no Designate installed in the OpenStack setup. openstack coe cluster template create kubernetes-cluster-template1 \   --image Fedora-AtomicHost-29-20190429.0.x86_64 \   --external-network public \   --dns-nameserver 8.8.8.8 \   --master-flavor m1.kubernetes \   --flavor m1.kubernetes \   --coe kubernetes \   --volume-driver cinder \   --network-driver flannel \   --docker-volume-size 25 openstack coe cluster create kubernetes-cluster1 \   --cluster-template kubernetes-cluster-template1 \   --master-count 1 \   --node-count 2 \   --keypair mykey # kubectl get pods --all-namespaces -o wide NAMESPACE     NAME                                       READY STATUS             RESTARTS   AGE       IP NODE                                         NOMINATED NODE kube-system   coredns-78df4bf8ff-mjp2c                   0/1 Pending            0          36m                                              kube-system   heapster-74f98f6489-tgtzl                  0/1 Pending            0          36m                                              kube-system   kube-dns-autoscaler-986c49747-wrvz4        0/1 Pending            0          36m                                              kube-system   kubernetes-dashboard-54cb7b5997-sk5pj      0/1 Pending            0          36m                                              kube-system   openstack-cloud-controller-manager-dgk64   0/1 CrashLoopBackOff   11         36m       kubernetes-cluster1-vulg5fz6hg2n-master-0   # kubectl -n kube-system logs openstack-cloud-controller-manager-dgk64 Error from server: Get https://kubernetes-cluster1-vulg5fz6hg2n-master-0:10250/containerLogs/kube-system/openstack-cloud-controller-manager-dgk64/openstack-cloud-controller-manager: dial tcp: lookup kubernetes-cluster1-vulg5fz6hg2n-master-0 on 8.8.8.8:53: no such host BR Pawel -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5227 bytes Desc: not available URL: From jungleboyj at gmail.com Tue May 7 20:06:10 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Tue, 7 May 2019 15:06:10 -0500 Subject: [nova][all][ptg] Summary: Same-Company Approvals In-Reply-To: References: Message-ID: All, Cinder has been working with the same unwritten rules for quite some time as well with minimal issues. I think the concerns about not having it documented are warranted.  We have had question about it in the past with no documentation to point to.  It is more or less lore that has been passed down over the releases.  :-) At a minimum, having this e-mail thread is helpful.  If, however, we decide to document it I think we should have it consistent across the teams that use the rule.  I would be happy to help draft/review any such documentation. Jay On 5/4/2019 8:19 PM, Morgan Fainberg wrote: > > > On Sat, May 4, 2019, 16:48 Eric Fried wrote: > > (NB: I tagged [all] because it would be interesting to know where > other > teams stand on this issue.) > > Etherpad: https://etherpad.openstack.org/p/nova-ptg-train-governance > > Summary: > - There is a (currently unwritten? at least for Nova) rule that a > patch > should not be approved exclusively by cores from the same company. > This > is rife with nuance, including but not limited to: >   - Usually (but not always) relevant when the patch was proposed by > member of same company >   - N/A for trivial things like typo fixes > - The issue is: >   - Should the rule be abolished? and/or >   - Should the rule be written down? > > Consensus (not unanimous): > - The rule should not be abolished. There are cases where both the > impetus and the subject matter expertise for a patch all reside within > one company. In such cases, at least one core from another company > should still be engaged and provide a "procedural +2" - much like > cores > proxy SME +1s when there's no core with deep expertise. > - If there is reasonable justification for bending the rules (e.g. > typo > fixes as noted above, some piece of work clearly not related to the > company's interest, unwedging the gate, etc.) said justification > should > be clearly documented in review commentary. > - The rule should not be documented (this email notwithstanding). This > would either encourage loopholing or turn into a huge detailed legal > tome that nobody will read. It would also *require* enforcement, which > is difficult and awkward. Overall, we should be able to trust cores to > act in good faith and in the appropriate spirit. > > efried > . > > > Keystone used to have the same policy outlined in this email (with > much of the same nuance and exceptions). Without going into crazy > details (as the contributor and core numbers went down), we opted to > really lean on "Overall, we should be able to trust cores to act in > good faith". We abolished the rule and the cores always ask for > outside input when the familiarity lies outside of the team. We often > also pull in cores more familiar with the code sometimes ending up > with 3x+2s before we workflow the patch. > > Personally I don't like the "this is an > unwritten rule and it shouldn't be documented"; if documenting and > enforcement of the rule elicits worry of gaming the system or being a > dense some not read, in my mind (and experience) the rule may not be > worth having. I voice my opinion with the caveat that every team is > different. If the rule works, and helps the team (Nova in this case) > feel more confident in the management of code, the rule has a place to > live on. What works for one team doesn't always work for another. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jungleboyj at gmail.com Tue May 7 20:22:25 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Tue, 7 May 2019 15:22:25 -0500 Subject: [cinder][ops] Nested Quota Driver Use? In-Reply-To: <20190507142046.GA3999@sm-workstation> References: <20190502003249.GA1432@sm-workstation> <20190507142046.GA3999@sm-workstation> Message-ID: On 5/7/2019 9:20 AM, Sean McGinnis wrote: > On Fri, May 03, 2019 at 06:58:41PM +0000, Tim Bell wrote: >> We're interested in the overall functionality but I think unified limits is the place to invest and thus would not have any problem deprecating this driver. >> >> We'd really welcome this being implemented across all the projects in a consistent way. The sort of functionality proposed in https://techblog.web.cern.ch/techblog/post/nested-quota-models/ would need Nova/Cinder/Manila at miniumum for CERN to switch. >> >> So, no objections to deprecation but strong support to converge on unified limits. >> >> Tim >> > Thanks Tim, that helps. > > Since there wasn't any other feedback, and no one jumping up to say they are > using it today, I have submitted https://review.opendev.org/657511 to > deprecated the current quota driver so we don't have to try to refactor that > functionality into whatever we need to do for the unified limits support. > > If anyone has any concerns about this plan, please feel free to raise them here > or on that review. > > Thanks! > Sean Sean, If I remember correctly, IBM had put some time into trying to fix the nested quota driver back around the Kilo or Liberty release. I haven't seen much activity since then. I am in support deprecating the driver and going to unified limits given that that appears to be the general direction of OpenStack. Jay From mthode at mthode.org Tue May 7 20:30:22 2019 From: mthode at mthode.org (Matthew Thode) Date: Tue, 7 May 2019 15:30:22 -0500 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 Message-ID: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> Hi all, This is a warning and call to test the requests updates linked below. The best way to test is to make a dummy review in your project that depends on the linked review (either Pike or Queens). Upstream has no intrest or (easy) ability to backport the patch. Please let us know either in the the #openstack-requirements channel or in this email thread if you have issues. Pike - 2.18.2 -> 2.20.1 - https://review.opendev.org/640727 Queens - 2.18.4 -> 2.20.1 - https://review.opendev.org/640710 -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From rodrigodsousa at gmail.com Tue May 7 20:30:51 2019 From: rodrigodsousa at gmail.com (Rodrigo Duarte) Date: Tue, 7 May 2019 13:30:51 -0700 Subject: [dev][keystone][ptg] Keystone team action items In-Reply-To: References: Message-ID: Thanks for the summary, Colleen. On Sun, May 5, 2019 at 8:59 AM Colleen Murphy wrote: > Hi everyone, > > I will write an in-depth summary of the Forum and PTG some time in the > coming week, but I wanted to quickly capture all the action items that came > out of the last six days so that we don't lose too much focus: > > Colleen > * move "Expand endpoint filters to Service Providers" spec[1] to attic > * review "Policy Goals"[2] and "Policy Security Roadmap"[3] specs with > Lance, refresh and possibly combine them > * move "Unified model for assignments, OAuth, and trusts" spec[4] from > ongoing to backlog, and circle up with Adam about refreshing it > * update app creds spec[5] to defer access_rules_config > * review app cred documentation with regard to proactive rotation > * follow up with nova/other service teams on need for microversion support > in access rules > * circle up with Guang on fixing autoprovisioning for tokenless auth > * keep up to date with IEEE/NIST efforts on standardizing federation > * investigate undoing the foreign key constraint that breaks the pluggable > resource driver > * propose governance change to add caching as a base service > * clean out deprecated cruft from keystonemiddleware > * write up Outreachy/other internship application tasks > > [1] > http://specs.openstack.org/openstack/keystone-specs/specs/keystone/backlog/service-providers-filters.html > [2] > http://specs.openstack.org/openstack/keystone-specs/specs/keystone/ongoing/policy-goals.html > [3] > http://specs.openstack.org/openstack/keystone-specs/specs/keystone/ongoing/policy-security-roadmap.html > [4] > http://specs.openstack.org/openstack/keystone-specs/specs/keystone/ongoing/unified-delegation.html > [5] > http://specs.openstack.org/openstack/keystone-specs/specs/keystone/train/capabilities-app-creds.html > > Lance > * write up plan for tempest testing of system scope > * break up unified limits testing plan into separate items, one for CRUD > in keystone and one for quota and limit validation in oslo.limit[6] > * write up spec for assigning roles on root domain > * (with Morgan) check for and add interface in oslo.policy to see if > policy has been overridden > > [6] https://trello.com/c/kbKvhYBz/20-test-unified-limits-in-tempest > > Kristi > * finish mutable config patch > * propose "model-timestamps" spec for Train[7] > * move "Add Multi-Version Support to Federation Mappings" spec[8] to attic > * review and possibly complete "Devstack Plugin for Keystone" spec[9] > * look into "RFE: Improved OpenID Connect Support" spec[10] > * update refreshable app creds spec[11] to make federated users expire > rather then app creds > * deprecate federated_domain_name > > [7] > http://specs.openstack.org/openstack/keystone-specs/specs/keystone/backlog/model-timestamps.html > [8] > http://specs.openstack.org/openstack/keystone-specs/specs/keystone/backlog/versioned-mappings.html > [9] > http://specs.openstack.org/openstack/keystone-specs/specs/keystone/ongoing/devstack-plugin.html > [10] https://bugs.launchpad.net/keystone/+bug/1815971 > [11] https://review.opendev.org/604201 > > Vishakha > * investigate effort needed for Alembic migrations spec[12] (with help > from Morgan) > * merge "RFE: Retrofit keystone-manage db_* commands to work with > Alembic"[13] into "Use Alembic for database migrations" spec > * remove deprecated [signing] config > * remove deprecated [DEFAULT]/admin_endpoint config > * remove deprecated [token]/infer_roles config > > [12] > http://specs.openstack.org/openstack/keystone-specs/specs/keystone/backlog/alembic.html > [13] https://bugs.launchpad.net/keystone/+bug/1816158 > > Morgan > * review "Materialize Project Hierarchy" spec[14] and make sure it > reflects the current state of the world, keep it in the backlog > * move "Functional Testing" spec[15] to attic > * move "Object Dependency Lifecycle" spec[16] to complete > * move "Add Endpoint Filter Enforcement to Keystonemiddleware" spec[17] to > attic > * move "Request Helpers" spec[18] to attic > * create PoC of external IdP proxy component > * (with Lance) check for and add interface in oslo.policy to see if policy > has been overridden > * investigate removing [eventlet_server] config section > * remove remaining PasteDeploy things > * remove PKI(Z) cruft from keystonemiddleware > * refactor keystonemiddleware to have functional components instead of > needing keystone to instantiate keystonemiddleware objects for auth > > [14] > http://specs.openstack.org/openstack/keystone-specs/specs/keystone/backlog/materialize-project-hierarchy.html > [15] > http://specs.openstack.org/openstack/keystone-specs/specs/keystone/ongoing/functional-testing.html > [16] > http://specs.openstack.org/openstack/keystone-specs/specs/keystone/ongoing/object-dependency-lifecycle.html > [17] > http://specs.openstack.org/openstack/keystone-specs/specs/keystonemiddleware/backlog/endpoint-enforcement-middleware.html > [18] > http://specs.openstack.org/openstack/keystone-specs/specs/keystonemiddleware/backlog/request-helpers.html > > Gage > * investigate with operators about specific use case behind "RFE: > Whitelisting (opt-in) users/projects/domains for PCI compliance"[19] request > * follow up on "RFE: Token returns Project's tag properties"[20] > * remove use of keystoneclient from keystonemiddleware > > [19] https://bugs.launchpad.net/keystone/+bug/1637146 > [20] https://bugs.launchpad.net/keystone/+bug/1807697 > > Rodrigo > * Propose finishing "RFE: Project Tree Deletion/Disabling"[21] as an > Outreachy project > > [21] https://bugs.launchpad.net/keystone/+bug/1816105 > > Adam > * write up super-spec on explicit project IDs plus predictable IDs > > > Thanks everyone for a productive week and for all your hard work! > > Colleen > > -- Rodrigo http://rodrigods.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From mthode at mthode.org Tue May 7 20:35:36 2019 From: mthode at mthode.org (Matthew Thode) Date: Tue, 7 May 2019 15:35:36 -0500 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> Message-ID: <20190507203536.w7uf2kh6qpvkhcgy@mthode.org> On 19-05-07 15:30:22, Matthew Thode wrote: > Hi all, > > This is a warning and call to test the requests updates linked below. > The best way to test is to make a dummy review in your project that > depends on the linked review (either Pike or Queens). Upstream has no > intrest or (easy) ability to backport the patch. > > Please let us know either in the the #openstack-requirements channel or > in this email thread if you have issues. > > Pike - 2.18.2 -> 2.20.1 - https://review.opendev.org/640727 > Queens - 2.18.4 -> 2.20.1 - https://review.opendev.org/640710 > Forgot to set the timeline for merging those reviews, the current plan is to merge them Tuesday Morning (May 14th) either EU or US time. -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From johnsomor at gmail.com Tue May 7 20:39:20 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Tue, 7 May 2019 13:39:20 -0700 Subject: OpenStack User Survey 2019 In-Reply-To: <74F9B988-972B-422F-94D1-E62A83FD87A7@openstack.org> References: <5CC0732E.8020601@tipit.net> <74F9B988-972B-422F-94D1-E62A83FD87A7@openstack.org> Message-ID: Jimmy & Allison, As you probably remember from previous year's surveys, the Octavia team has been trying to get a question included in the survey for a while. I have included the response we got the last time we inquired about the survey below. We never received a follow up invitation. I think it would be in the best interest for the community if we follow our "Four Opens" ethos in the user survey process, specifically the "Open Community" statement, by soliciting survey questions from the project teams in an open forum such as the openstack-discuss mailing list. Michael ----- Last response e-mail ------ Jimmy McArthur Fri, Sep 7, 2018, 5:51 PM to Allison, me Hey Michael, The project-specific questions were added in 2017, so likely didn't include some new projects. While we asked all projects to participate initially, less than a dozen did. We will be sending an invitation for new/underrepresented projects in the coming weeks. Please stand by and know that we value your feedback and that of the community. Cheers! On Sat, Apr 27, 2019 at 5:11 PM Allison Price wrote: > > Hi Michael, > > We reached out to all of the PTLs who had questions in the 2018 version of the survey to review and update their questions. If there is a project that was missed, we can add it and share anonymized results with the PTLs directly as well as the openstack-discsuss mailing list. > > If there is a question from the Octavia team, please let us know and we can add it for the 2019 survey. > > Cheers, > Allison > > > > On Apr 27, 2019, at 4:01 PM, Michael Johnson wrote: > > Jimmy, > > I am curious, how did you reach out the PTLs for project specific > questions? The Octavia team didn't receive any e-mail from you or > Allison on the topic. > > Michael > > From allison at openstack.org Tue May 7 20:50:10 2019 From: allison at openstack.org (Allison Price) Date: Tue, 7 May 2019 15:50:10 -0500 Subject: OpenStack User Survey 2019 In-Reply-To: References: <5CC0732E.8020601@tipit.net> <74F9B988-972B-422F-94D1-E62A83FD87A7@openstack.org> Message-ID: Hi Michael, I apologize that the Octavia project team has been unable to submit a question to date. Jimmy posted the User Survey update to the public mailing list to ensure we updated the entire community and that we caught any projects that had not submitted their questions. The User Survey is open all year, and the primary goal is passing operator feedback to the upstream community. If the Octavia team - or any OpenStack project team - has a question they would like added (limit of 2 per project), please let Jimmy or myself know. Thanks for reaching out, Michael. Cheers, Allison > On May 7, 2019, at 3:39 PM, Michael Johnson wrote: > > Jimmy & Allison, > > As you probably remember from previous year's surveys, the Octavia > team has been trying to get a question included in the survey for a > while. > I have included the response we got the last time we inquired about > the survey below. We never received a follow up invitation. > > I think it would be in the best interest for the community if we > follow our "Four Opens" ethos in the user survey process, specifically > the "Open Community" statement, by soliciting survey questions from > the project teams in an open forum such as the openstack-discuss > mailing list. > > Michael > > ----- Last response e-mail ------ > Jimmy McArthur > > Fri, Sep 7, 2018, 5:51 PM > to Allison, me > Hey Michael, > > The project-specific questions were added in 2017, so likely didn't > include some new projects. While we asked all projects to participate > initially, less than a dozen did. We will be sending an invitation for > new/underrepresented projects in the coming weeks. Please stand by and > know that we value your feedback and that of the community. > > Cheers! > > > >> On Sat, Apr 27, 2019 at 5:11 PM Allison Price wrote: >> >> Hi Michael, >> >> We reached out to all of the PTLs who had questions in the 2018 version of the survey to review and update their questions. If there is a project that was missed, we can add it and share anonymized results with the PTLs directly as well as the openstack-discsuss mailing list. >> >> If there is a question from the Octavia team, please let us know and we can add it for the 2019 survey. >> >> Cheers, >> Allison >> >> >> >> On Apr 27, 2019, at 4:01 PM, Michael Johnson wrote: >> >> Jimmy, >> >> I am curious, how did you reach out the PTLs for project specific >> questions? The Octavia team didn't receive any e-mail from you or >> Allison on the topic. >> >> Michael >> >> From dirk at dmllr.de Tue May 7 20:50:21 2019 From: dirk at dmllr.de (=?UTF-8?B?RGlyayBNw7xsbGVy?=) Date: Tue, 7 May 2019 22:50:21 +0200 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> Message-ID: Am Di., 7. Mai 2019 um 22:30 Uhr schrieb Matthew Thode : > Pike - 2.18.2 -> 2.20.1 - https://review.opendev.org/640727 > Queens - 2.18.4 -> 2.20.1 - https://review.opendev.org/640710 Specifically it looks like we're already at the next issue, as tracked here: https://github.com/kennethreitz/requests/issues/5065 Any concerns from anyone on these newer urllib3 updates? I guess we'll do them a bit later though. From johnsomor at gmail.com Tue May 7 20:51:56 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Tue, 7 May 2019 13:51:56 -0700 Subject: [octavia][taskflow] Adaption of persistence/jobboard (taskflow expert review needed) In-Reply-To: References: Message-ID: Hi Octavia team, Thank you for the great discussion we had at the PTG and via video conference. I am super excited that we can start work on flow resumption. I have updated the Stroyboard story for the Jobboard work based on the discussion: https://storyboard.openstack.org/#!/story/2005072 I tried to break it down into parts that multiple people could work on in parallel. Please feel free to sign up for work you are interested in or to add additional tasks you might think of. Michael On Wed, Apr 24, 2019 at 6:15 AM Anna Taraday wrote: > > Thanks for your feedback! > > The good thing about implementation of taskflow approach is that backend type is set in configs and does not affect code. We can create config settings in a flexible way, so that operators could choose which backend is preferable for their cloud. Just have one option as default for devstack, testing, etc. > Having etcd seems to be a good option, I did some experiments with it several years ago. But my concern here, if we do not have taskflow experts it may take a lot of time to implement it properly in taskflow. > > It is good to hear that some of refactor could be align with other activities and won't just disrupt the main course of work. > Implementing all of this as an alternative controller driver is a great idea! In this case we can have it as experimental feature to gather some user feedback. > > Unfortunately, I'm not attending PTG, but I hope we will find a find to discuss this in IRC. > > On Wed, Apr 24, 2019 at 5:24 AM Michael Johnson wrote: >> >> Thank you Ann for working on this. It has been on our roadmap[1] for some time. >> >> Using Taskflow JobBoard would bring huge value to Octavia by allowing >> sub-flow resumption of tasks. >> >> I inquired about this in the oslo team meeting a few weeks ago and >> sadly it seems that most if not all of the taskflow experts are no >> longer working on OpenStack. This may mean "we" are the current >> Taskflow experts.... >> >> I also inquired about adding etcd as an option for the jobs engine >> implementation. Currently only Zookeeper and Redis are implemented. >> Etcd is attractive as it provides similar functionality (to my limited >> knowledge of what Taskflow needs) and is already an OpenStack base >> service[2]. This may be an additional chunk of work to make this a >> viable option. >> >> The refactor of the flow data storage from oslo.db/sqlalchemy data >> models aligns with some of the work we need to do to make the amphora >> driver a proper Octavia driver. Currently it doesn't fully use the >> provider driver interface data passing. This work could resolve two >> issues at the same time. >> >> It also looks like you have found a reasonable solution to the >> importable flows issue. >> >> I did include this on the topic list for the PTG[3] expecting we would >> need to discuss it there. I think we have a number of questions to >> answer on this topic. >> >> 1. Do we have resources to work on this? >> 2. Is Taskflow JobBoard the right solution? Is there alternative we >> could implement without the overhead of JobBoard? Maybe a hybrid >> approach is the right answer. >> 3. Are we ok with requiring either Zookeeper or Redis for this >> functionality? Do we need to implement a TaskFlow driver for etcd? >> 4. Should this be implemented as an alternate controller driver to the >> current implementation? (yes, even the controller is a driver in >> Octavia.) >> >> Are you planning to attend the PTG? If so we can work through these >> questions there, it is already on the agenda. >> If not, we should figure out either how to include you in that >> discussion, or continue the discussion on the mailing list. >> >> Michael >> >> [1] https://wiki.openstack.org/wiki/Octavia/Roadmap >> [2] https://governance.openstack.org/tc/reference/base-services.html >> [3] https://etherpad.openstack.org/p/octavia-train-ptg >> >> On Fri, Apr 19, 2019 at 6:16 AM Anna Taraday wrote: >> > >> > Hello everyone! >> > >> > I was looking at the topic of usage taskflow persistence and jobboard in Octavia [1]. >> > I created a simple PoC to check what should be done to enable this functionality [2] . >> > >> > From what I see, taskflow expects that data, which will be stored in persitence backend/jobboard backend, is a dict or an object easily converted to dicts [3] (error [3.1]) >> > Also functions that creates flow should be importable [4] (error [4.1]). >> > >> > These two points lead to refactor required for Octavia to enable taskflow persistence and jobboard: >> > 1) Convert data which is passed between flows in dicts, at this moment it is db objects with links to other db objects. >> > 2) Create importable flow functions. >> > >> > As far as I see the only OpenStack project which adapted taskflow persistence is poppy [5] >> > >> > I'm looking for taskflow expect to take a look at all this and give some comments - whether I am correct or missing something. >> > >> > Thank you for your time in advance! >> > >> > [1] - https://storyboard.openstack.org/#!/story/2005072 >> > [2] - https://review.openstack.org/#/c/647406 >> > [3] - https://github.com/openstack/taskflow/blob/master/taskflow/persistence/backends/impl_sqlalchemy.py#L458 >> > [3.1] - http://paste.openstack.org/show/749530/ >> > [4] - https://docs.openstack.org/taskflow/latest/_modules/taskflow/engines/helpers.html#save_factory_details >> > [4.1] - http://paste.openstack.org/show/749527/ >> > [5] - https://github.com/openstack/poppy >> > >> > >> > -- >> > Regards, >> > Ann Taraday >> > Mirantis, Inc > > > > -- > Regards, > Ann Taraday > Mirantis, Inc From mriedemos at gmail.com Tue May 7 20:53:30 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Tue, 7 May 2019 15:53:30 -0500 Subject: [watcher][qa] Thoughts on performance testing for Watcher Message-ID: <6409b4e4-29af-da6d-1af6-a0d6e753049c@gmail.com> Hi, I'm new to Watcher and would like to do some performance and scale testing in a simulated environment and wondering if anyone can give some pointers on what I could be testing or looking for. If possible, I'd like to be able to just setup a single-node devstack with the nova fake virt driver which allows me to create dozens of fake compute nodes. I could also create multiple cells with devstack, but there gets to be a limit with how much you can cram into a single node 8GB RAM 8VCPU VM (I could maybe split 20 nodes across 2 cells). I could then create dozens of VMs to fill into those compute nodes. I'm mostly trying to figure out what could be an interesting set of tests. The biggest problem I'm trying to solve with Watcher is optimizing resource utilization, i.e. once the computes hit the Tetris problem and there is some room on some nodes but none of the nodes are fully packed. I was thinking I could simulate this by configuring nova so it spreads rather than packs VMs onto hosts (or just use the chance scheduler which randomly picks a host), using VMs of varying sizes, and then run some audit / action plan (I'm still learning the terminology here) to live migrate the VMs such that they get packed onto as few hosts as possible and see how long that takes. Naturally with devstack using fake nodes and no networking on the VMs, that live migration is basically a noop, but I'm more interested in profiling how long it takes Watcher itself to execute the actions. Once I get to know a bit more about how Watcher works, I could help with optimizing some of the nova-specific stuff using placement [1]. Any advice or guidance here would be appreciated. [1] https://review.opendev.org/#/c/656448/ -- Thanks, Matt From dirk at dmllr.de Tue May 7 21:02:57 2019 From: dirk at dmllr.de (=?UTF-8?B?RGlyayBNw7xsbGVy?=) Date: Tue, 7 May 2019 23:02:57 +0200 Subject: [all|requirements|stable] update django 1.x to 1.11.20 Message-ID: Hi, a number of security issues have been fixed for django 1.11.x which is still used by horizon for python 2.x and also optionally for python 3.x. The horizon gate jobs are already using that version: http://logs.openstack.org/46/651546/1/check/horizon-openstack-tox-python3-django111/7f0a6e0/job-output.txt.gz#_2019-04-10_14_22_10_604693 as they install django without using constraints.txt . Any objections to updating the global requirements constraints to match that? Reviewing the django fixes on the 1.11.x closely only shows security and data corruption bugfixes, so it should be pretty good on the risk/benefit trade-off. Thanks, Dirk From jp.methot at planethoster.info Tue May 7 21:31:19 2019 From: jp.methot at planethoster.info (=?utf-8?Q?Jean-Philippe_M=C3=A9thot?=) Date: Tue, 7 May 2019 17:31:19 -0400 Subject: [ops][nova]Logging in nova and other openstack projects In-Reply-To: References: Message-ID: <62034C21-91FC-4A9A-BC4B-47E372EAB925@planethoster.info> Indeed, this is what was written in your original response as well as in the documentation. As a result, it was fairly difficult to miss and I did comment it out before restarting the service. Additionally, as per the configuration I had set up, had the log-config-append option be set, I wouldn’t have any INFO level log in my logs. Hence why I believe it is strange that I have info level logs, when I’ve set default_log_levels like this: default_log_levels = amqp=WARN,amqplib=WARN,boto=WARN,qpid=WARN,sqlalchemy=WARN,suds=WARN,oslo.messaging=WARN,iso8601=WARN,requests.packages.urllib3.connectionpool=WARN,urllib3.connectionpool=WARN,websocket=WARN,requests.packages.urllib3.util.retry=WARN,urllib3.util.retry=WARN,keystonemiddleware=WARN,routes.middleware=WARN,stevedore=WARN,taskflow=WARN,keystoneauth=WARN,oslo.cache=WARN Please understand that I am not doubting that your previous answer normally works. I have seen your presentations at past Openstack summit and know that you are a brilliant individual. However, I can only answer here that, from my observations, this is not working as intended. I’ll also add that this is on Pike, but we are slated to upgrade to Queens in the coming weeks. Best regards, Jean-Philippe Méthot Openstack system administrator Administrateur système Openstack PlanetHoster inc. > Le 7 mai 2019 à 11:39, Jay Pipes a écrit : > > As mentioned in my original response, if you have CONF.log_config_append set to anything, then the other conf options related to logging will be ignored. > > Best, > -jay > > On Tue, May 7, 2019, 11:15 AM Jean-Philippe Méthot > wrote: > Hi, > > I’ve just tried setting everything to warn through the nova.conf option default_log_levels, as suggested. However, I’m still getting info level logs from the resource tracker like this : > > INFO nova.compute.resource_tracker > > Could the compute resource tracker logs be managed by another parameter than what’s in the default list for that configuration option? > > Best regards, > > Jean-Philippe Méthot > Openstack system administrator > Administrateur système Openstack > PlanetHoster inc. > > > > >> Le 7 mai 2019 à 09:02, Jay Pipes > a écrit : >> >> On 05/06/2019 05:56 PM, Jean-Philippe Méthot wrote: >>> Hi, >>> We’ve been modifying our login habits for Nova on our Openstack setup to try to send only warning level and up logs to our log servers. To do so, I’ve created a logging.conf and configured logging according to the logging module documentation. While what I’ve done works, it seems to be a very convoluted process for something as simple as changing the logging level to warning. We worry that if we upgrade and the syntax for this configuration file changes, we may have to push more changes through ansible than we would like to. >> >> It's unlikely that the syntax for the logging configuration file will change since it's upstream Python, not OpenStack or Nova that is the source of this syntax. >> >> That said, if all you want to do is change some or all package default logging levels, you can change the value of the CONF.default_log_levels option. >> >> The default_log_levels CONF option is actually derived from the oslo_log package that is used by all OpenStack service projects. It's default value is here: >> >> https://github.com/openstack/oslo.log/blob/29671ef2bfacb416d397abc57170bb090b116f68/oslo_log/_options.py#L19-L31 >> >> So, if you don't want to mess with the standard Python logging conf, you can just change that CONF.default_log_levels option. Note that if you do specify a logging config file using a non-None CONF.log_config_append value, then all other logging configuration options (like default_log_levels) are ignored). >> >> Best, >> -jay >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Tue May 7 21:45:38 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 7 May 2019 16:45:38 -0500 Subject: [oslo][oslo-messaging][nova] Stein nova-api AMQP issue running under uWSGI In-Reply-To: References: <229a2a53-870f-44c3-5e0c-6cfa9d45d0c5@oracle.com> <3275304e-d717-8b89-557e-b650fc4f661a@oracle.com> <20190420063850.GA18527@holtby.speedport.ip> <8b9cb0e4-b3a4-986a-be59-5bba6ae00f4e@nemebean.com> <20190503175904.GA26117@holtby> Message-ID: <8411da3c-9318-2189-5149-2beb9cab4bd0@nemebean.com> On 5/4/19 4:14 PM, Damien Ciabrini wrote: > > > On Fri, May 3, 2019 at 7:59 PM Michele Baldessari > wrote: > > On Mon, Apr 22, 2019 at 01:21:03PM -0500, Ben Nemec wrote: > > > > > > On 4/22/19 12:53 PM, Alex Schultz wrote: > > > On Mon, Apr 22, 2019 at 11:28 AM Ben Nemec > > wrote: > > > > > > > > > > > > > > > > On 4/20/19 1:38 AM, Michele Baldessari wrote: > > > > > On Fri, Apr 19, 2019 at 03:20:44PM -0700, > iain.macdonnell at oracle.com wrote: > > > > > > > > > > > > Today I discovered that this problem appears to be caused > by eventlet > > > > > > monkey-patching. I've created a bug for it: > > > > > > > > > > > > https://bugs.launchpad.net/nova/+bug/1825584 > > > > > > > > > > Hi, > > > > > > > > > > just for completeness we see this very same issue also with > > > > > mistral (actually it was the first service where we noticed > the missed > > > > > heartbeats). iirc Alex Schultz mentioned seeing it in > ironic as well, > > > > > although I have not personally observed it there yet. > > > > > > > > Is Mistral also mixing eventlet monkeypatching and WSGI? > > > > > > > > > > Looks like there is monkey patching, however we noticed it with the > > > engine/executor. So it's likely not just wsgi.  I think I also > saw it > > > in the ironic-conductor, though I'd have to try it out again.  I'll > > > spin up an undercloud today and see if I can get a more > complete list > > > of affected services. It was pretty easy to reproduce. > > > > Okay, I asked because if there's no WSGI/Eventlet combination > then this may > > be different from the Nova issue that prompted this thread. It > sounds like > > that was being caused by a bad interaction between WSGI and some > Eventlet > > timers. If there's no WSGI involved then I wouldn't expect that > to happen. > > > > I guess we'll see what further investigation turns up, but based > on the > > preliminary information there may be two bugs here. > > So just to get some closure on this error that we have seen around > mistral executor and tripleo with python3: this was due to the ansible > action that called subprocess which has a different implementation in > python3 and so the monkeypatching needs to be adapted. > > Review which fixes it for us is here: > https://review.opendev.org/#/c/656901/ > > Damien and I think the nova_api/eventlet/mod_wsgi has a separate > root-cause > (although we have not spent all too much time on that one yet) > > > Right, after further investigation, it appears that the problem we saw > under mod_wsgi was due to monkey patching, as Iain originally > reported. It has nothing to do with our work on healthchecks. > > It turns out that running the AMQP heartbeat thread under mod_wsgi > doesn't work when the threading library is monkey_patched, because the > thread waits on a data structure [1] that has been monkey patched [2], > which makes it yield its execution instead of sleeping for 15s. > > Because mod_wsgi stops the execution of its embedded interpreter, the > AMQP heartbeat thread can't be resumed until there's a message to be > processed in the mod_wsgi queue, which would resume the python > interpreter and make eventlet resume the thread. > > Disabling monkey-patching in nova_api makes the scheduling issue go > away. This sounds like the right long-term solution, but it seems unlikely to be backportable to the existing releases. As I understand it some nova-api functionality has an actual dependency on monkey-patching. Is there a workaround? Maybe periodically poking the API to wake up the wsgi interpreter? > > Note: other services like heat-api do not use monkey patching and > aren't affected, so this seem to confirm that monkey-patching > shouldn't happen in nova_api running under mod_wsgi in the first > place. > > [1] > https://github.com/openstack/oslo.messaging/blob/master/oslo_messaging/_drivers/impl_rabbit.py#L904 > [2] > https://github.com/openstack/oslo.utils/blob/master/oslo_utils/eventletutils.py#L182 From iain.macdonnell at oracle.com Tue May 7 22:22:36 2019 From: iain.macdonnell at oracle.com (iain.macdonnell at oracle.com) Date: Tue, 7 May 2019 15:22:36 -0700 Subject: [oslo][oslo-messaging][nova] Stein nova-api AMQP issue running under uWSGI In-Reply-To: <8411da3c-9318-2189-5149-2beb9cab4bd0@nemebean.com> References: <229a2a53-870f-44c3-5e0c-6cfa9d45d0c5@oracle.com> <3275304e-d717-8b89-557e-b650fc4f661a@oracle.com> <20190420063850.GA18527@holtby.speedport.ip> <8b9cb0e4-b3a4-986a-be59-5bba6ae00f4e@nemebean.com> <20190503175904.GA26117@holtby> <8411da3c-9318-2189-5149-2beb9cab4bd0@nemebean.com> Message-ID: <1537695d-fe31-3e48-36d7-566a92307a93@oracle.com> On 5/7/19 2:45 PM, Ben Nemec wrote: > > > On 5/4/19 4:14 PM, Damien Ciabrini wrote: >> >> >> On Fri, May 3, 2019 at 7:59 PM Michele Baldessari > > wrote: >> >>     On Mon, Apr 22, 2019 at 01:21:03PM -0500, Ben Nemec wrote: >>      > >>      > >>      > On 4/22/19 12:53 PM, Alex Schultz wrote: >>      > > On Mon, Apr 22, 2019 at 11:28 AM Ben Nemec >>     > wrote: >>      > > > >>      > > > >>      > > > >>      > > > On 4/20/19 1:38 AM, Michele Baldessari wrote: >>      > > > > On Fri, Apr 19, 2019 at 03:20:44PM -0700, >>     iain.macdonnell at oracle.com wrote: >>      > > > > > >>      > > > > > Today I discovered that this problem appears to be caused >>     by eventlet >>      > > > > > monkey-patching. I've created a bug for it: >>      > > > > > >>      > > > > > >> https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_nova_-2Bbug_1825584&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=RxYkIjeLZPK2frXV_wEUCq8d3wvUIvDPimUcunMwbMs&m=vdmZv2wQnoFF1TIFnkN4XXdIjy0p4TKcsQ598Qbjti4&s=zgCsi2WthDNaeptBSW02iplSjxg9P_zrnfocp8P06oA&e= >> >>      > > > > >>      > > > > Hi, >>      > > > > >>      > > > > just for completeness we see this very same issue also with >>      > > > > mistral (actually it was the first service where we noticed >>     the missed >>      > > > > heartbeats). iirc Alex Schultz mentioned seeing it in >>     ironic as well, >>      > > > > although I have not personally observed it there yet. >>      > > > >>      > > > Is Mistral also mixing eventlet monkeypatching and WSGI? >>      > > > >>      > > >>      > > Looks like there is monkey patching, however we noticed it >> with the >>      > > engine/executor. So it's likely not just wsgi.  I think I also >>     saw it >>      > > in the ironic-conductor, though I'd have to try it out >> again.  I'll >>      > > spin up an undercloud today and see if I can get a more >>     complete list >>      > > of affected services. It was pretty easy to reproduce. >>      > >>      > Okay, I asked because if there's no WSGI/Eventlet combination >>     then this may >>      > be different from the Nova issue that prompted this thread. It >>     sounds like >>      > that was being caused by a bad interaction between WSGI and some >>     Eventlet >>      > timers. If there's no WSGI involved then I wouldn't expect that >>     to happen. >>      > >>      > I guess we'll see what further investigation turns up, but based >>     on the >>      > preliminary information there may be two bugs here. >> >>     So just to get some closure on this error that we have seen around >>     mistral executor and tripleo with python3: this was due to the >> ansible >>     action that called subprocess which has a different implementation in >>     python3 and so the monkeypatching needs to be adapted. >> >>     Review which fixes it for us is here: >> >> https://urldefense.proofpoint.com/v2/url?u=https-3A__review.opendev.org_-23_c_656901_&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=RxYkIjeLZPK2frXV_wEUCq8d3wvUIvDPimUcunMwbMs&m=vdmZv2wQnoFF1TIFnkN4XXdIjy0p4TKcsQ598Qbjti4&s=1o81kC60gB8_5zIgi8WugZaOma_3m7grG4RQ-aVsbSE&e= >> >> >>     Damien and I think the nova_api/eventlet/mod_wsgi has a separate >>     root-cause >>     (although we have not spent all too much time on that one yet) >> >> >> Right, after further investigation, it appears that the problem we saw >> under mod_wsgi was due to monkey patching, as Iain originally >> reported. It has nothing to do with our work on healthchecks. >> >> It turns out that running the AMQP heartbeat thread under mod_wsgi >> doesn't work when the threading library is monkey_patched, because the >> thread waits on a data structure [1] that has been monkey patched [2], >> which makes it yield its execution instead of sleeping for 15s. >> >> Because mod_wsgi stops the execution of its embedded interpreter, the >> AMQP heartbeat thread can't be resumed until there's a message to be >> processed in the mod_wsgi queue, which would resume the python >> interpreter and make eventlet resume the thread. >> >> Disabling monkey-patching in nova_api makes the scheduling issue go >> away. > > This sounds like the right long-term solution, but it seems unlikely to > be backportable to the existing releases. As I understand it some > nova-api functionality has an actual dependency on monkey-patching. Is > there a workaround? Maybe periodically poking the API to wake up the > wsgi interpreter? I've been pondering things like that ... but if I have multiple WSGI processes, can I be sure that an API-poke will hit the one(s) that need it? This is a road-block for me upgrading to Stein. I really don't want to have to go back to running nova-api standalone, but that's increasingly looking like the only "safe" option :/ ~iain >> Note: other services like heat-api do not use monkey patching and >> aren't affected, so this seem to confirm that monkey-patching >> shouldn't happen in nova_api running under mod_wsgi in the first >> place. >> >> [1] >> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_openstack_oslo.messaging_blob_master_oslo-5Fmessaging_-5Fdrivers_impl-5Frabbit.py-23L904&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=RxYkIjeLZPK2frXV_wEUCq8d3wvUIvDPimUcunMwbMs&m=vdmZv2wQnoFF1TIFnkN4XXdIjy0p4TKcsQ598Qbjti4&s=O5nQh1r8Zmded00yYMXrfxL44xcd9KqFK-VOa0cg6gs&e= >> >> [2] >> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_openstack_oslo.utils_blob_master_oslo-5Futils_eventletutils.py-23L182&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=RxYkIjeLZPK2frXV_wEUCq8d3wvUIvDPimUcunMwbMs&m=vdmZv2wQnoFF1TIFnkN4XXdIjy0p4TKcsQ598Qbjti4&s=QRkXCiqv6zcnO2b2p8Uv6cgRuu1R414B9SvILuugN6w&e= >> > From cboylan at sapwetik.org Tue May 7 23:56:43 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Tue, 07 May 2019 19:56:43 -0400 Subject: [nova][CI] GPUs in the gate In-Reply-To: References: Message-ID: <3587e05d-deab-42ad-9a02-4312ca11760f@www.fastmail.com> On Tue, May 7, 2019, at 10:48 AM, Artom Lifshitz wrote: > Hey all, > > Following up on the CI session during the PTG [1], I wanted to get the > ball rolling on getting GPU hardware into the gate somehow. Initially > the plan was to do it through OpenLab and by convincing NVIDIA to > donate the cards, but after a conversation with Sean McGinnis it > appears Infra have access to machines with GPUs. > > From Nova's POV, the requirements are: > * The machines with GPUs should probably be Ironic baremetal nodes and > not VMs [*]. > * The GPUs need to support virtualization. It's hard to get a > comprehensive list of GPUs that do, but Nova's own docs [2] mention > two: Intel cards with GVT [3] and NVIDIA GRID [4]. > > So I think at this point the question is whether Infra can support > those reqs. If yes, we can start concrete steps towards getting those > machines used by a CI job. If not, we'll fall back to OpenLab and try > to get them hardware. What we currently have access to is a small amount of Vexxhost's GPU instances (so mnaser can further clarify my comments here). I believe these are VMs with dedicated nvidia gpus that are passed through. I don't think they support the vgpu feature. It might help to describe the use case you are trying to meet rather than jumping ahead to requirements/solutions. That way maybe we can work with Vexxhost to better support what you need (or come up with some other solutions). For those of us that don't know all of the particulars it really does help if you can go from use case to requirements. > > [*] Could we do double-passthrough? Could the card be passed through > to the L1 guest via the PCI passthrough mechanism, and then into the > L2 guest via the mdev mechanism? > > [1] https://etherpad.openstack.org/p/nova-ptg-train-ci > [2] https://docs.openstack.org/nova/rocky/admin/virtual-gpu.html > [3] https://01.org/igvt-g > [4] https://docs.nvidia.com/grid/5.0/pdf/grid-vgpu-user-guide.pdf From miguel at mlavalle.com Wed May 8 01:37:44 2019 From: miguel at mlavalle.com (Miguel Lavalle) Date: Tue, 7 May 2019 20:37:44 -0500 Subject: [openstack-dev] [neutron] Cancelling L3 sub-team meeting on May 8th Message-ID: Hi Neutrinos, Since we just had a long conversation on L3 topics during the PTG, we will cancel this week's meeting. We will resume normally on the 15th Regards -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Wed May 8 02:52:14 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Wed, 8 May 2019 11:52:14 +0900 Subject: [searchlight] Train-1 milestone goals Message-ID: Hi team, So the summit is over, the holiday is over, and the Train-1 milestone [1] is coming... I would like to take this chance to discuss a little bit about our targets for Train-1. My expectation simply is continuing what we left in Stein which are: - Deprecate Elasticsearch 2.x - Support multiple OpenStack clouds Please let me know what you think via email or input in the etherpad [2]. [1] https://releases.openstack.org/train/schedule.html [2] https://etherpad.openstack.org/p/searchlight-train You rock!!! -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Wed May 8 03:17:22 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Wed, 8 May 2019 12:17:22 +0900 Subject: [telemetry] Team meeting agenda for tomorrow Message-ID: Hi team, As planned, we will have a team meeting at 02:00 UTC, May 9th on #openstack-telemetry to discuss what we gonna do for the next milestone (Train-1) and continue what we left off from the last meeting. I put here [1] the agenda thinking that it should be fine for an hour meeting. If you have anything to talk about, please put it there too. [1] https://etherpad.openstack.org/p/telemetry-meeting-agenda Bests, -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From li.canwei2 at zte.com.cn Wed May 8 03:26:50 2019 From: li.canwei2 at zte.com.cn (li.canwei2 at zte.com.cn) Date: Wed, 8 May 2019 11:26:50 +0800 (CST) Subject: =?UTF-8?B?UmU6W3dhdGNoZXJdW3FhXSBUaG91Z2h0cyBvbiBwZXJmb3JtYW5jZSB0ZXN0aW5nIGZvciBXYXRjaGVy?= In-Reply-To: <6409b4e4-29af-da6d-1af6-a0d6e753049c@gmail.com> References: 6409b4e4-29af-da6d-1af6-a0d6e753049c@gmail.com Message-ID: <201905081126508513380@zte.com.cn> Hi Matt, I'm glad that you are interesting to Watcher. Though we never do such a simulated test, I wish you can get what you want. Some notes: 1, Watcher updates its data model based nova versioned notifications, so you should enable nova notification in your simulated environment. 2, Watcher needs node name getting from CONF.host or socket.gethostname, If you have two or more controller nodes they don't have same host name. 3, Watcher doesn't consider nova cell, now watcher filter nodes through host aggregate and zone. You can get more info by CLI cmd: watcher help audittemplate create 4, Watcher needs metric data source such as Ceilometer, so your fake nodes and VMs should have metric data. 5, For optimizing resource utilization, I think you could use strategy [1] 6, There are two audit type:ONESHOT and CONTINUOUS in Watcher, you can get more help by CLI cmd: watcher help audit create If any questions, let us know Thanks, licanwei [1] https://docs.openstack.org/watcher/latest/strategies/vm_workload_consolidation.html 发件人:MattRiedemann 收件人:openstack-discuss at lists.openstack.org ; 日 期 :2019年05月08日 04:57 主 题 :[watcher][qa] Thoughts on performance testing for Watcher Hi, I'm new to Watcher and would like to do some performance and scale testing in a simulated environment and wondering if anyone can give some pointers on what I could be testing or looking for. If possible, I'd like to be able to just setup a single-node devstack with the nova fake virt driver which allows me to create dozens of fake compute nodes. I could also create multiple cells with devstack, but there gets to be a limit with how much you can cram into a single node 8GB RAM 8VCPU VM (I could maybe split 20 nodes across 2 cells). I could then create dozens of VMs to fill into those compute nodes. I'm mostly trying to figure out what could be an interesting set of tests. The biggest problem I'm trying to solve with Watcher is optimizing resource utilization, i.e. once the computes hit the Tetris problem and there is some room on some nodes but none of the nodes are fully packed. I was thinking I could simulate this by configuring nova so it spreads rather than packs VMs onto hosts (or just use the chance scheduler which randomly picks a host), using VMs of varying sizes, and then run some audit / action plan (I'm still learning the terminology here) to live migrate the VMs such that they get packed onto as few hosts as possible and see how long that takes. Naturally with devstack using fake nodes and no networking on the VMs, that live migration is basically a noop, but I'm more interested in profiling how long it takes Watcher itself to execute the actions. Once I get to know a bit more about how Watcher works, I could help with optimizing some of the nova-specific stuff using placement [1]. Any advice or guidance here would be appreciated. [1] https://review.opendev.org/#/c/656448/ -- Thanks, Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From rico.lin.guanyu at gmail.com Wed May 8 05:34:07 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Wed, 8 May 2019 13:34:07 +0800 Subject: [heat][ptg] Summary for Heat from Denver Summit Message-ID: Hi all Here's the etherpad for Heat in Denver Summit and PTG: https://etherpad.openstack.org/p/DEN-Train-Heat I will update heat project onboarding video to slide page later, at meanwhile, enjoy slides for onboarding: https://goo.gl/eZ3bbH and project update: https://goo.gl/Fr6rBH *Some target items for Train cycle:* - Move to service token auth for re-auth - Hide OS::Glance::Image and mark as placeholder resource - make placeholder designate V1 resources too since they already deleted in Designate in Rocky - Heat zombie services entries recycle - Atomic ExtraRoute resource improvement - Better document and scenario test support for Auto-scaling SIG and Self-healing SIG - Ironic resources (Don't get this wrong, Heat already support Ironic by using Nova server resources. This is about directly supported Ironic resources) - Adding support for Heat in Terraform For some ongoing tasks like Vitrage Template still on our review target list too, and we will try to make sure those works for features/deprecation tasks will be landed soon as we can. *Help most needed for Heat:* We very much need for core reviewers to help to push patches in. So, please join us for help to review an develop. I hope we can still keep Heat develop more active. Like we *No meeting for this week* Since people just come back from Summit, and I pretty sure some of us still in other events now, so let's skip meeting this week. -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From li.canwei2 at zte.com.cn Wed May 8 05:38:51 2019 From: li.canwei2 at zte.com.cn (li.canwei2 at zte.com.cn) Date: Wed, 8 May 2019 13:38:51 +0800 (CST) Subject: =?UTF-8?B?W1dhdGNoZXJdIHRlYW0gbWVldGluZyBhbmQgYWdlbmRh?= Message-ID: <201905081338518155467@zte.com.cn> Hi, Watcher will have a meeting again at 08:00 UTC in the #openstack-meeting-alt channel. The agenda is available on https://wiki.openstack.org/wiki/Watcher_Meeting_Agenda feel free to add any additional items. Thanks! Canwei Li -------------- next part -------------- An HTML attachment was scrubbed... URL: From josephine.seifert at secustack.com Wed May 8 06:02:39 2019 From: josephine.seifert at secustack.com (Josephine Seifert) Date: Wed, 8 May 2019 08:02:39 +0200 Subject: [nova][cinder][glance][Barbican]Finding Timeslot for weekly Image Encryption IRC meeting In-Reply-To: <6cdb30ba-888c-cd89-5bff-f432edb90467@redhat.com> References: <6cdb30ba-888c-cd89-5bff-f432edb90467@redhat.com> Message-ID: Hi Douglas, it seems, that doodle put the "UTC"  in the second line. It should say e.g. "Mon 12 UTC" meaning Mondays at 12:00 UTC and so on. Greetings, Josephine (Luzi) Am 07.05.19 um 17:37 schrieb Douglas Mendizábal: > Hi Josephine, > > I think it's a great idea to have a recurring meeting to keep track of > the Image Encryption effort.   I tried to answer your doodle, but it > seems that it does not have actual times, just dates?  Maybe we need a > new doodle?  I live in the CDT (UTC-5) Time Zone if that helps. > > Thanks, > - Douglas Mendizábal (redrobot) > > On 5/4/19 1:57 PM, Josephine Seifert wrote: > > Hello, > > > as a result from the Summit and the PTG, I would like to hold a > > weekly IRC-meeting for the Image Encryption (soon to be a pop-up > > team). > > > As I work in Europe I have made a doodle poll, with timeslots I > > can attend and hopefully many of you. If you would like to join in > > a weekly meeting, please fill out the poll and state your name and > > the project you are working in: > > https://doodle.com/poll/wtg9ha3e5dvym6yt > > > Thank you Josephine (Luzi) > > > From li.canwei2 at zte.com.cn Wed May 8 06:19:17 2019 From: li.canwei2 at zte.com.cn (li.canwei2 at zte.com.cn) Date: Wed, 8 May 2019 14:19:17 +0800 (CST) Subject: =?UTF-8?B?UmU6W3dhdGNoZXJdW3FhXSBUaG91Z2h0cyBvbiBwZXJmb3JtYW5jZSB0ZXN0aW5nIGZvciBXYXRjaGVy?= In-Reply-To: <6409b4e4-29af-da6d-1af6-a0d6e753049c@gmail.com> References: 6409b4e4-29af-da6d-1af6-a0d6e753049c@gmail.com Message-ID: <201905081419177826734@zte.com.cn> another note, Watcher provides a WORKLOAD optimization(balancing or consolidation). If you want to maximize the node resource (such as vCPU, Ram...) usage through VM migration, Watcher doesn't have such a strategy now. Thanks! licanwei 原始邮件 发件人:MattRiedemann 收件人:openstack-discuss at lists.openstack.org ; 日 期 :2019年05月08日 04:57 主 题 :[watcher][qa] Thoughts on performance testing for Watcher Hi, I'm new to Watcher and would like to do some performance and scale testing in a simulated environment and wondering if anyone can give some pointers on what I could be testing or looking for. If possible, I'd like to be able to just setup a single-node devstack with the nova fake virt driver which allows me to create dozens of fake compute nodes. I could also create multiple cells with devstack, but there gets to be a limit with how much you can cram into a single node 8GB RAM 8VCPU VM (I could maybe split 20 nodes across 2 cells). I could then create dozens of VMs to fill into those compute nodes. I'm mostly trying to figure out what could be an interesting set of tests. The biggest problem I'm trying to solve with Watcher is optimizing resource utilization, i.e. once the computes hit the Tetris problem and there is some room on some nodes but none of the nodes are fully packed. I was thinking I could simulate this by configuring nova so it spreads rather than packs VMs onto hosts (or just use the chance scheduler which randomly picks a host), using VMs of varying sizes, and then run some audit / action plan (I'm still learning the terminology here) to live migrate the VMs such that they get packed onto as few hosts as possible and see how long that takes. Naturally with devstack using fake nodes and no networking on the VMs, that live migration is basically a noop, but I'm more interested in profiling how long it takes Watcher itself to execute the actions. Once I get to know a bit more about how Watcher works, I could help with optimizing some of the nova-specific stuff using placement [1]. Any advice or guidance here would be appreciated. [1] https://review.opendev.org/#/c/656448/ -- Thanks, Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From bharat at stackhpc.com Wed May 8 06:40:04 2019 From: bharat at stackhpc.com (Bharat Kunwar) Date: Wed, 8 May 2019 07:40:04 +0100 Subject: Magnum Kubernetes openstack-cloud-controller-manager unable not resolve master node by DNS In-Reply-To: References: Message-ID: <4FFA2395-960B-4DA7-8481-F2AD93EAB500@stackhpc.com> Try using the latest version, think there is an OCCM_TAG. Sent from my iPhone > On 7 May 2019, at 20:10, Pawel Konczalski wrote: > > Hi, > > i try to deploy a Kubernetes cluster with OpenStack Magnum but the openstack-cloud-controller-manager pod fails to resolve the master node hostname. > > Does magnum require further parameter to configure the DNS names of the master and minions? DNS resolution in the VMs works fine. Currently there is no Designate installed in the OpenStack setup. > > > openstack coe cluster template create kubernetes-cluster-template1 \ > --image Fedora-AtomicHost-29-20190429.0.x86_64 \ > --external-network public \ > --dns-nameserver 8.8.8.8 \ > --master-flavor m1.kubernetes \ > --flavor m1.kubernetes \ > --coe kubernetes \ > --volume-driver cinder \ > --network-driver flannel \ > --docker-volume-size 25 > > openstack coe cluster create kubernetes-cluster1 \ > --cluster-template kubernetes-cluster-template1 \ > --master-count 1 \ > --node-count 2 \ > --keypair mykey > > > # kubectl get pods --all-namespaces -o wide > NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE > kube-system coredns-78df4bf8ff-mjp2c 0/1 Pending 0 36m > kube-system heapster-74f98f6489-tgtzl 0/1 Pending 0 36m > kube-system kube-dns-autoscaler-986c49747-wrvz4 0/1 Pending 0 36m > kube-system kubernetes-dashboard-54cb7b5997-sk5pj 0/1 Pending 0 36m > kube-system openstack-cloud-controller-manager-dgk64 0/1 CrashLoopBackOff 11 36m kubernetes-cluster1-vulg5fz6hg2n-master-0 > > > # kubectl -n kube-system logs openstack-cloud-controller-manager-dgk64 > Error from server: Get https://kubernetes-cluster1-vulg5fz6hg2n-master-0:10250/containerLogs/kube-system/openstack-cloud-controller-manager-dgk64/openstack-cloud-controller-manager: dial tcp: lookup kubernetes-cluster1-vulg5fz6hg2n-master-0 on 8.8.8.8:53: no such host > > > BR > > Pawel From cjeanner at redhat.com Wed May 8 07:07:02 2019 From: cjeanner at redhat.com (=?UTF-8?Q?C=c3=a9dric_Jeanneret?=) Date: Wed, 8 May 2019 09:07:02 +0200 Subject: [TripleO][Validations] Tag convention In-Reply-To: References: <3c383d8d-54fa-b054-f0ad-b97ed67ba03f@redhat.com> Message-ID: <5228e551-477c-129e-d621-9b1bde9a6535@redhat.com> On 5/7/19 6:24 PM, Mohammed Naser wrote: > On Tue, May 7, 2019 at 12:12 PM Emilien Macchi wrote: >> >> >> >> On Tue, May 7, 2019 at 4:44 PM Cédric Jeanneret wrote: >>> >>> Dear all, >>> >>> We're currently working hard in order to provide a nice way to run >>> validations within a deploy (aka in-flight validations). >>> >>> We can already call validations provided by the tripleo-validations >>> package[1], it's working just fine. >>> >>> Now comes the question: "how can we disable the validations?". In order >>> to do that, we propose to use a standard tag in the ansible >>> roles/playbooks, and to add a "--skip-tags " when we disable the >>> validations via the CLI or configuration. >>> >>> After a quick check in the tripleoclient code, there apparently is a tag >>> named "validation", that can already be skipped from within the client. >>> >>> So, our questions: >>> - would the reuse of "validation" be OK? >>> - if not, what tag would be best in order to avoid confusion? >>> >>> We also have the idea to allow to disable validations per service. For >>> this, we propose to introduce the following tag: >>> - validation-, like "validation-nova", "validation-neutron" and >>> so on >>> >>> What do you think about those two additions? >> >> >> Such as variables, I think we should prefix all our variables and tags with tripleo_ or something, to differentiate them from any other playbooks our operators could run. >> I would rather use "tripleo_validations" and "tripleo_validation_nova" maybe. hmm. what-if we open this framework to a wider audience? For instance, openshift folks might be interested in some validations (I have Ceph in mind), and might find weird or even bad to have "tripleo-something" (with underscore or dashes). Maybe something more generic? "vf(-nova)" ? "validation-framework(-nova)" ? Or even "opendev-validation(-nova)" Since there are also a possibility to ask for a new package name for something more generic without the "tripleo" taint.. Cheers, C. > > Just chiming in here.. the pattern we like in OSA is using dashes for > tags, I think having something like 'tripleo-validations' and > 'tripleo-validations-nova' etc > >> Wdyt? >> -- >> Emilien Macchi > > > -- Cédric Jeanneret Software Engineer - OpenStack Platform Red Hat EMEA https://www.redhat.com/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From sorrison at gmail.com Wed May 8 07:51:18 2019 From: sorrison at gmail.com (Sam Morrison) Date: Wed, 8 May 2019 17:51:18 +1000 Subject: [cinder] Help with a review please Message-ID: <55F040AF-16C8-4029-B306-7E81B4BE191A@gmail.com> Hi, I’ve had a review going on for over 8 months now [1] and would love to get this in, it’s had +2s over the period and keeps getting nit picked, finally being knocked back due to no spec which there now is [2] This is now stalled itself after having a +2 and it is very depressing. I have had generally positive experiences contributing to openstack but this has been a real pain, is there something I can do to make this go smoother? Thanks, Sam [1] https://review.opendev.org/#/c/599866/ [2] https://review.opendev.org/#/c/645056/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From stig.openstack at telfer.org Wed May 8 08:11:31 2019 From: stig.openstack at telfer.org (Stig Telfer) Date: Wed, 8 May 2019 09:11:31 +0100 Subject: [scientific-sig] IRC Meeting today 1100 UTC: activity areas for Train cycle Message-ID: <1432B73C-C9C8-417D-9853-A268AA3D0325@telfer.org> Hello All - We have a Scientific SIG IRC meeting today at 1100 UTC in channel #openstack-meeting. Everyone is welcome. Today’s agenda is here: https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_May_8th_2019 After another busy session at the Open Infra Summit and a productive time at the PTG, we have a set of priority areas of focus identified. Cheers, Stig -------------- next part -------------- An HTML attachment was scrubbed... URL: From melwittt at gmail.com Wed May 8 08:15:34 2019 From: melwittt at gmail.com (melanie witt) Date: Wed, 8 May 2019 04:15:34 -0400 Subject: [oslo][oslo-messaging][nova] Stein nova-api AMQP issue running under uWSGI In-Reply-To: <1537695d-fe31-3e48-36d7-566a92307a93@oracle.com> References: <229a2a53-870f-44c3-5e0c-6cfa9d45d0c5@oracle.com> <3275304e-d717-8b89-557e-b650fc4f661a@oracle.com> <20190420063850.GA18527@holtby.speedport.ip> <8b9cb0e4-b3a4-986a-be59-5bba6ae00f4e@nemebean.com> <20190503175904.GA26117@holtby> <8411da3c-9318-2189-5149-2beb9cab4bd0@nemebean.com> <1537695d-fe31-3e48-36d7-566a92307a93@oracle.com> Message-ID: On Tue, 7 May 2019 15:22:36 -0700, Iain Macdonnell wrote: > > > On 5/7/19 2:45 PM, Ben Nemec wrote: >> >> >> On 5/4/19 4:14 PM, Damien Ciabrini wrote: >>> >>> >>> On Fri, May 3, 2019 at 7:59 PM Michele Baldessari >> > wrote: >>> >>>     On Mon, Apr 22, 2019 at 01:21:03PM -0500, Ben Nemec wrote: >>>      > >>>      > >>>      > On 4/22/19 12:53 PM, Alex Schultz wrote: >>>      > > On Mon, Apr 22, 2019 at 11:28 AM Ben Nemec >>>     > wrote: >>>      > > > >>>      > > > >>>      > > > >>>      > > > On 4/20/19 1:38 AM, Michele Baldessari wrote: >>>      > > > > On Fri, Apr 19, 2019 at 03:20:44PM -0700, >>>     iain.macdonnell at oracle.com wrote: >>>      > > > > > >>>      > > > > > Today I discovered that this problem appears to be caused >>>     by eventlet >>>      > > > > > monkey-patching. I've created a bug for it: >>>      > > > > > >>>      > > > > > >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_nova_-2Bbug_1825584&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=RxYkIjeLZPK2frXV_wEUCq8d3wvUIvDPimUcunMwbMs&m=vdmZv2wQnoFF1TIFnkN4XXdIjy0p4TKcsQ598Qbjti4&s=zgCsi2WthDNaeptBSW02iplSjxg9P_zrnfocp8P06oA&e= >>> >>>      > > > > >>>      > > > > Hi, >>>      > > > > >>>      > > > > just for completeness we see this very same issue also with >>>      > > > > mistral (actually it was the first service where we noticed >>>     the missed >>>      > > > > heartbeats). iirc Alex Schultz mentioned seeing it in >>>     ironic as well, >>>      > > > > although I have not personally observed it there yet. >>>      > > > >>>      > > > Is Mistral also mixing eventlet monkeypatching and WSGI? >>>      > > > >>>      > > >>>      > > Looks like there is monkey patching, however we noticed it >>> with the >>>      > > engine/executor. So it's likely not just wsgi.  I think I also >>>     saw it >>>      > > in the ironic-conductor, though I'd have to try it out >>> again.  I'll >>>      > > spin up an undercloud today and see if I can get a more >>>     complete list >>>      > > of affected services. It was pretty easy to reproduce. >>>      > >>>      > Okay, I asked because if there's no WSGI/Eventlet combination >>>     then this may >>>      > be different from the Nova issue that prompted this thread. It >>>     sounds like >>>      > that was being caused by a bad interaction between WSGI and some >>>     Eventlet >>>      > timers. If there's no WSGI involved then I wouldn't expect that >>>     to happen. >>>      > >>>      > I guess we'll see what further investigation turns up, but based >>>     on the >>>      > preliminary information there may be two bugs here. >>> >>>     So just to get some closure on this error that we have seen around >>>     mistral executor and tripleo with python3: this was due to the >>> ansible >>>     action that called subprocess which has a different implementation in >>>     python3 and so the monkeypatching needs to be adapted. >>> >>>     Review which fixes it for us is here: >>> >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__review.opendev.org_-23_c_656901_&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=RxYkIjeLZPK2frXV_wEUCq8d3wvUIvDPimUcunMwbMs&m=vdmZv2wQnoFF1TIFnkN4XXdIjy0p4TKcsQ598Qbjti4&s=1o81kC60gB8_5zIgi8WugZaOma_3m7grG4RQ-aVsbSE&e= >>> >>> >>>     Damien and I think the nova_api/eventlet/mod_wsgi has a separate >>>     root-cause >>>     (although we have not spent all too much time on that one yet) >>> >>> >>> Right, after further investigation, it appears that the problem we saw >>> under mod_wsgi was due to monkey patching, as Iain originally >>> reported. It has nothing to do with our work on healthchecks. >>> >>> It turns out that running the AMQP heartbeat thread under mod_wsgi >>> doesn't work when the threading library is monkey_patched, because the >>> thread waits on a data structure [1] that has been monkey patched [2], >>> which makes it yield its execution instead of sleeping for 15s. >>> >>> Because mod_wsgi stops the execution of its embedded interpreter, the >>> AMQP heartbeat thread can't be resumed until there's a message to be >>> processed in the mod_wsgi queue, which would resume the python >>> interpreter and make eventlet resume the thread. >>> >>> Disabling monkey-patching in nova_api makes the scheduling issue go >>> away. >> >> This sounds like the right long-term solution, but it seems unlikely to >> be backportable to the existing releases. As I understand it some >> nova-api functionality has an actual dependency on monkey-patching. Is >> there a workaround? Maybe periodically poking the API to wake up the >> wsgi interpreter? > > I've been pondering things like that ... but if I have multiple WSGI > processes, can I be sure that an API-poke will hit the one(s) that need it? > > This is a road-block for me upgrading to Stein. I really don't want to > have to go back to running nova-api standalone, but that's increasingly > looking like the only "safe" option :/ FWIW, I have a patch series that aims to re-eliminate the eventlet dependency in nova-api: https://review.opendev.org/657750 (top patch) if you might be able to give it a try. If it helps, then maybe we could backport to Stein if folks are in support. -melanie > > >>> Note: other services like heat-api do not use monkey patching and >>> aren't affected, so this seem to confirm that monkey-patching >>> shouldn't happen in nova_api running under mod_wsgi in the first >>> place. >>> >>> [1] >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_openstack_oslo.messaging_blob_master_oslo-5Fmessaging_-5Fdrivers_impl-5Frabbit.py-23L904&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=RxYkIjeLZPK2frXV_wEUCq8d3wvUIvDPimUcunMwbMs&m=vdmZv2wQnoFF1TIFnkN4XXdIjy0p4TKcsQ598Qbjti4&s=O5nQh1r8Zmded00yYMXrfxL44xcd9KqFK-VOa0cg6gs&e= >>> >>> [2] >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_openstack_oslo.utils_blob_master_oslo-5Futils_eventletutils.py-23L182&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=RxYkIjeLZPK2frXV_wEUCq8d3wvUIvDPimUcunMwbMs&m=vdmZv2wQnoFF1TIFnkN4XXdIjy0p4TKcsQ598Qbjti4&s=QRkXCiqv6zcnO2b2p8Uv6cgRuu1R414B9SvILuugN6w&e= >>> >> > From bdobreli at redhat.com Wed May 8 08:47:34 2019 From: bdobreli at redhat.com (Bogdan Dobrelya) Date: Wed, 8 May 2019 10:47:34 +0200 Subject: [ops][nova]Logging in nova and other openstack projects In-Reply-To: <62034C21-91FC-4A9A-BC4B-47E372EAB925@planethoster.info> References: <62034C21-91FC-4A9A-BC4B-47E372EAB925@planethoster.info> Message-ID: On 07.05.2019 23:31, Jean-Philippe Méthot wrote: > Indeed, this is what was written in your original response as well as in > the documentation. As a result, it was fairly difficult to miss and I > did comment it out before restarting the service. Additionally, as per There is also a deprecated (but still working) log_config [0]. So please double-check you don't have that configuration leftover. Another caveat might be that SIGHUP does not propagate to all of the child processes/threads/whatever to update its logging configs with the new default_log_levels and removed log_config(_append) ones... But you said you are restarting not reloading, so prolly can't be a problem here. [0] https://opendev.org/openstack/oslo.log/src/branch/master/oslo_log/_options.py#L47 > the configuration I had set up, had the log-config-append option be set, > I wouldn’t have any INFO level log in my logs. Hence why I believe it is > strange that I have info level logs, when I’ve set default_log_levels > like this: > > default_log_levels > = amqp=WARN,amqplib=WARN,boto=WARN,qpid=WARN,sqlalchemy=WARN,suds=WARN,oslo.messaging=WARN,iso8601=WARN,requests.packages.urllib3.connectionpool=WARN,urllib3.connectionpool=WARN,websocket=WARN,requests.packages.urllib3.util.retry=WARN,urllib3.util.retry=WARN,keystonemiddleware=WARN,routes.middleware=WARN,stevedore=WARN,taskflow=WARN,keystoneauth=WARN,oslo.cache=WARN > > Please understand that I am not doubting that your previous answer > normally works. I have seen your presentations at past Openstack summit > and know that you are a brilliant individual. However, I can only answer > here that, from my observations, this is not working as intended. > > I’ll also add that this is on Pike, but we are slated to upgrade to > Queens in the coming weeks. > > Best regards, > > Jean-Philippe Méthot > Openstack system administrator > Administrateur système Openstack > PlanetHoster inc. > > > > >> Le 7 mai 2019 à 11:39, Jay Pipes > > a écrit : >> >> As mentioned in my original response, if you have >> CONF.log_config_append set to anything, then the other conf options >> related to logging will be ignored. >> >> Best, >> -jay >> >> On Tue, May 7, 2019, 11:15 AM Jean-Philippe Méthot >> > wrote: >> >> Hi, >> >> I’ve just tried setting everything to warn through the nova.conf >> option default_log_levels, as suggested. However, I’m still >> getting info level logs from the resource tracker like this : >> >> INFO nova.compute.resource_tracker >> >> Could the compute resource tracker logs be managed by another >> parameter than what’s in the default list for that configuration >> option? >> >> Best regards, >> >> Jean-Philippe Méthot >> Openstack system administrator >> Administrateur système Openstack >> PlanetHoster inc. >> >> >> >> >>> Le 7 mai 2019 à 09:02, Jay Pipes >> > a écrit : >>> >>> On 05/06/2019 05:56 PM, Jean-Philippe Méthot wrote: >>>> Hi, >>>> We’ve been modifying our login habits for Nova on our Openstack >>>> setup to try to send only warning level and up logs to our log >>>> servers. To do so, I’ve created a logging.conf and configured >>>> logging according to the logging module documentation. While >>>> what I’ve done works, it seems to be a very convoluted process >>>> for something as simple as changing the logging level to >>>> warning. We worry that if we upgrade and the syntax for this >>>> configuration file changes, we may have to push more changes >>>> through ansible than we would like to. >>> >>> It's unlikely that the syntax for the logging configuration file >>> will change since it's upstream Python, not OpenStack or Nova >>> that is the source of this syntax. >>> >>> That said, if all you want to do is change some or all package >>> default logging levels, you can change the value of the >>> CONF.default_log_levels option. >>> >>> The default_log_levels CONF option is actually derived from the >>> oslo_log package that is used by all OpenStack service projects. >>> It's default value is here: >>> >>> https://github.com/openstack/oslo.log/blob/29671ef2bfacb416d397abc57170bb090b116f68/oslo_log/_options.py#L19-L31 >>> >>> So, if you don't want to mess with the standard Python logging >>> conf, you can just change that CONF.default_log_levels option. >>> Note that if you do specify a logging config file using a >>> non-None CONF.log_config_append value, then all other logging >>> configuration options (like default_log_levels) are ignored). >>> >>> Best, >>> -jay >>> >> > -- Best regards, Bogdan Dobrelya, Irc #bogdando From bdobreli at redhat.com Wed May 8 09:18:22 2019 From: bdobreli at redhat.com (Bogdan Dobrelya) Date: Wed, 8 May 2019 11:18:22 +0200 Subject: [ironic][tripleo] My PTG & Forum notes In-Reply-To: <7313c6aa-1693-2cb0-4ed9-a73646764070@redhat.com> References: <7313c6aa-1693-2cb0-4ed9-a73646764070@redhat.com> Message-ID: <896f2331-139d-acfe-5115-248411eb6b35@redhat.com> On 07.05.2019 19:47, Dmitry Tantsur wrote: > Hi folks, > > I've published my personal notes from the PTG & Forum in Denver: > https://dtantsur.github.io/posts/ironic-denver-2019/ > They're probably opinionated and definitely not complete, but I still > think they could be useful. > > Also pasting the whole raw RST text below for ease of commenting. > > Cheers, > Dmitry > > > Keynotes > ======== > > The `Metal3`_ project got some spotlight during the keynotes. A > (successful!) > `live demo`_ was done that demonstrated using Ironic through Kubernetes > API to > drive provisioning of bare metal nodes. this is very interesting to consider for TripleO integration alongside (or alternatively?) standalone Ironic, see my note below > > The official `bare metal program`_ was announced to promote managing > bare metal > infrastructure via OpenStack. > > Forum: standalone Ironic > ======================== > > On Monday we had two sessions dedicated to the future development of > standalone > Ironic (without Nova or without any other OpenStack services). > > During the `standalone roadmap session`_ the audience identified two > potential > domains where we could provide simple alternatives to depending on > OpenStack > services: > > * Alternative authentication. It was mentioned, however, that Keystone is a >   relatively easy service to install and operate, so adding this to Ironic >   may not be worth the effort. > > * Multi-tenant networking without Neutron. We could use networking-ansible_ >   directly, since they are planning on providing a Python API > independent of >   their ML2 implementation. > > Next, firmware update support was a recurring topic (also in hallway > conversations and also in non-standalone context). Related to that, a > driver > feature matrix documentation was requested, so that such driver-specific > features are easier to discover. > > Then we had a separate `API multi-tenancy session`_. Three topic were > covered: > > * Wiring in the existing ``owner`` field for access control. > >   The idea is to allow operations for non-administrator users only to > nodes >   with ``owner`` equal to their project (aka tenant) ID. In the > non-keystone >   context this field would stay free-form. We did not agree whether we > need an >   option to enable this feature. > >   An interesting use case was mentioned: assign a non-admin user to > Nova to >   allocate it only a part of the bare metal pool instead of all nodes. > >   We did not reach a consensus on using a schema with the ``owner`` field, >   e.g. where ``keystone://{project ID}`` represents a Keystone project ID. > > * Adding a new field (e.g. ``deployed_by``) to track a user that requested >   deploy for auditing purposes. > >   We agreed that the ``owner`` field should not be used for this > purpose, and >   overall it should never be changed automatically by Ironic. > > * Adding some notion of *node leased to*, probably via a new field. > >   This proposal was not well defined during the session, but we > probably would >   allow some subset of API to lessees using the policy mechanism. It > became >   apparent that implementing a separate *deployment API endpoint* is > required >   to make such policy possible. > > Creating the deployment API was identified as a potential immediate action > item. Wiring the ``owner`` field can also be done in the Train cycle, if we > find volunteers to push it forward. > > PTG: scientific SIG > =================== > > The PTG started for me with the `Scientific SIG discussions`_ of desired > features and fixes in Ironic. > > The hottest topic was reducing the deployment time by reducing the > number of > reboots that are done during the provisioning process. `Ramdisk deploy`_ > was identified as a very promising feature to solve this, as well as enable > booting from remote volumes not supported directly by Ironic and/or Cinder. > A few SIG members committed to testing it as soon as possible. > > Two related ideas were proposed for later brainstorming: > > * Keeping some proportion of nodes always on and with IPA booted. This is >   basing directly on the `fast-track deploy`_ work completed in the Stein >   cycle. A third party orchestrator would be needed for keeping the > percentage, >   but Ironic will have to provide an API to boot an ``available`` node > into the >   ramdisk. > > * Allow using *kexec* to instantly switch into a freshly deployed operating >   system. > > Combined together, these features can allow zero-reboot deployments. > > PTG: Ironic > =========== > > Community sustainability > ------------------------ > > We seem to have a disbalance in reviews, with very few people handling the > majority of reviews, and some of them are close to burning out. > > * The first thing we discussed is simplifying the specs process. We > considered a >   single +2 approval for specs and/or documentation. Approving > documentation >   cannot break anyone, and follow-ups are easy, so it seems a good > idea. We did >   not reach a firm agreement on a single +2 approval for specs; I > personally >   feel that it would only move the bottleneck from specs to the code. > > * Facilitating deprecated feature removals can help clean up the code, > and it >   can often be done by new contributors. We would like to maintain a > list of >   what can be removed when, so that we don't forget it. > > * We would also like to switch to single +2 for stable backports. This > needs >   changing the stable policy, and Tony volunteered to propose it. > > We felt that we're adding cores at a good pace, Julia had been mentoring > people > that wanted it. We would like people to volunteer, then we can mentor > them into > core status. > > However, we were not so sure we wanted to increase the stable core team. > This > team is supposed to be a small number of people that know quite a few small > details of the stable policy (e.g. requirements changes). We thought we > should > better switch to single +2 approval for the existing team. > > Then we discussed moving away from WSME, which is barely maintained by a > team > of not really interested individuals. The proposal was to follow the > example of > Keystone and just move to Flask. We can use ironic-inspector as an > example, and > probably migrate part by part. JSON schema could replace WSME objects, > similarly to how Nova does it. I volunteered to come up with a plan to > switch, > and some folks from Intel expressed interest in participating. > > Standalone roadmap > ------------------ > > We started with a recap of items from `Forum: standalone Ironic`_. > > While discussing creating a driver matrix, we realized that we could keep > driver capabilities in the source code (similar to existing iSCSI boot) and > generate the documentation from it. Then we could go as far as exposing > this > information in the API. > > During the multi-tenancy discussion, the idea of owner and lessee fields > was > well received. Julia volunteered to write a specification for that. We > clarified the following access control policies implemented by default: > > * A user can list or show nodes if they are an administrator, an owner of a >   node or a leaser of this node. > * A user can deploy or undeploy a node (through the future deployment > API) if >   they are an administrator, an owner of this node or a lessee of this > node. > * A user can update a node or any of its resources if they are an > administrator >   or an owner of this node. A lessee of a node can **not** update it. > > The discussion of recording the user that did a deployment turned into > discussing introducing a searchable log of changes to node power and > provision > states. We did not reach a final consensus on it, and we probably need a > volunteer to push this effort forward. > > Deploy steps continued > ---------------------- > > This session was dedicated to making the deploy templates framework more > usable > in practice. > > * We need to implement support for in-band deploy steps (other than the >   built-in ``deploy.deploy`` step). We probably need to start IPA before >   proceeding with the steps, similarly to how it is done with cleaning. > > * We agreed to proceed with splitting the built-in core step, making it a >   regular deploy step, as well as removing the compatibility shim for > drivers >   that do not support deploy steps. We will probably separate writing > an image >   to disk, writing a configdrive and creating a bootloader. > >   The latter could be overridden to provide custom kernel parameters. > > * To handle potential differences between deploy steps in different > hardware >   types, we discussed the possibility of optionally including a > hardware type >   or interface name in a clean step. Such steps will only be run for > nodes with >   matching hardware type or interface. > > Mark and Ruby volunteered to write a new spec on these topics. > > Day 2 operational workflow > -------------------------- > > For deployments with external health monitoring, we need a way to represent > the state when a deployed node looks healthy from our side but is detected > as failed by the monitoring. > > It seems that we could introduce a new state transition from ``active`` to > something like ``failed`` or ``quarantined``, where a node is still > deployed, > but explicitly marked as at fault by an operator. On unprovisioning, > this node > would not become ``available`` automatically. We also considered the > possibility of using a flag instead of a new state, although the > operators in > the room were more in favor of using a state. We largely agreed that the > already overloaded ``maintenance`` flag should not be used for this. > > On the Nova side we would probably use the ``error`` state to reflect > nodes in > the new state. > > A very similar request had been done for node retirement support. We > decided to > look for a unified solution. > > DHCP-less deploy > ---------------- > > We discussed options to avoid relying on DHCP for deploying. > > * An existing specification proposes attaching IP information to virtual > media. >   The initial contributors had become inactive, so we decided to help > this work >   to go through. Volunteers are welcome. > > * As an alternative to that, we discussed using IPv6 SLAAC with > multicast DNS >   (routed across WAN for Edge cases). A couple of folks on the room > volunteered >   to help with testing. We need to fix python-zeroconf_ to support > IPv6, which >   is something I'm planning on. > > Nova room > --------- > > In a cross-project discussion with the Nova team we went through a few > topics: > > * Whether Nova should use new Ironic API to build config drives. Since > Ironic >   is not the only driver building config drives, we agreed that it > probably >   doesn't make much sense to change that. > > * We did not come to a conclusion on deprecating capabilities. We agreed > that >   Ironic has to provide alternatives for ``boot_option`` and ``boot_mode`` >   capabilities first. These will probably become deploy steps or built-in >   traits. > > * We agreed that we should switch Nova to using *openstacksdk* instead of >   *ironicclient* to access Ironic. This work had already been in progress. > > Faster deploy > ------------- > > We followed up to `PTG: scientific SIG`_ with potential action items on > speeding up the deployment process by reducing the number of reboots. We > discussed an ability to keep all or some nodes powered on and > heartbeating in > the ``available`` state: > > * Add an option to keep the ramdisk running after cleaning. > >   * For this to work with multi-tenant networking we'll need an IPA > command to >     reset networking. > > * Add a provisioning verb going from ``available`` to ``available`` > booting the >   node into IPA. > > * Make sure that pre-booted nodes are prioritized for scheduling. We will >   probably dynamically add a special trait. Then we'll have to update both >   Nova/Placement and the allocation API to support preferred (optional) > traits. > > We also agreed that we could provide an option to *kexec* instead of > rebooting > as an advanced deploy step for operators that really know their hardware. > Multi-tenant networking can be tricky in this case, since there is no safe > point to switch from deployment to tenant network. We will probably take > a best > effort approach: command IPA to shutdown all its functionality and > schedule a > *kexec* after some time. After that, switch to tenant networks. This is not > entirely secure, but will probably fit the operators (HPC) who requests it. > > Asynchronous clean steps > ------------------------ > > We discussed enhancements for asynchronous clean and deploy steps. > Currently > running a step asynchronously requires either polling in a loop (occupying > a green thread) or creating a new periodic task in a hardware type. We > came up > with two proposed updates for clean steps: > > * Allow a clean step to request re-running itself after certain amount of >   time. E.g. a clean step would do something like > >   .. code-block:: python > >     @clean_step(...) >     def wait_for_raid(self): >         if not raid_is_ready(): >             return RerunAfter(60) > >   and the conductor would schedule re-running the same step in 60 seconds. > > * Allow a clean step to spawn more clean steps. E.g. a clean step would >   do something like > >   .. code-block:: python > >     @clean_step(...) >     def create_raid_configuration(self): >         start_create_raid() >         return RunNext([{'step': 'wait_for_raid'}]) > >   and the conductor would insert the provided step to ``node.clean_steps`` >   after the current one and start running it. > >   This would allow for several follow-up steps as well. A use case is a > clean >   step for resetting iDRAC to a clean state that in turn consists of > several >   other clean steps. The idea of sub-steps was deemed too complicated. > > PTG: TripleO > ============ > > We discussed our plans for removing Nova from the TripleO undercloud and > moving bare metal provisioning from under control of Heat. The plan from > the I wish we could have Metal3 provisioning via K8s API adapted for Undercloud in TripleO. Probably via a) standalone kubelet or b) k3s [0]. The former provides only kubelet running static pods, no API server et al. The latter is a lightweight k8s distro (a 10MB memory footprint or so) and may be as well used to spawn some very limited kubelet and API server setup for Metal3 to drive the provisioning of overclouds outside of Heat and Neutron. [0] https://www.cnrancher.com/blog/2019/2019-02-26-introducing-k3s-the-lightweight-kubernetes-distribution-built-for-the-edge/ > `nova-less-deploy specification`_, as well as the current state > of the implementation, were presented. > > The current concerns are: > > * upgrades from a Nova based deployment (probably just wipe the Nova >   database), > * losing user experience of ``nova list`` (largely compensated by >   ``metalsmith list``), > * tracking IP addresses for networks other than *ctlplane* (solved the same >   way as for deployed servers). > > The next action item is to create a CI job based on the already merged > code and > verify a few assumptions made above. > > PTG: Ironic, Placement, Blazar > ============================== > > We reiterated over our plans to allow Ironic to optionally report nodes to > Placement. This will be turned off when Nova is present to avoid > conflicts with > the Nova reporting. We will optionally use Placement as a backend for > Ironic > allocation API (which is something that had been planned before). > > Then we discussed potentially exposing detailed bare metal inventory to > Placement. To avoid partial allocations, Placement could introduce new > API to > consume the whole resource provider. Ironic would use it when creating an > allocation. No specific commitments were made with regards to this idea. > > Finally we came with the following workflow for bare metal reservations in > Blazar: > > #. A user requests a bare metal reservation from Blazar. > #. Blazar fetches allocation candidates from Placement. > #. Blazar fetches a list of bare metal nodes from Ironic and filters out >    allocation candidates, whose resource provider UUID does not match > one of >    the node UUIDs. > #. Blazar remembers the node UUID and returns the reservation UUID to > the user. > > When the reservation time comes: > > #. Blazar creates an allocation in Ironic (not Placement) with the > candidate >    node matching previously picked node and allocation UUID matching the >    reservation UUID. > #. When the enhancements in `Standalone roadmap`_ are implemented, > Blazar will >    also set the node's lessee field to the user ID of the reservation, > so that >    Ironic allows access to this node. > #. A user fetches an Ironic allocation corresponding to the Blazar > reservation >    UUID and learns the node UUID from it. > #. A user proceeds with deploying the node. > > Side and hallway discussions > ============================ > > * We discussed having Heat resources for Ironic. We recommended the team to >   start with Allocation and Deployment resources (the latter being virtual >   until we implement the planned deployment API). > > * We prototyped how Heat resources for Ironic could look, including > Node, Port, >   Allocation and Deployment as a first step. > > .. _Metal3: http://metal3.io > .. _live demo: > https://www.openstack.org/videos/summits/denver-2019/openstack-ironic-and-bare-metal-infrastructure-all-abstractions-start-somewhere > > .. _bare metal program: https://www.openstack.org/bare-metal/ > .. _standalone roadmap session: > https://etherpad.openstack.org/p/DEN-train-next-steps-for-standalone-ironic > .. _networking-ansible: https://opendev.org/x/networking-ansible > .. _API multi-tenancy session: > https://etherpad.openstack.org/p/DEN-train-ironic-multi-tenancy > .. _Scientific SIG discussions: > https://etherpad.openstack.org/p/scientific-sig-ptg-train > .. _Ramdisk deploy: > https://docs.openstack.org/ironic/latest/admin/interfaces/deploy.html#ramdisk-deploy > > .. _fast-track deploy: https://storyboard.openstack.org/#!/story/2004965 > .. _python-zeroconf: https://github.com/jstasiak/python-zeroconf > .. _nova-less-deploy specification: > http://specs.openstack.org/openstack/tripleo-specs/specs/stein/nova-less-deploy.html > > -- Best regards, Bogdan Dobrelya, Irc #bogdando From geguileo at redhat.com Wed May 8 10:01:56 2019 From: geguileo at redhat.com (Gorka Eguileor) Date: Wed, 8 May 2019 12:01:56 +0200 Subject: [cinder] Help with a review please In-Reply-To: <55F040AF-16C8-4029-B306-7E81B4BE191A@gmail.com> References: <55F040AF-16C8-4029-B306-7E81B4BE191A@gmail.com> Message-ID: <20190508100156.ypwpt4ouxzw7r2ld@localhost> On 08/05, Sam Morrison wrote: > Hi, > > I’ve had a review going on for over 8 months now [1] and would love to get this in, it’s had +2s over the period and keeps getting nit picked, finally being knocked back due to no spec which there now is [2] > This is now stalled itself after having a +2 and it is very depressing. > > I have had generally positive experiences contributing to openstack but this has been a real pain, is there something I can do to make this go smoother? > > Thanks, > Sam > Hi Sam, I agree, it can be very frustrating when your patch gets somehow stuck in review, and while the spec and the patch looks good to me, I cannot say that I see much point in the feature itself. If the primary reason to add this new key-value pair in the API response is for aggregation, then the caller could do that same thing with an additional call to get the service list, where it could get the AZs of the different backends and then do the aggregation. To me that would be reasonable, since the AZ is not really a usage stat. Are there any other use cases? Cheers, Gorka. > > [1] https://review.opendev.org/#/c/599866/ > [2] https://review.opendev.org/#/c/645056/ From massimo.sgaravatto at gmail.com Wed May 8 10:35:30 2019 From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto) Date: Wed, 8 May 2019 12:35:30 +0200 Subject: [nova][ops] 'Duplicate entry for primary key' problem running nova-manage db archive_deleted_rows Message-ID: Hi Fron time to time I use to move entries related to deleted instances to shadow tables, using the command: nova-manage db archive_deleted_rows This is now failing [*] for the instance_metadata table because of a 'duplicate entry for the primary key' problem: DBDuplicateEntry: (pymysql.err.IntegrityError) (1062, u"Duplicate entry '6' for key 'PRIMARY'") [SQL: u'INSERT INTO shadow_instance_metadata (created_at, updated_at, deleted_at, deleted, id, `key`, value, instance_uuid) SELECT instance_metadata.created_at, instance_metadata.updated_at, instance_metadata.deleted_at, instance_metadata.deleted, instance_metadata.id, instance_metadata.`key`, instance_metadata.value, instance_metadata.instance_uuid \nFROM instance_metadata \nWHERE instance_metadata.deleted != %(deleted_1)s ORDER BY instance_metadata.id \n LIMIT %(param_1)s'] [parameters: {u'param_1': 1, u'deleted_1': 0}] Indeed: mysql> SELECT instance_metadata.created_at, instance_metadata.updated_at, instance_metadata.deleted_at, instance_metadata.deleted, instance_metadata.id, instance_metadata.`key`, instance_metadata.value, instance_metadata.instance_uuid FROM instance_metadata WHERE instance_metadata.deleted != 0 ORDER BY instance_metadata.id limit 1; +---------------------+------------+---------------------+---------+----+-------+-------+--------------------------------------+ | created_at | updated_at | deleted_at | deleted | id | key | value | instance_uuid | +---------------------+------------+---------------------+---------+----+-------+-------+--------------------------------------+ | 2018-09-20 07:40:56 | NULL | 2018-09-20 07:54:26 | 6 | 6 | group | node | a9000ff7-2298-454c-bf71-9e3c62ec0f3c | +---------------------+------------+---------------------+---------+----+-------+-------+--------------------------------------+ 1 row in set (0.00 sec) But there is a 5-years old entry (if I am not wrong we were running Havana at that time) in the shadow table with that id: mysql> select * from shadow_instance_metadata where id='6'; +---------------------+------------+---------------------+----+----------+---------------------+--------------------------------------+---------+ | created_at | updated_at | deleted_at | id | key | value | instance_uuid | deleted | +---------------------+------------+---------------------+----+----------+---------------------+--------------------------------------+---------+ | 2014-11-04 12:57:10 | NULL | 2014-11-04 13:06:45 | 6 | director | microbosh-openstack | 5db5b17b-69f2-4f0a-bdd2-efe710268021 | 6 | +---------------------+------------+---------------------+----+----------+---------------------+--------------------------------------+---------+ 1 row in set (0.00 sec) mysql> I wonder how could that happen. Can I simply remove that entry from the shadow table (I am not really interested to keep it) or are there better (cleaner) way to fix the problem ? This Cloud is now running Ocata Thanks, Massimo [*] [root at cld-ctrl-01 ~]# nova-manage db archive_deleted_rows --max_rows 1000 --verbose An error has occurred: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/nova/cmd/manage.py", line 1617, in main ret = fn(*fn_args, **fn_kwargs) File "/usr/lib/python2.7/site-packages/nova/cmd/manage.py", line 691, in archive_deleted_rows run = db.archive_deleted_rows(max_rows) File "/usr/lib/python2.7/site-packages/nova/db/api.py", line 2040, in archive_deleted_rows return IMPL.archive_deleted_rows(max_rows=max_rows) File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 6564, in archive_deleted_rows tablename, max_rows=max_rows - total_rows_archived) File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 6513, in _archive_deleted_rows_for_table conn.execute(insert) File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 914, in execute return meth(self, multiparams, params) File "/usr/lib64/python2.7/site-packages/sqlalchemy/sql/elements.py", line 323, in _execute_on_connection return connection._execute_clauseelement(self, multiparams, params) File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1010, in _execute_clauseelement compiled_sql, distilled_params File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1146, in _execute_context context) File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1337, in _handle_dbapi_exception util.raise_from_cause(newraise, exc_info) File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 200, in raise_from_cause reraise(type(exception), exception, tb=exc_tb) File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1139, in _execute_context context) File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 450, in do_execute cursor.execute(statement, parameters) File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 165, in execute result = self._query(query) File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 321, in _query conn.query(q) File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 860, in query self._affected_rows = self._read_query_result(unbuffered=unbuffered) File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1061, in _read_query_result result.read() File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1349, in read first_packet = self.connection._read_packet() File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1018, in _read_packet packet.check_error() File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 384, in check_error err.raise_mysql_exception(self._data) File "/usr/lib/python2.7/site-packages/pymysql/err.py", line 107, in raise_mysql_exception raise errorclass(errno, errval) DBDuplicateEntry: (pymysql.err.IntegrityError) (1062, u"Duplicate entry '6' for key 'PRIMARY'") [SQL: u'INSERT INTO shadow_instance_metadata (created_at, updated_at, deleted_at, deleted, id, `key`, value, instance_uuid) SELECT instance_metadata.created_at, instance_metadata.updated_at, instance_metadata.deleted_at, instance_metadata.deleted, instance_metadata.id, instance_metadata.`key`, instance_metadata.value, instance_metadata.instance_uuid \nFROM instance_metadata \nWHERE instance_metadata.deleted != %(deleted_1)s ORDER BY instance_metadata.id \n LIMIT %(param_1)s'] [parameters: {u'param_1': 1, u'deleted_1': 0}] [root at cld-ctrl-01 ~]# -------------- next part -------------- An HTML attachment was scrubbed... URL: From dharmendra.kushwaha at india.nec.com Wed May 8 11:02:57 2019 From: dharmendra.kushwaha at india.nec.com (Dharmendra Kushwaha) Date: Wed, 8 May 2019 11:02:57 +0000 Subject: [Tacker] Train vPTG meetup schedule Message-ID: Dear all, We have planned to have our one-day virtual PTG meetup for Train cycle on below schedule. Please find the meeting details: Schedule: 14th May 2019, 08:00 UTC to 12:00 UTC Meeting Channel: https://bluejeans.com/614072564 Etherpad link: https://etherpad.openstack.org/p/Tacker-PTG-Train We will try to cover our topics within this schedule. If needed, we can extend it. Thanks & Regards Dharmendra Kushwaha From surya.seetharaman9 at gmail.com Wed May 8 11:50:47 2019 From: surya.seetharaman9 at gmail.com (Surya Seetharaman) Date: Wed, 8 May 2019 13:50:47 +0200 Subject: [nova][ops] 'Duplicate entry for primary key' problem running nova-manage db archive_deleted_rows In-Reply-To: References: Message-ID: Hi, On Wed, May 8, 2019 at 12:41 PM Massimo Sgaravatto < massimo.sgaravatto at gmail.com> wrote: > Hi > > Fron time to time I use to move entries related to deleted instances to > shadow tables, using the command: > > nova-manage db archive_deleted_rows > > This is now failing [*] for the instance_metadata table because of a > 'duplicate entry for the primary key' problem: > > DBDuplicateEntry: (pymysql.err.IntegrityError) (1062, u"Duplicate entry > '6' for key 'PRIMARY'") [SQL: u'INSERT INTO shadow_instance_metadata > (created_at, updated_at, deleted_at, deleted, id, `key`, value, > instance_uuid) SELECT instance_metadata.created_at, > instance_metadata.updated_at, instance_metadata.deleted_at, > instance_metadata.deleted, instance_metadata.id, instance_metadata.`key`, > instance_metadata.value, instance_metadata.instance_uuid \nFROM > instance_metadata \nWHERE instance_metadata.deleted != %(deleted_1)s ORDER > BY instance_metadata.id \n LIMIT %(param_1)s'] [parameters: {u'param_1': > 1, u'deleted_1': 0}] > > > Indeed: > > mysql> SELECT instance_metadata.created_at, instance_metadata.updated_at, > instance_metadata.deleted_at, instance_metadata.deleted, > instance_metadata.id, instance_metadata.`key`, instance_metadata.value, > instance_metadata.instance_uuid FROM instance_metadata WHERE > instance_metadata.deleted != 0 ORDER BY instance_metadata.id limit 1; > > +---------------------+------------+---------------------+---------+----+-------+-------+--------------------------------------+ > | created_at | updated_at | deleted_at | deleted | id | > key | value | instance_uuid | > > +---------------------+------------+---------------------+---------+----+-------+-------+--------------------------------------+ > | 2018-09-20 07:40:56 | NULL | 2018-09-20 07:54:26 | 6 | 6 | > group | node | a9000ff7-2298-454c-bf71-9e3c62ec0f3c | > > +---------------------+------------+---------------------+---------+----+-------+-------+--------------------------------------+ > 1 row in set (0.00 sec) > > > But there is a 5-years old entry (if I am not wrong we were running Havana > at that time) in the shadow table with that id: > > mysql> select * from shadow_instance_metadata where id='6'; > > +---------------------+------------+---------------------+----+----------+---------------------+--------------------------------------+---------+ > | created_at | updated_at | deleted_at | id | key | > value | instance_uuid | deleted | > > +---------------------+------------+---------------------+----+----------+---------------------+--------------------------------------+---------+ > | 2014-11-04 12:57:10 | NULL | 2014-11-04 13:06:45 | 6 | director | > microbosh-openstack | 5db5b17b-69f2-4f0a-bdd2-efe710268021 | 6 | > > +---------------------+------------+---------------------+----+----------+---------------------+--------------------------------------+---------+ > 1 row in set (0.00 sec) > > mysql> > > > I wonder how could that happen. > > Can I simply remove that entry from the shadow table (I am not really > interested to keep it) or are there better (cleaner) way to fix the problem > ? > > > This Cloud is now running Ocata > > Thanks, Massimo > > >From what I can understand, it looks like a record with id 6 was archived long back (havana-ish) and then there was a new record with id 6 again ready to be archived ? (not sure how there could have been two records with same id since ids are incremental even over releases, I am not sure of the history though since I wasn't involved with OS then). I think the only way out is to manually delete that entry from the shadow table if you don't want it. There should be no harm in removing it. We have a "nova-manage db purge [--all] [--before ] [--verbose] [--all-cells]" command that removes records from shadow_tables ( https://docs.openstack.org/nova/rocky/cli/nova-manage.html) but it was introduced in rocky. So it won't be available in Ocata unfortunately. Cheers, Surya. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alifshit at redhat.com Wed May 8 12:46:56 2019 From: alifshit at redhat.com (Artom Lifshitz) Date: Wed, 8 May 2019 08:46:56 -0400 Subject: [nova][CI] GPUs in the gate In-Reply-To: <3587e05d-deab-42ad-9a02-4312ca11760f@www.fastmail.com> References: <3587e05d-deab-42ad-9a02-4312ca11760f@www.fastmail.com> Message-ID: On Tue, May 7, 2019 at 8:00 PM Clark Boylan wrote: > > On Tue, May 7, 2019, at 10:48 AM, Artom Lifshitz wrote: > > Hey all, > > > > Following up on the CI session during the PTG [1], I wanted to get the > > ball rolling on getting GPU hardware into the gate somehow. Initially > > the plan was to do it through OpenLab and by convincing NVIDIA to > > donate the cards, but after a conversation with Sean McGinnis it > > appears Infra have access to machines with GPUs. > > > > From Nova's POV, the requirements are: > > * The machines with GPUs should probably be Ironic baremetal nodes and > > not VMs [*]. > > * The GPUs need to support virtualization. It's hard to get a > > comprehensive list of GPUs that do, but Nova's own docs [2] mention > > two: Intel cards with GVT [3] and NVIDIA GRID [4]. > > > > So I think at this point the question is whether Infra can support > > those reqs. If yes, we can start concrete steps towards getting those > > machines used by a CI job. If not, we'll fall back to OpenLab and try > > to get them hardware. > > What we currently have access to is a small amount of Vexxhost's GPU instances (so mnaser can further clarify my comments here). I believe these are VMs with dedicated nvidia gpus that are passed through. I don't think they support the vgpu feature. > > It might help to describe the use case you are trying to meet rather than jumping ahead to requirements/solutions. That way maybe we can work with Vexxhost to better support what you need (or come up with some other solutions). For those of us that don't know all of the particulars it really does help if you can go from use case to requirements. Right, apologies, I got ahead of myself. The use case is CI coverage for Nova's VGPU feature. This feature can be summarized (and oversimplified) as "SRIOV for GPUs": a single physical GPU can be split into multiple virtual GPUs (via libvirt's mdev support [5]), each one being assigned to a different guest. We have functional tests in-tree, but no tests with real hardware. So we're looking for a way to get real hardware in the gate. I hope that clarifies things. Let me know if there are further questions. [5] https://libvirt.org/drvnodedev.html#MDEVCap > > > > > [*] Could we do double-passthrough? Could the card be passed through > > to the L1 guest via the PCI passthrough mechanism, and then into the > > L2 guest via the mdev mechanism? > > > > [1] https://etherpad.openstack.org/p/nova-ptg-train-ci > > [2] https://docs.openstack.org/nova/rocky/admin/virtual-gpu.html > > [3] https://01.org/igvt-g > > [4] https://docs.nvidia.com/grid/5.0/pdf/grid-vgpu-user-guide.pdf From rosmaita.fossdev at gmail.com Wed May 8 13:18:17 2019 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Wed, 8 May 2019 09:18:17 -0400 Subject: [glance] 9 May meeting cancelled Message-ID: The Glance team had a very productive PTG and needs some recovery time, so there will be no weekly meeting tomorrow (Thursday 9 May). The weekly meetings will resume at their usual time (14:00 UTC) on Thursday 16 May. For any issues that can't wait, people will be available as usual in #openstack-glance -- and there's always the ML. cheers, brian From fungi at yuggoth.org Wed May 8 13:27:10 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 8 May 2019 13:27:10 +0000 Subject: [nova][CI] GPUs in the gate In-Reply-To: References: <3587e05d-deab-42ad-9a02-4312ca11760f@www.fastmail.com> Message-ID: <20190508132709.xgq6nz3mqkfw3q5d@yuggoth.org> On 2019-05-08 08:46:56 -0400 (-0400), Artom Lifshitz wrote: [...] > The use case is CI coverage for Nova's VGPU feature. This feature can > be summarized (and oversimplified) as "SRIOV for GPUs": a single > physical GPU can be split into multiple virtual GPUs (via libvirt's > mdev support [5]), each one being assigned to a different guest. We > have functional tests in-tree, but no tests with real hardware. So > we're looking for a way to get real hardware in the gate. [...] Long shot, but since you just need the feature provided and not the performance it usually implies, are there maybe any open source emulators which provide the same instruction set for conformance testing purposes? -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From jaypipes at gmail.com Wed May 8 13:35:36 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Wed, 8 May 2019 09:35:36 -0400 Subject: [ops][nova]Logging in nova and other openstack projects In-Reply-To: <62034C21-91FC-4A9A-BC4B-47E372EAB925@planethoster.info> References: <62034C21-91FC-4A9A-BC4B-47E372EAB925@planethoster.info> Message-ID: Sorry for delayed response... comments inline. On 05/07/2019 05:31 PM, Jean-Philippe Méthot wrote: > Indeed, this is what was written in your original response as well as in > the documentation. As a result, it was fairly difficult to miss and I > did comment it out before restarting the service. Additionally, as per > the configuration I had set up, had the log-config-append option be set, > I wouldn’t have any INFO level log in my logs. Hence why I believe it is > strange that I have info level logs, when I’ve set default_log_levels > like this: > > default_log_levels > = amqp=WARN,amqplib=WARN,boto=WARN,qpid=WARN,sqlalchemy=WARN,suds=WARN,oslo.messaging=WARN,iso8601=WARN,requests.packages.urllib3.connectionpool=WARN,urllib3.connectionpool=WARN,websocket=WARN,requests.packages.urllib3.util.retry=WARN,urllib3.util.retry=WARN,keystonemiddleware=WARN,routes.middleware=WARN,stevedore=WARN,taskflow=WARN,keystoneauth=WARN,oslo.cache=WARN Do you see any of the above modules logging with INFO level, though? Or are you just seeing other modules (e.g. nova.*) logging at INFO level? If you are only seeing nova modules logging at INFO level, try adding: ,nova=WARN to the default_log_levels CONF option. Let us know if this works :) Best, -jay From rafaelweingartner at gmail.com Wed May 8 13:49:59 2019 From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=) Date: Wed, 8 May 2019 10:49:59 -0300 Subject: [telemetry] Team meeting agenda for tomorrow In-Reply-To: References: Message-ID: Hello Trinh, Where does the meeting happen? Will it be via IRC Telemetry channel? Or, in the Etherpad (https://etherpad.openstack.org/p/telemetry-meeting-agenda)? I would like to discuss and understand a bit better the context behind the Telemetry events deprecation. On Wed, May 8, 2019 at 12:19 AM Trinh Nguyen wrote: > Hi team, > > As planned, we will have a team meeting at 02:00 UTC, May 9th on > #openstack-telemetry to discuss what we gonna do for the next milestone > (Train-1) and continue what we left off from the last meeting. > > I put here [1] the agenda thinking that it should be fine for an hour > meeting. If you have anything to talk about, please put it there too. > > [1] https://etherpad.openstack.org/p/telemetry-meeting-agenda > > > Bests, > > -- > *Trinh Nguyen* > *www.edlab.xyz * > > -- Rafael Weingärtner -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Wed May 8 14:12:13 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 8 May 2019 09:12:13 -0500 Subject: [placement][nova][ptg] Summary: Consumer Types In-Reply-To: References: <1557135206.12068.1@smtp.office365.com> Message-ID: <93df3b21-149c-d32b-54d0-614597d4d754@gmail.com> On 5/6/2019 10:49 AM, Chris Dent wrote: >>> Still nova might want to fix this placement data inconsistency. I >>> guess the new placement microversion will allow to update the consumer >>> type of an allocation. >> >> Yeah, I think this has to be updated from Nova. I (and I imagine others) >> would like to avoid making the type field optional in the API. So maybe >> default the value to something like "incomplete" or "unknown" and then >> let nova correct this naturally for instances on host startup and >> migrations on complete/revert. Ideally nova will be one one of the users >> that wants to depend on the type string, so we want to use our knowledge >> of which is which to get existing allocations updated so we can depend >> on the type value later. > > Ah, okay, good. If something like "unknown" is workable I think > that's much much better than defaulting to instance. Thanks. Yup I agree with everything said from a nova perspective. Our public cloud operators were just asking about leaked allocations and if there was tooling to report and clean that kind of stuff up. I explained we have the heal_allocations CLI but that's only going to create allocations for *instances* and only if those instances aren't deleted, but we don't have anything in nova that deals with detection and cleanup of leaked allocations, sort of like what this tooling does [1] but I think is different. So I was thinking about how we could write something in nova that reads the allocations from placement and checks to see if there is anything in there that doesn't match what we have for instances or migrations, i.e. the server was deleted but for whatever reason an allocation was leaked. To be able to determine what allocations are nova-specific today we'd have to guess based on the resource classes being used, namely VCPU and/or MEMORY_MB, but it of course gets more complicated once we start adding supported for nested allocations and such. So consumer type will help here, but we need it more than from the GET /usages API I think. If I were writing that kind of report/cleanup tool today, I'd probably want a GET /allocations API, but that might be too heavy (it would definitely require paging support I think). I could probably get by with using GET /resource_providers/{uuid}/allocations for each compute node we have in nova, but again that starts to get complicated with nested providers (what if the allocations are for VGPU?). Anyway, from a "it's better to have something than nothing at all" perspective it's probably easiest to just start with the easy thing and ask placement for allocations on all compute node providers and cross-check those consumers against what's in nova and if we find allocations that don't have a matching migration or instance we could optional delete them. [1] https://github.com/larsks/os-placement-tools -- Thanks, Matt From fungi at yuggoth.org Wed May 8 14:27:58 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 8 May 2019 14:27:58 +0000 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> Message-ID: <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> On 2019-05-07 22:50:21 +0200 (+0200), Dirk Müller wrote: > Am Di., 7. Mai 2019 um 22:30 Uhr schrieb Matthew Thode : > > > Pike - 2.18.2 -> 2.20.1 - https://review.opendev.org/640727 > > Queens - 2.18.4 -> 2.20.1 - https://review.opendev.org/640710 > > Specifically it looks like we're already at the next issue, as tracked here: > > https://github.com/kennethreitz/requests/issues/5065 > > Any concerns from anyone on these newer urllib3 updates? I guess we'll > do them a bit later though. It's still unclear to me why we're doing this at all. Our stable constraints lists are supposed to be a snapshot in time from when we released, modulo stable point release updates of the libraries we're maintaining. Agreeing to bump random dependencies on stable branches because of security vulnerabilities in them is a slippery slope toward our users expecting the project to be on top of vulnerability announcements for every one of the ~600 packages in our constraints list. Deployment projects already should not depend on our requirements team tracking security vulnerabilities, so need to have a mechanism to override constraints entries anyway if they're making such guarantees to their users (and I would also caution against doing that too). Distributions are far better equipped than our project to handle such tracking, as they generally get advance notice of vulnerabilities and selectively backport fixes for them. Trying to accomplish the same with a mix of old and new dependency versions in our increasingly aging stable and extended maintenance branches seems like a disaster waiting to happen. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From fungi at yuggoth.org Wed May 8 14:39:23 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 8 May 2019 14:39:23 +0000 Subject: [nova][all][ptg] Summary: Same-Company Approvals In-Reply-To: References: Message-ID: <20190508143923.bhmla62qi2p7yc7s@yuggoth.org> On 2019-05-07 15:06:10 -0500 (-0500), Jay Bryant wrote: > Cinder has been working with the same unwritten rules for quite some time as > well with minimal issues. > > I think the concerns about not having it documented are warranted.  We have > had question about it in the past with no documentation to point to.  It is > more or less lore that has been passed down over the releases.  :-) > > At a minimum, having this e-mail thread is helpful.  If, however, we decide > to document it I think we should have it consistent across the teams that > use the rule.  I would be happy to help draft/review any such documentation. [...] I have a feeling that a big part of why it's gone undocumented for so long is that putting it in writing risks explicitly sending the message that we don't trust our contributors to act in the best interests of the project even if those are not aligned with the interests of their employer/sponsor. I think many of us attempt to avoid having all activity on a given patch come from people with the same funding affiliation so as to avoid giving the impression that any one organization is able to ram changes through with no oversight, but more because of the outward appearance than because we don't trust ourselves or our colleagues. Documenting our culture is a good thing, but embodying that documentation with this sort of nuance can be challenging. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From dangtrinhnt at gmail.com Wed May 8 14:41:07 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Wed, 8 May 2019 23:41:07 +0900 Subject: [telemetry] Team meeting agenda for tomorrow In-Reply-To: References: Message-ID: Hi Rafael, The meeting will be held on the IRC channel #openstack-telemetry as mentioned in the previous email. Thanks, On Wed, May 8, 2019 at 10:50 PM Rafael Weingärtner < rafaelweingartner at gmail.com> wrote: > Hello Trinh, > Where does the meeting happen? Will it be via IRC Telemetry channel? Or, > in the Etherpad (https://etherpad.openstack.org/p/telemetry-meeting-agenda)? > I would like to discuss and understand a bit better the context behind the Telemetry > events deprecation. > > On Wed, May 8, 2019 at 12:19 AM Trinh Nguyen > wrote: > >> Hi team, >> >> As planned, we will have a team meeting at 02:00 UTC, May 9th on >> #openstack-telemetry to discuss what we gonna do for the next milestone >> (Train-1) and continue what we left off from the last meeting. >> >> I put here [1] the agenda thinking that it should be fine for an hour >> meeting. If you have anything to talk about, please put it there too. >> >> [1] https://etherpad.openstack.org/p/telemetry-meeting-agenda >> >> >> Bests, >> >> -- >> *Trinh Nguyen* >> *www.edlab.xyz * >> >> > > -- > Rafael Weingärtner > -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From jp.methot at planethoster.info Wed May 8 14:43:41 2019 From: jp.methot at planethoster.info (=?utf-8?Q?Jean-Philippe_M=C3=A9thot?=) Date: Wed, 8 May 2019 10:43:41 -0400 Subject: [ops][nova]Logging in nova and other openstack projects In-Reply-To: References: <62034C21-91FC-4A9A-BC4B-47E372EAB925@planethoster.info> Message-ID: <53BF2204-988C-4ED6-A687-F6188B90C547@planethoster.info> Hi, Indeed, the remaining info messages were coming from the nova-compute resource tracker. Adding nova=WARN in the list did remove these messages. Thank you very much for your help. Best regards, Jean-Philippe Méthot Openstack system administrator Administrateur système Openstack PlanetHoster inc. > Le 8 mai 2019 à 09:35, Jay Pipes a écrit : > > Sorry for delayed response... comments inline. > > On 05/07/2019 05:31 PM, Jean-Philippe Méthot wrote: >> Indeed, this is what was written in your original response as well as in the documentation. As a result, it was fairly difficult to miss and I did comment it out before restarting the service. Additionally, as per the configuration I had set up, had the log-config-append option be set, I wouldn’t have any INFO level log in my logs. Hence why I believe it is strange that I have info level logs, when I’ve set default_log_levels like this: >> default_log_levels = amqp=WARN,amqplib=WARN,boto=WARN,qpid=WARN,sqlalchemy=WARN,suds=WARN,oslo.messaging=WARN,iso8601=WARN,requests.packages.urllib3.connectionpool=WARN,urllib3.connectionpool=WARN,websocket=WARN,requests.packages.urllib3.util.retry=WARN,urllib3.util.retry=WARN,keystonemiddleware=WARN,routes.middleware=WARN,stevedore=WARN,taskflow=WARN,keystoneauth=WARN,oslo.cache=WARN > > Do you see any of the above modules logging with INFO level, though? Or are you just seeing other modules (e.g. nova.*) logging at INFO level? > > If you are only seeing nova modules logging at INFO level, try adding: > > ,nova=WARN > > to the default_log_levels CONF option. > > Let us know if this works :) > > Best, > -jay > -------------- next part -------------- An HTML attachment was scrubbed... URL: From surya.seetharaman9 at gmail.com Wed May 8 14:47:14 2019 From: surya.seetharaman9 at gmail.com (Surya Seetharaman) Date: Wed, 8 May 2019 16:47:14 +0200 Subject: [placement][nova][ptg] Summary: Consumer Types In-Reply-To: References: <1557135206.12068.1@smtp.office365.com> Message-ID: On Mon, May 6, 2019 at 5:51 PM Chris Dent wrote: > On Mon, 6 May 2019, Dan Smith wrote: > > >> Still nova might want to fix this placement data inconsistency. I > >> guess the new placement microversion will allow to update the consumer > >> type of an allocation. > > > > Yeah, I think this has to be updated from Nova. I (and I imagine others) > > would like to avoid making the type field optional in the API. So maybe > > default the value to something like "incomplete" or "unknown" and then > > let nova correct this naturally for instances on host startup and > > migrations on complete/revert. Ideally nova will be one one of the users > > that wants to depend on the type string, so we want to use our knowledge > > of which is which to get existing allocations updated so we can depend > > on the type value later. > > Ah, okay, good. If something like "unknown" is workable I think > that's much much better than defaulting to instance. Thanks. > okay, the spec will take this approach then. Regards, Surya. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtantsur at redhat.com Wed May 8 14:56:06 2019 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Wed, 8 May 2019 16:56:06 +0200 Subject: [ironic][tripleo] My PTG & Forum notes In-Reply-To: <896f2331-139d-acfe-5115-248411eb6b35@redhat.com> References: <7313c6aa-1693-2cb0-4ed9-a73646764070@redhat.com> <896f2331-139d-acfe-5115-248411eb6b35@redhat.com> Message-ID: <80510197-88fe-e4c0-6cd8-d68e2b38e28c@redhat.com> On 5/8/19 11:18 AM, Bogdan Dobrelya wrote: > On 07.05.2019 19:47, Dmitry Tantsur wrote: >> Hi folks, >> >> I've published my personal notes from the PTG & Forum in Denver: >> https://dtantsur.github.io/posts/ironic-denver-2019/ >> They're probably opinionated and definitely not complete, but I still think >> they could be useful. >> >> Also pasting the whole raw RST text below for ease of commenting. >> >> Cheers, >> Dmitry >> >> >> Keynotes >> ======== >> >> The `Metal3`_ project got some spotlight during the keynotes. A (successful!) >> `live demo`_ was done that demonstrated using Ironic through Kubernetes API to >> drive provisioning of bare metal nodes. > > this is very interesting to consider for TripleO integration alongside (or > alternatively?) standalone Ironic, see my note below > >> >> The official `bare metal program`_ was announced to promote managing bare metal >> infrastructure via OpenStack. >> >> >> PTG: TripleO >> ============ >> >> We discussed our plans for removing Nova from the TripleO undercloud and >> moving bare metal provisioning from under control of Heat. The plan from the > > I wish we could have Metal3 provisioning via K8s API adapted for Undercloud in > TripleO. Probably via a) standalone kubelet or b) k3s [0]. > The former provides only kubelet running static pods, no API server et al. The > latter is a lightweight k8s distro (a 10MB memory footprint or so) and may be as > well used to spawn some very limited kubelet and API server setup for Metal3 to > drive the provisioning of overclouds outside of Heat and Neutron. We could use Metal3, but it will definitely change user experience beyond the point of recognition and rule out upgrades. With the current effort we're trying to keep the user interactions similar and upgrades still possible. Dmitry > > [0] > https://www.cnrancher.com/blog/2019/2019-02-26-introducing-k3s-the-lightweight-kubernetes-distribution-built-for-the-edge/ > > >> `nova-less-deploy specification`_, as well as the current state >> of the implementation, were presented. >> >> The current concerns are: >> >> * upgrades from a Nova based deployment (probably just wipe the Nova >>    database), >> * losing user experience of ``nova list`` (largely compensated by >>    ``metalsmith list``), >> * tracking IP addresses for networks other than *ctlplane* (solved the same >>    way as for deployed servers). >> >> The next action item is to create a CI job based on the already merged code and >> verify a few assumptions made above. >> From massimo.sgaravatto at gmail.com Wed May 8 15:04:10 2019 From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto) Date: Wed, 8 May 2019 17:04:10 +0200 Subject: [nova][ops] 'Duplicate entry for primary key' problem running nova-manage db archive_deleted_rows In-Reply-To: References: Message-ID: The problem is not for that single entry Looks like the auto_increment for that table was reset (I don't know when-how) Cheers, Massimo On Wed, May 8, 2019 at 1:50 PM Surya Seetharaman < surya.seetharaman9 at gmail.com> wrote: > Hi, > > On Wed, May 8, 2019 at 12:41 PM Massimo Sgaravatto < > massimo.sgaravatto at gmail.com> wrote: > >> Hi >> >> Fron time to time I use to move entries related to deleted instances to >> shadow tables, using the command: >> >> nova-manage db archive_deleted_rows >> >> This is now failing [*] for the instance_metadata table because of a >> 'duplicate entry for the primary key' problem: >> >> DBDuplicateEntry: (pymysql.err.IntegrityError) (1062, u"Duplicate entry >> '6' for key 'PRIMARY'") [SQL: u'INSERT INTO shadow_instance_metadata >> (created_at, updated_at, deleted_at, deleted, id, `key`, value, >> instance_uuid) SELECT instance_metadata.created_at, >> instance_metadata.updated_at, instance_metadata.deleted_at, >> instance_metadata.deleted, instance_metadata.id, >> instance_metadata.`key`, instance_metadata.value, >> instance_metadata.instance_uuid \nFROM instance_metadata \nWHERE >> instance_metadata.deleted != %(deleted_1)s ORDER BY instance_metadata.id >> \n LIMIT %(param_1)s'] [parameters: {u'param_1': 1, u'deleted_1': 0}] >> >> >> Indeed: >> >> mysql> SELECT instance_metadata.created_at, instance_metadata.updated_at, >> instance_metadata.deleted_at, instance_metadata.deleted, >> instance_metadata.id, instance_metadata.`key`, instance_metadata.value, >> instance_metadata.instance_uuid FROM instance_metadata WHERE >> instance_metadata.deleted != 0 ORDER BY instance_metadata.id limit 1; >> >> +---------------------+------------+---------------------+---------+----+-------+-------+--------------------------------------+ >> | created_at | updated_at | deleted_at | deleted | id | >> key | value | instance_uuid | >> >> +---------------------+------------+---------------------+---------+----+-------+-------+--------------------------------------+ >> | 2018-09-20 07:40:56 | NULL | 2018-09-20 07:54:26 | 6 | 6 | >> group | node | a9000ff7-2298-454c-bf71-9e3c62ec0f3c | >> >> +---------------------+------------+---------------------+---------+----+-------+-------+--------------------------------------+ >> 1 row in set (0.00 sec) >> >> >> But there is a 5-years old entry (if I am not wrong we were running >> Havana at that time) in the shadow table with that id: >> >> mysql> select * from shadow_instance_metadata where id='6'; >> >> +---------------------+------------+---------------------+----+----------+---------------------+--------------------------------------+---------+ >> | created_at | updated_at | deleted_at | id | key >> | value | instance_uuid | deleted | >> >> +---------------------+------------+---------------------+----+----------+---------------------+--------------------------------------+---------+ >> | 2014-11-04 12:57:10 | NULL | 2014-11-04 13:06:45 | 6 | director >> | microbosh-openstack | 5db5b17b-69f2-4f0a-bdd2-efe710268021 | 6 | >> >> +---------------------+------------+---------------------+----+----------+---------------------+--------------------------------------+---------+ >> 1 row in set (0.00 sec) >> >> mysql> >> >> >> I wonder how could that happen. >> >> Can I simply remove that entry from the shadow table (I am not really >> interested to keep it) or are there better (cleaner) way to fix the problem >> ? >> >> >> This Cloud is now running Ocata >> >> Thanks, Massimo >> >> > From what I can understand, it looks like a record with id 6 was archived > long back (havana-ish) and then there was a new record with id 6 again > ready to be archived ? (not sure how there could have been two records with > same id since ids are incremental even over releases, I am not sure of the > history though since I wasn't involved with OS then). I think the only way > out is to manually delete that entry from the shadow table if you don't > want it. There should be no harm in removing it. > > We have a "nova-manage db purge [--all] [--before ] [--verbose] > [--all-cells]" command that removes records from shadow_tables ( > https://docs.openstack.org/nova/rocky/cli/nova-manage.html) but it was > introduced in rocky. So it won't be available in Ocata unfortunately. > > Cheers, > Surya. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From waboring at hemna.com Wed May 8 15:04:59 2019 From: waboring at hemna.com (Walter Boring) Date: Wed, 8 May 2019 11:04:59 -0400 Subject: [cinder] Python3 requirements for Train Message-ID: Hello Cinder folks, The train release is going to be the last release of OpenStack with python 2 support. Train also is going to require supporting python 3.6 and 3.7. This means that we should be enabling and or switching over all of our 3rd party CI runs to python 3 to ensure that our drivers and all of their required libraries run properly in a python 3.6/3.7 environment. This will help driver maintainers discover any python3 incompatibilities with their driver as well as any required libraries. At the PTG in Denver, the cinder team agreed that we wanted driver CI systems to start using python3 by milestone 2 for Train. This would be the July 22-26th time frame [1]. We are also working on adding driver library requirements to the OpenStack global requirements project [2] [3]. This effort will provide native install primitives for driver libraries in cinder. This process also requires the driver libraries to run in python3.6/3.7. The Cinder team wants to maintain it's high quality of driver support in the train release. By enabling python 3.6 and python 3.7 in CI tests, this will help everyone ship Cinder with the required support in Train and the following releases. Walt [1] https://releases.openstack.org/train/schedule.html [2] https://review.opendev.org/#/c/656724/ [3] https://review.opendev.org/#/c/657395/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rleander at redhat.com Thu May 2 04:35:10 2019 From: rleander at redhat.com (Rain Leander) Date: Wed, 1 May 2019 22:35:10 -0600 Subject: [ptg] Interviews at PTG Denver Message-ID: Hello all! I'm attending PTG this week to conduct project interviews [0]. These interviews have several purposes. Please consider all of the following when thinking about what you might want to say in your interview: * Tell the users/customers/press what you've been working on in Rocky * Give them some idea of what's (what might be?) coming in Stein * Put a human face on the OpenStack project and encourage new participants to join us * You're welcome to promote your company's involvement in OpenStack but we ask that you avoid any kind of product pitches or job recruitment In the interview I'll ask some leading questions and it'll go easier if you've given some thought to them ahead of time: * Who are you? (Your name, your employer, and the project(s) on which you are active.) * What did you accomplish in Rocky? (Focus on the 2-3 things that will be most interesting to cloud operators) * What do you expect to be the focus in Stein? (At the time of your interview, it's likely that the meetings will not yet have decided anything firm. That's ok.) * Anything further about the project(s) you work on or the OpenStack community in general. Finally, note that there are only 40 interview slots available, so please consider coordinating with your project to designate the people that you want to represent the project, so that we don't end up with 12 interview about Neutron, or whatever. I mean, love me some Neutron, but twelve interviews is a bit too many, eh? It's fine to have multiple people in one interview - Maximum 3, probably. Interview slots are 30 minutes, in which time we hope to capture somewhere between 10 and 20 minutes of content. It's fine to run shorter but 15 minutes is probably an ideal length. See you SOON! [0] https://docs.google.com/spreadsheets/d/1xZosqEL_iRI1Q-A5j-guRVh6Gc8-rRQ6SqKvHZHHxBg/edit?usp=sharing -- K Rain Leander OpenStack Community Liaison Open Source and Standards Team https://www.rdoproject.org/ http://community.redhat.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From sneha.rai at hpe.com Thu May 2 08:07:18 2019 From: sneha.rai at hpe.com (RAI, SNEHA) Date: Thu, 2 May 2019 08:07:18 +0000 Subject: Help needed to Support Multi-attach feature Message-ID: Hi Team, I am currently working on multiattach feature for HPE 3PAR cinder driver. For this, while setting up devstack(on stable/queens) I made below change in the local.conf [[local|localrc]] ENABLE_VOLUME_MULTIATTACH=True ENABLE_UBUNTU_CLOUD_ARCHIVE=False /etc/cinder/cinder.conf: [3pariscsi_1] hpe3par_api_url = https://192.168.1.7:8080/api/v1 hpe3par_username = user hpe3par_password = password san_ip = 192.168.1.7 san_login = user san_password = password volume_backend_name = 3pariscsi_1 hpe3par_cpg = my_cpg hpe3par_iscsi_ips = 192.168.11.2,192.168.11.3 volume_driver = cinder.volume.drivers.hpe.hpe_3par_iscsi.HPE3PARISCSIDriver hpe3par_iscsi_chap_enabled = True hpe3par_debug = True image_volume_cache_enabled = True /etc/cinder/policy.json: 'volume:multiattach': 'rule:admin_or_owner' Added https://review.opendev.org/#/c/560067/2/cinder/volume/drivers/hpe/hpe_3par_common.py change in the code. But I am getting below error in the nova log: Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [None req-2cda6e90-fd45-4bfe-960a-7fca9ba4abab demo admin] [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] Instance failed block device setup: MultiattachNotSupportedByVirtDriver: Volume dc25f09a-6ae1-4b06-a814-73a8afaba62f has 'multiattach' set, which is not supported for this instance. Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] Traceback (most recent call last): Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/compute/manager.py", line 1615, in _prep_block_device Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] wait_func=self._await_block_device_map_created) Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 840, in attach_block_devices Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] _log_and_attach(device) Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 837, in _log_and_attach Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] bdm.attach(*attach_args, **attach_kwargs) Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 46, in wrapped Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] ret_val = method(obj, context, *args, **kwargs) Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 620, in attach Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] virt_driver, do_driver_attach) Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 274, in inner Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] return f(*args, **kwargs) Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 617, in _do_locked_attach Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] self._do_attach(*args, **_kwargs) Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 602, in _do_attach Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] do_driver_attach) Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 509, in _volume_attach Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] volume_id=volume_id) Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] MultiattachNotSupportedByVirtDriver: Volume dc25f09a-6ae1-4b06-a814-73a8afaba62f has 'multiattach' set, which is not supported for this instance. Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] Apr 29 05:41:20 CSSOSBE04-B09 nova-compute[20455]: DEBUG nova.virt.libvirt.driver [-] Volume multiattach is not supported based on current versions of QEMU and libvirt. QEMU must be less than 2.10 or libvirt must be greater than or equal to 3.10. {{(pid=20455) _set_multiattach_support /opt/stack/nova/nova/virt/libvirt/driver.py:619}} stack at CSSOSBE04-B09:/tmp$ virsh --version 3.6.0 stack at CSSOSBE04-B09:/tmp$ kvm --version QEMU emulator version 2.10.1(Debian 1:2.10+dfsg-0ubuntu3.8~cloud1) Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers openstack volume show -c multiattach -c status sneha1 +-------------+-----------+ | Field | Value | +-------------+-----------+ | multiattach | True | | status | available | +-------------+-----------+ cinder extra-specs-list +--------------------------------------+-------------+--------------------------------------------------------------------+ | ID | Name | extra_specs | +--------------------------------------+-------------+--------------------------------------------------------------------+ | bd077fde-51c3-4581-80d5-5855e8ab2f6b | 3pariscsi_1 | {'volume_backend_name': '3pariscsi_1', 'multiattach': ' True'}| +--------------------------------------+-------------+--------------------------------------------------------------------+ echo $OS_COMPUTE_API_VERSION 2.60 pip list | grep python-novaclient DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. python-novaclient 13.0.0 How do I fix this version issue on my setup to proceed? Please help. Thanks & Regards, Sneha Rai -------------- next part -------------- An HTML attachment was scrubbed... URL: From gn01737625 at gmail.com Thu May 2 09:13:35 2019 From: gn01737625 at gmail.com (Ming-Che Liu) Date: Thu, 2 May 2019 17:13:35 +0800 Subject: [Deploy problem] deploy openstack+monasca with kolla-ansible 8.0.0.0rc1. [kolla] In-Reply-To: <10f217bf-33a2-d40a-8bcf-6994c26be699@stackhpc.com> References: <10f217bf-33a2-d40a-8bcf-6994c26be699@stackhpc.com> Message-ID: Hello, Thank you for replying, my goal is to deploy [all-in-one] openstack+monasca(in the same physical machine/VM). I will check the detail error information and provide such logs for you, thank you. I also have a question about kolla-ansible 8.0.0.0rc1, when I check the new feature about kolla-ansible 8.0.0.0rc1, it seems only 8.0.0.0rc1 provide the "complete" monasca functionality, it that right(that means you can see monasca's plugin in openstack horizon, as the following picture)? Thank you very much. Regards, Shawn [image: monasca.png] Doug Szumski 於 2019年5月2日 週四 下午4:21寫道: > > On 01/05/2019 08:45, Ming-Che Liu wrote: > > Hello, > > > > I deployed openstack+monasca with kolla-ansible 8.0.0.0rc1. > > It doesn't look like Monasca is enabled in your globals.yml file. Are > you trying to set up OpenStack services first and then enable Monasca > afterwards? You can also deploy Monasca standalone if that is useful: > > > https://docs.openstack.org/kolla-ansible/latest/reference/logging-and-monitoring/monasca-guide.html > > > > > I follow the steps as mentioned in > > https://docs.openstack.org/kolla-ansible/latest/user/quickstart.html > > > > The setting in my computer's globals.yml as same as [Quick Start] > > tutorial (attached file: globals.yml is my setting). > > > > My machine environment as following: > > OS: Ubuntu 16.04 > > Kolla-ansible verions: 8.0.0.0rc1 > > ansible version: 2.7 > > > > When I execute [bootstrap-servers] and [prechecks], it seems ok (no > > fatal error or any interrupt). > > > > But when I execute [deploy], it will occur some error about > > rabbitmq(when I set enable_rabbitmq:yes) and nova compute service(when > > I set enable_rabbitmq:no). > > > > I have some detail screenshot about the errors as attached files, > > could you please help me to solve this problem? > > Please can you post more information on why the containers are not > starting. > > - Inspect rabbit and nova-compute logs (in > /var/lib/docker/volumes/kolla_logs/_data/) > > - Check relevant containers are running, and if they are restarting > check the output. Eg. docker logs --follow nova_compute > > > > > Thank you very much. > > > > [Attached file description]: > > globals.yml: my computer's setting about kolla-ansible > > > > As mentioned above, the following pictures show the errors, the > > rabbitmq error will occur if I set [enable_rabbitmq:yes], the nova > > compute service error will occur if I set [enable_rabbitmq:no]. > > docker-version.png > > kolla-ansible-version.png > > nova-compute-service-error.png > > rabbitmq_error.png > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: monasca.png Type: image/png Size: 18872 bytes Desc: not available URL: From shyambiradarsggsit at gmail.com Thu May 2 15:59:56 2019 From: shyambiradarsggsit at gmail.com (Shyam Biradar) Date: Thu, 2 May 2019 21:29:56 +0530 Subject: Kolla-ansible pike nova compute keeps restarting Message-ID: Hi, I am setting up all-in-one ubuntu based kolla-ansible pike openstack. Deployment is failing at following ansible task: TASK [nova : include_tasks] ****************************** ************************************************************ **************************** included: /root/virtnev/share/kolla-ansible/ansible/roles/nova/tasks/discover_computes.yml for localhost TASK [nova : Waiting for nova-compute service up] ************************************************************ ************************************ FAILED - RETRYING: Waiting for nova-compute service up (20 retries left). FAILED - RETRYING: Waiting for nova-compute service up (19 retries left). FAILED - RETRYING: Waiting for nova-compute service up (18 retries left). FAILED - RETRYING: Waiting for nova-compute service up (17 retries left). FAILED - RETRYING: Waiting for nova-compute service up (16 retries left). FAILED - RETRYING: Waiting for nova-compute service up (15 retries left). FAILED - RETRYING: Waiting for nova-compute service up (14 retries left). FAILED - RETRYING: Waiting for nova-compute service up (13 retries left). FAILED - RETRYING: Waiting for nova-compute service up (12 retries left). FAILED - RETRYING: Waiting for nova-compute service up (11 retries left). FAILED - RETRYING: Waiting for nova-compute service up (10 retries left). FAILED - RETRYING: Waiting for nova-compute service up (9 retries left). FAILED - RETRYING: Waiting for nova-compute service up (8 retries left). FAILED - RETRYING: Waiting for nova-compute service up (7 retries left). FAILED - RETRYING: Waiting for nova-compute service up (6 retries left). FAILED - RETRYING: Waiting for nova-compute service up (5 retries left). FAILED - RETRYING: Waiting for nova-compute service up (4 retries left). FAILED - RETRYING: Waiting for nova-compute service up (3 retries left). FAILED - RETRYING: Waiting for nova-compute service up (2 retries left). FAILED - RETRYING: Waiting for nova-compute service up (1 retries left). fatal: [localhost -> localhost]: FAILED! => {"attempts": 20, "changed": false, "cmd": ["docker", "exec", "kolla_toolbox", "openstack", "--os-interface", "internal", "--os-auth-url", "http://192.168.122.151:35357 ", "--os-identity-api-version", "3", "--os-project-domain-name", "default", "--os-tenant-name", "admin", "--os-username", "admin", "--os-password", " ivpu1km8qxnVQESvAF4cyTFstOvrbxGUHjFF15gZ", "--os-user-domain-name", "default", "compute", "service", "list", "-f", "json", "--service", "nova-compute"], "delta": "0:00:02.555356", "end": "2019-05-02 09:24:45.485786", "rc": 0, "start": "2019-05-02 09:24:42.930430", "stderr": "", "stderr_lines": [], "stdout": "[]", "stdout_lines": ["[]"]} -------------------------------------------------------------------- I can see following stack trace in nova-compute container log 4. 2019-05-02 08:21:30.522 7 INFO nova.service [-] Starting compute node (version 16.1.7) 2019-05-02 08:21:30.524 7 ERROR oslo_service.service [-] Error starting thread.: PlacementNotConfigured: This compute is not configured to talk to the placement service. Configure the [placement] section of nova.conf and restart the service. 2019-05-02 08:21:30.524 7 ERROR oslo_service.service Traceback (most recent call last): 2019-05-02 08:21:30.524 7 ERROR oslo_service.service File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_service/service.py", line 721, in run_service 2019-05-02 08:21:30.524 7 ERROR oslo_service.service service.start() 2019-05-02 08:21:30.524 7 ERROR oslo_service.service File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/service.py", line 156, in start 2019-05-02 08:21:30.524 7 ERROR oslo_service.service self.manager.init_host() 2019-05-02 08:21:30.524 7 ERROR oslo_service.service File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/compute/manager.py", line 1155, in init_host 2019-05-02 08:21:30.524 7 ERROR oslo_service.service raise exception. PlacementNotConfigured() 2019-05-02 08:21:30.524 7 ERROR oslo_service.service PlacementNotConfigured: This compute is not configured to talk to the placement service. Configure the [placement] section of nova.conf and restart the service. 2019-05-02 08:21:30.524 7 ERROR oslo_service.service 2019-05-02 08:21:59.229 7 INFO os_vif [-] Loaded VIF plugins: ovs, linux_bridge --------------------------------------------------------------------- I saw nova-compute nova.conf has [placement] section configured well and it's same as nova_api's placement section. Other nova containers are started well. Any thoughts? Best Regards, Shyam Biradar Email Id: shyambiradarsggsit at gmail.com Contact: 8600266938 -------------- next part -------------- An HTML attachment was scrubbed... URL: From gn01737625 at gmail.com Fri May 3 01:22:00 2019 From: gn01737625 at gmail.com (Ming-Che Liu) Date: Fri, 3 May 2019 09:22:00 +0800 Subject: [Deploy problem] deploy openstack+monasca with kolla-ansible 8.0.0.0rc1. In-Reply-To: References: Message-ID: Hi Mark, Sure, I will do that, thanks. Regards, Ming-Che Mark Goddard 於 2019年5月3日 週五 上午1:12寫道: > > > On Wed, 1 May 2019 at 17:10, Ming-Che Liu wrote: > >> Hello, >> >> I deployed openstack+monasca with kolla-ansible 8.0.0.0rc1. >> >> I follow the steps as mentioned in >> https://docs.openstack.org/kolla-ansible/latest/user/quickstart.html >> >> The setting in my computer's globals.yml as same as [Quick Start] >> tutorial (attached file: globals.yml is my setting). >> >> My machine environment as following: >> OS: Ubuntu 16.04 >> Kolla-ansible verions: 8.0.0.0rc1 >> ansible version: 2.7 >> >> When I execute [bootstrap-servers] and [prechecks], it seems ok (no fatal >> error or any interrupt). >> >> But when I execute [deploy], it will occur some error about rabbitmq(when >> I set enable_rabbitmq:yes) and nova compute service(when I >> set enable_rabbitmq:no). >> >> I have some detail screenshot about the errors as attached files, could >> you please help me to solve this problem? >> >> Thank you very much. >> >> [Attached file description]: >> globals.yml: my computer's setting about kolla-ansible >> >> As mentioned above, the following pictures show the errors, the rabbitmq >> error will occur if I set [enable_rabbitmq:yes], the nova compute service >> error will occur if I set [enable_rabbitmq:no]. >> > > Hi Ming-Che, > > Since Stein, we no longer test Kolla Ansible with Ubuntu 16.04 upstream. > Could you try again using Ubuntu 18.04? > > Regards, > Mark > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gn01737625 at gmail.com Fri May 3 07:26:28 2019 From: gn01737625 at gmail.com (Ming-Che Liu) Date: Fri, 3 May 2019 15:26:28 +0800 Subject: [Deploy problem] deploy openstack+monasca with kolla-ansible 8.0.0.0rc1. In-Reply-To: References: Message-ID: Hi Mark, I tried to deploy openstack+monasca with kolla-ansible 8.0.0.0rc1(in the same machine), but still encounter some fatal error. The attached file:golbals.yml is my setting, machine_package_setting is machine environment setting. The error is: RUNNING HANDLER [rabbitmq : Waiting for rabbitmq to start on first node] ************************************************************ fatal: [localhost]: FAILED! => {"changed": true, "cmd": "docker exec rabbitmq rabbitmqctl wait /var/lib/rabbitmq/mnesia/rabbitmq.pid", "delta": "0:00:00.861054", "end": "2019-05-03 15:17:42.387873", "msg": "non-zero return code", "rc": 137, "start": "2019-05-03 15:17:41.526819", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []} When I use command "docker inspect rabbitmq_id |grep RestartCount", I find rabbitmq will restart many times such as: kaga at agre-an21:~$ sudo docker inspect 5567f37cc78a |grep RestartCount "RestartCount": 15, Could please help to solve this problem? Thanks. Regards, Ming-Che Ming-Che Liu 於 2019年5月3日 週五 上午9:22寫道: > Hi Mark, > > Sure, I will do that, thanks. > > Regards, > > Ming-Che > > Mark Goddard 於 2019年5月3日 週五 上午1:12寫道: > >> >> >> On Wed, 1 May 2019 at 17:10, Ming-Che Liu wrote: >> >>> Hello, >>> >>> I deployed openstack+monasca with kolla-ansible 8.0.0.0rc1. >>> >>> I follow the steps as mentioned in >>> https://docs.openstack.org/kolla-ansible/latest/user/quickstart.html >>> >>> The setting in my computer's globals.yml as same as [Quick Start] >>> tutorial (attached file: globals.yml is my setting). >>> >>> My machine environment as following: >>> OS: Ubuntu 16.04 >>> Kolla-ansible verions: 8.0.0.0rc1 >>> ansible version: 2.7 >>> >>> When I execute [bootstrap-servers] and [prechecks], it seems ok (no >>> fatal error or any interrupt). >>> >>> But when I execute [deploy], it will occur some error about >>> rabbitmq(when I set enable_rabbitmq:yes) and nova compute service(when I >>> set enable_rabbitmq:no). >>> >>> I have some detail screenshot about the errors as attached files, could >>> you please help me to solve this problem? >>> >>> Thank you very much. >>> >>> [Attached file description]: >>> globals.yml: my computer's setting about kolla-ansible >>> >>> As mentioned above, the following pictures show the errors, the rabbitmq >>> error will occur if I set [enable_rabbitmq:yes], the nova compute service >>> error will occur if I set [enable_rabbitmq:no]. >>> >> >> Hi Ming-Che, >> >> Since Stein, we no longer test Kolla Ansible with Ubuntu 16.04 upstream. >> Could you try again using Ubuntu 18.04? >> >> Regards, >> Mark >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: machine_package_setting Type: application/octet-stream Size: 1885 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: globals.yml Type: application/x-yaml Size: 20184 bytes Desc: not available URL: From gn01737625 at gmail.com Fri May 3 08:13:26 2019 From: gn01737625 at gmail.com (Ming-Che Liu) Date: Fri, 3 May 2019 16:13:26 +0800 Subject: [Deploy problem] deploy openstack+monasca with kolla-ansible 8.0.0.0rc1. In-Reply-To: References: Message-ID: Apologies,this mail will attach rabbitmq log file(ues command "docker logs --follow rabbitmq") for debug. Logs in /var/lib/docker/volumes/kolla_logs/_data/rabbitmq are empty. thanks. Regards, Ming-Che Ming-Che Liu 於 2019年5月3日 週五 下午3:26寫道: > Hi Mark, > > I tried to deploy openstack+monasca with kolla-ansible 8.0.0.0rc1(in the > same machine), but still encounter some fatal error. > > The attached file:golbals.yml is my setting, machine_package_setting is > machine environment setting. > > The error is: > RUNNING HANDLER [rabbitmq : Waiting for rabbitmq to start on first node] > ************************************************************ > fatal: [localhost]: FAILED! => {"changed": true, "cmd": "docker exec > rabbitmq rabbitmqctl wait /var/lib/rabbitmq/mnesia/rabbitmq.pid", "delta": > "0:00:00.861054", "end": "2019-05-03 15:17:42.387873", "msg": "non-zero > return code", "rc": 137, "start": "2019-05-03 15:17:41.526819", "stderr": > "", "stderr_lines": [], "stdout": "", "stdout_lines": []} > > When I use command "docker inspect rabbitmq_id |grep RestartCount", I > find rabbitmq will restart many times > > such as: > > kaga at agre-an21:~$ sudo docker inspect 5567f37cc78a |grep RestartCount > "RestartCount": 15, > > Could please help to solve this problem? Thanks. > > Regards, > > Ming-Che > > > > > > > > Ming-Che Liu 於 2019年5月3日 週五 上午9:22寫道: > >> Hi Mark, >> >> Sure, I will do that, thanks. >> >> Regards, >> >> Ming-Che >> >> Mark Goddard 於 2019年5月3日 週五 上午1:12寫道: >> >>> >>> >>> On Wed, 1 May 2019 at 17:10, Ming-Che Liu wrote: >>> >>>> Hello, >>>> >>>> I deployed openstack+monasca with kolla-ansible 8.0.0.0rc1. >>>> >>>> I follow the steps as mentioned in >>>> https://docs.openstack.org/kolla-ansible/latest/user/quickstart.html >>>> >>>> The setting in my computer's globals.yml as same as [Quick Start] >>>> tutorial (attached file: globals.yml is my setting). >>>> >>>> My machine environment as following: >>>> OS: Ubuntu 16.04 >>>> Kolla-ansible verions: 8.0.0.0rc1 >>>> ansible version: 2.7 >>>> >>>> When I execute [bootstrap-servers] and [prechecks], it seems ok (no >>>> fatal error or any interrupt). >>>> >>>> But when I execute [deploy], it will occur some error about >>>> rabbitmq(when I set enable_rabbitmq:yes) and nova compute service(when I >>>> set enable_rabbitmq:no). >>>> >>>> I have some detail screenshot about the errors as attached files, could >>>> you please help me to solve this problem? >>>> >>>> Thank you very much. >>>> >>>> [Attached file description]: >>>> globals.yml: my computer's setting about kolla-ansible >>>> >>>> As mentioned above, the following pictures show the errors, the >>>> rabbitmq error will occur if I set [enable_rabbitmq:yes], the nova compute >>>> service error will occur if I set [enable_rabbitmq:no]. >>>> >>> >>> Hi Ming-Che, >>> >>> Since Stein, we no longer test Kolla Ansible with Ubuntu 16.04 upstream. >>> Could you try again using Ubuntu 18.04? >>> >>> Regards, >>> Mark >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: rabbitmq_docker_log Type: application/octet-stream Size: 340347 bytes Desc: not available URL: From yadav.akshay58 at gmail.com Fri May 3 11:01:36 2019 From: yadav.akshay58 at gmail.com (Akki yadav) Date: Fri, 3 May 2019 16:31:36 +0530 Subject: [Neutron] Can I create VM on flat network which doesnt have any subnet attached to it. Message-ID: Hello Team, Hope you all are doing good. I wanted to know that can I launch a VM on a flat network directly which doesn't have any subnet attached to it. Steps to be followed: Create a flat Network without a subnet Attach the network to create a VM. Aim :- Spawn 2 VM's on network without any subnet, without any IP assigned to them. Then statically allocate same subnet IP to them and ping each other. Issue:- VM creation is getting failed stating that there is no subnet found. How can we resolve this? Thanks & Regards Akshay -------------- next part -------------- An HTML attachment was scrubbed... URL: From manuel.sb at garvan.org.au Thu May 2 03:16:13 2019 From: manuel.sb at garvan.org.au (Manuel Sopena Ballesteros) Date: Thu, 2 May 2019 03:16:13 +0000 Subject: how to setup nvme pci-pasthrough to get close to native performance? Message-ID: <9D8A2486E35F0941A60430473E29F15B017EA662EB@mxdb2.ad.garvan.unsw.edu.au> Dear Openstack community, I am configuring a high performance storage vms, I decided to go to the easy path (pci-passthrough), I can spin up vms and see the pci devices, however performance is below native/bare metal. Native/Bare metal performance: [root at zeus-54 data]# fio --ioengine=libaio --name=test --filename=test --bs=4k --size=40G --readwrite=randrw --runtime=120 --time_based test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1 fio-3.1 Starting 1 process Jobs: 1 (f=1): [f(1)][100.0%][r=39.5MiB/s,w=39.6MiB/s][r=10.1k,w=10.1k IOPS][eta 00m:00s] test: (groupid=0, jobs=1): err= 0: pid=50892: Wed May 1 22:22:45 2019 read: IOPS=9805, BW=38.3MiB/s (40.2MB/s)(4596MiB/120001msec) slat (usec): min=39, max=6678, avg=94.72, stdev=55.78 clat (nsec): min=450, max=18224, avg=525.83, stdev=120.10 lat (usec): min=39, max=6679, avg=95.36, stdev=55.79 clat percentiles (nsec): | 1.00th=[ 462], 5.00th=[ 478], 10.00th=[ 482], 20.00th=[ 486], | 30.00th=[ 490], 40.00th=[ 494], 50.00th=[ 502], 60.00th=[ 510], | 70.00th=[ 516], 80.00th=[ 532], 90.00th=[ 596], 95.00th=[ 676], | 99.00th=[ 860], 99.50th=[ 1048], 99.90th=[ 1384], 99.95th=[ 2480], | 99.99th=[ 3728] bw ( KiB/s): min= 720, max=40736, per=100.00%, avg=39389.00, stdev=5317.58, samples=239 iops : min= 180, max=10184, avg=9847.23, stdev=1329.39, samples=239 write: IOPS=9799, BW=38.3MiB/s (40.1MB/s)(4594MiB/120001msec) slat (nsec): min=2982, max=106207, avg=4220.09, stdev=980.04 clat (nsec): min=407, max=18130, avg=451.48, stdev=103.71 lat (usec): min=3, max=111, avg= 4.74, stdev= 1.03 clat percentiles (nsec): | 1.00th=[ 414], 5.00th=[ 418], 10.00th=[ 422], 20.00th=[ 430], | 30.00th=[ 434], 40.00th=[ 434], 50.00th=[ 438], 60.00th=[ 438], | 70.00th=[ 442], 80.00th=[ 446], 90.00th=[ 462], 95.00th=[ 588], | 99.00th=[ 700], 99.50th=[ 916], 99.90th=[ 1208], 99.95th=[ 1288], | 99.99th=[ 3536] bw ( KiB/s): min= 752, max=42608, per=100.00%, avg=39366.63, stdev=5355.73, samples=239 iops : min= 188, max=10652, avg=9841.64, stdev=1338.93, samples=239 lat (nsec) : 500=69.98%, 750=28.64%, 1000=0.90% lat (usec) : 2=0.42%, 4=0.04%, 10=0.01%, 20=0.01% cpu : usr=2.20%, sys=10.85%, ctx=1176675, majf=0, minf=1372 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwt: total=1176625,1175958,0, short=0,0,0, dropped=0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): READ: bw=38.3MiB/s (40.2MB/s), 38.3MiB/s-38.3MiB/s (40.2MB/s-40.2MB/s), io=4596MiB (4819MB), run=120001-120001msec WRITE: bw=38.3MiB/s (40.1MB/s), 38.3MiB/s-38.3MiB/s (40.1MB/s-40.1MB/s), io=4594MiB (4817MB), run=120001-120001msec Disk stats (read/write): nvme9n1: ios=1174695/883620, merge=0/0, ticks=105502/72225, in_queue=192101, util=99.28% VM performance: [centos at kudu-1 nvme0]$ sudo fio --ioengine=libaio --name=test --filename=test --bs=4k --size=40G --readwrite=randrw --runtime=120 --time_based test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1 fio-3.1 Starting 1 process Jobs: 1 (f=1): [m(1)][100.0%][r=29.2MiB/s,w=29.7MiB/s][r=7487,w=7595 IOPS][eta 00m:00s] test: (groupid=0, jobs=1): err= 0: pid=44383: Wed May 1 12:22:24 2019 read: IOPS=6994, BW=27.3MiB/s (28.6MB/s)(3278MiB/120000msec) slat (usec): min=54, max=20476, avg=115.27, stdev=71.45 clat (nsec): min=1757, max=31476, avg=2163.02, stdev=688.66 lat (usec): min=56, max=20481, avg=118.51, stdev=71.66 clat percentiles (nsec): | 1.00th=[ 1800], 5.00th=[ 1832], 10.00th=[ 1864], 20.00th=[ 1992], | 30.00th=[ 2040], 40.00th=[ 2064], 50.00th=[ 2064], 60.00th=[ 2096], | 70.00th=[ 2096], 80.00th=[ 2128], 90.00th=[ 2480], 95.00th=[ 2544], | 99.00th=[ 4448], 99.50th=[ 5536], 99.90th=[11072], 99.95th=[12736], | 99.99th=[18560] bw ( KiB/s): min= 952, max=31224, per=100.00%, avg=28153.51, stdev=4126.89, samples=237 iops : min= 238, max= 7806, avg=7038.23, stdev=1031.70, samples=237 write: IOPS=6985, BW=27.3MiB/s (28.6MB/s)(3274MiB/120000msec) slat (usec): min=7, max=963, avg=12.60, stdev= 6.24 clat (nsec): min=1662, max=199250, avg=2030.26, stdev=712.33 lat (usec): min=10, max=970, avg=15.68, stdev= 6.48 clat percentiles (nsec): | 1.00th=[ 1688], 5.00th=[ 1720], 10.00th=[ 1736], 20.00th=[ 1864], | 30.00th=[ 1928], 40.00th=[ 1944], 50.00th=[ 1944], 60.00th=[ 1960], | 70.00th=[ 1960], 80.00th=[ 1992], 90.00th=[ 2352], 95.00th=[ 2384], | 99.00th=[ 4048], 99.50th=[ 4768], 99.90th=[11456], 99.95th=[13120], | 99.99th=[19072] bw ( KiB/s): min= 912, max=31880, per=100.00%, avg=28119.64, stdev=4176.38, samples=237 iops : min= 228, max= 7970, avg=7029.75, stdev=1044.07, samples=237 lat (usec) : 2=51.56%, 4=47.17%, 10=1.03%, 20=0.22%, 50=0.01% lat (usec) : 250=0.01% cpu : usr=4.96%, sys=28.37%, ctx=839307, majf=0, minf=26 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwt: total=839283,838268,0, short=0,0,0, dropped=0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): READ: bw=27.3MiB/s (28.6MB/s), 27.3MiB/s-27.3MiB/s (28.6MB/s-28.6MB/s), io=3278MiB (3438MB), run=120000-120000msec WRITE: bw=27.3MiB/s (28.6MB/s), 27.3MiB/s-27.3MiB/s (28.6MB/s-28.6MB/s), io=3274MiB (3434MB), run=120000-120000msec Disk stats (read/write): nvme0n1: ios=838322/651596, merge=0/0, ticks=83804/22119, in_queue=104773, util=70.18% This is my Openstack rocky configuration: nova.conf on controller node [pci] alias = { "vendor_id":"10de", "product_id":"1db1", "device_type":"type-PCI", "name":"nv_v100" } alias = { "vendor_id":"8086", "product_id":"0953", "device_type":"type-PCI", "name":"nvme"} nova.conf on compute node: [pci] passthrough_whitelist = [ {"address":"0000:84:00.0"}, {"address":"0000:85:00.0"}, {"address":"0000:86:00.0"}, {"address":"0000:87:00.0"} ] alias = { "vendor_id":"8086", "product_id":"0953", "device_type":"type-PCI", "name":"nvme"} This is how the nvmes are exposed to the vm
Guest OS is centos 7.6 so I am guessing nvme drivers are included. Any help about what needs to my configuration to get close to native io performance? Thank you very much Manuel From: Manuel Sopena Ballesteros [mailto:manuel.sb at garvan.org.au] Sent: Wednesday, May 1, 2019 10:31 PM To: openstack-discuss at lists.openstack.org Subject: how to get best io performance from my block devices Dear Openstack community, I would like to have a high performance distributed database running in Openstack vms. I tried attaching dedicated nvme pci devices to the vm but the performance is not as good as I can get from bare metal. Bare metal: [root at zeus-54 data]# fio --ioengine=libaio --name=test --filename=test --bs=4k --size=40G --readwrite=randrw --runtime=120 --time_based test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1 fio-3.1 Starting 1 process Jobs: 1 (f=1): [f(1)][100.0%][r=39.5MiB/s,w=39.6MiB/s][r=10.1k,w=10.1k IOPS][eta 00m:00s] test: (groupid=0, jobs=1): err= 0: pid=50892: Wed May 1 22:22:45 2019 read: IOPS=9805, BW=38.3MiB/s (40.2MB/s)(4596MiB/120001msec) slat (usec): min=39, max=6678, avg=94.72, stdev=55.78 clat (nsec): min=450, max=18224, avg=525.83, stdev=120.10 lat (usec): min=39, max=6679, avg=95.36, stdev=55.79 clat percentiles (nsec): | 1.00th=[ 462], 5.00th=[ 478], 10.00th=[ 482], 20.00th=[ 486], | 30.00th=[ 490], 40.00th=[ 494], 50.00th=[ 502], 60.00th=[ 510], | 70.00th=[ 516], 80.00th=[ 532], 90.00th=[ 596], 95.00th=[ 676], | 99.00th=[ 860], 99.50th=[ 1048], 99.90th=[ 1384], 99.95th=[ 2480], | 99.99th=[ 3728] bw ( KiB/s): min= 720, max=40736, per=100.00%, avg=39389.00, stdev=5317.58, samples=239 iops : min= 180, max=10184, avg=9847.23, stdev=1329.39, samples=239 write: IOPS=9799, BW=38.3MiB/s (40.1MB/s)(4594MiB/120001msec) slat (nsec): min=2982, max=106207, avg=4220.09, stdev=980.04 clat (nsec): min=407, max=18130, avg=451.48, stdev=103.71 lat (usec): min=3, max=111, avg= 4.74, stdev= 1.03 clat percentiles (nsec): | 1.00th=[ 414], 5.00th=[ 418], 10.00th=[ 422], 20.00th=[ 430], | 30.00th=[ 434], 40.00th=[ 434], 50.00th=[ 438], 60.00th=[ 438], | 70.00th=[ 442], 80.00th=[ 446], 90.00th=[ 462], 95.00th=[ 588], | 99.00th=[ 700], 99.50th=[ 916], 99.90th=[ 1208], 99.95th=[ 1288], | 99.99th=[ 3536] bw ( KiB/s): min= 752, max=42608, per=100.00%, avg=39366.63, stdev=5355.73, samples=239 iops : min= 188, max=10652, avg=9841.64, stdev=1338.93, samples=239 lat (nsec) : 500=69.98%, 750=28.64%, 1000=0.90% lat (usec) : 2=0.42%, 4=0.04%, 10=0.01%, 20=0.01% cpu : usr=2.20%, sys=10.85%, ctx=1176675, majf=0, minf=1372 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwt: total=1176625,1175958,0, short=0,0,0, dropped=0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): READ: bw=38.3MiB/s (40.2MB/s), 38.3MiB/s-38.3MiB/s (40.2MB/s-40.2MB/s), io=4596MiB (4819MB), run=120001-120001msec WRITE: bw=38.3MiB/s (40.1MB/s), 38.3MiB/s-38.3MiB/s (40.1MB/s-40.1MB/s), io=4594MiB (4817MB), run=120001-120001msec Disk stats (read/write): nvme9n1: ios=1174695/883620, merge=0/0, ticks=105502/72225, in_queue=192101, util=99.28% >From vm: [centos at kudu-1 nvme0]$ sudo fio --ioengine=libaio --name=test --filename=test --bs=4k --size=40G --readwrite=randrw --runtime=120 --time_based test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1 fio-3.1 Starting 1 process Jobs: 1 (f=1): [m(1)][100.0%][r=29.2MiB/s,w=29.7MiB/s][r=7487,w=7595 IOPS][eta 00m:00s] test: (groupid=0, jobs=1): err= 0: pid=44383: Wed May 1 12:22:24 2019 read: IOPS=6994, BW=27.3MiB/s (28.6MB/s)(3278MiB/120000msec) slat (usec): min=54, max=20476, avg=115.27, stdev=71.45 clat (nsec): min=1757, max=31476, avg=2163.02, stdev=688.66 lat (usec): min=56, max=20481, avg=118.51, stdev=71.66 clat percentiles (nsec): | 1.00th=[ 1800], 5.00th=[ 1832], 10.00th=[ 1864], 20.00th=[ 1992], | 30.00th=[ 2040], 40.00th=[ 2064], 50.00th=[ 2064], 60.00th=[ 2096], | 70.00th=[ 2096], 80.00th=[ 2128], 90.00th=[ 2480], 95.00th=[ 2544], | 99.00th=[ 4448], 99.50th=[ 5536], 99.90th=[11072], 99.95th=[12736], | 99.99th=[18560] bw ( KiB/s): min= 952, max=31224, per=100.00%, avg=28153.51, stdev=4126.89, samples=237 iops : min= 238, max= 7806, avg=7038.23, stdev=1031.70, samples=237 write: IOPS=6985, BW=27.3MiB/s (28.6MB/s)(3274MiB/120000msec) slat (usec): min=7, max=963, avg=12.60, stdev= 6.24 clat (nsec): min=1662, max=199250, avg=2030.26, stdev=712.33 lat (usec): min=10, max=970, avg=15.68, stdev= 6.48 clat percentiles (nsec): | 1.00th=[ 1688], 5.00th=[ 1720], 10.00th=[ 1736], 20.00th=[ 1864], | 30.00th=[ 1928], 40.00th=[ 1944], 50.00th=[ 1944], 60.00th=[ 1960], | 70.00th=[ 1960], 80.00th=[ 1992], 90.00th=[ 2352], 95.00th=[ 2384], | 99.00th=[ 4048], 99.50th=[ 4768], 99.90th=[11456], 99.95th=[13120], | 99.99th=[19072] bw ( KiB/s): min= 912, max=31880, per=100.00%, avg=28119.64, stdev=4176.38, samples=237 iops : min= 228, max= 7970, avg=7029.75, stdev=1044.07, samples=237 lat (usec) : 2=51.56%, 4=47.17%, 10=1.03%, 20=0.22%, 50=0.01% lat (usec) : 250=0.01% cpu : usr=4.96%, sys=28.37%, ctx=839307, majf=0, minf=26 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwt: total=839283,838268,0, short=0,0,0, dropped=0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): READ: bw=27.3MiB/s (28.6MB/s), 27.3MiB/s-27.3MiB/s (28.6MB/s-28.6MB/s), io=3278MiB (3438MB), run=120000-120000msec WRITE: bw=27.3MiB/s (28.6MB/s), 27.3MiB/s-27.3MiB/s (28.6MB/s-28.6MB/s), io=3274MiB (3434MB), run=120000-120000msec Disk stats (read/write): nvme0n1: ios=838322/651596, merge=0/0, ticks=83804/22119, in_queue=104773, util=70.18% Is there a way I can get near bare metal performance from my nvme block devices? NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Tim.Bell at cern.ch Thu May 2 13:54:34 2019 From: Tim.Bell at cern.ch (Tim Bell) Date: Thu, 2 May 2019 13:54:34 +0000 Subject: [ops] how to get best io performance from my block devices Message-ID: <3D6C1968-76B0-449F-B389-1B59384D16F9@cern.ch> There are some hints in https://wiki.openstack.org/wiki/Documentation/HypervisorTuningGuide There are some tips in https://www.linux-kvm.org/page/Tuning_KVM too but you’d need to find the corresponding OpenStack flags on the guest/images/hosts/flavors. Overall, there are several options so it’s recommended to establish a baseline performance on a representative work load and try the various options. Tim From: Manuel Sopena Ballesteros Date: Wednesday, 1 May 2019 at 06:35 To: "openstack-discuss at lists.openstack.org" Subject: how to get best io performance from my block devices Dear Openstack community, I would like to have a high performance distributed database running in Openstack vms. I tried attaching dedicated nvme pci devices to the vm but the performance is not as good as I can get from bare metal. Bare metal: [root at zeus-54 data]# fio --ioengine=libaio --name=test --filename=test --bs=4k --size=40G --readwrite=randrw --runtime=120 --time_based test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1 fio-3.1 Starting 1 process Jobs: 1 (f=1): [f(1)][100.0%][r=39.5MiB/s,w=39.6MiB/s][r=10.1k,w=10.1k IOPS][eta 00m:00s] test: (groupid=0, jobs=1): err= 0: pid=50892: Wed May 1 22:22:45 2019 read: IOPS=9805, BW=38.3MiB/s (40.2MB/s)(4596MiB/120001msec) slat (usec): min=39, max=6678, avg=94.72, stdev=55.78 clat (nsec): min=450, max=18224, avg=525.83, stdev=120.10 lat (usec): min=39, max=6679, avg=95.36, stdev=55.79 clat percentiles (nsec): | 1.00th=[ 462], 5.00th=[ 478], 10.00th=[ 482], 20.00th=[ 486], | 30.00th=[ 490], 40.00th=[ 494], 50.00th=[ 502], 60.00th=[ 510], | 70.00th=[ 516], 80.00th=[ 532], 90.00th=[ 596], 95.00th=[ 676], | 99.00th=[ 860], 99.50th=[ 1048], 99.90th=[ 1384], 99.95th=[ 2480], | 99.99th=[ 3728] bw ( KiB/s): min= 720, max=40736, per=100.00%, avg=39389.00, stdev=5317.58, samples=239 iops : min= 180, max=10184, avg=9847.23, stdev=1329.39, samples=239 write: IOPS=9799, BW=38.3MiB/s (40.1MB/s)(4594MiB/120001msec) slat (nsec): min=2982, max=106207, avg=4220.09, stdev=980.04 clat (nsec): min=407, max=18130, avg=451.48, stdev=103.71 lat (usec): min=3, max=111, avg= 4.74, stdev= 1.03 clat percentiles (nsec): | 1.00th=[ 414], 5.00th=[ 418], 10.00th=[ 422], 20.00th=[ 430], | 30.00th=[ 434], 40.00th=[ 434], 50.00th=[ 438], 60.00th=[ 438], | 70.00th=[ 442], 80.00th=[ 446], 90.00th=[ 462], 95.00th=[ 588], | 99.00th=[ 700], 99.50th=[ 916], 99.90th=[ 1208], 99.95th=[ 1288], | 99.99th=[ 3536] bw ( KiB/s): min= 752, max=42608, per=100.00%, avg=39366.63, stdev=5355.73, samples=239 iops : min= 188, max=10652, avg=9841.64, stdev=1338.93, samples=239 lat (nsec) : 500=69.98%, 750=28.64%, 1000=0.90% lat (usec) : 2=0.42%, 4=0.04%, 10=0.01%, 20=0.01% cpu : usr=2.20%, sys=10.85%, ctx=1176675, majf=0, minf=1372 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwt: total=1176625,1175958,0, short=0,0,0, dropped=0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): READ: bw=38.3MiB/s (40.2MB/s), 38.3MiB/s-38.3MiB/s (40.2MB/s-40.2MB/s), io=4596MiB (4819MB), run=120001-120001msec WRITE: bw=38.3MiB/s (40.1MB/s), 38.3MiB/s-38.3MiB/s (40.1MB/s-40.1MB/s), io=4594MiB (4817MB), run=120001-120001msec Disk stats (read/write): nvme9n1: ios=1174695/883620, merge=0/0, ticks=105502/72225, in_queue=192101, util=99.28% From vm: [centos at kudu-1 nvme0]$ sudo fio --ioengine=libaio --name=test --filename=test --bs=4k --size=40G --readwrite=randrw --runtime=120 --time_based test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1 fio-3.1 Starting 1 process Jobs: 1 (f=1): [m(1)][100.0%][r=29.2MiB/s,w=29.7MiB/s][r=7487,w=7595 IOPS][eta 00m:00s] test: (groupid=0, jobs=1): err= 0: pid=44383: Wed May 1 12:22:24 2019 read: IOPS=6994, BW=27.3MiB/s (28.6MB/s)(3278MiB/120000msec) slat (usec): min=54, max=20476, avg=115.27, stdev=71.45 clat (nsec): min=1757, max=31476, avg=2163.02, stdev=688.66 lat (usec): min=56, max=20481, avg=118.51, stdev=71.66 clat percentiles (nsec): | 1.00th=[ 1800], 5.00th=[ 1832], 10.00th=[ 1864], 20.00th=[ 1992], | 30.00th=[ 2040], 40.00th=[ 2064], 50.00th=[ 2064], 60.00th=[ 2096], | 70.00th=[ 2096], 80.00th=[ 2128], 90.00th=[ 2480], 95.00th=[ 2544], | 99.00th=[ 4448], 99.50th=[ 5536], 99.90th=[11072], 99.95th=[12736], | 99.99th=[18560] bw ( KiB/s): min= 952, max=31224, per=100.00%, avg=28153.51, stdev=4126.89, samples=237 iops : min= 238, max= 7806, avg=7038.23, stdev=1031.70, samples=237 write: IOPS=6985, BW=27.3MiB/s (28.6MB/s)(3274MiB/120000msec) slat (usec): min=7, max=963, avg=12.60, stdev= 6.24 clat (nsec): min=1662, max=199250, avg=2030.26, stdev=712.33 lat (usec): min=10, max=970, avg=15.68, stdev= 6.48 clat percentiles (nsec): | 1.00th=[ 1688], 5.00th=[ 1720], 10.00th=[ 1736], 20.00th=[ 1864], | 30.00th=[ 1928], 40.00th=[ 1944], 50.00th=[ 1944], 60.00th=[ 1960], | 70.00th=[ 1960], 80.00th=[ 1992], 90.00th=[ 2352], 95.00th=[ 2384], | 99.00th=[ 4048], 99.50th=[ 4768], 99.90th=[11456], 99.95th=[13120], | 99.99th=[19072] bw ( KiB/s): min= 912, max=31880, per=100.00%, avg=28119.64, stdev=4176.38, samples=237 iops : min= 228, max= 7970, avg=7029.75, stdev=1044.07, samples=237 lat (usec) : 2=51.56%, 4=47.17%, 10=1.03%, 20=0.22%, 50=0.01% lat (usec) : 250=0.01% cpu : usr=4.96%, sys=28.37%, ctx=839307, majf=0, minf=26 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwt: total=839283,838268,0, short=0,0,0, dropped=0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): READ: bw=27.3MiB/s (28.6MB/s), 27.3MiB/s-27.3MiB/s (28.6MB/s-28.6MB/s), io=3278MiB (3438MB), run=120000-120000msec WRITE: bw=27.3MiB/s (28.6MB/s), 27.3MiB/s-27.3MiB/s (28.6MB/s-28.6MB/s), io=3274MiB (3434MB), run=120000-120000msec Disk stats (read/write): nvme0n1: ios=838322/651596, merge=0/0, ticks=83804/22119, in_queue=104773, util=70.18% Is there a way I can get near bare metal performance from my nvme block devices? NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. -------------- next part -------------- An HTML attachment was scrubbed... URL: From shyam.biradar at trilio.io Thu May 2 14:05:37 2019 From: shyam.biradar at trilio.io (Shyam Biradar) Date: Thu, 2 May 2019 19:35:37 +0530 Subject: kolla-ansible pike - nova_compute containers not starting Message-ID: Hi, I am setting up all-in-one ubuntu based kolla-ansible pike openstack. Deployment is failing at following ansible task: TASK [nova : include_tasks] ********************************************************************************************************************** included: /root/virtnev/share/kolla-ansible/ansible/roles/nova/tasks/discover_computes.yml for localhost TASK [nova : Waiting for nova-compute service up] ************************************************************************************************ FAILED - RETRYING: Waiting for nova-compute service up (20 retries left). FAILED - RETRYING: Waiting for nova-compute service up (19 retries left). FAILED - RETRYING: Waiting for nova-compute service up (18 retries left). FAILED - RETRYING: Waiting for nova-compute service up (17 retries left). FAILED - RETRYING: Waiting for nova-compute service up (16 retries left). FAILED - RETRYING: Waiting for nova-compute service up (15 retries left). FAILED - RETRYING: Waiting for nova-compute service up (14 retries left). FAILED - RETRYING: Waiting for nova-compute service up (13 retries left). FAILED - RETRYING: Waiting for nova-compute service up (12 retries left). FAILED - RETRYING: Waiting for nova-compute service up (11 retries left). FAILED - RETRYING: Waiting for nova-compute service up (10 retries left). FAILED - RETRYING: Waiting for nova-compute service up (9 retries left). FAILED - RETRYING: Waiting for nova-compute service up (8 retries left). FAILED - RETRYING: Waiting for nova-compute service up (7 retries left). FAILED - RETRYING: Waiting for nova-compute service up (6 retries left). FAILED - RETRYING: Waiting for nova-compute service up (5 retries left). FAILED - RETRYING: Waiting for nova-compute service up (4 retries left). FAILED - RETRYING: Waiting for nova-compute service up (3 retries left). FAILED - RETRYING: Waiting for nova-compute service up (2 retries left). FAILED - RETRYING: Waiting for nova-compute service up (1 retries left). fatal: [localhost -> localhost]: FAILED! => {"attempts": 20, "changed": false, "cmd": ["docker", "exec", "kolla_toolbox", "openstack", "--os-interface", "internal", "--os-auth-url", "http://192.168.122.151:35357", "--os-identity-api-version", "3", "--os-project-domain-name", "default", "--os-tenant-name", "admin", "--os-username", "admin", "--os-password", "ivpu1km8qxnVQESvAF4cyTFstOvrbxGUHjFF15gZ", "--os-user-domain-name", "default", "compute", "service", "list", "-f", "json", "--service", "nova-compute"], "delta": "0:00:02.555356", "end": "2019-05-02 09:24:45.485786", "rc": 0, "start": "2019-05-02 09:24:42.930430", "stderr": "", "stderr_lines": [], "stdout": "[]", "stdout_lines": ["[]"]} -------------------------------------------------------------------- I can see following stack trace in nova-compute container log 4. 2019-05-02 08:21:30.522 7 INFO nova.service [-] Starting compute node (version 16.1.7) 2019-05-02 08:21:30.524 7 ERROR oslo_service.service [-] Error starting thread.: PlacementNotConfigured: This compute is not configured to talk to the placement service. Configure the [placement] section of nova.conf and restart the service. 2019-05-02 08:21:30.524 7 ERROR oslo_service.service Traceback (most recent call last): 2019-05-02 08:21:30.524 7 ERROR oslo_service.service File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_service/service.py", line 721, in run_service 2019-05-02 08:21:30.524 7 ERROR oslo_service.service service.start() 2019-05-02 08:21:30.524 7 ERROR oslo_service.service File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/service.py", line 156, in start 2019-05-02 08:21:30.524 7 ERROR oslo_service.service self.manager.init_host() 2019-05-02 08:21:30.524 7 ERROR oslo_service.service File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/compute/manager.py", line 1155, in init_host 2019-05-02 08:21:30.524 7 ERROR oslo_service.service raise exception.PlacementNotConfigured() 2019-05-02 08:21:30.524 7 ERROR oslo_service.service PlacementNotConfigured: This compute is not configured to talk to the placement service. Configure the [placement] section of nova.conf and restart the service. 2019-05-02 08:21:30.524 7 ERROR oslo_service.service 2019-05-02 08:21:59.229 7 INFO os_vif [-] Loaded VIF plugins: ovs, linux_bridge --------------------------------------------------------------------- I saw nova-compute nova.conf has [placement] section configured well and it's same as nova_api's placement section. Other nova containers are started well. Any thoughts? [image: logo] *Shyam Biradar* * Software Engineer | DevOps* M +91 8600266938 | shyam.biradar at trilio.io | trilio.io -------------- next part -------------- An HTML attachment was scrubbed... URL: From hoanq13 at viettel.com.vn Sat May 4 06:50:53 2019 From: hoanq13 at viettel.com.vn (hoanq13 at viettel.com.vn) Date: Sat, 4 May 2019 13:50:53 +0700 (ICT) Subject: [Vitrage] add datasource kapacitor for vitrage In-Reply-To: References: <14511424.947437.1556614048877.JavaMail.zimbra@viettel.com.vn> <1324083046.973516.1556615406841.JavaMail.zimbra@viettel.com.vn> Message-ID: <1913467486.1561697.1556952653279.JavaMail.zimbra@viettel.com.vn> Hi, All the test are pass, hope you review soon. Best regards Hoa ----- Original Message ----- From: eyalb1 at gmail.com To: hoanq13 at viettel.com.vn Cc: openstack-discuss at lists.openstack.org Sent: Thursday, May 2, 2019 2:12:12 PM Subject: Re: [Vitrage] add datasource kapacitor for vitrage Hi, Please make sure all test are passing Eyal On Thu, May 2, 2019, 02:18 < hoanq13 at viettel.com.vn > wrote: Hi, In our system, we use monitor by TICK stack (include: Telegraf for collect metric, InfluxDB for storage metric, Chronograf for visualize and Kapacitor alarming), which is popular monitor solution. We hope can integrate vitrage in, so we decide to write kapacitor datasource contribute for vitrage. The work is almost done , you can review in: https://review.opendev.org/653416 So i send this mail hope for more review, ideal,... Appreciate it. also ask: have any step i miss in pipeline of contribute datasource vitrage? like create blueprints, vitrage-spec,vv.. Should i do it? -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano at canepa.ge.it Sun May 5 21:51:00 2019 From: stefano at canepa.ge.it (Stefano Canepa) Date: Sun, 5 May 2019 15:51:00 -0600 Subject: [openstack-ansible][monasca][zaqar][watcher][searchlight] Retirement of unused OpenStack Ansible roles In-Reply-To: References: <236ef912-21c5-4345-98ce-067499921af1@www.fastmail.com> Message-ID: Hi all, I would like to maintain monasca related roles but I have to double check how much time I can allocate to this task. Please hold before retiring them. All the best Stefano Stefano Canepa sc at linux.it or stefano at canepa.ge.it On Wed, 24 Apr 2019, 14:51 Mohammed Naser, wrote: > Hi, > > These roles have been broken for over a year now, some are not even > integrated with the OpenStack Ansible integrated repository. > > I think it's safe to say that for the most part, they have no users or > consumers unless someone has integrated it downstream somewhere and > didn't push that back out. It is a lot of overhead to maintain roles, > we're a small team that has to manage a huge amount of roles and their > integration, while on paper, I'd love for someone to step in and help, > but no one has for over a year. > > If someone wants to step in and get those roles to catch up on all the > technical debt they've accumulated (because when we'd do fixes across > all roles, we would always leave them.. because they always failed > tests..) then we're one revert away from it. I have some thoughts on > how we can resolve this for the future, but they're much more long > term, but for now, the additional workload on our very short resourced > team is a lot. > > Thanks, > Mohammed > > On Wed, Apr 24, 2019 at 8:56 AM Guilherme Steinmüller > wrote: > > > > Hello Witek and Jean-Philippe. > > > > I will hold off the retirement process until the end of PTG. > > > > Just for your information, that's what we have until now > https://review.opendev.org/#/q/topic:retire-osa-unused-roles+(status:open+OR+status:merged) > . > > > > I just -w the monsca roles as they were the only roles someone > manifested interest. > > > > Regards > > > > On Wed, Apr 24, 2019 at 8:14 AM Jean-Philippe Evrard < > jean-philippe at evrard.me> wrote: > >> > >> I am not sure this follows our documented retirement process, and it > seems very early to do so for some roles. > >> I think we should discuss role retirement at the next PTG (if we want > to change that process). > >> > >> In the meantime, I encourage people from the > monasca/zaqar/watcher/searchlight community interested deploying with > openstack-ansible to step up and take over their respective role's > maintainance. > >> > >> Regards, > >> Jean-Philippe Evrard (evrardjp). > >> > > > -- > Mohammed Naser — vexxhost > ----------------------------------------------------- > D. 514-316-8872 > D. 800-910-1726 ext. 200 > E. mnaser at vexxhost.com > W. http://vexxhost.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chenjiengu at gmail.com Wed May 8 11:48:32 2019 From: chenjiengu at gmail.com (=?UTF-8?B?6ZmI5p2w?=) Date: Wed, 8 May 2019 19:48:32 +0800 Subject: topic: ironic boot from cinder volume(the cinder volume backend is ceph storage) Message-ID: Nowdays , the opestack rocky release ironic , is support ironic boot from cinder volume(the cinder volume backend is ceph storage)? My goal is to achieve this. Who can tell me about this principle? looking forward to a reply thank you all. -------------- next part -------------- An HTML attachment was scrubbed... URL: From zackchen517 at gmail.com Wed May 8 12:24:36 2019 From: zackchen517 at gmail.com (zack chen) Date: Wed, 8 May 2019 20:24:36 +0800 Subject: Baremetal attach volume in Multi-tenancy Message-ID: Hi, I am looking for a mechanism that can be used for baremetal attach volume in a multi-tenant scenario. In addition we use ceph as the backend storage for cinder. Can anybody give me some advice? -------------- next part -------------- An HTML attachment was scrubbed... URL: From rafaelweingartner at gmail.com Wed May 8 15:08:55 2019 From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=) Date: Wed, 8 May 2019 12:08:55 -0300 Subject: [telemetry] Team meeting agenda for tomorrow In-Reply-To: References: Message-ID: Thanks, I'll be there. Em qua, 8 de mai de 2019 11:41, Trinh Nguyen escreveu: > Hi Rafael, > > The meeting will be held on the IRC channel #openstack-telemetry as > mentioned in the previous email. > > Thanks, > > On Wed, May 8, 2019 at 10:50 PM Rafael Weingärtner < > rafaelweingartner at gmail.com> wrote: > >> Hello Trinh, >> Where does the meeting happen? Will it be via IRC Telemetry channel? Or, >> in the Etherpad ( >> https://etherpad.openstack.org/p/telemetry-meeting-agenda)? I would like >> to discuss and understand a bit better the context behind the Telemetry >> events deprecation. >> >> On Wed, May 8, 2019 at 12:19 AM Trinh Nguyen >> wrote: >> >>> Hi team, >>> >>> As planned, we will have a team meeting at 02:00 UTC, May 9th on >>> #openstack-telemetry to discuss what we gonna do for the next milestone >>> (Train-1) and continue what we left off from the last meeting. >>> >>> I put here [1] the agenda thinking that it should be fine for an hour >>> meeting. If you have anything to talk about, please put it there too. >>> >>> [1] https://etherpad.openstack.org/p/telemetry-meeting-agenda >>> >>> >>> Bests, >>> >>> -- >>> *Trinh Nguyen* >>> *www.edlab.xyz * >>> >>> >> >> -- >> Rafael Weingärtner >> > > > -- > *Trinh Nguyen* > *www.edlab.xyz * > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jungleboyj at gmail.com Wed May 8 15:14:54 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Wed, 8 May 2019 10:14:54 -0500 Subject: [cinder] Python3 requirements for Train In-Reply-To: References: Message-ID: <2ab79d3a-469e-ca97-5ffc-3bd9a8015251@gmail.com> All, One additional note.  Drivers that fail to have Python 3 testing running in their CI environment by Milestone 2 will have a patch pushed up that will mark the driver as unsupported. Jay On 5/8/2019 10:04 AM, Walter Boring wrote: > Hello Cinder folks, >    The train release is going to be the last release of OpenStack with > python 2 support.  Train also is going to require supporting python > 3.6 and 3.7.  This means that we should be enabling and or switching > over all of our 3rd party CI runs to python 3 to ensure that our > drivers and all of their required libraries run properly in a python > 3.6/3.7 environment.  This will help driver maintainers discover any > python3 incompatibilities with their driver as well as any required > libraries.  At the PTG in Denver, the cinder team agreed that we > wanted driver CI systems to start using python3 by milestone 2 for > Train.  This would be the July 22-26th time frame [1]. > > >   We are also working on adding driver library requirements to the > OpenStack global requirements project [2] [3]. This effort will > provide native install primitives for driver libraries in cinder. This > process also requires the driver libraries to run in python3.6/3.7. > > > The Cinder team wants to maintain it's high quality of driver support > in the train release.  By enabling python 3.6 and python 3.7 in CI > tests, this will help everyone ship Cinder with the required support > in Train and the following releases. > > Walt > > [1] https://releases.openstack.org/train/schedule.html > [2] https://review.opendev.org/#/c/656724/ > [3] https://review.opendev.org/#/c/657395/ From waboring at hemna.com Wed May 8 15:28:21 2019 From: waboring at hemna.com (Walter Boring) Date: Wed, 8 May 2019 11:28:21 -0400 Subject: Baremetal attach volume in Multi-tenancy In-Reply-To: References: Message-ID: To attach to baremetal instance, you will need to install the cinderclient along with the python-brick-cinderclient-extension inside the instance itself. On Wed, May 8, 2019 at 11:15 AM zack chen wrote: > Hi, > I am looking for a mechanism that can be used for baremetal attach volume > in a multi-tenant scenario. In addition we use ceph as the backend storage > for cinder. > > Can anybody give me some advice? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Wed May 8 15:30:57 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Wed, 8 May 2019 17:30:57 +0200 Subject: [Neutron] Can I create VM on flat network which doesnt have any subnet attached to it. In-Reply-To: References: Message-ID: <884636FB-5942-4A7B-B815-41FE708626BD@redhat.com> Hi, I don’t think this is currently supported. But there is ongoing work to add support for such feature. See [1] for details. [1] https://review.opendev.org/#/c/641670/ > On 3 May 2019, at 13:01, Akki yadav wrote: > > Hello Team, > > Hope you all are doing good. I wanted to know that can I launch a VM on a flat network directly which doesn't have any subnet attached to it. > > Steps to be followed: > Create a flat Network without a subnet > Attach the network to create a VM. > > Aim :- Spawn 2 VM's on network without any subnet, without any IP assigned to them. Then statically allocate same subnet IP to them and ping each other. > > Issue:- VM creation is getting failed stating that there is no subnet found. > > How can we resolve this? > > Thanks & Regards > Akshay — Slawek Kaplonski Senior software engineer Red Hat From mriedemos at gmail.com Wed May 8 15:34:26 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 8 May 2019 10:34:26 -0500 Subject: [nova][ops] 'Duplicate entry for primary key' problem running nova-manage db archive_deleted_rows In-Reply-To: References: Message-ID: <9a74e1c1-86bf-4dde-5885-8faa626a79ff@gmail.com> On 5/8/2019 10:04 AM, Massimo Sgaravatto wrote: > The problem is not for that single entry > Looks like the auto_increment for that table was reset (I  don't know > when-how) Just purge your shadow tables. As Surya noted, there is a purge CLI in nova-manage on newer releases now which would do the same thing. You can either backport that, or simply run it in a container or virtualenv, or just do it manually. If you're paranoid, purge the entries that were created before ocata. -- Thanks, Matt From jungleboyj at gmail.com Wed May 8 15:35:06 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Wed, 8 May 2019 10:35:06 -0500 Subject: [cinder] Help with a review please In-Reply-To: <55F040AF-16C8-4029-B306-7E81B4BE191A@gmail.com> References: <55F040AF-16C8-4029-B306-7E81B4BE191A@gmail.com> Message-ID: Sam, Thank you for reaching out to the mailing list on this issue.  I am sorry that the review has been stuck in something of a limbo for quite some time.  This is not the developer experience we strive for as a team. Since it appears that we are having trouble reaching agreement as to whether this is a good change I would recommend bringing this topic up at our next weekly meeting so that we can all work out the details together. If you would like to discuss this issue please add it to the agenda for the next meeting [1]. Thanks! Jay [1] https://etherpad.openstack.org/p/cinder-train-meetings On 5/8/2019 2:51 AM, Sam Morrison wrote: > Hi, > > I’ve had a review going on for over 8 months now [1] and would love to > get this in, it’s had +2s over the period and keeps getting nit > picked, finally being knocked back due to no spec which there now is [2] > This is now stalled itself after having a +2 and it is very depressing. > > I have had generally positive experiences contributing to openstack > but this has been a real pain, is there something I can do to make > this go smoother? > > Thanks, > Sam > > > [1] https://review.opendev.org/#/c/599866/ > [2] https://review.opendev.org/#/c/645056/ From jungleboyj at gmail.com Wed May 8 15:41:13 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Wed, 8 May 2019 10:41:13 -0500 Subject: topic: ironic boot from cinder volume(the cinder volume backend is ceph storage) In-Reply-To: References: Message-ID: This is going to require being able to export Ceph volumes via iSCSI.  The Ironic team communicated the importance of this feature to the Cinder team a few months ago. We are working on getting this support in place soon but it probably will not be until the U release. Thanks! Jay On 5/8/2019 6:48 AM, 陈杰 wrote: > Nowdays , the opestack rocky release ironic , is support ironic boot > from cinder volume(the cinder volume backend is ceph storage)? My goal > is to achieve this. > Who can tell me about this principle? > looking forward to a reply > thank you all. From aspiers at suse.com Wed May 8 15:45:11 2019 From: aspiers at suse.com (Adam Spiers) Date: Wed, 8 May 2019 16:45:11 +0100 Subject: [nova][all][ptg] Summary: Same-Company Approvals In-Reply-To: <20190508143923.bhmla62qi2p7yc7s@yuggoth.org> References: <20190508143923.bhmla62qi2p7yc7s@yuggoth.org> Message-ID: <20190508154511.njvidentht4d4zim@pacific.linksys.moosehall> Jeremy Stanley wrote: >On 2019-05-07 15:06:10 -0500 (-0500), Jay Bryant wrote: >>Cinder has been working with the same unwritten rules for quite some time as >>well with minimal issues. >> >>I think the concerns about not having it documented are warranted.  We have >>had question about it in the past with no documentation to point to.  It is >>more or less lore that has been passed down over the releases.  :-) >> >>At a minimum, having this e-mail thread is helpful.  If, however, we decide >>to document it I think we should have it consistent across the teams that >>use the rule.  I would be happy to help draft/review any such documentation. >[...] > >I have a feeling that a big part of why it's gone undocumented for >so long is that putting it in writing risks explicitly sending the >message that we don't trust our contributors to act in the best >interests of the project even if those are not aligned with the >interests of their employer/sponsor. I think many of us attempt to >avoid having all activity on a given patch come from people with the >same funding affiliation so as to avoid giving the impression that >any one organization is able to ram changes through with no >oversight, but more because of the outward appearance than because >we don't trust ourselves or our colleagues. > >Documenting our culture is a good thing, but embodying that >documentation with this sort of nuance can be challenging. That's a good point. Maybe that risk could be countered by explicitly stating something like "this is not currently an issue within the community, and it has rarely, if ever, been one in the past; therefore this policy is a preemptive safeguard rather than a reactive one" ? From jaypipes at gmail.com Wed May 8 15:50:52 2019 From: jaypipes at gmail.com (Jay Pipes) Date: Wed, 8 May 2019 11:50:52 -0400 Subject: [ops][nova]Logging in nova and other openstack projects In-Reply-To: <53BF2204-988C-4ED6-A687-F6188B90C547@planethoster.info> References: <62034C21-91FC-4A9A-BC4B-47E372EAB925@planethoster.info> <53BF2204-988C-4ED6-A687-F6188B90C547@planethoster.info> Message-ID: <189efcf0-07f4-d5eb-b17b-658684ad0bbb@gmail.com> Sweet! :) Glad it worked! Best, -jay On 05/08/2019 10:43 AM, Jean-Philippe Méthot wrote: > Hi, > > Indeed, the remaining info messages were coming from the nova-compute > resource tracker. Adding nova=WARN in the list did remove these > messages. Thank you very much for your help. > > Best regards, > > Jean-Philippe Méthot > Openstack system administrator > Administrateur système Openstack > PlanetHoster inc. > > > > >> Le 8 mai 2019 à 09:35, Jay Pipes > > a écrit : >> >> Sorry for delayed response... comments inline. >> >> On 05/07/2019 05:31 PM, Jean-Philippe Méthot wrote: >>> Indeed, this is what was written in your original response as well as >>> in the documentation. As a result, it was fairly difficult to miss >>> and I did comment it out before restarting the service. Additionally, >>> as per the configuration I had set up, had the log-config-append >>> option be set, I wouldn’t have any INFO level log in my logs. Hence >>> why I believe it is strange that I have info level logs, when I’ve >>> set default_log_levels like this: >>> default_log_levels >>> = amqp=WARN,amqplib=WARN,boto=WARN,qpid=WARN,sqlalchemy=WARN,suds=WARN,oslo.messaging=WARN,iso8601=WARN,requests.packages.urllib3.connectionpool=WARN,urllib3.connectionpool=WARN,websocket=WARN,requests.packages.urllib3.util.retry=WARN,urllib3.util.retry=WARN,keystonemiddleware=WARN,routes.middleware=WARN,stevedore=WARN,taskflow=WARN,keystoneauth=WARN,oslo.cache=WARN >> >> Do you see any of the above modules logging with INFO level, though? >> Or are you just seeing other modules (e.g. nova.*) logging at INFO level? >> >> If you are only seeing nova modules logging at INFO level, try adding: >> >> ,nova=WARN >> >> to the default_log_levels CONF option. >> >> Let us know if this works :) >> >> Best, >> -jay >> > From Tim.Bell at cern.ch Wed May 8 15:55:18 2019 From: Tim.Bell at cern.ch (Tim Bell) Date: Wed, 8 May 2019 15:55:18 +0000 Subject: topic: ironic boot from cinder volume(the cinder volume backend is ceph storage) In-Reply-To: References: Message-ID: Just brainstorming.... Would it be possible to set up a couple of VMs as iscsi LIO gateways by hand while this feature is being developed and using that end point to boot an Ironic node? You may also be on a late enough version of Ceph to do it using http://docs.ceph.com/docs/mimic/rbd/iscsi-overview/. Not self-service but could work for a few cases.. Tim -----Original Message----- From: Jay Bryant Reply-To: "jsbryant at electronicjungle.net" Date: Wednesday, 8 May 2019 at 17:46 To: "openstack-discuss at lists.openstack.org" Subject: Re: topic: ironic boot from cinder volume(the cinder volume backend is ceph storage) This is going to require being able to export Ceph volumes via iSCSI. The Ironic team communicated the importance of this feature to the Cinder team a few months ago. We are working on getting this support in place soon but it probably will not be until the U release. Thanks! Jay On 5/8/2019 6:48 AM, 陈杰 wrote: > Nowdays , the opestack rocky release ironic , is support ironic boot > from cinder volume(the cinder volume backend is ceph storage)? My goal > is to achieve this. > Who can tell me about this principle? > looking forward to a reply > thank you all. From mriedemos at gmail.com Wed May 8 15:58:30 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 8 May 2019 10:58:30 -0500 Subject: [nova][ptg] Summary: Implicit trait-based filters In-Reply-To: <1557213589.2232.0@smtp.office365.com> References: <1557213589.2232.0@smtp.office365.com> Message-ID: On 5/7/2019 2:19 AM, Balázs Gibizer wrote: > 3) The request pre-filters [7] run before the placement a_c query is > generated. But these today changes the fields of the RequestSpec (e.g. > requested_destination) that would mean the regeneration of > RequestSpec.requested_resources would be needed. This probably solvable > by changing the pre-filters to work directly on > RequestSpec.requested_resources after we solved all the other issues. Yeah this is something I ran into while hacking on the routed networks aggregate stuff [1]. I added information to the RequestSpec so I could use it in a pre-filter (required aggregates) but I can't add that to the requested_resources in the RequestSpec without resources (and in the non-bw port case there is no RequestSpec.requested_resources yet), so what I did was hack the unnumbered RequestGroup after the pre-filters and after the RequestSpec was processed by resources_from_request_spec, but before the code that makes the GET /a_c call. It's definitely ugly and I'm not even sure it works yet (would need functional testing). What I've wondered is if there is a way we could merge request groups in resources_from_request_spec so if a pre-filter added an unnumbered RequestGroup to the RequestSpec (via the requestd_resources attribute) that resources_from_request_spec would then merge in the flavor information. That's what I initially tried with the multiattach required traits patch [2] but the groups weren't merged for whatever reason and GET /a_c failed because I had a group with a required trait but no resources. [1] https://review.opendev.org/#/c/656885/3/nova/scheduler/manager.py [2] https://review.opendev.org/#/c/645316/ -- Thanks, Matt From mriedemos at gmail.com Wed May 8 16:03:17 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 8 May 2019 11:03:17 -0500 Subject: [nova][cinder][ptg] Summary: Swap volume woes In-Reply-To: <20190506131834.nyc7k7qltdsmamuq@lyarwood.usersys.redhat.com> References: <20190506131834.nyc7k7qltdsmamuq@lyarwood.usersys.redhat.com> Message-ID: On 5/6/2019 8:18 AM, Lee Yarwood wrote: > - Deprecate the existing swap volume API in Train, remove in U. I don't remember this coming up. Deprecation is one thing if we have an alternative, but removal isn't really an option. Yes we have 410'ed some REST APIs for removed services (nova-network, nova-cells) but for the most part we're married to our REST APIs so we can deprecate things to signal "don't use these anymore" but that doesn't mean we can just delete them. This is why we require a spec for all API changes, because of said marriage. -- Thanks, Matt From ed at leafe.com Wed May 8 16:04:18 2019 From: ed at leafe.com (Ed Leafe) Date: Wed, 8 May 2019 11:04:18 -0500 Subject: [nova][all][ptg] Summary: Same-Company Approvals In-Reply-To: <20190508154511.njvidentht4d4zim@pacific.linksys.moosehall> References: <20190508143923.bhmla62qi2p7yc7s@yuggoth.org> <20190508154511.njvidentht4d4zim@pacific.linksys.moosehall> Message-ID: <78C13304-E630-43FD-BDA5-0C43FBDA8B29@leafe.com> On May 8, 2019, at 10:45 AM, Adam Spiers wrote: > >> I have a feeling that a big part of why it's gone undocumented for so long is that putting it in writing risks explicitly sending the message that we don't trust our contributors to act in the best interests of the project even if those are not aligned with the interests of their employer/sponsor. I think many of us attempt to avoid having all activity on a given patch come from people with the same funding affiliation so as to avoid giving the impression that any one organization is able to ram changes through with no oversight, but more because of the outward appearance than because we don't trust ourselves or our colleagues. >> Documenting our culture is a good thing, but embodying that documentation with this sort of nuance can be challenging. > > That's a good point. Maybe that risk could be countered by explicitly stating something like "this is not currently an issue within the community, and it has rarely, if ever, been one in the past; therefore this policy is a preemptive safeguard rather than a reactive one" ? I think that’s a good approach. This way if such a situation comes up and people wonder why others are questioning it, it will be all above-board. The downside of *not* documenting this concern is that in the future if it is ever needed to be mentioned, the people involved might feel that the community is suddenly ganging up against their company, instead of simply following documented policy. -- Ed Leafe From jungleboyj at gmail.com Wed May 8 16:04:21 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Wed, 8 May 2019 11:04:21 -0500 Subject: topic: ironic boot from cinder volume(the cinder volume backend is ceph storage) In-Reply-To: References: Message-ID: <54148c1b-ce06-ae7b-1c08-0b5a6ceba4f3@gmail.com> Tim, Good thought.  That would be an interim solution until we are able to get the process automated. Jay On 5/8/2019 10:55 AM, Tim Bell wrote: > Just brainstorming.... > > Would it be possible to set up a couple of VMs as iscsi LIO gateways by hand while this feature is being developed and using that end point to boot an Ironic node? You may also be on a late enough version of Ceph to do it using http://docs.ceph.com/docs/mimic/rbd/iscsi-overview/. > > Not self-service but could work for a few cases.. > > Tim > > -----Original Message----- > From: Jay Bryant > Reply-To: "jsbryant at electronicjungle.net" > Date: Wednesday, 8 May 2019 at 17:46 > To: "openstack-discuss at lists.openstack.org" > Subject: Re: topic: ironic boot from cinder volume(the cinder volume backend is ceph storage) > > This is going to require being able to export Ceph volumes via iSCSI. > The Ironic team communicated the importance of this feature to the > Cinder team a few months ago. > > We are working on getting this support in place soon but it probably > will not be until the U release. > > Thanks! > > Jay > > > On 5/8/2019 6:48 AM, 陈杰 wrote: > > Nowdays , the opestack rocky release ironic , is support ironic boot > > from cinder volume(the cinder volume backend is ceph storage)? My goal > > is to achieve this. > > Who can tell me about this principle? > > looking forward to a reply > > thank you all. > > > From mriedemos at gmail.com Wed May 8 16:07:44 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 8 May 2019 11:07:44 -0500 Subject: [nova][ptg] Summary: Tech Debt In-Reply-To: References: Message-ID: On 5/6/2019 3:12 PM, Eric Fried wrote: > - Remove the nova-console I'm still not clear on this one. Someone from Rackspace (Matt DePorter) said at the summit that they are still using xen and still rely on the nova-console service. Citrix supports Rackspace and Bob (Citrix) said we could drop the nova-console service, so I'm not sure what to make of the support matrix here - can we drop it or not for xen users? Is there an alternative? On the other hand, there was an undercurrent of support for deprecating the xenapi driver since it's not really maintained anymore and CI hasn't worked on it for several months. So if we go that route, what would the plan be? Deprecate the driver in Train and if no one steps up to maintain it and get CI working, drop it in U along with the nova-console service and xvp console? -- Thanks, Matt From jasonanderson at uchicago.edu Wed May 8 16:14:09 2019 From: jasonanderson at uchicago.edu (Jason Anderson) Date: Wed, 8 May 2019 16:14:09 +0000 Subject: topic: ironic boot from cinder volume(the cinder volume backend is ceph storage) In-Reply-To: <54148c1b-ce06-ae7b-1c08-0b5a6ceba4f3@gmail.com> References: , <54148c1b-ce06-ae7b-1c08-0b5a6ceba4f3@gmail.com> Message-ID: Tim, Jay -- I looked in to this recently as it was a use-case some of our HPC users wanted support for. I noticed that Ceph has the iSCSI gateway, but my impression was that this wouldn't work without adding some sort of new driver in Cinder. Is that not true? I thought that Cinder only Ceph via RBD. I'd be happy to be proven wrong on this. Cheers, /Jason ________________________________ From: Jay Bryant Sent: Wednesday, May 8, 2019 11:04 To: openstack-discuss at lists.openstack.org Subject: Re: topic: ironic boot from cinder volume(the cinder volume backend is ceph storage) Tim, Good thought. That would be an interim solution until we are able to get the process automated. Jay On 5/8/2019 10:55 AM, Tim Bell wrote: > Just brainstorming.... > > Would it be possible to set up a couple of VMs as iscsi LIO gateways by hand while this feature is being developed and using that end point to boot an Ironic node? You may also be on a late enough version of Ceph to do it using http://docs.ceph.com/docs/mimic/rbd/iscsi-overview/. > > Not self-service but could work for a few cases.. > > Tim > > -----Original Message----- > From: Jay Bryant > Reply-To: "jsbryant at electronicjungle.net" > Date: Wednesday, 8 May 2019 at 17:46 > To: "openstack-discuss at lists.openstack.org" > Subject: Re: topic: ironic boot from cinder volume(the cinder volume backend is ceph storage) > > This is going to require being able to export Ceph volumes via iSCSI. > The Ironic team communicated the importance of this feature to the > Cinder team a few months ago. > > We are working on getting this support in place soon but it probably > will not be until the U release. > > Thanks! > > Jay > > > On 5/8/2019 6:48 AM, 陈杰 wrote: > > Nowdays , the opestack rocky release ironic , is support ironic boot > > from cinder volume(the cinder volume backend is ceph storage)? My goal > > is to achieve this. > > Who can tell me about this principle? > > looking forward to a reply > > thank you all. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Wed May 8 16:18:42 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 8 May 2019 11:18:42 -0500 Subject: [nova][neutron][ptg] Summary: Leaking resources when ports are deleted out-of-band In-Reply-To: <1556989044.27606.0@smtp.office365.com> References: <62ef48e0-9425-9191-a648-c1009c1032b7@fried.cc> <1556919312.16566.2@smtp.office365.com> <5f87ea30-0bdf-31a4-a3f5-0e9d201b3665@gmail.com> <1556989044.27606.0@smtp.office365.com> Message-ID: On 5/4/2019 11:57 AM, Balázs Gibizer wrote: > The failure to detach a port via nova while the nova-compute is down > could be a bug on nova side. Depends on what you mean by detach. If the compute is down while deleting the server, the API will still call the (internal to nova) network API code [1] to either (a) unbind ports that nova didn't create or (2) delete ports that nova did create. For the policy change where the port has to be unbound to delete it, we'd already have support for that, it's just an extra step. At the PTG I was groaning a bit about needing to add another step to delete a port from the nova side, but thinking about it more we have to do the exact same thing with cinder volumes (we have to detach them before deleting them), so I guess it's not the worst thing ever. [1] https://github.com/openstack/nova/blob/56fef7c0e74d7512f062c4046def10401df16565/nova/compute/api.py#L2291 -- Thanks, Matt From jungleboyj at gmail.com Wed May 8 16:21:33 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Wed, 8 May 2019 11:21:33 -0500 Subject: topic: ironic boot from cinder volume(the cinder volume backend is ceph storage) In-Reply-To: References: <54148c1b-ce06-ae7b-1c08-0b5a6ceba4f3@gmail.com> Message-ID: <6fb96df4-9b91-a0f9-933a-a806205409cf@gmail.com> Jason, You are correct.  The plan is to add a driver that will use the iSCSI gateway to make volumes available instead of using RBD commands.  So, the driver will be heavily based on the existing RBD driver but do the export via iSCSI gateway. Unfortunately, the iSCSI Gateway CLI is not well suited to remote execution so we have Walt Boring looking into better ways of interacting with the gateway or possibly updating the client to support our needs. If you want to see additional notes on the topic see our discussion from the PTG last week at around line 119.  [1] Thanks! Jay [1] https://etherpad.openstack.org/p/cinder-train-ptg-planning On 5/8/2019 11:14 AM, Jason Anderson wrote: > Tim, Jay -- > > I looked in to this recently as it was a use-case some of our HPC > users wanted support for. I noticed that Ceph has the iSCSI gateway, > but my impression was that this wouldn't work without adding some sort > of new driver in Cinder. Is that not true? I thought that Cinder only > Ceph via RBD. I'd be happy to be proven wrong on this. > > Cheers, > /Jason > ------------------------------------------------------------------------ > *From:* Jay Bryant > *Sent:* Wednesday, May 8, 2019 11:04 > *To:* openstack-discuss at lists.openstack.org > *Subject:* Re: topic: ironic boot from cinder volume(the cinder volume > backend is ceph storage) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jasonanderson at uchicago.edu Wed May 8 16:24:45 2019 From: jasonanderson at uchicago.edu (Jason Anderson) Date: Wed, 8 May 2019 16:24:45 +0000 Subject: topic: ironic boot from cinder volume(the cinder volume backend is ceph storage) In-Reply-To: <6fb96df4-9b91-a0f9-933a-a806205409cf@gmail.com> References: <54148c1b-ce06-ae7b-1c08-0b5a6ceba4f3@gmail.com> , <6fb96df4-9b91-a0f9-933a-a806205409cf@gmail.com> Message-ID: Thanks Jay! So I guess if one wants to use the iSCSI gateway with Ironic now, one would have to use the 'external' storage interface available since Rocky and do the poking of Ceph out of band. That won't really work for our use case, but perhaps others could take advantage. I'm very grateful that the Cinder team is spending time on this! Cheers, /Jason ________________________________ From: Jay Bryant Sent: Wednesday, May 8, 2019 11:21 To: openstack-discuss at lists.openstack.org Subject: Re: topic: ironic boot from cinder volume(the cinder volume backend is ceph storage) Jason, You are correct. The plan is to add a driver that will use the iSCSI gateway to make volumes available instead of using RBD commands. So, the driver will be heavily based on the existing RBD driver but do the export via iSCSI gateway. Unfortunately, the iSCSI Gateway CLI is not well suited to remote execution so we have Walt Boring looking into better ways of interacting with the gateway or possibly updating the client to support our needs. If you want to see additional notes on the topic see our discussion from the PTG last week at around line 119. [1] Thanks! Jay [1] https://etherpad.openstack.org/p/cinder-train-ptg-planning On 5/8/2019 11:14 AM, Jason Anderson wrote: Tim, Jay -- I looked in to this recently as it was a use-case some of our HPC users wanted support for. I noticed that Ceph has the iSCSI gateway, but my impression was that this wouldn't work without adding some sort of new driver in Cinder. Is that not true? I thought that Cinder only Ceph via RBD. I'd be happy to be proven wrong on this. Cheers, /Jason ________________________________ From: Jay Bryant Sent: Wednesday, May 8, 2019 11:04 To: openstack-discuss at lists.openstack.org Subject: Re: topic: ironic boot from cinder volume(the cinder volume backend is ceph storage) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Tim.Bell at cern.ch Wed May 8 16:25:33 2019 From: Tim.Bell at cern.ch (Tim Bell) Date: Wed, 8 May 2019 16:25:33 +0000 Subject: topic: ironic boot from cinder volume(the cinder volume backend is ceph storage) In-Reply-To: References: <54148c1b-ce06-ae7b-1c08-0b5a6ceba4f3@gmail.com> Message-ID: <3B227E63-CA0E-4F88-8ED9-331008FF008D@cern.ch> I’m not sure you actually need full cinder support, see “Boot Without Cinder” in https://docs.openstack.org/ironic/latest/admin/boot-from-volume.html (Never tried it though ….) Tim From: Jason Anderson Date: Wednesday, 8 May 2019 at 18:17 To: "openstack-discuss at lists.openstack.org" , "jsbryant at electronicjungle.net" Subject: Re: topic: ironic boot from cinder volume(the cinder volume backend is ceph storage) Tim, Jay -- I looked in to this recently as it was a use-case some of our HPC users wanted support for. I noticed that Ceph has the iSCSI gateway, but my impression was that this wouldn't work without adding some sort of new driver in Cinder. Is that not true? I thought that Cinder only Ceph via RBD. I'd be happy to be proven wrong on this. Cheers, /Jason ________________________________ From: Jay Bryant Sent: Wednesday, May 8, 2019 11:04 To: openstack-discuss at lists.openstack.org Subject: Re: topic: ironic boot from cinder volume(the cinder volume backend is ceph storage) Tim, Good thought. That would be an interim solution until we are able to get the process automated. Jay On 5/8/2019 10:55 AM, Tim Bell wrote: > Just brainstorming.... > > Would it be possible to set up a couple of VMs as iscsi LIO gateways by hand while this feature is being developed and using that end point to boot an Ironic node? You may also be on a late enough version of Ceph to do it using http://docs.ceph.com/docs/mimic/rbd/iscsi-overview/. > > Not self-service but could work for a few cases.. > > Tim > > -----Original Message----- > From: Jay Bryant > Reply-To: "jsbryant at electronicjungle.net" > Date: Wednesday, 8 May 2019 at 17:46 > To: "openstack-discuss at lists.openstack.org" > Subject: Re: topic: ironic boot from cinder volume(the cinder volume backend is ceph storage) > > This is going to require being able to export Ceph volumes via iSCSI. > The Ironic team communicated the importance of this feature to the > Cinder team a few months ago. > > We are working on getting this support in place soon but it probably > will not be until the U release. > > Thanks! > > Jay > > > On 5/8/2019 6:48 AM, 陈杰 wrote: > > Nowdays , the opestack rocky release ironic , is support ironic boot > > from cinder volume(the cinder volume backend is ceph storage)? My goal > > is to achieve this. > > Who can tell me about this principle? > > looking forward to a reply > > thank you all. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at fried.cc Wed May 8 16:28:24 2019 From: openstack at fried.cc (Eric Fried) Date: Wed, 8 May 2019 11:28:24 -0500 Subject: [nova][ptg] Summary: Resource Management Daemon In-Reply-To: Message-ID: <80626b5e-735e-dbfd-c211-6305048ceeda@fried.cc> > [1] There has been a recurring theme of needing "some kind of config" - > not necessarily nova.conf or any oslo.config - that can describe: > - Resource provider name/uuid/parentage, be it an existing provider or a > new nested provider; > - Inventory (e.g. last-level cache in this case); > - Physical resource(s) to which the inventory corresponds (e.g. "cache > ways" in this case); > - Traits, aggregates, other? > As of this writing, no specifics have been decided, even to the point of > positing that it could be the same file for some/all of the specs for > which the issue arose. A proposal extremely close to this has been in the works in various forms for about a year now, the latest iteration of which can be found at [2]. Up to this point, there has been a general lack of enthusiasm for it, probably because we just didn't have any really strong use cases yet. I think we do now, given that RMD and others (including [3]) have expressed a need for it in Train. As such, Dakshina and team have agreed to take over that spec and move forward with it. To be clear, this will drive toward a general-purpose resource provider customization/description mechanism, not be RMD-specific. efried [2] https://review.opendev.org/#/c/612497/ [3] http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005802.html From morgan.fainberg at gmail.com Wed May 8 16:28:21 2019 From: morgan.fainberg at gmail.com (Morgan Fainberg) Date: Wed, 8 May 2019 09:28:21 -0700 Subject: [nova][all][ptg] Summary: Same-Company Approvals In-Reply-To: <78C13304-E630-43FD-BDA5-0C43FBDA8B29@leafe.com> References: <20190508143923.bhmla62qi2p7yc7s@yuggoth.org> <20190508154511.njvidentht4d4zim@pacific.linksys.moosehall> <78C13304-E630-43FD-BDA5-0C43FBDA8B29@leafe.com> Message-ID: On Wed, May 8, 2019, 09:09 Ed Leafe wrote: > On May 8, 2019, at 10:45 AM, Adam Spiers wrote: > > > >> I have a feeling that a big part of why it's gone undocumented for so > long is that putting it in writing risks explicitly sending the message > that we don't trust our contributors to act in the best interests of the > project even if those are not aligned with the interests of their > employer/sponsor. I think many of us attempt to avoid having all activity > on a given patch come from people with the same funding affiliation so as > to avoid giving the impression that any one organization is able to ram > changes through with no oversight, but more because of the outward > appearance than because we don't trust ourselves or our colleagues. > >> Documenting our culture is a good thing, but embodying that > documentation with this sort of nuance can be challenging. > > > > That's a good point. Maybe that risk could be countered by explicitly > stating something like "this is not currently an issue within the > community, and it has rarely, if ever, been one in the past; therefore this > policy is a preemptive safeguard rather than a reactive one" ? > > I think that’s a good approach. This way if such a situation comes up and > people wonder why others are questioning it, it will be all above-board. > The downside of *not* documenting this concern is that in the future if it > is ever needed to be mentioned, the people involved might feel that the > community is suddenly ganging up against their company, instead of simply > following documented policy. > > > -- Ed Leafe > In general I would rather see trust be pushed forward. The cores are explicitly trusted individuals within a team. I would encourage setting this policy aside and revisit if it ever becomes an issue. I think this policy, preemptive or not, highlights a lack of trust. It is another reason why Keystone team abolished the rule. AI.kuch prefer trusting the cores with landing code with or without external/additional input as they feel is appropriate. There are remedies if something lands inappropriately (revert, removal of core status, etc). As stated earlier in the quoted email, this has never or almost never been an issue. With that said, I don't have a strongly vested interest outside of seeing our community succeeding and being as open/inclusive as possible (since most contributions I am working on are not subject to this policy). As long as the policy isn't strictly tribal knowledge, we are headed in the right direction. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jungleboyj at gmail.com Wed May 8 16:31:17 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Wed, 8 May 2019 11:31:17 -0500 Subject: topic: ironic boot from cinder volume(the cinder volume backend is ceph storage) In-Reply-To: References: <54148c1b-ce06-ae7b-1c08-0b5a6ceba4f3@gmail.com> <6fb96df4-9b91-a0f9-933a-a806205409cf@gmail.com> Message-ID: Jason, Thanks for the input on this.  Helps us know the priority of this effort. If you have additional input or are able to help with the effort we welcome your contributions/input. Once we have the spec for this up I will mail the mailing list to keep everyone in the loop. Jay On 5/8/2019 11:24 AM, Jason Anderson wrote: > Thanks Jay! > > So I guess if one wants to use the iSCSI gateway with Ironic now, one > would have to use the 'external' storage interface available since > Rocky and do the poking of Ceph out of band. That won't really work > for our use case, but perhaps others could take advantage. > > I'm very grateful that the Cinder team is spending time on this! > > Cheers, > /Jason -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Wed May 8 16:36:04 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 8 May 2019 11:36:04 -0500 Subject: [nova] Stein regressions In-Reply-To: <27a23eb0-b31b-0f25-db74-bdef81908939@gmail.com> References: <27a23eb0-b31b-0f25-db74-bdef81908939@gmail.com> Message-ID: <61c255ff-942a-60d9-d55c-9df9b7338434@gmail.com> Another update on these now that we're past the summit and PTG. On 4/16/2019 9:30 PM, Matt Riedemann wrote: > > 1. https://bugs.launchpad.net/nova/+bug/1822801 > Done - backport is merged to stable/stein (not yet released). > > 2. https://bugs.launchpad.net/nova/+bug/1824435 > Still no fix proposed for this yet but it is re-createable in devstack. > > 3. https://bugs.launchpad.net/nova/+bug/1825034 > The fix is merged on master, backports proposed [1]. > > 4. https://bugs.launchpad.net/nova/+bug/1825020 Done - backport is merged to stable/stein (not yet released). [1] https://review.opendev.org/#/q/topic:bug/1825034+status:open -- Thanks, Matt From jungleboyj at gmail.com Wed May 8 16:40:20 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Wed, 8 May 2019 11:40:20 -0500 Subject: topic: ironic boot from cinder volume(the cinder volume backend is ceph storage) In-Reply-To: <3B227E63-CA0E-4F88-8ED9-331008FF008D@cern.ch> References: <54148c1b-ce06-ae7b-1c08-0b5a6ceba4f3@gmail.com> <3B227E63-CA0E-4F88-8ED9-331008FF008D@cern.ch> Message-ID: <6bceda83-eab9-3dd5-8d6b-446395735b36@gmail.com> Tim, This was kind of what I was picturing as the temporary work around for Cinder not supporting this yet. Ideally the user would be able to use Cinder to do all the volume management (create/delete/etc.), but right now the Ceph iSCSI CLI only shows the volumes it created.  This is one of the challenges we have to resolve in adding this support. None-the-less users could use a portion of their Ceph storage for boot-from-volume purposes via the Ceph iSCSI CLI until we add the support.  It would just require them to create the volume and set up the iSCSI target on the Ceph iSCSI gateway.  Then the directions you shared for use without Cinder could be used to use the iSCSI Gateway as the target. In the future it should be possible to add those volumes under Cinder Management once we have all the support in place and then the Ceph iSCSI CLI would not need to be used in the future. Jay On 5/8/2019 11:25 AM, Tim Bell wrote: > > I’m not sure you actually need full cinder support, see “Boot Without > Cinder” in > https://docs.openstack.org/ironic/latest/admin/boot-from-volume.html > > (Never tried it though ….) > > Tim > > *From: *Jason Anderson > *Date: *Wednesday, 8 May 2019 at 18:17 > *To: *"openstack-discuss at lists.openstack.org" > , > "jsbryant at electronicjungle.net" > *Subject: *Re: topic: ironic boot from cinder volume(the cinder volume > backend is ceph storage) > > Tim, Jay -- > > I looked in to this recently as it was a use-case some of our HPC > users wanted support for. I noticed that Ceph has the iSCSI gateway, > but my impression was that this wouldn't work without adding some sort > of new driver in Cinder. Is that not true? I thought that Cinder only > Ceph via RBD. I'd be happy to be proven wrong on this. > > Cheers, > > /Jason > > ------------------------------------------------------------------------ > > *From:*Jay Bryant > *Sent:* Wednesday, May 8, 2019 11:04 > *To:* openstack-discuss at lists.openstack.org > *Subject:* Re: topic: ironic boot from cinder volume(the cinder volume > backend is ceph storage) > > Tim, > > Good thought.  That would be an interim solution until we are able to > get the process automated. > > Jay > > On 5/8/2019 10:55 AM, Tim Bell wrote: > > Just brainstorming.... > > > > Would it be possible to set up a couple of VMs as iscsi LIO gateways > by hand while this feature is being developed and using that end point > to boot an Ironic node? You may also be on a late enough version of > Ceph to do it using http://docs.ceph.com/docs/mimic/rbd/iscsi-overview/. > > > > Not self-service but could work for a few cases.. > > > > Tim > > > > -----Original Message----- > > From: Jay Bryant > > Reply-To: "jsbryant at electronicjungle.net" > > > Date: Wednesday, 8 May 2019 at 17:46 > > To: "openstack-discuss at lists.openstack.org" > > > Subject: Re: topic: ironic boot from cinder volume(the cinder volume > backend is ceph storage) > > > >      This is going to require being able to export Ceph volumes via > iSCSI. > >      The Ironic team communicated the importance of this feature to the > >      Cinder team a few months ago. > > > >      We are working on getting this support in place soon but it > probably > >      will not be until the U release. > > > >      Thanks! > > > >      Jay > > > > > >      On 5/8/2019 6:48 AM, 陈杰 wrote: > >      > Nowdays , the opestack rocky release ironic , is support > ironic boot > >      > from cinder volume(the cinder volume backend is ceph > storage)? My goal > >      > is to achieve this. > >      > Who can tell me about this principle? > >      > looking forward to a reply > >      > thank you all. > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Wed May 8 17:00:37 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Thu, 9 May 2019 02:00:37 +0900 Subject: [openstack-ansible][monasca][zaqar][watcher][searchlight] Retirement of unused OpenStack Ansible roles In-Reply-To: References: <236ef912-21c5-4345-98ce-067499921af1@www.fastmail.com> Message-ID: Hi all, I would love to take care of the searchlight roles. Are there any specific requirements I need to keep in mind? Bests, On Thu, Apr 25, 2019 at 5:50 AM Mohammed Naser wrote: > Hi, > > These roles have been broken for over a year now, some are not even > integrated with the OpenStack Ansible integrated repository. > > I think it's safe to say that for the most part, they have no users or > consumers unless someone has integrated it downstream somewhere and > didn't push that back out. It is a lot of overhead to maintain roles, > we're a small team that has to manage a huge amount of roles and their > integration, while on paper, I'd love for someone to step in and help, > but no one has for over a year. > > If someone wants to step in and get those roles to catch up on all the > technical debt they've accumulated (because when we'd do fixes across > all roles, we would always leave them.. because they always failed > tests..) then we're one revert away from it. I have some thoughts on > how we can resolve this for the future, but they're much more long > term, but for now, the additional workload on our very short resourced > team is a lot. > > Thanks, > Mohammed > > On Wed, Apr 24, 2019 at 8:56 AM Guilherme Steinmüller > wrote: > > > > Hello Witek and Jean-Philippe. > > > > I will hold off the retirement process until the end of PTG. > > > > Just for your information, that's what we have until now > https://review.opendev.org/#/q/topic:retire-osa-unused-roles+(status:open+OR+status:merged) > . > > > > I just -w the monsca roles as they were the only roles someone > manifested interest. > > > > Regards > > > > On Wed, Apr 24, 2019 at 8:14 AM Jean-Philippe Evrard < > jean-philippe at evrard.me> wrote: > >> > >> I am not sure this follows our documented retirement process, and it > seems very early to do so for some roles. > >> I think we should discuss role retirement at the next PTG (if we want > to change that process). > >> > >> In the meantime, I encourage people from the > monasca/zaqar/watcher/searchlight community interested deploying with > openstack-ansible to step up and take over their respective role's > maintainance. > >> > >> Regards, > >> Jean-Philippe Evrard (evrardjp). > >> > > > -- > Mohammed Naser — vexxhost > ----------------------------------------------------- > D. 514-316-8872 > D. 800-910-1726 ext. 200 > E. mnaser at vexxhost.com > W. http://vexxhost.com > > -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From cgoncalves at redhat.com Wed May 8 17:04:19 2019 From: cgoncalves at redhat.com (Carlos Goncalves) Date: Wed, 8 May 2019 19:04:19 +0200 Subject: OpenStack User Survey 2019 In-Reply-To: References: <5CC0732E.8020601@tipit.net> <74F9B988-972B-422F-94D1-E62A83FD87A7@openstack.org> Message-ID: Hi Allison and Jimmy, In today's Octavia IRC meeting [1], the team agreed on the following two questions we would like to see included in the survey: 1. Which OpenStack load balancing (Octavia) provider drivers would you like to see supported? 2. Which new features would you like to see supported in OpenStack load balancing (Octavia)? Please let us know if you have any questions. Thanks, Carlos [1] http://eavesdrop.openstack.org/meetings/octavia/2019/octavia.2019-05-08-16.00.html On Tue, May 7, 2019 at 10:51 PM Allison Price wrote: > > Hi Michael, > > I apologize that the Octavia project team has been unable to submit a question to date. Jimmy posted the User Survey update to the public mailing list to ensure we updated the entire community and that we caught any projects that had not submitted their questions. The User Survey is open all year, and the primary goal is passing operator feedback to the upstream community. > > If the Octavia team - or any OpenStack project team - has a question they would like added (limit of 2 per project), please let Jimmy or myself know. > > Thanks for reaching out, Michael. > > Cheers, > Allison > > > On May 7, 2019, at 3:39 PM, Michael Johnson wrote: > > > > Jimmy & Allison, > > > > As you probably remember from previous year's surveys, the Octavia > > team has been trying to get a question included in the survey for a > > while. > > I have included the response we got the last time we inquired about > > the survey below. We never received a follow up invitation. > > > > I think it would be in the best interest for the community if we > > follow our "Four Opens" ethos in the user survey process, specifically > > the "Open Community" statement, by soliciting survey questions from > > the project teams in an open forum such as the openstack-discuss > > mailing list. > > > > Michael > > > > ----- Last response e-mail ------ > > Jimmy McArthur > > > > Fri, Sep 7, 2018, 5:51 PM > > to Allison, me > > Hey Michael, > > > > The project-specific questions were added in 2017, so likely didn't > > include some new projects. While we asked all projects to participate > > initially, less than a dozen did. We will be sending an invitation for > > new/underrepresented projects in the coming weeks. Please stand by and > > know that we value your feedback and that of the community. > > > > Cheers! > > > > > > > >> On Sat, Apr 27, 2019 at 5:11 PM Allison Price wrote: > >> > >> Hi Michael, > >> > >> We reached out to all of the PTLs who had questions in the 2018 version of the survey to review and update their questions. If there is a project that was missed, we can add it and share anonymized results with the PTLs directly as well as the openstack-discsuss mailing list. > >> > >> If there is a question from the Octavia team, please let us know and we can add it for the 2019 survey. > >> > >> Cheers, > >> Allison > >> > >> > >> > >> On Apr 27, 2019, at 4:01 PM, Michael Johnson wrote: > >> > >> Jimmy, > >> > >> I am curious, how did you reach out the PTLs for project specific > >> questions? The Octavia team didn't receive any e-mail from you or > >> Allison on the topic. > >> > >> Michael > >> > >> > > From lbragstad at gmail.com Wed May 8 17:13:47 2019 From: lbragstad at gmail.com (Lance Bragstad) Date: Wed, 8 May 2019 12:13:47 -0500 Subject: [cinder][ops] Nested Quota Driver Use? In-Reply-To: References: <20190502003249.GA1432@sm-workstation> <20190507142046.GA3999@sm-workstation> Message-ID: On 5/7/19 3:22 PM, Jay Bryant wrote: > > On 5/7/2019 9:20 AM, Sean McGinnis wrote: >> On Fri, May 03, 2019 at 06:58:41PM +0000, Tim Bell wrote: >>> We're interested in the overall functionality but I think unified >>> limits is the place to invest and thus would not have any problem >>> deprecating this driver. >>> >>> We'd really welcome this being implemented across all the projects >>> in a consistent way. The sort of functionality proposed in >>> https://techblog.web.cern.ch/techblog/post/nested-quota-models/  >>> would need Nova/Cinder/Manila at miniumum for CERN to switch. >>> >>> So, no objections to deprecation  but strong support to converge on >>> unified limits. >>> >>> Tim >>> >> Thanks Tim, that helps. >> >> Since there wasn't any other feedback, and no one jumping up to say >> they are >> using it today, I have submitted https://review.opendev.org/657511 to >> deprecated the current quota driver so we don't have to try to >> refactor that >> functionality into whatever we need to do for the unified limits >> support. >> >> If anyone has any concerns about this plan, please feel free to raise >> them here >> or on that review. >> >> Thanks! >> Sean > > Sean, > > If I remember correctly, IBM had put some time into trying to fix the > nested quota driver back around the Kilo or Liberty release. I haven't > seen much activity since then. > > I am in support deprecating the driver and going to unified limits > given that that appears to be the general direction of OpenStack. If you happen to notice anyone else contributing to the cinder-specific implementation, feel free to have them reach out to us. If people are interested in developing and adopting unified limits, we're happy to get them up-to-speed on the current approach. > > Jay > > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From cdent+os at anticdent.org Wed May 8 17:15:47 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Wed, 8 May 2019 10:15:47 -0700 (PDT) Subject: [placement][nova][ptg] Summary: Consumer Types In-Reply-To: <93df3b21-149c-d32b-54d0-614597d4d754@gmail.com> References: <1557135206.12068.1@smtp.office365.com> <93df3b21-149c-d32b-54d0-614597d4d754@gmail.com> Message-ID: On Wed, 8 May 2019, Matt Riedemann wrote: > Yup I agree with everything said from a nova perspective. Our public cloud > operators were just asking about leaked allocations and if there was tooling > to report and clean that kind of stuff up. I explained we have the > heal_allocations CLI but that's only going to create allocations for > *instances* and only if those instances aren't deleted, but we don't have > anything in nova that deals with detection and cleanup of leaked allocations, > sort of like what this tooling does [1] but I think is different. I continue to wish that we had (or could chose to make) functionality on the compute node, perhaps in response to a signal of some kind that said: performed a reset of inventory and allocations. So that in case of doubt we can use reality as the authoritative source of truth, not either of the nova or placement dbs. I'm not sure if that's feasible at this stage. I agree that healing allocations for instances that are known to exist is easy, but cleaning up allocations that got left behind is harder. It's simplified somewhat (from nova's perspective) in that there should only ever be one group of allocations (that is, a thing identified by a consumer uuid) for an instance. Right now, you can generate a list of known consumers of compute nodes by doing what you describe: traversing the allocations of each compute node provider. If we ever move to a state where the compute node doesn't provide resources (and thus will have no allocations) we won't be able to do that, and that's one of the reasons why I get resistant when we talk about moving VCPU to NUMA nodes in all cases. Which supports your assertion that maybe some day it would be nice to list allocations by type. Some day. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From alifshit at redhat.com Wed May 8 18:21:44 2019 From: alifshit at redhat.com (Artom Lifshitz) Date: Wed, 8 May 2019 14:21:44 -0400 Subject: [nova][CI] GPUs in the gate In-Reply-To: <20190508132709.xgq6nz3mqkfw3q5d@yuggoth.org> References: <3587e05d-deab-42ad-9a02-4312ca11760f@www.fastmail.com> <20190508132709.xgq6nz3mqkfw3q5d@yuggoth.org> Message-ID: On Wed, May 8, 2019 at 9:30 AM Jeremy Stanley wrote: > Long shot, but since you just need the feature provided and not the > performance it usually implies, are there maybe any open source > emulators which provide the same instruction set for conformance > testing purposes? Something like that exists for network cards. It's called netdevsim [1], and it's been mentioned in the SRIOV live migration spec [2]. However to my knowledge nothing like that exists for GPUs. [1] https://www.phoronix.com/scan.php?page=news_item&px=Linux-4.16-Networking [2] https://specs.openstack.org/openstack/nova-specs/specs/train/approved/libvirt-neutron-sriov-livemigration.html#testing From aspiers at suse.com Wed May 8 18:27:19 2019 From: aspiers at suse.com (Adam Spiers) Date: Wed, 8 May 2019 19:27:19 +0100 Subject: [nova][all][ptg] Summary: Same-Company Approvals In-Reply-To: References: <20190508143923.bhmla62qi2p7yc7s@yuggoth.org> <20190508154511.njvidentht4d4zim@pacific.linksys.moosehall> <78C13304-E630-43FD-BDA5-0C43FBDA8B29@leafe.com> Message-ID: <20190508182719.6exbju2l3ohskwjt@pacific.linksys.moosehall> Morgan Fainberg wrote: >In general I would rather see trust be pushed forward. The cores are >explicitly trusted individuals within a team. I would encourage setting >this policy aside and revisit if it ever becomes an issue. I think this >policy, preemptive or not, highlights a lack of trust. IMHO it wouldn't highlight a lack of trust if it explicitly said that there is no current problem in the community. But it's not just about trust. There's also the issue of simple honest lack of awareness, even by diligent newbie cores with the very finest of intentions. Honestly, if I hadn't stumbled across this conversation at the PTG, and later became core on a project, it might have never crossed my mind that it might be better in some scenarios to avoid giving W+1 on a review where +2 was only given by colleagues at my company. Indeed, the fact that we currently (and hopefully indefinitely) enjoy the ability to trust the best interests of others cores would probably make me *more* susceptible to accidentally introducing company-oriented bias without realising it. In contrast, if there was an on-boarding document for new cores which raised awareness of this, I would read that when becoming a core, and then vet myself for employer-oriented bias before every +2 and W+1 I subsequently gave. >It is another reason >why Keystone team abolished the rule. AI.kuch prefer trusting the cores >with landing code with or without external/additional input as they feel is >appropriate. > >There are remedies if something lands inappropriately (revert, removal of >core status, etc). As stated earlier in the quoted email, this has never or >almost never been an issue. > >With that said, I don't have a strongly vested interest outside of seeing >our community succeeding and being as open/inclusive as possible (since >most contributions I am working on are not subject to this policy). As long >as the policy isn't strictly tribal knowledge, we are headed in the right >direction. Agreed. Any suggestions on how to prevent it being tribal? The only way I can think of is documenting it, but maybe I'm missing a trick. From zbitter at redhat.com Wed May 8 18:27:54 2019 From: zbitter at redhat.com (Zane Bitter) Date: Wed, 8 May 2019 14:27:54 -0400 Subject: [tc] Proposal: restrict TC activities In-Reply-To: <20190504132550.GA28713@shipstone.jp> References: <20190503204942.GB28010@shipstone.jp> <20190504132550.GA28713@shipstone.jp> Message-ID: <630df54a-3645-6319-da88-58f47ae36ca5@redhat.com> On 4/05/19 9:25 AM, Emmet Hikory wrote: > Zhipeng Huang wrote: >> Then it might fit the purpose to rename the technical committee to >> governance committee or other terms. If we have a technical committee not >> investing time to lead in technical evolvement of OpenStack, it just seems >> odd to me. > > OpenStack has a rich governance structure, including at least the > Foundation Board, the User Committee, and the Technical Committee. Within > the context of governance, the Technical Committee is responsible for both > technical governance of OpenStack and governance of the technical community. > It is within that context that "Technical Committee" is the name. > > I also agree that it is important that members of the Technical Committee > are able to invest time to lead in the technical evolution of OpenStack, and > this is a significant reason that I propose that the activities of the TC be > restricted, precisely so that being elected does not mean that one no longer > is able to invest time for this. Could you be more clear about which activities you think should be restricted? Presumably you're arguing that there should be fewer... let's call it "ex officio" responsibilities to being a TC member. The suggestion reads as kind of circular, because you appear to be saying that aspiring TC members should be doing certain kinds of socially useful tasks that are likely to get them elected to the TC, where they will be restricted from doing those tasks in order to make sure they have free time to do the kinds of socially useful things they were doing prior to getting elected to the TC, except that those are now restricted for them. Presumably we're actually talking about different sets of tasks there, but I don't think we can break the loop without being explicit about what they are. >> TC should be a place good developers aspired to, not retired to. BTW this >> is not a OpenStack-only issue but I see across multiple open source >> communities. > > While I agree that it is valuable to have a target for the aspirations > of good developers, I am not convinced that OpenStack can be healthy if we > restrict our aspirations to nine seats. Good news, we have 13 seats ;) > From my perspective, this causes > enough competition that many excellent folk may never be elected, and that > some who wish to see their aspirations fufilled may focus activity in other > communities where it may be easier to achieve an arbitrary title. > > Instead, I suggest that developers should aspire to be leaders in the > OpenStack comunuity, and be actively involved in determining the future > technical direction of OpenStack. I just don't think there needs to be > any correlation between this and the mechanics of reviewing changes to the > governance repository. I couldn't agree more that we want as many people as possible to be leaders in the community and not wait to be elected to something. That said, in my personal experience, people just... listen more (for better and worse) to you when you're a TC member, because the election provides social proof that other people are listening to you too. This phenomenon seems unavoidable unless you create separate bodies for technical direction and governance (which I suspect has its own problems, like a tendency for the governance body to become dominated by professional managers). cheers, Zane. From morgan.fainberg at gmail.com Wed May 8 18:50:32 2019 From: morgan.fainberg at gmail.com (Morgan Fainberg) Date: Wed, 8 May 2019 11:50:32 -0700 Subject: [nova][all][ptg] Summary: Same-Company Approvals In-Reply-To: <20190508182719.6exbju2l3ohskwjt@pacific.linksys.moosehall> References: <20190508143923.bhmla62qi2p7yc7s@yuggoth.org> <20190508154511.njvidentht4d4zim@pacific.linksys.moosehall> <78C13304-E630-43FD-BDA5-0C43FBDA8B29@leafe.com> <20190508182719.6exbju2l3ohskwjt@pacific.linksys.moosehall> Message-ID: On Wed, May 8, 2019 at 11:27 AM Adam Spiers wrote: > Morgan Fainberg wrote: > >In general I would rather see trust be pushed forward. The cores are > >explicitly trusted individuals within a team. I would encourage setting > >this policy aside and revisit if it ever becomes an issue. I think this > >policy, preemptive or not, highlights a lack of trust. > > IMHO it wouldn't highlight a lack of trust if it explicitly said that > there is no current problem in the community. > > But it's not just about trust. There's also the issue of simple > honest lack of awareness, even by diligent newbie cores with the very > finest of intentions. > > Honestly, if I hadn't stumbled across this conversation at the PTG, > and later became core on a project, it might have never crossed my > mind that it might be better in some scenarios to avoid giving W+1 on > a review where +2 was only given by colleagues at my company. Indeed, > the fact that we currently (and hopefully indefinitely) enjoy the > ability to trust the best interests of others cores would probably > make me *more* susceptible to accidentally introducing > company-oriented bias without realising it. > > In contrast, if there was an on-boarding document for new cores which > raised awareness of this, I would read that when becoming a core, and > then vet myself for employer-oriented bias before every +2 and W+1 I > subsequently gave. > > >It is another reason > >why Keystone team abolished the rule. AI.kuch prefer trusting the cores > >with landing code with or without external/additional input as they feel > is > >appropriate. > > > >There are remedies if something lands inappropriately (revert, removal of > >core status, etc). As stated earlier in the quoted email, this has never > or > >almost never been an issue. > > > >With that said, I don't have a strongly vested interest outside of seeing > >our community succeeding and being as open/inclusive as possible (since > >most contributions I am working on are not subject to this policy). As > long > >as the policy isn't strictly tribal knowledge, we are headed in the right > >direction. > > Agreed. Any suggestions on how to prevent it being tribal? The only > way I can think of is documenting it, but maybe I'm missing a trick. > Unfortunately, in this case it's "tribal" or "documented". No "one weird trick" here as far as I know ;). --Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Wed May 8 19:19:01 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 8 May 2019 19:19:01 +0000 Subject: [tc] Proposal: restrict TC activities In-Reply-To: <630df54a-3645-6319-da88-58f47ae36ca5@redhat.com> References: <20190503204942.GB28010@shipstone.jp> <20190504132550.GA28713@shipstone.jp> <630df54a-3645-6319-da88-58f47ae36ca5@redhat.com> Message-ID: <20190508191900.aronojaifbnh26yi@yuggoth.org> On 2019-05-08 14:27:54 -0400 (-0400), Zane Bitter wrote: > On 4/05/19 9:25 AM, Emmet Hikory wrote: > > Zhipeng Huang wrote: > > > Then it might fit the purpose to rename the technical > > > committee to governance committee or other terms. If we have a > > > technical committee not investing time to lead in technical > > > evolvement of OpenStack, it just seems odd to me. > > > > OpenStack has a rich governance structure, including at least > > the Foundation Board, the User Committee, and the Technical > > Committee. Within the context of governance, the Technical > > Committee is responsible for both technical governance of > > OpenStack and governance of the technical community. It is > > within that context that "Technical Committee" is the name. > > > > I also agree that it is important that members of the Technical > > Committee are able to invest time to lead in the technical > > evolution of OpenStack, and this is a significant reason that I > > propose that the activities of the TC be restricted, precisely > > so that being elected does not mean that one no longer is able > > to invest time for this. > > Could you be more clear about which activities you think should be > restricted? Presumably you're arguing that there should be > fewer... let's call it "ex officio" responsibilities to being a TC > member. > > The suggestion reads as kind of circular, because you appear to be > saying that aspiring TC members should be doing certain kinds of > socially useful tasks that are likely to get them elected to the > TC, where they will be restricted from doing those tasks in order > to make sure they have free time to do the kinds of socially > useful things they were doing prior to getting elected to the TC, > except that those are now restricted for them. Presumably we're > actually talking about different sets of tasks there, but I don't > think we can break the loop without being explicit about what they > are. [...] My read was that the community should, each time we assert there's something we want done and we think the TC should also take care of for us, step back and consider that those TC members are already deeply embedded in various parts of our community as well as adjacent communities getting other things done (likely the same things which got them elected to seats on the TC to begin with), and that each new thing we want them to tackle is going to take the place of yet more of those other things they'll cease having time for as a result. Taken from another perspective, it's the idea that the TC as a governing body should limit its focus to governance tasks and stop feeling pressured to find yet more initiatives and responsibilities for itself, leaving more time for the folks serving on the TC to also continue doing all manner of other important tasks they feel compelled to do in their capacity as members of the community rather than with their "TC hats" on. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From pawel.konczalski at everyware.ch Wed May 8 19:27:32 2019 From: pawel.konczalski at everyware.ch (Pawel Konczalski) Date: Wed, 8 May 2019 21:27:32 +0200 Subject: Magnum Kubernetes openstack-cloud-controller-manager unable not resolve master node by DNS In-Reply-To: <4FFA2395-960B-4DA7-8481-F2AD93EAB500@stackhpc.com> References: <4FFA2395-960B-4DA7-8481-F2AD93EAB500@stackhpc.com> Message-ID: <8c93a364-030c-d0d2-447b-3e737641d24a@everyware.ch> Hi Bharat, i was able to deploy the Kubernetes cluster with Magnum after update / specify Kubernetes version with the "--labels kube_tag=v1.13.4" parameter. See: # kube_tag https://docs.openstack.org/magnum/latest/user/#kube-tag https://hub.docker.com/r/openstackmagnum/kubernetes-apiserver/tags/ # cloud_provider_tag https://docs.openstack.org/magnum/latest/user/#cloud-provider-tag https://hub.docker.com/r/k8scloudprovider/openstack-cloud-controller-manager/tags/ This may by related with this issue: https://github.com/kubernetes/cloud-provider-openstack/issues/280 # openstack coe cluster template create kubernetes-cluster-template \   --image "Fedora AtomicHost 29" \   --external-network public \   --dns-nameserver 8.8.8.8 \   --master-flavor m1.kubernetes \   --flavor m1.kubernetes \   --coe kubernetes \   --volume-driver cinder \   --network-driver flannel \   --docker-volume-size 25 \   --public \   --labels kube_tag=v1.13.4,cloud_provider_tag=1.13.1 # openstack coe cluster create kubernetes-cluster \   --cluster-template kubernetes-cluster-template \   --master-count 1 \   --node-count 2 \   --keypair mykey # kubectl get pods --all-namespaces -o wide NAMESPACE     NAME                                       READY STATUS    RESTARTS   AGE     IP NODE                                        NOMINATED NODE READINESS GATES kube-system   coredns-dcc6d487d-hxpgq                    1/1 Running   0          7h55m   10.100.9.2 kubernetes-cluster7-sysemevhbq4i-minion-1   kube-system   coredns-dcc6d487d-nkb9p                    1/1 Running   0          7h57m   10.100.78.4 kubernetes-cluster7-sysemevhbq4i-minion-0   kube-system   heapster-796547984d-6wgwp                  1/1 Running   0          7h57m   10.100.78.2 kubernetes-cluster7-sysemevhbq4i-minion-0   kube-system   kube-dns-autoscaler-7865df57cd-ln4cc       1/1 Running   0          7h57m   10.100.78.3 kubernetes-cluster7-sysemevhbq4i-minion-0   kube-system   kubernetes-dashboard-f5496d66d-tdbvv       1/1 Running   0          7h57m   10.100.78.5 kubernetes-cluster7-sysemevhbq4i-minion-0   kube-system   openstack-cloud-controller-manager-9s5wh   1/1 Running   3          7h57m   10.0.0.10 kubernetes-cluster7-sysemevhbq4i-master-0   Thank you Pawel Am 08.05.19 um 8:40 vorm. schrieb Bharat Kunwar: > Try using the latest version, think there is an OCCM_TAG. > > Sent from my iPhone > >> On 7 May 2019, at 20:10, Pawel Konczalski wrote: >> >> Hi, >> >> i try to deploy a Kubernetes cluster with OpenStack Magnum but the openstack-cloud-controller-manager pod fails to resolve the master node hostname. >> >> Does magnum require further parameter to configure the DNS names of the master and minions? DNS resolution in the VMs works fine. Currently there is no Designate installed in the OpenStack setup. >> >> >> openstack coe cluster template create kubernetes-cluster-template1 \ >> --image Fedora-AtomicHost-29-20190429.0.x86_64 \ >> --external-network public \ >> --dns-nameserver 8.8.8.8 \ >> --master-flavor m1.kubernetes \ >> --flavor m1.kubernetes \ >> --coe kubernetes \ >> --volume-driver cinder \ >> --network-driver flannel \ >> --docker-volume-size 25 >> >> openstack coe cluster create kubernetes-cluster1 \ >> --cluster-template kubernetes-cluster-template1 \ >> --master-count 1 \ >> --node-count 2 \ >> --keypair mykey >> >> >> # kubectl get pods --all-namespaces -o wide >> NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE >> kube-system coredns-78df4bf8ff-mjp2c 0/1 Pending 0 36m >> kube-system heapster-74f98f6489-tgtzl 0/1 Pending 0 36m >> kube-system kube-dns-autoscaler-986c49747-wrvz4 0/1 Pending 0 36m >> kube-system kubernetes-dashboard-54cb7b5997-sk5pj 0/1 Pending 0 36m >> kube-system openstack-cloud-controller-manager-dgk64 0/1 CrashLoopBackOff 11 36m kubernetes-cluster1-vulg5fz6hg2n-master-0 >> >> >> # kubectl -n kube-system logs openstack-cloud-controller-manager-dgk64 >> Error from server: Get https://kubernetes-cluster1-vulg5fz6hg2n-master-0:10250/containerLogs/kube-system/openstack-cloud-controller-manager-dgk64/openstack-cloud-controller-manager: dial tcp: lookup kubernetes-cluster1-vulg5fz6hg2n-master-0 on 8.8.8.8:53: no such host >> >> >> BR >> >> Pawel -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5227 bytes Desc: not available URL: From openstack at nemebean.com Wed May 8 19:42:00 2019 From: openstack at nemebean.com (Ben Nemec) Date: Wed, 8 May 2019 14:42:00 -0500 Subject: [oslo] PTG Summary Message-ID: Hi, You can find the raw notes on the etherpad (https://etherpad.openstack.org/p/oslo-train-topics), but hopefully this will be an easier to read/understand summary. Pluggable Policy ---------------- Spec: https://review.opendev.org/#/c/578719/ Since this sort of ran out of steam last cycle, we discussed the option of not actually making it pluggable and just explicitly adding support for other policy backends. The specific one that seems to be of interest is Open Policy Agent. To do this we would add an option to enable OPA mode, where all policy checks would be passed through to OPA by default. An OPACheck class would also be added to facilitate migration (as a rule is added to OPA, switch the policy to OPACheck. Once all rules are present, remove the policy file and just turn on the OPA mode). However, after some further investigation by Patrick East, it was not clear if users were asking for this or if the original spec was more of a "this might be useful" thing. He's following up with some OPA users to see if they would use such a feature, but at this point it's not clear whether there is enough demand to justify spending time on it. Image Encryption/Decryption Library ----------------------------------- I mention this mostly because the current plan is _not_ to create a new Oslo library to enable the feature. The common code between services is expected to live in os-brick, and there does not appear to be a need to create a new encryption library to support this (yay!). oslo.service SIGHUP bug ----------------------- This is a problem a number of people have run into recently and there's been some ongoing, but spotty, discussion of how to deal with it. In Denver we were able to have some face-to-face discussions and hammer out a plan to get this fixed. I think we have a fix identified, and now we just need to get it proposed and tested so we don't regress this in the future. Most of the prior discussion and a previously proposed fix are at https://review.opendev.org/#/c/641907/ so if you want to follow this that's the place to do it. In case anyone is interested, it looks like this is a bug that was introduced with mutable config. Mutable config requires a different type of service restart, and that was never implemented. Now that most services are using mutable config, this is much bigger problem. Unified Limits and Policy ------------------------- I won't try to cover everything in detail here, but good progress was made on both of these topics. There isn't much to do from the Oslo side for the policy changes, but we identified a plan for an initial implementation of oslo.limit. There was general agreement that we don't necessarily have to get it 100% right on the first attempt, we just need to get something in the repo that people can start prototyping with. Until we release a 1.0 we aren't committed to any API, so we have flexibility to iterate. For more details, see: https://etherpad.openstack.org/p/ptg-train-xproj-nova-keystone oslo.service profiling and pypy ------------------------------- Oslo has dropped support for pypy in general due to lack of maintainers, so although the profiling work has apparently broken oslo.service under pypy this isn't something we're likely to address. Based on our conversation at the PTG game night, it sounds like this isn't a priority anymore anyway because pypy didn't have the desired performance improvement. oslo.privsep eventlet timeout ----------------------------- AFAICT, oslo.privsep only uses eventlet at all if monkey-patching is enabled (and then only to make sure it returns the right type of pipe for the environment). It's doubtful any eventlet exceptions are being raised from the privsep code, and even if they are they would go away once monkey-patching in the calling service is disabled. Privsep is explicitly not depending on eventlet for any of its functionality so services should be able to freely move away from eventlet if they wish. Retrospective ------------- In general, we got some major features implemented that unblocked things either users or services were asking for. We did add two cores during the cycle, but we also lost a long-time Oslo core and some of the other cores are being pulled away on other projects. So far this has probably resulted in a net loss in review capacity. As a result, our primary actions out of this were to continue watching for new candidates to join the Oslo team. We have at least one person we are working closely with and a number of other people approached me at the event with interest in contributing to one or more Oslo projects. So while this cycle was a bit of a mixed bag, I have a cautiously optimistic view of the future. Service Healthchecks and Metrics -------------------------------- Had some initial hallway track discussions about this. The self-healing SIG is looking into ways to improve the healthcheck and metric situation in OpenStack, and some of them may require additions or changes in Oslo. There is quite a bit of discussion (not all of which I have read yet) related to this on https://review.opendev.org/#/c/653707/ On the metrics side, there are some notes on the SIG etherpad (currently around line 209): https://etherpad.openstack.org/p/DEN-self-healing-SIG It's still a bit early days for both of these things so plans may change, but it seems likely that Oslo will be involved to some extent. Stay tuned. Endgame ------- No spoilers, I promise. If you made it all the way here then thanks and congrats. :-) I hope this was helpful, and if you have any thoughts about anything above please let me know. Thanks. -Ben From sundar.nadathur at intel.com Wed May 8 19:53:27 2019 From: sundar.nadathur at intel.com (Nadathur, Sundar) Date: Wed, 8 May 2019 19:53:27 +0000 Subject: [placement][nova][ptg] resource provider affinity In-Reply-To: <5fd214e8-4822-53a5-a7d6-622c5133a26f@fried.cc> References: <1CC272501B5BC543A05DB90AA509DED527552AD6@fmsmsx122.amr.corp.intel.com> <97bd8e53-0285-1c92-845f-21098b0b0e38@gmail.com> <21aa22e7-be7d-8ecf-b5bd-9c6afcd789f5@fried.cc> <27624C30-2BB6-43DF-9613-783674389C0B@fried.cc> <1556631941.24201.1@smtp.office365.com> <264f10b8-05dc-5280-28af-1f29cae91821@hco.ntt.co.jp> <4aa76244-fce0-86f3-a6f5-cd7f4d8cb2f0@fried.cc> <03922b54-994e-dcae-8543-7c9c2f75b87d@hco.ntt.co.jp> <5fd214e8-4822-53a5-a7d6-622c5133a26f@fried.cc> Message-ID: <1CC272501B5BC543A05DB90AA509DED527557F03@fmsmsx122.amr.corp.intel.com> Thanks, Eric and Chris. Can this scheme address this use case? I have a set of compute hosts, each with several NICs of type T. Each NIC has a set of PFs: PF1, PF2, .... Each PF is a resource provider, and each has a separate custom RC: CUSTOM_RC_PF1, CUSTOM_RC_PF2, ... . The VFs are inventories of the associated PF's RC. Provider networks etc. are traits on that PF. The use case is to schedule a VM with several Neutron ports coming from the same NIC card and tied to specific networks. Let us say we (somehow) translate this to a set of request groups like this: resources_T1:CUSTOM_RC_PF1 = 2 # Note: T is the NIC name, and we are asking for VFs as resources. traits_T1:CUSTOM_TRAIT_MYNET1 = required resources_T2:CUSTOM_RC_PF2 = 1 traits_T2:CUSTOM_TRAIT_MYNET2 = required "same_subtree=%s" % ','.join(suffix for suffix in all_suffixes if suffix.startswith('T')) Will this ensure that all allocations come from the same NIC card? Do I have to create a 'resourceless RP' for the NIC card that contains the individual PF RPs as children nodes? P.S.: Ignore the comments I added to https://storyboard.openstack.org/#!/story/2005575#comment-122255. Regards, Sundar > -----Original Message----- > From: Eric Fried > Sent: Saturday, May 4, 2019 3:57 PM > To: openstack-discuss at lists.openstack.org > Subject: Re: [placement][nova][ptg] resource provider affinity > > For those of you following along at home, we had a design session a couple of > hours ago and hammered out the broad strokes of this work, including rough > prioritization of the various pieces. Chris has updated the story [1] with a > couple of notes; expect details and specs to emerge therefrom. > > efried > > [1] https://storyboard.openstack.org/#!/story/2005575 From openstack at nemebean.com Wed May 8 20:04:21 2019 From: openstack at nemebean.com (Ben Nemec) Date: Wed, 8 May 2019 15:04:21 -0500 Subject: [nova][all][ptg] Summary: Same-Company Approvals In-Reply-To: References: <20190508143923.bhmla62qi2p7yc7s@yuggoth.org> <20190508154511.njvidentht4d4zim@pacific.linksys.moosehall> <78C13304-E630-43FD-BDA5-0C43FBDA8B29@leafe.com> <20190508182719.6exbju2l3ohskwjt@pacific.linksys.moosehall> Message-ID: <4bb90a56-9e01-8935-9d4a-51fb5a61145d@nemebean.com> On 5/8/19 1:50 PM, Morgan Fainberg wrote: > > > On Wed, May 8, 2019 at 11:27 AM Adam Spiers > wrote: > > Morgan Fainberg > wrote: > >In general I would rather see trust be pushed forward. The cores are > >explicitly trusted individuals within a team. I would encourage > setting > >this policy aside and revisit if it ever becomes an issue. I think > this > >policy, preemptive or not, highlights a lack of trust. > > IMHO it wouldn't highlight a lack of trust if it explicitly said that > there is no current problem in the community. > > But it's not just about trust.  There's also the issue of simple > honest lack of awareness, even by diligent newbie cores with the very > finest of intentions. > > Honestly, if I hadn't stumbled across this conversation at the PTG, > and later became core on a project, it might have never crossed my > mind that it might be better in some scenarios to avoid giving W+1 on > a review where +2 was only given by colleagues at my company.  Indeed, > the fact that we currently (and hopefully indefinitely) enjoy the > ability to trust the best interests of others cores would probably > make me *more* susceptible to accidentally introducing > company-oriented bias without realising it. > > In contrast, if there was an on-boarding document for new cores which > raised awareness of this, I would read that when becoming a core, and > then vet myself for employer-oriented bias before every +2 and W+1 I > subsequently gave. > > >It is another reason > >why Keystone team abolished the rule.  AI.kuch prefer trusting the > cores > >with landing code with or without external/additional input as > they feel is > >appropriate. > > > >There are remedies if something lands inappropriately (revert, > removal of > >core status, etc). As stated earlier in the quoted email, this has > never or > >almost never been an issue. > > > >With that said, I don't have a strongly vested interest outside of > seeing > >our community succeeding and being as open/inclusive as possible > (since > >most contributions I am working on are not subject to this > policy). As long > >as the policy isn't strictly tribal knowledge, we are headed in > the right > >direction. > > Agreed.  Any suggestions on how to prevent it being tribal?  The only > way I can think of is documenting it, but maybe I'm missing a trick. > > > Unfortunately, in this case it's "tribal" or "documented". No "one weird > trick" here as far as I know ;). Two cores, one company. You won't believe what happens next! /me goes back to daydreaming about working on a project with enough contributors for this to be a problem :-) From marcin.juszkiewicz at linaro.org Wed May 8 20:17:19 2019 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Wed, 8 May 2019 22:17:19 +0200 Subject: [cinder] Python3 requirements for Train In-Reply-To: References: Message-ID: <1fa6b516-ad4e-a4dc-cac1-6b72e9b1846b@linaro.org> W dniu 08.05.2019 o 17:04, Walter Boring pisze: > The train release is going to be the last release of OpenStack with > python 2 support. Train also is going to require supporting python > 3.6 and 3.7. This means that we should be enabling and or switching > over all of our 3rd party CI runs to python 3 to ensure that our > drivers and all of their required libraries run properly in a python > 3.6/3.7 environment. This will help driver maintainers discover any > python3 incompatibilities with their driver as well as any required > libraries. At the PTG in Denver, the cinder team agreed that we > wanted driver CI systems to start using python3 by milestone 2 for > Train. This would be the July 22-26th time frame [1]. Added cinder to a list of 'things may break' projects then. I am working on switching Kolla to use only Python 3 in Debian/Ubuntu based images. Stopped counting projects I had to patch ;( From mriedemos at gmail.com Wed May 8 20:41:25 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 8 May 2019 15:41:25 -0500 Subject: [nova][ptg] Summary: Implicit trait-based filters In-Reply-To: References: Message-ID: <3dadf184-e889-0975-0d55-cc6066a122a8@gmail.com> On 5/6/2019 1:44 PM, Eric Fried wrote: > Addendum: > There's another implicit trait-based filter that bears mentioning: > Excluding disabled compute hosts. > > We have code that disables a compute service when "something goes wrong" > in various ways. This code should decorate the compute node's resource > provider with a COMPUTE_SERVICE_DISABLED trait, and every GET > /allocation_candidates request should include > ?required=!COMPUTE_SERVICE_DISABLED, so that we don't retrieve > allocation candidates for disabled hosts. > > mriedem has started to prototype the code for this [1]. > > Action: Spec to be written. Code to be polished up. Possibly aspiers to > be involved in this bit as well. > > efried > > [1]https://review.opendev.org/#/c/654596/ Here is the spec [1]. There are noted TODOs and quite a few alternatives listed, mostly alternatives to the proposed design and what's in my PoC. One thing my PoC didn't cover was the service group API and it automatically reporting a service as up or down, I think that will have to be incorp0rated into this, but how best to do that without having this 'disabled' trait management everywhere might be tricky. My PoC tries to make the compute the single place we manage the trait, but that's also problematic if we lose a race with the API to disable a compute before the compute dies, or if MQ drops the call, etc. We might need/want to hook into the update_available_resource periodic to heal / sync the trait if we have an issue like that, or on startup during upgrade, and we likely also need a CLI to sync the trait status manually - at least to aid with the upgrade. Who knew that managing a status reporting daemon could be complicated (oh right everyone). [1] https://review.opendev.org/#/c/657884/ -- Thanks, Matt From joseph.davis at suse.com Wed May 8 21:23:22 2019 From: joseph.davis at suse.com (Joseph Davis) Date: Wed, 8 May 2019 14:23:22 -0700 Subject: [telemetry] Team meeting agenda for tomorrow In-Reply-To: References: Message-ID: <14ff728c-f19e-e869-90b1-4ff37f7170af@suse.com> On 5/8/19 7:12 AM, openstack-discuss-request at lists.openstack.org wrote: > Hello Trinh, > Where does the meeting happen? Will it be via IRC Telemetry channel? Or, in > the Etherpad (https://etherpad.openstack.org/p/telemetry-meeting-agenda)? I > would like to discuss and understand a bit better the context behind > the Telemetry > events deprecation. Unfortunately, I have a conflict at that time and will not be able to attend. I do have a little bit of context on the Events deprecation to share. First, you will note the commit message from the commit [0] when Events were deprecated: " Deprecate event subsystem This subsystem has never been finished and is not maintained. Deprecate it for future removal. " I got the impression from jd at the time that there were a number of features in Telemetry, including Panko, that were not really "finished" and that the engineers who had worked on them had moved on to other things, so the features had become unsupported.  In late 2018 there was an effort to clean up things that were not well maintained or didn't fit the direction of Telemetry. See also: https://julien.danjou.info/lessons-from-openstack-telemetry-deflation/ Events is one feature that often gets requested, but the use cases and demand for it are not expressed strongly or well understood by most people.  If the Telemetry project has demand to de-deprecate Event handling (including Panko), I'd suggest a review of the requirements for event handling and possibly choosing a champion for maintaining the Panko service. Also note: over in Monasca we have a spec [1] for handling Events ingestion which I hope we will be completing in Train.  Contributions and comments welcome. :) joseph [0] https://github.com/openstack/ceilometer/commit/8a0245a5b3e1357d35ad6653be37ca01176577e4 [1] https://github.com/openstack/monasca-specs/blob/master/specs/stein/approved/monasca-events-listener.rst > > On Wed, May 8, 2019 at 12:19 AM Trinh Nguyen wrote: > >> Hi team, >> >> As planned, we will have a team meeting at 02:00 UTC, May 9th on >> #openstack-telemetry to discuss what we gonna do for the next milestone >> (Train-1) and continue what we left off from the last meeting. >> >> I put here [1] the agenda thinking that it should be fine for an hour >> meeting. If you have anything to talk about, please put it there too. >> >> [1]https://etherpad.openstack.org/p/telemetry-meeting-agenda >> >> >> Bests, >> >> -- >> *Trinh Nguyen* >> *www.edlab.xyz* >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From allison at openstack.org Wed May 8 21:30:31 2019 From: allison at openstack.org (Allison Price) Date: Wed, 8 May 2019 16:30:31 -0500 Subject: OpenStack User Survey 2019 In-Reply-To: References: <5CC0732E.8020601@tipit.net> <74F9B988-972B-422F-94D1-E62A83FD87A7@openstack.org> Message-ID: Hi Carlos, Thank you for providing these two questions. We can get them both added, but I did have a question. Are both of these questions intended to be open ended with a text box for respondents to fill in their answers? Or do you want to provide answer choices? (thinking for the first question in particular) With any multiple choice question, an Other option can be included that will trigger a text box to be completed. Thanks! Allison > On May 8, 2019, at 12:04 PM, Carlos Goncalves wrote: > > Hi Allison and Jimmy, > > In today's Octavia IRC meeting [1], the team agreed on the following > two questions we would like to see included in the survey: > > 1. Which OpenStack load balancing (Octavia) provider drivers would you > like to see supported? > 2. Which new features would you like to see supported in OpenStack > load balancing (Octavia)? > > Please let us know if you have any questions. > > Thanks, > Carlos > > [1] http://eavesdrop.openstack.org/meetings/octavia/2019/octavia.2019-05-08-16.00.html > > > On Tue, May 7, 2019 at 10:51 PM Allison Price wrote: >> >> Hi Michael, >> >> I apologize that the Octavia project team has been unable to submit a question to date. Jimmy posted the User Survey update to the public mailing list to ensure we updated the entire community and that we caught any projects that had not submitted their questions. The User Survey is open all year, and the primary goal is passing operator feedback to the upstream community. >> >> If the Octavia team - or any OpenStack project team - has a question they would like added (limit of 2 per project), please let Jimmy or myself know. >> >> Thanks for reaching out, Michael. >> >> Cheers, >> Allison >> >>> On May 7, 2019, at 3:39 PM, Michael Johnson wrote: >>> >>> Jimmy & Allison, >>> >>> As you probably remember from previous year's surveys, the Octavia >>> team has been trying to get a question included in the survey for a >>> while. >>> I have included the response we got the last time we inquired about >>> the survey below. We never received a follow up invitation. >>> >>> I think it would be in the best interest for the community if we >>> follow our "Four Opens" ethos in the user survey process, specifically >>> the "Open Community" statement, by soliciting survey questions from >>> the project teams in an open forum such as the openstack-discuss >>> mailing list. >>> >>> Michael >>> >>> ----- Last response e-mail ------ >>> Jimmy McArthur >>> >>> Fri, Sep 7, 2018, 5:51 PM >>> to Allison, me >>> Hey Michael, >>> >>> The project-specific questions were added in 2017, so likely didn't >>> include some new projects. While we asked all projects to participate >>> initially, less than a dozen did. We will be sending an invitation for >>> new/underrepresented projects in the coming weeks. Please stand by and >>> know that we value your feedback and that of the community. >>> >>> Cheers! >>> >>> >>> >>>> On Sat, Apr 27, 2019 at 5:11 PM Allison Price wrote: >>>> >>>> Hi Michael, >>>> >>>> We reached out to all of the PTLs who had questions in the 2018 version of the survey to review and update their questions. If there is a project that was missed, we can add it and share anonymized results with the PTLs directly as well as the openstack-discsuss mailing list. >>>> >>>> If there is a question from the Octavia team, please let us know and we can add it for the 2019 survey. >>>> >>>> Cheers, >>>> Allison >>>> >>>> >>>> >>>> On Apr 27, 2019, at 4:01 PM, Michael Johnson wrote: >>>> >>>> Jimmy, >>>> >>>> I am curious, how did you reach out the PTLs for project specific >>>> questions? The Octavia team didn't receive any e-mail from you or >>>> Allison on the topic. >>>> >>>> Michael >>>> >>>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at fried.cc Wed May 8 21:31:20 2019 From: openstack at fried.cc (Eric Fried) Date: Wed, 8 May 2019 16:31:20 -0500 Subject: [placement][nova][ptg] resource provider affinity In-Reply-To: <1CC272501B5BC543A05DB90AA509DED527557F03@fmsmsx122.amr.corp.intel.com> References: <97bd8e53-0285-1c92-845f-21098b0b0e38@gmail.com> <21aa22e7-be7d-8ecf-b5bd-9c6afcd789f5@fried.cc> <27624C30-2BB6-43DF-9613-783674389C0B@fried.cc> <1556631941.24201.1@smtp.office365.com> <264f10b8-05dc-5280-28af-1f29cae91821@hco.ntt.co.jp> <4aa76244-fce0-86f3-a6f5-cd7f4d8cb2f0@fried.cc> <03922b54-994e-dcae-8543-7c9c2f75b87d@hco.ntt.co.jp> <5fd214e8-4822-53a5-a7d6-622c5133a26f@fried.cc> <1CC272501B5BC543A05DB90AA509DED527557F03@fmsmsx122.amr.corp.intel.com> Message-ID: <1934f31d-da89-071f-d667-c36d965851ae@fried.cc> Sundar- > I have a set of compute hosts, each with several NICs of type T. Each NIC has a set of PFs: PF1, PF2, .... Each PF is a resource provider, and each has a separate custom RC: CUSTOM_RC_PF1, CUSTOM_RC_PF2, ... . The VFs are inventories of the associated PF's RC. Provider networks etc. are traits on that PF. It would be weird for the inventories to be called PF* if they're inventories of VF. But mainly: why the custom resource classes? The way "resourceless RP" + "same_subtree" is designed to work is best explained if I model your use case with standard resource classes instead: CN | +---NIC1 (trait: I_AM_A_NIC) | | | +-----PF1_1 (trait: CUSTOM_PHYSNET1, inventory: VF=4) | | | +-----PF1_2 (trait: CUSTOM_PHYSNET2, inventory: VF=4) | +---NIC2 (trait: I_AM_A_NIC) | +-----PF2_1 (trait: CUSTOM_PHYSNET1, inventory: VF=4) | +-----PF2_2 (trait: CUSTOM_PHYSNET2, inventory: VF=4) Now if I say: ?resources_T1=VF:1 &required_T1=CUSTOM_PHYSNET1 &resources_T2=VF:1 &required_T2=CUSTOM_PHYSNET2 &required_T3=I_AM_A_NIC &same_subtree=','.join([suffix for suffix in suffixes if suffix.startswith('_T')]) (i.e. '_T1,_T2,_T3') ...then I'll get two candidates: - {PF1_1: VF=1, PF1_2: VF=1} <== i.e. both from NIC1 - {PF2_1: VF=1, PF2_2: VF=1} <== i.e. both from NIC2 ...and no candidates where one VF is from each NIC. IIUC this is how you wanted it. ============== With the custom resource classes, I'm having a hard time understanding the model. How unique are the _PF$N bits? Do they repeat (a) from one NIC to the next? (b) From one host to the next? (c) Never? The only thing that begins to make sense is (a), because (b) and (c) would lead to skittles. So assuming (a), the model would look something like: CN | +---NIC1 (trait: I_AM_A_NIC) | | | +-----PF1_1 (trait: CUSTOM_PHYSNET1, inventory: CUSTOM_PF1_VF=4) | | | +-----PF1_2 (trait: CUSTOM_PHYSNET2, inventory: CUSTOM_PF2_VF=4) | +---NIC2 (trait: I_AM_A_NIC) | +-----PF2_1 (trait: CUSTOM_PHYSNET1, inventory: CUSTOM_PF1_VF=4) | +-----PF2_2 (trait: CUSTOM_PHYSNET2, inventory: CUSTOM_PF2_VF=4) Now you could get the same result with (essentially) the same request as above: ?resources_T1=CUSTOM_PF1_VF:1 &required_T1=CUSTOM_PHYSNET1 &resources_T2=CUSTOM_PF2_VF:1 &required_T2=CUSTOM_PHYSNET2 &required_T3=I_AM_A_NIC &same_subtree=','.join([suffix for suffix in suffixes if suffix.startswith('_T')]) (i.e. '_T1,_T2,_T3') ==> - {PF1_1: CUSTOM_PF1_VF=1, PF1_2: CUSTOM_PF2_VF=1} - {PF2_1: CUSTOM_PF1_VF=1, PF2_2: CUSTOM_PF2_VF=1} ...except that in this model, PF$N corresponds to PHYSNET$N, so you wouldn't actually need the required_T$N=CUSTOM_PHYSNET$N to get the same result: ?resources_T1=CUSTOM_PF1_VF:1 &resources_T2=CUSTOM_PF2_VF:1 &required_T3=I_AM_A_NIC &same_subtree=','.join([suffix for suffix in suffixes if suffix.startswith('_T')]) (i.e. '_T1,_T2,_T3') ...because you're effectively encoding the physnet into the RC. Which is not good IMO. But either way... > Do I have to create a 'resourceless RP' for the NIC card that contains the individual PF RPs as children nodes? ...if you want to be able to request this kind of affinity, then yes, you do (unless there's some consumable resource on the NIC, in which case it's not resourceless, but the spirit is the same). This is exactly what these features are being designed for. Thanks, efried . From gouthampravi at gmail.com Wed May 8 23:21:41 2019 From: gouthampravi at gmail.com (Goutham Pacha Ravi) Date: Wed, 8 May 2019 16:21:41 -0700 Subject: [nova][all][ptg] Summary: Same-Company Approvals In-Reply-To: References: Message-ID: On Tue, May 7, 2019 at 1:08 PM Jay Bryant wrote: > > All, > > Cinder has been working with the same unwritten rules for quite some time as well with minimal issues. > > I think the concerns about not having it documented are warranted. We have had question about it in the past with no documentation to point to. It is more or less lore that has been passed down over the releases. :-) > > At a minimum, having this e-mail thread is helpful. If, however, we decide to document it I think we should have it consistent across the teams that use the rule. I would be happy to help draft/review any such documentation. Chiming in to say the manila community adopted a review policy during Stein release - most of the review policy was what we followed prior, without explicitly writing them down: https://docs.openstack.org/manila/latest/contributor/manila-review-policy.html. Here's a snip/snap from that policy that is relevant to this discussion: Previously, the manila core team informally enforced a code review convention that each code change be reviewed and merged by reviewers of different affiliations. This was followed because the OpenStack Technical Committee used the diversity of affiliation of the core reviewer team as a metric for maturity of the project. However, since the Rocky release cycle, the TC has changed its view on the subject 3 4. We believe this is a step in the right direction. While there is no strict requirement that two core reviewers accepting a code change have different affiliations. Other things being equal, we will continue to informally encourage organizational diversity by having core reviewers from different organizations. Core reviewers have the professional responsibility of avoiding conflicts of interest. > > Jay > > On 5/4/2019 8:19 PM, Morgan Fainberg wrote: > > > > On Sat, May 4, 2019, 16:48 Eric Fried wrote: >> >> (NB: I tagged [all] because it would be interesting to know where other >> teams stand on this issue.) >> >> Etherpad: https://etherpad.openstack.org/p/nova-ptg-train-governance >> >> Summary: >> - There is a (currently unwritten? at least for Nova) rule that a patch >> should not be approved exclusively by cores from the same company. This >> is rife with nuance, including but not limited to: >> - Usually (but not always) relevant when the patch was proposed by >> member of same company >> - N/A for trivial things like typo fixes >> - The issue is: >> - Should the rule be abolished? and/or >> - Should the rule be written down? >> >> Consensus (not unanimous): >> - The rule should not be abolished. There are cases where both the >> impetus and the subject matter expertise for a patch all reside within >> one company. In such cases, at least one core from another company >> should still be engaged and provide a "procedural +2" - much like cores >> proxy SME +1s when there's no core with deep expertise. >> - If there is reasonable justification for bending the rules (e.g. typo >> fixes as noted above, some piece of work clearly not related to the >> company's interest, unwedging the gate, etc.) said justification should >> be clearly documented in review commentary. >> - The rule should not be documented (this email notwithstanding). This >> would either encourage loopholing or turn into a huge detailed legal >> tome that nobody will read. It would also *require* enforcement, which >> is difficult and awkward. Overall, we should be able to trust cores to >> act in good faith and in the appropriate spirit. >> >> efried >> . > > > Keystone used to have the same policy outlined in this email (with much of the same nuance and exceptions). Without going into crazy details (as the contributor and core numbers went down), we opted to really lean on "Overall, we should be able to trust cores to act in good faith". We abolished the rule and the cores always ask for outside input when the familiarity lies outside of the team. We often also pull in cores more familiar with the code sometimes ending up with 3x+2s before we workflow the patch. > > Personally I don't like the "this is an unwritten rule and it shouldn't be documented"; if documenting and enforcement of the rule elicits worry of gaming the system or being a dense some not read, in my mind (and experience) the rule may not be worth having. I voice my opinion with the caveat that every team is different. If the rule works, and helps the team (Nova in this case) feel more confident in the management of code, the rule has a place to live on. What works for one team doesn't always work for another. From rafaelweingartner at gmail.com Thu May 9 00:45:38 2019 From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=) Date: Wed, 8 May 2019 21:45:38 -0300 Subject: [telemetry] Team meeting agenda for tomorrow In-Reply-To: <14ff728c-f19e-e869-90b1-4ff37f7170af@suse.com> References: <14ff728c-f19e-e869-90b1-4ff37f7170af@suse.com> Message-ID: > > Unfortunately, I have a conflict at that time and will not be able to > attend. > > I do have a little bit of context on the Events deprecation to share. > > First, you will note the commit message from the commit [0] when Events > were deprecated: > > " > > Deprecate event subsystem > > This subsystem has never been finished and is not maintained. > Deprecate it for future removal. > > " > > I got the impression from jd at the time that there were a number of > features in Telemetry, > > including Panko, that were not really "finished" and that the engineers > who had worked on them > > had moved on to other things, so the features had become unsupported. In > late 2018 there was > > an effort to clean up things that were not well maintained or didn't fit > the direction of Telemetry. > > See also: > https://julien.danjou.info/lessons-from-openstack-telemetry-deflation/ > Thanks for the reply Joseph, I have seen the commit message, and I also read the blog you referenced (and other pages related to the same topic) which made us a bit worried. I will try to explain our perspective and impressions when we read those blog pages. It is also worth noting that we have just started engaging with the OpenStack community (so, pardon my ignorance with some parts of OpenStack, and how this OpenSource community works). We are already making some contributions to Kolla-ansible, and we want to start to contribute back to Telemetry as well. Before getting to the topic of Telemetry, and to be more precise, Ceilometer, let me state that I have taken part in other OpenSource projects and communities before, but these communities are managed by a different organization. So, Ceilometer; when we were designing and building our OpenStack Cloud, where billing is a crucial part of it. Ceilometer was chosen because it fits our requirements, working "out of the box" to provide valuable data for billing in a high availability fashion. It for sure lacks some features, but that is ok when one works with OpenSource. The pollers and event managers we are missing, we would like to create and contribute back to the community. Having said that, what puzzled me, and worried us, is the fact that features that work are being removed from a project just because some contributors/committers left the community. There wasn't (at least I did not see) a good technical reason to remove this feature (e.g. it does not deliver what is promised, or an alternative solution has been created somewhere and effort is being concentrated there, nobody uses it, and so on). If the features were broken, and there were no people to fix it, I would understand, but that is not the case. The feature works, and it delivers what is promised. Moreover, reading the blog you referenced does not provide a good feeling about how the community has managed the event (the project losing part of its contributors) in question. OpenSource has cycles, and it is understandable that sometimes we do not have many people working on something. OpenSource projects have cycles, and that is normal. As you can see, now there would be us starting/trying to engage with the Telemetry project/community. What is hard for us to understand is that the contributors while leaving are also "killing" the project by removing part of its features (that are very interesting and valuable for us). Why is that important for us? When we work with OpenSource we now that we might need to put effort to customize/adapt things to our business workflow, and we expect that the community will be there to receive and discuss these changes. Therefore, we have predictability that the software/system we base our business will be there, and we can contribute back to improve it. An open source community could and should live even if the project has no community for a while, then if people regroup and start to work on it again, the community is able to flourish. Events is one feature that often gets requested, but the use cases and > demand for it are not expressed > > strongly or well understood by most people. If the Telemetry project has > demand to de-deprecate > > Event handling (including Panko), I'd suggest a review of the requirements > for event handling and > > possibly choosing a champion for maintaining the Panko service. > > > Also note: over in Monasca we have a spec [1] for handling Events > ingestion which I hope we will be > > completing in Train. Contributions and comments welcome. :) > > > joseph > > [0] > https://github.com/openstack/ceilometer/commit/8a0245a5b3e1357d35ad6653be37ca01176577e4 > > [1] > https://github.com/openstack/monasca-specs/blob/master/specs/stein/approved/monasca-events-listener.rst > It is awesome that you might have a similar spec (not developed yet) for Monasca, but the question would remain for us. One, two, or three years from now, what will happen if you, your team, or the people behind this spec/feature decide to leave the community? Will this feature be removed from Monasca too? On Wed, May 8, 2019 at 6:23 PM Joseph Davis wrote: > On 5/8/19 7:12 AM, openstack-discuss-request at lists.openstack.org wrote: > > Hello Trinh, > Where does the meeting happen? Will it be via IRC Telemetry channel? Or, in > the Etherpad (https://etherpad.openstack.org/p/telemetry-meeting-agenda)? I > would like to discuss and understand a bit better the context behind > the Telemetry > events deprecation. > > Unfortunately, I have a conflict at that time and will not be able to > attend. > > I do have a little bit of context on the Events deprecation to share. > > First, you will note the commit message from the commit [0] when Events > were deprecated: > > " > > Deprecate event subsystem > > This subsystem has never been finished and is not maintained. > Deprecate it for future removal. > > " > > I got the impression from jd at the time that there were a number of > features in Telemetry, > > including Panko, that were not really "finished" and that the engineers > who had worked on them > > had moved on to other things, so the features had become unsupported. In > late 2018 there was > > an effort to clean up things that were not well maintained or didn't fit > the direction of Telemetry. > > See also: > https://julien.danjou.info/lessons-from-openstack-telemetry-deflation/ > > > Events is one feature that often gets requested, but the use cases and > demand for it are not expressed > > strongly or well understood by most people. If the Telemetry project has > demand to de-deprecate > > Event handling (including Panko), I'd suggest a review of the requirements > for event handling and > > possibly choosing a champion for maintaining the Panko service. > > > Also note: over in Monasca we have a spec [1] for handling Events > ingestion which I hope we will be > > completing in Train. Contributions and comments welcome. :) > > > joseph > > [0] > https://github.com/openstack/ceilometer/commit/8a0245a5b3e1357d35ad6653be37ca01176577e4 > > [1] > https://github.com/openstack/monasca-specs/blob/master/specs/stein/approved/monasca-events-listener.rst > > > On Wed, May 8, 2019 at 12:19 AM Trinh Nguyen wrote: > > > Hi team, > > As planned, we will have a team meeting at 02:00 UTC, May 9th on > #openstack-telemetry to discuss what we gonna do for the next milestone > (Train-1) and continue what we left off from the last meeting. > > I put here [1] the agenda thinking that it should be fine for an hour > meeting. If you have anything to talk about, please put it there too. > > [1] https://etherpad.openstack.org/p/telemetry-meeting-agenda > > > Bests, > > --**Trinh Nguyen** > *www.edlab.xyz * > > > > -- Rafael Weingärtner -------------- next part -------------- An HTML attachment was scrubbed... URL: From joseph.davis at suse.com Thu May 9 01:33:11 2019 From: joseph.davis at suse.com (Joseph Davis) Date: Wed, 8 May 2019 18:33:11 -0700 Subject: [telemetry] Team meeting agenda for tomorrow In-Reply-To: References: <14ff728c-f19e-e869-90b1-4ff37f7170af@suse.com> Message-ID: <51d1e4cd-3e88-8326-a28e-56e267637d83@suse.com> On 5/8/19 5:45 PM, Rafael Weingärtner wrote: > Thanks for the reply Joseph, > > I have seen the commit message, and I also read the blog you > referenced (and other pages related to the same topic) which made us a > bit worried. I will try to explain our perspective and impressions > when we read those blog pages. It is also worth noting that we have > just started engaging with the OpenStack community (so, pardon my > ignorance with some parts of OpenStack, and how this OpenSource > community works). We are already making some contributions to > Kolla-ansible, and we want to start to contribute back to Telemetry as > well. > > Before getting to the topic of Telemetry, and to be more precise, > Ceilometer, let me state that I have taken part in other OpenSource > projects and communities before, but these communities are managed by > a different organization. > > So, Ceilometer; when we were designing and building our OpenStack > Cloud, where billing is a crucial part of it. Ceilometer was chosen > because it fits our requirements, working "out of the box" to provide > valuable data for billing in a high availability fashion. It for sure > lacks some features, but that is ok when one works with OpenSource. > The pollers and event managers we are missing, we would like to create > and contribute back to the community. > > Having said that, what puzzled me, and worried us, is the fact that > features that work are being removed from a project just because some > contributors/committers left the community. There wasn't (at least I > did not see) a good technical reason to remove this feature (e.g. it > does not deliver what is promised, or an alternative solution has been > created somewhere and effort is being concentrated there, nobody uses > it, and so on). If the features were broken, and there were no people > to fix it, I would understand, but that is not the case. The feature > works, and it delivers what is promised. Moreover, reading the blog > you referenced does not provide a good feeling about how the community > has managed the event (the project losing part of its contributors) in > question. OpenSource has cycles, and it is understandable that > sometimes we do not have many people working on something. OpenSource > projects have cycles, and that is normal. As you can see, now there > would be us starting/trying to engage with the Telemetry > project/community. What is hard for us to understand is that the > contributors while leaving are also "killing" the project by removing > part of its features (that are very interesting and valuable for us). > Yeah, the history of Telemetry is a bit unusual in how it developed, and I could give editorials and opinions about decisions that were made and how well they worked in the community, but I'll save that for another time.  I will say that communication with the community could have been better.  And while I think that simplifying Ceilometer was a good choice at the time when the number of contributors was dwindling, I agree that cutting out a feature that is being used by users is not how OpenStack ought to operate. And now I'm starting to give opinions so I'll stop. I will say that it may be beneficial to the Telemetry project if you can write out your use case for the Telemetry stack and describe why you want Events to be captured and how you will use them.  Describe how they important to your billing solution (*), and if you are matching the event notifications up with other metering data.  You can discuss with the team in the meeting if that use case and set of requirements goes in Storyboard or elsewhere. (*) I am curious if you are using CloudKitty or another solution. > Why is that important for us? > When we work with OpenSource we now that we might need to put effort > to customize/adapt things to our business workflow, and we expect that > the community will be there to receive and discuss these changes. > Therefore, we have predictability that the software/system we base our > business will be there, and we can contribute back to improve it. An > open source community could and should live even if the project has no > community for a while, then if people regroup and start to work on it > again, the community is able to flourish. I'm really glad you recognize the benefits of contributing back to the community.  It gives me hope. :) > > It is awesome that you might have a similar spec (not developed yet) > for Monasca, but the question would remain for us. One, two, or three > years from now, what will happen if you, your team, or the people > behind this spec/feature decide to leave the community? Will this > feature be removed from Monasca too? Developers leaving the community is a normal part of the lifecycle, so I think you would agree that part of having a healthy project is ensuring that when that happens the project can go on.  Monasca has already seen a number of developers come and go, and will continue on for the foreseeable future.  That is part of why we wrote a spec for the events-listener, so that if needed the work could change hands and continue with context.  We try to plan and get cross-company agreement in the community.  Of course, there are priorities and trade-offs and limits on developers, but Monasca and OpenStack seem to do a good job of being 'open' about it. > > -- > Rafael Weingärtner joseph -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Thu May 9 01:45:12 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Thu, 9 May 2019 10:45:12 +0900 Subject: [telemetry] Team meeting agenda for tomorrow In-Reply-To: <51d1e4cd-3e88-8326-a28e-56e267637d83@suse.com> References: <14ff728c-f19e-e869-90b1-4ff37f7170af@suse.com> <51d1e4cd-3e88-8326-a28e-56e267637d83@suse.com> Message-ID: Thanks, Joseph, Rafael for the great comments. Understanding the user's use-cases is a very important step to make a feature alive. On Thu, May 9, 2019 at 10:33 AM Joseph Davis wrote: > On 5/8/19 5:45 PM, Rafael Weingärtner wrote: > > > Thanks for the reply Joseph, > > I have seen the commit message, and I also read the blog you referenced > (and other pages related to the same topic) which made us a bit worried. I > will try to explain our perspective and impressions when we read those blog > pages. It is also worth noting that we have just started engaging with the > OpenStack community (so, pardon my ignorance with some parts of OpenStack, > and how this OpenSource community works). We are already making some > contributions to Kolla-ansible, and we want to start to contribute back to > Telemetry as well. > > Before getting to the topic of Telemetry, and to be more precise, > Ceilometer, let me state that I have taken part in other OpenSource > projects and communities before, but these communities are managed by a > different organization. > > So, Ceilometer; when we were designing and building our OpenStack Cloud, > where billing is a crucial part of it. Ceilometer was chosen because it > fits our requirements, working "out of the box" to provide valuable data > for billing in a high availability fashion. It for sure lacks some > features, but that is ok when one works with OpenSource. The pollers and > event managers we are missing, we would like to create and contribute back > to the community. > > Having said that, what puzzled me, and worried us, is the fact that > features that work are being removed from a project just because some > contributors/committers left the community. There wasn't (at least I did > not see) a good technical reason to remove this feature (e.g. it does not > deliver what is promised, or an alternative solution has been created > somewhere and effort is being concentrated there, nobody uses it, and so > on). If the features were broken, and there were no people to fix it, I > would understand, but that is not the case. The feature works, and it > delivers what is promised. Moreover, reading the blog you referenced does > not provide a good feeling about how the community has managed the event > (the project losing part of its contributors) in question. OpenSource has > cycles, and it is understandable that sometimes we do not have many people > working on something. OpenSource projects have cycles, and that is normal. > As you can see, now there would be us starting/trying to engage with the > Telemetry project/community. What is hard for us to understand is that the > contributors while leaving are also "killing" the project by removing part > of its features (that are very interesting and valuable for us). > > Yeah, the history of Telemetry is a bit unusual in how it developed, and I > could give editorials and opinions about decisions that were made and how > well they worked in the community, but I'll save that for another time. I > will say that communication with the community could have been better. And > while I think that simplifying Ceilometer was a good choice at the time > when the number of contributors was dwindling, I agree that cutting out a > feature that is being used by users is not how OpenStack ought to operate. > And now I'm starting to give opinions so I'll stop. > > I will say that it may be beneficial to the Telemetry project if you can > write out your use case for the Telemetry stack and describe why you want > Events to be captured and how you will use them. Describe how they > important to your billing solution (*), and if you are matching the event > notifications up with other metering data. You can discuss with the team > in the meeting if that use case and set of requirements goes in Storyboard > or elsewhere. > > (*) I am curious if you are using CloudKitty or another solution. > > > Why is that important for us? > When we work with OpenSource we now that we might need to put effort to > customize/adapt things to our business workflow, and we expect that the > community will be there to receive and discuss these changes. Therefore, we > have predictability that the software/system we base our business will be > there, and we can contribute back to improve it. An open source community > could and should live even if the project has no community for a while, > then if people regroup and start to work on it again, the community is able > to flourish. > > I'm really glad you recognize the benefits of contributing back to the > community. It gives me hope. :) > > > > It is awesome that you might have a similar spec (not developed yet) for > Monasca, but the question would remain for us. One, two, or three years > from now, what will happen if you, your team, or the people behind this > spec/feature decide to leave the community? Will this feature be removed > from Monasca too? > > Developers leaving the community is a normal part of the lifecycle, so I > think you would agree that part of having a healthy project is ensuring > that when that happens the project can go on. Monasca has already seen a > number of developers come and go, and will continue on for the foreseeable > future. That is part of why we wrote a spec for the events-listener, so > that if needed the work could change hands and continue with context. We > try to plan and get cross-company agreement in the community. Of course, > there are priorities and trade-offs and limits on developers, but Monasca > and OpenStack seem to do a good job of being 'open' about it. > > > > > -- > Rafael Weingärtner > > > joseph > -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Thu May 9 01:48:29 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Thu, 9 May 2019 10:48:29 +0900 Subject: [telemetry] Team meeting agenda for tomorrow In-Reply-To: References: Message-ID: Hi team, It's 12m before the meeting. Bests, On Thu, May 9, 2019 at 12:09 AM Rafael Weingärtner < rafaelweingartner at gmail.com> wrote: > Thanks, I'll be there. > > Em qua, 8 de mai de 2019 11:41, Trinh Nguyen > escreveu: > >> Hi Rafael, >> >> The meeting will be held on the IRC channel #openstack-telemetry as >> mentioned in the previous email. >> >> Thanks, >> >> On Wed, May 8, 2019 at 10:50 PM Rafael Weingärtner < >> rafaelweingartner at gmail.com> wrote: >> >>> Hello Trinh, >>> Where does the meeting happen? Will it be via IRC Telemetry channel? Or, >>> in the Etherpad ( >>> https://etherpad.openstack.org/p/telemetry-meeting-agenda)? I would >>> like to discuss and understand a bit better the context behind the Telemetry >>> events deprecation. >>> >>> On Wed, May 8, 2019 at 12:19 AM Trinh Nguyen >>> wrote: >>> >>>> Hi team, >>>> >>>> As planned, we will have a team meeting at 02:00 UTC, May 9th on >>>> #openstack-telemetry to discuss what we gonna do for the next milestone >>>> (Train-1) and continue what we left off from the last meeting. >>>> >>>> I put here [1] the agenda thinking that it should be fine for an hour >>>> meeting. If you have anything to talk about, please put it there too. >>>> >>>> [1] https://etherpad.openstack.org/p/telemetry-meeting-agenda >>>> >>>> >>>> Bests, >>>> >>>> -- >>>> *Trinh Nguyen* >>>> *www.edlab.xyz * >>>> >>>> >>> >>> -- >>> Rafael Weingärtner >>> >> >> >> -- >> *Trinh Nguyen* >> *www.edlab.xyz * >> >> -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From whayutin at redhat.com Thu May 9 02:00:45 2019 From: whayutin at redhat.com (Wesley Hayutin) Date: Wed, 8 May 2019 20:00:45 -0600 Subject: [tripleo] CI RED fyi.. something is causing both overcloud network configuration issues atm Message-ID: Seem the jobs go red at midnight May 9th UTC. What I'm mainly seeing is the overcloud hang after the ssh keys are created, it seems the overcloud nodes do not have network connectivity. http://logs.openstack.org/47/657547/5/check/tripleo-ci-centos-7-containers-multinode/af82c9f/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz http://logs.openstack.org/47/657547/5/check/tripleo-ci-centos-7-containers-multinode/af82c9f/logs/subnode-2/var/log/extra/failed_services.txt.gz This looks normal eth0, eth1 come up http://logs.openstack.org/47/657547/5/check/tripleo-ci-centos-7-containers-multinode/af82c9f/logs/subnode-2/var/log/journal.txt.gz#_May_08_21_42_19 I'm not 100% if this related to some of the latest patches, or if this impacts all jobs atm. Looking into it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From whayutin at redhat.com Thu May 9 04:22:25 2019 From: whayutin at redhat.com (Wesley Hayutin) Date: Wed, 8 May 2019 22:22:25 -0600 Subject: [tripleo] CI RED fyi.. something is causing both overcloud network configuration issues atm References: Message-ID: On Wed, May 8, 2019 at 8:00 PM Wesley Hayutin wrote: > Seem the jobs go red at midnight May 9th UTC. > > What I'm mainly seeing is the overcloud hang after the ssh keys are > created, it seems the overcloud nodes do not have network connectivity. > > http://logs.openstack.org/47/657547/5/check/tripleo-ci-centos-7-containers-multinode/af82c9f/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz > > > http://logs.openstack.org/47/657547/5/check/tripleo-ci-centos-7-containers-multinode/af82c9f/logs/subnode-2/var/log/extra/failed_services.txt.gz > > This looks normal eth0, eth1 come up > > http://logs.openstack.org/47/657547/5/check/tripleo-ci-centos-7-containers-multinode/af82c9f/logs/subnode-2/var/log/journal.txt.gz#_May_08_21_42_19 > > I'm not 100% if this related to some of the latest patches, or if this > impacts all jobs atm. > Looking into it. > AFAICT, the issue is either with a few of the patches submitted or a blip in infra. A clean test patch is working w/ multinode-containers and ovb fs001 CI should be green'ish quoting #tripleo Sorry for the spam -------------- next part -------------- An HTML attachment was scrubbed... URL: From sundar.nadathur at intel.com Thu May 9 04:37:08 2019 From: sundar.nadathur at intel.com (Nadathur, Sundar) Date: Wed, 8 May 2019 21:37:08 -0700 Subject: [placement][nova][ptg] resource provider affinity In-Reply-To: <1934f31d-da89-071f-d667-c36d965851ae@fried.cc> References: <97bd8e53-0285-1c92-845f-21098b0b0e38@gmail.com> <21aa22e7-be7d-8ecf-b5bd-9c6afcd789f5@fried.cc> <27624C30-2BB6-43DF-9613-783674389C0B@fried.cc> <1556631941.24201.1@smtp.office365.com> <264f10b8-05dc-5280-28af-1f29cae91821@hco.ntt.co.jp> <4aa76244-fce0-86f3-a6f5-cd7f4d8cb2f0@fried.cc> <03922b54-994e-dcae-8543-7c9c2f75b87d@hco.ntt.co.jp> <5fd214e8-4822-53a5-a7d6-622c5133a26f@fried.cc> <1CC272501B5BC543A05DB90AA509DED527557F03@fmsmsx122.amr.corp.intel.com> <1934f31d-da89-071f-d667-c36d965851ae@fried.cc> Message-ID: <489d8cae-9151-5f43-b495-ad51c959a0ea@intel.com> On 5/8/2019 2:31 PM, Eric Fried wrote: > Sundar- > >> I have a set of compute hosts, each with several NICs of type T. Each NIC has a set of PFs: PF1, PF2, .... Each PF is a resource provider, and each has a separate custom RC: CUSTOM_RC_PF1, CUSTOM_RC_PF2, ... . The VFs are inventories of the associated PF's RC. Provider networks etc. are traits on that PF. > It would be weird for the inventories to be called PF* if they're > inventories of VF.  I am focusing mainly on the concepts for now, not on the names. > But mainly: why the custom resource classes? This is as elaborate an example as I could cook up. IRL, we may need some custom RC, but maybe not one for each PF type. > The way "resourceless RP" + "same_subtree" is designed to work is best > explained if I model your use case with standard resource classes instead: > > CN > | > +---NIC1 (trait: I_AM_A_NIC) > | | > | +-----PF1_1 (trait: CUSTOM_PHYSNET1, inventory: VF=4) > | | > | +-----PF1_2 (trait: CUSTOM_PHYSNET2, inventory: VF=4) > | > +---NIC2 (trait: I_AM_A_NIC) > | > +-----PF2_1 (trait: CUSTOM_PHYSNET1, inventory: VF=4) > | > +-----PF2_2 (trait: CUSTOM_PHYSNET2, inventory: VF=4) > > Now if I say: > > ?resources_T1=VF:1 > &required_T1=CUSTOM_PHYSNET1 > &resources_T2=VF:1 > &required_T2=CUSTOM_PHYSNET2 > &required_T3=I_AM_A_NIC > &same_subtree=','.join([suffix for suffix in suffixes if > suffix.startswith('_T')]) (i.e. '_T1,_T2,_T3') > > ...then I'll get two candidates: > > - {PF1_1: VF=1, PF1_2: VF=1} <== i.e. both from NIC1 > - {PF2_1: VF=1, PF2_2: VF=1} <== i.e. both from NIC2 > > ...and no candidates where one VF is from each NIC. > > IIUC this is how you wanted it. Yes. The examples in the storyboard [1] for NUMA affinity use group numbers. If that were recast to use named groups, and we wanted NUMA affinity apart from device colocation, would that not require a different name than T? In short, if you want to express 2 different affinities/groupings, perhaps we need to use a name with 2 parts, and use 2 different same_subtree clauses. Just pointing out the implications. BTW, I noticed there is a standard RC for NIC VFs [2]. [1] https://storyboard.openstack.org/#!/story/2005575 [2] https://github.com/openstack/os-resource-classes/blob/master/os_resource_classes/__init__.py#L49 > ============== > > With the custom resource classes, I'm having a hard time understanding > the model. How unique are the _PF$N bits? Do they repeat (a) from one > NIC to the next? (b) From one host to the next? (c) Never? > > The only thing that begins to make sense is (a), because (b) and (c) > would lead to skittles. So assuming (a), the model would look something > like: Yes, (a) is what I had in mind. > CN > | > +---NIC1 (trait: I_AM_A_NIC) > | | > | +-----PF1_1 (trait: CUSTOM_PHYSNET1, inventory: CUSTOM_PF1_VF=4) > | | > | +-----PF1_2 (trait: CUSTOM_PHYSNET2, inventory: CUSTOM_PF2_VF=4) > | > +---NIC2 (trait: I_AM_A_NIC) > | > +-----PF2_1 (trait: CUSTOM_PHYSNET1, inventory: CUSTOM_PF1_VF=4) > | > +-----PF2_2 (trait: CUSTOM_PHYSNET2, inventory: CUSTOM_PF2_VF=4) > > Now you could get the same result with (essentially) the same request as > above: > > ?resources_T1=CUSTOM_PF1_VF:1 > &required_T1=CUSTOM_PHYSNET1 > &resources_T2=CUSTOM_PF2_VF:1 > &required_T2=CUSTOM_PHYSNET2 > &required_T3=I_AM_A_NIC > &same_subtree=','.join([suffix for suffix in suffixes if > suffix.startswith('_T')]) (i.e. '_T1,_T2,_T3') > > ==> > > - {PF1_1: CUSTOM_PF1_VF=1, PF1_2: CUSTOM_PF2_VF=1} > - {PF2_1: CUSTOM_PF1_VF=1, PF2_2: CUSTOM_PF2_VF=1} > > ...except that in this model, PF$N corresponds to PHYSNET$N, so you > wouldn't actually need the required_T$N=CUSTOM_PHYSNET$N to get the same > result: > > ?resources_T1=CUSTOM_PF1_VF:1 > &resources_T2=CUSTOM_PF2_VF:1 > &required_T3=I_AM_A_NIC > &same_subtree=','.join([suffix for suffix in suffixes if > suffix.startswith('_T')]) (i.e. '_T1,_T2,_T3') > > ...because you're effectively encoding the physnet into the RC. Which is > not good IMO. > > But either way... > >> Do I have to create a 'resourceless RP' for the NIC card that contains > the individual PF RPs as children nodes? > > ...if you want to be able to request this kind of affinity, then yes, > you do (unless there's some consumable resource on the NIC, in which > case it's not resourceless, but the spirit is the same). This is exactly > what these features are being designed for. Great. Thank you very much for the detailed reply. Regards, Sundar > Thanks, > efried > . > From dangtrinhnt at gmail.com Thu May 9 06:37:00 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Thu, 9 May 2019 15:37:00 +0900 Subject: [tc][searchlight] What does Maintenance Mode mean for a project? Message-ID: Hi, Currently, in the project details section of Searchlight page [1], it says we're in the Maintenance Mode. What does that mean? and how we can update it? Thanks, [1] https://www.openstack.org/software/releases/rocky/components/searchlight -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From Tim.Bell at cern.ch Thu May 9 07:24:43 2019 From: Tim.Bell at cern.ch (Tim Bell) Date: Thu, 9 May 2019 07:24:43 +0000 Subject: [telemetry] Team meeting agenda for tomorrow In-Reply-To: References: <14ff728c-f19e-e869-90b1-4ff37f7170af@suse.com> Message-ID: <20AC2324-24B6-40D1-A0A4-0382BCE430A7@cern.ch> Is it time to rethink the approach to telemetry a bit? Having each project provide its telemetry data (such as Swift with statsd - https://docs.openstack.org/swift/latest/admin/objectstorage-monitoring.html or using a framework like Prometheus)? In the end, the projects are the ones who have the best knowledge of how to get the metrics. Tim From: Rafael Weingärtner Date: Thursday, 9 May 2019 at 02:51 To: Joseph Davis Cc: openstack-discuss , Trinh Nguyen Subject: Re: [telemetry] Team meeting agenda for tomorrow Unfortunately, I have a conflict at that time and will not be able to attend. I do have a little bit of context on the Events deprecation to share. First, you will note the commit message from the commit [0] when Events were deprecated: " Deprecate event subsystem This subsystem has never been finished and is not maintained. Deprecate it for future removal. " I got the impression from jd at the time that there were a number of features in Telemetry, including Panko, that were not really "finished" and that the engineers who had worked on them had moved on to other things, so the features had become unsupported. In late 2018 there was an effort to clean up things that were not well maintained or didn't fit the direction of Telemetry. See also: https://julien.danjou.info/lessons-from-openstack-telemetry-deflation/ Thanks for the reply Joseph, I have seen the commit message, and I also read the blog you referenced (and other pages related to the same topic) which made us a bit worried. I will try to explain our perspective and impressions when we read those blog pages. It is also worth noting that we have just started engaging with the OpenStack community (so, pardon my ignorance with some parts of OpenStack, and how this OpenSource community works). We are already making some contributions to Kolla-ansible, and we want to start to contribute back to Telemetry as well. Before getting to the topic of Telemetry, and to be more precise, Ceilometer, let me state that I have taken part in other OpenSource projects and communities before, but these communities are managed by a different organization. So, Ceilometer; when we were designing and building our OpenStack Cloud, where billing is a crucial part of it. Ceilometer was chosen because it fits our requirements, working "out of the box" to provide valuable data for billing in a high availability fashion. It for sure lacks some features, but that is ok when one works with OpenSource. The pollers and event managers we are missing, we would like to create and contribute back to the community. Having said that, what puzzled me, and worried us, is the fact that features that work are being removed from a project just because some contributors/committers left the community. There wasn't (at least I did not see) a good technical reason to remove this feature (e.g. it does not deliver what is promised, or an alternative solution has been created somewhere and effort is being concentrated there, nobody uses it, and so on). If the features were broken, and there were no people to fix it, I would understand, but that is not the case. The feature works, and it delivers what is promised. Moreover, reading the blog you referenced does not provide a good feeling about how the community has managed the event (the project losing part of its contributors) in question. OpenSource has cycles, and it is understandable that sometimes we do not have many people working on something. OpenSource projects have cycles, and that is normal. As you can see, now there would be us starting/trying to engage with the Telemetry project/community. What is hard for us to understand is that the contributors while leaving are also "killing" the project by removing part of its features (that are very interesting and valuable for us). Why is that important for us? When we work with OpenSource we now that we might need to put effort to customize/adapt things to our business workflow, and we expect that the community will be there to receive and discuss these changes. Therefore, we have predictability that the software/system we base our business will be there, and we can contribute back to improve it. An open source community could and should live even if the project has no community for a while, then if people regroup and start to work on it again, the community is able to flourish. Events is one feature that often gets requested, but the use cases and demand for it are not expressed strongly or well understood by most people. If the Telemetry project has demand to de-deprecate Event handling (including Panko), I'd suggest a review of the requirements for event handling and possibly choosing a champion for maintaining the Panko service. Also note: over in Monasca we have a spec [1] for handling Events ingestion which I hope we will be completing in Train. Contributions and comments welcome. :) joseph [0] https://github.com/openstack/ceilometer/commit/8a0245a5b3e1357d35ad6653be37ca01176577e4 [1] https://github.com/openstack/monasca-specs/blob/master/specs/stein/approved/monasca-events-listener.rst It is awesome that you might have a similar spec (not developed yet) for Monasca, but the question would remain for us. One, two, or three years from now, what will happen if you, your team, or the people behind this spec/feature decide to leave the community? Will this feature be removed from Monasca too? On Wed, May 8, 2019 at 6:23 PM Joseph Davis > wrote: On 5/8/19 7:12 AM, openstack-discuss-request at lists.openstack.org wrote: Hello Trinh, Where does the meeting happen? Will it be via IRC Telemetry channel? Or, in the Etherpad (https://etherpad.openstack.org/p/telemetry-meeting-agenda)? I would like to discuss and understand a bit better the context behind the Telemetry events deprecation. Unfortunately, I have a conflict at that time and will not be able to attend. I do have a little bit of context on the Events deprecation to share. First, you will note the commit message from the commit [0] when Events were deprecated: " Deprecate event subsystem This subsystem has never been finished and is not maintained. Deprecate it for future removal. " I got the impression from jd at the time that there were a number of features in Telemetry, including Panko, that were not really "finished" and that the engineers who had worked on them had moved on to other things, so the features had become unsupported. In late 2018 there was an effort to clean up things that were not well maintained or didn't fit the direction of Telemetry. See also: https://julien.danjou.info/lessons-from-openstack-telemetry-deflation/ Events is one feature that often gets requested, but the use cases and demand for it are not expressed strongly or well understood by most people. If the Telemetry project has demand to de-deprecate Event handling (including Panko), I'd suggest a review of the requirements for event handling and possibly choosing a champion for maintaining the Panko service. Also note: over in Monasca we have a spec [1] for handling Events ingestion which I hope we will be completing in Train. Contributions and comments welcome. :) joseph [0] https://github.com/openstack/ceilometer/commit/8a0245a5b3e1357d35ad6653be37ca01176577e4 [1] https://github.com/openstack/monasca-specs/blob/master/specs/stein/approved/monasca-events-listener.rst On Wed, May 8, 2019 at 12:19 AM Trinh Nguyen wrote: Hi team, As planned, we will have a team meeting at 02:00 UTC, May 9th on #openstack-telemetry to discuss what we gonna do for the next milestone (Train-1) and continue what we left off from the last meeting. I put here [1] the agenda thinking that it should be fine for an hour meeting. If you have anything to talk about, please put it there too. [1] https://etherpad.openstack.org/p/telemetry-meeting-agenda Bests, -- *Trinh Nguyen* *www.edlab.xyz * -- Rafael Weingärtner -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Thu May 9 07:37:35 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Thu, 9 May 2019 16:37:35 +0900 Subject: [telemetry] Team meeting agenda for tomorrow In-Reply-To: <20AC2324-24B6-40D1-A0A4-0382BCE430A7@cern.ch> References: <14ff728c-f19e-e869-90b1-4ff37f7170af@suse.com> <20AC2324-24B6-40D1-A0A4-0382BCE430A7@cern.ch> Message-ID: Hi Tim, It's exactly a great time for your idea as we are trying to develop the new roadmap/vision for Telemetry. I put your comment to the brainstorming etherpad [1] [1] https://etherpad.openstack.org/p/telemetry-train-roadmap Bests, On Thu, May 9, 2019 at 4:24 PM Tim Bell wrote: > Is it time to rethink the approach to telemetry a bit? > > > > Having each project provide its telemetry data (such as Swift with statsd > - > https://docs.openstack.org/swift/latest/admin/objectstorage-monitoring.html > > or using a framework like Prometheus)? > > > > In the end, the projects are the ones who have the best knowledge of how > to get the metrics. > > > > Tim > > > > *From: *Rafael Weingärtner > *Date: *Thursday, 9 May 2019 at 02:51 > *To: *Joseph Davis > *Cc: *openstack-discuss , Trinh > Nguyen > *Subject: *Re: [telemetry] Team meeting agenda for tomorrow > > > > Unfortunately, I have a conflict at that time and will not be able to > attend. > > I do have a little bit of context on the Events deprecation to share. > > First, you will note the commit message from the commit [0] when Events > were deprecated: > > " > > Deprecate event subsystem > > This subsystem has never been finished and is not maintained. > > Deprecate it for future removal. > > " > > I got the impression from jd at the time that there were a number of > features in Telemetry, > > including Panko, that were not really "finished" and that the engineers > who had worked on them > > had moved on to other things, so the features had become unsupported. In > late 2018 there was > > an effort to clean up things that were not well maintained or didn't fit > the direction of Telemetry. > > See also: > https://julien.danjou.info/lessons-from-openstack-telemetry-deflation/ > > > > Thanks for the reply Joseph, > > I have seen the commit message, and I also read the blog you referenced > (and other pages related to the same topic) which made us a bit worried. I > will try to explain our perspective and impressions when we read those blog > pages. It is also worth noting that we have just started engaging with the > OpenStack community (so, pardon my ignorance with some parts of OpenStack, > and how this OpenSource community works). We are already making some > contributions to Kolla-ansible, and we want to start to contribute back to > Telemetry as well. > > Before getting to the topic of Telemetry, and to be more precise, > Ceilometer, let me state that I have taken part in other OpenSource > projects and communities before, but these communities are managed by a > different organization. > > So, Ceilometer; when we were designing and building our OpenStack Cloud, > where billing is a crucial part of it. Ceilometer was chosen because it > fits our requirements, working "out of the box" to provide valuable data > for billing in a high availability fashion. It for sure lacks some > features, but that is ok when one works with OpenSource. The pollers and > event managers we are missing, we would like to create and contribute back > to the community. > > Having said that, what puzzled me, and worried us, is the fact that > features that work are being removed from a project just because some > contributors/committers left the community. There wasn't (at least I did > not see) a good technical reason to remove this feature (e.g. it does not > deliver what is promised, or an alternative solution has been created > somewhere and effort is being concentrated there, nobody uses it, and so > on). If the features were broken, and there were no people to fix it, I > would understand, but that is not the case. The feature works, and it > delivers what is promised. Moreover, reading the blog you referenced does > not provide a good feeling about how the community has managed the event > (the project losing part of its contributors) in question. OpenSource has > cycles, and it is understandable that sometimes we do not have many people > working on something. OpenSource projects have cycles, and that is normal. > As you can see, now there would be us starting/trying to engage with the > Telemetry project/community. What is hard for us to understand is that the > contributors while leaving are also "killing" the project by removing part > of its features (that are very interesting and valuable for us). > > Why is that important for us? > When we work with OpenSource we now that we might need to put effort to > customize/adapt things to our business workflow, and we expect that the > community will be there to receive and discuss these changes. Therefore, we > have predictability that the software/system we base our business will be > there, and we can contribute back to improve it. An open source community > could and should live even if the project has no community for a while, > then if people regroup and start to work on it again, the community is able > to flourish. > > > > Events is one feature that often gets requested, but the use cases and > demand for it are not expressed > > strongly or well understood by most people. If the Telemetry project has > demand to de-deprecate > > Event handling (including Panko), I'd suggest a review of the requirements > for event handling and > > possibly choosing a champion for maintaining the Panko service. > > > > Also note: over in Monasca we have a spec [1] for handling Events > ingestion which I hope we will be > > completing in Train. Contributions and comments welcome. :) > > > > joseph > > [0] > https://github.com/openstack/ceilometer/commit/8a0245a5b3e1357d35ad6653be37ca01176577e4 > > [1] > https://github.com/openstack/monasca-specs/blob/master/specs/stein/approved/monasca-events-listener.rst > > > > It is awesome that you might have a similar spec (not developed yet) for > Monasca, but the question would remain for us. One, two, or three years > from now, what will happen if you, your team, or the people behind this > spec/feature decide to leave the community? Will this feature be removed > from Monasca too? > > > > On Wed, May 8, 2019 at 6:23 PM Joseph Davis wrote: > > On 5/8/19 7:12 AM, openstack-discuss-request at lists.openstack.org wrote: > > Hello Trinh, > > Where does the meeting happen? Will it be via IRC Telemetry channel? Or, in > > the Etherpad (https://etherpad.openstack.org/p/telemetry-meeting-agenda)? I > > would like to discuss and understand a bit better the context behind > > the Telemetry > > events deprecation. > > Unfortunately, I have a conflict at that time and will not be able to > attend. > > I do have a little bit of context on the Events deprecation to share. > > First, you will note the commit message from the commit [0] when Events > were deprecated: > > " > > Deprecate event subsystem > > This subsystem has never been finished and is not maintained. > > Deprecate it for future removal. > > " > > I got the impression from jd at the time that there were a number of > features in Telemetry, > > including Panko, that were not really "finished" and that the engineers > who had worked on them > > had moved on to other things, so the features had become unsupported. In > late 2018 there was > > an effort to clean up things that were not well maintained or didn't fit > the direction of Telemetry. > > See also: > https://julien.danjou.info/lessons-from-openstack-telemetry-deflation/ > > > > Events is one feature that often gets requested, but the use cases and > demand for it are not expressed > > strongly or well understood by most people. If the Telemetry project has > demand to de-deprecate > > Event handling (including Panko), I'd suggest a review of the requirements > for event handling and > > possibly choosing a champion for maintaining the Panko service. > > > > Also note: over in Monasca we have a spec [1] for handling Events > ingestion which I hope we will be > > completing in Train. Contributions and comments welcome. :) > > > > joseph > > [0] > https://github.com/openstack/ceilometer/commit/8a0245a5b3e1357d35ad6653be37ca01176577e4 > > [1] > https://github.com/openstack/monasca-specs/blob/master/specs/stein/approved/monasca-events-listener.rst > > > > On Wed, May 8, 2019 at 12:19 AM Trinh Nguyen wrote: > > > > Hi team, > > > > As planned, we will have a team meeting at 02:00 UTC, May 9th on > > #openstack-telemetry to discuss what we gonna do for the next milestone > > (Train-1) and continue what we left off from the last meeting. > > > > I put here [1] the agenda thinking that it should be fine for an hour > > meeting. If you have anything to talk about, please put it there too. > > > > [1] https://etherpad.openstack.org/p/telemetry-meeting-agenda > > > > > > Bests, > > > > -- > > ****Trinh Nguyen** > > *www.edlab.xyz * > > > > > > > > -- > > Rafael Weingärtner > -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From ekuvaja at redhat.com Thu May 9 07:53:51 2019 From: ekuvaja at redhat.com (Erno Kuvaja) Date: Thu, 9 May 2019 08:53:51 +0100 Subject: [Glance] No team meeting today Message-ID: Hi all, There is no agenda items proposed for todays meeting and I'm still traveling after the Summit/PTG so we will not have weekly meeting today. Lets resume to the normal from next week onwards. Thanks all! Best, Erno "jokke_" Kuvaja From mrunge at matthias-runge.de Thu May 9 08:23:58 2019 From: mrunge at matthias-runge.de (Matthias Runge) Date: Thu, 9 May 2019 10:23:58 +0200 Subject: [telemetry] Team meeting agenda for tomorrow In-Reply-To: References: <14ff728c-f19e-e869-90b1-4ff37f7170af@suse.com> Message-ID: <20190509082357.GA3547@hilbert.berg.ol> On Wed, May 08, 2019 at 09:45:38PM -0300, Rafael Weingärtner wrote: > Having said that, what puzzled me, and worried us, is the fact that > features that work are being removed from a project just because some > contributors/committers left the community. There wasn't (at least I did > not see) a good technical reason to remove this feature (e.g. it does not If I remember correctly, it was the other way around. The idea was to make things cleaner: ceilometer to just gather data and to send it along, gnocchi for storage, panko for events, etc. > deliver what is promised, or an alternative solution has been created > somewhere and effort is being concentrated there, nobody uses it, and so > on). If the features were broken, and there were no people to fix it, I > would understand, but that is not the case. The feature works, and it > delivers what is promised. Moreover, reading the blog you referenced does > not provide a good feeling about how the community has managed the event > (the project losing part of its contributors) in question. OpenSource has > cycles, and it is understandable that sometimes we do not have many people > working on something. OpenSource projects have cycles, and that is normal. > As you can see, now there would be us starting/trying to engage with the > Telemetry project/community. What is hard for us to understand is that the > contributors while leaving are also "killing" the project by removing part > of its features (that are very interesting and valuable for us). So, let's take your understanding what/how OpenSource works aside, please. I am sure, nobody is trying to kill their baby when leaving a project. > > Why is that important for us? > When we work with OpenSource we now that we might need to put effort to > customize/adapt things to our business workflow, and we expect that the > community will be there to receive and discuss these changes. Therefore, we > have predictability that the software/system we base our business will be > there, and we can contribute back to improve it. An open source community > could and should live even if the project has no community for a while, > then if people regroup and start to work on it again, the community is able > to flourish. Right. We're at the point "after no community", and it is up to the community to start something new, taking over the corresponding code (if they choose to do so). Matthias -- Matthias Runge From mrunge at matthias-runge.de Thu May 9 08:35:58 2019 From: mrunge at matthias-runge.de (Matthias Runge) Date: Thu, 9 May 2019 10:35:58 +0200 Subject: [telemetry] Team meeting agenda for tomorrow In-Reply-To: <20AC2324-24B6-40D1-A0A4-0382BCE430A7@cern.ch> References: <14ff728c-f19e-e869-90b1-4ff37f7170af@suse.com> <20AC2324-24B6-40D1-A0A4-0382BCE430A7@cern.ch> Message-ID: <20190509083558.GB3547@hilbert.berg.ol> On Thu, May 09, 2019 at 07:24:43AM +0000, Tim Bell wrote: > Is it time to rethink the approach to telemetry a bit? > > Having each project provide its telemetry data (such as Swift with statsd - https://docs.openstack.org/swift/latest/admin/objectstorage-monitoring.html > or using a framework like Prometheus)? > > In the end, the projects are the ones who have the best knowledge of how to get the metrics. > > Tim Yes please! I'd have some ideas, here. Prometheus has been mentioned so many times now as a requirement/request. There are also other projects to mention here, such as collectd, or OPNFV Barometer. Unfortunately, having a meetig at 4 am in the morning does not really work for me. May I kindly request to move the meeting to a more friendly hour? -- Matthias Runge From witold.bedyk at suse.com Thu May 9 08:42:39 2019 From: witold.bedyk at suse.com (Witek Bedyk) Date: Thu, 9 May 2019 10:42:39 +0200 Subject: [telemetry][monasca][self-healing] Team meeting agenda for tomorrow In-Reply-To: <20AC2324-24B6-40D1-A0A4-0382BCE430A7@cern.ch> References: <14ff728c-f19e-e869-90b1-4ff37f7170af@suse.com> <20AC2324-24B6-40D1-A0A4-0382BCE430A7@cern.ch> Message-ID: <07e6e24d-4b80-8a97-077e-e6e9b39ba15e@suse.com> Agree. Instrumenting the code is the most efficient and recommended way to monitor the applications. We have discussed it during the Self-healing SIG PTG session last week. The problem is that telemetry topic is not and never will be high priority for individual projects so the coordination effort from community is required here. I thinks this is one of the areas where Telemetry and Monasca teams could work together on. Cheers Witek On 5/9/19 9:24 AM, Tim Bell wrote: > Is it time to rethink the approach to telemetry a bit? > > Having each project provide its telemetry data (such as Swift with > statsd - > https://docs.openstack.org/swift/latest/admin/objectstorage-monitoring.html > > or using a framework like Prometheus)? > > In the end, the projects are the ones who have the best knowledge of how > to get the metrics. > > Tim > > *From: *Rafael Weingärtner > *Date: *Thursday, 9 May 2019 at 02:51 > *To: *Joseph Davis > *Cc: *openstack-discuss , Trinh > Nguyen > *Subject: *Re: [telemetry] Team meeting agenda for tomorrow > > Unfortunately, I have a conflict at that time and will not be able > to attend. > > I do have a little bit of context on the Events deprecation to share. > > First, you will note the commit message from the commit [0] when > Events were deprecated: > > " > > Deprecate event subsystem > > This subsystem has never been finished and is not maintained. > > Deprecate it for future removal. > > " > > I got the impression from jd at the time that there were a number of > features in Telemetry, > > including Panko, that were not really "finished" and that the > engineers who had worked on them > > had moved on to other things, so the features had become > unsupported.  In late 2018 there was > > an effort to clean up things that were not well maintained or didn't > fit the direction of Telemetry. > > See also: > https://julien.danjou.info/lessons-from-openstack-telemetry-deflation/ > > Thanks for the reply Joseph, > > I have seen the commit message, and I also read the blog you referenced > (and other pages related to the same topic) which made us a bit worried. > I will try to explain our perspective and impressions when we read those > blog pages. It is also worth noting that we have just started engaging > with the OpenStack community (so, pardon my ignorance with some parts of > OpenStack, and how this OpenSource community works). We are already > making some contributions to Kolla-ansible, and we want to start to > contribute back to Telemetry as well. > > Before getting to the topic of Telemetry, and to be more precise, > Ceilometer, let me state that I have taken part in other OpenSource > projects and communities before, but these communities are managed by a > different organization. > > So, Ceilometer; when we were designing and building our OpenStack Cloud, > where billing is a crucial part of it. Ceilometer was chosen because it > fits our requirements, working "out of the box" to provide valuable data > for billing in a high availability fashion. It for sure lacks some > features, but that is ok when one works with OpenSource. The pollers and > event managers we are missing, we would like to create and contribute > back to the community. > > Having said that, what puzzled me, and worried us, is the fact that > features that work are being removed from a project just because some > contributors/committers left the community. There wasn't (at least I did > not see) a good technical reason to remove this feature (e.g. it does > not deliver what is promised, or an alternative solution has been > created somewhere and effort is being concentrated there, nobody uses > it, and so on). If the features were broken, and there were no people to > fix it, I would understand, but that is not the case. The feature works, > and it delivers what is promised. Moreover, reading the blog you > referenced does not provide a good feeling about how the community has > managed the event (the project losing part of its contributors) in > question. OpenSource has cycles, and it is understandable that sometimes > we do not have many people working on something. OpenSource projects > have cycles, and that is normal. As you can see, now there would be us > starting/trying to engage with the Telemetry project/community. What is > hard for us to understand is that the contributors while leaving are > also "killing" the project by removing part of its features (that are > very interesting and valuable for us). > > Why is that important for us? > When we work with OpenSource we now that we might need to put effort to > customize/adapt things to our business workflow, and we expect that the > community will be there to receive and discuss these changes. Therefore, > we have predictability that the software/system we base our business > will be there, and we can contribute back to improve it. An open source > community could and should live even if the project has no community for > a while, then if people regroup and start to work on it again, the > community is able to flourish. > > Events is one feature that often gets requested, but the use cases > and demand for it are not expressed > > strongly or well understood by most people.  If the Telemetry > project has demand to de-deprecate > > Event handling (including Panko), I'd suggest a review of the > requirements for event handling and > > possibly choosing a champion for maintaining the Panko service. > > Also note: over in Monasca we have a spec [1] for handling Events > ingestion which I hope we will be > > completing in Train.  Contributions and comments welcome. :) > > joseph > > [0] > https://github.com/openstack/ceilometer/commit/8a0245a5b3e1357d35ad6653be37ca01176577e4 > > [1] > https://github.com/openstack/monasca-specs/blob/master/specs/stein/approved/monasca-events-listener.rst > > It is awesome that you might have a similar spec (not developed yet) for > Monasca, but the question would remain for us. One, two, or three years > from now, what will happen if you, your team, or the people behind this > spec/feature decide to leave the community? Will this feature be removed > from Monasca too? > > On Wed, May 8, 2019 at 6:23 PM Joseph Davis > wrote: > > On 5/8/19 7:12 AM, openstack-discuss-request at lists.openstack.org > wrote: > > Hello Trinh, > > Where does the meeting happen? Will it be via IRC Telemetry channel? Or, in > > the Etherpad (https://etherpad.openstack.org/p/telemetry-meeting-agenda)? I > > would like to discuss and understand a bit better the context behind > > the Telemetry > > events deprecation. > > Unfortunately, I have a conflict at that time and will not be able > to attend. > > I do have a little bit of context on the Events deprecation to share. > > First, you will note the commit message from the commit [0] when > Events were deprecated: > > " > > Deprecate event subsystem > > This subsystem has never been finished and is not maintained. > > Deprecate it for future removal. > > " > > I got the impression from jd at the time that there were a number of > features in Telemetry, > > including Panko, that were not really "finished" and that the > engineers who had worked on them > > had moved on to other things, so the features had become > unsupported.  In late 2018 there was > > an effort to clean up things that were not well maintained or didn't > fit the direction of Telemetry. > > See also: > https://julien.danjou.info/lessons-from-openstack-telemetry-deflation/ > > Events is one feature that often gets requested, but the use cases > and demand for it are not expressed > > strongly or well understood by most people.  If the Telemetry > project has demand to de-deprecate > > Event handling (including Panko), I'd suggest a review of the > requirements for event handling and > > possibly choosing a champion for maintaining the Panko service. > > Also note: over in Monasca we have a spec [1] for handling Events > ingestion which I hope we will be > > completing in Train.  Contributions and comments welcome. :) > > joseph > > [0] > https://github.com/openstack/ceilometer/commit/8a0245a5b3e1357d35ad6653be37ca01176577e4 > > [1] > https://github.com/openstack/monasca-specs/blob/master/specs/stein/approved/monasca-events-listener.rst > > On Wed, May 8, 2019 at 12:19 AM Trinh Nguyen wrote: > > Hi team, > > As planned, we will have a team meeting at 02:00 UTC, May 9th on > > #openstack-telemetry to discuss what we gonna do for the > next milestone > > (Train-1) and continue what we left off from the last meeting. > > I put here [1] the agenda thinking that it should be fine > for an hour > > meeting. If you have anything to talk about, please put it > there too. > > [1] https://etherpad.openstack.org/p/telemetry-meeting-agenda > > Bests, > > -- > > ****Trinh Nguyen** > > *www.edlab.xyz > * > > > > -- > > Rafael Weingärtner > From jose.castro.leon at cern.ch Thu May 9 08:45:52 2019 From: jose.castro.leon at cern.ch (Jose Castro Leon) Date: Thu, 9 May 2019 08:45:52 +0000 Subject: [watcher][qa] Thoughts on performance testing for Watcher In-Reply-To: <201905081419177826734@zte.com.cn> References: 6409b4e4-29af-da6d-1af6-a0d6e753049c@gmail.com <201905081419177826734@zte.com.cn> Message-ID: Hi, Actually, we are working on providing such feature in combination with aardvark. The idea is to create a strategy that fills up with preemptible resources, that later on could be reclaimed by aardvark if a normal instance is deployed. https://www.openstack.org/summit/berlin-2018/summit-schedule/events/22248/towards-fully-automated-cern-private-cloud https://www.openstack.org/summit/denver-2019/summit-schedule/events/23187/improving-resource-availability-in-cern-private-cloud Cheers Jose Castro Leon CERN Cloud Team On Wed, 2019-05-08 at 14:19 +0800, li.canwei2 at zte.com.cn wrote: another note, Watcher provides a WORKLOAD optimization(balancing or consolidation). If you want to maximize the node resource (such as vCPU, Ram...) usage through VM migration, Watcher doesn't have such a strategy now. Thanks! licanwei 原始邮件 发件人:MattRiedemann 收件人:openstack-discuss at lists.openstack.org ; 日 期 :2019年05月08日 04:57 主 题 :[watcher][qa] Thoughts on performance testing for Watcher Hi, I'm new to Watcher and would like to do some performance and scale testing in a simulated environment and wondering if anyone can give some pointers on what I could be testing or looking for. If possible, I'd like to be able to just setup a single-node devstack with the nova fake virt driver which allows me to create dozens of fake compute nodes. I could also create multiple cells with devstack, but there gets to be a limit with how much you can cram into a single node 8GB RAM 8VCPU VM (I could maybe split 20 nodes across 2 cells). I could then create dozens of VMs to fill into those compute nodes. I'm mostly trying to figure out what could be an interesting set of tests. The biggest problem I'm trying to solve with Watcher is optimizing resource utilization, i.e. once the computes hit the Tetris problem and there is some room on some nodes but none of the nodes are fully packed. I was thinking I could simulate this by configuring nova so it spreads rather than packs VMs onto hosts (or just use the chance scheduler which randomly picks a host), using VMs of varying sizes, and then run some audit / action plan (I'm still learning the terminology here) to live migrate the VMs such that they get packed onto as few hosts as possible and see how long that takes. Naturally with devstack using fake nodes and no networking on the VMs, that live migration is basically a noop, but I'm more interested in profiling how long it takes Watcher itself to execute the actions. Once I get to know a bit more about how Watcher works, I could help with optimizing some of the nova-specific stuff using placement [1]. Any advice or guidance here would be appreciated. [1] https://review.opendev.org/#/c/656448/ -- Thanks, Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From zigo at debian.org Thu May 9 09:02:15 2019 From: zigo at debian.org (Thomas Goirand) Date: Thu, 9 May 2019 11:02:15 +0200 Subject: [telemetry] Team meeting agenda for tomorrow In-Reply-To: <20AC2324-24B6-40D1-A0A4-0382BCE430A7@cern.ch> References: <14ff728c-f19e-e869-90b1-4ff37f7170af@suse.com> <20AC2324-24B6-40D1-A0A4-0382BCE430A7@cern.ch> Message-ID: <1894ef89-ea11-0d31-4820-dc1c39ed07b7@debian.org> On 5/9/19 9:24 AM, Tim Bell wrote: > Is it time to rethink the approach to telemetry a bit? > > Having each project provide its telemetry data (such as Swift with > statsd - > https://docs.openstack.org/swift/latest/admin/objectstorage-monitoring.html > > or using a framework like Prometheus)? > > In the end, the projects are the ones who have the best knowledge of how > to get the metrics. > > Tim Tim, statsd for swift is for monitoring, it is *not* a usage metric. Likewise with Prometheus, who wont care if some data are missing. I very much would love to have each project handle metrics collection by themselves. Especially, I always though that the polling system implemented in Ceilometer is just wrong, and that every service must be able to report itself rather than being polled. I understand however that doing polling is easier than implementing such change in every service, so I get why it has been done this way. But then we need some kind of timeseries framework within OpenStack as a whole (through an Oslo library?), and also we must decide on a backend. Right now, the only serious thing we have is Gnocchi, since influxdb is gone through the open core model. Or do you have something else to suggest? Cheers, Thomas Goirand (zigo) From balazs.gibizer at ericsson.com Thu May 9 09:19:02 2019 From: balazs.gibizer at ericsson.com (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Thu, 9 May 2019 09:19:02 +0000 Subject: [nova][neutron][ptg] Summary: Leaking resources when ports are deleted out-of-band In-Reply-To: References: <62ef48e0-9425-9191-a648-c1009c1032b7@fried.cc> <1556919312.16566.2@smtp.office365.com> <5f87ea30-0bdf-31a4-a3f5-0e9d201b3665@gmail.com> <1556989044.27606.0@smtp.office365.com> Message-ID: <1557393539.17816.4@smtp.office365.com> On Wed, May 8, 2019 at 6:18 PM, Matt Riedemann wrote: > On 5/4/2019 11:57 AM, Balázs Gibizer wrote: >> The failure to detach a port via nova while the nova-compute is down >> could be a bug on nova side. > > Depends on what you mean by detach. If the compute is down while > deleting the server, the API will still call the (internal to nova) > network API code [1] to either (a) unbind ports that nova didn't > create or (2) delete ports that nova did create. This sentence based on the reported bug [2]. The reason while Octavia is unbinding the port in Neutron instead of via Nova is that Nova fails to detach the interface and unbind the port if the nova-compute is down. In that bug we discussing if it would be meaningful to do a local interface detach (unvind port in neutron + deallocate port resource in placement) in the nova-api if the compute is done similar to the local server delete. [2] https://bugs.launchpad.net/nova/+bug/1827746 > > For the policy change where the port has to be unbound to delete it, > we'd already have support for that, it's just an extra step. > > At the PTG I was groaning a bit about needing to add another step to > delete a port from the nova side, but thinking about it more we have > to do the exact same thing with cinder volumes (we have to detach > them before deleting them), so I guess it's not the worst thing ever. As soon as somebody from Neutron states that the neutron policy patch is on the way I can start working on the Nova side of this. Cheers, gibi > > [1] > https://protect2.fireeye.com/url?k=56f34fb5-0a7a9599-56f30f2e-0cc47ad93da2-193a4612d9e0575f&u=https://github.com/openstack/nova/blob/56fef7c0e74d7512f062c4046def10401df16565/nova/compute/api.py#L2291 > > -- > > Thanks, > > Matt > From geguileo at redhat.com Thu May 9 09:28:28 2019 From: geguileo at redhat.com (Gorka Eguileor) Date: Thu, 9 May 2019 11:28:28 +0200 Subject: Baremetal attach volume in Multi-tenancy In-Reply-To: References: Message-ID: <20190509092828.g6qvdg5jbvqqvpba@localhost> On 08/05, zack chen wrote: > Hi, > I am looking for a mechanism that can be used for baremetal attach volume > in a multi-tenant scenario. In addition we use ceph as the backend storage > for cinder. > > Can anybody give me some advice? Hi, Is this a stand alone Cinder deployment or a normal Cinder in OpenStack deployment? What storage backend will you be using? What storage protocol? iSCSI, FC, RBD...? Depending on these you can go with Walter's suggestion of using cinderclient and its extension (which in general is the best way to go), or you may prefer writing a small python script that uses OS-Brick and makes the REST API calls directly. Cheers, Gorka. From sylvain.bauza at gmail.com Thu May 9 09:28:42 2019 From: sylvain.bauza at gmail.com (Sylvain Bauza) Date: Thu, 9 May 2019 11:28:42 +0200 Subject: [nova][CI] GPUs in the gate In-Reply-To: References: <3587e05d-deab-42ad-9a02-4312ca11760f@www.fastmail.com> <20190508132709.xgq6nz3mqkfw3q5d@yuggoth.org> Message-ID: Le mer. 8 mai 2019 à 20:27, Artom Lifshitz a écrit : > On Wed, May 8, 2019 at 9:30 AM Jeremy Stanley wrote: > > Long shot, but since you just need the feature provided and not the > > performance it usually implies, are there maybe any open source > > emulators which provide the same instruction set for conformance > > testing purposes? > > Something like that exists for network cards. It's called netdevsim > [1], and it's been mentioned in the SRIOV live migration spec [2]. > However to my knowledge nothing like that exists for GPUs. > > libvirt provides us a way to fake mediated devices attached to instances but we still need to lookup sysfs for either knowing all the physical GPUs or creating a new mdev so that's where it's not possibleto have an emulator AFAICU. -Sylvain [1] > https://www.phoronix.com/scan.php?page=news_item&px=Linux-4.16-Networking > [2] > https://specs.openstack.org/openstack/nova-specs/specs/train/approved/libvirt-neutron-sriov-livemigration.html#testing > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kchamart at redhat.com Thu May 9 11:55:46 2019 From: kchamart at redhat.com (Kashyap Chamarthy) Date: Thu, 9 May 2019 13:55:46 +0200 Subject: [nova][all][ptg] Summary: Same-Company Approvals In-Reply-To: References: Message-ID: <20190509115546.GG28897@paraplu> On Sat, May 04, 2019 at 07:19:48PM -0600, Morgan Fainberg wrote: > On Sat, May 4, 2019, 16:48 Eric Fried wrote: > > > (NB: I tagged [all] because it would be interesting to know where other > > teams stand on this issue.) > > > > Etherpad: https://etherpad.openstack.org/p/nova-ptg-train-governance [Thanks for the summary, Eric; I couldn't be at that session due to a conflict.] > > Summary: > > - There is a (currently unwritten? at least for Nova) rule that a patch > > should not be approved exclusively by cores from the same company. This > > is rife with nuance, including but not limited to: > > - Usually (but not always) relevant when the patch was proposed by > > member of same company > > - N/A for trivial things like typo fixes > > - The issue is: > > - Should the rule be abolished? and/or > > - Should the rule be written down? > > [...] > we opted to really lean on "Overall, we should be able to trust cores > to act in good faith". Indeed. IME, this is what other mature open source projects do (e.g. Linux and QEMU, which are in the "critical path" to Nova and OpenStack). FWIW, over the past six years, I've seen plenty of cases on 'qemu-devel' (the upstream development list fo the QEMU project) and on KVM list, where a (non-trivial) patch contribution from company-X is merged by maintainers from company-X. Of course, there is the implicit trust in that the contributor is acting in upstream's best interests first. (If not, course-correct and educate.) - - - This Nova "rule" (which, as Eric succintly put it, is "rife with nuance") doesn't affect me much, if at all. But allow me share my stance: I'm of course all for diverse set of opinions and reviews from different companies as much as posisble, which I consider super healthy. So long as there are no overly iron-clad "rules" that are "unbendable". What _should_ raise a red flag is a _pattern_. E.g. Developer-A from the company Pied Piper uploads a complex change, within a couple of days (or worse, even shorter), two more upstream "core" reivewers from Pied Piper, who are in the know about the change, pile on it and approve without giving sufficient time for other community reviewers to catch-up. (Because: "hey, we need to get Pied Piper's 'priority feature' into the current release, to get that one-up against the competitor!") *That* kind of behaviour should be called out and rebuked. _However_. If: - a particular (non-trivial) change is publicly advertized well-enough (2 weeks or more), for the community developers to catch up; - all necessary details, context and use cases are described clearly in the open, without any assumptions; - if you've checked with other non-Pied Piper "cores" if they have any strong opinions, and gave them the time to chime in; - if the patch receives negative comments, address it without hand-waving, explaining in _every_ detail that isn't clear to non-Pied Piper reviewers, so that in the end they can come to a clear conclusion whether it's right or not; - you are working in the best interests of upstream, _even if_ it goes against your downstream's interests; i.e. you're sincere and sensible when operating with your "upstream hat". Then, "core" reviewers from Pied Piper _should_ be able to merge a contribution from Pied Piper (or someone else), without a nauseating feeling of "you're just one 'wrong sneeze' away from being implicated of 'trust erosion'!" or any artificial "procedural blockers". Of course, this requires a "heightend sense of awareness", and doing that delicate tango of balancing "upstream" vs. "downstream" hats. And I'd like to imagine contributors and maintainers are constantly striving towards it. [...] -- /kashyap From tobias.rydberg at citynetwork.eu Thu May 9 12:01:16 2019 From: tobias.rydberg at citynetwork.eu (Tobias Rydberg) Date: Thu, 9 May 2019 14:01:16 +0200 Subject: [sigs][publiccloud][publiccloud-wg] Reminder meeting this afternoon for Public Cloud WG/SIG Message-ID: <1887c685-4404-39b9-7428-15792b87a80f@citynetwork.eu> Hi all, This is a reminder for todays meeting for the Public Cloud WG/SIG - 1400 UTC in #openstack-publiccloud. Agenda at: https://etherpad.openstack.org/p/publiccloud-wg See you all later! Cheers, Tobias -- Tobias Rydberg Senior Developer Twitter & IRC: tobberydberg www.citynetwork.eu | www.citycloud.com INNOVATION THROUGH OPEN IT INFRASTRUCTURE ISO 9001, 14001, 27001, 27015 & 27018 CERTIFIED From thierry at openstack.org Thu May 9 12:10:03 2019 From: thierry at openstack.org (Thierry Carrez) Date: Thu, 9 May 2019 14:10:03 +0200 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> Message-ID: Jeremy Stanley wrote: > [...] > It's still unclear to me why we're doing this at all. Our stable > constraints lists are supposed to be a snapshot in time from when we > released, modulo stable point release updates of the libraries we're > maintaining. Agreeing to bump random dependencies on stable branches > because of security vulnerabilities in them is a slippery slope > toward our users expecting the project to be on top of vulnerability > announcements for every one of the ~600 packages in our constraints > list. Deployment projects already should not depend on our > requirements team tracking security vulnerabilities, so need to have > a mechanism to override constraints entries anyway if they're making > such guarantees to their users (and I would also caution against > doing that too). > > Distributions are far better equipped than our project to handle > such tracking, as they generally get advance notice of > vulnerabilities and selectively backport fixes for them. Trying to > accomplish the same with a mix of old and new dependency versions in > our increasingly aging stable and extended maintenance branches > seems like a disaster waiting to happen. I agree it is a bit of a slippery slope... We historically did not do that (stable branches are a convenience, not a product), because it is a lot of work to track and test vulnerable dependencies across multiple stable branches in a comprehensive manner. Why update requests 2.18.4 for CVE-2018-18074, and not Jinja2 2.10.0 for CVE-2019-8341 ? I'm not sure doing it on a case-by-case basis is a good idea either, as it might set unreasonable expectations. -- Thierry Carrez (ttx) From thierry at openstack.org Thu May 9 12:24:31 2019 From: thierry at openstack.org (Thierry Carrez) Date: Thu, 9 May 2019 14:24:31 +0200 Subject: [tc] Proposal: restrict TC activities In-Reply-To: <20190503204942.GB28010@shipstone.jp> References: <20190503204942.GB28010@shipstone.jp> Message-ID: <0b579079-265e-c1ee-bd87-261566b1a6af@openstack.org> Emmet Hikory wrote: > [...] As such, I suggest that the Technical Committee be > restricted from actually doing anything beyond approval of merges to the > governance repository. If you look at the documented role of the TC[1], you'll see that it is mostly focused on deciding on proposed governance (or governance repository) changes. The only section that does not directly translate into governance change approval is "Ensuring a healthy, open collaboration", which is about tracking that the project still lives by its documented values, principles and rules -- activities that I think should also remain with the TC. [1] https://governance.openstack.org/tc/reference/role-of-the-tc.html Beyond that, it is true that some Technical Committee members are involved in driving other initiatives (including *proposing* governance changes), but I'd say that they do it like any other community member could. While I think we should (continue to) encourage participation in governance from other people, and ensure a healthy turnover level in TC membership, I don't think that we should *restrict* TC members from voluntarily doing things beyond approving changes. -- Thierry Carrez (ttx) From mark at stackhpc.com Thu May 9 12:26:40 2019 From: mark at stackhpc.com (Mark Goddard) Date: Thu, 9 May 2019 13:26:40 +0100 Subject: kolla-ansible pike - nova_compute containers not starting In-Reply-To: References: Message-ID: On Wed, 8 May 2019 at 16:07, Shyam Biradar wrote: > Hi, > > I am setting up all-in-one ubuntu based kolla-ansible pike openstack. > > Deployment is failing at following ansible task: > TASK [nova : include_tasks] > ********************************************************************************************************************** > included: > /root/virtnev/share/kolla-ansible/ansible/roles/nova/tasks/discover_computes.yml > for localhost > > TASK [nova : Waiting for nova-compute service up] > ************************************************************************************************ > FAILED - RETRYING: Waiting for nova-compute service up (20 retries left). > FAILED - RETRYING: Waiting for nova-compute service up (19 retries left). > FAILED - RETRYING: Waiting for nova-compute service up (18 retries left). > FAILED - RETRYING: Waiting for nova-compute service up (17 retries left). > FAILED - RETRYING: Waiting for nova-compute service up (16 retries left). > FAILED - RETRYING: Waiting for nova-compute service up (15 retries left). > FAILED - RETRYING: Waiting for nova-compute service up (14 retries left). > FAILED - RETRYING: Waiting for nova-compute service up (13 retries left). > FAILED - RETRYING: Waiting for nova-compute service up (12 retries left). > FAILED - RETRYING: Waiting for nova-compute service up (11 retries left). > FAILED - RETRYING: Waiting for nova-compute service up (10 retries left). > FAILED - RETRYING: Waiting for nova-compute service up (9 retries left). > FAILED - RETRYING: Waiting for nova-compute service up (8 retries left). > FAILED - RETRYING: Waiting for nova-compute service up (7 retries left). > FAILED - RETRYING: Waiting for nova-compute service up (6 retries left). > FAILED - RETRYING: Waiting for nova-compute service up (5 retries left). > FAILED - RETRYING: Waiting for nova-compute service up (4 retries left). > FAILED - RETRYING: Waiting for nova-compute service up (3 retries left). > FAILED - RETRYING: Waiting for nova-compute service up (2 retries left). > FAILED - RETRYING: Waiting for nova-compute service up (1 retries left). > fatal: [localhost -> localhost]: FAILED! => {"attempts": 20, "changed": > false, "cmd": ["docker", "exec", "kolla_toolbox", "openstack", > "--os-interface", "internal", "--os-auth-url", " > http://192.168.122.151:35357", "--os-identity-api-version", "3", > "--os-project-domain-name", "default", "--os-tenant-name", "admin", > "--os-username", "admin", "--os-password", > "ivpu1km8qxnVQESvAF4cyTFstOvrbxGUHjFF15gZ", "--os-user-domain-name", > "default", "compute", "service", "list", "-f", "json", "--service", > "nova-compute"], "delta": "0:00:02.555356", "end": "2019-05-02 > 09:24:45.485786", "rc": 0, "start": "2019-05-02 09:24:42.930430", "stderr": > "", "stderr_lines": [], "stdout": "[]", "stdout_lines": ["[]"]} > > -------------------------------------------------------------------- > > I can see following stack trace in nova-compute container log > > 4. 2019-05-02 08:21:30.522 7 INFO nova.service [-] Starting compute node > (version 16.1.7) > 2019-05-02 08:21:30.524 7 ERROR oslo_service.service [-] Error starting > thread.: PlacementNotConfigured: This compute is not configured to talk to > the placement service. Configure the [placement] section of nova.conf and > restart the service. > 2019-05-02 08:21:30.524 7 ERROR oslo_service.service Traceback (most > recent call last): > 2019-05-02 08:21:30.524 7 ERROR oslo_service.service File > "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_service/service.py", > line 721, in run_service > 2019-05-02 08:21:30.524 7 ERROR oslo_service.service service.start() > 2019-05-02 08:21:30.524 7 ERROR oslo_service.service File > "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/service.py", > line 156, in start > 2019-05-02 08:21:30.524 7 ERROR oslo_service.service > self.manager.init_host() > 2019-05-02 08:21:30.524 7 ERROR oslo_service.service File > "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/compute/manager.py", > line 1155, in init_host > 2019-05-02 08:21:30.524 7 ERROR oslo_service.service raise > exception.PlacementNotConfigured() > 2019-05-02 08:21:30.524 7 ERROR oslo_service.service > PlacementNotConfigured: This compute is not configured to talk to the > placement service. Configure the [placement] section of nova.conf and > restart the service. > 2019-05-02 08:21:30.524 7 ERROR oslo_service.service > 2019-05-02 08:21:59.229 7 INFO os_vif [-] Loaded VIF plugins: ovs, > linux_bridge > --------------------------------------------------------------------- > > I saw nova-compute nova.conf has [placement] section configured well and > it's same as nova_api's placement section. > Other nova containers are started well. > Hi Shyam, The nova code has this: # NOTE(sbauza): We want the compute node to hard fail if it can't be # able to provide its resources to the placement API, or it would not # be able to be eligible as a destination. if CONF.placement.os_region_name is None: raise exception.PlacementNotConfigured() Do you have the os_region_name option set in [placement] in nova.conf? > Any thoughts? > [image: logo] > *Shyam Biradar* * Software Engineer | DevOps* > M +91 8600266938 | shyam.biradar at trilio.io | trilio.io > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chkumar246 at gmail.com Thu May 9 12:30:55 2019 From: chkumar246 at gmail.com (Chandan kumar) Date: Thu, 9 May 2019 18:00:55 +0530 Subject: [tripleo][openstack-ansible] collaboration on os_tempest role update 21 - May 09, 2019 Message-ID: Hello, Here is the 21th update (Apr 24 to May 09, 2019) on collaboration on os_tempest[1] role between TripleO and OpenStack-Ansible projects. Due to Denver Train PTG, we have skipped the last week report. Highlights of Update 21: * We removed install_test_requirements flag in os_tempest as all the tempest plugins have their requirements specified in requirements.txt, so let's use that instead of using test_requirements.txt also. * All the upstream TripleO CI standalone base, scenario1-4 and puppet standalone jobs are running tempest using os_tempest. Thanks to Arx for porting jobs to os_tempest and odyssey4me for bring back the gate alive. >From Denver Train PTG: Wes made a nice os_tempest tripleo asci video: https://asciinema.org/a/rm7LDAs6RAR1xh7oQrp07LaeR OSA project update from summit: https://www.youtube.com/watch?v=JZet1uNAr_o&t=868s Things got merged: os_tempest: * Remove install_test_requirements flag - https://review.opendev.org/657778 * Temporarily set bionic job to non-voting - https://review.opendev.org/657833 os_cinder: * Set glance_api_version=2 in cinder.conf - https://review.opendev.org/653308 Tripleo: * Enable os_tempest in baremetal-full-overcloud-validate playbook - https://review.opendev.org/652983 * Set gather_facts to false while calling tempest playbook - https://review.opendev.org/653702 * Port tripleo-ci-centos-7-scenario001-standalone to os_tempest - https://review.opendev.org/655870 * Port scenario002-standalone-master to os_tempest - https://review.opendev.org/656259 * Port puppet-keystone-tripleo-standalone to os_tempest - https://review.opendev.org/656474 * Port puppet-swift-tripleo-standalone to os_tempest - https://review.opendev.org/656481 * Port puppet-nova-tripleo-standalone to os_tempest - https://review.opendev.org/656480 * Port puppet-neutron-tripleo-standalone job to os_tempest - https://review.opendev.org/656479 * Switch scenario003-standalone job to use os_tempest - https://review.opendev.org/656290 * Port scenario004-standalone-master to os_tempest - https://review.opendev.org/656291 * Port puppet-horizon-tripleo-standalone to os_tempest - https://review.opendev.org/656758 * Port puppet-cinder-tripleo-standalone to os_tempest - https://review.opendev.org/656752 * Port puppet-glance-tripleo-standalone to os_tempest - https://review.opendev.org/656753 * Port standalone job to os_tempest - https://review.opendev.org/656748 Things in progress: os_tempest: * Replace tempestconf job with aio_distro_metal-tempestconf job - https://review.opendev.org/657359 * Update openstack.org -> opendev.org - https://review.opendev.org/654942 * Make smoke tests as a default whitelist tests - https://review.opendev.org/652060 on TripleO/OSA side, we will be working on enabling heat tempest plugin support. Here is the 20th update [2]. Have queries, Feel free to ping us on #tripleo or #openstack-ansible channel. Links: [1.] http://opendev.org/openstack/openstack-ansible-os_tempest [2.] http://lists.openstack.org/pipermail/openstack-discuss/2019-April/005563.html Thanks, Chandan Kumar From witold.bedyk at suse.com Thu May 9 12:35:01 2019 From: witold.bedyk at suse.com (Witek Bedyk) Date: Thu, 9 May 2019 14:35:01 +0200 Subject: [telemetry][monasca][self-healing] Team meeting agenda for tomorrow In-Reply-To: <1894ef89-ea11-0d31-4820-dc1c39ed07b7@debian.org> References: <14ff728c-f19e-e869-90b1-4ff37f7170af@suse.com> <20AC2324-24B6-40D1-A0A4-0382BCE430A7@cern.ch> <1894ef89-ea11-0d31-4820-dc1c39ed07b7@debian.org> Message-ID: <22b57ad4-c737-cef6-b18b-775c0cb9e7a6@suse.com> > But then we need some kind of timeseries framework within OpenStack as a > whole (through an Oslo library?), What would be the requirements and the scope of this framework from your point of view? > and also we must decide on a backend. > Right now, the only serious thing we have is Gnocchi, since influxdb is > gone through the open core model. Or do you have something else to suggest? Monasca can be used as the backend. As TSDB it uses Apache Cassandra with native clustering support or InfluxDB. Monasca uses Apache Kafka as the message queue. It can replicate and partition the measurements into independent InfluxDB instances. Additionally Monasca API could act as the load balancer monitoring the healthiness of InfluxDB instances and routing the queries to the assigned shards. We want to work in Train cycle to add upstream all configuration options to allow such setup [1]. Your feedback, comments and contribution are very welcome. Cheers Witek [1] https://storyboard.openstack.org/#!/story/2005620 From jesse at odyssey4.me Thu May 9 12:38:29 2019 From: jesse at odyssey4.me (Jesse Pretorius) Date: Thu, 9 May 2019 12:38:29 +0000 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> Message-ID: > On 9 May 2019, at 13:10, Thierry Carrez wrote: > > I agree it is a bit of a slippery slope... We historically did not do that (stable branches are a convenience, not a product), because it is a lot of work to track and test vulnerable dependencies across multiple stable branches in a comprehensive manner. > > Why update requests 2.18.4 for CVE-2018-18074, and not Jinja2 2.10.0 for CVE-2019-8341 ? > > I'm not sure doing it on a case-by-case basis is a good idea either, as it might set unreasonable expectations. A lot of operators make use of u-c for source-based builds to ensure consistency in the builds and to ensure that they’re using the same packages as those which were tested upstream. It makes sense to collaborate on something this important as far upstream as possible. If we think of this as a community effort similar to the extended maintenance policy - the development community doesn’t *have* to implement the infrastructure to actively monitor for the vulnerabilities and respond to them. It can be maintained on a best effort basis by those interested in doing so. To limit the effort involved we could agree to limit the scope to only allow changes to the current ‘maintained’ releases. For all other branches we can encourage an upgrade to a ‘maintained’ release by adding a release note. To manage the 'unreasonable expectations’, we should document a policy to this effect. From fungi at yuggoth.org Thu May 9 12:43:01 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 9 May 2019 12:43:01 +0000 Subject: [telemetry] Team meeting agenda for tomorrow In-Reply-To: <20190509083558.GB3547@hilbert.berg.ol> References: <14ff728c-f19e-e869-90b1-4ff37f7170af@suse.com> <20AC2324-24B6-40D1-A0A4-0382BCE430A7@cern.ch> <20190509083558.GB3547@hilbert.berg.ol> Message-ID: <20190509124300.4f7d7qxprq6osasb@yuggoth.org> On 2019-05-09 10:35:58 +0200 (+0200), Matthias Runge wrote: [...] > Unfortunately, having a meetig at 4 am in the morning does not really > work for me. May I kindly request to move the meeting to a more friendly > hour? The World is round, and your "friendly" times are always someone else's "unfriendly" times. Asking the folks interested in participating in the meeting to agree on a consensus timeslot between them is fair, but please don't characterize someone else's locale as "unfriendly" just because it's on the opposite side of the planet from you. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From thierry at openstack.org Thu May 9 12:49:17 2019 From: thierry at openstack.org (Thierry Carrez) Date: Thu, 9 May 2019 14:49:17 +0200 Subject: [tc][searchlight] What does Maintenance Mode mean for a project? In-Reply-To: References: Message-ID: <155f4110-df20-3b23-8c68-700e9c3d66f0@openstack.org> Trinh Nguyen wrote: > Currently, in the project details section of Searchlight page [1], it > says we're in the Maintenance Mode. What does that mean? and how we can > update it? Maintenance mode is a project-team tag that teams can choose to apply to themselves. It is documented at: https://governance.openstack.org/tc/reference/tags/status_maintenance-mode.html If you feel like Searchlight is back to a feature development phase, you can ask for it to be changed by proposing a change to https://opendev.org/openstack/governance/src/branch/master/reference/projects.yaml#L3407 -- Thierry Carrez (ttx) From mriedemos at gmail.com Thu May 9 13:02:47 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 9 May 2019 08:02:47 -0500 Subject: Any ideas on fixing bug 1827083 so we can merge code? Message-ID: I'm not sure what is causing the bug [1] but it's failing at a really high rate for about week now. Do we have ideas on the issue? Do we have thoughts on a workaround? Or should we disable the vexxhost-sjc1 provider until it's solved? [1] http://status.openstack.org/elastic-recheck/#1827083 -- Thanks, Matt From mriedemos at gmail.com Thu May 9 13:20:10 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 9 May 2019 08:20:10 -0500 Subject: [nova][neutron][ptg] Summary: Leaking resources when ports are deleted out-of-band In-Reply-To: <1557393539.17816.4@smtp.office365.com> References: <62ef48e0-9425-9191-a648-c1009c1032b7@fried.cc> <1556919312.16566.2@smtp.office365.com> <5f87ea30-0bdf-31a4-a3f5-0e9d201b3665@gmail.com> <1556989044.27606.0@smtp.office365.com> <1557393539.17816.4@smtp.office365.com> Message-ID: <0e10037f-f193-3752-c96e-7ffb536ea187@gmail.com> On 5/9/2019 4:19 AM, Balázs Gibizer wrote: > This sentence based on the reported bug [2]. The reason while Octavia > is unbinding the port in Neutron instead of via Nova is that Nova fails > to detach the interface and unbind the port if the nova-compute is > down. In that bug we discussing if it would be meaningful to do a local > interface detach (unvind port in neutron + deallocate port resource in > placement) in the nova-api if the compute is done similar to the local > server delete. > > [2]https://bugs.launchpad.net/nova/+bug/1827746 Oh OK I was confusing this with deleting the VM while the compute host was down, not detaching the port from the server while the compute was down. Yeah I'm not sure what we'd want to do there. We could obviously do the same thing we do for VM delete in the API while the compute host is down, but could we be leaking things on the compute host in that case if the VIF was never properly unplugged? I'd think that is already an issue for local delete of the VM in the API if the compute comes back up later (maybe there is something in the compute service on startup that will do cleanup, I'm not sure off the top of my head). -- Thanks, Matt From openstack at fried.cc Thu May 9 13:39:07 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 9 May 2019 08:39:07 -0500 Subject: [placement][nova][ptg] resource provider affinity In-Reply-To: <489d8cae-9151-5f43-b495-ad51c959a0ea@intel.com> References: <21aa22e7-be7d-8ecf-b5bd-9c6afcd789f5@fried.cc> <27624C30-2BB6-43DF-9613-783674389C0B@fried.cc> <1556631941.24201.1@smtp.office365.com> <264f10b8-05dc-5280-28af-1f29cae91821@hco.ntt.co.jp> <4aa76244-fce0-86f3-a6f5-cd7f4d8cb2f0@fried.cc> <03922b54-994e-dcae-8543-7c9c2f75b87d@hco.ntt.co.jp> <5fd214e8-4822-53a5-a7d6-622c5133a26f@fried.cc> <1CC272501B5BC543A05DB90AA509DED527557F03@fmsmsx122.amr.corp.intel.com> <1934f31d-da89-071f-d667-c36d965851ae@fried.cc> <489d8cae-9151-5f43-b495-ad51c959a0ea@intel.com> Message-ID: <145f897e-2744-25b6-596f-43c51982044e@fried.cc> Sundar- > Yes. The examples in the storyboard [1] for NUMA affinity use group > numbers. If that were recast to use named groups, and we wanted NUMA > affinity apart from device colocation, would that not require a > different name than T? In short, if you want to express 2 different > affinities/groupings, perhaps we need to use a name with 2 parts, and > use 2 different same_subtree clauses. Just pointing out the implications. That's correct. If we wanted two groupings... [repeating diagram for context] CN | +---NIC1 (trait: I_AM_A_NIC) | | | +-----PF1_1 (trait: CUSTOM_PHYSNET1, inventory: VF=4) | | | +-----PF1_2 (trait: CUSTOM_PHYSNET2, inventory: VF=4) | +---NIC2 (trait: I_AM_A_NIC) | +-----PF2_1 (trait: CUSTOM_PHYSNET1, inventory: VF=4) | +-----PF2_2 (trait: CUSTOM_PHYSNET2, inventory: VF=4) ?resources_TA1=VF:1&required_TA1=CUSTOM_PHYSNET1 &resources_TA2=VF:1&required_TA2=CUSTOM_PHYSNET2 &required_TA3=I_AM_A_NIC &same_subtree=','.join([ suffix for suffix in suffixes if suffix.startswith('_TA')]) # (i.e. '_TA1,_TA2,_TA3') &resources_TB1=VF:1&required_TB1=CUSTOM_PHYSNET1 &resources_TB2=VF:1&required_TB2=CUSTOM_PHYSNET2 &required_TB3=I_AM_A_NIC &same_subtree=','.join([ suffix for suffix in suffixes if suffix.startswith('_TB')]) # (i.e. '_TB1,_TB2,_TB3') This would give us four candidates: - One where TA* is under NIC1 and TB* is under NIC2 - One where TB* is under NIC1 and TA* is under NIC2 - One where everything is under NIC1 - One where everything is under NIC2 This of course leads to some nontrivial questions, like: - How do we express these groupings from the operator-/user-facing sources (flavor, port, device_profile, etc.)? Especially when different pieces come from different sources but still need to be affined to each other. This is helped by allowing named as opposed to autonumbered suffixes, which is why we're doing that, but it's still going to be tricky to do in practice. - What if we want to express anti-affinity, i.e. limit the response to just the first two candidates? We discussed being able to say something like same_subtree=_TA3,!_TB3, but decided to defer that design/implementation for now. If you want this kind of thing in Train, you'll have to filter post-Placement. Thanks, efried . From fungi at yuggoth.org Thu May 9 13:48:09 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 9 May 2019 13:48:09 +0000 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> Message-ID: <20190509134808.4eqwwjcdxjpt37wh@yuggoth.org> On 2019-05-09 12:38:29 +0000 (+0000), Jesse Pretorius wrote: [...] > A lot of operators make use of u-c for source-based builds to > ensure consistency in the builds and to ensure that they’re using > the same packages as those which were tested upstream. It makes > sense to collaborate on something this important as far upstream > as possible. [...] See, this is what frightens me. We should *strongly* discourage them from doing this, period. If your deployment relies on distribution packages of dependencies then your distro's package maintainers have almost certainly received advance notice of many of these vulnerabilities and have fixes ready for you to download the moment they're made public. They're in most cases selectively backporting the fixes to the versions they carry so as to make them otherwise backward compatible and avoid knock-on effects involving a need to upgrade other transitive dependencies which are not involved in the vulnerability. > If we think of this as a community effort similar to the extended > maintenance policy - the development community doesn’t *have* to > implement the infrastructure to actively monitor for the > vulnerabilities and respond to them. It can be maintained on a > best effort basis by those interested in doing so. By the time we find out and work through the transitive dependency bumps implied by this sort of change (because many of these ~600 dependencies of ours don't backport fixes or maintain multiple stable series of their own and so our only option is to upgrade to the latest version, and this brings with it removal of old features or reliance on newer versions of other transitive dependencies), we're long past public disclosure and the vulnerability has likely been getting exploited in the wild for some time. If a deployer/operator can't rely on our constraints list for having a timely and complete picture of a secure dependency tree then they already need local workarounds which are probably superior regardless. There are also plenty of non-Python dependencies for our software which can have vulnerabilities of their own, and those aren't reflected at all in our constraints lists. How are said users updating those? > To limit the effort involved we could agree to limit the scope to > only allow changes to the current ‘maintained’ releases. For all > other branches we can encourage an upgrade to a ‘maintained’ > release by adding a release note. I still think even that is an abuse of the stable upper constraints lists and in direct conflict with their purpose as a *frozen* snapshot of external dependencies contemporary with the release which allow us to maintain the stability of our test environments for our stable branches. It can't be both that *and* updated with the latest versions of some dependencies because of random bug fixes, security-related or otherwise. > To manage the 'unreasonable expectations’, we should document a > policy to this effect. What we should document is that it's unreasonable to attempt to repurpose our stable constraints lists as a security update mechanism for external dependencies, and encourage users to look elsewhere when attempting to find solutions for securing the dependency trees of their deployments. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From openstack at fried.cc Thu May 9 13:49:35 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 9 May 2019 08:49:35 -0500 Subject: Any ideas on fixing bug 1827083 so we can merge code? In-Reply-To: References: Message-ID: Have we tried changing the URI to https://opendev.org/openstack/requirements/raw/branch/master/upper-constraints.txt to avoid the redirecting? On 5/9/19 8:02 AM, Matt Riedemann wrote: > I'm not sure what is causing the bug [1] but it's failing at a really > high rate for about week now. Do we have ideas on the issue? Do we have > thoughts on a workaround? Or should we disable the vexxhost-sjc1 > provider until it's solved? > > [1] http://status.openstack.org/elastic-recheck/#1827083 > From fungi at yuggoth.org Thu May 9 13:55:17 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 9 May 2019 13:55:17 +0000 Subject: Any ideas on fixing bug 1827083 so we can merge code? In-Reply-To: References: Message-ID: <20190509135517.7j7ccyyxzp2yneun@yuggoth.org> On 2019-05-09 08:49:35 -0500 (-0500), Eric Fried wrote: > Have we tried changing the URI to > https://opendev.org/openstack/requirements/raw/branch/master/upper-constraints.txt > to avoid the redirecting? > > On 5/9/19 8:02 AM, Matt Riedemann wrote: > > I'm not sure what is causing the bug [1] but it's failing at a really > > high rate for about week now. Do we have ideas on the issue? Do we have > > thoughts on a workaround? Or should we disable the vexxhost-sjc1 > > provider until it's solved? > > > > [1] http://status.openstack.org/elastic-recheck/#1827083 I have to assume the bug report itself is misleading. Jobs should be using the on-disk copy of the requirements repository provided by Zuul for this and not retrieving that file over the network. However the problem is presumably DNS resolution not working at all on those nodes, so something is going to break at some point in the job in those cases regardless. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From witold.bedyk at suse.com Thu May 9 14:07:53 2019 From: witold.bedyk at suse.com (Witek Bedyk) Date: Thu, 9 May 2019 16:07:53 +0200 Subject: [monasca] Monasca PTG sessions summary Message-ID: <9f4aa6a9-c69b-c675-b15c-b17a80b64dde@suse.com> Hello Team, I've put together some of the items we've discussed during the PTG last week [1]. Please add or update if anything important is missing or wrong. Thanks again for all your contributions during the Summit and the PTG. Cheers Witek [1] https://wiki.openstack.org/wiki/MonascaTrainPTG From pierre-samuel.le-stang at corp.ovh.com Thu May 9 15:14:28 2019 From: pierre-samuel.le-stang at corp.ovh.com (Pierre-Samuel LE STANG) Date: Thu, 9 May 2019 17:14:28 +0200 Subject: [ops] database archiving tool Message-ID: <20190509151428.im2c6dbxpv6hwhyo@corp.ovh.com> Hi all, At OVH we needed to write our own tool that archive data from OpenStack databases to prevent some side effect related to huge tables (slower response time, changing MariaDB query plan) and to answer to some legal aspects. So we started to write a python tool which is called OSArchiver that I briefly presented at Denver few days ago in the "Optimizing OpenStack at large scale" talk. We think that this tool could be helpful to other and are ready to open source it, first we would like to get the opinion of the ops community about that tool. To sum-up OSArchiver is written to work regardless of Openstack project. The tool relies on the fact that soft deleted data are recognizable because of their 'deleted' column which is set to 1 or uuid and 'deleted_at' column which is set to the date of deletion. The points to have in mind about OSArchiver: * There is no knowledge of business objects * One table might be archived if it contains 'deleted' column * Children rows are archived before parents rows * A row can not be deleted if it fails to be archived Here are features already implemented: * Archive data in an other database and/or file (actually SQL and CSV formats are supported) to be easily imported * Delete data from Openstack databases * Customizable (retention, exclude DBs, exclude tables, bulk insert/delete) * Multiple archiving configuration * Dry-run mode * Easily extensible, you can add your own destination module (other file format, remote storage etc...) * Archive and/or delete only mode It also means that by design you can run osarchiver not only on OpenStack databases but also on archived OpenStack databases. Thanks in advance for your feedbacks. -- Pierre-Samuel Le Stang From mriedemos at gmail.com Thu May 9 15:28:03 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 9 May 2019 10:28:03 -0500 Subject: [nova][ptg] Summary: Extra specs validation In-Reply-To: <07673fec-c193-1031-b9f0-5d32c65cc124@fried.cc> References: <07673fec-c193-1031-b9f0-5d32c65cc124@fried.cc> Message-ID: <17e7e0f8-4604-a845-8749-738f588374c1@gmail.com> On 5/2/2019 11:11 PM, Eric Fried wrote: > - Do it in the flavor API when extra specs are set (as opposed to e.g. > during server create) > - One spec, but two stages: > 1) For known keys, validate values; do this without a microversion. > 2) Validate keys, which entails > - Standard set of keys (by pattern) known to nova > - Mechanism for admin to extend the set for snowflake extra specs > specific to their deployment / OOT driver / etc. > - "Validation" will at least comprise messaging/logging. > - Optional "strict mode" making the operation fail is also a possibility. I don't remember agreeing to one spec with two stages for this. If you want to get something approved in workable in Train, validating the values for known keys is low-hanging-fruit. Figuring out how to validate known keys in a way that allows out of tree extra specs to work is going to be a lot more complicated and rat-holey, so I would personally make those separate efforts and separate specs. -- Thanks, Matt From moreira.belmiro.email.lists at gmail.com Thu May 9 15:43:49 2019 From: moreira.belmiro.email.lists at gmail.com (Belmiro Moreira) Date: Thu, 9 May 2019 17:43:49 +0200 Subject: [ops] database archiving tool In-Reply-To: <20190509151428.im2c6dbxpv6hwhyo@corp.ovh.com> References: <20190509151428.im2c6dbxpv6hwhyo@corp.ovh.com> Message-ID: Hi Pierre-Samuel, at this point most of the OpenStack projects have their own way to archive/delete soft deleted records. But one thing usually missing is the retention period of soft deleted records and then the archived data. I'm interested to learn more about what you are doing. Is there any link to access the code? Belmiro CERN On Thu, May 9, 2019 at 5:25 PM Pierre-Samuel LE STANG < pierre-samuel.le-stang at corp.ovh.com> wrote: > Hi all, > > At OVH we needed to write our own tool that archive data from OpenStack > databases to prevent some side effect related to huge tables (slower > response > time, changing MariaDB query plan) and to answer to some legal aspects. > > So we started to write a python tool which is called OSArchiver that I > briefly > presented at Denver few days ago in the "Optimizing OpenStack at large > scale" > talk. We think that this tool could be helpful to other and are ready to > open > source it, first we would like to get the opinion of the ops community > about > that tool. > > To sum-up OSArchiver is written to work regardless of Openstack project. > The > tool relies on the fact that soft deleted data are recognizable because of > their 'deleted' column which is set to 1 or uuid and 'deleted_at' column > which > is set to the date of deletion. > > The points to have in mind about OSArchiver: > * There is no knowledge of business objects > * One table might be archived if it contains 'deleted' column > * Children rows are archived before parents rows > * A row can not be deleted if it fails to be archived > > Here are features already implemented: > * Archive data in an other database and/or file (actually SQL and CSV > formats are supported) to be easily imported > * Delete data from Openstack databases > * Customizable (retention, exclude DBs, exclude tables, bulk insert/delete) > * Multiple archiving configuration > * Dry-run mode > * Easily extensible, you can add your own destination module (other file > format, remote storage etc...) > * Archive and/or delete only mode > > It also means that by design you can run osarchiver not only on OpenStack > databases but also on archived OpenStack databases. > > Thanks in advance for your feedbacks. > > -- > Pierre-Samuel Le Stang > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mthode at mthode.org Thu May 9 15:54:55 2019 From: mthode at mthode.org (Matthew Thode) Date: Thu, 9 May 2019 10:54:55 -0500 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: <20190509134808.4eqwwjcdxjpt37wh@yuggoth.org> References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> <20190509134808.4eqwwjcdxjpt37wh@yuggoth.org> Message-ID: <20190509155455.7wkszge3e7bykgsj@mthode.org> On 19-05-09 13:48:09, Jeremy Stanley wrote: > On 2019-05-09 12:38:29 +0000 (+0000), Jesse Pretorius wrote: > [...] > > A lot of operators make use of u-c for source-based builds to > > ensure consistency in the builds and to ensure that they’re using > > the same packages as those which were tested upstream. It makes > > sense to collaborate on something this important as far upstream > > as possible. > [...] > > See, this is what frightens me. We should *strongly* discourage them > from doing this, period. If your deployment relies on distribution > packages of dependencies then your distro's package maintainers have > almost certainly received advance notice of many of these > vulnerabilities and have fixes ready for you to download the moment > they're made public. They're in most cases selectively backporting > the fixes to the versions they carry so as to make them otherwise > backward compatible and avoid knock-on effects involving a need to > upgrade other transitive dependencies which are not involved in the > vulnerability. > To extend on this, I thought that OSA had the ability to override certian constraints (meaning they could run the check and maintain the overrides on their end). > > If we think of this as a community effort similar to the extended > > maintenance policy - the development community doesn’t *have* to > > implement the infrastructure to actively monitor for the > > vulnerabilities and respond to them. It can be maintained on a > > best effort basis by those interested in doing so. > > By the time we find out and work through the transitive dependency > bumps implied by this sort of change (because many of these ~600 > dependencies of ours don't backport fixes or maintain multiple > stable series of their own and so our only option is to upgrade to > the latest version, and this brings with it removal of old features > or reliance on newer versions of other transitive dependencies), > we're long past public disclosure and the vulnerability has likely > been getting exploited in the wild for some time. If a > deployer/operator can't rely on our constraints list for having a > timely and complete picture of a secure dependency tree then they > already need local workarounds which are probably superior > regardless. There are also plenty of non-Python dependencies for our > software which can have vulnerabilities of their own, and those > aren't reflected at all in our constraints lists. How are said users > updating those? > There's also the problem for knock on dependencies. Update foo, which pulls in a new version of bar as required. Either of which can break the world (and on down the dep tree) -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From joseph.davis at suse.com Thu May 9 15:57:03 2019 From: joseph.davis at suse.com (Joseph Davis) Date: Thu, 9 May 2019 08:57:03 -0700 Subject: [telemetry] Team meeting agenda for tomorrow In-Reply-To: <20AC2324-24B6-40D1-A0A4-0382BCE430A7@cern.ch> References: <14ff728c-f19e-e869-90b1-4ff37f7170af@suse.com> <20AC2324-24B6-40D1-A0A4-0382BCE430A7@cern.ch> Message-ID: Hi Tim, I added your question as Proposal C to the roadmap etherpad [1]. Feel free to change it if I got something wrong. :) [1]https://etherpad.openstack.org/p/telemetry-train-roadmap joseph On 5/9/19 12:24 AM, Tim Bell wrote: > > Is it time to rethink the approach to telemetry a bit? > > Having each project provide its telemetry data (such as Swift with > statsd - > https://docs.openstack.org/swift/latest/admin/objectstorage-monitoring.html > > or using a framework like Prometheus)? > > In the end, the projects are the ones who have the best knowledge of > how to get the metrics. > > Tim > ** -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Thu May 9 19:50:45 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 9 May 2019 14:50:45 -0500 Subject: [ops] database archiving tool In-Reply-To: References: <20190509151428.im2c6dbxpv6hwhyo@corp.ovh.com> Message-ID: <70050fb8-9d5f-b39c-a46a-af40e8a83ee5@gmail.com> On 5/9/2019 10:43 AM, Belmiro Moreira wrote: > But one thing usually missing is the retention period of soft deleted > records and then the archived data. Something like this? https://review.opendev.org/#/c/556751/ -- Thanks, Matt From cgoncalves at redhat.com Thu May 9 20:59:38 2019 From: cgoncalves at redhat.com (Carlos Goncalves) Date: Thu, 9 May 2019 22:59:38 +0200 Subject: [User-committee] OpenStack User Survey 2019 In-Reply-To: <5CD34F85.9010604@openstack.org> References: <5CC0732E.8020601@tipit.net> <74F9B988-972B-422F-94D1-E62A83FD87A7@openstack.org> <5CD34F85.9010604@openstack.org> Message-ID: Thank you for the prompt replies and action, Allison and Jimmy! After discussing on #openstack-lbaas with the team and drafting on https://etherpad.openstack.org/p/cItdtzi32r, we would like to suggest presenting some multiple choices along with an "Other" free text area. 1. Which OpenStack load balancing (Octavia) provider drivers would you like to see supported? (sorted alphabetically) A10 Networks AVI Networks Amphora Brocade F5 HAProxy Technologies Kemp Netscaler OVN Radware VMware Other (free text area) 2. Which new features would you like to see supported in OpenStack load balancing (Octavia)? (sorted alphabetically) Active-active Container-based amphora driver Event notifications gRPC protocol HTTP/2 protocol Log offloading MySQL protocol Simultaneous IPv4 and IPv6 VIP Statistics (more metrics) VIP ACL API Other (free text area) Thanks, Carlos On Wed, May 8, 2019 at 11:52 PM Jimmy McArthur wrote: > > Carlos, > > Right now these questions are up as free text area. Feel free to send along adjustments if you'd like. > > > > Cheers > Jimmy > > Allison Price May 8, 2019 at 4:30 PM > Hi Carlos, > > Thank you for providing these two questions. We can get them both added, but I did have a question. Are both of these questions intended to be open ended with a text box for respondents to fill in their answers? Or do you want to provide answer choices? (thinking for the first question in particular) With any multiple choice question, an Other option can be included that will trigger a text box to be completed. > > Thanks! > Allison > > > > _______________________________________________ > User-committee mailing list > User-committee at lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/user-committee > Carlos Goncalves May 8, 2019 at 12:04 PM > Hi Allison and Jimmy, > > In today's Octavia IRC meeting [1], the team agreed on the following > two questions we would like to see included in the survey: > > 1. Which OpenStack load balancing (Octavia) provider drivers would you > like to see supported? > 2. Which new features would you like to see supported in OpenStack > load balancing (Octavia)? > > Please let us know if you have any questions. > > Thanks, > Carlos > > [1] http://eavesdrop.openstack.org/meetings/octavia/2019/octavia.2019-05-08-16.00.html > > Allison Price May 7, 2019 at 3:50 PM > Hi Michael, > > I apologize that the Octavia project team has been unable to submit a question to date. Jimmy posted the User Survey update to the public mailing list to ensure we updated the entire community and that we caught any projects that had not submitted their questions. The User Survey is open all year, and the primary goal is passing operator feedback to the upstream community. > > If the Octavia team - or any OpenStack project team - has a question they would like added (limit of 2 per project), please let Jimmy or myself know. > > Thanks for reaching out, Michael. > > Cheers, > Allison > > > Michael Johnson May 7, 2019 at 3:39 PM > Jimmy & Allison, > > As you probably remember from previous year's surveys, the Octavia > team has been trying to get a question included in the survey for a > while. > I have included the response we got the last time we inquired about > the survey below. We never received a follow up invitation. > > I think it would be in the best interest for the community if we > follow our "Four Opens" ethos in the user survey process, specifically > the "Open Community" statement, by soliciting survey questions from > the project teams in an open forum such as the openstack-discuss > mailing list. > > Michael > > ----- Last response e-mail ------ > Jimmy McArthur > > Fri, Sep 7, 2018, 5:51 PM > to Allison, me > Hey Michael, > > The project-specific questions were added in 2017, so likely didn't > include some new projects. While we asked all projects to participate > initially, less than a dozen did. We will be sending an invitation for > new/underrepresented projects in the coming weeks. Please stand by and > know that we value your feedback and that of the community. > > Cheers! > > > Allison Price April 27, 2019 at 7:11 PM > Hi Michael, > > We reached out to all of the PTLs who had questions in the 2018 version of the survey to review and update their questions. If there is a project that was missed, we can add it and share anonymized results with the PTLs directly as well as the openstack-discsuss mailing list. > > If there is a question from the Octavia team, please let us know and we can add it for the 2019 survey. > > Cheers, > Allison > > > > > From miguel at mlavalle.com Thu May 9 22:35:34 2019 From: miguel at mlavalle.com (Miguel Lavalle) Date: Thu, 9 May 2019 17:35:34 -0500 Subject: [openstack-dev] [neutron] Cancelling Neutron Drivers meeting Message-ID: Dear Neutrinos, During the recent PTG in Denver we had session were we discussed RFEs. Other RFEs in the pipeline are still in the preliminary discussion stage. As a consequence, let's skip the meeting on May 10th. We will resume on the 17th Best regards Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Fri May 10 01:42:14 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Fri, 10 May 2019 10:42:14 +0900 Subject: [telemetry] Team meeting agenda for tomorrow In-Reply-To: References: <14ff728c-f19e-e869-90b1-4ff37f7170af@suse.com> <20AC2324-24B6-40D1-A0A4-0382BCE430A7@cern.ch> Message-ID: Hi Joseph, Thanks for the update. I would suggest creating another mail thread since this is barely about the meeting yesterday. Bests, On Fri, May 10, 2019 at 12:57 AM Joseph Davis wrote: > Hi Tim, > > > I added your question as Proposal C to the roadmap etherpad [1]. Feel > free to change it if I got something wrong. :) > > > [1] https://etherpad.openstack.org/p/telemetry-train-roadmap > > > joseph > > > On 5/9/19 12:24 AM, Tim Bell wrote: > > Is it time to rethink the approach to telemetry a bit? > > > > Having each project provide its telemetry data (such as Swift with statsd > - > https://docs.openstack.org/swift/latest/admin/objectstorage-monitoring.html > > or using a framework like Prometheus)? > > > > In the end, the projects are the ones who have the best knowledge of how > to get the metrics. > > > > Tim > > > > ** > -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Fri May 10 03:05:43 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Fri, 10 May 2019 12:05:43 +0900 Subject: [telemetry] Voting for a new meeting time Message-ID: Hi team, As discussed, we should have a new meeting time so more contributors can join. So please cast your vote in the link below *by the end of May 15th (UTC).* https://doodle.com/poll/cd9d3ksvpms4frud One thing to keep in mind that I still want to keep the old meeting time as an option, not because I'm biasing the APAC developers but because it is the time that most of the active contributors (who actually pushing patches and review) can join. When we have the results if we end up missing some contributors (I think all of you are great!), no worries. We could try to create different meetings for a different set of contributors, something like: - Developers: for bug triage, implementation, etc. - Operators: input from operators are important too since we need real use cases - Cross-project: Telemetry may need to work with other teams - Core team: for the core team to discuss the vision and goals, planning Okie, I know we cannot live without monitoring/logging so let's rock the world guys!!! Bests -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From rico.lin.guanyu at gmail.com Fri May 10 04:26:04 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Fri, 10 May 2019 12:26:04 +0800 Subject: [User-committee] OpenStack User Survey 2019 In-Reply-To: References: <5CC0732E.8020601@tipit.net> <74F9B988-972B-422F-94D1-E62A83FD87A7@openstack.org> <5CD34F85.9010604@openstack.org> Message-ID: Thanks, Jimmy and Allison for the effort to put this together, As Heat PTL I would like to ask for question update for Heat project. Here is our new question: https://etherpad.openstack.org/p/heat-user-survey-brainstrom Please let me know if any confusion On Fri, May 10, 2019 at 5:02 AM Carlos Goncalves wrote: > Thank you for the prompt replies and action, Allison and Jimmy! > > After discussing on #openstack-lbaas with the team and drafting on > https://etherpad.openstack.org/p/cItdtzi32r, we would like to suggest > presenting some multiple choices along with an "Other" free text area. > > 1. Which OpenStack load balancing (Octavia) provider drivers would you > like to see supported? > > (sorted alphabetically) > A10 Networks > AVI Networks > Amphora > Brocade > F5 > HAProxy Technologies > Kemp > Netscaler > OVN > Radware > VMware > Other (free text area) > > 2. Which new features would you like to see supported in OpenStack > load balancing (Octavia)? > > (sorted alphabetically) > Active-active > Container-based amphora driver > Event notifications > gRPC protocol > HTTP/2 protocol > Log offloading > MySQL protocol > Simultaneous IPv4 and IPv6 VIP > Statistics (more metrics) > VIP ACL API > Other (free text area) > > Thanks, > Carlos > > > On Wed, May 8, 2019 at 11:52 PM Jimmy McArthur > wrote: > > > > Carlos, > > > > Right now these questions are up as free text area. Feel free to send > along adjustments if you'd like. > > > > > > > > Cheers > > Jimmy > > > > Allison Price May 8, 2019 at 4:30 PM > > Hi Carlos, > > > > Thank you for providing these two questions. We can get them both added, > but I did have a question. Are both of these questions intended to be open > ended with a text box for respondents to fill in their answers? Or do you > want to provide answer choices? (thinking for the first question in > particular) With any multiple choice question, an Other option can be > included that will trigger a text box to be completed. > > > > Thanks! > > Allison > > > > > > > > _______________________________________________ > > User-committee mailing list > > User-committee at lists.openstack.org > > http://lists.openstack.org/cgi-bin/mailman/listinfo/user-committee > > Carlos Goncalves May 8, 2019 at 12:04 PM > > Hi Allison and Jimmy, > > > > In today's Octavia IRC meeting [1], the team agreed on the following > > two questions we would like to see included in the survey: > > > > 1. Which OpenStack load balancing (Octavia) provider drivers would you > > like to see supported? > > 2. Which new features would you like to see supported in OpenStack > > load balancing (Octavia)? > > > > Please let us know if you have any questions. > > > > Thanks, > > Carlos > > > > [1] > http://eavesdrop.openstack.org/meetings/octavia/2019/octavia.2019-05-08-16.00.html > > > > Allison Price May 7, 2019 at 3:50 PM > > Hi Michael, > > > > I apologize that the Octavia project team has been unable to submit a > question to date. Jimmy posted the User Survey update to the public mailing > list to ensure we updated the entire community and that we caught any > projects that had not submitted their questions. The User Survey is open > all year, and the primary goal is passing operator feedback to the upstream > community. > > > > If the Octavia team - or any OpenStack project team - has a question > they would like added (limit of 2 per project), please let Jimmy or myself > know. > > > > Thanks for reaching out, Michael. > > > > Cheers, > > Allison > > > > > > Michael Johnson May 7, 2019 at 3:39 PM > > Jimmy & Allison, > > > > As you probably remember from previous year's surveys, the Octavia > > team has been trying to get a question included in the survey for a > > while. > > I have included the response we got the last time we inquired about > > the survey below. We never received a follow up invitation. > > > > I think it would be in the best interest for the community if we > > follow our "Four Opens" ethos in the user survey process, specifically > > the "Open Community" statement, by soliciting survey questions from > > the project teams in an open forum such as the openstack-discuss > > mailing list. > > > > Michael > > > > ----- Last response e-mail ------ > > Jimmy McArthur > > > > Fri, Sep 7, 2018, 5:51 PM > > to Allison, me > > Hey Michael, > > > > The project-specific questions were added in 2017, so likely didn't > > include some new projects. While we asked all projects to participate > > initially, less than a dozen did. We will be sending an invitation for > > new/underrepresented projects in the coming weeks. Please stand by and > > know that we value your feedback and that of the community. > > > > Cheers! > > > > > > Allison Price April 27, 2019 at 7:11 PM > > Hi Michael, > > > > We reached out to all of the PTLs who had questions in the 2018 version > of the survey to review and update their questions. If there is a project > that was missed, we can add it and share anonymized results with the PTLs > directly as well as the openstack-discsuss mailing list. > > > > If there is a question from the Octavia team, please let us know and we > can add it for the 2019 survey. > > > > Cheers, > > Allison > > > > > > > > > > > > -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Fri May 10 06:09:37 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Fri, 10 May 2019 15:09:37 +0900 Subject: [telemetry] Team meeting agenda for tomorrow In-Reply-To: References: <14ff728c-f19e-e869-90b1-4ff37f7170af@suse.com> <20AC2324-24B6-40D1-A0A4-0382BCE430A7@cern.ch> Message-ID: Hi guys, Please cast your votes for the new meeting time: https://doodle.com/poll/cd9d3ksvpms4frud Bests, On Fri, May 10, 2019 at 10:42 AM Trinh Nguyen wrote: > Hi Joseph, > > Thanks for the update. I would suggest creating another mail thread since > this is barely about the meeting yesterday. > > Bests, > > On Fri, May 10, 2019 at 12:57 AM Joseph Davis > wrote: > >> Hi Tim, >> >> >> I added your question as Proposal C to the roadmap etherpad [1]. Feel >> free to change it if I got something wrong. :) >> >> >> [1] https://etherpad.openstack.org/p/telemetry-train-roadmap >> >> >> joseph >> >> >> On 5/9/19 12:24 AM, Tim Bell wrote: >> >> Is it time to rethink the approach to telemetry a bit? >> >> >> >> Having each project provide its telemetry data (such as Swift with statsd >> - >> https://docs.openstack.org/swift/latest/admin/objectstorage-monitoring.html >> >> or using a framework like Prometheus)? >> >> >> >> In the end, the projects are the ones who have the best knowledge of how >> to get the metrics. >> >> >> >> Tim >> >> >> >> ** >> > > > -- > *Trinh Nguyen* > *www.edlab.xyz * > > -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From info at dantalion.nl Fri May 10 06:49:13 2019 From: info at dantalion.nl (info at dantalion.nl) Date: Fri, 10 May 2019 08:49:13 +0200 Subject: [telemetry][ceilometer][gnocchi] How to configure aggregate for cpu_util or calculate from metrics In-Reply-To: References: <14ff728c-f19e-e869-90b1-4ff37f7170af@suse.com> <20AC2324-24B6-40D1-A0A4-0382BCE430A7@cern.ch> Message-ID: <48533933-1443-6ad3-9cf1-940ac4d52d6f@dantalion.nl> Hello, I am working on Watcher and we are currently changing how metrics are retrieved from different datasources such as Monasca or Gnocchi. Because of this major overhaul I would like to validate that everything is working correctly. Almost all of the optimization strategies in Watcher require the cpu utilization of an instance as metric but with newer versions of Ceilometer this has become unavailable. On IRC I received the information that Gnocchi could be used to configure an aggregate and this aggregate would then report cpu utilization, however, I have been unable to find documentation on how to achieve this. I was also notified that cpu_util is something that could be computed from other metrics. When reading https://docs.openstack.org/ceilometer/rocky/admin/telemetry-measurements.html#openstack-compute the documentation seems to agree on this as it states that cpu_util is measured by using a 'rate of change' transformer. But I have not been able to find how this can be computed. I was hoping someone could spare the time to provide documentation or information on how this currently is best achieved. Kind Regards, Corne Lukken (Dantali0n) From akekane at redhat.com Fri May 10 07:01:40 2019 From: akekane at redhat.com (Abhishek Kekane) Date: Fri, 10 May 2019 12:31:40 +0530 Subject: [glance] Train PTG summary Message-ID: Hi All, I attended OpenStack Train PTG at Denver in last week. It was an interesting event with lots of discussion happening around different OpenStack projects. I was mostly associated with Glance and cross-projects work related to Glance. There were other topics around Edge, UI and QA. >From Edge computing glance prospective, glance has already added a feature for enabling multiple backend support and the people from edge team will work on building more concrete use cases about glance. This cycle Glance is knee towards making multiple stores feature as concrete and enhance the cache management tool. Apart from this, glance will mostly focus on making appropriate changes for releasing glance-store version 1.0.0. Main task related to this is stabilization of multiple stores feature. Glance team has identified below tasks related to release of glance-store version 1.0.0 1. Add multiple stores support to Location API 2. Stabilize multiple stores support 3. Clean out deprecated configuration options 4. Modify deprecation warnings for single store configuration options (Should be removed in V cycle) 5. Resizing issue related to ceph backend Regarding Glance the focus is mostly on stabilizing the multiple stores feature, Glance team has identified below tasks towards the same; 1. Rethinking of file system access 2. Location API to support multiple stores 3. Store IDs will be lazy added to the images upon first access (existing images before enabling multiple stores support) 4. Correction of vocabulary (i.e. where 'backend' is exposed to user change it to 'store') 5. Necessary client changes Another major efforts will be carried out to "Use v2 API for cache management", Glance team will try it's best to deliver the basic cache-management tool to pre-cache the images. Below are some tasks identified related to this work; 1. /v2/cache - endpoint, for listing, deleting and pre-caching images 2. rabbitmq implementation for HA deployment - optional 3. RESTFul policy driven JSON output 4. Deprecate previous glance-cache-manage and glance-cache-prefetcher tools *Cross-Project work:* In this PTG we had discussion with Nova and Cinder regarding the adoption of multiple store feature of Glance. As per discussion we have finalized the design and Glance team will work together with Nova and Cinder towards adding multiple store support feature in Train cycle. Support for Glance multiple stores in Cinder: As per discussion, volume-type will be used to add which store the image will be uploaded on upload-to-image operation. Nova snapshots to dedicated store: Aggrement is, If instance is boot from image, we need to find the store of the base image and upload the snapshot or backup to the same store, if instance is boot from volume then nova creates 0 size image in glance which will be uploaded to default store (so no need to change in case of instance boot from volume scenario). Glance image properties and Cinder encrypted volume key management: new draft spec: https://review.opendev.org/#/c/656895/ Aggrement is, Cinder will add a metadata property 'delete_encryption_key_on_image_deletion' with True to image, and while image is deleted and if this property is present the Glance will make a call to Barbican to delete the related secret. Cinder/Glance creating image from volume with Ceph: References: https://review.openstack.org/#/c/608400/ Solution: Lets not make the size user settable and instead just resize bigger chunks at the time and shrink back after EOF. Below is the Train cycle planning and deadlines for Glance. *Train milestone planning:* *Train T1 - June 03-07:* glance_store v 1.0.0 'Store' vs. 'Backend': Getting the vocabulary correct Clear deprecated config options Modify deprecated warnings for single store config options Stabilize multiple store functionality Rethinking filesystem access Add support for location to identify the store based on location URI Add wrapper function to update store information to existing images Nova backup and snapshots to dedicated stores Cinder to utilize glance multiple store Remove 'owner_is_tenant' config option *Train T2 - July 22-26* Use v2 API for cache management Glance image properties and Cinder encrypted volume key management Cinder/Glance creating image from volume with Ceph *Train T3 - September 09-13* cluster awareness openstackclient vs python-glanceclient Clear deprecated options from glance Get rid of deprecation warning messages Release of non-client and client libraries glance-store (if required) python-glanceclient Glance PTG planning etherpad: https://etherpad.openstack.org/p/Glance-Train-PTG-planning Let me know if you guys need more details on this. Thanks & Best Regards, Abhishek Kekane -------------- next part -------------- An HTML attachment was scrubbed... URL: From mrunge at matthias-runge.de Fri May 10 07:05:05 2019 From: mrunge at matthias-runge.de (Matthias Runge) Date: Fri, 10 May 2019 09:05:05 +0200 Subject: [telemetry] Team meeting agenda for tomorrow In-Reply-To: <20190509124300.4f7d7qxprq6osasb@yuggoth.org> References: <14ff728c-f19e-e869-90b1-4ff37f7170af@suse.com> <20AC2324-24B6-40D1-A0A4-0382BCE430A7@cern.ch> <20190509083558.GB3547@hilbert.berg.ol> <20190509124300.4f7d7qxprq6osasb@yuggoth.org> Message-ID: <20190510070505.GC18559@hilbert.berg.ol> On Thu, May 09, 2019 at 12:43:01PM +0000, Jeremy Stanley wrote: > On 2019-05-09 10:35:58 +0200 (+0200), Matthias Runge wrote: > [...] > > Unfortunately, having a meetig at 4 am in the morning does not really > > work for me. May I kindly request to move the meeting to a more friendly > > hour? > > The World is round, and your "friendly" times are always someone > else's "unfriendly" times. Asking the folks interested in > participating in the meeting to agree on a consensus timeslot > between them is fair, but please don't characterize someone else's > locale as "unfriendly" just because it's on the opposite side of the > planet from you. Right. It was not my intention to sound develuating. However, and for myself, I could not imagine a worse time. When asking for people to join an effort, it usually helps to have meetings in hours being accessible for them; the alternative would be to switch to asynchronous methods. Matthias -- Matthias Runge -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From yongli.he at intel.com Fri May 10 07:45:46 2019 From: yongli.he at intel.com (yonglihe) Date: Fri, 10 May 2019 15:45:46 +0800 Subject: [nova] PTG aligning of nova spec: show-server-numa-topology Message-ID: Hi,  Everyone I synced up with Alex about comments we got at PTG.  It's a long discussion, I might lost something. What i got lists below,  fix me: *  Remove sockets *  Remove thread_policy Not sure about following comments: *  Remove the cpu topology from the proposal? *  Using the cpu pinning info instead of cpu set? By apply the suggestion, the API ``GET /servers/{server_id}/topology``  response gonna to be like this, and let us align what it should be:  {          # overall policy: TOPOLOGY % 'index          "nodes":[                     {                       # Host Numa Node                       # control by policy TOPOLOGY % 'index:host_info'                       "host_numa_node": 3,                       # 0:5 means vcpu 0 pinning to pcpu 5                       # control by policy TOPOLOGY % 'index:host_info'                       "cpu_pinning": {0:5, 1:6},                       "vcpu_set": [0,1,2,3],                       "siblings": [[0,1],[2,3]],                       "memory_mb": 1024,                       "pagesize_kb": 4096,                       "cores": 2,                       # one core has at least one thread                       "threads": 2                     }                     ...                    ], # nodes     } links: ptg: https://etherpad.openstack.org/p/nova-ptg-train L334 spec review: https://review.opendev.org/#/c/612256/25/specs/stein/approved/show-server-numa-topology.rst code review: https://review.openstack.org/#/c/621476/ bp: https://blueprints.launchpad.net/nova/+spec/show-server-numa-topology Regards Yongli He From pierre-samuel.le-stang at corp.ovh.com Fri May 10 07:55:44 2019 From: pierre-samuel.le-stang at corp.ovh.com (Pierre-Samuel LE STANG) Date: Fri, 10 May 2019 09:55:44 +0200 Subject: [ops] database archiving tool In-Reply-To: References: <20190509151428.im2c6dbxpv6hwhyo@corp.ovh.com> Message-ID: <20190510075544.dk6oxokbzfm4n5pd@corp.ovh.com> Hello Belmiro, I will put the code on OVH's github repository as soon as possible. I'll keep you informed. -- Pierre-Samuel Le Stang Belmiro Moreira wrote on jeu. [2019-mai-09 17:43:49 +0200]: > Hi Pierre-Samuel, > at this point most of the OpenStack projects have their own way to archive/ > delete soft deleted records. > But one thing usually missing is the retention period of soft deleted records > and then the archived data. > > I'm interested to learn more about what you are doing. > Is there any link to access the code? > > Belmiro > CERN > > On Thu, May 9, 2019 at 5:25 PM Pierre-Samuel LE STANG < > pierre-samuel.le-stang at corp.ovh.com> wrote: > > Hi all, > > At OVH we needed to write our own tool that archive data from OpenStack > databases to prevent some side effect related to huge tables (slower > response > time, changing MariaDB query plan) and to answer to some legal aspects. > > So we started to write a python tool which is called OSArchiver that I > briefly > presented at Denver few days ago in the "Optimizing OpenStack at large > scale" > talk. We think that this tool could be helpful to other and are ready to > open > source it, first we would like to get the opinion of the ops community > about > that tool. > > To sum-up OSArchiver is written to work regardless of Openstack project. > The > tool relies on the fact that soft deleted data are recognizable because of > their 'deleted' column which is set to 1 or uuid and 'deleted_at' column > which > is set to the date of deletion. > > The points to have in mind about OSArchiver: > * There is no knowledge of business objects > * One table might be archived if it contains 'deleted' column > * Children rows are archived before parents rows > * A row can not be deleted if it fails to be archived > > Here are features already implemented: > * Archive data in an other database and/or file (actually SQL and CSV > formats are supported) to be easily imported > * Delete data from Openstack databases > * Customizable (retention, exclude DBs, exclude tables, bulk insert/delete) > * Multiple archiving configuration > * Dry-run mode > * Easily extensible, you can add your own destination module (other file > format, remote storage etc...) > * Archive and/or delete only mode > > It also means that by design you can run osarchiver not only on OpenStack > databases but also on archived OpenStack databases. > > Thanks in advance for your feedbacks. > > -- > Pierre-Samuel Le Stang > > -- Pierre-Samuel Le Stang From cjeanner at redhat.com Fri May 10 09:12:12 2019 From: cjeanner at redhat.com (=?UTF-8?Q?C=c3=a9dric_Jeanneret?=) Date: Fri, 10 May 2019 11:12:12 +0200 Subject: [TripleO][Validations] Tag convention In-Reply-To: <5228e551-477c-129e-d621-9b1bde9a6535@redhat.com> References: <3c383d8d-54fa-b054-f0ad-b97ed67ba03f@redhat.com> <5228e551-477c-129e-d621-9b1bde9a6535@redhat.com> Message-ID: <1c816ba1-b557-ef59-ba59-6c4fc31f4111@redhat.com> On 5/8/19 9:07 AM, Cédric Jeanneret wrote: > > > On 5/7/19 6:24 PM, Mohammed Naser wrote: >> On Tue, May 7, 2019 at 12:12 PM Emilien Macchi wrote: >>> >>> >>> >>> On Tue, May 7, 2019 at 4:44 PM Cédric Jeanneret wrote: >>>> >>>> Dear all, >>>> >>>> We're currently working hard in order to provide a nice way to run >>>> validations within a deploy (aka in-flight validations). >>>> >>>> We can already call validations provided by the tripleo-validations >>>> package[1], it's working just fine. >>>> >>>> Now comes the question: "how can we disable the validations?". In order >>>> to do that, we propose to use a standard tag in the ansible >>>> roles/playbooks, and to add a "--skip-tags " when we disable the >>>> validations via the CLI or configuration. >>>> >>>> After a quick check in the tripleoclient code, there apparently is a tag >>>> named "validation", that can already be skipped from within the client. >>>> >>>> So, our questions: >>>> - would the reuse of "validation" be OK? >>>> - if not, what tag would be best in order to avoid confusion? >>>> >>>> We also have the idea to allow to disable validations per service. For >>>> this, we propose to introduce the following tag: >>>> - validation-, like "validation-nova", "validation-neutron" and >>>> so on >>>> >>>> What do you think about those two additions? >>> >>> >>> Such as variables, I think we should prefix all our variables and tags with tripleo_ or something, to differentiate them from any other playbooks our operators could run. >>> I would rather use "tripleo_validations" and "tripleo_validation_nova" maybe. > > hmm. what-if we open this framework to a wider audience? For instance, > openshift folks might be interested in some validations (I have Ceph in > mind), and might find weird or even bad to have "tripleo-something" > (with underscore or dashes). > Maybe something more generic? > "vf(-nova)" ? > "validation-framework(-nova)" ? > Or even "opendev-validation(-nova)" > Since there are also a possibility to ask for a new package name for > something more generic without the "tripleo" taint.. Can we agree on something? I really like the "opendev-validation(-service)", even if it's a bit long. For automated thins, it's still good IMHO. Would love to get some feedback on that so that we can go forward with the validations :). Cheers, C. > > Cheers, > > C. > >> >> Just chiming in here.. the pattern we like in OSA is using dashes for >> tags, I think having something like 'tripleo-validations' and >> 'tripleo-validations-nova' etc >> >>> Wdyt? >>> -- >>> Emilien Macchi >> >> >> > -- Cédric Jeanneret Software Engineer - OpenStack Platform Red Hat EMEA https://www.redhat.com/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From geguileo at redhat.com Fri May 10 09:26:00 2019 From: geguileo at redhat.com (Gorka Eguileor) Date: Fri, 10 May 2019 11:26:00 +0200 Subject: Help needed to Support Multi-attach feature In-Reply-To: References: Message-ID: <20190510092600.r27zetl5e3k5ow5v@localhost> On 02/05, RAI, SNEHA wrote: > Hi Team, > > I am currently working on multiattach feature for HPE 3PAR cinder driver. > > For this, while setting up devstack(on stable/queens) I made below change in the local.conf > [[local|localrc]] > ENABLE_VOLUME_MULTIATTACH=True > ENABLE_UBUNTU_CLOUD_ARCHIVE=False > > /etc/cinder/cinder.conf: > [3pariscsi_1] > hpe3par_api_url = https://192.168.1.7:8080/api/v1 > hpe3par_username = user > hpe3par_password = password > san_ip = 192.168.1.7 > san_login = user > san_password = password > volume_backend_name = 3pariscsi_1 > hpe3par_cpg = my_cpg > hpe3par_iscsi_ips = 192.168.11.2,192.168.11.3 > volume_driver = cinder.volume.drivers.hpe.hpe_3par_iscsi.HPE3PARISCSIDriver > hpe3par_iscsi_chap_enabled = True > hpe3par_debug = True > image_volume_cache_enabled = True > > /etc/cinder/policy.json: > 'volume:multiattach': 'rule:admin_or_owner' > > Added https://review.opendev.org/#/c/560067/2/cinder/volume/drivers/hpe/hpe_3par_common.py change in the code. > > But I am getting below error in the nova log: > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [None req-2cda6e90-fd45-4bfe-960a-7fca9ba4abab demo admin] [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] Instance failed block device setup: MultiattachNotSupportedByVirtDriver: Volume dc25f09a-6ae1-4b06-a814-73a8afaba62f has 'multiattach' set, which is not supported for this instance. > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] Traceback (most recent call last): > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/compute/manager.py", line 1615, in _prep_block_device > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] wait_func=self._await_block_device_map_created) > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 840, in attach_block_devices > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] _log_and_attach(device) > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 837, in _log_and_attach > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] bdm.attach(*attach_args, **attach_kwargs) > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 46, in wrapped > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] ret_val = method(obj, context, *args, **kwargs) > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 620, in attach > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] virt_driver, do_driver_attach) > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 274, in inner > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] return f(*args, **kwargs) > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 617, in _do_locked_attach > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] self._do_attach(*args, **_kwargs) > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 602, in _do_attach > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] do_driver_attach) > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 509, in _volume_attach > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] volume_id=volume_id) > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] MultiattachNotSupportedByVirtDriver: Volume dc25f09a-6ae1-4b06-a814-73a8afaba62f has 'multiattach' set, which is not supported for this instance. > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] > > > Apr 29 05:41:20 CSSOSBE04-B09 nova-compute[20455]: DEBUG nova.virt.libvirt.driver [-] Volume multiattach is not supported based on current versions of QEMU and libvirt. QEMU must be less than 2.10 or libvirt must be greater than or equal to 3.10. {{(pid=20455) _set_multiattach_support /opt/stack/nova/nova/virt/libvirt/driver.py:619}} > > > stack at CSSOSBE04-B09:/tmp$ virsh --version > 3.6.0 > stack at CSSOSBE04-B09:/tmp$ kvm --version > QEMU emulator version 2.10.1(Debian 1:2.10+dfsg-0ubuntu3.8~cloud1) > Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers > Hi Sneha, I don't know much about this side of Nova, but reading the log error I would say that you either need to update your libvirt version from 3.6.0 to 3.10, or you need to downgrade your QEMU version to something prior to 2.10. The later is probably easier. I don't use Ubuntu, but according to the Internet you can list available versions with "apt-cache policy qemu" and then install or downgrade to the specific version with "sudo apt-get install qemu=2.5\*" if you wanted to install version 2.5 I hope this helps. Cheers, Gorka. > > openstack volume show -c multiattach -c status sneha1 > +-------------+-----------+ > | Field | Value | > +-------------+-----------+ > | multiattach | True | > | status | available | > +-------------+-----------+ > > cinder extra-specs-list > +--------------------------------------+-------------+--------------------------------------------------------------------+ > | ID | Name | extra_specs | > +--------------------------------------+-------------+--------------------------------------------------------------------+ > | bd077fde-51c3-4581-80d5-5855e8ab2f6b | 3pariscsi_1 | {'volume_backend_name': '3pariscsi_1', 'multiattach': ' True'}| > +--------------------------------------+-------------+--------------------------------------------------------------------+ > > > echo $OS_COMPUTE_API_VERSION > 2.60 > > pip list | grep python-novaclient > DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. > python-novaclient 13.0.0 > > How do I fix this version issue on my setup to proceed? Please help. > > Thanks & Regards, > Sneha Rai From sfinucan at redhat.com Fri May 10 09:42:02 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Fri, 10 May 2019 10:42:02 +0100 Subject: Change in behavior in bandit 1.6.0 Message-ID: We've noticed a spate of recent test failures within the 'pep8' jobs in oslo recently. It seems these are because of the release of bandit 1.6.0. The root cause is that the '-x' (exclude) option seems to have changed behavior. Previously, to match a path like 'oslo_log/tests/*', you could state '-x test'. This option now expects a glob patterns, such as '-x oslo_log/tests/*'. If you use bandit, you probably have to update your 'tox.ini' accordingly. See [1] for an example. It's worth noting that bandit is one of the few packages we don't manage the version for [2], so if you're not already limiting yourself to a version, perhaps it would be a good idea to do so to avoid stable branches breaking periodically. Also, this is something that really shouldn't have happened in a minor version (backwards incompatible behavior change, yo) but it has so we'll live with it. I would ask though that the bandit maintainers, whoever ye be, be more careful about this kind of stuff in the future. Thanks :) Stephen [1] https://review.opendev.org/#/c/658249/ [2] https://github.com/openstack/requirements/blob/master/blacklist.txt From zigo at debian.org Fri May 10 09:54:27 2019 From: zigo at debian.org (Thomas Goirand) Date: Fri, 10 May 2019 11:54:27 +0200 Subject: [telemetry][monasca][self-healing] Team meeting agenda for tomorrow In-Reply-To: <22b57ad4-c737-cef6-b18b-775c0cb9e7a6@suse.com> References: <14ff728c-f19e-e869-90b1-4ff37f7170af@suse.com> <20AC2324-24B6-40D1-A0A4-0382BCE430A7@cern.ch> <1894ef89-ea11-0d31-4820-dc1c39ed07b7@debian.org> <22b57ad4-c737-cef6-b18b-775c0cb9e7a6@suse.com> Message-ID: <294643ce-df80-c06e-585a-d5c0ac4d8f15@debian.org> On 5/9/19 2:35 PM, Witek Bedyk wrote: > >> But then we need some kind of timeseries framework within OpenStack as a >> whole (through an Oslo library?), > > What would be the requirements and the scope of this framework from your > point of view? Currently, Ceilometer pushes values to a timeseries. We could see services doing this directly, without having Ceilometer in the middle doing the polling. I'm thinking about bandwidth and IO usage, which can potentially be quite resource intensive. For example, we could have neutron-metering-agent sending metrics to a timeseries directly, without going through the loop of rabbitmq. That's just an idea I'm throwing... Cheers, Thomas Goirand (zigo) From geguileo at redhat.com Fri May 10 10:39:29 2019 From: geguileo at redhat.com (Gorka Eguileor) Date: Fri, 10 May 2019 12:39:29 +0200 Subject: Baremetal attach volume in Multi-tenancy In-Reply-To: References: <20190509092828.g6qvdg5jbvqqvpba@localhost> Message-ID: <20190510103929.w7iqvakxzskk2pmb@localhost> On 10/05, zack chen wrote: > This is a normal Cinder in Openstack deployment > > I'm using ceph as cinder backend, RBD drvier. > Hi, If you are using a Ceph/RBD cluster then there are some things to take into consideration: - You need to have the ceph-common package installed in the system. - The images are mounted using the kernel module, so you have to be careful with the features that are enabled in the images. - If I'm not mistaken the RBD attach using the cinderclient extension will fail if you don't have the configuration and credentials file already in the system. > My ideas the instance should communicate with Openstack platform storage > network via the vrouter provided by neutron. The vrouter gateway should > communicate with Openstack platform. is or right? > I can't help you on the network side, since I don't know anything about Neutron. Cheers, Gorka. > Gorka Eguileor 于2019年5月9日周四 下午5:28写道: > > > On 08/05, zack chen wrote: > > > Hi, > > > I am looking for a mechanism that can be used for baremetal attach volume > > > in a multi-tenant scenario. In addition we use ceph as the backend > > storage > > > for cinder. > > > > > > Can anybody give me some advice? > > > > Hi, > > > > Is this a stand alone Cinder deployment or a normal Cinder in OpenStack > > deployment? > > > > What storage backend will you be using? > > > > What storage protocol? iSCSI, FC, RBD...? > > > > Depending on these you can go with Walter's suggestion of using > > cinderclient and its extension (which in general is the best way to go), > > or you may prefer writing a small python script that uses OS-Brick and > > makes the REST API calls directly. > > > > Cheers, > > Gorka. > > From rico.lin.guanyu at gmail.com Fri May 10 10:41:47 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Fri, 10 May 2019 18:41:47 +0800 Subject: [User-committee] OpenStack User Survey 2019 In-Reply-To: References: <5CC0732E.8020601@tipit.net> <74F9B988-972B-422F-94D1-E62A83FD87A7@openstack.org> <5CD34F85.9010604@openstack.org> Message-ID: (resend to user committee ML) Thanks, Jimmy and Allison for the effort to put this together, As Heat PTL I would like to ask for question update for Heat project. Here is our new question: https://etherpad.openstack.org/p/heat-user-survey-brainstrom Please let me know if any confusion Rico Lin 於 2019年5月10日 週五,下午12:26寫道: > Thanks, Jimmy and Allison for the effort to put this together, > > As Heat PTL > I would like to ask for question update for Heat project. > Here is our new question: > https://etherpad.openstack.org/p/heat-user-survey-brainstrom > Please let me know if any confusion > > On Fri, May 10, 2019 at 5:02 AM Carlos Goncalves > wrote: > >> Thank you for the prompt replies and action, Allison and Jimmy! >> >> After discussing on #openstack-lbaas with the team and drafting on >> https://etherpad.openstack.org/p/cItdtzi32r, we would like to suggest >> presenting some multiple choices along with an "Other" free text area. >> >> 1. Which OpenStack load balancing (Octavia) provider drivers would you >> like to see supported? >> >> (sorted alphabetically) >> A10 Networks >> AVI Networks >> Amphora >> Brocade >> F5 >> HAProxy Technologies >> Kemp >> Netscaler >> OVN >> Radware >> VMware >> Other (free text area) >> >> 2. Which new features would you like to see supported in OpenStack >> load balancing (Octavia)? >> >> (sorted alphabetically) >> Active-active >> Container-based amphora driver >> Event notifications >> gRPC protocol >> HTTP/2 protocol >> Log offloading >> MySQL protocol >> Simultaneous IPv4 and IPv6 VIP >> Statistics (more metrics) >> VIP ACL API >> Other (free text area) >> >> Thanks, >> Carlos >> >> >> On Wed, May 8, 2019 at 11:52 PM Jimmy McArthur >> wrote: >> > >> > Carlos, >> > >> > Right now these questions are up as free text area. Feel free to send >> along adjustments if you'd like. >> > >> > >> > >> > Cheers >> > Jimmy >> > >> > Allison Price May 8, 2019 at 4:30 PM >> > Hi Carlos, >> > >> > Thank you for providing these two questions. We can get them both >> added, but I did have a question. Are both of these questions intended to >> be open ended with a text box for respondents to fill in their answers? Or >> do you want to provide answer choices? (thinking for the first question in >> particular) With any multiple choice question, an Other option can be >> included that will trigger a text box to be completed. >> > >> > Thanks! >> > Allison >> > >> > >> > >> > _______________________________________________ >> > User-committee mailing list >> > User-committee at lists.openstack.org >> > http://lists.openstack.org/cgi-bin/mailman/listinfo/user-committee >> > Carlos Goncalves May 8, 2019 at 12:04 PM >> > Hi Allison and Jimmy, >> > >> > In today's Octavia IRC meeting [1], the team agreed on the following >> > two questions we would like to see included in the survey: >> > >> > 1. Which OpenStack load balancing (Octavia) provider drivers would you >> > like to see supported? >> > 2. Which new features would you like to see supported in OpenStack >> > load balancing (Octavia)? >> > >> > Please let us know if you have any questions. >> > >> > Thanks, >> > Carlos >> > >> > [1] >> http://eavesdrop.openstack.org/meetings/octavia/2019/octavia.2019-05-08-16.00.html >> > >> > Allison Price May 7, 2019 at 3:50 PM >> > Hi Michael, >> > >> > I apologize that the Octavia project team has been unable to submit a >> question to date. Jimmy posted the User Survey update to the public mailing >> list to ensure we updated the entire community and that we caught any >> projects that had not submitted their questions. The User Survey is open >> all year, and the primary goal is passing operator feedback to the upstream >> community. >> > >> > If the Octavia team - or any OpenStack project team - has a question >> they would like added (limit of 2 per project), please let Jimmy or myself >> know. >> > >> > Thanks for reaching out, Michael. >> > >> > Cheers, >> > Allison >> > >> > >> > Michael Johnson May 7, 2019 at 3:39 PM >> > Jimmy & Allison, >> > >> > As you probably remember from previous year's surveys, the Octavia >> > team has been trying to get a question included in the survey for a >> > while. >> > I have included the response we got the last time we inquired about >> > the survey below. We never received a follow up invitation. >> > >> > I think it would be in the best interest for the community if we >> > follow our "Four Opens" ethos in the user survey process, specifically >> > the "Open Community" statement, by soliciting survey questions from >> > the project teams in an open forum such as the openstack-discuss >> > mailing list. >> > >> > Michael >> > >> > ----- Last response e-mail ------ >> > Jimmy McArthur >> > >> > Fri, Sep 7, 2018, 5:51 PM >> > to Allison, me >> > Hey Michael, >> > >> > The project-specific questions were added in 2017, so likely didn't >> > include some new projects. While we asked all projects to participate >> > initially, less than a dozen did. We will be sending an invitation for >> > new/underrepresented projects in the coming weeks. Please stand by and >> > know that we value your feedback and that of the community. >> > >> > Cheers! >> > >> > >> > Allison Price April 27, 2019 at 7:11 PM >> > Hi Michael, >> > >> > We reached out to all of the PTLs who had questions in the 2018 version >> of the survey to review and update their questions. If there is a project >> that was missed, we can add it and share anonymized results with the PTLs >> directly as well as the openstack-discsuss mailing list. >> > >> > If there is a question from the Octavia team, please let us know and we >> can add it for the 2019 survey. >> > >> > Cheers, >> > Allison >> > >> > >> > >> > >> > >> >> > > -- > May The Force of OpenStack Be With You, > > *Rico Lin*irc: ricolin > > -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.urdin at binero.se Fri May 10 12:16:01 2019 From: tobias.urdin at binero.se (Tobias Urdin) Date: Fri, 10 May 2019 14:16:01 +0200 Subject: [telemetry] Team meeting agenda for tomorrow In-Reply-To: <14ff728c-f19e-e869-90b1-4ff37f7170af@suse.com> References: <14ff728c-f19e-e869-90b1-4ff37f7170af@suse.com> Message-ID: <8db2ece3-3704-ff66-e5e8-49d25a23d640@binero.se> Interesting thread! I was very intrigued about reading this thread, and reading through Julien's blog posts. We are consuming the various Telemetry projects and I would like to get more involved in the Telemetry-effort. The mentioned Train roadmap ethernetpad [1] is a great start on defining the focus for the Telemetry projects. I think this is a great start to getting to the roots of how, based on the words of Julien's, how Ceilometer should have been build and not how it was built to work around all the limitations of the old OpenStack era. Most new deployments are probably already using Gnocchi, I'm including myself, were even third parties has implemented billing connections to the API. In my opinion the bigger questions here is how the various Telemetry parts should evolve based on the theory that storage is already provided. There should be a lot of thought put into the difference between a metrics- and billingbased storage solution in the back. I'll check out the Doodle and see if I can make the next meeting. Best regards Tobias [1] https://etherpad.openstack.org/p/telemetry-train-roadmap On 05/08/2019 11:27 PM, Joseph Davis wrote: > On 5/8/19 7:12 AM, openstack-discuss-request at lists.openstack.org wrote: >> Hello Trinh, >> Where does the meeting happen? Will it be via IRC Telemetry channel? Or, in >> the Etherpad (https://etherpad.openstack.org/p/telemetry-meeting-agenda)? I >> would like to discuss and understand a bit better the context behind >> the Telemetry >> events deprecation. > > Unfortunately, I have a conflict at that time and will not be able to > attend. > > I do have a little bit of context on the Events deprecation to share. > > First, you will note the commit message from the commit [0] when > Events were deprecated: > > " > > Deprecate event subsystem > > This subsystem has never been finished and is not maintained. > Deprecate it for future removal. > > " > > I got the impression from jd at the time that there were a number of > features in Telemetry, > > including Panko, that were not really "finished" and that the > engineers who had worked on them > > had moved on to other things, so the features had become unsupported.  > In late 2018 there was > > an effort to clean up things that were not well maintained or didn't > fit the direction of Telemetry. > > See also: > https://julien.danjou.info/lessons-from-openstack-telemetry-deflation/ > > > Events is one feature that often gets requested, but the use cases and > demand for it are not expressed > > strongly or well understood by most people.  If the Telemetry project > has demand to de-deprecate > > Event handling (including Panko), I'd suggest a review of the > requirements for event handling and > > possibly choosing a champion for maintaining the Panko service. > > > Also note: over in Monasca we have a spec [1] for handling Events > ingestion which I hope we will be > > completing in Train.  Contributions and comments welcome. :) > > > joseph > > [0] > https://github.com/openstack/ceilometer/commit/8a0245a5b3e1357d35ad6653be37ca01176577e4 > > [1] > https://github.com/openstack/monasca-specs/blob/master/specs/stein/approved/monasca-events-listener.rst > > >> On Wed, May 8, 2019 at 12:19 AM Trinh Nguyen wrote: >> >>> Hi team, >>> >>> As planned, we will have a team meeting at 02:00 UTC, May 9th on >>> #openstack-telemetry to discuss what we gonna do for the next milestone >>> (Train-1) and continue what we left off from the last meeting. >>> >>> I put here [1] the agenda thinking that it should be fine for an hour >>> meeting. If you have anything to talk about, please put it there too. >>> >>> [1]https://etherpad.openstack.org/p/telemetry-meeting-agenda >>> >>> >>> Bests, >>> >>> -- >>> *Trinh Nguyen* >>> *www.edlab.xyz* >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jim at jimrollenhagen.com Fri May 10 13:07:30 2019 From: jim at jimrollenhagen.com (Jim Rollenhagen) Date: Fri, 10 May 2019 09:07:30 -0400 Subject: Change in behavior in bandit 1.6.0 In-Reply-To: References: Message-ID: On Fri, May 10, 2019 at 5:48 AM Stephen Finucane wrote: > We've noticed a spate of recent test failures within the 'pep8' jobs in > oslo recently. It seems these are because of the release of bandit > 1.6.0. The root cause is that the '-x' (exclude) option seems to have > changed behavior. Previously, to match a path like 'oslo_log/tests/*', > you could state '-x test'. This option now expects a glob patterns, > such as '-x oslo_log/tests/*'. If you use bandit, you probably have to > update your 'tox.ini' accordingly. See [1] for an example. > As a note, it looks like you have to catch every .py file in this glob, so this won't work for more complex directory layouts. e.g. in Keystone, `-x keystone/tests/*` still fails, as does `-x keystone/tests/**/*`, etc. I've just blacklisted this version there.[3] > > It's worth noting that bandit is one of the few packages we don't > manage the version for [2], so if you're not already limiting yourself > to a version, perhaps it would be a good idea to do so to avoid stable > branches breaking periodically. Also, this is something that really > shouldn't have happened in a minor version (backwards incompatible > behavior change, yo) but it has so we'll live with it. I would ask > though that the bandit maintainers, whoever ye be, be more careful > about this kind of stuff in the future. Thanks :) > FWIW, looks like this was an unintentional regression[4] that they're working on fixing[5] in 1.6.1. > Stephen > > [1] https://review.opendev.org/#/c/658249/ > [2] https://github.com/openstack/requirements/blob/master/blacklist.txt > > > // jim [3] https://review.opendev.org/#/c/658107/ [4] https://github.com/PyCQA/bandit/issues/488 [5] https://github.com/PyCQA/bandit/pull/489 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimmy at openstack.org Wed May 8 21:52:05 2019 From: jimmy at openstack.org (Jimmy McArthur) Date: Wed, 08 May 2019 16:52:05 -0500 Subject: [User-committee] OpenStack User Survey 2019 In-Reply-To: References: <5CC0732E.8020601@tipit.net> <74F9B988-972B-422F-94D1-E62A83FD87A7@openstack.org> Message-ID: <5CD34F85.9010604@openstack.org> Carlos, Right now these questions are up as free text area. Feel free to send along adjustments if you'd like. Cheers Jimmy > Allison Price > May 8, 2019 at 4:30 PM > Hi Carlos, > > Thank you for providing these two questions. We can get them both > added, but I did have a question. Are both of these questions intended > to be open ended with a text box for respondents to fill in their > answers? Or do you want to provide answer choices? (thinking for the > first question in particular) With any multiple choice question, an > Other option can be included that will trigger a text box to be > completed. > > Thanks! > Allison > > > > _______________________________________________ > User-committee mailing list > User-committee at lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/user-committee > Carlos Goncalves > May 8, 2019 at 12:04 PM > Hi Allison and Jimmy, > > In today's Octavia IRC meeting [1], the team agreed on the following > two questions we would like to see included in the survey: > > 1. Which OpenStack load balancing (Octavia) provider drivers would you > like to see supported? > 2. Which new features would you like to see supported in OpenStack > load balancing (Octavia)? > > Please let us know if you have any questions. > > Thanks, > Carlos > > [1] > http://eavesdrop.openstack.org/meetings/octavia/2019/octavia.2019-05-08-16.00.html > > Allison Price > May 7, 2019 at 3:50 PM > Hi Michael, > > I apologize that the Octavia project team has been unable to submit a > question to date. Jimmy posted the User Survey update to the public > mailing list to ensure we updated the entire community and that we > caught any projects that had not submitted their questions. The User > Survey is open all year, and the primary goal is passing operator > feedback to the upstream community. > > If the Octavia team - or any OpenStack project team - has a question > they would like added (limit of 2 per project), please let Jimmy or > myself know. > > Thanks for reaching out, Michael. > > Cheers, > Allison > > > Michael Johnson > May 7, 2019 at 3:39 PM > Jimmy & Allison, > > As you probably remember from previous year's surveys, the Octavia > team has been trying to get a question included in the survey for a > while. > I have included the response we got the last time we inquired about > the survey below. We never received a follow up invitation. > > I think it would be in the best interest for the community if we > follow our "Four Opens" ethos in the user survey process, specifically > the "Open Community" statement, by soliciting survey questions from > the project teams in an open forum such as the openstack-discuss > mailing list. > > Michael > > ----- Last response e-mail ------ > Jimmy McArthur > > Fri, Sep 7, 2018, 5:51 PM > to Allison, me > Hey Michael, > > The project-specific questions were added in 2017, so likely didn't > include some new projects. While we asked all projects to participate > initially, less than a dozen did. We will be sending an invitation for > new/underrepresented projects in the coming weeks. Please stand by and > know that we value your feedback and that of the community. > > Cheers! > > > Allison Price > April 27, 2019 at 7:11 PM > Hi Michael, > > We reached out to all of the PTLs who had questions in the 2018 > version of the survey to review and update their questions. If there > is a project that was missed, we can add it and share anonymized > results with the PTLs directly as well as the openstack-discsuss > mailing list. > > If there is a question from the Octavia team, please let us know and > we can add it for the 2019 survey. > > Cheers, > Allison > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 77831 bytes Desc: not available URL: From zackchen517 at gmail.com Thu May 9 01:46:33 2019 From: zackchen517 at gmail.com (zack chen) Date: Thu, 9 May 2019 09:46:33 +0800 Subject: Baremetal attach volume in Multi-tenancy In-Reply-To: References: Message-ID: Thanks! Yes, I have seen this approach. However, the baremetal instance must be able to communicate with the openstack api network,storage network If I use the iscsi or rbd driver as the cinder volume driver. This may have some security risks in a multi-tenant scenario. How do I ensure that the storage network between different tenants is isolated and able to communicate with the platform's storage network. Walter Boring 于2019年5月8日周三 下午11:28写道: > To attach to baremetal instance, you will need to install the cinderclient > along with the python-brick-cinderclient-extension inside the instance > itself. > > > On Wed, May 8, 2019 at 11:15 AM zack chen wrote: > >> Hi, >> I am looking for a mechanism that can be used for baremetal attach >> volume in a multi-tenant scenario. In addition we use ceph as the backend >> storage for cinder. >> >> Can anybody give me some advice? >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zackchen517 at gmail.com Fri May 10 04:00:11 2019 From: zackchen517 at gmail.com (zack chen) Date: Fri, 10 May 2019 12:00:11 +0800 Subject: Baremetal attach volume in Multi-tenancy In-Reply-To: <20190509092828.g6qvdg5jbvqqvpba@localhost> References: <20190509092828.g6qvdg5jbvqqvpba@localhost> Message-ID: This is a normal Cinder in Openstack deployment I'm using ceph as cinder backend, RBD drvier. My ideas the instance should communicate with Openstack platform storage network via the vrouter provided by neutron. The vrouter gateway should communicate with Openstack platform. is or right? Gorka Eguileor 于2019年5月9日周四 下午5:28写道: > On 08/05, zack chen wrote: > > Hi, > > I am looking for a mechanism that can be used for baremetal attach volume > > in a multi-tenant scenario. In addition we use ceph as the backend > storage > > for cinder. > > > > Can anybody give me some advice? > > Hi, > > Is this a stand alone Cinder deployment or a normal Cinder in OpenStack > deployment? > > What storage backend will you be using? > > What storage protocol? iSCSI, FC, RBD...? > > Depending on these you can go with Walter's suggestion of using > cinderclient and its extension (which in general is the best way to go), > or you may prefer writing a small python script that uses OS-Brick and > makes the REST API calls directly. > > Cheers, > Gorka. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From saurabh683 at outlook.com Fri May 10 12:17:49 2019 From: saurabh683 at outlook.com (saurabh683 at outlook.com) Date: Fri, 10 May 2019 12:17:49 +0000 Subject: Documentation update Message-ID: Hi, First of all I would like to thank you for training labs. I successfully installed but I found the training labs README is not updated. I found the username password mentioned did not work for me. so I check the admin-openrc and demo-openrc and it worked. Also the demo project name is different in Horizon. https://github.com/openstack/training-labs Admin Login: * Username: admin * Password: admin_pass osbash at controller:~$ cat admin-openrc.sh export OS_USERNAME=admin export OS_PASSWORD=admin_user_secret export OS_PROJECT_NAME=admin export OS_USER_DOMAIN_NAME=Default export OS_PROJECT_DOMAIN_NAME=Default export OS_AUTH_URL=http://10.0.0.11:5000/v3 export OS_IDENTITY_API_VERSION=3 export OS_IMAGE_API_VERSION=2 osbash at controller:~$ cat demo-openrc.sh export OS_USERNAME=myuser export OS_PASSWORD=myuser_user_pass export OS_PROJECT_NAME=myproject export OS_USER_DOMAIN_NAME=default export OS_PROJECT_DOMAIN_NAME=default export OS_AUTH_URL=http://10.0.0.11:5000/v3 export OS_IDENTITY_API_VERSION=3 export OS_IMAGE_API_VERSION=2 osbash at controller:~$ -Thanks, Saurabh -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimmy at openstack.org Fri May 10 13:45:13 2019 From: jimmy at openstack.org (Jimmy McArthur) Date: Fri, 10 May 2019 08:45:13 -0500 Subject: [User-committee] OpenStack User Survey 2019 In-Reply-To: References: <5CC0732E.8020601@tipit.net> <74F9B988-972B-422F-94D1-E62A83FD87A7@openstack.org> <5CD34F85.9010604@openstack.org> Message-ID: <5CD58069.6030700@openstack.org> Carlos, These questions are now updated and on the survey, if the user indicates they are using Octavia on their deployment. Thank you, Jimmy > Carlos Goncalves > May 9, 2019 at 3:59 PM > Thank you for the prompt replies and action, Allison and Jimmy! > > After discussing on #openstack-lbaas with the team and drafting on > https://etherpad.openstack.org/p/cItdtzi32r, we would like to suggest > presenting some multiple choices along with an "Other" free text area. > > 1. Which OpenStack load balancing (Octavia) provider drivers would you > like to see supported? > > (sorted alphabetically) > A10 Networks > AVI Networks > Amphora > Brocade > F5 > HAProxy Technologies > Kemp > Netscaler > OVN > Radware > VMware > Other (free text area) > > 2. Which new features would you like to see supported in OpenStack > load balancing (Octavia)? > > (sorted alphabetically) > Active-active > Container-based amphora driver > Event notifications > gRPC protocol > HTTP/2 protocol > Log offloading > MySQL protocol > Simultaneous IPv4 and IPv6 VIP > Statistics (more metrics) > VIP ACL API > Other (free text area) > > Thanks, > Carlos > > Jimmy McArthur > May 8, 2019 at 4:52 PM > Carlos, > > Right now these questions are up as free text area. Feel free to send > along adjustments if you'd like. > > > > Cheers > Jimmy > > Allison Price > May 8, 2019 at 4:30 PM > Hi Carlos, > > Thank you for providing these two questions. We can get them both > added, but I did have a question. Are both of these questions intended > to be open ended with a text box for respondents to fill in their > answers? Or do you want to provide answer choices? (thinking for the > first question in particular) With any multiple choice question, an > Other option can be included that will trigger a text box to be > completed. > > Thanks! > Allison > > > > Carlos Goncalves > May 8, 2019 at 12:04 PM > Hi Allison and Jimmy, > > In today's Octavia IRC meeting [1], the team agreed on the following > two questions we would like to see included in the survey: > > 1. Which OpenStack load balancing (Octavia) provider drivers would you > like to see supported? > 2. Which new features would you like to see supported in OpenStack > load balancing (Octavia)? > > Please let us know if you have any questions. > > Thanks, > Carlos > > [1] > http://eavesdrop.openstack.org/meetings/octavia/2019/octavia.2019-05-08-16.00.html > > Allison Price > May 7, 2019 at 3:50 PM > Hi Michael, > > I apologize that the Octavia project team has been unable to submit a > question to date. Jimmy posted the User Survey update to the public > mailing list to ensure we updated the entire community and that we > caught any projects that had not submitted their questions. The User > Survey is open all year, and the primary goal is passing operator > feedback to the upstream community. > > If the Octavia team - or any OpenStack project team - has a question > they would like added (limit of 2 per project), please let Jimmy or > myself know. > > Thanks for reaching out, Michael. > > Cheers, > Allison > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 42443 bytes Desc: not available URL: From jimmy at openstack.org Fri May 10 13:51:11 2019 From: jimmy at openstack.org (Jimmy McArthur) Date: Fri, 10 May 2019 08:51:11 -0500 Subject: [User-committee] OpenStack User Survey 2019 In-Reply-To: References: <5CC0732E.8020601@tipit.net> <74F9B988-972B-422F-94D1-E62A83FD87A7@openstack.org> <5CD34F85.9010604@openstack.org> Message-ID: <5CD581CF.6010306@openstack.org> Rico, This has been added to the survey. Users that select Heat as part of their deployment will see this question. The old question has been hidden, but retained since some users might have already answered. Cheers, Jimmy > Rico Lin > May 10, 2019 at 5:41 AM > (resend to user committee ML) > > Thanks, Jimmy and Allison for the effort to put this together, > > As Heat PTL > I would like to ask for question update for Heat project. > Here is our new question: > https://etherpad.openstack.org/p/heat-user-survey-brainstrom > Please let me know if any confusion > > -- > May The Force of OpenStack Be With You, > */Rico Lin > /*irc: ricolin > > > > Rico Lin > May 9, 2019 at 11:26 PM > Thanks, Jimmy and Allison for the effort to put this together, > > As Heat PTL > I would like to ask for question update for Heat project. > Here is our new question: > https://etherpad.openstack.org/p/heat-user-survey-brainstrom > Please let me know if any confusion > > > > -- > May The Force of OpenStack Be With You, > */Rico Lin > /*irc: ricolin > > > > Carlos Goncalves > May 9, 2019 at 3:59 PM > Thank you for the prompt replies and action, Allison and Jimmy! > > After discussing on #openstack-lbaas with the team and drafting on > https://etherpad.openstack.org/p/cItdtzi32r, we would like to suggest > presenting some multiple choices along with an "Other" free text area. > > 1. Which OpenStack load balancing (Octavia) provider drivers would you > like to see supported? > > (sorted alphabetically) > A10 Networks > AVI Networks > Amphora > Brocade > F5 > HAProxy Technologies > Kemp > Netscaler > OVN > Radware > VMware > Other (free text area) > > 2. Which new features would you like to see supported in OpenStack > load balancing (Octavia)? > > (sorted alphabetically) > Active-active > Container-based amphora driver > Event notifications > gRPC protocol > HTTP/2 protocol > Log offloading > MySQL protocol > Simultaneous IPv4 and IPv6 VIP > Statistics (more metrics) > VIP ACL API > Other (free text area) > > Thanks, > Carlos > > Jimmy McArthur > May 8, 2019 at 4:52 PM > Carlos, > > Right now these questions are up as free text area. Feel free to send > along adjustments if you'd like. > > > > Cheers > Jimmy > > Allison Price > May 8, 2019 at 4:30 PM > Hi Carlos, > > Thank you for providing these two questions. We can get them both > added, but I did have a question. Are both of these questions intended > to be open ended with a text box for respondents to fill in their > answers? Or do you want to provide answer choices? (thinking for the > first question in particular) With any multiple choice question, an > Other option can be included that will trigger a text box to be > completed. > > Thanks! > Allison > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 42443 bytes Desc: not available URL: From sfinucan at redhat.com Fri May 10 13:58:34 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Fri, 10 May 2019 14:58:34 +0100 Subject: Change in behavior in bandit 1.6.0 In-Reply-To: References: Message-ID: <8ecc8cf2f080124bfa2486ec4e106349af4c16c4.camel@redhat.com> On Fri, 2019-05-10 at 09:07 -0400, Jim Rollenhagen wrote: > On Fri, May 10, 2019 at 5:48 AM Stephen Finucane wrote: > > We've noticed a spate of recent test failures within the 'pep8' jobs in > > oslo recently. It seems these are because of the release of bandit > > 1.6.0. The root cause is that the '-x' (exclude) option seems to have > > changed behavior. Previously, to match a path like 'oslo_log/tests/*', > > you could state '-x test'. This option now expects a glob patterns, > > such as '-x oslo_log/tests/*'. If you use bandit, you probably have to > > update your 'tox.ini' accordingly. See [1] for an example. > > As a note, it looks like you have to catch every .py file in this glob, > so this won't work for more complex directory layouts. e.g. in Keystone, > `-x keystone/tests/*` still fails, as does `-x keystone/tests/**/*`, etc. > I've just blacklisted this version there.[3] > > > It's worth noting that bandit is one of the few packages we don't > > manage the version for [2], so if you're not already limiting yourself > > to a version, perhaps it would be a good idea to do so to avoid stable > > branches breaking periodically. Also, this is something that really > > shouldn't have happened in a minor version (backwards incompatible > > behavior change, yo) but it has so we'll live with it. I would ask > > though that the bandit maintainers, whoever ye be, be more careful > > about this kind of stuff in the future. Thanks :) > > FWIW, looks like this was an unintentional regression[4] that they're > working on fixing[5] in 1.6.1. Yay! Go bandit devs :) Stephen > > Stephen > > > > [1] https://review.opendev.org/#/c/658249/ > > [2] > > https://github.com/openstack/requirements/blob/master/blacklist.txt > > > > > > // jim > > [3] https://review.opendev.org/#/c/658107/ > [4] https://github.com/PyCQA/bandit/issues/488 > [5] https://github.com/PyCQA/bandit/pull/489 From mriedemos at gmail.com Fri May 10 14:21:39 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 10 May 2019 09:21:39 -0500 Subject: [watcher][qa] Thoughts on performance testing for Watcher In-Reply-To: <201905081126508513380@zte.com.cn> References: <201905081126508513380@zte.com.cn> Message-ID: On 5/7/2019 10:26 PM, li.canwei2 at zte.com.cn wrote: > Some notes: > > 1, Watcher updates its data model based nova versioned notifications, so you > >    should enable nova notification in your simulated environment. Sure, this shouldn't be a problem in a devstack environment. > > 2, Watcher needs node name getting from CONF.host or socket.gethostname, > >    If you have two or more controller nodes they don't have same host name. When you say "node name" do you mean the name of a nova-compute hypervisor node, i.e. what is returned from "openstack hypervisor list"? Or do you mean something specific to Watcher itself? The former is already supported in devstack using fake computes [1]. > > 3, Watcher doesn't consider nova cell, now watcher filter nodes through host > >   aggregate and zone. You can get more info by CLI cmd: watcher help > audittemplate create Yup, that's fine and is part of what I want to test. I wouldn't expect Watcher to know anything about cells anyway since there is no REST API or notification about those in nova. > > 4, Watcher needs metric data source such as Ceilometer, so your fake > nodes and VMs > >    should have metric data. Hmm, is this where the ceilometer agent pulls metric data from the hypervisor (libvirt) directly? If so, that could be a problem for me if I'm using the fake virt driver in devstack to do some scale testing since that's not a real hypervisor. There would be notifications from nova for servers getting created and deleted and such, but there wouldn't be a real hypervisor to pull data from for the ceilometer agent. Honestly I'd like to avoid using ceilometer at all if possible but I'm also unfamiliar with gnocchi and monasca, but I'll probably try to get one of those working first. Looking at the Watcher configuration I shouldn't depend on ceilometer anyway since it looks like the datasource is deprecated? [2] I am a little confused about that because the configuration docs talk about using ceilometer (maybe those are just out of date?) [3]. > > 5,  For optimizing resource utilization, I think you could use strategy [1] Thanks I'll try it. > > 6, There are two audit type:ONESHOT and CONTINUOUS in Watcher, you can get > >   more help by CLI cmd: watcher help audit create > Thanks for all of the help here. [1] https://docs.openstack.org/devstack/latest/guides/nova.html#scaling [2] https://docs.openstack.org/watcher/latest/configuration/watcher.html#ceilometer-client [3] https://docs.openstack.org/watcher/latest/configuration/configuring.html#configure-measurements -- Thanks, Matt From ianyrchoi at gmail.com Fri May 10 14:44:37 2019 From: ianyrchoi at gmail.com (Ian Y. Choi) Date: Fri, 10 May 2019 23:44:37 +0900 Subject: [docs][training-labs] Documentation update In-Reply-To: References: Message-ID: <55edf419-3614-c39f-d136-9cda8284da2f@gmail.com> Hello, Although I don't have a good answer for your post, I would like to comment a few small things on your post to openstack-discuss: - Please use [training-labs] in future, as mentioned README file https://opendev.org/openstack/training-labs/#mailing-lists-irc (+ [docs] is a good idea). - For this bug you think, a better place to post is Launchpad: https://bugs.launchpad.net/labs . My general understanding on training-labs is that training-labs contributors update scripts after there is a new release. Since Stein were released recently, I think contributors are working hard, like https://review.opendev.org/#/q/project:openstack/training-labs . Hope that my explanation helps your better understanding on OpenStack world and good points how to solve your problem. With many thanks, /Ian saurabh683 at outlook.com wrote on 5/10/2019 9:17 PM: > Hi, > > First of all I would like to thank you for training labs. > > I successfully installed but I found the training labs README is not > updated. > I found the username password mentioned did not work for me. so I > check the > admin-openrc and demo-openrc and it worked. > > Also the demo project name is different in Horizon. > > https://github.com/openstack/training-labs > > Admin Login: > > * Username:|admin| > * Password:|admin_pass| > > > osbash at controller:~$ cat admin-openrc.sh > > export OS_USERNAME=admin > > export OS_PASSWORD=admin_user_secret > > export OS_PROJECT_NAME=admin > > export OS_USER_DOMAIN_NAME=Default > > export OS_PROJECT_DOMAIN_NAME=Default > > export OS_AUTH_URL=http://10.0.0.11:5000/v3 > > export OS_IDENTITY_API_VERSION=3 > > export OS_IMAGE_API_VERSION=2 > > osbash at controller:~$ cat demo-openrc.sh > > export OS_USERNAME=myuser > > export OS_PASSWORD=myuser_user_pass > > export OS_PROJECT_NAME=myproject > > export OS_USER_DOMAIN_NAME=default > > export OS_PROJECT_DOMAIN_NAME=default > > export OS_AUTH_URL=http://10.0.0.11:5000/v3 > > export OS_IDENTITY_API_VERSION=3 > > export OS_IMAGE_API_VERSION=2 > > osbash at controller:~$ > > > -Thanks, > Saurabh From john at johngarbutt.com Fri May 10 15:28:37 2019 From: john at johngarbutt.com (John Garbutt) Date: Fri, 10 May 2019 16:28:37 +0100 Subject: [nova] PTG aligning of nova spec: show-server-numa-topology In-Reply-To: References: Message-ID: Hi, My main worry was to not expose host related information to end users, but noting administrators probably do what the information. Looking again at the Stein spec we merged, the proposed policy rules already take care of all that. I think the next step is to re-propose the spec for the Train release. I couldn't find it, but maybe you have done that already? Thanks, johnthetubaguy On Fri, 10 May 2019 at 08:51, yonglihe wrote: > Hi, Everyone > > I synced up with Alex about comments we got at PTG. It's a long > discussion, I might lost something. > > What i got lists below, fix me: > > * Remove sockets > * Remove thread_policy > > Not sure about following comments: > > > * Remove the cpu topology from the proposal? > * Using the cpu pinning info instead of cpu set? > > > By apply the suggestion, the API ``GET /servers/{server_id}/topology`` > response gonna to be like this, > > and let us align what it should be: > > { > # overall policy: TOPOLOGY % 'index > "nodes":[ > { > # Host Numa Node > # control by policy TOPOLOGY % 'index:host_info' > "host_numa_node": 3, > # 0:5 means vcpu 0 pinning to pcpu 5 > # control by policy TOPOLOGY % 'index:host_info' > "cpu_pinning": {0:5, 1:6}, > "vcpu_set": [0,1,2,3], > "siblings": [[0,1],[2,3]], > "memory_mb": 1024, > "pagesize_kb": 4096, > "cores": 2, > # one core has at least one thread > "threads": 2 > } > ... > ], # nodes > } > > > links: > > ptg: > https://etherpad.openstack.org/p/nova-ptg-train L334 > > spec review: > > https://review.opendev.org/#/c/612256/25/specs/stein/approved/show-server-numa-topology.rst > > code review: > https://review.openstack.org/#/c/621476/ > > bp: > https://blueprints.launchpad.net/nova/+spec/show-server-numa-topology > > > Regards > Yongli He > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Fri May 10 15:48:35 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 10 May 2019 10:48:35 -0500 Subject: [watcher] Getting infra to update launchpad bugs automatically Message-ID: I've noticed that when pushing patches to watcher that the corresponding bug in launchpad isn't updated (in-progress and assign the owner). It looks like this is because the "hudson-openstack" user isn't in the watcher bug team (which I guess is watcher-drivers team?). Would it be possible to add that? Then bugs in launchpad will get automatically updated when posted to gerrit via infra tooling. -- Thanks, Matt From dtroyer at gmail.com Fri May 10 16:48:21 2019 From: dtroyer at gmail.com (Dean Troyer) Date: Fri, 10 May 2019 11:48:21 -0500 Subject: [OSC][PTG] Summary: many things to do! Message-ID: OpenStackClient held a session at the Denver PTG and despite not having much planned had plenty to talk about. Some of the highlights from the etherpad[0] are: * Aretm is working on changing the Image commands to use OpenStackSDK. This is the work described in the cycle goal proposal[1] that he is planning to do anyway. I support going ahead with this even without an SDK 1.0 release as it lets us remove glanceclient and some of its unique dependencies. * There was some discussion about image encryption and where the client-side bits of that may land. One option was to put it into os-brick; if that is where it winds up OSC will make that an optional dependency de to the number of other dependencies that will introduce. (ie, OSC currently uses very little of oslo, some of which brings in a number of things not otherwise needed client-side). * Doug brought up the problems with load times due to scanning the entire import path on every invocation. I found in my notes almost exactly 2 years ago where we discussed this same topic. AS we did then, the idea of skipping entry points entirely for commands in the OSC repo is the best solution we have found. This would help some common cases but still leave all plugins with slow load times. * Nate Johnston asked about supporting bulk create APIs, such as Neutron's bulk port create. After kicking around a couple of options the rough consensus is around using a YAML file (or JSON or both?) to define the resources to be created and giving it to a new top-level 'create' command (yes, verb only, the resource names will be in the YAML file). APIs that allow bulk creates will get a single call with the entire list, for other APIs we can loop and feed them one at a time. This would be very similar to using interactive mode and feeding in a list of commands, stopping at the first failure. Note that delete commands already take multiple resource identifiers, adding the ability to source that list from YAML would be an easy addition. * OSC4 has been waiting in a feature branch for over 2 years (where has that PTL been???). I recently tried to merge master in to see how far off it was, it was enough that I think we should just cherry-pick the commites in that branch to master and move forward. So the current plan is to: * do one more release in the 3.x series to clean up outstanding things * switch to OSC4 development, cherry pick in amotoki's existing commits[2] (mostly changes to output formatting) * refresh and merge other reviews in the osc4 topic * remove all of the backward-compatibility code in the OSC authentication process so OSC will now work like all other pure keystoneauth- and sdk-using code. Also relevant to OSC but covered in a Nova Forum session[3,4], highlights: * boot-from-volume: Support type=image for the --block-device-mapping, and Add a --boot-from-volume option which will translate to a root --block-device-mapping using the provided --image value * server migrate --live: deprecate the --live option and add a new --live-migration option and a --host option * compute migration: begin exposing this resource in the CLI dt [0] https://etherpad.openstack.org/p/train-ptg-osc [1] https://review.opendev.org/#/c/639376/ [2] this series starts at https://review.opendev.org/#/c/657907/ [3] https://etherpad.openstack.org/p/DEN-osc-compute-api-gaps [4] http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005783.html -- Dean Troyer dtroyer at gmail.com From cdent+os at anticdent.org Fri May 10 16:53:30 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Fri, 10 May 2019 09:53:30 -0700 (PDT) Subject: [OSC][PTG] Summary: many things to do! In-Reply-To: References: Message-ID: On Fri, 10 May 2019, Dean Troyer wrote: > * Nate Johnston asked about supporting bulk create APIs, such as > Neutron's bulk port create. After kicking around a couple of options > the rough consensus is around using a YAML file (or JSON or both?) to > define the resources to be created and giving it to a new top-level > 'create' command (yes, verb only, the resource names will be in the > YAML file). +∞ There are presumably some error handling issues with that, but overall something like this would be very useful. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From dtroyer at gmail.com Fri May 10 17:17:35 2019 From: dtroyer at gmail.com (Dean Troyer) Date: Fri, 10 May 2019 12:17:35 -0500 Subject: [OSC][PTG] Summary: many things to do! In-Reply-To: References: Message-ID: On Fri, May 10, 2019 at 11:55 AM Chris Dent wrote: > > * Nate Johnston asked about supporting bulk create APIs, such as > > Neutron's bulk port create. After kicking around a couple of options > > the rough consensus is around using a YAML file (or JSON or both?) to > > define the resources to be created and giving it to a new top-level > > 'create' command (yes, verb only, the resource names will be in the > > YAML file). > > +∞ > > There are presumably some error handling issues with that, but > overall something like this would be very useful. I should admit that during this conversation I was well down the path of re-creating Ansible before I realized it... that is part of the reason we want to start with a simplistic equivalent to shell's 'set -e' and let the user sort it out. Note that the delete command pattern is different in that it attempts them all once and returns a list of failures if any. I am certainly open to discussion on the preferred way to address errors without inventing complicated retry mechanisms; my feeling is if you need those use Ansible and the SDK directly. dt -- Dean Troyer dtroyer at gmail.com From colleen at gazlene.net Fri May 10 17:48:26 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Fri, 10 May 2019 13:48:26 -0400 Subject: [PTL][SIG][WG] PTG Team Photos In-Reply-To: References: Message-ID: <981673d8-b095-4c30-a651-577d1c5286d3@www.fastmail.com> On Thu, Mar 28, 2019, at 17:03, Kendall Nelson wrote: > Hello! > > If your team is attending the PTG and is interested in having a team > photo taken, here is the signup[1]! There are slots Thursday and Friday > from 10:00 AM to 4:30 PM. > > The location is TBD but will likely be close to where registration will > be. I'll send an email out the day before with a reminder of your time > slot and an exact location. > > -Kendall (diablo_rojo) > > [1]https://docs.google.com/spreadsheets/d/1DgsRHVWW2YLv7ewfX0M21zWJRf4wUfPG4ff2V5XtaMg/edit?usp=sharing > Are the photos available somewhere now? I'm wondering if I missed an email. Colleen From opensrloo at gmail.com Fri May 10 17:52:36 2019 From: opensrloo at gmail.com (Ruby Loo) Date: Fri, 10 May 2019 13:52:36 -0400 Subject: [nova][all][ptg] Summary: Same-Company Approvals In-Reply-To: References: Message-ID: On Sat, May 4, 2019 at 6:48 PM Eric Fried wrote: > (NB: I tagged [all] because it would be interesting to know where other > teams stand on this issue.) > > Etherpad: https://etherpad.openstack.org/p/nova-ptg-train-governance > > Summary: > - There is a (currently unwritten? at least for Nova) rule that a patch > should not be approved exclusively by cores from the same company. This > is rife with nuance, including but not limited to: > - Usually (but not always) relevant when the patch was proposed by > member of same company > - N/A for trivial things like typo fixes > - The issue is: > - Should the rule be abolished? and/or > - Should the rule be written down? > > Consensus (not unanimous): > - The rule should not be abolished. There are cases where both the > impetus and the subject matter expertise for a patch all reside within > one company. In such cases, at least one core from another company > should still be engaged and provide a "procedural +2" - much like cores > proxy SME +1s when there's no core with deep expertise. > - If there is reasonable justification for bending the rules (e.g. typo > fixes as noted above, some piece of work clearly not related to the > company's interest, unwedging the gate, etc.) said justification should > be clearly documented in review commentary. > - The rule should not be documented (this email notwithstanding). This > would either encourage loopholing or turn into a huge detailed legal > tome that nobody will read. It would also *require* enforcement, which > is difficult and awkward. Overall, we should be able to trust cores to > act in good faith and in the appropriate spirit. > > efried > . > In ironic-land, we documented this [1] many moons ago. Whether that is considered a rule or a guideline, I don't know, but we haven't been sued yet and I don't recall any heated arguments/incidents about it. :) --ruby [1] https://wiki.openstack.org/wiki/Ironic/CoreTeam#Other_notes -------------- next part -------------- An HTML attachment was scrubbed... URL: From jungleboyj at gmail.com Fri May 10 17:57:52 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Fri, 10 May 2019 12:57:52 -0500 Subject: [PTL][SIG][WG] PTG Team Photos In-Reply-To: <981673d8-b095-4c30-a651-577d1c5286d3@www.fastmail.com> References: <981673d8-b095-4c30-a651-577d1c5286d3@www.fastmail.com> Message-ID: <30c8cbb5-b11b-be98-339d-ef6c5e35305b@gmail.com> Colleen, I haven't seen them made available anywhere yet so I don't think you missed an e-mail. Jay On 5/10/2019 12:48 PM, Colleen Murphy wrote: > On Thu, Mar 28, 2019, at 17:03, Kendall Nelson wrote: >> Hello! >> >> If your team is attending the PTG and is interested in having a team >> photo taken, here is the signup[1]! There are slots Thursday and Friday >> from 10:00 AM to 4:30 PM. >> >> The location is TBD but will likely be close to where registration will >> be. I'll send an email out the day before with a reminder of your time >> slot and an exact location. >> >> -Kendall (diablo_rojo) >> >> [1]https://docs.google.com/spreadsheets/d/1DgsRHVWW2YLv7ewfX0M21zWJRf4wUfPG4ff2V5XtaMg/edit?usp=sharing >> > Are the photos available somewhere now? I'm wondering if I missed an email. > > Colleen > From mordred at inaugust.com Fri May 10 20:42:41 2019 From: mordred at inaugust.com (Monty Taylor) Date: Fri, 10 May 2019 20:42:41 +0000 Subject: [OSC][PTG] Summary: many things to do! In-Reply-To: References: Message-ID: <2de134d3-629c-5396-4b1d-0c1dd0c42065@inaugust.com> On 5/10/19 4:48 PM, Dean Troyer wrote: > OpenStackClient held a session at the Denver PTG and despite not > having much planned had plenty to talk about. Some of the highlights > from the etherpad[0] are: Well, poo. Sorry I missed it. > * Aretm is working on changing the Image commands to use OpenStackSDK. > This is the work described in the cycle goal proposal[1] that he is > planning to do anyway. I support going ahead with this even without > an SDK 1.0 release as it lets us remove glanceclient and some of its > unique dependencies. > > * There was some discussion about image encryption and where the > client-side bits of that may land. One option was to put it into > os-brick; if that is where it winds up OSC will make that an optional > dependency de to the number of other dependencies that will introduce. > (ie, OSC currently uses very little of oslo, some of which brings in a > number of things not otherwise needed client-side). We landed support for image signing in SDK: https://opendev.org/openstack/openstacksdk/src/branch/master/openstack/image/image_signer.py so I don't think it would be an issue to also land support for encryption. I'd like to have support for signing so that people using SDK could do signing. However, like OSC, SDK doesn't use oslo deps, largely for the reason you mention - they're written for server side and vastly complicate the client-side story. SDK will not grow a dependency on os-brick, even an optional one. How about if we put the encryption routines in SDK, then expose them to the server-side components via SDK? > * Doug brought up the problems with load times due to scanning the > entire import path on every invocation. I found in my notes almost > exactly 2 years ago where we discussed this same topic. AS we did > then, the idea of skipping entry points entirely for commands in the > OSC repo is the best solution we have found. This would help some > common cases but still leave all plugins with slow load times. > > * Nate Johnston asked about supporting bulk create APIs, such as > Neutron's bulk port create. After kicking around a couple of options > the rough consensus is around using a YAML file (or JSON or both?) to > define the resources to be created and giving it to a new top-level > 'create' command (yes, verb only, the resource names will be in the > YAML file). APIs that allow bulk creates will get a single call with > the entire list, for other APIs we can loop and feed them one at a > time. This would be very similar to using interactive mode and > feeding in a list of commands, stopping at the first failure. Note > that delete commands already take multiple resource identifiers, > adding the ability to source that list from YAML would be an easy > addition. > > * OSC4 has been waiting in a feature branch for over 2 years (where > has that PTL been???). I recently tried to merge master in to see how > far off it was, it was enough that I think we should just cherry-pick > the commites in that branch to master and move forward. So the > current plan is to: > * do one more release in the 3.x series to clean up outstanding things > * switch to OSC4 development, cherry pick in amotoki's existing > commits[2] (mostly changes to output formatting) > * refresh and merge other reviews in the osc4 topic > * remove all of the backward-compatibility code in the OSC > authentication process so OSC will now work like all other pure > keystoneauth- and sdk-using code. > > > Also relevant to OSC but covered in a Nova Forum session[3,4], highlights: > > * boot-from-volume: Support type=image for the --block-device-mapping, > and Add a --boot-from-volume option which will translate to a root > --block-device-mapping using the provided --image value ++ FWIW - https://opendev.org/openstack/openstacksdk/src/branch/master/openstack/cloud/_compute.py#L736 is where we define the similar parameters for server create in SDK. As we're making the new OSC params, if we can align naming and usage, that would likely lead to future joy - unless we can't, in which case it's fine. > * server migrate --live: deprecate the --live option and add a new > --live-migration option and a --host option > > * compute migration: begin exposing this resource in the CLI > > dt > > [0] https://etherpad.openstack.org/p/train-ptg-osc > [1] https://review.opendev.org/#/c/639376/ > [2] this series starts at https://review.opendev.org/#/c/657907/ > [3] https://etherpad.openstack.org/p/DEN-osc-compute-api-gaps > [4] http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005783.html > > From arbermejo0417 at gmail.com Fri May 10 20:59:18 2019 From: arbermejo0417 at gmail.com (Alejandro Ruiz Bermejo) Date: Fri, 10 May 2019 16:59:18 -0400 Subject: [Horizon] Openstack Dasboard Frozen at login screen Message-ID: Hi I'm having problems with Horizon (OpenStack Queens running in Ubuntu 18.04.1 LTS), I've installed it following the recomendations at the oficial guides and i can't login into the dasboard, i'm stuck at the login screen. I checked the content of */var/log/apache2/error.log* and this is the output: [Fri May 10 20:47:33.347995 2019] [wsgi:error] [pid 24256:tid 140032273983232] [remote 10.8.2.116:36002] INFO openstack_auth.plugin.base Attempted scope to domain Default failed, will attemptto scope to another domain.[Fri May 10 20:47:36.008849 2019] [wsgi:error] [pid 24256:tid 140032273983232] [remote 10.8.2.116:36002] INFO openstack_auth.forms Login successful for user "admin" using domain "Default", remote address 10.8.2.116.[Fri May 10 20:47:54.033509 2019] [wsgi:error] [pid 24255:tid 140032232019712] [remote 10.8.2.116:36008] INFO openstack_auth.plugin.base Attempted scope to domain Default failed, will attemptto scope to another domain.[Fri May 10 20:47:54.383471 2019] [wsgi:error] [pid 24255:tid 140032232019712] [remote 10.8.2.116:36008] INFO openstack_auth.forms Login successful for user "admin" using domain "Default", remote address 10.8.2.116. According to this i should be already logged into the admin dashboard but i'm still in the login screen Does anyone ever experienced this error, or knows what solution can be done??? -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Fri May 10 21:17:09 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Fri, 10 May 2019 16:17:09 -0500 Subject: [OSC][PTG] Summary: many things to do! In-Reply-To: References: Message-ID: <20190510211708.GA27938@sm-workstation> On Fri, May 10, 2019 at 11:48:21AM -0500, Dean Troyer wrote: > OpenStackClient held a session at the Denver PTG and despite not > having much planned had plenty to talk about. Some of the highlights > from the etherpad[0] are: > > > > > * OSC4 has been waiting in a feature branch for over 2 years (where > has that PTL been???). I recently tried to merge master in to see how > far off it was, it was enough that I think we should just cherry-pick > the commites in that branch to master and move forward. So the > current plan is to: > * do one more release in the 3.x series to clean up outstanding things > * switch to OSC4 development, cherry pick in amotoki's existing > commits[2] (mostly changes to output formatting) I also have a few holding out there for the major release: https://review.opendev.org/#/q/status:open+project:openstack/python-openstackclient+branch:master+topic:osc4 I'll try to get to that one in merge conflict and I'll try to watch for other conflicts as we merge these different major changes in. (sorry, didn't realize there was a feature branch for this work. Though that looks like it was OK given the current state. ;) ) From flux.adam at gmail.com Fri May 10 21:40:55 2019 From: flux.adam at gmail.com (Adam Harwell) Date: Fri, 10 May 2019 14:40:55 -0700 Subject: [octavia][lbaas][neutron-lbaas] Octavia v1 API and neutron-lbaas retirement Message-ID: Hi Octavia folks, This email is to announce that we are currently in the process of retiring both neutron-lbaas and the Octavia v1 API. We said this would happen in September 2019 or the OpenStack "U" release, whichever came first. Because September of 2019 occurs during the Train release cycle, there will be no Train release of neutron-lbaas. We will therefore be retiring the neutron-lbaas repository and the Octavia v1 API effective immediately, as there is no reason to continue to maintain them if no release is forthcoming. Reminder: The Octavia v1 API was used by the neutron-lbaas Octavia provider driver and was not an end-user API. The Octavia v1 API is a proprietary API and is not compatible with the LBaaS v2 API specification or any other LBaaS API specification. If you have questions about this deprecation cycle please see the FAQ wiki: https://wiki.openstack.org/wiki/Neutron/LBaaS/Deprecation If you have questions or concerns not covered by the FAQ, please reach out to the Octavia team either by using the "[octavia]" subject prefix on this mailing list or in our IRC channel #openstack-lbaas. We look forward to working with you moving forward with Octavia LBaaS! --Adam -------------- next part -------------- An HTML attachment was scrubbed... URL: From colleen at gazlene.net Fri May 10 22:13:28 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Fri, 10 May 2019 18:13:28 -0400 Subject: [dev][keystone] Keystone Team Update - Week of 6 May 2019 Message-ID: <43255a69-497c-42f0-a2f1-599f91525297@www.fastmail.com> # Keystone Team Update - Week of 6 May 2019 ## News This was our first week back after the Forum and PTG in Denver and was a slow week for keystone since most of us were recovering. I wrote a recap of the marathon week[1]. [1] http://www.gazlene.net/denver-forum-ptg-2019.html ## Open Specs Train specs: https://bit.ly/2uZ2tRl Ongoing specs: https://bit.ly/2OyDLTh ## Recently Merged Changes Search query: https://bit.ly/2pquOwT We merged 5 changes this week. ## Changes that need Attention Search query: https://bit.ly/2tymTje There are 40 changes that are passing CI, not in merge conflict, have no negative reviews and aren't proposed by bots. ## Bugs This week we opened 3 new bugs and closed 6. Many of these we were able to close together during the spec backlog and RFE review on the last day of the PTG last week. Bugs opened (3) Bug #1828126 (keystone:Undecided) opened by Dmitrii Shcherbakov https://bugs.launchpad.net/keystone/+bug/1828126 Bug #1828565 (keystone:Undecided) opened by Jose Castro Leon https://bugs.launchpad.net/keystone/+bug/1828565 Bug #1827761 (oslo.policy:Undecided) opened by jacky06 https://bugs.launchpad.net/oslo.policy/+bug/1827761 Bugs closed (5) Bug #1815972 (keystone:Wishlist) https://bugs.launchpad.net/keystone/+bug/1815972 Bug #1816163 (keystone:Wishlist) https://bugs.launchpad.net/keystone/+bug/1816163 Bug #1816164 (keystone:Wishlist) https://bugs.launchpad.net/keystone/+bug/1816164 Bug #1816167 (keystone:Wishlist) https://bugs.launchpad.net/keystone/+bug/1816167 Bug #1824239 (keystone:Undecided) https://bugs.launchpad.net/keystone/+bug/1824239 Bugs fixed (1) Bug #1816112 (keystone:Wishlist) fixed by no one https://bugs.launchpad.net/keystone/+bug/1816112 ## Milestone Outlook https://releases.openstack.org/train/schedule.html Spec proposal freeze is in 4 weeks. As discussed at the PTG, we will be very firm about this deadline for Train work. ## Shout-outs Many thanks to everyone - cores, new contributors, casual contributors, members of other project teams, users and operators - who participated in Forum and PTG sessions and provided valuable input. ## Help with this newsletter Help contribute to this newsletter by editing the etherpad: https://etherpad.openstack.org/p/keystone-team-newsletter From sundar.nadathur at intel.com Fri May 10 23:55:35 2019 From: sundar.nadathur at intel.com (Nadathur, Sundar) Date: Fri, 10 May 2019 23:55:35 +0000 Subject: [OSC][PTG] Summary: many things to do! In-Reply-To: <2de134d3-629c-5396-4b1d-0c1dd0c42065@inaugust.com> References: <2de134d3-629c-5396-4b1d-0c1dd0c42065@inaugust.com> Message-ID: <1CC272501B5BC543A05DB90AA509DED52755A484@fmsmsx122.amr.corp.intel.com> > -----Original Message----- > From: Monty Taylor > Sent: Friday, May 10, 2019 1:43 PM > To: openstack-discuss at lists.openstack.org > Subject: Re: [OSC][PTG] Summary: many things to do! > > On 5/10/19 4:48 PM, Dean Troyer wrote: > > OpenStackClient held a session at the Denver PTG and despite not > > having much planned had plenty to talk about. Some of the highlights > > from the etherpad[0] are: > > Well, poo. Sorry I missed it. > > > * Aretm is working on changing the Image commands to use OpenStackSDK. > > This is the work described in the cycle goal proposal[1] that he is > > planning to do anyway. I support going ahead with this even without > > an SDK 1.0 release as it lets us remove glanceclient and some of its > > unique dependencies. > > > > * There was some discussion about image encryption and where the > > client-side bits of that may land. One option was to put it into > > os-brick; if that is where it winds up OSC will make that an optional > > dependency de to the number of other dependencies that will introduce. > > (ie, OSC currently uses very little of oslo, some of which brings in a > > number of things not otherwise needed client-side). > > We landed support for image signing in SDK: > > https://opendev.org/openstack/openstacksdk/src/branch/master/openstack/i > mage/image_signer.py The old python-glanceclient has an option to download the image to a local file. I missed that when using the clouds.yaml-based approach that directly accesses the Glance API. Hope we can add that option to the openstacksdk-based client. Just my 2 cents. Regards, Sundar > > dt > > > > [0] https://etherpad.openstack.org/p/train-ptg-osc > > [1] https://review.opendev.org/#/c/639376/ > > [2] this series starts at https://review.opendev.org/#/c/657907/ > > [3] https://etherpad.openstack.org/p/DEN-osc-compute-api-gaps > > [4] > > http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005783 > > .html > > > > From dangtrinhnt at gmail.com Sat May 11 01:42:19 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Sat, 11 May 2019 10:42:19 +0900 Subject: [tc][searchlight] What does Maintenance Mode mean for a project? In-Reply-To: <155f4110-df20-3b23-8c68-700e9c3d66f0@openstack.org> References: <155f4110-df20-3b23-8c68-700e9c3d66f0@openstack.org> Message-ID: Thank Thierry for the information. On Thu, May 9, 2019 at 9:52 PM Thierry Carrez wrote: > Trinh Nguyen wrote: > > Currently, in the project details section of Searchlight page [1], it > > says we're in the Maintenance Mode. What does that mean? and how we can > > update it? > > Maintenance mode is a project-team tag that teams can choose to apply to > themselves. It is documented at: > > > https://governance.openstack.org/tc/reference/tags/status_maintenance-mode.html > > If you feel like Searchlight is back to a feature development phase, you > can ask for it to be changed by proposing a change to > > > https://opendev.org/openstack/governance/src/branch/master/reference/projects.yaml#L3407 > > -- > Thierry Carrez (ttx) > > -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Sat May 11 07:05:05 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Sat, 11 May 2019 09:05:05 +0200 Subject: [neutron][qa] Grenade jobs in check queue Message-ID: Hi, In Neutron team we are thinking about limiting a bit number of CI jobs which we are running. Currently we have e.g. 4 different grenade jobs: neutron-grenade grenade-py3 neutron-grenade-multinode neutron-grenade-dvr-multinode And jobs grenade-py3 and neutron-grenade-multinode are almost the same (same python version, same L2 agent, same legacy L3 agent, same fw driver). Only difference between those 2 jobs is that one of them is single and one is multinode (2) job. So I thought that maybe we can use only one of them and I wanted to remove grenade-py3 and left only multinode job in check queue. But grenade-py3 comes from "integrated-gate-py3” template so we probably shouldn’t remove this one. Can we run only grenade-py3 and drop neutron-grenade-multinode? Or maybe we can change grenade-py3 to be multinode job and then drop neutron-grenade-multinode? — Slawek Kaplonski Senior software engineer Red Hat From aj at suse.com Sat May 11 10:26:44 2019 From: aj at suse.com (Andreas Jaeger) Date: Sat, 11 May 2019 10:26:44 +0000 Subject: [octavia][lbaas][neutron-lbaas] Octavia v1 API and neutron-lbaas retirement In-Reply-To: References: Message-ID: <6ea9495f-f843-93bb-fc49-1e5448dc79d4@suse.com> Are you going to retire the complete repo with all branches including the maintained ones? That means no bug fixes can merge for these anymore, Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter: jaegerandi SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 From flux.adam at gmail.com Sat May 11 11:56:12 2019 From: flux.adam at gmail.com (Adam Harwell) Date: Sat, 11 May 2019 04:56:12 -0700 Subject: [octavia][lbaas][neutron-lbaas] Octavia v1 API and neutron-lbaas retirement In-Reply-To: <6ea9495f-f843-93bb-fc49-1e5448dc79d4@suse.com> References: <6ea9495f-f843-93bb-fc49-1e5448dc79d4@suse.com> Message-ID: The intent is for stable branches to remain. If there's something special I need to do in the retirement prices to guarantee this works, I'd appreciate a pointer in the right direction. So far the myriad of interdependent patches required across all the different repos involved has been a bit confusing, so all help is welcome. --Adam On Sat, May 11, 2019, 03:51 Andreas Jaeger wrote: > Are you going to retire the complete repo with all branches including > the maintained ones? That means no bug fixes can merge for these anymore, > > Andreas > -- > Andreas Jaeger aj@{suse.com,opensuse.org} Twitter: jaegerandi > SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany > > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah > HRB 21284 (AG Nürnberg) > GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hongbin034 at gmail.com Sat May 11 15:43:38 2019 From: hongbin034 at gmail.com (Hongbin Lu) Date: Sat, 11 May 2019 11:43:38 -0400 Subject: [devstack-plugin-container][zun][kuryr] Extend core team for devstack-plugin-container In-Reply-To: References: Message-ID: I didn't receive negative feedback since I published this proposal. I go ahead and include Zun and Kuryr core team into the plugin. Best regards, Hongbin On Tue, May 7, 2019 at 8:14 AM Hongbin Lu wrote: > Hi all, > > I propose to add Zun and Kuryr core team into devstack-plugin-container. > Right now, both Zun and Kuryr are using that plugin and extending the core > team would help accelerating the code review process. > > Please let me know if there is any concern of the proposal. > > Best regards, > Hongbin > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnaser at vexxhost.com Sat May 11 23:57:07 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Sat, 11 May 2019 19:57:07 -0400 Subject: Any ideas on fixing bug 1827083 so we can merge code? In-Reply-To: <20190509135517.7j7ccyyxzp2yneun@yuggoth.org> References: <20190509135517.7j7ccyyxzp2yneun@yuggoth.org> Message-ID: is it possible that this is because `mirror.sjc1.vexxhost.openstack.org` does not actually have AAAA records that causes this? On Thu, May 9, 2019 at 9:58 AM Jeremy Stanley wrote: > > On 2019-05-09 08:49:35 -0500 (-0500), Eric Fried wrote: > > Have we tried changing the URI to > > https://opendev.org/openstack/requirements/raw/branch/master/upper-constraints.txt > > to avoid the redirecting? > > > > On 5/9/19 8:02 AM, Matt Riedemann wrote: > > > I'm not sure what is causing the bug [1] but it's failing at a really > > > high rate for about week now. Do we have ideas on the issue? Do we have > > > thoughts on a workaround? Or should we disable the vexxhost-sjc1 > > > provider until it's solved? > > > > > > [1] http://status.openstack.org/elastic-recheck/#1827083 > > I have to assume the bug report itself is misleading. Jobs should be > using the on-disk copy of the requirements repository provided by > Zuul for this and not retrieving that file over the network. However > the problem is presumably DNS resolution not working at all on those > nodes, so something is going to break at some point in the job in > those cases regardless. > -- > Jeremy Stanley -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. http://vexxhost.com From aj at suse.com Sun May 12 08:19:02 2019 From: aj at suse.com (Andreas Jaeger) Date: Sun, 12 May 2019 08:19:02 +0000 Subject: [octavia][lbaas][neutron-lbaas] Octavia v1 API and neutron-lbaas retirement In-Reply-To: References: <6ea9495f-f843-93bb-fc49-1e5448dc79d4@suse.com> Message-ID: On 11/05/2019 13.56, Adam Harwell wrote: > The intent is for stable branches to remain. If there's something > special I need to do in the retirement prices to guarantee this works, > I'd appreciate a pointer in the right direction. So far the myriad of > interdependent patches required across all the different repos involved > has been a bit confusing, so all help is welcome. Ah, that is more complicated ;) I'll review accordingly and give advise on the changes. Basically, jobs running master can remove lbaas now (and some must since an enable_plugin neutron-lbaas will fail (I assume)) but jobs running on stable should not change, Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter: jaegerandi SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 From sneha.rai at hpe.com Fri May 10 16:51:07 2019 From: sneha.rai at hpe.com (RAI, SNEHA) Date: Fri, 10 May 2019 16:51:07 +0000 Subject: Help needed to Support Multi-attach feature In-Reply-To: <20190510092600.r27zetl5e3k5ow5v@localhost> References: <20190510092600.r27zetl5e3k5ow5v@localhost> Message-ID: Thanks Gorka for your response. I have changed the version of libvirt and qemu on my host and I am able to move past the previous error mentioned in my last email. Current versions of libvirt and qemu: root at CSSOSBE04-B09:/etc# libvirtd --version libvirtd (libvirt) 1.3.1 root at CSSOSBE04-B09:/etc# kvm --version QEMU emulator version 2.5.0 (Debian 1:2.5+dfsg-5ubuntu10.36), Copyright (c) 2003-2008 Fabrice Bellard Also, I made a change in /etc/nova/nova.conf and set virt_type=qemu. Earlier it was set to kvm. I restarted all nova services post the changes but I can see one nova service was disabled and state was down. root at CSSOSBE04-B09:/etc# nova service-list +--------------------------------------+------------------+---------------+----------+----------+-------+----------------------------+-------------------------------------+-------------+ | Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason | Forced down | +--------------------------------------+------------------+---------------+----------+----------+-------+----------------------------+-------------------------------------+-------------+ | 1ebcd1f6-b7dc-40ce-8d7b-95d60503c0ff | nova-scheduler | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:48:59.000000 | - | False | | ed82277c-d2e0-4a1a-adf6-9bcdcc50ba29 | nova-consoleauth | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:48:49.000000 | - | False | | bc2b6703-7a1e-4f07-96b9-35cbb14398d5 | nova-conductor | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:48:59.000000 | - | False | | 72ecbc1d-1b47-4f55-a18d-de2fbf1771e9 | nova-conductor | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:48:54.000000 | - | False | | 9c700ee1-1694-479b-afc0-1fd37c1a5561 | nova-compute | CSSOSBE04-B09 | nova | disabled | down | 2019-05-07T22:11:06.000000 | AUTO: Connection to libvirt lost: 1 | False | +--------------------------------------+------------------+---------------+----------+----------+-------+----------------------------+-------------------------------------+-------------+ So, I manually enabled the service, but the state was still down. root at CSSOSBE04-B09:/etc# nova service-enable 9c700ee1-1694-479b-afc0-1fd37c1a5561 +--------------------------------------+---------------+--------------+---------+ | ID | Host | Binary | Status | +--------------------------------------+---------------+--------------+---------+ | 9c700ee1-1694-479b-afc0-1fd37c1a5561 | CSSOSBE04-B09 | nova-compute | enabled | +--------------------------------------+---------------+--------------+---------+ root at CSSOSBE04-B09:/etc# nova service-list +--------------------------------------+------------------+---------------+----------+---------+-------+----------------------------+-----------------+-------------+ | Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason | Forced down | +--------------------------------------+------------------+---------------+----------+---------+-------+----------------------------+-----------------+-------------+ | 1ebcd1f6-b7dc-40ce-8d7b-95d60503c0ff | nova-scheduler | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:49:19.000000 | - | False | | ed82277c-d2e0-4a1a-adf6-9bcdcc50ba29 | nova-consoleauth | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:49:19.000000 | - | False | | bc2b6703-7a1e-4f07-96b9-35cbb14398d5 | nova-conductor | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:49:19.000000 | - | False | | 72ecbc1d-1b47-4f55-a18d-de2fbf1771e9 | nova-conductor | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:49:14.000000 | - | False | | 9c700ee1-1694-479b-afc0-1fd37c1a5561 | nova-compute | CSSOSBE04-B09 | nova | enabled | down | 2019-05-10T05:49:14.000000 | - | False | +--------------------------------------+------------------+---------------+----------+---------+-------+----------------------------+-----------------+-------------+ So, now when I try to attach a volume to nova instance, I get the below error. As one of the service is down it fails in filter validation for nova-compute and gives us "No host" error. May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;32mDEBUG nova.filters [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo admin#033[00;32m] #033[01;35m#033[00;32mFilter RetryFilter returned 1 host(s)#033[00m #033[00;33m{{(pid=21775) get_filtered_objects /opt/stack/nova/nova/filters.py:104}}#033[00m May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;32mDEBUG nova.filters [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo admin#033[00;32m] #033[01;35m#033[00;32mFilter AvailabilityZoneFilter returned 1 host(s)#033[00m #033[00;33m{{(pid=21775) get_filtered_objects /opt/stack/nova/nova/filters.py:104}}#033[00m May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;32mDEBUG nova.scheduler.filters.compute_filter [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo admin#033[00;32m] #033[01;35m#033[00;32m(CSSOSBE04-B09, CSSOSBE04-B09) ram: 30810MB disk: 1737728MB io_ops: 0 instances: 1 is disabled, reason: AUTO: Connection to libvirt lost: 1#033[00m #033[00;33m{{(pid=21775) host_passes /opt/stack/nova/nova/scheduler/filters/compute_filter.py:42}}#033[00m May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;36mINFO nova.filters [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo admin#033[00;36m] #033[01;35m#033[00;36mFilter ComputeFilter returned 0 hosts#033[00m May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;32mDEBUG nova.filters [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo admin#033[00;32m] #033[01;35m#033[00;32mFiltering removed all hosts for the request with instance ID '1735ece5-d187-454a-aab1-12650646a2ec'. Filter results: [('RetryFilter', [(u'CSSOSBE04-B09', u'CSSOSBE04-B09')]), ('AvailabilityZoneFilter', [(u'CSSOSBE04-B09', u'CSSOSBE04-B09')]), ('ComputeFilter', None)]#033[00m #033[00;33m{{(pid=21775) get_filtered_objects /opt/stack/nova/nova/filters.py:129}}#033[00m May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;36mINFO nova.filters [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo admin#033[00;36m] #033[01;35m#033[00;36mFiltering removed all hosts for the request with instance ID '1735ece5-d187-454a-aab1-12650646a2ec'. Filter results: ['RetryFilter: (start: 1, end: 1)', 'AvailabilityZoneFilter: (start: 1, end: 1)', 'ComputeFilter: (start: 1, end: 0)']#033[00m May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;32mDEBUG nova.scheduler.filter_scheduler [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo admin#033[00;32m] #033[01;35m#033[00;32mFiltered []#033[00m #033[00;33m{{(pid=21775) _get_sorted_hosts /opt/stack/nova/nova/scheduler/filter_scheduler.py:404}}#033[00m May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;32mDEBUG nova.scheduler.filter_scheduler [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo admin#033[00;32m] #033[01;35m#033[00;32mThere are 0 hosts available but 1 instances requested to build.#033[00m #033[00;33m{{(pid=21775) _ensure_sufficient_hosts /opt/stack/nova/nova/scheduler/filter_scheduler.py:279}}#033[00m May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: #033[01;31mERROR nova.conductor.manager [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo admin#033[01;31m] #033[01;35m#033[01;31mFailed to schedule instances#033[00m: NoValidHost_Remote: No valid host was found. There are not enough hosts available. May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: Traceback (most recent call last): May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 226, in inner May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: return func(*args, **kwargs) May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: File "/opt/stack/nova/nova/scheduler/manager.py", line 154, in select_destinations May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: allocation_request_version, return_alternates) May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: File "/opt/stack/nova/nova/scheduler/filter_scheduler.py", line 91, in select_destinations May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: allocation_request_version, return_alternates) May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: File "/opt/stack/nova/nova/scheduler/filter_scheduler.py", line 244, in _schedule May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: claimed_instance_uuids) May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: File "/opt/stack/nova/nova/scheduler/filter_scheduler.py", line 281, in _ensure_sufficient_hosts May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: raise exception.NoValidHost(reason=reason) May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: NoValidHost: No valid host was found. There are not enough hosts available. Need help in understanding on how to fix this error. For detailed logs, please refer the attached syslog. Thanks & Regards, Sneha Rai -----Original Message----- From: Gorka Eguileor [mailto:geguileo at redhat.com] Sent: Friday, May 10, 2019 2:56 PM To: RAI, SNEHA Cc: openstack-dev at lists.openstack.org Subject: Re: Help needed to Support Multi-attach feature On 02/05, RAI, SNEHA wrote: > Hi Team, > > I am currently working on multiattach feature for HPE 3PAR cinder driver. > > For this, while setting up devstack(on stable/queens) I made below > change in the local.conf [[local|localrc]] > ENABLE_VOLUME_MULTIATTACH=True ENABLE_UBUNTU_CLOUD_ARCHIVE=False > > /etc/cinder/cinder.conf: > [3pariscsi_1] > hpe3par_api_url = > https://urldefense.proofpoint.com/v2/url?u=https-3A__192.168.1.7-3A808 > 0_api_v1&d=DwIBAg&c=C5b8zRQO1miGmBeVZ2LFWg&r=8drU3i56Z5sQ_Ltpya89LTNn3 > xDSwtigjYbGrSY1lM8&m=zTRvI4nj8MoP0_z5MmxTYwKiNNW6addwP4L5VFG4wkg&s=a2D > HbzzRtbbBPz0_kfodZv5X1HxbN_hFxte5rEZabAg&e= > hpe3par_username = user > hpe3par_password = password > san_ip = 192.168.1.7 > san_login = user > san_password = password > volume_backend_name = 3pariscsi_1 > hpe3par_cpg = my_cpg > hpe3par_iscsi_ips = 192.168.11.2,192.168.11.3 volume_driver = > cinder.volume.drivers.hpe.hpe_3par_iscsi.HPE3PARISCSIDriver > hpe3par_iscsi_chap_enabled = True > hpe3par_debug = True > image_volume_cache_enabled = True > > /etc/cinder/policy.json: > 'volume:multiattach': 'rule:admin_or_owner' > > Added https://urldefense.proofpoint.com/v2/url?u=https-3A__review.opendev.org_-23_c_560067_2_cinder_volume_drivers_hpe_hpe-5F3par-5Fcommon.py&d=DwIBAg&c=C5b8zRQO1miGmBeVZ2LFWg&r=8drU3i56Z5sQ_Ltpya89LTNn3xDSwtigjYbGrSY1lM8&m=zTRvI4nj8MoP0_z5MmxTYwKiNNW6addwP4L5VFG4wkg&s=U8n1fpI-4OVYOSjST8IL0x0BRUhTLyumOpRZMJ_sVOI&e= change in the code. > > But I am getting below error in the nova log: > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [None req-2cda6e90-fd45-4bfe-960a-7fca9ba4abab demo admin] [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] Instance failed block device setup: MultiattachNotSupportedByVirtDriver: Volume dc25f09a-6ae1-4b06-a814-73a8afaba62f has 'multiattach' set, which is not supported for this instance. > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] Traceback (most recent call last): > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/compute/manager.py", line 1615, in _prep_block_device > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] wait_func=self._await_block_device_map_created) > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 840, in attach_block_devices > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] _log_and_attach(device) > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 837, in _log_and_attach > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] bdm.attach(*attach_args, **attach_kwargs) > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 46, in wrapped > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] ret_val = method(obj, context, *args, **kwargs) > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 620, in attach > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] virt_driver, do_driver_attach) > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 274, in inner > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] return f(*args, **kwargs) > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 617, in _do_locked_attach > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] self._do_attach(*args, **_kwargs) > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 602, in _do_attach > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] do_driver_attach) > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 509, in _volume_attach > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] volume_id=volume_id) > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] MultiattachNotSupportedByVirtDriver: Volume dc25f09a-6ae1-4b06-a814-73a8afaba62f has 'multiattach' set, which is not supported for this instance. > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR > nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] > > > Apr 29 05:41:20 CSSOSBE04-B09 nova-compute[20455]: DEBUG > nova.virt.libvirt.driver [-] Volume multiattach is not supported based > on current versions of QEMU and libvirt. QEMU must be less than 2.10 > or libvirt must be greater than or equal to 3.10. {{(pid=20455) > _set_multiattach_support > /opt/stack/nova/nova/virt/libvirt/driver.py:619}} > > > stack at CSSOSBE04-B09:/tmp$ virsh --version > 3.6.0 > stack at CSSOSBE04-B09:/tmp$ kvm --version QEMU emulator version > 2.10.1(Debian 1:2.10+dfsg-0ubuntu3.8~cloud1) Copyright (c) 2003-2017 > Fabrice Bellard and the QEMU Project developers > Hi Sneha, I don't know much about this side of Nova, but reading the log error I would say that you either need to update your libvirt version from 3.6.0 to 3.10, or you need to downgrade your QEMU version to something prior to 2.10. The later is probably easier. I don't use Ubuntu, but according to the Internet you can list available versions with "apt-cache policy qemu" and then install or downgrade to the specific version with "sudo apt-get install qemu=2.5\*" if you wanted to install version 2.5 I hope this helps. Cheers, Gorka. > > openstack volume show -c multiattach -c status sneha1 > +-------------+-----------+ > | Field | Value | > +-------------+-----------+ > | multiattach | True | > | status | available | > +-------------+-----------+ > > cinder extra-specs-list > +--------------------------------------+-------------+--------------------------------------------------------------------+ > | ID | Name | extra_specs | > +--------------------------------------+-------------+--------------------------------------------------------------------+ > | bd077fde-51c3-4581-80d5-5855e8ab2f6b | 3pariscsi_1 | > | {'volume_backend_name': '3pariscsi_1', 'multiattach': ' True'}| > +--------------------------------------+-------------+--------------------------------------------------------------------+ > > > echo $OS_COMPUTE_API_VERSION > 2.60 > > pip list | grep python-novaclient > DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. > python-novaclient 13.0.0 > > How do I fix this version issue on my setup to proceed? Please help. > > Thanks & Regards, > Sneha Rai -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: syslog.zip Type: application/x-zip-compressed Size: 3374773 bytes Desc: syslog.zip URL: From zackchen517 at gmail.com Mon May 13 01:58:18 2019 From: zackchen517 at gmail.com (zack chen) Date: Mon, 13 May 2019 09:58:18 +0800 Subject: Baremetal attach volume in Multi-tenancy In-Reply-To: <20190510103929.w7iqvakxzskk2pmb@localhost> References: <20190509092828.g6qvdg5jbvqqvpba@localhost> <20190510103929.w7iqvakxzskk2pmb@localhost> Message-ID: Hi, Thanks for your reply. I saw that ceph already has the Iscsi Gateway. Does the cinder project have such a driver? Gorka Eguileor 于2019年5月10日周五 下午6:39写道: > On 10/05, zack chen wrote: > > This is a normal Cinder in Openstack deployment > > > > I'm using ceph as cinder backend, RBD drvier. > > > > Hi, > > If you are using a Ceph/RBD cluster then there are some things to take > into consideration: > > - You need to have the ceph-common package installed in the system. > > - The images are mounted using the kernel module, so you have to be > careful with the features that are enabled in the images. > > - If I'm not mistaken the RBD attach using the cinderclient extension > will fail if you don't have the configuration and credentials file > already in the system. > > > > My ideas the instance should communicate with Openstack platform storage > > network via the vrouter provided by neutron. The vrouter gateway should > > communicate with Openstack platform. is or right? > > > > I can't help you on the network side, since I don't know anything about > Neutron. > > Cheers, > Gorka. > > > Gorka Eguileor 于2019年5月9日周四 下午5:28写道: > > > > > On 08/05, zack chen wrote: > > > > Hi, > > > > I am looking for a mechanism that can be used for baremetal attach > volume > > > > in a multi-tenant scenario. In addition we use ceph as the backend > > > storage > > > > for cinder. > > > > > > > > Can anybody give me some advice? > > > > > > Hi, > > > > > > Is this a stand alone Cinder deployment or a normal Cinder in OpenStack > > > deployment? > > > > > > What storage backend will you be using? > > > > > > What storage protocol? iSCSI, FC, RBD...? > > > > > > Depending on these you can go with Walter's suggestion of using > > > cinderclient and its extension (which in general is the best way to > go), > > > or you may prefer writing a small python script that uses OS-Brick and > > > makes the REST API calls directly. > > > > > > Cheers, > > > Gorka. > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From isanjayk5 at gmail.com Mon May 13 07:00:25 2019 From: isanjayk5 at gmail.com (Sanjay K) Date: Mon, 13 May 2019 12:30:25 +0530 Subject: [devstack][stein][masakari]Using Masakari in multihost devstack stein setup In-Reply-To: References: Message-ID: Hi Tushar, I searched for masakarimonitors.conf on my both controller and compute node in my devstack stein setup, but I did not find this file present anywhere. After I run "sudo python setup.py install" from my controller's "/opt/stack/masakari-monitors" path, it installed the necessary monitoring binaries in /usr/local/bin/ path- stack at stein-cntlr-masakari:~/masakari-monitors$ ls -ltr /usr/local/bin/masakari* -rwxr-xr-x 1 root root 976 May 9 00:21 /usr/local/bin/masakari-processmonitor.sh -rwxr-xr-x 1 root root 918 May 9 00:21 /usr/local/bin/masakari-hostmonitor.sh -rwxr-xr-x 1 root root 222 May 9 00:45 /usr/local/bin/masakari -rwxr-xr-x 1 root root 1854 May 10 00:06 /usr/local/bin/masakari-wsgi -rwxr-xr-x 1 root root 158 May 10 00:06 /usr/local/bin/masakari-status -rwxr-xr-x 1 root root 158 May 10 00:06 /usr/local/bin/masakari-manage -rwxr-xr-x 1 root root 158 May 10 00:06 /usr/local/bin/masakari-engine -rwxr-xr-x 1 root root 155 May 10 00:06 /usr/local/bin/masakari-api -rwxr-xr-x 1 root root 188 May 12 23:23 /usr/local/bin/masakari-introspectiveinstancemonitor -rwxr-xr-x 1 root root 174 May 12 23:23 /usr/local/bin/masakari-processmonitor -rwxr-xr-x 1 root root 175 May 12 23:23 /usr/local/bin/masakari-instancemonitor -rwxr-xr-x 1 root root 171 May 12 23:23 /usr/local/bin/masakari-hostmonitor Under /opt/stack/masakari-monitors, I can find below config files inside etc/ - $ ls -lt etc/masakarimonitors/ total 24 -rwxr-xr-x 1 stack stack 1743 May 9 00:21 hostmonitor.conf.sample -rw-r--r-- 1 stack stack 173 May 9 00:21 masakarimonitors-config-generator.conf -rw-r--r-- 1 stack stack 2188 May 9 00:21 process_list.yaml.sample -rwxr-xr-x 1 stack stack 290 May 9 00:21 processmonitor.conf.sample -rwxr-xr-x 1 stack stack 239 May 9 00:21 proc.list.sample -rw-r--r-- 1 stack stack 144 May 9 00:21 README-masakarimonitors.conf.txt Following steps from README-masakarimonitors.conf.txt to generate masakarimonitors.conf file by running "tox -egenconfig", I get errors below- ===Errors=== I/opt/stack/masakari-monitors/.tox/genconfig/include/python3.5m -c ext/_yaml.c -o build/temp.linux-x86_64-3.5/ext/_yaml.o ext/_yaml.c:4:20: fatal error: Python.h: No such file or directory compilation terminated. error: command 'x86_64-linux-gnu-gcc' failed with exit status 1 ---------------------------------------- ERROR: Command "/opt/stack/masakari-monitors/.tox/genconfig/bin/python3 -u -c 'import setuptools, tokenize;__file__='"'"'/tmp/pip-install-_yy7e6rg/PyYAML/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-fbsfgz00/install-record.txt --single-version-externally-managed --compile --install-headers /opt/stack/masakari-monitors/.tox/genconfig/include/site/python3.5/PyYAML" failed with error code 1 in /tmp/pip-install-_yy7e6rg/PyYAML/ ==================================================================================================== log end ===================================================================================================== ERROR: could not install deps [-r/opt/stack/masakari-monitors/test-requirements.txt]; v = InvocationError(u'/opt/stack/masakari-monitors/.tox/genconfig/bin/pip install -chttps://releases.openstack.org/constraints/upper/stein -r/opt/stack/masakari-monitors/test-requirements.txt', 1) ____________________________________________________________________________________________________ summary _____________________________________________________________________________________________________ ERROR: genconfig: could not install deps [-r/opt/stack/masakari-monitors/test-requirements.txt]; v = InvocationError(u'/opt/stack/masakari-monitors/.tox/genconfig/bin/pip install -chttps://releases.openstack.org/constraints/upper/stein -r/opt/stack/masakari-monitors/test-requirements.txt', 1) ===Errors END=== I also get this same errors when I try to generate masakarimonitors.conf file on my compute host after installing masakari monitors binaries on it. As mentioned by you earlier in this thread, Masakari dev guys are going to add these monitoring processes in devstack Train release. So my queries are - 1. whether devstack stein currently does not support these monitoring processes as of now? If not, whether Masakari service can be utilized fully using the normal Masakari Stein release branch integrating it with the other services, not using this devstack version of Masakari what I am trying in my test multi host setup? Can you please provide me the documentation link for how to integrate Masakari and its additional sub binaries to existing openstack services under stein release. 2. whether Train Masakari main release and Devstack release will completely support Masakari monitoring services which in turn gives Masakari its full features to utilize in openstack? If yes, then I will wait for the next Train release to have this feature work completely. thank you for support and help on this as I am trying to put these pieces together at my end. Sorry for my long thread. best regards, Sanjay On Tue, Apr 30, 2019 at 8:47 PM Patil, Tushar wrote: > Hi Sanjay, > > You will need to add following config options in masakarimonitors.conf > under the [api] section. > > auth_url > password > project_name > username > user_domain_id > project_domain_id > region > > We have added support to install and run following masakari-monitors in > the current master planned to available in Train cycle. > > 1. process monitor > 2. instance monitor > 3. introspective instance monitor > > We are still working on adding support to install and run host monitors > using devstack. > > Regards, > Tushar Patil > > > > > ________________________________________ > From: Sanjay K > Sent: Tuesday, April 30, 2019 10:04:21 PM > To: Patil, Tushar > Cc: openstack-discuss at lists.openstack.org > Subject: Re: [devstack][stein][masakari]Using Masakari in multihost > devstack stein setup > > Hi Tushar, > Thanks you for your quick response on this. As I have already included the > masakari-monitors plugin in my compute host's local.conf file in devstack, > I have the repo present in my VM. Following the steps to set up the conf > files and start the necessary masakari monitoring processes from this link > https://github.com/openstack/masakari-monitors/blob/master/README.rst I > get the same error when I start the 3 monitoring processes > (masakari-processmonitor, masakari-hostmonitor, masakari-instancemonitor) - > > stack at devstack-:/etc/masakarimonitors$ masakari-processmonitor > Traceback (most recent call last): > File "/usr/local/bin/masakari-processmonitor", line 10, in > sys.exit(main()) > File > "/usr/local/lib/python2.7/dist-packages/masakarimonitors/cmd/processmonitor.py", > line 31, in main > config.parse_args(sys.argv) > File > "/usr/local/lib/python2.7/dist-packages/masakarimonitors/config.py", line > 32, in parse_args > default_config_files=default_config_files) > File "/usr/local/lib/python2.7/dist-packages/oslo_config/cfg.py", line > 2127, in __call__ > self._check_required_opts() > File "/usr/local/lib/python2.7/dist-packages/oslo_config/cfg.py", line > 2865, in _check_required_opts > raise RequiredOptError(opt.name, group) > oslo_config.cfg.RequiredOptError: value required for option auth-url in > group [api] > > After masakari setup I can file all the required binaries in my > /usr/local/bin directory - > > $ ls -ltr /usr/local/bin/masakari* > -rwxr-xr-x 1 root root 222 Apr 17 01:55 /usr/local/bin/masakari > -rwxr-xr-x 1 root root 976 Apr 21 23:41 > /usr/local/bin/masakari-processmonitor.sh > -rwxr-xr-x 1 root root 918 Apr 21 23:41 > /usr/local/bin/masakari-hostmonitor.sh > -rwxr-xr-x 1 root root 1854 Apr 29 01:01 /usr/local/bin/masakari-wsgi > -rwxr-xr-x 1 root root 158 Apr 29 01:01 /usr/local/bin/masakari-status > -rwxr-xr-x 1 root root 158 Apr 29 01:01 /usr/local/bin/masakari-manage > -rwxr-xr-x 1 root root 158 Apr 29 01:01 /usr/local/bin/masakari-engine > -rwxr-xr-x 1 root root 155 Apr 29 01:01 /usr/local/bin/masakari-api > -rwxr-xr-x 1 root root 174 Apr 30 04:34 > /usr/local/bin/masakari-processmonitor > -rwxr-xr-x 1 root root 188 Apr 30 04:34 > /usr/local/bin/masakari-introspectiveinstancemonitor > -rwxr-xr-x 1 root root 175 Apr 30 04:34 > /usr/local/bin/masakari-instancemonitor > -rwxr-xr-x 1 root root 171 Apr 30 04:34 > /usr/local/bin/masakari-hostmonitor > > Please let me know is there any issue with my masakari setup with devstack > and where can I find latest documentation on Masakari for using in devstack. > > thanks for your pointer, > best regards, > Sanjay > > On Tue, Apr 30, 2019 at 3:24 AM Patil, Tushar > wrote: > Hi Sanjay, > > In case of masakari-processmonitor, it only monitors processes as > mentioned in the process_list.yaml which by default monitors libvirt-bin, > nova-compute, instancemonitor, hostmonitor and sshd processes. > > To test process failure, you should terminate any of the above processes. > > In case of instancemonitor, you can shutdown the VM to test whether a > notification is sent or not. > > >> I have asked the same question in openstack forum, but not got a single > response. > Sorry, I didn't notice your question on forum. I have replied above > comment on forum as well. > > Regards, > Tushar Patil > > ________________________________________ > From: Sanjay K > > Sent: Monday, April 29, 2019 6:18:20 PM > To: openstack-discuss at lists.openstack.org openstack-discuss at lists.openstack.org> > Subject: [devstack][stein][masakari]Using Masakari in multihost devstack > stein setup > > Hi all, > I have been trying to setup Masakari in a 3 node setup - 1 controller + 2 > computes with minimal openstack services installed using devstack > stable/stein version. All these 3 nodes are Ubuntu 16.04 VMs. I have > included Masakari plugin in controller's local.conf file and included > masakari-monitor in both compute node's local.conf file. I want to test out > VM/process failure in my test environment. To do so, when I kill one of the > qemu process created for one instance (cirros 256 flavor VMs) on one of the > compute with root user login, I did not see any notification under my > horizon/instance-ha section for this (I have already created Segments, > Hosts under instance-ha in horizon). Also the killed process is not > restarted by Masakari. > > I have asked the same question in openstack forum, but not got a single > response. > > masakari-notification-on-process-failure< > https://ask.openstack.org/en/question/121490/masakari-notification-on-process-failure/ > > > > Whether any additional services related to Masakari need to be configured > on compute hosts to detect failures since I did not find exactly the > documentation related inside devstack for this. > > Please let me know if I am missing anything my set up. Any helps and > pointers are most appreciated. > > thank you for your reply. > > best regards > Disclaimer: This email and any attachments are sent in strictest > confidence for the sole use of the addressee and may contain legally > privileged, confidential, and proprietary data. If you are not the intended > recipient, please advise the sender by replying promptly to this email and > then delete and destroy this email and any attachments without any further > use, copying or forwarding. > Disclaimer: This email and any attachments are sent in strictest > confidence for the sole use of the addressee and may contain legally > privileged, confidential, and proprietary data. If you are not the intended > recipient, please advise the sender by replying promptly to this email and > then delete and destroy this email and any attachments without any further > use, copying or forwarding. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jean-philippe at evrard.me Mon May 13 07:20:48 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Mon, 13 May 2019 09:20:48 +0200 Subject: =?UTF-8?Q?Re:_[openstack-ansible][monasca][zaqar][watcher][searchlight]_?= =?UTF-8?Q?Retirement_of_unused_OpenStack_Ansible_roles?= In-Reply-To: References: <236ef912-21c5-4345-98ce-067499921af1@www.fastmail.com> Message-ID: <604fd001-f9aa-4f32-8c19-fdd19a0a458c@www.fastmail.com> On Wed, May 8, 2019, at 19:05, Trinh Nguyen wrote: > Hi all, > > I would love to take care of the searchlight roles. Are there any specific requirements I need to keep in mind? > > Bests, > > > > -- > *Trinh Nguyen* > _www.edlab.xyz_ > Hello, Great news! Searchlight role has been unmaintained for a while. The code is still using old elastic search versions, and is following relatively old standards. We are looking for someone ready to first step into the code to fix the deployment and add functional test coverage (for example, add tempest testing). This might require refreshing the role to our latest openstack-ansible standards too (we can discuss this in a different email or on our channel). When this would be done, we would be hoping you'd accept to be core on this role, so you can monitor the role, and ensure it's always working fine, and behave the way you expect it to be. Regards, Jean-Philippe Evrard (evrardjp) From jean-philippe at evrard.me Mon May 13 07:25:35 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Mon, 13 May 2019 09:25:35 +0200 Subject: =?UTF-8?Q?Re:_[openstack-ansible][monasca][zaqar][watcher][searchlight]_?= =?UTF-8?Q?Retirement_of_unused_OpenStack_Ansible_roles?= In-Reply-To: References: <236ef912-21c5-4345-98ce-067499921af1@www.fastmail.com> Message-ID: <2e12229a-df32-4a58-90dd-565c5ba0ce10@www.fastmail.com> On Wed, May 8, 2019, at 17:10, Stefano Canepa wrote: > Hi all, > I would like to maintain monasca related roles but I have to double check how much time I can allocate to this task. Please hold before retiring them. > Hello Stefano, Good news on that side too! I guess it all depends on what we are trying to achieve, but monasca roles have not been adapted for a while. They are not too old, so I suppose it would not be that hard to update the roles to our latest standards. However, IIRC, the functional testing is now broken, and would require some love. That is just a "point in time" effort. When this would be done, we would like to ensure the role is maintained in the long run, which requires low maintenance, but over time effort. Regards, Jean-Philippe Evrard (evrardjp) From dangtrinhnt at gmail.com Mon May 13 07:29:37 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Mon, 13 May 2019 16:29:37 +0900 Subject: [openstack-ansible][monasca][zaqar][watcher][searchlight] Retirement of unused OpenStack Ansible roles In-Reply-To: <604fd001-f9aa-4f32-8c19-fdd19a0a458c@www.fastmail.com> References: <236ef912-21c5-4345-98ce-067499921af1@www.fastmail.com> <604fd001-f9aa-4f32-8c19-fdd19a0a458c@www.fastmail.com> Message-ID: Hi Jean-Philippe, Thanks for the information. Sure, let's me look at the role for sometimes and get back to you if I need help. Bests, On Mon, May 13, 2019 at 4:25 PM Jean-Philippe Evrard < jean-philippe at evrard.me> wrote: > > > On Wed, May 8, 2019, at 19:05, Trinh Nguyen wrote: > > Hi all, > > > > I would love to take care of the searchlight roles. Are there any > specific requirements I need to keep in mind? > > > > Bests, > > > > > > > > -- > > *Trinh Nguyen* > > _www.edlab.xyz_ > > > > Hello, > > Great news! > Searchlight role has been unmaintained for a while. The code is still > using old elastic search versions, and is following relatively old > standards. We are looking for someone ready to first step into the code to > fix the deployment and add functional test coverage (for example, add > tempest testing). This might require refreshing the role to our latest > openstack-ansible standards too (we can discuss this in a different email > or on our channel). > > When this would be done, we would be hoping you'd accept to be core on > this role, so you can monitor the role, and ensure it's always working > fine, and behave the way you expect it to be. > > Regards, > Jean-Philippe Evrard (evrardjp) > > -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From flux.adam at gmail.com Mon May 13 08:34:37 2019 From: flux.adam at gmail.com (Adam Harwell) Date: Mon, 13 May 2019 01:34:37 -0700 Subject: [senlin][zaqar][networking-midonet][networking-odl][vmware-nsx] Neutron-LBaaS retirement warning! Message-ID: As you are hopefully already aware, the Neutron-LBaaS project is being retired this cycle (and a lot of the patches to accomplish this will land in the next few days). >From a quick code search, it seems the following projects still include neutron-lbaas in their zuul job configs: networking-odl networking-midonet senlin vmware-nsx zaqar For projects on this list, the retirement of neutron-lbaas *will* cause your zuul jobs to fail. *Please take action to remove this requirement!* It is possible that it is simply an extra unused requirement, but if your project is actually using neutron-lbaas to create loadbalancers, it will be necessary to convert to Octavia. If you need assistance with this change or have any questions, don't hesitate to stop by #openstack-lbaas on IRC and we can help! --Adam Harwell -------------- next part -------------- An HTML attachment was scrubbed... URL: From geguileo at redhat.com Mon May 13 08:38:57 2019 From: geguileo at redhat.com (Gorka Eguileor) Date: Mon, 13 May 2019 10:38:57 +0200 Subject: Baremetal attach volume in Multi-tenancy In-Reply-To: References: <20190509092828.g6qvdg5jbvqqvpba@localhost> <20190510103929.w7iqvakxzskk2pmb@localhost> Message-ID: <20190513083857.xhsuaa6lk6g5nm6o@localhost> On 13/05, zack chen wrote: > Hi, > > Thanks for your reply. > I saw that ceph already has the Iscsi Gateway. Does the cinder project have > such a driver? > Hi, There is an ongoing effort to write a new RBD driver specific for iSCSI, but it is not available yet. Cheers, Gorka. > Gorka Eguileor 于2019年5月10日周五 下午6:39写道: > > > On 10/05, zack chen wrote: > > > This is a normal Cinder in Openstack deployment > > > > > > I'm using ceph as cinder backend, RBD drvier. > > > > > > > Hi, > > > > If you are using a Ceph/RBD cluster then there are some things to take > > into consideration: > > > > - You need to have the ceph-common package installed in the system. > > > > - The images are mounted using the kernel module, so you have to be > > careful with the features that are enabled in the images. > > > > - If I'm not mistaken the RBD attach using the cinderclient extension > > will fail if you don't have the configuration and credentials file > > already in the system. > > > > > > > My ideas the instance should communicate with Openstack platform storage > > > network via the vrouter provided by neutron. The vrouter gateway should > > > communicate with Openstack platform. is or right? > > > > > > > I can't help you on the network side, since I don't know anything about > > Neutron. > > > > Cheers, > > Gorka. > > > > > Gorka Eguileor 于2019年5月9日周四 下午5:28写道: > > > > > > > On 08/05, zack chen wrote: > > > > > Hi, > > > > > I am looking for a mechanism that can be used for baremetal attach > > volume > > > > > in a multi-tenant scenario. In addition we use ceph as the backend > > > > storage > > > > > for cinder. > > > > > > > > > > Can anybody give me some advice? > > > > > > > > Hi, > > > > > > > > Is this a stand alone Cinder deployment or a normal Cinder in OpenStack > > > > deployment? > > > > > > > > What storage backend will you be using? > > > > > > > > What storage protocol? iSCSI, FC, RBD...? > > > > > > > > Depending on these you can go with Walter's suggestion of using > > > > cinderclient and its extension (which in general is the best way to > > go), > > > > or you may prefer writing a small python script that uses OS-Brick and > > > > makes the REST API calls directly. > > > > > > > > Cheers, > > > > Gorka. > > > > > > From geguileo at redhat.com Mon May 13 08:51:18 2019 From: geguileo at redhat.com (Gorka Eguileor) Date: Mon, 13 May 2019 10:51:18 +0200 Subject: Help needed to Support Multi-attach feature In-Reply-To: References: <20190510092600.r27zetl5e3k5ow5v@localhost> Message-ID: <20190513085118.3hfsekvtabq6ipm2@localhost> On 10/05, RAI, SNEHA wrote: > Thanks Gorka for your response. > > I have changed the version of libvirt and qemu on my host and I am able to move past the previous error mentioned in my last email. > > Current versions of libvirt and qemu: > root at CSSOSBE04-B09:/etc# libvirtd --version > libvirtd (libvirt) 1.3.1 > root at CSSOSBE04-B09:/etc# kvm --version > QEMU emulator version 2.5.0 (Debian 1:2.5+dfsg-5ubuntu10.36), Copyright (c) 2003-2008 Fabrice Bellard > > Also, I made a change in /etc/nova/nova.conf and set virt_type=qemu. Earlier it was set to kvm. > I restarted all nova services post the changes but I can see one nova service was disabled and state was down. > > root at CSSOSBE04-B09:/etc# nova service-list > +--------------------------------------+------------------+---------------+----------+----------+-------+----------------------------+-------------------------------------+-------------+ > | Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason | Forced down | > +--------------------------------------+------------------+---------------+----------+----------+-------+----------------------------+-------------------------------------+-------------+ > | 1ebcd1f6-b7dc-40ce-8d7b-95d60503c0ff | nova-scheduler | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:48:59.000000 | - | False | > | ed82277c-d2e0-4a1a-adf6-9bcdcc50ba29 | nova-consoleauth | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:48:49.000000 | - | False | > | bc2b6703-7a1e-4f07-96b9-35cbb14398d5 | nova-conductor | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:48:59.000000 | - | False | > | 72ecbc1d-1b47-4f55-a18d-de2fbf1771e9 | nova-conductor | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:48:54.000000 | - | False | > | 9c700ee1-1694-479b-afc0-1fd37c1a5561 | nova-compute | CSSOSBE04-B09 | nova | disabled | down | 2019-05-07T22:11:06.000000 | AUTO: Connection to libvirt lost: 1 | False | > +--------------------------------------+------------------+---------------+----------+----------+-------+----------------------------+-------------------------------------+-------------+ > > So, I manually enabled the service, but the state was still down. > root at CSSOSBE04-B09:/etc# nova service-enable 9c700ee1-1694-479b-afc0-1fd37c1a5561 > +--------------------------------------+---------------+--------------+---------+ > | ID | Host | Binary | Status | > +--------------------------------------+---------------+--------------+---------+ > | 9c700ee1-1694-479b-afc0-1fd37c1a5561 | CSSOSBE04-B09 | nova-compute | enabled | > +--------------------------------------+---------------+--------------+---------+ > > root at CSSOSBE04-B09:/etc# nova service-list > +--------------------------------------+------------------+---------------+----------+---------+-------+----------------------------+-----------------+-------------+ > | Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason | Forced down | > +--------------------------------------+------------------+---------------+----------+---------+-------+----------------------------+-----------------+-------------+ > | 1ebcd1f6-b7dc-40ce-8d7b-95d60503c0ff | nova-scheduler | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:49:19.000000 | - | False | > | ed82277c-d2e0-4a1a-adf6-9bcdcc50ba29 | nova-consoleauth | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:49:19.000000 | - | False | > | bc2b6703-7a1e-4f07-96b9-35cbb14398d5 | nova-conductor | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:49:19.000000 | - | False | > | 72ecbc1d-1b47-4f55-a18d-de2fbf1771e9 | nova-conductor | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:49:14.000000 | - | False | > | 9c700ee1-1694-479b-afc0-1fd37c1a5561 | nova-compute | CSSOSBE04-B09 | nova | enabled | down | 2019-05-10T05:49:14.000000 | - | False | > +--------------------------------------+------------------+---------------+----------+---------+-------+----------------------------+-----------------+-------------+ > Hi, If it appears as down it's probably because there is an issue during the service's start procedure. You can look in the logs to see what messages appeared during the start or tail the logs and restart the service to see what error appears there. Cheers, Gorka. > So, now when I try to attach a volume to nova instance, I get the below error. As one of the service is down it fails in filter validation for nova-compute and gives us "No host" error. > > May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;32mDEBUG nova.filters [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo admin#033[00;32m] #033[01;35m#033[00;32mFilter RetryFilter returned 1 host(s)#033[00m #033[00;33m{{(pid=21775) get_filtered_objects /opt/stack/nova/nova/filters.py:104}}#033[00m > May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;32mDEBUG nova.filters [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo admin#033[00;32m] #033[01;35m#033[00;32mFilter AvailabilityZoneFilter returned 1 host(s)#033[00m #033[00;33m{{(pid=21775) get_filtered_objects /opt/stack/nova/nova/filters.py:104}}#033[00m > May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;32mDEBUG nova.scheduler.filters.compute_filter [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo admin#033[00;32m] #033[01;35m#033[00;32m(CSSOSBE04-B09, CSSOSBE04-B09) ram: 30810MB disk: 1737728MB io_ops: 0 instances: 1 is disabled, reason: AUTO: Connection to libvirt lost: 1#033[00m #033[00;33m{{(pid=21775) host_passes /opt/stack/nova/nova/scheduler/filters/compute_filter.py:42}}#033[00m > May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;36mINFO nova.filters [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo admin#033[00;36m] #033[01;35m#033[00;36mFilter ComputeFilter returned 0 hosts#033[00m > May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;32mDEBUG nova.filters [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo admin#033[00;32m] #033[01;35m#033[00;32mFiltering removed all hosts for the request with instance ID '1735ece5-d187-454a-aab1-12650646a2ec'. Filter results: [('RetryFilter', [(u'CSSOSBE04-B09', u'CSSOSBE04-B09')]), ('AvailabilityZoneFilter', [(u'CSSOSBE04-B09', u'CSSOSBE04-B09')]), ('ComputeFilter', None)]#033[00m #033[00;33m{{(pid=21775) get_filtered_objects /opt/stack/nova/nova/filters.py:129}}#033[00m > May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;36mINFO nova.filters [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo admin#033[00;36m] #033[01;35m#033[00;36mFiltering removed all hosts for the request with instance ID '1735ece5-d187-454a-aab1-12650646a2ec'. Filter results: ['RetryFilter: (start: 1, end: 1)', 'AvailabilityZoneFilter: (start: 1, end: 1)', 'ComputeFilter: (start: 1, end: 0)']#033[00m > May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;32mDEBUG nova.scheduler.filter_scheduler [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo admin#033[00;32m] #033[01;35m#033[00;32mFiltered []#033[00m #033[00;33m{{(pid=21775) _get_sorted_hosts /opt/stack/nova/nova/scheduler/filter_scheduler.py:404}}#033[00m > May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;32mDEBUG nova.scheduler.filter_scheduler [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo admin#033[00;32m] #033[01;35m#033[00;32mThere are 0 hosts available but 1 instances requested to build.#033[00m #033[00;33m{{(pid=21775) _ensure_sufficient_hosts /opt/stack/nova/nova/scheduler/filter_scheduler.py:279}}#033[00m > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: #033[01;31mERROR nova.conductor.manager [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo admin#033[01;31m] #033[01;35m#033[01;31mFailed to schedule instances#033[00m: NoValidHost_Remote: No valid host was found. There are not enough hosts available. > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: Traceback (most recent call last): > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 226, in inner > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: return func(*args, **kwargs) > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: File "/opt/stack/nova/nova/scheduler/manager.py", line 154, in select_destinations > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: allocation_request_version, return_alternates) > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: File "/opt/stack/nova/nova/scheduler/filter_scheduler.py", line 91, in select_destinations > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: allocation_request_version, return_alternates) > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: File "/opt/stack/nova/nova/scheduler/filter_scheduler.py", line 244, in _schedule > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: claimed_instance_uuids) > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: File "/opt/stack/nova/nova/scheduler/filter_scheduler.py", line 281, in _ensure_sufficient_hosts > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: raise exception.NoValidHost(reason=reason) > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: NoValidHost: No valid host was found. There are not enough hosts available. > > Need help in understanding on how to fix this error. For detailed logs, please refer the attached syslog. > > > Thanks & Regards, > Sneha Rai > > > > > > -----Original Message----- > From: Gorka Eguileor [mailto:geguileo at redhat.com] > Sent: Friday, May 10, 2019 2:56 PM > To: RAI, SNEHA > Cc: openstack-dev at lists.openstack.org > Subject: Re: Help needed to Support Multi-attach feature > > > > On 02/05, RAI, SNEHA wrote: > > > Hi Team, > > > > > > I am currently working on multiattach feature for HPE 3PAR cinder driver. > > > > > > For this, while setting up devstack(on stable/queens) I made below > > > change in the local.conf [[local|localrc]] > > > ENABLE_VOLUME_MULTIATTACH=True ENABLE_UBUNTU_CLOUD_ARCHIVE=False > > > > > > /etc/cinder/cinder.conf: > > > [3pariscsi_1] > > > hpe3par_api_url = > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__192.168.1.7-3A808 > > > 0_api_v1&d=DwIBAg&c=C5b8zRQO1miGmBeVZ2LFWg&r=8drU3i56Z5sQ_Ltpya89LTNn3 > > > xDSwtigjYbGrSY1lM8&m=zTRvI4nj8MoP0_z5MmxTYwKiNNW6addwP4L5VFG4wkg&s=a2D > > > HbzzRtbbBPz0_kfodZv5X1HxbN_hFxte5rEZabAg&e= > > > hpe3par_username = user > > > hpe3par_password = password > > > san_ip = 192.168.1.7 > > > san_login = user > > > san_password = password > > > volume_backend_name = 3pariscsi_1 > > > hpe3par_cpg = my_cpg > > > hpe3par_iscsi_ips = 192.168.11.2,192.168.11.3 volume_driver = > > > cinder.volume.drivers.hpe.hpe_3par_iscsi.HPE3PARISCSIDriver > > > hpe3par_iscsi_chap_enabled = True > > > hpe3par_debug = True > > > image_volume_cache_enabled = True > > > > > > /etc/cinder/policy.json: > > > 'volume:multiattach': 'rule:admin_or_owner' > > > > > > Added https://urldefense.proofpoint.com/v2/url?u=https-3A__review.opendev.org_-23_c_560067_2_cinder_volume_drivers_hpe_hpe-5F3par-5Fcommon.py&d=DwIBAg&c=C5b8zRQO1miGmBeVZ2LFWg&r=8drU3i56Z5sQ_Ltpya89LTNn3xDSwtigjYbGrSY1lM8&m=zTRvI4nj8MoP0_z5MmxTYwKiNNW6addwP4L5VFG4wkg&s=U8n1fpI-4OVYOSjST8IL0x0BRUhTLyumOpRZMJ_sVOI&e= change in the code. > > > > > > But I am getting below error in the nova log: > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [None req-2cda6e90-fd45-4bfe-960a-7fca9ba4abab demo admin] [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] Instance failed block device setup: MultiattachNotSupportedByVirtDriver: Volume dc25f09a-6ae1-4b06-a814-73a8afaba62f has 'multiattach' set, which is not supported for this instance. > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] Traceback (most recent call last): > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/compute/manager.py", line 1615, in _prep_block_device > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] wait_func=self._await_block_device_map_created) > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 840, in attach_block_devices > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] _log_and_attach(device) > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 837, in _log_and_attach > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] bdm.attach(*attach_args, **attach_kwargs) > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 46, in wrapped > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] ret_val = method(obj, context, *args, **kwargs) > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 620, in attach > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] virt_driver, do_driver_attach) > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 274, in inner > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] return f(*args, **kwargs) > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 617, in _do_locked_attach > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] self._do_attach(*args, **_kwargs) > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 602, in _do_attach > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] do_driver_attach) > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 509, in _volume_attach > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] volume_id=volume_id) > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] MultiattachNotSupportedByVirtDriver: Volume dc25f09a-6ae1-4b06-a814-73a8afaba62f has 'multiattach' set, which is not supported for this instance. > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR > > > nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] > > > > > > > > > Apr 29 05:41:20 CSSOSBE04-B09 nova-compute[20455]: DEBUG > > > nova.virt.libvirt.driver [-] Volume multiattach is not supported based > > > on current versions of QEMU and libvirt. QEMU must be less than 2.10 > > > or libvirt must be greater than or equal to 3.10. {{(pid=20455) > > > _set_multiattach_support > > > /opt/stack/nova/nova/virt/libvirt/driver.py:619}} > > > > > > > > > stack at CSSOSBE04-B09:/tmp$ virsh --version > > > 3.6.0 > > > stack at CSSOSBE04-B09:/tmp$ kvm --version QEMU emulator version > > > 2.10.1(Debian 1:2.10+dfsg-0ubuntu3.8~cloud1) Copyright (c) 2003-2017 > > > Fabrice Bellard and the QEMU Project developers > > > > > > > Hi Sneha, > > > > I don't know much about this side of Nova, but reading the log error I would say that you either need to update your libvirt version from 3.6.0 to 3.10, or you need to downgrade your QEMU version to something prior to 2.10. > > > > The later is probably easier. > > > > I don't use Ubuntu, but according to the Internet you can list available versions with "apt-cache policy qemu" and then install or downgrade to the specific version with "sudo apt-get install qemu=2.5\*" if you wanted to install version 2.5 > > > > I hope this helps. > > > > Cheers, > > Gorka. > > > > > > > > openstack volume show -c multiattach -c status sneha1 > > > +-------------+-----------+ > > > | Field | Value | > > > +-------------+-----------+ > > > | multiattach | True | > > > | status | available | > > > +-------------+-----------+ > > > > > > cinder extra-specs-list > > > +--------------------------------------+-------------+--------------------------------------------------------------------+ > > > | ID | Name | extra_specs | > > > +--------------------------------------+-------------+--------------------------------------------------------------------+ > > > | bd077fde-51c3-4581-80d5-5855e8ab2f6b | 3pariscsi_1 | > > > | {'volume_backend_name': '3pariscsi_1', 'multiattach': ' True'}| > > > +--------------------------------------+-------------+--------------------------------------------------------------------+ > > > > > > > > > echo $OS_COMPUTE_API_VERSION > > > 2.60 > > > > > > pip list | grep python-novaclient > > > DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. > > > python-novaclient 13.0.0 > > > > > > How do I fix this version issue on my setup to proceed? Please help. > > > > > > Thanks & Regards, > > > Sneha Rai From lajos.katona at ericsson.com Mon May 13 09:05:56 2019 From: lajos.katona at ericsson.com (Lajos Katona) Date: Mon, 13 May 2019 09:05:56 +0000 Subject: [senlin][zaqar][networking-midonet][networking-odl][vmware-nsx] Neutron-LBaaS retirement warning! In-Reply-To: References: Message-ID: <4b0516ee-b248-f13a-8381-c09785006ad3@ericsson.com> Hi, This means that stable/stein is the last where these projects can include neutron-lbaas? Thanks for the heads up. Regards Lajos On 2019. 05. 13. 10:34, Adam Harwell wrote: As you are hopefully already aware, the Neutron-LBaaS project is being retired this cycle (and a lot of the patches to accomplish this will land in the next few days). From a quick code search, it seems the following projects still include neutron-lbaas in their zuul job configs: networking-odl networking-midonet senlin vmware-nsx zaqar For projects on this list, the retirement of neutron-lbaas *will* cause your zuul jobs to fail. Please take action to remove this requirement! It is possible that it is simply an extra unused requirement, but if your project is actually using neutron-lbaas to create loadbalancers, it will be necessary to convert to Octavia. If you need assistance with this change or have any questions, don't hesitate to stop by #openstack-lbaas on IRC and we can help! --Adam Harwell -------------- next part -------------- An HTML attachment was scrubbed... URL: From yamamoto at midokura.com Mon May 13 09:13:34 2019 From: yamamoto at midokura.com (Takashi Yamamoto) Date: Mon, 13 May 2019 18:13:34 +0900 Subject: [senlin][zaqar][networking-midonet][networking-odl][vmware-nsx] Neutron-LBaaS retirement warning! In-Reply-To: References: Message-ID: On Mon, May 13, 2019 at 5:42 PM Adam Harwell wrote: > > As you are hopefully already aware, the Neutron-LBaaS project is being retired this cycle (and a lot of the patches to accomplish this will land in the next few days). > From a quick code search, it seems the following projects still include neutron-lbaas in their zuul job configs: > > networking-odl > networking-midonet > senlin > vmware-nsx > zaqar > > For projects on this list, the retirement of neutron-lbaas *will* cause your zuul jobs to fail. Please take action to remove this requirement! It is possible that it is simply an extra unused requirement, but if your project is actually using neutron-lbaas to create loadbalancers, it will be necessary to convert to Octavia. is there a guide for the conversion? > > If you need assistance with this change or have any questions, don't hesitate to stop by #openstack-lbaas on IRC and we can help! > > --Adam Harwell From saphi070 at gmail.com Mon May 13 11:06:04 2019 From: saphi070 at gmail.com (Sa Pham) Date: Mon, 13 May 2019 20:06:04 +0900 Subject: Baremetal attach volume in Multi-tenancy In-Reply-To: <20190513083857.xhsuaa6lk6g5nm6o@localhost> References: <20190509092828.g6qvdg5jbvqqvpba@localhost> <20190510103929.w7iqvakxzskk2pmb@localhost> <20190513083857.xhsuaa6lk6g5nm6o@localhost> Message-ID: Dear Gorka, Could you give me patch link on this work? Thank you On Mon, May 13, 2019 at 5:39 PM Gorka Eguileor wrote: > On 13/05, zack chen wrote: > > Hi, > > > > Thanks for your reply. > > I saw that ceph already has the Iscsi Gateway. Does the cinder project > have > > such a driver? > > > > Hi, > > There is an ongoing effort to write a new RBD driver specific for iSCSI, > but it is not available yet. > > Cheers, > Gorka. > > > Gorka Eguileor 于2019年5月10日周五 下午6:39写道: > > > > > On 10/05, zack chen wrote: > > > > This is a normal Cinder in Openstack deployment > > > > > > > > I'm using ceph as cinder backend, RBD drvier. > > > > > > > > > > Hi, > > > > > > If you are using a Ceph/RBD cluster then there are some things to take > > > into consideration: > > > > > > - You need to have the ceph-common package installed in the system. > > > > > > - The images are mounted using the kernel module, so you have to be > > > careful with the features that are enabled in the images. > > > > > > - If I'm not mistaken the RBD attach using the cinderclient extension > > > will fail if you don't have the configuration and credentials file > > > already in the system. > > > > > > > > > > My ideas the instance should communicate with Openstack platform > storage > > > > network via the vrouter provided by neutron. The vrouter gateway > should > > > > communicate with Openstack platform. is or right? > > > > > > > > > > I can't help you on the network side, since I don't know anything about > > > Neutron. > > > > > > Cheers, > > > Gorka. > > > > > > > Gorka Eguileor 于2019年5月9日周四 下午5:28写道: > > > > > > > > > On 08/05, zack chen wrote: > > > > > > Hi, > > > > > > I am looking for a mechanism that can be used for baremetal > attach > > > volume > > > > > > in a multi-tenant scenario. In addition we use ceph as the > backend > > > > > storage > > > > > > for cinder. > > > > > > > > > > > > Can anybody give me some advice? > > > > > > > > > > Hi, > > > > > > > > > > Is this a stand alone Cinder deployment or a normal Cinder in > OpenStack > > > > > deployment? > > > > > > > > > > What storage backend will you be using? > > > > > > > > > > What storage protocol? iSCSI, FC, RBD...? > > > > > > > > > > Depending on these you can go with Walter's suggestion of using > > > > > cinderclient and its extension (which in general is the best way to > > > go), > > > > > or you may prefer writing a small python script that uses OS-Brick > and > > > > > makes the REST API calls directly. > > > > > > > > > > Cheers, > > > > > Gorka. > > > > > > > > > > -- Sa Pham Dang Master Student - Soongsil University Kakaotalk: sapd95 Skype: great_bn -------------- next part -------------- An HTML attachment was scrubbed... URL: From geguileo at redhat.com Mon May 13 11:14:26 2019 From: geguileo at redhat.com (Gorka Eguileor) Date: Mon, 13 May 2019 13:14:26 +0200 Subject: Baremetal attach volume in Multi-tenancy In-Reply-To: References: <20190509092828.g6qvdg5jbvqqvpba@localhost> <20190510103929.w7iqvakxzskk2pmb@localhost> <20190513083857.xhsuaa6lk6g5nm6o@localhost> Message-ID: <20190513111426.i5rkvkn4utehko2r@localhost> On 13/05, Sa Pham wrote: > Dear Gorka, > > Could you give me patch link on this work? > > Thank you Hi, You can see an update on the subject on the PTG's etherpad [1] starting on line 119 until line 139. There's a video [2] of a previous discussion topic and this one. Cheers, Gorka. [1]: https://etherpad.openstack.org/p/cinder-train-ptg-planning [2]: https://www.youtube.com/watch?v=N6D6ib7T9Io&feature=em-lbcastemail > > On Mon, May 13, 2019 at 5:39 PM Gorka Eguileor wrote: > > > On 13/05, zack chen wrote: > > > Hi, > > > > > > Thanks for your reply. > > > I saw that ceph already has the Iscsi Gateway. Does the cinder project > > have > > > such a driver? > > > > > > > Hi, > > > > There is an ongoing effort to write a new RBD driver specific for iSCSI, > > but it is not available yet. > > > > Cheers, > > Gorka. > > > > > Gorka Eguileor 于2019年5月10日周五 下午6:39写道: > > > > > > > On 10/05, zack chen wrote: > > > > > This is a normal Cinder in Openstack deployment > > > > > > > > > > I'm using ceph as cinder backend, RBD drvier. > > > > > > > > > > > > > Hi, > > > > > > > > If you are using a Ceph/RBD cluster then there are some things to take > > > > into consideration: > > > > > > > > - You need to have the ceph-common package installed in the system. > > > > > > > > - The images are mounted using the kernel module, so you have to be > > > > careful with the features that are enabled in the images. > > > > > > > > - If I'm not mistaken the RBD attach using the cinderclient extension > > > > will fail if you don't have the configuration and credentials file > > > > already in the system. > > > > > > > > > > > > > My ideas the instance should communicate with Openstack platform > > storage > > > > > network via the vrouter provided by neutron. The vrouter gateway > > should > > > > > communicate with Openstack platform. is or right? > > > > > > > > > > > > > I can't help you on the network side, since I don't know anything about > > > > Neutron. > > > > > > > > Cheers, > > > > Gorka. > > > > > > > > > Gorka Eguileor 于2019年5月9日周四 下午5:28写道: > > > > > > > > > > > On 08/05, zack chen wrote: > > > > > > > Hi, > > > > > > > I am looking for a mechanism that can be used for baremetal > > attach > > > > volume > > > > > > > in a multi-tenant scenario. In addition we use ceph as the > > backend > > > > > > storage > > > > > > > for cinder. > > > > > > > > > > > > > > Can anybody give me some advice? > > > > > > > > > > > > Hi, > > > > > > > > > > > > Is this a stand alone Cinder deployment or a normal Cinder in > > OpenStack > > > > > > deployment? > > > > > > > > > > > > What storage backend will you be using? > > > > > > > > > > > > What storage protocol? iSCSI, FC, RBD...? > > > > > > > > > > > > Depending on these you can go with Walter's suggestion of using > > > > > > cinderclient and its extension (which in general is the best way to > > > > go), > > > > > > or you may prefer writing a small python script that uses OS-Brick > > and > > > > > > makes the REST API calls directly. > > > > > > > > > > > > Cheers, > > > > > > Gorka. > > > > > > > > > > > > > > > > -- > Sa Pham Dang > Master Student - Soongsil University > Kakaotalk: sapd95 > Skype: great_bn From saphi070 at gmail.com Mon May 13 11:16:40 2019 From: saphi070 at gmail.com (Sa Pham) Date: Mon, 13 May 2019 20:16:40 +0900 Subject: Baremetal attach volume in Multi-tenancy In-Reply-To: <20190513111426.i5rkvkn4utehko2r@localhost> References: <20190509092828.g6qvdg5jbvqqvpba@localhost> <20190510103929.w7iqvakxzskk2pmb@localhost> <20190513083857.xhsuaa6lk6g5nm6o@localhost> <20190513111426.i5rkvkn4utehko2r@localhost> Message-ID: Thanks, I'll check it out. On Mon, May 13, 2019 at 8:14 PM Gorka Eguileor wrote: > On 13/05, Sa Pham wrote: > > Dear Gorka, > > > > Could you give me patch link on this work? > > > > Thank you > > Hi, > > You can see an update on the subject on the PTG's etherpad [1] starting > on line 119 until line 139. There's a video [2] of a previous > discussion topic and this one. > > Cheers, > Gorka. > > > [1]: https://etherpad.openstack.org/p/cinder-train-ptg-planning > [2]: https://www.youtube.com/watch?v=N6D6ib7T9Io&feature=em-lbcastemail > > > > > > On Mon, May 13, 2019 at 5:39 PM Gorka Eguileor > wrote: > > > > > On 13/05, zack chen wrote: > > > > Hi, > > > > > > > > Thanks for your reply. > > > > I saw that ceph already has the Iscsi Gateway. Does the cinder > project > > > have > > > > such a driver? > > > > > > > > > > Hi, > > > > > > There is an ongoing effort to write a new RBD driver specific for > iSCSI, > > > but it is not available yet. > > > > > > Cheers, > > > Gorka. > > > > > > > Gorka Eguileor 于2019年5月10日周五 下午6:39写道: > > > > > > > > > On 10/05, zack chen wrote: > > > > > > This is a normal Cinder in Openstack deployment > > > > > > > > > > > > I'm using ceph as cinder backend, RBD drvier. > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > If you are using a Ceph/RBD cluster then there are some things to > take > > > > > into consideration: > > > > > > > > > > - You need to have the ceph-common package installed in the system. > > > > > > > > > > - The images are mounted using the kernel module, so you have to be > > > > > careful with the features that are enabled in the images. > > > > > > > > > > - If I'm not mistaken the RBD attach using the cinderclient > extension > > > > > will fail if you don't have the configuration and credentials > file > > > > > already in the system. > > > > > > > > > > > > > > > > My ideas the instance should communicate with Openstack platform > > > storage > > > > > > network via the vrouter provided by neutron. The vrouter gateway > > > should > > > > > > communicate with Openstack platform. is or right? > > > > > > > > > > > > > > > > I can't help you on the network side, since I don't know anything > about > > > > > Neutron. > > > > > > > > > > Cheers, > > > > > Gorka. > > > > > > > > > > > Gorka Eguileor 于2019年5月9日周四 下午5:28写道: > > > > > > > > > > > > > On 08/05, zack chen wrote: > > > > > > > > Hi, > > > > > > > > I am looking for a mechanism that can be used for baremetal > > > attach > > > > > volume > > > > > > > > in a multi-tenant scenario. In addition we use ceph as the > > > backend > > > > > > > storage > > > > > > > > for cinder. > > > > > > > > > > > > > > > > Can anybody give me some advice? > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > Is this a stand alone Cinder deployment or a normal Cinder in > > > OpenStack > > > > > > > deployment? > > > > > > > > > > > > > > What storage backend will you be using? > > > > > > > > > > > > > > What storage protocol? iSCSI, FC, RBD...? > > > > > > > > > > > > > > Depending on these you can go with Walter's suggestion of using > > > > > > > cinderclient and its extension (which in general is the best > way to > > > > > go), > > > > > > > or you may prefer writing a small python script that uses > OS-Brick > > > and > > > > > > > makes the REST API calls directly. > > > > > > > > > > > > > > Cheers, > > > > > > > Gorka. > > > > > > > > > > > > > > > > > > > > > > -- > > Sa Pham Dang > > Master Student - Soongsil University > > Kakaotalk: sapd95 > > Skype: great_bn > -- Sa Pham Dang Master Student - Soongsil University Kakaotalk: sapd95 Skype: great_bn -------------- next part -------------- An HTML attachment was scrubbed... URL: From hberaud at redhat.com Mon May 13 11:47:53 2019 From: hberaud at redhat.com (Herve Beraud) Date: Mon, 13 May 2019 13:47:53 +0200 Subject: [requirements] bandit bump to 1.6.0 Message-ID: Hello, FYI bandit 1.6.0 was released and changes the behavior of the '-x' option so that it now supports glob patterns. Many openstack projects will facing bandit issues due to these changes. Two possibilities exists: - pin your bandit version to < 1.6.0 - accept 1.6.0 and modify your bandit call by passing a patterns like this https://review.opendev.org/#/c/658319/1 We also need to update openstack/requirements ( https://review.opendev.org/#/c/658767/) I think the better approach is to use 1.6.0 now and to fix the bandit command to avoid issues in the future, and avoid undesired reviews on this topic. Regards -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From hberaud at redhat.com Mon May 13 11:50:11 2019 From: hberaud at redhat.com (Herve Beraud) Date: Mon, 13 May 2019 13:50:11 +0200 Subject: [requirements] bandit bump to 1.6.0 In-Reply-To: References: Message-ID: Alreaady discussed here => http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006116.html Sorry Le lun. 13 mai 2019 à 13:47, Herve Beraud a écrit : > Hello, > > FYI bandit 1.6.0 was released and changes the behavior of the '-x' option > so that it now supports glob patterns. > > Many openstack projects will facing bandit issues due to these changes. > > Two possibilities exists: > - pin your bandit version to < 1.6.0 > - accept 1.6.0 and modify your bandit call by passing a patterns like this > https://review.opendev.org/#/c/658319/1 > > We also need to update openstack/requirements ( > https://review.opendev.org/#/c/658767/) > > I think the better approach is to use 1.6.0 now and to fix the bandit > command to avoid issues in the future, and avoid undesired reviews on this > topic. > > Regards > > -- > Hervé Beraud > Senior Software Engineer > Red Hat - Openstack Oslo > irc: hberaud > -----BEGIN PGP SIGNATURE----- > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > v6rDpkeNksZ9fFSyoY2o > =ECSj > -----END PGP SIGNATURE----- > > -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From berndbausch at gmail.com Mon May 13 12:14:29 2019 From: berndbausch at gmail.com (Bernd Bausch) Date: Mon, 13 May 2019 21:14:29 +0900 Subject: [tc] [ceilometer] Rocky release notes for Ceilometer Message-ID: <907ABCCF-C566-40B4-B479-5F5D3A5923EB@gmail.com> I don’t know if it’s correct to tag this message with [tc]; sorry if not. The Ceilometer Rocky release notes are not published on the releases web site [1]. They were distributed via email only [2], it would seem. Whoever manages the releases site or the Ceilometer release notes may want to fill that gap to make it easier to find information. [1] https://releases.openstack.org/rocky/index.html [2] http://lists.openstack.org/pipermail/release-announce/2018-July.txt Bernd -------------- next part -------------- An HTML attachment was scrubbed... URL: From josephine.seifert at secustack.com Mon May 13 12:19:43 2019 From: josephine.seifert at secustack.com (Josephine Seifert) Date: Mon, 13 May 2019 14:19:43 +0200 Subject: [nova][cinder][glance][Barbican]Finding Timeslot for weekly Image Encryption IRC meeting In-Reply-To: References: Message-ID: <798dc164-1ed3-10f3-6de2-e902ae269869@secustack.com> Just re-raising this :) Please vote, if you would like to participate: https://doodle.com/poll/wtg9ha3e5dvym6yt Am 04.05.19 um 20:57 schrieb Josephine Seifert: > Hello, > > as a result from the Summit and the PTG, I would like to hold a weekly > IRC-meeting for the Image Encryption (soon to be a pop-up team).  > > As I work in Europe I have made a doodle poll, with timeslots I can > attend and hopefully many of you. If you would like to join in a weekly > meeting, please fill out the poll and state your name and the project > you are working in: > https://doodle.com/poll/wtg9ha3e5dvym6yt > > Thank you > Josephine (Luzi) > > > From dangtrinhnt at gmail.com Mon May 13 12:22:16 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Mon, 13 May 2019 21:22:16 +0900 Subject: [tc] [ceilometer] Rocky release notes for Ceilometer In-Reply-To: <907ABCCF-C566-40B4-B479-5F5D3A5923EB@gmail.com> References: <907ABCCF-C566-40B4-B479-5F5D3A5923EB@gmail.com> Message-ID: Hi Bernd, Let me have a look and fix it. Thank for reporting. On Mon, May 13, 2019 at 9:18 PM Bernd Bausch wrote: > I don’t know if it’s correct to tag this message with [tc]; sorry if not. > > The Ceilometer Rocky release notes are not published on the releases web > site [1]. They were distributed via email only [2], it would seem. Whoever > manages the releases site or the Ceilometer release notes may want to fill > that gap to make it easier to find information. > > [1] https://releases.openstack.org/rocky/index.html > [2] http://lists.openstack.org/pipermail/release-announce/2018-July.txt > > Bernd > -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From josorior at redhat.com Mon May 13 04:57:12 2019 From: josorior at redhat.com (Juan Osorio Robles) Date: Mon, 13 May 2019 07:57:12 +0300 Subject: [oslo] PTG Summary In-Reply-To: References: Message-ID: On Wed, May 8, 2019 at 10:49 PM Ben Nemec wrote: > Hi, > > You can find the raw notes on the etherpad > (https://etherpad.openstack.org/p/oslo-train-topics), but hopefully this > will be an easier to read/understand summary. > > Pluggable Policy > ---------------- > Spec: https://review.opendev.org/#/c/578719/ > > Since this sort of ran out of steam last cycle, we discussed the option > of not actually making it pluggable and just explicitly adding support > for other policy backends. The specific one that seems to be of interest > is Open Policy Agent. To do this we would add an option to enable OPA > mode, where all policy checks would be passed through to OPA by default. > An OPACheck class would also be added to facilitate migration (as a rule > is added to OPA, switch the policy to OPACheck. Once all rules are > present, remove the policy file and just turn on the OPA mode). > > However, after some further investigation by Patrick East, it was not > clear if users were asking for this or if the original spec was more of > a "this might be useful" thing. He's following up with some OPA users to > see if they would use such a feature, but at this point it's not clear > whether there is enough demand to justify spending time on it. > The OPA plugin (and the pluggable plugin spec that came from it) was originally made by me to address the dynamic & centralized policy problem. Unfortunately I won't be able to continue this work; so if nobody steps up, I'd have to agree that this can be discarded. > > Image Encryption/Decryption Library > ----------------------------------- > I mention this mostly because the current plan is _not_ to create a new > Oslo library to enable the feature. The common code between services is > expected to live in os-brick, and there does not appear to be a need to > create a new encryption library to support this (yay!). > > oslo.service SIGHUP bug > ----------------------- > This is a problem a number of people have run into recently and there's > been some ongoing, but spotty, discussion of how to deal with it. In > Denver we were able to have some face-to-face discussions and hammer out > a plan to get this fixed. I think we have a fix identified, and now we > just need to get it proposed and tested so we don't regress this in the > future. Most of the prior discussion and a previously proposed fix are > at https://review.opendev.org/#/c/641907/ so if you want to follow this > that's the place to do it. > > In case anyone is interested, it looks like this is a bug that was > introduced with mutable config. Mutable config requires a different type > of service restart, and that was never implemented. Now that most > services are using mutable config, this is much bigger problem. > > Unified Limits and Policy > ------------------------- > I won't try to cover everything in detail here, but good progress was > made on both of these topics. There isn't much to do from the Oslo side > for the policy changes, but we identified a plan for an initial > implementation of oslo.limit. There was general agreement that we don't > necessarily have to get it 100% right on the first attempt, we just need > to get something in the repo that people can start prototyping with. > Until we release a 1.0 we aren't committed to any API, so we have > flexibility to iterate. > > For more details, see: > https://etherpad.openstack.org/p/ptg-train-xproj-nova-keystone > > oslo.service profiling and pypy > ------------------------------- > Oslo has dropped support for pypy in general due to lack of maintainers, > so although the profiling work has apparently broken oslo.service under > pypy this isn't something we're likely to address. Based on our > conversation at the PTG game night, it sounds like this isn't a priority > anymore anyway because pypy didn't have the desired performance > improvement. > > oslo.privsep eventlet timeout > ----------------------------- > AFAICT, oslo.privsep only uses eventlet at all if monkey-patching is > enabled (and then only to make sure it returns the right type of pipe > for the environment). It's doubtful any eventlet exceptions are being > raised from the privsep code, and even if they are they would go away > once monkey-patching in the calling service is disabled. Privsep is > explicitly not depending on eventlet for any of its functionality so > services should be able to freely move away from eventlet if they wish. > > Retrospective > ------------- > In general, we got some major features implemented that unblocked things > either users or services were asking for. We did add two cores during > the cycle, but we also lost a long-time Oslo core and some of the other > cores are being pulled away on other projects. So far this has probably > resulted in a net loss in review capacity. > > As a result, our primary actions out of this were to continue watching > for new candidates to join the Oslo team. We have at least one person we > are working closely with and a number of other people approached me at > the event with interest in contributing to one or more Oslo projects. So > while this cycle was a bit of a mixed bag, I have a cautiously > optimistic view of the future. > > Service Healthchecks and Metrics > -------------------------------- > Had some initial hallway track discussions about this. The self-healing > SIG is looking into ways to improve the healthcheck and metric situation > in OpenStack, and some of them may require additions or changes in Oslo. > There is quite a bit of discussion (not all of which I have read yet) > related to this on https://review.opendev.org/#/c/653707/ > > On the metrics side, there are some notes on the SIG etherpad (currently > around line 209): https://etherpad.openstack.org/p/DEN-self-healing-SIG > > It's still a bit early days for both of these things so plans may > change, but it seems likely that Oslo will be involved to some extent. > Stay tuned. > > Endgame > ------- > No spoilers, I promise. If you made it all the way here then thanks and > congrats. :-) > > I hope this was helpful, and if you have any thoughts about anything > above please let me know. > > Thanks. > > -Ben > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mordred at inaugust.com Mon May 13 12:39:01 2019 From: mordred at inaugust.com (Monty Taylor) Date: Mon, 13 May 2019 12:39:01 +0000 Subject: [OSC][PTG] Summary: many things to do! In-Reply-To: <1CC272501B5BC543A05DB90AA509DED52755A484@fmsmsx122.amr.corp.intel.com> References: <2de134d3-629c-5396-4b1d-0c1dd0c42065@inaugust.com> <1CC272501B5BC543A05DB90AA509DED52755A484@fmsmsx122.amr.corp.intel.com> Message-ID: <2d695b60-4fab-0a10-f366-32e456f767ee@inaugust.com> On 5/10/19 11:55 PM, Nadathur, Sundar wrote: >> -----Original Message----- >> From: Monty Taylor >> Sent: Friday, May 10, 2019 1:43 PM >> To: openstack-discuss at lists.openstack.org >> Subject: Re: [OSC][PTG] Summary: many things to do! >> >> On 5/10/19 4:48 PM, Dean Troyer wrote: >>> OpenStackClient held a session at the Denver PTG and despite not >>> having much planned had plenty to talk about. Some of the highlights >>> from the etherpad[0] are: >> >> Well, poo. Sorry I missed it. >> >>> * Aretm is working on changing the Image commands to use OpenStackSDK. >>> This is the work described in the cycle goal proposal[1] that he is >>> planning to do anyway. I support going ahead with this even without >>> an SDK 1.0 release as it lets us remove glanceclient and some of its >>> unique dependencies. >>> >>> * There was some discussion about image encryption and where the >>> client-side bits of that may land. One option was to put it into >>> os-brick; if that is where it winds up OSC will make that an optional >>> dependency de to the number of other dependencies that will introduce. >>> (ie, OSC currently uses very little of oslo, some of which brings in a >>> number of things not otherwise needed client-side). >> >> We landed support for image signing in SDK: >> >> https://opendev.org/openstack/openstacksdk/src/branch/master/openstack/i >> mage/image_signer.py > > The old python-glanceclient has an option to download the image to a local file. I missed that when using the clouds.yaml-based approach that directly accesses the Glance API. Hope we can add that option to the openstacksdk-based client. Yup - we have this. The shade side of sdk has had this for a while. We just landed a refactor which pushes all of the shade image logic into the sdk layer (thanks to Artem) - so when 0.28.0 is cut later today, conn.image.download() should work properly in the sdk layer too. But in the meantime, you can just use conn.download_image and it'll work fine Monty From sfinucan at redhat.com Mon May 13 13:02:14 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Mon, 13 May 2019 14:02:14 +0100 Subject: [OSC][PTG] Summary: many things to do! In-Reply-To: References: Message-ID: <768df039575b8cad3ee3262230534d7309f9c09f.camel@redhat.com> On Fri, 2019-05-10 at 11:48 -0500, Dean Troyer wrote: > OpenStackClient held a session at the Denver PTG and despite not > having much planned had plenty to talk about. Some of the highlights > from the etherpad[0] are: > > * Aretm is working on changing the Image commands to use OpenStackSDK. > This is the work described in the cycle goal proposal[1] that he is > planning to do anyway. I support going ahead with this even without > an SDK 1.0 release as it lets us remove glanceclient and some of its > unique dependencies. > > * There was some discussion about image encryption and where the > client-side bits of that may land. One option was to put it into > os-brick; if that is where it winds up OSC will make that an optional > dependency de to the number of other dependencies that will introduce. > (ie, OSC currently uses very little of oslo, some of which brings in a > number of things not otherwise needed client-side). > > * Doug brought up the problems with load times due to scanning the > entire import path on every invocation. I found in my notes almost > exactly 2 years ago where we discussed this same topic. AS we did > then, the idea of skipping entry points entirely for commands in the > OSC repo is the best solution we have found. This would help some > common cases but still leave all plugins with slow load times. > > * Nate Johnston asked about supporting bulk create APIs, such as > Neutron's bulk port create. After kicking around a couple of options > the rough consensus is around using a YAML file (or JSON or both?) to > define the resources to be created and giving it to a new top-level > 'create' command (yes, verb only, the resource names will be in the > YAML file). APIs that allow bulk creates will get a single call with > the entire list, for other APIs we can loop and feed them one at a > time. This would be very similar to using interactive mode and > feeding in a list of commands, stopping at the first failure. Note > that delete commands already take multiple resource identifiers, > adding the ability to source that list from YAML would be an easy > addition. > > * OSC4 has been waiting in a feature branch for over 2 years (where > has that PTL been???). I recently tried to merge master in to see how > far off it was, it was enough that I think we should just cherry-pick > the commites in that branch to master and move forward. So the > current plan is to: > * do one more release in the 3.x series to clean up outstanding things > * switch to OSC4 development, cherry pick in amotoki's existing > commits[2] (mostly changes to output formatting) > * refresh and merge other reviews in the osc4 topic > * remove all of the backward-compatibility code in the OSC > authentication process so OSC will now work like all other pure > keystoneauth- and sdk-using code. > > > Also relevant to OSC but covered in a Nova Forum session[3,4], highlights: > > * boot-from-volume: Support type=image for the --block-device-mapping, > and Add a --boot-from-volume option which will translate to a root > --block-device-mapping using the provided --image value > > * server migrate --live: deprecate the --live option and add a new > --live-migration option and a --host option Could I suggest we don't bother deprecating the '--live' option in 3.x and simply rework it for 4.0? This is of course assuming the 4.0 release date is months away and not years, of course. > * compute migration: begin exposing this resource in the CLI > > dt > > [0] https://etherpad.openstack.org/p/train-ptg-osc > [1] https://review.opendev.org/#/c/639376/ > [2] this series starts at https://review.opendev.org/#/c/657907/ > [3] https://etherpad.openstack.org/p/DEN-osc-compute-api-gaps > [4] http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005783.html > > From doug at doughellmann.com Mon May 13 13:09:51 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Mon, 13 May 2019 09:09:51 -0400 Subject: [OSC][PTG] Summary: many things to do! In-Reply-To: References: Message-ID: Dean Troyer writes: > * Doug brought up the problems with load times due to scanning the > entire import path on every invocation. I found in my notes almost > exactly 2 years ago where we discussed this same topic. AS we did > then, the idea of skipping entry points entirely for commands in the > OSC repo is the best solution we have found. This would help some > common cases but still leave all plugins with slow load times. Because of the layering to handle versioned plugins this is going to be a little more complex than I expected and I'm unlikely to have time to work on it this cycle. I would be happy to talk about it if someone else wants to pick it up -- it's definitely possible, just not in the time I have available. -- Doug From jean-philippe at evrard.me Mon May 13 13:10:50 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Mon, 13 May 2019 09:10:50 -0400 Subject: =?UTF-8?Q?Re:_[ptg][kolla][openstack-ansible][tripleo]_PTG_cross-project?= =?UTF-8?Q?_summary?= In-Reply-To: References: Message-ID: Let's see what this all gives indeed :) > I'd like to thank the OSA team for hosting the discussion - it was great to meet the team and share experience. Thanks for being there, and for being open to discussion. It was a pleasure to see you in those conversations! Regards, Jean-Philippe Evrard From lajos.katona at ericsson.com Mon May 13 13:17:32 2019 From: lajos.katona at ericsson.com (Lajos Katona) Date: Mon, 13 May 2019 13:17:32 +0000 Subject: [neutron] Bug deputy report 2019-05-06 - 2019-05-13 Message-ID: <41d574c2-e181-0d38-2f00-51fc841fc406@ericsson.com> Hi Neutrinos, This week's bug deputy report for networking projects: Highlights: * There are 2 ryu related bugs with Rocky. * Unassigned bugs: 1828205, 1828547, 1828605, 1828375, 1828406 * 3 RFEs HIGH MEDIUM #1828053 GRE tunnels between VMs don't work when openvswitch firewall is used https://bugs.launchpad.net/neutron/+bug/1828053 Fix Released #1828205 "network-segment-ranges" doesn't return the project_id https://bugs.launchpad.net/neutron/+bug/1828205 Unassigned #1828543 Routed provider networks: placement API handling errors https://bugs.launchpad.net/neutron/+bug/1828543 Assigned #1828605 [l3][scale issue] unrestricted hosting routers in network node increase service operating pressure https://bugs.launchpad.net/neutron/+bug/1828605 Unassigned, related to #1828494 #1828721 [VPNaaS]: Check restart_check_config enabled https://bugs.launchpad.net/neutron/+bug/1828721 In Progress LOW #1828363 FdbInterfaceTestCase interface names should be randomly generated https://bugs.launchpad.net/neutron/+bug/1828363 In Progress #1828375 Bulk creation of subports fails with StaleDataError https://bugs.launchpad.net/neutron/+bug/1828375 Reported from Ocata, need more effort to reproduce on master/other branch #1828406 neutron-dynamic-routing bgp ryu hold timer expired but never tried to recover https://bugs.launchpad.net/neutron/+bug/1828406 Rocky with ryu, Unassigned #1828423 Very often in DHCP agent we have ""Duplicate IP addresses found, DHCP cache is out of sync" https://bugs.launchpad.net/neutron/+bug/1828423 Assigned #1828437 Remove unneeded compatibility conversions https://bugs.launchpad.net/neutron/+bug/1828437 In Progress #1828473 Dnsmasq spawned by neutron-dhcp-agent should use bind-dynamic option instead of bind-interfaces https://bugs.launchpad.net/neutron/+bug/1828473 In Progress #1828547 neutron-dynamic-routing TypeError: argument of type 'NoneType' is not iterable https://bugs.launchpad.net/neutron/+bug/1828547 Rocky with ryu, Unassigned RFE #1828367 [RFE] Ironic notifier - Notify Ironic on port status changes https://bugs.launchpad.net/neutron/+bug/1828367 In Progress #1828494 [RFE][L3] l3-agent should have its capacity https://bugs.launchpad.net/neutron/+bug/1828494 In Progress #1828607 [RFE] DVR Enhancements https://bugs.launchpad.net/neutron/+bug/1828607 In Progress Regards Lajos -------------- next part -------------- An HTML attachment was scrubbed... URL: From jean-philippe at evrard.me Mon May 13 13:42:30 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Mon, 13 May 2019 09:42:30 -0400 Subject: =?UTF-8?Q?Re:_[all][requirements][stable]_requests_version_bump_on_stabl?= =?UTF-8?Q?e_brances_{pike|queens}_for_CVE-2018-18074?= In-Reply-To: <20190509155455.7wkszge3e7bykgsj@mthode.org> References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> <20190509134808.4eqwwjcdxjpt37wh@yuggoth.org> <20190509155455.7wkszge3e7bykgsj@mthode.org> Message-ID: <4a118cfa-7c10-455e-b084-b4cc911c8053@www.fastmail.com> > To extend on this, I thought that OSA had the ability to override > certian constraints (meaning they could run the check and maintain the > overrides on their end). > OSA does indeed. But this problem is not limited to OSA, AFAIK. If I read Jesse's comment correctly, the point was to get a clear state of what we do as a community. I agree with Jesse, we should do as much upstream as we can, so that the whole community benefits from it. If things are updated on a best effort basis in u-c, more than a single project benefits from this. If things are not updated on a best effort basis, then source based deployment projects should discuss together on making this a reality. In all cases, this deserves documentation if it's not documented already (I totally missed that part of the documentation myself). Regards, JP From fungi at yuggoth.org Mon May 13 13:51:03 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 13 May 2019 13:51:03 +0000 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: <4a118cfa-7c10-455e-b084-b4cc911c8053@www.fastmail.com> References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> <20190509134808.4eqwwjcdxjpt37wh@yuggoth.org> <20190509155455.7wkszge3e7bykgsj@mthode.org> <4a118cfa-7c10-455e-b084-b4cc911c8053@www.fastmail.com> Message-ID: <20190513135102.oqcgxhkfgepytih7@yuggoth.org> On 2019-05-13 09:42:30 -0400 (-0400), Jean-Philippe Evrard wrote: [...] > I agree with Jesse, we should do as much upstream as we can, so > that the whole community benefits from it. If things are updated > on a best effort basis in u-c, more than a single project benefits > from this. If things are not updated on a best effort basis, then > source based deployment projects should discuss together on making > this a reality. In all cases, this deserves documentation if it's > not documented already (I totally missed that part of the > documentation myself). I don't see anything wrong with a best-effort attempt by folks who build or rely on source-based deployments from stable branches, my primary concerns remain: 1. This goal is tangential to (and even conflicting with) the purpose of the requirements repository's upper-constraints.txt file so should probably be managed independently of that. 2. As a project we should be clear that this is a not-at-all-timely post-hoc attempt at reflecting somewhat secure deployment sets and can't guarantee we will always be able to find a solution for (or perhaps even notice) many future vulnerabilities in the transitive dependency tree where stable branches of our software are concerned. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From jim at jimrollenhagen.com Mon May 13 13:54:52 2019 From: jim at jimrollenhagen.com (Jim Rollenhagen) Date: Mon, 13 May 2019 09:54:52 -0400 Subject: [tc][all] Github mirroring (or lack thereof) for unofficial projects In-Reply-To: <20190503190538.GB3377@localhost.localdomain> References: <20190503190538.GB3377@localhost.localdomain> Message-ID: On Fri, May 3, 2019 at 3:05 PM Paul Belanger wrote: > On Fri, May 03, 2019 at 08:48:10PM +0200, Roman Gorshunov wrote: > > Hello Jim, team, > > > > I'm from Airship project. I agree with archival of Github mirrors of > > repositories. One small suggestion: could we have project descriptions > > adjusted to point to the new location of the source code repository, > > please? E.g. "The repo now lives at opendev.org/x/y". > > > This is something important to keep in mind from infra side, once the > repo is read-only, we lose the ability to use the API to change it. > > From manage-projects.py POV, we can update the description before > flipping the archive bit without issues, just need to make sure we have > the ordering correct. > Agree this is a good idea. There's been no objections to this plan for some time now. Is there someone from the infra team available to do this archival (or work with me to do it)? // jim > > Also, there is no API to unarchive a repo from github sadly, for that a > human needs to log into github UI and click the button. I have no idea > why. > > - Paul > > > Thanks to AJaeger & clarkb. > > > > Thank you. > > > > Best regards, > > -- Roman Gorshunov > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sneha.rai at hpe.com Mon May 13 13:50:08 2019 From: sneha.rai at hpe.com (RAI, SNEHA) Date: Mon, 13 May 2019 13:50:08 +0000 Subject: Help needed to Support Multi-attach feature In-Reply-To: <20190513085118.3hfsekvtabq6ipm2@localhost> References: <20190510092600.r27zetl5e3k5ow5v@localhost> <20190513085118.3hfsekvtabq6ipm2@localhost> Message-ID: Thanks Gorka for your response. The main reason is "AUTO: Connection to libvirt lost: 1". Not sure, why the connection is being lost. I tried restarting all the nova services too, but no luck. Regards, Sneha Rai -----Original Message----- From: Gorka Eguileor [mailto:geguileo at redhat.com] Sent: Monday, May 13, 2019 2:21 PM To: RAI, SNEHA Cc: openstack-dev at lists.openstack.org Subject: Re: Help needed to Support Multi-attach feature On 10/05, RAI, SNEHA wrote: > Thanks Gorka for your response. > > I have changed the version of libvirt and qemu on my host and I am able to move past the previous error mentioned in my last email. > > Current versions of libvirt and qemu: > root at CSSOSBE04-B09:/etc# libvirtd --version libvirtd (libvirt) 1.3.1 > root at CSSOSBE04-B09:/etc# kvm --version QEMU emulator version 2.5.0 > (Debian 1:2.5+dfsg-5ubuntu10.36), Copyright (c) 2003-2008 Fabrice > Bellard > > Also, I made a change in /etc/nova/nova.conf and set virt_type=qemu. Earlier it was set to kvm. > I restarted all nova services post the changes but I can see one nova service was disabled and state was down. > > root at CSSOSBE04-B09:/etc# nova service-list > +--------------------------------------+------------------+---------------+----------+----------+-------+----------------------------+-------------------------------------+-------------+ > | Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason | Forced down | > +--------------------------------------+------------------+---------------+----------+----------+-------+----------------------------+-------------------------------------+-------------+ > | 1ebcd1f6-b7dc-40ce-8d7b-95d60503c0ff | nova-scheduler | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:48:59.000000 | - | False | > | ed82277c-d2e0-4a1a-adf6-9bcdcc50ba29 | nova-consoleauth | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:48:49.000000 | - | False | > | bc2b6703-7a1e-4f07-96b9-35cbb14398d5 | nova-conductor | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:48:59.000000 | - | False | > | 72ecbc1d-1b47-4f55-a18d-de2fbf1771e9 | nova-conductor | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:48:54.000000 | - | False | > | 9c700ee1-1694-479b-afc0-1fd37c1a5561 | nova-compute | CSSOSBE04-B09 | nova | disabled | down | 2019-05-07T22:11:06.000000 | AUTO: Connection to libvirt lost: 1 | False | > +--------------------------------------+------------------+---------------+----------+----------+-------+----------------------------+-------------------------------------+-------------+ > > So, I manually enabled the service, but the state was still down. > root at CSSOSBE04-B09:/etc# nova service-enable > 9c700ee1-1694-479b-afc0-1fd37c1a5561 > +--------------------------------------+---------------+--------------+---------+ > | ID | Host | Binary | Status | > +--------------------------------------+---------------+--------------+---------+ > | 9c700ee1-1694-479b-afc0-1fd37c1a5561 | CSSOSBE04-B09 | nova-compute > | | enabled | > +--------------------------------------+---------------+--------------+---------+ > > root at CSSOSBE04-B09:/etc# nova service-list > +--------------------------------------+------------------+---------------+----------+---------+-------+----------------------------+-----------------+-------------+ > | Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason | Forced down | > +--------------------------------------+------------------+---------------+----------+---------+-------+----------------------------+-----------------+-------------+ > | 1ebcd1f6-b7dc-40ce-8d7b-95d60503c0ff | nova-scheduler | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:49:19.000000 | - | False | > | ed82277c-d2e0-4a1a-adf6-9bcdcc50ba29 | nova-consoleauth | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:49:19.000000 | - | False | > | bc2b6703-7a1e-4f07-96b9-35cbb14398d5 | nova-conductor | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:49:19.000000 | - | False | > | 72ecbc1d-1b47-4f55-a18d-de2fbf1771e9 | nova-conductor | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:49:14.000000 | - | False | > | 9c700ee1-1694-479b-afc0-1fd37c1a5561 | nova-compute | CSSOSBE04-B09 | nova | enabled | down | 2019-05-10T05:49:14.000000 | - | False | > +--------------------------------------+------------------+---------------+----------+---------+-------+----------------------------+-----------------+-------------+ > Hi, If it appears as down it's probably because there is an issue during the service's start procedure. You can look in the logs to see what messages appeared during the start or tail the logs and restart the service to see what error appears there. Cheers, Gorka. > So, now when I try to attach a volume to nova instance, I get the below error. As one of the service is down it fails in filter validation for nova-compute and gives us "No host" error. > > May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;32mDEBUG > nova.filters [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 > #033[00;36mdemo admin#033[00;32m] #033[01;35m#033[00;32mFilter > RetryFilter returned 1 host(s)#033[00m #033[00;33m{{(pid=21775) > get_filtered_objects /opt/stack/nova/nova/filters.py:104}}#033[00m > May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;32mDEBUG > nova.filters [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 > #033[00;36mdemo admin#033[00;32m] #033[01;35m#033[00;32mFilter > AvailabilityZoneFilter returned 1 host(s)#033[00m > #033[00;33m{{(pid=21775) get_filtered_objects > /opt/stack/nova/nova/filters.py:104}}#033[00m > May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;32mDEBUG > nova.scheduler.filters.compute_filter [#033[01;36mNone > req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo > admin#033[00;32m] #033[01;35m#033[00;32m(CSSOSBE04-B09, CSSOSBE04-B09) > ram: 30810MB disk: 1737728MB io_ops: 0 instances: 1 is disabled, > reason: AUTO: Connection to libvirt lost: 1#033[00m > #033[00;33m{{(pid=21775) host_passes > /opt/stack/nova/nova/scheduler/filters/compute_filter.py:42}}#033[00m > May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;36mINFO > nova.filters [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 > #033[00;36mdemo admin#033[00;36m] #033[01;35m#033[00;36mFilter > ComputeFilter returned 0 hosts#033[00m May 10 10:43:00 CSSOSBE04-B09 > nova-scheduler[21775]: #033[00;32mDEBUG nova.filters [#033[01;36mNone > req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo > admin#033[00;32m] #033[01;35m#033[00;32mFiltering removed all hosts > for the request with instance ID > '1735ece5-d187-454a-aab1-12650646a2ec'. Filter results: > [('RetryFilter', [(u'CSSOSBE04-B09', u'CSSOSBE04-B09')]), > ('AvailabilityZoneFilter', [(u'CSSOSBE04-B09', u'CSSOSBE04-B09')]), > ('ComputeFilter', None)]#033[00m #033[00;33m{{(pid=21775) > get_filtered_objects /opt/stack/nova/nova/filters.py:129}}#033[00m > May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;36mINFO > nova.filters [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 > #033[00;36mdemo admin#033[00;36m] #033[01;35m#033[00;36mFiltering > removed all hosts for the request with instance ID > '1735ece5-d187-454a-aab1-12650646a2ec'. Filter results: ['RetryFilter: > (start: 1, end: 1)', 'AvailabilityZoneFilter: (start: 1, end: 1)', > 'ComputeFilter: (start: 1, end: 0)']#033[00m May 10 10:43:00 > CSSOSBE04-B09 nova-scheduler[21775]: #033[00;32mDEBUG > nova.scheduler.filter_scheduler [#033[01;36mNone > req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo > admin#033[00;32m] #033[01;35m#033[00;32mFiltered []#033[00m > #033[00;33m{{(pid=21775) _get_sorted_hosts > /opt/stack/nova/nova/scheduler/filter_scheduler.py:404}}#033[00m > May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;32mDEBUG > nova.scheduler.filter_scheduler [#033[01;36mNone > req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo > admin#033[00;32m] #033[01;35m#033[00;32mThere are 0 hosts available > but 1 instances requested to build.#033[00m #033[00;33m{{(pid=21775) > _ensure_sufficient_hosts > /opt/stack/nova/nova/scheduler/filter_scheduler.py:279}}#033[00m > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: #033[01;31mERROR nova.conductor.manager [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo admin#033[01;31m] #033[01;35m#033[01;31mFailed to schedule instances#033[00m: NoValidHost_Remote: No valid host was found. There are not enough hosts available. > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: Traceback (most recent call last): > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 226, in inner > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: return func(*args, **kwargs) > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: File "/opt/stack/nova/nova/scheduler/manager.py", line 154, in select_destinations > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: allocation_request_version, return_alternates) > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: File "/opt/stack/nova/nova/scheduler/filter_scheduler.py", line 91, in select_destinations > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: allocation_request_version, return_alternates) > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: File "/opt/stack/nova/nova/scheduler/filter_scheduler.py", line 244, in _schedule > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: claimed_instance_uuids) > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: File "/opt/stack/nova/nova/scheduler/filter_scheduler.py", line 281, in _ensure_sufficient_hosts > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: raise exception.NoValidHost(reason=reason) > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: NoValidHost: No valid host was found. There are not enough hosts available. > > Need help in understanding on how to fix this error. For detailed logs, please refer the attached syslog. > > > Thanks & Regards, > Sneha Rai > > > > > > -----Original Message----- > From: Gorka Eguileor [mailto:geguileo at redhat.com] > Sent: Friday, May 10, 2019 2:56 PM > To: RAI, SNEHA > > Cc: openstack-dev at lists.openstack.org > Subject: Re: Help needed to Support Multi-attach feature > > > > On 02/05, RAI, SNEHA wrote: > > > Hi Team, > > > > > > I am currently working on multiattach feature for HPE 3PAR cinder driver. > > > > > > For this, while setting up devstack(on stable/queens) I made below > > > change in the local.conf [[local|localrc]] > > > ENABLE_VOLUME_MULTIATTACH=True ENABLE_UBUNTU_CLOUD_ARCHIVE=False > > > > > > /etc/cinder/cinder.conf: > > > [3pariscsi_1] > > > hpe3par_api_url = > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__192.168.1.7-3A8 > > 08 > > > 0_api_v1&d=DwIBAg&c=C5b8zRQO1miGmBeVZ2LFWg&r=8drU3i56Z5sQ_Ltpya89LTN > > n3 > > > xDSwtigjYbGrSY1lM8&m=zTRvI4nj8MoP0_z5MmxTYwKiNNW6addwP4L5VFG4wkg&s=a > > 2D > > > HbzzRtbbBPz0_kfodZv5X1HxbN_hFxte5rEZabAg&e= > > > hpe3par_username = user > > > hpe3par_password = password > > > san_ip = 192.168.1.7 > > > san_login = user > > > san_password = password > > > volume_backend_name = 3pariscsi_1 > > > hpe3par_cpg = my_cpg > > > hpe3par_iscsi_ips = 192.168.11.2,192.168.11.3 volume_driver = > > > cinder.volume.drivers.hpe.hpe_3par_iscsi.HPE3PARISCSIDriver > > > hpe3par_iscsi_chap_enabled = True > > > hpe3par_debug = True > > > image_volume_cache_enabled = True > > > > > > /etc/cinder/policy.json: > > > 'volume:multiattach': 'rule:admin_or_owner' > > > > > > Added https://urldefense.proofpoint.com/v2/url?u=https-3A__review.opendev.org_-23_c_560067_2_cinder_volume_drivers_hpe_hpe-5F3par-5Fcommon.py&d=DwIBAg&c=C5b8zRQO1miGmBeVZ2LFWg&r=8drU3i56Z5sQ_Ltpya89LTNn3xDSwtigjYbGrSY1lM8&m=zTRvI4nj8MoP0_z5MmxTYwKiNNW6addwP4L5VFG4wkg&s=U8n1fpI-4OVYOSjST8IL0x0BRUhTLyumOpRZMJ_sVOI&e= change in the code. > > > > > > But I am getting below error in the nova log: > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [None req-2cda6e90-fd45-4bfe-960a-7fca9ba4abab demo admin] [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] Instance failed block device setup: MultiattachNotSupportedByVirtDriver: Volume dc25f09a-6ae1-4b06-a814-73a8afaba62f has 'multiattach' set, which is not supported for this instance. > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] Traceback (most recent call last): > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/compute/manager.py", line 1615, in _prep_block_device > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] wait_func=self._await_block_device_map_created) > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 840, in attach_block_devices > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] _log_and_attach(device) > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 837, in _log_and_attach > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] bdm.attach(*attach_args, **attach_kwargs) > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 46, in wrapped > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] ret_val = method(obj, context, *args, **kwargs) > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 620, in attach > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] virt_driver, do_driver_attach) > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 274, in inner > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] return f(*args, **kwargs) > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 617, in _do_locked_attach > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] self._do_attach(*args, **_kwargs) > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 602, in _do_attach > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] do_driver_attach) > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 509, in _volume_attach > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] volume_id=volume_id) > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] MultiattachNotSupportedByVirtDriver: Volume dc25f09a-6ae1-4b06-a814-73a8afaba62f has 'multiattach' set, which is not supported for this instance. > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR > > > nova.compute.manager [instance: > > fcaa5a47-fc48-489d-9827-6533bfd1a9fa] > > > > > > > > > Apr 29 05:41:20 CSSOSBE04-B09 nova-compute[20455]: DEBUG > > > nova.virt.libvirt.driver [-] Volume multiattach is not supported > > based > > > on current versions of QEMU and libvirt. QEMU must be less than 2.10 > > > or libvirt must be greater than or equal to 3.10. {{(pid=20455) > > > _set_multiattach_support > > > /opt/stack/nova/nova/virt/libvirt/driver.py:619}} > > > > > > > > > stack at CSSOSBE04-B09:/tmp$ virsh --version > > > 3.6.0 > > > stack at CSSOSBE04-B09:/tmp$ kvm --version QEMU emulator version > > > 2.10.1(Debian 1:2.10+dfsg-0ubuntu3.8~cloud1) Copyright (c) 2003-2017 > > > Fabrice Bellard and the QEMU Project developers > > > > > > > Hi Sneha, > > > > I don't know much about this side of Nova, but reading the log error I would say that you either need to update your libvirt version from 3.6.0 to 3.10, or you need to downgrade your QEMU version to something prior to 2.10. > > > > The later is probably easier. > > > > I don't use Ubuntu, but according to the Internet you can list > available versions with "apt-cache policy qemu" and then install or > downgrade to the specific version with "sudo apt-get install > qemu=2.5\*" if you wanted to install version 2.5 > > > > I hope this helps. > > > > Cheers, > > Gorka. > > > > > > > > openstack volume show -c multiattach -c status sneha1 > > > +-------------+-----------+ > > > | Field | Value | > > > +-------------+-----------+ > > > | multiattach | True | > > > | status | available | > > > +-------------+-----------+ > > > > > > cinder extra-specs-list > > > +--------------------------------------+-------------+--------------------------------------------------------------------+ > > > | ID | Name | extra_specs | > > > +--------------------------------------+-------------+--------------------------------------------------------------------+ > > > | bd077fde-51c3-4581-80d5-5855e8ab2f6b | 3pariscsi_1 | > > > | {'volume_backend_name': '3pariscsi_1', 'multiattach': ' > > | True'}| > > > +--------------------------------------+-------------+--------------------------------------------------------------------+ > > > > > > > > > echo $OS_COMPUTE_API_VERSION > > > 2.60 > > > > > > pip list | grep python-novaclient > > > DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. > > > python-novaclient 13.0.0 > > > > > > How do I fix this version issue on my setup to proceed? Please help. > > > > > > Thanks & Regards, > > > Sneha Rai -------------- next part -------------- An HTML attachment was scrubbed... URL: From aj at suse.com Mon May 13 13:40:33 2019 From: aj at suse.com (Andreas Jaeger) Date: Mon, 13 May 2019 13:40:33 +0000 Subject: [ceilometer] Rocky release notes for Ceilometer In-Reply-To: <907ABCCF-C566-40B4-B479-5F5D3A5923EB@gmail.com> References: <907ABCCF-C566-40B4-B479-5F5D3A5923EB@gmail.com> Message-ID: <792fbf82-94a2-5747-44fd-e4b6cb7327eb@suse.com> On 13/05/2019 14.14, Bernd Bausch wrote: > I don’t know if it’s correct to tag this message with [tc]; sorry if not. > > The Ceilometer Rocky release notes are not published on the releases web > site [1]. They were distributed via email only [2], it would seem. > Whoever manages the releases site or the Ceilometer release notes may > want to fill that gap to make it easier to find information. > > [1] https://releases.openstack.org/rocky/index.html > [2] http://lists.openstack.org/pipermail/release-announce/2018-July.txt Looking at: https://docs.openstack.org/releasenotes/ceilometer/ There are no rocky releasenotes listed - so, nothing the main page could link to. So, I would say: This works as designed, Btw. removing [tc], the tc is not involved in releases - [release] would be better but not needed since this is solved ;) Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter: jaegerandi SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 From dangtrinhnt at gmail.com Mon May 13 14:21:25 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Mon, 13 May 2019 23:21:25 +0900 Subject: [ceilometer] Rocky release notes for Ceilometer In-Reply-To: <792fbf82-94a2-5747-44fd-e4b6cb7327eb@suse.com> References: <907ABCCF-C566-40B4-B479-5F5D3A5923EB@gmail.com> <792fbf82-94a2-5747-44fd-e4b6cb7327eb@suse.com> Message-ID: Thanks Andreas for pointing that out. On Mon, May 13, 2019 at 11:08 PM Andreas Jaeger wrote: > On 13/05/2019 14.14, Bernd Bausch wrote: > > I don’t know if it’s correct to tag this message with [tc]; sorry if not. > > > > The Ceilometer Rocky release notes are not published on the releases web > > site [1]. They were distributed via email only [2], it would seem. > > Whoever manages the releases site or the Ceilometer release notes may > > want to fill that gap to make it easier to find information. > > > > [1] https://releases.openstack.org/rocky/index.html > > [2] http://lists.openstack.org/pipermail/release-announce/2018-July.txt > > > Looking at: > https://docs.openstack.org/releasenotes/ceilometer/ > > There are no rocky releasenotes listed - so, nothing the main page could > link to. > > So, I would say: This works as designed, > > > Btw. removing [tc], the tc is not involved in releases - [release] would > be better but not needed since this is solved ;) > > Andreas > -- > Andreas Jaeger aj@{suse.com,opensuse.org} Twitter: jaegerandi > SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah > HRB 21284 (AG Nürnberg) > GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 > > -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From balazs.gibizer at ericsson.com Mon May 13 14:38:04 2019 From: balazs.gibizer at ericsson.com (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Mon, 13 May 2019 14:38:04 +0000 Subject: [nova][ptg] Summary: Implicit trait-based filters In-Reply-To: References: <1557213589.2232.0@smtp.office365.com> Message-ID: <1557758281.17816.17@smtp.office365.com> On Wed, May 8, 2019 at 5:58 PM, Matt Riedemann wrote: > On 5/7/2019 2:19 AM, Balázs Gibizer wrote: >> 3) The request pre-filters [7] run before the placement a_c query is >> generated. But these today changes the fields of the RequestSpec >> (e.g. >> requested_destination) that would mean the regeneration of >> RequestSpec.requested_resources would be needed. This probably >> solvable >> by changing the pre-filters to work directly on >> RequestSpec.requested_resources after we solved all the other issues. > > Yeah this is something I ran into while hacking on the routed > networks aggregate stuff [1]. I added information to the RequestSpec > so I could use it in a pre-filter (required aggregates) but I can't > add that to the requested_resources in the RequestSpec without > resources (and in the non-bw port case there is no > RequestSpec.requested_resources yet), so what I did was hack the > unnumbered RequestGroup after the pre-filters and after the > RequestSpec was processed by resources_from_request_spec, but before > the code that makes the GET /a_c call. It's definitely ugly and I'm > not even sure it works yet (would need functional testing). > > What I've wondered is if there is a way we could merge request groups > in resources_from_request_spec so if a pre-filter added an unnumbered > RequestGroup to the RequestSpec (via the requestd_resources > attribute) that resources_from_request_spec would then merge in the > flavor information. That's what I initially tried with the > multiattach required traits patch [2] but the groups weren't merged > for whatever reason and GET /a_c failed because I had a group with a > required trait but no resources. > If we only need to merge once then it feels doable. We just add new things to the pre-existing unnumbered group from the flavor and image. But if we ever need to update what we already merged into the unnumbered group then we would need access to the old flavor / image to first subtract them from the unnumbered group and then add the requests from the new flavor / image to the unnumbered group. The other way would be to store the extra traits separately as well in the RequestSpec and only generate the unnumbered group from all the input when needed. Cheers, gibi > [1] https://review.opendev.org/#/c/656885/3/nova/scheduler/manager.py > [2] https://review.opendev.org/#/c/645316/ > > -- > > Thanks, > > Matt > From mthode at mthode.org Mon May 13 14:45:55 2019 From: mthode at mthode.org (Matthew Thode) Date: Mon, 13 May 2019 09:45:55 -0500 Subject: [requirements] bandit bump to 1.6.0 In-Reply-To: References: Message-ID: <20190513144555.ccb7w256gimde3vn@mthode.org> On 19-05-13 13:50:11, Herve Beraud wrote: > Alreaady discussed here => > http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006116.html > > Sorry > > Le lun. 13 mai 2019 à 13:47, Herve Beraud a écrit : > > > Hello, > > > > FYI bandit 1.6.0 was released and changes the behavior of the '-x' option > > so that it now supports glob patterns. > > > > Many openstack projects will facing bandit issues due to these changes. > > > > Two possibilities exists: > > - pin your bandit version to < 1.6.0 > > - accept 1.6.0 and modify your bandit call by passing a patterns like this > > https://review.opendev.org/#/c/658319/1 > > > > We also need to update openstack/requirements ( > > https://review.opendev.org/#/c/658767/) > > > > I think the better approach is to use 1.6.0 now and to fix the bandit > > command to avoid issues in the future, and avoid undesired reviews on this > > topic. > > I'm pasting the projects I found using the option, hopefully it helps. I do agree that moving now would be better, caps are always a bad thing. | ara | tox.ini | 31 | bandit -r ara -x ara/tests --skip B303 | | armada | tox.ini | 77 | bandit -r armada -x armada/tests -n 5 | | armada | tox.ini | 82 | bandit -r armada -x armada/tests -n 5 | | barbican | tox.ini | 53 | bandit -r barbican -x tests -n5 | | barbican | tox.ini | 175 | commands = bandit -r barbican -x tests -n5 | | castellan | tox.ini | 25 | bandit -r castellan -x tests -s B105,B106,B107,B607 | | castellan | tox.ini | 38 | bandit -r castellan -x tests -s B105,B106,B107,B607 | | cinder | tox.ini | 160 | commands = bandit -r cinder -n5 -x tests -ll | | cliff | tox.ini | 31 | bandit -c bandit.yaml -r cliff -x tests -n5 | | cloudkitty | tox.ini | 33 | commands = bandit -r cloudkitty -n5 -x tests -ll | | deckhand | tox.ini | 90 | commands = bandit -r deckhand -x deckhand/tests -n 5 | | deckhand | tox.ini | 111 | bandit -r deckhand -x deckhand/tests -n 5 | | designate | tox.ini | 91 | commands = bandit -r designate -n5 -x tests -t \ | | heat | tox.ini | 47 | bandit -r heat -x tests --skip B101,B104,B107,B110,B310,B311,B404,B410,B504,B506,B603,B607 | | heat | tox.ini | 112 | commands = bandit -r heat -x tests --skip B101,B104,B107,B110,B310,B311,B404,B410,B504,B506,B603,B607 | | horizon | tox.ini | 168 | commands = bandit -r horizon openstack_auth openstack_dashboard -n5 -x tests -ll | | keystone | tox.ini | 40 | bandit -r keystone -x tests | | keystone | tox.ini | 49 | commands = bandit -r keystone -x tests | | keystoneauth | tox.ini | 26 | bandit -r keystoneauth1 -x tests -s B110,B410 | | keystoneauth | tox.ini | 32 | commands = bandit -r keystoneauth1 -x tests -s B110,B410 | | keystonemiddleware | tox.ini | 21 | bandit -r keystonemiddleware -x tests -n5 | | keystonemiddleware | tox.ini | 27 | commands = bandit -r keystonemiddleware -x tests -n5 | | magnum | tox.ini | 114 | bandit -r magnum -x tests -n5 -ll | | magnum | tox.ini | 130 | commands = bandit -r magnum -x tests -n5 -ll | | monasca-agent | tox.ini | 61 | bandit -r monasca_agent -n5 -s B101,B602,B603,B301,B303,B311,B403,B404,B405,B310,B320,B410,B411,B501,B504,B605,B607,B608 -x {toxinidir}/tests | | monasca-api | tox.ini | 53 | bandit -r monasca_api -n5 -s B101,B303 -x monasca_api/tests | | monasca-common | tox.ini | 72 | commands = bandit -r monasca_common -n5 -s B101 -x monasca_common/tests -x monasca_common/kafka_lib | | monasca-events-api | tox.ini | 67 | commands = bandit -r monasca_events_api -n5 -x monasca_events_api/tests | | monasca-log-api | tox.ini | 55 | bandit -r monasca_log_api -n5 -s B101 -x monasca_log_api/tests | | monasca-notification | tox.ini | 59 | bandit -r monasca_notification -n5 -x monasca_notification/tests | | monasca-persister | tox.ini | 89 | bandit -r monasca_persister -n5 -s B303 -x monasca_persister/tests | | monasca-statsd | tox.ini | 47 | commands = bandit -r monascastatsd -s B311 -n5 -x monascastatsd/tests | | murano | tox.ini | 36 | commands = bandit -c bandit.yaml -r murano -x tests -n 5 -ll | | networking-cisco | tox.ini | 105 | #commands = bandit -r networking_cisco -x apps/saf,tests,plugins/cisco/cpnr -n5 -f txt | | networking-midonet | tox.ini | 54 | commands = bandit -r midonet -x midonet/neutron/tests -n5 | | networking-odl | tox.ini | 124 | commands = bandit -r networking_odl -x tests -n5 -s B101 | | networking-omnipath | tox.ini | 143 | commands = bandit -r omnipath -x tests -n5 | | networking-ovn | tox.ini | 154 | commands = bandit -r networking_ovn -x networking_ovn/tests/* -n5 -s B104 | | neutron | tox.ini | 190 | commands = bandit -r neutron -x tests -n5 -s B104,B303,B311,B604 | | neutron-lib | tox.ini | 105 | commands = bandit -r neutron_lib -x tests -n5 -s B104,B303,B311 | | nova | tox.ini | 221 | commands = bandit -r nova -x tests -n 5 -ll | | novajoin | tox.ini | 45 | commands = bandit -r novajoin -n5 -x tests -ll -s B104 | | octavia | tox.ini | 72 | bandit -r octavia -ll -ii -x 'octavia/tests/*' | | octavia | tox.ini | 130 | commands = bandit -r octavia -ll -ii -x octavia/tests {posargs} | | octavia-lib | tox.ini | 28 | bandit -r octavia_lib -ll -ii -x octavia_lib/tests | | ooi | tox.ini | 37 | bandit -r ooi -x tests -s B110,B410 | | ooi | tox.ini | 42 | commands = bandit -r ooi -x tests -s B110,B410 | | oslo.cache | tox.ini | 32 | bandit -r oslo_cache -x tests -n5 | | oslo.concurrency | tox.ini | 26 | bandit -r oslo_concurrency -x tests -n5 --skip B311,B404,B603,B606 | | oslo.config | tox.ini | 38 | bandit -r oslo_config -x tests -n5 | | oslo.config | tox.ini | 64 | commands = bandit -r oslo_config -x tests -n5 | | oslo.context | tox.ini | 20 | bandit -r oslo_context -x tests -n5 | | oslo.db | tox.ini | 38 | bandit -r oslo_db -x tests -n5 --skip B105,B311 | | oslo.i18n | tox.ini | 23 | bandit -r oslo_i18n -x tests -n5 | | oslo.log | tox.ini | 25 | bandit -r oslo_log -x tests -n5 | | oslo.log | tox.ini | 53 | commands = bandit -r oslo_log -x tests -n5 | | oslo.messaging | tox.ini | 23 | bandit -r oslo_messaging -x tests -n5 | | oslo.messaging | tox.ini | 97 | commands = bandit -r oslo_messaging -x tests -n5 | | oslo.middleware | tox.ini | 22 | bandit -r oslo_middleware -x tests -n5 | | oslo.privsep | tox.ini | 25 | bandit -r oslo_privsep -x tests -n5 --skip B404,B603 | | oslo.service | tox.ini | 24 | bandit -r oslo_service -n5 -x tests | | oslo.service | tox.ini | 60 | commands = bandit -r oslo_service -n5 -x tests {posargs} | | oslo.utils | tox.ini | 21 | bandit -r oslo_utils -x tests -n5 | | oslo.utils | tox.ini | 41 | commands = bandit -r oslo_utils -x tests -n5 | | patrole | tox.ini | 29 | bandit -r patrole_tempest_plugin -x patrole_tempest_plugin/tests -n 5 | | placement | tox.ini | 141 | commands = bandit -r placement -x tests -n 5 -ll | | python-keystoneclient | tox.ini | 25 | bandit -r keystoneclient -x tests -n5 | | python-keystoneclient | tox.ini | 31 | commands = bandit -r keystoneclient -x tests -n5 | | python-magnumclient | tox.ini | 26 | commands = bandit -r magnumclient -x tests -n5 -ll | | python-magnumclient | tox.ini | 49 | bandit -r magnumclient -x tests -n5 -ll | | python-monascaclient | tox.ini | 61 | commands = bandit -r monascaclient -n5 -x {env:OS_TEST_PATH} | | python-neutronclient | tox.ini | 82 | commands = bandit -r neutronclient -x tests -n5 -s B303 | | python-novaclient | tox.ini | 29 | commands = bandit -r novaclient -n5 -x tests | | python-openstackclient | tox.ini | 30 | bandit -r openstackclient -x tests -s B105,B106,B107,B401,B404,B603,B606,B607,B110,B605,B101 | | python-openstackclient | tox.ini | 57 | bandit -r openstackclient -x tests -s B105,B106,B107,B401,B404,B603,B606,B607,B110,B605,B101 | | python-senlinclient | tox.ini | 23 | commands = bandit -r senlinclient -x tests -n5 -ll | | python-zunclient | tox.ini | 27 | commands = bandit -r zunclient -x tests -n5 -ll | | python-zunclient | tox.ini | 61 | bandit -r zunclient -x tests -n5 -ll | | renderspec | tox.ini | 26 | bandit -r -s B701 renderspec -x tests | | sahara | tox.ini | 46 | bandit -c bandit.yaml -r sahara -n5 -p sahara_default -x tests | | sahara | tox.ini | 118 | commands = bandit -c bandit.yaml -r sahara -n5 -p sahara_default -x tests | | senlin | tox.ini | 101 | commands = bandit -r senlin -x tests -s B101,B104,B110,B310,B311,B506 | | solum | tox.ini | 92 | commands = bandit -r solum -n5 -x tests -ll | | spyglass-plugin-xls | test-requirements.txt | 8 | bandit>=1.5.0 | | spyglass-plugin-xls | tox.ini | 37 | bandit -r spyglass-plugin-xls -n 5 | | spyglass-plugin-xls | tox.ini | 44 | commands = bandit -r spyglass-plugin-xls -n 5 | | stevedore | tox.ini | 32 | bandit -r stevedore -x tests -n5 | | tatu | tox.ini | 45 | commands = bandit -r tatu -n5 -x tests -ll -s B104 | | trove | tox.ini | 99 | commands = bandit -r trove -n5 -x tests | | valet | tox.ini | 59 | commands = bandit -r valet -x tests -n 5 -l | | watcher | tox.ini | 28 | bandit -r watcher -x watcher/tests/* -n5 -ll -s B320 | | watcher | tox.ini | 106 | commands = bandit -r watcher -x watcher/tests/* -n5 -ll -s B320 | | watcher-tempest-plugin | tox.ini | 20 | bandit -r watcher_tempest_plugin -x tests -n5 -ll -s B320 | | watcher-tempest-plugin | tox.ini | 56 | commands = bandit -r watcher_tempest_plugin -x tests -n5 -ll -s B320 | | zun | tox.ini | 35 | bandit -r zun -x tests -n5 -ll --skip B303,B604 | -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From geguileo at redhat.com Mon May 13 14:48:13 2019 From: geguileo at redhat.com (Gorka Eguileor) Date: Mon, 13 May 2019 16:48:13 +0200 Subject: Help needed to Support Multi-attach feature In-Reply-To: References: <20190510092600.r27zetl5e3k5ow5v@localhost> <20190513085118.3hfsekvtabq6ipm2@localhost> Message-ID: <20190513144813.ek7e5zmsh5wwgthy@localhost> On 13/05, RAI, SNEHA wrote: > Thanks Gorka for your response. The main reason is "AUTO: Connection to libvirt lost: 1". > > Not sure, why the connection is being lost. I tried restarting all the nova services too, but no luck. > Hi, I would confirm that libvirtd.service, virtlockd.socket, and virtlogd.socket are loaded and active. Cheers, Gorka. > > > Regards, > > Sneha Rai > > > > -----Original Message----- > From: Gorka Eguileor [mailto:geguileo at redhat.com] > Sent: Monday, May 13, 2019 2:21 PM > To: RAI, SNEHA > Cc: openstack-dev at lists.openstack.org > Subject: Re: Help needed to Support Multi-attach feature > > > > On 10/05, RAI, SNEHA wrote: > > > Thanks Gorka for your response. > > > > > > I have changed the version of libvirt and qemu on my host and I am able to move past the previous error mentioned in my last email. > > > > > > Current versions of libvirt and qemu: > > > root at CSSOSBE04-B09:/etc# libvirtd --version libvirtd (libvirt) 1.3.1 > > > root at CSSOSBE04-B09:/etc# kvm --version QEMU emulator version 2.5.0 > > > (Debian 1:2.5+dfsg-5ubuntu10.36), Copyright (c) 2003-2008 Fabrice > > > Bellard > > > > > > Also, I made a change in /etc/nova/nova.conf and set virt_type=qemu. Earlier it was set to kvm. > > > I restarted all nova services post the changes but I can see one nova service was disabled and state was down. > > > > > > root at CSSOSBE04-B09:/etc# nova service-list > > > +--------------------------------------+------------------+---------------+----------+----------+-------+----------------------------+-------------------------------------+-------------+ > > > | Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason | Forced down | > > > +--------------------------------------+------------------+---------------+----------+----------+-------+----------------------------+-------------------------------------+-------------+ > > > | 1ebcd1f6-b7dc-40ce-8d7b-95d60503c0ff | nova-scheduler | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:48:59.000000 | - | False | > > > | ed82277c-d2e0-4a1a-adf6-9bcdcc50ba29 | nova-consoleauth | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:48:49.000000 | - | False | > > > | bc2b6703-7a1e-4f07-96b9-35cbb14398d5 | nova-conductor | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:48:59.000000 | - | False | > > > | 72ecbc1d-1b47-4f55-a18d-de2fbf1771e9 | nova-conductor | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:48:54.000000 | - | False | > > > | 9c700ee1-1694-479b-afc0-1fd37c1a5561 | nova-compute | CSSOSBE04-B09 | nova | disabled | down | 2019-05-07T22:11:06.000000 | AUTO: Connection to libvirt lost: 1 | False | > > > +--------------------------------------+------------------+---------------+----------+----------+-------+----------------------------+-------------------------------------+-------------+ > > > > > > So, I manually enabled the service, but the state was still down. > > > root at CSSOSBE04-B09:/etc# nova service-enable > > > 9c700ee1-1694-479b-afc0-1fd37c1a5561 > > > +--------------------------------------+---------------+--------------+---------+ > > > | ID | Host | Binary | Status | > > > +--------------------------------------+---------------+--------------+---------+ > > > | 9c700ee1-1694-479b-afc0-1fd37c1a5561 | CSSOSBE04-B09 | nova-compute > > > | | enabled | > > > +--------------------------------------+---------------+--------------+---------+ > > > > > > root at CSSOSBE04-B09:/etc# nova service-list > > > +--------------------------------------+------------------+---------------+----------+---------+-------+----------------------------+-----------------+-------------+ > > > | Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason | Forced down | > > > +--------------------------------------+------------------+---------------+----------+---------+-------+----------------------------+-----------------+-------------+ > > > | 1ebcd1f6-b7dc-40ce-8d7b-95d60503c0ff | nova-scheduler | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:49:19.000000 | - | False | > > > | ed82277c-d2e0-4a1a-adf6-9bcdcc50ba29 | nova-consoleauth | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:49:19.000000 | - | False | > > > | bc2b6703-7a1e-4f07-96b9-35cbb14398d5 | nova-conductor | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:49:19.000000 | - | False | > > > | 72ecbc1d-1b47-4f55-a18d-de2fbf1771e9 | nova-conductor | CSSOSBE04-B09 | internal | enabled | up | 2019-05-10T05:49:14.000000 | - | False | > > > | 9c700ee1-1694-479b-afc0-1fd37c1a5561 | nova-compute | CSSOSBE04-B09 | nova | enabled | down | 2019-05-10T05:49:14.000000 | - | False | > > > +--------------------------------------+------------------+---------------+----------+---------+-------+----------------------------+-----------------+-------------+ > > > > > > > Hi, > > > > If it appears as down it's probably because there is an issue during the service's start procedure. > > > > You can look in the logs to see what messages appeared during the start or tail the logs and restart the service to see what error appears there. > > > > Cheers, > > Gorka. > > > > > > > So, now when I try to attach a volume to nova instance, I get the below error. As one of the service is down it fails in filter validation for nova-compute and gives us "No host" error. > > > > > > May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;32mDEBUG > > > nova.filters [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 > > > #033[00;36mdemo admin#033[00;32m] #033[01;35m#033[00;32mFilter > > > RetryFilter returned 1 host(s)#033[00m #033[00;33m{{(pid=21775) > > > get_filtered_objects /opt/stack/nova/nova/filters.py:104}}#033[00m > > > May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;32mDEBUG > > > nova.filters [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 > > > #033[00;36mdemo admin#033[00;32m] #033[01;35m#033[00;32mFilter > > > AvailabilityZoneFilter returned 1 host(s)#033[00m > > > #033[00;33m{{(pid=21775) get_filtered_objects > > > /opt/stack/nova/nova/filters.py:104}}#033[00m > > > May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;32mDEBUG > > > nova.scheduler.filters.compute_filter [#033[01;36mNone > > > req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo > > > admin#033[00;32m] #033[01;35m#033[00;32m(CSSOSBE04-B09, CSSOSBE04-B09) > > > ram: 30810MB disk: 1737728MB io_ops: 0 instances: 1 is disabled, > > > reason: AUTO: Connection to libvirt lost: 1#033[00m > > > #033[00;33m{{(pid=21775) host_passes > > > /opt/stack/nova/nova/scheduler/filters/compute_filter.py:42}}#033[00m > > > May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;36mINFO > > > nova.filters [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 > > > #033[00;36mdemo admin#033[00;36m] #033[01;35m#033[00;36mFilter > > > ComputeFilter returned 0 hosts#033[00m May 10 10:43:00 CSSOSBE04-B09 > > > nova-scheduler[21775]: #033[00;32mDEBUG nova.filters [#033[01;36mNone > > > req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo > > > admin#033[00;32m] #033[01;35m#033[00;32mFiltering removed all hosts > > > for the request with instance ID > > > '1735ece5-d187-454a-aab1-12650646a2ec'. Filter results: > > > [('RetryFilter', [(u'CSSOSBE04-B09', u'CSSOSBE04-B09')]), > > > ('AvailabilityZoneFilter', [(u'CSSOSBE04-B09', u'CSSOSBE04-B09')]), > > > ('ComputeFilter', None)]#033[00m #033[00;33m{{(pid=21775) > > > get_filtered_objects /opt/stack/nova/nova/filters.py:129}}#033[00m > > > May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;36mINFO > > > nova.filters [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 > > > #033[00;36mdemo admin#033[00;36m] #033[01;35m#033[00;36mFiltering > > > removed all hosts for the request with instance ID > > > '1735ece5-d187-454a-aab1-12650646a2ec'. Filter results: ['RetryFilter: > > > (start: 1, end: 1)', 'AvailabilityZoneFilter: (start: 1, end: 1)', > > > 'ComputeFilter: (start: 1, end: 0)']#033[00m May 10 10:43:00 > > > CSSOSBE04-B09 nova-scheduler[21775]: #033[00;32mDEBUG > > > nova.scheduler.filter_scheduler [#033[01;36mNone > > > req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo > > > admin#033[00;32m] #033[01;35m#033[00;32mFiltered []#033[00m > > > #033[00;33m{{(pid=21775) _get_sorted_hosts > > > /opt/stack/nova/nova/scheduler/filter_scheduler.py:404}}#033[00m > > > May 10 10:43:00 CSSOSBE04-B09 nova-scheduler[21775]: #033[00;32mDEBUG > > > nova.scheduler.filter_scheduler [#033[01;36mNone > > > req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo > > > admin#033[00;32m] #033[01;35m#033[00;32mThere are 0 hosts available > > > but 1 instances requested to build.#033[00m #033[00;33m{{(pid=21775) > > > _ensure_sufficient_hosts > > > /opt/stack/nova/nova/scheduler/filter_scheduler.py:279}}#033[00m > > > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: #033[01;31mERROR nova.conductor.manager [#033[01;36mNone req-b0ca81b3-a2b6-492e-9036-249644b94349 #033[00;36mdemo admin#033[01;31m] #033[01;35m#033[01;31mFailed to schedule instances#033[00m: NoValidHost_Remote: No valid host was found. There are not enough hosts available. > > > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: Traceback (most recent call last): > > > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 226, in inner > > > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: return func(*args, **kwargs) > > > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: File "/opt/stack/nova/nova/scheduler/manager.py", line 154, in select_destinations > > > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: allocation_request_version, return_alternates) > > > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: File "/opt/stack/nova/nova/scheduler/filter_scheduler.py", line 91, in select_destinations > > > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: allocation_request_version, return_alternates) > > > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: File "/opt/stack/nova/nova/scheduler/filter_scheduler.py", line 244, in _schedule > > > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: claimed_instance_uuids) > > > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: File "/opt/stack/nova/nova/scheduler/filter_scheduler.py", line 281, in _ensure_sufficient_hosts > > > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: raise exception.NoValidHost(reason=reason) > > > May 10 10:43:00 CSSOSBE04-B09 nova-conductor[21789]: NoValidHost: No valid host was found. There are not enough hosts available. > > > > > > Need help in understanding on how to fix this error. For detailed logs, please refer the attached syslog. > > > > > > > > > Thanks & Regards, > > > Sneha Rai > > > > > > > > > > > > > > > > > > -----Original Message----- > > > From: Gorka Eguileor [mailto:geguileo at redhat.com] > > > Sent: Friday, May 10, 2019 2:56 PM > > > To: RAI, SNEHA > > > > Cc: openstack-dev at lists.openstack.org > > > Subject: Re: Help needed to Support Multi-attach feature > > > > > > > > > > > > On 02/05, RAI, SNEHA wrote: > > > > > > > Hi Team, > > > > > > > > > > > > > > I am currently working on multiattach feature for HPE 3PAR cinder driver. > > > > > > > > > > > > > > For this, while setting up devstack(on stable/queens) I made below > > > > > > > change in the local.conf [[local|localrc]] > > > > > > > ENABLE_VOLUME_MULTIATTACH=True ENABLE_UBUNTU_CLOUD_ARCHIVE=False > > > > > > > > > > > > > > /etc/cinder/cinder.conf: > > > > > > > [3pariscsi_1] > > > > > > > hpe3par_api_url = > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__192.168.1.7-3A8 > > > > 08 > > > > > > > 0_api_v1&d=DwIBAg&c=C5b8zRQO1miGmBeVZ2LFWg&r=8drU3i56Z5sQ_Ltpya89LTN > > > > n3 > > > > > > > xDSwtigjYbGrSY1lM8&m=zTRvI4nj8MoP0_z5MmxTYwKiNNW6addwP4L5VFG4wkg&s=a > > > > 2D > > > > > > > HbzzRtbbBPz0_kfodZv5X1HxbN_hFxte5rEZabAg&e= > > > > > > > hpe3par_username = user > > > > > > > hpe3par_password = password > > > > > > > san_ip = 192.168.1.7 > > > > > > > san_login = user > > > > > > > san_password = password > > > > > > > volume_backend_name = 3pariscsi_1 > > > > > > > hpe3par_cpg = my_cpg > > > > > > > hpe3par_iscsi_ips = 192.168.11.2,192.168.11.3 volume_driver = > > > > > > > cinder.volume.drivers.hpe.hpe_3par_iscsi.HPE3PARISCSIDriver > > > > > > > hpe3par_iscsi_chap_enabled = True > > > > > > > hpe3par_debug = True > > > > > > > image_volume_cache_enabled = True > > > > > > > > > > > > > > /etc/cinder/policy.json: > > > > > > > 'volume:multiattach': 'rule:admin_or_owner' > > > > > > > > > > > > > > Added https://urldefense.proofpoint.com/v2/url?u=https-3A__review.opendev.org_-23_c_560067_2_cinder_volume_drivers_hpe_hpe-5F3par-5Fcommon.py&d=DwIBAg&c=C5b8zRQO1miGmBeVZ2LFWg&r=8drU3i56Z5sQ_Ltpya89LTNn3xDSwtigjYbGrSY1lM8&m=zTRvI4nj8MoP0_z5MmxTYwKiNNW6addwP4L5VFG4wkg&s=U8n1fpI-4OVYOSjST8IL0x0BRUhTLyumOpRZMJ_sVOI&e= change in the code. > > > > > > > > > > > > > > But I am getting below error in the nova log: > > > > > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [None req-2cda6e90-fd45-4bfe-960a-7fca9ba4abab demo admin] [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] Instance failed block device setup: MultiattachNotSupportedByVirtDriver: Volume dc25f09a-6ae1-4b06-a814-73a8afaba62f has 'multiattach' set, which is not supported for this instance. > > > > > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] Traceback (most recent call last): > > > > > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/compute/manager.py", line 1615, in _prep_block_device > > > > > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] wait_func=self._await_block_device_map_created) > > > > > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 840, in attach_block_devices > > > > > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] _log_and_attach(device) > > > > > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 837, in _log_and_attach > > > > > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] bdm.attach(*attach_args, **attach_kwargs) > > > > > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 46, in wrapped > > > > > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] ret_val = method(obj, context, *args, **kwargs) > > > > > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 620, in attach > > > > > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] virt_driver, do_driver_attach) > > > > > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 274, in inner > > > > > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] return f(*args, **kwargs) > > > > > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 617, in _do_locked_attach > > > > > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] self._do_attach(*args, **_kwargs) > > > > > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 602, in _do_attach > > > > > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] do_driver_attach) > > > > > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] File "/opt/stack/nova/nova/virt/block_device.py", line 509, in _volume_attach > > > > > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] volume_id=volume_id) > > > > > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR nova.compute.manager [instance: fcaa5a47-fc48-489d-9827-6533bfd1a9fa] MultiattachNotSupportedByVirtDriver: Volume dc25f09a-6ae1-4b06-a814-73a8afaba62f has 'multiattach' set, which is not supported for this instance. > > > > > > > Apr 29 04:23:04 CSSOSBE04-B09 nova-compute[31396]: ERROR > > > > > > > nova.compute.manager [instance: > > > > fcaa5a47-fc48-489d-9827-6533bfd1a9fa] > > > > > > > > > > > > > > > > > > > > > Apr 29 05:41:20 CSSOSBE04-B09 nova-compute[20455]: DEBUG > > > > > > > nova.virt.libvirt.driver [-] Volume multiattach is not supported > > > > based > > > > > > > on current versions of QEMU and libvirt. QEMU must be less than 2.10 > > > > > > > or libvirt must be greater than or equal to 3.10. {{(pid=20455) > > > > > > > _set_multiattach_support > > > > > > > /opt/stack/nova/nova/virt/libvirt/driver.py:619}} > > > > > > > > > > > > > > > > > > > > > stack at CSSOSBE04-B09:/tmp$ virsh --version > > > > > > > 3.6.0 > > > > > > > stack at CSSOSBE04-B09:/tmp$ kvm --version QEMU emulator version > > > > > > > 2.10.1(Debian 1:2.10+dfsg-0ubuntu3.8~cloud1) Copyright (c) 2003-2017 > > > > > > > Fabrice Bellard and the QEMU Project developers > > > > > > > > > > > > > > > > > > > Hi Sneha, > > > > > > > > > > > > I don't know much about this side of Nova, but reading the log error I would say that you either need to update your libvirt version from 3.6.0 to 3.10, or you need to downgrade your QEMU version to something prior to 2.10. > > > > > > > > > > > > The later is probably easier. > > > > > > > > > > > > I don't use Ubuntu, but according to the Internet you can list > > > available versions with "apt-cache policy qemu" and then install or > > > downgrade to the specific version with "sudo apt-get install > > > qemu=2.5\*" if you wanted to install version 2.5 > > > > > > > > > > > > I hope this helps. > > > > > > > > > > > > Cheers, > > > > > > Gorka. > > > > > > > > > > > > > > > > > > > > openstack volume show -c multiattach -c status sneha1 > > > > > > > +-------------+-----------+ > > > > > > > | Field | Value | > > > > > > > +-------------+-----------+ > > > > > > > | multiattach | True | > > > > > > > | status | available | > > > > > > > +-------------+-----------+ > > > > > > > > > > > > > > cinder extra-specs-list > > > > > > > +--------------------------------------+-------------+--------------------------------------------------------------------+ > > > > > > > | ID | Name | extra_specs | > > > > > > > +--------------------------------------+-------------+--------------------------------------------------------------------+ > > > > > > > | bd077fde-51c3-4581-80d5-5855e8ab2f6b | 3pariscsi_1 | > > > > > > > | {'volume_backend_name': '3pariscsi_1', 'multiattach': ' > > > > | True'}| > > > > > > > +--------------------------------------+-------------+--------------------------------------------------------------------+ > > > > > > > > > > > > > > > > > > > > > echo $OS_COMPUTE_API_VERSION > > > > > > > 2.60 > > > > > > > > > > > > > > pip list | grep python-novaclient > > > > > > > DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. > > > > > > > python-novaclient 13.0.0 > > > > > > > > > > > > > > How do I fix this version issue on my setup to proceed? Please help. > > > > > > > > > > > > > > Thanks & Regards, > > > > > > > Sneha Rai > > > > From openstack at nemebean.com Mon May 13 17:23:33 2019 From: openstack at nemebean.com (Ben Nemec) Date: Mon, 13 May 2019 12:23:33 -0500 Subject: [oslo] Bandit Strategy Message-ID: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> Nefarious cap bandits are running amok in the OpenStack community! Won't someone take a stand against these villainous headwear thieves?! Oh, sorry, just pasted the elevator pitch for my new novel. ;-) Actually, this email is to summarize the plan we came up with in the Oslo meeting this morning. Since we have a bunch of projects affected by the Bandit breakage I wanted to make sure we had a common fix so we don't have a bunch of slightly different approaches in each project. The plan we agreed on in the meeting was to push a two patch series to each repo - one to cap bandit <1.6.0 and one to uncap it with a !=1.6.0 exclusion. The first should be merged immediately to unblock ci, and the latter can be rechecked once bandit 1.6.1 releases to verify that it fixes the problem for us. We chose this approach instead of just tweaking the exclusion in tox.ini because it's not clear that the current behavior will continue once Bandit fixes the bug. Assuming they restore the old behavior, this should require the least churn in our repos and means we're still compatible with older versions that people may already have installed. I started pushing patches under https://review.opendev.org/#/q/topic:cap-bandit (which prompted the digression to start this email ;-) to implement this plan. This is mostly intended to be informational, but if you have any concerns with the plan above please do let us know immediately. Thanks. -Ben From dtroyer at gmail.com Mon May 13 17:27:07 2019 From: dtroyer at gmail.com (Dean Troyer) Date: Mon, 13 May 2019 12:27:07 -0500 Subject: [OSC][PTG] Summary: many things to do! In-Reply-To: <768df039575b8cad3ee3262230534d7309f9c09f.camel@redhat.com> References: <768df039575b8cad3ee3262230534d7309f9c09f.camel@redhat.com> Message-ID: On Mon, May 13, 2019 at 8:02 AM Stephen Finucane wrote: > > * server migrate --live: deprecate the --live option and add a new > > --live-migration option and a --host option > > Could I suggest we don't bother deprecating the '--live' option in 3.x > and simply rework it for 4.0? This is of course assuming the 4.0 > release date is months away and not years, of course. Sure, my primary concern is that we do what makes the most sense from a user's point of view. Re OSC4, I am cherry-picking the work that was done a while back on the feature/osc4 branch back to master, that should be done in a day or two. That said, I do not want to rush a 4.0 release, it has taken 2 years already, I want to make sure we include all of the things that we have been holding both explicitly and mentally. I am open to input on how long that should be... dt -- Dean Troyer dtroyer at gmail.com From openstack at nemebean.com Mon May 13 17:40:19 2019 From: openstack at nemebean.com (Ben Nemec) Date: Mon, 13 May 2019 12:40:19 -0500 Subject: [oslo] Bandit Strategy In-Reply-To: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> Message-ID: On 5/13/19 12:23 PM, Ben Nemec wrote: > Nefarious cap bandits are running amok in the OpenStack community! Won't > someone take a stand against these villainous headwear thieves?! > > Oh, sorry, just pasted the elevator pitch for my new novel. ;-) > > Actually, this email is to summarize the plan we came up with in the > Oslo meeting this morning. Since we have a bunch of projects affected by > the Bandit breakage I wanted to make sure we had a common fix so we > don't have a bunch of slightly different approaches in each project. The > plan we agreed on in the meeting was to push a two patch series to each > repo - one to cap bandit <1.6.0 and one to uncap it with a !=1.6.0 > exclusion. The first should be merged immediately to unblock ci, and the > latter can be rechecked once bandit 1.6.1 releases to verify that it > fixes the problem for us. Oh, and since sphinx is also breaking the Oslo world, I guess we're going to have to include the sphinx requirements fix in these first patches: https://review.opendev.org/#/c/658857/ That's passing the requirements job so it should unblock us. /me is off to squash some patches > > We chose this approach instead of just tweaking the exclusion in tox.ini > because it's not clear that the current behavior will continue once > Bandit fixes the bug. Assuming they restore the old behavior, this > should require the least churn in our repos and means we're still > compatible with older versions that people may already have installed. > > I started pushing patches under > https://review.opendev.org/#/q/topic:cap-bandit (which prompted the > digression to start this email ;-) to implement this plan. This is > mostly intended to be informational, but if you have any concerns with > the plan above please do let us know immediately. > > Thanks. > > -Ben > From smooney at redhat.com Mon May 13 18:01:47 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 13 May 2019 19:01:47 +0100 Subject: [nova][CI] GPUs in the gate In-Reply-To: References: Message-ID: <17dbb752e66f914087f19d65b03fd83c747c5a35.camel@redhat.com> On Tue, 2019-05-07 at 13:47 -0400, Artom Lifshitz wrote: > Hey all, > > Following up on the CI session during the PTG [1], I wanted to get the > ball rolling on getting GPU hardware into the gate somehow. Initially > the plan was to do it through OpenLab and by convincing NVIDIA to > donate the cards, but after a conversation with Sean McGinnis it > appears Infra have access to machines with GPUs. > > From Nova's POV, the requirements are: > * The machines with GPUs should probably be Ironic baremetal nodes and > not VMs [*]. > * The GPUs need to support virtualization. It's hard to get a > comprehensive list of GPUs that do, but Nova's own docs [2] mention > two: Intel cards with GVT [3] and NVIDIA GRID [4]. Intel cards is a misnomer GVT is currently only supported by the integrated gpu on intel cpus which was removed form xeon cpus when GVT support was added. in the future with the descrete gpus from intel slated for release sometime in 2020 we should hopefully have intel cards that actully support GVT assuming that is on there gpu product roadmap but i can see how it would not be given they developed the tech for there integrated gpu. it would also be intersting to test amd gpus using there sriov approach but i think NVIDA tesla gpus would be the shortest path forword. > > So I think at this point the question is whether Infra can support > those reqs. If yes, we can start concrete steps towards getting those > machines used by a CI job. If not, we'll fall back to OpenLab and try > to get them hardware. > > [*] Could we do double-passthrough? Could the card be passed through > to the L1 guest via the PCI passthrough mechanism, and then into the > L2 guest via the mdev mechanism? i have a theory about how this "migth" be possible but openstack is missing the features requried to pull it off. i may test it locally with libvirt but the only why i think this could work would be to do a full passthough of the PF to an l1 guest using the q35 chipset with a viommu(not supported in nova) with hypervior hiding enabled and then run the grid driver in the l1 guest to expose a mdev to the l2 guest. ironic would be much simpler > > [1] https://etherpad.openstack.org/p/nova-ptg-train-ci > [2] https://docs.openstack.org/nova/rocky/admin/virtual-gpu.html > [3] https://01.org/igvt-g > [4] https://docs.nvidia.com/grid/5.0/pdf/grid-vgpu-user-guide.pdf > From smooney at redhat.com Mon May 13 18:09:50 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 13 May 2019 19:09:50 +0100 Subject: [nova][CI] GPUs in the gate In-Reply-To: <3587e05d-deab-42ad-9a02-4312ca11760f@www.fastmail.com> References: <3587e05d-deab-42ad-9a02-4312ca11760f@www.fastmail.com> Message-ID: <990842e654eaf7a080da64c6de3a06f38df88838.camel@redhat.com> On Tue, 2019-05-07 at 19:56 -0400, Clark Boylan wrote: > On Tue, May 7, 2019, at 10:48 AM, Artom Lifshitz wrote: > > Hey all, > > > > Following up on the CI session during the PTG [1], I wanted to get the > > ball rolling on getting GPU hardware into the gate somehow. Initially > > the plan was to do it through OpenLab and by convincing NVIDIA to > > donate the cards, but after a conversation with Sean McGinnis it > > appears Infra have access to machines with GPUs. > > > > From Nova's POV, the requirements are: > > * The machines with GPUs should probably be Ironic baremetal nodes and > > not VMs [*]. > > * The GPUs need to support virtualization. It's hard to get a > > comprehensive list of GPUs that do, but Nova's own docs [2] mention > > two: Intel cards with GVT [3] and NVIDIA GRID [4]. > > > > So I think at this point the question is whether Infra can support > > those reqs. If yes, we can start concrete steps towards getting those > > machines used by a CI job. If not, we'll fall back to OpenLab and try > > to get them hardware. > > What we currently have access to is a small amount of Vexxhost's GPU instances (so mnaser can further clarify my > comments here). I believe these are VMs with dedicated nvidia gpus that are passed through. I don't think they support > the vgpu feature. this is correct i asked mnaser about this in the past which is why he made the gpu nodeset available initally but after checking with sylvain and confiming the gpu model available via vexxhost we determined they could not be used to test vgpu support. > > It might help to describe the use case you are trying to meet rather than jumping ahead to requirements/solutions. > That way maybe we can work with Vexxhost to better support what you need (or come up with some other solutions). For > those of us that don't know all of the particulars it really does help if you can go from use case to requirements. effectly we just want to test the mdev based vgpu support in the libvirt driver. nvidia locks down support for vGPU to there tesla and quadro cards and requires a license server to be running to enabled the use fo teh grid driver. as a resutl to be able to test this feaute in the upstream gate we would need a gpu that is on the supported list of the nvida grid driver and a license server(could just use the trial licenses) so that we can use the vgpu feature. As vfio medatione devices are an extention of the sr-iov framework bulit on top fo the vfio stack the only simple way to these this would be via a baremetal host as we do not have a way to do a double passthough in a way that preserves sriov fucntionality.( the way i descibed in my last email is just a theory and openstack is missing support for vIOMMU support in anycase even if it did work) > > > > > [*] Could we do double-passthrough? Could the card be passed through > > to the L1 guest via the PCI passthrough mechanism, and then into the > > L2 guest via the mdev mechanism? > > > > [1] https://etherpad.openstack.org/p/nova-ptg-train-ci > > [2] https://docs.openstack.org/nova/rocky/admin/virtual-gpu.html > > [3] https://01.org/igvt-g > > [4] https://docs.nvidia.com/grid/5.0/pdf/grid-vgpu-user-guide.pdf > > From smooney at redhat.com Mon May 13 18:14:24 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 13 May 2019 19:14:24 +0100 Subject: [nova][CI] GPUs in the gate In-Reply-To: <20190508132709.xgq6nz3mqkfw3q5d@yuggoth.org> References: <3587e05d-deab-42ad-9a02-4312ca11760f@www.fastmail.com> <20190508132709.xgq6nz3mqkfw3q5d@yuggoth.org> Message-ID: On Wed, 2019-05-08 at 13:27 +0000, Jeremy Stanley wrote: > On 2019-05-08 08:46:56 -0400 (-0400), Artom Lifshitz wrote: > [...] > > The use case is CI coverage for Nova's VGPU feature. This feature can > > be summarized (and oversimplified) as "SRIOV for GPUs": a single > > physical GPU can be split into multiple virtual GPUs (via libvirt's > > mdev support [5]), each one being assigned to a different guest. We > > have functional tests in-tree, but no tests with real hardware. So > > we're looking for a way to get real hardware in the gate. > > [...] > > Long shot, but since you just need the feature provided and not the > performance it usually implies, are there maybe any open source > emulators which provide the same instruction set for conformance > testing purposes? i tried going down this route looking at the netdev shim module for emulationg nics to test generic sriov but it does not actully do the pcie emulation of the vfs. for vgpus im not aware of any kernel or userspace emulation we could use to test teh end to end workflow with libvirt. if anywone else know of one that would be an interesting alternative to pursue. also if any kernel developers want to pcie vf emulation to the netdevsim module it really would be awsome to be able to use that to test sriov nics in the gate without hardware. From jasonanderson at uchicago.edu Mon May 13 19:38:47 2019 From: jasonanderson at uchicago.edu (Jason Anderson) Date: Mon, 13 May 2019 19:38:47 +0000 Subject: [nova][ironic] Lock-related performance issue with update_resources periodic job Message-ID: Hey OpenStackers, I work on a cloud that allows users to reserve and provision bare metal instances with Ironic. We recently performed a long-overdue upgrade of our core components, all the way from Ocata up through Rocky. During this, we noticed that instance build requests were taking 4-5x (!!) as long as before. We have two deployments, one with ~150 bare metal nodes, and another with ~300. These are each managed by one nova-compute process running the Ironic driver. After investigation, the root cause appeared to be contention between the update_resources periodic task and the instance claim step. There is one semaphore "compute_resources" that is used to control every access within the resource_tracker. In our case, what was happening was the update_resources job, which runs every minute by default, was constantly queuing up accesses to this semaphore, because each hypervisor is updated independently, in series. This meant that, for us, each Ironic node was being processed and was holding the semaphore during its update (which took about 2-5 seconds in practice.) Multiply this by 150 and our update task was running constantly. Because an instance claim also needs to access this semaphore, this led to instances getting stuck in the "Build" state, after scheduling, for tens of minutes on average. There seemed to be some probabilistic effect here, which I hypothesize is related to the locking mechanism not using a "fair" lock (first-come, first-served) by default. Our fix was to drastically increase the interval this task runs at--from every 1 minute to every 12 hours. We only provision bare metal, so my rationale was that the periodic full resource sync was less important and mostly helpful for fixing weird things where somehow Placement's state got out of sync with Nova's somehow. I'm wondering, after all this, if it makes sense to rethink this one-semaphore thing, and instead create a per-hypervisor semaphore when doing the resource syncing. I can't think of a reason why the entire set of hypervisors needs to be considered as a whole when doing this task, but I could very well be missing something. TL;DR: if you have one nova-compute process managing lots of Ironic hypervisors, consider tweaking the update_resources_interval to a higher value, especially if you're seeing instances stuck in the Build state for a while. Cheers, Jason Anderson Cloud Computing Software Developer Consortium for Advanced Science and Engineering, The University of Chicago Mathematics & Computer Science Division, Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at fried.cc Mon May 13 19:54:52 2019 From: openstack at fried.cc (Eric Fried) Date: Mon, 13 May 2019 14:54:52 -0500 Subject: [nova][ironic] Lock-related performance issue with update_resources periodic job In-Reply-To: References: Message-ID: <6b75f78d-2dc3-a043-4329-c12ab6bbdf8f@fried.cc> Jason- You may find this article interesting [1]. It isn't clear whether your issue is the same as CERN's. But it would be interesting to know whether setting [compute]resource_provider_association_refresh [2] to a very large number (while leaving your periodic interval at its default) also mitigates the issue. Thanks, efried [1] https://techblog.web.cern.ch/techblog/post/placement-requests/ [2] https://docs.openstack.org/nova/latest/configuration/config.html#compute.resource_provider_association_refresh From skaplons at redhat.com Mon May 13 20:01:32 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Mon, 13 May 2019 22:01:32 +0200 Subject: [neutron] CI meeting on 14th May cancelled Message-ID: <79980246-C6B6-4AF6-94E0-18C8F081F26A@redhat.com> Hi, I can’t run CI meeting tomorrow (14.05.2019). As Miguel can’t run it also, lets cancel meeting this week. We will meet next week (21.05) as usual. — Slawek Kaplonski Senior software engineer Red Hat From sean.mcginnis at gmx.com Mon May 13 20:03:13 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Mon, 13 May 2019 15:03:13 -0500 Subject: Help needed to Support Multi-attach feature In-Reply-To: References: <20190510092600.r27zetl5e3k5ow5v@localhost> Message-ID: <20190513200312.GA21325@sm-workstation> On Fri, May 10, 2019 at 04:51:07PM +0000, RAI, SNEHA wrote: > Thanks Gorka for your response. > > I have changed the version of libvirt and qemu on my host and I am able to move past the previous error mentioned in my last email. > > Current versions of libvirt and qemu: > root at CSSOSBE04-B09:/etc# libvirtd --version > libvirtd (libvirt) 1.3.1 > root at CSSOSBE04-B09:/etc# kvm --version > QEMU emulator version 2.5.0 (Debian 1:2.5+dfsg-5ubuntu10.36), Copyright (c) 2003-2008 Fabrice Bellard > > Also, I made a change in /etc/nova/nova.conf and set virt_type=qemu. Earlier it was set to kvm. > I restarted all nova services post the changes but I can see one nova service was disabled and state was down. > Not sure if it is related or not, but I don't believe you want to change virt_type t0 "qemu". That should stay "kvm". From sean.mcginnis at gmx.com Mon May 13 20:07:47 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Mon, 13 May 2019 15:07:47 -0500 Subject: [ceilometer] Rocky release notes for Ceilometer In-Reply-To: References: <907ABCCF-C566-40B4-B479-5F5D3A5923EB@gmail.com> <792fbf82-94a2-5747-44fd-e4b6cb7327eb@suse.com> Message-ID: <20190513200747.GB21325@sm-workstation> On Mon, May 13, 2019 at 11:21:25PM +0900, Trinh Nguyen wrote: > Thanks Andreas for pointing that out. > It would appear there is no rocky landing page for release notes due to this patch never landing: https://review.opendev.org/#/c/587182/ If that is fixed up from the merge conflict and merged, then there will be a landing page for rocky release notes to link to. Sean From jasonanderson at uchicago.edu Mon May 13 20:14:09 2019 From: jasonanderson at uchicago.edu (Jason Anderson) Date: Mon, 13 May 2019 20:14:09 +0000 Subject: [nova][ironic] Lock-related performance issue with update_resources periodic job In-Reply-To: <6b75f78d-2dc3-a043-4329-c12ab6bbdf8f@fried.cc> References: , <6b75f78d-2dc3-a043-4329-c12ab6bbdf8f@fried.cc> Message-ID: Hi Eric, thanks, that's very useful reading. I suspect the root issue is the same, as this isn't specific to Ironic per se, but rather is linked to a high # of hypervisors managed by one compute service. In our case, Placement was able to keep up just fine (though raising this job interval also lowered the number of requests to Placement significantly.) My suspicion was that it was less about load on Placement, and more about this lock contention. I will have to try pulling in these patches to test that. Cheers, /Jason ________________________________ From: Eric Fried Sent: Monday, May 13, 2019 14:54 To: openstack-discuss at lists.openstack.org Subject: Re: [nova][ironic] Lock-related performance issue with update_resources periodic job Jason- You may find this article interesting [1]. It isn't clear whether your issue is the same as CERN's. But it would be interesting to know whether setting [compute]resource_provider_association_refresh [2] to a very large number (while leaving your periodic interval at its default) also mitigates the issue. Thanks, efried [1] https://techblog.web.cern.ch/techblog/post/placement-requests/ [2] https://docs.openstack.org/nova/latest/configuration/config.html#compute.resource_provider_association_refresh -------------- next part -------------- An HTML attachment was scrubbed... URL: From surya.seetharaman9 at gmail.com Mon May 13 20:15:27 2019 From: surya.seetharaman9 at gmail.com (Surya Seetharaman) Date: Mon, 13 May 2019 22:15:27 +0200 Subject: [nova][ironic] Lock-related performance issue with update_resources periodic job In-Reply-To: References: Message-ID: Hi Jason, On Mon, May 13, 2019 at 9:40 PM Jason Anderson wrote: > After investigation, the root cause appeared to be contention between the > update_resources periodic task and the instance claim step. There is one > semaphore "compute_resources" that is used to control every access within > the resource_tracker. In our case, what was happening was the > update_resources job, which runs every minute by default, was constantly > queuing up accesses to this semaphore, because each hypervisor is updated > independently, in series. This meant that, for us, each Ironic node was > being processed and was holding the semaphore during its update (which took > about 2-5 seconds in practice.) Multiply this by 150 and our update task > was running constantly. Because an instance claim also needs to access this > semaphore, this led to instances getting stuck in the "Build" state, after > scheduling, for tens of minutes on average. There seemed to be some > probabilistic effect here, which I hypothesize is related to the locking > mechanism not using a "fair" lock (first-come, first-served) by default. > > Our fix was to drastically increase the interval this task runs at--from > every 1 minute to every 12 hours. We only provision bare metal, so my > rationale was that the periodic full resource sync was less important and > mostly helpful for fixing weird things where somehow Placement's state got > out of sync with Nova's somehow. > > I'm wondering, after all this, if it makes sense to rethink this > one-semaphore thing, and instead create a per-hypervisor semaphore when > doing the resource syncing. I can't think of a reason why the entire set of > hypervisors needs to be considered as a whole when doing this task, but I > could very well be missing something. > > *TL;DR*: if you have one nova-compute process managing lots of Ironic > hypervisors, consider tweaking the update_resources_interval to a higher > value, especially if you're seeing instances stuck in the Build state for a > while. > We faced the same problem at CERN when we upgraded to rocky (we have ~2300 nodes on a single compute) like Eric said, and we set the [compute]resource_provider_association_refresh to a large value (this definitely helps by stopping the syncing of traits/aggregates and provider tree cache info stuff in terms of chattiness with placement) and inspite of that it doesn't scale that well for us. We still find the periodic task taking too much of time which causes the locking to hold up the claim for instances in BUILD state (the exact same problem you described). While one way to tackle this like you said is to set the "update_resources_interval" to a higher value - we were not sure how much out of sync things would get with placement, so it will be interesting to see how this spans out for you - another way out would be to use multiple computes and spread the nodes around (though this is also a pain to maintain IMHO) which is what we are looking into presently. -- Regards, Surya. -------------- next part -------------- An HTML attachment was scrubbed... URL: From colleen at gazlene.net Mon May 13 20:21:01 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Mon, 13 May 2019 16:21:01 -0400 Subject: [dev][keystone] Cross-project liaison review Message-ID: <2ddebe5c-a5be-45d9-b233-399df423748d@www.fastmail.com> Hi everyone, I had scheduled time to review our liaison list during the PTG but we decided to defer it until after the PTG so that people who could not attend the PTG had the option of participating. We'll go over the list at the next keystone meeting (Tuesday, 14 May at 1600 UTC in #openstack-meeting-alt), possibly spilling over into the keystone office hours. If you can't make the meeting, please review the current list[1][2] and let me know if you want to update your liaison status or take on any liaison duties. Colleen [1] https://etherpad.openstack.org/p/keystone-liaison-review-2019 [2] https://wiki.openstack.org/wiki/CrossProjectLiaisons From surya.seetharaman9 at gmail.com Mon May 13 20:34:05 2019 From: surya.seetharaman9 at gmail.com (Surya Seetharaman) Date: Mon, 13 May 2019 22:34:05 +0200 Subject: [nova][ironic] Lock-related performance issue with update_resources periodic job In-Reply-To: References: Message-ID: On Mon, May 13, 2019 at 9:40 PM Jason Anderson wrote: > > I'm wondering, after all this, if it makes sense to rethink this > one-semaphore thing, and instead create a per-hypervisor semaphore when > doing the resource syncing. I can't think of a reason why the entire set of > hypervisors needs to be considered as a whole when doing this task, but I > could very well be missing something. > > While theoretically this would be ideal, I am not sure how the COMPUTE_RESOURCE_SEMAPHORE can be tweaked into a per-hypervisor (for ironic) semaphore since its ultimately on a single compute-service's resource tracker, unless I am missing something obvious. Maybe the nova experts who know more this could shed some light. -- Regards, Surya. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dirk at dmllr.de Mon May 13 21:31:41 2019 From: dirk at dmllr.de (=?UTF-8?B?RGlyayBNw7xsbGVy?=) Date: Mon, 13 May 2019 23:31:41 +0200 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> Message-ID: Hi Jeremy, > It's still unclear to me why we're doing this at all. Our stable > constraints lists are supposed to be a snapshot in time from when we > released, modulo stable point release updates of the libraries we're > maintaining. Agreeing to bump random dependencies on stable branches > because of security vulnerabilities in them is a slippery slope > toward our users expecting the project to be on top of vulnerability > announcements for every one of the ~600 packages in our constraints > list. I think this is combining two different viewpoints in one: "snapshot in time" and "user expects it to be updated asap on security vulnerabilities". We are already updating upper-constraints on bugfixes for projects that openstack maintains, and we do so similarly for security fixes for packages that are part of openstack. Also, distribution vendor provided (non-pip installed bindeps) are maintained and updated by security fixes. > Deployment projects already should not depend on our > requirements team tracking security vulnerabilities, so need to have > a mechanism to override constraints entries anyway if they're making > such guarantees to their users (and I would also caution against > doing that too). for traditional openstack deployment projects that might be well the case. However several virtualenv / container / dockerbuild based projects are using upperconstraints to generate a stable, coherent and tested container. Without adjustments those will not be including security fixes though. > Distributions are far better equipped than our project to handle > such tracking, as they generally get advance notice of > vulnerabilities and selectively backport fixes for them. Agreed, still OpenStack chose to use pip for managing its dependencies, so I think it is preferable to find a solution within that ecosystem. Not every dependency with security issues is so offensive as "requests" is which does not maintain any stable branch but asks you to update to the latest version instead. Most others do have backports available on micro versions as pip installable project. > accomplish the same with a mix of old and new dependency versions in > our increasingly aging stable and extended maintenance branches > seems like a disaster waiting to happen. I agree it is a difficult exercise and we need to document a clear policy. For me documenting that it upper-constraints is maintained with security fix version updates as a best effort/risk bases is good enough, fwiw. Greetings, Dirk From mriedemos at gmail.com Mon May 13 21:51:54 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Mon, 13 May 2019 16:51:54 -0500 Subject: [nova][ironic] Lock-related performance issue with update_resources periodic job In-Reply-To: References: Message-ID: <06982507-d001-3a69-6670-bb9016bac851@gmail.com> On 5/13/2019 3:34 PM, Surya Seetharaman wrote: > > I'm wondering, after all this, if it makes sense to rethink this > one-semaphore thing, and instead create a per-hypervisor semaphore > when doing the resource syncing. I can't think of a reason why the > entire set of hypervisors needs to be considered as a whole when > doing this task, but I could very well be missing something. > > > > While theoretically this would be ideal, I am not sure how the > COMPUTE_RESOURCE_SEMAPHORE can be tweaked into a per-hypervisor (for > ironic) semaphore since its ultimately on a single compute-service's > resource tracker, unless I am missing something obvious. Maybe the nova > experts who know more this could shed some light. I would think it would just be a matter of locking on the nodename. That would have the same effect for a non-ironic compute service where the driver should only be reporting a single nodename. But for a compute service managing ironic nodes, it would be more like a per-instance lock since the nodes are 1:1 with the instances managed on that host. Having said all that, the devil is in the details (and trying to refactor that very old and crusty RT code). -- Thanks, Matt From doug at doughellmann.com Mon May 13 22:22:35 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Mon, 13 May 2019 18:22:35 -0400 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> Message-ID: Dirk Müller writes: > Jeremy wrote: >> It's still unclear to me why we're doing this at all. Our stable >> constraints lists are supposed to be a snapshot in time from when we >> released, modulo stable point release updates of the libraries we're >> maintaining. Agreeing to bump random dependencies on stable branches >> because of security vulnerabilities in them is a slippery slope >> toward our users expecting the project to be on top of vulnerability >> announcements for every one of the ~600 packages in our constraints >> list. > > I think this is combining two different viewpoints in one: "snapshot > in time" and "user expects > it to be updated asap on security vulnerabilities". We are already > updating upper-constraints > on bugfixes for projects that openstack maintains, and we do so > similarly for security fixes > for packages that are part of openstack. Also, distribution vendor > provided (non-pip installed bindeps) > are maintained and updated by security fixes. But our motivation for updating the list when *we* release a package is that we want to test that package with the rest of our code. That's consistent with the original purpose of the list, which was to control which things we run in CI so we can have more control over when releases "break" us, and it isn't related to the reason for the releases. -- Doug From colleen at gazlene.net Mon May 13 22:36:42 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Mon, 13 May 2019 18:36:42 -0400 Subject: [dev][keystone] Office hours revamp Message-ID: Hi team, One of the outcomes of the PTG was to make more productive use of office hours by designating a topic ahead of time and working through it as a team. I've proposed to officially add the office hour to eavesdrop[1] and started a topic etherpad[2]. This week we'll finish the liaison review if it's not already completed during the meeting, and we'll do some bug triage. We should be sure to continuously reflect on how this is going and whether we should change our process or reschedule the session. Colleen [1] https://review.opendev.org/658909 [2] https://etherpad.openstack.org/p/keystone-office-hours-topics From mthode at mthode.org Mon May 13 22:57:25 2019 From: mthode at mthode.org (Matthew Thode) Date: Mon, 13 May 2019 17:57:25 -0500 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> Message-ID: <20190513225725.bnjyeno5khwtmeoj@mthode.org> On 19-05-13 18:22:35, Doug Hellmann wrote: > Dirk Müller writes: > > > Jeremy wrote: > >> It's still unclear to me why we're doing this at all. Our stable > >> constraints lists are supposed to be a snapshot in time from when we > >> released, modulo stable point release updates of the libraries we're > >> maintaining. Agreeing to bump random dependencies on stable branches > >> because of security vulnerabilities in them is a slippery slope > >> toward our users expecting the project to be on top of vulnerability > >> announcements for every one of the ~600 packages in our constraints > >> list. > > > > I think this is combining two different viewpoints in one: "snapshot > > in time" and "user expects > > it to be updated asap on security vulnerabilities". We are already > > updating upper-constraints > > on bugfixes for projects that openstack maintains, and we do so > > similarly for security fixes > > for packages that are part of openstack. Also, distribution vendor > > provided (non-pip installed bindeps) > > are maintained and updated by security fixes. > > But our motivation for updating the list when *we* release a package is > that we want to test that package with the rest of our code. That's > consistent with the original purpose of the list, which was to control > which things we run in CI so we can have more control over when releases > "break" us, and it isn't related to the reason for the releases. > yep, we are FIRST concerned with stability, and possibly secondly concerned with security (as a project). This would be expanding our perview a ton (talking with fungi earlier, it'd add a bunch conplexity even if done in a basic way). At the moment this merge is on hold til we figure out if we want to do this, and if so, how (and would the cost be worth it). -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From kennelson11 at gmail.com Mon May 13 23:00:01 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Mon, 13 May 2019 16:00:01 -0700 Subject: [PTL][SIG][WG] PTG Team Photos In-Reply-To: <30c8cbb5-b11b-be98-339d-ef6c5e35305b@gmail.com> References: <981673d8-b095-4c30-a651-577d1c5286d3@www.fastmail.com> <30c8cbb5-b11b-be98-339d-ef6c5e35305b@gmail.com> Message-ID: Sorting through them today, should have a link for everyone tomorrow. -Kendall On Fri, May 10, 2019 at 11:01 AM Jay Bryant wrote: > Colleen, > > I haven't seen them made available anywhere yet so I don't think you > missed an e-mail. > > Jay > > On 5/10/2019 12:48 PM, Colleen Murphy wrote: > > On Thu, Mar 28, 2019, at 17:03, Kendall Nelson wrote: > >> Hello! > >> > >> If your team is attending the PTG and is interested in having a team > >> photo taken, here is the signup[1]! There are slots Thursday and Friday > >> from 10:00 AM to 4:30 PM. > >> > >> The location is TBD but will likely be close to where registration will > >> be. I'll send an email out the day before with a reminder of your time > >> slot and an exact location. > >> > >> -Kendall (diablo_rojo) > >> > >> [1] > https://docs.google.com/spreadsheets/d/1DgsRHVWW2YLv7ewfX0M21zWJRf4wUfPG4ff2V5XtaMg/edit?usp=sharing > >> > > Are the photos available somewhere now? I'm wondering if I missed an > email. > > > > Colleen > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From flux.adam at gmail.com Mon May 13 23:36:20 2019 From: flux.adam at gmail.com (Adam Harwell) Date: Mon, 13 May 2019 16:36:20 -0700 Subject: [senlin][zaqar][networking-midonet][networking-odl][vmware-nsx] Neutron-LBaaS retirement warning! In-Reply-To: <4b0516ee-b248-f13a-8381-c09785006ad3@ericsson.com> References: <4b0516ee-b248-f13a-8381-c09785006ad3@ericsson.com> Message-ID: Yes, stable/stein is the last release of neutron-lbaas, and the latest stable branch that will remain. On Mon, May 13, 2019 at 2:07 AM Lajos Katona wrote: > Hi, > > This means that stable/stein is the last where these projects can include > neutron-lbaas? > > Thanks for the heads up. > > Regards > Lajos > > On 2019. 05. 13. 10:34, Adam Harwell wrote: > > As you are hopefully already aware, the Neutron-LBaaS project is being > retired this cycle (and a lot of the patches to accomplish this will land > in the next few days). > From a quick code search, it seems the following projects still include > neutron-lbaas in their zuul job configs: > > networking-odl > networking-midonet > senlin > vmware-nsx > zaqar > > For projects on this list, the retirement of neutron-lbaas *will* cause > your zuul jobs to fail. *Please take action to remove this requirement!* > It is possible that it is simply an extra unused requirement, but if your > project is actually using neutron-lbaas to create loadbalancers, it will be > necessary to convert to Octavia. > > If you need assistance with this change or have any questions, don't > hesitate to stop by #openstack-lbaas on IRC and we can help! > > --Adam Harwell > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From flux.adam at gmail.com Mon May 13 23:41:05 2019 From: flux.adam at gmail.com (Adam Harwell) Date: Mon, 13 May 2019 16:41:05 -0700 Subject: [senlin][zaqar][networking-midonet][networking-odl][vmware-nsx] Neutron-LBaaS retirement warning! In-Reply-To: References: Message-ID: If you are using the CLI, there is a tutorial on how to use most of the functionality of Octavia here: https://docs.openstack.org/octavia/latest/user/guides/basic-cookbook.html If you are using the API directly, then the only action would be to include Octavia in your projects list (if it isn't already) and change your endpoint from neutron to octavia. The Octavia API is fully compatible with the Neutron-LBaaS v2 API spec. If you are using the python client, you will need to switch to python-octaviaclient or to the openstack SDK, and I am unfortunately not aware of a guide to do that. However, it is a remarkably similar service (object model and API are 100% compatible), so making the transition should hopefully not be very difficult, and we are happy to help in #openstack-lbaas on IRC if you need specific assistance. --Adam On Mon, May 13, 2019 at 2:13 AM Takashi Yamamoto wrote: > On Mon, May 13, 2019 at 5:42 PM Adam Harwell wrote: > > > > As you are hopefully already aware, the Neutron-LBaaS project is being > retired this cycle (and a lot of the patches to accomplish this will land > in the next few days). > > From a quick code search, it seems the following projects still include > neutron-lbaas in their zuul job configs: > > > > networking-odl > > networking-midonet > > senlin > > vmware-nsx > > zaqar > > > > For projects on this list, the retirement of neutron-lbaas *will* cause > your zuul jobs to fail. Please take action to remove this requirement! It > is possible that it is simply an extra unused requirement, but if your > project is actually using neutron-lbaas to create loadbalancers, it will be > necessary to convert to Octavia. > > is there a guide for the conversion? > > > > > If you need assistance with this change or have any questions, don't > hesitate to stop by #openstack-lbaas on IRC and we can help! > > > > --Adam Harwell > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Tue May 14 01:51:28 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Tue, 14 May 2019 10:51:28 +0900 Subject: [ceilometer] Rocky release notes for Ceilometer In-Reply-To: <20190513200747.GB21325@sm-workstation> References: <907ABCCF-C566-40B4-B479-5F5D3A5923EB@gmail.com> <792fbf82-94a2-5747-44fd-e4b6cb7327eb@suse.com> <20190513200747.GB21325@sm-workstation> Message-ID: Thank Sean. I'll fix it. On Tue, May 14, 2019 at 5:07 AM Sean McGinnis wrote: > On Mon, May 13, 2019 at 11:21:25PM +0900, Trinh Nguyen wrote: > > Thanks Andreas for pointing that out. > > > > It would appear there is no rocky landing page for release notes due to > this > patch never landing: > > https://review.opendev.org/#/c/587182/ > > If that is fixed up from the merge conflict and merged, then there will be > a > landing page for rocky release notes to link to. > > Sean > -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From liu.xuefeng1 at zte.com.cn Tue May 14 02:33:43 2019 From: liu.xuefeng1 at zte.com.cn (liu.xuefeng1 at zte.com.cn) Date: Tue, 14 May 2019 10:33:43 +0800 (CST) Subject: =?UTF-8?B?UmU6W3Nlbmxpbl1bemFxYXJdW25ldHdvcmtpbmctbWlkb25ldF1bbmV0d29ya2luZy1vZGxdW3Ztd2FyZS1uc3hdTmV1dHJvbi1MQmFhUyByZXRpcmVtZW50IHdhcm5pbmch?= In-Reply-To: References: CAHxXnheFqru3fU8WwNqG2nSojkzBSq_Y6P2PM=RD+kYLZwMdWA@mail.gmail.com Message-ID: <201905141033439137243@zte.com.cn> Hi,Adam Harwell Thanks for your remind. Senlin is already supported octavia so we only need remove neutron-lbaas reference in .zuul. XueFeng 原始邮件 发件人:AdamHarwell 收件人:openstack-discuss ; 日 期 :2019年05月13日 16:40 主 题 :[senlin][zaqar][networking-midonet][networking-odl][vmware-nsx]Neutron-LBaaS retirement warning! As you are hopefully already aware, the Neutron-LBaaS project is being retired this cycle (and a lot of the patches to accomplish this will land in the next few days).From a quick code search, it seems the following projects still include neutron-lbaas in their zuul job configs: networking-odl networking-midonet senlin vmware-nsx zaqar For projects on this list, the retirement of neutron-lbaas *will* cause your zuul jobs to fail. Please take action to remove this requirement! It is possible that it is simply an extra unused requirement, but if your project is actually using neutron-lbaas to create loadbalancers, it will be necessary to convert to Octavia. If you need assistance with this change or have any questions, don't hesitate to stop by #openstack-lbaas on IRC and we can help! --Adam Harwell -------------- next part -------------- An HTML attachment was scrubbed... URL: From yamamoto at midokura.com Tue May 14 03:01:24 2019 From: yamamoto at midokura.com (Takashi Yamamoto) Date: Tue, 14 May 2019 12:01:24 +0900 Subject: [senlin][zaqar][networking-midonet][networking-odl][vmware-nsx] Neutron-LBaaS retirement warning! In-Reply-To: References: Message-ID: i'm more interested in the backend side. i've heard a plan to introduce some glue code to allow octavia to use lbaas v2 backend drivers with no or minimal modifications before deprecation/removal. (i guess it was during atlanta ptg but i might be wrong) did it happen? On Tue, May 14, 2019 at 8:41 AM Adam Harwell wrote: > > If you are using the CLI, there is a tutorial on how to use most of the functionality of Octavia here: > https://docs.openstack.org/octavia/latest/user/guides/basic-cookbook.html > > If you are using the API directly, then the only action would be to include Octavia in your projects list (if it isn't already) and change your endpoint from neutron to octavia. The Octavia API is fully compatible with the Neutron-LBaaS v2 API spec. > > If you are using the python client, you will need to switch to python-octaviaclient or to the openstack SDK, and I am unfortunately not aware of a guide to do that. However, it is a remarkably similar service (object model and API are 100% compatible), so making the transition should hopefully not be very difficult, and we are happy to help in #openstack-lbaas on IRC if you need specific assistance. > > --Adam > > On Mon, May 13, 2019 at 2:13 AM Takashi Yamamoto wrote: >> >> On Mon, May 13, 2019 at 5:42 PM Adam Harwell wrote: >> > >> > As you are hopefully already aware, the Neutron-LBaaS project is being retired this cycle (and a lot of the patches to accomplish this will land in the next few days). >> > From a quick code search, it seems the following projects still include neutron-lbaas in their zuul job configs: >> > >> > networking-odl >> > networking-midonet >> > senlin >> > vmware-nsx >> > zaqar >> > >> > For projects on this list, the retirement of neutron-lbaas *will* cause your zuul jobs to fail. Please take action to remove this requirement! It is possible that it is simply an extra unused requirement, but if your project is actually using neutron-lbaas to create loadbalancers, it will be necessary to convert to Octavia. >> >> is there a guide for the conversion? >> >> > >> > If you need assistance with this change or have any questions, don't hesitate to stop by #openstack-lbaas on IRC and we can help! >> > >> > --Adam Harwell From flux.adam at gmail.com Tue May 14 04:00:11 2019 From: flux.adam at gmail.com (Adam Harwell) Date: Mon, 13 May 2019 21:00:11 -0700 Subject: [senlin][zaqar][networking-midonet][networking-odl][vmware-nsx] Neutron-LBaaS retirement warning! In-Reply-To: References: Message-ID: No, unfortunately we did not end up with a model that easily allowed for a "shim" system to allow use of older drivers. We've been trying to work with the vendors to get their drivers prepared, but I don't know how many of them actually finished the work. That said, stable/stein should be usable for the time being with no real change -- we simply will not be maintaining the service any longer (except possibly extreme cases like major security patches) or providing any further releases. If you need to adjust your code to continue to use the stein release instead of master, that is a perfectly valid option. --Adam On Mon, May 13, 2019 at 8:01 PM Takashi Yamamoto wrote: > i'm more interested in the backend side. > i've heard a plan to introduce some glue code to allow octavia to use > lbaas v2 backend drivers with no or minimal modifications before > deprecation/removal. > (i guess it was during atlanta ptg but i might be wrong) > did it happen? > > On Tue, May 14, 2019 at 8:41 AM Adam Harwell wrote: > > > > If you are using the CLI, there is a tutorial on how to use most of the > functionality of Octavia here: > > > https://docs.openstack.org/octavia/latest/user/guides/basic-cookbook.html > > > > If you are using the API directly, then the only action would be to > include Octavia in your projects list (if it isn't already) and change your > endpoint from neutron to octavia. The Octavia API is fully compatible with > the Neutron-LBaaS v2 API spec. > > > > If you are using the python client, you will need to switch to > python-octaviaclient or to the openstack SDK, and I am unfortunately not > aware of a guide to do that. However, it is a remarkably similar service > (object model and API are 100% compatible), so making the transition should > hopefully not be very difficult, and we are happy to help in > #openstack-lbaas on IRC if you need specific assistance. > > > > --Adam > > > > On Mon, May 13, 2019 at 2:13 AM Takashi Yamamoto > wrote: > >> > >> On Mon, May 13, 2019 at 5:42 PM Adam Harwell > wrote: > >> > > >> > As you are hopefully already aware, the Neutron-LBaaS project is > being retired this cycle (and a lot of the patches to accomplish this will > land in the next few days). > >> > From a quick code search, it seems the following projects still > include neutron-lbaas in their zuul job configs: > >> > > >> > networking-odl > >> > networking-midonet > >> > senlin > >> > vmware-nsx > >> > zaqar > >> > > >> > For projects on this list, the retirement of neutron-lbaas *will* > cause your zuul jobs to fail. Please take action to remove this > requirement! It is possible that it is simply an extra unused requirement, > but if your project is actually using neutron-lbaas to create > loadbalancers, it will be necessary to convert to Octavia. > >> > >> is there a guide for the conversion? > >> > >> > > >> > If you need assistance with this change or have any questions, don't > hesitate to stop by #openstack-lbaas on IRC and we can help! > >> > > >> > --Adam Harwell > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kennelson11 at gmail.com Tue May 14 05:33:52 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Mon, 13 May 2019 22:33:52 -0700 Subject: [First Contact] [SIG] Summit/Forum + PTG Summary Message-ID: Hello All! Here's a super short summary of the going's on WRT the First Contact SIG. There were plenty of other relevant sessions, but this email summarizes the ones that we were directly responsible for. Summit Session + PTG -------------------------------- The Meet & Greet and the PTG session went similarly. We had a few new faces come in and introduce themselves. Most of the new faces were operators from various companies which was cool that they wanted to get engaged. Members of the SIG introduced themselves and answered any questions that we could. Forum Session (Welcoming New Contributors State of the Union and Deduplication of Efforts) ----------------------------------------------------------------------------------------------------------------------------- The biggest things that came out of this session were discussion about recording of onboarding sessions and a community goal of improving contributor documentation. Basically, we have never had the onboarding sessions recorded but if we could tt would really help new contributors even if they might get a little stale before we are able to record new ones. During that chat, we learned that Octavia does somewhat regular calls in whch they do onboarding for new contributors. I have asked for an outline to help encourage other projects to do similar. As for per project contributor documentation, some projects have it and some don't. Some projects have it and its incomplete. bauzas volunteered to do an audit of which projects have it and which don't and to propose a community goal for it. As a part of that, we should probably decide on a list of bare minimum things to include. Etherpad from those discussions: https://etherpad.openstack.org/p/new-contribs-state-and-deduplication -Kendall Nelson (diablo_rojo) -------------- next part -------------- An HTML attachment was scrubbed... URL: From kennelson11 at gmail.com Tue May 14 05:38:22 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Mon, 13 May 2019 22:38:22 -0700 Subject: [StoryBoard] Forum + PTG Summary Message-ID: Hello Everyone :) Quick summaries of what we talked about during the sessions in Denver. If you have any questions about anything below feel free to reply or drop into #storyboard and ask us there! Forum (Ibuprofen for Your StoryBoard Pain Points) ------------------------------------------------------------------- The Forum session went really well. We had a pretty full room and lots of engagement from basically everyone. There were no real surprises about anything people were having issues with or new features they needed added that we didn't already know about. It was nice that we weren't blindsided by anything and seem to have a pretty good feel on the pulse of what people (that have spoke up at least) are asking for. The most entertaining part was that it had never occurred to us to do StoryBoard onboarding (both for howto use it and another for how to get started on developing for Storyboard) but also might be a really helpful thing for us to do at the next event. We talked about trying to fit it in before the end of the week, but our schedules were just too full to make it happen. I'm still working through making sure everything that needs to becomes a story in our backlog, but that should be done by the end of the week. Etherpad from Forum Session: https://etherpad.openstack.org/p/storyboard-pain-points PTG ------ During the PTG, Adam and I basically just sat down and did a full run through of all open stories. We closed duplicates, cleaned up old vague stories, replied to stories that we had questions about. It was all very cathartic. Next step is to make sure all the new stories got created that we need to and do another pass through to make sure its all tagged accurately. After that, we should document what tags we are using in our contributor documentation. Etherpad we used during the purge: https://etherpad.openstack.org/p/sb-train-ptg -Kendall Nelson (diablo_rojo) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.rydberg at citynetwork.eu Tue May 14 07:14:44 2019 From: tobias.rydberg at citynetwork.eu (Tobias Rydberg) Date: Tue, 14 May 2019 09:14:44 +0200 Subject: [publiccloud-wg][publiccloud-sig][telemetry] OpenStack billing initiative for public and private clouds Message-ID: <6deec62b-e549-8b75-7124-ad738c0d7366@citynetwork.eu> Hi all, During the Denver 2019 Summit the Public Cloud SIG identified that the current situation around public cloud (and private) billing is somewhat fragmented. Some operators use ceilometer, some Cloudkitty, and many use custom implementations. We have started an initiative to find a solution for this issue and figure out a way moving forward that can solve the needs and gaps in current solutions that exists, a solution that will make it easier for all openstack clouds, private and public, to manage billing. We are at a very early stage here, brainstorming around requirements and ideas, and would like to collect as much information about potential requirements and solutions as possible before we get any further. An etherpad [0] is created for the initial collection of information, and we would appreciate feedback and ideas there before the next Public Cloud SIG meeting that will take place 23rd of May 2019 at 1400 UTC in the #openstack-publiccloud channel. Cheers, Tobias [0] https://etherpad.openstack.org/p/publiccloud-sig-billing-implementation-proposal -- Tobias Rydberg Senior Developer Twitter & IRC: tobberydberg www.citynetwork.eu | www.citycloud.com INNOVATION THROUGH OPEN IT INFRASTRUCTURE ISO 9001, 14001, 27001, 27015 & 27018 CERTIFIED From geguileo at redhat.com Tue May 14 08:36:30 2019 From: geguileo at redhat.com (Gorka Eguileor) Date: Tue, 14 May 2019 10:36:30 +0200 Subject: Help needed to Support Multi-attach feature In-Reply-To: References: <20190510092600.r27zetl5e3k5ow5v@localhost> <20190513200312.GA21325@sm-workstation> Message-ID: <20190514083630.6ebvzjdravpjpse3@localhost> On 14/05, RAI, SNEHA wrote: > Thanks Sean for your response. > > Setting virt_type to kvm doesn’t help. n-cpu service is failing to come up. > > > > Journalctl logs of n-cpu service: > > May 14 02:07:05 CSSOSBE04-B09 systemd[1]: Started Devstack devstack at n-cpu.service. > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: DEBUG os_vif [-] Loaded VIF plugin class '' with name 'ovs' {{(pid=15989) initialize /usr/local/lib/python2.7/dist-packages/os_vif/__init__.py:46}} > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: DEBUG os_vif [-] Loaded VIF plugin class '' with name 'linux_bridge' {{(pid=15989) initialize /usr/local/lib/python2.7/dist- > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: INFO os_vif [-] Loaded VIF plugins: ovs, linux_bridge > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: WARNING oslo_config.cfg [None req-9dc9d20c-b002-4b34-a123-81612cdc47fc None None] Option "use_neutron" from group "DEFAULT" is deprecated for removal ( > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: nova-network is deprecated, as are any related configuration options. > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ). Its value may be silently ignored in the future. > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: DEBUG oslo_policy.policy [None req-9dc9d20c-b002-4b34-a123-81612cdc47fc None None] The policy file policy.json could not be found. {{(pid=15989) load_rules /usr/local/lib/python2.7/dist- > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: INFO nova.virt.driver [None req-9dc9d20c-b002-4b34-a123-81612cdc47fc None None] Loading compute driver 'libvirt.LibvirtDriver' > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver [None req-9dc9d20c-b002-4b34-a123-81612cdc47fc None None] Unable to load the virtualization driver: ImportError: /usr/lib/x86_64-linux-gnu/libvirt.so.0: version `L > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver Traceback (most recent call last): > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver File "/opt/stack/nova/nova/virt/driver.py", line 1700, in load_compute_driver > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver virtapi) > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver File "/usr/local/lib/python2.7/dist-packages/oslo_utils/importutils.py", line 44, in import_object > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver return import_class(import_str)(*args, **kwargs) > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 346, in __init__ > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver libvirt = importutils.import_module('libvirt') > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver File "/usr/local/lib/python2.7/dist-packages/oslo_utils/importutils.py", line 73, in import_module > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver __import__(import_str) > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver File "/home/stack/.local/lib/python2.7/site-packages/libvirt.py", line 28, in > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver raise lib_e > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver ImportError: /usr/lib/x86_64-linux-gnu/libvirt.so.0: version `LIBVIRT_2.2.0' not found (required by /home/stack/.local/lib/python2.7/site-packages/libvirtmod.so) > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver > > May 14 02:07:08 CSSOSBE04-B09 systemd[1]: devstack at n-cpu.service: Main process exited, code=exited, status=1/FAILURE > > May 14 02:07:08 CSSOSBE04-B09 systemd[1]: devstack at n-cpu.service: Unit entered failed state. > > May 14 02:07:08 CSSOSBE04-B09 systemd[1]: devstack at n-cpu.service: Failed with result 'exit-code'. > > > > > > root at CSSOSBE04-B09:/etc# sudo systemctl status devstack at n-cpu.service > > ● devstack at n-cpu.service - Devstack devstack at n-cpu.service > > Loaded: loaded (/etc/systemd/system/devstack at n-cpu.service; enabled; vendor preset: enabled) > > Active: failed (Result: exit-code) since Tue 2019-05-14 02:07:08 IST; 7min ago > > Process: 15989 ExecStart=/usr/local/bin/nova-compute --config-file /etc/nova/nova-cpu.conf (code=exited, status=1/FAILURE) > > Main PID: 15989 (code=exited, status=1/FAILURE) > > > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver libvirt = importutils.import_module('libvirt') > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver File "/usr/local/lib/python2.7/dist-packages/oslo_utils/importutils.py", line 73, in import_module > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver __import__(import_str) > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver File "/home/stack/.local/lib/python2.7/site-packages/libvirt.py", line 28, in > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver raise lib_e > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver ImportError: /usr/lib/x86_64-linux-gnu/libvirt.so.0: version `LIBVIRT_2.2.0' not found (required by /home/stack/.local/lib/python2.7/site-packages/libvirtmod.so) > > May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver > > May 14 02:07:08 CSSOSBE04-B09 systemd[1]: devstack at n-cpu.service: Main process exited, code=exited, status=1/FAILURE > > May 14 02:07:08 CSSOSBE04-B09 systemd[1]: devstack at n-cpu.service: Unit entered failed state. > > May 14 02:07:08 CSSOSBE04-B09 systemd[1]: devstack at n-cpu.service: Failed with result 'exit-code'. > > Hi, This looks like a compatibility issue between the libvirt-python package that's installed in /home/stack/.local/lib/python2.7/site-packages/ an the system's libvirt version in /usr/lib/x86_64-linux-gnu/. If ythe libvirt-python package was installed from PyPi maybe uninstalling it and reinstalling or installing a different version will fix it... Sorry for not being more helpful, but Sean and I are from the Cinder team, and all this are specific to the Nova side, so we are basically guessing here... Cheers, Gorka. > > Regards, > > Sneha Rai > > > > -----Original Message----- > From: Sean McGinnis [mailto:sean.mcginnis at gmx.com] > Sent: Tuesday, May 14, 2019 1:33 AM > To: RAI, SNEHA > Cc: Gorka Eguileor ; openstack-dev at lists.openstack.org > Subject: Re: Help needed to Support Multi-attach feature > > > > On Fri, May 10, 2019 at 04:51:07PM +0000, RAI, SNEHA wrote: > > > Thanks Gorka for your response. > > > > > > I have changed the version of libvirt and qemu on my host and I am able to move past the previous error mentioned in my last email. > > > > > > Current versions of libvirt and qemu: > > > root at CSSOSBE04-B09:/etc# libvirtd --version libvirtd (libvirt) 1.3.1 > > > root at CSSOSBE04-B09:/etc# kvm --version QEMU emulator version 2.5.0 > > > (Debian 1:2.5+dfsg-5ubuntu10.36), Copyright (c) 2003-2008 Fabrice > > > Bellard > > > > > > Also, I made a change in /etc/nova/nova.conf and set virt_type=qemu. Earlier it was set to kvm. > > > I restarted all nova services post the changes but I can see one nova service was disabled and state was down. > > > > > > > Not sure if it is related or not, but I don't believe you want to change virt_type t0 "qemu". That should stay "kvm". From aspiers at suse.com Tue May 14 11:22:51 2019 From: aspiers at suse.com (Adam Spiers) Date: Tue, 14 May 2019 12:22:51 +0100 Subject: [StoryBoard] Forum + PTG Summary In-Reply-To: References: Message-ID: <20190514112251.aafzpnhcqc2c3sj5@pacific.linksys.moosehall> Kendall Nelson wrote: >Forum (Ibuprofen for Your StoryBoard Pain Points) >------------------------------------------------------------------- > >The Forum session went really well. We had a pretty full room and lots of >engagement from basically everyone. There were no real surprises about >anything people were having issues with or new features they needed added >that we didn't already know about. It was nice that we weren't blindsided >by anything and seem to have a pretty good feel on the pulse of what people >(that have spoke up at least) are asking for. > >The most entertaining part was that it had never occurred to us to do >StoryBoard onboarding (both for howto use it and another for how to get >started on developing for Storyboard) but also might be a really helpful >thing for us to do at the next event. Really glad you liked this suggestion ;-) IMHO it would make sense to split these two types of onboarding up into different sessions, since there will be a very different audience and content for each. I volunteer to be a guinea pig / beta tester for any developer quickstart content you produce, since I quite fancy having a go at hacking StoryBoard a bit in my Copious Free Time. From mbooth at redhat.com Tue May 14 12:08:05 2019 From: mbooth at redhat.com (Matthew Booth) Date: Tue, 14 May 2019 13:08:05 +0100 Subject: [nova] Bug warning: function wrapped in db retry which modifies its arguments Message-ID: I'm sharing this because I have a suspicion it's a class of bug rather than just this one, but I haven't gone looking. The pattern is: @wrap_db_retry def instance_update_and_get_original(..., values, ...): ... values.pop() db_operation() ... Note that in this case when db_operation() raises an exception which causes a retry, the second invocation of the function is passed values which has already been modified by the first. Modifying argument data is a generally bad idea unless it's the explicit purpose of the function, but I suspect the combination with a retry wrapper is particularly likely to be overlooked by tests. If anybody would like to review my specific patch it's here: https://review.opendev.org/#/c/658845/ . This is from a reproducible (on a large deployment under load) customer issue, btw, so it isn't theoretical. Matt -- Matthew Booth Red Hat OpenStack Engineer, Compute DFG Phone: +442070094448 (UK) From cjeanner at redhat.com Tue May 14 12:30:01 2019 From: cjeanner at redhat.com (=?UTF-8?Q?C=c3=a9dric_Jeanneret?=) Date: Tue, 14 May 2019 14:30:01 +0200 Subject: [TripleO][Validations] Tag convention In-Reply-To: <1c816ba1-b557-ef59-ba59-6c4fc31f4111@redhat.com> References: <3c383d8d-54fa-b054-f0ad-b97ed67ba03f@redhat.com> <5228e551-477c-129e-d621-9b1bde9a6535@redhat.com> <1c816ba1-b557-ef59-ba59-6c4fc31f4111@redhat.com> Message-ID: <19c5908a-c427-fddf-8556-2986f48855b1@redhat.com> On 5/10/19 11:12 AM, Cédric Jeanneret wrote: > > > On 5/8/19 9:07 AM, Cédric Jeanneret wrote: >> >> >> On 5/7/19 6:24 PM, Mohammed Naser wrote: >>> On Tue, May 7, 2019 at 12:12 PM Emilien Macchi wrote: >>>> >>>> >>>> >>>> On Tue, May 7, 2019 at 4:44 PM Cédric Jeanneret wrote: >>>>> >>>>> Dear all, >>>>> >>>>> We're currently working hard in order to provide a nice way to run >>>>> validations within a deploy (aka in-flight validations). >>>>> >>>>> We can already call validations provided by the tripleo-validations >>>>> package[1], it's working just fine. >>>>> >>>>> Now comes the question: "how can we disable the validations?". In order >>>>> to do that, we propose to use a standard tag in the ansible >>>>> roles/playbooks, and to add a "--skip-tags " when we disable the >>>>> validations via the CLI or configuration. >>>>> >>>>> After a quick check in the tripleoclient code, there apparently is a tag >>>>> named "validation", that can already be skipped from within the client. >>>>> >>>>> So, our questions: >>>>> - would the reuse of "validation" be OK? >>>>> - if not, what tag would be best in order to avoid confusion? >>>>> >>>>> We also have the idea to allow to disable validations per service. For >>>>> this, we propose to introduce the following tag: >>>>> - validation-, like "validation-nova", "validation-neutron" and >>>>> so on >>>>> >>>>> What do you think about those two additions? >>>> >>>> >>>> Such as variables, I think we should prefix all our variables and tags with tripleo_ or something, to differentiate them from any other playbooks our operators could run. >>>> I would rather use "tripleo_validations" and "tripleo_validation_nova" maybe. >> >> hmm. what-if we open this framework to a wider audience? For instance, >> openshift folks might be interested in some validations (I have Ceph in >> mind), and might find weird or even bad to have "tripleo-something" >> (with underscore or dashes). >> Maybe something more generic? >> "vf(-nova)" ? >> "validation-framework(-nova)" ? >> Or even "opendev-validation(-nova)" >> Since there are also a possibility to ask for a new package name for >> something more generic without the "tripleo" taint.. > > > Can we agree on something? I really like the > "opendev-validation(-service)", even if it's a bit long. For automated > thins, it's still good IMHO. *opendev-validation-(service)* will do, since no one raised a voice against it :). > > Would love to get some feedback on that so that we can go forward with > the validations :). > > Cheers, > > C. > >> >> Cheers, >> >> C. >> >>> >>> Just chiming in here.. the pattern we like in OSA is using dashes for >>> tags, I think having something like 'tripleo-validations' and >>> 'tripleo-validations-nova' etc >>> >>>> Wdyt? >>>> -- >>>> Emilien Macchi >>> >>> >>> >> > -- Cédric Jeanneret Software Engineer - OpenStack Platform Red Hat EMEA https://www.redhat.com/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From fungi at yuggoth.org Tue May 14 12:31:55 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 14 May 2019 12:31:55 +0000 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> Message-ID: <20190514123155.xbj5srhhnrmg2h2y@yuggoth.org> On 2019-05-13 23:31:41 +0200 (+0200), Dirk Müller wrote: [...] > I think this is combining two different viewpoints in one: > "snapshot in time" and "user expects it to be updated asap on > security vulnerabilities". I agree, that is the point I was trying to make... or moreso that the "snapshot in time" is the purpose upper-constraints.txt was intended to serve for stable branches so we can keep them... stable. On the other hand "user expects it to be updated asap on security vulnerabilities" sounds like a misconception we need to better document isn't the reason we have that mechanism. > We are already updating upper-constraints on bugfixes for projects > that openstack maintains, and we do so similarly for security > fixes for packages that are part of openstack. Yes, in those cases we have selectively backported those fixes by themselves into the affected projects so as to minimize disruption to any other projects depending on them, and we update the constraints list so that they are tested with other projects' contemporary branches. Many (I expect most?) of our external Python dependencies do not follow a similar pattern, and those which do may not have the same opinions as to what constitutes a backward-incompatible change or may maintain different lifetimes for their various stable backport branches. > Also, distribution vendor provided (non-pip installed bindeps) are > maintained and updated by security fixes. Yes, and those vendors (at least for the versions of their distros we claim to test against) generally maintain a snapshot-in-time fork of those packages and selectively backport fixes to them, which is *why* we can depend on them being a generally stable test bed for us. > > Deployment projects already should not depend on our > > requirements team tracking security vulnerabilities, so need to have > > a mechanism to override constraints entries anyway if they're making > > such guarantees to their users (and I would also caution against > > doing that too). > > for traditional openstack deployment projects that might be well > the case. However several virtualenv / container / dockerbuild > based projects are using upperconstraints to generate a stable, > coherent and tested container. Without adjustments those will not > be including security fixes though. Right, again we seem to agree on the risk, just not the source of the problem. I continue to argue that the underlying issue is the choice to reuse the existing upper-constraints.txt (which was invented for a different, conflicting purpose) rather than creating a solid solution to their problem. It's not a good idea to make the current constraints list less effective at solving its intended problem just so that it can be used to solve an unrelated one, regardless of how important solving that other problem might be. > > Distributions are far better equipped than our project to handle > > such tracking, as they generally get advance notice of > > vulnerabilities and selectively backport fixes for them. > > Agreed, still OpenStack chose to use pip for managing its > dependencies, so I think it is preferable to find a solution > within that ecosystem. Not every dependency with security issues > is so offensive as "requests" is which does not maintain any > stable branch but asks you to update to the latest version > instead. Most others do have backports available on micro versions > as pip installable project. OpenStack chose to use pip to *test* its Python dependency chain so that we can evaluate newer versions of those dependencies than the distros in question are carrying. It sticks with the same model on stable branches to avoid having to maintain two fundamentally different mechanisms for installing Python dependencies for tests. This doesn't mean we necessarily think its a good model for production deployments of stable branches of our software (for exactly the security-related reasons being discussed in this thread). The stable branches are meant as a place for distro package maintainers to collaborate on selective backports of patches for their packaged versions of our software, and suddenly starting to have them depend on newer versions of external dependencies which don't follow the same branching model and cadence creates new challenges for verifying that critical and security fixes for *our* software continues to remain compatible with deps contemporary to when we initially created those branches. > > accomplish the same with a mix of old and new dependency > > versions in our increasingly aging stable and extended > > maintenance branches seems like a disaster waiting to happen. > > I agree it is a difficult exercise and we need to document a clear > policy. For me documenting that it upper-constraints is maintained > with security fix version updates as a best effort/risk bases is > good enough, fwiw. And I think we're better off picking a different solution for coordinating security updates to external dependencies of our stable branches, rather than trying to turn upper-constraints.txt into that while destabilizing its intended use. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From sfinucan at redhat.com Tue May 14 12:53:25 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Tue, 14 May 2019 13:53:25 +0100 Subject: [all][docs] season of docs In-Reply-To: <671844ee600c7a5d99359b18c9c840c782e4cc67.camel@redhat.com> References: <671844ee600c7a5d99359b18c9c840c782e4cc67.camel@redhat.com> Message-ID: <674e3d5d525a550c50f03a4be8ff01c37451d8dd.camel@redhat.com> On Wed, 2019-04-10 at 11:45 +0100, Stephen Finucane wrote: > [Top posting] > > Petr (Kovar) has kindly put me in touch with folks within Red Hat who > have worked on this in the past for other projects. They're able to > help with getting the submission out the door and have pointed to > Gnome's submission as a good example of what we need to do here: > > https://wiki.gnome.org/Outreach/SeasonofDocs > > Based of that, I guess the next steps are figuring out what projects > need the most help and putting together a list of ideas that we can > submit. > > I can only really speak for nova and oslo. For nova, I'd like to see > us better align with the documentation style used in Django, which is > described in the below article: > > https://jacobian.org/2009/nov/10/what-to-write/ > > The documentation structure we use doesn't allow us to map to this > directly but I do think there are some easy gains to be made: > > More clearly delineate between admin-facing (/admin) and user-facing > (/user) docsExpand the how-to docs we have to better explain common > user and admin operations, such as rebooting instances, rebuilding, > attaching interfaces, etc. > On top of that, there are some general cleanup things that need to > happen and just haven't. > > [Technical] Audit our reference guide, which explains concepts like > cells v2, to see if these make sense to someone who's not in the > trenchesGenerally examine the structure of the docs to see how easy > it is to find stuff (fwiw, I struggle to find things without Google > so this is probably a bad sign) > For oslo, I think our issue is less about documentation and more > about marketing (very few people outside of OpenStack know that reno > is a thing, for example, or that oslo.config exists and is as > powerful as it is) so there's nothing I'd really submit here. I'm > willing to debate that though, if someone disagrees. > > Does anyone else have anything they'd like to get help with? If so, > please let me know (here or on IRC) and we can feed that into the > process. > > Stephen Just to close this off, we never got to finish the application for this. It was quite involved, as promised, and Summit/PTG work took priority. Hopefully we'll be able to try again next year. Thanks to all who provided suggestions for things to work on. Stephen > On Thu, 2019-03-21 at 10:07 +0000, Alexandra Settle wrote: > > > > > > On 21/03/2019 01:58, Kendall Nelson wrote: > > > > > > > We've only been selected on time previously I think? The > > > application process was pretty involved from what I recall. I > > > will dig around and see if I can find anything from our last > > > application and send it over if I discover anything. > > > > > > > > > > > > Happy to try to help with the application too if you want an > > > extra set of eyes/hands. > > > > > > > > > > > > > Ditto. I love these applications and Outreachy was really > > successful! (well, sorta, long story) > > > > > > > > > > > > > > > > > > > -Kendall (diablo_rojo) > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 20, 2019 at 2:51 AM Stephen Finucane < > > > sfinucan at redhat.com> wrote: > > > > > > > > > > On Tue, 2019-03-12 at 14:42 -0700, Kendall Nelson wrote: > > > > > I think it would be a great idea if we can find someone to be > > > > > our coordinator. In the past when I've helped out with the > > > > > Google Summer of Code, the application has been a fair bit of > > > > > work, but maybe this one is different? I haven't looked yet. > > > > > I can try to help support whoever wants to coordinate this, > > > > > but I don't have time to be the primary point of contact. > > > > > > > > > > > > > > > > > > > > -Kendall (diablo_rojo) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This sounds like something the docs team (and me specifically) > > > > could take point on. I'm happy to look into what's required and > > > > reach out to people as necessary. Is there anything documented > > > > regarding the previous Summer of Code applications though? > > > > > > > > > > > > > > > > > > > > > > > > > > > > Stephen > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 11, 2019 at 9:30 AM Mohammed Naser < > > > > > mnaser at vexxhost.com> wrote: > > > > > > > > > > > > > > > > Hi there: > > > > > > > > > > > > > > > > > > > > > > > > It seems like Google has come up with a new somewhat-GSoC- > > > > > > like idea > > > > > > > > > > > > but focused on documentation. I think it could be a good > > > > > > opportunity > > > > > > > > > > > > for the documentation team (or any specific team actually, > > > > > > coordinated > > > > > > > > > > > > with docs) to be part of this. > > > > > > > > > > > > > > > > > > > > > > > > https://opensource.googleblog.com/2019/03/introducing-season-of-docs.html > > > > > > > > > > > > > > > > > > > > > > > > I'm not sure if the team has the amount of resources, but > > > > > > it seems > > > > > > > > > > > > they should be able to apply to this. Does this seem like > > > > > > something > > > > > > > > > > > > that might help the team more (or perhaps a specific > > > > > > project, > > > > > > > > > > > > coordinating with the docs team) to apply for this? > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Mohammed -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Tue May 14 13:04:31 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Tue, 14 May 2019 14:04:31 +0100 Subject: [StoryBoard] Forum + PTG Summary In-Reply-To: References: Message-ID: <76915c8636ba1abd11eee5d82c6e05d0ab972700.camel@redhat.com> On Mon, 2019-05-13 at 22:38 -0700, Kendall Nelson wrote: > Hello Everyone :) > > Quick summaries of what we talked about during the sessions in > Denver. If you have any questions about anything below feel free to > reply or drop into #storyboard and ask us there! > > Forum (Ibuprofen for Your StoryBoard Pain Points)------------------ > ------------------------------------------------- > The Forum session went really well. We had a pretty full room and > lots of engagement from basically everyone. There were no real > surprises about anything people were having issues with or new > features they needed added that we didn't already know about. It was > nice that we weren't blindsided by anything and seem to have a pretty > good feel on the pulse of what people (that have spoke up at least) > are asking for. > The most entertaining part was that it had never occurred to us to do > StoryBoard onboarding (both for howto use it and another for how to > get started on developing for Storyboard) but also might be a really > helpful thing for us to do at the next event. We talked about trying > to fit it in before the end of the week, but our schedules were just > too full to make it happen. There's an alternative. Personally, I've found images or small videos (GIFs?) embedded in user manuals for web apps to be amazing for this kind of stuff. I realize these take some time to produce but pictures truly are worth their weight in gold. (/me sobs about not having Visio since switching to Fedora). I'm pretty much a Storyboard newb so I'd be happy to review anything you produced in this space, if it would help? Stephen > I'm still working through making sure everything that needs to > becomes a story in our backlog, but that should be done by the end of > the week. > > Etherpad from Forum Session: > https://etherpad.openstack.org/p/storyboard-pain-points > > PTG > ------ > > > During the PTG, Adam and I basically just sat down and did a full run > through of all open stories. We closed duplicates, cleaned up old > vague stories, replied to stories that we had questions about. It was > all very cathartic. > > > Next step is to make sure all the new stories got created that we > need to and do another pass through to make sure its all tagged > accurately. After that, we should document what tags we are using in > our contributor documentation. > > > Etherpad we used during the purge: > https://etherpad.openstack.org/p/sb-train-ptg > > > -Kendall Nelson (diablo_rojo) -------------- next part -------------- An HTML attachment was scrubbed... URL: From aschultz at redhat.com Tue May 14 13:47:49 2019 From: aschultz at redhat.com (Alex Schultz) Date: Tue, 14 May 2019 07:47:49 -0600 Subject: [tripleo] Specs & Blueprints for the Train cycle Message-ID: Hey folks, Last IRC meeting I mentioned that we would like to try and get the specs that we'll be committing to reviewed and merged by Train M1. This means we need to have the specs up and blueprints[0] filed as soon as possible. Please let me know if you have any questions or need help. Also please take a moment or two to review the tripleo specs that we have up[1]. Thanks, -Alex [0] https://blueprints.launchpad.net/tripleo/train [1] https://review.opendev.org/#/q/project:openstack/tripleo-specs+status:open -------------- next part -------------- An HTML attachment was scrubbed... URL: From aspiers at suse.com Tue May 14 14:32:58 2019 From: aspiers at suse.com (Adam Spiers) Date: Tue, 14 May 2019 15:32:58 +0100 Subject: [StoryBoard] Forum + PTG Summary In-Reply-To: <76915c8636ba1abd11eee5d82c6e05d0ab972700.camel@redhat.com> References: <76915c8636ba1abd11eee5d82c6e05d0ab972700.camel@redhat.com> Message-ID: <20190514143258.z7oy6tfzqq7roqtc@pacific.linksys.moosehall> Stephen Finucane wrote: >On Mon, 2019-05-13 at 22:38 -0700, Kendall Nelson wrote: >> Hello Everyone :) >> >> Quick summaries of what we talked about during the sessions in >> Denver. If you have any questions about anything below feel free to >> reply or drop into #storyboard and ask us there! >> >> Forum (Ibuprofen for Your StoryBoard Pain Points)------------------ >> ------------------------------------------------- >> The Forum session went really well. We had a pretty full room and >> lots of engagement from basically everyone. There were no real >> surprises about anything people were having issues with or new >> features they needed added that we didn't already know about. It was >> nice that we weren't blindsided by anything and seem to have a pretty >> good feel on the pulse of what people (that have spoke up at least) >> are asking for. >> The most entertaining part was that it had never occurred to us to do >> StoryBoard onboarding (both for howto use it and another for how to >> get started on developing for Storyboard) but also might be a really >> helpful thing for us to do at the next event. We talked about trying >> to fit it in before the end of the week, but our schedules were just >> too full to make it happen. > >There's an alternative. Personally, I've found images or small videos >(GIFs?) embedded in user manuals for web apps to be amazing for this >kind of stuff. I realize these take some time to produce but pictures >truly are worth their weight in gold. I said almost exactly the same thing, during this Forum session IIRC: https://etherpad.openstack.org/p/new-contribs-state-and-deduplication and the unanimous response was basically "no one has time". Then I suggested as a poor man's replacement for custom crafted videos that the on-boarding sessions could be recorded, but IIRC that was not possible due to budget constraints. I also pointed out that online webinars can reach a wider audience, especially considering they can easily be recorded and uploaded somewhere for viewing after the event. The etherpad notes are scarce, but I think then someone pointed out that while video can be great, documentation has the distinct advantage of being maintainable by the community. In an ideal world, of course we'd have everything: comprehensive documentation, quick-start tutorials guides, videos, face-to-face training ... But if we only have bandwidth to produce one form of onboarding material, then maybe documentation is the one to focus on. Having said that, presentation slides can be a form of documentation, and can also be maintainable if authored with open tools such as reveal.js + git. (Much as I love the features and convenience of Google Slides, I don't think it's a good choice for material which needs to be maintained collaboratively.) >(/me sobs about not having Visio since switching to Fedora). What type of diagrams do you want to create? Depending on the answer, there are probably perfectly good open source replacements. In fact as you probably already know, nova already uses at least one of them, e.g. https://docs.openstack.org/nova/latest/reference/live-migration.html From thierry at openstack.org Tue May 14 14:34:02 2019 From: thierry at openstack.org (Thierry Carrez) Date: Tue, 14 May 2019 16:34:02 +0200 Subject: [tc] Summary: "Driving common goals" forum session and Popup teams discussion at PTG Message-ID: <0b5b811f-3900-7c78-0f68-afd278843fe2@openstack.org> In Denver we had a Forum session on how to facilitate driving common goals, here is a quick summary. We started by defining the different types of common goals in OpenStack. There are OpenStack-wide consistency goals (having common support for a feature across all OpenStack components, raising the minimal QA/doc/operations bar, improving overall "openstack" experience). And there are cross-project changes that only affect some projects: project refactoring (like extracting placement), multi-project features (like volume multi-attach) or architectural changes (like having common node-centric agents). The discussion then shifted to discussing why cross-project is hard. While I thought the projectteam-centric structure made the cross-project work less rewarding, discussion revealed that most of the difficulty comes from lack of initial directions / contacts, and lack of synchronous discussions to get buy-in from the various stakeholders. We then moved to discuss implementation models. The release goals (small, limited to one development cycle and affecting all teams) are great to drive consistency goals. SIGs are great to drive long-term cross-project changes, care for special interests, or global concerns that affect both software production and software consumption. That leaves a gap for short or mid-term cross-project changes. Popup teams are designed to fill that gap. Those would be used to drive a clear, limited cross-project objective. They are temporary like release goals, but can extend beyond a release cycle. They just need a clear disband criteria. During the rest of the session we discussed how to best implement them to facilitate that work. We ran out of time before solving the key question of whether the TC should approve the idea, the technical implementation, or the team membership itself. The implementation discussion continued during the TC meeting at the PTG on Saturday. The general consensus in the room was that the TC should look at the scope/goal of the proposed popup team (before any spec), and offer to "support" it if it looks like a viable goal and a desirable objective for OpenStack. Supported popup teams would get listed on the governance website, and get a experienced community member sponsor to serve as a liaison with the TC and mentor the popup team through initial steps (like getting the right connections in the various affected teams). Further discussion on implementation specs may reveal that the goal is impossible or not desirable, in which case the popup team can be disbanded. Action items to move this idea forward include documenting the popup team concept / approval criteria and create a home for them on the governance website (ttx), documenting best practices on how to get cross-project work done (ildiko), and documenting the difference between project teams, SIGs and popup teams (ricolin, ttx, persia). -- Thierry Carrez (ttx) From mthode at mthode.org Tue May 14 14:39:35 2019 From: mthode at mthode.org (Matthew Thode) Date: Tue, 14 May 2019 09:39:35 -0500 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: <20190514123155.xbj5srhhnrmg2h2y@yuggoth.org> References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> <20190514123155.xbj5srhhnrmg2h2y@yuggoth.org> Message-ID: <20190514143935.thuj6t7z6v4xoyay@mthode.org> On 19-05-14 12:31:55, Jeremy Stanley wrote: > On 2019-05-13 23:31:41 +0200 (+0200), Dirk Müller wrote: > [...] > > I think this is combining two different viewpoints in one: > > "snapshot in time" and "user expects it to be updated asap on > > security vulnerabilities". > > I agree, that is the point I was trying to make... or moreso that > the "snapshot in time" is the purpose upper-constraints.txt was > intended to serve for stable branches so we can keep them... stable. > On the other hand "user expects it to be updated asap on security > vulnerabilities" sounds like a misconception we need to better > document isn't the reason we have that mechanism. > > > We are already updating upper-constraints on bugfixes for projects > > that openstack maintains, and we do so similarly for security > > fixes for packages that are part of openstack. > > Yes, in those cases we have selectively backported those fixes by > themselves into the affected projects so as to minimize disruption > to any other projects depending on them, and we update the > constraints list so that they are tested with other projects' > contemporary branches. Many (I expect most?) of our external Python > dependencies do not follow a similar pattern, and those which do may > not have the same opinions as to what constitutes a > backward-incompatible change or may maintain different lifetimes for > their various stable backport branches. > > > Also, distribution vendor provided (non-pip installed bindeps) are > > maintained and updated by security fixes. > > Yes, and those vendors (at least for the versions of their distros > we claim to test against) generally maintain a snapshot-in-time fork > of those packages and selectively backport fixes to them, which is > *why* we can depend on them being a generally stable test bed for > us. > > > > Deployment projects already should not depend on our > > > requirements team tracking security vulnerabilities, so need to have > > > a mechanism to override constraints entries anyway if they're making > > > such guarantees to their users (and I would also caution against > > > doing that too). > > > > for traditional openstack deployment projects that might be well > > the case. However several virtualenv / container / dockerbuild > > based projects are using upperconstraints to generate a stable, > > coherent and tested container. Without adjustments those will not > > be including security fixes though. > > Right, again we seem to agree on the risk, just not the source of > the problem. I continue to argue that the underlying issue is the > choice to reuse the existing upper-constraints.txt (which was > invented for a different, conflicting purpose) rather than creating > a solid solution to their problem. It's not a good idea to make the > current constraints list less effective at solving its intended > problem just so that it can be used to solve an unrelated one, > regardless of how important solving that other problem might be. > > > > Distributions are far better equipped than our project to handle > > > such tracking, as they generally get advance notice of > > > vulnerabilities and selectively backport fixes for them. > > > > Agreed, still OpenStack chose to use pip for managing its > > dependencies, so I think it is preferable to find a solution > > within that ecosystem. Not every dependency with security issues > > is so offensive as "requests" is which does not maintain any > > stable branch but asks you to update to the latest version > > instead. Most others do have backports available on micro versions > > as pip installable project. > > OpenStack chose to use pip to *test* its Python dependency chain so > that we can evaluate newer versions of those dependencies than the > distros in question are carrying. It sticks with the same model on > stable branches to avoid having to maintain two fundamentally > different mechanisms for installing Python dependencies for tests. > This doesn't mean we necessarily think its a good model for > production deployments of stable branches of our software (for > exactly the security-related reasons being discussed in this > thread). The stable branches are meant as a place for distro package > maintainers to collaborate on selective backports of patches for > their packaged versions of our software, and suddenly starting to > have them depend on newer versions of external dependencies which > don't follow the same branching model and cadence creates new > challenges for verifying that critical and security fixes for *our* > software continues to remain compatible with deps contemporary to > when we initially created those branches. > > > > accomplish the same with a mix of old and new dependency > > > versions in our increasingly aging stable and extended > > > maintenance branches seems like a disaster waiting to happen. > > > > I agree it is a difficult exercise and we need to document a clear > > policy. For me documenting that it upper-constraints is maintained > > with security fix version updates as a best effort/risk bases is > > good enough, fwiw. > > And I think we're better off picking a different solution for > coordinating security updates to external dependencies of our stable > branches, rather than trying to turn upper-constraints.txt into that > while destabilizing its intended use. I don't like the idea of conflating the stability promise of upper-constraints.txt with the not quite fully tested-ness of adding security updates after the fact (while we do some cross testing, we do not and should not have 100% coverage, boiling the ocean). The only way I can see this working is to have a separate file for security updates. The idea I had (and don't like too much) is to do the following. 1. Keep upper-constraints.txt as is a. rename to tox-constraints possibly 2. add a new file, let's call it 'security-updates.txt' a. in this file goes security updates and all the knock on updates that it causes (foo pulls in a new bersion of bar and baz). b. the file needs to maintain co-installability of openstack. It is laid over the upper-constraints file and tested the same way upper-constraints is. This testing is NOT perfect. The generated file could be called something like 'somewhat-tested-secureconstraints.txt' 3. global-requirements.txt remains the same (minimum not updated for security issues) This would increase test sprawl quite a bit (tests need to be run on any constraints change on this larger set). This also sets up incrased work and scope for the requirements team. Perhaps this could be a sub team type of item or something? Anything we do should be within our documentation before we do it, policy wise. -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From beagles at redhat.com Tue May 14 14:52:47 2019 From: beagles at redhat.com (Brent Eagles) Date: Tue, 14 May 2019 12:22:47 -0230 Subject: [TripleO][Validations] Tag convention In-Reply-To: <19c5908a-c427-fddf-8556-2986f48855b1@redhat.com> References: <3c383d8d-54fa-b054-f0ad-b97ed67ba03f@redhat.com> <5228e551-477c-129e-d621-9b1bde9a6535@redhat.com> <1c816ba1-b557-ef59-ba59-6c4fc31f4111@redhat.com> <19c5908a-c427-fddf-8556-2986f48855b1@redhat.com> Message-ID: On Tue, May 14, 2019 at 10:01 AM Cédric Jeanneret wrote: > > > On 5/10/19 11:12 AM, Cédric Jeanneret wrote: > > > > > > On 5/8/19 9:07 AM, Cédric Jeanneret wrote: > >> > >> > >> On 5/7/19 6:24 PM, Mohammed Naser wrote: > >>> On Tue, May 7, 2019 at 12:12 PM Emilien Macchi > wrote: > >>>> > >>>> > >>>> > >>>> On Tue, May 7, 2019 at 4:44 PM Cédric Jeanneret > wrote: > >>>>> > >>>>> Dear all, > >>>>> > >>>>> We're currently working hard in order to provide a nice way to run > >>>>> validations within a deploy (aka in-flight validations). > >>>>> > >>>>> We can already call validations provided by the tripleo-validations > >>>>> package[1], it's working just fine. > >>>>> > >>>>> Now comes the question: "how can we disable the validations?". In > order > >>>>> to do that, we propose to use a standard tag in the ansible > >>>>> roles/playbooks, and to add a "--skip-tags " when we disable the > >>>>> validations via the CLI or configuration. > >>>>> > >>>>> After a quick check in the tripleoclient code, there apparently is a > tag > >>>>> named "validation", that can already be skipped from within the > client. > >>>>> > >>>>> So, our questions: > >>>>> - would the reuse of "validation" be OK? > >>>>> - if not, what tag would be best in order to avoid confusion? > >>>>> > >>>>> We also have the idea to allow to disable validations per service. > For > >>>>> this, we propose to introduce the following tag: > >>>>> - validation-, like "validation-nova", "validation-neutron" > and > >>>>> so on > >>>>> > >>>>> What do you think about those two additions? > >>>> > >>>> > >>>> Such as variables, I think we should prefix all our variables and > tags with tripleo_ or something, to differentiate them from any other > playbooks our operators could run. > >>>> I would rather use "tripleo_validations" and > "tripleo_validation_nova" maybe. > >> > >> hmm. what-if we open this framework to a wider audience? For instance, > >> openshift folks might be interested in some validations (I have Ceph in > >> mind), and might find weird or even bad to have "tripleo-something" > >> (with underscore or dashes). > >> Maybe something more generic? > >> "vf(-nova)" ? > >> "validation-framework(-nova)" ? > >> Or even "opendev-validation(-nova)" > >> Since there are also a possibility to ask for a new package name for > >> something more generic without the "tripleo" taint.. > > > > > > Can we agree on something? I really like the > > "opendev-validation(-service)", even if it's a bit long. For automated > > thins, it's still good IMHO. > > *opendev-validation-(service)* will do, since no one raised a voice > against it :). > Cool, works for me! Cheers, Brent -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Tue May 14 14:55:35 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 14 May 2019 09:55:35 -0500 Subject: [oslo] Bandit Strategy In-Reply-To: References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> Message-ID: <033e59ef-daed-b1ad-7ce6-8fc9a1e3ed4c@nemebean.com> I started an ethercalc to track this since we have patches from multiple people now: https://ethercalc.openstack.org/ml1qj9xrnyfg If you're interested in pumping your commit stats feel free to take one of the projects that doesn't have a review listed and get that submitted. It would be great to have some non-cores do that so the cores can approve them. We've been following a single-approver model for this since all of the patches are basically the same. If you do submit a patch to fix this, please add it to the ethercalc too. I've been marking merged patches in green to keep track of our progress. Thanks. On 5/13/19 12:40 PM, Ben Nemec wrote: > > > On 5/13/19 12:23 PM, Ben Nemec wrote: >> Nefarious cap bandits are running amok in the OpenStack community! >> Won't someone take a stand against these villainous headwear thieves?! >> >> Oh, sorry, just pasted the elevator pitch for my new novel. ;-) >> >> Actually, this email is to summarize the plan we came up with in the >> Oslo meeting this morning. Since we have a bunch of projects affected >> by the Bandit breakage I wanted to make sure we had a common fix so we >> don't have a bunch of slightly different approaches in each project. >> The plan we agreed on in the meeting was to push a two patch series to >> each repo - one to cap bandit <1.6.0 and one to uncap it with a >> !=1.6.0 exclusion. The first should be merged immediately to unblock >> ci, and the latter can be rechecked once bandit 1.6.1 releases to >> verify that it fixes the problem for us. > > Oh, and since sphinx is also breaking the Oslo world, I guess we're > going to have to include the sphinx requirements fix in these first > patches: https://review.opendev.org/#/c/658857/ > > That's passing the requirements job so it should unblock us. > > /me is off to squash some patches > >> >> We chose this approach instead of just tweaking the exclusion in >> tox.ini because it's not clear that the current behavior will continue >> once Bandit fixes the bug. Assuming they restore the old behavior, >> this should require the least churn in our repos and means we're still >> compatible with older versions that people may already have installed. >> >> I started pushing patches under >> https://review.opendev.org/#/q/topic:cap-bandit (which prompted the >> digression to start this email ;-) to implement this plan. This is >> mostly intended to be informational, but if you have any concerns with >> the plan above please do let us know immediately. >> >> Thanks. >> >> -Ben >> > From aspiers at suse.com Tue May 14 15:06:57 2019 From: aspiers at suse.com (Adam Spiers) Date: Tue, 14 May 2019 16:06:57 +0100 Subject: [First Contact] [SIG] Summit/Forum + PTG Summary In-Reply-To: References: Message-ID: <20190514150657.hshqfcjsa35t57yb@pacific.linksys.moosehall> Kendall Nelson wrote: >Forum Session (Welcoming New Contributors State of the Union and >Deduplication of Efforts) >----------------------------------------------------------------------------------------------------------------------------- > >The biggest things that came out of this session were discussion about >recording of onboarding sessions and a community goal of improving >contributor documentation. > >Basically, we have never had the onboarding sessions recorded but if we >could tt would really help new contributors even if they might get a little >stale before we are able to record new ones. +1: slightly out of date info is still usually better than none. This other mail thread in the last hour jogged my memory on some of the other details we discussed in this session: http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006224.html >During that chat, we learned >that Octavia does somewhat regular calls in whch they do onboarding for new >contributors. I have asked for an outline to help encourage other projects >to do similar. > >As for per project contributor documentation, some projects have it and >some don't. Some projects have it and its incomplete. bauzas volunteered >to do an audit of which projects have it and which don't and to propose a >community goal for it. As a part of that, we should probably decide on a >list of bare minimum things to include. Few things off the top of my head: - Architectural overview - Quickstart for getting the code running in the simplest form (even if this is just "use devstack with these parameters") - Overview of all the project's git repos, and the layout of the files in each - How to run the various types of tests - How to find some easy dev tasks to get started with From zbitter at redhat.com Tue May 14 15:09:26 2019 From: zbitter at redhat.com (Zane Bitter) Date: Tue, 14 May 2019 11:09:26 -0400 Subject: [oslo][requirements] Bandit Strategy In-Reply-To: References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> Message-ID: <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> On 13/05/19 1:40 PM, Ben Nemec wrote: > > > On 5/13/19 12:23 PM, Ben Nemec wrote: >> Nefarious cap bandits are running amok in the OpenStack community! >> Won't someone take a stand against these villainous headwear thieves?! >> >> Oh, sorry, just pasted the elevator pitch for my new novel. ;-) >> >> Actually, this email is to summarize the plan we came up with in the >> Oslo meeting this morning. Since we have a bunch of projects affected >> by the Bandit breakage I wanted to make sure we had a common fix so we >> don't have a bunch of slightly different approaches in each project. >> The plan we agreed on in the meeting was to push a two patch series to >> each repo - one to cap bandit <1.6.0 and one to uncap it with a >> !=1.6.0 exclusion. The first should be merged immediately to unblock >> ci, and the latter can be rechecked once bandit 1.6.1 releases to >> verify that it fixes the problem for us. I take it that just blocking 1.6.0 in global-requirements isn't an option? (Would it not work, or just break every project's requirements job? I could live with the latter since they're broken anyway because of the sphinx issue below...) > Oh, and since sphinx is also breaking the Oslo world, I guess we're > going to have to include the sphinx requirements fix in these first > patches: https://review.opendev.org/#/c/658857/ It's breaking the whole world and I'm actually not sure there's a good reason for it. Who cares if sphinx 2.0 doesn't run on Python 2.7 when we set and achieved a goal in Stein to only run docs jobs under Python 3? It's unavoidable for stable/rocky and earlier but it seems like the pain on master is not necessary. > That's passing the requirements job so it should unblock us. > > /me is off to squash some patches > >> >> We chose this approach instead of just tweaking the exclusion in >> tox.ini because it's not clear that the current behavior will continue >> once Bandit fixes the bug. Assuming they restore the old behavior, >> this should require the least churn in our repos and means we're still >> compatible with older versions that people may already have installed. >> >> I started pushing patches under >> https://review.opendev.org/#/q/topic:cap-bandit (which prompted the >> digression to start this email ;-) to implement this plan. This is >> mostly intended to be informational, but if you have any concerns with >> the plan above please do let us know immediately. >> >> Thanks. >> >> -Ben >> > From doug at doughellmann.com Tue May 14 15:27:50 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Tue, 14 May 2019 11:27:50 -0400 Subject: [oslo][requirements] Bandit Strategy In-Reply-To: <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> Message-ID: Zane Bitter writes: > On 13/05/19 1:40 PM, Ben Nemec wrote: >> >> >> On 5/13/19 12:23 PM, Ben Nemec wrote: >>> Nefarious cap bandits are running amok in the OpenStack community! >>> Won't someone take a stand against these villainous headwear thieves?! >>> >>> Oh, sorry, just pasted the elevator pitch for my new novel. ;-) >>> >>> Actually, this email is to summarize the plan we came up with in the >>> Oslo meeting this morning. Since we have a bunch of projects affected >>> by the Bandit breakage I wanted to make sure we had a common fix so we >>> don't have a bunch of slightly different approaches in each project. >>> The plan we agreed on in the meeting was to push a two patch series to >>> each repo - one to cap bandit <1.6.0 and one to uncap it with a >>> !=1.6.0 exclusion. The first should be merged immediately to unblock >>> ci, and the latter can be rechecked once bandit 1.6.1 releases to >>> verify that it fixes the problem for us. > > I take it that just blocking 1.6.0 in global-requirements isn't an > option? (Would it not work, or just break every project's requirements > job? I could live with the latter since they're broken anyway because of > the sphinx issue below...) Because bandit is a "linter" it is in the blacklist in the requirements repo, which means it is not constrained there. Projects are expected to manage the versions of linters they use, and roll forward when they are ready to deal with any new rules introduced by the linters (either by following or disabling them). So, no, unfortunately we can't do this globally through the requirements repo right now. -- Doug From a.settle at outlook.com Tue May 14 15:59:22 2019 From: a.settle at outlook.com (Alexandra Settle) Date: Tue, 14 May 2019 15:59:22 +0000 Subject: [all[tc][ptls] Success bot lives on! Message-ID: Hi all, Hope you're all settled back in after an absolutely crazily long week in Denver. The TC met on Saturday the 4th of May for the PTG and we had a large group discussing how we can evolve our systems towards more simplicity, fun, exciting, enjoyable, and rewarding (take your wordy pick). But in the mean time, we don't appear to celebrate the little things anymore. One of the proposals to this was to revive success bot for consistent use. For those who don't know or remember what success bot is, it is a success IRC bot (*dramatic gasp*) that makes it simple to record "little moments of joy and progress" and share them. Review this article for more info [1] or the original email from Thierry [2]. It is clear that we still have some using it, with about ~25 posts last year and 2 so far this year, but I think we can do better than that. There are a lot of new people in the community, so I hope this generates more interest! So whenever you feel like you (or someone) made progress, or had a little success in your OpenStack adventures, or have some joyful moment to share, just throw the following message on your local IRC channel: #success [Your message here] The openstackstatus bot will take that and record it on the wiki page [3]. Cheers, Alex IRC: asettle Twitter: dewsday [1] https://superuser.openstack.org/articles/success-bot-helps-share-your-happy-moments-in-the-openstack-community/ [2] http://lists.openstack.org/pipermail/openstack-dev/2015-October/076552.html [3] https://wiki.openstack.org/wiki/Successes p.s - The bot still only works in channels where openstackstatus is present (the official OpenStack IRC channels), and we may remove entries that are off-topic or spam. -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Tue May 14 16:19:06 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 14 May 2019 11:19:06 -0500 Subject: [all[tc][ptls] Success bot lives on! In-Reply-To: References: Message-ID: On 5/14/19 10:59 AM, Alexandra Settle wrote: > Hi all, > > Hope you're all settled back in after an absolutely crazily long week in > Denver. > > The TC met on Saturday the 4th of May for the PTG and we had a large > group discussing how we can evolve our systems towards more simplicity, > fun, exciting, enjoyable, and rewarding (take your wordy pick). But in > the mean time, we don't appear to celebrate the little things anymore. > > One of the proposals to this was to revive success bot for consistent > use. For those who don't know or remember what success bot is, it is a > success IRC bot (*dramatic gasp*) that makes it simple to record "little > moments of joy and progress" and share them. Review this article for > more info [1] or the original email from Thierry [2]. > > It is clear that we still have some using it, with about ~25 posts last > year and 2 so far this year, but I think we can do better than that. > There are a lot of new people in the community, so I hope this generates > more interest! > > So whenever you feel like you (or someone) made progress, or had a > little success in your OpenStack adventures, or have some joyful moment > to share, just throw the following message on your local IRC channel: > > #success [Your message here] > > The openstackstatus bot will take that and record it on the wiki page [3]. I think we had talked about sending a weekly summary to the list or something. I know I don't tend to check the wiki page on a regular basis. Is there any plan around that? > > Cheers, > > Alex > > IRC: asettle > Twitter: dewsday > > [1] > https://superuser.openstack.org/articles/success-bot-helps-share-your-happy-moments-in-the-openstack-community/ > > [2] > http://lists.openstack.org/pipermail/openstack-dev/2015-October/076552.html > > [3]https://wiki.openstack.org/wiki/Successes > > > p.s  - The bot still only works in channels where openstackstatus is > present (the official OpenStack IRC channels), and we may remove entries > that are off-topic or spam. > From a.settle at outlook.com Tue May 14 16:36:37 2019 From: a.settle at outlook.com (Alexandra Settle) Date: Tue, 14 May 2019 16:36:37 +0000 Subject: [all[tc][ptls] Success bot lives on! In-Reply-To: References: Message-ID: On 14/05/2019 17:19, Ben Nemec wrote: > > > On 5/14/19 10:59 AM, Alexandra Settle wrote: >> Hi all, >> >> Hope you're all settled back in after an absolutely crazily long week >> in Denver. >> >> The TC met on Saturday the 4th of May for the PTG and we had a large >> group discussing how we can evolve our systems towards more >> simplicity, fun, exciting, enjoyable, and rewarding (take your wordy >> pick). But in the mean time, we don't appear to celebrate the little >> things anymore. >> >> One of the proposals to this was to revive success bot for consistent >> use. For those who don't know or remember what success bot is, it is >> a success IRC bot (*dramatic gasp*) that makes it simple to record >> "little moments of joy and progress" and share them. Review this >> article for more info [1] or the original email from Thierry [2]. >> >> It is clear that we still have some using it, with about ~25 posts >> last year and 2 so far this year, but I think we can do better than >> that. There are a lot of new people in the community, so I hope this >> generates more interest! >> >> So whenever you feel like you (or someone) made progress, or had a >> little success in your OpenStack adventures, or have some joyful >> moment to share, just throw the following message on your local IRC >> channel: >> >> #success [Your message here] >> >> The openstackstatus bot will take that and record it on the wiki page >> [3]. > > I think we had talked about sending a weekly summary to the list or > something. I know I don't tend to check the wiki page on a regular > basis. Is there any plan around that? There sure is. So, based on those numbers above ^^ there's not really a case to jump in and go "ahoy, let's send the results to the list" as there really is no (not many) results at the moment. So, to start - I'm hoping this email generates interest in using the success bot, and then the plan is to change the output location from the wiki, to auto send an email (that, or scrap the wiki diff to generate an email) :) > >> >> Cheers, >> >> Alex >> >> IRC: asettle >> Twitter: dewsday >> >> [1] >> https://superuser.openstack.org/articles/success-bot-helps-share-your-happy-moments-in-the-openstack-community/ >> >> [2] >> http://lists.openstack.org/pipermail/openstack-dev/2015-October/076552.html >> >> [3]https://wiki.openstack.org/wiki/Successes >> >> >> p.s  - The bot still only works in channels where openstackstatus is >> present (the official OpenStack IRC channels), and we may remove >> entries that are off-topic or spam. >> From openstack at nemebean.com Tue May 14 16:58:03 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 14 May 2019 11:58:03 -0500 Subject: [oslo][all] Ending courtesy pings Message-ID: Hi, We discussed this some in the Oslo meeting yesterday[0], and I wanted to send a followup because there wasn't universal support for it. One of the outcomes of the PTL tips and tricks session in Denver was that courtesy pings like we use in the Oslo meeting are considered bad IRC etiquette. The recommendation was for interested parties to set up custom highlights on the "#startmeeting oslo" (or whichever meeting) command. Also, there is an ics file available on eavesdrop[1] that can be used to import the meeting to your calendaring app of choice. I should note that I don't seem to be able to configure notifications on the imported calendar entry in Google calendar though, so I'm not sure how useful this is as a reminder. A couple of concerns were raised yesterday. One was that people didn't know how to configure their IRC client to do this. Once you do configure it, there's a testing problem in that you don't get notified of your own messages, so you basically have to wait for the next meeting and hope you got it right. Or pull someone into a private channel and have them send a startmeeting command, which is a hassle. It isn't terribly complicated, but if it isn't tested then it's assumed broken. :-) The other concern was that this process would have to be done any time someone changes IRC clients, whereas the ping list was a central thing that always applies no matter where you're connecting from. Anyway, I said I would send an email out for further public discussion, and this is it. I'm interested to hear people's thoughts. Thanks. -Ben 0: http://eavesdrop.openstack.org/meetings/oslo/2019/oslo.2019-05-13-15.00.log.html#l-44 1: http://eavesdrop.openstack.org/#Oslo_Team_Meeting From fungi at yuggoth.org Tue May 14 17:14:12 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 14 May 2019 17:14:12 +0000 Subject: [all[tc][ptls] Success bot lives on! In-Reply-To: References: Message-ID: <20190514171411.dx5bv7a6epqlnqsz@yuggoth.org> On 2019-05-14 15:59:22 +0000 (+0000), Alexandra Settle wrote: [...] > For those who don't know or remember what success bot is, it is a > success IRC bot (*dramatic gasp*) that makes it simple to record > "little moments of joy and progress" and share them. [...] A subsequent addition also created a thanksbot mechanism, as described in this SU article from last year: https://superuser.openstack.org/articles/thank_bot/ It's similarly under-utilized, but serves as a reminder that I should be thanking people for their contributions with greater frequency than I do. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From jim at jimrollenhagen.com Tue May 14 17:32:49 2019 From: jim at jimrollenhagen.com (Jim Rollenhagen) Date: Tue, 14 May 2019 13:32:49 -0400 Subject: [oslo][all] Ending courtesy pings In-Reply-To: References: Message-ID: On Tue, May 14, 2019 at 1:04 PM Ben Nemec wrote: > Hi, > > We discussed this some in the Oslo meeting yesterday[0], and I wanted to > send a followup because there wasn't universal support for it. > > One of the outcomes of the PTL tips and tricks session in Denver was > that courtesy pings like we use in the Oslo meeting are considered bad > IRC etiquette. The recommendation was for interested parties to set up > custom highlights on the "#startmeeting oslo" (or whichever meeting) > command. Also, there is an ics file available on eavesdrop[1] that can > be used to import the meeting to your calendaring app of choice. I > should note that I don't seem to be able to configure notifications on > the imported calendar entry in Google calendar though, so I'm not sure > how useful this is as a reminder. > > A couple of concerns were raised yesterday. One was that people didn't > know how to configure their IRC client to do this. Once you do configure > it, there's a testing problem in that you don't get notified of your own > messages, so you basically have to wait for the next meeting and hope > you got it right. Or pull someone into a private channel and have them > send a startmeeting command, which is a hassle. It isn't terribly > complicated, but if it isn't tested then it's assumed broken. :-) > > The other concern was that this process would have to be done any time > someone changes IRC clients, whereas the ping list was a central thing > that always applies no matter where you're connecting from. > > Anyway, I said I would send an email out for further public discussion, > and this is it. I'm interested to hear people's thoughts. > I'd argue that as long as people opt in to a courtesy ping, and have a clear way to opt out, then the courtesy ping is not bad etiquette. It would be great if folks managed their own meeting reminders, but if they appreciate a ping at the start of the meeting, I see no reason not to do that. // jim > > Thanks. > > -Ben > > 0: > > http://eavesdrop.openstack.org/meetings/oslo/2019/oslo.2019-05-13-15.00.log.html#l-44 > 1: http://eavesdrop.openstack.org/#Oslo_Team_Meeting > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kennelson11 at gmail.com Tue May 14 17:36:44 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Tue, 14 May 2019 10:36:44 -0700 Subject: [PTL][SIG][WG] PTG Team Photos In-Reply-To: References: <981673d8-b095-4c30-a651-577d1c5286d3@www.fastmail.com> <30c8cbb5-b11b-be98-339d-ef6c5e35305b@gmail.com> Message-ID: Hello! I have just about all the photos sorted into separate project team folders[1]. There were a few groups that were not signed up and came for photos anyway which is great, but being that I wasn't taking photos I don't know exactly what teams they were. These teams that I was unable to place just exist outside the team directories. If anyone wants to speak up and help solve the mystery I would very much appreciate it! Enjoy! -Kendall (diablo_rojo) [1] https://www.dropbox.com/sh/fydqjehy9h5y728/AAAEP6h_uK_6r1a9oh3aAF6Qa?dl=0 On Mon, May 13, 2019 at 4:00 PM Kendall Nelson wrote: > Sorting through them today, should have a link for everyone tomorrow. > > -Kendall > > On Fri, May 10, 2019 at 11:01 AM Jay Bryant wrote: > >> Colleen, >> >> I haven't seen them made available anywhere yet so I don't think you >> missed an e-mail. >> >> Jay >> >> On 5/10/2019 12:48 PM, Colleen Murphy wrote: >> > On Thu, Mar 28, 2019, at 17:03, Kendall Nelson wrote: >> >> Hello! >> >> >> >> If your team is attending the PTG and is interested in having a team >> >> photo taken, here is the signup[1]! There are slots Thursday and Friday >> >> from 10:00 AM to 4:30 PM. >> >> >> >> The location is TBD but will likely be close to where registration will >> >> be. I'll send an email out the day before with a reminder of your time >> >> slot and an exact location. >> >> >> >> -Kendall (diablo_rojo) >> >> >> >> [1] >> https://docs.google.com/spreadsheets/d/1DgsRHVWW2YLv7ewfX0M21zWJRf4wUfPG4ff2V5XtaMg/edit?usp=sharing >> >> >> > Are the photos available somewhere now? I'm wondering if I missed an >> email. >> > >> > Colleen >> > >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From morgan.fainberg at gmail.com Tue May 14 17:42:22 2019 From: morgan.fainberg at gmail.com (Morgan Fainberg) Date: Tue, 14 May 2019 10:42:22 -0700 Subject: [oslo][all] Ending courtesy pings In-Reply-To: References: Message-ID: On Tue, May 14, 2019 at 10:38 AM Jim Rollenhagen wrote: > On Tue, May 14, 2019 at 1:04 PM Ben Nemec wrote: > >> Hi, >> >> We discussed this some in the Oslo meeting yesterday[0], and I wanted to >> send a followup because there wasn't universal support for it. >> >> One of the outcomes of the PTL tips and tricks session in Denver was >> that courtesy pings like we use in the Oslo meeting are considered bad >> IRC etiquette. The recommendation was for interested parties to set up >> custom highlights on the "#startmeeting oslo" (or whichever meeting) >> command. Also, there is an ics file available on eavesdrop[1] that can >> be used to import the meeting to your calendaring app of choice. I >> should note that I don't seem to be able to configure notifications on >> the imported calendar entry in Google calendar though, so I'm not sure >> how useful this is as a reminder. >> >> A couple of concerns were raised yesterday. One was that people didn't >> know how to configure their IRC client to do this. Once you do configure >> it, there's a testing problem in that you don't get notified of your own >> messages, so you basically have to wait for the next meeting and hope >> you got it right. Or pull someone into a private channel and have them >> send a startmeeting command, which is a hassle. It isn't terribly >> complicated, but if it isn't tested then it's assumed broken. :-) >> >> The other concern was that this process would have to be done any time >> someone changes IRC clients, whereas the ping list was a central thing >> that always applies no matter where you're connecting from. >> >> Anyway, I said I would send an email out for further public discussion, >> and this is it. I'm interested to hear people's thoughts. >> > > I'd argue that as long as people opt in to a courtesy ping, and have a > clear way to opt out, then the courtesy ping is not bad etiquette. > > It would be great if folks managed their own meeting reminders, but > if they appreciate a ping at the start of the meeting, I see no reason > not to do that. > > // jim > > >> >> Thanks. >> >> -Ben >> >> 0: >> >> http://eavesdrop.openstack.org/meetings/oslo/2019/oslo.2019-05-13-15.00.log.html#l-44 >> 1: http://eavesdrop.openstack.org/#Oslo_Team_Meeting >> >> Keystone used to have a self-managed list (curated at the start of each cycle administratively so inactive folks weren't constantly pinged) for the courtesy pings. The list was located at the top of our weekly meeting agenda. With that said, we've also moved away from courtesy pings. With the ability to export the .ics of the calendars (for my personal calendar) this has become less of an issue. I support the general removal of courtesy pings simply for the reason of limiting the clutter in the channels. --Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From jungleboyj at gmail.com Tue May 14 17:48:07 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Tue, 14 May 2019 12:48:07 -0500 Subject: [PTL][SIG][WG] PTG Team Photos In-Reply-To: References: <981673d8-b095-4c30-a651-577d1c5286d3@www.fastmail.com> <30c8cbb5-b11b-be98-339d-ef6c5e35305b@gmail.com> Message-ID: <79bdd7d5-474e-3033-07ce-e9e40a527424@gmail.com> 2019-05-03 12.23.27 to 12.24.08 should be the Designate team. :-) Mugsie (Graham Hayes) is the PTL for that group and it is him and one other guy.  So I am pretty sure that is what it is.  :-) Thanks for getting the pictures posted.  They look good! Jay On 5/14/2019 12:36 PM, Kendall Nelson wrote: > Hello! > > I have just about all the photos sorted into separate project team > folders[1]. There were a few groups that were not signed up and came > for photos anyway which is great, but being that I wasn't taking > photos I don't know exactly what teams they were. These teams that I > was unable to place just exist outside the team directories. If anyone > wants to speak up and help solve the mystery I would very much > appreciate it! > > Enjoy! > > -Kendall (diablo_rojo) > > [1] > https://www.dropbox.com/sh/fydqjehy9h5y728/AAAEP6h_uK_6r1a9oh3aAF6Qa?dl=0 > > On Mon, May 13, 2019 at 4:00 PM Kendall Nelson > wrote: > > Sorting through them today, should have a link for everyone tomorrow. > > -Kendall > > On Fri, May 10, 2019 at 11:01 AM Jay Bryant > wrote: > > Colleen, > > I haven't seen them made available anywhere yet so I don't > think you > missed an e-mail. > > Jay > > On 5/10/2019 12:48 PM, Colleen Murphy wrote: > > On Thu, Mar 28, 2019, at 17:03, Kendall Nelson wrote: > >> Hello! > >> > >> If your team is attending the PTG and is interested in > having a team > >> photo taken, here is the signup[1]! There are slots > Thursday and Friday > >> from 10:00 AM to 4:30 PM. > >> > >> The location is TBD but will likely be close to where > registration will > >> be. I'll send an email out the day before with a reminder > of your time > >> slot and an exact location. > >> > >> -Kendall (diablo_rojo) > >> > >> > [1]https://docs.google.com/spreadsheets/d/1DgsRHVWW2YLv7ewfX0M21zWJRf4wUfPG4ff2V5XtaMg/edit?usp=sharing > >> > > Are the photos available somewhere now? I'm wondering if I > missed an email. > > > > Colleen > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Tue May 14 17:51:14 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 14 May 2019 17:51:14 +0000 Subject: [oslo][all] Ending courtesy pings In-Reply-To: References: Message-ID: <20190514175114.gvejbtdbzhwjbbqi@yuggoth.org> On 2019-05-14 13:32:49 -0400 (-0400), Jim Rollenhagen wrote: [...] > I'd argue that as long as people opt in to a courtesy ping, and > have a clear way to opt out, then the courtesy ping is not bad > etiquette. > > It would be great if folks managed their own meeting reminders, > but if they appreciate a ping at the start of the meeting, I see > no reason not to do that. Yep, the main challenge is that spammers also like to randomly mention lists of nicks to trigger highlights in a particular channel before proceeding to paste in whatever nonsense with which they wish to regale us, so Freenode's policing mechanisms may mistake a lengthy "ping list" for such activity and insta-ban you. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From eandersson at blizzard.com Tue May 14 17:56:49 2019 From: eandersson at blizzard.com (Erik Olof Gunnar Andersson) Date: Tue, 14 May 2019 17:56:49 +0000 Subject: [PTL][SIG][WG] PTG Team Photos In-Reply-To: <79bdd7d5-474e-3033-07ce-e9e40a527424@gmail.com> References: <981673d8-b095-4c30-a651-577d1c5286d3@www.fastmail.com> <30c8cbb5-b11b-be98-339d-ef6c5e35305b@gmail.com> <79bdd7d5-474e-3033-07ce-e9e40a527424@gmail.com> Message-ID: I heard that other guy is pretty great. :p The two man pictures are indeed of me and mugsie for the Designate project. The two last pictures with 3 people (me, Colin Gibbons and Spyros Trigazis) is for the Magnum project. Best Regards, Erik Olof Gunnar Andersson From: Jay Bryant Sent: Tuesday, May 14, 2019 10:48 AM To: openstack-discuss at lists.openstack.org Subject: Re: [PTL][SIG][WG] PTG Team Photos 2019-05-03 12.23.27 to 12.24.08 should be the Designate team. :-) Mugsie (Graham Hayes) is the PTL for that group and it is him and one other guy. So I am pretty sure that is what it is. :-) Thanks for getting the pictures posted. They look good! Jay On 5/14/2019 12:36 PM, Kendall Nelson wrote: Hello! I have just about all the photos sorted into separate project team folders[1]. There were a few groups that were not signed up and came for photos anyway which is great, but being that I wasn't taking photos I don't know exactly what teams they were. These teams that I was unable to place just exist outside the team directories. If anyone wants to speak up and help solve the mystery I would very much appreciate it! Enjoy! -Kendall (diablo_rojo) [1] https://www.dropbox.com/sh/fydqjehy9h5y728/AAAEP6h_uK_6r1a9oh3aAF6Qa?dl=0 On Mon, May 13, 2019 at 4:00 PM Kendall Nelson > wrote: Sorting through them today, should have a link for everyone tomorrow. -Kendall On Fri, May 10, 2019 at 11:01 AM Jay Bryant > wrote: Colleen, I haven't seen them made available anywhere yet so I don't think you missed an e-mail. Jay On 5/10/2019 12:48 PM, Colleen Murphy wrote: > On Thu, Mar 28, 2019, at 17:03, Kendall Nelson wrote: >> Hello! >> >> If your team is attending the PTG and is interested in having a team >> photo taken, here is the signup[1]! There are slots Thursday and Friday >> from 10:00 AM to 4:30 PM. >> >> The location is TBD but will likely be close to where registration will >> be. I'll send an email out the day before with a reminder of your time >> slot and an exact location. >> >> -Kendall (diablo_rojo) >> >> [1]https://docs.google.com/spreadsheets/d/1DgsRHVWW2YLv7ewfX0M21zWJRf4wUfPG4ff2V5XtaMg/edit?usp=sharing >> > Are the photos available somewhere now? I'm wondering if I missed an email. > > Colleen > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Mon May 13 22:19:50 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 13 May 2019 23:19:50 +0100 Subject: [nova][ptg] main etherpad backup Message-ID: so it looks like the https://etherpad.openstack.org/p/nova-ptg-train-5 etherpad has died like the 4 before it. attached is a offline copy i took near the end of ptg which should have the majoriy fo the content for those that are looking for it the downside is this is just a copy past i did in to a text file so i dont have any of the strike thouughs or autor info in it but all the #Aggree: and other notes we took should still be there. regards sean -------------- next part -------------- Nova Train PTG - Denver 2019 For forum session brainstorming use https://etherpad.openstack.org/p/DEN-train-nova-brainstorming Attendance: efried sean-k-mooney,sean-k-mooney aspiers stephenfin takashin helenafm gmann Sundar mriedem gibi melwitt alex_xu mdbooth lyarwood tssurya kashyap (first two days will be sparsely available; be present fully on the last day) artom egallen dakshina-ilangov (joining post 11:30AM on Thur, Fri) jaypipes adrianc IvensZambrano johnthetubaguy (afternoon thursday, onwards) amodi gryf cfriesen (bouncing around rooms a bit) med_ mnestratov shuquan bauzas dklyle jichenjc sorrison jgasparakis tetsuro Team photo Friday 11:50-1200 https://ethercalc.openstack.org/3qd1fj5f3tt3 Topics - Please include your IRC nick next to your topic so we know who to talk to about that topic. NUMA Topology with placement Spec: https://review.openstack.org/#/c/552924/ XPROJ see https://etherpad.openstack.org/p/ptg-train-xproj-nova-placement Subtree affinity with placement [efried] XPROJ: see https://etherpad.openstack.org/p/ptg-train-xproj-nova-placement completing numa affinity polices for neutron sriov interfaces. ==> neutron XPROJ (efried 20190422) https://specs.openstack.org/openstack/nova-specs/specs/queens/implemented/share-pci-between-numa-nodes.html ^ dose not work for nutron port as the flavor extra specs and image properties were removed during the implematnion and the spec was retroativly updated to document what was implemented. we should fix that by supportin numa polices TODO sean-k-mooney to write blueprint/spec either reporpose original spec or explore using the new port requestes/traits mechanisium. cpu modeling in placement jays spec https://review.openstack.org/#/c/555081/ #agree approve the spec more or less as is and get moving how to make it work with numa affinity and cache RMD - Resource Management Daemon (dakshina-ilangov/IvensZambrano) Base enablement - https://review.openstack.org/#/c/651130/ The following blueprints reference the base enablement blueprint above Power management using CPU core P state control - https://review.openstack.org/#/c/651024/ Last-level cache - https://review.openstack.org/#/c/651233/ #agree Generic file (inventory.yaml?) allowing $daemon (RMD) to dictate inventory to report, which can be scheduled via extra_specs #agree RMD to monitor (by subscribing to nova and/or libvirt notifications) and effect assignments/changes out-of-band - no communication from virt to RMD resource provider yaml (or how we whitelist/model host resources via config in general) https://review.openstack.org/#/c/612497/ Code: https://review.openstack.org/#/c/622622/ [efried 20190418] scrubbing from agenda for general lack of interest AMD SEV support efried 20190424 - removing from agenda because approved Any matters arising (if we're lucky, there won't be any) Train spec: https://review.opendev.org/#/c/641994/ Note this is significantly different from the... Stein spec: https://specs.openstack.org/openstack/nova-specs/specs/stein/approved/amd-sev-libvirt-support.html ...in that we're now using a resource class and making SEV contexts a quantifiable resource. "Guest CPU selection with hypervisor consideration" (kashyap) efried 20190425 -- removing from agenda because spec/bp approved Blueprint: https://blueprints.launchpad.net/nova/+spec/cpu-selection-with-hypervisor-consideration Spec: https://review.openstack.org/#/c/645814/ tl;dr: Re-work (for the better) the way Nova's libvirt driver chooses CPU models. Problem: Currently the CPU configuration APIs that Nova's libvirt driver uses — baselineCPU() and compareCPU() — ignore the host hypervisor's (QEMU + KVM) capabilities when determining guest CPU model To solve that, libvirt has introduced two new APIs that are "hypervisor-literate" — baselineHypervisorCPU() and compareHypervisorCPU(). These newer APIs (requires: libvirt-4.0.0; and QEMU-2.9 — both for x86_64) take into account the hypervisor's capabilities, and are therefore much more useful. This addresses several problems (along multiple TODOs item the libvirt driver code (refer to _get_guest_cpu_model_config() and _get_cpu_traits() methods in libvirt/driver.py) Reference: Slide-28 here: https://kashyapc.fedorapeople.org/Effective-Virtual-CPU-Configuration-in-Nova-Berlin2018.pdf Making extra specs less of a forgotten child (stephenfin) spec: https://review.openstack.org/#/c/638734/ Unlike config options, we have no central reference point for flavour extra specs. There are a *lot* of them and I frequently see typos, people setting them wrong etc. We don't? https://docs.openstack.org/nova/latest/user/flavors.html#extra-specs Do you intend to make this exhaustive / comprehensive / exclusive on the first pass (i.e. unknown/unregistered items trigger an error rather than being allowed)? I'd like this to be configurable, though I'm not sure if that's allowed (no to configurable APIs) so maybe warning-only first Glance supports metadata definitions but the definitions look old https://github.com/openstack/glance/blob/master/etc/metadefs/compute-cpu-pinning.json i think the way that the glance metadata refence the type of resouce they refer to (flavor,image,volume,host aggragate) is also tied into how heat references them. the glance metadef my be old but they are used to generate ui and validation logic in horizon too. they are availabe via a glance api endpoint which is how they are consumed by other services. There are also several missing metadefs and documented image properties. https://developer.openstack.org/api-ref/image/v2/metadefs-index.html Do we want to start validating these on the API side (microversion) and before an instance boots? Rough PoC for flavour extra spec definition here: http://paste.openstack.org/show/B9unIL8e2KpeSMBGaINe/ extracting the metadef into an external lib that is importable by several service may be useful Not entirely sure if this is necessary. They change very rarely and it could be just as easy to have a "nova metadef" -> "glance metadef" translation tool. Worth discussing though We have json schema for scheduler hints but still allow undefined out of tree scheduler hints, do something like that? https://github.com/openstack/nova/blob/c7f4190/nova/api/openstack/compute/schemas/servers.py#L93 Seems like trying to build a new metadefs type API for flavor extra specs in nova would take a lot longer than simply doing json schema validation of the syntax for known extra specs (like scheduler hints). (Sundar) +1 to two ideas: (a) Keep a standard list of keys which operators cannot modify and allow operators to add more to the schema (b) Use a new microversion to enforce strict extra specs key checking. #agree do it as part of flavor property set #agree first do value side validation for known keys, then validate keys, allowing admin ability to augment the schema of valid keys/values #agree validate the values without a microversion - doesn't fix the fat finger key issue, but helps with value validation (and doesn't need to all land at once) probably need something later for validating keys (strict mode or whatever) Persistent memory (alex_xu) spec https://review.openstack.org/601596, https://review.openstack.org/622893 patches: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/virtual-persistent-memory NUMA rears its ugly head again #agree can "ignore NUMA" initially #agree Document lifecycle ops that are and aren't supported New virt driver for rsd: For management of a composable infrastructure bp: https://blueprints.launchpad.net/nova/+spec/rsd-virt-for-nova-implementation rsd-virt-for-nova virt driver: https://github.com/openstack/rsd-virt-for-nova spec review: https://review.openstack.org/#/c/648665/ Questions Can this be done with nova + ironic? Why can't the driver live out of tree? What's the 3rd party CI story? Nova governance (stephenfin) In light of the placement split and massively reduced nova contributor base, it would be good to take the time to examine some of the reasons we do what we do and why...on a day that isn't the last day [efried] how about right after the retrospective on Thursday morning? Do we need more than 15 minutes for this? Right after the retro is fine. It's very subjective and I don't expect to make any decisions on the day. More of a sharing session. Ideas cores vs. spec cores This is done: https://review.openstack.org/#/admin/groups/302,members nova cores vs. neutron cores (the subsystem argument) two +2s from same company (for a change from someone in the same company?) (mriedem): I'm still a -1 on this. Multiple problems with this e.g. reviewing with the blinders and pressure to deliver for your downstream product roadmap ("we'll fix it downstream later") and I as a nova maintainer don't want to be responsible for maintaining technical debt rushed in by a single vendor. (kashyap) While I see where yuo're coming from, your comment implies mistrust and that people will intentionally "rush things in". As long as a particular change is publicly advertized well-enough, gave sufficient time for others to catch up, all necessary assmuptions are described clearly, and respond in _every_ detail that isn't clear to a community reviewer, then it is absolutely reasonable for reviewers from a comapny to merge a change by a contributor from the same company. This happens _all_ time in other mature open source communities (kernel, QEMU, et al). (adrianc) -1 on that, diversty is more likely to ensure community goals. (adrianc) perhaps a happy middle e.g bugfixes ? (mdbooth): The concern above is completely valid, but I don’t believe it will happen in practise, and in the meantime we’re making it harder on ourselves. I would like to trust our core reviewers, and we can address this if it actually happens. (+1)+1 +1 separate specs repo vs. in-tree specs directory more effort than it's worth (anything else that nova does differently to other OpenStack projects and other large, multi-vendor, open source projects) Compute capabilities traits placement request filter (mriedem) Solution for https://bugs.launchpad.net/nova/+bug/1817927 and other things like booting from a multi-attach volume, we need the scheduler to pick a host with a virt driver that supports those types of requests. Think we agreed in https://etherpad.openstack.org/p/ptg-train-xproj-nova-placement toward the bottom that we're OK with this. Here's some code: https://review.opendev.org/#/c/645316/ (gibi): I'm OK to modify the flavor extra_spec for now. I think we agreed yestarday to allow numbered group without resources in placement as a final solution. It is also OK to me. However we have a comment about storing the unnumbered request group in RequestSpec.requested_resources list. (https://github.com/openstack/nova/blob/master/nova/objects/request_spec.py#L93) I tried to do that to give a place where the capability traits can be stored, but failed: https://review.opendev.org/#/c/647396/5//COMMIT_MSG . Is there any reason to still try to store the unnumbered group in the RequestSpec.requested_resources? How do you like this hack? https://review.opendev.org/#/c/656885/3/nova/scheduler/manager.py (not traits related) [dtroyer][sean-k-mooney] 3rd party CI for NUMA/PCI/SRIOV (mriedem) moved to https://etherpad.openstack.org/p/nova-ptg-train-ci Corner case issues with root volume detach/attach (mriedem/Kevin_Zheng) Go over the stuff that came up late in the Stein cycle: tags and multiattach volumes: http://lists.openstack.org/pipermail/openstack-discuss/2019-March/003376.html https://etherpad.openstack.org/p/detach-attach-root-volume-corner-cases When rebuilding with a new image we reset the stashed image_* system_metadata on the instance and some other fields on the instance based on new image metadata. When attaching a new root volume, the underlying image (and its metadata) could change, so presumably we need to do the same type of updates to the instance record when attaching a new root volume with a new image - agree? The state of nova's documentation (stephenfin) There are numerous issues with our docs Many features aren't documented or are barely documented in nova. I've cleaned up some but there's much more to do metadata (the metadata service, config drives, vendordata etc.), console proxy services, man pages, cross_az_attach: https://review.opendev.org/#/c/650456/ Flow diagram for resize like for live migration https://docs.openstack.org/nova/latest/reference/live-migration.html Loads of stuff is out-of-date If you're a mere user of nova, the docs are essentially useless as admin'y stuff is scattered everywhere Other stuff Testing guides down cells: https://review.opendev.org/#/c/650167/ Before there's serious time sunk into this, does anyone really care and should it be a priority? (kashyap) I enjoy improving documentation, so, FWIW, in my "copious free time" I am willing to help chip in with areas that I know a thing or two about. Broader topic: how can we get our respective downstream documentation teams to engage upstream?+∞ (kashyap) A potential first step is to agree on a "system" (and consistently stick to it). E.g. the "Django" project's (IIRC, Stephen even mentioned this in Berlin) documentation model (described here: https://www.divio.com/blog/documentation/) Tutorials — learning oriented How-To guides — problem-oriented Explanation — understanding-oriented Reference — information-oriented (mriedem): I try to push patches to fix busted / incorrect stuff or add missing things when I have the context (I'm looking through our docs for some specific reason). If I don't have the time, I'll report a bug and sometimes those can be marked as low-hanging-fruit for part time contributors to work on those. e.g. https://bugs.launchpad.net/nova/+bug/1820283 TODO(stephenfin): Consider making this a mini cycle goal Tech debt: Removing cells v1 Already in progress \o/ Removing nova-network CERN are moving off this entirely (as discussed on the mailing list). We can kill it now? (melwitt): We have had the go ahead [from CERN] since Stein to remove nova-network entirely. \o/ 🎉 Can we remove the nova-console, nova-consoleauth, nova-xvpxvncproxy service? (mriedem, stephenfin) The nova-console service is xenapi-specific and was deprecated in stein: https://review.openstack.org/#/c/610075/ There are, however, REST APIs for it: https://developer.openstack.org/api-ref/compute/#server-consoles-servers-os-consoles-os-console-auth-tokens (stephenfin): Maybe I misunderstood you, but I thought these APIs could also talk to the DB stuff Mel did? So if we drop the nova-console service, the APIs would no longer work. It seems our options are: Delete the nova-console service and obsolete the REST APIs (410 response on all microversions like what we're doing with nova-cells and nova-network APIs) Deprecate the REST APIs on a new microversion but continue to support older microversions - this means the nova-console service would live forever. Are people still using the nova-console service? Are there alternatives/replacements for xen users? BobBall seemed to suggest there was http://lists.openstack.org/pipermail/openstack-dev/2018-October/135422.html but it's not clear to me. Matt DePorter (Rackspace) said this week that they rely on it - but they are on Queens and not sure if there are alternatives (as Bob suggested in the ML). But they're also on cellsv1, so they have work to do to upgrade anyway, so they might as well move to whatever isn't these things we want to delete. What is that? So upgrade and migrate from xen to kvm? #agree: Do it Migrating rootwrap to privsep #agree: Continue getting an MVP since it's not any worse than what we have and mikal is doing the work Bumping the minimum microversion Did this ever progress in ironic? #agree: nope imagebackend/image cache another go? (mdbooth) (mriedem): How could we have real integration testing for this before refactoring it? Not tempest, just a minimal devstack with (for lack of a better word) exercises. Why not tempest? Too heavy? Tempest tests the API, you need low-level testing of the cache to see if it's doing what you expect. Whitebox? I refactored the unit tests to be somewhat functional-y at the time for exactly this reason. This was before functional was a thing. Clean up rest of the needless libvirt driver version constants and compat code(kashyap) Mostly an FYI (as this is a recurring item) WIP: https://review.opendev.org/#/q/topic:Bump_min_libvirt_and_QEMU_for_Stein+(status:open+OR+status:merged) (kashyap) Some more compat code to be cleaned up, it's noted at the end of this (merge) change: https://review.opendev.org/#/c/632507/ Remove mox (takashin) https://review.opendev.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/mox-removal-train Remove explicit eventlet usage (preferably entirely), specifically to allow wsgi to no longer require it. (mdbooth) Removing virt drivers that no longer has third-party CI (cough XenAPI cough) Removing fake libvirt driver (we use that! mdbooth) "we" who? It's basically a hack that's in place only because mocking wasn't done properly. mikal has a series to do the mocking properly and remove that driver. Link: https://review.opendev.org/#/q/topic:fake-cleanup+(status:open+OR+status:merged) Ah, fake libvirt *driver*. The fake libvirt module is used in functional (by the real libvirt driver). Fixing OSC's live migrate interface (mriedem) OSC CLI is not like nova and defaults to 2.1 unless the user overrides on the CLI or uses an environment variable. The "openstack server migrate --live " CLI therefore essentially makes all live migrations by default forced live migrations, bypassing the scheduler, which is very dangerous. Changing the interface is likely going to have to be a breaking change and major version bump, but it needs to happen and has been put off too long. Let's agree on the desired interface, taking into account that you can also specify a host with cold migration now too, using the same CLI (openstack server migrate). See https://review.openstack.org/#/c/627801/ and the referenced changes for attempts at fixing this. (dtroyer) IIRC at least part of this can be done without breaking changes and should move forward. But yeah, its a mess and time to fix it... More details in the Forum session etherpad: https://etherpad.openstack.org/p/DEN-osc-compute-api-gaps See the ML summary: http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005783.html Let's plan the next steps of the bandwidth support feature (gibi) [efried: XPROJ? does this need to involve neutron folks? If so, please add to https://etherpad.openstack.org/p/ptg-train-xproj-nova-neutron] (gibi): these items are mostly nova only things but I added a XPROJ item to the Neutron pad about multisegement support. Obvious next step is supporting server move operations with bandwidth: https://blueprints.launchpad.net/nova/+spec/support-server-move-operations-with-ports-having-resource-request spec: https://review.opendev.org/#/c/652608 Question: Do we want to add support for server move operations with port having resource request as a new API microversion or as bug fixes? (gibi) background from ML http://lists.openstack.org/pipermail/openstack-discuss/2019-January/001881.html Server delete and detach port works without require a specific microversion Server create works since microversion 2.72 Server move operations rejected since https://review.openstack.org/#/c/630725 #agree : no microversion (will be proposed in the spec, and allow people to object there) Question: Can the live migration support depend on the existence of multiple portbinding or we have to support the old codepath as well when the port binding is created by the nova-compute on the destination host? #agree: Yes, this extension cannot be turned off But there are various smaller and bigger enchancements. Tracked in https://blueprints.launchpad.net/nova/+spec/enhance-support-for-ports-having-resource-request Which one seems the most imporant to focus on in Train? Use placement to figure out which RP fulfills a port resource request (currently it is done by Nova). This requires the Placement bp https://blueprints.launchpad.net/nova/+spec/placement-resource-provider-request-group-mapping-in-allocation-candidates to be implemented first. A consensus is emerging to do the '"mappings" dict next to "allocations"' solution+1+1 Supporting SRIOV ports with resource request requires virt driver support (currently supported by libvirt driver) to include the parent interface name to the descriptor of the PCI device represents VFs. Introduce a new TRAIT based capability to the virt drivers to report if they support SRIOV port with resource request to be able to drive the scheduling of server using such ports. Today only the pci_claim stops a boot if the virt driver does not support the feature and that leads to reschedule. #agree: add a new capablity as a trait Automatically set group_policy if more than one RequestGroup is generated for an allocation_candidate query in nova. This first needs an agreement what is a good default value for such policy. #agree: 'none' seems to be a sensible policy state that default on the nova side, not placement (which wants explicit) # agree: priority order: 1) group_policy, 2) capability trait, 3) port mapping (gibi) The rest is a long shot in Train but I added them for completness: Support attaching a port to a server where the port has resource request. This needs a way to increase the allocation of the running servers. So this requires the in_tree allocation candidate support from placement that was implemented in Stein https://blueprints.launchpad.net/nova/+spec/alloc-candidates-in-tree Also this operation can only be supported if the new, increased allocation still fits to the current compute the server is running on. Support attaching a network to a server where the network has a (default) QoS minimum bandwidth rule. Support creating a server with a network that has a (default) QoS minimum bandwidth rule. This requires to move the port create from the nova-compute to the nova-conductor first. Changing how server create force_hosts/nodes works (mriedem) Spec: https://review.openstack.org/#/c/645458/ (merged) Blueprint https://blueprints.launchpad.net/nova/+spec/add-host-and-hypervisor-hostname-flag-to-create-server approved Code https://review.openstack.org/#/c/645520/ started See discussion in the mailing list: http://lists.openstack.org/pipermail/openstack-discuss/2019-March/003813.html API change option: Add a new parameter (or couple of parameters) to the server create API which would deprecate the weird az:host:node format for forcing a host/node and if used, would run the requested destination through the scheduler filters. This would be like how cold migrate with a target host works today. If users wanted to continue forcing the host and bypass the scheduler, they could still use an older microversion with the az:host:node format. Other options: config option or policy rule Integrating openstacksdk and replacing use of python-*client blueprint: https://blueprints.launchpad.net/nova/+spec/openstacksdk-in-nova code: openstacksdk patch to support ksa-conf-based connection construction: https://review.openstack.org/#/c/643601/ Introduce SDK framework to nova (get_sdk_adapter): https://review.opendev.org/#/c/643664/ WIP use openstacksdk instead of ksa for placement: https://review.opendev.org/#/c/656023/ WIP/PoC start using openstacksdk instead of python-ironicclient: https://review.openstack.org/#/c/642899/ questions: Community and/or project goal? Move from one-conf-per-service to unified conf and/or clouds.yaml How does the operator tell us to do this? Config options for location of clouds.yaml This should be in [DEFAULT] since it'll apply to all services (that support it) Which cloud region (is that the right term?) from clouds.yaml to use specifying this option would take precedence, ignore the ksa opts, and trigger use of clouds.yaml or perhaps a [DEFAULT] use_sdk_for_every_service_that_supports_it_and_use_this_cloud_region Process (blueprints/specs required?) API inconsistency cleanup (gmann) There are multiple API cleanup were found which seems worth to fix. These cleanups are API change so need microversion bump. Instead of increasing microversion separatly for each cleanup, I propose to be fix them under single microversion bump. Current list of cleanup - https://etherpad.openstack.org/p/nova-api-cleanup #. 400 for unknown param for query param and for request body.http://lists.openstack.org/pipermail/openstack-discuss/2019-May/ Consensus is sure do this. #. Remove OS-* prefix from request and response field. Alternative: return both in response, accept either in request Dan and John are -1 on removing the old fields If you're using an SDK it should hide this for you anyway. Consensus in the room is to just not do this. #. Making server representation always consistent among all APIs returning the complete server representation. GET /servers/detail GET /servers/{server_id} PUT /servers/{server_id} POST /servers/{server_id} (rebuild) Consensus in the room is this is fine, it's just more fields in the PUT and rebuild responses. #. Return ``servers`` field always in response of GET /os-hypervisors this was nacked/deferred (i.e. not to be included in same microversion as above) Consensus: do it in the same microversion as the above #. Consistent error codes on quota exceeded this was nacked/deferred Spec - https://review.openstack.org/#/c/603969/ Do we want to also lump https://review.opendev.org/#/c/648919/ (change flavors.swap default from '' [string] to 0 [int] in the response) into gmann's spec? It's a relatively small change. +1 https://github.com/openstack/nova/blob/11de108daaab4a70e11f13c8adca3f5926aeb540/nova/api/openstack/compute/views/flavors.py#L55 Consensus is yeah sure let's do this, clients already have to handle the empty string today for older clouds. This just fixes it for newer clouds. Libvirt + block migration + config drive + iso9660 Would like another option besides enabling rsync or ssh across all compute nodes due to security concerns In our specific case we dont use any of the --files option when booting vm's. We would like to be able to just regenerate the config drive contents on the destination side. Instead of copying the existing config drive. This is for live migration, cold migration, or both? (mriedem): Who is "we"? GoDaddy? Secure Boot support for QEMU- and KVM-based Nova instances (kashyap) Blueprint: https://blueprints.launchpad.net/nova/+spec/allow-secure-boot-for-qemu-kvm-guests Spec (needs to be refreshed): https://review.openstack.org/#/c/506720/ (Add UEFI Secure Boot support for QEMU/KVM guests, using OVMF) Use case: Prevent guests from running untrusted code ("malware") at boot time. Refer to the periodic updates I posted in the Nova specification over the last year, as various pieces of work in lower layers got completed Upstream libvirt recently (13 Mar 2019) merged support for auto-selecting guest firmware: https://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=1dd24167b ("news: Document firmware autoselection for QEMU driver") NOTE: With the above libvirt work in place, Nova should have all the pieces ready (OVMF, QEMU, and libvirt) to integrate this. PS: Nova already has Secure Boot support for HyperV in-tree (http://git.openstack.org/cgit/openstack/nova/commit/?id=29dab997b4e) based on this: https://specs.openstack.org/openstack/nova-specs/specs/newton/approved/hyper-v-uefi-secureboot.html Action: Kashyap to write a summary brief to the mailing list John Garbutt will look at the spec Securing privsep (mdbooth) Privsep isn't currently providing any security, and is arguably worse than rootwrap: http://lists.openstack.org/pipermail/openstack-discuss/2019-March/004358.html Support filtering of allocation_candidates by forbidden aggregates (tpatil) Specs: https://review.opendev.org/#/c/609960/ #action tpatil to answer questions on spec Allow compute nodes to use DISK_GB from shared storage RP (tpatil) [efried 20190422 XPROJ placement] Specs: https://review.opendev.org/#/c/650188/ Disabled compute service request filter (mriedem) https://bugs.launchpad.net/nova/+bug/1805984 related bug on using affinity with limits: https://bugs.launchpad.net/nova/+bug/1827628, we could just do a more generic solution for both. Modeling server groups in placement is a longer-term thing that requires some thought. Could pre-filter using in_tree filter in the strict affinity case, but that only works if there are already members of the group on a host (I think). Move affinity to given host in tree anti-affinity, retry a few times to see if you get lucky https://www.youtube.com/watch?v=mBluR6cLxJ8 PoC using forbidden trait and a request filter: https://review.opendev.org/#/c/654596/ Should we even do this? It would mean we'd have two source of truth about a disabled compute and they could get out of sync (we could heal in a periodic on the compute but still). Using a trait for this sort of violates the "traits should only be capabilities" thing "capable of hosting instances" seems like a fundamental capability to me https://review.opendev.org/#/c/623558/ attempts to address the reason why CERN limits allocation candidates to a small number (20) (should be fixed regardless) If one of the goals with nova's use of placement is to move as many python scheduler filters into placement filtering-in-sql, this would seem to align with that goal. How to deal with rolling upgrades while there are older computes? Should the API attempt to set the trait if the compute is too old to have the new code (and remove that in U Alternatives: Use a placement aggregate for all disabled computes and filter using negative member_of: https://docs.openstack.org/placement/latest/placement-api-microversion-history.html#support-forbidden-aggregates We'd have to hard-code the aggregate UUID in nova somewhere. This could be hard to debug for an operator since we can't put a meaningful name into a hex UUID. I liked your suggestion: d15ab1ed-dead-dead-dead-000000000000 Update all resource class inventory on the compute node provider and set reserved=total (like the ironic driver does when a node is in maintenance): https://github.com/openstack/nova/blob/fc3890667e4971e3f0f35ac921c2a6c25f72adec/nova/virt/ironic/driver.py#L882 Might be a pain if we have to update all inventory classes. Baremetal providers are a bit easier since they have one custom resource class. What about nested provider inventory? Unable to configure this behavior like a request filter since it's messing with inventory. The compute update_provider_tree code would have to be aware of the disabled status to avoid changing the reserved value. #agree: Create a standard trait like COMPUTE_DISABLED and add required=!COMPUTE_DISABLED to every request TODO: mriedem to push a spec #agree: tssurya to push up the CERN downstream fix for bug 1827628 as the backportable fix for now (affinity problem) TODO: tssurya will push a spec for the feature Clean up orphan instances: https://review.opendev.org/#/c/627765/ (yonglihe, alex_xu) Problem: Even though splited, still long and boring. Need fully review. Last time discussion link: https://etherpad.openstack.org/p/nova-ptg-stein L931 Who could help on libvirt module? (mriedem): I've reviewed the big change before the split, and am still committed to reviewing this, just haven't thought about it lately - just ping me to remind me about reviews (Huawei also needs this). I don't think we really need this as a PTG item. (melwitt): I can also help continue to review. I don't think it's a bug (defect) but the launchpad bug has been changed to Wishlist so I guess that's OK. Will need a release note for it. The change is naturally a bit complex -- I haven't gotten around to reviewing it again lately. (johnthetubaguy) having attempted this manually recently, I want to review this too, if I can (gibi): I feel that this periodic cleanup papers over some bugs in nova results in orphans. Can we fix try to fix the root cause / original bug? TODO: https://review.opendev.org/#/c/556751/ so you can archive everything before a given time (not recent stuff). Might help with the case that you archived while a compute was down so the compute wasn't able to delete the guest on compute while it was still in the DB. Question: can this be integrated into the existing _cleanup_running_deleted_instances periodic task with new values for the config option, e.g. reap_with_orphans? Rather than mostly duplicating that entire periodic for orphans. StarlingX Add server sub-resource topology API https://review.opendev.org/#/c/621476/ (yonglihe, alex_xu) Problem: Inernal NUMA information of NOVA kind of too complex for end user, need to expose a clear well defined, understandable information. How we define the infomation is Open: a) Starting from current bp, elimated the fuzzy one, keep the clear one b) Come up with a new set of data, if we have a clear model for all that stuff. discussion link: https://docs.google.com/document/d/1kRRZFq_ha0T9mFDOEzv0PMvXgtnjGm5ii9mSzdqt1VM/edit?usp=sharing (alex) remove the cpu topology from the proposal or just move that out of numa topology? only using the cpupinning info instead of cpuset? hugepage is per Numa node or not? bp: https://blueprints.launchpad.net/nova/+spec/show-server-numa-topology Last time discussion link: https://etherpad.openstack.org/p/nova-ptg-stein L901 Who could help on NUMA module? StarlingX Briefly discuss idea for transferring ownership of nova resources (melwitt) Want to run the idea by the team and get a sanity check or yea/nay From the "Change ownership of resources - followup" session from Monday: https://etherpad.openstack.org/p/DEN-change-ownership-of-resources Idea: build upon the implementation in https://github.com/kk7ds/oschown/tree/master/oschown Each project (nova, cinder, neutron) has its own dir containing the code related to transferring ownership of its resources, to be available as a plugin This way, each project is responsible for providing test coverage, regression testing, upgrade testing (?) of their ownership transfer code. This is meant to address concerns around testing and maintenance of transfer code over time and across releases Then, something (micro service that has DB creds to nova/cinder/neutron or maybe an adjuntant workflow) will load plugins from all the projects and be able to carry out ownership changes based on a call to its REST API AGREE: sounds reasonable, melwitt to talk to tobberydberg and send summary to ML and figure out next steps Reduce RAM & CPU quota usage for shelved servers (mnestratov) - this is actually superseded by https://review.opendev.org/#/c/638073/ Spec https://review.opendev.org/#/c/656806/ there was a bug closed as invalid https://bugs.launchpad.net/nova/+bug/1630454 proposing to create spec StarlingX Reviews RBD: https://review.opendev.org/#/c/640271/, https://review.opendev.org/#/c/642667/ auto-converge spec: https://review.opendev.org/#/c/651681/ vCPU model:spec: https://review.openstack.org/#/c/642030/ NUMA aware live migration This needs fixing first :( #action: Prioritize these for review somehow (runway, gerrit priorities, ...) Thursday: 0900-0915: Settle, greet, caffeinate 0915-0945: Retrospective https://etherpad.openstack.org/p/nova-ptg-train-retrospective 0945-1000: Nova governance (stephenfin) 1000-1030: cpu modeling in placement 1030-1100: Persistent memory (alex_xu, rui zang) 1100-1130: Support filtering of allocation_candidates by forbidden aggregates (tpatil) 1115-1145: Corner case issues with root volume detach/attach (mriedem/Kevin_Zheng) 1145-1215: Making extra specs less of a forgotten child (stephenfin) 1215-1230: The state of nova's documentation (stephenfin) 1230-1330: Lunch 1330-1400: Let's plan the next steps of the bandwidth support feature (gibi) 1400-1430: RMD - Resource Management Daemon Part I (dakshina-ilangov/IvensZambrano) 1430-1445: Integrating openstacksdk and replacing use of python-*client (efried, dustinc, mordred) 1445-1500: RMD - Resource Management Daemon Part II (dakshina-ilangov/IvensZambrano) 1500-beer: Placement XPROJ: https://etherpad.openstack.org/p/ptg-train-xproj-nova-placement (ordered as shown in etherpad) Friday: 0900-1000: Cyborg XPROJ (Ballroom 4!): https://etherpad.openstack.org/p/ptg-train-xproj-nova-cyborg 1015-1115: Ironic XPROJ: https://etherpad.openstack.org/p/ptg-train-xproj-nova-ironic 1115-1150: Cinder XPROJ: https://etherpad.openstack.org/p/ptg-train-xproj-nova-cinder in the Cinder room 203 they broadcast via microphones so not so portable. 1150-1200: Team Photo https://ethercalc.openstack.org/3qd1fj5f3tt3 1200-1230: API inconsistency cleanup (gmann) 1230-1330: Lunch 1330-1400: Glance topics - todo: dansmith to summarize the idea in the ML 1400-1515: Neutron XPROJ: https://etherpad.openstack.org/p/ptg-train-xproj-nova-neutron *1430-1440: Placement team picture 1515-1615: Keystone XPROJ: https://etherpad.openstack.org/p/ptg-train-xproj-nova-keystone 1615-1630: Compute capabilities traits placement request filter (mriedem) (aspiers sched) 1630-beer: Disabled compute service request filter (mriedem) (aspiers sched) Saturday: 0900-1000: [dtroyer][sean-k-mooney] 3rd party CI for NUMA/PCI/SRIOV (mriedem) https://etherpad.openstack.org/p/nova-ptg-train-ci 1000-1030: Clean up orphan instances, Add server sub-resource topology API (yonglihe, alex_xu) 1030-1045: Tech debt 1115-1130: Securing privsep (mdbooth) 1100-1115: StarlingX patches 1115-1130: Secure Boot support for QEMU- and KVM-based Nova instances (kashyap sched) 1130-1200: New virt driver for rsd 1200-1230: Train Theme setting https://etherpad.openstack.org/p/nova-train-themes 1230-1330: Lunch 1330-1345: Governance (single-company patch+approval, trusting cores) cont'd 1345-beer: Continue deferred discussions Deferred Governance: two cores from same company Mdbooth proposed words: https://etherpad.openstack.org/p/nova-ptg-train-governance mini-cores: can we trust them to be SMEs and not just shove in new code? (Isn't that the same question we ask for other cores?) From sneha.rai at hpe.com Tue May 14 05:01:36 2019 From: sneha.rai at hpe.com (RAI, SNEHA) Date: Tue, 14 May 2019 05:01:36 +0000 Subject: Help needed to Support Multi-attach feature In-Reply-To: <20190513200312.GA21325@sm-workstation> References: <20190510092600.r27zetl5e3k5ow5v@localhost> <20190513200312.GA21325@sm-workstation> Message-ID: Thanks Sean for your response. Setting virt_type to kvm doesn’t help. n-cpu service is failing to come up. Journalctl logs of n-cpu service: May 14 02:07:05 CSSOSBE04-B09 systemd[1]: Started Devstack devstack at n-cpu.service. May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: DEBUG os_vif [-] Loaded VIF plugin class '' with name 'ovs' {{(pid=15989) initialize /usr/local/lib/python2.7/dist-packages/os_vif/__init__.py:46}} May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: DEBUG os_vif [-] Loaded VIF plugin class '' with name 'linux_bridge' {{(pid=15989) initialize /usr/local/lib/python2.7/dist- May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: INFO os_vif [-] Loaded VIF plugins: ovs, linux_bridge May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: WARNING oslo_config.cfg [None req-9dc9d20c-b002-4b34-a123-81612cdc47fc None None] Option "use_neutron" from group "DEFAULT" is deprecated for removal ( May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: nova-network is deprecated, as are any related configuration options. May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ). Its value may be silently ignored in the future. May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: DEBUG oslo_policy.policy [None req-9dc9d20c-b002-4b34-a123-81612cdc47fc None None] The policy file policy.json could not be found. {{(pid=15989) load_rules /usr/local/lib/python2.7/dist- May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: INFO nova.virt.driver [None req-9dc9d20c-b002-4b34-a123-81612cdc47fc None None] Loading compute driver 'libvirt.LibvirtDriver' May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver [None req-9dc9d20c-b002-4b34-a123-81612cdc47fc None None] Unable to load the virtualization driver: ImportError: /usr/lib/x86_64-linux-gnu/libvirt.so.0: version `L May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver Traceback (most recent call last): May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver File "/opt/stack/nova/nova/virt/driver.py", line 1700, in load_compute_driver May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver virtapi) May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver File "/usr/local/lib/python2.7/dist-packages/oslo_utils/importutils.py", line 44, in import_object May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver return import_class(import_str)(*args, **kwargs) May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 346, in __init__ May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver libvirt = importutils.import_module('libvirt') May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver File "/usr/local/lib/python2.7/dist-packages/oslo_utils/importutils.py", line 73, in import_module May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver __import__(import_str) May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver File "/home/stack/.local/lib/python2.7/site-packages/libvirt.py", line 28, in May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver raise lib_e May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver ImportError: /usr/lib/x86_64-linux-gnu/libvirt.so.0: version `LIBVIRT_2.2.0' not found (required by /home/stack/.local/lib/python2.7/site-packages/libvirtmod.so) May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver May 14 02:07:08 CSSOSBE04-B09 systemd[1]: devstack at n-cpu.service: Main process exited, code=exited, status=1/FAILURE May 14 02:07:08 CSSOSBE04-B09 systemd[1]: devstack at n-cpu.service: Unit entered failed state. May 14 02:07:08 CSSOSBE04-B09 systemd[1]: devstack at n-cpu.service: Failed with result 'exit-code'. root at CSSOSBE04-B09:/etc# sudo systemctl status devstack at n-cpu.service ● devstack at n-cpu.service - Devstack devstack at n-cpu.service Loaded: loaded (/etc/systemd/system/devstack at n-cpu.service; enabled; vendor preset: enabled) Active: failed (Result: exit-code) since Tue 2019-05-14 02:07:08 IST; 7min ago Process: 15989 ExecStart=/usr/local/bin/nova-compute --config-file /etc/nova/nova-cpu.conf (code=exited, status=1/FAILURE) Main PID: 15989 (code=exited, status=1/FAILURE) May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver libvirt = importutils.import_module('libvirt') May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver File "/usr/local/lib/python2.7/dist-packages/oslo_utils/importutils.py", line 73, in import_module May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver __import__(import_str) May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver File "/home/stack/.local/lib/python2.7/site-packages/libvirt.py", line 28, in May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver raise lib_e May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver ImportError: /usr/lib/x86_64-linux-gnu/libvirt.so.0: version `LIBVIRT_2.2.0' not found (required by /home/stack/.local/lib/python2.7/site-packages/libvirtmod.so) May 14 02:07:08 CSSOSBE04-B09 nova-compute[15989]: ERROR nova.virt.driver May 14 02:07:08 CSSOSBE04-B09 systemd[1]: devstack at n-cpu.service: Main process exited, code=exited, status=1/FAILURE May 14 02:07:08 CSSOSBE04-B09 systemd[1]: devstack at n-cpu.service: Unit entered failed state. May 14 02:07:08 CSSOSBE04-B09 systemd[1]: devstack at n-cpu.service: Failed with result 'exit-code'. Regards, Sneha Rai -----Original Message----- From: Sean McGinnis [mailto:sean.mcginnis at gmx.com] Sent: Tuesday, May 14, 2019 1:33 AM To: RAI, SNEHA Cc: Gorka Eguileor ; openstack-dev at lists.openstack.org Subject: Re: Help needed to Support Multi-attach feature On Fri, May 10, 2019 at 04:51:07PM +0000, RAI, SNEHA wrote: > Thanks Gorka for your response. > > I have changed the version of libvirt and qemu on my host and I am able to move past the previous error mentioned in my last email. > > Current versions of libvirt and qemu: > root at CSSOSBE04-B09:/etc# libvirtd --version libvirtd (libvirt) 1.3.1 > root at CSSOSBE04-B09:/etc# kvm --version QEMU emulator version 2.5.0 > (Debian 1:2.5+dfsg-5ubuntu10.36), Copyright (c) 2003-2008 Fabrice > Bellard > > Also, I made a change in /etc/nova/nova.conf and set virt_type=qemu. Earlier it was set to kvm. > I restarted all nova services post the changes but I can see one nova service was disabled and state was down. > Not sure if it is related or not, but I don't believe you want to change virt_type t0 "qemu". That should stay "kvm". -------------- next part -------------- An HTML attachment was scrubbed... URL: From rajiv.mucheli at gmail.com Tue May 14 05:36:53 2019 From: rajiv.mucheli at gmail.com (rajiv mucheli) Date: Tue, 14 May 2019 11:06:53 +0530 Subject: [Glance] Is Bug 1493122 fixed? Message-ID: Hi, Could you please let me know if https://bugs.launchpad.net/horizon/+bug/1493122 has been fixed? if not, in which release would it be fixed? or is there a workaround? I did follow-up on : https://blueprints.launchpad.net/glance/+spec/glance-quota-enhancements Regards, Rajiv -------------- next part -------------- An HTML attachment was scrubbed... URL: From victoria at vmartinezdelacruz.com Tue May 14 13:30:26 2019 From: victoria at vmartinezdelacruz.com (=?UTF-8?Q?Victoria_Mart=C3=ADnez_de_la_Cruz?=) Date: Tue, 14 May 2019 10:30:26 -0300 Subject: [manila] PTG Summary Message-ID: Hi all, Just reaching you out with a brief PTG summary for the Manila team. Thanks to everyone who attended and participated in-person and remote! Here's the PTG notes etherpad: https://etherpad.openstack.org/p/manila-ptg-train *High points* - We had a highly productive PTG, albeit shorter than the previous times we've met as a group. It reminded us of the old Design Summit times, although we weren't doing smaller time slots during the Summit week, but rather had a separate time to catch up. - We had ample time to do cross-project sync up; however, there were unavoidable overlaps. - It was also great to have many engineers travel to both the summit and the PTG. The following is a summary of notes organized with action items and owner/s. *Cross project goals* - Summary: - Identified current cross project goals, being them PDF rendering and IPv6 support - For the first case, we rely on the current documentation we have and on the docs team to guide us on the changes we need to do to have PDF rendering support - For the latter, we have been supporting IPv6 on manila for a while now and we have CI enabled to exercise this configuration. - Action: - Work with asettle to understand what it's need to be done to comply with the PDF rendering goal - Create an IPv6 only job in manila - many of our jobs run with 4+6, where we're currently doing IPv6 data path testing. - Owner: Not yet defined *Auto-snapshotting of manila shares* - Summary: - CERN had a use case that required cloud administrator driven automatic snapshots at pre-configured intervals, with pre-configured retention policies. - The team discussed ways the automation can occur with the existing API. - There are some backend specific extra-specs that can allow offloading the responsibility of periodic snapshots, and pruning of snapshots per retention policy. There is a disadvantage here that these snapshots are invisible to manila; and do not count against the project quotas - They also require knowing backend special sauce. - Snapshotting of multiple shares can be achieved with Share Groups - To make this occur without end-user actions, one would need to build tooling with these APIs - Maybe Mistral can be harnessed to do workflow scheduling and automation per business logic. - Action: CERN will explore these options and revert with their thoughts - Owner/s: CERN/Jose Castro Leon/josecastroleon *Manila in k8s* - Summary: - Robert Vasek/gman0 has an initial implementation of the CSI driver on the cloud-provider-openstack repo. We're seeking reviewers to merge it - https://github.com/kubernetes/cloud-provider-openstack/pull/536 - Next set of TODOs: - Robert's application to GSOC is pending, if he gets sponsorship, he might lead some of the following efforts: - e2e testing in the OpenLab CI for the manila-csi-provisioner - Testing with an NFS backend - AZ awareness / Topology based scheduling support - Snapshot support - Volume expansion - NoAuth mode for using the provisioner in single-tenant deployments - Manila differentiates abilities to take snapshots of shares, cloning these snapshots into new shares, mounting snapshots (with access control) and reverting shares to snapshots in place. CephFS supports snapshots, however, does not support rapid cloning - this feature gap exists for OpenStack use case as well. - Plans to build snapshot cloning capability into Manila and generically within the CSI provisioner were discussed. We'll evolve this design in the coming months. - Action Items: - Continue on plan to build and support manila-csi-provisioner (gman0, Tomas Smatena/tsmatena, Goutham Ravi/gouthamr, Tom Barron/tbarron, Victoria Martinez de la Cruz/vkmc, manila community) - Gather user feedback for the usecase for CEPH snapshots (josecastroleon and Xing Yang/xyang - Ask CERN and Huawei if they care about a slower "create-share-from-snapshot" over not having the ability at all. - Owners: as identified above ^ *OpenStack Client and OpenStack SDK* - Summary: - Manila lags in supporting OSC. Work began late in the Stein cycle to bridge this gap: - Specification: https://review.opendev.org/#/c/644218/ - Initial Code: https://review.opendev.org/#/c/642222/ - We have an Outreachy Intern starting with the manila team (Soledad Kuczala/s0ru) - Sofia Enriquez/enriquetaso and vkmc will be mentoring the intern - CERN may offer cycles to cover some of the development work - Amit Oren/amito submitted changes to add manila support to the openstacksdk: - *https://review.opendev.org/#/c/638782/* - Team agreed to use the existing manilaclient SDK (in python-manilaclient) for the OSC instead of using the openstacksdk right away until we achieve feature parity in the openstacksdk project - Action Items: - Start Working Group to plan/coordinate OSC implementation - Identify minimum set of commands to implement and extended set to achieve parity with existing client - Allow support for using "openstack share" and "manila" to invoke manila CLI commands - the latter is essential to support manila running standalone. - Deprecate the existing manila clients in favor of the newer OpenStack CLI (and corresponding manila bash shell equivalent) - Review the openstacksdk change and identify feature gaps - Owner/s: s0ru, enriquetaso, vkmc, amito, gouthamr, josecastroleon *SemVer in manila releases* - Summary: This was a recap of how we are versioning the manila deliverables as we release them - "manila" gets a major version bump every release - Client releases indicate backwards compatibility by not bumping up the major release version. Minor version (and micro-version) bumps are done per policy based off of what changes are being released - The general guideline to follow is that of pbr versioning notes here: https://docs.openstack.org/pbr/latest/user/semver.html - Defer to the judgement of the OpenStack release team when in doubt - Action items: None - Owner/s: None *Create share from snapshots in another pool or back end* - Summary: - This specification has new owners who explained how they plan to implement this feature - The main concerns for the proposed spec ( https://review.opendev.org/#/c/609537/) were around the user experience and administrator control of the scheduling - Existing scheduling capabilities will be used to determine when a snapshot has to be cloned into a new pool rather than its existing pool. - Users can influence the scheduler's decision by specifying the availability zone or a different share type - Implementation will be two-phased - a driver optimized approach followed by a generic approach - Action Items: - Re-target specification to Train, assign to new owners - Implement driver-driven snapshot cloning to different pool - Owner/s: Douglas Viroel/dviroel, Lucio Seki/lseki, Erlon Cruz/erlon *Security Service password* - Summary: - Manila stores the security service password as plaintext in its database. It also allows all users within the tenant to retrieve the password via the security service API. - https://bugs.launchpad.net/manila/+bug/1817316 - We discussed adding barbican and castellan support to manila and making it a hard dependency by including their clients into the requirements.txt - Appropriate key management software will be a soft dependency, and there will be a recommendation about which key management backend can be used - a soft dependency to manila. - Action Items: - Disallow retrieving password from the security services API - Investigate use of Barbican/Castellan to store security services API password - change representation in the database - Owner/s: Not yet identified *Edge Requirements and active/active* - Summary: - Active-active - Manila share manager service running active/active - this has been a request from a long time, from traditional data center workloads as well as for newer/Edge workloads - Ganesha active/active - this allows datapath high availability for the CephFS-NFS driver - Edge - The goal of edge is to distribute services better, with latency considerations. We rely in deployment tools such as TripleO to do this work. This implementation has started already in some other projects (e.g. Cinder). TripleO supports and edge topology already. It's mainly hyperconverged OpenStack at the edge. - Initial users are telco customers running NFVs, 5G - Other services have been configured to be deployed on the edge - Cinder started on this effort last cycle - Cinder - qualify at least one driver to work in an active-active volume service - Is tooz supported for cinder-volume at the edge? - What tooz backend is being deployed? -> etcd deployed at each edge - Action items: - Introduce a multi-node job in the manila repository to test active-active deployment in the gate - Audit the usage of oslo_concurrency based file locks in the share manager service, replace these with tooz based locks - Design a leader election based workflow for asynchronous/time based tasks occuring within the share manager - Owner/s: gouthamr *OpenStack technical vision* - Summary: - This was a discussion of the TC OpenStack Vision Document: https://governance.openstack.org/tc/reference/technical-vision.html - Tom submitted a proposal that is for review in https://review.openstack.org/#/c/636770/. Based on this we exchanged some ideas regarding the accuracy of this document. - Action items: Reviewers are needed to check this proposal. - Owner/s: tbarron *API Improvements: Pagination and Filtering* - Summary: In the last couple of cycles several bugs were filed wrt pagination and filtering for our API. Precisely, - Can't filter/list resources that have no key=value metadata/extra-specs (https://bugs.launchpad.net/manila/+bug/1782847) - Pagination does not speed up list queries ( https://bugs.launchpad.net/manila/+bug/1795463) - next share_links contains not supported marker field ( https://bugs.launchpad.net/manila/+bug/1819167) - Action items: Look for volunteers wishing to work on the reported bugs. Bug/1795463 is being followed up by carloss. - Owner/s: Each bug has its asignee *CI Improvements: Rationalizing our jobs* - Summary: This discussion focused on the current status of our CI. It's fairly complex and we noticed that we need to simplify it to focus our resources better. Main highlights are: - We test 8 FOSS/first party drivers (Generic DHSS=True, Generic DHSS=False, Container, LVM, ZFSOnLinux, CEPHFS-Native, CEPHFS-NFS, ~~~Dummy~~~) - We test two databases - mysql, postgresql - We run API and scenario tests (together in some cases, individually as "manila-tempest-dsvm-scenario" (Generic driver DHSS=True) - We run python2 and python3 jobs - We run same jobs in manila-tempest-plugin and manila repos - Action items: - Generic driver with DHSS=False will be removed; - the job for Generic driver with DHSS=True can be combined with dsvm-scenario. - Unify tempest api and scenario tests for all drivers, even if that means we must bump timeouts. - Owner/s: gouthamr *Generic Driver improvements* - Summary: - There are lot of issues (even for usage at the gate) wrt the Generic Driver and nobody has enough bandwitch to work on all of them. Some of the known issues are: - It has a single point of failure, since there is no HA for the share servers created by manila - Attach / re-attach issues - timeouts - We need to figure out is there are people using the generic driver, understand better if we need to keep it or if it can be substituted for a different driver. Ben Swartzlander had a "NextGen" generic driver that is stuck in review because there's no migration path to move shares between the two different implementations of the generic driver: - https://review.opendev.org/#/c/511038/ - There are some production users of the Generic Driver that seem to have solved the SPOF (Single point of failure) issues in private forks of manila. If they are interested in sharing the love upstream, the manila community will be very welcoming. - Action items: - Continue to look for alternatives to the generic driver for CI. - Discuss in following upstream Manila meetings. - Owner/s: None *Bugs* - Summary: - Force flag on the snapshot creation API ( https://bugs.launchpad.net/manila/+bug/1811336) Considered to be a doc bug. Need to clarify this in the bug report. Marked as a "low-hanging-fruit" for new volunteers. - Limit share size (https://bugs.launchpad.net/manila/+bug/1811943). The team is interested in this one, we replied to the reporter during the PTG. We need to keep the dicussion going and look for a volunteer to work on this. - Incompatible protocol/access types in the API ( https://bugs.launchpad.net/manila/+bug/1637542) - Action items: None in particular, triaged during the PTG - Owner/s: None Cheers, Victoria -------------- next part -------------- An HTML attachment was scrubbed... URL: From jungleboyj at gmail.com Tue May 14 18:27:11 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Tue, 14 May 2019 13:27:11 -0500 Subject: [all[tc][ptls] Success bot lives on! In-Reply-To: <20190514171411.dx5bv7a6epqlnqsz@yuggoth.org> References: <20190514171411.dx5bv7a6epqlnqsz@yuggoth.org> Message-ID: On 5/14/2019 12:14 PM, Jeremy Stanley wrote: > On 2019-05-14 15:59:22 +0000 (+0000), Alexandra Settle wrote: > [...] >> For those who don't know or remember what success bot is, it is a >> success IRC bot (*dramatic gasp*) that makes it simple to record >> "little moments of joy and progress" and share them. > [...] > > A subsequent addition also created a thanksbot mechanism, as > described in this SU article from last year: > > https://superuser.openstack.org/articles/thank_bot/ > > It's similarly under-utilized, but serves as a reminder that I > should be thanking people for their contributions with greater > frequency than I do. Jeremy, Thank you for the Thanks Bot reminder.  I had used it in the past and then totally forgot about it.  Think if we can start using these more regularly again and then share the notes regularly it would be nice. Jay From kennelson11 at gmail.com Tue May 14 18:39:29 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Tue, 14 May 2019 11:39:29 -0700 Subject: [release][ptl] Cycle Highlights Schedule Changes Message-ID: Hello! I wanted to bring your attention to the shift in the schedule for the collection of cycle highlights[1]. It used to be that collection started around m3 and concluded at RC1. However, not having them until then makes it difficult to get them to the marketing machine in time to process/ turn into a press release to celebrate our hard work. Now, with the changes, between m2 and m3 I will send out a reminder to get liaisons + PTLs to start thinking about them. From then on you are welcome to start adding them to your deliverable files (I suppose you could add them sooner if you want to be extra on top of things). But, more importantly.. *The new final deadline for cycle highlights will now be feature freeze[2] the week of R-5. * The process for submitting them remains the same[3]. If you have any questions please let me know or join us in the #openstack-release channel. Look forward to seeing what everyone has accomplished during Train :) Thanks! -Kendall (diablo_rojo) [1] https://releases.openstack.org/reference/process.html#between-milestone-2-and-milestone-3 [2] https://releases.openstack.org/train/schedule.html [3] https://docs.openstack.org/project-team-guide/release-management.html#cycle-highlights -------------- next part -------------- An HTML attachment was scrubbed... URL: From kennelson11 at gmail.com Tue May 14 18:57:25 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Tue, 14 May 2019 11:57:25 -0700 Subject: [PTL][SIG][WG] PTG Team Photos In-Reply-To: References: <981673d8-b095-4c30-a651-577d1c5286d3@www.fastmail.com> <30c8cbb5-b11b-be98-339d-ef6c5e35305b@gmail.com> <79bdd7d5-474e-3033-07ce-e9e40a527424@gmail.com> Message-ID: I was pretty sure it was for Designate, but I didn't want to make assumptions :) I'll get the folders added in a little while :) -Kendall (diablo_rojo) On Tue, May 14, 2019 at 10:57 AM Erik Olof Gunnar Andersson < eandersson at blizzard.com> wrote: > I heard that other guy is pretty great. :p The two man pictures are indeed > of me and mugsie for the Designate project. > > > The two last pictures with 3 people (me, Colin Gibbons and Spyros > Trigazis) is for the Magnum project. > > > > Best Regards, Erik Olof Gunnar Andersson > > > > *From:* Jay Bryant > *Sent:* Tuesday, May 14, 2019 10:48 AM > *To:* openstack-discuss at lists.openstack.org > *Subject:* Re: [PTL][SIG][WG] PTG Team Photos > > > > 2019-05-03 12.23.27 to 12.24.08 should be the Designate team. :-) > > Mugsie (Graham Hayes) is the PTL for that group and it is him and one > other guy. So I am pretty sure that is what it is. :-) > > Thanks for getting the pictures posted. They look good! > > Jay > > > > On 5/14/2019 12:36 PM, Kendall Nelson wrote: > > Hello! > > > > I have just about all the photos sorted into separate project team > folders[1]. There were a few groups that were not signed up and came for > photos anyway which is great, but being that I wasn't taking photos I don't > know exactly what teams they were. These teams that I was unable to place > just exist outside the team directories. If anyone wants to speak up and > help solve the mystery I would very much appreciate it! > > > > Enjoy! > > > > -Kendall (diablo_rojo) > > > > [1] > https://www.dropbox.com/sh/fydqjehy9h5y728/AAAEP6h_uK_6r1a9oh3aAF6Qa?dl=0 > > > > On Mon, May 13, 2019 at 4:00 PM Kendall Nelson > wrote: > > Sorting through them today, should have a link for everyone tomorrow. > > > > -Kendall > > > > On Fri, May 10, 2019 at 11:01 AM Jay Bryant wrote: > > Colleen, > > I haven't seen them made available anywhere yet so I don't think you > missed an e-mail. > > Jay > > On 5/10/2019 12:48 PM, Colleen Murphy wrote: > > On Thu, Mar 28, 2019, at 17:03, Kendall Nelson wrote: > >> Hello! > >> > >> If your team is attending the PTG and is interested in having a team > >> photo taken, here is the signup[1]! There are slots Thursday and Friday > >> from 10:00 AM to 4:30 PM. > >> > >> The location is TBD but will likely be close to where registration will > >> be. I'll send an email out the day before with a reminder of your time > >> slot and an exact location. > >> > >> -Kendall (diablo_rojo) > >> > >> [1] > https://docs.google.com/spreadsheets/d/1DgsRHVWW2YLv7ewfX0M21zWJRf4wUfPG4ff2V5XtaMg/edit?usp=sharing > >> > > Are the photos available somewhere now? I'm wondering if I missed an > email. > > > > Colleen > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Tue May 14 19:46:50 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 14 May 2019 19:46:50 +0000 Subject: [oslo][all] Ending courtesy pings In-Reply-To: References: Message-ID: <20190514194650.cuzrwon3aquhrfq4@yuggoth.org> On 2019-05-14 11:58:03 -0500 (-0500), Ben Nemec wrote: [...] > The recommendation was for interested parties to set up custom > highlights on the "#startmeeting oslo" (or whichever meeting) > command. [...] Cross-sections of our community have observed similar success with "group highlight" strings (infra-root, tc-members, zuul-maint and so on) where the folks who want to get notified as a group can opt to add these custom strings to their client configurations. > people didn't know how to configure their IRC client to do this. For those using WeeChat, the invocation could be something like this in your core buffer: /set weechat.look.highlight_regex #startmeeting (oslo|tripleo) /save Or you could similarly set the corresponding line in the [look] section of your ~/.weechat/weechat.conf file and then /reload it: highlight_regex = "#startmeeting (oslo|tripleo)" Extend the (Python flavored) regex however makes sense. https://www.weechat.org/files/doc/stable/weechat_user.en.html#option_weechat.look.highlight_regex > Once you do configure it, there's a testing problem in that you > don't get notified of your own messages, so you basically have to > wait for the next meeting and hope you got it right. Or pull > someone into a private channel and have them send a startmeeting > command, which is a hassle. It isn't terribly complicated, but if > it isn't tested then it's assumed broken. :-) Or temporarily add one for a meeting you know is about to happen on some channel to make sure you have the correct configuration option and formatting, at least. > The other concern was that this process would have to be done any > time someone changes IRC clients, whereas the ping list was a > central thing that always applies no matter where you're > connecting from. [...] I may be an atypical IRC user, but this is far from the most complicated part of my configuration which would need to be migrated to a new client. Then again, I've only changed IRC clients roughly every 8 years (so I guess I'm due to move to a 5th one next year). -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From jungleboyj at gmail.com Tue May 14 19:56:25 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Tue, 14 May 2019 14:56:25 -0500 Subject: [cinder] PTG and Forum Summary Message-ID: <94b8dece-7f4c-5160-fd87-045ae09b2b0f@gmail.com> All, I have now gone through all the notes and recordings from the OpenInfra Summit Forum sessions as well as the OpenStack PTG meetings.  You can see the summary of discussions and action items here [1] with links to the detailed etherpads and recordings organized chronologically. I have also gotten the Cinder Project Update [2] and Cinder On-boarding [3] session slides posted. All-in-all the PTG meetings and forum sessions were well attended and we had many productive discussions.  As indicated in my project update presentation, I feel the health of the Cinder community continues to be good.  We still have a diverse population (different geographies, different companies, different technologies) contributing to Cinder.  We also are seeing a consistent stream of user experience and stability improvements being proposed and merged.  The attendance at the PTG and Summit sessions supported the health status of our community. Thank you to everyone who was able to attend our sessions both in person and remotely!  Make sure to mark your calendars for the planned mid-cycle meeting at the Lenovo site in RTP, August 21st - 23rd, 2019.  Hope to see many of you there! Jay (JungleboyJ) [1] https://wiki.openstack.org/wiki/CinderTrainSummitandPTGSummary [2] https://www.slideshare.net/JayBryant2/cinder-project-update-denver-summit-2019 [3] https://www.slideshare.net/JayBryant2/cinder-project-onboarding-openinfra-summit-denver-2019 From rosmaita.fossdev at gmail.com Tue May 14 20:19:01 2019 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Tue, 14 May 2019 16:19:01 -0400 Subject: [Glance] Is Bug 1493122 fixed? In-Reply-To: References: Message-ID: <8c166d54-e20c-40e7-00b8-95b687647fa8@gmail.com> On 5/14/19 1:36 AM, rajiv mucheli wrote: > Hi, > > Could you please let me know > if https://bugs.launchpad.net/horizon/+bug/1493122 has been fixed? if > not, in which release would it be fixed? or is there a workaround? Bug #1493122, "There is no quota check for instance snapshot", is not fixed. Glance has something better, namely, user_storage_quota, which can be used to set an upper limit on the cumulative storage consumed by all images of a tenant across all stores (introduced in Havana). I think it's better because most likely users are charged by the amount of image storage they consume, not on a per-image basis. (Otherwise, all the Arch Linux users would be subsidizing the Windows users.) So the workaround is to set user_storage_quota in the Glance api-conf (the default value is 0 (unlimited)). That being said, it's really a hard limit, not a quota, because the same value applies to *all* projects. There's currently no support in Glance for applying different values to different projects. At the Denver PTG, people expressed interested in seeing the Keystone unified limits feature used to get real quotas into Glance. Someone even mentioned that they might have bandwidth to work on this during the Train cycle. As you may have noticed from Abhishek's PTG Summary and Train milestones email [1], the currently small Glance team has their hands full with other items for Train, so quotas won't happen in Train unless someone other than the usual suspects steps up. So if you're interested in helping out, or can direct some development resources to this effort, please put an item on the agenda for the weekly Glance meeting [0]. Meetings are 14:00 UTC Thursdays. cheers, brian [0] https://etherpad.openstack.org/p/glance-team-meeting-agenda [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006110.html > I did follow-up on : > > https://blueprints.launchpad.net/glance/+spec/glance-quota-enhancements > > Regards, > Rajiv From stig.openstack at telfer.org Tue May 14 20:24:57 2019 From: stig.openstack at telfer.org (Stig Telfer) Date: Tue, 14 May 2019 21:24:57 +0100 Subject: [scientific-sig] IRC Meeting - summit roundup, cycle planning Message-ID: <32435687-41DD-445D-BC6C-348BD2DC2855@telfer.org> Hi All - We have a Scientific SIG meeting at 2100 UTC (about 40 minutes time) in channel #openstack-meeting. Everyone is welcome. Today’s agenda is available here: https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_May_14th_2019 We’d like to recap the summit activities and discuss some new plans for the Train cycle. Cheers, Stig From mriedemos at gmail.com Tue May 14 20:34:10 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Tue, 14 May 2019 15:34:10 -0500 Subject: [watcher] Compute CDM builder issues (mostly perf related) Message-ID: Hi all, I was looking over the NovaClusterDataModelCollector code today and trying to learn more about how watcher builds the nova CDM (and when) and got digging into this change from Stein [1] where I noted what appear to be several issues. I'd like to enumerate a few of those issues here and then figure out how to proceed. 1. In general, a lot of this code for building the compute node model is based on at least using the 2.53 microversion (Pike) in nova where the hypervisor.id is a UUID - this is actually necessary for a multi-cell environment like CERN. The nova_client.api_version config option already defaults to 2.56 which was in Queens. I'm not sure what the compatibility matrix looks like for Watcher, but would it be possible for us to say that Watcher requires nova at least at Queens level API (so nova_client.api_version >= 2.60), add a release note and a "watcher-status upgrade check" if necessary. This might make things a bit cleaner in the nova CDM code to know we can rely on a given minimum version. 2. I had a question about when the nova CDM gets built now [2]. It looks like the nova CDM only gets built when there is an audit? But I thought the CDM was supposed to get built on start of the decision-engine service and then refreshed every hour (by default) on a periodic task or as notifications are processed that change the model. Does this mean the nova CDM is rebuilt fresh whenever there is an audit even if the audit is not scoped? If so, isn't that potentially inefficient (and an unnecessary load on the compute API every time an audit runs?). 3. The host_aggregates and availability_zone compute audit scopes don't appear to be documented in the docs or the API reference, just the spec [3]. Should I open a docs bug about what are the supported audit scopes and how they work (it looks like the host_aggregates scope works for aggregate ids or names and availability_zone scope works for AZ names). 4. There are a couple of issues with how the unscoped compute nodes are retrieved from nova [4]. a) With microversion 2.33 there is a server-side configurable limit applied when listing hypervisors (defaults to 1000). In a large cloud this could be a problem since the watch client-side code is not paging. b) The code is listing hypervisors with details, but then throwing away those details to just get the hypervisor_hostname, then iterating over each of those node names and getting the details per hypervisor again. I see why this is done because of the scope vs unscoped cases, but we could still optimize this I think (we might need some changes to python-novaclient for this though, which should be easy enough to add). 5. For each server on a node, we get the details of the server in separate API calls to nova [5]. Why can't we just do a GET /servers/detail and filter on "host" or "node" so it's a single API call to nova per hypervisor? I'm happy to work on any of this but if there are any reasons things need to be done this way please let me know before I get started. Also, how would the core team like these kinds of improvements tracked? With bugs? [1] https://review.opendev.org/#/c/640585/ [2] https://review.opendev.org/#/c/640585/10/watcher/decision_engine/model/collector/nova.py at 181 [3] https://specs.openstack.org/openstack/watcher-specs/specs/stein/implemented/scope-for-watcher-datamodel.html [4] https://review.opendev.org/#/c/640585/10/watcher/decision_engine/model/collector/nova.py at 257 [5] https://review.opendev.org/#/c/640585/10/watcher/decision_engine/model/collector/nova.py at 399 -- Thanks, Matt From najoy at cisco.com Tue May 14 21:16:52 2019 From: najoy at cisco.com (Naveen Joy (najoy)) Date: Tue, 14 May 2019 21:16:52 +0000 Subject: networking-vpp 19.04 is now available Message-ID: Hello All, We'd like to invite you all to try out networking-vpp 19.04. As many of you may already know, VPP is a fast user space forwarder based on the DPDK toolkit. VPP uses vector packet processing algorithms to minimize the CPU time spent on each packet to maximize throughput. Networking-vpp is a ML2 mechanism driver that controls VPP on your control and compute hosts to provide fast L2 forwarding under Neutron. This latest version is updated to work with VPP 19.04. In the 19.04 release, we've worked on making the below updates: - We've built an automated test pipeline using Tempest. We've identified and fixed bugs discovered during our integration test runs. We are currently investigating a bug, which causes a race condition in the agent. We hope to have a fix for this issue soon. - We've made it possible to overwrite the VPP repo path. VPP repo paths are constructed, pointing to upstream repos, based on OS and version requested. Now you can allow this to be redirected elsewhere. - We've updated the mac-ip permit list to allow the link-local IPv6 address prefix for neighbor discovery to enable seamless IPv6 networking. - We've worked on additional fixes for Python3 compatibility and enabled py3 tests in gerrit gating. - We've updated the ACL calls in vpp.py to tidy-up the arguments. We've worked on reordering vpp.py to group related functions, which is going to be helpful for further refactoring work in the future. - We've been doing the usual round of bug fixes and updates - the code will work with both VPP 19.01 and 19.04 and has been updated to keep up with Neutron Rocky and Stein. The README [1] explains how you can try out VPP using devstack: the devstack plugin will deploy the mechanism driver and VPP 19.04 and should give you a working system with a minimum of hassle. We will be continuing our development for VPP's 19.08 release. We welcome anyone who would like to come help us. -- Naveen & Ian [1] https://opendev.org/x/networking-vpp/src/branch/master/README.rst -------------- next part -------------- An HTML attachment was scrubbed... URL: From kennelson11 at gmail.com Tue May 14 21:24:31 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Tue, 14 May 2019 14:24:31 -0700 Subject: [First Contact] [SIG] Summit/Forum + PTG Summary In-Reply-To: <20190514150657.hshqfcjsa35t57yb@pacific.linksys.moosehall> References: <20190514150657.hshqfcjsa35t57yb@pacific.linksys.moosehall> Message-ID: Thanks for starting a list Adam :) On Tue, May 14, 2019 at 8:07 AM Adam Spiers wrote: > Kendall Nelson wrote: > >Forum Session (Welcoming New Contributors State of the Union and > >Deduplication of Efforts) > >----------------------------------------------------------------------------------------------------------------------------- > > > > >The biggest things that came out of this session were discussion about > >recording of onboarding sessions and a community goal of improving > >contributor documentation. > > > >Basically, we have never had the onboarding sessions recorded but if we > >could tt would really help new contributors even if they might get a > little > >stale before we are able to record new ones. > > +1: slightly out of date info is still usually better than none. > > This other mail thread in the last hour jogged my memory on some of the > other > details we discussed in this session: > > http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006224.html > > >During that chat, we learned > >that Octavia does somewhat regular calls in whch they do onboarding for > new > >contributors. I have asked for an outline to help encourage other > projects > >to do similar. > > > >As for per project contributor documentation, some projects have it and > >some don't. Some projects have it and its incomplete. bauzas volunteered > >to do an audit of which projects have it and which don't and to propose a > >community goal for it. As a part of that, we should probably decide on a > >list of bare minimum things to include. > > Few things off the top of my head: > > - Architectural overview > > - Quickstart for getting the code running in the simplest form > (even if this is just "use devstack with these parameters") > > - Overview of all the project's git repos, and the layout of the files > in each > > - How to run the various types of tests > > - How to find some easy dev tasks to get started with > I would also add what task trackers they use and the tags they use as an extension of your last bullet point. Also might include info about if they use specs or bps or neither for new features. -Kendall (diablo_rojo) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Tue May 14 21:59:39 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Tue, 14 May 2019 16:59:39 -0500 Subject: Summit video website shenanigans Message-ID: <21ce1f4d-2e19-589f-3bce-44f411a22e67@gmail.com> On Sunday I was able to pull up several of the summit videos on youtube via the summit phone app, e.g. [1] but when trying to view the same videos from the summit video website I'm getting errors [2]. Is there just something still in progress with linking the videos on the https://www.openstack.org/videos/ site? Note that in one case 15 minutes of a talk was chopped (40 minute talk chopped to 25 minutes) [3]. I reckon I should take that up directly with the speakersupport at openstack.org team though? [1] https://www.youtube.com/watch?v=YdSVY4517sE [2] https://www.openstack.org/videos/summits/denver-2019/the-vision-for-openstack-clouds-explained [3] https://www.youtube.com/watch?v=OyNFIOSGjac -- Thanks, Matt From kennelson11 at gmail.com Tue May 14 22:42:46 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Tue, 14 May 2019 15:42:46 -0700 Subject: Summit video website shenanigans In-Reply-To: <21ce1f4d-2e19-589f-3bce-44f411a22e67@gmail.com> References: <21ce1f4d-2e19-589f-3bce-44f411a22e67@gmail.com> Message-ID: I let various Foundation staff know- Jimmy and others- and supposedly its been fixed now already? They are doing some more research to see if there are other videos facing the same issues, but it should all be fine now? -Kendall (diablo_rojo) On Tue, May 14, 2019 at 3:00 PM Matt Riedemann wrote: > On Sunday I was able to pull up several of the summit videos on youtube > via the summit phone app, e.g. [1] but when trying to view the same > videos from the summit video website I'm getting errors [2]. > > Is there just something still in progress with linking the videos on the > https://www.openstack.org/videos/ site? > > Note that in one case 15 minutes of a talk was chopped (40 minute talk > chopped to 25 minutes) [3]. I reckon I should take that up directly with > the speakersupport at openstack.org team though? > > [1] https://www.youtube.com/watch?v=YdSVY4517sE > [2] > > https://www.openstack.org/videos/summits/denver-2019/the-vision-for-openstack-clouds-explained > [3] https://www.youtube.com/watch?v=OyNFIOSGjac > > -- > > Thanks, > > Matt > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Tue May 14 22:58:17 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Tue, 14 May 2019 17:58:17 -0500 Subject: [watcher] Compute CDM builder issues (mostly perf related) In-Reply-To: References: Message-ID: <9c229be7-52a6-8055-9261-eaef73318394@gmail.com> On 5/14/2019 3:34 PM, Matt Riedemann wrote: > 2. I had a question about when the nova CDM gets built now [2]. It looks > like the nova CDM only gets built when there is an audit? But I thought > the CDM was supposed to get built on start of the decision-engine > service and then refreshed every hour (by default) on a periodic task or > as notifications are processed that change the model. Does this mean the > nova CDM is rebuilt fresh whenever there is an audit even if the audit > is not scoped? If so, isn't that potentially inefficient (and an > unnecessary load on the compute API every time an audit runs?). Also, it looks like https://bugs.launchpad.net/watcher/+bug/1828582 is due to a regression caused by that change. The problem is a nova notification is received before the nova CDM is built which results in an AttributeError traceback in the decision-engine logs. Should we be building the nova CDM if nova is sending notifications and there is no model yet? Or should we just handle the case that the nova CDM hasn't been built yet when we start getting notifications (and before an audit builds the CDM)? -- Thanks, Matt From kennelson11 at gmail.com Tue May 14 23:05:39 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Tue, 14 May 2019 16:05:39 -0700 Subject: [all] [TC] [elections] Proposed Dates Message-ID: Hello Everyone! So, basically if we follow the usual formulas for when to hold the TC and PTL elections, they overlap a bit but not nicely. You can see what the dates would be if we exactly followed the usual timeline in this etherpad[1]. The proposal from the election officials is that we move the PTL nominations 1 week early and that gives them 1-week idle time in which we *could* do campaigning for PTLs the same way we do campaigning for TC seats. We definitely don't have to do campaigning, it could just be a week of idle time while TC campaigning happens. Setting Combined Election Summit is at: 2019-11-04 Release is at: 2019-10-14 Latest possible completion is at: 2019-09-23 Moving back to Tuesday: 2019-09-17 TC Election from 2019-09-10T23:45 to 2019-09-17T23:45 PTL Election from 2019-09-10T23:45 to 2019-09-17T23:45 TC Campaigning from 2019-09-03T23:45 to 2019-09-10T23:45 TC Nominations from 2019-08-27T23:45 to 2019-09-03T23:45 PTL Nominations from 2019-08-27T23:45 to 2019-09-03T23:45 Set email_deadline to 2019-09-03T00:00 Setting TC timeframe end to email_deadline Beginning of Stein Cycle @ 2018-08-10 00:00:00+00:00 End of Train cycle @ 2019-09-03 00:00:00+00:00 Election timeframe: 389 days, 0:00:00s This format makes it easier for election officials since we would only need to generate the electorate once for both elections. We can create whatever polls we need to for PTL elections at the same time we make the TC election poll. All in all less confusing for the electorate as well (hopefully). Thanks! -Kendall Nelson (diablo_rojo) & the election officials [1] https://etherpad.openstack.org/p/election-train-ptg -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Tue May 14 23:48:18 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Tue, 14 May 2019 18:48:18 -0500 Subject: [watcher] Compute CDM builder issues (mostly perf related) In-Reply-To: References: Message-ID: <599abe78-0239-d449-469b-7ed37bfa15dd@gmail.com> On 5/14/2019 3:34 PM, Matt Riedemann wrote: > 1. In general, a lot of this code for building the compute node model is > based on at least using the 2.53 microversion (Pike) in nova where the > hypervisor.id is a UUID - this is actually necessary for a multi-cell > environment like CERN. The nova_client.api_version config option already > defaults to 2.56 which was in Queens. I'm not sure what the > compatibility matrix looks like for Watcher, but would it be possible > for us to say that Watcher requires nova at least at Queens level API > (so nova_client.api_version >= 2.60), add a release note and a > "watcher-status upgrade check" if necessary. This might make things a > bit cleaner in the nova CDM code to know we can rely on a given minimum > version. I tried changing nova_client.api_version to a FloatOpt but that gets messy because of how things like 2.60 are handled (str(2.60) gets turned into '2.6' which is not what we'd want). I was hoping we could use FloatOpt with a min version to enforce the minimum required version, but I guess we could do this other ways in the client helper code itself by comparing to some minimum required version in the code. -- Thanks, Matt From ashlee at openstack.org Wed May 15 01:03:17 2019 From: ashlee at openstack.org (Ashlee Ferguson) Date: Tue, 14 May 2019 20:03:17 -0500 Subject: Open Infrastructure Summit CFP Open - Deadline: July 2 Message-ID: <1ED31B50-AE18-40D3-BB17-3D3E9FC5BD93@openstack.org> Hi everyone, The Call for Presentations (CFP) [1] for the Open Infrastructure Summit in Shanghai (November 4 - 6, 2019) [2] is open! Review the list of Tracks [3], and submit your presentations, panels, and workshops before July 2, 2019. Sessions will be presented in both Mandarin and English, so you may submit your presentation in either language. The content submission process for the Forum and Project Teams Gathering will be managed separately in the upcoming months. SUBMIT YOUR PRESENTATION [1] - Deadline July 2, 2019 at 11:59pm PT (July 3 at 6:59 UTC) Want to help shape the content for the Summit? The Programming Committee helps select sessions from the CFP for the Summit schedule. Nominate yourself or someone else for the Programming Committee [4] before May 20, 2019. Registration and Sponsorship • Shanghai Summit + PTG registration is available in the following currencies: • Register in USD [5] • Register in RMB (includes fapiao) [6] • Sponsorship opportunities [7] Please email speakersupport at openstack.org with any questions or feedback. Thanks, Ashlee [1] https://cfp.openstack.org/ [2] https://www.openstack.org/summit/shanghai-2019 [3] https://www.openstack.org/summit/shanghai-2019/summit-categories [4] http://bit.ly/ShanghaiProgrammingCommittee [5] https://app.eventxtra.link/registrations/6640a923-98d7-44c7-a623-1e2c9132b402?locale=en [6] https://app.eventxtra.link/registrations/f564960c-74f6-452d-b0b2-484386d33eb6?locale=en [7] https://www.openstack.org/summit/shanghai-2019/sponsors/ Ashlee Ferguson OpenStack Foundation ashlee at openstack.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Wed May 15 01:31:46 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 15 May 2019 02:31:46 +0100 Subject: Summit video website shenanigans In-Reply-To: References: <21ce1f4d-2e19-589f-3bce-44f411a22e67@gmail.com> Message-ID: <363a03ca4dfe607224762944e2d43d081d2bdc1c.camel@redhat.com> On Tue, 2019-05-14 at 15:42 -0700, Kendall Nelson wrote: > I let various Foundation staff know- Jimmy and others- and supposedly its > been fixed now already? They are doing some more research to see if there > are other videos facing the same issues, but it should all be fine now? the nova cells v2 video is still only 25 mins https://www.youtube.com/watch?v=OyNFIOSGjac so while https://www.openstack.org/videos/summits/denver-2019/the-vision-for-openstack-clouds-explained does seam to be fixed the missing 15mins for the start of the cellsv2 video has not been adressed > > -Kendall (diablo_rojo) > > On Tue, May 14, 2019 at 3:00 PM Matt Riedemann wrote: > > > On Sunday I was able to pull up several of the summit videos on youtube > > via the summit phone app, e.g. [1] but when trying to view the same > > videos from the summit video website I'm getting errors [2]. > > > > Is there just something still in progress with linking the videos on the > > https://www.openstack.org/videos/ site? > > > > Note that in one case 15 minutes of a talk was chopped (40 minute talk > > chopped to 25 minutes) [3]. I reckon I should take that up directly with > > the speakersupport at openstack.org team though? > > > > [1] https://www.youtube.com/watch?v=YdSVY4517sE > > [2] > > > > https://www.openstack.org/videos/summits/denver-2019/the-vision-for-openstack-clouds-explained > > [3] https://www.youtube.com/watch?v=OyNFIOSGjac > > > > -- > > > > Thanks, > > > > Matt > > > > From renat.akhmerov at gmail.com Wed May 15 05:15:30 2019 From: renat.akhmerov at gmail.com (Renat Akhmerov) Date: Wed, 15 May 2019 12:15:30 +0700 Subject: [mistral] Proposing to have Mistral office hours weekly on Wed 8.00 UTC In-Reply-To: References: Message-ID: <91db14e5-23b9-43c6-8410-d773cfcd7e55@Spark> Hi, All people who are interested in any kind of discussions (technical, user questions) around Mistral are invited to participate regular Mistral office hours sessions starting next week. The proposed time slot for now is Wed 8.00 UTC. If you have other suggestions, let’s discuss. The Mistral IRC channel is #openstack-mistral. That doesn’t mean though that you can find us in the channel only within this hour once a week. Usually some of the Mistral contributors are there and available to talk. However, I’d like to renew office hours just so people know that somebody will be there for sure at this time. Since we haven’t had regular meetings for a few months the most important topics I’d like to propose are building a further roadmap and helping new contributors getting up to speed. But anything else is also welcome. Thanks Renat Akhmerov @Nokia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ghcks1000 at gmail.com Wed May 15 06:02:50 2019 From: ghcks1000 at gmail.com (Hochan Lee) Date: Wed, 15 May 2019 15:02:50 +0900 Subject: [Tacker][dev] Scaling and auto-healing functions for VNFFG Message-ID: <5cdbab7f.1c69fb81.1037b.51db@mx.google.com> Hello tacker team and all,   I'm hochan lee and graduate student from korea univ.   Our team is interested in SFC and VNFFG, especially HA of VNFFG.   We are intereseted in scaling and auto-healing functions for VNFFG proposed in Tacker Pike Specifications. https://specs.openstack.org/openstack/tacker-specs/specs/pike/vnffg-scaling.html https://specs.openstack.org/openstack/tacker-specs/specs/pike/vnffg-autohealing.html   We think these functions hadn't been developed yet. Are these features currently being developed?   If not, can we go on developing these features for contribution?   We wanna join next tacker weekly meeting and discuss them.   Thanks, -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtantsur at redhat.com Wed May 15 08:29:00 2019 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Wed, 15 May 2019 10:29:00 +0200 Subject: [PTL][SIG][WG] PTG Team Photos In-Reply-To: References: <981673d8-b095-4c30-a651-577d1c5286d3@www.fastmail.com> <30c8cbb5-b11b-be98-339d-ef6c5e35305b@gmail.com> Message-ID: So many great people, but Rain + Treva are winning it :) Thank you Kendall! On 5/14/19 7:36 PM, Kendall Nelson wrote: > Hello! > > I have just about all the photos sorted into separate project team folders[1]. > There were a few groups that were not signed up and came for photos anyway which > is great, but being that I wasn't taking photos I don't know exactly what teams > they were. These teams that I was unable to place just exist outside the team > directories. If anyone wants to speak up and help solve the mystery I would very > much appreciate it! > > Enjoy! > > -Kendall (diablo_rojo) > > [1] https://www.dropbox.com/sh/fydqjehy9h5y728/AAAEP6h_uK_6r1a9oh3aAF6Qa?dl=0 > > On Mon, May 13, 2019 at 4:00 PM Kendall Nelson > wrote: > > Sorting through them today, should have a link for everyone tomorrow. > > -Kendall > > On Fri, May 10, 2019 at 11:01 AM Jay Bryant > wrote: > > Colleen, > > I haven't seen them made available anywhere yet so I don't think you > missed an e-mail. > > Jay > > On 5/10/2019 12:48 PM, Colleen Murphy wrote: > > On Thu, Mar 28, 2019, at 17:03, Kendall Nelson wrote: > >> Hello! > >> > >> If your team is attending the PTG and is interested in having a team > >> photo taken, here is the signup[1]! There are slots Thursday and Friday > >> from 10:00 AM to 4:30 PM. > >> > >> The location is TBD but will likely be close to where registration will > >> be. I'll send an email out the day before with a reminder of your time > >> slot and an exact location. > >> > >> -Kendall (diablo_rojo) > >> > >> > [1]https://docs.google.com/spreadsheets/d/1DgsRHVWW2YLv7ewfX0M21zWJRf4wUfPG4ff2V5XtaMg/edit?usp=sharing > >> > > Are the photos available somewhere now? I'm wondering if I missed an > email. > > > > Colleen > > > From li.canwei2 at zte.com.cn Wed May 15 09:03:10 2019 From: li.canwei2 at zte.com.cn (li.canwei2 at zte.com.cn) Date: Wed, 15 May 2019 17:03:10 +0800 (CST) Subject: =?UTF-8?B?UmU6W3dhdGNoZXJdIENvbXB1dGUgQ0RNIGJ1aWxkZXIgaXNzdWVzIChtb3N0bHkgcGVyZiByZWxhdGVkKQ==?= In-Reply-To: References: bde03f2f-c416-2bdb-285f-68c1eaa06a34@gmail.com Message-ID: <201905151703106287420@zte.com.cn> Hi all, I was looking over the NovaClusterDataModelCollector code today and trying to learn more about how watcher builds the nova CDM (and when) and got digging into this change from Stein [1] where I noted what appear to be several issues. I'd like to enumerate a few of those issues here and then figure out how to proceed. 1. In general, a lot of this code for building the compute node model is based on at least using the 2.53 microversion (Pike) in nova where the hypervisor.id is a UUID - this is actually necessary for a multi-cell environment like CERN. The nova_client.api_version config option already defaults to 2.56 which was in Queens. I'm not sure what the compatibility matrix looks like for Watcher, but would it be possible for us to say that Watcher requires nova at least at Queens level API (so nova_client.api_version >= 2.60), add a release note and a "watcher-status upgrade check" if necessary. This might make things a bit cleaner in the nova CDM code to know we can rely on a given minimum version. [licanwei]:We set the default nova api version to 2.56 , but it's better to add a release note 2. I had a question about when the nova CDM gets built now [2]. It looks like the nova CDM only gets built when there is an audit? But I thought the CDM was supposed to get built on start of the decision-engine service and then refreshed every hour (by default) on a periodic task or as notifications are processed that change the model. Does this mean the nova CDM is rebuilt fresh whenever there is an audit even if the audit is not scoped? If so, isn't that potentially inefficient (and an unnecessary load on the compute API every time an audit runs?). [licanwei]:Yes, the CDM will be built when the first audit being created. and don't rebuild if the next new audit with the same scope. 3. The host_aggregates and availability_zone compute audit scopes don't appear to be documented in the docs or the API reference, just the spec [3]. Should I open a docs bug about what are the supported audit scopes and how they work (it looks like the host_aggregates scope works for aggregate ids or names and availability_zone scope works for AZ names). [licanwei]:There is an example in CLI command 'watcher help create audittemplate' and it's a good idea to documented these. 4. There are a couple of issues with how the unscoped compute nodes are retrieved from nova [4]. a) With microversion 2.33 there is a server-side configurable limit applied when listing hypervisors (defaults to 1000). In a large cloud this could be a problem since the watch client-side code is not paging. b) The code is listing hypervisors with details, but then throwing away those details to just get the hypervisor_hostname, then iterating over each of those node names and getting the details per hypervisor again. I see why this is done because of the scope vs unscoped cases, but we could still optimize this I think (we might need some changes to python-novaclient for this though, which should be easy enough to add). [licanwei]: Yes, If novaclient can do some changes, we can optimize the code. 5. For each server on a node, we get the details of the server in separate API calls to nova [5]. Why can't we just do a GET /servers/detail and filter on "host" or "node" so it's a single API call to nova per hypervisor? [licanwei] This also depends on novaclient. I'm happy to work on any of this but if there are any reasons things need to be done this way please let me know before I get started. Also, how would the core team like these kinds of improvements tracked? With bugs? [licanwei]: welcome to improve Watcher. bug or other kind is not important [1] https://review.opendev.org/#/c/640585/ [2] https://review.opendev.org/#/c/640585/10/watcher/decision_engine/model/collector/nova.py at 181 [3] https://specs.openstack.org/openstack/watcher-specs/specs/stein/implemented/scope-for-watcher-datamodel.html [4] https://review.opendev.org/#/c/640585/10/watcher/decision_engine/model/collector/nova.py at 257 [5] https://review.opendev.org/#/c/640585/10/watcher/decision_engine/model/collector/nova.py at 399 -- Thanks, Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From li.canwei2 at zte.com.cn Wed May 15 09:13:57 2019 From: li.canwei2 at zte.com.cn (li.canwei2 at zte.com.cn) Date: Wed, 15 May 2019 17:13:57 +0800 (CST) Subject: =?UTF-8?B?UmU6W3dhdGNoZXJdIENvbXB1dGUgQ0RNIGJ1aWxkZXIgaXNzdWVzIChtb3N0bHkgcGVyZiByZWxhdGVkKQ==?= In-Reply-To: <9c229be7-52a6-8055-9261-eaef73318394@gmail.com> References: bde03f2f-c416-2bdb-285f-68c1eaa06a34@gmail.com, 9c229be7-52a6-8055-9261-eaef73318394@gmail.com Message-ID: <201905151713578647919@zte.com.cn> On 5/14/2019 3:34 PM, Matt Riedemann wrote: > 2. I had a question about when the nova CDM gets built now [2]. It looks > like the nova CDM only gets built when there is an audit? But I thought > the CDM was supposed to get built on start of the decision-engine > service and then refreshed every hour (by default) on a periodic task or > as notifications are processed that change the model. Does this mean the > nova CDM is rebuilt fresh whenever there is an audit even if the audit > is not scoped? If so, isn't that potentially inefficient (and an > unnecessary load on the compute API every time an audit runs?). Also, it looks like https://bugs.launchpad.net/watcher/+bug/1828582 is due to a regression caused by that change. The problem is a nova notification is received before the nova CDM is built which results in an AttributeError traceback in the decision-engine logs. Should we be building the nova CDM if nova is sending notifications and there is no model yet? Or should we just handle the case that the nova CDM hasn't been built yet when we start getting notifications (and before an audit builds the CDM)? [licanwei]:please refer to https://github.com/openstack/watcher/blob/master/watcher/decision_engine/model/notification/nova.py#L144 When a nova notification is received before the nova CDM is built or no node in the CDM, the node will be add to the CDM. -- Thanks, Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From li.canwei2 at zte.com.cn Wed May 15 09:19:10 2019 From: li.canwei2 at zte.com.cn (li.canwei2 at zte.com.cn) Date: Wed, 15 May 2019 17:19:10 +0800 (CST) Subject: =?UTF-8?B?UmU6W3dhdGNoZXJdIENvbXB1dGUgQ0RNIGJ1aWxkZXIgaXNzdWVzIChtb3N0bHkgcGVyZiByZWxhdGVkKQ==?= In-Reply-To: <599abe78-0239-d449-469b-7ed37bfa15dd@gmail.com> References: bde03f2f-c416-2bdb-285f-68c1eaa06a34@gmail.com, 599abe78-0239-d449-469b-7ed37bfa15dd@gmail.com Message-ID: <201905151719101538176@zte.com.cn> On 5/14/2019 3:34 PM, Matt Riedemann wrote: > 1. In general, a lot of this code for building the compute node model is > based on at least using the 2.53 microversion (Pike) in nova where the > hypervisor.id is a UUID - this is actually necessary for a multi-cell > environment like CERN. The nova_client.api_version config option already > defaults to 2.56 which was in Queens. I'm not sure what the > compatibility matrix looks like for Watcher, but would it be possible > for us to say that Watcher requires nova at least at Queens level API > (so nova_client.api_version >= 2.60), add a release note and a > "watcher-status upgrade check" if necessary. This might make things a > bit cleaner in the nova CDM code to know we can rely on a given minimum > version. I tried changing nova_client.api_version to a FloatOpt but that gets messy because of how things like 2.60 are handled (str(2.60) gets turned into '2.6' which is not what we'd want). I was hoping we could use FloatOpt with a min version to enforce the minimum required version, but I guess we could do this other ways in the client helper code itself by comparing to some minimum required version in the code. [licanwei]: Maybe we can refer to https://github.com/openstack/watcher/blob/master/watcher/common/nova_helper.py#L714 Thanks, Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From moguimar at redhat.com Wed May 15 09:37:04 2019 From: moguimar at redhat.com (Moises Guimaraes de Medeiros) Date: Wed, 15 May 2019 11:37:04 +0200 Subject: [oslo][requirements] Bandit Strategy In-Reply-To: References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> Message-ID: Should uncap patches be -W until next bandit release? Em ter, 14 de mai de 2019 às 17:26, Doug Hellmann escreveu: > Zane Bitter writes: > > > On 13/05/19 1:40 PM, Ben Nemec wrote: > >> > >> > >> On 5/13/19 12:23 PM, Ben Nemec wrote: > >>> Nefarious cap bandits are running amok in the OpenStack community! > >>> Won't someone take a stand against these villainous headwear thieves?! > >>> > >>> Oh, sorry, just pasted the elevator pitch for my new novel. ;-) > >>> > >>> Actually, this email is to summarize the plan we came up with in the > >>> Oslo meeting this morning. Since we have a bunch of projects affected > >>> by the Bandit breakage I wanted to make sure we had a common fix so we > >>> don't have a bunch of slightly different approaches in each project. > >>> The plan we agreed on in the meeting was to push a two patch series to > >>> each repo - one to cap bandit <1.6.0 and one to uncap it with a > >>> !=1.6.0 exclusion. The first should be merged immediately to unblock > >>> ci, and the latter can be rechecked once bandit 1.6.1 releases to > >>> verify that it fixes the problem for us. > > > > I take it that just blocking 1.6.0 in global-requirements isn't an > > option? (Would it not work, or just break every project's requirements > > job? I could live with the latter since they're broken anyway because of > > the sphinx issue below...) > > Because bandit is a "linter" it is in the blacklist in the requirements > repo, which means it is not constrained there. Projects are expected to > manage the versions of linters they use, and roll forward when they are > ready to deal with any new rules introduced by the linters (either by > following or disabling them). > > So, no, unfortunately we can't do this globally through the requirements > repo right now. > > -- > Doug > > -- Moisés Guimarães Software Engineer Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Wed May 15 09:38:42 2019 From: thierry at openstack.org (Thierry Carrez) Date: Wed, 15 May 2019 11:38:42 +0200 Subject: [all] [TC] [elections] Proposed Dates In-Reply-To: References: Message-ID: Kendall Nelson wrote: > [...] > So, basically if we follow the usual formulas for when to hold the TC > and PTL elections, they overlap a bit but not nicely. You can see what > the dates would be if we exactly followed the usual timeline in this > etherpad[1]. > > The proposal from the election officials is that we move the PTL > nominations 1 week early and that gives them 1-week idle time in which > we *could* do campaigning for PTLs the same way we do campaigning for TC > seats. We definitely don't have to do campaigning, it could just be a > week of idle time while TC campaigning happens. > > Setting Combined Election > Summit is at: 2019-11-04 > Release is at: 2019-10-14 > Latest possible completion is at: 2019-09-23 > Moving back to Tuesday: 2019-09-17 > TC Election from 2019-09-10T23:45 to 2019-09-17T23:45 > PTL Election from 2019-09-10T23:45 to 2019-09-17T23:45 > TC Campaigning from 2019-09-03T23:45 to 2019-09-10T23:45 > TC Nominations from 2019-08-27T23:45 to 2019-09-03T23:45 > PTL Nominations from 2019-08-27T23:45 to 2019-09-03T23:45 > Set email_deadline to 2019-09-03T00:00 > Setting TC timeframe end to email_deadline > Beginning of Stein Cycle @ 2018-08-10 00:00:00+00:00 > End of Train cycle @ 2019-09-03 00:00:00+00:00 > Election timeframe: 389 days, 0:00:00s > > This format makes it easier for election officials since we would only > need to generate the electorate once for both elections. We can create > whatever polls we need to for PTL elections at the same time we make the > TC election poll. All in all less confusing for the electorate as well > (hopefully). We historically placed TC elections after PTL elections because some people found it useful to know if they were elected PTL before running (or not running) for a TC seat. Is that still a concern? If not, I think it's just easier to do both at the same time (nominations, campaigning, election). -- Thierry Carrez (ttx) From aspiers at suse.com Wed May 15 09:45:41 2019 From: aspiers at suse.com (Adam Spiers) Date: Wed, 15 May 2019 10:45:41 +0100 Subject: [Tacker][dev] Scaling and auto-healing functions for VNFFG In-Reply-To: <5cdbab7f.1c69fb81.1037b.51db@mx.google.com> References: <5cdbab7f.1c69fb81.1037b.51db@mx.google.com> Message-ID: <20190515094541.55hoaiuqqk5fbbu4@pacific.linksys.moosehall> Hochan Lee wrote: >Hello tacker team and all, >  >I'm hochan lee and graduate student from korea univ. >  >Our team is interested in SFC and VNFFG, especially HA of VNFFG. >  >We are intereseted in scaling and auto-healing functions for VNFFG proposed in Tacker Pike Specifications. >https://specs.openstack.org/openstack/tacker-specs/specs/pike/vnffg-scaling.html >https://specs.openstack.org/openstack/tacker-specs/specs/pike/vnffg-autohealing.html >  >We think these functions hadn't been developed yet. > >Are these features currently being developed? >  >If not, can we go on developing these features for contribution? >  >We wanna join next tacker weekly meeting and discuss them. These sound interesting. I know very little about NFV so I can't comment on the details, but I just wanted to make sure everyone involved is aware of these two Special Interest Groups: https://wiki.openstack.org/wiki/Auto-scaling_SIG https://wiki.openstack.org/wiki/Self-healing_SIG Maybe these SIGs can help you, e.g. if there is any cross-project coordination work required. Also it would be great if you could keep the SIGs updated on any results you achieve, since they are collecting documentation of use cases in order to make auto-scaling and self-healing easier for non-developers to achieve within OpenStack. From thierry at openstack.org Wed May 15 10:01:36 2019 From: thierry at openstack.org (Thierry Carrez) Date: Wed, 15 May 2019 12:01:36 +0200 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: <20190514143935.thuj6t7z6v4xoyay@mthode.org> References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> <20190514123155.xbj5srhhnrmg2h2y@yuggoth.org> <20190514143935.thuj6t7z6v4xoyay@mthode.org> Message-ID: <90baa056-ab00-d2fc-f068-0a312ea775f7@openstack.org> Matthew Thode wrote: > [...] > I don't like the idea of conflating the stability promise of > upper-constraints.txt with the not quite fully tested-ness of adding > security updates after the fact (while we do some cross testing, we do > not and should not have 100% coverage, boiling the ocean). Me neither. > The only > way I can see this working is to have a separate file for security > updates. > > The idea I had (and don't like too much) is to do the following. > 1. Keep upper-constraints.txt as is > a. rename to tox-constraints possibly > 2. add a new file, let's call it 'security-updates.txt' > a. in this file goes security updates and all the knock on updates > that it causes (foo pulls in a new bersion of bar and baz). > b. the file needs to maintain co-installability of openstack. It is > laid over the upper-constraints file and tested the same way > upper-constraints is. This testing is NOT perfect. The generated > file could be called something like > 'somewhat-tested-secureconstraints.txt' > 3. global-requirements.txt remains the same (minimum not updated for > security issues) > > This would increase test sprawl quite a bit (tests need to be run on any > constraints change on this larger set). > This also sets up incrased work and scope for the requirements team. > Perhaps this could be a sub team type of item or something? I'm a bit worried that a security-updates.txt would promise more than we can deliver. While we may be able to track vulnerabilities in our direct dependencies, we can't rely on *them* to properly avoid depending on vulnerable second-level dependencies (and so on). And this solution does not cover non-Python dependencies. So saying "use this to be secure" is just misleading our users. Nothing short of full distribution security work can actually deliver on a "use this to be secure" promise. And that is definitely not the kind of effort we should tackle as a community imho. I understand the need to signal critical vulnerabilities in our dependencies to our users and distributions. Historically we have used the OSSN (OpenStack Security Notices) to draw attention to select, key vulnerabilities in our dependency chain: OSSN-0082 - Heap and Stack based buffer overflows in dnsmasq<2.78 OSSN-0044 - Older versions of noVNC allow session theft OSSN-0043 - glibc 'Ghost' vulnerability can allow remote code execution ... I would rather continue to use that mechanism to communicate about critical vulnerabilities in all our dependencies than implement a complex and costly process to only cover /some/ of our Python dependencies. -- Thierry Carrez (ttx) From sean.mcginnis at gmx.com Wed May 15 11:35:53 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Wed, 15 May 2019 06:35:53 -0500 Subject: [all] [TC] [elections] Proposed Dates In-Reply-To: References: Message-ID: <20190515113553.GA33223@smcginnis-mbp.local> > > > > This format makes it easier for election officials since we would only > > need to generate the electorate once for both elections. We can create > > whatever polls we need to for PTL elections at the same time we make the > > TC election poll. All in all less confusing for the electorate as well > > (hopefully). > > We historically placed TC elections after PTL elections because some people > found it useful to know if they were elected PTL before running (or not > running) for a TC seat. > > Is that still a concern? If not, I think it's just easier to do both at the > same time (nominations, campaigning, election). > > -- > Thierry Carrez (ttx) > I personally don't see this being an issue and think it could be beneficial from a community focus point of view. I wonder if that would cause some challenges for the election officials though. I would guess that the election tooling is currently only able to handle one type of election at a time. Sean From kalyani.rajkumar at bristol.ac.uk Wed May 15 12:24:04 2019 From: kalyani.rajkumar at bristol.ac.uk (Kalyani Rajkumar) Date: Wed, 15 May 2019 12:24:04 +0000 Subject: [networking-sfc] Unable to get Service Function Chain Mechanism working in Neutron Message-ID: Hi, I have been trying to enable the networking SFC mechanism in OpenStack. I have successfully created port pairs, port pair groups, port chain and a flow classifier. However, I am unable to get the service chain working. The architecture of the set up I have deployed is attached. I have used the queens version of OpenStack. The steps that I followed are as below. * Create port neutron port-create --name sfc-Network * Create VMs and attach the interfaces with them accordingly VM1 - P1 & P2; VM2 - P3 & P4; VM3 - P5 & P6 * Create port pairs neutron port-pair-create pp1 -- ingress p1 -- egress p2 neutron port-pair-create pp2 -- ingress p3 -- egress p4 neutron port-pair-create pp3 -- ingress p5 -- egress p6 * Create port pair groups neutron port-pair-group-create -- port-pair pp1 ppg1 neutron port-pair-group-create -- port-pair pp2 ppg2 neutron port-pair-group-create -- port-pair pp3 ppg3 * Create flow classifier neutron flow-classifier-create --source-ip-prefix --destination-ip-prefix --logical-source-port p1 fc1 * Create port chain neutron port-chain-create --port-pair-group ppg1 --port-pair-group ppg2 --port-pair-group ppg3 --flow-classifier fc1 pc1 I am testing this architecture by sending a ping request from VM1 to VM3. Therefore, the destination port is P6. If SFC is working correctly, I should be able to see the packets go through the VM2 to VM3 when I do a tcpdump in VM2. As I am new to OpenStack and SFC, I am not certain if this is logically correct. I would like to pose two questions. 1) All the VMs are on the same network, is it logically correct to expect the ping packets to be routed from VM1 > VM2 > VM3 in the SFC scenario? Because all the ports are on the same network, I get a ping response but it is not via VM2 even though the port chain is created through VM2. 2) If not, how do I make sure that the packets are routed through VM2 which is the second port pair in the port pair chain. Could it be something to do with the OpenVSwitch configuration? Any help would be highly appreciated. Regards, Kalyani Rajkumar High Performance Networks Group, University of Bristol -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Picture1.png Type: image/png Size: 12566 bytes Desc: Picture1.png URL: From doug at doughellmann.com Wed May 15 12:54:28 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Wed, 15 May 2019 08:54:28 -0400 Subject: [oslo][requirements] Bandit Strategy In-Reply-To: References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> Message-ID: Moises Guimaraes de Medeiros writes: > Should uncap patches be -W until next bandit release? I would expect them to fail the linter job until then, so I don't think that's strictly needed. > > Em ter, 14 de mai de 2019 às 17:26, Doug Hellmann > escreveu: > >> Zane Bitter writes: >> >> > On 13/05/19 1:40 PM, Ben Nemec wrote: >> >> >> >> >> >> On 5/13/19 12:23 PM, Ben Nemec wrote: >> >>> Nefarious cap bandits are running amok in the OpenStack community! >> >>> Won't someone take a stand against these villainous headwear thieves?! >> >>> >> >>> Oh, sorry, just pasted the elevator pitch for my new novel. ;-) >> >>> >> >>> Actually, this email is to summarize the plan we came up with in the >> >>> Oslo meeting this morning. Since we have a bunch of projects affected >> >>> by the Bandit breakage I wanted to make sure we had a common fix so we >> >>> don't have a bunch of slightly different approaches in each project. >> >>> The plan we agreed on in the meeting was to push a two patch series to >> >>> each repo - one to cap bandit <1.6.0 and one to uncap it with a >> >>> !=1.6.0 exclusion. The first should be merged immediately to unblock >> >>> ci, and the latter can be rechecked once bandit 1.6.1 releases to >> >>> verify that it fixes the problem for us. >> > >> > I take it that just blocking 1.6.0 in global-requirements isn't an >> > option? (Would it not work, or just break every project's requirements >> > job? I could live with the latter since they're broken anyway because of >> > the sphinx issue below...) >> >> Because bandit is a "linter" it is in the blacklist in the requirements >> repo, which means it is not constrained there. Projects are expected to >> manage the versions of linters they use, and roll forward when they are >> ready to deal with any new rules introduced by the linters (either by >> following or disabling them). >> >> So, no, unfortunately we can't do this globally through the requirements >> repo right now. >> >> -- >> Doug >> >> > > -- > > Moisés Guimarães > > Software Engineer > > Red Hat > > -- Doug From moguimar at redhat.com Wed May 15 12:55:03 2019 From: moguimar at redhat.com (Moises Guimaraes de Medeiros) Date: Wed, 15 May 2019 14:55:03 +0200 Subject: [oslo][requirements] Bandit Strategy In-Reply-To: References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> Message-ID: Doug, they pass now, and might fail once 1.6.1 is out and the behavior is not fixed, but that will probably need a recheck on a passed job. The -W would be just a reminder not to merge them by mistake. Em qua, 15 de mai de 2019 às 14:52, Doug Hellmann escreveu: > Moises Guimaraes de Medeiros writes: > > > Should uncap patches be -W until next bandit release? > > I would expect them to fail the linter job until then, so I don't think > that's strictly needed. > > > > > Em ter, 14 de mai de 2019 às 17:26, Doug Hellmann > > > escreveu: > > > >> Zane Bitter writes: > >> > >> > On 13/05/19 1:40 PM, Ben Nemec wrote: > >> >> > >> >> > >> >> On 5/13/19 12:23 PM, Ben Nemec wrote: > >> >>> Nefarious cap bandits are running amok in the OpenStack community! > >> >>> Won't someone take a stand against these villainous headwear > thieves?! > >> >>> > >> >>> Oh, sorry, just pasted the elevator pitch for my new novel. ;-) > >> >>> > >> >>> Actually, this email is to summarize the plan we came up with in the > >> >>> Oslo meeting this morning. Since we have a bunch of projects > affected > >> >>> by the Bandit breakage I wanted to make sure we had a common fix so > we > >> >>> don't have a bunch of slightly different approaches in each project. > >> >>> The plan we agreed on in the meeting was to push a two patch series > to > >> >>> each repo - one to cap bandit <1.6.0 and one to uncap it with a > >> >>> !=1.6.0 exclusion. The first should be merged immediately to unblock > >> >>> ci, and the latter can be rechecked once bandit 1.6.1 releases to > >> >>> verify that it fixes the problem for us. > >> > > >> > I take it that just blocking 1.6.0 in global-requirements isn't an > >> > option? (Would it not work, or just break every project's requirements > >> > job? I could live with the latter since they're broken anyway because > of > >> > the sphinx issue below...) > >> > >> Because bandit is a "linter" it is in the blacklist in the requirements > >> repo, which means it is not constrained there. Projects are expected to > >> manage the versions of linters they use, and roll forward when they are > >> ready to deal with any new rules introduced by the linters (either by > >> following or disabling them). > >> > >> So, no, unfortunately we can't do this globally through the requirements > >> repo right now. > >> > >> -- > >> Doug > >> > >> > > > > -- > > > > Moisés Guimarães > > > > Software Engineer > > > > Red Hat > > > > > > -- > Doug > -- Moisés Guimarães Software Engineer Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From cdent+os at anticdent.org Wed May 15 12:58:52 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Wed, 15 May 2019 13:58:52 +0100 (BST) Subject: [placement][ptg] Summary of Summaries Message-ID: I produced several summary messages at the end of the PTG [1] but then disappeared into a dark corporate belly, so I thought it might be useful to do a quick check in to assert and verify some status. I'll be doing a full pupdate on Friday. If any of the below doesn't align with your memories, please let me know. At the end the PTG there were some open questions about priorities so I produced an etherpad listing all the existing RFE stories along with the new ones produced by discussions at the PTG and asked people to register their preferences. Thus far only three people (including me) have voted. Please look at https://etherpad.openstack.org/p/placement-ptg-train-rfe-voter to register your input. As things currently stand our priorities are what you would expect based on who is available to work and the work that needs to be done to be able to satisfy other projects' dependencies on placement: * Consumer types, spec at: https://review.opendev.org/#/c/654799/ * Nested things: * request group mapping in allocation candidates: https://review.opendev.org/#/c/657582/ * the remaining "nested magic: https://review.opendev.org/#/c/658510/ The latter is a WIP and will almost certainly need someone besides Eric to finish it off. Multiple inter-related features will come out of that. See the related story [2]. "support any trait in allocation candidates" and "support mixing required traits with any traits" are still under review but have been de-emphasized. If we're able to get to them that's great, but they are not critical. Same is also true for managing a local-to-placement container. Since the PTG several of us have recognized/acknowledged a thing we already pretty much knew: Queries for nested providers in a highly populated but sparsely used cloud will be less performant than desired. There are things we can do to monitor and fix this. Tetsuro has already started some changes [3] and in the gaps I'm working on making changes to placeload [4] to build nested providers to be used in the perfload job. "Support resource provider partitioning" is also being de-emphasized, but based on several recent conversations I suspect we will want it soon, and any performance improvements we get before then will be important. So, to summarize this summary of summaries: Look at https://etherpad.openstack.org/p/placement-ptg-train-rfe-voter and review the three specs above. If you think other priorities are necessary please speak up and explain why. Thanks to everyone for their participation in the PTG and especially the virtual pre-PTG. We got a lot figured out and were still able to attend other sessions. [1] * Nested magic: http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005823.html * Shared disk: http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005829.html * Consumer Types: http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005878.html * Placement, Ironic, Blazar: http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005880.html [2] https://storyboard.openstack.org/#!/story/2005575 [3] https://review.opendev.org/#/c/658977/ [4] https://pypi.org/project/placeload/ -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From amotoki at gmail.com Wed May 15 12:58:40 2019 From: amotoki at gmail.com (Akihiro Motoki) Date: Wed, 15 May 2019 21:58:40 +0900 Subject: [storyboard] email notification on stories/tasks of subscribed projects Message-ID: Hi, Is there a way to get email notification on stories/tasks of subscribed projects in storyboard? In launchpad, I configure launchpad bug notifications of all bug changes in interested projects. After migration to storyboard (for example, openstackclient), I failed to find a similar way. Thanks, Akihiro Motoki (irc: amotoki) -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Wed May 15 13:04:04 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Wed, 15 May 2019 22:04:04 +0900 Subject: [storyboard] email notification on stories/tasks of subscribed projects In-Reply-To: References: Message-ID: +1 It could be a great feature. On Wed, May 15, 2019 at 10:02 PM Akihiro Motoki wrote: > Hi, > > Is there a way to get email notification on stories/tasks of subscribed > projects in storyboard? > In launchpad, I configure launchpad bug notifications of all bug changes > in interested projects. > After migration to storyboard (for example, openstackclient), I failed to > find a similar way. > > Thanks, > Akihiro Motoki (irc: amotoki) > > -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From kchamart at redhat.com Wed May 15 09:24:56 2019 From: kchamart at redhat.com (Kashyap Chamarthy) Date: Wed, 15 May 2019 11:24:56 +0200 Subject: On reporting CPU flags that provide mitiation (to CVE flaws) as Nova 'traits' Message-ID: <20190515092456.GH17214@paraplu> [When replying, please keep us Cced, I'm not subscribed to the list.] Grab a cup of tea, slightly long e-mail. Some of us in the upstream Nova channel are splitting hairs over this seemingly small issue about potential "security concern" in representing some CPU flags as "traits"[0]. Context ------- The 'os-traits' project lets you report CPU flags as "traits", e.g. to configure a flavor to require the "AVX2" CPU flag: $> openstack flavor set 1 --property trait:HW_CPU_X86_AVX2=required And here's the list of current CPU traits: https://github.com/openstack/os-traits/blob/master/os_traits/hw/cpu/x86.py The other day I casually noticed that the above file is missing some important CPU flags, and proposed a couple of drive-by small changes. And that set off this massive bike-shedding exercise sized Jupier. Some of the following flags provide mitigation from Meltdown/Spectre, others reduce performance degradation, yet others are 'benign' features or are in weird zone like the just-off-the-press (for "ZombieLoad") CPU flag: 'md-clear' -- which doesn't fix the vulnerability, but is a way to report to the operating system that the kernel is opportunistically issuing instructions (which, thankfully, already _exist_ in the CPU) needed for the fix. (See what all these acronyms mean in the rendered QEMU documentation here[1]). - Intel : PCID, STIBP, SPEC-CTRL, SSBD, PDPE1GB, MD-CLEAR - AMD : IBPB, STIBP, VIRT-SSBD, AMD-SSBD, AMD-NO-SSB, PDPE1GB Two things to distinguish ------------------------- It's worth reminding that there are two things here: (a) allowing CPU flags via Nova's config attribute: `[libvirt]/cpu_model_extra_flags`; and (b) allow scheduling of instances based on CPU traits. We're talking about case (b) here. Possible "security issue" ------------------------- As mentioned above, Nova has this ability to say: "DON'T land this instance on a host if it has $TRAIT". So, hypothetically, you can say: "don't land on the host that has AMD-SSBD fix", which can be done (not yet, as the patch[2] is being discussed) in one of the two ways: - Via placement request: `required=!HW_CPU_X86_AMD_SSBD` - From a flavor Extra Spec: `trait:HW_CPU_X86_AMD_SSBD=forbidden` So, theoretically there is scope for "exploiting" (but non-trivial) the above — however, it can be possible only when Nova exposes them via Compute (which it doesn't yet). Contention / unsolved question ------------------------------ Whether we should expose CPU flags (e.g. "SSBD", or "STIBP") that provide mitigation from CPU flaws as traits or not? It is a "policy" decision, and the 'traits' are "forever" (well, you can soft-deprecate them with a comment) once they're added, hence all the belaboring. There's no consensus here. Some think that we should _not_ allow those CPU flags as traits which can 'allow' you to target vulnerable hosts. Some think it is okay to add these as granular CPU traits. (Have a gander at the discussion on this[2] change.) Does the Security Team has any strong opinions? On "generic" traits ------------------- We also discussed whether it makes sense to add "generic roll-up traits" such as 'HW_CPU_HAS_SPECTRE_CURE' and 'HW_CPU_HAS_MELTDOWN_CURE'. However, that seems intuitively appealinty, the messy real world doesn't quite allow that (as there are multiple different Spectre flaws) So, for now we won't do these "generic" triats; but it can be added later, if we change our minds. Next steps ---------- If there is consensus on dropping those CPU-flags-as-traits that let you target vulnerable hosts, drop them. And add only those CPU flags as traits that provide either 'features' (what's the definition?) or those that reduce performance degradation. Otherwise, add all the required CPU flags consistently to 'os-traits', and move on. Another idea ------------ For "Meltdown" (and for other vulnerabilities; it should be case-by-case, based on fix availability), we can potentially make Nova check the 'sysfs' directory for vulnerabilities. And if it reports "Vulnerable" (instead of "Mitigation", as shown below): $> cat /sys/devices/system/cpu/vulnerabilities/meltdown Mitigation: PTI Then we can print a log warning for the current release that the host is vulnerable and warn that future Nova will refuse to run VMs on it, and then next release, make it mandatory. Likewise for "Spectre": $> grep . /sys/devices/system/cpu/vulnerabilities/spectre_* /sys/devices/system/cpu/vulnerabilities/spectre_v1:Mitigation: __user pointer sanitization /sys/devices/system/cpu/vulnerabilities/spectre_v2:Mitigation: Full generic retpoline, IBPB: conditional, IBRS_FW, STIBP: conditional, RSB filling Some think this is not "Nova's business", because: "just like how you don't want to stop based on CPU fan speed or temperature or firmware patch levels ...". But that argument doesn't quite apply, as CPU fan/speed are very different, and are not seen by the guest. If you take security seriously, it _is_ be fair game, IMHO, to make Nova warn (then stop) launching instances on Compute hosts with vulnerable hypervisors. But my umbilical cord isn't tied to this idea, just wanted to mention it for completeness' sake. [0] https://github.com/openstack/os-traits/ [1] https://qemu.weilnetz.de/doc/qemu-doc.html#important_005fcpu_005ffeatures_005fintel_005fx86 [2] https://review.opendev.org/#/c/655193/ Add CPU traits for Meltdown/Spectre mitigatio -- /kashyap From kchamart at redhat.com Wed May 15 13:11:09 2019 From: kchamart at redhat.com (Kashyap Chamarthy) Date: Wed, 15 May 2019 15:11:09 +0200 Subject: On reporting CPU flags that provide mitiation (to CVE flaws) as Nova 'traits' In-Reply-To: <1e4f1df66115fd8e96b8aec3a679b25534c66541.camel@redhat.com> References: <20190515092456.GH17214@paraplu> <1e4f1df66115fd8e96b8aec3a679b25534c66541.camel@redhat.com> Message-ID: <20190515131109.GJ17214@paraplu> On Wed, May 15, 2019 at 11:49:03AM +0100, Sean Mooney wrote: > On Wed, 2019-05-15 at 11:24 +0200, Kashyap Chamarthy wrote: [...] > > Contention / unsolved question > > ------------------------------ > > > > Whether we should expose CPU flags (e.g. "SSBD", or "STIBP") that > > provide mitigation from CPU flaws as traits or not? It is a "policy" > > decision, and the 'traits' are "forever" (well, you can soft-deprecate > > them with a comment) once they're added, hence all the belaboring. > > > > There's no consensus here. Some think that we should _not_ allow those > > CPU flags as traits which can 'allow' you to target vulnerable hosts. > > for what its worth im in this camp and have said so in other places > where we have been disucssing it. Yep, noted. > > Some think it is okay to add these as granular CPU traits. (Have > > a gander at the discussion on this[2] change.) > > > > Does the Security Team has any strong opinions? [...] > > Next steps > > ---------- > > > > If there is consensus on dropping those CPU-flags-as-traits that let you > > target vulnerable hosts, drop them. And add only those CPU flags as > > traits that provide either 'features' (what's the definition?) or those > > that reduce performance degradation. > > > my vote is for only adding tratis for cpu featrue. Noted; I'd like to hear other opinions. (And note that the word "feature" can get fuzzy in this context, I'll assume we're using it somewhat loosely to include things that help with reducing perf degradation, etc.) > PCID is a CPU feautre that was designed as a performce optiomistation ... except that "feature" was a 'no-op' and it wasn't even _used_, until Linux 4.1.4 enabled it (in November 2017) for Meltdown mitigation. So the presence of PCID in the hardware didn't matter one whit all these decades. (Source: http://archive.is/ma8Iw.) > and several generation later also was found to be useful in reducing > the performace impacts of the sepcter mitigation Nit: Not Spectre, but Meltdown. [...] > > Some think this is not "Nova's business", because: "just like how you > > don't want to stop based on CPU fan speed or temperature or firmware > > patch levels ...". > > i think it applies perfectly. It's a matter of scope. To be clear — I'm not "insisting" that it be done in Nova. Just thinking out loud. [...] > form a product perspective vendors shoudl ensure that they > provide tooling and software updated that are secure by default "Product perspective" is irrelevant here. Of course, it's obvious that vendors "should" provide the relevant tooling and sofware updates. > > But that argument doesn't quite apply, as CPU > > fan/speed are very different, and are not seen by the guest. If you > > take security seriously, it _is_ be fair game, IMHO, to make Nova warn > > (then stop) launching instances on Compute hosts with vulnerable Correcting myself: Okay, "stopping" / "refusing to launch" is too strict and unresonable; scratch that. (Because, as discussed before, there _are_ valid cases to be made that certain admins/operators intentionally will run on vulnerable hypervisors — e.g. because their CPUs are too old to receive microcode updates. Or may deliberately tolerate this risk, as they know their risk policy. Or they're running staging envs, or any number of other reasons.) > > hypervisors. > > the same aregument could be aplied to qemu or libvirt. No, that argument does not apply to QEMU or libvirt. Why? QEMU and libvirt are low-level primitives. They explicitly state that they don't, and will not, make such "policy" decisions. But Nova, as a management tool, _does_ make some policy decisions (e.g. how we generate a libvirt guest XML based on certain criteria, and others). And in this case, Nova _can_ take a stance that "orchestration tools" should do that — that's perfectly acceptable. [...] -- /kashyap From fungi at yuggoth.org Wed May 15 13:25:43 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 15 May 2019 13:25:43 +0000 Subject: [security-sig][nova] On reporting CPU flags that provide mitiation (to CVE flaws) as Nova 'traits' In-Reply-To: <20190515092456.GH17214@paraplu> References: <20190515092456.GH17214@paraplu> Message-ID: <20190515132543.kanmiyq6my5unhnc@yuggoth.org> On 2019-05-15 11:24:56 +0200 (+0200), Kashyap Chamarthy wrote: > [When replying, please keep us Cced, I'm not subscribed to the list.] [...] At Kashyap's request I have bounced this message and his other followup to the openstack-discuss ML. Please reply there rather than on the openstack-security ML which is only used for automated notifications these days. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From fungi at yuggoth.org Wed May 15 13:36:12 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 15 May 2019 13:36:12 +0000 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: <90baa056-ab00-d2fc-f068-0a312ea775f7@openstack.org> References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> <20190514123155.xbj5srhhnrmg2h2y@yuggoth.org> <20190514143935.thuj6t7z6v4xoyay@mthode.org> <90baa056-ab00-d2fc-f068-0a312ea775f7@openstack.org> Message-ID: <20190515133612.cklkcqnvgomolx5o@yuggoth.org> On 2019-05-15 12:01:36 +0200 (+0200), Thierry Carrez wrote: [...] > Nothing short of full distribution security work can actually > deliver on a "use this to be secure" promise. And that is > definitely not the kind of effort we should tackle as a community > imho. I concur. There is a reason the model we follow for our own stable branches (fork from a point in time and only backport critical fixes) is basically identical to how distros operate, except they do it on a *much* larger scale. To do this "properly" we would not only need to stay on top of vulnerabilities announced for our entire transitive Python dependency tree, but also fork the source code for all of it and then backport fixes to those forks. This is precisely creating a new (perhaps derivative) distro, and it's a ton of work I doubt anyone is eager to sign up for. > I understand the need to signal critical vulnerabilities in our > dependencies to our users and distributions. Historically we have > used the OSSN (OpenStack Security Notices) to draw attention to > select, key vulnerabilities in our dependency chain: > > OSSN-0082 - Heap and Stack based buffer overflows in dnsmasq<2.78 > OSSN-0044 - Older versions of noVNC allow session theft > OSSN-0043 - glibc 'Ghost' vulnerability can allow remote code execution > ... > > I would rather continue to use that mechanism to communicate about > critical vulnerabilities in all our dependencies than implement a > complex and costly process to only cover /some/ of our Python > dependencies. This isn't so much the problem at hand, it's that we have deployment projects who have decided that deploying from stable branch source with pip-installed python packages selected using the frozen stable upper-constraints.txt set is a model they're recommending to their users. Honestly I think the solution to *that* problem is to stop, or at least document clearly that it's a security-wise unsafe choice for anything besides a test/proof-of-concept environment. I get why they thought it could be a useful addition since it's basically how we test our stable branches, but it's really a frightening lapse in judgement which puts our users' systems at grave risk of compromise. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From cdent+os at anticdent.org Wed May 15 13:37:52 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Wed, 15 May 2019 14:37:52 +0100 (BST) Subject: [all][dev] python-etcd3 needs maintainers Message-ID: Going through my todo list after returning from travels and I'm reminded of: When I was at OpenInfraDays UK, Louis Taylor (aka kragniz) asked me if I could check with the OpenStack community to see if there are people who are actively using python-etcd3 [1] that are interested in helping to maintain it. It needs more attention than he is able to give. python-etcd3 is a "Python client for the etcd API v3" over GRPC so more v3-native than etcd3-gateway [2]. I've used python-etcd3 in etcd-compute and it works well. I've been hoping to contribute in my Copious Free Time but don't seem to have any. [1] https://github.com/kragniz/python-etcd3 [2] https://github.com/dims/etcd3-gateway -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent From adam at sotk.co.uk Wed May 15 13:41:44 2019 From: adam at sotk.co.uk (adam at sotk.co.uk) Date: Wed, 15 May 2019 14:41:44 +0100 Subject: [storyboard] email notification on stories/tasks of subscribed projects In-Reply-To: References: Message-ID: <1db76780066130ccb661d2b1f632f163@sotk.co.uk> On 2019-05-15 13:58, Akihiro Motoki wrote: > Hi, > > Is there a way to get email notification on stories/tasks of > subscribed projects in storyboard? Yes, go to your preferences (https://storyboard.openstack.org/#!/profile/preferences) by clicking on your name in the top right, then Preferences. Scroll to the bottom and check the "Enable notification emails" checkbox, then click "Save". There's a UI bug where sometimes the displayed preferences will look like the save button didn't work, but rest assured that it did unless you get an error message. Once you've done this the email associated with your OpenID will receive notification emails for things you're subscribed to (which includes changes on stories/tasks related to projects you're subscribed to). Thanks, Adam (SotK) From davanum at gmail.com Wed May 15 14:01:12 2019 From: davanum at gmail.com (Davanum Srinivas) Date: Wed, 15 May 2019 10:01:12 -0400 Subject: [all][dev] python-etcd3 needs maintainers In-Reply-To: References: Message-ID: yay! happy to cede control of this :) On Wed, May 15, 2019 at 9:38 AM Chris Dent wrote: > > Going through my todo list after returning from travels and I'm > reminded of: > > When I was at OpenInfraDays UK, Louis Taylor (aka kragniz) asked me > if I could check with the OpenStack community to see if there are > people who are actively using python-etcd3 [1] that are interested in > helping to maintain it. It needs more attention than he is able to > give. > > python-etcd3 is a "Python client for the etcd API v3" over GRPC so > more v3-native than etcd3-gateway [2]. > > I've used python-etcd3 in etcd-compute and it works well. I've been > hoping to contribute in my Copious Free Time but don't seem to have > any. > > [1] https://github.com/kragniz/python-etcd3 > [2] https://github.com/dims/etcd3-gateway > > > -- > Chris Dent ٩◔̯◔۶ https://anticdent.org/ > freenode: cdent tw: @anticdent -- Davanum Srinivas :: https://twitter.com/dims -------------- next part -------------- An HTML attachment was scrubbed... URL: From soulxu at gmail.com Wed May 15 14:07:17 2019 From: soulxu at gmail.com (Alex Xu) Date: Wed, 15 May 2019 22:07:17 +0800 Subject: [nova] PTG aligning of nova spec: show-server-numa-topology In-Reply-To: References: Message-ID: John Garbutt 于2019年5月10日周五 下午11:28写道: > Hi, > > My main worry was to not expose host related information to end users, but > noting administrators probably do what the information. > > Looking again at the Stein spec we merged, the proposed policy rules > already take care of all that. > > I think the next step is to re-propose the spec for the Train release. I > couldn't find it, but maybe you have done that already? > > Thanks, > johnthetubaguy > > On Fri, 10 May 2019 at 08:51, yonglihe wrote: > >> Hi, Everyone >> >> I synced up with Alex about comments we got at PTG. It's a long >> discussion, I might lost something. >> >> What i got lists below, fix me: >> >> * Remove sockets >> * Remove thread_policy >> >> Not sure about following comments: >> >> >> * Remove the cpu topology from the proposal? >> * Using the cpu pinning info instead of cpu set? >> >> >> By apply the suggestion, the API ``GET /servers/{server_id}/topology`` >> response gonna to be like this, >> >> and let us align what it should be: >> >> { >> # overall policy: TOPOLOGY % 'index >> "nodes":[ >> { >> # Host Numa Node >> # control by policy TOPOLOGY % 'index:host_info' >> "host_numa_node": 3, >> # 0:5 means vcpu 0 pinning to pcpu 5 >> # control by policy TOPOLOGY % 'index:host_info' >> "cpu_pinning": {0:5, 1:6}, >> "vcpu_set": [0,1,2,3], >> "siblings": [[0,1],[2,3]], >> "memory_mb": 1024, >> "pagesize_kb": 4096, >> "cores": 2, >> # one core has at least one thread >> "threads": 2 >> > I'm not sure the pagesize_kb, cores and threads. I guess Sean will have comment on them. We can't get cores for threads for each numa node. The InstanceNUMACell.cpu_topology is empty only except for the dedicated cpu policy. And the Instance.cpu_topology is for the whole instance. And we don't support choice page size per numa node. Thanks for John and Sean's discussion in the PTG, and sorry for loss the conversation in PTG again. Actually, those are the questions Yongli is looking for. > } >> ... >> ], # nodes >> } >> >> >> links: >> >> ptg: >> https://etherpad.openstack.org/p/nova-ptg-train L334 >> >> spec review: >> >> https://review.opendev.org/#/c/612256/25/specs/stein/approved/show-server-numa-topology.rst >> >> code review: >> https://review.openstack.org/#/c/621476/ >> >> bp: >> https://blueprints.launchpad.net/nova/+spec/show-server-numa-topology >> >> >> Regards >> Yongli He >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Wed May 15 14:28:24 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 15 May 2019 09:28:24 -0500 Subject: [watcher] Compute CDM builder issues (mostly perf related) In-Reply-To: <201905151719101538176@zte.com.cn> References: <201905151719101538176@zte.com.cn> Message-ID: <290e8682-9a57-51e1-26b2-62f09fff4cd6@gmail.com> On 5/15/2019 4:19 AM, li.canwei2 at zte.com.cn wrote: > I tried changing nova_client.api_version to a FloatOpt but that gets > messy because of how things like 2.60 are handled (str(2.60) gets turned > into '2.6' which is not what we'd want). I was hoping we could use > FloatOpt with a min version to enforce the minimum required version, but > I guess we could do this other ways in the client helper code itself by > comparing to some minimum required version in the code. > [licanwei]: Maybe we can refer to > https://github.com/openstack/watcher/blob/master/watcher/common/nova_helper.py#L714 > I just did this which seems more explicit: https://review.opendev.org/#/c/659194/ That change leaves the default of 2.56 since the 2.56 code does version discovery so it's backward compatible, but I think we can assert that you need at least 2.53 because of how the scoped nova CDM code works (and to support nova deployments with multiple cells properly). Also note that 2.53 is pike-era nova and 2.56 is queens-era nova and those seem old enough that it's safe to require 2.53 as a minimum for watcher in train. -- Thanks, Matt From mriedemos at gmail.com Wed May 15 14:32:34 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 15 May 2019 09:32:34 -0500 Subject: [watcher] Compute CDM builder issues (mostly perf related) In-Reply-To: <201905151713578647919@zte.com.cn> References: <201905151713578647919@zte.com.cn> Message-ID: <0ee8e7df-09d4-46ce-e5bc-93854c6d44ef@gmail.com> On 5/15/2019 4:13 AM, li.canwei2 at zte.com.cn wrote: > [licanwei]:please refer to > https://github.com/openstack/watcher/blob/master/watcher/decision_engine/model/notification/nova.py#L144 > > > When a nova notification is received before the nova CDM is built or no > node in the CDM, > > the node will be add to the CDM. > That's not what's happening in this bug. We're getting an instance.update event from nova during scheduling/building of an instance before it has a host, so when this is called: https://github.com/openstack/watcher/blob/master/watcher/decision_engine/model/notification/nova.py#L220 node_uuid is None. Which means we never call get_or_create_node here: https://github.com/openstack/watcher/blob/master/watcher/decision_engine/model/notification/nova.py#L44 And then we blow up here: https://github.com/openstack/watcher/blob/master/watcher/decision_engine/model/notification/nova.py#L50 because self.cluster_data_model is None. Based on your earlier reply about when the nova CDM is built: > [licanwei]:Yes, the CDM will be built when the first audit being created. It seems the fix for this notification traceback bug is to just make sure the self.cluster_data_model is not None and return early if it's not, i.e. gracefully handle receiving notifications before we've ever done an audit and built the nova CDM. -- Thanks, Matt From cdent+os at anticdent.org Wed May 15 15:02:00 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Wed, 15 May 2019 16:02:00 +0100 (BST) Subject: [placement][ptg] Summary of Summaries In-Reply-To: References: Message-ID: On Wed, 15 May 2019, Chris Dent wrote: > Thanks to everyone for their participation in the PTG and especially > the virtual pre-PTG. We got a lot figured out and were still able to > attend other sessions. I forgot an important thing! Placement needs a logo. I got the sense from some foundation staff members that the option at this time was to adopt the logo (and thus stickers) of a project that has faded gently away. That doesn't make for much of an identity so I wonder, is there anyone who is able and willing to draw: An Australian magpie looking like a waiter (linen cloth over one wing) carrying a plate ? -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From openstack at nemebean.com Wed May 15 15:38:13 2019 From: openstack at nemebean.com (Ben Nemec) Date: Wed, 15 May 2019 10:38:13 -0500 Subject: [oslo][requirements] Bandit Strategy In-Reply-To: References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> Message-ID: Yeah, I've just been relying on our cores to not merge the uncap patches before we're ready. I'm fine with marking them WIP too though. On 5/15/19 7:55 AM, Moises Guimaraes de Medeiros wrote: > Doug, they pass now, and might fail once 1.6.1 is out and the behavior > is not fixed, but that will probably need a recheck on a passed job. The > -W would be just a reminder not to merge them by mistake. > > Em qua, 15 de mai de 2019 às 14:52, Doug Hellmann > escreveu: > > Moises Guimaraes de Medeiros > writes: > > > Should uncap patches be -W until next bandit release? > > I would expect them to fail the linter job until then, so I don't think > that's strictly needed. > > > > > Em ter, 14 de mai de 2019 às 17:26, Doug Hellmann > > > > escreveu: > > > >> Zane Bitter > writes: > >> > >> > On 13/05/19 1:40 PM, Ben Nemec wrote: > >> >> > >> >> > >> >> On 5/13/19 12:23 PM, Ben Nemec wrote: > >> >>> Nefarious cap bandits are running amok in the OpenStack > community! > >> >>> Won't someone take a stand against these villainous headwear > thieves?! > >> >>> > >> >>> Oh, sorry, just pasted the elevator pitch for my new novel. ;-) > >> >>> > >> >>> Actually, this email is to summarize the plan we came up > with in the > >> >>> Oslo meeting this morning. Since we have a bunch of projects > affected > >> >>> by the Bandit breakage I wanted to make sure we had a common > fix so we > >> >>> don't have a bunch of slightly different approaches in each > project. > >> >>> The plan we agreed on in the meeting was to push a two patch > series to > >> >>> each repo - one to cap bandit <1.6.0 and one to uncap it with a > >> >>> !=1.6.0 exclusion. The first should be merged immediately to > unblock > >> >>> ci, and the latter can be rechecked once bandit 1.6.1 > releases to > >> >>> verify that it fixes the problem for us. > >> > > >> > I take it that just blocking 1.6.0 in global-requirements isn't an > >> > option? (Would it not work, or just break every project's > requirements > >> > job? I could live with the latter since they're broken anyway > because of > >> > the sphinx issue below...) > >> > >> Because bandit is a "linter" it is in the blacklist in the > requirements > >> repo, which means it is not constrained there. Projects are > expected to > >> manage the versions of linters they use, and roll forward when > they are > >> ready to deal with any new rules introduced by the linters > (either by > >> following or disabling them). > >> > >> So, no, unfortunately we can't do this globally through the > requirements > >> repo right now. > >> > >> -- > >> Doug > >> > >> > > > > -- > > > > Moisés Guimarães > > > > Software Engineer > > > > Red Hat > > > > > > -- > Doug > > > > -- > > Moisés Guimarães > > Software Engineer > > Red Hat > > > From mthode at mthode.org Wed May 15 15:49:31 2019 From: mthode at mthode.org (Matthew Thode) Date: Wed, 15 May 2019 10:49:31 -0500 Subject: [oslo][requirements] Bandit Strategy In-Reply-To: References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> Message-ID: <20190515154931.xhyzutsecqgoqqbl@mthode.org> If it helps, upper-constraints still has not been updated (and is -W'd) https://review.opendev.org/658767 On 19-05-15 10:38:13, Ben Nemec wrote: > Yeah, I've just been relying on our cores to not merge the uncap patches > before we're ready. I'm fine with marking them WIP too though. > > On 5/15/19 7:55 AM, Moises Guimaraes de Medeiros wrote: > > Doug, they pass now, and might fail once 1.6.1 is out and the behavior > > is not fixed, but that will probably need a recheck on a passed job. The > > -W would be just a reminder not to merge them by mistake. > > > > Em qua, 15 de mai de 2019 às 14:52, Doug Hellmann > > escreveu: > > > > Moises Guimaraes de Medeiros > > writes: > > > > > Should uncap patches be -W until next bandit release? > > > > I would expect them to fail the linter job until then, so I don't think > > that's strictly needed. > > > > > > > > Em ter, 14 de mai de 2019 às 17:26, Doug Hellmann > > > > > > escreveu: > > > > > >> Zane Bitter > writes: > > >> > > >> > On 13/05/19 1:40 PM, Ben Nemec wrote: > > >> >> > > >> >> > > >> >> On 5/13/19 12:23 PM, Ben Nemec wrote: > > >> >>> Nefarious cap bandits are running amok in the OpenStack > > community! > > >> >>> Won't someone take a stand against these villainous headwear > > thieves?! > > >> >>> > > >> >>> Oh, sorry, just pasted the elevator pitch for my new novel. ;-) > > >> >>> > > >> >>> Actually, this email is to summarize the plan we came up > > with in the > > >> >>> Oslo meeting this morning. Since we have a bunch of projects > > affected > > >> >>> by the Bandit breakage I wanted to make sure we had a common > > fix so we > > >> >>> don't have a bunch of slightly different approaches in each > > project. > > >> >>> The plan we agreed on in the meeting was to push a two patch > > series to > > >> >>> each repo - one to cap bandit <1.6.0 and one to uncap it with a > > >> >>> !=1.6.0 exclusion. The first should be merged immediately to > > unblock > > >> >>> ci, and the latter can be rechecked once bandit 1.6.1 > > releases to > > >> >>> verify that it fixes the problem for us. > > >> > > > >> > I take it that just blocking 1.6.0 in global-requirements isn't an > > >> > option? (Would it not work, or just break every project's > > requirements > > >> > job? I could live with the latter since they're broken anyway > > because of > > >> > the sphinx issue below...) > > >> > > >> Because bandit is a "linter" it is in the blacklist in the > > requirements > > >> repo, which means it is not constrained there. Projects are > > expected to > > >> manage the versions of linters they use, and roll forward when > > they are > > >> ready to deal with any new rules introduced by the linters > > (either by > > >> following or disabling them). > > >> > > >> So, no, unfortunately we can't do this globally through the > > requirements > > >> repo right now. > > >> > > >> -- > > >> Doug > > >> > > >> > > > > > > -- > > > > > > Moisés Guimarães > > > > > > Software Engineer > > > > > > Red Hat > > > > > > > > > > -- Doug > > > > > > > > -- > > > > Moisés Guimarães > > > > Software Engineer > > > > Red Hat > > > > > > > -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From openstack at nemebean.com Wed May 15 16:18:51 2019 From: openstack at nemebean.com (Ben Nemec) Date: Wed, 15 May 2019 11:18:51 -0500 Subject: [oslo][requirements] Bandit Strategy In-Reply-To: <20190515154931.xhyzutsecqgoqqbl@mthode.org> References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> <20190515154931.xhyzutsecqgoqqbl@mthode.org> Message-ID: On 5/15/19 10:49 AM, Matthew Thode wrote: > If it helps, upper-constraints still has not been updated (and is -W'd) > > https://review.opendev.org/658767 I'm a little confused by this patch. We don't use upper-constraints for linters or this probably wouldn't have broken us. It looks like that is just updating a test file? > > On 19-05-15 10:38:13, Ben Nemec wrote: >> Yeah, I've just been relying on our cores to not merge the uncap patches >> before we're ready. I'm fine with marking them WIP too though. >> >> On 5/15/19 7:55 AM, Moises Guimaraes de Medeiros wrote: >>> Doug, they pass now, and might fail once 1.6.1 is out and the behavior >>> is not fixed, but that will probably need a recheck on a passed job. The >>> -W would be just a reminder not to merge them by mistake. >>> >>> Em qua, 15 de mai de 2019 às 14:52, Doug Hellmann >> > escreveu: >>> >>> Moises Guimaraes de Medeiros >> > writes: >>> >>> > Should uncap patches be -W until next bandit release? >>> >>> I would expect them to fail the linter job until then, so I don't think >>> that's strictly needed. >>> >>> > >>> > Em ter, 14 de mai de 2019 às 17:26, Doug Hellmann >>> > >>> > escreveu: >>> > >>> >> Zane Bitter > writes: >>> >> >>> >> > On 13/05/19 1:40 PM, Ben Nemec wrote: >>> >> >> >>> >> >> >>> >> >> On 5/13/19 12:23 PM, Ben Nemec wrote: >>> >> >>> Nefarious cap bandits are running amok in the OpenStack >>> community! >>> >> >>> Won't someone take a stand against these villainous headwear >>> thieves?! >>> >> >>> >>> >> >>> Oh, sorry, just pasted the elevator pitch for my new novel. ;-) >>> >> >>> >>> >> >>> Actually, this email is to summarize the plan we came up >>> with in the >>> >> >>> Oslo meeting this morning. Since we have a bunch of projects >>> affected >>> >> >>> by the Bandit breakage I wanted to make sure we had a common >>> fix so we >>> >> >>> don't have a bunch of slightly different approaches in each >>> project. >>> >> >>> The plan we agreed on in the meeting was to push a two patch >>> series to >>> >> >>> each repo - one to cap bandit <1.6.0 and one to uncap it with a >>> >> >>> !=1.6.0 exclusion. The first should be merged immediately to >>> unblock >>> >> >>> ci, and the latter can be rechecked once bandit 1.6.1 >>> releases to >>> >> >>> verify that it fixes the problem for us. >>> >> > >>> >> > I take it that just blocking 1.6.0 in global-requirements isn't an >>> >> > option? (Would it not work, or just break every project's >>> requirements >>> >> > job? I could live with the latter since they're broken anyway >>> because of >>> >> > the sphinx issue below...) >>> >> >>> >> Because bandit is a "linter" it is in the blacklist in the >>> requirements >>> >> repo, which means it is not constrained there. Projects are >>> expected to >>> >> manage the versions of linters they use, and roll forward when >>> they are >>> >> ready to deal with any new rules introduced by the linters >>> (either by >>> >> following or disabling them). >>> >> >>> >> So, no, unfortunately we can't do this globally through the >>> requirements >>> >> repo right now. >>> >> >>> >> -- >>> >> Doug >>> >> >>> >> >>> > >>> > -- >>> > >>> > Moisés Guimarães >>> > >>> > Software Engineer >>> > >>> > Red Hat >>> > >>> > >>> >>> -- Doug >>> >>> >>> >>> -- >>> >>> Moisés Guimarães >>> >>> Software Engineer >>> >>> Red Hat >>> >>> >>> >> > From mthode at mthode.org Wed May 15 16:23:12 2019 From: mthode at mthode.org (Matthew Thode) Date: Wed, 15 May 2019 11:23:12 -0500 Subject: [oslo][requirements] Bandit Strategy In-Reply-To: References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> <20190515154931.xhyzutsecqgoqqbl@mthode.org> Message-ID: <20190515162312.goxfuo3uoijlqj7l@mthode.org> On 19-05-15 11:18:51, Ben Nemec wrote: > > > On 5/15/19 10:49 AM, Matthew Thode wrote: > > If it helps, upper-constraints still has not been updated (and is -W'd) > > > > https://review.opendev.org/658767 > > I'm a little confused by this patch. We don't use upper-constraints for > linters or this probably wouldn't have broken us. It looks like that is just > updating a test file? > You're right, I'm not sure why that was done, commented on the review (going to suggest abandoning it). -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From doug at doughellmann.com Wed May 15 16:52:05 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Wed, 15 May 2019 12:52:05 -0400 Subject: [oslo][requirements] Bandit Strategy In-Reply-To: References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> Message-ID: Moises Guimaraes de Medeiros writes: > Doug, they pass now, and might fail once 1.6.1 is out and the behavior is > not fixed, but that will probably need a recheck on a passed job. The -W > would be just a reminder not to merge them by mistake. Oh, I guess I assumed we would only be going through this process for repos that are broken. It makes sense to be consistent across all of them, though, if that was the goal. -- Doug From jungleboyj at gmail.com Wed May 15 17:13:44 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Wed, 15 May 2019 12:13:44 -0500 Subject: [cinder] ACTION REQUIRED: Updating weekly meeting ping list Message-ID: <7cd8f3b7-e474-e76c-05b3-6083ee92d3f1@gmail.com> Team, There has been discussion recently about whether or not courtesy pings are good practice or not.  The decision from the Cinder team is that we feel they are useful and we want to continue to use them.  Our courtesy ping list, however, has become quite stale. So, we have decided to remove the existing list and allow people to add their names back in if they wish to continue to be pinged for the weekly meeting. I am going to keep the existing list one more week to give people time to see this e-mail and make changes.  If, however, your name is not on the new list that I have started here [1] the assumption will be that you don't want to be pinged when the Cinder meetings start. Thanks for your attention to this change.  In the future we will plan to refresh the ping list at the beginning of each release. Hope to see you in our weekly meetings! Thanks! Jay [1] https://etherpad.openstack.org/p/cinder-train-meetings From mss at datera.io Wed May 15 17:16:12 2019 From: mss at datera.io (Matt Smith) Date: Wed, 15 May 2019 12:16:12 -0500 Subject: [Cinder][Tempest] All-plugin is deprecated, what do we do now? Message-ID: <20190515171612.lkrv36grx2rjggh6@Matthews-MacBook-Pro-2.local> Hey Folks, First time posting to the mailing list, so hopefully I got this right. Anyways, Tempest complains when I use 'all-plugin' and says it's deprecated. Cinder has Tempest plugins that were triggered by using 'all-plugin' instead of 'all' and now I don't know how I'm supposed to run those plugins as well. My guess is that 'all' now includes plugins? Or maybe there's an alternative for running plugins? Regards, Matt --- Matthew Smith Software Engineer mss at datera.io 214-223-0887 From fungi at yuggoth.org Wed May 15 17:20:04 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 15 May 2019 17:20:04 +0000 Subject: [oslo][requirements] Bandit Strategy In-Reply-To: References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> Message-ID: <20190515172004.c3ilkvn5bql4g246@yuggoth.org> On 2019-05-15 12:52:05 -0400 (-0400), Doug Hellmann wrote: > Moises Guimaraes de Medeiros writes: > > > Doug, they pass now, and might fail once 1.6.1 is out and the behavior is > > not fixed, but that will probably need a recheck on a passed job. The -W > > would be just a reminder not to merge them by mistake. > > Oh, I guess I assumed we would only be going through this process for > repos that are broken. It makes sense to be consistent across all of > them, though, if that was the goal. Only doing it for projects which actually hit that problem seems like a reasonable approach, since we don't expect them to all coordinate on a common version of static analyzers and linters anyway (hence bandit being in the constraints blacklist to start with). -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From aspiers at suse.com Wed May 15 17:37:03 2019 From: aspiers at suse.com (Adam Spiers) Date: Wed, 15 May 2019 18:37:03 +0100 Subject: [First Contact] [SIG] Summit/Forum + PTG Summary In-Reply-To: References: <20190514150657.hshqfcjsa35t57yb@pacific.linksys.moosehall> Message-ID: <20190515173703.uj2pigyevxm7x27n@arabian.linksys.moosehall> Kendall Nelson wrote: >Thanks for starting a list Adam :) [snipped] >> Few things off the top of my head: >> >> - Architectural overview >> >> - Quickstart for getting the code running in the simplest form >> (even if this is just "use devstack with these parameters") >> >> - Overview of all the project's git repos, and the layout of the files >> in each >> >> - How to run the various types of tests >> >> - How to find some easy dev tasks to get started with >> > >I would also add what task trackers they use and the tags they use as an >extension of your last bullet point. > >Also might include info about if they use specs or bps or neither for new >features. OK, this is now all in a new etherpad: https://etherpad.openstack.org/p/onboarding-docs-mvp Anyone else got any suggestions? From aspiers at suse.com Wed May 15 17:51:10 2019 From: aspiers at suse.com (Adam Spiers) Date: Wed, 15 May 2019 18:51:10 +0100 Subject: [tc][all] Github mirroring (or lack thereof) for unofficial projects In-Reply-To: References: <20190503190538.GB3377@localhost.localdomain> Message-ID: <20190515175110.26i2xuclkksgx744@arabian.linksys.moosehall> Hi Jim, Jim Rollenhagen wrote: >On Fri, May 3, 2019 at 3:05 PM Paul Belanger wrote: > >> On Fri, May 03, 2019 at 08:48:10PM +0200, Roman Gorshunov wrote: >> > Hello Jim, team, >> > >> > I'm from Airship project. I agree with archival of Github mirrors of >> > repositories. One small suggestion: could we have project descriptions >> > adjusted to point to the new location of the source code repository, >> > please? E.g. "The repo now lives at opendev.org/x/y". >> > >> This is something important to keep in mind from infra side, once the >> repo is read-only, we lose the ability to use the API to change it. >> >> From manage-projects.py POV, we can update the description before >> flipping the archive bit without issues, just need to make sure we have >> the ordering correct. > >Agree this is a good idea. Just checking you saw my reply to the same email from Paul? http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005846.html >There's been no objections to this plan for some time now. I might be missing some context, but I *think* my email could be interpreted as an objection based on the reasons listed in it. Also, the understanding I took away from the PTG session was that there was consensus *not* to archive repos, but rather to ensure that mirroring and redirects are set up properly. However I am of course very willing to be persuaded otherwise. Please could you take a look at that mail and let us know what you think? Thanks! Adam From molenkam at uwo.ca Wed May 15 17:51:31 2019 From: molenkam at uwo.ca (Gary Molenkamp) Date: Wed, 15 May 2019 17:51:31 +0000 Subject: [Cinder][Ceph] Migrating backend host Message-ID: <11c12057-6213-2073-8f73-a693ec798cd6@uwo.ca> I am moving my cinder-volume service from one controller to another, and I'm trying to determine the correct means to update all existing volume's back-end host reference.  I now have two cinder volume services running in front of the same ceph cluster, and I would like to retire the old cinder-volume service. For example, on my test cloud, "openstack volume service list": +------------------+----------------------------------+------+---------+-------+----------------------------+ | Binary           | Host                             | Zone | Status  | State | Updated At                 | +------------------+----------------------------------+------+---------+-------+----------------------------+ | cinder-scheduler | osdev-ctrl1     | nova | enabled | up    | 2019-05-15T17:44:05.000000 | | cinder-volume    | osdev-ctrl2 at rbd | nova | disabled | up    | 2019-05-15T17:44:04.000000 | | cinder-volume    | osdev-ctrl1 at rbd | nova | enabled | up    | 2019-05-15T17:44:00.000000 | +------------------+----------------------------------+------+---------+-------+----------------------------+ Now, an existing volume has a reference to the disabled cinder-volume:     "os-vol-host-attr:host          | osdev-ctrl2 at rbd#rbd" but this needs to be changed to:     "os-vol-host-attr:host          | osdev-ctrl1 at rbd#rbd" As both controllers are members of the same ceph cluster, an "openstack volume migrate" is not appropriate.  If it is appropriate, my testing has shown that it errors out and deletes the source volume from ceph. I can alter this field manually in the cinder database,  but in following the "don't mess with the data model" mantra, is there a means to do this from the cli? Thanks, Gary. Openstack release: Queens. Distro: Redhat (Centos) 7.6 -- Gary Molenkamp Computer Science/Science Technology Services Systems Administrator University of Western Ontario molenkam at uwo.ca http://www.csd.uwo.ca (519) 661-2111 x86882 (519) 661-3566 From jim at jimrollenhagen.com Wed May 15 18:05:52 2019 From: jim at jimrollenhagen.com (Jim Rollenhagen) Date: Wed, 15 May 2019 14:05:52 -0400 Subject: [tc][all] Github mirroring (or lack thereof) for unofficial projects In-Reply-To: <20190515175110.26i2xuclkksgx744@arabian.linksys.moosehall> References: <20190503190538.GB3377@localhost.localdomain> <20190515175110.26i2xuclkksgx744@arabian.linksys.moosehall> Message-ID: On Wed, May 15, 2019 at 1:51 PM Adam Spiers wrote: > Hi Jim, > > Jim Rollenhagen wrote: > >On Fri, May 3, 2019 at 3:05 PM Paul Belanger > wrote: > > > >> On Fri, May 03, 2019 at 08:48:10PM +0200, Roman Gorshunov wrote: > >> > Hello Jim, team, > >> > > >> > I'm from Airship project. I agree with archival of Github mirrors of > >> > repositories. One small suggestion: could we have project descriptions > >> > adjusted to point to the new location of the source code repository, > >> > please? E.g. "The repo now lives at opendev.org/x/y". > >> > > >> This is something important to keep in mind from infra side, once the > >> repo is read-only, we lose the ability to use the API to change it. > >> > >> From manage-projects.py POV, we can update the description before > >> flipping the archive bit without issues, just need to make sure we have > >> the ordering correct. > > > >Agree this is a good idea. > > Just checking you saw my reply to the same email from Paul? > > http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005846.html Sorry, yes I saw it, but then later mis-remembered it. :( > > > >There's been no objections to this plan for some time now. > > I might be missing some context, but I *think* my email could be > interpreted as an objection based on the reasons listed in it. > > Also, the understanding I took away from the PTG session was that > there was consensus *not* to archive repos, but rather to ensure that > mirroring and redirects are set up properly. However I am of course > very willing to be persuaded otherwise. > > Please could you take a look at that mail and let us know what you > think? Thanks! > So there's two things to do, in this order: 1) do a proper transfer of the Airship repos 2) Archive any other repos on github that are no longer in the openstack namespace. Has the airship team been working with the infra team on getting the transfer done? I would think that could be done quickly, and then we can proceed with archiving the others. // jim > Adam > -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Wed May 15 18:08:32 2019 From: openstack at nemebean.com (Ben Nemec) Date: Wed, 15 May 2019 13:08:32 -0500 Subject: [oslo][requirements] Bandit Strategy In-Reply-To: References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> Message-ID: <014592f6-6d03-a44b-cda3-0007ca2c3c29@nemebean.com> On 5/15/19 11:52 AM, Doug Hellmann wrote: > Moises Guimaraes de Medeiros writes: > >> Doug, they pass now, and might fail once 1.6.1 is out and the behavior is >> not fixed, but that will probably need a recheck on a passed job. The -W >> would be just a reminder not to merge them by mistake. > > Oh, I guess I assumed we would only be going through this process for > repos that are broken. It makes sense to be consistent across all of > them, though, if that was the goal. > The reason they pass right now is that there is no newer release than 1.6.0, so the != exclusion is effectively the same as the < cap. Once 1.6.1 releases that won't be the case, but in the meantime it means that both forms behave the same. The reason we did it this way is to prevent 1.6.1 from blocking all of the repos again if it doesn't fix the problem or introduces a new one. If so, it blocks the uncapping patches only and we can deal with it on our own schedule. From fungi at yuggoth.org Wed May 15 18:36:52 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 15 May 2019 18:36:52 +0000 Subject: [tc][all] Github mirroring (or lack thereof) for unofficial projects In-Reply-To: <20190515175110.26i2xuclkksgx744@arabian.linksys.moosehall> References: <20190503190538.GB3377@localhost.localdomain> <20190515175110.26i2xuclkksgx744@arabian.linksys.moosehall> Message-ID: <20190515183652.scwftsgzubdj7wg6@yuggoth.org> On 2019-05-15 18:51:10 +0100 (+0100), Adam Spiers wrote: [...] > Also, the understanding I took away from the PTG session was that > there was consensus *not* to archive repos, but rather to ensure that > mirroring and redirects are set up properly. However I am of course > very willing to be persuaded otherwise. [...] Just to back up Jim's clarification, the idea is to transfer/redirect any repos which request it, and archive the rest. Archival itself is a non-destructive operation anyway, and we should still be able to unarchive and transfer others later if asked. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From fungi at yuggoth.org Wed May 15 18:40:34 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 15 May 2019 18:40:34 +0000 Subject: [oslo][requirements] Bandit Strategy In-Reply-To: <014592f6-6d03-a44b-cda3-0007ca2c3c29@nemebean.com> References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> <014592f6-6d03-a44b-cda3-0007ca2c3c29@nemebean.com> Message-ID: <20190515184034.ooivbn6btshi7nqn@yuggoth.org> On 2019-05-15 13:08:32 -0500 (-0500), Ben Nemec wrote: [...] > The reason we did it this way is to prevent 1.6.1 from blocking > all of the repos again if it doesn't fix the problem or introduces > a new one. If so, it blocks the uncapping patches only and we can > deal with it on our own schedule. Normally, if it had been treated like other linters, projects should have been guarding against unanticipated upgrades by specifying something like a <1.6.0 version and then expressly advancing that cap at the start of a new cycle when they're prepared to deal with fixing whatever problems are identified. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From openstack at nemebean.com Wed May 15 18:47:03 2019 From: openstack at nemebean.com (Ben Nemec) Date: Wed, 15 May 2019 13:47:03 -0500 Subject: [oslo][all] Ending courtesy pings In-Reply-To: <20190514194650.cuzrwon3aquhrfq4@yuggoth.org> References: <20190514194650.cuzrwon3aquhrfq4@yuggoth.org> Message-ID: On 5/14/19 2:46 PM, Jeremy Stanley wrote: > On 2019-05-14 11:58:03 -0500 (-0500), Ben Nemec wrote: > [...] >> The recommendation was for interested parties to set up custom >> highlights on the "#startmeeting oslo" (or whichever meeting) >> command. > [...] > > Cross-sections of our community have observed similar success with > "group highlight" strings (infra-root, tc-members, zuul-maint and so > on) where the folks who want to get notified as a group can opt to > add these custom strings to their client configurations. > >> people didn't know how to configure their IRC client to do this. > > For those using WeeChat, the invocation could be something like this > in your core buffer: > > /set weechat.look.highlight_regex #startmeeting (oslo|tripleo) > /save > > Or you could similarly set the corresponding line in the [look] > section of your ~/.weechat/weechat.conf file and then /reload it: > > highlight_regex = "#startmeeting (oslo|tripleo)" > > Extend the (Python flavored) regex however makes sense. > > https://www.weechat.org/files/doc/stable/weechat_user.en.html#option_weechat.look.highlight_regex One other thing I forgot to mention was a suggestion that having dev docs for some of the common IRC clients would be helpful with this. It wasn't terribly hard for me to do in Quassel, but I've heard from IRC Cloud users that they weren't able to find a way to do this. I've never used it so I'm not much help. > >> Once you do configure it, there's a testing problem in that you >> don't get notified of your own messages, so you basically have to >> wait for the next meeting and hope you got it right. Or pull >> someone into a private channel and have them send a startmeeting >> command, which is a hassle. It isn't terribly complicated, but if >> it isn't tested then it's assumed broken. :-) > > Or temporarily add one for a meeting you know is about to happen on > some channel to make sure you have the correct configuration option > and formatting, at least. True, and this is what I ended up doing. :-) > >> The other concern was that this process would have to be done any >> time someone changes IRC clients, whereas the ping list was a >> central thing that always applies no matter where you're >> connecting from. > [...] > > I may be an atypical IRC user, but this is far from the most > complicated part of my configuration which would need to be migrated > to a new client. Then again, I've only changed IRC clients roughly > every 8 years (so I guess I'm due to move to a 5th one next year). > Yeah, I can't say this is a huge issue for me personally, but it was mentioned in the meeting so I included it. From fungi at yuggoth.org Wed May 15 18:50:56 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 15 May 2019 18:50:56 +0000 Subject: [oslo][all] Ending courtesy pings In-Reply-To: References: <20190514194650.cuzrwon3aquhrfq4@yuggoth.org> Message-ID: <20190515185056.vg6wueqi6brnnjeh@yuggoth.org> On 2019-05-15 13:47:03 -0500 (-0500), Ben Nemec wrote: [...] > One other thing I forgot to mention was a suggestion that having > dev docs for some of the common IRC clients would be helpful with > this. It wasn't terribly hard for me to do in Quassel, but I've > heard from IRC Cloud users that they weren't able to find a way to > do this. I've never used it so I'm not much help. [...] Having a tips-n-tricks section in https://docs.openstack.org/infra/manual/irc.html for things like this might make sense, if folks want to propose additions there. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From openstack at nemebean.com Wed May 15 18:55:10 2019 From: openstack at nemebean.com (Ben Nemec) Date: Wed, 15 May 2019 13:55:10 -0500 Subject: [oslo][all] Ending courtesy pings In-Reply-To: References: Message-ID: <9953e02c-7c0a-e382-4c53-552bf337dbae@nemebean.com> On 5/14/19 12:42 PM, Morgan Fainberg wrote: > (curated at the start of each cycle administratively so inactive folks > weren't constantly pinged) We discussed this some in the Cinder meeting today and I suggested maybe clearing the list each cycle and requiring people to re-subscribe. That way we don't end up with stale entries and people being pinged undesirably. It's a little tricky for me to clean up the list manually since I don't want to remove liaisons, but I'm not sure some of them actually attend the meeting anymore. Plus that might encourage people to set up client-side notifications so they don't have to mess with it every six months. If the ping list ends up empty (or not) then it's a pretty good indication of people's preferences. From openstack at nemebean.com Wed May 15 19:07:41 2019 From: openstack at nemebean.com (Ben Nemec) Date: Wed, 15 May 2019 14:07:41 -0500 Subject: [oslo][requirements] Bandit Strategy In-Reply-To: <20190515184034.ooivbn6btshi7nqn@yuggoth.org> References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> <014592f6-6d03-a44b-cda3-0007ca2c3c29@nemebean.com> <20190515184034.ooivbn6btshi7nqn@yuggoth.org> Message-ID: <1a187a01-d991-06a9-c002-967b803406ac@nemebean.com> On 5/15/19 1:40 PM, Jeremy Stanley wrote: > On 2019-05-15 13:08:32 -0500 (-0500), Ben Nemec wrote: > [...] >> The reason we did it this way is to prevent 1.6.1 from blocking >> all of the repos again if it doesn't fix the problem or introduces >> a new one. If so, it blocks the uncapping patches only and we can >> deal with it on our own schedule. > > Normally, if it had been treated like other linters, projects should > have been guarding against unanticipated upgrades by specifying > something like a <1.6.0 version and then expressly advancing that > cap at the start of a new cycle when they're prepared to deal with > fixing whatever problems are identified. > Yeah, I guess I don't know why we weren't doing that with bandit. Maybe just that it hadn't broken us previously, in which case we might want to drop the uncap patches entirely. From mriedemos at gmail.com Wed May 15 20:34:43 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 15 May 2019 15:34:43 -0500 Subject: [OSC][PTG] Summary: many things to do! In-Reply-To: <768df039575b8cad3ee3262230534d7309f9c09f.camel@redhat.com> References: <768df039575b8cad3ee3262230534d7309f9c09f.camel@redhat.com> Message-ID: <36739406-7b0f-8d76-5fb5-63548063aa37@gmail.com> On 5/13/2019 8:02 AM, Stephen Finucane wrote: >> * server migrate --live: deprecate the --live option and add a new >> --live-migration option and a --host option > Could I suggest we don't bother deprecating the '--live' option in 3.x > and simply rework it for 4.0? This is of course assuming the 4.0 > release date is months away and not years, of course. > But, why? If we're going to get some stuff into 3.x yet, why not deprecate the bad thing so it's there in the version most people will be using for a long time before re-writing their stuff to work with 4.0? The deprecation is the easy bit. -- Thanks, Matt From mriedemos at gmail.com Wed May 15 20:37:19 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 15 May 2019 15:37:19 -0500 Subject: Any ideas on fixing bug 1827083 so we can merge code? In-Reply-To: References: <20190509135517.7j7ccyyxzp2yneun@yuggoth.org> Message-ID: On 5/11/2019 6:57 PM, Mohammed Naser wrote: > is it possible that this is because > `mirror.sjc1.vexxhost.openstack.org` does not actually have AAAA > records that causes this? > > On Thu, May 9, 2019 at 9:58 AM Jeremy Stanley wrote: >> On 2019-05-09 08:49:35 -0500 (-0500), Eric Fried wrote: >>> Have we tried changing the URI to >>> https://opendev.org/openstack/requirements/raw/branch/master/upper-constraints.txt >>> to avoid the redirecting? >>> >>> On 5/9/19 8:02 AM, Matt Riedemann wrote: >>>> I'm not sure what is causing the bug [1] but it's failing at a really >>>> high rate for about week now. Do we have ideas on the issue? Do we have >>>> thoughts on a workaround? Or should we disable the vexxhost-sjc1 >>>> provider until it's solved? >>>> >>>> [1]http://status.openstack.org/elastic-recheck/#1827083 >> I have to assume the bug report itself is misleading. Jobs should be >> using the on-disk copy of the requirements repository provided by >> Zuul for this and not retrieving that file over the network. However >> the problem is presumably DNS resolution not working at all on those >> nodes, so something is going to break at some point in the job in >> those cases regardless. >> -- >> Jeremy Stanley Just an update on this for those watching at home. Clark switched the vexxhost-sjc1 nodes to IPv4: https://review.opendev.org/#/c/659140/ https://review.opendev.org/#/c/659181/ And early results in logstash [1] show that might have worked around the problem for now. [1] http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22ERROR%3A%20Could%20not%20install%20packages%20due%20to%20an%20EnvironmentError%3A%20HTTPSConnectionPool(host%3D'git.openstack.org'%2C%20port%3D443)%3A%20Max%20retries%20exceeded%20with%20url%3A%20%2Fcgit%2Fopenstack%2Frequirements%2Fplain%2Fupper-constraints.txt%20(Caused%20by%20NewConnectionError('%3Cpip._vendor.urllib3.connection.VerifiedHTTPSConnection%20object%20at%5C%22%20AND%20message%3A%5C%22Failed%20to%20establish%20a%20new%20connection%3A%20%5BErrno%20-3%5D%20Temporary%20failure%20in%20name%20resolution'%2C))%5C%22%20AND%20tags%3A%5C%22console%5C%22%20AND%20voting%3A1&from=1d -- Thanks, Matt From mriedemos at gmail.com Wed May 15 20:45:35 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 15 May 2019 15:45:35 -0500 Subject: [OSC][PTG] Summary: many things to do! In-Reply-To: References: Message-ID: <02baeaf3-5202-d506-a312-70d5e9f68071@gmail.com> Regarding owners / progress on these... On 5/10/2019 11:48 AM, Dean Troyer wrote: > Also relevant to OSC but covered in a Nova Forum session[3,4], highlights: > > * boot-from-volume: Support type=image for the --block-device-mapping, > and Add a --boot-from-volume option which will translate to a root > --block-device-mapping using the provided --image value Artom signed up for this but I don't see patches yet. > > * server migrate --live: deprecate the --live option and add a new > --live-migration option and a --host option > > * compute migration: begin exposing this resource in the CLI As far as I know we don't have owners for these yet. I'm a bit swamped right now otherwise I'd start working on the live migration plan we agreed on at the forum [1]. I can take a crack at starting a patch for the live migration thing but will likely need some peer developer synergy action super happy fun time for things like docs/tests/reno. [1] https://etherpad.openstack.org/p/DEN-osc-compute-api-gaps -- Thanks, Matt From mjturek at linux.vnet.ibm.com Wed May 15 20:49:04 2019 From: mjturek at linux.vnet.ibm.com (Michael Turek) Date: Wed, 15 May 2019 16:49:04 -0400 Subject: [devstack][tempest][neutron][ironic] Trouble debugging networking issues with devstack/lib/tempest Message-ID: <55e560da-1c96-e93a-8a1f-799a24cd23dc@linux.vnet.ibm.com> Hey all, We've been having networking issues with our ironic CI job and I'm a bit blocked. Included is a sample run of the job failing [0]. Our job creates a single flat public network at [1]. Everything goes fine until we hit 'lib/tempest'. The job fails trying to create a network [2]. At first I thought this could be because we configure a single physnet which is used by the 'public' network, however I've tried jumping on the node, deleting the 'public' network and creating the network mentioned in lib/tempest and it still fails with: Error while executing command: HttpException: 503, Unable to create the network. No tenant network is available for allocation. Does anyone have any insight into what we're not configuring properly? Any help would be greatly appreciated! Thanks! Mike Turek [0] https://dal05.objectstorage.softlayer.net/v1/AUTH_3d8e6ecb-f597-448c-8ec2-164e9f710dd6/pkvmci/ironic/71/658271/4/check-ironic/tempest-dsvm-ironic-agent_ipmitool/30c047e/devstacklog.txt.gz [1] https://github.com/openstack/devstack/blob/master/stack.sh#L1348 [2] https://github.com/openstack/devstack/blob/master/lib/tempest#L256 -------------- next part -------------- An HTML attachment was scrubbed... URL: From aspiers at suse.com Wed May 15 21:00:55 2019 From: aspiers at suse.com (Adam Spiers) Date: Wed, 15 May 2019 22:00:55 +0100 Subject: [tc][all] Github mirroring (or lack thereof) for unofficial projects In-Reply-To: <20190515183652.scwftsgzubdj7wg6@yuggoth.org> References: <20190503190538.GB3377@localhost.localdomain> <20190515175110.26i2xuclkksgx744@arabian.linksys.moosehall> <20190515183652.scwftsgzubdj7wg6@yuggoth.org> Message-ID: <20190515210055.qw3kmb5q3fcvntm4@pacific.linksys.moosehall> Jeremy Stanley wrote: >On 2019-05-15 18:51:10 +0100 (+0100), Adam Spiers wrote: >[...] >>Also, the understanding I took away from the PTG session was that >>there was consensus *not* to archive repos, but rather to ensure that >>mirroring and redirects are set up properly. However I am of course >>very willing to be persuaded otherwise. >[...] > >Just to back up Jim's clarification, the idea is to >transfer/redirect any repos which request it, and archive the rest. >Archival itself is a non-destructive operation anyway, and we should >still be able to unarchive and transfer others later if asked. Yes, that makes sense. Thanks both for the replies! From kennelson11 at gmail.com Wed May 15 21:32:03 2019 From: kennelson11 at gmail.com (Kendall Nelson) Date: Wed, 15 May 2019 14:32:03 -0700 Subject: Summit video website shenanigans In-Reply-To: <363a03ca4dfe607224762944e2d43d081d2bdc1c.camel@redhat.com> References: <21ce1f4d-2e19-589f-3bce-44f411a22e67@gmail.com> <363a03ca4dfe607224762944e2d43d081d2bdc1c.camel@redhat.com> Message-ID: Noted. Will pass that along as well. Thanks for the heads up Sean :) -Kendall (diablo_rojo) On Tue, May 14, 2019 at 6:31 PM Sean Mooney wrote: > On Tue, 2019-05-14 at 15:42 -0700, Kendall Nelson wrote: > > I let various Foundation staff know- Jimmy and others- and supposedly its > > been fixed now already? They are doing some more research to see if there > > are other videos facing the same issues, but it should all be fine now? > the nova cells v2 video is still only 25 mins > https://www.youtube.com/watch?v=OyNFIOSGjac > so while > > https://www.openstack.org/videos/summits/denver-2019/the-vision-for-openstack-clouds-explained > does seam to be fixed the missing 15mins for the start of the cellsv2 > video has not been adressed > > > > > -Kendall (diablo_rojo) > > > > On Tue, May 14, 2019 at 3:00 PM Matt Riedemann > wrote: > > > > > On Sunday I was able to pull up several of the summit videos on youtube > > > via the summit phone app, e.g. [1] but when trying to view the same > > > videos from the summit video website I'm getting errors [2]. > > > > > > Is there just something still in progress with linking the videos on > the > > > https://www.openstack.org/videos/ site? > > > > > > Note that in one case 15 minutes of a talk was chopped (40 minute talk > > > chopped to 25 minutes) [3]. I reckon I should take that up directly > with > > > the speakersupport at openstack.org team though? > > > > > > [1] https://www.youtube.com/watch?v=YdSVY4517sE > > > [2] > > > > > > > https://www.openstack.org/videos/summits/denver-2019/the-vision-for-openstack-clouds-explained > > > [3] https://www.youtube.com/watch?v=OyNFIOSGjac > > > > > > -- > > > > > > Thanks, > > > > > > Matt > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From haleyb.dev at gmail.com Wed May 15 21:37:21 2019 From: haleyb.dev at gmail.com (Brian Haley) Date: Wed, 15 May 2019 17:37:21 -0400 Subject: [devstack][tempest][neutron][ironic] Trouble debugging networking issues with devstack/lib/tempest In-Reply-To: <55e560da-1c96-e93a-8a1f-799a24cd23dc@linux.vnet.ibm.com> References: <55e560da-1c96-e93a-8a1f-799a24cd23dc@linux.vnet.ibm.com> Message-ID: <9096f5ca-de93-9df4-7e7b-a48832c79643@gmail.com> On 5/15/19 4:49 PM, Michael Turek wrote: > Hey all, > > We've been having networking issues with our ironic CI job and I'm a bit > blocked. Included is a sample run of the job failing [0]. Our job > creates a single flat public network at [1]. Everything goes fine until > we hit 'lib/tempest'. The job fails trying to create a network [2]. > > At first I thought this could be because we configure a single physnet > which is used by the 'public' network, however I've tried jumping on the > node, deleting the 'public' network and creating the network mentioned > in lib/tempest and it still fails with: > > Error while executing command: HttpException: 503, Unable to create the network. No tenant network is available for allocation. > > > Does anyone have any insight into what we're not configuring properly? > Any help would be greatly appreciated! Just going based on your log: OPTS=tenant_network_types=flat But tenant flat networks are not supported. It's typically going to be vxlan or local. Did you change a setting or did this just start breaking recently? -Brian From openstack at fried.cc Wed May 15 21:50:58 2019 From: openstack at fried.cc (Eric Fried) Date: Wed, 15 May 2019 16:50:58 -0500 Subject: On reporting CPU flags that provide mitiation (to CVE flaws) as Nova 'traits' In-Reply-To: <20190515131109.GJ17214@paraplu> References: <20190515092456.GH17214@paraplu> <1e4f1df66115fd8e96b8aec3a679b25534c66541.camel@redhat.com> <20190515131109.GJ17214@paraplu> Message-ID: <60a1e97d-c9ac-469f-3c16-39e89347acb3@fried.cc> (NB: I'm explicitly rendering "no opinion" on several items below so you know I didn't miss/ignore them.) > The other day I casually noticed that the above file is missing some > important CPU flags I think this is noteworthy. These traits are being proposed because you casually noticed they were missing, not because someone asked for them. We can invent use cases, but without demand we may just be spinning our wheels. >> So, theoretically there is scope for "exploiting" (but non-trivial) > it is trivial all you would have to do is I'm not a security guy, but I'm pretty sure it doesn't matter whether it's trivial; if it's possible at all, that's bad. That being the case, you don't even have to be able to target a vulnerable host for it to be a security problem. If my cloud is set up so that Joe Hacker is able to land his instance on a vulnerable host even by randomly trying, I done effed up already. >>> There's no consensus here. Some think that we should _not_ allow those >>> CPU flags as traits which can 'allow' you to target vulnerable hosts. >> >> for what its worth im in this camp and have said so in other places >> where we have been disucssing it. > > Yep, noted. My position is that it's not harmful to add them to os-traits; it's whether/how they're used in nova that needs some thought. >>> Does the Security Team has any strong opinions? Still hoping someone speaks up in this capacity... >>> If there is consensus on dropping those CPU-flags-as-traits that let you >>> target vulnerable hosts, drop them. And add only those CPU flags as >>> traits that provide either 'features' (what's the definition?) or those >>> that reduce performance degradation. >>> >> my vote is for only adding tratis for cpu featrue. > > Noted; I'd like to hear other opinions. (And note that the word > "feature" can get fuzzy in this context, I'll assume we're using it > somewhat loosely to include things that help with reducing perf > degradation, etc.) I abstain. Once again, presence in os-traits is harmless; use by nova is subject to further discussion. But we also don't have any demand (that I'm aware of). However, I'll state again for the record that vendor-specific "positive" traits (indicating "has mitigation", "not vulnerable", etc.) are nigh worthless for the Nova scheduling use case of "land me on a non-vulnerable host" because, until you can say required=in:HW_CPU_X86_INTEL_FIX,HW_CPU_X86_AMD_FIX, you would have to pick your CPU vendor ahead of time. >> PCID is a CPU feautre that was designed as a performce optiomistation I'm staying well away from the what-is-a-feature discussion, mainly out of ignorance. >>> Some think this is not "Nova's business", because: "just like how you >>> don't want to stop based on CPU fan speed or temperature or firmware >>> patch levels ...". IMO this (cpu flags/features/attributes, and even possibly firmware patch levels, though probably not fan speed or temperature) is a perfectly suitable use of traits. Not all traits have to feed into Nova scheduling decisions; they could also be used by e.g. external orchestrators. os-traits needs to have that more global not-just-Nova perspective. (Disclaimer: I'm a card-carrying "trait libertarian": freedom to do what makes sense with traits, as long as you're not hurting anyone and it's not costing the taxpayers.) > Okay, "stopping" / "refusing to launch" is too strict > and unresonable; scratch that. I agree with this, for all the reasons stated. > we can potentially make Nova > check the 'sysfs' directory for vulnerabilities. IMO this is still a good idea, but rather than warning / refusing to boot, we could expose a roll-up trait, subject to the strawman design below. To summarize my position on the os-traits side of things: - We can merge the feature-ish traits (assuming folks can agree on which ones those are). - We can merge the vulnerability traits as long as they come with nice comments explaining the potential security pitfalls around using them. - Or for all I care we can merge nothing, since we don't actually seem to have a demand for it. ========================== I'm going to dive into Nova-land now. The below would need a blueprint and a spec. And an owner. And it would be nice if it also had demand. If we want to make scheduling decisions based on vulnerabilities, it needs to be under the exclusive control of the admin. As mentioned above, exposing the traits and allowing untrusted/untrustworthy users to target vulnerable hosts is only marginally worse than having those vulnerable hosts available to said untrusted users at all. So if we are going to have virt drivers expose a VULNERABLE trait in any form, it should come with: 1) a config option in the spirit of: [scheduler] allow_scheduling_to_vulnerable_hosts = $bool (default: False) which, when False, causes the scheduler to add trait:VULNERABLE=forbidden to *all* GET /a_c requests. But we should generalize this to: (a) Maintain a hardcoded list of traits that represent vulnerabilities or other undesirables (b) Have the conf option be [scheduler]evil_trait_whitelist (c) Add [trait:$X=forbidden for $X in {(b) - (a)}] 2) a hard check to disallow trait:$X=required from *anywhere* (flavor, image, etc.) regardless of the conf option. Either reject the boot request or explicitly strip that out. For completeness, note that these traits need to be "negative" (i.e. "has vulnerability") so that we can forbid them in a list in the GET /a_c request. Because required=!INTEL_VULNERABLE,!AMD_VULNERABLE will correctly avoid vulnerable hosts from either vendor, but required=INTEL_FIXED,AMD_FIXED won't land anywhere, and we don't have required=in:INTEL_FIXED,AMD_FIXED yet. efried . From mriedemos at gmail.com Wed May 15 22:06:26 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 15 May 2019 17:06:26 -0500 Subject: [OSC][PTG] Summary: many things to do! In-Reply-To: <02baeaf3-5202-d506-a312-70d5e9f68071@gmail.com> References: <02baeaf3-5202-d506-a312-70d5e9f68071@gmail.com> Message-ID: On 5/15/2019 3:45 PM, Matt Riedemann wrote: >> >> * server migrate --live: deprecate the --live option and add a new >> --live-migration option and a --host option >> > > As far as I know we don't have owners for these yet. I'm a bit swamped > right now otherwise I'd start working on the live migration plan we > agreed on at the forum [1]. I can take a crack at starting a patch for > the live migration thing but will likely need some peer developer > synergy action super happy fun time for things like docs/tests/reno. > > [1] https://etherpad.openstack.org/p/DEN-osc-compute-api-gaps Despite my dramatic cry for help I've managed to crank out a start for the live migration fix: https://review.opendev.org/#/c/659382/ -- Thanks, Matt From fungi at yuggoth.org Wed May 15 22:11:46 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 15 May 2019 22:11:46 +0000 Subject: On reporting CPU flags that provide mitiation (to CVE flaws) as Nova 'traits' In-Reply-To: <60a1e97d-c9ac-469f-3c16-39e89347acb3@fried.cc> References: <20190515092456.GH17214@paraplu> <1e4f1df66115fd8e96b8aec3a679b25534c66541.camel@redhat.com> <20190515131109.GJ17214@paraplu> <60a1e97d-c9ac-469f-3c16-39e89347acb3@fried.cc> Message-ID: <20190515221145.codjzvxitduj3bhw@yuggoth.org> On 2019-05-15 16:50:58 -0500 (-0500), Eric Fried wrote: > (NB: I'm explicitly rendering "no opinion" on several items below so you > know I didn't miss/ignore them.) [...] (NBNB: Kashyap asked to be Cc'd as a non-subscriber, so I have added him back but you may want to forward him a copy of your reply.) > >>> Does the Security Team has any strong opinions? > > Still hoping someone speaks up in this capacity... [...] I've added a link to this thread on the agenda for tomorrow's Security SIG meeting[*] in order to attempt to raise the visibility a bit with other members of the SIG. [*] http://eavesdrop.openstack.org/#Security_SIG_meeting -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From mnaser at vexxhost.com Wed May 15 22:20:39 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Thu, 16 May 2019 06:20:39 +0800 Subject: On reporting CPU flags that provide mitiation (to CVE flaws) as Nova 'traits' In-Reply-To: <20190515221145.codjzvxitduj3bhw@yuggoth.org> References: <20190515092456.GH17214@paraplu> <1e4f1df66115fd8e96b8aec3a679b25534c66541.camel@redhat.com> <20190515131109.GJ17214@paraplu> <60a1e97d-c9ac-469f-3c16-39e89347acb3@fried.cc> <20190515221145.codjzvxitduj3bhw@yuggoth.org> Message-ID: On Thu, May 16, 2019 at 6:14 AM Jeremy Stanley wrote: > > On 2019-05-15 16:50:58 -0500 (-0500), Eric Fried wrote: > > (NB: I'm explicitly rendering "no opinion" on several items below so you > > know I didn't miss/ignore them.) > [...] > > (NBNB: Kashyap asked to be Cc'd as a non-subscriber, so I have added > him back but you may want to forward him a copy of your reply.) > > > >>> Does the Security Team has any strong opinions? > > > > Still hoping someone speaks up in this capacity... > [...] > > I've added a link to this thread on the agenda for tomorrow's > Security SIG meeting[*] in order to attempt to raise the visibility > a bit with other members of the SIG. > > [*] http://eavesdrop.openstack.org/#Security_SIG_meeting > -- > Jeremy Stanley I'm actually on the side of adding all the traits (cpu flags) and letting the operator make sure that their cloud is patched. We don't want to make assumption on behalf of the user, if I am $chip_manufacturer and I want to use OpenStack to do CI for regression testing of these, then I don't have the ability to do it. The solution of introducing a flag that says "it's okay if it's vulnerable" opens a whole can of worms on a) keeping up to date with all thee different vulnerabilities and b) potentially causing a lot of upgrade surprises when all of a sudden the flag you relied on is now all of a sudden a "banned" one. I think we should empower our operators and let them decide what to do with their clouds. These recent CPU vulnerabilities are very 'massive' in terms of "PR" so usually most people know about them. -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. http://vexxhost.com From dms at danplanet.com Wed May 15 22:31:04 2019 From: dms at danplanet.com (Dan Smith) Date: Wed, 15 May 2019 15:31:04 -0700 Subject: On reporting CPU flags that provide mitiation (to CVE flaws) as Nova 'traits' In-Reply-To: <60a1e97d-c9ac-469f-3c16-39e89347acb3@fried.cc> (Eric Fried's message of "Wed, 15 May 2019 16:50:58 -0500") References: <20190515092456.GH17214@paraplu> <1e4f1df66115fd8e96b8aec3a679b25534c66541.camel@redhat.com> <20190515131109.GJ17214@paraplu> <60a1e97d-c9ac-469f-3c16-39e89347acb3@fried.cc> Message-ID: > IMO this (cpu flags/features/attributes, and even possibly firmware > patch levels, though probably not fan speed or temperature) is a > perfectly suitable use of traits. Not all traits have to feed into Nova > scheduling decisions; they could also be used by e.g. external > orchestrators. os-traits needs to have that more global not-just-Nova > perspective. Clearly not everything has to feed into a Nova scheduling decision, by virtue of placement hoping to cater to things other than nova. That said, I do think that placement should try to avoid being "tags as a service" which this use-case is dangerously close to becoming, IMHO. >> Okay, "stopping" / "refusing to launch" is too strict >> and unresonable; scratch that. > > I agree with this, for all the reasons stated. Me too, and that'd be a Nova decision to do anything with the security flag or not. >> we can potentially make Nova >> check the 'sysfs' directory for vulnerabilities. > > IMO this is still a good idea, but rather than warning / refusing to > boot, we could expose a roll-up trait, subject to the strawman design > below. And I think it's a bad idea. Honestly, if we're going to do this, why not query yum/apt and set a trait for has-updates-pending? Or has-major-update-available? Or dell-tells-us-there-is-a-bios-update-for-this-machine? Where does it end? Obviously I think it's up to the placement team to decide if they're going to put has-updates-pending in the set of standard traits. I'd vote for no, and Jay will be turning over in his grave shortly. However, I strenuously object to Nova becoming the agent for everything on the compute node, software, hardware, etc. If we're going to peek into kernel updatey things, I don't see how we explain to the next person that it's not okay to check to see if firefox is up to date. Further, if we do get into this business, who is to say that in the future, Nova doesn't get a CVE for failing to notice and report something? Like, do we need to put nova in the embargo box since it claims to be able to tell you if your stuff is vulnerable or not? > To summarize my position on the os-traits side of things: > > - We can merge the feature-ish traits (assuming folks can agree on which > ones those are). > - We can merge the vulnerability traits as long as they come with nice > comments explaining the potential security pitfalls around using them. > - Or for all I care we can merge nothing, since we don't actually seem > to have a demand for it. Every vendor has a tool dedicated to monitoring for updates, applicable vulnerabilities, and for orchestrating that work. A deployment of any appreciable size monitors hardware inventory and can answer the questions of which hosts need a patch without having to ask Nova about it. There are plenty of reasons why you might not apply one update at all or on a specifc schedule. This is well outside of Nova's scope. > The below would need a blueprint and a spec. And an owner. And it would > be nice if it also had demand. > > If we want to make scheduling decisions based on vulnerabilities, it > needs to be under the exclusive control of the admin. As mentioned > above, exposing the traits and allowing untrusted/untrustworthy users to > target vulnerable hosts is only marginally worse than having those > vulnerable hosts available to said untrusted users at all. So if we are > going to have virt drivers expose a VULNERABLE trait in any form, it > should come with: Further, if placement is ever exposed to middle admins (i.e. domain admins, site admins in a larger deployment, etc) even read-only, presumably you'll need to be able to expose (or hide) the presence of a trait based on their security clearance. > 1) a config option in the spirit of: > > [scheduler] > allow_scheduling_to_vulnerable_hosts = $bool (default: False) > > which, when False, causes the scheduler to add > trait:VULNERABLE=forbidden to *all* GET /a_c requests. > > But we should generalize this to: > > (a) Maintain a hardcoded list of traits that represent vulnerabilities > or other undesirables > (b) Have the conf option be [scheduler]evil_trait_whitelist > (c) Add [trait:$X=forbidden for $X in {(b) - (a)}] > > 2) a hard check to disallow trait:$X=required from *anywhere* (flavor, > image, etc.) regardless of the conf option. Either reject the boot > request or explicitly strip that out. > > For completeness, note that these traits need to be "negative" (i.e. > "has vulnerability") so that we can forbid them in a list in the GET > /a_c request. Because required=!INTEL_VULNERABLE,!AMD_VULNERABLE will > correctly avoid vulnerable hosts from either vendor, but > required=INTEL_FIXED,AMD_FIXED won't land anywhere, and we don't have > required=in:INTEL_FIXED,AMD_FIXED yet. I'm strong -3 on exposing VULNERABLE or NOT_VULNERABLE and +2 on SUPPORTS_SOMEACTUALCPUFLAG. It's trivial today for an operator to nova-disable all computes, and start enabling them as they are patched (automatically, with their patching tool). --Dan From fungi at yuggoth.org Wed May 15 22:32:34 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 15 May 2019 22:32:34 +0000 Subject: [nova][security-sig] On reporting CPU flags that provide mitiation (to CVE flaws) as Nova 'traits' In-Reply-To: <20190515221145.codjzvxitduj3bhw@yuggoth.org> References: <20190515092456.GH17214@paraplu> <1e4f1df66115fd8e96b8aec3a679b25534c66541.camel@redhat.com> <20190515131109.GJ17214@paraplu> <60a1e97d-c9ac-469f-3c16-39e89347acb3@fried.cc> <20190515221145.codjzvxitduj3bhw@yuggoth.org> Message-ID: <20190515223233.uuoqiw4l2fdt2pmo@yuggoth.org> On 2019-05-15 22:11:46 +0000 (+0000), Jeremy Stanley wrote: [...] > Kashyap asked to be Cc'd as a non-subscriber [...] Oops, as Eric rightly pointed out to me just now, Kashyap is a subscriber to openstack-discuss, just not to openstack-security where he originally sent it. My mistake! -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From mnaser at vexxhost.com Wed May 15 22:36:07 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Thu, 16 May 2019 06:36:07 +0800 Subject: [Cinder][Ceph] Migrating backend host In-Reply-To: <11c12057-6213-2073-8f73-a693ec798cd6@uwo.ca> References: <11c12057-6213-2073-8f73-a693ec798cd6@uwo.ca> Message-ID: On Thu, May 16, 2019 at 2:14 AM Gary Molenkamp wrote: > > I am moving my cinder-volume service from one controller to another, and > I'm trying to determine the correct means to update all existing > volume's back-end host reference. I now have two cinder volume services > running in front of the same ceph cluster, and I would like to retire > the old cinder-volume service. > > For example, on my test cloud, "openstack volume service list": > > +------------------+----------------------------------+------+---------+-------+----------------------------+ > | Binary | Host | Zone | Status | > State | Updated At | > +------------------+----------------------------------+------+---------+-------+----------------------------+ > | cinder-scheduler | osdev-ctrl1 | nova | enabled | up | > 2019-05-15T17:44:05.000000 | > | cinder-volume | osdev-ctrl2 at rbd | nova | disabled | up | > 2019-05-15T17:44:04.000000 | > | cinder-volume | osdev-ctrl1 at rbd | nova | enabled | up | > 2019-05-15T17:44:00.000000 | > +------------------+----------------------------------+------+---------+-------+----------------------------+ > > Now, an existing volume has a reference to the disabled cinder-volume: > "os-vol-host-attr:host | osdev-ctrl2 at rbd#rbd" > > but this needs to be changed to: > "os-vol-host-attr:host | osdev-ctrl1 at rbd#rbd" > > As both controllers are members of the same ceph cluster, an "openstack > volume migrate" is not appropriate. If it is appropriate, my testing > has shown that it errors out and deletes the source volume from ceph. > > I can alter this field manually in the cinder database, but in > following the "don't mess with the data model" mantra, is there a means > to do this from the cli? https://docs.openstack.org/cinder/stein/cli/cinder-manage.html#cinder-volume cinder-manage volume update_host --currenthost --newhost That should do it :) > Thanks, > Gary. > > Openstack release: Queens. > Distro: Redhat (Centos) 7.6 > > -- > Gary Molenkamp Computer Science/Science Technology Services > Systems Administrator University of Western Ontario > molenkam at uwo.ca http://www.csd.uwo.ca > (519) 661-2111 x86882 (519) 661-3566 > -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. http://vexxhost.com From duc.openstack at gmail.com Thu May 16 00:10:26 2019 From: duc.openstack at gmail.com (Duc Truong) Date: Wed, 15 May 2019 17:10:26 -0700 Subject: [User-committee] OpenStack User Survey 2019 In-Reply-To: <5CD581CF.6010306@openstack.org> References: <5CC0732E.8020601@tipit.net> <74F9B988-972B-422F-94D1-E62A83FD87A7@openstack.org> <5CD34F85.9010604@openstack.org> <5CD581CF.6010306@openstack.org> Message-ID: Hi Jimmy, The auto-scaling SIG would like to add a question to the user survey. Our question was drafted during our auto-scaling SIG meeting [1]. The question we would like to get added is as follows: If you are using auto-scaling in your OpenStack cloud, which services do you use as part of auto-scaling? [Select all that apply] Monasca Ceilometer Aodh Senlin Heat Other OpenStack service (please specify - e.g. Watcher, Vitrage, Congress) Custom application components Prometheus Other user-provided service (please specify) Thanks, Duc [1] http://eavesdrop.openstack.org/meetings/auto_scaling_sig/2019/auto_scaling_sig.2019-05-15-23.06.html From jwoh95 at dcn.ssu.ac.kr Thu May 16 02:07:10 2019 From: jwoh95 at dcn.ssu.ac.kr (Jaewook Oh) Date: Thu, 16 May 2019 11:07:10 +0900 Subject: [dev] [tacker] Prometheus Plugin for VNF Monitoring Cover Message-ID: Hello Tacker team, and Trinh. As I said in the vPTG, I want to cover the blueprint "*Prometheus Plugin for VNF Monitoring in Tacker VNF Manager*" ( https://blueprints.launchpad.net/tacker/+spec/prometheus-plugin) This blueprint is not updated for long time, and also no development is not in progress. So Trinh, can I update the spec and develop that plugin if you don't mind? Best Regards, Jaewook. ================================================ *Jaewook Oh* (오재욱) IISTRC - Internet Infra System Technology Research Center 369 Sangdo-ro, Dongjak-gu, 06978, Seoul, Republic of Korea Tel : +82-2-820-0841 | Mobile : +82-10-9924-2618 E-mail : jwoh95 at dcn.ssu.ac.kr ================================================ -------------- next part -------------- An HTML attachment was scrubbed... URL: From li.canwei2 at zte.com.cn Thu May 16 02:07:29 2019 From: li.canwei2 at zte.com.cn (li.canwei2 at zte.com.cn) Date: Thu, 16 May 2019 10:07:29 +0800 (CST) Subject: =?UTF-8?B?UmU6W3dhdGNoZXJdIENvbXB1dGUgQ0RNIGJ1aWxkZXIgaXNzdWVzIChtb3N0bHkgcGVyZiByZWxhdGVkKQ==?= In-Reply-To: <290e8682-9a57-51e1-26b2-62f09fff4cd6@gmail.com> References: 201905151719101538176@zte.com.cn, 290e8682-9a57-51e1-26b2-62f09fff4cd6@gmail.com Message-ID: <201905161007296647068@zte.com.cn> On 5/15/2019 4:19 AM, li.canwei2 at zte.com.cn wrote: > I tried changing nova_client.api_version to a FloatOpt but that gets > messy because of how things like 2.60 are handled (str(2.60) gets turned > into '2.6' which is not what we'd want). I was hoping we could use > FloatOpt with a min version to enforce the minimum required version, but > I guess we could do this other ways in the client helper code itself by > comparing to some minimum required version in the code. > [licanwei]: Maybe we can refer to > https://github.com/openstack/watcher/blob/master/watcher/common/nova_helper.py#L714 > I just did this which seems more explicit: https://review.opendev.org/#/c/659194/ That change leaves the default of 2.56 since the 2.56 code does version discovery so it's backward compatible, but I think we can assert that you need at least 2.53 because of how the scoped nova CDM code works (and to support nova deployments with multiple cells properly). Also note that 2.53 is pike-era nova and 2.56 is queens-era nova and those seem old enough that it's safe to require 2.53 as a minimum for watcher in train. [licanwei]: Because we need to specify the destination host when migration, at least 2.56 is required. https://github.com/openstack/watcher/blob/master/watcher/common/nova_helper.py#L145 -- Thanks, Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Thu May 16 02:10:18 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Thu, 16 May 2019 11:10:18 +0900 Subject: [dev] [tacker] Prometheus Plugin for VNF Monitoring Cover In-Reply-To: References: Message-ID: Hi Jaewook, Yes, please go ahead. I switched my focus to Searchlight and Telemetry projects recently. Thanks for doing this, On Thu, May 16, 2019 at 11:07 AM Jaewook Oh wrote: > Hello Tacker team, and Trinh. > > As I said in the vPTG, I want to cover the blueprint "*Prometheus Plugin > for VNF Monitoring in Tacker VNF Manager*" ( > https://blueprints.launchpad.net/tacker/+spec/prometheus-plugin) > > This blueprint is not updated for long time, and also no development is > not in progress. > So Trinh, can I update the spec and develop that plugin if you don't mind? > > Best Regards, > Jaewook. > > ================================================ > *Jaewook Oh* (오재욱) > IISTRC - Internet Infra System Technology Research Center > 369 Sangdo-ro, Dongjak-gu, > 06978, Seoul, Republic of Korea > Tel : +82-2-820-0841 | Mobile : +82-10-9924-2618 > E-mail : jwoh95 at dcn.ssu.ac.kr > ================================================ > -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From jwoh95 at dcn.ssu.ac.kr Thu May 16 02:12:00 2019 From: jwoh95 at dcn.ssu.ac.kr (Jaewook Oh) Date: Thu, 16 May 2019 11:12:00 +0900 Subject: [dev] [tacker] Prometheus Plugin for VNF Monitoring Cover In-Reply-To: References: Message-ID: Thank you Trinh, Hope you have good result in your projects. BR, Jaewook. ================================================ *Jaewook Oh* (오재욱) IISTRC - Internet Infra System Technology Research Center 369 Sangdo-ro, Dongjak-gu, 06978, Seoul, Republic of Korea Tel : +82-2-820-0841 | Mobile : +82-10-9924-2618 E-mail : jwoh95 at dcn.ssu.ac.kr ================================================ 2019년 5월 16일 (목) 오전 11:10, Trinh Nguyen 님이 작성: > Hi Jaewook, > > Yes, please go ahead. I switched my focus to Searchlight and Telemetry > projects recently. > > Thanks for doing this, > > On Thu, May 16, 2019 at 11:07 AM Jaewook Oh wrote: > >> Hello Tacker team, and Trinh. >> >> As I said in the vPTG, I want to cover the blueprint "*Prometheus Plugin >> for VNF Monitoring in Tacker VNF Manager*" ( >> https://blueprints.launchpad.net/tacker/+spec/prometheus-plugin) >> >> This blueprint is not updated for long time, and also no development is >> not in progress. >> So Trinh, can I update the spec and develop that plugin if you don't mind? >> >> Best Regards, >> Jaewook. >> >> ================================================ >> *Jaewook Oh* (오재욱) >> IISTRC - Internet Infra System Technology Research Center >> 369 Sangdo-ro, Dongjak-gu, >> 06978, Seoul, Republic of Korea >> Tel : +82-2-820-0841 | Mobile : +82-10-9924-2618 >> E-mail : jwoh95 at dcn.ssu.ac.kr >> ================================================ >> > > > -- > *Trinh Nguyen* > *www.edlab.xyz * > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From li.canwei2 at zte.com.cn Thu May 16 02:15:03 2019 From: li.canwei2 at zte.com.cn (li.canwei2 at zte.com.cn) Date: Thu, 16 May 2019 10:15:03 +0800 (CST) Subject: =?UTF-8?B?UmU6W3dhdGNoZXJdIENvbXB1dGUgQ0RNIGJ1aWxkZXIgaXNzdWVzIChtb3N0bHkgcGVyZiByZWxhdGVkKQ==?= In-Reply-To: <0ee8e7df-09d4-46ce-e5bc-93854c6d44ef@gmail.com> References: 201905151713578647919@zte.com.cn, 0ee8e7df-09d4-46ce-e5bc-93854c6d44ef@gmail.com Message-ID: <201905161015037217378@zte.com.cn> On 5/15/2019 4:13 AM, li.canwei2 at zte.com.cn wrote: > [licanwei]:please refer to > https://github.com/openstack/watcher/blob/master/watcher/decision_engine/model/notification/nova.py#L144 > > > When a nova notification is received before the nova CDM is built or no > node in the CDM, > > the node will be add to the CDM. > That's not what's happening in this bug. We're getting an instance.update event from nova during scheduling/building of an instance before it has a host, so when this is called: https://github.com/openstack/watcher/blob/master/watcher/decision_engine/model/notification/nova.py#L220 node_uuid is None. Which means we never call get_or_create_node here: https://github.com/openstack/watcher/blob/master/watcher/decision_engine/model/notification/nova.py#L44 And then we blow up here: https://github.com/openstack/watcher/blob/master/watcher/decision_engine/model/notification/nova.py#L50 because self.cluster_data_model is None. Based on your earlier reply about when the nova CDM is built: > [licanwei]:Yes, the CDM will be built when the first audit being created. It seems the fix for this notification traceback bug is to just make sure the self.cluster_data_model is not None and return early if it's not, i.e. gracefully handle receiving notifications before we've ever done an audit and built the nova CDM. [licanwei]: In this case, I think we can just ignore the exception. -- Thanks, Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Thu May 16 02:17:08 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Thu, 16 May 2019 11:17:08 +0900 Subject: [dev] [tacker] Prometheus Plugin for VNF Monitoring Cover In-Reply-To: References: Message-ID: Thanks Jaewook. Please let me know the email that you're using in Gerrit so I can add you to the specs I'm working on. Is it Jaewook Oh or this email address? Bests, On Thu, May 16, 2019 at 11:12 AM Jaewook Oh wrote: > Thank you Trinh, > > Hope you have good result in your projects. > > BR, > Jaewook. > > ================================================ > *Jaewook Oh* (오재욱) > IISTRC - Internet Infra System Technology Research Center > 369 Sangdo-ro, Dongjak-gu, > 06978, Seoul, Republic of Korea > Tel : +82-2-820-0841 | Mobile : +82-10-9924-2618 > E-mail : jwoh95 at dcn.ssu.ac.kr > ================================================ > > > 2019년 5월 16일 (목) 오전 11:10, Trinh Nguyen 님이 작성: > >> Hi Jaewook, >> >> Yes, please go ahead. I switched my focus to Searchlight and Telemetry >> projects recently. >> >> Thanks for doing this, >> >> On Thu, May 16, 2019 at 11:07 AM Jaewook Oh wrote: >> >>> Hello Tacker team, and Trinh. >>> >>> As I said in the vPTG, I want to cover the blueprint "*Prometheus >>> Plugin for VNF Monitoring in Tacker VNF Manager*" ( >>> https://blueprints.launchpad.net/tacker/+spec/prometheus-plugin) >>> >>> This blueprint is not updated for long time, and also no development is >>> not in progress. >>> So Trinh, can I update the spec and develop that plugin if you don't >>> mind? >>> >>> Best Regards, >>> Jaewook. >>> >>> ================================================ >>> *Jaewook Oh* (오재욱) >>> IISTRC - Internet Infra System Technology Research Center >>> 369 Sangdo-ro, Dongjak-gu, >>> 06978, Seoul, Republic of Korea >>> Tel : +82-2-820-0841 | Mobile : +82-10-9924-2618 >>> E-mail : jwoh95 at dcn.ssu.ac.kr >>> ================================================ >>> >> >> >> -- >> *Trinh Nguyen* >> *www.edlab.xyz * >> >> -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Thu May 16 02:22:59 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Thu, 16 May 2019 11:22:59 +0900 Subject: [dev] [tacker] Prometheus Plugin for VNF Monitoring Cover In-Reply-To: References: Message-ID: okie, done. Added you to the patch on Gerrit and the Launchpad blueprint. Please tell me if you need any further assistance. On Thu, May 16, 2019 at 11:17 AM Trinh Nguyen wrote: > Thanks Jaewook. > > Please let me know the email that you're using in Gerrit so I can add you > to the specs I'm working on. Is it Jaewook Oh or > this email address? > > Bests, > > On Thu, May 16, 2019 at 11:12 AM Jaewook Oh wrote: > >> Thank you Trinh, >> >> Hope you have good result in your projects. >> >> BR, >> Jaewook. >> >> ================================================ >> *Jaewook Oh* (오재욱) >> IISTRC - Internet Infra System Technology Research Center >> 369 Sangdo-ro, Dongjak-gu, >> 06978, Seoul, Republic of Korea >> Tel : +82-2-820-0841 | Mobile : +82-10-9924-2618 >> E-mail : jwoh95 at dcn.ssu.ac.kr >> ================================================ >> >> >> 2019년 5월 16일 (목) 오전 11:10, Trinh Nguyen 님이 작성: >> >>> Hi Jaewook, >>> >>> Yes, please go ahead. I switched my focus to Searchlight and Telemetry >>> projects recently. >>> >>> Thanks for doing this, >>> >>> On Thu, May 16, 2019 at 11:07 AM Jaewook Oh >>> wrote: >>> >>>> Hello Tacker team, and Trinh. >>>> >>>> As I said in the vPTG, I want to cover the blueprint "*Prometheus >>>> Plugin for VNF Monitoring in Tacker VNF Manager*" ( >>>> https://blueprints.launchpad.net/tacker/+spec/prometheus-plugin) >>>> >>>> This blueprint is not updated for long time, and also no development is >>>> not in progress. >>>> So Trinh, can I update the spec and develop that plugin if you don't >>>> mind? >>>> >>>> Best Regards, >>>> Jaewook. >>>> >>>> ================================================ >>>> *Jaewook Oh* (오재욱) >>>> IISTRC - Internet Infra System Technology Research Center >>>> 369 Sangdo-ro, Dongjak-gu, >>>> 06978, Seoul, Republic of Korea >>>> Tel : +82-2-820-0841 | Mobile : +82-10-9924-2618 >>>> E-mail : jwoh95 at dcn.ssu.ac.kr >>>> ================================================ >>>> >>> >>> >>> -- >>> *Trinh Nguyen* >>> *www.edlab.xyz * >>> >>> > > -- > *Trinh Nguyen* > *www.edlab.xyz * > > -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From jwoh95 at dcn.ssu.ac.kr Thu May 16 02:27:29 2019 From: jwoh95 at dcn.ssu.ac.kr (Jaewook Oh) Date: Thu, 16 May 2019 11:27:29 +0900 Subject: [dev] [tacker] Prometheus Plugin for VNF Monitoring Cover In-Reply-To: References: Message-ID: Thank you Trinh, have a good day! ================================================ *Jaewook Oh* (오재욱) IISTRC - Internet Infra System Technology Research Center 369 Sangdo-ro, Dongjak-gu, 06978, Seoul, Republic of Korea Tel : +82-2-820-0841 | Mobile : +82-10-9924-2618 E-mail : jwoh95 at dcn.ssu.ac.kr ================================================ 2019년 5월 16일 (목) 오전 11:23, Trinh Nguyen 님이 작성: > okie, done. Added you to the patch on Gerrit and the Launchpad blueprint. > > Please tell me if you need any further assistance. > > On Thu, May 16, 2019 at 11:17 AM Trinh Nguyen > wrote: > >> Thanks Jaewook. >> >> Please let me know the email that you're using in Gerrit so I can add you >> to the specs I'm working on. Is it Jaewook Oh or >> this email address? >> >> Bests, >> >> On Thu, May 16, 2019 at 11:12 AM Jaewook Oh wrote: >> >>> Thank you Trinh, >>> >>> Hope you have good result in your projects. >>> >>> BR, >>> Jaewook. >>> >>> ================================================ >>> *Jaewook Oh* (오재욱) >>> IISTRC - Internet Infra System Technology Research Center >>> 369 Sangdo-ro, Dongjak-gu, >>> 06978, Seoul, Republic of Korea >>> Tel : +82-2-820-0841 | Mobile : +82-10-9924-2618 >>> E-mail : jwoh95 at dcn.ssu.ac.kr >>> ================================================ >>> >>> >>> 2019년 5월 16일 (목) 오전 11:10, Trinh Nguyen 님이 작성: >>> >>>> Hi Jaewook, >>>> >>>> Yes, please go ahead. I switched my focus to Searchlight and Telemetry >>>> projects recently. >>>> >>>> Thanks for doing this, >>>> >>>> On Thu, May 16, 2019 at 11:07 AM Jaewook Oh >>>> wrote: >>>> >>>>> Hello Tacker team, and Trinh. >>>>> >>>>> As I said in the vPTG, I want to cover the blueprint "*Prometheus >>>>> Plugin for VNF Monitoring in Tacker VNF Manager*" ( >>>>> https://blueprints.launchpad.net/tacker/+spec/prometheus-plugin) >>>>> >>>>> This blueprint is not updated for long time, and also no development >>>>> is not in progress. >>>>> So Trinh, can I update the spec and develop that plugin if you don't >>>>> mind? >>>>> >>>>> Best Regards, >>>>> Jaewook. >>>>> >>>>> ================================================ >>>>> *Jaewook Oh* (오재욱) >>>>> IISTRC - Internet Infra System Technology Research Center >>>>> 369 Sangdo-ro, Dongjak-gu, >>>>> 06978, Seoul, Republic of Korea >>>>> Tel : +82-2-820-0841 | Mobile : +82-10-9924-2618 >>>>> E-mail : jwoh95 at dcn.ssu.ac.kr >>>>> ================================================ >>>>> >>>> >>>> >>>> -- >>>> *Trinh Nguyen* >>>> *www.edlab.xyz * >>>> >>>> >> >> -- >> *Trinh Nguyen* >> *www.edlab.xyz * >> >> > > -- > *Trinh Nguyen* > *www.edlab.xyz * > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From huang.xiangdong at 99cloud.net Thu May 16 02:34:26 2019 From: huang.xiangdong at 99cloud.net (=?UTF-8?B?6buE5ZCR5Lic?=) Date: Thu, 16 May 2019 10:34:26 +0800 (GMT+08:00) Subject: =?UTF-8?B?W21hZ251bV0gTWFnbnVtIHN1cHBvcnQgRmVkb3JhIGNvcmVPUyBwbGFu?= Message-ID: Hi, All Fedora Atomic 29 released on 29 April and it's the last release of Fedora Atomic, Fedora community launched Fedora CoreOS as a t -------------- next part -------------- An HTML attachment was scrubbed... URL: From huang.xiangdong at 99cloud.net Thu May 16 02:39:09 2019 From: huang.xiangdong at 99cloud.net (=?UTF-8?B?6buE5ZCR5Lic?=) Date: Thu, 16 May 2019 10:39:09 +0800 (GMT+08:00) Subject: =?UTF-8?B?W21hZ251bV0gTWFnbnVtIHN1cHBvcnQgRmVkb3JhIGNvcmVPUyBwbGFu?= Message-ID: Hi, All Fedora Atomic 29 released on 29 April and it's the last release of Fedora Atomic, Fedora community launched Fedora CoreOS as a replacement of Fedora Atomic, Does Magnum have bp to supoort Fedora Core + k8s ? Regardsxiangdong -------------- next part -------------- An HTML attachment was scrubbed... URL: From feilong at catalyst.net.nz Thu May 16 02:47:43 2019 From: feilong at catalyst.net.nz (Feilong Wang) Date: Thu, 16 May 2019 14:47:43 +1200 Subject: [magnum] Magnum support Fedora coreOS plan In-Reply-To: References: Message-ID: Hi XingDong, We do have plan/goal to transit to Fedora CoreOS, but we don't have resource in community at this moment.  So if you're interested in this area, you can jump in #openstack-containers and we can discuss more details. Thanks. On 16/05/19 2:39 PM, 黄向东 wrote: > Hi, All > > Fedora Atomic 29 released on 29 April and it's the last release of > Fedora Atomic, > Fedora community launched Fedora CoreOS as a replacement of Fedora Atomic, > Does Magnum have bp to supoort Fedora Core + k8s ? > > Regards > xiangdong > -- Cheers & Best regards, Feilong Wang (王飞龙) -------------------------------------------------------------------------- Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang at catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington -------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From masayuki at igawa.io Thu May 16 02:57:54 2019 From: masayuki at igawa.io (Masayuki Igawa) Date: Thu, 16 May 2019 11:57:54 +0900 Subject: [Cinder][Tempest] All-plugin is deprecated, what do we do now? In-Reply-To: <20190515171612.lkrv36grx2rjggh6@Matthews-MacBook-Pro-2.local> References: <20190515171612.lkrv36grx2rjggh6@Matthews-MacBook-Pro-2.local> Message-ID: <98b2763d-6008-4c6a-87c3-7530e89463b8@www.fastmail.com> Hi Matt, > My guess is that 'all' now includes plugins? Yes, 'all' should run Tempest plugins too. https://opendev.org/openstack/tempest/src/branch/master/tox.ini#L67-L83 67 │ [testenv:all-plugin] 68 │ # DEPRECATED 69 │ # NOTE(andreaf) The all-plugin tox env uses sitepackages 70 │ # so that plugins installed outsite of Tempest virtual environment 71 │ # can be discovered. After the implementation during the Queens 72 │ # release cycle of the goal of moving Tempest plugins in dedicated 73 │ # git repos, this environment should not be used anymore. "all" 74 │ # should be used instead with the appropriate regex filtering. 75 │ sitepackages = True 76 │ # 'all' includes slow tests 77 │ setenv = 78 │ {[tempestenv]setenv} 79 │ OS_TEST_TIMEOUT={env:OS_TEST_TIMEOUT:1200} 80 │ deps = {[tempestenv]deps} 81 │ commands = 82 │ echo "WARNING: The all-plugin env is deprecated and will be removed" 83 │ echo "WARNING Please use the 'all' environment for Tempest plugins." -- Masayuki Igawa Key fingerprint = C27C 2F00 3A2A 999A 903A 753D 290F 53ED C899 BF89 On Thu, May 16, 2019, at 02:21, Matt Smith wrote: > Hey Folks, > > First time posting to the mailing list, so hopefully I got this right. > > Anyways, Tempest complains when I use 'all-plugin' and says it's > deprecated. Cinder has Tempest plugins that were triggered by using > 'all-plugin' instead of 'all' and now I don't know how I'm supposed to > run those plugins as well. > > My guess is that 'all' now includes plugins? Or maybe there's an > alternative for running plugins? > > Regards, > > Matt > > --- > Matthew Smith > Software Engineer > mss at datera.io > 214-223-0887 > > From yangyi01 at inspur.com Thu May 16 03:47:10 2019 From: yangyi01 at inspur.com (=?utf-8?B?WWkgWWFuZyAo5p2o54eaKS3kupHmnI3liqHpm4blm6I=?=) Date: Thu, 16 May 2019 03:47:10 +0000 Subject: [DVR config] Can we use drv_snat agent_mode in every compute node? Message-ID: <2106942ccb764c0ca618169098621a47@inspur.com> Hi, folks I saw somebody discussed distributed SNAT, but finally they didn’t make agreement on how to implement distributed SNAT, my question is can we use dvr_snat agent_mode in compute node? I understand dvr_snat only does snat but doesn’t do east west routing, right? Can we set dvr_snat and dvr in one compute node at the same time? It is equivalent to distributed SNAT if we can set drv_snat in every compute node, isn’t right? I know Opendaylight can do SNAT in compute node in distributed way, but one external router only can run in one compute node. I also see https://wiki.openstack.org/wiki/Dragonflow is trying to implement distributed SNAT, what are technical road blocks for distributed SNAT in openstack dvr? Do we have any good way to remove these road blocks? Thank you in advance and look forward to getting your replies and insights. Also attached official drv configuration guide for your reference. https://docs.openstack.org/neutron/stein/configuration/l3-agent.html agent_mode ¶ Type string Default legacy Valid Values dvr, dvr_snat, legacy, dvr_no_external The working mode for the agent. Allowed modes are: ‘legacy’ - this preserves the existing behavior where the L3 agent is deployed on a centralized networking node to provide L3 services like DNAT, and SNAT. Use this mode if you do not want to adopt DVR. ‘dvr’ - this mode enables DVR functionality and must be used for an L3 agent that runs on a compute host. ‘dvr_snat’ - this enables centralized SNAT support in conjunction with DVR. This mode must be used for an L3 agent running on a centralized node (or in single-host deployments, e.g. devstack). ‘dvr_no_external’ - this mode enables only East/West DVR routing functionality for a L3 agent that runs on a compute host, the North/South functionality such as DNAT and SNAT will be provided by the centralized network node that is running in ‘dvr_snat’ mode. This mode should be used when there is no external network connectivity on the compute host. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3600 bytes Desc: not available URL: From tony at bakeyournoodle.com Thu May 16 05:00:42 2019 From: tony at bakeyournoodle.com (Tony Breeds) Date: Thu, 16 May 2019 15:00:42 +1000 Subject: [all] [TC] [elections] Proposed Dates In-Reply-To: <20190515113553.GA33223@smcginnis-mbp.local> References: <20190515113553.GA33223@smcginnis-mbp.local> Message-ID: <20190516050042.GA18431@thor.bakeyournoodle.com> On Wed, May 15, 2019 at 06:35:53AM -0500, Sean McGinnis wrote: > I personally don't see this being an issue and think it could be beneficial > from a community focus point of view. I wonder if that would cause some > challenges for the election officials though. I would guess that the election > tooling is currently only able to handle one type of election at a time. Yes, currently we have a couple of places that "do the right thing" based on the election type. At this point we'd create a new type called combined (or similar) and basically have it to the aggregate of the 2 existing types. While I don't think the coding will be too hard but it does add a small amount of risk during the election itself. Yours Tony. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From tony at bakeyournoodle.com Thu May 16 05:15:44 2019 From: tony at bakeyournoodle.com (Tony Breeds) Date: Thu, 16 May 2019 15:15:44 +1000 Subject: [all] [TC] [elections] Proposed Dates In-Reply-To: References: Message-ID: <20190516051544.GB18431@thor.bakeyournoodle.com> On Wed, May 15, 2019 at 11:38:42AM +0200, Thierry Carrez wrote: > We historically placed TC elections after PTL elections because some people > found it useful to know if they were elected PTL before running (or not > running) for a TC seat. > > Is that still a concern? If not, I think it's just easier to do both at the > same time (nominations, campaigning, election). Personally I think it's less of an issue now that in the past. There are some good points to doing both elections at the same time. Given we've shifted event timing recently we're going to hit this more often going forward. However, if it is a concern how can we mitigate that? Moving the PTL elections forward to avoid the overlap makes them pretty early and extends the 'double PTL' time, moving the TC elections later reduces the time TC members have to make arrangements with employers to get approval to be in Shanghai. Yours Tony. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From tony at bakeyournoodle.com Thu May 16 05:41:54 2019 From: tony at bakeyournoodle.com (Tony Breeds) Date: Thu, 16 May 2019 15:41:54 +1000 Subject: [oslo][requirements] Bandit Strategy In-Reply-To: <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> Message-ID: <20190516054153.GC18431@thor.bakeyournoodle.com> On Tue, May 14, 2019 at 11:09:26AM -0400, Zane Bitter wrote: > It's breaking the whole world and I'm actually not sure there's a good > reason for it. Who cares if sphinx 2.0 doesn't run on Python 2.7 when we set > and achieved a goal in Stein to only run docs jobs under Python 3? It's > unavoidable for stable/rocky and earlier but it seems like the pain on > master is not necessary. While we support python2 *anywhere* we need to do this. The current tools (both ours and the broader python ecosystem) need to have these markers. I apologise that we managed to mess this up we're looking at how we can avoid this in the future but we don't really get any kinda of signals about $library dropping support for $python_version. The py2 things is more visible than a py3 minor release but they're broadly the same thing Yours Tony. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From dharmendra.kushwaha at india.nec.com Thu May 16 06:39:11 2019 From: dharmendra.kushwaha at india.nec.com (Dharmendra Kushwaha) Date: Thu, 16 May 2019 06:39:11 +0000 Subject: [Tacker][dev] Scaling and auto-healing functions for VNFFG In-Reply-To: <5cdbab7f.1c69fb81.1037b.51db@mx.google.com> References: <5cdbab7f.1c69fb81.1037b.51db@mx.google.com> Message-ID: Dear Lee, Thanks for heading-up for these topics. Implementation got started earlier for these spacs(as in [1], [2]), but Currently no progress on that. That will be great if your team work on these features. Please feel free to lead this activity. [1]: https://review.opendev.org/#/c/484088/ [2]: https://review.opendev.org/#/c/495748/ Thanks & Regards Dharmendra Kushwaha ________________________________________ From: Hochan Lee Sent: Wednesday, May 15, 2019 11:32 AM To: openstack-discuss at lists.openstack.org Subject: [Tacker][dev] Scaling and auto-healing functions for VNFFG Hello tacker team and all, I'm hochan lee and graduate student from korea univ. Our team is interested in SFC and VNFFG, especially HA of VNFFG. We are intereseted in scaling and auto-healing functions for VNFFG proposed in Tacker Pike Specifications. https://specs.openstack.org/openstack/tacker-specs/specs/pike/vnffg-scaling.html https://specs.openstack.org/openstack/tacker-specs/specs/pike/vnffg-autohealing.html We think these functions hadn't been developed yet. Are these features currently being developed? If not, can we go on developing these features for contribution? We wanna join next tacker weekly meeting and discuss them. Thanks, From hberaud at redhat.com Thu May 16 09:29:05 2019 From: hberaud at redhat.com (Herve Beraud) Date: Thu, 16 May 2019 11:29:05 +0200 Subject: [oslo][requirements] Bandit Strategy In-Reply-To: <20190516054153.GC18431@thor.bakeyournoodle.com> References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> <20190516054153.GC18431@thor.bakeyournoodle.com> Message-ID: Hello, To help us to be more reactive on similar issues related to requirements who drop python 2 (the sphinx use case) I've submit a patch https://review.opendev.org/659289 to schedule "check-requirements" daily. Normally with that if openstack/requirements add somes changes who risk to break our CI we will be informed quickly by this periodical job. I guess we will facing a many similar issues in the next month due to the python 2.7 final countdown and libs who will drop python 2.7 support. For the moment only submit my patch on oslo.log, but if it work, in a second time, we can copy it to all the oslo projects. I'm not a zuul expert and I don't know if my patch is correct or not, so please feel free to review it and to put comments to let me know how to proceed with periodic jobs. Also oslo core could check daily the result of this job to know if actions are needed and inform team via the ML or something like that in fix the issue efficiently. Thoughts? Yours Hervé. Le jeu. 16 mai 2019 à 07:44, Tony Breeds a écrit : > On Tue, May 14, 2019 at 11:09:26AM -0400, Zane Bitter wrote: > > > It's breaking the whole world and I'm actually not sure there's a good > > reason for it. Who cares if sphinx 2.0 doesn't run on Python 2.7 when we > set > > and achieved a goal in Stein to only run docs jobs under Python 3? It's > > unavoidable for stable/rocky and earlier but it seems like the pain on > > master is not necessary. > > While we support python2 *anywhere* we need to do this. The current > tools (both ours and the broader python ecosystem) need to have these > markers. > > I apologise that we managed to mess this up we're looking at how we can > avoid this in the future but we don't really get any kinda of signals > about $library dropping support for $python_version. The py2 things is > more visible than a py3 minor release but they're broadly the same thing > > Yours Tony. > -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From hberaud at redhat.com Thu May 16 09:55:47 2019 From: hberaud at redhat.com (Herve Beraud) Date: Thu, 16 May 2019 11:55:47 +0200 Subject: [oslo][requirements] Bandit Strategy In-Reply-To: References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> <20190516054153.GC18431@thor.bakeyournoodle.com> Message-ID: In other words, I propose to schedule a periodical requirements check on the oslo projects to detect as soon as possible CI errors related to requirements check (related to py2.7 support drop), and fix it as soon as possible to avoid to fix it during standard review process (patches related to common fix or feat). Le jeu. 16 mai 2019 à 11:29, Herve Beraud a écrit : > Hello, > > To help us to be more reactive on similar issues related to requirements > who drop python 2 (the sphinx use case) > I've submit a patch https://review.opendev.org/659289 to schedule > "check-requirements" daily. > > Normally with that if openstack/requirements add somes changes who risk to > break our CI we will be informed quickly by this periodical job. > > I guess we will facing a many similar issues in the next month due to the > python 2.7 final countdown and libs who will drop python 2.7 support. > > For the moment only submit my patch on oslo.log, but if it work, in a > second time, we can copy it to all the oslo projects. > > I'm not a zuul expert and I don't know if my patch is correct or not, so > please feel free to review it and to put comments to let me know how to > proceed with periodic jobs. > > Also oslo core could check daily the result of this job to know if actions > are needed and inform team via the ML or something like that in fix the > issue efficiently. > > Thoughts? > > Yours Hervé. > > > Le jeu. 16 mai 2019 à 07:44, Tony Breeds a > écrit : > >> On Tue, May 14, 2019 at 11:09:26AM -0400, Zane Bitter wrote: >> >> > It's breaking the whole world and I'm actually not sure there's a good >> > reason for it. Who cares if sphinx 2.0 doesn't run on Python 2.7 when >> we set >> > and achieved a goal in Stein to only run docs jobs under Python 3? It's >> > unavoidable for stable/rocky and earlier but it seems like the pain on >> > master is not necessary. >> >> While we support python2 *anywhere* we need to do this. The current >> tools (both ours and the broader python ecosystem) need to have these >> markers. >> >> I apologise that we managed to mess this up we're looking at how we can >> avoid this in the future but we don't really get any kinda of signals >> about $library dropping support for $python_version. The py2 things is >> more visible than a py3 minor release but they're broadly the same thing >> >> Yours Tony. >> > > > -- > Hervé Beraud > Senior Software Engineer > Red Hat - Openstack Oslo > irc: hberaud > -----BEGIN PGP SIGNATURE----- > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > v6rDpkeNksZ9fFSyoY2o > =ECSj > -----END PGP SIGNATURE----- > > -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Thu May 16 10:01:22 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Thu, 16 May 2019 12:01:22 +0200 Subject: [DVR config] Can we use drv_snat agent_mode in every compute node? In-Reply-To: <2106942ccb764c0ca618169098621a47@inspur.com> References: <2106942ccb764c0ca618169098621a47@inspur.com> Message-ID: <8B2A08A6-A289-4E72-B98E-587C91C52C57@redhat.com> Hi, According to documentation which You cited even "‘dvr_snat’ - this enables centralized SNAT support in conjunction with DVR”. So yes, dvr_snat will do both, SNAT mode as well as DVR for E-W traffic. We are using it like that in some CI jobs for sure and it works. But I’m not 100% sure that this is “production ready” solution. > On 16 May 2019, at 05:47, Yi Yang (杨燚)-云服务集团 wrote: > > Hi, folks > > I saw somebody discussed distributed SNAT, but finally they didn’t make agreement on how to implement distributed SNAT, my question is can we use dvr_snat agent_mode in compute node? I understand dvr_snat only does snat but doesn’t do east west routing, right? Can we set dvr_snat and dvr in one compute node at the same time? It is equivalent to distributed SNAT if we can set drv_snat in every compute node, isn’t right? I know Opendaylight can do SNAT in compute node in distributed way, but one external router only can run in one compute node. > > I also see https://wiki.openstack.org/wiki/Dragonflow is trying to implement distributed SNAT, what are technical road blocks for distributed SNAT in openstack dvr? Do we have any good way to remove these road blocks? > > Thank you in advance and look forward to getting your replies and insights. > > Also attached official drv configuration guide for your reference. > > https://docs.openstack.org/neutron/stein/configuration/l3-agent.html > > agent_mode¶ > Type > string > > Default > legacy > > Valid Values > dvr, dvr_snat, legacy, dvr_no_external > > The working mode for the agent. Allowed modes are: ‘legacy’ - this preserves the existing behavior where the L3 agent is deployed on a centralized networking node to provide L3 services like DNAT, and SNAT. Use this mode if you do not want to adopt DVR. ‘dvr’ - this mode enables DVR functionality and must be used for an L3 agent that runs on a compute host. ‘dvr_snat’ - this enables centralized SNAT support in conjunction with DVR. This mode must be used for an L3 agent running on a centralized node (or in single-host deployments, e.g. devstack). ‘dvr_no_external’ - this mode enables only East/West DVR routing functionality for a L3 agent that runs on a compute host, the North/South functionality such as DNAT and SNAT will be provided by the centralized network node that is running in ‘dvr_snat’ mode. This mode should be used when there is no external network connectivity on the compute host. > — Slawek Kaplonski Senior software engineer Red Hat From cjeanner at redhat.com Thu May 16 10:40:25 2019 From: cjeanner at redhat.com (=?UTF-8?Q?C=c3=a9dric_Jeanneret?=) Date: Thu, 16 May 2019 12:40:25 +0200 Subject: [ptg][kolla][openstack-ansible][tripleo] PTG cross-project summary In-Reply-To: References: Message-ID: <0b78939a-ccb2-77fa-f2a4-d462576bcbb2@redhat.com> On 5/7/19 11:07 AM, Mark Goddard wrote: > Hi, > > This is a summary of the ad-hoc cross project session between the > OpenStack Ansible and Kolla teams. > > It occurred to me that our two teams likely face similar challenges, and > there are areas we could collaborate on. I've tagged TripleO also since > the same applies there. > > [Collaboration on approach to features] > This was my main reason for proposing the session - there are features > and changes that all deployment tools need to make. Examples coming up > include support for upgrade checkers and IPv6. Rather than work in > isolation and solve the same problem in different ways, perhaps we could > share our experiences. The implementations will differ, but providing a > reasonably consistent feel between deployment tools can't be a bad thing. > > As a test case, we briefly discussed our experience with the upgrade > checker support added in Stein, and found that our expectation of how it > would work was fairly aligned in the room, but not aligned with how I > understand it to actually work (it's more of a post-upgrade check than a > pre-upgrade check). Hello! I'm pretty sure the new Validation Framework can help here, since we intend to provide a pre|post deploy|update|upgrade way to run validations. Feel free to ping me if you want (Tengu on #tripleo) - or just ask questions in here :). Since we want to extend the framework to not only cover tripleo and openstack, that would be a good start with kolla imho :) > > I was also able to point the OSA team at the placement migration code > added to Kolla in the Stein release, which should save them some time, > and provide more eyes on our code. > > I'd like to pursue this more collaborative approach during the Train > release where it fits. Upgrade checkers seems a good place to start, but > am open to other ideas such as IPv6 or Python 3. > > [OSA in Kayobe] > This was my wildcard - add support for deploying OpenStack via OSA in > Kayobe as an alternative to Kolla Ansible. It could be a good fit for > those users who want to use OSA but don't have a provisioning system. > This wasn't true of anyone in the room, and lack of resources deters > from 'build it and they will come'. Still, the seed is planted, it may > yet grow. > > [Sharing Ansible roles] > mnaser had an interesting idea: add support for deploying kolla > containers to the OSA Ansible service roles. We could then use those > roles within Kolla Ansible to avoid duplication of code. There is > definitely some appeal to this in theory. In practice however I feel > that the two deployment models are sufficiently different that it would > add significantly complexity to both projects. Part of the (relative) > simplicity and regularity of Kolla Ansible is enabled by handing off > installation and other tasks to Kolla. > > One option that might work however is sharing some of the lower level > building blocks. mnaser offered to make a PoC for > using https://github.com/openstack/ansible-config_template to generate > configuration in Kolla Ansible in place of merge_config and merge_yaml. > It requires some changes to that role to support merging a list of > source template files. We'd also need to add an external dependency to > our 'monorepo', or 'vendor' the module - trade offs to make in > complexity vs. maintaining our own module. > > I'd like to thank the OSA team for hosting the discussion - it was great > to meet the team and share experience. > > Cheers, > Mark -- Cédric Jeanneret Software Engineer - OpenStack Platform Red Hat EMEA https://www.redhat.com/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From gchamoul at redhat.com Thu May 16 11:36:46 2019 From: gchamoul at redhat.com (=?utf-8?B?R2HDq2w=?= Chamoulaud) Date: Thu, 16 May 2019 13:36:46 +0200 Subject: [ptg][kolla][openstack-ansible][tripleo] PTG cross-project summary In-Reply-To: <0b78939a-ccb2-77fa-f2a4-d462576bcbb2@redhat.com> References: <0b78939a-ccb2-77fa-f2a4-d462576bcbb2@redhat.com> Message-ID: <20190516113646.c2ql7wxzf4h6d6y4@olivia.strider.local> On 16/May/2019 12:40, Cédric Jeanneret wrote: > > > On 5/7/19 11:07 AM, Mark Goddard wrote: > > Hi, > > > > This is a summary of the ad-hoc cross project session between the > > OpenStack Ansible and Kolla teams. > > > > It occurred to me that our two teams likely face similar challenges, and > > there are areas we could collaborate on. I've tagged TripleO also since > > the same applies there. > > > > [Collaboration on approach to features] > > This was my main reason for proposing the session - there are features > > and changes that all deployment tools need to make. Examples coming up > > include support for upgrade checkers and IPv6. Rather than work in > > isolation and solve the same problem in different ways, perhaps we could > > share our experiences. The implementations will differ, but providing a > > reasonably consistent feel between deployment tools can't be a bad thing. > > > > As a test case, we briefly discussed our experience with the upgrade > > checker support added in Stein, and found that our expectation of how it > > would work was fairly aligned in the room, but not aligned with how I > > understand it to actually work (it's more of a post-upgrade check than a > > pre-upgrade check). > > Hello! I'm pretty sure the new Validation Framework can help here, since > we intend to provide a pre|post deploy|update|upgrade way to run > validations. To be more precise, that's not really *new* as TripleO runs validations since Newton. > Feel free to ping me if you want (Tengu on #tripleo) - or just ask > questions in here :). > > Since we want to extend the framework to not only cover tripleo and > openstack, that would be a good start with kolla imho :) > > > > > I was also able to point the OSA team at the placement migration code > > added to Kolla in the Stein release, which should save them some time, > > and provide more eyes on our code. > > > > I'd like to pursue this more collaborative approach during the Train > > release where it fits. Upgrade checkers seems a good place to start, but > > am open to other ideas such as IPv6 or Python 3. > > > > [OSA in Kayobe] > > This was my wildcard - add support for deploying OpenStack via OSA in > > Kayobe as an alternative to Kolla Ansible. It could be a good fit for > > those users who want to use OSA but don't have a provisioning system. > > This wasn't true of anyone in the room, and lack of resources deters > > from 'build it and they will come'. Still, the seed is planted, it may > > yet grow. > > > > [Sharing Ansible roles] > > mnaser had an interesting idea: add support for deploying kolla > > containers to the OSA Ansible service roles. We could then use those > > roles within Kolla Ansible to avoid duplication of code. There is > > definitely some appeal to this in theory. In practice however I feel > > that the two deployment models are sufficiently different that it would > > add significantly complexity to both projects. Part of the (relative) > > simplicity and regularity of Kolla Ansible is enabled by handing off > > installation and other tasks to Kolla. > > > > One option that might work however is sharing some of the lower level > > building blocks. mnaser offered to make a PoC for > > using https://github.com/openstack/ansible-config_template to generate > > configuration in Kolla Ansible in place of merge_config and merge_yaml. > > It requires some changes to that role to support merging a list of > > source template files. We'd also need to add an external dependency to > > our 'monorepo', or 'vendor' the module - trade offs to make in > > complexity vs. maintaining our own module. > > > > I'd like to thank the OSA team for hosting the discussion - it was great > > to meet the team and share experience. > > > > Cheers, > > Mark > > -- > Cédric Jeanneret > Software Engineer - OpenStack Platform > Red Hat EMEA > https://www.redhat.com/ > Gaël, -- Gaël Chamoulaud (He/Him/His) .::. Senior Software Engineer .::. OpenStack .::. DFG UI & Validations .::. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: not available URL: From ekuvaja at redhat.com Thu May 16 11:48:30 2019 From: ekuvaja at redhat.com (Erno Kuvaja) Date: Thu, 16 May 2019 12:48:30 +0100 Subject: [qa][ptg][nova][cinder][keystone][neutron][glance][swift][placement] How to make integrated-gate testing (tempest-full) more stable and fast In-Reply-To: References: <16a86db6ccd.d787148123989.2198391414179782565@ghanshyammann.com> Message-ID: On Tue, May 7, 2019 at 12:31 AM Tim Burke wrote: > > On 5/5/19 12:18 AM, Ghanshyam Mann wrote: > > Current integrated-gate jobs (tempest-full) is not so stable for various bugs specially timeout. We tried > to improve it via filtering the slow tests in the separate tempest-slow job but the situation has not been improved much. > > We talked about the Ideas to make it more stable and fast for projects especially when failure is not > related to each project. We are planning to split the integrated-gate template (only tempest-full job as > first step) per related services. > > Idea: > - Run only dependent service tests on project gate. > > I love this plan already. > > - Tempest gate will keep running all the services tests as the integrated gate at a centeralized place without any change in the current job. > - Each project can run the below mentioned template. > - All below template will be defined and maintained by QA team. > > My biggest regret is that I couldn't figure out how to do this myself. > Much thanks to the QA team! > > I would like to know each 6 services which run integrated-gate jobs > > 1."Integrated-gate-networking" (job to run on neutron gate) > Tests to run in this template: neutron APIs , nova APIs, keystone APIs ? All scenario currently running in tempest-full in the same way ( means non-slow and in serial) > Improvement for neutron gate: exlcude the cinder API tests, glance API tests, swift API tests, > > 2."Integrated-gate-storage" (job to run on cinder gate, glance gate) > Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, Nova APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) > Improvement for cinder, glance gate: excluded the neutron APIs tests, Keystone APIs tests > > 3. "Integrated-gate-object-storage" (job to run on swift gate) > Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) > Improvement for swift gate: excluded the neutron APIs tests, - Keystone APIs tests, - Nova APIs tests. > > This sounds great. My only question is why Cinder tests are still > included, but I trust that it's there for a reason and I'm just revealing > my own ignorance of Swift's consumers, however removed. > > Note: swift does not run integrated-gate as of now. > > Correct, and for all the reasons that you're seeking to address. Some > eight months ago I'd gotten tired of seeing spurious failures that had > nothing to do with Swift, and I was hard pressed to find an instance where > the tempest tests caught a regression or behavior change that wasn't > already caught by Swift's own functional tests. In short, the > signal-to-noise ratio for those particular tests was low enough that a > failure only told me "you should leave a recheck comment," so I proposed > https://review.opendev.org/#/c/601813/ . There was also a side benefit of > having our longest-running job change from legacy-tempest-dsvm-neutron-full > (at 90-100 minutes) to swift-probetests-centos-7 (at ~30 minutes), > tightening developer feedback loops. > > It sounds like this proposal addresses both concerns: by reducing the > scope of tests to what might actually exercise the Swift API (if > indirectly), the signal-to-noise ratio should be much better and the > wall-clock time will be reduced. > > 4. "Integrated-gate-compute" (job to run on Nova gate) > tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs and All scenario currently running in tempest-full in same way ( means non-slow and in serial) > Improvement for Nova gate: excluded the swift APIs tests(not running in current job but in future, it might), Keystone API tests. > > 5. "Integrated-gate-identity" (job to run on keystone gate) > Tests to run is : all as all project use keystone, we might need to run all tests as it is running in integrated-gate. > But does keystone is being unsed differently by all services? if no then, is it enough to run only single service tests say Nova or neutron ? > > 6. "Integrated-gate-placement" (job to run on placement gate) > Tests to run in this template: Nova APIs tests, Neutron APIs tests + scenario tests + any new service depends on placement APIs > Improvement for placement gate: excluded the glance APIs tests, cinder APIs tests, swift APIs tests, keystone APIs tests > > Thoughts on this approach? > > The important point is we must not lose the coverage of integrated testing per project. So I would like to > get each project view if we are missing any dependency (proposed tests removal) in above proposed templates. > > As far as Swift is aware, these dependencies seem accurate; at any rate, > *we* don't use anything other than Keystone, even by way of another API. > Further, Swift does not use particularly esoteric Keysonte APIs; I would be > OK with integrated-gate-identity not exercising Swift's API with the > assumption that some other (or indeed, almost *any* other) service would > likely exercise the parts that we care about. > > - https:/etherpad.openstack.org/p/qa-train-ptg > > -gmann > > > > While I'm all up for limiting the scope Tempest is targeting for each patch to save time and our precious infra resources I have feeling that we might end up missing something here. Honestly I'm not sure what that something would be and maybe it's me thinking the scopes wrong way around. For example: 4. "Integrated-gate-compute" (job to run on Nova gate) I'm not exactly sure what any given Nova patch would be able to break from Cinder, Glance or Neutron or on number 2 what Swift is depending on Glance and Cinder that we could break when we introduce a change. Shouldn't we be looking "What projects are consuming service X and target those Tempest tests"? In Glance perspective this would be (from core projects) Glance, Cinder, Nova; Cinder probably interested about Cinder, Glance and Nova (anyone else consuming Cinder?) etc. I'd like to propose approach where we define these jobs and run them in check for the start and let gate run full suites until we figure out are we catching something in gate we did not catch in check and once the understanding has been reached that we have sufficient coverage, we can go ahead and swap gate using those jobs as well. This approach would give us the benefit where the impact is highest until we are confident we got the coverage right. I think biggest issue is that for the transition period _everyone_ needs to understand that gate might catch something check did not and simple "recheck" might not be sufficient when tempest succeeded in check but failed in gate. Best, Erno "jokke_" Kuvaja -------------- next part -------------- An HTML attachment was scrubbed... URL: From ekuvaja at redhat.com Thu May 16 12:10:45 2019 From: ekuvaja at redhat.com (Erno Kuvaja) Date: Thu, 16 May 2019 13:10:45 +0100 Subject: [glance][interop] standardized image "name" ? In-Reply-To: <4234325e-7569-e11f-53e9-72f07ed8ce53@gmail.com> References: <939FEDBD-6E5E-43F2-AE1F-2FE71A71BF58@vmware.com> <20190408123255.vqwwvzzdt24tm3pq@yuggoth.org> <4234325e-7569-e11f-53e9-72f07ed8ce53@gmail.com> Message-ID: On Thu, Apr 18, 2019 at 1:41 PM Brian Rosmaita wrote: > I really need to get caught up on my ML reading. > > On 4/11/19 6:40 PM, Thomas Goirand wrote: > [snip] > > It's just a shame that Glance doesn't show MD5 and not sha512 sums by > > default... > > The Secure Hash Algorithm Support ("multihash") spec [0] for Glance was > implemented in Rocky and provides a self-describing secure hash on > images (in addition to the 'checksum', which is preserved for backward > compatability.) The default is SHA-512. See the Rocky release notes [1] > for some implementation details not covered by the spec. > > The multihash is displayed in the image-list and image-show API > responses since Images API v2.7, and in the glanceclient since 2.12.0. > > The glanceclient has been using the secure hash for download > verification since 2.13.0, with a fallback to the md5 'checksum' field > if the multihash isn't populated. (It also optionally allows fallback > to md5 if the algorithm for the secure hash isn't available to the > client; this option is off by default.) See the 2.13.0 release notes > [2] for details. > > [0] > > https://specs.openstack.org/openstack/glance-specs/specs/rocky/implemented/glance/multihash.html > [1] https://docs.openstack.org/releasenotes/glance/rocky.html#new-features > [2] > > https://docs.openstack.org/releasenotes/python-glanceclient/rocky.html#relnotes-2-13-0-stable-rocky > > > Cheers, > > > > Thomas Goirand (zigo) > I'm even much more late on this thread than Brian ever was. But as this clearly did sidetrack a bit from the original, I'm gonna chip in something that could help for the original request: Original topic/request was the exact first usecase Searchlight team was taking on with metadefs and Searchlight when it spun up within Glance. Image names are what they are, freeform text that doesn't need to be unique and likely will always be something you can get bit of an idea what it might contain but will never be your reliable source of information what that image actually contains. Please people, lets not try to reinvent the wheel with something that's really not sufficient for the purpose and start populating the metadata into the image records instead. That's why there is plenty of metadefs so that information is structured and can be easily parsed/searched by something like searchlight. If you look up for the first Searchlight updates and demo's, finding your specific version of OS from images or specific software stack (F.E. LAMP) preinstalled were there literally from the first release. Cheers, Erno "jokke_" Kuvaja -------------- next part -------------- An HTML attachment was scrubbed... URL: From marios at redhat.com Thu May 16 12:51:25 2019 From: marios at redhat.com (Marios Andreou) Date: Thu, 16 May 2019 15:51:25 +0300 Subject: [tripleo][ci] rdo cloud outage - maintenance Message-ID: There is ongoing maintenance outage in rdo cloud... this means there are currently NO 3rd party rdo jobs running on any tripleo project reviews. We also stopped the promoter about an hour ago in preparation so there are no promotions. The outage should be resolved today we will update when things are back to normal thanks (marios|ruck and panda|rover on #tripleo and #oooq if you want to ping us) -------------- next part -------------- An HTML attachment was scrubbed... URL: From cdent+os at anticdent.org Thu May 16 13:07:33 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Thu, 16 May 2019 14:07:33 +0100 (BST) Subject: On reporting CPU flags that provide mitiation (to CVE flaws) as Nova 'traits' In-Reply-To: <60a1e97d-c9ac-469f-3c16-39e89347acb3@fried.cc> References: <20190515092456.GH17214@paraplu> <1e4f1df66115fd8e96b8aec3a679b25534c66541.camel@redhat.com> <20190515131109.GJ17214@paraplu> <60a1e97d-c9ac-469f-3c16-39e89347acb3@fried.cc> Message-ID: On Wed, 15 May 2019, Eric Fried wrote: > (NB: I'm explicitly rendering "no opinion" on several items below so you > know I didn't miss/ignore them.) I'm responding in this thread so that it's clear I'm not ignoring it. I don't have a strong opinion. I agree that availability of a trait in os-traits is not the same as nova reporting that trait when creating resource providers representing compute nodes. However, having something in os-traits that nobody is going to use is not without cost: Once something is in os-traits it must stay there forever. So if there's no pressing use case for these additions, maybe we just wait. Bit more within... > However, I'll state again for the record that vendor-specific "positive" > traits (indicating "has mitigation", "not vulnerable", etc.) are nigh > worthless for the Nova scheduling use case of "land me on a > non-vulnerable host" because, until you can say > required=in:HW_CPU_X86_INTEL_FIX,HW_CPU_X86_AMD_FIX, you would have to > pick your CPU vendor ahead of time. There's a spec for this, but it is currently on hold as there is neither immediate use cases demanding to be satisfied, nor anyone to do the work. https://review.opendev.org/649992 > (Disclaimer: I'm a card-carrying "trait libertarian": freedom to do what > makes sense with traits, as long as you're not hurting anyone and it's > not costing the taxpayers.) I guess that makes me a "trait anarcho communitarian". People should have the freedom to do what they like with traits and they aren't hurting anybody, but blessing a trait as official (by putting it in os-traits) is a strong signifier and has system-wide impacts that should be debated in ad-hoc committees endlessly until a consensus emerges which avoids anyone facepalming or rage quitting. >From a placement-the-service standpoint, it cares naught. It doesn't know what traits mean and cannot distinguish between official and custom traits when filtering candidates. It's important that placement be able to work easily with thousands or hundreds of thousands of traits. We very definitely do not wanting to making authorization decisions based on the value of a trait and the status of the requestor. As said elsewhere by several folk: It's how the other services use them that matters. I'm agnostic on nova reporting all the cpu flags/features/capabilities as traits. If it is going to do that, then having _those_ traits as members of os-traits is the right thing to do. I'm less agnostic on users ever needing or wanting to be aware of specific cpu features in order to get a satisfactory workload placement. I want to be able to request high performance without knowing the required underlying features. Flavors + traits (which I don't have to understand) gets us that, so ... cool. > If we want to make scheduling decisions based on vulnerabilities, it > needs to be under the exclusive control of the admin. Others have said this (at least Dan): This seems like something where something other than nova ought to handle it. A host which shouldn't be scheduled to should be disabled (as a service). -=-=- This thread and several other conversations about traits and resource classes have made it pretty clear that the knowledge and experience required to make good decisions about what names should be in os-traits and os-resource-classes (and the form the names should take) is not exactly overlapping with what's required to be a core on the placement service. How do people feel about the idea of forming a core group for those two repos that includes placement cores but has additions from nova (Dan, Kashyap and Sean would make good candidates) and other projects that consume them? Having that group wouldn't remove the need for these extended conversations but would help make sure the right people were aware of changes and participating. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From haleyb.dev at gmail.com Thu May 16 13:45:52 2019 From: haleyb.dev at gmail.com (Brian Haley) Date: Thu, 16 May 2019 09:45:52 -0400 Subject: [DVR config] Can we use drv_snat agent_mode in every compute node? In-Reply-To: <2106942ccb764c0ca618169098621a47@inspur.com> References: <2106942ccb764c0ca618169098621a47@inspur.com> Message-ID: Hi Yi, I'm a little confused by the question, comments inline. On 5/15/19 11:47 PM, Yi Yang (杨燚)-云服务集团 wrote: > Hi, folks > > I saw somebody discussed distributed SNAT, but finally they didn’t make > agreement on how to implement distributed SNAT, my question is can we > use dvr_snat agent_mode in compute node? I understand dvr_snat only does > snat but doesn’t do east west routing, right? Can we set dvr_snat and > dvr in one compute node at the same time? It is equivalent to > distributed SNAT if we can set drv_snat in every compute node, isn’t > right? I know Opendaylight can do SNAT in compute node in distributed > way, but one external router only can run in one compute node. Distributed SNAT is not available in neutron, there was a spec proposed recently though, https://review.opendev.org/#/c/658414 Regarding the agent_mode setting for L3, only one mode can be set at a time. Typically 'dvr_snat' is used on network nodes and 'dvr' on compute nodes because it leads to less resource usage (i.e. namespaces). The centralized part of the router hosting the default SNAT IP address will only be scheduled to one of the agents in 'dvr_snat' mode. All the DVR modes can do East/West routing when an instance is scheduled to the node, and two can do North/South - 'dvr_snat' using the default SNAT IP, and 'dvr' using a floating IP. 'dvr_no_external' can only do East/West. Hopefully that clarifies things. -Brian > I also see https://wiki.openstack.org/wiki/Dragonflow is trying to > implement distributed SNAT, what are technical road blocks for > distributed SNAT in openstack dvr? Do we have any good way to remove > these road blocks? > > Thank you in advance and look forward to getting your replies and insights. > > Also attached official drv configuration guide for your reference. > > https://docs.openstack.org/neutron/stein/configuration/l3-agent.html > > |agent_mode|¶ > > > Type > > string > > Default > > legacy > > Valid Values > > dvr, dvr_snat, legacy, dvr_no_external > > The working mode for the agent. Allowed modes are: ‘legacy’ - this > preserves the existing behavior where the L3 agent is deployed on a > centralized networking node to provide L3 services like DNAT, and SNAT. > Use this mode if you do not want to adopt DVR. ‘dvr’ - this mode enables > DVR functionality and must be used for an L3 agent that runs on a > compute host. ‘dvr_snat’ - this enables centralized SNAT support in > conjunction with DVR. This mode must be used for an L3 agent running on > a centralized node (or in single-host deployments, e.g. devstack). > ‘dvr_no_external’ - this mode enables only East/West DVR routing > functionality for a L3 agent that runs on a compute host, the > North/South functionality such as DNAT and SNAT will be provided by the > centralized network node that is running in ‘dvr_snat’ mode. This mode > should be used when there is no external network connectivity on the > compute host. > From mark at stackhpc.com Thu May 16 13:53:06 2019 From: mark at stackhpc.com (Mark Goddard) Date: Thu, 16 May 2019 14:53:06 +0100 Subject: [ptg][kolla][openstack-ansible][tripleo] PTG cross-project summary In-Reply-To: <0b78939a-ccb2-77fa-f2a4-d462576bcbb2@redhat.com> References: <0b78939a-ccb2-77fa-f2a4-d462576bcbb2@redhat.com> Message-ID: On Thu, 16 May 2019 at 11:41, Cédric Jeanneret wrote: > > > On 5/7/19 11:07 AM, Mark Goddard wrote: > > Hi, > > > > This is a summary of the ad-hoc cross project session between the > > OpenStack Ansible and Kolla teams. > > > > It occurred to me that our two teams likely face similar challenges, and > > there are areas we could collaborate on. I've tagged TripleO also since > > the same applies there. > > > > [Collaboration on approach to features] > > This was my main reason for proposing the session - there are features > > and changes that all deployment tools need to make. Examples coming up > > include support for upgrade checkers and IPv6. Rather than work in > > isolation and solve the same problem in different ways, perhaps we could > > share our experiences. The implementations will differ, but providing a > > reasonably consistent feel between deployment tools can't be a bad thing. > > > > As a test case, we briefly discussed our experience with the upgrade > > checker support added in Stein, and found that our expectation of how it > > would work was fairly aligned in the room, but not aligned with how I > > understand it to actually work (it's more of a post-upgrade check than a > > pre-upgrade check). > > Hello! I'm pretty sure the new Validation Framework can help here, since > we intend to provide a pre|post deploy|update|upgrade way to run > validations. > > Feel free to ping me if you want (Tengu on #tripleo) - or just ask > questions in here :). > > Since we want to extend the framework to not only cover tripleo and > openstack, that would be a good start with kolla imho :) > > Hi Cedric. The validation framework is based around this new tempest ansible role, correct? Presumably each deployment tool would provide a its own entry point on top of that. How does the pre/post deploy etc affect what validations are run? Is that up to the deployment tool, or defined in the ansible role, or somewhere else? -------------- next part -------------- An HTML attachment was scrubbed... URL: From mjturek at linux.vnet.ibm.com Thu May 16 13:55:38 2019 From: mjturek at linux.vnet.ibm.com (Michael Turek) Date: Thu, 16 May 2019 09:55:38 -0400 Subject: [devstack][tempest][neutron][ironic] Trouble debugging networking issues with devstack/lib/tempest In-Reply-To: <9096f5ca-de93-9df4-7e7b-a48832c79643@gmail.com> References: <55e560da-1c96-e93a-8a1f-799a24cd23dc@linux.vnet.ibm.com> <9096f5ca-de93-9df4-7e7b-a48832c79643@gmail.com> Message-ID: <465ee4e4-ae40-0a57-797e-34604da8da4c@linux.vnet.ibm.com> On 5/15/19 5:37 PM, Brian Haley wrote: > On 5/15/19 4:49 PM, Michael Turek wrote: >> Hey all, >> >> We've been having networking issues with our ironic CI job and I'm a >> bit blocked. Included is a sample run of the job failing [0]. Our job >> creates a single flat public network at [1]. Everything goes fine >> until we hit 'lib/tempest'. The job fails trying to create a network >> [2]. >> >> At first I thought this could be because we configure a single >> physnet which is used by the 'public' network, however I've tried >> jumping on the node, deleting the 'public' network and creating the >> network mentioned in lib/tempest and it still fails with: >> >> Error while executing command: HttpException: 503, Unable to create >> the network. No tenant network is available for allocation. >> >> >> Does anyone have any insight into what we're not configuring >> properly? Any help would be greatly appreciated! > > Just going based on your log: > > OPTS=tenant_network_types=flat > > But tenant flat networks are not supported.  It's typically going to > be vxlan or local. > > Did you change a setting or did this just start breaking recently? > > -Brian > Hey Brian, Thanks! The job had embarrassingly been broken on an issue with lib/ironic for quite some time so it's hard to say when the networking issues arose. The only thing that's changed between the last successful run and now (on our side), is that we cherry pick a patch into devstack [0]. Highly doubt it's related but figured I'd mention it. I'll investigate properly setting the tenant network types. Thank you for the lead, much appreciated! Thanks, Mike Turek [0] https://review.opendev.org/#/c/653463/ From cjeanner at redhat.com Thu May 16 13:59:08 2019 From: cjeanner at redhat.com (=?UTF-8?Q?C=c3=a9dric_Jeanneret?=) Date: Thu, 16 May 2019 15:59:08 +0200 Subject: [ptg][kolla][openstack-ansible][tripleo] PTG cross-project summary In-Reply-To: References: <0b78939a-ccb2-77fa-f2a4-d462576bcbb2@redhat.com> Message-ID: On 5/16/19 3:53 PM, Mark Goddard wrote: > > > On Thu, 16 May 2019 at 11:41, Cédric Jeanneret > wrote: > > > > On 5/7/19 11:07 AM, Mark Goddard wrote: > > Hi, > > > > This is a summary of the ad-hoc cross project session between the > > OpenStack Ansible and Kolla teams. > > > > It occurred to me that our two teams likely face similar > challenges, and > > there are areas we could collaborate on. I've tagged TripleO also > since > > the same applies there. > > > > [Collaboration on approach to features] > > This was my main reason for proposing the session - there are features > > and changes that all deployment tools need to make. Examples coming up > > include support for upgrade checkers and IPv6. Rather than work in > > isolation and solve the same problem in different ways, perhaps we > could > > share our experiences. The implementations will differ, but > providing a > > reasonably consistent feel between deployment tools can't be a bad > thing. > > > > As a test case, we briefly discussed our experience with the upgrade > > checker support added in Stein, and found that our expectation of > how it > > would work was fairly aligned in the room, but not aligned with how I > > understand it to actually work (it's more of a post-upgrade check > than a > > pre-upgrade check). > > Hello! I'm pretty sure the new Validation Framework can help here, since > we intend to provide a pre|post deploy|update|upgrade way to run > validations. > > Feel free to ping me if you want (Tengu on #tripleo) - or just ask > questions in here :). > > Since we want to extend the framework to not only cover tripleo and > openstack, that would be a good start with kolla imho :) > > Hi Cedric. The validation framework is based around this new tempest > ansible role, correct? Presumably each deployment tool would provide a > its own entry point on top of that. How does the pre/post deploy etc > affect what validations are run? Is that up to the deployment tool, or > defined in the ansible role, or somewhere else? So for now we "only" have a (fairly) good integration with the "openstack tripleo" subcommand[1]. Currently, we're mainly using plain ansible roles and playbook, being launched by Mistral. We intend to allow non-Mistral runs (Work In Progress) and, in not too far future hopefully, to provide a descent python library within the tripleo-validations package for a better integration. It's still early, we still need to put things together, but if we can already raise awareness and interest for this work, it will help getting more involvement and time in order to provide a great bundle :). If you want to know more, you can already have a look at the new doc[1]. Does it help a bit understanding the possibilities? Cheers, C. [1] https://docs.openstack.org/tripleo-docs/latest/validations/index.html (note: we will probably add some more inputs in there) -- Cédric Jeanneret Software Engineer - OpenStack Platform Red Hat EMEA https://www.redhat.com/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From ed at leafe.com Thu May 16 14:09:03 2019 From: ed at leafe.com (Ed Leafe) Date: Thu, 16 May 2019 09:09:03 -0500 Subject: On reporting CPU flags that provide mitiation (to CVE flaws) as Nova 'traits' In-Reply-To: <60a1e97d-c9ac-469f-3c16-39e89347acb3@fried.cc> References: <20190515092456.GH17214@paraplu> <1e4f1df66115fd8e96b8aec3a679b25534c66541.camel@redhat.com> <20190515131109.GJ17214@paraplu> <60a1e97d-c9ac-469f-3c16-39e89347acb3@fried.cc> Message-ID: On May 15, 2019, at 4:50 PM, Eric Fried wrote: > >>>> There's no consensus here. Some think that we should _not_ allow those >>>> CPU flags as traits which can 'allow' you to target vulnerable hosts. >>> >>> for what its worth im in this camp and have said so in other places >>> where we have been disucssing it. >> >> Yep, noted. > > My position is that it's not harmful to add them to os-traits; it's > whether/how they're used in nova that needs some thought. They may not be "harmful", but they set a very bad precedent. I don't want to see os-traits become "Oh, just dump the trait in there, and maybe someday someone will use it". -- Ed Leafe From ed at leafe.com Thu May 16 14:09:59 2019 From: ed at leafe.com (Ed Leafe) Date: Thu, 16 May 2019 09:09:59 -0500 Subject: On reporting CPU flags that provide mitiation (to CVE flaws) as Nova 'traits' In-Reply-To: References: <20190515092456.GH17214@paraplu> <1e4f1df66115fd8e96b8aec3a679b25534c66541.camel@redhat.com> <20190515131109.GJ17214@paraplu> <60a1e97d-c9ac-469f-3c16-39e89347acb3@fried.cc> Message-ID: <01340206-FE60-4AB8-8C66-66D4F04A3B28@leafe.com> On May 15, 2019, at 5:31 PM, Dan Smith wrote: > > That said, I do think that placement should try to avoid being "tags as a > service" which this use-case is dangerously close to becoming, IMHO. This. -- Ed Leafe From mthode at mthode.org Thu May 16 14:21:13 2019 From: mthode at mthode.org (Matthew Thode) Date: Thu, 16 May 2019 09:21:13 -0500 Subject: [oslo][requirements] Bandit Strategy In-Reply-To: References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> <20190516054153.GC18431@thor.bakeyournoodle.com> Message-ID: <20190516142113.ve6zy6mv33letrd6@mthode.org> On 19-05-16 11:55:47, Herve Beraud wrote: > In other words, I propose to schedule a periodical requirements check on > the oslo projects to detect as soon as possible CI errors related to > requirements check (related to py2.7 support drop), and fix it as soon as > possible to avoid to fix it during standard review process (patches related > to common fix or feat). > > Le jeu. 16 mai 2019 à 11:29, Herve Beraud a écrit : > Would it be better to have one job that people monitor? requirements-tox-py27-check-uc may work for you (in the requirements project) as it tests co-installability. -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From openstack at nemebean.com Thu May 16 15:21:04 2019 From: openstack at nemebean.com (Ben Nemec) Date: Thu, 16 May 2019 10:21:04 -0500 Subject: [oslo][requirements] Bandit Strategy In-Reply-To: <20190516054153.GC18431@thor.bakeyournoodle.com> References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> <20190516054153.GC18431@thor.bakeyournoodle.com> Message-ID: <5351768d-cb8c-150b-14d9-ac2df33c22a7@nemebean.com> On 5/16/19 12:41 AM, Tony Breeds wrote: > On Tue, May 14, 2019 at 11:09:26AM -0400, Zane Bitter wrote: > >> It's breaking the whole world and I'm actually not sure there's a good >> reason for it. Who cares if sphinx 2.0 doesn't run on Python 2.7 when we set >> and achieved a goal in Stein to only run docs jobs under Python 3? It's >> unavoidable for stable/rocky and earlier but it seems like the pain on >> master is not necessary. > > While we support python2 *anywhere* we need to do this. The current > tools (both ours and the broader python ecosystem) need to have these > markers. > > I apologise that we managed to mess this up we're looking at how we can > avoid this in the future but we don't really get any kinda of signals > about $library dropping support for $python_version. The py2 things is > more visible than a py3 minor release but they're broadly the same thing The biggest problem here was the timing with the Bandit issue. Normally this would have only blocked patches that needed to change requirements, but because most of our repos needed a requirements change to unblock them it became a bigger issue than it normally would have been. That said, it would be nice if we could come up with a less intrusive way to handle this in the future. I'd rather not have to keep merging a ton of requirements patches when dependencies drop py2 support. From mthode at mthode.org Thu May 16 15:26:51 2019 From: mthode at mthode.org (Matthew Thode) Date: Thu, 16 May 2019 10:26:51 -0500 Subject: [oslo][requirements] Bandit Strategy In-Reply-To: <5351768d-cb8c-150b-14d9-ac2df33c22a7@nemebean.com> References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> <20190516054153.GC18431@thor.bakeyournoodle.com> <5351768d-cb8c-150b-14d9-ac2df33c22a7@nemebean.com> Message-ID: <20190516152651.pptbo2cmh7hcazbs@mthode.org> On 19-05-16 10:21:04, Ben Nemec wrote: > > > On 5/16/19 12:41 AM, Tony Breeds wrote: > > On Tue, May 14, 2019 at 11:09:26AM -0400, Zane Bitter wrote: > > > > > It's breaking the whole world and I'm actually not sure there's a good > > > reason for it. Who cares if sphinx 2.0 doesn't run on Python 2.7 when we set > > > and achieved a goal in Stein to only run docs jobs under Python 3? It's > > > unavoidable for stable/rocky and earlier but it seems like the pain on > > > master is not necessary. > > > > While we support python2 *anywhere* we need to do this. The current > > tools (both ours and the broader python ecosystem) need to have these > > markers. > > > > I apologise that we managed to mess this up we're looking at how we can > > avoid this in the future but we don't really get any kinda of signals > > about $library dropping support for $python_version. The py2 things is > > more visible than a py3 minor release but they're broadly the same thing > > The biggest problem here was the timing with the Bandit issue. Normally this > would have only blocked patches that needed to change requirements, but > because most of our repos needed a requirements change to unblock them it > became a bigger issue than it normally would have been. > > That said, it would be nice if we could come up with a less intrusive way to > handle this in the future. I'd rather not have to keep merging a ton of > requirements patches when dependencies drop py2 support. > We are trying to determine if using constraints alone is suficient. pip not having a depsolver strikes again. -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From openstack at nemebean.com Thu May 16 15:28:03 2019 From: openstack at nemebean.com (Ben Nemec) Date: Thu, 16 May 2019 10:28:03 -0500 Subject: [oslo][requirements] Bandit Strategy In-Reply-To: References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> <20190516054153.GC18431@thor.bakeyournoodle.com> Message-ID: <6bf60a87-80d7-b158-ea9d-8b4576c9f86c@nemebean.com> On 5/16/19 4:29 AM, Herve Beraud wrote: > Hello, > > To help us to be more reactive on similar issues related to requirements > who drop python 2 (the sphinx use case) > I've submit a patch https://review.opendev.org/659289 to schedule > "check-requirements" daily. > > Normally with that if openstack/requirements add somes changes who risk > to break our CI we will be informed quickly by this periodical job. > > I guess we will facing a many similar issues in the next month due to > the python 2.7 final countdown and libs who will drop python 2.7 support. > > For the moment only submit my patch on oslo.log, but if it work, in a > second time, we can copy it to all the oslo projects. > > I'm not a zuul expert and I don't know if my patch is correct or not, so > please feel free to review it and to put comments to let me know how to > proceed with periodic jobs. > > Also oslo core could check daily the result of this job to know if > actions are needed and inform team via the ML or something like that in > fix the issue efficiently. This is generally the problem with periodic jobs. People don't pay attention to them so issues still don't get noticed until they start breaking live patches. As I said in IRC, if you're willing to commit to checking the periodic jobs daily I'm okay with adding them. I know when dims was PTL he had nightly jobs running on all of the Oslo repos, but I think that was in his own private infra so I don't know that we could reuse what he had. > > Thoughts? > > Yours Hervé. > > > Le jeu. 16 mai 2019 à 07:44, Tony Breeds > a écrit : > > On Tue, May 14, 2019 at 11:09:26AM -0400, Zane Bitter wrote: > > > It's breaking the whole world and I'm actually not sure there's a > good > > reason for it. Who cares if sphinx 2.0 doesn't run on Python 2.7 > when we set > > and achieved a goal in Stein to only run docs jobs under Python > 3? It's > > unavoidable for stable/rocky and earlier but it seems like the > pain on > > master is not necessary. > > While we support python2 *anywhere* we need to do this.  The current > tools (both ours and the broader python ecosystem) need to have these > markers. > > I apologise that we managed to mess this up we're looking at how we can > avoid this in the future but we don't really get any kinda of signals > about $library dropping support for $python_version.  The py2 things is > more visible than a py3 minor release but they're broadly the same thing > > Yours Tony. > > > > -- > Hervé Beraud > Senior Software Engineer > Red Hat - Openstack Oslo > irc: hberaud > -----BEGIN PGP SIGNATURE----- > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > v6rDpkeNksZ9fFSyoY2o > =ECSj > -----END PGP SIGNATURE----- > From allison at openstack.org Thu May 16 15:36:38 2019 From: allison at openstack.org (Allison Price) Date: Thu, 16 May 2019 10:36:38 -0500 Subject: [User-committee] OpenStack User Survey 2019 In-Reply-To: References: <5CC0732E.8020601@tipit.net> <74F9B988-972B-422F-94D1-E62A83FD87A7@openstack.org> <5CD34F85.9010604@openstack.org> <5CD581CF.6010306@openstack.org> Message-ID: <74DA0DC0-C7FD-465E-81EE-8784C7CA8DB2@openstack.org> Hi Duc, Thanks for sharing this. To date, we have limited adding questions to OpenStack project teams, UC, and TC. If we were going to open it up to Working Groups and SIGs, then we would need to open it up to all of the groups. We have not done this in the past, as we were trying to avoid making the survey too long and complex. We could evaluate this for the next version of the User Survey that will open in August and communicate with all working groups and SIGs, but we don’t have enough time to open that process for the current version. I am happy to join the next auto-scaling SIG meeting to help identify a way to collect this feedback in the meantime if your team would like. Cheers, Allison > On May 15, 2019, at 7:10 PM, Duc Truong wrote: > > Hi Jimmy, > > The auto-scaling SIG would like to add a question to the user survey. > Our question was drafted during our auto-scaling SIG meeting [1]. > The question we would like to get added is as follows: > > If you are using auto-scaling in your OpenStack cloud, which services > do you use as part of auto-scaling? [Select all that apply] > > Monasca > Ceilometer > Aodh > Senlin > Heat > Other OpenStack service (please specify - e.g. Watcher, Vitrage, Congress) > Custom application components > Prometheus > Other user-provided service (please specify) > > Thanks, > > Duc > > [1] http://eavesdrop.openstack.org/meetings/auto_scaling_sig/2019/auto_scaling_sig.2019-05-15-23.06.html > From alifshit at redhat.com Thu May 16 15:41:11 2019 From: alifshit at redhat.com (Artom Lifshitz) Date: Thu, 16 May 2019 11:41:11 -0400 Subject: [nova][all][ptg] Summary: Same-Company Approvals In-Reply-To: References: Message-ID: Top posting to change the topic slightly: same-company approvals for the stable branch. I brought this up at today's Nova meeting, and my takeaway was as follows. Please correct me if I got something wrong. 1. Judgment! Yes, this is hard to define. 2. If you feel uneasy about merging a thing, don't do it. 3. A backport by a stable-maint core counts as a proxy +2 if the backport is an obvious bugfix and is clean. 4. We still need 2 +2's on a patch, proxy +2's count towards that. 5. As on master, we should strive to avoid 100% same-company patches. 6. The original patch author counts. A patch from company A can be approved by 2 cores from company B. One of those cores can be the backporter, with the caveats in point 3. 7. All of this can be superseded by point 1 if necessary. On Fri, May 10, 2019 at 1:55 PM Ruby Loo wrote: > > > > On Sat, May 4, 2019 at 6:48 PM Eric Fried wrote: >> >> (NB: I tagged [all] because it would be interesting to know where other >> teams stand on this issue.) >> >> Etherpad: https://etherpad.openstack.org/p/nova-ptg-train-governance >> >> Summary: >> - There is a (currently unwritten? at least for Nova) rule that a patch >> should not be approved exclusively by cores from the same company. This >> is rife with nuance, including but not limited to: >> - Usually (but not always) relevant when the patch was proposed by >> member of same company >> - N/A for trivial things like typo fixes >> - The issue is: >> - Should the rule be abolished? and/or >> - Should the rule be written down? >> >> Consensus (not unanimous): >> - The rule should not be abolished. There are cases where both the >> impetus and the subject matter expertise for a patch all reside within >> one company. In such cases, at least one core from another company >> should still be engaged and provide a "procedural +2" - much like cores >> proxy SME +1s when there's no core with deep expertise. >> - If there is reasonable justification for bending the rules (e.g. typo >> fixes as noted above, some piece of work clearly not related to the >> company's interest, unwedging the gate, etc.) said justification should >> be clearly documented in review commentary. >> - The rule should not be documented (this email notwithstanding). This >> would either encourage loopholing or turn into a huge detailed legal >> tome that nobody will read. It would also *require* enforcement, which >> is difficult and awkward. Overall, we should be able to trust cores to >> act in good faith and in the appropriate spirit. >> >> efried >> . > > > In ironic-land, we documented this [1] many moons ago. Whether that is considered a rule or a guideline, I don't know, but we haven't been sued yet and I don't recall any heated arguments/incidents about it. :) > > --ruby > > [1] https://wiki.openstack.org/wiki/Ironic/CoreTeam#Other_notes -- Artom Lifshitz Software Engineer, OpenStack Compute DFG From openstack at fried.cc Thu May 16 15:42:47 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 16 May 2019 10:42:47 -0500 Subject: On reporting CPU flags that provide mitiation (to CVE flaws) as Nova 'traits' In-Reply-To: References: <20190515092456.GH17214@paraplu> <1e4f1df66115fd8e96b8aec3a679b25534c66541.camel@redhat.com> <20190515131109.GJ17214@paraplu> <60a1e97d-c9ac-469f-3c16-39e89347acb3@fried.cc> Message-ID: > I've added a link to this thread on the agenda for tomorrow's > Security SIG meeting This happened [1]. TL;DR: it does more potential good than harm to expose these traits ("scheduler roulette is not a security measure" --fungi). > Others have said this (at least Dan): This seems like something > where something other than nova ought to handle it. A host which > shouldn't be scheduled to should be disabled (as a service). WFM. Scrap strawman. Given that it's not considered a security issue, we could expose the (low-level, CPU flag) traits so that "other than nova" can use them. If we think there's demand. > How do people feel about the idea of forming a core group for those > two repos that includes placement cores but has additions from nova > (Dan, Kashyap and Sean would make good candidates) and other projects > that consume them? ++ efried [1] http://eavesdrop.openstack.org/irclogs/%23openstack-meeting/%23openstack-meeting.2019-05-16.log.html#t2019-05-16T15:06:24 From fungi at yuggoth.org Thu May 16 18:46:46 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 16 May 2019 18:46:46 +0000 Subject: On reporting CPU flags that provide mitiation (to CVE flaws) as Nova 'traits' In-Reply-To: References: <20190515092456.GH17214@paraplu> <1e4f1df66115fd8e96b8aec3a679b25534c66541.camel@redhat.com> <20190515131109.GJ17214@paraplu> <60a1e97d-c9ac-469f-3c16-39e89347acb3@fried.cc> Message-ID: <20190516184645.5gfjeao3heaifoyb@yuggoth.org> On 2019-05-16 10:42:47 -0500 (-0500), Eric Fried wrote: [...] > > I've added a link to this thread on the agenda for tomorrow's > > Security SIG meeting > > This happened [1]. TL;DR: it does more potential good than harm to > expose these traits ("scheduler roulette is not a security measure" > --fungi). [...] To reiterate my position from the SIG meeting, I only really care whether or not processes which need to know about CPU details for enabling modes to cope with these vulnerabilities have access to them (generally so they can drop their own inefficient mitigations when presented with CPU flags which indicate they're unnecessary because the relevant microcode has been installed on the host or the particular chip lacks that design flaw entirely). That is security-relevant. Whether users want to be able to make scheduling choices based on those same flags, and whether the operators of those environments want to grant them the ability to do so, isn't really a security-relevant discussion point. I support providing a means for users to get good performance on secure systems by default. Anyone who wants to knowingly choose less secure systems to gain a performance boost, or to intentionally shuffle specific customer workloads onto less secure parts of their infrastructure is welcome to those features, but I don't consider that to really be a security topic. At that point it's more of a discussion about people making (hopefully well-informed) trade-offs for the sake of performance and efficiency. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From mriedemos at gmail.com Thu May 16 20:33:39 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 16 May 2019 15:33:39 -0500 Subject: [nova] cross-cell resize review guide Message-ID: <3ce2cab8-94ed-272d-0cc8-db72ba330639@gmail.com> At the Train PTG people said it would be useful to have a review guide in the ML like what gibi did for the bandwidth provider series in Stein, so this is my attempt. First let me start by saying if I were you, I'd read the spec [1] and watch the presentation from the summit [2] since those should give you the high-level overview of what's going on. **Notes** * The current bottom of the series starts here [3]. That bottom change is actually a bug fix for a long-standing latent race issue in the resource tracker and a functional recreate test patch has already been merged. I ran into this during my functional testing later in the series when auditing resource tracker stuff. * I'm pushing as much non-cross-cell related stuff to the bottom of the series as possible, like refactoring utility methods and test fixture enhancements, and some bug fixes. For anything that would otherwise be unused outside of the series I don't move to the bottom. * The rest of the series works from the bottom up in that nothing "turns it all on" externally in the API until the end of the series. This means we can land code as we go with little risk of impacting existing behavior. Even then the new policy rule that enables this is disabled for all users by default. * The instance task_state flow, instance action records and notifications should be consistent to normal resize throughout to make sure it is transparent to the end user. * We do hard deletes of database records on rollback, confirm and revert. We have a copy of the instance in each DB while resized, but need to hard delete the one we aren't going to keep at our terminal state (otherwise you can't try to resize the server back to the source cell without archiving that DB). * Rather than have the computes RPC cast back and forth to each other, conductor will be orchestrating the compute service work using synchronous RPC calls with the long_rpc_timeout option. This is because we assume the computes can't communicate with each other over RPC, SSH or shared storage. **Testing** * About halfway through the series [4], once the code is plumbed to get the server to VERIFY_RESIZE state, I start a series of functional tests because for a lot of this I'm trying to rely on functional rather than unit tests. I do have unit tests in the majority of the patches but not all of them toward the end because frankly I don't want to spend a lot of time writing unit tests before the code is reviewed - the functional tests are much better at flushing out issues. * At the very end of the series I have a patch [5] which enables cross-cell resize and cold migration using the new nova-multi-cell job. I'm easing into this by only enabling a few tests at a time, flushing issues, and iterating that way since it makes debugging one failing test rather than 10 much easier (to trace the logs and such). So far I've gotten resize + confirm, cold migrate + confirm, and volume-backed resize + confirm passing in that job. I'm currently working on a cold migrate + revert test. **The main idea** We want to keep the new cross-cell resize flow as similar to traditional resize as possible, so you'll see a similar flow of: a) prep for resize on dest host This is just a resource claim and to create the migration context in the target cell DB. But otherwise it's very similar to prep_resize. b) resize from source host Power off the guest, disconnect volumes and interfaces from the source host, and do disk stuff (in this case create a temporary snapshot of the root disk for a non-volume-backed server). c) finish on the dest host Connect volumes, plug VIFs, and spawn a guest (using the snapshot created for a non-volume-backed server). Super easy right?! **Conductor** The big change with cross-cell resize is that (super)conductor is going to be doing the heavy lifting of orchestrating the resize across the two cells. For this I've created a main task (CrossCellMigrationTask) which then drives the sequential execution of sub-tasks, which for the most part mirror the steps above for the operations in the compute services. Also note that each task has a rollback method and as such if any task fails, we run the rollback of all the other tasks back to the beginning which means each task can just care about cleaning up anything it needs to rather than a giant single try/except in the main task. The first thing we have to do is copy the instance and its related records to the target cell database (TargetDBSetupTask). The copy of the instance created in the target cell DB will be marked with a new "hidden" field so that when listing servers in the API we'll filter it out. For the DB copy stuff there needed to be the DB schema change for that hidden column and then just some new versioned object DB API type methods to aid with the data transfer. After that we run the sub-tasks that mirror the compute service operations (prep on dest, resize from source, finish on dest). In between the compute services doing stuff we sometimes need to copy data around, e.g. after PrepResizeAtDestTask we need to copy the migration context from the target cell DB to the source cell DB so the API can handle the network-vif-plugged even when the guest is spawned on the dest host. At the very end once we've finished on the dest host, we update the instance mapping in the API DB to point at the target cell and swap the hidden values on the source/target cell instance so when listing servers we'll be showing the target cell instance in VERIFY_RESIZE state. **Scheduler** All move operations are currently restricted to the same cell (there is code in conductor that does this). If the request passes the new policy check, the RequestSpec is modified so the scheduler will not restrict to just hosts in the source cell. There is a new CrossCellWeigher added which by default will prefer hosts in the same cell if there is a choice, but there might not be a choice, e.g. an admin could target a host in another cell for a cold migration. Also, the weigher can be configured to prefer to move instances to other cells or be overridden on a per-aggregate basis. I talk about this in the video a bit, but the idea is leave the default to prefer hosts in the source cell but override via aggregates to drain instances out of old cells using the weigher. Also, a reminder that when the scheduler picks a primary host, any alternates are in the same cell. This isn't new but is good to remember. **Operation flow** 1. The API will check the new policy rule and if the request passes the policy check, the RequestSpec is modified to say cross-cell migration is OK (conductor and scheduler will look for this) and RPC cast to conductor (rather than RPC call like it does today). 2. The MigrationTask in conductor will RPC call the scheduler's select_destinations method as normal. a) If the selected target host is in another cell, kick off the CrossCellMigrationTask. b) If the selected target host is in the same cell as the instance, we do traditional resize as normal. 3. CrossCellMigrationTask will check that neutron and cinder are new enough for port bindings and volume attachment resources and then execute TargetDBSetupTask. 4. TargetDBSetupTask does the DB copy stuff mentioned above. 5. CrossCellMigrationTask then executes PrepResizeAtDestTask which will create (inactive) port bindings for the target host and empty volume attachments, then RPC call the dest compute to do the resize claim (this is needed for instances with PCI devices and NUMA topology). a) If all of that works, cool, continue. b) If any of that fails, cleanup and try an alternate host (note the alternate host processing is not implemented yet but shouldn't be hard). 6. Next PrepResizeAtSourceTask gets executed which creates a snapshot image for a non-volume-backed server, powers off the guest (but does not destroy the root disks), disconnects volumes and interfaces. At this step on the source compute we also activate the inactive dest host port bindings created earlier. 7. Next FinishResizeAtDestTask calls the dest compute to setup networking on the dest host (most just for PCI stuff), connect volumes, and spawn the guest (from root volume or temp snapshot). Once that is done the task will swap the hidden value on the instances and update the instance mapping in the API DB. At this point the instance is in VERIFY_RESIZE status. **Confirm/Revert** Confirm and revert are implemented similarly with new conductor tasks. Confirm is pretty simple in that it just cleans up the guest from the source compute and deletes the records from the source cell DB. Revert is a bit more work since we have to cleanup the dest compute host, adjust BDMs in the source cell DB if any volumes were attached to or detached from the instance while it was in VERIFY_RESIZE state (yes the API allows that), spawn the guest in the source compute and then twiddle the DB bits so the instance mapping is pointing at the source cell and the source instance is no longer hidden, and then cleanup the target cell DB. **Patch series structure** As noted I'm writing this from the bottom up, and even within sub-operations I'm following a pattern of (1) write the compute methods first and then (2) write the conductor task that will use them and integrate it into the main CrossCellMigrationTask. This means there will be multiple compute RPC API version bumps as new methods get added. The scheduler stuff and some API things happen before the functional testing starts so I can actually start functional testing with stubbing out as little as possible (I have to stub out the policy check in the functional test long before it's enabled in the API at the end of the series). The CrossCellWeigher comes late in the series because by default all in-tree weighers are enabled so I don't want to land that and have it running if there is no chance it will be used yet. It would be a no-op but still not much point in landing that early. Also towards the end of the series is the API code to handle routing external events to multiple cells [6]. That's actually pretty lightweight and we could move it up to merge earlier, it's just not really something that benefits the in-tree functional tests (but is required to make the tempest tests work). **Hairy things** * The patch that adds the Instance.hidden field [7] is a bit complicated just because we have to account for it in a few places (not only listing instances but counting quotas). I have functional testing for all of that though to make people feel better. * The DB copy stuff in the TargetDBSetupTask is not really complicated but there are a lot of instance-related records to account for so I've tried to do my best there and written a pretty extensive set of tests for it. There are things we don't copy over like pci_devices and services since those are only relevant to other resources, like the compute node, in the source cell DB. Also, I messed around with doing the inserts in a single transaction like was done here [8] but couldn't really get it to work so I just left a TODO for now (maybe melwitt can figure it out). The single transaction insert is mostly low priority for me since if anything in there fails we hard destroy the target cell instance which will cascade delete any related records in the target cell DB. * The functional tests themselves can be a little daunting at first but I've tried to fully document those to make it clear what is being setup, what is happening, and what is being asserted. There are still TODOs for more testing to be done as I get more review (and time). * I have not written a docs patch yet since it could change as the series starts getting reviewed, and there is likely a lot to document (the spec is big, this email is big). When I do I will likely try to add a sequence diagram as well (and write one for normal resize since we don't have one of those in our docs today - we can do that whenever). If you have questions please reply to this email, ping me on IRC, or just ask in the code review - if you find I'm not replying in the review and rebasing over your comment, drop a -1 to make me notice (I'm currently up to 43 open changes so noticing random comments in a change in the middle is not obvious without a -1). If you made it this far, thanks, and remember that you asked for this review guide. :) [1] https://specs.openstack.org/openstack/nova-specs/specs/train/approved/cross-cell-resize.html [2] https://youtu.be/OyNFIOSGjac?t=143 [3] https://review.opendev.org/#/c/641806/ [4] https://review.opendev.org/#/c/636253/ [5] https://review.opendev.org/#/c/656656/ [6] https://review.opendev.org/#/c/658478/ [7] https://review.opendev.org/#/c/631123/ [8] https://review.opendev.org/#/c/586742/ -- Thanks, Matt From mriedemos at gmail.com Thu May 16 20:42:50 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 16 May 2019 15:42:50 -0500 Subject: [OSC][PTG] Summary: many things to do! In-Reply-To: References: <02baeaf3-5202-d506-a312-70d5e9f68071@gmail.com> Message-ID: <22dd6c3e-5c89-b282-1681-eb4e2d951943@gmail.com> On 5/16/2019 1:48 AM, Artem Goncharov wrote: > Artom signed up for this but I don't see patches yet. > > Here you are: > > Since I am not a full-time contributor it takes a bit more time, > otherwise I would have created patches for migration of all osc services > to sdk already ;-) I'm hoping there wasn't a misunderstanding with my comment. I literally meant "Artom" and not a typo for "Artem", because I'm talking about Artom Lifshitz specifically signing up [1] for the boot-from-volume stuff I mentioned. [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005793.html -- Thanks, Matt From yangyi01 at inspur.com Fri May 17 00:26:08 2019 From: yangyi01 at inspur.com (=?utf-8?B?WWkgWWFuZyAo5p2o54eaKS3kupHmnI3liqHpm4blm6I=?=) Date: Fri, 17 May 2019 00:26:08 +0000 Subject: =?utf-8?B?562U5aSNOiBbRFZSIGNvbmZpZ10gQ2FuIHdlIHVzZSBkcnZfc25hdCBhZ2Vu?= =?utf-8?B?dF9tb2RlIGluIGV2ZXJ5IGNvbXB1dGUgbm9kZT8=?= In-Reply-To: <8B2A08A6-A289-4E72-B98E-587C91C52C57@redhat.com> References: <8B2A08A6-A289-4E72-B98E-587C91C52C57@redhat.com> Message-ID: <7fc153d465d549acb2b1457f26ced6e7@inspur.com> Slawomir, thanks a lot. -----邮件原件----- 发件人: Slawomir Kaplonski [mailto:skaplons at redhat.com] 发送时间: 2019年5月16日 18:01 收件人: Yi Yang (杨燚)-云服务集团 抄送: openstack-discuss at lists.openstack.org 主题: Re: [DVR config] Can we use drv_snat agent_mode in every compute node? 重要性: 高 Hi, According to documentation which You cited even "‘dvr_snat’ - this enables centralized SNAT support in conjunction with DVR”. So yes, dvr_snat will do both, SNAT mode as well as DVR for E-W traffic. We are using it like that in some CI jobs for sure and it works. But I’m not 100% sure that this is “production ready” solution. > On 16 May 2019, at 05:47, Yi Yang (杨燚)-云服务集团 wrote: > > Hi, folks > > I saw somebody discussed distributed SNAT, but finally they didn’t make agreement on how to implement distributed SNAT, my question is can we use dvr_snat agent_mode in compute node? I understand dvr_snat only does snat but doesn’t do east west routing, right? Can we set dvr_snat and dvr in one compute node at the same time? It is equivalent to distributed SNAT if we can set drv_snat in every compute node, isn’t right? I know Opendaylight can do SNAT in compute node in distributed way, but one external router only can run in one compute node. > > I also see https://wiki.openstack.org/wiki/Dragonflow is trying to implement distributed SNAT, what are technical road blocks for distributed SNAT in openstack dvr? Do we have any good way to remove these road blocks? > > Thank you in advance and look forward to getting your replies and insights. > > Also attached official drv configuration guide for your reference. > > https://docs.openstack.org/neutron/stein/configuration/l3-agent.html > > agent_mode¶ > Type > string > > Default > legacy > > Valid Values > dvr, dvr_snat, legacy, dvr_no_external > > The working mode for the agent. Allowed modes are: ‘legacy’ - this preserves the existing behavior where the L3 agent is deployed on a centralized networking node to provide L3 services like DNAT, and SNAT. Use this mode if you do not want to adopt DVR. ‘dvr’ - this mode enables DVR functionality and must be used for an L3 agent that runs on a compute host. ‘dvr_snat’ - this enables centralized SNAT support in conjunction with DVR. This mode must be used for an L3 agent running on a centralized node (or in single-host deployments, e.g. devstack). ‘dvr_no_external’ - this mode enables only East/West DVR routing functionality for a L3 agent that runs on a compute host, the North/South functionality such as DNAT and SNAT will be provided by the centralized network node that is running in ‘dvr_snat’ mode. This mode should be used when there is no external network connectivity on the compute host. > — Slawek Kaplonski Senior software engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3600 bytes Desc: not available URL: From yangyi01 at inspur.com Fri May 17 00:29:39 2019 From: yangyi01 at inspur.com (=?utf-8?B?WWkgWWFuZyAo5p2o54eaKS3kupHmnI3liqHpm4blm6I=?=) Date: Fri, 17 May 2019 00:29:39 +0000 Subject: =?utf-8?B?562U5aSNOiBbRFZSIGNvbmZpZ10gQ2FuIHdlIHVzZSBkcnZfc25hdCBhZ2Vu?= =?utf-8?B?dF9tb2RlIGluIGV2ZXJ5IGNvbXB1dGUgbm9kZT8=?= In-Reply-To: References: Message-ID: <67d4e0f3053949fc844b6d1d26f05559@inspur.com> Thanks Brian, your explanation clarified something, but I don't get the answer if we can have multiple compute nodes are configured to dvr_snat, for this case, SNAT IPs are obviously different. Why do we want to use network node if compute node can do everything? -----邮件原件----- 发件人: Brian Haley [mailto:haleyb.dev at gmail.com] 发送时间: 2019年5月16日 21:46 收件人: Yi Yang (杨燚)-云服务集团 抄送: openstack-discuss at lists.openstack.org 主题: Re: [DVR config] Can we use drv_snat agent_mode in every compute node? Hi Yi, I'm a little confused by the question, comments inline. On 5/15/19 11:47 PM, Yi Yang (杨燚)-云服务集团 wrote: > Hi, folks > > I saw somebody discussed distributed SNAT, but finally they didn’t > make agreement on how to implement distributed SNAT, my question is > can we use dvr_snat agent_mode in compute node? I understand dvr_snat > only does snat but doesn’t do east west routing, right? Can we set > dvr_snat and dvr in one compute node at the same time? It is > equivalent to distributed SNAT if we can set drv_snat in every compute > node, isn’t right? I know Opendaylight can do SNAT in compute node in > distributed way, but one external router only can run in one compute node. Distributed SNAT is not available in neutron, there was a spec proposed recently though, https://review.opendev.org/#/c/658414 Regarding the agent_mode setting for L3, only one mode can be set at a time. Typically 'dvr_snat' is used on network nodes and 'dvr' on compute nodes because it leads to less resource usage (i.e. namespaces). The centralized part of the router hosting the default SNAT IP address will only be scheduled to one of the agents in 'dvr_snat' mode. All the DVR modes can do East/West routing when an instance is scheduled to the node, and two can do North/South - 'dvr_snat' using the default SNAT IP, and 'dvr' using a floating IP. 'dvr_no_external' can only do East/West. Hopefully that clarifies things. -Brian > I also see https://wiki.openstack.org/wiki/Dragonflow is trying to > implement distributed SNAT, what are technical road blocks for > distributed SNAT in openstack dvr? Do we have any good way to remove > these road blocks? > > Thank you in advance and look forward to getting your replies and insights. > > Also attached official drv configuration guide for your reference. > > https://docs.openstack.org/neutron/stein/configuration/l3-agent.html > > |agent_mode|¶ > DEFAULT.agent_mode> > > Type > > string > > Default > > legacy > > Valid Values > > dvr, dvr_snat, legacy, dvr_no_external > > The working mode for the agent. Allowed modes are: ‘legacy’ - this > preserves the existing behavior where the L3 agent is deployed on a > centralized networking node to provide L3 services like DNAT, and SNAT. > Use this mode if you do not want to adopt DVR. ‘dvr’ - this mode > enables DVR functionality and must be used for an L3 agent that runs > on a compute host. ‘dvr_snat’ - this enables centralized SNAT support > in conjunction with DVR. This mode must be used for an L3 agent > running on a centralized node (or in single-host deployments, e.g. devstack). > ‘dvr_no_external’ - this mode enables only East/West DVR routing > functionality for a L3 agent that runs on a compute host, the > North/South functionality such as DNAT and SNAT will be provided by > the centralized network node that is running in ‘dvr_snat’ mode. This > mode should be used when there is no external network connectivity on > the compute host. > -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3600 bytes Desc: not available URL: From missile0407 at gmail.com Fri May 17 01:44:46 2019 From: missile0407 at gmail.com (Eddie Yen) Date: Fri, 17 May 2019 09:44:46 +0800 Subject: [Kolla] Few questions about offline deploy & operation. Message-ID: Hi everyone, I'm a newbie of using Kolla-ansible to deploy OpenStack. Everything works well. But I have few questions about Kolla since I was using Fuel as deploy tool before. The questions may silly, but I really want to know since I didn't found much informations to solve my questions.. About offline deployment: I already known that we need have OS local repository (CentOS or Ubuntu), docker local package repository, and local docker registry. But one thing I'm not sure that it will use pip to install python packages or not. Because I found it will also install pip into target node during bootstrap-servers. If so, which python package should I prepare for? About operation: 1. Is it possible to operate a single service to whole OpenStack nodes? Like using crm to check status about MySQL or RabbitMQ services on all control nodes. 2. How can I maintenance the Ceph OSD if OSD down caused by disk issue? I know how to rebuild OSD by Ceph commands but I'm not sure how to do if ceph-osd running in container. Many thanks, Eddie. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaronzhu1121 at gmail.com Fri May 17 02:15:20 2019 From: aaronzhu1121 at gmail.com (Rong Zhu) Date: Fri, 17 May 2019 10:15:20 +0800 Subject: [stackalytics] Reported numbers seem inaccurate In-Reply-To: References: Message-ID: Hi Sergey, What is the process about rebuild the database? Thanks, Rong Zhu Sergey Nikitin 于2019年5月7日 周二00:59写道: > Hello Rong, > > Sorry for long response. I was on a trip during last 5 days. > > What I have found: > Lets take a look on this patch [1]. It must be a contribution of gengchc2, > but for some reasons it was matched to Yuval Brik [2] > I'm still trying to find a root cause of it, but anyway on this week we > are planing to rebuild our database to increase RAM. I checked statistics > of gengchc2 on clean database and it's complete correct. > So your problem will be solved in several days. It will take so long time > because full rebuild of DB takes 48 hours, but we need to test our > migration process first to keep zero down time. > I'll share a results with you here when the process will be finished. > Thank you for your patience. > > Sergey > > [1] https://review.opendev.org/#/c/627762/ > [2] > https://www.stackalytics.com/?user_id=jhamhader&project_type=all&release=all&metric=commits&company=&module=freezer-api > > > On Mon, May 6, 2019 at 6:30 AM Rong Zhu wrote: > >> Hi Sergey, >> >> Do we have any process about my colleague's data loss problem? >> >> Sergey Nikitin 于2019年4月29日 周一19:57写道: >> >>> Thank you for information! I will take a look >>> >>> On Mon, Apr 29, 2019 at 3:47 PM Rong Zhu wrote: >>> >>>> Hi there, >>>> >>>> Recently we found we lost a person's data from our company at the >>>> stackalytics website. >>>> You can check the merged patch from [0], but there no date from >>>> the stackalytics website. >>>> >>>> stackalytics info as below: >>>> Company: ZTE Corporation >>>> Launchpad: 578043796-b >>>> Gerrit: gengchc2 >>>> >>>> Look forward to hearing from you! >>>> >>> >> Best Regards, >> Rong Zhu >> >>> >>>> -- >> Thanks, >> Rong Zhu >> > > > -- > Best Regards, > Sergey Nikitin > -- Thanks, Rong Zhu -------------- next part -------------- An HTML attachment was scrubbed... URL: From marios at redhat.com Fri May 17 06:42:17 2019 From: marios at redhat.com (Marios Andreou) Date: Fri, 17 May 2019 09:42:17 +0300 Subject: [tripleo][ci] rdo cloud outage - maintenance In-Reply-To: References: Message-ID: update - tripleo rdo 3rd party jobs and rdo workloads should all be running normally today. There *may* be more disruption later if another reboot is required but afaik that is yet to be determined. thanks, marios On Thu, May 16, 2019 at 3:51 PM Marios Andreou wrote: > There is ongoing maintenance outage in rdo cloud... this means there are > currently NO 3rd party rdo jobs running on any tripleo project reviews. > > We also stopped the promoter about an hour ago in preparation so there are > no promotions. The outage should be resolved today we will update when > things are back to normal > > thanks > > (marios|ruck and panda|rover on #tripleo and #oooq if you want to ping us) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zigo at debian.org Fri May 17 06:47:35 2019 From: zigo at debian.org (Thomas Goirand) Date: Fri, 17 May 2019 08:47:35 +0200 Subject: [glance][interop] standardized image "name" ? In-Reply-To: <4234325e-7569-e11f-53e9-72f07ed8ce53@gmail.com> References: <939FEDBD-6E5E-43F2-AE1F-2FE71A71BF58@vmware.com> <20190408123255.vqwwvzzdt24tm3pq@yuggoth.org> <4234325e-7569-e11f-53e9-72f07ed8ce53@gmail.com> Message-ID: <7893dbd2-acc1-692c-df38-29ec7c8a98e7@debian.org> On 4/18/19 2:37 PM, Brian Rosmaita wrote: > I really need to get caught up on my ML reading. > > On 4/11/19 6:40 PM, Thomas Goirand wrote: > [snip] >> It's just a shame that Glance doesn't show MD5 and not sha512 sums by >> default... > > The Secure Hash Algorithm Support ("multihash") spec [0] for Glance was > implemented in Rocky and provides a self-describing secure hash on > images (in addition to the 'checksum', which is preserved for backward > compatability.) The default is SHA-512. See the Rocky release notes [1] > for some implementation details not covered by the spec. > > The multihash is displayed in the image-list and image-show API > responses since Images API v2.7, and in the glanceclient since 2.12.0. That's the thing. "image show --long" continues to display the md5sum instead of the sha512. Cheers, Thomas Goirand (zigo) From dangtrinhnt at gmail.com Fri May 17 07:02:00 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Fri, 17 May 2019 16:02:00 +0900 Subject: [telemetry] Voting for a new meeting time In-Reply-To: References: Message-ID: Hi team, According to the poll [1], I will organize 2 meeting sessions on May 23rd: - Core contributors (mostly in APAC): 02:00 UTC - Cross-projects contributors (mostly in US or around that): 08:00 UTC Some core members or at least myself will be able to attend both meetings so I think it should be fine. I will put the meeting agenda here [2]. [1] https://doodle.com/poll/cd9d3ksvpms4frud [2] https://etherpad.openstack.org/p/telemetry-meeting-agenda Bests, On Fri, May 10, 2019 at 12:05 PM Trinh Nguyen wrote: > Hi team, > > As discussed, we should have a new meeting time so more contributors can > join. So please cast your vote in the link below *by the end of May 15th > (UTC).* > > https://doodle.com/poll/cd9d3ksvpms4frud > > One thing to keep in mind that I still want to keep the old meeting time > as an option, not because I'm biasing the APAC developers but because it is > the time that most of the active contributors (who actually pushing patches > and review) can join. > > When we have the results if we end up missing some contributors (I think > all of you are great!), no worries. We could try to create different > meetings for a different set of contributors, something like: > > - Developers: for bug triage, implementation, etc. > - Operators: input from operators are important too since we need real > use cases > - Cross-project: Telemetry may need to work with other teams > - Core team: for the core team to discuss the vision and goals, > planning > > > Okie, I know we cannot live without monitoring/logging so let's rock the > world guys!!! > > Bests > > > -- > *Trinh Nguyen* > *www.edlab.xyz * > > -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Fri May 17 07:33:16 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Fri, 17 May 2019 16:33:16 +0900 Subject: [telemetry] Team meetings on May 23rd 02:00 and 08:00 Message-ID: Hi team, I have sent this in another thread but I think It would better to do it in a separate email. According to the poll [1], I will organize 2 meeting sessions on May 23rd: - Core contributors (mostly in APAC): 02:00-03:00 UTC - Cross-projects contributors (mostly in the US or around): 08:00-09:00 UTC Some core members or at least myself will be able to attend both meetings so I think it should be fine. I draft the meeting agenda in [2]. Please check it out and input the topics that you want to discuss but remember the 1-hour meeting constraint. [1] https://doodle.com/poll/cd9d3ksvpms4frud [2] https://etherpad.openstack.org/p/telemetry-meeting-agenda Bests, -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From geguileo at redhat.com Fri May 17 08:55:12 2019 From: geguileo at redhat.com (Gorka Eguileor) Date: Fri, 17 May 2019 10:55:12 +0200 Subject: [Cinder][Ceph] Migrating backend host In-Reply-To: References: <11c12057-6213-2073-8f73-a693ec798cd6@uwo.ca> Message-ID: <20190517085512.g747at5bm6lp2ugf@localhost> On 16/05, Mohammed Naser wrote: > On Thu, May 16, 2019 at 2:14 AM Gary Molenkamp wrote: > > > > I am moving my cinder-volume service from one controller to another, and > > I'm trying to determine the correct means to update all existing > > volume's back-end host reference. I now have two cinder volume services > > running in front of the same ceph cluster, and I would like to retire > > the old cinder-volume service. > > > > For example, on my test cloud, "openstack volume service list": > > > > +------------------+----------------------------------+------+---------+-------+----------------------------+ > > | Binary | Host | Zone | Status | > > State | Updated At | > > +------------------+----------------------------------+------+---------+-------+----------------------------+ > > | cinder-scheduler | osdev-ctrl1 | nova | enabled | up | > > 2019-05-15T17:44:05.000000 | > > | cinder-volume | osdev-ctrl2 at rbd | nova | disabled | up | > > 2019-05-15T17:44:04.000000 | > > | cinder-volume | osdev-ctrl1 at rbd | nova | enabled | up | > > 2019-05-15T17:44:00.000000 | > > +------------------+----------------------------------+------+---------+-------+----------------------------+ > > > > Now, an existing volume has a reference to the disabled cinder-volume: > > "os-vol-host-attr:host | osdev-ctrl2 at rbd#rbd" > > > > but this needs to be changed to: > > "os-vol-host-attr:host | osdev-ctrl1 at rbd#rbd" > > > > As both controllers are members of the same ceph cluster, an "openstack > > volume migrate" is not appropriate. If it is appropriate, my testing > > has shown that it errors out and deletes the source volume from ceph. > > > > I can alter this field manually in the cinder database, but in > > following the "don't mess with the data model" mantra, is there a means > > to do this from the cli? > > https://docs.openstack.org/cinder/stein/cli/cinder-manage.html#cinder-volume > > cinder-manage volume update_host --currenthost > --newhost > > That should do it :) > Hi, Since you've already create volumes in the new cinder-volume service Mohammed's suggestion is the right one. For the future, when you want to move the service to a new node, I would recommend you setting the 'backend_host' configuration option in the driver's section or the 'host' option in the DEFAULT section. That way you don't need to modify the database and the service will start with the same host as the old one regardless of where you are running it. This is the recommended way to deploy Cinder-Volume in Active-Passive, as you won't need to make any changes when you start the service on a new node. Cheers, Gorka. > > Thanks, > > Gary. > > > > Openstack release: Queens. > > Distro: Redhat (Centos) 7.6 > > > > -- > > Gary Molenkamp Computer Science/Science Technology Services > > Systems Administrator University of Western Ontario > > molenkam at uwo.ca http://www.csd.uwo.ca > > (519) 661-2111 x86882 (519) 661-3566 > > > > > -- > Mohammed Naser — vexxhost > ----------------------------------------------------- > D. 514-316-8872 > D. 800-910-1726 ext. 200 > E. mnaser at vexxhost.com > W. http://vexxhost.com > From lajos.katona at ericsson.com Fri May 17 10:48:32 2019 From: lajos.katona at ericsson.com (Lajos Katona) Date: Fri, 17 May 2019 10:48:32 +0000 Subject: [keystone][placement][neutron][api-sig] http404 to NotFound, or how should a http json error body look like? Message-ID: <9ae10062-a9c8-3e76-15a1-da0745361c57@ericsson.com> Hi, Recently I planned to add tests to the neutron feature routed provider networks, which actually uses placement API to make possible the scheduling based on neutron segment information. I realized that the feature last worked on Queens. We can discuss separately the lack of test coverage, and similar things, but I would like to narrow or view now. The neutron bug report, with some background information: https://bugs.launchpad.net/neutron/+bug/1828543 After some debugging I realized that the problem comes from the fact that at a point neutron uses the information received in http404, like this: try:     return self._get(url).json() except keystoneauth1.exceptions.NotFound as e:     if 'foo' in e.details:         do_something() see: https://opendev.org/openstack/neutron-lib/src/branch/master/neutron_lib/placement/client.py#L405-L406 keystoneauth1 expects the http body in case of for example http404 to be something like this (python dict comes now): body= {'error': {'message': 'Foooooo', 'details': 'Baaaaar'}} see: https://opendev.org/openstack/keystoneauth/src/branch/master/keystoneauth1/exceptions/http.py#L406-L415 But placement started to adopt one suggestion from the API-SIG: http://specs.openstack.org/openstack/api-wg/guidelines/errors.html, and that is something like this (again python): body={'errors': [{'status': 404, 'title': 'Not Found', 'detail': 'The resource could not be found.... ', 'request_id': '...'}]} Shall I ask for help how I should make this bug in neutron fixed? As a quick and dirty solution in neutron I can use the response from the exception, but I feel that there should be a better way to fix this:-) Thanks in advance for the help Regards LAjos From kchamart at redhat.com Fri May 17 11:07:21 2019 From: kchamart at redhat.com (Kashyap Chamarthy) Date: Fri, 17 May 2019 13:07:21 +0200 Subject: On reporting CPU flags that provide mitiation (to CVE flaws) as Nova 'traits' In-Reply-To: References: <20190515092456.GH17214@paraplu> <1e4f1df66115fd8e96b8aec3a679b25534c66541.camel@redhat.com> <20190515131109.GJ17214@paraplu> <60a1e97d-c9ac-469f-3c16-39e89347acb3@fried.cc> Message-ID: <20190517110721.GA19519@paraplu> On Thu, May 16, 2019 at 10:42:47AM -0500, Eric Fried wrote: > > I've added a link to this thread on the agenda for tomorrow's > > Security SIG meeting > > This happened [1]. TL;DR: it does more potential good than harm to > expose these traits ("scheduler roulette is not a security measure" > --fungi). Thanks for the summary, Eric. I've just read the relevant IRC log discussion. Thanks to everyone who's chimed in (Jeremey, et al). > > Others have said this (at least Dan): This seems like something > > where something other than nova ought to handle it. A host which > > shouldn't be scheduled to should be disabled (as a service). > > WFM. Scrap strawman. ACK. > Given that it's not considered a security issue, we could expose the > (low-level, CPU flag) traits so that "other than nova" can use them. If > we think there's demand. Okay, so I take it that all the relevant low-level CPU flags (including things like SSBD, et al) as proposed here[2][3] can be added to 'os-traits'. And tools _other_ than Nova can consume, if need be. Correct me if I misparsed. > > How do people feel about the idea of forming a core group for those > > two repos that includes placement cores but has additions from nova > > (Dan, Kashyap and Sean would make good candidates) and other projects > > that consume them? I'm fine participating, if I can provide useful input. > ++ > > efried > > [1] > http://eavesdrop.openstack.org/irclogs/%23openstack-meeting/%23openstack-meeting.2019-05-16.log.html#t2019-05-16T15:06:24 [2] https://review.opendev.org/#/c/655193/4/os_traits/hw/cpu/x86.py [3] https://review.opendev.org/#/c/655193/4/os_traits/hw/cpu/amd.py -- /kashyap From cdent+os at anticdent.org Fri May 17 11:42:52 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Fri, 17 May 2019 12:42:52 +0100 (BST) Subject: [keystone][placement][neutron][api-sig] http404 to NotFound, or how should a http json error body look like? In-Reply-To: <9ae10062-a9c8-3e76-15a1-da0745361c57@ericsson.com> References: <9ae10062-a9c8-3e76-15a1-da0745361c57@ericsson.com> Message-ID: On Fri, 17 May 2019, Lajos Katona wrote: > keystoneauth1 expects the http body in case of for example http404 to be > something like this (python dict comes now): > body= {'error': {'message': 'Foooooo', 'details': 'Baaaaar'}} > see: > https://opendev.org/openstack/keystoneauth/src/branch/master/keystoneauth1/exceptions/http.py#L406-L415 > > But placement started to adopt one suggestion from the API-SIG: > http://specs.openstack.org/openstack/api-wg/guidelines/errors.html, > > and that is something like this (again python): > body={'errors': [{'status': 404, 'title': 'Not Found', 'detail': 'The > resource could not be found.... ', 'request_id': '...'}]} Thanks for all the detail in this message and on the bug report. It helps make understanding what's going on a lot easier. As you've discovered placement is following the guidelines for how errors are supposed to be formatted. If keystoneauth1 can't speak that format, that's probably the primary bug. However, it also sounds like you're encountering a few different evolutions in placement that may need to be addressed in older versions of neutron's placement client: * For quite a while placement was strict about responding to Accept headers appropriately. This was based on its interaction with Webob. If you didn't ask for json in the Accept header, errors could come in HTML or Text. The most reliable fix for this in any client of any API is to always send an Accept header that states how you want responses to be presented (in this case application/json). This can lead to interesting parsing troubles if you are rely on the bodies of responses. * In microversion 1.23 we started sending 'code' with error responses in an effort to avoid needing to parse error responses. I've got a few different suggestions that you might want to explore. None of them are a direct fix for the issue you are seeing but may lead to some ideas. First off, this would be a big change, but I do not think it is good practice to raise exceptions when getting 4xx responses. Instead in the neutron-lib placement client it would be better to branch on status code in the resp object. If it doesn't matter why you 404d, just that you did, you could log just that, not the detail. Another thing to think about is in the neutron placement client you have a get_inventory [1] method which has both resource provider uuid and resource class in the URL and thus can lead to the "two different types of 404s" issue that is making parsing the error response required. You could potentially avoid this by implementing get_inventories [2] which would only 404 on bad resource provider uuid and would return inventory for all resource classes on that rp. You can use PUT on the same URL to replace all inventory on that rp. Make sure you send Accept: application/json in all your requests in the client. Make keystoneauth1 interpret two diferent types of errors response: the one it does, and the one in the api-sig guidelines. Note that I've yet to see any context where there is more than one error in the list, so it is always errors[0] that gets inspected. On the placement side we should probably add codes to the few different places where a 404 can happen for different reasons (they are almost all combinations of not finding a resource provider or not finding a resource class). If that were present you could branch the library code on the code in the errors structure instead of the detail. However if its possible to avoid choosing which 404, that's probably better. I hope some of that helps. Hopefully some keystoneauth folk will chime in too. [1] https://opendev.org/openstack/neutron-lib/src/branch/master/neutron_lib/placement/client.py#L417k [2] https://developer.openstack.org/api-ref/placement/#resource-provider-inventories -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From mark at stackhpc.com Fri May 17 12:45:02 2019 From: mark at stackhpc.com (Mark Goddard) Date: Fri, 17 May 2019 13:45:02 +0100 Subject: [ptg][kolla][openstack-ansible][tripleo] PTG cross-project summary In-Reply-To: References: <0b78939a-ccb2-77fa-f2a4-d462576bcbb2@redhat.com> Message-ID: On Thu, 16 May 2019 at 14:59, Cédric Jeanneret wrote: > > > On 5/16/19 3:53 PM, Mark Goddard wrote: > > > > > > On Thu, 16 May 2019 at 11:41, Cédric Jeanneret > > wrote: > > > > > > > > On 5/7/19 11:07 AM, Mark Goddard wrote: > > > Hi, > > > > > > This is a summary of the ad-hoc cross project session between the > > > OpenStack Ansible and Kolla teams. > > > > > > It occurred to me that our two teams likely face similar > > challenges, and > > > there are areas we could collaborate on. I've tagged TripleO also > > since > > > the same applies there. > > > > > > [Collaboration on approach to features] > > > This was my main reason for proposing the session - there are > features > > > and changes that all deployment tools need to make. Examples > coming up > > > include support for upgrade checkers and IPv6. Rather than work in > > > isolation and solve the same problem in different ways, perhaps we > > could > > > share our experiences. The implementations will differ, but > > providing a > > > reasonably consistent feel between deployment tools can't be a bad > > thing. > > > > > > As a test case, we briefly discussed our experience with the > upgrade > > > checker support added in Stein, and found that our expectation of > > how it > > > would work was fairly aligned in the room, but not aligned with > how I > > > understand it to actually work (it's more of a post-upgrade check > > than a > > > pre-upgrade check). > > > > Hello! I'm pretty sure the new Validation Framework can help here, > since > > we intend to provide a pre|post deploy|update|upgrade way to run > > validations. > > > > Feel free to ping me if you want (Tengu on #tripleo) - or just ask > > questions in here :). > > > > Since we want to extend the framework to not only cover tripleo and > > openstack, that would be a good start with kolla imho :) > > > > Hi Cedric. The validation framework is based around this new tempest > > ansible role, correct? Presumably each deployment tool would provide a > > its own entry point on top of that. How does the pre/post deploy etc > > affect what validations are run? Is that up to the deployment tool, or > > defined in the ansible role, or somewhere else? > > So for now we "only" have a (fairly) good integration with the > "openstack tripleo" subcommand[1]. > > Currently, we're mainly using plain ansible roles and playbook, being > launched by Mistral. We intend to allow non-Mistral runs (Work In > Progress) and, in not too far future hopefully, to provide a descent > python library within the tripleo-validations package for a better > integration. > It's still early, we still need to put things together, but if we can > already raise awareness and interest for this work, it will help getting > more involvement and time in order to provide a great bundle :). > > If you want to know more, you can already have a look at the new doc[1]. > > Does it help a bit understanding the possibilities? > > Thanks for following up Cedric, that is quite useful. I'll take a look through. > Cheers, > > C. > > [1] > https://docs.openstack.org/tripleo-docs/latest/validations/index.html > (note: we will probably add some more inputs in there) > -- > Cédric Jeanneret > Software Engineer - OpenStack Platform > Red Hat EMEA > https://www.redhat.com/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bluejay.ahn at gmail.com Fri May 17 12:49:50 2019 From: bluejay.ahn at gmail.com (Jaesuk Ahn) Date: Fri, 17 May 2019 21:49:50 +0900 Subject: [openstack-helm] List of the current meeting times, and opinion on adjustment Message-ID: Hi In PTG, I have put as one of agenda during openstack-helm PTG session, and it was then discussed during the Airship PTG session. The reason is, to encourage current/potential contributors from my region to show up at these community meetings, a little bit more preferable time would be great. (from https://etherpad.openstack.org/p/airship-ptg-train) - Meeting times - Use https://everytimezone.com/ - There is no good time for everyone - lots of folks west coast, europe, asia - Compromise -- two meeting times on same day, separated ~10 hours apart or some such - Outcome: +1 - use https://everytimezone.com/ to come up with a few options, including single meetings per week and multiple - Let people vote on it in the patchset to change the meeting time - Note: OSH to do something similar In reality, there are many people who are attending both openstack-helm and Airship meeting. I also totally agree with Pete (PTL) that it makes sense to ensure the meeting time for both projects work together. Therefore, FYI, here is the summary of meeting times for both Airship and OpenStack-Helm Here is the current Airship Meeting Time - Airship Design Call (Twice a Week untill airship 2.0 design settled) - Tuesday (1.5h) : 13:00 UTC (09:00 AM EST / 06:00 AM PST / 13:00 CEST / 22:00 Korea) - Thursday (1.5 h) : 15:00 UTC (11:00 AM EST / 08:00 AM PST / 15:00 CEST / 00:00 AM Korea Friday) - Airship Weekly IRC Meeting - Tuesday (1h) : 16:00 UTC (12:00 PM EST / 09:00 AM PST / 16:00 CEST / 01:00 AM Korea Wednesday) Current OpenStack-Helm Meeting Time - OSH Weekly IRC Meeting - Tuesday (1h) : 15:00 UTC (11:00 AM EST / 08:00 AM PST / 15:00 CEST / 00:00 AM Korea Friday) Now we have all the meeting time listed here in this email. I would like to discuss possible available options for the adjustment. I will start!. Here is my opinion. For me, first of all, 13:00 UTC (10PM) or 14:00 UTC (11PM) would be the reasonable time to have a weekly meetings (I am living in Korea). Current 15:00 UTC (midnight) is a bit difficult time to regularly attend the weekly meeting. It also make me difficult to ask someone else from my region to attend the meeting. Secondly, as mentioned during PTG, having two meetings on the same day will also be an option. Personally, I concern that having two meetings will put somewhat heavy burden on PTL and core reviewers. PLEASE share your opinion on possible options for meeting time. Thanks you. -- *Jaesuk Ahn*, Ph.D. Software R&D Center, SK Telecom -------------- next part -------------- An HTML attachment was scrubbed... URL: From petebirley+openstack-dev at gmail.com Fri May 17 13:06:35 2019 From: petebirley+openstack-dev at gmail.com (Pete Birley) Date: Fri, 17 May 2019 08:06:35 -0500 Subject: [openstack-discuss][openstack-helm] Nominating Itxaka Serrano Garcia to core review team Message-ID: Hi OpenStack-Helm Core Team, I would like to nominate a new core reviewer for OpenStack-Helm: Itxaka Serrano Garcia (igarcia at suse.com) Itxaka has been doing many reviews and contributed some critical patches to OpenStack Helm, helping make the project both more approachable and better validated through his great tempest work. Also, he is a great ambassador for the project both in IRC and community meetings. Voting is open for 7 days. Please reply with your +1 vote in favor or -1 as a veto vote. Regards, Pete Birley (portdirect) -------------- next part -------------- An HTML attachment was scrubbed... URL: From hberaud at redhat.com Fri May 17 13:55:14 2019 From: hberaud at redhat.com (Herve Beraud) Date: Fri, 17 May 2019 15:55:14 +0200 Subject: [oslo][requirements] Bandit Strategy In-Reply-To: <6bf60a87-80d7-b158-ea9d-8b4576c9f86c@nemebean.com> References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> <20190516054153.GC18431@thor.bakeyournoodle.com> <6bf60a87-80d7-b158-ea9d-8b4576c9f86c@nemebean.com> Message-ID: Le jeu. 16 mai 2019 à 17:30, Ben Nemec a écrit : > > > On 5/16/19 4:29 AM, Herve Beraud wrote: > > Hello, > > > > To help us to be more reactive on similar issues related to requirements > > who drop python 2 (the sphinx use case) > > I've submit a patch https://review.opendev.org/659289 to schedule > > "check-requirements" daily. > > > > Normally with that if openstack/requirements add somes changes who risk > > to break our CI we will be informed quickly by this periodical job. > > > > I guess we will facing a many similar issues in the next month due to > > the python 2.7 final countdown and libs who will drop python 2.7 support. > > > > For the moment only submit my patch on oslo.log, but if it work, in a > > second time, we can copy it to all the oslo projects. > > > > I'm not a zuul expert and I don't know if my patch is correct or not, so > > please feel free to review it and to put comments to let me know how to > > proceed with periodic jobs. > > > > Also oslo core could check daily the result of this job to know if > > actions are needed and inform team via the ML or something like that in > > fix the issue efficiently. > > This is generally the problem with periodic jobs. People don't pay > attention to them so issues still don't get noticed until they start > breaking live patches. As I said in IRC, if you're willing to commit to > checking the periodic jobs daily I'm okay with adding them. > I'm ok to pay attention and to checking the periodic jobs, but sometimes I think I'll be away (PTO, etc..) and others people will need to pay attention during this period. > I know when dims was PTL he had nightly jobs running on all of the Oslo > repos, but I think that was in his own private infra so I don't know > that we could reuse what he had. > > > > > Thoughts? > > > > Yours Hervé. > > > > > > Le jeu. 16 mai 2019 à 07:44, Tony Breeds > > a écrit : > > > > On Tue, May 14, 2019 at 11:09:26AM -0400, Zane Bitter wrote: > > > > > It's breaking the whole world and I'm actually not sure there's a > > good > > > reason for it. Who cares if sphinx 2.0 doesn't run on Python 2.7 > > when we set > > > and achieved a goal in Stein to only run docs jobs under Python > > 3? It's > > > unavoidable for stable/rocky and earlier but it seems like the > > pain on > > > master is not necessary. > > > > While we support python2 *anywhere* we need to do this. The current > > tools (both ours and the broader python ecosystem) need to have these > > markers. > > > > I apologise that we managed to mess this up we're looking at how we > can > > avoid this in the future but we don't really get any kinda of signals > > about $library dropping support for $python_version. The py2 things > is > > more visible than a py3 minor release but they're broadly the same > thing > > > > Yours Tony. > > > > > > > > -- > > Hervé Beraud > > Senior Software Engineer > > Red Hat - Openstack Oslo > > irc: hberaud > > -----BEGIN PGP SIGNATURE----- > > > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > > v6rDpkeNksZ9fFSyoY2o > > =ECSj > > -----END PGP SIGNATURE----- > > > > -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From haleyb.dev at gmail.com Fri May 17 14:10:55 2019 From: haleyb.dev at gmail.com (Brian Haley) Date: Fri, 17 May 2019 10:10:55 -0400 Subject: =?UTF-8?B?UmU6IOetlOWkjTogW0RWUiBjb25maWddIENhbiB3ZSB1c2UgZHJ2X3Nu?= =?UTF-8?Q?at_agent=5fmode_in_every_compute_node=3f?= In-Reply-To: <67d4e0f3053949fc844b6d1d26f05559@inspur.com> References: <67d4e0f3053949fc844b6d1d26f05559@inspur.com> Message-ID: <279f1523-bfcd-9863-c5d6-7cb190f7142b@gmail.com> On 5/16/19 8:29 PM, Yi Yang (杨燚)-云服务集团 wrote: > Thanks Brian, your explanation clarified something, but I don't get the answer if we can have multiple compute nodes are configured to dvr_snat, for this case, SNAT IPs are obviously different. Why do we want to use network node if compute node can do everything? Hi Yi, There will only be one DVR SNAT IP allocated for a router on the external network, and only one router scheduled using it, so having dvr_snat mode on a compute node doesn't mean that North/South router will be local, only the East/West portion might be. Typically people choose to place these on separate systems since the requirements of the role are different - network node could have fewer cores and a 10G nic for higher bandwidth, compute node could have lots of cores for instances but maybe a 1G nic. There's no reason you can't run dvr_snat everywhere, I would just say it's not common. -Brian > -----邮件原件----- > 发件人: Brian Haley [mailto:haleyb.dev at gmail.com] > 发送时间: 2019年5月16日 21:46 > 收件人: Yi Yang (杨燚)-云服务集团 > 抄送: openstack-discuss at lists.openstack.org > 主题: Re: [DVR config] Can we use drv_snat agent_mode in every compute node? > > Hi Yi, > > I'm a little confused by the question, comments inline. > > On 5/15/19 11:47 PM, Yi Yang (杨燚)-云服务集团 wrote: >> Hi, folks >> >> I saw somebody discussed distributed SNAT, but finally they didn’t >> make agreement on how to implement distributed SNAT, my question is >> can we use dvr_snat agent_mode in compute node? I understand dvr_snat >> only does snat but doesn’t do east west routing, right? Can we set >> dvr_snat and dvr in one compute node at the same time? It is >> equivalent to distributed SNAT if we can set drv_snat in every compute >> node, isn’t right? I know Opendaylight can do SNAT in compute node in >> distributed way, but one external router only can run in one compute node. > > Distributed SNAT is not available in neutron, there was a spec proposed recently though, https://review.opendev.org/#/c/658414 > > Regarding the agent_mode setting for L3, only one mode can be set at a time. Typically 'dvr_snat' is used on network nodes and 'dvr' on compute nodes because it leads to less resource usage (i.e. namespaces). > The centralized part of the router hosting the default SNAT IP address will only be scheduled to one of the agents in 'dvr_snat' mode. All the DVR modes can do East/West routing when an instance is scheduled to the node, and two can do North/South - 'dvr_snat' using the default SNAT IP, and 'dvr' using a floating IP. 'dvr_no_external' can only do East/West. > > Hopefully that clarifies things. > > -Brian > >> I also see https://wiki.openstack.org/wiki/Dragonflow is trying to >> implement distributed SNAT, what are technical road blocks for >> distributed SNAT in openstack dvr? Do we have any good way to remove >> these road blocks? >> >> Thank you in advance and look forward to getting your replies and insights. >> >> Also attached official drv configuration guide for your reference. >> >> https://docs.openstack.org/neutron/stein/configuration/l3-agent.html >> >> |agent_mode|¶ >> > DEFAULT.agent_mode> >> >> Type >> >> string >> >> Default >> >> legacy >> >> Valid Values >> >> dvr, dvr_snat, legacy, dvr_no_external >> >> The working mode for the agent. Allowed modes are: ‘legacy’ - this >> preserves the existing behavior where the L3 agent is deployed on a >> centralized networking node to provide L3 services like DNAT, and SNAT. >> Use this mode if you do not want to adopt DVR. ‘dvr’ - this mode >> enables DVR functionality and must be used for an L3 agent that runs >> on a compute host. ‘dvr_snat’ - this enables centralized SNAT support >> in conjunction with DVR. This mode must be used for an L3 agent >> running on a centralized node (or in single-host deployments, e.g. devstack). >> ‘dvr_no_external’ - this mode enables only East/West DVR routing >> functionality for a L3 agent that runs on a compute host, the >> North/South functionality such as DNAT and SNAT will be provided by >> the centralized network node that is running in ‘dvr_snat’ mode. This >> mode should be used when there is no external network connectivity on >> the compute host. >> From snikitin at mirantis.com Fri May 17 14:20:07 2019 From: snikitin at mirantis.com (Sergey Nikitin) Date: Fri, 17 May 2019 18:20:07 +0400 Subject: [stackalytics] Reported numbers seem inaccurate In-Reply-To: References: Message-ID: Testing of migration process shown us that we have to rebuild database "on live". Unfortunately it means that during rebuild data will be incomplete. I talked with the colleague who did it previously and he told me that it's normal procedure. I got these results on Monday and at this moment I'm waiting for weekend. It's better to rebuild database in Saturday and Sunday to do now affect much number of users. So by the end of this week everything will be completed. Thank you for patient. On Fri, May 17, 2019 at 6:15 AM Rong Zhu wrote: > Hi Sergey, > > What is the process about rebuild the database? > > Thanks, > Rong Zhu > > Sergey Nikitin 于2019年5月7日 周二00:59写道: > >> Hello Rong, >> >> Sorry for long response. I was on a trip during last 5 days. >> >> What I have found: >> Lets take a look on this patch [1]. It must be a contribution of >> gengchc2, but for some reasons it was matched to Yuval Brik [2] >> I'm still trying to find a root cause of it, but anyway on this week we >> are planing to rebuild our database to increase RAM. I checked statistics >> of gengchc2 on clean database and it's complete correct. >> So your problem will be solved in several days. It will take so long time >> because full rebuild of DB takes 48 hours, but we need to test our >> migration process first to keep zero down time. >> I'll share a results with you here when the process will be finished. >> Thank you for your patience. >> >> Sergey >> >> [1] https://review.opendev.org/#/c/627762/ >> [2] >> https://www.stackalytics.com/?user_id=jhamhader&project_type=all&release=all&metric=commits&company=&module=freezer-api >> >> >> On Mon, May 6, 2019 at 6:30 AM Rong Zhu wrote: >> >>> Hi Sergey, >>> >>> Do we have any process about my colleague's data loss problem? >>> >>> Sergey Nikitin 于2019年4月29日 周一19:57写道: >>> >>>> Thank you for information! I will take a look >>>> >>>> On Mon, Apr 29, 2019 at 3:47 PM Rong Zhu >>>> wrote: >>>> >>>>> Hi there, >>>>> >>>>> Recently we found we lost a person's data from our company at the >>>>> stackalytics website. >>>>> You can check the merged patch from [0], but there no date from >>>>> the stackalytics website. >>>>> >>>>> stackalytics info as below: >>>>> Company: ZTE Corporation >>>>> Launchpad: 578043796-b >>>>> Gerrit: gengchc2 >>>>> >>>>> Look forward to hearing from you! >>>>> >>>> >>> Best Regards, >>> Rong Zhu >>> >>>> >>>>> -- >>> Thanks, >>> Rong Zhu >>> >> >> >> -- >> Best Regards, >> Sergey Nikitin >> > -- > Thanks, > Rong Zhu > -- Best Regards, Sergey Nikitin -------------- next part -------------- An HTML attachment was scrubbed... URL: From jean-philippe at evrard.me Fri May 17 15:01:07 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Fri, 17 May 2019 17:01:07 +0200 Subject: =?UTF-8?Q?Re:_[openstack-helm]_List_of_the_current_meeting_times, _and_op?= =?UTF-8?Q?inion_on_adjustment?= In-Reply-To: References: Message-ID: Sadly, there is no magic with timezones :( I am fine with your proposal though. I believe that there could be another way: - If a conversation needs to happen "synchronously", we could use the office hours for that. The office hours are timezone alternated IIRC, but not yet "officialized" (they are not appearing in [1]). - Bring conversations that need to happen asynchronously over the ML. - Discuss/Decide things in reviews. That's technically totally workable globally, but it needs a change in mindset. It's also most likely people will get less active on IRC, which will impact the way we "feel" as a community. Regards, Jean-Philippe Evrard [1]: http://eavesdrop.openstack.org/#OpenStack-Helm_Team_Meeting -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.bock at suse.com Fri May 17 14:51:05 2019 From: nicolas.bock at suse.com (Nicolas Bock) Date: Fri, 17 May 2019 08:51:05 -0600 Subject: [openstack-discuss][openstack-helm] Nominating Itxaka Serrano Garcia to core review team In-Reply-To: References: Message-ID: <875zq9dtie.fsf@gmail.com> +1 On Fri, May 17 2019, Pete Birley wrote: > Hi OpenStack-Helm Core Team, > > I would like to nominate a new core reviewer for OpenStack-Helm: > > Itxaka Serrano Garcia (igarcia at suse.com) > > Itxaka has been doing many reviews and contributed some critical patches to > OpenStack Helm, helping make the project both more approachable and better > validated through his great tempest work. > > Also, he is a great ambassador for the project both in IRC and community > meetings. > > Voting is open for 7 days. Please reply with your +1 vote in favor or -1 > as a veto vote. > > Regards, > > Pete Birley (portdirect) From cdent+os at anticdent.org Fri May 17 15:34:13 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Fri, 17 May 2019 16:34:13 +0100 (BST) Subject: [placement] update 19-19 Message-ID: HTML: https://anticdent.org/placement-update-19-19.html Woo! Placement update 19-19. First one post PTG and Summit. Thanks to everyone who helped make it a useful event for Placement. Having the pre-PTG meant that we had addressed most issues prior to getting there meaning that people were freed up to work in other areas and the discussions we did have were highly coherent. Thanks, also, to everyone involved in getting placement deleted from nova. We did that while at the PTG and had a little [celebration](https://tank-binaries.s3.amazonaws.com/8e922a32c7ff4116a68d7309ec079ec4.jpe). # Most Important We're still working on narrowing priorities and focusing the details of those priorities. There's an [etherpad](https://etherpad.openstack.org/p/placement-ptg-train-rfe-voter) where we're taking votes on what's important. There are three specs in progress from that that need review and refinement. There are two others which have been put on the back burner (see specs section below). # What's Changed * We're now [running a subset](https://review.opendev.org/657077) of nova's functional tests in placement's gate. * osc-placement is using the PlacementFixture to run its functional tests making them _much_ faster. * There's a set of StoryBoard [worklists](https://docs.openstack.org/placement/latest/contributor/contributing.html#storyboard) that can be used to help find in progress work and new bugs. That section also describes how tags are used. * There's a [summary of summaries](http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006278.html) email message that summarizes and links to various results from the PTG. # Specs/Features As the summary of summaries points out, we have two major features this cycle, one of which is _large_: getting consumer types going and getting a whole suite of features going to support nested providers in a more effective fashion. * Support Consumer Types. This is very close with a few details to work out on what we're willing and able to query on. It only has reviews from me so far. * Spec for Nested Magic. This is associated with a [lengthy story](https://storyboard.openstack.org/#!/story/2005575) that includes visual artifacts from the PTG. It covers several related features to enable nested-related requirements from nova and neutron. It is a work in progress, with several unanswered questions. It is also something that efried started but will be unable to finish so the rest of us will need to finish it up as the questions get answered. And it also mostly subsumes a previous spec on [subtree affinity](https://review.opendev.org/#/c/650476/). (Eric, please correct me if I'm wrong on that.) * Resource provider - request group mapping in allocation candidate. This spec was copied over from nova. It is a requirement of the overall nested magic theme. While it has a well-defined and refined design, there's currently no one on the hook implement it. There are also two specs that are still live but de-prioritized: * support any trait in allocation candidates * support mixing required traits with any traits These and other features being considered can be found on the [feature worklist](https://storyboard.openstack.org/#!/worklist/594). Some non-placement specs are listed in the Other section below. # Stories/Bugs There are 23 stories in [the placement group](https://storyboard.openstack.org/#!/project_group/placement). 0 are [untagged](https://storyboard.openstack.org/#!/worklist/580). 4 are [bugs](https://storyboard.openstack.org/#!/worklist/574). 5 are [cleanups](https://storyboard.openstack.org/#!/worklist/575). 12 are [rfes](https://storyboard.openstack.org/#!/worklist/594). 2 are [docs](https://storyboard.openstack.org/#!/worklist/637). If you're interested in helping out with placement, those stories are good places to look. On launchpad: * Placement related nova [bugs not yet in progress](https://goo.gl/TgiPXb) on launchpad: 16. +3 * [In progress placement bugs](https://goo.gl/vzGGDQ) on launchpad: 6. +2. These are placement-related, in nova. Of those there two interesting ones to note: * nova placement api non-responsive due to eventlet error. When using placement-in-nova in stein, recent eventlet changes can cause issues. As I've mentioned on the bug the best way out of this problem is to use placement-in-placement but there are other solutions. * The allocation table has residual records when instance is evacuated and the source physical node is removed. This appears to be yet another issue related to orphaned allocations during one of the several move operations. The impact they are most concerned with, though, seems to be the common "When I bring up a new compute node with the same name there's an existing resource provider in the way" that happens because of the unique constrain on the rp name column. I'm still not sure that constraint is the right thing unless we want to make people's lives hard when they leave behind allocations. We may want to make it hard because it will impact quota... # osc-placement osc-placement is currently behind by 11 microversions. No change since the last report. Pending changes: _Note: a few of these having been sitting for some time with my +2 awaiting review by some other placement core. Please remember osc-placement when reviewing._ * Add 'resource provider inventory update' command (that helps with aggregate allocation ratios). * Add support for 1.22 microversion * Provide a useful message in the case of 500-error * Remove unused cruft from doc and releasenotes config * Improve aggregate version check error messages with min_version * Expose version error message generically # Main Themes Now that the PTG has passed some themes have emerged. Since the Nested Magic one is rather all encompassing and Cleanup is a catchall, I think we can consider three enough. If there's some theme that you think is critical that is being missed, let me know. For people coming from the nova-side of the world who need or want something like review runways to know where they should be focusing their review energy, consider these themes and the links within them as a runway. But don't forget bugs and everything else. ## Nested Magic At the PTG we decided that it was worth the effort, in both Nova and Placement, to make the push to make better use of nested providers — things like NUMA layouts, multiple devices, networks — while keeping the "simple" case working well. The general ideas for this are described in a [story](https://storyboard.openstack.org/#!/story/2005575) and an evolving [spec](https://review.opendev.org/658510). Some code has started, mostly to reveal issues: * Changing request group suffix to string * WIP: Allow RequestGroups without resources * Add NUMANetworkFixture for gabbits * Gabbi test cases for can_split ## Consumer Types Adding a type to consumers will allow them to be grouped for various purposes, including quota accounting. A [spec](https://review.opendev.org/654799) has started. There are some questions about request and response details that need to be resolved, but the overall concept is sound. ## Cleanup As we explore and extend nested functionality we'll need to do some work to make sure that the code is maintainable and has suitable performance. There's some work in progress for this that's important enough to call out as a theme: * Some work from Tetsuro exploring ways to remove redundancies in the code. There's a [related WIP](https://review.opendev.org/658778) * Enhance debug logging in allocation candidate handling * Start of a stack that will allow us to remove the protections against null root providers (which turns out is a pretty significant performance hit). * WIP: Optionally run a wsgi profiler when asked. This was used to find some of the above issues. Should we make it generally available or is it better as a thing to base off when exploring? Ed Leafe has also been doing some intriguing work on using graph databases with placement. It's not yet clear if or how it could be integrated with mainline placement, but there are likely many things to be learned from the experiment. # Other Placement * A suite of refactorings that given their lack of attention perhaps we don't need or want, but let's be explicit about that rather than ignoring the patches if that is indeed the case. * A start at some unit tests for the PlacementFixture which got lost in the run up to the PTG. They may be less of a requirement now that placement is running nova's functional tests. But again, we should be explicit about that decision. # Other Service Users New discoveries are added to the end. Merged stuff is removed. * Nova: Spec: Proposes NUMA topology with RPs * Nova: Spec: Virtual persistent memory libvirt driver implementation * Nova: Check compute_node existence in when nova-compute reports info to placement * Nova: spec: support virtual persistent memory * Workaround doubling allocations on resize * Nova: Pre-filter hosts based on multiattach volume support * Nova: Add flavor to requested_resources in RequestSpec * Blazar: Retry on inventory update conflict * Nova: count quota usage from placement * Nova: nova-manage: heal port allocations * Nova: Spec for a new nova virt driver to manage an RSD * Cyborg: Initial readme for nova pilot * Tempest: Add QoS policies and minimum bandwidth rule client * Nova-spec: Add PENDING vm state * nova-spec: Allow compute nodes to use DISK_GB from shared storage RP * nova-spec: RMD Plugin: Energy Efficiency using CPU Core P-State control * puppet: Debian: Add support for placement-api over uwsgi * nova-spec: Proposes NUMA affinity for vGPUs. This describes a legacy way of doing things because affinity in placement may be a ways off. But it also [may not be](https://review.openstack.org/650476). * Nova: heal allocations, --dry-run * Neutron: Fullstack test for placement sync * Watcher spec: Add Placement helper * Cyborg: Placement report * Nova: Spec to pre-filter disabled computes with placement * rpm-packaging: placement service * Delete resource providers for all nodes when deleting compute service # End I'm out of practice on these things. This one took a long time. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From openstack at fried.cc Fri May 17 16:25:24 2019 From: openstack at fried.cc (Eric Fried) Date: Fri, 17 May 2019 11:25:24 -0500 Subject: On reporting CPU flags that provide mitiation (to CVE flaws) as Nova 'traits' In-Reply-To: <20190517110721.GA19519@paraplu> References: <20190515092456.GH17214@paraplu> <1e4f1df66115fd8e96b8aec3a679b25534c66541.camel@redhat.com> <20190515131109.GJ17214@paraplu> <60a1e97d-c9ac-469f-3c16-39e89347acb3@fried.cc> <20190517110721.GA19519@paraplu> Message-ID: <49033c4d-bfe5-3493-926c-75804719b1be@fried.cc> > Okay, so I take it that all the relevant low-level CPU flags (including > things like SSBD, et al) as proposed here[2][3] can be added to > 'os-traits'. Yes, subject to already-noted namespacing and spelling issues. > And tools _other_ than Nova can consume, if need be. Nova should consume by having the driver expose the flags as appropriate. And switching on flaggage in domain xml if that's a thing. But that's all. No efforts to special-case scheduling decisions etc. efried . From MM9745 at att.com Fri May 17 16:59:44 2019 From: MM9745 at att.com (MCEUEN, MATT) Date: Fri, 17 May 2019 16:59:44 +0000 Subject: [openstack-discuss][openstack-helm] Nominating Itxaka Serrano Garcia to core review team In-Reply-To: <875zq9dtie.fsf@gmail.com> References: <875zq9dtie.fsf@gmail.com> Message-ID: <7C64A75C21BB8D43BD75BB18635E4D897094C303@MOSTLS1MSGUSRFF.ITServices.sbc.com> +1 -----Original Message----- From: Nicolas Bock Sent: Friday, May 17, 2019 9:51 AM To: Pete Birley ; openstack-discuss at lists.openstack.org Subject: Re: [openstack-discuss][openstack-helm] Nominating Itxaka Serrano Garcia to core review team +1 On Fri, May 17 2019, Pete Birley wrote: > Hi OpenStack-Helm Core Team, > > I would like to nominate a new core reviewer for OpenStack-Helm: > > Itxaka Serrano Garcia (igarcia at suse.com) > > Itxaka has been doing many reviews and contributed some critical > patches to OpenStack Helm, helping make the project both more > approachable and better validated through his great tempest work. > > Also, he is a great ambassador for the project both in IRC and > community meetings. > > Voting is open for 7 days. Please reply with your +1 vote in favor or > -1 as a veto vote. > > Regards, > > Pete Birley (portdirect) From openstack at fried.cc Fri May 17 17:36:05 2019 From: openstack at fried.cc (Eric Fried) Date: Fri, 17 May 2019 12:36:05 -0500 Subject: [placement] update 19-19 In-Reply-To: References: Message-ID: > * >   Spec for Nested Magic. This is associated with a [lengthy >   story](https://storyboard.openstack.org/#!/story/2005575) that >   includes visual artifacts from the PTG. It covers several related >   features to enable nested-related requirements from nova and >   neutron. It is a work in progress, with several unanswered >   questions. It is also something that efried started but will be >   unable to finish so the rest of us will need to finish it up as >   the questions get answered. And it also mostly subsumes a previous >   spec on [subtree >   affinity](https://review.opendev.org/#/c/650476/). (Eric, please >   correct me if I'm wrong on that.) Thanks for the reminder, Chris, you're correct. I've abandoned that spec and invalidated its associated task (in story https://storyboard.openstack.org/#!/story/2005385 which I think makes the story be closed?). efried . From pierre at stackhpc.com Fri May 17 18:19:48 2019 From: pierre at stackhpc.com (Pierre Riteau) Date: Fri, 17 May 2019 19:19:48 +0100 Subject: [blazar] Limits to reservation usage Message-ID: Hello, We had a very interesting first Blazar IRC meeting for the Americas last week [1]. One of the topics discussed was enforcing limits to reservation usage. Currently, upstream Blazar doesn't provide ways to limit reservations per user or project. It is technically possible for users to reserve more resources than what their quotas allows them to use. The Chameleon project, which has been running Blazar in production for several years, has extended it to: a) enforce operator-defined limits to reservation length (e.g. 7 days) b) when creating or updating reservations, check whether the project has enough available Service Units (SU) in their allocation, which is stored in a custom external database George Turner from Indiana University Bloomington explained how Jetstream, if it were to use Blazar to share GPU resources, would have a similar requirement to check reservation usage against XSEDE allocations (which are again stored in a custom database). I am starting this thread to discuss how Blazar can support enforcing these kinds of limits, making sure that it is generic enough to be plugged with the various custom allocation backends in use. 1) Blazar should check Nova limits as a guide for limiting reservation usage at any point in time: if number of instances quota is 8, we shouldn't allow the user to reserve more than 8 instances at any point in time. 2) In addition, Blazar could use a quota to limit how much resources can be allocated in advance. Operators may be happy for projects to reserve 8 instances for a month, but not for a century. This could be expressed as a time dimension that would apply to the instance / cores / ram quotas. 3) If Blazar was making REST requests to a customisable endpoint on reservation creation / update, expecting to get a simple yes/no answer (with a human-friendly error message, like how much SUs are left compared to how much would be used), would people be motivated to write a small REST service making the link between Blazar and any custom allocation backend? Feel free to reply to this message or join our next meeting on Thursday May 23 at 1600 UTC. Cheers, Pierre [1] http://eavesdrop.openstack.org/meetings/blazar/2019/blazar.2019-05-09-16.01.log.html From mvanwinkle at salesforce.com Fri May 17 19:29:36 2019 From: mvanwinkle at salesforce.com (Matt Van Winkle) Date: Fri, 17 May 2019 14:29:36 -0500 Subject: =?UTF-8?Q?It=E2=80=99s_OpenStack_User_Survey_Time=21?= Message-ID: Hi everyone, If you operate OpenStack, it’s time to participate in the annual OpenStack User Survey . This is your opportunity to provide direct feedback to the OpenStack community, so we can better understand your environment and needs. We send all anonymous feedback directly to the project teams who work to improve the software and documentation. The survey will take less than 20 minutes. All participants who complete the OpenStack deployment survey will receive AUC status and discounted registration to the Shanghai Summit. Please let me or any members of the User Committee know if you have any questions. Cheers, VW -- Matt Van Winkle Senior Manager, Software Engineering | Salesforce -------------- next part -------------- An HTML attachment was scrubbed... URL: From cw at f00f.org Fri May 17 20:16:07 2019 From: cw at f00f.org (Chris Wedgwood) Date: Fri, 17 May 2019 13:16:07 -0700 Subject: [openstack-discuss][openstack-helm] Nominating Itxaka Serrano Garcia to core review team In-Reply-To: References: Message-ID: <20190517201607.GA14460@aether.stupidest.org> On Fri, May 17, 2019 at 08:06:35AM -0500, Pete Birley wrote: > I would like to nominate a new core reviewer for OpenStack-Helm: > > Itxaka Serrano Garcia (igarcia at suse.com) +1 From jungleboyj at gmail.com Fri May 17 20:32:21 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Fri, 17 May 2019 15:32:21 -0500 Subject: [nova][all][ptg] Summary: Same-Company Approvals In-Reply-To: <20190508154511.njvidentht4d4zim@pacific.linksys.moosehall> References: <20190508143923.bhmla62qi2p7yc7s@yuggoth.org> <20190508154511.njvidentht4d4zim@pacific.linksys.moosehall> Message-ID: On 5/8/2019 10:45 AM, Adam Spiers wrote: > Jeremy Stanley wrote: >> On 2019-05-07 15:06:10 -0500 (-0500), Jay Bryant wrote: >>> Cinder has been working with the same unwritten rules for quite some >>> time as well with minimal issues. >>> I think the concerns about not having it documented are warranted.  >>> We have had question about it in the past with no documentation to >>> point to.  It is more or less lore that has been passed down over >>> the releases.  :-) >>> At a minimum, having this e-mail thread is helpful.  If, however, we >>> decide to document it I think we should have it consistent across >>> the teams that use the rule.  I would be happy to help draft/review >>> any such documentation. >> [...] >> >> I have a feeling that a big part of why it's gone undocumented for so >> long is that putting it in writing risks explicitly sending the >> message that we don't trust our contributors to act in the best >> interests of the project even if those are not aligned with the >> interests of their employer/sponsor. I think many of us attempt to >> avoid having all activity on a given patch come from people with the >> same funding affiliation so as to avoid giving the impression that >> any one organization is able to ram changes through with no >> oversight, but more because of the outward appearance than because we >> don't trust ourselves or our colleagues. >> Documenting our culture is a good thing, but embodying that >> documentation with this sort of nuance can be challenging. > > That's a good point.  Maybe that risk could be countered by explicitly > stating something like "this is not currently an issue within the > community, and it has rarely, if ever, been one in the past; therefore > this policy is a preemptive safeguard rather than a reactive one" ? This thread sparked discussion in the Cinder meeting this week and I will just share that here for completeness. Some of our newer members were not aware of this unwritten rule and it has been 'tribal' knowledge for the Cinder team for quite some time.  It also has not been an issue for many years. The long story short was: 'If it is a new feature it should be reviewed by at least one person from a different company than the author.  For bugs, cores should use their best judgement.' As to whether or not to document it, it looks from the mailing list thread like some teams have documented it and some haven't. So it would seem that letting what each team sees as most fit for their project may be the best answer. Jay From ekcs.openstack at gmail.com Fri May 17 22:00:02 2019 From: ekcs.openstack at gmail.com (Eric K) Date: Fri, 17 May 2019 15:00:02 -0700 Subject: [self-healing] live-migrate instance in response to fault signals In-Reply-To: <1640608910.1064843.1556812966934.JavaMail.zimbra@speichert.pl> References: <1640608910.1064843.1556812966934.JavaMail.zimbra@speichert.pl> Message-ID: On Thu, May 2, 2019 at 9:02 AM Daniel Speichert wrote: > > ----- Original Message ----- > > From: "Eric K" > > To: "openstack-discuss" > > Sent: Wednesday, May 1, 2019 4:59:57 PM > > Subject: [self-healing] live-migrate instance in response to fault signals > ... > > > > I just want to follow up to get more info on the context; > > specifically, which of the following pieces are the main difficulties? > > - detecting the failure/soft-fail/early failure indication > > - codifying how to respond to each failure scenario > > - triggering/executing the desired workflow > > - something else > > > > [1] https://etherpad.openstack.org/p/DEN-self-healing-SIG > > We currently attempt to do all of the above using less-than-optimal custom > scripts (using openstacksdk) and pipelines (running Ansible). > > I think there is tremendous value in developing at least one tested > way to do all of the above by connecting e.g. Monasca, Mistral and Nova > together to do the above. Maybe it's currently somewhat possible - then > it's more of a documentation issue that would benefit operators. Agreed. > > One of the derivative issues is the quality of live-migration in Nova. > (I don't have production-level experience with Rocky/Stein yet.) > When we do lots of live migrations, there is obviously a limit on the number > of live migrations happening at the same time (doing more would be counter > productive). These limits could be smarter/more dynamic in some cases. > There is no immediate action item here right now though. Any rough thoughts on which factors would be considered to decide an appropriate dynamic limit? I'm assuming something to do with network traffic? > > I would like to begin with putting together all the pieces that currently > work together and go from there - see what's missing. I hope to make progress on this too. Mistral workflows (including ansible playbooks) can be triggered via API. What's needed then is a mechanism to collect (pre) failure data (Monasca perhaps) and a mechanism that evaluates the data to decide what workflow/playbook to trigger (Monasca does threshold evaluation and raise alarms, Congress can process Monasca alarms then make contextual decision to trigger workflows/playbooks). The pieces starting from Monasca raising an alarm to Congress to Mistral are in place (though need to be better documented). But I am less clear on the sources of raw data needed and how to collect them in Monasca. > > -Daniel From colleen at gazlene.net Fri May 17 22:48:38 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Fri, 17 May 2019 15:48:38 -0700 Subject: [dev][keystone] Keystone Team Update - Week of 13 May 2019 Message-ID: # Keystone Team Update - Week of 13 May 2019 ## News Another quiet week as people gear up for summer holidays. ## Open Specs Train specs: https://bit.ly/2uZ2tRl Ongoing specs: https://bit.ly/2OyDLTh As we're gearing up for feature work for the cycle, please prioritize reviewing specs. ## Recently Merged Changes Search query: https://bit.ly/2pquOwT We merged 17 changes this week. ## Changes that need Attention Search query: https://bit.ly/2tymTje There are 32 changes that are passing CI, not in merge conflict, have no negative reviews and aren't proposed by bots. ## Bugs This week we opened 10 new bugs and closed 1. Two of the bugs opened this week are for tracking deprecations and removals and will be closed at the end of the cycle, some others are for tracking work that was discussed at the PTG. Bugs opened (10) Bug #1829453 (keystone:Low) opened by Colleen Murphy https://bugs.launchpad.net/keystone/+bug/1829453 Bug #1829454 (keystone:Low) opened by Colleen Murphy https://bugs.launchpad.net/keystone/+bug/1829454 Bug #1829573 (keystone:Low) opened by Colleen Murphy https://bugs.launchpad.net/keystone/+bug/1829573 Bug #1829574 (keystone:Low) opened by Colleen Murphy https://bugs.launchpad.net/keystone/+bug/1829574 Bug #1829455 (keystone:Wishlist) opened by Vishakha Agarwal https://bugs.launchpad.net/keystone/+bug/1829455 Bug #1828783 (keystone:Undecided) opened by Akihiro Motoki https://bugs.launchpad.net/keystone/+bug/1828783 Bug #1829296 (keystone:Undecided) opened by Douglas Mendizábal https://bugs.launchpad.net/keystone/+bug/1829296 Bug #1829575 (keystonemiddleware:Low) opened by Colleen Murphy https://bugs.launchpad.net/keystonemiddleware/+bug/1829575 Bug #1828737 (oslo.policy:Undecided) opened by Doug Hellmann https://bugs.launchpad.net/oslo.policy/+bug/1828737 Bug #1828739 (oslo.policy:Undecided) opened by Doug Hellmann https://bugs.launchpad.net/oslo.policy/+bug/1828739 Bugs closed (1) Bug #1828783 (keystone:Undecided) https://bugs.launchpad.net/keystone/+bug/1828783 ## Milestone Outlook https://releases.openstack.org/train/schedule.html Our spec proposal freeze is three weeks away. If you want to get a major initiative done in keystone this cycle, now is the time to propose it. ## Help with this newsletter Help contribute to this newsletter by editing the etherpad: https://etherpad.openstack.org/p/keystone-team-newsletter From bluejay.ahn at gmail.com Fri May 17 23:42:23 2019 From: bluejay.ahn at gmail.com (Jaesuk Ahn) Date: Sat, 18 May 2019 08:42:23 +0900 Subject: [openstack-helm] List of the current meeting times, and opinion on adjustment In-Reply-To: References: Message-ID: On Sat, May 18, 2019 at 12:05 AM Jean-Philippe Evrard < jean-philippe at evrard.me> wrote: > Sadly, there is no magic with timezones :( I am fine with your proposal > though. > True, there is really no magic with timezones, especially it involves everyone in the earth. > > I believe that there could be another way: > - If a conversation needs to happen "synchronously", we could use the > office hours for that. The office hours are timezone alternated IIRC, but > not yet "officialized" (they are not appearing in [1]). > - Bring conversations that need to happen asynchronously over the ML. > - Discuss/Decide things in reviews. > > I do like your idea. Bringing asynchronous conversation actively to ML and reviews. If one think posting on ML or writing reviews has an equal effect as talking directly to someone in weekly irc meeting, it would give good alternative way. I did not think of using official hours as a sort of irc meeting for people in different timezone. However, I guess it cloud be done. > That's technically totally workable globally, but it needs a change in > mindset. > It's also most likely people will get less active on IRC, which will > impact the way we "feel" as a community. > True, "feel" part is sometimes a very important factor for people. I think that being part of "official" activity is sometimes important not only for effective communication but also for "feeling" that I am part of community. Being able to attend "official" weekly meeting without too much effort (e.g. keep your eye open till midnight) is an weapon for me to bring more people into the community. I suppose we can do both. 1) Trying to find if it is possible to have a slightly better official meeting time slot for Asian region 2) Discussing on more leveraging asynchronous way of communication and its effect on the community. Thank you for the insightful feedback. :) > > Regards, > Jean-Philippe Evrard > > > > [1]: http://eavesdrop.openstack.org/#OpenStack-Helm_Team_Meeting > -- *Jaesuk Ahn*, Ph.D. Software R&D Center, SK Telecom -------------- next part -------------- An HTML attachment was scrubbed... URL: From bluejay.ahn at gmail.com Fri May 17 23:47:24 2019 From: bluejay.ahn at gmail.com (Jaesuk Ahn) Date: Sat, 18 May 2019 08:47:24 +0900 Subject: [openstack-helm] List of the current meeting times, and opinion on adjustment In-Reply-To: References: Message-ID: *Jaesuk Ahn*, Ph.D. Software Labs, SK Telecom On Sat, May 18, 2019 at 8:42 AM Jaesuk Ahn wrote: > > > On Sat, May 18, 2019 at 12:05 AM Jean-Philippe Evrard < > jean-philippe at evrard.me> wrote: > >> Sadly, there is no magic with timezones :( I am fine with your proposal >> though. >> > > True, there is really no magic with timezones, especially it involves > everyone in the earth. > > >> >> I believe that there could be another way: >> - If a conversation needs to happen "synchronously", we could use the >> office hours for that. The office hours are timezone alternated IIRC, but >> not yet "officialized" (they are not appearing in [1]). >> - Bring conversations that need to happen asynchronously over the ML. >> - Discuss/Decide things in reviews. >> >> > I do like your idea. Bringing asynchronous conversation actively to ML and > reviews. If one think posting on ML or writing reviews has an equal effect > as talking directly to someone in weekly irc meeting, it would give good > alternative way. > > I did not think of using official hours as a sort of irc meeting for > people in different timezone. However, I guess it cloud be done. > > >> That's technically totally workable globally, but it needs a change in >> mindset. >> It's also most likely people will get less active on IRC, which will >> impact the way we "feel" as a community. >> > > True, "feel" part is sometimes a very important factor for people. I think > that being part of "official" activity is sometimes important not only for > effective communication but also for "feeling" that I am part of > community. Being able to attend "official" weekly meeting without too much > effort (e.g. keep your eye open till midnight) is an weapon for me to bring > more people into the community. > > I suppose we can do both. > 1) Trying to find if it is possible to have a slightly better official > meeting time slot for Asian region > 2) Discussing on more leveraging asynchronous way of communication and its > effect on the community. > > Thank you for the insightful feedback. :) > > > > > >> >> Regards, >> Jean-Philippe Evrard >> >> >> >> [1]: http://eavesdrop.openstack.org/#OpenStack-Helm_Team_Meeting >> > > > -- > *Jaesuk Ahn*, Ph.D. > Software R&D Center, SK Telecom > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Sat May 18 08:29:49 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Sat, 18 May 2019 17:29:49 +0900 Subject: [telemetry][storyboard] Telemetry project bug trackers have been moved to Storyboard Message-ID: Hi team, Great news!!!!! We have moved all of the telemetry project bug trackers to Storyboard. You can find the project group link in [1] The projects are: - openstack/telemetry-tempest-plugin - Tempest Plugin for Telemetry projects - openstack/telemetry-specs - OpenStack Telemetry Specifications - openstack/python-pankoclient - Client library for Panko API - openstack/python-aodhclient - Client library for Aodh API - openstack/panko - Event storage and REST API for Ceilometer - openstack/ceilometermiddleware - OpenStack Telemetry (Ceilometer) Middleware - openstack/ceilometer - OpenStack Telemetry (Ceilometer) - openstack/aodh - OpenStack Telemetry (Ceilometer) Alarming All existing bugs, blueprints will be migrated shortly next week or so. For now, please refrain from firing new bugs/issues on the launchpad. [1] https://storyboard.openstack.org/#!/project_group/ceilometer Bests, -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Sat May 18 13:28:52 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Sat, 18 May 2019 13:28:52 +0000 Subject: [telemetry][storyboard] Telemetry project bug trackers have been moved to Storyboard In-Reply-To: References: Message-ID: <20190518132851.pkpie4zspc5klvsx@yuggoth.org> On 2019-05-18 17:29:49 +0900 (+0900), Trinh Nguyen wrote: [...] > All existing bugs, blueprints will be migrated shortly next week or so. [...] Yes, I'm getting the bug import set up now. The blueprint import may take a little longer since that's a new feature for our migration tool and, based on my initial testing, seems to need a bit more work before I'd be comfortable running it in production. I think it's very close to ready though. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From dangtrinhnt at gmail.com Sat May 18 13:36:00 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Sat, 18 May 2019 22:36:00 +0900 Subject: [telemetry][storyboard] Telemetry project bug trackers have been moved to Storyboard In-Reply-To: <20190518132851.pkpie4zspc5klvsx@yuggoth.org> References: <20190518132851.pkpie4zspc5klvsx@yuggoth.org> Message-ID: Thank Jeremy for your support! On Sat, May 18, 2019 at 10:32 PM Jeremy Stanley wrote: > On 2019-05-18 17:29:49 +0900 (+0900), Trinh Nguyen wrote: > [...] > > All existing bugs, blueprints will be migrated shortly next week or so. > [...] > > Yes, I'm getting the bug import set up now. The blueprint import may > take a little longer since that's a new feature for our migration > tool and, based on my initial testing, seems to need a bit more work > before I'd be comfortable running it in production. I think it's > very close to ready though. > -- > Jeremy Stanley > -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From jean-philippe at evrard.me Sat May 18 13:35:46 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Sat, 18 May 2019 15:35:46 +0200 Subject: =?UTF-8?Q?Re:_[openstack-discuss][openstack-helm]_Nominating_Itxaka_Serr?= =?UTF-8?Q?ano_Garcia_to_core_review_team?= In-Reply-To: References: Message-ID: A very nice addition to the core team! (+1 for me, for what's its worth) It's nice to see an improvement in company diversity at the same time! Regards, Jean-Philippe Evrard (evrardjp) -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilkers.steve at gmail.com Sat May 18 14:16:14 2019 From: wilkers.steve at gmail.com (Steve Wilkerson) Date: Sat, 18 May 2019 09:16:14 -0500 Subject: [openstack-discuss][openstack-helm] Nominating Itxaka Serrano Garcia to core review team In-Reply-To: References: Message-ID: A big +1 from me. On Fri, May 17, 2019 at 8:11 AM Pete Birley < petebirley+openstack-dev at gmail.com> wrote: > Hi OpenStack-Helm Core Team, > > I would like to nominate a new core reviewer for OpenStack-Helm: > > Itxaka Serrano Garcia (igarcia at suse.com) > > Itxaka has been doing many reviews and contributed some critical patches > to OpenStack Helm, helping make the project both more approachable and > better validated through his great tempest work. > > Also, he is a great ambassador for the project both in IRC and community > meetings. > > Voting is open for 7 days. Please reply with your +1 vote in favor or -1 > as a veto vote. > > Regards, > > Pete Birley (portdirect) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amer.server.two at gmail.com Sat May 18 16:53:18 2019 From: amer.server.two at gmail.com (Amer Hwitat) Date: Sat, 18 May 2019 19:53:18 +0300 Subject: Fwd: Linux (RHEL 7.6 with OSP 14) Bugs In-Reply-To: References: Message-ID: Dears, I have the following Bugs that crashed my VM, I reported it to RH, they didn't answer, and banned my developer account, the Bug is when you disable the network on RHEL with OSP 14 installed all in one, it crashes the system, I had a 12GB RAM, with 8 CPUs on the VM, and I found out that this crash report pissed off someone in RH, because they called me, and said what do you want from me!!, what I need is a Simple reply, is this a bug or not. here is the problem: [root at localhost network-scripts]# systemctl status network -l ? network.service - LSB: Bring up/down networking Loaded: loaded (/etc/rc.d/init.d/network; bad; vendor preset: disabled) Active: failed (Result: exit-code) since Sat 2019-01-19 03:47:01 EST; 21s ago Docs: man:systemd-sysv-generator(8) Process: 86319 ExecStop=/etc/rc.d/init.d/network stop (code=exited, status=0/SUCCESS) Process: 86591 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=1/FAILURE) Tasks: 0 Jan 19 03:47:01 localhost.localdomain dhclient[86963]: Please report for this software via the Red Hat Bugzilla site: Jan 19 03:47:01 localhost.localdomain dhclient[86963]: http://bugzilla.redhat.com Jan 19 03:47:01 localhost.localdomain dhclient[86963]: ution. Jan 19 03:47:01 localhost.localdomain dhclient[86963]: exiting. Jan 19 03:47:01 localhost.localdomain network[86591]: failed. Jan 19 03:47:01 localhost.localdomain network[86591]: [FAILED] Jan 19 03:47:01 localhost.localdomain systemd[1]: network.service: control process exited, code=exited status=1 Jan 19 03:47:01 localhost.localdomain systemd[1]: Failed to start LSB: Bring up/down networking. Jan 19 03:47:01 localhost.localdomain systemd[1]: Unit network.service entered failed state. Jan 19 03:47:01 localhost.localdomain systemd[1]: network.service failed. [root at localhost network-scripts]# [root at localhost log]# Message from syslogd at localhost at Jan 23 02:23:31 ... kernel:NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [ovsdb-server:10088] [root at amer network-scripts]# Message from syslogd at amer at Jan 27 12:46:38 ... kernel:NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [nova-api:102738] Message from syslogd at amer at Jan 27 19:26:19 ... kernel:NMI watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [swapper/5:0] Message from syslogd at amer at Jan 27 19:26:19 ... kernel:NMI watchdog: BUG: soft lockup - CPU#1 stuck for 27s! [dmeventd:71548] Message from syslogd at amer at Jan 27 19:27:30 ... kernel:NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [6_scheduler:64928] Message from syslogd at amer at Jan 27 19:31:25 ... kernel:NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [ksoftirqd/5:34] Message from syslogd at amer at Jan 27 19:32:42 ... kernel:NMI watchdog: BUG: soft lockup - CPU#3 stuck for 33s! [swift-object-up:11358] Message from syslogd at amer at Jan 27 19:33:55 ... kernel:NMI watchdog: BUG: soft lockup - CPU#3 stuck for 24s! [dmeventd:71548] Message from syslogd at amer at Jan 27 19:34:25 ... kernel:NMI watchdog: BUG: soft lockup - CPU#2 stuck for 65s! [kworker/2:0:59993] Message from syslogd at amer at Jan 27 19:37:50 ... kernel:NMI watchdog: BUG: soft lockup - CPU#2 stuck for 24s! [kworker/u256:3:8447] Message from syslogd at amer at Jan 27 19:37:50 ... kernel:NMI watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [ksoftirqd/5:34] Message from syslogd at amer at Jan 27 19:37:51 ... kernel:NMI watchdog: BUG: soft lockup - CPU#0 stuck for 21s! [systemd:11968] The CPU has been disabled by the guest operating system. Power off or reset the virtual machine. snapshots attached [image: Red Hat Enterprise Linux 7 64-bit (2)-2019-01-28-03-57-27.png] [image: Red Hat Enterprise Linux 7 64-bit (2)-2019-01-28-04-26-41.png] [image: working solution.JPG] the last snapshot is from a successful installation of OSP 14 that specifically says that Kernel is not compatible with Firmware (Bios). I didn't test on Debian flavors but I think it's the same, the problem is with RabbitMQ heart beats, when the server is disconnected it times out causing this problem of kernel loop. Thanks and Best regards Amer -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Red Hat Enterprise Linux 7 64-bit (2)-2019-01-28-03-57-27.png Type: image/png Size: 26978 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Red Hat Enterprise Linux 7 64-bit (2)-2019-01-28-04-26-41.png Type: image/png Size: 386086 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: working solution.JPG Type: image/jpeg Size: 139627 bytes Desc: not available URL: From amotoki at gmail.com Sun May 19 04:06:47 2019 From: amotoki at gmail.com (Akihiro Motoki) Date: Sun, 19 May 2019 13:06:47 +0900 Subject: [storyboard] email notification on stories/tasks of subscribed projects In-Reply-To: <1db76780066130ccb661d2b1f632f163@sotk.co.uk> References: <1db76780066130ccb661d2b1f632f163@sotk.co.uk> Message-ID: Thanks for the information. I re-enabled email notification and then started to receive notifications. I am not sure why this solved the problem but it now works for me. 2019年5月15日(水) 22:43 : > On 2019-05-15 13:58, Akihiro Motoki wrote: > > Hi, > > > > Is there a way to get email notification on stories/tasks of > > subscribed projects in storyboard? > > Yes, go to your preferences > (https://storyboard.openstack.org/#!/profile/preferences) > by clicking on your name in the top right, then Preferences. > > Scroll to the bottom and check the "Enable notification emails" > checkbox, then > click "Save". There's a UI bug where sometimes the displayed preferences > will > look like the save button didn't work, but rest assured that it did > unless you > get an error message. > > Once you've done this the email associated with your OpenID will receive > notification emails for things you're subscribed to (which includes > changes on > stories/tasks related to projects you're subscribed to). > > Thanks, > > Adam (SotK) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From snikitin at mirantis.com Sun May 19 13:12:07 2019 From: snikitin at mirantis.com (Sergey Nikitin) Date: Sun, 19 May 2019 17:12:07 +0400 Subject: [stackalytics] Reported numbers seem inaccurate In-Reply-To: References: Message-ID: Hi, Rong, Database was rebuild and now stats o gengchc2 [1] is correct [2]. [1] https://www.stackalytics.com/?release=all&metric=commits&project_type=all&user_id=578043796-b [2] https://review.opendev.org/#/q/owner:gengchc2,n,z Sorry for delay, Sergey On Fri, May 17, 2019 at 6:20 PM Sergey Nikitin wrote: > Testing of migration process shown us that we have to rebuild database "on > live". > Unfortunately it means that during rebuild data will be incomplete. I > talked with the colleague who did it previously and he told me that it's > normal procedure. > I got these results on Monday and at this moment I'm waiting for weekend. > It's better to rebuild database in Saturday and Sunday to do now affect > much number of users. > So by the end of this week everything will be completed. Thank you for > patient. > > On Fri, May 17, 2019 at 6:15 AM Rong Zhu wrote: > >> Hi Sergey, >> >> What is the process about rebuild the database? >> >> Thanks, >> Rong Zhu >> >> Sergey Nikitin 于2019年5月7日 周二00:59写道: >> >>> Hello Rong, >>> >>> Sorry for long response. I was on a trip during last 5 days. >>> >>> What I have found: >>> Lets take a look on this patch [1]. It must be a contribution of >>> gengchc2, but for some reasons it was matched to Yuval Brik [2] >>> I'm still trying to find a root cause of it, but anyway on this week we >>> are planing to rebuild our database to increase RAM. I checked statistics >>> of gengchc2 on clean database and it's complete correct. >>> So your problem will be solved in several days. It will take so long >>> time because full rebuild of DB takes 48 hours, but we need to test our >>> migration process first to keep zero down time. >>> I'll share a results with you here when the process will be finished. >>> Thank you for your patience. >>> >>> Sergey >>> >>> [1] https://review.opendev.org/#/c/627762/ >>> [2] >>> https://www.stackalytics.com/?user_id=jhamhader&project_type=all&release=all&metric=commits&company=&module=freezer-api >>> >>> >>> On Mon, May 6, 2019 at 6:30 AM Rong Zhu wrote: >>> >>>> Hi Sergey, >>>> >>>> Do we have any process about my colleague's data loss problem? >>>> >>>> Sergey Nikitin 于2019年4月29日 周一19:57写道: >>>> >>>>> Thank you for information! I will take a look >>>>> >>>>> On Mon, Apr 29, 2019 at 3:47 PM Rong Zhu >>>>> wrote: >>>>> >>>>>> Hi there, >>>>>> >>>>>> Recently we found we lost a person's data from our company at the >>>>>> stackalytics website. >>>>>> You can check the merged patch from [0], but there no date from >>>>>> the stackalytics website. >>>>>> >>>>>> stackalytics info as below: >>>>>> Company: ZTE Corporation >>>>>> Launchpad: 578043796-b >>>>>> Gerrit: gengchc2 >>>>>> >>>>>> Look forward to hearing from you! >>>>>> >>>>> >>>> Best Regards, >>>> Rong Zhu >>>> >>>>> >>>>>> -- >>>> Thanks, >>>> Rong Zhu >>>> >>> >>> >>> -- >>> Best Regards, >>> Sergey Nikitin >>> >> -- >> Thanks, >> Rong Zhu >> > > > -- > Best Regards, > Sergey Nikitin > -- Best Regards, Sergey Nikitin -------------- next part -------------- An HTML attachment was scrubbed... URL: From miguel at mlavalle.com Sun May 19 21:15:27 2019 From: miguel at mlavalle.com (Miguel Lavalle) Date: Sun, 19 May 2019 16:15:27 -0500 Subject: [openstack-dev] [Neutron] Train PTG Summary Message-ID: Dear Neutron team, Thank you very much for your hard work during the PTG in Denver. Even though it took place at the end of a very long week, we had a very productive meeting and we planned and prioritized a lot of work to be done during the cycle. Following below is a high level summary of the discussions we had. If there is something I left out, please reply to this email thread to add it. However, if you want to continue the discussion on any of the individual points summarized below, please start a new thread, so we don't have a lot of conversations going on attached to this update. You can find the etherpad we used during the PTG meetings here: https://etherpad.openstack.org/p/openstack-networking-train-ptg Retrospective ========== * The team considered positive points during the Stein cycle the following: - Implemented and merged all the targeted blueprints. - Minted several new core team members through a mentoring program. The new core reviewers are Nate Johnston, Hongbin Lu, Liu Yulong, Bernard Caffarelli (stable branches) and Ryan Tidwell (neutron-dynamic-routing) - Very good cross project cooperation with Nova ( https://blueprints.launchpad.net/neutron/+spec/strict-minimum-bandwidth-support) and StarlingX ( https://blueprints.launchpad.net/neutron/+spec/network-segment-range-management ) - The team got caught up with all the community goals - Added non-voting jobs from the Stadium projects, enabling the team to catch potential breakage due to changes in Neutron - Successfully forked the Ryu SDN framework, which is used by Neutron for Openflow programming. The original developer is not supporting the framework anymore, so the Neutron team forked it as os-ken ( https://opendev.org/openstack/os-ken) and adopted it seamlessly in the code * The team considered the following as improvement areas: - At the end of the cycle, we experienced failures in the gate that impacted the speed at which code was merged. Measures to solve this problem were later discussed in the "CI stability" section below - The team didn't make much progress adopting Storyboard. Given comments of lack of functionality from other projects, a decision was made to evaluate progress by other teams before moving ahead with Storyboard - Lost almost all the key contributors in the following Stadium projects: https://opendev.org/openstack/neutron-fwaas and https://opendev.org/openstack/neutron-vpnaas. Miguel Lavalle will talk to the remaining contributors to asses how to move forward - Not too much concrete progress was achieved by the performance and scalability sub-team. Please see the "Neutron performance and scaling up" section below for next steps - Engine facade adoption didn't make much progress due to the loss of all the members of the sub-team working on it. Miguel Lavalle will lead this effort during Train. Nate Johnston and Rodolfo Alonso volunteered to help. The approach will be to break up this patch into smaller, more easily implementable and reviewable chunks: https://review.opendev.org/#/c/545501/ Support more than one segment per network per host ======================================== The basic value proposition of routed networks is to allow deployers to offer their users a single "big virtual network" without the performance limitations of large L2 broadcast domains. This value proposition is currently limited by the fact that Neutron allows only one segment per network per host: https://github.com/openstack/neutron/blob/77fa7114f9ff67d43a1150b52001883fafb7f6c8/neutron/objects/subnet.py#L319-L328. As a consequence, as demand of IP addresses exceeds the limits of a reasonably sized subnets (/22 subnets is a consensus on the upper limit), it becomes necessary to allow hosts to be connected to more than one segment in a routed network. David Bingham and Kris Lindgren (GoDaddy) have been working on PoC code to implement this (https://review.opendev.org/#/c/623115). This code has helped to uncover some key challenges: * Change all code that assumes a 1-1 relationship between network and segment per host into a 1-many relationship. * Generate IDs based on segment_id rather than network_id to be used in naming software bridges associated with the network segments. * Ensure new 1-many relationship (network -> segment) can be supported by ML2 drivers implementors. * Provide migration paths for current deployments of routed networks. The agreements made were the following: * We will write a spec reflecting the learnings of the PoC * The spec will target all the supported ML2 backends, not only some of them * We will modify and update ML2 interfaces to support the association of software bridges with segments, striving to provide backwards compatibility * We will try to provide an automatic migration option that only requires re-starting the agents. If that proves not to be possible, a set of migration scripts and detailed instructions will be created The first draft of the spec is already up for review: https://review.opendev.org/#/c/657170/ Neutron CI stability ============== At the end of the Stein cycle the project experienced a significant impact due to CI instability. This situation has improved recently but there is still gains to be achieved. The team discussed to major areas of improvement: make sure we don't have more tests that are necessary (simplification of jobs) and fix recurring problems. - To help the conversation on simplification of jobs, Slawek Kaplonski shared this matrix showing what currently is being tested: https://ethercalc.openstack.org/neutron-ci-jobs * One approach is reducing the services Neutron is tested with in integrated-gate jobs (tempest-full), which will reduce the number of failures not related to Neutron. Slawek Kaplonski represented Neutron in the QA PTG session where this approach was discussed. The proposal being socialized in the mailing list ( http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005871.html ) involves: # Run only dependent service tests on project gate # Tempest gate will keep running all the services tests as the integrated gate at a centeralized place without any change in the current job # Each project can run a simplified integrated gate job template tailored to its needs # All the simplified integrated gate job templates will be defined and maintained by QA team # For Neutron there will be an "Integrated-gate-networking". Tests to run in this template: Neutron APIs , Nova APIs, Keystone APIs. All scenario currently running in tempest-full in the same way (means non-slow and in serial). The improvement will be to exclude the Cinder API tests, Glance API tests and Swift API tests * Another idea discussed was removing single node jobs that are very similar to multinode jobs # One possibility is consolidating grenade jobs. There is a proposal being socialized in the mailing list: http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006146.html # Other consolidation of single node - multinode jobs will require stabilizing the corresponding multinode job - One common source of problems is ssh failures in various scenario tests * Several team members are working on different aspects of this issue * Slawek Kaplonski is investigating authentication failures. As of the writing of this summary, it has been determined that there is a slowdown in the metadata service, either on the Neutron or the Nova side. Further investigation is going on * Miguel Lavalle is adding tcpdump commands to router namespaces to investigate data path disruptions netwroking-omnipath ================ networking-omnipath (https://opendev.org/x/networking-omnipath) is a ML2 mechanism driver that integrates OpenStack Neutron API with an Omnipath backend. It enables Omnipath switching fabric in OpenStack cloud and each network in the Openstack networking realm corresponds to a virtual fabric on the omnipath side. - Manjeet Singh Bhatia proposed to make networking-omnipath a Neutron Stadium project - The agreement was that Miguel Lavalle and Manjeet will work together in determining whether networking-omnipath meet the Stadium project criteria, as outlined here: https://docs.openstack.org/neutron/latest/contributor/stadium/governance.html#when-is-a-project-considered-part-of-the-stadium - In case the criteria is not met, a remediation plan will be defined Cross networking project topics ======================= - Cross networking project topics * Neutron the only projects not using WSGI * We have to make it the default option in DevStack, although this will require some investigation * We already have a check queue non-voting job for WSGI. It is failing constantly, although the failures are all due to a singe test case (neutron_add_remove_fixed_ip). Miguel Lavalle will investigate and fix it * Target is to adopt WSGI as the default by Train-2 - Adoption of neutron-classifier (https://opendev.org/x/neutron-classifier) * David Shaughnessy has two PoC patches that demonstrate the adoption of neutron-classifier into Neutron's QoS. David will continue refining these patches and will bring them up for discussion in the QoS sub-team meeting on May 14th * It was also agreed to start the process of adding neutron-classifier the the Neutron Stadium. David Shaughnessy and Miguel Lavalle will work on this per the criteria defined in https://docs.openstack.org/neutron/latest/contributor/stadium/governance.html#when-is-a-project-considered-part-of-the-stadium - DHCP agent configured with mismatching domain and host entries * Since the merge of https://review.opendev.org/#/c/571546/, we have a confusion about what exactly the dns_domain field of a network is for. Historically, dns_domain for use with external DNS integration in the form of designate, but that delineation has become muddied with the previously mentioned change. * Miguel Lavalle will go back to the original spec of DNS integration and make a decision as to how to move forward - Neutron Events for smartNIC-baremetal use-case * In smartNIC baremetal usecase, Ironic need to know when agent is/is-not alive (since the neutron agent is running on the smartNIC) and when a port becomes up/down * General agreement was to leverage the existing notifier mechanism to emit this information for Ironic to consume (requires implementation of an API endpoint in Ironic). It was also agreed that a spec will be created * The notifications emitted can be leveraged by Ironic for other use-cases. In fact, in a lunch with Ironic team members (Julia Kreger, Devananda van der Veen and Harald Jensås), it was agreed to use use it also for the port bind/unbind completed notification. Neutron performance and scaling up =========================== - Recently, a performance and scalability sub-team ( http://eavesdrop.openstack.org/#Neutron_Performance_sub-team_Meeting) has been formed to explore ways to improve performance overall - One of the activities of this sub-team has been adding osprofiler to the Neutron Rally job (https://review.opendev.org/#/c/615350). Sample result reports can be seen here: http://logs.openstack.org/50/615350/38/check/neutron-rally-task/0a4b791/results/report.html.gz#/NeutronNetworks.create_and_delete_ports/output and http://logs.openstack.org/50/615350/38/check/neutron-rally-task/0a4b791/results/report.html.gz#/NeutronNetworks.create_and_delete_subnets/output - Reports analysis indicate that port creation takes on average in the order of 9 seconds, even without assigning IP addresses to it and without binding it. The team decided to concentrate its efforts in improving the entire port creation - binding - wiring cycle. One step necessary for this is the addition of a Rally scenario, which Bence Romsics volunteered to develop. - Another area of activity has been EnOS ( https://github.com/BeyondTheClouds/enos ), which is a framework that deploys OpenStack (using Kolla Ansible) and then runs Rally based performance experiments on that deployment ( https://enos.readthedocs.io/en/stable/benchmarks/index.html) * The deployment can take place on VMs (Vagrant provider) or in large clouds such as Grid5000 testbed: https://www.grid5000.fr/w/Grid5000:Home * The Neutron performance sub-team and the EnOS developers are cooperating to define a performance experiment at scale * To that end, Miguel Lavalle has built a "big PC" with an AMD Threadripper 2950x processor (16 cores, 32 threads) and 64 GB of memory. This machine will be used to experiment with deployments in VMs to refine the scenarios to be tested, with the additional benefit that the Rally results will not be affected by variability in the OpenStack CI infrastructure. * Once the Neutron and EnOS team reach an agreement on the scenarios to be tested, an experiment will be run Grid5000 * The EnOS team delivered on May 6th the version that supports the Stein release - Miguel Lavalle will create a wiki page to record a performance baseline and track subsequent progress DVR Enhancements =============== - Supporting allowed_address_pairs for DVR is a longstanding issue for DVR: https://bugs.launchpad.net/neutron/+bug/1774459. There are to patches up for review to address this issue: * https://review.opendev.org/#/c/616272/ * https://review.opendev.org/#/c/651905/ - The team discussed the current state of DVR functionality and identified the following missing features that would be beneficial for operators: * Distributed ingress/egress for IPv6. Distributed ingress/egress (AKA "fast-exit") would be implemented for IPv6. This would bypass the centralized router in a network node * Support for smart-NIC offloads. This involves pushing all DVR forwarding policy into OVS and implementing it via OpenFlow * Distributed DHCP. Rather than having DHCP for a given network be answered centrally, OpenFlow rules will be programmed into OVS to provide static, locally generated responses to DHCP requests received on br-int * Distributed SNAT. This involves allowing SNAT to happen directly on the compute node instead of centralizing it on a network node. * There was agreement that these features are needed and Ryan Tidwell agreed to develop a spec as the next step. The spec is already up for review: https://review.opendev.org/#/c/658414 - networking-ovn team members pointed out that some of the above features are already implemented in their Stadium project. This led to the discussion of why duplicate efforts implementing the same features and instead explore the possibility of a convergence between the ML2 / agents based reference implementation and the ML2 / OVN implementation. * This discussion is particularly relevant in the context where the OpenStack community is rationalizing its size and contributors are scarcer * Such a convergence would most likely play out over several development cycles * The team agreed to explore how to achieve this convergence. To move forward, we will need visibility and certainty that the following is feasible: # Feature parity with what the reference implementation offers today # Ability to support all the backends in the current reference implementation # Offer verifiable substantial gains in performance and scalability compared to the current reference implementation # Broaden the community of developers contributing to the ML2 / OVN implementation * To move ahead in the exploration of this convergence, three actions were agreed: # Benchmarking of the two implementations will be carried out with EnOS, as part of the performance and scaling up activities described above # Write the necessary specs to address feature parity, support all the backends in the current reference implementation and provide migration paths # An item will be added to the weekly Neutron meeting to track progress # Make every decision along this exploration process with approval of the broader community Policy topics / API ============== - Keystone has a now a system scope. A system-scoped token implies the user has authorization to act on the deployment system. These tokens are useful for interacting with resources that affect the deployment as a whole, or exposes resources that may otherwise violate project or domain isolation * Currently in Neutron, if users have an admin role they, can access all the resources * In order to maintain alignment with the community. Akihiro Motoki will review the Neutron code and determine how the admin role is used to interact with deployment resources. He will also monitor how Nova's adoption of the system scope progresses - During the policy-in-code work, some improvements and clean ups were left pending, which are Items 2.3, 2.4 and 4 in https://etherpad.openstack.org/p/neutron-policy-in-code - The Neutron approach to use new extensions to make any change to the ReST API discoverable, has resulted in the proliferation of "shim extensions" to introduce small changes such as the addition of an attribute * To eliminate this issue, Akihiro Motoki proposed to maintain the overall extensions approach but micro version the extensions so that every feature added does not result in another extension * The counter argument from some in the room was: "extensions are messy, but it's a static mess. Putting in Micro versions creates a mess in the code with lots of conditionals on micro version enablement" * It was decided to explore simple versioning of extensions. The details will be fleshed out in the following spec: https://review.opendev.org/#/c/656955 Neutron - Nova cross project planning ============================= This session was summarized in the following messages to the mailing list: - http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005844.html summarizes the following topics * Optional NUMA affinity for neutron ports * Track neutron ports in placement * os-vif to be extended to contain new fields to record the connectiviy type and ml2 driver that bound the vif * Boot vms with unaddressed port - Leaking resources when ports are deleted out-of-band is summarized in this thread: http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005837.html - Melanie Witt asked if Neutron would support implementing transferring ownership of its resources. The answer was yes and as next step, she is going to send a message to the mailing list to define the next steps Code streamlining proposals ====================== - Streamlining IPAM flow. As a result of bulk port creation work done in Stein by Nate Johnston, it is clear that there are opportunities to improve the IPAM code. The team brainstormed several possible approaches and the following proposals were put forward: * When allocating blocks of IP addresses where strategy is 'next ip' then ensure it happens as a single SQL insert * Create bulk versions of allocate_ip_from_port_and_store etc. so that bulk can be preserved when pushed down to the IPAM driver to take advantage of the previous item * Add profiling code to the IPAM call so that we can log the time duration for IPAM execution, as a PoC - Streamlining use of remote groups in security groups. Nate Johnston pointed out that there is a performance hit when using security groups that are keyed to a remote_group_id, because when a port is added to a remote group it triggers security group rule updates for all of the members of the security group. On deployments with 150+ ports, this can take up to 5 mins to bring up the port * After discussion, the proposed next step is for Nate Johnston to create a PoC for a new approach where a nested security group creates a new iptables table/ovs flow table (let's call it a subtable) that can be used as an abstraction for the nested group relationship. Then the IP addresses of the primary security group will jump to the new table, and the new table can represent the contents of the remote security group # In a case where a primary security group has 170 members and lists itself as a remote security group (indicating members can all talk amongst themselves) when adding an instance to the security group that causes 171 updates, since each member needs the address of the new instance and a record needs to be created for the new one # With the proposed approach there would only be 2 updates: creating an entry for the new instance to jump to the subtable representing the remote security group, and adding an entry to the subtable Train community goals ================= The two community goals accepted or Train are: - PDF doc generation for project docs: https://review.opendev.org/#/c/647712/ * Akihiro Motoki will track this goal - IPv6 support and testing goal: https://review.opendev.org/#/c/653545/ * Good blog entry on overcoming metadata service shortcomings in this scenario: https://superuser.openstack.org/articles/deploying-ipv6-only-tenants-with-openstack/ neutron-lib topics ============= - To help expedite the merging the of neutron-lib consumption patches it was proposed to the team that neutron-lib-current projects must get their dependencies for devstack based testing jobs from source, instead of PyPI. * For an example of an incident motivating this proposal, please see: https://bugs.launchpad.net/tricircle/+bug/1816644 * This refers to inter-project dependencies, for example networking-l2gw depending on networking-sfc. It does not apply to *-lib projects, those will still be based on PyPI release * The team agreed to this proposal * When creating a new stable branch the Zuul config would need to be updated to point to the stable releases of the other projects it depends on. May include a periodic job that involves testing master and stable branches against PyPI packages * Boden Russel will make a list of what jobs need to be updated in projects that consume neutron-lib (superset of stadium) - Boden reminded the team we have a work items list for neutron-lib https://etherpad.openstack.org/p/neutron-lib-volunteers-and-punch-list Requests for enhancement ===================== - Improve extraroute API * Current extraroute API does not allow atomic additions/deletions of particular routing table entries. In the current API the routes attribute of a router (containing all routing table entries) must be updated at once, leading to race conditions on the client side * The team debated several alternatives: an API extension that makes routers extra routes first level resources, solve the concurrency issue though "compare and swap" approach, seek input from API work group or provide better guidelines for the use of the current API * The decision was made to move ahead with a spec proposing extra routes as first level API resources. That spec can be found here: https://review.opendev.org/#/c/655680 - Decouple placement reporting service plugin from ML2 * The placement reporter service plugin as merged in Stein depends on ML2. The improvement idea is to decouple it, by a driver pattern as in the QoS service plugin * We currently don't have a use case for this decoupling. As a consequence, it was decided to postpone it Various topics ========== - Migration of stadium projects CI jobs to python 3 * We have an etherpad recording the work items: https://etherpad.openstack.org/p/neutron_stadium_python3_status * Lajos Katona will take care of networking-odl * Miguel Lavalle will talk to Takashi Yamamoto about networking-midonet * Nate Johnston will continue working on networking-bagpipe and neutron-fwaas patches * A list of projects beyond the Stadium will be collected as part of the effort for neutron-lib to start pulling requirements from source - Removal of deprecated "of_interface" option * The option was deprecated in Pike * In some cases, deployers might experience a few seconds of data plane down time when the OVS agent is restarted without the option * A message was sent to the ML warning of this possible effect: http://lists.openstack.org/pipermail/openstack-dev/2018-September/134938.html. There has been no reaction from the community * We will move ahead with the removal of the option. Patch is here: https://review.opendev.org/#/c/599496 Status and health of some Stadium and non-Stadium projects ============================================== - Some projects have experienced loss of development team: * networking-old. In this case, Ericsson is interested in continuing maintaining the project. The key contact is Lajos Katona * networking-l2gw is also interesting for Ericsson (Lajos Katona). Over the pas few cycles the project has been maintained by Ricardo Noriega of Red Hat. Miguel Lavalle will organize a meeting with Lajos and Ricardo to decide how to move ahead with this project * neutron-fwaas. In this case, Miguel Lavalle will send a message to the mailing list describing the status of the project and requesting parties interested in continuing maintaining the project -------------- next part -------------- An HTML attachment was scrubbed... URL: From yangyi01 at inspur.com Mon May 20 00:07:38 2019 From: yangyi01 at inspur.com (=?utf-8?B?WWkgWWFuZyAo5p2o54eaKS3kupHmnI3liqHpm4blm6I=?=) Date: Mon, 20 May 2019 00:07:38 +0000 Subject: =?utf-8?B?562U5aSNOiDnrZTlpI06IFtEVlIgY29uZmlnXSBDYW4gd2UgdXNlIGRydl9z?= =?utf-8?B?bmF0IGFnZW50X21vZGUgaW4gZXZlcnkgY29tcHV0ZSBub2RlPw==?= In-Reply-To: <279f1523-bfcd-9863-c5d6-7cb190f7142b@gmail.com> References: <67d4e0f3053949fc844b6d1d26f05559@inspur.com> <279f1523-bfcd-9863-c5d6-7cb190f7142b@gmail.com> Message-ID: <58f85a3e3f1449cebdf59f7e16e7090e@inspur.com> Brian, thank for your reply. So if I configure 3 compute nodes of many compute node as drv_snat, it doesn't have substantial difference from the case that I configure 3 single network nodes as snat gateway except deployment difference, right? Another question, it doesn't use HA even if we have multiple dvr_snat nodes, right? If enable l3_ha, I think one external router will be scheduled in multiple (2 at least) dvr_snat nodes, for that case, IPs of these HA routers for this one router are same one and are activated by VRRP, right? For l3_ha, two or multiple HA l3 nodes must be in the same L2 network because it uses VRRP (keepalived) to share a VIP, right? For that case, how can we make sure VRRP can work well across leaf switches in a L3 leaf-spine network (servers are connected to leaf switch by L2)? -----邮件原件----- 发件人: Brian Haley [mailto:haleyb.dev at gmail.com] 发送时间: 2019年5月17日 22:11 收件人: Yi Yang (杨燚)-云服务集团 抄送: openstack-discuss at lists.openstack.org 主题: Re: 答复: [DVR config] Can we use drv_snat agent_mode in every compute node? On 5/16/19 8:29 PM, Yi Yang (杨燚)-云服务集团 wrote: > Thanks Brian, your explanation clarified something, but I don't get the answer if we can have multiple compute nodes are configured to dvr_snat, for this case, SNAT IPs are obviously different. Why do we want to use network node if compute node can do everything? Hi Yi, There will only be one DVR SNAT IP allocated for a router on the external network, and only one router scheduled using it, so having dvr_snat mode on a compute node doesn't mean that North/South router will be local, only the East/West portion might be. Typically people choose to place these on separate systems since the requirements of the role are different - network node could have fewer cores and a 10G nic for higher bandwidth, compute node could have lots of cores for instances but maybe a 1G nic. There's no reason you can't run dvr_snat everywhere, I would just say it's not common. -Brian > -----邮件原件----- > 发件人: Brian Haley [mailto:haleyb.dev at gmail.com] > 发送时间: 2019年5月16日 21:46 > 收件人: Yi Yang (杨燚)-云服务集团 > 抄送: openstack-discuss at lists.openstack.org > 主题: Re: [DVR config] Can we use drv_snat agent_mode in every compute node? > > Hi Yi, > > I'm a little confused by the question, comments inline. > > On 5/15/19 11:47 PM, Yi Yang (杨燚)-云服务集团 wrote: >> Hi, folks >> >> I saw somebody discussed distributed SNAT, but finally they didn’t >> make agreement on how to implement distributed SNAT, my question is >> can we use dvr_snat agent_mode in compute node? I understand dvr_snat >> only does snat but doesn’t do east west routing, right? Can we set >> dvr_snat and dvr in one compute node at the same time? It is >> equivalent to distributed SNAT if we can set drv_snat in every >> compute node, isn’t right? I know Opendaylight can do SNAT in compute >> node in distributed way, but one external router only can run in one compute node. > > Distributed SNAT is not available in neutron, there was a spec > proposed recently though, https://review.opendev.org/#/c/658414 > > Regarding the agent_mode setting for L3, only one mode can be set at a time. Typically 'dvr_snat' is used on network nodes and 'dvr' on compute nodes because it leads to less resource usage (i.e. namespaces). > The centralized part of the router hosting the default SNAT IP address will only be scheduled to one of the agents in 'dvr_snat' mode. All the DVR modes can do East/West routing when an instance is scheduled to the node, and two can do North/South - 'dvr_snat' using the default SNAT IP, and 'dvr' using a floating IP. 'dvr_no_external' can only do East/West. > > Hopefully that clarifies things. > > -Brian > >> I also see https://wiki.openstack.org/wiki/Dragonflow is trying to >> implement distributed SNAT, what are technical road blocks for >> distributed SNAT in openstack dvr? Do we have any good way to remove >> these road blocks? >> >> Thank you in advance and look forward to getting your replies and insights. >> >> Also attached official drv configuration guide for your reference. >> >> https://docs.openstack.org/neutron/stein/configuration/l3-agent.html >> >> |agent_mode|¶ >> > # >> DEFAULT.agent_mode> >> >> Type >> >> string >> >> Default >> >> legacy >> >> Valid Values >> >> dvr, dvr_snat, legacy, dvr_no_external >> >> The working mode for the agent. Allowed modes are: ‘legacy’ - this >> preserves the existing behavior where the L3 agent is deployed on a >> centralized networking node to provide L3 services like DNAT, and SNAT. >> Use this mode if you do not want to adopt DVR. ‘dvr’ - this mode >> enables DVR functionality and must be used for an L3 agent that runs >> on a compute host. ‘dvr_snat’ - this enables centralized SNAT support >> in conjunction with DVR. This mode must be used for an L3 agent >> running on a centralized node (or in single-host deployments, e.g. devstack). >> ‘dvr_no_external’ - this mode enables only East/West DVR routing >> functionality for a L3 agent that runs on a compute host, the >> North/South functionality such as DNAT and SNAT will be provided by >> the centralized network node that is running in ‘dvr_snat’ mode. This >> mode should be used when there is no external network connectivity on >> the compute host. >> -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3600 bytes Desc: not available URL: From rtidwell at suse.com Mon May 20 00:39:43 2019 From: rtidwell at suse.com (Ryan Tidwell) Date: Sun, 19 May 2019 19:39:43 -0500 Subject: [neutron] Bug deputy report 2019-05-14 - 2019-05-20 Message-ID: <0939e28a-91c7-a14d-edfd-41b98bb1e31f@suse.com> Hello Neutrinos, Here is this week's bug deputy report for neutron-related issues. *High* N/A* * *Medium* * https://bugs.launchpad.net/neutron/+bug/1829332 - neutron-server report DhcpPortInUse ERROR log o https://review.opendev.org/659523 and https://review.opendev.org/659524 have been proposed against neutron and neutron-lib respectively. * https://bugs.launchpad.net/neutron/+bug/1829261 - Duplicate quota entry for project_id/resource causes inconsistent behaviour * https://bugs.launchpad.net/neutron/+bug/1829357 - Firewall-as-a-Service (FWaaS) v2 scenario in neutron o This is a documentation issue, the docs suggest commands be invoked via neutron CLI when this is not supported. https://review.opendev.org/659721 has been proposed to fix this. *  https://bugs.launchpad.net/neutron/+bug/1829387 - no way for non admin users to get networks *  https://bugs.launchpad.net/neutron/+bug/1829414 - Attribute filtering should be based on all objects instead of only first o https://review.opendev.org/659617 has been proposed to address this. *Low* N/A* * *RFE* * https://bugs.launchpad.net/neutron/+bug/1829449*- *Implement consistency check and self-healing for SDN-managed fabrics o A potentially interesting RFE around enabling feedback mechanisms for ML2 mech drivers. I'm not sure what to make of this yet, but I think it warrants some further discussion. ** *Filed and Fix Released: * * https://bugs.launchpad.net/neutron/+bug/1829304 - Neutron returns HttpException: 500 on certain operations with modified list of policies for non-admin user o https://review.opendev.org/#/c/659397/ was merged on master, with backports proposed to stable/stein and stable/rocky *Unassigned:* https://bugs.launchpad.net/neutron/+bug/1829261 and https://bugs.launchpad.net/neutron/+bug/1829387 -Ryan Tidwell -------------- next part -------------- An HTML attachment was scrubbed... URL: From aaronzhu1121 at gmail.com Mon May 20 00:43:19 2019 From: aaronzhu1121 at gmail.com (Rong Zhu) Date: Mon, 20 May 2019 08:43:19 +0800 Subject: [stackalytics] Reported numbers seem inaccurate In-Reply-To: References: Message-ID: Hi Sergey, Thanks for your help. Now the numbers are correctly. Sergey Nikitin 于2019年5月19日 周日21:12写道: > Hi, Rong, > > Database was rebuild and now stats o gengchc2 [1] is correct [2]. > > [1] > https://www.stackalytics.com/?release=all&metric=commits&project_type=all&user_id=578043796-b > [2] https://review.opendev.org/#/q/owner:gengchc2,n,z > > Sorry for delay, > Sergey > > > > > On Fri, May 17, 2019 at 6:20 PM Sergey Nikitin > wrote: > >> Testing of migration process shown us that we have to rebuild database >> "on live". >> Unfortunately it means that during rebuild data will be incomplete. I >> talked with the colleague who did it previously and he told me that it's >> normal procedure. >> I got these results on Monday and at this moment I'm waiting for weekend. >> It's better to rebuild database in Saturday and Sunday to do now affect >> much number of users. >> So by the end of this week everything will be completed. Thank you for >> patient. >> >> On Fri, May 17, 2019 at 6:15 AM Rong Zhu wrote: >> >>> Hi Sergey, >>> >>> What is the process about rebuild the database? >>> >>> Thanks, >>> Rong Zhu >>> >>> Sergey Nikitin 于2019年5月7日 周二00:59写道: >>> >>>> Hello Rong, >>>> >>>> Sorry for long response. I was on a trip during last 5 days. >>>> >>>> What I have found: >>>> Lets take a look on this patch [1]. It must be a contribution of >>>> gengchc2, but for some reasons it was matched to Yuval Brik [2] >>>> I'm still trying to find a root cause of it, but anyway on this week we >>>> are planing to rebuild our database to increase RAM. I checked statistics >>>> of gengchc2 on clean database and it's complete correct. >>>> So your problem will be solved in several days. It will take so long >>>> time because full rebuild of DB takes 48 hours, but we need to test our >>>> migration process first to keep zero down time. >>>> I'll share a results with you here when the process will be finished. >>>> Thank you for your patience. >>>> >>>> Sergey >>>> >>>> [1] https://review.opendev.org/#/c/627762/ >>>> [2] >>>> https://www.stackalytics.com/?user_id=jhamhader&project_type=all&release=all&metric=commits&company=&module=freezer-api >>>> >>>> >>>> On Mon, May 6, 2019 at 6:30 AM Rong Zhu wrote: >>>> >>>>> Hi Sergey, >>>>> >>>>> Do we have any process about my colleague's data loss problem? >>>>> >>>>> Sergey Nikitin 于2019年4月29日 周一19:57写道: >>>>> >>>>>> Thank you for information! I will take a look >>>>>> >>>>>> On Mon, Apr 29, 2019 at 3:47 PM Rong Zhu >>>>>> wrote: >>>>>> >>>>>>> Hi there, >>>>>>> >>>>>>> Recently we found we lost a person's data from our company at the >>>>>>> stackalytics website. >>>>>>> You can check the merged patch from [0], but there no date from >>>>>>> the stackalytics website. >>>>>>> >>>>>>> stackalytics info as below: >>>>>>> Company: ZTE Corporation >>>>>>> Launchpad: 578043796-b >>>>>>> Gerrit: gengchc2 >>>>>>> >>>>>>> Look forward to hearing from you! >>>>>>> >>>>>> >>>>> Best Regards, >>>>> Rong Zhu >>>>> >>>>>> >>>>>>> -- >>>>> Thanks, >>>>> Rong Zhu >>>>> >>>> >>>> >>>> -- >>>> Best Regards, >>>> Sergey Nikitin >>>> >>> -- >>> Thanks, >>> Rong Zhu >>> >> >> >> -- >> Best Regards, >> Sergey Nikitin >> > > > -- > Best Regards, > Sergey Nikitin > -- Thanks, Rong Zhu -------------- next part -------------- An HTML attachment was scrubbed... URL: From cheng1.li at intel.com Mon May 20 02:19:05 2019 From: cheng1.li at intel.com (Li, Cheng1) Date: Mon, 20 May 2019 02:19:05 +0000 Subject: [openstack-helm] List of the current meeting times, and opinion on adjustment In-Reply-To: References: Message-ID: +1 Trying to find if it is possible to have a slightly better official meeting time slot for Asian region Thanks, Cheng From: Jaesuk Ahn [mailto:bluejay.ahn at gmail.com] Sent: Saturday, May 18, 2019 7:42 AM To: Jean-Philippe Evrard Cc: openstack-discuss at lists.openstack.org Subject: Re: [openstack-helm] List of the current meeting times, and opinion on adjustment On Sat, May 18, 2019 at 12:05 AM Jean-Philippe Evrard > wrote: Sadly, there is no magic with timezones :( I am fine with your proposal though. True, there is really no magic with timezones, especially it involves everyone in the earth. I believe that there could be another way: - If a conversation needs to happen "synchronously", we could use the office hours for that. The office hours are timezone alternated IIRC, but not yet "officialized" (they are not appearing in [1]). - Bring conversations that need to happen asynchronously over the ML. - Discuss/Decide things in reviews. I do like your idea. Bringing asynchronous conversation actively to ML and reviews. If one think posting on ML or writing reviews has an equal effect as talking directly to someone in weekly irc meeting, it would give good alternative way. I did not think of using official hours as a sort of irc meeting for people in different timezone. However, I guess it cloud be done. That's technically totally workable globally, but it needs a change in mindset. It's also most likely people will get less active on IRC, which will impact the way we "feel" as a community. True, "feel" part is sometimes a very important factor for people. I think that being part of "official" activity is sometimes important not only for effective communication but also for "feeling" that I am part of community. Being able to attend "official" weekly meeting without too much effort (e.g. keep your eye open till midnight) is an weapon for me to bring more people into the community. I suppose we can do both. 1) Trying to find if it is possible to have a slightly better official meeting time slot for Asian region 2) Discussing on more leveraging asynchronous way of communication and its effect on the community. Thank you for the insightful feedback. :) Regards, Jean-Philippe Evrard [1]: http://eavesdrop.openstack.org/#OpenStack-Helm_Team_Meeting -- Jaesuk Ahn, Ph.D. Software R&D Center, SK Telecom -------------- next part -------------- An HTML attachment was scrubbed... URL: From anlin.kong at gmail.com Mon May 20 04:11:09 2019 From: anlin.kong at gmail.com (Lingxian Kong) Date: Mon, 20 May 2019 16:11:09 +1200 Subject: [autohealing] Demo of Application Autohealing in OpenStack (Heat + Octavia + Aodh) Message-ID: Please see the demo here: https://youtu.be/dXsGnbr7DfM --- Best regards, Lingxian Kong Catalyst Cloud -------------- next part -------------- An HTML attachment was scrubbed... URL: From anlin.kong at gmail.com Mon May 20 04:15:41 2019 From: anlin.kong at gmail.com (Lingxian Kong) Date: Mon, 20 May 2019 16:15:41 +1200 Subject: [autohealing] Demo of Application Autohealing in OpenStack (Heat + Octavia + Aodh) In-Reply-To: References: Message-ID: Recommend to watch in a 1080p video quality. --- Best regards, Lingxian Kong Catalyst Cloud On Mon, May 20, 2019 at 4:11 PM Lingxian Kong wrote: > Please see the demo here: https://youtu.be/dXsGnbr7DfM > > --- > Best regards, > Lingxian Kong > Catalyst Cloud > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Mon May 20 04:22:05 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Mon, 20 May 2019 13:22:05 +0900 Subject: [autohealing] Demo of Application Autohealing in OpenStack (Heat + Octavia + Aodh) In-Reply-To: References: Message-ID: It's great! Thanks Lingxian. On Mon, May 20, 2019, 13:20 Lingxian Kong wrote: > Recommend to watch in a 1080p video quality. > > --- > Best regards, > Lingxian Kong > Catalyst Cloud > > > On Mon, May 20, 2019 at 4:11 PM Lingxian Kong > wrote: > >> Please see the demo here: https://youtu.be/dXsGnbr7DfM >> >> --- >> Best regards, >> Lingxian Kong >> Catalyst Cloud >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Mon May 20 07:12:56 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Mon, 20 May 2019 09:12:56 +0200 Subject: [DVR config] Can we use drv_snat agent_mode in every compute node? In-Reply-To: <58f85a3e3f1449cebdf59f7e16e7090e@inspur.com> References: <67d4e0f3053949fc844b6d1d26f05559@inspur.com> <279f1523-bfcd-9863-c5d6-7cb190f7142b@gmail.com> <58f85a3e3f1449cebdf59f7e16e7090e@inspur.com> Message-ID: <1B6127C7-2794-40F4-BEED-6CD40DDB4BD9@redhat.com> Hi, > On 20 May 2019, at 02:07, Yi Yang (杨燚)-云服务集团 wrote: > > Brian, thank for your reply. So if I configure 3 compute nodes of many compute node as drv_snat, it doesn't have substantial difference from the case that I configure 3 single network nodes as snat gateway except deployment difference, right? Another question, it doesn't use HA even if we have multiple dvr_snat nodes, right? If enable l3_ha, I think one external router will be scheduled in multiple (2 at least) dvr_snat nodes, for that case, IPs of these HA routers for this one router are same one and are activated by VRRP, right? For l3_ha, two or multiple HA l3 nodes must be in the same L2 network because it uses VRRP (keepalived) to share a VIP, right? For that case, how can we make sure VRRP can work well across leaf switches in a L3 leaf-spine network (servers are connected to leaf switch by L2)? That is correct what You are saying. In DVR-HA case, SNAT nodes are working in same way like in “standard” L3HA. So it’s active-backup config and keepalived is deciding which node is active. Neutron creates “HA network” for tenant to use for keepalived. It can be e.g. vxlan network and that way You will have L2 between such nodes (routers). > > -----邮件原件----- > 发件人: Brian Haley [mailto:haleyb.dev at gmail.com] > 发送时间: 2019年5月17日 22:11 > 收件人: Yi Yang (杨燚)-云服务集团 > 抄送: openstack-discuss at lists.openstack.org > 主题: Re: 答复: [DVR config] Can we use drv_snat agent_mode in every compute node? > > On 5/16/19 8:29 PM, Yi Yang (杨燚)-云服务集团 wrote: >> Thanks Brian, your explanation clarified something, but I don't get the answer if we can have multiple compute nodes are configured to dvr_snat, for this case, SNAT IPs are obviously different. Why do we want to use network node if compute node can do everything? > > Hi Yi, > > There will only be one DVR SNAT IP allocated for a router on the external network, and only one router scheduled using it, so having dvr_snat mode on a compute node doesn't mean that North/South router will be local, only the East/West portion might be. > > Typically people choose to place these on separate systems since the requirements of the role are different - network node could have fewer cores and a 10G nic for higher bandwidth, compute node could have lots of cores for instances but maybe a 1G nic. There's no reason you can't run dvr_snat everywhere, I would just say it's not common. > > -Brian > > >> -----邮件原件----- >> 发件人: Brian Haley [mailto:haleyb.dev at gmail.com] >> 发送时间: 2019年5月16日 21:46 >> 收件人: Yi Yang (杨燚)-云服务集团 >> 抄送: openstack-discuss at lists.openstack.org >> 主题: Re: [DVR config] Can we use drv_snat agent_mode in every compute node? >> >> Hi Yi, >> >> I'm a little confused by the question, comments inline. >> >> On 5/15/19 11:47 PM, Yi Yang (杨燚)-云服务集团 wrote: >>> Hi, folks >>> >>> I saw somebody discussed distributed SNAT, but finally they didn’t >>> make agreement on how to implement distributed SNAT, my question is >>> can we use dvr_snat agent_mode in compute node? I understand dvr_snat >>> only does snat but doesn’t do east west routing, right? Can we set >>> dvr_snat and dvr in one compute node at the same time? It is >>> equivalent to distributed SNAT if we can set drv_snat in every >>> compute node, isn’t right? I know Opendaylight can do SNAT in compute >>> node in distributed way, but one external router only can run in one compute node. >> >> Distributed SNAT is not available in neutron, there was a spec >> proposed recently though, https://review.opendev.org/#/c/658414 >> >> Regarding the agent_mode setting for L3, only one mode can be set at a time. Typically 'dvr_snat' is used on network nodes and 'dvr' on compute nodes because it leads to less resource usage (i.e. namespaces). >> The centralized part of the router hosting the default SNAT IP address will only be scheduled to one of the agents in 'dvr_snat' mode. All the DVR modes can do East/West routing when an instance is scheduled to the node, and two can do North/South - 'dvr_snat' using the default SNAT IP, and 'dvr' using a floating IP. 'dvr_no_external' can only do East/West. >> >> Hopefully that clarifies things. >> >> -Brian >> >>> I also see https://wiki.openstack.org/wiki/Dragonflow is trying to >>> implement distributed SNAT, what are technical road blocks for >>> distributed SNAT in openstack dvr? Do we have any good way to remove >>> these road blocks? >>> >>> Thank you in advance and look forward to getting your replies and insights. >>> >>> Also attached official drv configuration guide for your reference. >>> >>> https://docs.openstack.org/neutron/stein/configuration/l3-agent.html >>> >>> |agent_mode|¶ >>> >> # >>> DEFAULT.agent_mode> >>> >>> Type >>> >>> string >>> >>> Default >>> >>> legacy >>> >>> Valid Values >>> >>> dvr, dvr_snat, legacy, dvr_no_external >>> >>> The working mode for the agent. Allowed modes are: ‘legacy’ - this >>> preserves the existing behavior where the L3 agent is deployed on a >>> centralized networking node to provide L3 services like DNAT, and SNAT. >>> Use this mode if you do not want to adopt DVR. ‘dvr’ - this mode >>> enables DVR functionality and must be used for an L3 agent that runs >>> on a compute host. ‘dvr_snat’ - this enables centralized SNAT support >>> in conjunction with DVR. This mode must be used for an L3 agent >>> running on a centralized node (or in single-host deployments, e.g. devstack). >>> ‘dvr_no_external’ - this mode enables only East/West DVR routing >>> functionality for a L3 agent that runs on a compute host, the >>> North/South functionality such as DNAT and SNAT will be provided by >>> the centralized network node that is running in ‘dvr_snat’ mode. This >>> mode should be used when there is no external network connectivity on >>> the compute host. >>> — Slawek Kaplonski Senior software engineer Red Hat From yangyi01 at inspur.com Mon May 20 07:33:26 2019 From: yangyi01 at inspur.com (=?utf-8?B?WWkgWWFuZyAo5p2o54eaKS3kupHmnI3liqHpm4blm6I=?=) Date: Mon, 20 May 2019 07:33:26 +0000 Subject: =?utf-8?B?562U5aSNOiBbRFZSIGNvbmZpZ10gQ2FuIHdlIHVzZSBkcnZfc25hdCBhZ2Vu?= =?utf-8?B?dF9tb2RlIGluIGV2ZXJ5IGNvbXB1dGUgbm9kZT8=?= In-Reply-To: <1B6127C7-2794-40F4-BEED-6CD40DDB4BD9@redhat.com> References: <67d4e0f3053949fc844b6d1d26f05559@inspur.com> <279f1523-bfcd-9863-c5d6-7cb190f7142b@gmail.com> <58f85a3e3f1449cebdf59f7e16e7090e@inspur.com> <1B6127C7-2794-40F4-BEED-6CD40DDB4BD9@redhat.com> Message-ID: <55f84d63363640b480ff5bfd6013e895@inspur.com> Hi, Slawomir, do you mean VRRP over VXLAN? I mean servers in leaf switch are attached to the leaf switch by VLAN and servers handle VxLAN encap and decap, for such case, how can leaf-spine transport a L2 packet to another server in another leaf switch? -----邮件原件----- 发件人: Slawomir Kaplonski [mailto:skaplons at redhat.com] 发送时间: 2019年5月20日 15:13 收件人: Yi Yang (杨燚)-云服务集团 抄送: haleyb.dev at gmail.com; openstack-discuss at lists.openstack.org 主题: Re: [DVR config] Can we use drv_snat agent_mode in every compute node? 重要性: 高 Hi, > On 20 May 2019, at 02:07, Yi Yang (杨燚)-云服务集团 wrote: > > Brian, thank for your reply. So if I configure 3 compute nodes of many compute node as drv_snat, it doesn't have substantial difference from the case that I configure 3 single network nodes as snat gateway except deployment difference, right? Another question, it doesn't use HA even if we have multiple dvr_snat nodes, right? If enable l3_ha, I think one external router will be scheduled in multiple (2 at least) dvr_snat nodes, for that case, IPs of these HA routers for this one router are same one and are activated by VRRP, right? For l3_ha, two or multiple HA l3 nodes must be in the same L2 network because it uses VRRP (keepalived) to share a VIP, right? For that case, how can we make sure VRRP can work well across leaf switches in a L3 leaf-spine network (servers are connected to leaf switch by L2)? That is correct what You are saying. In DVR-HA case, SNAT nodes are working in same way like in “standard” L3HA. So it’s active-backup config and keepalived is deciding which node is active. Neutron creates “HA network” for tenant to use for keepalived. It can be e.g. vxlan network and that way You will have L2 between such nodes (routers). > > -----邮件原件----- > 发件人: Brian Haley [mailto:haleyb.dev at gmail.com] > 发送时间: 2019年5月17日 22:11 > 收件人: Yi Yang (杨燚)-云服务集团 > 抄送: openstack-discuss at lists.openstack.org > 主题: Re: 答复: [DVR config] Can we use drv_snat agent_mode in every compute node? > > On 5/16/19 8:29 PM, Yi Yang (杨燚)-云服务集团 wrote: >> Thanks Brian, your explanation clarified something, but I don't get the answer if we can have multiple compute nodes are configured to dvr_snat, for this case, SNAT IPs are obviously different. Why do we want to use network node if compute node can do everything? > > Hi Yi, > > There will only be one DVR SNAT IP allocated for a router on the external network, and only one router scheduled using it, so having dvr_snat mode on a compute node doesn't mean that North/South router will be local, only the East/West portion might be. > > Typically people choose to place these on separate systems since the requirements of the role are different - network node could have fewer cores and a 10G nic for higher bandwidth, compute node could have lots of cores for instances but maybe a 1G nic. There's no reason you can't run dvr_snat everywhere, I would just say it's not common. > > -Brian > > >> -----邮件原件----- >> 发件人: Brian Haley [mailto:haleyb.dev at gmail.com] >> 发送时间: 2019年5月16日 21:46 >> 收件人: Yi Yang (杨燚)-云服务集团 >> 抄送: openstack-discuss at lists.openstack.org >> 主题: Re: [DVR config] Can we use drv_snat agent_mode in every compute node? >> >> Hi Yi, >> >> I'm a little confused by the question, comments inline. >> >> On 5/15/19 11:47 PM, Yi Yang (杨燚)-云服务集团 wrote: >>> Hi, folks >>> >>> I saw somebody discussed distributed SNAT, but finally they didn’t >>> make agreement on how to implement distributed SNAT, my question is >>> can we use dvr_snat agent_mode in compute node? I understand >>> dvr_snat only does snat but doesn’t do east west routing, right? Can >>> we set dvr_snat and dvr in one compute node at the same time? It is >>> equivalent to distributed SNAT if we can set drv_snat in every >>> compute node, isn’t right? I know Opendaylight can do SNAT in >>> compute node in distributed way, but one external router only can run in one compute node. >> >> Distributed SNAT is not available in neutron, there was a spec >> proposed recently though, https://review.opendev.org/#/c/658414 >> >> Regarding the agent_mode setting for L3, only one mode can be set at a time. Typically 'dvr_snat' is used on network nodes and 'dvr' on compute nodes because it leads to less resource usage (i.e. namespaces). >> The centralized part of the router hosting the default SNAT IP address will only be scheduled to one of the agents in 'dvr_snat' mode. All the DVR modes can do East/West routing when an instance is scheduled to the node, and two can do North/South - 'dvr_snat' using the default SNAT IP, and 'dvr' using a floating IP. 'dvr_no_external' can only do East/West. >> >> Hopefully that clarifies things. >> >> -Brian >> >>> I also see https://wiki.openstack.org/wiki/Dragonflow is trying to >>> implement distributed SNAT, what are technical road blocks for >>> distributed SNAT in openstack dvr? Do we have any good way to remove >>> these road blocks? >>> >>> Thank you in advance and look forward to getting your replies and insights. >>> >>> Also attached official drv configuration guide for your reference. >>> >>> https://docs.openstack.org/neutron/stein/configuration/l3-agent.html >>> >>> |agent_mode|¶ >>> >> l >>> # >>> DEFAULT.agent_mode> >>> >>> Type >>> >>> string >>> >>> Default >>> >>> legacy >>> >>> Valid Values >>> >>> dvr, dvr_snat, legacy, dvr_no_external >>> >>> The working mode for the agent. Allowed modes are: ‘legacy’ - this >>> preserves the existing behavior where the L3 agent is deployed on a >>> centralized networking node to provide L3 services like DNAT, and SNAT. >>> Use this mode if you do not want to adopt DVR. ‘dvr’ - this mode >>> enables DVR functionality and must be used for an L3 agent that runs >>> on a compute host. ‘dvr_snat’ - this enables centralized SNAT >>> support in conjunction with DVR. This mode must be used for an L3 >>> agent running on a centralized node (or in single-host deployments, e.g. devstack). >>> ‘dvr_no_external’ - this mode enables only East/West DVR routing >>> functionality for a L3 agent that runs on a compute host, the >>> North/South functionality such as DNAT and SNAT will be provided by >>> the centralized network node that is running in ‘dvr_snat’ mode. >>> This mode should be used when there is no external network >>> connectivity on the compute host. >>> — Slawek Kaplonski Senior software engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3600 bytes Desc: not available URL: From kchamart at redhat.com Mon May 20 08:07:51 2019 From: kchamart at redhat.com (Kashyap Chamarthy) Date: Mon, 20 May 2019 10:07:51 +0200 Subject: On reporting CPU flags that provide mitiation (to CVE flaws) as Nova 'traits' In-Reply-To: <49033c4d-bfe5-3493-926c-75804719b1be@fried.cc> References: <20190515092456.GH17214@paraplu> <1e4f1df66115fd8e96b8aec3a679b25534c66541.camel@redhat.com> <20190515131109.GJ17214@paraplu> <60a1e97d-c9ac-469f-3c16-39e89347acb3@fried.cc> <20190517110721.GA19519@paraplu> <49033c4d-bfe5-3493-926c-75804719b1be@fried.cc> Message-ID: <20190520080751.GB19519@paraplu> On Fri, May 17, 2019 at 11:25:24AM -0500, Eric Fried wrote: > > Okay, so I take it that all the relevant low-level CPU flags (including > > things like SSBD, et al) as proposed here[2][3] can be added to > > 'os-traits'. > > Yes, subject to already-noted namespacing and spelling issues. Noted. > > And tools _other_ than Nova can consume, if need be. > > Nova should consume by having the driver expose the flags as > appropriate. And switching on flaggage in domain xml if that's a thing. > But that's all. No efforts to special-case scheduling decisions etc. Nod; thanks for clarifying, Eric. -- /kashyap From mark at stackhpc.com Mon May 20 10:31:48 2019 From: mark at stackhpc.com (Mark Goddard) Date: Mon, 20 May 2019 11:31:48 +0100 Subject: [Kolla] Few questions about offline deploy & operation. In-Reply-To: References: Message-ID: On Fri, 17 May 2019 at 02:46, Eddie Yen wrote: > Hi everyone, > > I'm a newbie of using Kolla-ansible to deploy OpenStack. > Everything works well. But I have few questions about Kolla since I was > using Fuel as deploy tool before. > The questions may silly, but I really want to know since I didn't found > much informations to solve my questions.. > > > > About offline deployment: > > I already known that we need have OS local repository (CentOS or Ubuntu), > docker local package repository, and local docker registry. > But one thing I'm not sure that it will use pip to install python packages > or not. Because I found it will also install pip into target node during > bootstrap-servers. > If so, which python package should I prepare for? > > If you are using bootstrap-servers, you will need the pip and docker Python packages available for installation. Are you building your own images? If so you will probably find it easier to use binary images than source since these use OS distro packages. > About operation: > > 1. Is it possible to operate a single service to whole OpenStack nodes? > Like using crm to check status about MySQL or RabbitMQ services on all > control nodes. > > We don't have a simple way to manage the whole cluster of services from one place. You can use the docker CLI, perhaps configured for TCP transport if you want remote control, or just via SSH if you don't want to expose it to the network. > 2. How can I maintenance the Ceph OSD if OSD down caused by disk issue? > I know how to rebuild OSD by Ceph commands but I'm not sure how to do if > ceph-osd running in container. > > I don't tend to use Ceph in kolla, so I can't really answer this one. I couldn't see anything about this in the docs. > Many thanks, > Eddie. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jean-philippe at evrard.me Mon May 20 12:30:51 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Mon, 20 May 2019 14:30:51 +0200 Subject: [all][dev] python-etcd3 needs maintainers In-Reply-To: References: Message-ID: >> When I was at OpenInfraDays UK, Louis Taylor (aka kragniz) asked me >> if I could check with the OpenStack community to see if there are >> people who are actively using python-etcd3 [1] that are interested in >> helping to maintain it. It needs more attention than he is able to >> give. >> It's also in use in ansible etcd3 module. Those interested in this module should ... have a look at python-etcd3. I don't have much time to maintain this too :( From lajos.katona at ericsson.com Mon May 20 12:39:05 2019 From: lajos.katona at ericsson.com (Lajos Katona) Date: Mon, 20 May 2019 12:39:05 +0000 Subject: [keystone][placement][neutron][api-sig] http404 to NotFound, or how should a http json error body look like? In-Reply-To: References: <9ae10062-a9c8-3e76-15a1-da0745361c57@ericsson.com> Message-ID: Thanks Chris for the reply, let's wait for keystone folks for comments. On 2019. 05. 17. 13:42, Chris Dent wrote: > On Fri, 17 May 2019, Lajos Katona wrote: > >> keystoneauth1 expects the http body in case of for example http404 to be >> something like this (python dict comes now): >> body= {'error': {'message': 'Foooooo', 'details': 'Baaaaar'}} >> see: >> https://protect2.fireeye.com/url?k=e824ec9f-b4f7fcf7-e824ac04-8691b328a8b8-1c7b10b9a8d8e9c8&q=1&u=https%3A%2F%2Fopendev.org%2Fopenstack%2Fkeystoneauth%2Fsrc%2Fbranch%2Fmaster%2Fkeystoneauth1%2Fexceptions%2Fhttp.py%23L406-L415 >> >> >> But placement started to adopt one suggestion from the API-SIG: >> http://specs.openstack.org/openstack/api-wg/guidelines/errors.html, >> >> and that is something like this (again python): >> body={'errors': [{'status': 404, 'title': 'Not Found', 'detail': 'The >> resource could not be found.... ', 'request_id': '...'}]} > > Thanks for all the detail in this message and on the bug report. It > helps make understanding what's going on a lot easier. > > As you've discovered placement is following the guidelines for how > errors are supposed to be formatted. If keystoneauth1 can't speak > that format, that's probably the primary bug. > > However, it also sounds like you're encountering a few different > evolutions in placement that may need to be addressed in older > versions of neutron's placement client: > > * For quite a while placement was strict about responding to Accept >   headers appropriately. This was based on its interaction with >   Webob. If you didn't ask for json in the Accept header, errors >   could come in HTML or Text. The most reliable fix for this in any >   client of any API is to always send an Accept header that states >   how you want responses to be presented (in this case >   application/json). This can lead to interesting parsing troubles >   if you are rely on the bodies of responses. > > * In microversion 1.23 we started sending 'code' with error >   responses in an effort to avoid needing to parse error responses. > > I've got a few different suggestions that you might want to explore. > None of them are a direct fix for the issue you are seeing but may > lead to some ideas. > > First off, this would be a big change, but I do not think it is good > practice to raise exceptions when getting 4xx responses. Instead in > the neutron-lib placement client it would be better to branch on > status code in the resp object. > > If it doesn't matter why you 404d, just that you did, you could log > just that, not the detail. > > Another thing to think about is in the neutron placement client you > have a get_inventory [1] method which has both resource provider > uuid and resource class in the URL and thus can lead to the "two > different types of 404s" issue that is making parsing the error > response required. You could potentially avoid this by implementing > get_inventories [2] which would only 404 on bad resource provider > uuid and would return inventory for all resource classes on that rp. > You can use PUT on the same URL to replace all inventory on that > rp. > > Make sure you send Accept: application/json in all your requests in > the client. > > Make keystoneauth1 interpret two diferent types of errors response: > the one it does, and the one in the api-sig guidelines. Note that > I've yet to see any context where there is more than one error in > the list, so it is always errors[0] that gets inspected. > > On the placement side we should probably add codes to the few > different places where a 404 can happen for different reasons (they > are almost all combinations of not finding a resource provider or > not finding a resource class). If that were present you could branch > the library code on the code in the errors structure instead of the > detail. However if its possible to avoid choosing which 404, that's > probably better. > > I hope some of that helps. Hopefully some keystoneauth folk will > chime in too. > > [1] > https://protect2.fireeye.com/url?k=301a51a6-6cc941ce-301a113d-8691b328a8b8-87f5962f00f50154&q=1&u=https%3A%2F%2Fopendev.org%2Fopenstack%2Fneutron-lib%2Fsrc%2Fbranch%2Fmaster%2Fneutron_lib%2Fplacement%2Fclient.py%23L417k > > > [2] > https://developer.openstack.org/api-ref/placement/#resource-provider-inventories From missile0407 at gmail.com Mon May 20 13:35:37 2019 From: missile0407 at gmail.com (Eddie Yen) Date: Mon, 20 May 2019 21:35:37 +0800 Subject: [Kolla] Few questions about offline deploy & operation. In-Reply-To: References: Message-ID: Hi Mark, glad to see your reply. 1. For python repository, only need are pip and python-docker during bootstrapping servers. Sounds good. And I didn't build my own images. I'm using the official binary docker images. I'll try to do these on test site. 2. Roger that. Seems little less convenient but still easy to manage by checking container's status and cluster status for each Openstack services (like Nova, Neutron, etc.). 3. Okay, I'll try to look inside the container to check about ceph-osd container. Anyway, still appreciate your reply. And also apologize my bad English. Hope this project can be better and more powerful. Have a nice day, Eddie. Mark Goddard 於 2019年5月20日 週一 下午6:32寫道: > On Fri, 17 May 2019 at 02:46, Eddie Yen wrote: > >> Hi everyone, >> >> I'm a newbie of using Kolla-ansible to deploy OpenStack. >> Everything works well. But I have few questions about Kolla since I was >> using Fuel as deploy tool before. >> The questions may silly, but I really want to know since I didn't found >> much informations to solve my questions.. >> >> >> >> About offline deployment: >> >> I already known that we need have OS local repository (CentOS or Ubuntu), >> docker local package repository, and local docker registry. >> But one thing I'm not sure that it will use pip to install python >> packages or not. Because I found it will also install pip into target node >> during bootstrap-servers. >> If so, which python package should I prepare for? >> >> > If you are using bootstrap-servers, you will need the pip and docker > Python packages available for installation. > > Are you building your own images? If so you will probably find it easier > to use binary images than source since these use OS distro packages. > > >> About operation: >> >> 1. Is it possible to operate a single service to whole OpenStack nodes? >> Like using crm to check status about MySQL or RabbitMQ services on all >> control nodes. >> >> > We don't have a simple way to manage the whole cluster of services from > one place. You can use the docker CLI, perhaps configured for TCP transport > if you want remote control, or just via SSH if you don't want to expose it > to the network. > >> 2. How can I maintenance the Ceph OSD if OSD down caused by disk issue? >> I know how to rebuild OSD by Ceph commands but I'm not sure how to do if >> ceph-osd running in container. >> >> > I don't tend to use Ceph in kolla, so I can't really answer this one. I > couldn't see anything about this in the docs. > > >> Many thanks, >> Eddie. >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lyarwood at redhat.com Mon May 20 14:08:02 2019 From: lyarwood at redhat.com (Lee Yarwood) Date: Mon, 20 May 2019 15:08:02 +0100 Subject: [nova][cinder][ptg] Summary: Swap volume woes In-Reply-To: References: <20190506131834.nyc7k7qltdsmamuq@lyarwood.usersys.redhat.com> Message-ID: <20190520140802.q55pxpdwjnzf7ri5@lyarwood.usersys.redhat.com> On 08-05-19 11:03:17, Matt Riedemann wrote: > On 5/6/2019 8:18 AM, Lee Yarwood wrote: > > - Deprecate the existing swap volume API in Train, remove in U. > > I don't remember this coming up. Apologies, I think I trying to suggest that we could deprecate the underlying swap volume logic and replace it with some basic attachment update logic while keeping the same URI etc. IIRC you suggested this would be useful during the session. > Deprecation is one thing if we have an alternative, but removal isn't > really an option. Yes we have 410'ed some REST APIs for removed > services (nova-network, nova-cells) but for the most part we're > married to our REST APIs so we can deprecate things to signal "don't > use these anymore" but that doesn't mean we can just delete them. > This is why we require a spec for all API changes, because of said > marriage. Understood and I have every intention of writing this up in a spec now I'm back from PTO. Cheers, -- Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 455 bytes Desc: not available URL: From jacob.anders.au at gmail.com Mon May 20 14:46:56 2019 From: jacob.anders.au at gmail.com (Jacob Anders) Date: Tue, 21 May 2019 00:46:56 +1000 Subject: [ironic] IRC meeting timing Message-ID: Hi All, I would be keen to participate in the Ironic IRC meetings, however the current timing of the meeting is quite unfavourable to those based in the Asia Pacific region. I'm wondering - would you be open to consider changing the timing to either: - 2000hrs UTC: UTC (Time Zone) Monday, 20 May 2019 at 8:00:00 pm UTC (Sydney/Australia) Tuesday, 21 May 2019 at 6:00:00 am AEST UTC+10 hours (Germany/Berlin) Monday, 20 May 2019 at 10:00:00 pm CEST UTC+2 hours (USA/California) Monday, 20 May 2019 at 1:00:00 pm PDT UTC-7 hours - alternating between two different times to accommodate different timezones? For example 1300hrs and 2000hrs UTC? Thank you. Best regards, Jacob -------------- next part -------------- An HTML attachment was scrubbed... URL: From colleen at gazlene.net Mon May 20 14:49:22 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Mon, 20 May 2019 07:49:22 -0700 Subject: =?UTF-8?Q?Re:_[keystone][placement][neutron][api-sig]_http404_to_NotFoun?= =?UTF-8?Q?d,_or_how_should_a_http_json_error_body_look_like=3F?= In-Reply-To: References: <9ae10062-a9c8-3e76-15a1-da0745361c57@ericsson.com> Message-ID: <93c95d69-c87a-4d4d-bf10-3b6b293b8a6a@www.fastmail.com> On Mon, May 20, 2019, at 05:40, Lajos Katona wrote: > Thanks Chris for the reply, let's wait for keystone folks for comments. Hi! > > On 2019. 05. 17. 13:42, Chris Dent wrote: > > On Fri, 17 May 2019, Lajos Katona wrote: > > > >> keystoneauth1 expects the http body in case of for example http404 to be > >> something like this (python dict comes now): > >> body= {'error': {'message': 'Foooooo', 'details': 'Baaaaar'}} > >> see: > >> https://protect2.fireeye.com/url?k=e824ec9f-b4f7fcf7-e824ac04-8691b328a8b8-1c7b10b9a8d8e9c8&q=1&u=https%3A%2F%2Fopendev.org%2Fopenstack%2Fkeystoneauth%2Fsrc%2Fbranch%2Fmaster%2Fkeystoneauth1%2Fexceptions%2Fhttp.py%23L406-L415 > >> > >> > >> But placement started to adopt one suggestion from the API-SIG: > >> http://specs.openstack.org/openstack/api-wg/guidelines/errors.html, > >> > >> and that is something like this (again python): > >> body={'errors': [{'status': 404, 'title': 'Not Found', 'detail': 'The > >> resource could not be found.... ', 'request_id': '...'}]} > > > > Thanks for all the detail in this message and on the bug report. It > > helps make understanding what's going on a lot easier. > > > > As you've discovered placement is following the guidelines for how > > errors are supposed to be formatted. If keystoneauth1 can't speak > > that format, that's probably the primary bug. > > [snipped] > > > > Make keystoneauth1 interpret two diferent types of errors response: > > the one it does, and the one in the api-sig guidelines. Note that > > I've yet to see any context where there is more than one error in > > the list, so it is always errors[0] that gets inspected. We'll happily accept patches to keystoneauth that make it compliant with the API-SIG's guidelines (as long as it is backwards compatible). I gotta say, though, this guideline on error handling really takes me aback. Why should a single HTTP request ever result in a list of errors, plural? Is there any standard, pattern, or example *outside* of OpenStack where this is done or recommended? Why? Colleen From gsteinmuller at vexxhost.com Mon May 20 14:54:39 2019 From: gsteinmuller at vexxhost.com (=?UTF-8?Q?Guilherme_Steinm=C3=BCller?=) Date: Mon, 20 May 2019 11:54:39 -0300 Subject: [openstack-ansible][monasca][zaqar][watcher][searchlight] Retirement of unused OpenStack Ansible roles In-Reply-To: References: <236ef912-21c5-4345-98ce-067499921af1@www.fastmail.com> <604fd001-f9aa-4f32-8c19-fdd19a0a458c@www.fastmail.com> Message-ID: Hello all. So, we already have some retired roles. For now, to merge the last patch ( https://review.opendev.org/#/c/650422/ ) we need to finish the retirement of os_monasca-* roles ( https://review.opendev.org/#/c/653630/ , https://review.opendev.org/#/c/653631/ ). I am going to proceed with these 2 monasca roles. I think if anyone wants in the future to maintain those, the retirement could be reverted. Best regards Guilherme On Mon, May 13, 2019 at 4:31 AM Trinh Nguyen wrote: > Hi Jean-Philippe, > > Thanks for the information. Sure, let's me look at the role for sometimes > and get back to you if I need help. > > Bests, > > > > On Mon, May 13, 2019 at 4:25 PM Jean-Philippe Evrard < > jean-philippe at evrard.me> wrote: > >> >> >> On Wed, May 8, 2019, at 19:05, Trinh Nguyen wrote: >> > Hi all, >> > >> > I would love to take care of the searchlight roles. Are there any >> specific requirements I need to keep in mind? >> > >> > Bests, >> > >> > >> > >> > -- >> > *Trinh Nguyen* >> > _www.edlab.xyz_ >> > >> >> Hello, >> >> Great news! >> Searchlight role has been unmaintained for a while. The code is still >> using old elastic search versions, and is following relatively old >> standards. We are looking for someone ready to first step into the code to >> fix the deployment and add functional test coverage (for example, add >> tempest testing). This might require refreshing the role to our latest >> openstack-ansible standards too (we can discuss this in a different email >> or on our channel). >> >> When this would be done, we would be hoping you'd accept to be core on >> this role, so you can monitor the role, and ensure it's always working >> fine, and behave the way you expect it to be. >> >> Regards, >> Jean-Philippe Evrard (evrardjp) >> >> > > -- > *Trinh Nguyen* > *www.edlab.xyz * > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cdent+os at anticdent.org Mon May 20 14:59:37 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Mon, 20 May 2019 15:59:37 +0100 (BST) Subject: [keystone][placement][neutron][api-sig] http404 to NotFound, or how should a http json error body look like? In-Reply-To: <93c95d69-c87a-4d4d-bf10-3b6b293b8a6a@www.fastmail.com> References: <9ae10062-a9c8-3e76-15a1-da0745361c57@ericsson.com> <93c95d69-c87a-4d4d-bf10-3b6b293b8a6a@www.fastmail.com> Message-ID: On Mon, 20 May 2019, Colleen Murphy wrote: >>> Make keystoneauth1 interpret two diferent types of errors response: >>> the one it does, and the one in the api-sig guidelines. Note that >>> I've yet to see any context where there is more than one error in >>> the list, so it is always errors[0] that gets inspected. > > We'll happily accept patches to keystoneauth that make it compliant with the API-SIG's guidelines (as long as it is backwards compatible). > > I gotta say, though, this guideline on error handling really takes me aback. Why should a single HTTP request ever result in a list of errors, plural? Is there any standard, pattern, or example *outside* of OpenStack where this is done or recommended? Why? I can't remember the exact details (it was 4 years ago [1]) but I think the rationale was that if there was a call behind the call it would be useful and important to be able to report a stack of errors: I called neutron and it failed like this, and I failed as a result of that failure, like this. I agree it is a bit weird but it seems the guideline had some acclaim at the time so... Ed or Monty may have additional recollections.k [1] https://review.opendev.org/#/c/167793/ -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From cdent+os at anticdent.org Mon May 20 15:24:21 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Mon, 20 May 2019 16:24:21 +0100 (BST) Subject: [placement] Poll for what do with the meeting Message-ID: In today's placement meeting it was agreed to have a poll to decide what to do about the meeting. I've created a public poll at https://civs.cs.cornell.edu/cgi-bin/vote.pl?id=E_9599a2647c319fd4&akey=12a23953ab33e056 It will run until the end of this week. If you are participating in the placement project, or want to, please register your preferences so we can decide how to proceed. Thanks. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From bcafarel at redhat.com Mon May 20 16:33:41 2019 From: bcafarel at redhat.com (Bernard Cafarelli) Date: Mon, 20 May 2019 18:33:41 +0200 Subject: [neutron] Bug deputy report 2019-05-14 - 2019-05-20 In-Reply-To: <0939e28a-91c7-a14d-edfd-41b98bb1e31f@suse.com> References: <0939e28a-91c7-a14d-edfd-41b98bb1e31f@suse.com> Message-ID: On Mon, 20 May 2019 at 02:43, Ryan Tidwell wrote: > Hello Neutrinos, > > Here is this week's bug deputy report for neutron-related issues. > > *High* > > N/A > > *Medium* > > - https://bugs.launchpad.net/neutron/+bug/1829332 - neutron-server > report DhcpPortInUse ERROR log > - https://review.opendev.org/659523 and > https://review.opendev.org/659524 have been proposed against > neutron and neutron-lib respectively. > > > - https://bugs.launchpad.net/neutron/+bug/1829261 - Duplicate quota > entry for project_id/resource causes inconsistent behaviour > - https://bugs.launchpad.net/neutron/+bug/1829357 - > Firewall-as-a-Service (FWaaS) v2 scenario in neutron > - This is a documentation issue, the docs suggest commands be > invoked via neutron CLI when this is not supported. > https://review.opendev.org/659721 has been proposed to fix this. > - https://bugs.launchpad.net/neutron/+bug/1829387 - no way for non > admin users to get networks > - https://bugs.launchpad.net/neutron/+bug/1829414 - Attribute > filtering should be based on all objects instead of only first > - https://review.opendev.org/659617 has been proposed to address > this. > > *Low* > > N/A > > *RFE* > > - https://bugs.launchpad.net/neutron/+bug/1829449* - *Implement > consistency check and self-healing for SDN-managed fabrics > - A potentially interesting RFE around enabling feedback mechanisms > for ML2 mech drivers. I'm not sure what to make of this yet, but I think it > warrants some further discussion. > > > *Filed and Fix Released: * > > - https://bugs.launchpad.net/neutron/+bug/1829304 - Neutron returns > HttpException: 500 on certain operations with modified list of policies for > non-admin user > - https://review.opendev.org/#/c/659397/ was merged on master, with > backports proposed to stable/stein and stable/rocky > > *Unassigned:* > > https://bugs.launchpad.net/neutron/+bug/1829261 and > https://bugs.launchpad.net/neutron/+bug/1829387 > > -Ryan Tidwell > Also https://bugs.launchpad.net/neutron/+bug/1829042 - Some API requests (GET networks) fail with "Accept: application/json; charset=utf-8" header and WebOb>=1.8.0 -- Bernard Cafarelli -------------- next part -------------- An HTML attachment was scrubbed... URL: From arne.wiebalck at cern.ch Mon May 20 16:33:58 2019 From: arne.wiebalck at cern.ch (Arne Wiebalck) Date: Mon, 20 May 2019 18:33:58 +0200 Subject: [ironic][ops] Taking ironic nodes out of production Message-ID: Dear all, One of the discussions at the PTG in Denver raised the need for a mechanism to take ironic nodes out of production (a task for which the currently available 'maintenance' flag does not seem appropriate [1]). The use case there is an unhealthy physical node in state 'active', i.e. associated with an instance. The request is then to enable an admin to mark such a node as 'faulty' or 'in quarantine' with the aim of not returning the node to the pool of available nodes once the hosted instance is deleted. A very similar use case which came up independently is node retirement: it should be possible to mark nodes ('active' or not) as being 'up for retirement' to prepare the eventual removal from ironic. As in the example above, ('active') nodes marked this way should not become eligible for instance scheduling again, but automatic cleaning, for instance, should still be possible. In an effort to cover these use cases by a more general "quarantine/retirement" feature: - are there additional use cases which could profit from such a "take a node out of service" mechanism? - would these use cases put additional constraints on how the feature should look like (e.g.: "should not prevent cleaning") - are there other characteristics such a feature should have (e.g.: "finding these nodes should be supported by the cli") Let me know if you have any thoughts on this. Cheers, Arne [1] https://etherpad.openstack.org/p/DEN-train-ironic-ptg, l. 360 From openstack at nemebean.com Mon May 20 16:48:28 2019 From: openstack at nemebean.com (Ben Nemec) Date: Mon, 20 May 2019 11:48:28 -0500 Subject: [oslo] Courtesy ping changes Message-ID: <4d50450b-61fe-ac98-5144-d360de91020c@nemebean.com> Important: Action is required if you want to continue receiving courtesy pings. Read on for details. This is an oslo-specific followup to [0]. There's a lot of good discussion there if you're interested in the background for this email. The TLDR version is that we're going to keep the Oslo courtesy ping list because a number of Oslo contributors have expressed their preference for it. However, we are making some changes. First, the ping list will be cleared at the start of each cycle. This should prevent us from pinging people who no longer work on Oslo (which is almost certainly happening right now). Everyone who wants a courtesy ping will need to re-opt-in each cycle. We'll work out a transition process so people don't just stop receiving pings. Second, the ping list is going to move from the script in oslo.tools to the meeting agenda[1] on the wiki. There's no need for Oslo core signoff on ping list changes and that makes it a waste of time for both the cores and the people looking to make changes to the list. This does mean we'll lose the automatic wrapping of the list, but once we clean up the stale entries I suspect we won't need to wrap it as much anyway. I will continue to use the existing ping list for the next two weeks to give everyone a chance to add their name to the new list. I've seeded the new list with a couple of people who had expressed interest in continuing to receive pings, but if anyone else wants to continue getting them please add yourself to the list in [1] (see the Agenda Template section). I'm intentionally _not_ adding all of the active Oslo cores on the assumption that you may prefer to set up your own notification method. I might automatically carry over cores from cycle to cycle since they are presumably still interested in Oslo and chose that notification method, but we'll worry about that at the start of next cycle. I think that covers the current plan for courtesy pings in Oslo. If you have any comments or concerns please let me know. Otherwise expect this new system to take effect in 3 weeks. Thanks. -Ben 0: http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006235.html 1: https://wiki.openstack.org/wiki/Meetings/Oslo From openstack at nemebean.com Mon May 20 17:02:08 2019 From: openstack at nemebean.com (Ben Nemec) Date: Mon, 20 May 2019 12:02:08 -0500 Subject: [oslo][requirements] Bandit Strategy In-Reply-To: <1a187a01-d991-06a9-c002-967b803406ac@nemebean.com> References: <4638b722-fff4-8387-726e-b75800f59186@nemebean.com> <755b2762-d3ff-2cf6-07a9-17f03c538ccd@redhat.com> <014592f6-6d03-a44b-cda3-0007ca2c3c29@nemebean.com> <20190515184034.ooivbn6btshi7nqn@yuggoth.org> <1a187a01-d991-06a9-c002-967b803406ac@nemebean.com> Message-ID: On 5/15/19 2:07 PM, Ben Nemec wrote: > > > On 5/15/19 1:40 PM, Jeremy Stanley wrote: >> On 2019-05-15 13:08:32 -0500 (-0500), Ben Nemec wrote: >> [...] >>> The reason we did it this way is to prevent 1.6.1 from blocking >>> all of the repos again if it doesn't fix the problem or introduces >>> a new one. If so, it blocks the uncapping patches only and we can >>> deal with it on our own schedule. >> >> Normally, if it had been treated like other linters, projects should >> have been guarding against unanticipated upgrades by specifying >> something like a <1.6.0 version and then expressly advancing that >> cap at the start of a new cycle when they're prepared to deal with >> fixing whatever problems are identified. >> > Yeah, I guess I don't know why we weren't doing that with bandit. Maybe > just that it hadn't broken us previously, in which case we might want to > drop the uncap patches entirely. > We discussed this in the Oslo meeting and agreed to leave the cap in place until we choose to move to a newer version of bandit. That brings bandit into alignment with the rest of our linters. I'll go through and abandon the existing uncap patches unless someone objects. From rtidwell at suse.com Mon May 20 18:32:30 2019 From: rtidwell at suse.com (Ryan Tidwell) Date: Mon, 20 May 2019 13:32:30 -0500 Subject: [neutron] Bug deputy report 2019-05-14 - 2019-05-20 In-Reply-To: References: <0939e28a-91c7-a14d-edfd-41b98bb1e31f@suse.com> Message-ID: <6915cbfa-acc1-ea5f-cbe6-cf3b48b41293@suse.com> I missed that one. Thanks Bernard! -Ryan On 5/20/19 11:33 AM, Bernard Cafarelli wrote: > On Mon, 20 May 2019 at 02:43, Ryan Tidwell > wrote: > > Hello Neutrinos, > > Here is this week's bug deputy report for neutron-related issues. > > *High* > > N/A* > * > > *Medium* > > * https://bugs.launchpad.net/neutron/+bug/1829332 - > neutron-server report DhcpPortInUse ERROR log > o https://review.opendev.org/659523 and > https://review.opendev.org/659524 have been proposed > against neutron and neutron-lib respectively. > > * https://bugs.launchpad.net/neutron/+bug/1829261 - Duplicate > quota entry for project_id/resource causes inconsistent behaviour > * https://bugs.launchpad.net/neutron/+bug/1829357 - > Firewall-as-a-Service (FWaaS) v2 scenario in neutron > o This is a documentation issue, the docs suggest commands > be invoked via neutron CLI when this is not supported. > https://review.opendev.org/659721 has been proposed to fix > this. > *  https://bugs.launchpad.net/neutron/+bug/1829387 - no way for > non admin users to get networks > *  https://bugs.launchpad.net/neutron/+bug/1829414 - Attribute > filtering should be based on all objects instead of only first > o https://review.opendev.org/659617 has been proposed to > address this. > > *Low* > > N/A* > * > > *RFE* > > * https://bugs.launchpad.net/neutron/+bug/1829449*- *Implement > consistency check and self-healing for SDN-managed fabrics > o A potentially interesting RFE around enabling feedback > mechanisms for ML2 mech drivers. I'm not sure what to make > of this yet, but I think it warrants some further discussion. > > *Filed and Fix Released: > * > > * https://bugs.launchpad.net/neutron/+bug/1829304 - Neutron > returns HttpException: 500 on certain operations with modified > list of policies for non-admin user > o https://review.opendev.org/#/c/659397/ was merged on > master, with backports proposed to stable/stein and > stable/rocky > > *Unassigned:* > > https://bugs.launchpad.net/neutron/+bug/1829261 and > https://bugs.launchpad.net/neutron/+bug/1829387 > > -Ryan Tidwell > > Also https://bugs.launchpad.net/neutron/+bug/1829042 - Some API > requests (GET networks) fail with "Accept: application/json; > charset=utf-8" header and WebOb>=1.8.0 > > > -- > Bernard Cafarelli -------------- next part -------------- An HTML attachment was scrubbed... URL: From sam47priya at gmail.com Mon May 20 20:17:20 2019 From: sam47priya at gmail.com (Sam P) Date: Mon, 20 May 2019 22:17:20 +0200 Subject: [masakari] propose to skip 5/21 IRC meeting Message-ID: <03F109B3-9844-4BCE-82B9-ABF5D46B9DB5@gmail.com> Hi all, Since most of the members are absent in this week, I would like to skip 5/21 irc meeting. Please use this mailing for share any issues or discussions for masakari. Best regards, Sampath From sorrison at gmail.com Mon May 20 22:18:16 2019 From: sorrison at gmail.com (Sam Morrison) Date: Tue, 21 May 2019 08:18:16 +1000 Subject: [cinder] Help with a review please In-Reply-To: References: <55F040AF-16C8-4029-B306-7E81B4BE191A@gmail.com> Message-ID: Thanks Jay, unfortunately being in Australia means the meeting is at 2am which isn’t really practical for me to attend. I’ll respond to the reviews, I understand there is a way now to do this with the API which I guess means this won’t get in. I was just trying to make the api easier to use. Cheers, Sam > On 9 May 2019, at 1:35 am, Jay Bryant wrote: > > Sam, > > Thank you for reaching out to the mailing list on this issue. I am sorry that the review has been stuck in something of a limbo for quite some time. This is not the developer experience we strive for as a team. > > Since it appears that we are having trouble reaching agreement as to whether this is a good change I would recommend bringing this topic up at our next weekly meeting so that we can all work out the details together. > > If you would like to discuss this issue please add it to the agenda for the next meeting [1]. > > Thanks! > > Jay > > [1] https://etherpad.openstack.org/p/cinder-train-meetings > > On 5/8/2019 2:51 AM, Sam Morrison wrote: >> Hi, >> >> I’ve had a review going on for over 8 months now [1] and would love to get this in, it’s had +2s over the period and keeps getting nit picked, finally being knocked back due to no spec which there now is [2] >> This is now stalled itself after having a +2 and it is very depressing. >> >> I have had generally positive experiences contributing to openstack but this has been a real pain, is there something I can do to make this go smoother? >> >> Thanks, >> Sam >> >> >> [1] https://review.opendev.org/#/c/599866/ >> [2] https://review.opendev.org/#/c/645056/ > From tony at bakeyournoodle.com Tue May 21 01:37:14 2019 From: tony at bakeyournoodle.com (Tony Breeds) Date: Tue, 21 May 2019 11:37:14 +1000 Subject: [Release-job-failures] Release of openstack/puppet-vswitch failed In-Reply-To: References: Message-ID: <20190521013714.GA15808@thor.bakeyournoodle.com> On Thu, May 16, 2019 at 11:11:20PM +0000, zuul at openstack.org wrote: > Build failed. > > - release-openstack-puppet http://logs.openstack.org/56/56ba10b449c6bed9d468d90b12ee8046c77cbb29/release/release-openstack-puppet/fed1fe4/ : POST_FAILURE in 2m 46s I didn't see this one discussed. From[1]: Forge API failed to upload tarball with code: 400 errors: The dependency 'camptocamp/kmod' in metadata.json is empty. All dependencies must have a 'version_requirement'. Yours Tony. [1] http://logs.openstack.org/56/56ba10b449c6bed9d468d90b12ee8046c77cbb29/release/release-openstack-puppet/fed1fe4/job-output.txt.gz#_2019-05-16_23_10_47_384070 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From liang.a.fang at intel.com Tue May 21 02:36:45 2019 From: liang.a.fang at intel.com (Fang, Liang A) Date: Tue, 21 May 2019 02:36:45 +0000 Subject: Devstack cannot deploy an openstack environment in master branch Message-ID: Hi Devstack cannot deploy an openstack environment in master branch. It always stick in starlingx horizon. Here are some logs: Every 10.0s: cat stack.sh.log.summary Mon May 20 19:29:52 2019 2019-05-21 02:14:20.727 | stack.sh log /opt/stack/logs/stack.sh.log.2019-05-20-191420 2019-05-21 02:14:21.969 | Installing package prerequisites 2019-05-21 02:15:31.441 | Starting RabbitMQ 2019-05-21 02:15:36.870 | Installing OpenStack project source 2019-05-21 02:17:07.343 | Installing Tempest 2019-05-21 02:17:45.010 | Configuring and starting MySQL 2019-05-21 02:17:50.678 | Starting Keystone 2019-05-21 02:19:42.599 | Configuring Horizon 2019-05-21 02:19:51.117 | Configuring Glance 2019-05-21 02:19:57.830 | Configuring Neutron 2019-05-21 02:20:31.993 | Configuring Cinder 2019-05-21 02:20:38.434 | Configuring placement 2019-05-21 02:20:51.430 | Configuring Nova 2019-05-21 02:21:49.456 | Starting Glance 2019-05-21 02:21:51.681 | Uploading images 2019-05-21 02:21:54.614 | Starting Nova API 2019-05-21 02:21:57.146 | Starting Neutron 2019-05-21 02:21:59.473 | Starting Placement 2019-05-21 02:22:03.514 | Creating initial neutron network elements 2019-05-21 02:22:28.396 | Starting Nova 2019-05-21 02:22:45.696 | Starting Cinder 2019-05-21 02:22:50.307 | Starting Horizon ++:: openstack --os-cloud devstack-admin --os-region RegionOne compute service list --host ubuntu16vmliang --service nova-compute -c ID -f value +:: ID= +:: [[ '' == '' ]] +:: sleep 1 +functions:wait_for_compute:449 rval=124 +functions:wait_for_compute:461 time_stop wait_for_service +functions-common:time_stop:2317 local name +functions-common:time_stop:2318 local end_time +functions-common:time_stop:2319 local elapsed_time +functions-common:time_stop:2320 local total +functions-common:time_stop:2321 local start_time +functions-common:time_stop:2323 name=wait_for_service +functions-common:time_stop:2324 start_time=1558405396860 +functions-common:time_stop:2326 [[ -z 1558405396860 ]] ++functions-common:time_stop:2329 date +%s%3N +functions-common:time_stop:2329 end_time=1558405457017 +functions-common:time_stop:2330 elapsed_time=60157 +functions-common:time_stop:2331 total=5598 +functions-common:time_stop:2333 _TIME_START[$name]= +functions-common:time_stop:2334 _TIME_TOTAL[$name]=65755 +functions:wait_for_compute:463 [[ 124 != 0 ]] +functions:wait_for_compute:464 echo 'Didn'\''t find service registered by hostname after 60 seconds' Didn't find service registered by hostname after 60 seconds +functions:wait_for_compute:465 openstack --os-cloud devstack-admin --os-region RegionOne compute service list +----+------------------+-----------------+----------+---------+-------+----------------------------+ | ID | Binary | Host | Zone | Status | State | Updated At | +----+------------------+-----------------+----------+---------+-------+----------------------------+ | 2 | nova-scheduler | ubuntu16vmliang | internal | enabled | up | 2019-05-21T02:24:16.000000 | | 5 | nova-consoleauth | ubuntu16vmliang | internal | enabled | up | 2019-05-21T02:24:17.000000 | | 6 | nova-conductor | ubuntu16vmliang | internal | enabled | up | 2019-05-21T02:24:09.000000 | | 1 | nova-conductor | ubuntu16vmliang | internal | enabled | up | 2019-05-21T02:24:09.000000 | +----+------------------+-----------------+----------+---------+-------+----------------------------+ +functions:wait_for_compute:467 return 124 +lib/nova:is_nova_ready:1 exit_trap +./stack.sh:exit_trap:525 local r=124 ++./stack.sh:exit_trap:526 jobs -p +./stack.sh:exit_trap:526 jobs= +./stack.sh:exit_trap:529 [[ -n '' ]] +./stack.sh:exit_trap:535 '[' -f /tmp/tmp.IbKRmNIAwU ']' +./stack.sh:exit_trap:536 rm /tmp/tmp.IbKRmNIAwU +./stack.sh:exit_trap:540 kill_spinner +./stack.sh:kill_spinner:435 '[' '!' -z '' ']' +./stack.sh:exit_trap:542 [[ 124 -ne 0 ]] +./stack.sh:exit_trap:543 echo 'Error on exit' Error on exit +./stack.sh:exit_trap:545 type -p generate-subunit +./stack.sh:exit_trap:546 generate-subunit 1558404858 599 fail +./stack.sh:exit_trap:548 [[ -z /opt/stack/logs ]] +./stack.sh:exit_trap:551 /opt/stack/devstack/tools/worlddump.py -d /opt/stack/logs neutron-dhcp-agent: no process found neutron-l3-agent: no process found neutron-metadata-agent: no process found neutron-openvswitch-agent: no process found stack at ubuntu16vmliang:~/devstack$ +./stack.sh:exit_trap:560 exit 124 Regards Liang -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Tue May 21 05:36:00 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Tue, 21 May 2019 14:36:00 +0900 Subject: OpenInfra Days Vietnam 2019 (Hanoi) - Call For Presentations Message-ID: Hello, Hope you're doing well :) The OpenInfra Days Vietnam 2019 [1] is looking for speakers in many different topics (e.g., container, CI, deployment, edge computing, etc.). If you would love to have a taste of Hanoi, the capital of Vietnam, please join us this one-day event and submit your presentation [2]. *- Date:* 24 AUGUST 2019 *- Location:* INTERCONTINENTAL HANOI LANDMARK72, HANOI, VIETNAM Especially this time, we're honored to have the Upstream Institute Training [3] hosted by the OpenStack Foundation on the next day (25 August 2019). [1] http://day.vietopeninfra.org/ [2] https://forms.gle/iiRBxxyRv1mGFbgi7 [3] https://docs.openstack.org/upstream-training/upstream-training-content.html See you in Hanoi! Bests, On behalf of the VietOpenInfra Group. -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From bdobreli at redhat.com Tue May 21 08:13:08 2019 From: bdobreli at redhat.com (Bogdan Dobrelya) Date: Tue, 21 May 2019 10:13:08 +0200 Subject: [ironic][ops] Taking ironic nodes out of production In-Reply-To: References: Message-ID: <08cb8294-04c8-e4ba-78c0-dec00f87156a@redhat.com> [CC'ed edge-computing at lists.openstack.org] On 20.05.2019 18:33, Arne Wiebalck wrote: > Dear all, > > One of the discussions at the PTG in Denver raised the need for > a mechanism to take ironic nodes out of production (a task for > which the currently available 'maintenance' flag does not seem > appropriate [1]). > > The use case there is an unhealthy physical node in state 'active', > i.e. associated with an instance. The request is then to enable an > admin to mark such a node as 'faulty' or 'in quarantine' with the > aim of not returning the node to the pool of available nodes once > the hosted instance is deleted. > > A very similar use case which came up independently is node > retirement: it should be possible to mark nodes ('active' or not) > as being 'up for retirement' to prepare the eventual removal from > ironic. As in the example above, ('active') nodes marked this way > should not become eligible for instance scheduling again, but > automatic cleaning, for instance, should still be possible. > > In an effort to cover these use cases by a more general > "quarantine/retirement" feature: > > - are there additional use cases which could profit from such a >   "take a node out of service" mechanism? There are security related examples described in the Edge Security Challenges whitepaper [0] drafted by k8s IoT SIG [1], like in the chapter 2 Trusting hardware, whereby "GPS coordinate changes can be used to force a shutdown of an edge node". So a node may be taken out of service as an indicator of a particular condition of edge hardware. [0] https://docs.google.com/document/d/1iSIk8ERcheehk0aRG92dfOvW5NjkdedN8F7mSUTr-r0/edit#heading=h.xf8mdv7zexgq [1] https://github.com/kubernetes/community/tree/master/wg-iot-edge > > - would these use cases put additional constraints on how the >   feature should look like (e.g.: "should not prevent cleaning") > > - are there other characteristics such a feature should have >   (e.g.: "finding these nodes should be supported by the cli") > > Let me know if you have any thoughts on this. > > Cheers, >  Arne > > > [1] https://etherpad.openstack.org/p/DEN-train-ironic-ptg, l. 360 > -- Best regards, Bogdan Dobrelya, Irc #bogdando From bence.romsics at gmail.com Tue May 21 08:17:20 2019 From: bence.romsics at gmail.com (Bence Romsics) Date: Tue, 21 May 2019 10:17:20 +0200 Subject: [heat][neutron] improving extraroute support In-Reply-To: References: <8b4f8152-b6d0-2145-104b-300bfd479ca8@redhat.com> <7998fbfd-1262-237e-8c59-a96dec00f8eb@redhat.com> <8748D38C-4ACA-4823-9C24-52260AC8A058@redhat.com> Message-ID: Hi All, Some of you may not be aware yet that a new concern was raised regarding the extraroute improvement plans just after the last neutron session was closed on the PTG. It seems we have a tradeoff between the support for the use case of tracking multiple needs for the same extra route or keeping the virtual router abstraction as simple as it was in the past. I'm raising the question of this tradeoff here in the mailing list because this (I hope) seems to be the last cross-project question of this topic. If we could find a cross-project consensus on this I could continue making progress inside each project without need for further cross-project coordination. Please help me find this consensus. I don't want to unnecessarily repeat arguments already made. I think the question is clearly formulated in the comments of patch sets 5, 6 and 8 of the below neutron-spec: https://review.opendev.org/655680 Improve Extraroute API All opinions, comments, questions are welcome there. Thanks in advance, Bence (rubasov) From tobias.urdin at binero.se Tue May 21 09:58:51 2019 From: tobias.urdin at binero.se (Tobias Urdin) Date: Tue, 21 May 2019 11:58:51 +0200 Subject: [Release-job-failures] Release of openstack/puppet-vswitch failed In-Reply-To: <20190521013714.GA15808@thor.bakeyournoodle.com> References: <20190521013714.GA15808@thor.bakeyournoodle.com> Message-ID: <1caeeb4b-6648-ec01-2e06-b9ce8fe19cfb@binero.se> Hello Tony, I've submitted a patch [1] that adds the first version as the requirement for that module since the actual usage hasn't change it's up to the deployment to know which version it requires. We then need to backport this and tag a new minor bugfix release so that the module is properly released. I've added going through all modules and verifying that all dependencies has a version_requirement field set but I don't have to time going through it right away unfortunately. Best regards Tobias [1] https://review.opendev.org/#/c/660326/ On 05/21/2019 03:42 AM, Tony Breeds wrote: > On Thu, May 16, 2019 at 11:11:20PM +0000, zuul at openstack.org wrote: >> Build failed. >> >> - release-openstack-puppet http://logs.openstack.org/56/56ba10b449c6bed9d468d90b12ee8046c77cbb29/release/release-openstack-puppet/fed1fe4/ : POST_FAILURE in 2m 46s > I didn't see this one discussed. From[1]: > > Forge API failed to upload tarball with code: 400 errors: The dependency > 'camptocamp/kmod' in metadata.json is empty. All dependencies must have > a 'version_requirement'. > > Yours Tony. > > [1] http://logs.openstack.org/56/56ba10b449c6bed9d468d90b12ee8046c77cbb29/release/release-openstack-puppet/fed1fe4/job-output.txt.gz#_2019-05-16_23_10_47_384070 From tobias.urdin at binero.se Tue May 21 11:53:49 2019 From: tobias.urdin at binero.se (Tobias Urdin) Date: Tue, 21 May 2019 13:53:49 +0200 Subject: [Release-job-failures] Release of openstack/puppet-vswitch failed In-Reply-To: <1caeeb4b-6648-ec01-2e06-b9ce8fe19cfb@binero.se> References: <20190521013714.GA15808@thor.bakeyournoodle.com> <1caeeb4b-6648-ec01-2e06-b9ce8fe19cfb@binero.se> Message-ID: I validated all modules and it's only the puppet-vswitch module having this issue. I've proposed two more patches [1] to change so the metadata-json-lint tool that runs in CI tests the dependencies where this issue would have been caught. Best regards Tobias [1] https://review.opendev.org/#/q/topic:metadata-strict-dep On 05/21/2019 12:07 PM, Tobias Urdin wrote: > Hello Tony, > > I've submitted a patch [1] that adds the first version as the > requirement for that > module since the actual usage hasn't change it's up to the deployment to > know which version it requires. > > We then need to backport this and tag a new minor bugfix release so that > the module is properly released. > > I've added going through all modules and verifying that all dependencies > has a version_requirement field set but > I don't have to time going through it right away unfortunately. > > Best regards > Tobias > > [1] https://review.opendev.org/#/c/660326/ > > On 05/21/2019 03:42 AM, Tony Breeds wrote: >> On Thu, May 16, 2019 at 11:11:20PM +0000, zuul at openstack.org wrote: >>> Build failed. >>> >>> - release-openstack-puppet http://logs.openstack.org/56/56ba10b449c6bed9d468d90b12ee8046c77cbb29/release/release-openstack-puppet/fed1fe4/ : POST_FAILURE in 2m 46s >> I didn't see this one discussed. From[1]: >> >> Forge API failed to upload tarball with code: 400 errors: The dependency >> 'camptocamp/kmod' in metadata.json is empty. All dependencies must have >> a 'version_requirement'. >> >> Yours Tony. >> >> [1] http://logs.openstack.org/56/56ba10b449c6bed9d468d90b12ee8046c77cbb29/release/release-openstack-puppet/fed1fe4/job-output.txt.gz#_2019-05-16_23_10_47_384070 > > From rosmaita.fossdev at gmail.com Tue May 21 12:42:11 2019 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Tue, 21 May 2019 08:42:11 -0400 Subject: [ops] database archiving tool In-Reply-To: <20190509151428.im2c6dbxpv6hwhyo@corp.ovh.com> References: <20190509151428.im2c6dbxpv6hwhyo@corp.ovh.com> Message-ID: <0b771521-da88-7eb8-cd61-a60bdb999520@gmail.com> On 5/9/19 11:14 AM, Pierre-Samuel LE STANG wrote: > Hi all, > > At OVH we needed to write our own tool that archive data from OpenStack > databases to prevent some side effect related to huge tables (slower response > time, changing MariaDB query plan) and to answer to some legal aspects. [snip] Please make sure your tool takes into account OSSN-0075 [0]. Please read [1] to understand how the glance-manage tool currently deals with this issue. [0] https://wiki.openstack.org/wiki/OSSN/OSSN-0075 [1] https://docs.openstack.org/glance/latest/admin/db.html#database-maintenance [snip] > Thanks in advance for your feedbacks. Happy to help! brian From Arkady.Kanevsky at dell.com Tue May 21 12:55:06 2019 From: Arkady.Kanevsky at dell.com (Arkady.Kanevsky at dell.com) Date: Tue, 21 May 2019 12:55:06 +0000 Subject: [Edge-computing] [ironic][ops] Taking ironic nodes out of production In-Reply-To: <6A205BFA-881E-4D2D-9A7D-E35935F6631B@est.tech> References: <08cb8294-04c8-e4ba-78c0-dec00f87156a@redhat.com> <6A205BFA-881E-4D2D-9A7D-E35935F6631B@est.tech> Message-ID: <09e4bfaa95404bcfba37ee63f6bf1189@AUSX13MPS304.AMER.DELL.COM> Let's dig deeper into requirements. I see three distinct use cases: 1. put node into maintenance mode. Say to upgrade FW/BIOS or any other life-cycle event. It stays in ironic cluster but it is no longer in use by the rest of openstack, like Nova. 2. Put node into "fail" state. That is remove from usage, remove from Ironic cluster. What cleanup, operator would like/can do is subject to failure. Depending on the node type it may need to be "replaced". 3. Put node into "available" to other usage. What cleanup operator wants to do will need to be defined. This is very similar step as used for Baremetal as a Service as node is reassigned back into available pool. Depending on the next usage of a node it may stay in the Ironic cluster or may be removed from it. Once removed it can be "retired" or used for any other purpose. Thanks, Arkady -----Original Message----- From: Christopher Price Sent: Tuesday, May 21, 2019 3:26 AM To: Bogdan Dobrelya; openstack-discuss at lists.openstack.org; edge-computing at lists.openstack.org Subject: Re: [Edge-computing] [ironic][ops] Taking ironic nodes out of production [EXTERNAL EMAIL] I would add that something as simple as an operator policy could/should be able to remove hardware from an operational domain. It does not specifically need to be a fault or retirement, it may be as simple as repurposing to a different operational domain. From an OpenStack perspective this should not require any special handling from "retirement", it's just to know that there may be time constraints implied in a policy change that could potentially be ignored in a "retirement scenario". Further, at least in my imagination, one might be reallocating hardware from one Ironic domain to another which may have implications on how we best bring a new node online. (or not, I'm no expert) / Chris On 2019-05-21, 09:16, "Bogdan Dobrelya" wrote: [CC'ed edge-computing at lists.openstack.org] On 20.05.2019 18:33, Arne Wiebalck wrote: > Dear all, > > One of the discussions at the PTG in Denver raised the need for > a mechanism to take ironic nodes out of production (a task for > which the currently available 'maintenance' flag does not seem > appropriate [1]). > > The use case there is an unhealthy physical node in state 'active', > i.e. associated with an instance. The request is then to enable an > admin to mark such a node as 'faulty' or 'in quarantine' with the > aim of not returning the node to the pool of available nodes once > the hosted instance is deleted. > > A very similar use case which came up independently is node > retirement: it should be possible to mark nodes ('active' or not) > as being 'up for retirement' to prepare the eventual removal from > ironic. As in the example above, ('active') nodes marked this way > should not become eligible for instance scheduling again, but > automatic cleaning, for instance, should still be possible. > > In an effort to cover these use cases by a more general > "quarantine/retirement" feature: > > - are there additional use cases which could profit from such a > "take a node out of service" mechanism? There are security related examples described in the Edge Security Challenges whitepaper [0] drafted by k8s IoT SIG [1], like in the chapter 2 Trusting hardware, whereby "GPS coordinate changes can be used to force a shutdown of an edge node". So a node may be taken out of service as an indicator of a particular condition of edge hardware. [0] https://docs.google.com/document/d/1iSIk8ERcheehk0aRG92dfOvW5NjkdedN8F7mSUTr-r0/edit#heading=h.xf8mdv7zexgq [1] https://github.com/kubernetes/community/tree/master/wg-iot-edge > > - would these use cases put additional constraints on how the > feature should look like (e.g.: "should not prevent cleaning") > > - are there other characteristics such a feature should have > (e.g.: "finding these nodes should be supported by the cli") > > Let me know if you have any thoughts on this. > > Cheers, > Arne > > > [1] https://etherpad.openstack.org/p/DEN-train-ironic-ptg, l. 360 > -- Best regards, Bogdan Dobrelya, Irc #bogdando _______________________________________________ Edge-computing mailing list Edge-computing at lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/edge-computing _______________________________________________ Edge-computing mailing list Edge-computing at lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/edge-computing From rosmaita.fossdev at gmail.com Tue May 21 13:31:19 2019 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Tue, 21 May 2019 09:31:19 -0400 Subject: [dev][all] note to non-core reviewers in all projects Message-ID: Hello everyone, A recent spate of +1 reviews with no comments on patches has put me into grumpy-old-man mode. A +1 with no comments is completely useless (unless you have a review on a previous patch set with comments that have been addressed by the author). I already know you're a smart person (you figured out how to sign the CLA and navigate gerrit -- lots of people can't or won't do that), but all your non-comment +1 tells me is that you are in favor of the patch. That doesn't give me any information, because I already know that the author is in favor of the patch, so that makes two of you out of about 1,168 reviewers. That's not exactly a groundswell of support. When you post your +1, please leave a comment explaining why you approve, or at least what in particular you looked at in the patch that gave you a favorable impression. This whole open source community thing is a collaborative effort, so please collaborate! You comment does not have to be profound. Even just saying that you checked that the release note or docs on the patch rendered correctly in HTML is very helpful. The same thing goes for leaving a -1 on a patch. Don't just drop a -1 bomb with no explanation. The kind of review that will put you on track for becoming core in a project is what johnthetubaguy calls a "thoughtful -1", that is, a negative review that clearly explains what the problem is and points the author in a good direction to fix it. That's all I have to say. I now return to my normal sunny disposition. cheers, brian From pierre-samuel.le-stang at corp.ovh.com Tue May 21 13:33:16 2019 From: pierre-samuel.le-stang at corp.ovh.com (Pierre-Samuel LE STANG) Date: Tue, 21 May 2019 15:33:16 +0200 Subject: [ops] database archiving tool In-Reply-To: <0b771521-da88-7eb8-cd61-a60bdb999520@gmail.com> References: <20190509151428.im2c6dbxpv6hwhyo@corp.ovh.com> <0b771521-da88-7eb8-cd61-a60bdb999520@gmail.com> Message-ID: <20190521133316.6qw4ss2t7fz4yuej@corp.ovh.com> Hi Brian Thanks for pointing that out! The tool is able to exclude a table from a database so it's possible to exclude by default glance.images in the config file with an explicit message pointing to the OSSN-0075. A more robust solution may be to hard code the glance.images exclusion and add a boolean flag set to false by default that you may enable to delete the table. Regards, -- PS Brian Rosmaita wrote on mar. [2019-mai-21 08:42:11 -0400]: > On 5/9/19 11:14 AM, Pierre-Samuel LE STANG wrote: > > Hi all, > > > > At OVH we needed to write our own tool that archive data from OpenStack > > databases to prevent some side effect related to huge tables (slower response > > time, changing MariaDB query plan) and to answer to some legal aspects. > [snip] > > Please make sure your tool takes into account OSSN-0075 [0]. Please > read [1] to understand how the glance-manage tool currently deals with > this issue. > > [0] https://wiki.openstack.org/wiki/OSSN/OSSN-0075 > [1] > https://docs.openstack.org/glance/latest/admin/db.html#database-maintenance > > [snip] > > Thanks in advance for your feedbacks. > > Happy to help! > brian > > > From marios at redhat.com Tue May 21 14:26:01 2019 From: marios at redhat.com (Marios Andreou) Date: Tue, 21 May 2019 17:26:01 +0300 Subject: [dev][all] note to non-core reviewers in all projects In-Reply-To: References: Message-ID: On Tue, May 21, 2019 at 4:34 PM Brian Rosmaita wrote: > Hello everyone, > > A recent spate of +1 reviews with no comments on patches has put me into > grumpy-old-man mode. > > A +1 with no comments is completely useless (unless you have a review on > a previous patch set with comments that have been addressed by the > author). I already know you're a smart person (you figured out how to > sign the CLA and navigate gerrit -- lots of people can't or won't do > that), but all your non-comment +1 tells me is that you are in favor of > the patch. That doesn't give me any information, because I already know > that the author is in favor of the patch, so that makes two of you out > of about 1,168 reviewers. That's not exactly a groundswell of support. > > When you post your +1, please leave a comment explaining why you > approve, or at least what in particular you looked at in the patch that > gave you a favorable impression. This whole open source community thing > is a collaborative effort, so please collaborate! You comment does not > have to be profound. Even just saying that you checked that the release > note or docs on the patch rendered correctly in HTML is very helpful. > > The same thing goes for leaving a -1 on a patch. Don't just drop a -1 > bomb with no explanation. The kind of review that will put you on track > for becoming core in a project is what johnthetubaguy calls a > "thoughtful -1", that is, a negative review that clearly explains what > the problem is and points the author in a good direction to fix it. > whilst i agree on all you wrote, imo a -1 with no comments is worse than a +1 with no comments. If you dislike my patch enough to -1 it tell me what i need to change in order to fix and get your vote thanks, marios > > That's all I have to say. I now return to my normal sunny disposition. > > cheers, > brian > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sombrafam at gmail.com Tue May 21 14:45:29 2019 From: sombrafam at gmail.com (Erlon Cruz) Date: Tue, 21 May 2019 11:45:29 -0300 Subject: [dev][all] note to non-core reviewers in all projects In-Reply-To: References: Message-ID: The +1 means in a general sense "I agree with the fix/feature conveyed at the code and don't see anything that oppose to it being merged". So, I don't think that all the times there will be an explanation for the agreement. You don't need to rely on +1s, but you can you them as a 'heat' factor that will show you how many people care about it and in a way or another have reviewed, may be not a thoroughly review, but what they could contribute. This seems totally harmless and having one more nitpick rule would just make new contributor's life harder. Em ter, 21 de mai de 2019 às 11:31, Marios Andreou escreveu: > > > On Tue, May 21, 2019 at 4:34 PM Brian Rosmaita > wrote: > >> Hello everyone, >> >> A recent spate of +1 reviews with no comments on patches has put me into >> grumpy-old-man mode. >> >> A +1 with no comments is completely useless (unless you have a review on >> a previous patch set with comments that have been addressed by the >> author). I already know you're a smart person (you figured out how to >> sign the CLA and navigate gerrit -- lots of people can't or won't do >> that), but all your non-comment +1 tells me is that you are in favor of >> the patch. That doesn't give me any information, because I already know >> that the author is in favor of the patch, so that makes two of you out >> of about 1,168 reviewers. That's not exactly a groundswell of support. >> >> When you post your +1, please leave a comment explaining why you >> approve, or at least what in particular you looked at in the patch that >> gave you a favorable impression. This whole open source community thing >> is a collaborative effort, so please collaborate! You comment does not >> have to be profound. Even just saying that you checked that the release >> note or docs on the patch rendered correctly in HTML is very helpful. >> >> The same thing goes for leaving a -1 on a patch. Don't just drop a -1 >> bomb with no explanation. The kind of review that will put you on track >> for becoming core in a project is what johnthetubaguy calls a >> "thoughtful -1", that is, a negative review that clearly explains what >> the problem is and points the author in a good direction to fix it. >> > > whilst i agree on all you wrote, imo a -1 with no comments is worse than > a +1 with no comments. If you dislike my patch enough to -1 it tell me what > i need to change in order to fix and get your vote > > thanks, marios > > >> >> That's all I have to say. I now return to my normal sunny disposition. >> >> cheers, >> brian >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnaser at vexxhost.com Tue May 21 14:47:13 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Tue, 21 May 2019 22:47:13 +0800 Subject: [dev][all] note to non-core reviewers in all projects In-Reply-To: References: Message-ID: On Tue, May 21, 2019 at 9:37 PM Brian Rosmaita wrote: > > Hello everyone, > > A recent spate of +1 reviews with no comments on patches has put me into > grumpy-old-man mode. > > A +1 with no comments is completely useless (unless you have a review on > a previous patch set with comments that have been addressed by the > author). I already know you're a smart person (you figured out how to > sign the CLA and navigate gerrit -- lots of people can't or won't do > that), but all your non-comment +1 tells me is that you are in favor of > the patch. That doesn't give me any information, because I already know > that the author is in favor of the patch, so that makes two of you out > of about 1,168 reviewers. That's not exactly a groundswell of support. I feel a bit different about this. Most cores probably leave a +2 with no comment to merge code, the implication that a non-core needs to add more comments to it seems a bit 'unfair'. I think this is something where we need to put our judgement in. It may seem that we get a bunch of +1s, but if there is also other -1s that have value and comments, then I think it's okay that we get +1s. tl;dr: I don't think we should say "if you're going to give a +1, leave a comment why you agree", IMHO. > When you post your +1, please leave a comment explaining why you > approve, or at least what in particular you looked at in the patch that > gave you a favorable impression. This whole open source community thing > is a collaborative effort, so please collaborate! You comment does not > have to be profound. Even just saying that you checked that the release > note or docs on the patch rendered correctly in HTML is very helpful. > > The same thing goes for leaving a -1 on a patch. Don't just drop a -1 > bomb with no explanation. The kind of review that will put you on track > for becoming core in a project is what johnthetubaguy calls a > "thoughtful -1", that is, a negative review that clearly explains what > the problem is and points the author in a good direction to fix it. > > That's all I have to say. I now return to my normal sunny disposition. > > cheers, > brian > -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. http://vexxhost.com From rfolco at redhat.com Tue May 21 14:51:38 2019 From: rfolco at redhat.com (Rafael Folco) Date: Tue, 21 May 2019 11:51:38 -0300 Subject: [tripleo] TripleO CI Summary: Sprint 30 Message-ID: Greetings, The TripleO CI team has completed Sprint 30 / Unified Sprint 9 (Apr 25 thru May 15). The following is a summary of completed work during this sprint cycle: - Tested and created container build jobs for RDO on RHEL 7 in the internal instance of Software Factory. - Moved collect-logs into a separate repository to be consumed by other projects. - Ran a test day for zuul reproducer, filing and fixing bugs. - Continued on-boarding baremetal hardware to the internal Software Factory for running promotion jobs. - Promotion status: green on all branches at most of the sprint - PTG was attended by weshay, marios, sagi, quique The planned work for the next sprint [1] are: - Complete RDO on RHEL7 work by having an independent pipeline running container and image build, standalone and ovb featureset001 jobs. - Bootstrap OSP 15 standalone job on RHEL8 running in the internal Software Factory. The Ruck and Rover for this sprint are Marios Andreou (marios) and Sorin Gabriele Cerami (panda). Please direct questions or queries to them regarding CI status or issues in #tripleo, ideally to whomever has the ‘|ruck’ suffix on their nick. Ruck/rover notes are being tracked in etherpad [2]. Thanks, rfolco [1] https://tree.taiga.io/project/tripleo-ci-board/taskboard/unified-sprint-10 [2] https://etherpad.openstack.org/p/ruckroversprint10 -------------- next part -------------- An HTML attachment was scrubbed... URL: From gr at ham.ie Tue May 21 14:55:50 2019 From: gr at ham.ie (Graham Hayes) Date: Tue, 21 May 2019 15:55:50 +0100 Subject: [dev][all] note to non-core reviewers in all projects In-Reply-To: References: Message-ID: <86e1fbea-d077-5615-cfab-be19c66f12c6@ham.ie> On 21/05/2019 15:47, Mohammed Naser wrote: > On Tue, May 21, 2019 at 9:37 PM Brian Rosmaita > wrote: >> >> Hello everyone, >> >> A recent spate of +1 reviews with no comments on patches has put me into >> grumpy-old-man mode. >> >> A +1 with no comments is completely useless (unless you have a review on >> a previous patch set with comments that have been addressed by the >> author). I already know you're a smart person (you figured out how to >> sign the CLA and navigate gerrit -- lots of people can't or won't do >> that), but all your non-comment +1 tells me is that you are in favor of >> the patch. That doesn't give me any information, because I already know >> that the author is in favor of the patch, so that makes two of you out >> of about 1,168 reviewers. That's not exactly a groundswell of support. > > I feel a bit different about this. > > Most cores probably leave a +2 with no comment to merge code, the implication > that a non-core needs to add more comments to it seems a bit 'unfair'. > > I think this is something where we need to put our judgement in. It may seem > that we get a bunch of +1s, but if there is also other -1s that have value and > comments, then I think it's okay that we get +1s. > > tl;dr: I don't think we should say "if you're going to give a +1, > leave a comment > why you agree", IMHO. If you have no history reviewing code in the project, yes, you do need to leave a comment. I see patches that get pushed up, and then a flurry of +1's with no coments arrive from people that don't have a history of good reviews, I will discount them. I am assuming that before someone +1s they do some sort of check, be that reading the diff line by line and agreeing with the code + direction of the patch, or running devstack and seeing if it ran, and provided the fix or feature - if you have just say what you did. >> When you post your +1, please leave a comment explaining why you >> approve, or at least what in particular you looked at in the patch that >> gave you a favorable impression. This whole open source community thing >> is a collaborative effort, so please collaborate! You comment does not >> have to be profound. Even just saying that you checked that the release >> note or docs on the patch rendered correctly in HTML is very helpful. >> >> The same thing goes for leaving a -1 on a patch. Don't just drop a -1 >> bomb with no explanation. The kind of review that will put you on track >> for becoming core in a project is what johnthetubaguy calls a >> "thoughtful -1", that is, a negative review that clearly explains what >> the problem is and points the author in a good direction to fix it. >> >> That's all I have to say. I now return to my normal sunny disposition. >> >> cheers, >> brian >> > > -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From skaplons at redhat.com Tue May 21 15:08:36 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Tue, 21 May 2019 17:08:36 +0200 Subject: [dev][all] note to non-core reviewers in all projects In-Reply-To: References: Message-ID: <8D6AF7A0-9B61-407F-8C8D-45D1E4130CCC@redhat.com> Hi, > On 21 May 2019, at 16:47, Mohammed Naser wrote: > > On Tue, May 21, 2019 at 9:37 PM Brian Rosmaita > wrote: >> >> Hello everyone, >> >> A recent spate of +1 reviews with no comments on patches has put me into >> grumpy-old-man mode. >> >> A +1 with no comments is completely useless (unless you have a review on >> a previous patch set with comments that have been addressed by the >> author). I already know you're a smart person (you figured out how to >> sign the CLA and navigate gerrit -- lots of people can't or won't do >> that), but all your non-comment +1 tells me is that you are in favor of >> the patch. That doesn't give me any information, because I already know >> that the author is in favor of the patch, so that makes two of you out >> of about 1,168 reviewers. That's not exactly a groundswell of support. > > I feel a bit different about this. > > Most cores probably leave a +2 with no comment to merge code, the implication > that a non-core needs to add more comments to it seems a bit 'unfair'. > > I think this is something where we need to put our judgement in. It may seem > that we get a bunch of +1s, but if there is also other -1s that have value and > comments, then I think it's okay that we get +1s. > > tl;dr: I don't think we should say "if you're going to give a +1, > leave a comment > why you agree", IMHO. I agree with that. +1 without comment still can have some value. But we should say “if you’re going to gibe a -1, please leave a comment”. That’s really important IMO. > >> When you post your +1, please leave a comment explaining why you >> approve, or at least what in particular you looked at in the patch that >> gave you a favorable impression. This whole open source community thing >> is a collaborative effort, so please collaborate! You comment does not >> have to be profound. Even just saying that you checked that the release >> note or docs on the patch rendered correctly in HTML is very helpful. >> >> The same thing goes for leaving a -1 on a patch. Don't just drop a -1 >> bomb with no explanation. The kind of review that will put you on track >> for becoming core in a project is what johnthetubaguy calls a >> "thoughtful -1", that is, a negative review that clearly explains what >> the problem is and points the author in a good direction to fix it. >> >> That's all I have to say. I now return to my normal sunny disposition. >> >> cheers, >> brian >> > > > -- > Mohammed Naser — vexxhost > ----------------------------------------------------- > D. 514-316-8872 > D. 800-910-1726 ext. 200 > E. mnaser at vexxhost.com > W. http://vexxhost.com — Slawek Kaplonski Senior software engineer Red Hat From ed at leafe.com Tue May 21 15:10:23 2019 From: ed at leafe.com (Ed Leafe) Date: Tue, 21 May 2019 10:10:23 -0500 Subject: [dev][all] note to non-core reviewers in all projects In-Reply-To: References: Message-ID: <26D36A7A-A1A5-4315-928F-6D682117EA5C@leafe.com> On May 21, 2019, at 8:31 AM, Brian Rosmaita wrote: > > When you post your +1, please leave a comment explaining why you > approve, or at least what in particular you looked at in the patch that > gave you a favorable impression. This whole open source community thing > is a collaborative effort, so please collaborate! You comment does not > have to be profound. Even just saying that you checked that the release > note or docs on the patch rendered correctly in HTML is very helpful. I have to disagree with this, especially when you are a known member of a project. Adding a +1 with no comment means "I've looked at this patch, and it seems fine to me". Many cores have to budget their time for reviews, and if they see a patch with a few +1s from team members, it is a sign that there is nothing obviously wrong with it, and it might be a good candidate for them to review. > The same thing goes for leaving a -1 on a patch. Don't just drop a -1 > bomb with no explanation. The kind of review that will put you on track > for becoming core in a project is what johnthetubaguy calls a > "thoughtful -1", that is, a negative review that clearly explains what > the problem is and points the author in a good direction to fix it. This part I totally agree with. It doesn't help to say "something is wrong with this"; you need to spell out what it is that you feel is wrong. -- Ed Leafe From juliaashleykreger at gmail.com Tue May 21 15:13:59 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 21 May 2019 08:13:59 -0700 Subject: [dev][all] note to non-core reviewers in all projects In-Reply-To: <8D6AF7A0-9B61-407F-8C8D-45D1E4130CCC@redhat.com> References: <8D6AF7A0-9B61-407F-8C8D-45D1E4130CCC@redhat.com> Message-ID: I've likely just not had enough coffee this morning to grok this thread, but I'd like to point out that we have a guide for how to review[0]. I've found this document very useful to help provide people with context and re-calibrate their review behaviors such that they are not detrimental to the community and collaboration. /me goes back to coffeee [0]: https://docs.openstack.org/project-team-guide/review-the-openstack-way.html. On Tue, May 21, 2019 at 8:09 AM Slawomir Kaplonski wrote: > > Hi, > > > On 21 May 2019, at 16:47, Mohammed Naser wrote: > > > > On Tue, May 21, 2019 at 9:37 PM Brian Rosmaita > > wrote: > >> > >> Hello everyone, > >> > >> A recent spate of +1 reviews with no comments on patches has put me into > >> grumpy-old-man mode. > >> > >> A +1 with no comments is completely useless (unless you have a review on > >> a previous patch set with comments that have been addressed by the > >> author). I already know you're a smart person (you figured out how to > >> sign the CLA and navigate gerrit -- lots of people can't or won't do > >> that), but all your non-comment +1 tells me is that you are in favor of > >> the patch. That doesn't give me any information, because I already know > >> that the author is in favor of the patch, so that makes two of you out > >> of about 1,168 reviewers. That's not exactly a groundswell of support. > > > > I feel a bit different about this. > > > > Most cores probably leave a +2 with no comment to merge code, the implication > > that a non-core needs to add more comments to it seems a bit 'unfair'. > > > > I think this is something where we need to put our judgement in. It may seem > > that we get a bunch of +1s, but if there is also other -1s that have value and > > comments, then I think it's okay that we get +1s. > > > > tl;dr: I don't think we should say "if you're going to give a +1, > > leave a comment > > why you agree", IMHO. > > I agree with that. +1 without comment still can have some value. > But we should say “if you’re going to gibe a -1, please leave a comment”. That’s really important IMO. > > > > >> When you post your +1, please leave a comment explaining why you > >> approve, or at least what in particular you looked at in the patch that > >> gave you a favorable impression. This whole open source community thing > >> is a collaborative effort, so please collaborate! You comment does not > >> have to be profound. Even just saying that you checked that the release > >> note or docs on the patch rendered correctly in HTML is very helpful. > >> > >> The same thing goes for leaving a -1 on a patch. Don't just drop a -1 > >> bomb with no explanation. The kind of review that will put you on track > >> for becoming core in a project is what johnthetubaguy calls a > >> "thoughtful -1", that is, a negative review that clearly explains what > >> the problem is and points the author in a good direction to fix it. > >> > >> That's all I have to say. I now return to my normal sunny disposition. > >> > >> cheers, > >> brian > >> > > > > > > -- > > Mohammed Naser — vexxhost > > ----------------------------------------------------- > > D. 514-316-8872 > > D. 800-910-1726 ext. 200 > > E. mnaser at vexxhost.com > > W. http://vexxhost.com > > — > Slawek Kaplonski > Senior software engineer > Red Hat > > From bdobreli at redhat.com Tue May 21 15:23:38 2019 From: bdobreli at redhat.com (Bogdan Dobrelya) Date: Tue, 21 May 2019 17:23:38 +0200 Subject: [dev][all] note to non-core reviewers in all projects In-Reply-To: <8D6AF7A0-9B61-407F-8C8D-45D1E4130CCC@redhat.com> References: <8D6AF7A0-9B61-407F-8C8D-45D1E4130CCC@redhat.com> Message-ID: <4ce86f28-b370-0d10-fb93-09d329d982f4@redhat.com> On 21.05.2019 17:08, Slawomir Kaplonski wrote: > Hi, > >> On 21 May 2019, at 16:47, Mohammed Naser wrote: >> >> On Tue, May 21, 2019 at 9:37 PM Brian Rosmaita >> wrote: >>> >>> Hello everyone, >>> >>> A recent spate of +1 reviews with no comments on patches has put me into >>> grumpy-old-man mode. >>> >>> A +1 with no comments is completely useless (unless you have a review on >>> a previous patch set with comments that have been addressed by the >>> author). I already know you're a smart person (you figured out how to >>> sign the CLA and navigate gerrit -- lots of people can't or won't do >>> that), but all your non-comment +1 tells me is that you are in favor of >>> the patch. That doesn't give me any information, because I already know >>> that the author is in favor of the patch, so that makes two of you out >>> of about 1,168 reviewers. That's not exactly a groundswell of support. >> >> I feel a bit different about this. >> >> Most cores probably leave a +2 with no comment to merge code, the implication >> that a non-core needs to add more comments to it seems a bit 'unfair'. >> >> I think this is something where we need to put our judgement in. It may seem >> that we get a bunch of +1s, but if there is also other -1s that have value and >> comments, then I think it's okay that we get +1s. >> >> tl;dr: I don't think we should say "if you're going to give a +1, >> leave a comment >> why you agree", IMHO. > > I agree with that. +1 without comment still can have some value. > But we should say “if you’re going to gibe a -1, please leave a comment”. That’s really important IMO. +1 > >> >>> When you post your +1, please leave a comment explaining why you >>> approve, or at least what in particular you looked at in the patch that >>> gave you a favorable impression. This whole open source community thing >>> is a collaborative effort, so please collaborate! You comment does not >>> have to be profound. Even just saying that you checked that the release >>> note or docs on the patch rendered correctly in HTML is very helpful. >>> >>> The same thing goes for leaving a -1 on a patch. Don't just drop a -1 >>> bomb with no explanation. The kind of review that will put you on track >>> for becoming core in a project is what johnthetubaguy calls a >>> "thoughtful -1", that is, a negative review that clearly explains what >>> the problem is and points the author in a good direction to fix it. >>> >>> That's all I have to say. I now return to my normal sunny disposition. >>> >>> cheers, >>> brian >>> >> >> >> -- >> Mohammed Naser — vexxhost >> ----------------------------------------------------- >> D. 514-316-8872 >> D. 800-910-1726 ext. 200 >> E. mnaser at vexxhost.com >> W. http://vexxhost.com > > — > Slawek Kaplonski > Senior software engineer > Red Hat > > -- Best regards, Bogdan Dobrelya, Irc #bogdando From jungleboyj at gmail.com Tue May 21 15:24:29 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Tue, 21 May 2019 10:24:29 -0500 Subject: [dev][all] note to non-core reviewers in all projects In-Reply-To: References: Message-ID: > When you post your +1, please leave a comment explaining why you > approve, or at least what in particular you looked at in the patch that > gave you a favorable impression. This whole open source community thing > is a collaborative effort, so please collaborate! You comment does not > have to be profound. Even just saying that you checked that the release > note or docs on the patch rendered correctly in HTML is very helpful. I do not leave reviews without some sort of comment.  When I was mentored into doing reviews the expectation was that you at least leave some sort of comment with any review.  Also, as Graham noted, especially for people who are newer to the project this helps give information on their review.  This is another one of those 'tribal knowledge' items so I am not going to get too passionate about +1's with or without comments. > The same thing goes for leaving a -1 on a patch. Don't just drop a -1 > bomb with no explanation. The kind of review that will put you on track > for becoming core in a project is what johnthetubaguy calls a > "thoughtful -1", that is, a negative review that clearly explains what > the problem is and points the author in a good direction to fix it. This obviously is  a requirement and it is just rude to -1 with no additional direction. > That's all I have to say. I now return to my normal sunny disposition. > > cheers, > brian > From jayachander.it at gmail.com Tue May 21 15:30:15 2019 From: jayachander.it at gmail.com (Jay See) Date: Tue, 21 May 2019 17:30:15 +0200 Subject: [OpenStack][Foreman][MAAS][Juju][Kubernetes][Docker] OpenStack deployment on Bare Metal Message-ID: Hi, I am trying to deploy OpenStack cloud , before proceeding with deployment I wanted to take suggestion from people using some of these technologies. *Setup / plan:* I will be deploying OpenStack cloud using 3 nodes (servers), later on I will be adding more nodes. I want automatically install and provision the OS on the new nodes. I want scale up my cloud with new nodes in near future. If there are any issues in existing OpenStack, I would like to fix them or patch the fixes without much trouble. (I might be using wrong terminology - sorry for that) 1. Which tool is better for Bare Metal OS installation? I have looked around and found Foremen and MaaS. All the servers we are going to deploy will be running on Ubuntu (16.04 /18.04 LTS). If you have any other suggestions please let me know. 2. Do you suggest me to use Juju to deploy the OpenStack or do all the manual installation for the OpenStack as mentioned in OpenStack installation guide? 3. I am bit confused about choosing Kubernetes. If I am going with Juju, can I still use Kubernetes? If yes, if there any documentation please guide me in right direction. 4. I was thinking of going with OpenStack installation over Kubernetes. Is this a right decision? or Do I need to do some research between Kubernetes and Docker, find out which one is better? or Just install OpenStack without any containers. 5. I could not find installation of OpenStack with Kubernetes or Docker. If you know something, could you share the link? I don't have bigger picture at the moment. If some tools might help in near future. or If you can give any other suggestions, please let me know. Thanks and Best regards, Jayachander. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aschultz at redhat.com Tue May 21 15:41:49 2019 From: aschultz at redhat.com (Alex Schultz) Date: Tue, 21 May 2019 09:41:49 -0600 Subject: [tripleo] retiring ansible-role-k8s-* repos Message-ID: FYI, We're retiring the following repos that were used during the initial investigation of k8s + tripleo. openstack/ansible-role-k8s-cinder openstack/ansible-role-k8s-cookiecutter openstack/ansible-role-k8s-glance openstack/ansible-role-k8s-keystone openstack/ansible-role-k8s-mariadb openstack/ansible-role-k8s-rabbitmq openstack/ansible-role-k8s-tripleo Please let me know if there are any objections. We're shooting for the end of this week (by May 24th) depending on patch merging. Thanks, -Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From emilien at redhat.com Tue May 21 15:58:50 2019 From: emilien at redhat.com (Emilien Macchi) Date: Tue, 21 May 2019 11:58:50 -0400 Subject: [tripleo] retiring ansible-role-k8s-* repos In-Reply-To: References: Message-ID: On Tue, May 21, 2019 at 11:56 AM Alex Schultz wrote: > FYI, > > We're retiring the following repos that were used during the initial > investigation of k8s + tripleo. > > openstack/ansible-role-k8s-cinder > openstack/ansible-role-k8s-cookiecutter > openstack/ansible-role-k8s-glance > openstack/ansible-role-k8s-keystone > openstack/ansible-role-k8s-mariadb > openstack/ansible-role-k8s-rabbitmq > openstack/ansible-role-k8s-tripleo > > Please let me know if there are any objections. We're shooting for the > end of this week (by May 24th) depending on patch merging. > +1, they haven't been used and nobody is working on those (they were supposed to be apb roles). However, I still have some hope we can one day move out the ansible tasks from THT into ansible-role-tripleo-* (e.g; ansible-role-tripleo-keystone); but it's a different story. -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From akekane at redhat.com Tue May 21 16:03:59 2019 From: akekane at redhat.com (Abhishek Kekane) Date: Tue, 21 May 2019 21:33:59 +0530 Subject: [dev][all] note to non-core reviewers in all projects In-Reply-To: References: Message-ID: IMO, if you put +1 to a patch you should put some comments like, verified code as per standards, tested the functionality by applying patch in local environment, works but refactoring is possible. The reason behind adding such comments is that it will help to built trust between core and non-core reviewers. This will also help the non-cores to speedup their way to become cores. Thanks, Abhishek On Tue, 21 May, 2019, 20:58 Jay Bryant, wrote: > > > When you post your +1, please leave a comment explaining why you > > approve, or at least what in particular you looked at in the patch that > > gave you a favorable impression. This whole open source community thing > > is a collaborative effort, so please collaborate! You comment does not > > have to be profound. Even just saying that you checked that the release > > note or docs on the patch rendered correctly in HTML is very helpful. > I do not leave reviews without some sort of comment. When I was > mentored into doing reviews the expectation was that you at least leave > some sort of comment with any review. Also, as Graham noted, especially > for people who are newer to the project this helps give information on > their review. This is another one of those 'tribal knowledge' items so > I am not going to get too passionate about +1's with or without comments. > > The same thing goes for leaving a -1 on a patch. Don't just drop a -1 > > bomb with no explanation. The kind of review that will put you on track > > for becoming core in a project is what johnthetubaguy calls a > > "thoughtful -1", that is, a negative review that clearly explains what > > the problem is and points the author in a good direction to fix it. > This obviously is a requirement and it is just rude to -1 with no > additional direction. > > That's all I have to say. I now return to my normal sunny disposition. > > > > cheers, > > brian > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mbooth at redhat.com Tue May 21 16:16:05 2019 From: mbooth at redhat.com (Matthew Booth) Date: Tue, 21 May 2019 17:16:05 +0100 Subject: [nova] stable-maint is especially unhealthily RH-centric Message-ID: During the trifecta discussions at PTG I was only considering nova-core. I didn't appreciate at the time how bad the situation is for nova-stable-maint. nova-stable-maint currently consists of: https://review.opendev.org/#/admin/groups/540,members https://review.opendev.org/#/admin/groups/530,members Not Red Hat: Claudiu Belu -> Inactive? Matt Riedemann John Garbutt Matthew Treinish Red Hat: Dan Smith Lee Yarwood Sylvain Bauza Tony Breeds Melanie Witt Alan Pevec Chuck Short Flavio Percoco Tony Breeds This leaves Nova entirely dependent on Matt Riedemann, John Garbutt, and Matthew Treinish to land patches in stable, which isn't a great situation. With Matt R temporarily out of action that's especially bad. Looking for constructive suggestions. I'm obviously in favour of relaxing the trifecta rules, but adding some non-RH stable cores also seems like it would be a generally healthy thing for the project to do. Matt -- Matthew Booth Red Hat OpenStack Engineer, Compute DFG Phone: +442070094448 (UK) From rico.lin.guanyu at gmail.com Tue May 21 16:20:00 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Wed, 22 May 2019 00:20:00 +0800 Subject: [auto-scaling] PTG summary Message-ID: Sorry for the late summary here. As the following action should get started. Let me try to point out actions and discussions we have from Summit + PTG. If you care about auto-scaling with/around OpenStack please join us All our discussion should be able to found in Etherpad https://etherpad.openstack.org/p/DEN-auto-scaling-SIG For more information about auto-scaling SIG: - repo: https://opendev.org/openstack/auto-scaling-sig - storyboard: https://storyboard.openstack.org/#!/project/openstack/auto-scaling-sig - irc: #openstack-auto-scaling - meetings: http://eavesdrop.openstack.org/#Auto-scaling_SIG_Meeting - use the ics file so you don't miss the schedule Call for help: we just formed so we need reviewers and helpers. Let us know if you would like to be core reviewer for SIG repo, would like to help on any tasks, or whould like th share any idea with us. As we try to find out what's the most important action we should take to initial this SIG, here are some init scope we will try to improve on: - Documentation: As we all agreed, document to record the current auto-scaling situation and collect use cases or relative resources, so users can have a better landscape. We will have to ask other teams to help to provide related use cases as one of our first steps. Here are some identified tasks: https://storyboard.openstack.org/#!/story/2005751 - Integration tests: Once we set up a basic test environment (which will happen soon), we will also try to talk with projects on having a cross-project gating CI job. At PTG, we already consult some projects and generally got positive feedback. It will be harder to create a CI job for autoscaling OpenStack services like nova-compute, so that part is not included in the initial plan for now. Here are tasks for now, and please suggest more if you see available: https://storyboard.openstack.org/#!/story/2005752 - Initial repo setup: Thanks for Adam, we now have our repository and irc bot ready, if you check out https://docs.openstack.org/auto-scaling-sig/latest/ and https://opendev.org/openstack/auto-scaling-sig you will find the doc and repo entry is ready. - Make a list to identify missing features: Which we should add into the docs to really identify missing features that might be some potential targets we can ask if project teams would like to have. Or help users to avoid mistaken how OpenStack autoscaling might works. - Consistent our discussion and task tracking: We not set up missions and looks like we have a lot of things need to be done in this cycle, so we should keep tracking and make sure we not missing anything - Collect Use Cases and user feedbacks: We need to identify use cases and any feedback, so if you like to add any use case or scenario to our document, please help with added it in https://opendev.org/openstack/auto-scaling-sig/src/branch/master/use-cases -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From christopher.price at est.tech Tue May 21 08:26:25 2019 From: christopher.price at est.tech (Christopher Price) Date: Tue, 21 May 2019 08:26:25 +0000 Subject: [Edge-computing] [ironic][ops] Taking ironic nodes out of production In-Reply-To: <08cb8294-04c8-e4ba-78c0-dec00f87156a@redhat.com> References: <08cb8294-04c8-e4ba-78c0-dec00f87156a@redhat.com> Message-ID: <6A205BFA-881E-4D2D-9A7D-E35935F6631B@est.tech> I would add that something as simple as an operator policy could/should be able to remove hardware from an operational domain. It does not specifically need to be a fault or retirement, it may be as simple as repurposing to a different operational domain. From an OpenStack perspective this should not require any special handling from "retirement", it's just to know that there may be time constraints implied in a policy change that could potentially be ignored in a "retirement scenario". Further, at least in my imagination, one might be reallocating hardware from one Ironic domain to another which may have implications on how we best bring a new node online. (or not, I'm no expert) / Chris On 2019-05-21, 09:16, "Bogdan Dobrelya" wrote: [CC'ed edge-computing at lists.openstack.org] On 20.05.2019 18:33, Arne Wiebalck wrote: > Dear all, > > One of the discussions at the PTG in Denver raised the need for > a mechanism to take ironic nodes out of production (a task for > which the currently available 'maintenance' flag does not seem > appropriate [1]). > > The use case there is an unhealthy physical node in state 'active', > i.e. associated with an instance. The request is then to enable an > admin to mark such a node as 'faulty' or 'in quarantine' with the > aim of not returning the node to the pool of available nodes once > the hosted instance is deleted. > > A very similar use case which came up independently is node > retirement: it should be possible to mark nodes ('active' or not) > as being 'up for retirement' to prepare the eventual removal from > ironic. As in the example above, ('active') nodes marked this way > should not become eligible for instance scheduling again, but > automatic cleaning, for instance, should still be possible. > > In an effort to cover these use cases by a more general > "quarantine/retirement" feature: > > - are there additional use cases which could profit from such a > "take a node out of service" mechanism? There are security related examples described in the Edge Security Challenges whitepaper [0] drafted by k8s IoT SIG [1], like in the chapter 2 Trusting hardware, whereby "GPS coordinate changes can be used to force a shutdown of an edge node". So a node may be taken out of service as an indicator of a particular condition of edge hardware. [0] https://docs.google.com/document/d/1iSIk8ERcheehk0aRG92dfOvW5NjkdedN8F7mSUTr-r0/edit#heading=h.xf8mdv7zexgq [1] https://github.com/kubernetes/community/tree/master/wg-iot-edge > > - would these use cases put additional constraints on how the > feature should look like (e.g.: "should not prevent cleaning") > > - are there other characteristics such a feature should have > (e.g.: "finding these nodes should be supported by the cli") > > Let me know if you have any thoughts on this. > > Cheers, > Arne > > > [1] https://etherpad.openstack.org/p/DEN-train-ironic-ptg, l. 360 > -- Best regards, Bogdan Dobrelya, Irc #bogdando _______________________________________________ Edge-computing mailing list Edge-computing at lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/edge-computing From rony.khan at novotel-bd.com Tue May 21 14:51:44 2019 From: rony.khan at novotel-bd.com (Md. Farhad Hasan Khan) Date: Tue, 21 May 2019 20:51:44 +0600 Subject: neutron network namespaces not create Message-ID: <7FDE461E16EF1E4587C3A3333C7DA2D90E8C5DAC74@Email.novotel-bd.com> Hi, I can create router from horizon. But network namespaces not created. I check with # ip netns list command. Not found router ID, but showing in horizon. Here is some log I get from neutron: #cat /var/log/neutron/l3-agent.log 2019-05-21 17:35:44.646 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) 2019-05-21 17:35:44.646 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-21 17:35:44.646 52248 ERROR neutron.agent.l3.agent 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent ri.delete() 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent router_id) 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self.force_reraise() 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self.ha_network_added() 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent 2019-05-21 17:35:44.711 52248 WARNING neutron.agent.l3.agent [-] Hit retry limit with router update for cad85ce0-6624-42ff-b42b-09480aea2613, action None: BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-21 17:35:44.711 52248 WARNING neutron.agent.l3.agent [-] Info for router cad85ce0-6624-42ff-b42b-09480aea2613 was not found. Performing router cleanup: BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent self.ha_network_added() 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent ri.delete() 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent router_id) 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self.force_reraise() 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self.ha_network_added() 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent self.ha_network_added() 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent ri.delete() 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent router_id) 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self.force_reraise() 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self.ha_network_added() 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent self.ha_network_added() 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent ri.delete() 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent router_id) 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self.force_reraise() 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self.ha_network_added() 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent self.ha_network_added() 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent ri.delete() 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent router_id) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.force_reraise() 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.ha_network_added() 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent 2019-05-21 17:37:13.024 52248 WARNING neutron.agent.l3.agent [-] Info for router cad85ce0-6624-42ff-b42b-09480aea2613 was not found. Performing router cleanup 2019-05-21 17:37:14.358 52248 WARNING neutron.agent.l3.agent [-] Info for router cad85ce0-6624-42ff-b42b-09480aea2613 was not found. Performing router cleanup Thanks & B'Rgds, Rony -------------- next part -------------- An HTML attachment was scrubbed... URL: From rony.khan at novotel-bd.com Tue May 21 14:53:27 2019 From: rony.khan at novotel-bd.com (Md. Farhad Hasan Khan) Date: Tue, 21 May 2019 20:53:27 +0600 Subject: neutron network namespaces not create Message-ID: <7FDE461E16EF1E4587C3A3333C7DA2D90E8C5DAC75@Email.novotel-bd.com> Hi, I can create router from horizon. But network namespaces not created. I check with # ip netns list command. Not found router ID, but showing in horizon. Here is some log I get from neutron: #cat /var/log/neutron/l3-agent.log 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent ri.delete() 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent router_id) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.force_reraise() 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.ha_network_added() 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent 2019-05-21 17:37:13.024 52248 WARNING neutron.agent.l3.agent [-] Info for router cad85ce0-6624-42ff-b42b-09480aea2613 was not found. Performing router cleanup 2019-05-21 17:37:14.358 52248 WARNING neutron.agent.l3.agent [-] Info for router cad85ce0-6624-42ff-b42b-09480aea2613 was not found. Performing router cleanup Thanks & B'Rgds, Rony -------------- next part -------------- An HTML attachment was scrubbed... URL: From elmiko at redhat.com Tue May 21 15:13:09 2019 From: elmiko at redhat.com (Michael McCune) Date: Tue, 21 May 2019 11:13:09 -0400 Subject: [dev][all] note to non-core reviewers in all projects In-Reply-To: References: Message-ID: On Tue, May 21, 2019 at 10:34 AM Marios Andreou wrote: > whilst i agree on all you wrote, imo a -1 with no comments is worse than a +1 with no comments. If you dislike my patch enough to -1 it tell me what i need to change in order to fix and get your vote i tend to agree with this sentiment. i am ok to have a +1 with no comment, especially from experienced team members, but a -1 should never have no comment. peace o/ From fungi at yuggoth.org Tue May 21 16:37:10 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 21 May 2019 16:37:10 +0000 Subject: [dev][all] note to non-core reviewers in all projects In-Reply-To: References: Message-ID: <20190521163710.nujmle4dknr5cqgv@yuggoth.org> On 2019-05-21 09:31:19 -0400 (-0400), Brian Rosmaita wrote: > A recent spate of +1 reviews with no comments on patches has put > me into grumpy-old-man mode. Welcome to the club, we probably have an extra lapel pin for you around here somewhere. ;) > A +1 with no comments is completely useless (unless you have a > review on a previous patch set with comments that have been > addressed by the author). [...] I think it's far more contextually nuanced than that. When I see a naked +1 comment on a change from a reviewer I recognize as having provided insightful feedback in the past, I tend to give that some weight. Ideally a +1 carries with it an implicit "this change makes sense to me, is a good idea for the project, and I don't see any obvious flaws in it." If I only ever see (or only mostly see) them from someone who I don't recall commenting usefully in recent history, I basically ignore it. The implication of the +1 for that reviewer is still the same, it's just that I don't have a lot of faith in their ability to judge the validity of the change if they haven't demonstrated that ability to me. > When you post your +1, please leave a comment explaining why you > approve, or at least what in particular you looked at in the patch > that gave you a favorable impression. This whole open source > community thing is a collaborative effort, so please collaborate! > You comment does not have to be profound. Even just saying that > you checked that the release note or docs on the patch rendered > correctly in HTML is very helpful. [,...] In an ideal world where we all have plenty of time to review changes and whatever else needs doing, this would be great. You know better than most, I suspect, what it's like to be a core reviewer on a project with a very high change-to-reviewer ratio. I don't personally think it's a healthy thing to hold reviews from newer or less frequent reviewers to a *higher* standard than we do for our core reviewers on projects. The goal is to improve our software, and to do that we need a constant injection of contributors with an understanding of that software and our processes and workflows. We should be giving them the courtesy and respect of assuming they have performed diligence in the course of a review, as encouragement to get more involved and feel a part of the project. As project leaders, it falls upon us to make the determination as to when feedback from a reviewer is pertinent and when it isn't, *even* if it requires us to keep a bit of context. But more importantly, we should be setting the example for how we'd like new folks to review changes, not simply telling them. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From juliaashleykreger at gmail.com Tue May 21 16:43:18 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 21 May 2019 09:43:18 -0700 Subject: [ironic] IRC meeting timing In-Reply-To: References: Message-ID: >From experience and as the chair, I'd kind of like to avoid jumping between two different meeting times unless they are times that would be reasonably possible for me to make. That being said, I wonder if it is really the only way forward given existing time zone separations. I think the best thing we can do is actually poll the community using a doodle poll with 1 hour blocks over the course of say, two days in UTC. This would allow people to populate and represent possible times and from there if we have two peak areas of availability we might want to consider it. Jacob, would you be willing to create a doodle poll in the UTC time zone with ?48 (eek) options? over the course of two days and share that link on the mailing list and we can bring it up in the next weekly meeting? -Julia On Mon, May 20, 2019 at 7:47 AM Jacob Anders wrote: > > Hi All, > > I would be keen to participate in the Ironic IRC meetings, however the current timing of the meeting is quite unfavourable to those based in the Asia Pacific region. > > I'm wondering - would you be open to consider changing the timing to either: > > - 2000hrs UTC: > UTC (Time Zone) Monday, 20 May 2019 at 8:00:00 pm UTC > (Sydney/Australia) Tuesday, 21 May 2019 at 6:00:00 am AEST UTC+10 hours > (Germany/Berlin) Monday, 20 May 2019 at 10:00:00 pm CEST UTC+2 hours > (USA/California) Monday, 20 May 2019 at 1:00:00 pm PDT UTC-7 hours > > - alternating between two different times to accommodate different timezones? For example 1300hrs and 2000hrs UTC? > > Thank you. > > Best regards, > Jacob From morgan.fainberg at gmail.com Tue May 21 17:25:57 2019 From: morgan.fainberg at gmail.com (Morgan Fainberg) Date: Tue, 21 May 2019 10:25:57 -0700 Subject: [dev][all] note to non-core reviewers in all projects In-Reply-To: <20190521163710.nujmle4dknr5cqgv@yuggoth.org> References: <20190521163710.nujmle4dknr5cqgv@yuggoth.org> Message-ID: I 100% agree with Jeremy here. I also want to add that a +1 without comment, even from a reviewer with no history, is an indicator that the code was read. Everyone has to start somewhere. I am 100% ok with some bare/naked +1s as long as eventually those folks give quality feedback (even as far as a -1 with a "hey this looks off" or a no-score with a question). If it is a pattern of just +1 to their friends/coworkers patches without review, I'll eventually ignore the +1s. In short, I encourage +1 even without comment if it brings in new contributors / reviewers. I hope that if there is a subsequent -1 on a patch (or other comments) the reviewers giving the initial +1 will read the comments and understand the reasoning for the -1 (-2, -1 workflow, whatever). I think what I am saying is: I hope we (as a community) don't chase folks off because they +1'd without comment. Everyone starts somewhere On Tue, May 21, 2019 at 9:42 AM Jeremy Stanley wrote: > On 2019-05-21 09:31:19 -0400 (-0400), Brian Rosmaita wrote: > > A recent spate of +1 reviews with no comments on patches has put > > me into grumpy-old-man mode. > > Welcome to the club, we probably have an extra lapel pin for you > around here somewhere. ;) > > > A +1 with no comments is completely useless (unless you have a > > review on a previous patch set with comments that have been > > addressed by the author). > [...] > > I think it's far more contextually nuanced than that. When I see a > naked +1 comment on a change from a reviewer I recognize as having > provided insightful feedback in the past, I tend to give that some > weight. Ideally a +1 carries with it an implicit "this change makes > sense to me, is a good idea for the project, and I don't see any > obvious flaws in it." If I only ever see (or only mostly see) them > from someone who I don't recall commenting usefully in recent > history, I basically ignore it. The implication of the +1 for that > reviewer is still the same, it's just that I don't have a lot of > faith in their ability to judge the validity of the change if they > haven't demonstrated that ability to me. > > > When you post your +1, please leave a comment explaining why you > > approve, or at least what in particular you looked at in the patch > > that gave you a favorable impression. This whole open source > > community thing is a collaborative effort, so please collaborate! > > You comment does not have to be profound. Even just saying that > > you checked that the release note or docs on the patch rendered > > correctly in HTML is very helpful. > [,...] > > In an ideal world where we all have plenty of time to review changes > and whatever else needs doing, this would be great. You know better > than most, I suspect, what it's like to be a core reviewer on a > project with a very high change-to-reviewer ratio. I don't > personally think it's a healthy thing to hold reviews from newer or > less frequent reviewers to a *higher* standard than we do for our > core reviewers on projects. The goal is to improve our software, and > to do that we need a constant injection of contributors with an > understanding of that software and our processes and workflows. We > should be giving them the courtesy and respect of assuming they have > performed diligence in the course of a review, as encouragement to > get more involved and feel a part of the project. > > As project leaders, it falls upon us to make the determination as to > when feedback from a reviewer is pertinent and when it isn't, *even* > if it requires us to keep a bit of context. But more importantly, we > should be setting the example for how we'd like new folks to review > changes, not simply telling them. > -- > Jeremy Stanley > -------------- next part -------------- An HTML attachment was scrubbed... URL: From juliaashleykreger at gmail.com Tue May 21 17:28:00 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 21 May 2019 10:28:00 -0700 Subject: [Edge-computing] [ironic][ops] Taking ironic nodes out of production In-Reply-To: <6A205BFA-881E-4D2D-9A7D-E35935F6631B@est.tech> References: <08cb8294-04c8-e4ba-78c0-dec00f87156a@redhat.com> <6A205BFA-881E-4D2D-9A7D-E35935F6631B@est.tech> Message-ID: On Tue, May 21, 2019 at 9:34 AM Christopher Price wrote: > > I would add that something as simple as an operator policy could/should be able to remove hardware from an operational domain. It does not specifically need to be a fault or retirement, it may be as simple as repurposing to a different operational domain. From an OpenStack perspective this should not require any special handling from "retirement", it's just to know that there may be time constraints implied in a policy change that could potentially be ignored in a "retirement scenario". > > Further, at least in my imagination, one might be reallocating hardware from one Ironic domain to another which may have implications on how we best bring a new node online. (or not, I'm no expert) You raise a really good point and we've had some past discussions from a standpoint of leasing hardware between clusters. One was ultimately to allow for a federated model where ironic could talk to ironic, however... that wasn't a very well received idea because it would mean ironic could become aware of other ironics... And soon ironic takes over the rest of the world. > > / Chris > [trim] From juliaashleykreger at gmail.com Tue May 21 17:33:10 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 21 May 2019 10:33:10 -0700 Subject: [Edge-computing] [ironic][ops] Taking ironic nodes out of production In-Reply-To: <09e4bfaa95404bcfba37ee63f6bf1189@AUSX13MPS304.AMER.DELL.COM> References: <08cb8294-04c8-e4ba-78c0-dec00f87156a@redhat.com> <6A205BFA-881E-4D2D-9A7D-E35935F6631B@est.tech> <09e4bfaa95404bcfba37ee63f6bf1189@AUSX13MPS304.AMER.DELL.COM> Message-ID: On Tue, May 21, 2019 at 5:55 AM wrote: > > Let's dig deeper into requirements. > I see three distinct use cases: > 1. put node into maintenance mode. Say to upgrade FW/BIOS or any other life-cycle event. It stays in ironic cluster but it is no longer in use by the rest of openstack, like Nova. > 2. Put node into "fail" state. That is remove from usage, remove from Ironic cluster. What cleanup, operator would like/can do is subject to failure. Depending on the node type it may need to be "replaced". Or troubleshooted by a human, and could be returned to a non-failure state. I think largely the only way we as developers could support that is allow for hook scripts to be called upon entering/exiting such a state. That being said, At least from what Beth was saying at the PTG, this seems to be one of the most important states. > 3. Put node into "available" to other usage. What cleanup operator wants to do will need to be defined. This is very similar step as used for Baremetal as a Service as node is reassigned back into available pool. Depending on the next usage of a node it may stay in the Ironic cluster or may be removed from it. Once removed it can be "retired" or used for any other purpose. Do you mean "unprovision" a node and move it through cleaning? I'm not sure I understand what your trying to get across. There is a case where a node would have been moved to a "failed" state, and could be "unprovisioned". If we reach the point where we are able to unprovision, it seems like we might be able to re-deploy, so maybe the option is to automatically move to state which is kind of like bucket for broken nodes? > > Thanks, > Arkady > > -----Original Message----- > From: Christopher Price > Sent: Tuesday, May 21, 2019 3:26 AM > To: Bogdan Dobrelya; openstack-discuss at lists.openstack.org; edge-computing at lists.openstack.org > Subject: Re: [Edge-computing] [ironic][ops] Taking ironic nodes out of production > > > [EXTERNAL EMAIL] > > I would add that something as simple as an operator policy could/should be able to remove hardware from an operational domain. It does not specifically need to be a fault or retirement, it may be as simple as repurposing to a different operational domain. From an OpenStack perspective this should not require any special handling from "retirement", it's just to know that there may be time constraints implied in a policy change that could potentially be ignored in a "retirement scenario". > > Further, at least in my imagination, one might be reallocating hardware from one Ironic domain to another which may have implications on how we best bring a new node online. (or not, I'm no expert) > > / Chris > > On 2019-05-21, 09:16, "Bogdan Dobrelya" wrote: > > [CC'ed edge-computing at lists.openstack.org] > > On 20.05.2019 18:33, Arne Wiebalck wrote: > > Dear all, > > > > One of the discussions at the PTG in Denver raised the need for > > a mechanism to take ironic nodes out of production (a task for > > which the currently available 'maintenance' flag does not seem > > appropriate [1]). > > > > The use case there is an unhealthy physical node in state 'active', > > i.e. associated with an instance. The request is then to enable an > > admin to mark such a node as 'faulty' or 'in quarantine' with the > > aim of not returning the node to the pool of available nodes once > > the hosted instance is deleted. > > > > A very similar use case which came up independently is node > > retirement: it should be possible to mark nodes ('active' or not) > > as being 'up for retirement' to prepare the eventual removal from > > ironic. As in the example above, ('active') nodes marked this way > > should not become eligible for instance scheduling again, but > > automatic cleaning, for instance, should still be possible. > > > > In an effort to cover these use cases by a more general > > "quarantine/retirement" feature: > > > > - are there additional use cases which could profit from such a > > "take a node out of service" mechanism? > > There are security related examples described in the Edge Security > Challenges whitepaper [0] drafted by k8s IoT SIG [1], like in the > chapter 2 Trusting hardware, whereby "GPS coordinate changes can be used > to force a shutdown of an edge node". So a node may be taken out of > service as an indicator of a particular condition of edge hardware. > > [0] > https://docs.google.com/document/d/1iSIk8ERcheehk0aRG92dfOvW5NjkdedN8F7mSUTr-r0/edit#heading=h.xf8mdv7zexgq > [1] https://github.com/kubernetes/community/tree/master/wg-iot-edge > > > > > - would these use cases put additional constraints on how the > > feature should look like (e.g.: "should not prevent cleaning") > > > > - are there other characteristics such a feature should have > > (e.g.: "finding these nodes should be supported by the cli") > > > > Let me know if you have any thoughts on this. > > > > Cheers, > > Arne > > > > > > [1] https://etherpad.openstack.org/p/DEN-train-ironic-ptg, l. 360 > > > > > -- > Best regards, > Bogdan Dobrelya, > Irc #bogdando > > _______________________________________________ > Edge-computing mailing list > Edge-computing at lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/edge-computing > > > _______________________________________________ > Edge-computing mailing list > Edge-computing at lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/edge-computing From eandersson at blizzard.com Tue May 21 18:26:59 2019 From: eandersson at blizzard.com (Erik Olof Gunnar Andersson) Date: Tue, 21 May 2019 18:26:59 +0000 Subject: [dev][all] note to non-core reviewers in all projects In-Reply-To: References: <20190521163710.nujmle4dknr5cqgv@yuggoth.org> Message-ID: 100% agree here with Morgan. +1 is great, especially for projects with fewer regular reviewers where more eyes can help a lot. Best Regards, Erik Olof Gunnar Andersson From: Morgan Fainberg Sent: Tuesday, May 21, 2019 10:26 AM To: Jeremy Stanley Cc: openstack-discuss at lists.openstack.org Subject: Re: [dev][all] note to non-core reviewers in all projects I 100% agree with Jeremy here. I also want to add that a +1 without comment, even from a reviewer with no history, is an indicator that the code was read. Everyone has to start somewhere. I am 100% ok with some bare/naked +1s as long as eventually those folks give quality feedback (even as far as a -1 with a "hey this looks off" or a no-score with a question). If it is a pattern of just +1 to their friends/coworkers patches without review, I'll eventually ignore the +1s. In short, I encourage +1 even without comment if it brings in new contributors / reviewers. I hope that if there is a subsequent -1 on a patch (or other comments) the reviewers giving the initial +1 will read the comments and understand the reasoning for the -1 (-2, -1 workflow, whatever). I think what I am saying is: I hope we (as a community) don't chase folks off because they +1'd without comment. Everyone starts somewhere On Tue, May 21, 2019 at 9:42 AM Jeremy Stanley > wrote: On 2019-05-21 09:31:19 -0400 (-0400), Brian Rosmaita wrote: > A recent spate of +1 reviews with no comments on patches has put > me into grumpy-old-man mode. Welcome to the club, we probably have an extra lapel pin for you around here somewhere. ;) > A +1 with no comments is completely useless (unless you have a > review on a previous patch set with comments that have been > addressed by the author). [...] I think it's far more contextually nuanced than that. When I see a naked +1 comment on a change from a reviewer I recognize as having provided insightful feedback in the past, I tend to give that some weight. Ideally a +1 carries with it an implicit "this change makes sense to me, is a good idea for the project, and I don't see any obvious flaws in it." If I only ever see (or only mostly see) them from someone who I don't recall commenting usefully in recent history, I basically ignore it. The implication of the +1 for that reviewer is still the same, it's just that I don't have a lot of faith in their ability to judge the validity of the change if they haven't demonstrated that ability to me. > When you post your +1, please leave a comment explaining why you > approve, or at least what in particular you looked at in the patch > that gave you a favorable impression. This whole open source > community thing is a collaborative effort, so please collaborate! > You comment does not have to be profound. Even just saying that > you checked that the release note or docs on the patch rendered > correctly in HTML is very helpful. [,...] In an ideal world where we all have plenty of time to review changes and whatever else needs doing, this would be great. You know better than most, I suspect, what it's like to be a core reviewer on a project with a very high change-to-reviewer ratio. I don't personally think it's a healthy thing to hold reviews from newer or less frequent reviewers to a *higher* standard than we do for our core reviewers on projects. The goal is to improve our software, and to do that we need a constant injection of contributors with an understanding of that software and our processes and workflows. We should be giving them the courtesy and respect of assuming they have performed diligence in the course of a review, as encouragement to get more involved and feel a part of the project. As project leaders, it falls upon us to make the determination as to when feedback from a reviewer is pertinent and when it isn't, *even* if it requires us to keep a bit of context. But more importantly, we should be setting the example for how we'd like new folks to review changes, not simply telling them. -- Jeremy Stanley -------------- next part -------------- An HTML attachment was scrubbed... URL: From Arkady.Kanevsky at dell.com Tue May 21 19:00:41 2019 From: Arkady.Kanevsky at dell.com (Arkady.Kanevsky at dell.com) Date: Tue, 21 May 2019 19:00:41 +0000 Subject: [Edge-computing] [ironic][ops] Taking ironic nodes out of production In-Reply-To: References: <08cb8294-04c8-e4ba-78c0-dec00f87156a@redhat.com> <6A205BFA-881E-4D2D-9A7D-E35935F6631B@est.tech> <09e4bfaa95404bcfba37ee63f6bf1189@AUSX13MPS304.AMER.DELL.COM> Message-ID: Inline response -----Original Message----- From: Julia Kreger Sent: Tuesday, May 21, 2019 12:33 PM To: Kanevsky, Arkady Cc: Christopher Price; Bogdan Dobrelya; openstack-discuss; edge-computing at lists.openstack.org Subject: Re: [Edge-computing] [ironic][ops] Taking ironic nodes out of production [EXTERNAL EMAIL] On Tue, May 21, 2019 at 5:55 AM wrote: > > Let's dig deeper into requirements. > I see three distinct use cases: > 1. put node into maintenance mode. Say to upgrade FW/BIOS or any other life-cycle event. It stays in ironic cluster but it is no longer in use by the rest of openstack, like Nova. > 2. Put node into "fail" state. That is remove from usage, remove from Ironic cluster. What cleanup, operator would like/can do is subject to failure. Depending on the node type it may need to be "replaced". Or troubleshooted by a human, and could be returned to a non-failure state. I think largely the only way we as developers could support that is allow for hook scripts to be called upon entering/exiting such a state. That being said, At least from what Beth was saying at the PTG, this seems to be one of the most important states. > 3. Put node into "available" to other usage. What cleanup operator wants to do will need to be defined. This is very similar step as used for Baremetal as a Service as node is reassigned back into available pool. Depending on the next usage of a node it may stay in the Ironic cluster or may be removed from it. Once removed it can be "retired" or used for any other purpose. Do you mean "unprovision" a node and move it through cleaning? I'm not sure I understand what your trying to get across. There is a case where a node would have been moved to a "failed" state, and could be "unprovisioned". If we reach the point where we are able to unprovision, it seems like we might be able to re-deploy, so maybe the option is to automatically move to state which is kind of like bucket for broken nodes? AK: Before node is removed from Ironic some level of cleanup is expected. Especially if node is to be reused as Chris stated. I assume that that cleanup will be done by Ironic. What you do with the node after it is outside of Ironic is out of scope. > > Thanks, > Arkady > > -----Original Message----- > From: Christopher Price > Sent: Tuesday, May 21, 2019 3:26 AM > To: Bogdan Dobrelya; openstack-discuss at lists.openstack.org; > edge-computing at lists.openstack.org > Subject: Re: [Edge-computing] [ironic][ops] Taking ironic nodes out of > production > > > [EXTERNAL EMAIL] > > I would add that something as simple as an operator policy could/should be able to remove hardware from an operational domain. It does not specifically need to be a fault or retirement, it may be as simple as repurposing to a different operational domain. From an OpenStack perspective this should not require any special handling from "retirement", it's just to know that there may be time constraints implied in a policy change that could potentially be ignored in a "retirement scenario". > > Further, at least in my imagination, one might be reallocating > hardware from one Ironic domain to another which may have implications > on how we best bring a new node online. (or not, I'm no expert) end dubious thought stream> > > / Chris > > On 2019-05-21, 09:16, "Bogdan Dobrelya" wrote: > > [CC'ed edge-computing at lists.openstack.org] > > On 20.05.2019 18:33, Arne Wiebalck wrote: > > Dear all, > > > > One of the discussions at the PTG in Denver raised the need for > > a mechanism to take ironic nodes out of production (a task for > > which the currently available 'maintenance' flag does not seem > > appropriate [1]). > > > > The use case there is an unhealthy physical node in state 'active', > > i.e. associated with an instance. The request is then to enable an > > admin to mark such a node as 'faulty' or 'in quarantine' with the > > aim of not returning the node to the pool of available nodes once > > the hosted instance is deleted. > > > > A very similar use case which came up independently is node > > retirement: it should be possible to mark nodes ('active' or not) > > as being 'up for retirement' to prepare the eventual removal from > > ironic. As in the example above, ('active') nodes marked this way > > should not become eligible for instance scheduling again, but > > automatic cleaning, for instance, should still be possible. > > > > In an effort to cover these use cases by a more general > > "quarantine/retirement" feature: > > > > - are there additional use cases which could profit from such a > > "take a node out of service" mechanism? > > There are security related examples described in the Edge Security > Challenges whitepaper [0] drafted by k8s IoT SIG [1], like in the > chapter 2 Trusting hardware, whereby "GPS coordinate changes can be used > to force a shutdown of an edge node". So a node may be taken out of > service as an indicator of a particular condition of edge hardware. > > [0] > https://docs.google.com/document/d/1iSIk8ERcheehk0aRG92dfOvW5NjkdedN8F7mSUTr-r0/edit#heading=h.xf8mdv7zexgq > [1] > https://github.com/kubernetes/community/tree/master/wg-iot-edge > > > > > - would these use cases put additional constraints on how the > > feature should look like (e.g.: "should not prevent cleaning") > > > > - are there other characteristics such a feature should have > > (e.g.: "finding these nodes should be supported by the cli") > > > > Let me know if you have any thoughts on this. > > > > Cheers, > > Arne > > > > > > [1] https://etherpad.openstack.org/p/DEN-train-ironic-ptg, l. 360 > > > > > -- > Best regards, > Bogdan Dobrelya, > Irc #bogdando > > _______________________________________________ > Edge-computing mailing list > Edge-computing at lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/edge-computing > > > _______________________________________________ > Edge-computing mailing list > Edge-computing at lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/edge-computing From skaplons at redhat.com Tue May 21 20:13:04 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Tue, 21 May 2019 22:13:04 +0200 Subject: neutron network namespaces not create In-Reply-To: <7FDE461E16EF1E4587C3A3333C7DA2D90E8C5DAC74@Email.novotel-bd.com> References: <7FDE461E16EF1E4587C3A3333C7DA2D90E8C5DAC74@Email.novotel-bd.com> Message-ID: <08D0BBB7-C082-4953-9AFC-B06F13880C44@redhat.com> Hi, Error which You pasted looks that is related to moment when router is deleted. Do You have any errors in neutron-server and/or neutron-l3-agent logs during creation of the router? Can You check with command “neutron l3-agent-list-hosting-router ” by which l3 agent router should be hosted? What kind of router Your are creating? Is it HA, DVR, DVR-HA router? Or maybe Legacy? > On 21 May 2019, at 16:51, Md. Farhad Hasan Khan wrote: > > Hi, > I can create router from horizon. But network namespaces not created. I check with # ip netns list command. Not found router ID, but showing in horizon. > > Here is some log I get from neutron: > > > #cat /var/log/neutron/l3-agent.log > > > 2019-05-21 17:35:44.646 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:35:44.646 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:35:44.646 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent ri.delete() > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent router_id) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self.force_reraise() > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:35:44.711 52248 WARNING neutron.agent.l3.agent [-] Hit retry limit with router update for cad85ce0-6624-42ff-b42b-09480aea2613, action None: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:35:44.711 52248 WARNING neutron.agent.l3.agent [-] Info for router cad85ce0-6624-42ff-b42b-09480aea2613 was not found. Performing router cleanup: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent ri.delete() > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent router_id) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self.force_reraise() > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent ri.delete() > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent router_id) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self.force_reraise() > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent ri.delete() > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent router_id) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self.force_reraise() > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent ri.delete() > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent router_id) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.force_reraise() > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:13.024 52248 WARNING neutron.agent.l3.agent [-] Info for router cad85ce0-6624-42ff-b42b-09480aea2613 was not found. Performing router cleanup > 2019-05-21 17:37:14.358 52248 WARNING neutron.agent.l3.agent [-] Info for router cad85ce0-6624-42ff-b42b-09480aea2613 was not found. Performing router cleanup > > > Thanks & B’Rgds, > Rony — Slawek Kaplonski Senior software engineer Red Hat From mriedemos at gmail.com Tue May 21 20:26:17 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Tue, 21 May 2019 15:26:17 -0500 Subject: [nova] stable-maint is especially unhealthily RH-centric In-Reply-To: References: Message-ID: <86241a3b-be28-c83f-7c35-386946c3cdc8@gmail.com> On 5/21/2019 11:16 AM, Matthew Booth wrote: > Not Red Hat: > Claudiu Belu -> Inactive? > Matt Riedemann > John Garbutt > Matthew Treinish Sean McGinnis is on the release management team which is a (grand)parent group to nova-stable-maint and Sean reviews nova stable changes from time to time or as requested, but he's currently in the same boat as me. > > Red Hat: > Dan Smith > Lee Yarwood > Sylvain Bauza > Tony Breeds > Melanie Witt > Alan Pevec > Chuck Short > Flavio Percoco Alan, Chuck and Flavio are all in the parent stable-maint-core group but also inactive as far as I know. FWIW the most active nova stable cores are myself, Lee and Melanie. I ping Dan and Sylvain from time to time as needed on specific changes or if I'm trying to flush a branch for a release. > Tony Breeds > > This leaves Nova entirely dependent on Matt Riedemann, John Garbutt, > and Matthew Treinish to land patches in stable, which isn't a great > situation. With Matt R temporarily out of action that's especially > bad. This is a bit of an exaggeration. What you mean is that it leaves backports from Red Hat stuck(ish) because we want to avoid two RH cores from approving the backport. However, it doesn't mean 2 RH cores can't approve a backport from someone else, like something I backport for example. > > Looking for constructive suggestions. I'm obviously in favour of > relaxing the trifecta rules, but adding some non-RH stable cores also > seems like it would be a generally healthy thing for the project to > do. I've started a conversation about this within the nova-stable-maint team but until there are changes I think it's fair to say if you really need something that is backported from RH (like Lee backports something) then we can ping non-RH people to approve (like mtreinish or johnthetubaguy) or wait for me to get out of /dev/jail. -- Thanks, Matt From michaelr at catalyst.net.nz Wed May 22 01:19:40 2019 From: michaelr at catalyst.net.nz (Michael Richardson) Date: Wed, 22 May 2019 13:19:40 +1200 Subject: neutron network namespaces not create In-Reply-To: <7FDE461E16EF1E4587C3A3333C7DA2D90E8C5DAC75@Email.novotel-bd.com> References: <7FDE461E16EF1E4587C3A3333C7DA2D90E8C5DAC75@Email.novotel-bd.com> Message-ID: Hi Rony, Based on this line: > BridgeDoesNotExist: Bridge br-int does not exist ...it looks as if your SDN-provider-of-choice (OVS ?) may need some attention. As a wild stab in the dark, is the bridge defined in bridge_mappings, for the default bridge, present, and available to be patched (hypothetically via OVS, or another provider) to br-int ? Cheers, Michael. On 22/05/19 2:53 AM, Md. Farhad Hasan Khan wrote: > Hi, > > I can create router from horizon. But network namespaces not created. I > check with # ip netns list command. Not found router ID, but showing in > horizon. > > Here is some log I get from neutron: > > #cat /var/log/neutron/l3-agent.log > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent [-] Error > while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: > [Errno 2] No such file or directory: > '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent Traceback > (most recent call last): > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent   File > "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, > in _router_added > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent     ri.delete() > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent   File > "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line > 452, in delete > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent > self.disable_keepalived() > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent   File > "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line > 178, in disable_keepalived > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent > shutil.rmtree(conf_dir) > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent   File > "/usr/lib64/python2.7/shutil.py", line 239, in rmtree > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent > onerror(os.listdir, path, sys.exc_info()) > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent   File > "/usr/lib64/python2.7/shutil.py", line 237, in rmtree > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent     names = > os.listdir(path) > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent OSError: > [Errno 2] No such file or directory: > '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent [-] Failed to > process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: > BridgeDoesNotExist: Bridge br-int does not exist. > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent Traceback > (most recent call last): > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent   File > "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, > in _process_router_update > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent > self._process_router_if_compatible(router) > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent   File > "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, > in _process_router_if_compatible > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent > self._process_added_router(router) > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent   File > "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, > in _process_added_router > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent > self._router_added(router['id'], router) > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent   File > "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, > in _router_added > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent     router_id) > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent   File > "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in > __exit__ > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent > self.force_reraise() > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent   File > "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in > force_reraise > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent > six.reraise(self.type_, self.value, self.tb) > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent   File > "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, > in _router_added > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent > ri.initialize(self.process_monitor) > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent   File > "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line > 128, in initialize > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent > self.ha_network_added() > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent   File > "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line > 198, in ha_network_added > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent > mtu=self.ha_port.get('mtu')) > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent   File > "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", > line 263, in plug > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent     bridge, > namespace, prefix, mtu) > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent   File > "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", > line 346, in plug_new > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent > self.check_bridge_exists(bridge) > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent   File > "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", > line 221, in check_bridge_exists > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent     raise > exceptions.BridgeDoesNotExist(bridge=bridge) > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent > BridgeDoesNotExist: Bridge br-int does not exist. > > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent > > 2019-05-21 17:37:13.024 52248 WARNING neutron.agent.l3.agent [-] Info > for router cad85ce0-6624-42ff-b42b-09480aea2613 was not found. > Performing router cleanup > > 2019-05-21 17:37:14.358 52248 WARNING neutron.agent.l3.agent [-] Info > for router cad85ce0-6624-42ff-b42b-09480aea2613 was not found. > Performing router cleanup > > Thanks & B’Rgds, > > Rony > -- Michael Richardson Catalyst Cloud || Catalyst IT Limited 150-154 Willis Street, PO Box 11-053 Wellington New Zealand http://catalyst.net.nz GPG: 0530 4686 F996 4E2C 5DC7 6327 5C98 5EED A302 From li.canwei2 at zte.com.cn Wed May 22 02:47:27 2019 From: li.canwei2 at zte.com.cn (li.canwei2 at zte.com.cn) Date: Wed, 22 May 2019 10:47:27 +0800 (CST) Subject: =?UTF-8?B?W1dhdGNoZXJdIHRlYW0gbWVldGluZyBhbmQgYWdlbmRh?= Message-ID: <201905221047274670348@zte.com.cn> Hi, Watcher will have a meeting at 08:00 UTC today in the #openstack-meeting-alt channel. The agenda is available on https://wiki.openstack.org/wiki/Watcher_Meeting_Agenda#05.2F22.2F2019 feel free to add any additional items. Thanks! Canwei Li -------------- next part -------------- An HTML attachment was scrubbed... URL: From tony at bakeyournoodle.com Wed May 22 03:02:04 2019 From: tony at bakeyournoodle.com (Tony Breeds) Date: Wed, 22 May 2019 13:02:04 +1000 Subject: [dev][requirements] Upcoming changes to constraints handling in tox.ini Message-ID: <20190522030203.GD15808@thor.bakeyournoodle.com> Hi folks, This is a heads-up to describe 3 sets of changes you'll start seeing starting next week. 1) lower-constraints.txt handling TL;DR: Make sure projects do not specify a constraint file in install_command 2) Switch to the new canonical constraints URL on master TR;DR: Make sure you use https://releases.openstack.org/constraints/upper/master 3) Switch to the new canonical constraints URL on stable branches TR;DR: Make sure you use https://releases.openstack.org/constraints/upper/$series These will be generated from a member of the requirements team[1], and will be on the gerrit topic constraints-updates. We'll start next week to give y'all a few days to digest this email Now for slightly more details: 1) lower-constraints.txt handling If a projects has an install_command that includes a constraint file *and* the also have a lower-constraints test env what may happen is something like: given a tox.ini that contains: --- [testenv] install_command = pip install -U -c upper-constraints.txt {opts} {deps} deps = -r requirements.txt -r test-requirements.txt [testenv:lower-constraints] deps = -c lower-constrtaints.txt -r requirements.txt -r test-requirements.txt --- pyXX: virtualenv -p $python_version .tox/$testenv_name .tox/$testenv_name/bin/pip install -U -c upper-constratints.txt -r requirements.txt -rtest-requirements.txt .tox/$testenv_name/bin/pip install . lower-constraints: virtualenv -p $python_version .tox/$testenv_name .tox/$testenv_name/bin/pip install -U -c upper-constratints.txt -c lower-constrtaints.txt -r requirements.txt -rtest-requirements.txt .tox/$testenv_name/bin/pip install . pip will constrain a library to the first item it encounters, which pretty much means that the lower-constraints testenv is still testing against upper-constraints.txt[3] The fix is to move all constraints into deps rather than install_command. This impacts ~40 projects and will be fixed by hand, rather than a tool so we can preserve each projects style for tox.ini. There's a pretty good chance that this change will break the lower-constraints testenv, project teams are encouraged to help identify the incorrect values in lower-constraints, the requirements team will pick the current value in upper-constraints which almost certainly raises a value well past the minimum. https://review.opendev.org/#/c/601188/ is the change we did for keystone way back when Some projects have upper-constraints.txt in install_command but explicitly override it in lower-constratints.txt this works and wont be changed at this point but ideally all projects will use the same format to reduce the impact in switching between projects. 2) Switch to the new canonical constraints URL on master Way back at the first Denver PTG thr requirements, release and infrastructure teams discussed avoiding using gitweb urls for constraints urls. This was discussed[4] in more detail after the liberty EOL broke consumers still wanting to use tox as: https://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt?h=stable/liberty 'vanished' The idea was re-raised[5] in Feb. this year where we devised a plan to create URLs that abstract the git hosting and allow us to switch between branches and tags at the right times in the OpenStack life-cycle. Other, more dynamic, ideas were discussed but over complicate things and make OpenStack slightly harder to understand or create more work. for example we could: a. Add an install_script that looks up $metadata and grab the right constraints We're trying avoid scripts like this as it very rapidly gets us to the point where we're doing "managed copy-and-paste" between projects *or* projects install scripts diverge b. Add to tox itself the ability to derive the URL from $metadata This is doable but frankly requires more time than we have right now. So what we have is: redirect 301 /constraints/upper/master http://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt?h=master redirect 301 /constraints/upper/train http://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt?h=master redirect 301 /constraints/upper/stein http://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt?h=stable/stein redirect 301 /constraints/upper/rocky http://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt?h=stable/rocky redirect 301 /constraints/upper/queens http://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt?h=stable/queens redirect 301 /constraints/upper/pike http://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt?h=stable/pike redirect 301 /constraints/upper/ocata http://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt?h=stable/ocata redirect 301 /constraints/upper/newton http://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt?h=stable/newton redirect 301 /constraints/upper/mitaka http://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt?h=mitaka-eol redirect 301 /constraints/upper/liberty http://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt?h=liberty-eol redirect 301 /constraints/upper/kilo http://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt?h=kilo-eol redirect 301 /constraints/upper/juno http://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt?h=juno-eol [Yes I know we need to use opendev that's tomorrows problem] so that means that: http{s,}://releases.openstack.org/constraints/upper/master http{s,}://releases.openstack.org/constraints/upper/$series will always do the right thing based on the state of each OpenStack release. So we're going to generate changes that point to the right place. For master we have 2 options: a. Use .../train If we use train in the constraints URLs now things will work and when we branch for 'U' we'd need to update master to point to the right series name. This is very doable. The release team already generates a change on master at this point in time to update reno. We also need to update the new stable branch so that .gitreview points to the new branch. b. Use .../master If we use 'master' in the constratints URLs master will always work and when we branch for 'U' we'd need to update the stable/train branch to point to train. The release team already does this today along with the .gitreview change. So the bottom line is that using master (over train) is *slighty* less work for the release team and *slightly* less work for each project team. as such that's what we're going to do. There are several changes in the works that set the constraints url to either https://opendev.org/openstack/requirements/raw/branch/master/upper-constraints.txt or https://releases.openstack.org/constraints/upper/train These will work and will not break anything when the project branches, however for the sake of OpenStack wide consistency, and avoiding merge conflicts project teams are discouraged from merging them, or if the change also fixes other references to git.openstack.org just have the tox.ini updates removed. Once tox.ini is done the requirements team will look for additional places where the 'old style' urls are used and follow-up but right now there's a lot of noise to sift through 3) Switch to the new canonical constraints URL on stable branches Hopefully with all the background from 2 above this should be pretty self-explanatory. This is however much lower priority and probably wont start until the bulk of 1 and 2 is complete Yours Tony. [1] https://review.opendev.org/#/admin/groups/131,members [2] [2] Probably me actually ;P [3] There is some nuance here but this is the general rule [4] In the thread that starts with: http://lists.openstack.org/pipermail/openstack-dev/2017-September/122333.html [5] In the thread that starts with: http://lists.openstack.org/pipermail/openstack-discuss/2019-February/002682.html -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From skaplons at redhat.com Wed May 22 04:50:42 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Wed, 22 May 2019 06:50:42 +0200 Subject: neutron network namespaces not create In-Reply-To: References: <7FDE461E16EF1E4587C3A3333C7DA2D90E8C5DAC75@Email.novotel-bd.com> Message-ID: Hi, Yes, Michael is right. This error message is probably suspicious :) So, in ML2/OVS its neutron-ovs-agent who should create br-int during the start. It’s in [1]. So is neutron-openvswitch-agent running on Your node? Or maybe if You are using some other solution rather than ML2/OVS, maybe You should change to other than “openvswitch” interface driver in neutron-l3-agent’s config? [1] https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L1074 > On 22 May 2019, at 03:19, Michael Richardson wrote: > > Hi Rony, > > Based on this line: > > BridgeDoesNotExist: Bridge br-int does not exist > > ...it looks as if your SDN-provider-of-choice (OVS ?) may need some attention. As a wild stab in the dark, is the bridge defined in bridge_mappings, for the default bridge, present, and available to be patched (hypothetically via OVS, or another provider) to br-int ? > > Cheers, > Michael. > > > On 22/05/19 2:53 AM, Md. Farhad Hasan Khan wrote: >> Hi, >> I can create router from horizon. But network namespaces not created. I check with # ip netns list command. Not found router ID, but showing in horizon. >> Here is some log I get from neutron: >> #cat /var/log/neutron/l3-agent.log >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent ri.delete() >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent router_id) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.force_reraise() >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.ha_network_added() >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent >> 2019-05-21 17:37:13.024 52248 WARNING neutron.agent.l3.agent [-] Info for router cad85ce0-6624-42ff-b42b-09480aea2613 was not found. Performing router cleanup >> 2019-05-21 17:37:14.358 52248 WARNING neutron.agent.l3.agent [-] Info for router cad85ce0-6624-42ff-b42b-09480aea2613 was not found. Performing router cleanup >> Thanks & B’Rgds, >> Rony > > -- > Michael Richardson > Catalyst Cloud || Catalyst IT Limited > 150-154 Willis Street, PO Box 11-053 Wellington New Zealand > http://catalyst.net.nz > GPG: 0530 4686 F996 4E2C 5DC7 6327 5C98 5EED A302 > — Slawek Kaplonski Senior software engineer Red Hat From ifatafekn at gmail.com Wed May 22 05:39:26 2019 From: ifatafekn at gmail.com (Ifat Afek) Date: Wed, 22 May 2019 08:39:26 +0300 Subject: [vitrage][ptl][tc] stepping down as Vitrage PTL Message-ID: Hi, As I have taken on a new role in my company, I will not be able to continue serving as the Vitrage PTL. I’ve been the PTL of Vitrage from the day it started (back then in Mitaka), and it has been a real pleasure for me. Helping a project grow from an idea and a set of diagrams to a production-grade service was an amazing experience. I am honored to have worked with excellent developers, both Vitrage core contributors and other community members. I learned a lot, and also managed to have fun along the way :-) I would like to take this opportunity to thank everyone who contributed to the success of Vitrage – either by writing code, suggesting new use cases, participating in our discussions, or helping out when our gate was broken. Eyal Bar-Ilan (irc: eyalb), a Vitrage core contributor who was part of the team from the very beginning, will be replacing me as the new Vitrage PTL [1]. I’m sure he will make an excellent PTL, as someone who knows every piece of the code and is tightly connected to the community. I will still be around to help if needed. I wish Eyal lots of luck in his new role! Ifat [1] https://review.opendev.org/#/c/660563/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dirk at dmllr.de Wed May 22 06:45:29 2019 From: dirk at dmllr.de (=?UTF-8?B?RGlyayBNw7xsbGVy?=) Date: Wed, 22 May 2019 08:45:29 +0200 Subject: [dev][requirements] Upcoming changes to constraints handling in tox.ini In-Reply-To: <20190522030203.GD15808@thor.bakeyournoodle.com> References: <20190522030203.GD15808@thor.bakeyournoodle.com> Message-ID: Hi Tony, Thanks for the write-up. > 2) Switch to the new canonical constraints URL on master At the last Denver PTG we also discussed the switch from UPPER_CONSTRAINTS_FILE environment variable to TOX_CONSTRAINTS_FILE. As this change and the switch from UPPER_CONSTRAINTS_FILE to TOX_CONSTRAINTS_FILE would touch the very same line of text in the tox.ini, I would suggest that we combine that into one review as that is ~ 300 reviews less to conflict-merge and resolve when both would happen independently at the same time. I started the patch series to add TOX_CONSTRAINTS_FILE in addition to UPPER_CONSTRAINTS_FILE so that lower-constraints setting looks less odd here: https://review.opendev.org/657886 https://review.opendev.org/660187 Would be good to get this in in-time so that requirements team can do both changes in one review set. Thanks, Dirk From renat.akhmerov at gmail.com Wed May 22 06:57:40 2019 From: renat.akhmerov at gmail.com (Renat Akhmerov) Date: Wed, 22 May 2019 13:57:40 +0700 Subject: [mistral] Reminder: office hours in ~60 mins (8.00 UTC) In-Reply-To: <34fb190d-45db-401a-a09d-e0e1a43fc878@Spark> References: <34fb190d-45db-401a-a09d-e0e1a43fc878@Spark> Message-ID: Hi, Just a reminder that we’ll have an office hours sessions in ~60 mins (8.00 UTC) in #openstack-mistral. If you’re interested in chatting about anything regarding Mistral, welcome to join. Thanks Renat Akhmerov @Nokia -------------- next part -------------- An HTML attachment was scrubbed... URL: From zigo at debian.org Wed May 22 07:09:25 2019 From: zigo at debian.org (Thomas Goirand) Date: Wed, 22 May 2019 09:09:25 +0200 Subject: [ops] database archiving tool In-Reply-To: <20190509151428.im2c6dbxpv6hwhyo@corp.ovh.com> References: <20190509151428.im2c6dbxpv6hwhyo@corp.ovh.com> Message-ID: <011a09f0-3e55-ff57-b5a0-00a3567061ad@debian.org> On 5/9/19 5:14 PM, Pierre-Samuel LE STANG wrote: > Hi all, > > At OVH we needed to write our own tool that archive data from OpenStack > databases to prevent some side effect related to huge tables (slower response > time, changing MariaDB query plan) and to answer to some legal aspects. > > So we started to write a python tool which is called OSArchiver that I briefly > presented at Denver few days ago in the "Optimizing OpenStack at large scale" > talk. We think that this tool could be helpful to other and are ready to open > source it, first we would like to get the opinion of the ops community about > that tool. > > To sum-up OSArchiver is written to work regardless of Openstack project. The > tool relies on the fact that soft deleted data are recognizable because of > their 'deleted' column which is set to 1 or uuid and 'deleted_at' column which > is set to the date of deletion. > > The points to have in mind about OSArchiver: > * There is no knowledge of business objects > * One table might be archived if it contains 'deleted' column > * Children rows are archived before parents rows > * A row can not be deleted if it fails to be archived > > Here are features already implemented: > * Archive data in an other database and/or file (actually SQL and CSV > formats are supported) to be easily imported > * Delete data from Openstack databases > * Customizable (retention, exclude DBs, exclude tables, bulk insert/delete) > * Multiple archiving configuration > * Dry-run mode > * Easily extensible, you can add your own destination module (other file > format, remote storage etc...) > * Archive and/or delete only mode > > It also means that by design you can run osarchiver not only on OpenStack > databases but also on archived OpenStack databases. > > Thanks in advance for your feedbacks. > Hi Pierre, That's really the kind of project that I would prefer not to have to exist. By this I mean, it'd be a lot nicer if this could be taken care of project by project, with something like what Nova does (ie: nova-manage db archive_deleted_rows). In such configuration, that's typically something that could be added as a cron job, automatically configured by packages. Now, a question for other OPS reading this thread: how long should be the retention? In Debian, we use to have the unsaid policy that we don't want too much retention, to project privacy. Though in operation, we may need at least a few weeks of history, so we can do support. If I was to configure a cron job for nova, for example, what parameter should I set to --before (when #556751 is merged)? My instinct would be: nova-manage db archive_deleted_rows \ --before $(date -d "-1 month" +%Y-%m-%d) Your thoughts everyone? Cheers, Thomas Goirand (zigo) From sfinucan at redhat.com Wed May 22 08:26:19 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Wed, 22 May 2019 09:26:19 +0100 Subject: [dev][requirements] Upcoming changes to constraints handling in tox.ini In-Reply-To: <20190522030203.GD15808@thor.bakeyournoodle.com> References: <20190522030203.GD15808@thor.bakeyournoodle.com> Message-ID: <2e3fa8543cde73f3b93566c0a5b89c30c8d6b42b.camel@redhat.com> On Wed, 2019-05-22 at 13:02 +1000, Tony Breeds wrote: > Hi folks, > This is a heads-up to describe 3 sets of changes you'll start seeing > starting next week. > > 1) lower-constraints.txt handling > TL;DR: Make sure projects do not specify a constraint file in install_command > 2) Switch to the new canonical constraints URL on master > TR;DR: Make sure you use https://releases.openstack.org/constraints/upper/master > 3) Switch to the new canonical constraints URL on stable branches > TR;DR: Make sure you use https://releases.openstack.org/constraints/upper/$series > > These will be generated from a member of the requirements team[1], and > will be on the gerrit topic constraints-updates. We'll start next week > to give y'all a few days to digest this email All looks good to me. I'd been fixing this in a piecemeal fashion with oslo but who knows what other projects do iffy stuff here. I realize this is bound to be controversial, but would it be possible to just auto-merge these patches assuming they pass CI? We've had a lot of these initiatives before and, invariably, there are some projects that won't get around to merging these for a long time (if ever). We had to do this recently with the opendev updates to the '.gitreview' files (I think?) so there is precedent here. Stephen > [1] https://review.opendev.org/#/admin/groups/131,members [2] > [2] Probably me actually ;P From pierre-samuel.le-stang at corp.ovh.com Wed May 22 08:34:37 2019 From: pierre-samuel.le-stang at corp.ovh.com (Pierre-Samuel LE STANG) Date: Wed, 22 May 2019 10:34:37 +0200 Subject: [ops] database archiving tool In-Reply-To: <011a09f0-3e55-ff57-b5a0-00a3567061ad@debian.org> References: <20190509151428.im2c6dbxpv6hwhyo@corp.ovh.com> <011a09f0-3e55-ff57-b5a0-00a3567061ad@debian.org> Message-ID: <20190522083437.u4hneebmjualmd32@corp.ovh.com> Thomas Goirand wrote on mer. [2019-mai-22 09:09:25 +0200]: > On 5/9/19 5:14 PM, Pierre-Samuel LE STANG wrote: > > Hi all, > > > > At OVH we needed to write our own tool that archive data from OpenStack > > databases to prevent some side effect related to huge tables (slower response > > time, changing MariaDB query plan) and to answer to some legal aspects. > > > > So we started to write a python tool which is called OSArchiver that I briefly > > presented at Denver few days ago in the "Optimizing OpenStack at large scale" > > talk. We think that this tool could be helpful to other and are ready to open > > source it, first we would like to get the opinion of the ops community about > > that tool. > > > > To sum-up OSArchiver is written to work regardless of Openstack project. The > > tool relies on the fact that soft deleted data are recognizable because of > > their 'deleted' column which is set to 1 or uuid and 'deleted_at' column which > > is set to the date of deletion. > > > > The points to have in mind about OSArchiver: > > * There is no knowledge of business objects > > * One table might be archived if it contains 'deleted' column > > * Children rows are archived before parents rows > > * A row can not be deleted if it fails to be archived > > > > Here are features already implemented: > > * Archive data in an other database and/or file (actually SQL and CSV > > formats are supported) to be easily imported > > * Delete data from Openstack databases > > * Customizable (retention, exclude DBs, exclude tables, bulk insert/delete) > > * Multiple archiving configuration > > * Dry-run mode > > * Easily extensible, you can add your own destination module (other file > > format, remote storage etc...) > > * Archive and/or delete only mode > > > > It also means that by design you can run osarchiver not only on OpenStack > > databases but also on archived OpenStack databases. > > > > Thanks in advance for your feedbacks. > > > > Hi Pierre, > > That's really the kind of project that I would prefer not to have to > exist. By this I mean, it'd be a lot nicer if this could be taken care > of project by project, with something like what Nova does (ie: > nova-manage db archive_deleted_rows). > > In such configuration, that's typically something that could be added as > a cron job, automatically configured by packages. > > Now, a question for other OPS reading this thread: how long should be > the retention? In Debian, we use to have the unsaid policy that we don't > want too much retention, to project privacy. Though in operation, we may > need at least a few weeks of history, so we can do support. If I was to > configure a cron job for nova, for example, what parameter should I set > to --before (when #556751 is merged)? My instinct would be: > > nova-manage db archive_deleted_rows \ > --before $(date -d "-1 month" +%Y-%m-%d) > > Your thoughts everyone? Hi Thomas, Thanks for your feedback, I really appreciate it. The tool is designed to be customized and to fit your needs. It means that you can have one configuration per project or one configuration for all projects. So you might imagine having a configuration for glance which exclude images table and one configuration for nova with a higher or lower retention. -- PS From thierry at openstack.org Wed May 22 08:56:02 2019 From: thierry at openstack.org (Thierry Carrez) Date: Wed, 22 May 2019 10:56:02 +0200 Subject: [vitrage][ptl][tc] stepping down as Vitrage PTL In-Reply-To: References: Message-ID: Ifat Afek wrote: > As I have taken on a new role in my company, I will not be able to > continue serving as the Vitrage PTL. > [...] > Eyal Bar-Ilan (irc: eyalb), a Vitrage core contributor who was part of > the team from the very beginning, will be replacing me as the new > Vitrage PTL [1]. I’m sure he will make an excellent PTL, as someone who > knows every piece of the code and is tightly connected to the community. > I will still be around to help if needed. > > I wish Eyal lots of luck in his new role! Thanks Ifat for all your help driving Vitrage so far, and thanks to Eyal for stepping up ! -- Thierry Carrez (ttx) From bdobreli at redhat.com Wed May 22 09:01:08 2019 From: bdobreli at redhat.com (Bogdan Dobrelya) Date: Wed, 22 May 2019 11:01:08 +0200 Subject: [dev][all] note to non-core reviewers in all projects In-Reply-To: <20190521163710.nujmle4dknr5cqgv@yuggoth.org> References: <20190521163710.nujmle4dknr5cqgv@yuggoth.org> Message-ID: <0bdfa780-d629-3401-7df1-54a96aa1b6ea@redhat.com> On 21.05.2019 18:37, Jeremy Stanley wrote: > this change makes > sense to me, is a good idea for the project, and I don't see any > obvious flaws in it. Would be nice to have this as a default message proposed by gerrit for +1 action. So there never be emptiness and everyone gets happy, by default! -- Best regards, Bogdan Dobrelya, Irc #bogdando From kchamart at redhat.com Wed May 22 09:45:49 2019 From: kchamart at redhat.com (Kashyap Chamarthy) Date: Wed, 22 May 2019 11:45:49 +0200 Subject: [dev][all] note to non-core reviewers in all projects In-Reply-To: <20190521163710.nujmle4dknr5cqgv@yuggoth.org> References: <20190521163710.nujmle4dknr5cqgv@yuggoth.org> Message-ID: <20190522094549.GC19519@paraplu> On Tue, May 21, 2019 at 04:37:10PM +0000, Jeremy Stanley wrote: > On 2019-05-21 09:31:19 -0400 (-0400), Brian Rosmaita wrote: [...] > > A +1 with no comments is completely useless (unless you have a > > review on a previous patch set with comments that have been > > addressed by the author). > [...] > > I think it's far more contextually nuanced than that. When I see a > naked +1 comment on a change from a reviewer I recognize as having > provided insightful feedback in the past, I tend to give that some > weight. Ideally a +1 carries with it an implicit "this change makes > sense to me, is a good idea for the project, and I don't see any > obvious flaws in it." If I only ever see (or only mostly see) them > from someone who I don't recall commenting usefully in recent > history, I basically ignore it. The implication of the +1 for that > reviewer is still the same, it's just that I don't have a lot of > faith in their ability to judge the validity of the change if they > haven't demonstrated that ability to me. /me nods in agreement. One way I look at it (as someone who also participates in projects that use mailing list-based patch workflows) is: Would you feel comfortable sending a flurry of 'naked +1' (or "ACK") e-mails to patches or patch series on a publicly archived mailing list? Occasionally, yes, depending on the shared understanding between regular contributors, and the nuanced context Jeremey notes above. But most reviewers (new, or otherwise) would feel awkward to do that, and instead leave an observation (and more often: a "thoughtful assent"). The nature of Gerrit is such that ("each change is an island") it encourages one to just push a simple button to give assent or dissent, without additional rationale. > > When you post your +1, please leave a comment explaining why you > > approve, or at least what in particular you looked at in the patch > > that gave you a favorable impression. This whole open source > > community thing is a collaborative effort, so please collaborate! > > You comment does not have to be profound. Even just saying that > > you checked that the release note or docs on the patch rendered > > correctly in HTML is very helpful. > > > [,...] > > In an ideal world where we all have plenty of time to review changes > and whatever else needs doing, this would be great. You know better > than most, I suspect, what it's like to be a core reviewer on a > project with a very high change-to-reviewer ratio. I don't > personally think it's a healthy thing to hold reviews from newer or > less frequent reviewers to a *higher* standard than we do for our > core reviewers on projects. The goal is to improve our software, and > to do that we need a constant injection of contributors with an > understanding of that software and our processes and workflows. We > should be giving them the courtesy and respect of assuming they have > performed diligence in the course of a review, as encouragement to > get more involved and feel a part of the project. > > As project leaders, it falls upon us to make the determination as to > when feedback from a reviewer is pertinent and when it isn't, *even* > if it requires us to keep a bit of context. But more importantly, we > should be setting the example for how we'd like new folks to review > changes, not simply telling them. Very well articulated. -- /kashyap From stig.openstack at telfer.org Wed May 22 09:47:04 2019 From: stig.openstack at telfer.org (Stig Telfer) Date: Wed, 22 May 2019 10:47:04 +0100 Subject: [scientific-sig] IRC Meeting today - SDN for scientific OpenStack Message-ID: <9DFFDFF9-4571-4DF2-86AD-73CD578D4FD4@telfer.org> Hi All - We have an IRC meeting today at 1100 UTC (about an hour’s time) in channel #openstack-meeting. Everyone is welcome. Today’s agenda is here: https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_May_22nd_2019 We’d like to focus today on issues with SDN and research computing use cases. Please come along with war stories and experiences to share. Cheers, Stig -------------- next part -------------- An HTML attachment was scrubbed... URL: From natal at redhat.com Wed May 22 10:02:33 2019 From: natal at redhat.com (=?UTF-8?Q?Natal_Ng=C3=A9tal?=) Date: Wed, 22 May 2019 12:02:33 +0200 Subject: [dev][all] note to non-core reviewers in all projects In-Reply-To: References: Message-ID: On Tue, May 21, 2019 at 3:33 PM Brian Rosmaita wrote: > A recent spate of +1 reviews with no comments on patches has put me into > grumpy-old-man mode. What do you mean by this? I make several code review and each time I add +1 that will mean I have made a code review. > A +1 with no comments is completely useless (unless you have a review on > a previous patch set with comments that have been addressed by the > author). I already know you're a smart person (you figured out how to > sign the CLA and navigate gerrit -- lots of people can't or won't do > that), but all your non-comment +1 tells me is that you are in favor of > the patch. That doesn't give me any information, because I already know > that the author is in favor of the patch, so that makes two of you out > of about 1,168 reviewers. That's not exactly a groundswell of support. I don't make code review to improve my statics, however sometimes I add a +1 and I have don't saw an error and another saw an error. That doesn't means I have don't read the code, sometimes during a code review we can also make a fail. This why it's good to have many different person make the code review on a patch. > When you post your +1, please leave a comment explaining why you > approve, or at least what in particular you looked at in the patch that > gave you a favorable impression. This whole open source community thing > is a collaborative effort, so please collaborate! You comment does not > have to be profound. Even just saying that you checked that the release > note or docs on the patch rendered correctly in HTML is very helpful. For me a +1 without comment is ok. The +1 is implicit that mean looks good to merge, but must be review by another person. Impose to add a comment for each review, it's for me a nonsense and a bad idea. Which type of comment do you will? I mean if it's only to add in the comment, looks good to merge, that change nothing. > The same thing goes for leaving a -1 on a patch. Don't just drop a -1 > bomb with no explanation. The kind of review that will put you on track > for becoming core in a project is what johnthetubaguy calls a > "thoughtful -1", that is, a negative review that clearly explains what > the problem is and points the author in a good direction to fix it. I totally agree a -1 must be coming with a comment, it's a nonsense to have a -1 without explanation. From jacob.anders.au at gmail.com Wed May 22 10:57:05 2019 From: jacob.anders.au at gmail.com (Jacob Anders) Date: Wed, 22 May 2019 20:57:05 +1000 Subject: [ironic] IRC meeting timing In-Reply-To: References: Message-ID: Hi Julia, Thank you for your response. On Wed, May 22, 2019 at 2:43 AM Julia Kreger wrote: > I think the best thing we can do is actually poll the community using > a doodle poll with 1 hour blocks over the course of say, two days in > UTC. This would allow people to populate and represent possible times > and from there if we have two peak areas of availability we might want > to consider it. > > Jacob, would you be willing to create a doodle poll in the UTC time > zone with ?48 (eek) options? over the course of two days and share > that link on the mailing list and we can bring it up in the next > weekly meeting? > Of course. Here's the poll: https://doodle.com/poll/bv9a4qyqy44wiq92 I'll look forward to the feedback from the group. Best Regards, Jacob On Mon, May 20, 2019 at 7:47 AM Jacob Anders > wrote: > > > > Hi All, > > > > I would be keen to participate in the Ironic IRC meetings, however the > current timing of the meeting is quite unfavourable to those based in the > Asia Pacific region. > > > > I'm wondering - would you be open to consider changing the timing to > either: > > > > - 2000hrs UTC: > > UTC (Time Zone) Monday, 20 May 2019 at 8:00:00 pm UTC > > (Sydney/Australia) Tuesday, 21 May 2019 at 6:00:00 am AEST UTC+10 hours > > (Germany/Berlin) Monday, 20 May 2019 at 10:00:00 pm CEST UTC+2 hours > > (USA/California) Monday, 20 May 2019 at 1:00:00 pm PDT UTC-7 hours > > > > - alternating between two different times to accommodate different > timezones? For example 1300hrs and 2000hrs UTC? > > > > Thank you. > > > > Best regards, > > Jacob > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zigo at debian.org Wed May 22 11:24:19 2019 From: zigo at debian.org (Thomas Goirand) Date: Wed, 22 May 2019 13:24:19 +0200 Subject: [ops] database archiving tool In-Reply-To: <20190522083437.u4hneebmjualmd32@corp.ovh.com> References: <20190509151428.im2c6dbxpv6hwhyo@corp.ovh.com> <011a09f0-3e55-ff57-b5a0-00a3567061ad@debian.org> <20190522083437.u4hneebmjualmd32@corp.ovh.com> Message-ID: On 5/22/19 10:34 AM, Pierre-Samuel LE STANG wrote: > Thomas Goirand wrote on mer. [2019-mai-22 09:09:25 +0200]: >> Hi Pierre, >> >> That's really the kind of project that I would prefer not to have to >> exist. By this I mean, it'd be a lot nicer if this could be taken care >> of project by project, with something like what Nova does (ie: >> nova-manage db archive_deleted_rows). >> >> In such configuration, that's typically something that could be added as >> a cron job, automatically configured by packages. >> >> Now, a question for other OPS reading this thread: how long should be >> the retention? In Debian, we use to have the unsaid policy that we don't >> want too much retention, to project privacy. Though in operation, we may >> need at least a few weeks of history, so we can do support. If I was to >> configure a cron job for nova, for example, what parameter should I set >> to --before (when #556751 is merged)? My instinct would be: >> >> nova-manage db archive_deleted_rows \ >> --before $(date -d "-1 month" +%Y-%m-%d) >> >> Your thoughts everyone? > > Hi Thomas, > > Thanks for your feedback, I really appreciate it. The tool is designed to be > customized and to fit your needs. It means that you can have one configuration > per project or one configuration for all projects. > > So you might imagine having a configuration for glance which exclude images > table and one configuration for nova with a higher or lower retention. > > -- > PS This looks super nice then! Will you provide a standard configuration for every OpenStack project? It'd be nice if your package had a conf.d folder where one could drop the config for every project. That way, every OpenStack project package could drop a configuration there. Cheers, Thomas Goirand (zigo) From witold.bedyk at suse.com Wed May 22 11:28:37 2019 From: witold.bedyk at suse.com (Witek Bedyk) Date: Wed, 22 May 2019 13:28:37 +0200 Subject: [vitrage][ptl][tc] stepping down as Vitrage PTL In-Reply-To: References: Message-ID: <1c2b3d3e-e437-c9fc-3391-6ce92a5a4e8b@suse.com> Thanks Ifat, I wish you a lot of success in the new role! Congratulations Eyal. Looking forward to working together. Cheers Witek On 5/22/19 7:39 AM, Ifat Afek wrote: > Hi, > > > As I have taken on a new role in my company, I will not be able to > continue serving as the Vitrage PTL. > > I’ve been the PTL of Vitrage from the day it started (back then in > Mitaka), and it has been a real pleasure for me. Helping a project grow > from an idea and a set of diagrams to a production-grade service was an > amazing experience. I am honored to have worked with excellent > developers, both Vitrage core contributors and other community members. > I learned a lot, and also managed to have fun along the way :-) > > I would like to take this opportunity to thank everyone who contributed > to the success of Vitrage – either by writing code, suggesting new use > cases, participating in our discussions, or helping out when our gate > was broken. > > > Eyal Bar-Ilan (irc: eyalb), a Vitrage core contributor who was part of > the team from the very beginning, will be replacing me as the new > Vitrage PTL [1]. I’m sure he will make an excellent PTL, as someone who > knows every piece of the code and is tightly connected to the community. > I will still be around to help if needed. > > I wish Eyal lots of luck in his new role! > > > Ifat > > > [1] https://review.opendev.org/#/c/660563/ > > From pierre-samuel.le-stang at corp.ovh.com Wed May 22 11:49:02 2019 From: pierre-samuel.le-stang at corp.ovh.com (Pierre-Samuel LE STANG) Date: Wed, 22 May 2019 13:49:02 +0200 Subject: [ops] database archiving tool In-Reply-To: References: <20190509151428.im2c6dbxpv6hwhyo@corp.ovh.com> <011a09f0-3e55-ff57-b5a0-00a3567061ad@debian.org> <20190522083437.u4hneebmjualmd32@corp.ovh.com> Message-ID: <20190522114902.2gqgporssjuvnrxt@corp.ovh.com> Thomas Goirand wrote on mer. [2019-mai-22 13:24:19 +0200]: > On 5/22/19 10:34 AM, Pierre-Samuel LE STANG wrote: > > Thomas Goirand wrote on mer. [2019-mai-22 09:09:25 +0200]: > >> Hi Pierre, > >> > >> That's really the kind of project that I would prefer not to have to > >> exist. By this I mean, it'd be a lot nicer if this could be taken care > >> of project by project, with something like what Nova does (ie: > >> nova-manage db archive_deleted_rows). > >> > >> In such configuration, that's typically something that could be added as > >> a cron job, automatically configured by packages. > >> > >> Now, a question for other OPS reading this thread: how long should be > >> the retention? In Debian, we use to have the unsaid policy that we don't > >> want too much retention, to project privacy. Though in operation, we may > >> need at least a few weeks of history, so we can do support. If I was to > >> configure a cron job for nova, for example, what parameter should I set > >> to --before (when #556751 is merged)? My instinct would be: > >> > >> nova-manage db archive_deleted_rows \ > >> --before $(date -d "-1 month" +%Y-%m-%d) > >> > >> Your thoughts everyone? > > > > Hi Thomas, > > > > Thanks for your feedback, I really appreciate it. The tool is designed to be > > customized and to fit your needs. It means that you can have one configuration > > per project or one configuration for all projects. > > > > So you might imagine having a configuration for glance which exclude images > > table and one configuration for nova with a higher or lower retention. > > > > -- > > PS > > This looks super nice then! > > Will you provide a standard configuration for every OpenStack project? > It'd be nice if your package had a conf.d folder where one could drop > the config for every project. That way, every OpenStack project package > could drop a configuration there. > > Cheers, > > Thomas Goirand (zigo) > I will provide a default configuration file that might be used as a template or reference. I did not consider adding the conf.d folder mechanism as all the different configurations can be written in the same config file but that sounds good to make the things clearer an easier. I'm still focus on making the code opensource as soon as it's done we will be able to make the tool evolve. -- PS From a.settle at outlook.com Wed May 22 12:12:36 2019 From: a.settle at outlook.com (Alexandra Settle) Date: Wed, 22 May 2019 12:12:36 +0000 Subject: [dev][all][ptls] note to non-core reviewers in all projects In-Reply-To: References: Message-ID: Replying to the last thread and top posting both on purpose as I want to point out something slightly differently but I am taking this platform as a way to create further discussion... The topic of +1 and -1 was brought up in a number of times at the most recent PTG. In the past it has been somewhat obvious that we HAVE had individuals who provide a +1 or -1 to improve their review stats. Now, whilst we *now* have the guide to reviewing as Julia pointed out [0] we originally created a community that kind of had to make up the meanings behind a +1 and -1. I am speaking generally here, some projects definitely had defined rules as to what these reviews meant, but not all did. This obviously left this very open to interpretation. To become a core review in many projects, the barrier to entry was general "you must provide quality reviews". That's great, except what ended up happening is we had waves of both individuals providing mass +1's and -1's. Neither were quality. When we (the docs team) were looking at who to offer core responsibilities to, we used to look specifically at the amount of +1's and -1's. At that point in time, a -1 was seen to be a review of "higher quality" because it meant somehow that the individual's critical review, automatically meant "critical thinking". All this did was create a negative review culture ("a -1 is better than a +1"). I know I used to be highly conscious of how many +1's I offered before I became core, I would often avoid providing a +1 even if the patch was fine so my stats would not be messed up. Which is pretty messed up. We have the review document in the project team guide, but what are we doing to socialise this? This thread currently is divulging into personal opinions and experiences (myself included here) but it looks like we need to work on some actions. A few questions: 1. Who knew about the "How to Review Changes the OpenStack Way" document before this thread? * If you did, how did you find the content? Did you find it easy to understand and follow? * If you did not, would you have found this document helpful when you first started reviewing? 2. PTLs - do you share this document with new contributors that are reappearing? 3. Is this socialised in the Upstream Institute meeting? (@diablo) I don't expect a flood of answers to these questions, but it's important that these are on our mind. Cheers, Alex IRC: asettle Twitter: @dewsday [0]: https://docs.openstack.org/project-team-guide/review-the-openstack-way.html On 22/05/2019 11:02, Natal Ngétal wrote: On Tue, May 21, 2019 at 3:33 PM Brian Rosmaita wrote: A recent spate of +1 reviews with no comments on patches has put me into grumpy-old-man mode. What do you mean by this? I make several code review and each time I add +1 that will mean I have made a code review. A +1 with no comments is completely useless (unless you have a review on a previous patch set with comments that have been addressed by the author). I already know you're a smart person (you figured out how to sign the CLA and navigate gerrit -- lots of people can't or won't do that), but all your non-comment +1 tells me is that you are in favor of the patch. That doesn't give me any information, because I already know that the author is in favor of the patch, so that makes two of you out of about 1,168 reviewers. That's not exactly a groundswell of support. I don't make code review to improve my statics, however sometimes I add a +1 and I have don't saw an error and another saw an error. That doesn't means I have don't read the code, sometimes during a code review we can also make a fail. This why it's good to have many different person make the code review on a patch. When you post your +1, please leave a comment explaining why you approve, or at least what in particular you looked at in the patch that gave you a favorable impression. This whole open source community thing is a collaborative effort, so please collaborate! You comment does not have to be profound. Even just saying that you checked that the release note or docs on the patch rendered correctly in HTML is very helpful. For me a +1 without comment is ok. The +1 is implicit that mean looks good to merge, but must be review by another person. Impose to add a comment for each review, it's for me a nonsense and a bad idea. Which type of comment do you will? I mean if it's only to add in the comment, looks good to merge, that change nothing. The same thing goes for leaving a -1 on a patch. Don't just drop a -1 bomb with no explanation. The kind of review that will put you on track for becoming core in a project is what johnthetubaguy calls a "thoughtful -1", that is, a negative review that clearly explains what the problem is and points the author in a good direction to fix it. I totally agree a -1 must be coming with a comment, it's a nonsense to have a -1 without explanation. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rico.lin.guanyu at gmail.com Wed May 22 13:30:55 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Wed, 22 May 2019 21:30:55 +0800 Subject: [vitrage][ptl][tc] stepping down as Vitrage PTL In-Reply-To: References: Message-ID: Thanks Ifat, You have done a great job with Vitrage, and I wish you all the best in the new role. And Congrats Eyal, looking forward to working with you for cross-project works:) On Wed, May 22, 2019 at 1:43 PM Ifat Afek wrote: > Hi, > > > As I have taken on a new role in my company, I will not be able to > continue serving as the Vitrage PTL. > > I’ve been the PTL of Vitrage from the day it started (back then in > Mitaka), and it has been a real pleasure for me. Helping a project grow > from an idea and a set of diagrams to a production-grade service was an > amazing experience. I am honored to have worked with excellent developers, > both Vitrage core contributors and other community members. I learned a > lot, and also managed to have fun along the way :-) > > I would like to take this opportunity to thank everyone who contributed to > the success of Vitrage – either by writing code, suggesting new use cases, > participating in our discussions, or helping out when our gate was broken. > > > Eyal Bar-Ilan (irc: eyalb), a Vitrage core contributor who was part of the > team from the very beginning, will be replacing me as the new Vitrage PTL > [1]. I’m sure he will make an excellent PTL, as someone who knows every > piece of the code and is tightly connected to the community. I will still > be around to help if needed. > > I wish Eyal lots of luck in his new role! > > > Ifat > > > [1] https://review.opendev.org/#/c/660563/ > > > -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.rydberg at citynetwork.eu Wed May 22 13:56:36 2019 From: tobias.rydberg at citynetwork.eu (Tobias Rydberg) Date: Wed, 22 May 2019 15:56:36 +0200 Subject: [sigs][publiccloud][publiccloud-wg] Reminder meeting tomorrow afternoon for Public Cloud WG/SIG Message-ID: <8997dd7a-935e-6e5f-9df0-cdfb9f30e5b7@citynetwork.eu> Hi all, This is a reminder for tomorrows meeting for the Public Cloud WG/SIG - 1400 UTC in #openstack-publiccloud. The main focus for the meeting will be continues discussions regarding the billing initiative. More information about that at https://etherpad.openstack.org/p/publiccloud-sig-billing-implementation-proposal Agenda at: https://etherpad.openstack.org/p/publiccloud-wg See you all tomorrow! Cheers, Tobias -- Tobias Rydberg Senior Developer Twitter & IRC: tobberydberg www.citynetwork.eu | www.citycloud.com INNOVATION THROUGH OPEN IT INFRASTRUCTURE ISO 9001, 14001, 27001, 27015 & 27018 CERTIFIED From i at liuyulong.me Wed May 22 03:26:16 2019 From: i at liuyulong.me (=?utf-8?B?TElVIFl1bG9uZw==?=) Date: Wed, 22 May 2019 11:26:16 +0800 Subject: [openstack-dev] [Neutron] Train PTG Summary In-Reply-To: References: Message-ID: Thanks Miguel for the detail summary. ------------------ Original ------------------ From: "Miguel Lavalle"; Date: Mon, May 20, 2019 05:15 AM To: "openstack-discuss"; Subject: [openstack-dev] [Neutron] Train PTG Summary Dear Neutron team, Thank you very much for your hard work during the PTG in Denver. Even though it took place at the end of a very long week, we had a very productive meeting and we planned and prioritized a lot of work to be done during the cycle. Following below is a high level summary of the discussions we had. If there is something I left out, please reply to this email thread to add it. However, if you want to continue the discussion on any of the individual points summarized below, please start a new thread, so we don't have a lot of conversations going on attached to this update. You can find the etherpad we used during the PTG meetings here: https://etherpad.openstack.org/p/openstack-networking-train-ptg Retrospective ========== * The team considered positive points during the Stein cycle the following: - Implemented and merged all the targeted blueprints. - Minted several new core team members through a mentoring program. The new core reviewers are Nate Johnston, Hongbin Lu, Liu Yulong, Bernard Caffarelli (stable branches) and Ryan Tidwell (neutron-dynamic-routing) - Very good cross project cooperation with Nova (https://blueprints.launchpad.net/neutron/+spec/strict-minimum-bandwidth-support) and StarlingX (https://blueprints.launchpad.net/neutron/+spec/network-segment-range-management) - The team got caught up with all the community goals - Added non-voting jobs from the Stadium projects, enabling the team to catch potential breakage due to changes in Neutron - Successfully forked the Ryu SDN framework, which is used by Neutron for Openflow programming. The original developer is not supporting the framework anymore, so the Neutron team forked it as os-ken (https://opendev.org/openstack/os-ken) and adopted it seamlessly in the code * The team considered the following as improvement areas: - At the end of the cycle, we experienced failures in the gate that impacted the speed at which code was merged. Measures to solve this problem were later discussed in the "CI stability" section below - The team didn't make much progress adopting Storyboard. Given comments of lack of functionality from other projects, a decision was made to evaluate progress by other teams before moving ahead with Storyboard - Lost almost all the key contributors in the following Stadium projects: https://opendev.org/openstack/neutron-fwaas and https://opendev.org/openstack/neutron-vpnaas. Miguel Lavalle will talk to the remaining contributors to asses how to move forward - Not too much concrete progress was achieved by the performance and scalability sub-team. Please see the "Neutron performance and scaling up" section below for next steps - Engine facade adoption didn't make much progress due to the loss of all the members of the sub-team working on it. Miguel Lavalle will lead this effort during Train. Nate Johnston and Rodolfo Alonso volunteered to help. The approach will be to break up this patch into smaller, more easily implementable and reviewable chunks: https://review.opendev.org/#/c/545501/ Support more than one segment per network per host ======================================== The basic value proposition of routed networks is to allow deployers to offer their users a single "big virtual network" without the performance limitations of large L2 broadcast domains. This value proposition is currently limited by the fact that Neutron allows only one segment per network per host: https://github.com/openstack/neutron/blob/77fa7114f9ff67d43a1150b52001883fafb7f6c8/neutron/objects/subnet.py#L319-L328. As a consequence, as demand of IP addresses exceeds the limits of a reasonably sized subnets (/22 subnets is a consensus on the upper limit), it becomes necessary to allow hosts to be connected to more than one segment in a routed network. David Bingham and Kris Lindgren (GoDaddy) have been working on PoC code to implement this (https://review.opendev.org/#/c/623115). This code has helped to uncover some key challenges: * Change all code that assumes a 1-1 relationship between network and segment per host into a 1-many relationship. * Generate IDs based on segment_id rather than network_id to be used in naming software bridges associated with the network segments. * Ensure new 1-many relationship (network -> segment) can be supported by ML2 drivers implementors. * Provide migration paths for current deployments of routed networks. The agreements made were the following: * We will write a spec reflecting the learnings of the PoC * The spec will target all the supported ML2 backends, not only some of them * We will modify and update ML2 interfaces to support the association of software bridges with segments, striving to provide backwards compatibility * We will try to provide an automatic migration option that only requires re-starting the agents. If that proves not to be possible, a set of migration scripts and detailed instructions will be created The first draft of the spec is already up for review: https://review.opendev.org/#/c/657170/ Neutron CI stability ============== At the end of the Stein cycle the project experienced a significant impact due to CI instability. This situation has improved recently but there is still gains to be achieved. The team discussed to major areas of improvement: make sure we don't have more tests that are necessary (simplification of jobs) and fix recurring problems. - To help the conversation on simplification of jobs, Slawek Kaplonski shared this matrix showing what currently is being tested: https://ethercalc.openstack.org/neutron-ci-jobs * One approach is reducing the services Neutron is tested with in integrated-gate jobs (tempest-full), which will reduce the number of failures not related to Neutron. Slawek Kaplonski represented Neutron in the QA PTG session where this approach was discussed. The proposal being socialized in the mailing list (http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005871.html) involves: # Run only dependent service tests on project gate # Tempest gate will keep running all the services tests as the integrated gate at a centeralized place without any change in the current job # Each project can run a simplified integrated gate job template tailored to its needs # All the simplified integrated gate job templates will be defined and maintained by QA team # For Neutron there will be an "Integrated-gate-networking". Tests to run in this template: Neutron APIs , Nova APIs, Keystone APIs. All scenario currently running in tempest-full in the same way (means non-slow and in serial). The improvement will be to exclude the Cinder API tests, Glance API tests and Swift API tests * Another idea discussed was removing single node jobs that are very similar to multinode jobs # One possibility is consolidating grenade jobs. There is a proposal being socialized in the mailing list: http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006146.html # Other consolidation of single node - multinode jobs will require stabilizing the corresponding multinode job - One common source of problems is ssh failures in various scenario tests * Several team members are working on different aspects of this issue * Slawek Kaplonski is investigating authentication failures. As of the writing of this summary, it has been determined that there is a slowdown in the metadata service, either on the Neutron or the Nova side. Further investigation is going on * Miguel Lavalle is adding tcpdump commands to router namespaces to investigate data path disruptions netwroking-omnipath ================ networking-omnipath (https://opendev.org/x/networking-omnipath) is a ML2 mechanism driver that integrates OpenStack Neutron API with an Omnipath backend. It enables Omnipath switching fabric in OpenStack cloud and each network in the Openstack networking realm corresponds to a virtual fabric on the omnipath side. - Manjeet Singh Bhatia proposed to make networking-omnipath a Neutron Stadium project - The agreement was that Miguel Lavalle and Manjeet will work together in determining whether networking-omnipath meet the Stadium project criteria, as outlined here: https://docs.openstack.org/neutron/latest/contributor/stadium/governance.html#when-is-a-project-considered-part-of-the-stadium - In case the criteria is not met, a remediation plan will be defined Cross networking project topics ======================= - Cross networking project topics * Neutron the only projects not using WSGI * We have to make it the default option in DevStack, although this will require some investigation * We already have a check queue non-voting job for WSGI. It is failing constantly, although the failures are all due to a singe test case (neutron_add_remove_fixed_ip). Miguel Lavalle will investigate and fix it * Target is to adopt WSGI as the default by Train-2 - Adoption of neutron-classifier (https://opendev.org/x/neutron-classifier) * David Shaughnessy has two PoC patches that demonstrate the adoption of neutron-classifier into Neutron's QoS. David will continue refining these patches and will bring them up for discussion in the QoS sub-team meeting on May 14th * It was also agreed to start the process of adding neutron-classifier the the Neutron Stadium. David Shaughnessy and Miguel Lavalle will work on this per the criteria defined in https://docs.openstack.org/neutron/latest/contributor/stadium/governance.html#when-is-a-project-considered-part-of-the-stadium - DHCP agent configured with mismatching domain and host entries * Since the merge of https://review.opendev.org/#/c/571546/, we have a confusion about what exactly the dns_domain field of a network is for. Historically, dns_domain for use with external DNS integration in the form of designate, but that delineation has become muddied with the previously mentioned change. * Miguel Lavalle will go back to the original spec of DNS integration and make a decision as to how to move forward - Neutron Events for smartNIC-baremetal use-case * In smartNIC baremetal usecase, Ironic need to know when agent is/is-not alive (since the neutron agent is running on the smartNIC) and when a port becomes up/down * General agreement was to leverage the existing notifier mechanism to emit this information for Ironic to consume (requires implementation of an API endpoint in Ironic). It was also agreed that a spec will be created * The notifications emitted can be leveraged by Ironic for other use-cases. In fact, in a lunch with Ironic team members (Julia Kreger, Devananda van der Veen and Harald Jensås), it was agreed to use use it also for the port bind/unbind completed notification. Neutron performance and scaling up =========================== - Recently, a performance and scalability sub-team (http://eavesdrop.openstack.org/#Neutron_Performance_sub-team_Meeting) has been formed to explore ways to improve performance overall - One of the activities of this sub-team has been adding osprofiler to the Neutron Rally job (https://review.opendev.org/#/c/615350). Sample result reports can be seen here: http://logs.openstack.org/50/615350/38/check/neutron-rally-task/0a4b791/results/report.html.gz#/NeutronNetworks.create_and_delete_ports/output and http://logs.openstack.org/50/615350/38/check/neutron-rally-task/0a4b791/results/report.html.gz#/NeutronNetworks.create_and_delete_subnets/output - Reports analysis indicate that port creation takes on average in the order of 9 seconds, even without assigning IP addresses to it and without binding it. The team decided to concentrate its efforts in improving the entire port creation - binding - wiring cycle. One step necessary for this is the addition of a Rally scenario, which Bence Romsics volunteered to develop. - Another area of activity has been EnOS (https://github.com/BeyondTheClouds/enos ), which is a framework that deploys OpenStack (using Kolla Ansible) and then runs Rally based performance experiments on that deployment (https://enos.readthedocs.io/en/stable/benchmarks/index.html) * The deployment can take place on VMs (Vagrant provider) or in large clouds such as Grid5000 testbed: https://www.grid5000.fr/w/Grid5000:Home * The Neutron performance sub-team and the EnOS developers are cooperating to define a performance experiment at scale * To that end, Miguel Lavalle has built a "big PC" with an AMD Threadripper 2950x processor (16 cores, 32 threads) and 64 GB of memory. This machine will be used to experiment with deployments in VMs to refine the scenarios to be tested, with the additional benefit that the Rally results will not be affected by variability in the OpenStack CI infrastructure. * Once the Neutron and EnOS team reach an agreement on the scenarios to be tested, an experiment will be run Grid5000 * The EnOS team delivered on May 6th the version that supports the Stein release - Miguel Lavalle will create a wiki page to record a performance baseline and track subsequent progress DVR Enhancements =============== - Supporting allowed_address_pairs for DVR is a longstanding issue for DVR: https://bugs.launchpad.net/neutron/+bug/1774459. There are to patches up for review to address this issue: * https://review.opendev.org/#/c/616272/ * https://review.opendev.org/#/c/651905/ - The team discussed the current state of DVR functionality and identified the following missing features that would be beneficial for operators: * Distributed ingress/egress for IPv6. Distributed ingress/egress (AKA "fast-exit") would be implemented for IPv6. This would bypass the centralized router in a network node * Support for smart-NIC offloads. This involves pushing all DVR forwarding policy into OVS and implementing it via OpenFlow * Distributed DHCP. Rather than having DHCP for a given network be answered centrally, OpenFlow rules will be programmed into OVS to provide static, locally generated responses to DHCP requests received on br-int * Distributed SNAT. This involves allowing SNAT to happen directly on the compute node instead of centralizing it on a network node. * There was agreement that these features are needed and Ryan Tidwell agreed to develop a spec as the next step. The spec is already up for review: https://review.opendev.org/#/c/658414 - networking-ovn team members pointed out that some of the above features are already implemented in their Stadium project. This led to the discussion of why duplicate efforts implementing the same features and instead explore the possibility of a convergence between the ML2 / agents based reference implementation and the ML2 / OVN implementation. * This discussion is particularly relevant in the context where the OpenStack community is rationalizing its size and contributors are scarcer * Such a convergence would most likely play out over several development cycles * The team agreed to explore how to achieve this convergence. To move forward, we will need visibility and certainty that the following is feasible: # Feature parity with what the reference implementation offers today # Ability to support all the backends in the current reference implementation # Offer verifiable substantial gains in performance and scalability compared to the current reference implementation # Broaden the community of developers contributing to the ML2 / OVN implementation * To move ahead in the exploration of this convergence, three actions were agreed: # Benchmarking of the two implementations will be carried out with EnOS, as part of the performance and scaling up activities described above # Write the necessary specs to address feature parity, support all the backends in the current reference implementation and provide migration paths # An item will be added to the weekly Neutron meeting to track progress # Make every decision along this exploration process with approval of the broader community Policy topics / API ============== - Keystone has a now a system scope. A system-scoped token implies the user has authorization to act on the deployment system. These tokens are useful for interacting with resources that affect the deployment as a whole, or exposes resources that may otherwise violate project or domain isolation * Currently in Neutron, if users have an admin role they, can access all the resources * In order to maintain alignment with the community. Akihiro Motoki will review the Neutron code and determine how the admin role is used to interact with deployment resources. He will also monitor how Nova's adoption of the system scope progresses - During the policy-in-code work, some improvements and clean ups were left pending, which are Items 2.3, 2.4 and 4 in https://etherpad.openstack.org/p/neutron-policy-in-code - The Neutron approach to use new extensions to make any change to the ReST API discoverable, has resulted in the proliferation of "shim extensions" to introduce small changes such as the addition of an attribute * To eliminate this issue, Akihiro Motoki proposed to maintain the overall extensions approach but micro version the extensions so that every feature added does not result in another extension * The counter argument from some in the room was: "extensions are messy, but it's a static mess. Putting in Micro versions creates a mess in the code with lots of conditionals on micro version enablement" * It was decided to explore simple versioning of extensions. The details will be fleshed out in the following spec: https://review.opendev.org/#/c/656955 Neutron - Nova cross project planning ============================= This session was summarized in the following messages to the mailing list: - http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005844.html summarizes the following topics * Optional NUMA affinity for neutron ports * Track neutron ports in placement * os-vif to be extended to contain new fields to record the connectiviy type and ml2 driver that bound the vif * Boot vms with unaddressed port - Leaking resources when ports are deleted out-of-band is summarized in this thread: http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005837.html - Melanie Witt asked if Neutron would support implementing transferring ownership of its resources. The answer was yes and as next step, she is going to send a message to the mailing list to define the next steps Code streamlining proposals ====================== - Streamlining IPAM flow. As a result of bulk port creation work done in Stein by Nate Johnston, it is clear that there are opportunities to improve the IPAM code. The team brainstormed several possible approaches and the following proposals were put forward: * When allocating blocks of IP addresses where strategy is 'next ip' then ensure it happens as a single SQL insert * Create bulk versions of allocate_ip_from_port_and_store etc. so that bulk can be preserved when pushed down to the IPAM driver to take advantage of the previous item * Add profiling code to the IPAM call so that we can log the time duration for IPAM execution, as a PoC - Streamlining use of remote groups in security groups. Nate Johnston pointed out that there is a performance hit when using security groups that are keyed to a remote_group_id, because when a port is added to a remote group it triggers security group rule updates for all of the members of the security group. On deployments with 150+ ports, this can take up to 5 mins to bring up the port * After discussion, the proposed next step is for Nate Johnston to create a PoC for a new approach where a nested security group creates a new iptables table/ovs flow table (let's call it a subtable) that can be used as an abstraction for the nested group relationship. Then the IP addresses of the primary security group will jump to the new table, and the new table can represent the contents of the remote security group # In a case where a primary security group has 170 members and lists itself as a remote security group (indicating members can all talk amongst themselves) when adding an instance to the security group that causes 171 updates, since each member needs the address of the new instance and a record needs to be created for the new one # With the proposed approach there would only be 2 updates: creating an entry for the new instance to jump to the subtable representing the remote security group, and adding an entry to the subtable Train community goals ================= The two community goals accepted or Train are: - PDF doc generation for project docs: https://review.opendev.org/#/c/647712/ * Akihiro Motoki will track this goal - IPv6 support and testing goal: https://review.opendev.org/#/c/653545/ * Good blog entry on overcoming metadata service shortcomings in this scenario: https://superuser.openstack.org/articles/deploying-ipv6-only-tenants-with-openstack/ neutron-lib topics ============= - To help expedite the merging the of neutron-lib consumption patches it was proposed to the team that neutron-lib-current projects must get their dependencies for devstack based testing jobs from source, instead of PyPI. * For an example of an incident motivating this proposal, please see: https://bugs.launchpad.net/tricircle/+bug/1816644 * This refers to inter-project dependencies, for example networking-l2gw depending on networking-sfc. It does not apply to *-lib projects, those will still be based on PyPI release * The team agreed to this proposal * When creating a new stable branch the Zuul config would need to be updated to point to the stable releases of the other projects it depends on. May include a periodic job that involves testing master and stable branches against PyPI packages * Boden Russel will make a list of what jobs need to be updated in projects that consume neutron-lib (superset of stadium) - Boden reminded the team we have a work items list for neutron-lib https://etherpad.openstack.org/p/neutron-lib-volunteers-and-punch-list Requests for enhancement ===================== - Improve extraroute API * Current extraroute API does not allow atomic additions/deletions of particular routing table entries. In the current API the routes attribute of a router (containing all routing table entries) must be updated at once, leading to race conditions on the client side * The team debated several alternatives: an API extension that makes routers extra routes first level resources, solve the concurrency issue though "compare and swap" approach, seek input from API work group or provide better guidelines for the use of the current API * The decision was made to move ahead with a spec proposing extra routes as first level API resources. That spec can be found here: https://review.opendev.org/#/c/655680 - Decouple placement reporting service plugin from ML2 * The placement reporter service plugin as merged in Stein depends on ML2. The improvement idea is to decouple it, by a driver pattern as in the QoS service plugin * We currently don't have a use case for this decoupling. As a consequence, it was decided to postpone it Various topics ========== - Migration of stadium projects CI jobs to python 3 * We have an etherpad recording the work items: https://etherpad.openstack.org/p/neutron_stadium_python3_status * Lajos Katona will take care of networking-odl * Miguel Lavalle will talk to Takashi Yamamoto about networking-midonet * Nate Johnston will continue working on networking-bagpipe and neutron-fwaas patches * A list of projects beyond the Stadium will be collected as part of the effort for neutron-lib to start pulling requirements from source - Removal of deprecated "of_interface" option * The option was deprecated in Pike * In some cases, deployers might experience a few seconds of data plane down time when the OVS agent is restarted without the option * A message was sent to the ML warning of this possible effect: http://lists.openstack.org/pipermail/openstack-dev/2018-September/134938.html. There has been no reaction from the community * We will move ahead with the removal of the option. Patch is here: https://review.opendev.org/#/c/599496 Status and health of some Stadium and non-Stadium projects ============================================== - Some projects have experienced loss of development team: * networking-old. In this case, Ericsson is interested in continuing maintaining the project. The key contact is Lajos Katona * networking-l2gw is also interesting for Ericsson (Lajos Katona). Over the pas few cycles the project has been maintained by Ricardo Noriega of Red Hat. Miguel Lavalle will organize a meeting with Lajos and Ricardo to decide how to move ahead with this project * neutron-fwaas. In this case, Miguel Lavalle will send a message to the mailing list describing the status of the project and requesting parties interested in continuing maintaining the project -------------- next part -------------- An HTML attachment was scrubbed... URL: From rony.khan at novotel-bd.com Wed May 22 06:37:39 2019 From: rony.khan at novotel-bd.com (Md. Farhad Hasan Khan) Date: Wed, 22 May 2019 12:37:39 +0600 Subject: neutron network namespaces not create In-Reply-To: <08D0BBB7-C082-4953-9AFC-B06F13880C44@redhat.com> References: <7FDE461E16EF1E4587C3A3333C7DA2D90E8C5DAC74@Email.novotel-bd.com> <08D0BBB7-C082-4953-9AFC-B06F13880C44@redhat.com> Message-ID: <7FDE461E16EF1E4587C3A3333C7DA2D90E8C5DAC9C@Email.novotel-bd.com> Hi, We are using L3 agent HA. We are creating router in a project from horizon. We didn’t get any error in neutron -server log. Only getting log in l3-agent.log. [root at controller1 neutron]# cat /var/log/neutron/l3-agent.log |grep 2b39246b-5cd5-461b-babb-f37aae38c25f 2019-05-22 11:16:29.041 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. 2019-05-22 11:16:29.111 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' 2019-05-22 11:16:29.111 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' 2019-05-22 11:16:29.112 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. 2019-05-22 11:16:32.223 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. 2019-05-22 11:16:32.280 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' 2019-05-22 11:16:32.280 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' 2019-05-22 11:16:32.280 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. 2019-05-22 11:16:35.035 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. 2019-05-22 11:16:35.103 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' 2019-05-22 11:16:35.103 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' 2019-05-22 11:16:35.104 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. 2019-05-22 11:16:37.590 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. 2019-05-22 11:16:37.656 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' 2019-05-22 11:16:37.656 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' 2019-05-22 11:16:37.657 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. 2019-05-22 11:16:43.952 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. 2019-05-22 11:16:44.022 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' 2019-05-22 11:16:44.022 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' 2019-05-22 11:16:44.023 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. 2019-05-22 11:16:46.491 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. 2019-05-22 11:16:46.550 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' 2019-05-22 11:16:46.550 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' 2019-05-22 11:16:46.551 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. 2019-05-22 11:16:54.852 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. 2019-05-22 11:16:54.912 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' 2019-05-22 11:16:54.912 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' 2019-05-22 11:16:54.912 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. 2019-05-22 11:17:01.427 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. 2019-05-22 11:17:01.494 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' 2019-05-22 11:17:01.494 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' 2019-05-22 11:17:01.495 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. 2019-05-22 11:17:03.975 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. 2019-05-22 11:17:04.037 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' 2019-05-22 11:17:04.037 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' 2019-05-22 11:17:04.037 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. 2019-05-22 11:17:06.665 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. 2019-05-22 11:17:06.737 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' 2019-05-22 11:17:06.737 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' 2019-05-22 11:17:06.738 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. 2019-05-22 11:17:09.306 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. 2019-05-22 11:17:09.363 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' 2019-05-22 11:17:09.363 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' 2019-05-22 11:17:09.364 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. 2019-05-22 11:17:09.364 52248 WARNING neutron.agent.l3.agent [-] Hit retry limit with router update for 2b39246b-5cd5-461b-babb-f37aae38c25f, action None: BridgeDoesNExist: Bridge br-int does not exist. 2019-05-22 11:17:09.365 52248 WARNING neutron.agent.l3.agent [-] Info for router 2b39246b-5cd5-461b-babb-f37aae38c25f was not found. Performing router cleanup: BridgeesNotExist: Bridge br-int does not exist. 2019-05-22 11:17:15.145 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. 2019-05-22 11:17:15.203 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' 2019-05-22 11:17:15.203 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' 2019-05-22 11:17:15.204 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. 2019-05-22 11:17:15.205 52248 WARNING neutron.agent.l3.agent [-] Hit retry limit with router update for 2b39246b-5cd5-461b-babb-f37aae38c25f, action None: BridgeDoesNExist: Bridge br-int does not exist. 2019-05-22 11:17:15.205 52248 WARNING neutron.agent.l3.agent [-] Info for router 2b39246b-5cd5-461b-babb-f37aae38c25f was not found. Performing router cleanup: BridgeesNotExist: Bridge br-int does not exist. [root at controller1 neutron]# #cat /var/log/neutron/l3-agent.log |grep 2b39246b-5cd5-461b-babb-f37aae38c25f [root at controller1 neutron]# tailf /var/log/neutron/l3-agent.log 2019-05-22 12:04:28.315 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug 2019-05-22 12:04:28.315 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) 2019-05-22 12:04:28.315 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new 2019-05-22 12:04:28.315 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) 2019-05-22 12:04:28.315 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exist 2019-05-22 12:04:28.315 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) 2019-05-22 12:04:28.315 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-22 12:04:28.315 52248 ERROR neutron.agent.l3.agent 2019-05-22 12:04:29.356 52248 WARNING neutron.agent.l3.agent [-] Info for router 2cb935c7-a007-4fcc-82ce-7f2a6108b12d was not found. Performing router cleanup 2019-05-22 12:04:33.125 52248 WARNING neutron.agent.l3.agent [-] Info for router 2cb935c7-a007-4fcc-82ce-7f2a6108b12d was not found. Performing router cleanup 2019-05-22 12:24:26.582 52248 WARNING neutron.agent.l3.agent [-] Info for router b9ecd487-336a-4c5f-94c1-2b4fe7079a93 was not found. Performing router cleanup ^Z [2]+ Stopped tailf /var/log/neutron/l3-agent.log -----Original Message----- From: Slawomir Kaplonski [mailto:skaplons at redhat.com] Sent: Wednesday, May 22, 2019 2:13 AM To: Md. Farhad Hasan Khan Cc: openstack-discuss at lists.openstack.org Subject: Re: neutron network namespaces not create Hi, Error which You pasted looks that is related to moment when router is deleted. Do You have any errors in neutron-server and/or neutron-l3-agent logs during creation of the router? Can You check with command “neutron l3-agent-list-hosting-router ” by which l3 agent router should be hosted? What kind of router Your are creating? Is it HA, DVR, DVR-HA router? Or maybe Legacy? > On 21 May 2019, at 16:51, Md. Farhad Hasan Khan wrote: > > Hi, > I can create router from horizon. But network namespaces not created. I check with # ip netns list command. Not found router ID, but showing in horizon. > > Here is some log I get from neutron: > > > #cat /var/log/neutron/l3-agent.log > > > 2019-05-21 17:35:44.646 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:35:44.646 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:35:44.646 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent ri.delete() > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent router_id) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self.force_reraise() > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:35:44.711 52248 WARNING neutron.agent.l3.agent [-] Hit retry limit with router update for cad85ce0-6624-42ff-b42b-09480aea2613, action None: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:35:44.711 52248 WARNING neutron.agent.l3.agent [-] Info for router cad85ce0-6624-42ff-b42b-09480aea2613 was not found. Performing router cleanup: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent ri.delete() > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent router_id) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self.force_reraise() > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent ri.delete() > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent router_id) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self.force_reraise() > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent ri.delete() > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent router_id) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self.force_reraise() > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent ri.delete() > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent router_id) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.force_reraise() > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:13.024 52248 WARNING neutron.agent.l3.agent [-] Info for router cad85ce0-6624-42ff-b42b-09480aea2613 was not found. Performing router cleanup > 2019-05-21 17:37:14.358 52248 WARNING neutron.agent.l3.agent [-] Info for router cad85ce0-6624-42ff-b42b-09480aea2613 was not found. Performing router cleanup > > > Thanks & B’Rgds, > Rony — Slawek Kaplonski Senior software engineer Red Hat From rony.khan at brilliant.com.bd Wed May 22 06:45:09 2019 From: rony.khan at brilliant.com.bd (Md. Farhad Hasan Khan) Date: Wed, 22 May 2019 12:45:09 +0600 Subject: neutron network namespaces not create References: <7FDE461E16EF1E4587C3A3333C7DA2D90E8C5DAC74@Email.novotel-bd.com> <08D0BBB7-C082-4953-9AFC-B06F13880C44@redhat.com> Message-ID: <01f501d51069$e5fbc2b0$b1f34810$@brilliant.com.bd> Hi, We are using L3 agent HA. We are creating router in a project from horizon. We didn’t get any error in neutron -server log. Only getting log in l3-agent.log. >From horizon we can see router interfaces HA port status down [root at controller1 neutron]# cat /var/log/neutron/l3-agent.log |grep 2b39246b-5cd5-461b-babb-f37aae38c25f 2019-05-22 11:16:29.041 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. 2019-05-22 11:16:29.111 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' 2019-05-22 11:16:29.111 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' 2019-05-22 11:16:29.112 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. 2019-05-22 11:16:32.223 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. 2019-05-22 11:16:32.280 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' 2019-05-22 11:16:32.280 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' 2019-05-22 11:16:32.280 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. 2019-05-22 11:16:35.035 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. 2019-05-22 11:16:35.103 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' 2019-05-22 11:16:35.103 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' 2019-05-22 11:16:35.104 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. 2019-05-22 11:16:37.590 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. 2019-05-22 11:16:37.656 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' 2019-05-22 11:16:37.656 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' 2019-05-22 11:16:37.657 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. 2019-05-22 11:16:43.952 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. 2019-05-22 11:16:44.022 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' 2019-05-22 11:16:44.022 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' 2019-05-22 11:16:44.023 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. 2019-05-22 11:16:46.491 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. 2019-05-22 11:16:46.550 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' 2019-05-22 11:16:46.550 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' 2019-05-22 11:16:46.551 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. 2019-05-22 11:16:54.852 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. 2019-05-22 11:16:54.912 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' 2019-05-22 11:16:54.912 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' 2019-05-22 11:16:54.912 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. 2019-05-22 11:17:01.427 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. 2019-05-22 11:17:01.494 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' 2019-05-22 11:17:01.494 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' 2019-05-22 11:17:01.495 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. 2019-05-22 11:17:03.975 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. 2019-05-22 11:17:04.037 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' 2019-05-22 11:17:04.037 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' 2019-05-22 11:17:04.037 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. 2019-05-22 11:17:06.665 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. 2019-05-22 11:17:06.737 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' 2019-05-22 11:17:06.737 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' 2019-05-22 11:17:06.738 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. 2019-05-22 11:17:09.306 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. 2019-05-22 11:17:09.363 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' 2019-05-22 11:17:09.363 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' 2019-05-22 11:17:09.364 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. 2019-05-22 11:17:09.364 52248 WARNING neutron.agent.l3.agent [-] Hit retry limit with router update for 2b39246b-5cd5-461b-babb-f37aae38c25f, action None: BridgeDoesNExist: Bridge br-int does not exist. 2019-05-22 11:17:09.365 52248 WARNING neutron.agent.l3.agent [-] Info for router 2b39246b-5cd5-461b-babb-f37aae38c25f was not found. Performing router cleanup: BridgeesNotExist: Bridge br-int does not exist. 2019-05-22 11:17:15.145 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. 2019-05-22 11:17:15.203 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' 2019-05-22 11:17:15.203 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' 2019-05-22 11:17:15.204 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. 2019-05-22 11:17:15.205 52248 WARNING neutron.agent.l3.agent [-] Hit retry limit with router update for 2b39246b-5cd5-461b-babb-f37aae38c25f, action None: BridgeDoesNExist: Bridge br-int does not exist. 2019-05-22 11:17:15.205 52248 WARNING neutron.agent.l3.agent [-] Info for router 2b39246b-5cd5-461b-babb-f37aae38c25f was not found. Performing router cleanup: BridgeesNotExist: Bridge br-int does not exist. [root at controller1 neutron]# #cat /var/log/neutron/l3-agent.log |grep 2b39246b-5cd5-461b-babb-f37aae38c25f [root at controller1 neutron]# tailf /var/log/neutron/l3-agent.log 2019-05-22 12:04:28.315 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug 2019-05-22 12:04:28.315 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) 2019-05-22 12:04:28.315 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new 2019-05-22 12:04:28.315 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) 2019-05-22 12:04:28.315 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exist 2019-05-22 12:04:28.315 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) 2019-05-22 12:04:28.315 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. 2019-05-22 12:04:28.315 52248 ERROR neutron.agent.l3.agent 2019-05-22 12:04:29.356 52248 WARNING neutron.agent.l3.agent [-] Info for router 2cb935c7-a007-4fcc-82ce-7f2a6108b12d was not found. Performing router cleanup 2019-05-22 12:04:33.125 52248 WARNING neutron.agent.l3.agent [-] Info for router 2cb935c7-a007-4fcc-82ce-7f2a6108b12d was not found. Performing router cleanup 2019-05-22 12:24:26.582 52248 WARNING neutron.agent.l3.agent [-] Info for router b9ecd487-336a-4c5f-94c1-2b4fe7079a93 was not found. Performing router cleanup ^Z [2]+ Stopped tailf /var/log/neutron/l3-agent.log -----Original Message----- From: Slawomir Kaplonski [mailto:skaplons at redhat.com] Sent: Wednesday, May 22, 2019 2:13 AM To: Md. Farhad Hasan Khan Cc: openstack-discuss at lists.openstack.org Subject: Re: neutron network namespaces not create Hi, Error which You pasted looks that is related to moment when router is deleted. Do You have any errors in neutron-server and/or neutron-l3-agent logs during creation of the router? Can You check with command “neutron l3-agent-list-hosting-router ” by which l3 agent router should be hosted? What kind of router Your are creating? Is it HA, DVR, DVR-HA router? Or maybe Legacy? > On 21 May 2019, at 16:51, Md. Farhad Hasan Khan wrote: > > Hi, > I can create router from horizon. But network namespaces not created. I check with # ip netns list command. Not found router ID, but showing in horizon. > > Here is some log I get from neutron: > > > #cat /var/log/neutron/l3-agent.log > > > 2019-05-21 17:35:44.646 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:35:44.646 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:35:44.646 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent ri.delete() > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent router_id) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self.force_reraise() > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:35:44.711 52248 WARNING neutron.agent.l3.agent [-] Hit retry limit with router update for cad85ce0-6624-42ff-b42b-09480aea2613, action None: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:35:44.711 52248 WARNING neutron.agent.l3.agent [-] Info for router cad85ce0-6624-42ff-b42b-09480aea2613 was not found. Performing router cleanup: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent ri.delete() > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent router_id) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self.force_reraise() > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent ri.delete() > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent router_id) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self.force_reraise() > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent ri.delete() > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent router_id) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self.force_reraise() > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent ri.delete() > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent router_id) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.force_reraise() > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.ha_network_added() > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent > 2019-05-21 17:37:13.024 52248 WARNING neutron.agent.l3.agent [-] Info > for router cad85ce0-6624-42ff-b42b-09480aea2613 was not found. > Performing router cleanup > 2019-05-21 17:37:14.358 52248 WARNING neutron.agent.l3.agent [-] Info > for router cad85ce0-6624-42ff-b42b-09480aea2613 was not found. > Performing router cleanup > > > Thanks & B’Rgds, > Rony — Slawek Kaplonski Senior software engineer Red Hat From skaplons at redhat.com Wed May 22 07:00:26 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Wed, 22 May 2019 09:00:26 +0200 Subject: neutron network namespaces not create In-Reply-To: <01f501d51069$e5fbc2b0$b1f34810$@brilliant.com.bd> References: <7FDE461E16EF1E4587C3A3333C7DA2D90E8C5DAC74@Email.novotel-bd.com> <08D0BBB7-C082-4953-9AFC-B06F13880C44@redhat.com> <01f501d51069$e5fbc2b0$b1f34810$@brilliant.com.bd> Message-ID: Hi, According to Michael's email, can You also check if neutron-openvswitch-agent running on Your node. Because br-int should be created on node when neutron-ovs-agent is starting. It’s in [1]. Or maybe if You are using some other solution rather than ML2/OVS, maybe You should change to other than “openvswitch” interface driver in neutron-l3-agent’s config? [1] https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L1074 > On 22 May 2019, at 08:45, Md. Farhad Hasan Khan wrote: > > Hi, > We are using L3 agent HA. We are creating router in a project from horizon. We didn’t get any error in neutron -server log. Only getting log in l3-agent.log. > From horizon we can see router interfaces HA port status down > > > [root at controller1 neutron]# cat /var/log/neutron/l3-agent.log |grep 2b39246b-5cd5-461b-babb-f37aae38c25f > 2019-05-22 11:16:29.041 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. > 2019-05-22 11:16:29.111 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' > 2019-05-22 11:16:29.111 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' > 2019-05-22 11:16:29.112 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. > 2019-05-22 11:16:32.223 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. > 2019-05-22 11:16:32.280 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' > 2019-05-22 11:16:32.280 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' > 2019-05-22 11:16:32.280 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. > 2019-05-22 11:16:35.035 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. > 2019-05-22 11:16:35.103 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' > 2019-05-22 11:16:35.103 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' > 2019-05-22 11:16:35.104 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. > 2019-05-22 11:16:37.590 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. > 2019-05-22 11:16:37.656 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' > 2019-05-22 11:16:37.656 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' > 2019-05-22 11:16:37.657 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. > 2019-05-22 11:16:43.952 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. > 2019-05-22 11:16:44.022 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' > 2019-05-22 11:16:44.022 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' > 2019-05-22 11:16:44.023 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. > 2019-05-22 11:16:46.491 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. > 2019-05-22 11:16:46.550 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' > 2019-05-22 11:16:46.550 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' > 2019-05-22 11:16:46.551 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. > 2019-05-22 11:16:54.852 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. > 2019-05-22 11:16:54.912 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' > 2019-05-22 11:16:54.912 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' > 2019-05-22 11:16:54.912 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. > 2019-05-22 11:17:01.427 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. > 2019-05-22 11:17:01.494 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' > 2019-05-22 11:17:01.494 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' > 2019-05-22 11:17:01.495 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. > 2019-05-22 11:17:03.975 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. > 2019-05-22 11:17:04.037 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' > 2019-05-22 11:17:04.037 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' > 2019-05-22 11:17:04.037 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. > 2019-05-22 11:17:06.665 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. > 2019-05-22 11:17:06.737 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' > 2019-05-22 11:17:06.737 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' > 2019-05-22 11:17:06.738 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. > 2019-05-22 11:17:09.306 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. > 2019-05-22 11:17:09.363 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' > 2019-05-22 11:17:09.363 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' > 2019-05-22 11:17:09.364 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. > 2019-05-22 11:17:09.364 52248 WARNING neutron.agent.l3.agent [-] Hit retry limit with router update for 2b39246b-5cd5-461b-babb-f37aae38c25f, action None: BridgeDoesNExist: Bridge br-int does not exist. > 2019-05-22 11:17:09.365 52248 WARNING neutron.agent.l3.agent [-] Info for router 2b39246b-5cd5-461b-babb-f37aae38c25f was not found. Performing router cleanup: BridgeesNotExist: Bridge br-int does not exist. > 2019-05-22 11:17:15.145 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge br-intoes not exist. > 2019-05-22 11:17:15.203 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router 2b39246b-5cd5-461b-babb-f37aae38c25f: OSError: [Errno 2] No such file or dictory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c25f' > 2019-05-22 11:17:15.203 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/2b39246b-5cd5-461b-babb-f37aae38c2' > 2019-05-22 11:17:15.204 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 2b39246b-5cd5-461b-babb-f37aae38c25f: BridgeDoesNotExist: Bridge bint does not exist. > 2019-05-22 11:17:15.205 52248 WARNING neutron.agent.l3.agent [-] Hit retry limit with router update for 2b39246b-5cd5-461b-babb-f37aae38c25f, action None: BridgeDoesNExist: Bridge br-int does not exist. > 2019-05-22 11:17:15.205 52248 WARNING neutron.agent.l3.agent [-] Info for router 2b39246b-5cd5-461b-babb-f37aae38c25f was not found. Performing router cleanup: BridgeesNotExist: Bridge br-int does not exist. > [root at controller1 neutron]# #cat /var/log/neutron/l3-agent.log |grep 2b39246b-5cd5-461b-babb-f37aae38c25f > [root at controller1 neutron]# tailf /var/log/neutron/l3-agent.log > 2019-05-22 12:04:28.315 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug > 2019-05-22 12:04:28.315 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) > 2019-05-22 12:04:28.315 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new > 2019-05-22 12:04:28.315 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) > 2019-05-22 12:04:28.315 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exist > 2019-05-22 12:04:28.315 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) > 2019-05-22 12:04:28.315 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. > 2019-05-22 12:04:28.315 52248 ERROR neutron.agent.l3.agent > 2019-05-22 12:04:29.356 52248 WARNING neutron.agent.l3.agent [-] Info for router 2cb935c7-a007-4fcc-82ce-7f2a6108b12d was not found. Performing router cleanup > 2019-05-22 12:04:33.125 52248 WARNING neutron.agent.l3.agent [-] Info for router 2cb935c7-a007-4fcc-82ce-7f2a6108b12d was not found. Performing router cleanup > 2019-05-22 12:24:26.582 52248 WARNING neutron.agent.l3.agent [-] Info for router b9ecd487-336a-4c5f-94c1-2b4fe7079a93 was not found. Performing router cleanup ^Z > [2]+ Stopped tailf /var/log/neutron/l3-agent.log > > > > > > -----Original Message----- > From: Slawomir Kaplonski [mailto:skaplons at redhat.com] > Sent: Wednesday, May 22, 2019 2:13 AM > To: Md. Farhad Hasan Khan > Cc: openstack-discuss at lists.openstack.org > Subject: Re: neutron network namespaces not create > > Hi, > > Error which You pasted looks that is related to moment when router is deleted. > Do You have any errors in neutron-server and/or neutron-l3-agent logs during creation of the router? > Can You check with command “neutron l3-agent-list-hosting-router ” by which l3 agent router should be hosted? > What kind of router Your are creating? Is it HA, DVR, DVR-HA router? Or maybe Legacy? > >> On 21 May 2019, at 16:51, Md. Farhad Hasan Khan wrote: >> >> Hi, >> I can create router from horizon. But network namespaces not created. I check with # ip netns list command. Not found router ID, but showing in horizon. >> >> Here is some log I get from neutron: >> >> >> #cat /var/log/neutron/l3-agent.log >> >> >> 2019-05-21 17:35:44.646 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) >> 2019-05-21 17:35:44.646 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:35:44.646 52248 ERROR neutron.agent.l3.agent >> 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' >> 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): >> 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added >> 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent ri.delete() >> 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete >> 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() >> 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived >> 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) >> 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree >> 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) >> 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree >> 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) >> 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' >> 2019-05-21 17:35:44.709 52248 ERROR neutron.agent.l3.agent >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent router_id) >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self.force_reraise() >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self.ha_network_added() >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:35:44.710 52248 ERROR neutron.agent.l3.agent >> 2019-05-21 17:35:44.711 52248 WARNING neutron.agent.l3.agent [-] Hit retry limit with router update for cad85ce0-6624-42ff-b42b-09480aea2613, action None: BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:35:44.711 52248 WARNING neutron.agent.l3.agent [-] Info for router cad85ce0-6624-42ff-b42b-09480aea2613 was not found. Performing router cleanup: BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): >> 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added >> 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) >> 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize >> 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent self.ha_network_added() >> 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added >> 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) >> 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug >> 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) >> 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new >> 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) >> 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists >> 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) >> 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:37:05.705 52248 ERROR neutron.agent.l3.agent >> 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' >> 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): >> 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added >> 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent ri.delete() >> 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete >> 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() >> 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived >> 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) >> 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree >> 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) >> 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree >> 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) >> 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' >> 2019-05-21 17:37:05.761 52248 ERROR neutron.agent.l3.agent >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent router_id) >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self.force_reraise() >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self.ha_network_added() >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:37:05.762 52248 ERROR neutron.agent.l3.agent >> 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): >> 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added >> 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) >> 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize >> 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent self.ha_network_added() >> 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added >> 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) >> 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug >> 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) >> 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new >> 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) >> 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists >> 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) >> 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:37:07.975 52248 ERROR neutron.agent.l3.agent >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent ri.delete() >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent router_id) >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self.force_reraise() >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self.ha_network_added() >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:37:08.029 52248 ERROR neutron.agent.l3.agent >> 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): >> 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added >> 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) >> 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize >> 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent self.ha_network_added() >> 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added >> 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) >> 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug >> 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) >> 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new >> 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) >> 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists >> 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) >> 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:37:10.277 52248 ERROR neutron.agent.l3.agent >> 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' >> 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): >> 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added >> 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent ri.delete() >> 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete >> 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() >> 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived >> 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) >> 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree >> 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) >> 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree >> 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) >> 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' >> 2019-05-21 17:37:10.351 52248 ERROR neutron.agent.l3.agent >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent router_id) >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self.force_reraise() >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self.ha_network_added() >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:37:10.352 52248 ERROR neutron.agent.l3.agent >> 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent [-] Error while initializing router cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): >> 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added >> 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) >> 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize >> 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent self.ha_network_added() >> 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added >> 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) >> 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug >> 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) >> 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new >> 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) >> 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists >> 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) >> 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:37:12.575 52248 ERROR neutron.agent.l3.agent >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent ri.delete() >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent router_id) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.force_reraise() >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.ha_network_added() >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent >> 2019-05-21 17:37:13.024 52248 WARNING neutron.agent.l3.agent [-] Info >> for router cad85ce0-6624-42ff-b42b-09480aea2613 was not found. >> Performing router cleanup >> 2019-05-21 17:37:14.358 52248 WARNING neutron.agent.l3.agent [-] Info >> for router cad85ce0-6624-42ff-b42b-09480aea2613 was not found. >> Performing router cleanup >> >> >> Thanks & B’Rgds, >> Rony > > — > Slawek Kaplonski > Senior software engineer > Red Hat > > — Slawek Kaplonski Senior software engineer Red Hat From openstack at nemebean.com Wed May 22 14:53:29 2019 From: openstack at nemebean.com (Ben Nemec) Date: Wed, 22 May 2019 09:53:29 -0500 Subject: [dev][all][ptls] note to non-core reviewers in all projects In-Reply-To: References: Message-ID: <7ebee0cb-a42b-b6df-8690-39124fb7096d@nemebean.com> On 5/22/19 7:12 AM, Alexandra Settle wrote: > Replying to the last thread and top posting both on purpose as I want to > point out something slightly differently but I am taking this platform > as a way to create further discussion... > > The topic of +1 and -1 was brought up in a number of times at the most > recent PTG. In the past it has been somewhat obvious that we HAVE had > individuals who provide a +1 or -1 to improve their review stats. Now, > whilst we *now* have the guide to reviewing as Julia pointed out [0] we > originally created a community that kind of had to make up the meanings > behind a +1 and -1. I am speaking generally here, some projects > definitely had defined rules as to what these reviews meant, but not all > did. This obviously left this very open to interpretation. > > To become a core review in many projects, the barrier to entry was > general "you must provide quality reviews". That's great, except what > ended up happening is we had waves of both individuals providing mass > +1's and -1's. Neither were quality. > > When we (the docs team) were looking at who to offer core > responsibilities to, we used to look specifically at the amount of +1's > and -1's. At that point in time, a -1 was seen to be a review of "higher > quality" because it meant somehow that the individual's critical review, > automatically meant "critical thinking". All this did was create a > negative review culture ("a -1 is better than a +1"). I know I used to > be highly conscious of how many +1's I offered before I became core, I > would often avoid providing a +1 even if the patch was fine so my stats > would not be messed up. Which is pretty messed up. Definitely, and this is why I avoid mentioning review stats when proposing someone as core. In fact, I barely look at review stats when making the decision. Basically I just verify that they've been doing reviews, and if so I move on. It's far more important to me that someone demonstrates an understanding of Oslo and OpenStack. -1's are one way to do that, but they're not the only way. Nit picky -1's are _not_ a way to do that, and in fact they demonstrate a lack of understanding of where we're trying to go as a community. Followup patches to fix the nits are terrific way to ingratiate yourself though *hint hint*. :-) I know I've seen blog posts, but do we have any official-ish documentation of best practices for becoming a core reviewer in OpenStack? I realize some of it is project-specific, but it seems like there would be some commonalities we could document. > > We have the review document in the project team guide, but what are we > doing to socialise this? This thread currently is divulging into > personal opinions and experiences (myself included here) but it looks > like we need to work on some actions. > > A few questions: > > 1. Who knew about the "How to Review Changes the OpenStack Way" > document before this thread? I did, but I'm pretty sure I was involved in the discussion that led to the creation of it so I may not be typical. :-) > 1. If you did, how did you find the content? Did you find it easy > to understand and follow? > 2. If you did not, would you have found this document helpful when > you first started reviewing? > 2. PTLs - do you share this document with new contributors that are > reappearing? I haven't been, but I will try to start. Also I need to update https://docs.openstack.org/project-team-guide/oslo.html It still talks about oslo-incubator. o.O > 3. Is this socialised in the Upstream Institute meeting? (@diablo) > > I don't expect a flood of answers to these questions, but it's important > that these are on our mind. > > Cheers, > > Alex > > IRC: asettle > Twitter: @dewsday > > [0]: > https://docs.openstack.org/project-team-guide/review-the-openstack-way.html > > On 22/05/2019 11:02, Natal Ngétal wrote: >> On Tue, May 21, 2019 at 3:33 PM Brian Rosmaita >> wrote: >>> A recent spate of +1 reviews with no comments on patches has put me into >>> grumpy-old-man mode. >> What do you mean by this? I make several code review and each time I >> add +1 that will mean I have made a code review. >> >>> A +1 with no comments is completely useless (unless you have a review on >>> a previous patch set with comments that have been addressed by the >>> author). I already know you're a smart person (you figured out how to >>> sign the CLA and navigate gerrit -- lots of people can't or won't do >>> that), but all your non-comment +1 tells me is that you are in favor of >>> the patch. That doesn't give me any information, because I already know >>> that the author is in favor of the patch, so that makes two of you out >>> of about 1,168 reviewers. That's not exactly a groundswell of support. >> I don't make code review to improve my statics, however sometimes I >> add a +1 and I have don't saw an error and another saw an error. >> That doesn't means I have don't read the code, sometimes during a code >> review we can also make a fail. This why it's good to have many >> different person make the code review on a patch. >> >>> When you post your +1, please leave a comment explaining why you >>> approve, or at least what in particular you looked at in the patch that >>> gave you a favorable impression. This whole open source community thing >>> is a collaborative effort, so please collaborate! You comment does not >>> have to be profound. Even just saying that you checked that the release >>> note or docs on the patch rendered correctly in HTML is very helpful. >> For me a +1 without comment is ok. The +1 is implicit that mean looks >> good to merge, but must be review by another person. Impose to add a >> comment for each review, it's for me a nonsense and a bad idea. >> Which type of comment do you will? I mean if it's only to add in the >> comment, looks good to merge, that change nothing. >> >>> The same thing goes for leaving a -1 on a patch. Don't just drop a -1 >>> bomb with no explanation. The kind of review that will put you on track >>> for becoming core in a project is what johnthetubaguy calls a >>> "thoughtful -1", that is, a negative review that clearly explains what >>> the problem is and points the author in a good direction to fix it. >> I totally agree a -1 must be coming with a comment, it's a nonsense to >> have a -1 without explanation. >> From fungi at yuggoth.org Wed May 22 15:18:36 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 22 May 2019 15:18:36 +0000 Subject: [dev][requirements] Upcoming changes to constraints handling in tox.ini In-Reply-To: <2e3fa8543cde73f3b93566c0a5b89c30c8d6b42b.camel@redhat.com> References: <20190522030203.GD15808@thor.bakeyournoodle.com> <2e3fa8543cde73f3b93566c0a5b89c30c8d6b42b.camel@redhat.com> Message-ID: <20190522151836.fc3wpuiuwlwa5kki@yuggoth.org> On 2019-05-22 09:26:19 +0100 (+0100), Stephen Finucane wrote: [...] > I realize this is bound to be controversial, but would it be > possible to just auto-merge these patches assuming they pass CI? > We've had a lot of these initiatives before and, invariably, there > are some projects that won't get around to merging these for a > long time (if ever). We had to do this recently with the opendev > updates to the '.gitreview' files (I think?) so there is precedent > here. Well, there were two approaches we used in the OpenDev migration: 1. Backward-compatible mass changes which fixed things we knew would otherwise break were proposed, given a brief opportunity for projects to review and approve or -2, and then at an pre-announced deadline any which were still open but passing their jobs and had no blocking votes were bulk-approved by a Gerrit administrator who temporarily elevated their access to act as a core reviewer for all projects. More specifically, this was the changes to replace git:// URLs with https:// because we were dropping support for the protocol. 2. Non-backward-compatible mass changes which fixed things we knew would otherwise be broken by the transition were committed directly into the on-disk copies of repositories in Gerrit while the service was offline for maintenance, entirely bypassing CI and code review. These were changes for things like .gitreview files and zuul pipelines/jobs/playbooks/roles. I think something similar to #1 might be appropriate here. I could see, for example, requiring Gerrit ACLs for official OpenStack deliverable repositories to inherit from a parent ACL (Gerrit supports this) which includes core reviewer permissions for a group that the Release team can temporarily add themselves to, for the purposes of bulk approving relevant changes at or shortly following the coordinated release. The release process they follow already involves some automated group updates for reassigning control of branches, so this probably wouldn't be too hard to incorporate. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From johnsomor at gmail.com Wed May 22 16:17:15 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Wed, 22 May 2019 09:17:15 -0700 Subject: [autohealing] Demo of Application Autohealing in OpenStack (Heat + Octavia + Aodh) In-Reply-To: References: Message-ID: Very cool, thanks for sharing! On Sun, May 19, 2019 at 9:24 PM Trinh Nguyen wrote: > > It's great! Thanks Lingxian. > > On Mon, May 20, 2019, 13:20 Lingxian Kong wrote: >> >> Recommend to watch in a 1080p video quality. >> >> --- >> Best regards, >> Lingxian Kong >> Catalyst Cloud >> >> >> On Mon, May 20, 2019 at 4:11 PM Lingxian Kong wrote: >>> >>> Please see the demo here: https://youtu.be/dXsGnbr7DfM >>> >>> --- >>> Best regards, >>> Lingxian Kong >>> Catalyst Cloud From alifshit at redhat.com Wed May 22 16:34:22 2019 From: alifshit at redhat.com (Artom Lifshitz) Date: Wed, 22 May 2019 12:34:22 -0400 Subject: [nova] stable-maint is especially unhealthily RH-centric In-Reply-To: <86241a3b-be28-c83f-7c35-386946c3cdc8@gmail.com> References: <86241a3b-be28-c83f-7c35-386946c3cdc8@gmail.com> Message-ID: On Tue, May 21, 2019, 16:32 Matt Riedemann, wrote: > On 5/21/2019 11:16 AM, Matthew Booth wrote: > > Not Red Hat: > > Claudiu Belu -> Inactive? > > Matt Riedemann > > John Garbutt > > Matthew Treinish > > Sean McGinnis is on the release management team which is a (grand)parent > group to nova-stable-maint and Sean reviews nova stable changes from > time to time or as requested, but he's currently in the same boat as me. > Wait, did I miss something? We at RH were told it was business as usual with respect to upstream community collaboration with Huawei. > > > > Red Hat: > > Dan Smith > > Lee Yarwood > > Sylvain Bauza > > Tony Breeds > > Melanie Witt > > Alan Pevec > Chuck Short > > Flavio Percoco > > Alan, Chuck and Flavio are all in the parent stable-maint-core group but > also inactive as far as I know. FWIW the most active nova stable cores > are myself, Lee and Melanie. I ping Dan and Sylvain from time to time as > needed on specific changes or if I'm trying to flush a branch for a > release. > > > Tony Breeds > > > > This leaves Nova entirely dependent on Matt Riedemann, John Garbutt, > > and Matthew Treinish to land patches in stable, which isn't a great > > situation. With Matt R temporarily out of action that's especially > > bad. > > This is a bit of an exaggeration. What you mean is that it leaves > backports from Red Hat stuck(ish) because we want to avoid two RH cores > from approving the backport. However, it doesn't mean 2 RH cores can't > approve a backport from someone else, like something I backport for > example. > > > > > Looking for constructive suggestions. I'm obviously in favour of > > relaxing the trifecta rules, but adding some non-RH stable cores also > > seems like it would be a generally healthy thing for the project to > > do. > > I've started a conversation about this within the nova-stable-maint team > but until there are changes I think it's fair to say if you really need > something that is backported from RH (like Lee backports something) then > we can ping non-RH people to approve (like mtreinish or johnthetubaguy) > or wait for me to get out of /dev/jail. > > -- > > Thanks, > > Matt > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Wed May 22 16:35:37 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Wed, 22 May 2019 11:35:37 -0500 Subject: [Infra][ironic][Release-job-failures] release-post job for openstack/releases failed Message-ID: <20190522163537.GA28637@sm-workstation> There was a failure in the post job for ironic-inspector in stable/stein. The error was after tagging during the log collection task. "rsync: connection unexpectedly closed" http://logs.openstack.org/8f/8f71bcfb8d02cae91d0e49e5813c03c745745c92/release-post/tag-releases/52abf46/job-output.txt.gz#_2019-05-22_16_08_25_140267 It appears all necessary tasks completed though, so I believe this should be safe to ignore. This failures caused the doc publishing job to be skipped, but we had other releases that would have picked that up. Just sharing in case we run into something similar again and need to start tracking down root cause of the rsync failure. If anyone sees anything unusual that may be a side effect of this, please let us know and we can try to investigate. Sean ----- Forwarded message from zuul at openstack.org ----- Date: Wed, 22 May 2019 16:08:49 +0000 From: zuul at openstack.org To: release-job-failures at lists.openstack.org Subject: [Release-job-failures] release-post job for openstack/releases failed Reply-To: openstack-discuss at lists.openstack.org Build failed. - tag-releases http://logs.openstack.org/8f/8f71bcfb8d02cae91d0e49e5813c03c745745c92/release-post/tag-releases/52abf46/ : POST_FAILURE in 4m 48s - publish-tox-docs-static publish-tox-docs-static : SKIPPED _______________________________________________ Release-job-failures mailing list Release-job-failures at lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/release-job-failures ----- End forwarded message ----- From sean.mcginnis at gmx.com Wed May 22 16:38:41 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Wed, 22 May 2019 11:38:41 -0500 Subject: [nova] stable-maint is especially unhealthily RH-centric In-Reply-To: References: <86241a3b-be28-c83f-7c35-386946c3cdc8@gmail.com> Message-ID: <20190522163840.GB28637@sm-workstation> On Wed, May 22, 2019 at 12:34:22PM -0400, Artom Lifshitz wrote: > On Tue, May 21, 2019, 16:32 Matt Riedemann, wrote: > > > On 5/21/2019 11:16 AM, Matthew Booth wrote: > > > Not Red Hat: > > > Claudiu Belu -> Inactive? > > > Matt Riedemann > > > John Garbutt > > > Matthew Treinish > > > > Sean McGinnis is on the release management team which is a (grand)parent > > group to nova-stable-maint and Sean reviews nova stable changes from > > time to time or as requested, but he's currently in the same boat as me. > > > > Wait, did I miss something? We at RH were told it was business as usual > with respect to upstream community collaboration with Huawei. > > We had been told internally to hold off for awhile, but I believe the OSF and our internal teams have done all the due diligence they've needed to make sure we are in the clear as far as participating upstream. Our team should be more active again. Great to hear that is what RH has communicated as well. If anyone else has any concerns I can try to track down answers. Sean From sundar.nadathur at intel.com Wed May 22 16:58:43 2019 From: sundar.nadathur at intel.com (Nadathur, Sundar) Date: Wed, 22 May 2019 16:58:43 +0000 Subject: [cyborg] No Zoom meeting today Message-ID: <1CC272501B5BC543A05DB90AA509DED527574E7A@fmsmsx122.amr.corp.intel.com> Cancelled after mutual agreement in IRC meeting because there is no specific agenda. Regards, Sundar -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmendiza at redhat.com Wed May 22 19:09:23 2019 From: dmendiza at redhat.com (=?UTF-8?Q?Douglas_Mendiz=c3=a1bal?=) Date: Wed, 22 May 2019 14:09:23 -0500 Subject: [castellan-ui] Retiring castellan-ui Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Hello openstack-discuss, The Barbican Team is announcing the retirement of castellan-ui. As far as I can tell there is no interest in the community to continue to develop this Horizon plugin. This does not affect the castellan library [1], which is still supported . Regards, Douglas Mendizábal [1] https://opendev.org/openstack/castellan -----BEGIN PGP SIGNATURE----- iQEzBAEBCAAdFiEEan2ddQosxMRNS/FejiZC4mXuYkoFAlzlnmMACgkQjiZC4mXu YkovUAf9EnSKc14kVSYl2EKeVKGfHtX22D1oezwi9xDcM4Y+XzY+nkYRsikkrQ4Y usCqzKgZYzTX1A52nns4ORpGcQL7/E1hEbEm54PE3und9CH0ORA8WnfIwKfDSfH/ DzyVyUKUxrcHp8CHkqsQUun3wFbPYlfl/Pptk76Xj0RnGz0YnLncHLx1uN2ucqjL olLXjwyxVDGJsnZJ5zs/T0PzzYnLGD8UCp8YzTF3aZQpehXsTpEUL+GL5EDic9IG 2FlVrFqIVgxFm6XeQCpxt5uHy2Lxpqp9WcTSmgyd71HDnuIyZ+iqAHY+R1fT/avV ajaQV8FiMvdrhYFx6ujqnjpVAqB3hQ== =eqNk -----END PGP SIGNATURE----- From fungi at yuggoth.org Wed May 22 19:14:01 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 22 May 2019 19:14:01 +0000 Subject: [dev][all] note to non-core reviewers in all projects In-Reply-To: <0bdfa780-d629-3401-7df1-54a96aa1b6ea@redhat.com> References: <20190521163710.nujmle4dknr5cqgv@yuggoth.org> <0bdfa780-d629-3401-7df1-54a96aa1b6ea@redhat.com> Message-ID: <20190522191400.c7fkibziyxxqceie@yuggoth.org> On 2019-05-22 11:01:08 +0200 (+0200), Bogdan Dobrelya wrote: > On 21.05.2019 18:37, Jeremy Stanley wrote: > > this change makes > > sense to me, is a good idea for the project, and I don't see any > > obvious flaws in it. > > Would be nice to have this as a default message proposed by gerrit for +1 > action. So there never be emptiness and everyone gets happy, by default! The [label "Code-Review"] section of the All-Projects ACL for our Gerrit deployment[*] defines a +1 as indicating "Looks good to me, but someone else must approve." This doesn't get included directly into comment text of course but is shown as a tooltip in the vote selection modal of the WebUI. I'm not sure if that actually needs to be changed, but it *is* configurable if this is something we truly desire. [*] https://docs.openstack.org/infra/system-config/gerrit.html#access-controls -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From rafaelweingartner at gmail.com Wed May 22 19:38:12 2019 From: rafaelweingartner at gmail.com (=?UTF-8?Q?Rafael_Weing=C3=A4rtner?=) Date: Wed, 22 May 2019 16:38:12 -0300 Subject: [telemetry] Voting for a new meeting time In-Reply-To: References: Message-ID: Hello guys, Sadly, I will not be able to attend our meeting at 23/05/2019 02:00-03:00 UTC. I checked the telemetry vision proposals, and I would vote (if it ever comes to something like that) to options A and B. Option C would be very interesting, but I am not sure if it is feasible in the short/medium term. Options D and E seem to be the path that Telemetry was taking as we briefly discussed in our last meeting; that would have a serious impact on many production environments. Therefore, I (we) think that might be better to avoid them at all cost. On Fri, May 17, 2019 at 4:05 AM Trinh Nguyen wrote: > Hi team, > > According to the poll [1], I will organize 2 meeting sessions on May 23rd: > > - Core contributors (mostly in APAC): 02:00 UTC > - Cross-projects contributors (mostly in US or around that): 08:00 UTC > > Some core members or at least myself will be able to attend both meetings > so I think it should be fine. I will put the meeting agenda here [2]. > > [1] https://doodle.com/poll/cd9d3ksvpms4frud > [2] https://etherpad.openstack.org/p/telemetry-meeting-agenda > > Bests, > > On Fri, May 10, 2019 at 12:05 PM Trinh Nguyen > wrote: > >> Hi team, >> >> As discussed, we should have a new meeting time so more contributors can >> join. So please cast your vote in the link below *by the end of May 15th >> (UTC).* >> >> https://doodle.com/poll/cd9d3ksvpms4frud >> >> One thing to keep in mind that I still want to keep the old meeting time >> as an option, not because I'm biasing the APAC developers but because it is >> the time that most of the active contributors (who actually pushing patches >> and review) can join. >> >> When we have the results if we end up missing some contributors (I think >> all of you are great!), no worries. We could try to create different >> meetings for a different set of contributors, something like: >> >> - Developers: for bug triage, implementation, etc. >> - Operators: input from operators are important too since we need >> real use cases >> - Cross-project: Telemetry may need to work with other teams >> - Core team: for the core team to discuss the vision and goals, >> planning >> >> >> Okie, I know we cannot live without monitoring/logging so let's rock the >> world guys!!! >> >> Bests >> >> >> -- >> *Trinh Nguyen* >> *www.edlab.xyz * >> >> > > -- > *Trinh Nguyen* > *www.edlab.xyz * > > -- Rafael Weingärtner -------------- next part -------------- An HTML attachment was scrubbed... URL: From aspiers at suse.com Wed May 22 21:39:27 2019 From: aspiers at suse.com (Adam Spiers) Date: Wed, 22 May 2019 22:39:27 +0100 Subject: [stackalytics] Reported numbers seem inaccurate In-Reply-To: References: Message-ID: <20190522213927.iuty4y5mrgw7dmjt@pacific.linksys.moosehall> There are still issues. For example nova is not showing any commits since April: https://www.stackalytics.com/?metric=commits&release=train&project_type=all&module=nova Rong Zhu wrote: >Hi Sergey, > >Thanks for your help. Now the numbers are correctly. > > >Sergey Nikitin 于2019年5月19日 周日21:12写道: > >> Hi, Rong, >> >> Database was rebuild and now stats o gengchc2 [1] is correct [2]. >> >> [1] >> https://www.stackalytics.com/?release=all&metric=commits&project_type=all&user_id=578043796-b >> [2] https://review.opendev.org/#/q/owner:gengchc2,n,z >> >> Sorry for delay, >> Sergey >> >> >> >> >> On Fri, May 17, 2019 at 6:20 PM Sergey Nikitin >> wrote: >> >>> Testing of migration process shown us that we have to rebuild database >>> "on live". >>> Unfortunately it means that during rebuild data will be incomplete. I >>> talked with the colleague who did it previously and he told me that it's >>> normal procedure. >>> I got these results on Monday and at this moment I'm waiting for weekend. >>> It's better to rebuild database in Saturday and Sunday to do now affect >>> much number of users. >>> So by the end of this week everything will be completed. Thank you for >>> patient. >>> >>> On Fri, May 17, 2019 at 6:15 AM Rong Zhu wrote: >>> >>>> Hi Sergey, >>>> >>>> What is the process about rebuild the database? >>>> >>>> Thanks, >>>> Rong Zhu >>>> >>>> Sergey Nikitin 于2019年5月7日 周二00:59写道: >>>> >>>>> Hello Rong, >>>>> >>>>> Sorry for long response. I was on a trip during last 5 days. >>>>> >>>>> What I have found: >>>>> Lets take a look on this patch [1]. It must be a contribution of >>>>> gengchc2, but for some reasons it was matched to Yuval Brik [2] >>>>> I'm still trying to find a root cause of it, but anyway on this week we >>>>> are planing to rebuild our database to increase RAM. I checked statistics >>>>> of gengchc2 on clean database and it's complete correct. >>>>> So your problem will be solved in several days. It will take so long >>>>> time because full rebuild of DB takes 48 hours, but we need to test our >>>>> migration process first to keep zero down time. >>>>> I'll share a results with you here when the process will be finished. >>>>> Thank you for your patience. >>>>> >>>>> Sergey >>>>> >>>>> [1] https://review.opendev.org/#/c/627762/ >>>>> [2] >>>>> https://www.stackalytics.com/?user_id=jhamhader&project_type=all&release=all&metric=commits&company=&module=freezer-api >>>>> >>>>> >>>>> On Mon, May 6, 2019 at 6:30 AM Rong Zhu wrote: >>>>> >>>>>> Hi Sergey, >>>>>> >>>>>> Do we have any process about my colleague's data loss problem? >>>>>> >>>>>> Sergey Nikitin 于2019年4月29日 周一19:57写道: >>>>>> >>>>>>> Thank you for information! I will take a look >>>>>>> >>>>>>> On Mon, Apr 29, 2019 at 3:47 PM Rong Zhu >>>>>>> wrote: >>>>>>> >>>>>>>> Hi there, >>>>>>>> >>>>>>>> Recently we found we lost a person's data from our company at the >>>>>>>> stackalytics website. >>>>>>>> You can check the merged patch from [0], but there no date from >>>>>>>> the stackalytics website. >>>>>>>> >>>>>>>> stackalytics info as below: >>>>>>>> Company: ZTE Corporation >>>>>>>> Launchpad: 578043796-b >>>>>>>> Gerrit: gengchc2 >>>>>>>> >>>>>>>> Look forward to hearing from you! >>>>>>>> >>>>>>> >>>>>> Best Regards, >>>>>> Rong Zhu >>>>>> >>>>>>> >>>>>>>> -- >>>>>> Thanks, >>>>>> Rong Zhu >>>>>> >>>>> >>>>> >>>>> -- >>>>> Best Regards, >>>>> Sergey Nikitin >>>>> >>>> -- >>>> Thanks, >>>> Rong Zhu >>>> >>> >>> >>> -- >>> Best Regards, >>> Sergey Nikitin >>> >> >> >> -- >> Best Regards, >> Sergey Nikitin >> >-- >Thanks, >Rong Zhu From dirk at dmllr.de Wed May 22 21:49:55 2019 From: dirk at dmllr.de (=?UTF-8?B?RGlyayBNw7xsbGVy?=) Date: Wed, 22 May 2019 23:49:55 +0200 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: <20190514123155.xbj5srhhnrmg2h2y@yuggoth.org> References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> <20190514123155.xbj5srhhnrmg2h2y@yuggoth.org> Message-ID: Hi Jeremy, > I agree, that is the point I was trying to make... or moreso that > the "snapshot in time" is the purpose upper-constraints.txt was > intended to serve for stable branches so we can keep them... stable. I would replace "stable" with "working" in that sentence (e.g. existing test coverage still passes). I do agree with that in general. > On the other hand "user expects it to be updated asap on security > vulnerabilities" sounds like a misconception we need to better > document isn't the reason we have that mechanism. I am not sure this was implied. The current suggested mode was to clear out known security vulnerabilities, which is a different angle than guaranteeing that everything is immediately secure. Fixing known issues still means there might be unfixed issues left. For me the point of that exercise is not to ensure that pure pip/upper-constraints built environments are *secure*, but that they still *work*. I would not want to extrapolate that just because we're fixing known vulnerable versions in our dependency chain by fixed versions that we're magically secure. > contemporary branches. Many (I expect most?) of our external Python > dependencies do not follow a similar pattern For many projects with security problems there is actually a sane backporting strategy implemented. As an example there is django, where we opt for the LTS version and that one is very cautiously maintained with only security and critical bugfixes (data-loss/data corruption type). I would rather up the constraints on stable branches to verify that our test coverage is still passing with that change than to rely on distributors magically doing better backports of the security fixes than upstream does and then not shipping that to their customers without finding that the security patch actually broke some useful functionality of the project. django might be a good example and a bad example here, because I actually know that horizon *ignores* upper-constraints so the django version we're having in upper-constraints is not the one that is actually being used in the testing, which is another reason why I would like to make it actually match* the actually used version because thats the one the code is most likely working with and it then serves at least as a documentation type thing. > Yes, and those vendors (at least for the versions of their distros > we claim to test against) generally maintain a snapshot-in-time fork > of those packages and selectively backport fixes to them, which is > *why* we can depend on them being a generally stable test bed for > us. I do happen to work for one of those vendors, and yes, upper-constraints is an important consideration for the choice of versions that are being shipped. I happen to come accross a situation every once in a while where a security fix the upstream project wasn't such a super great thing and caused regressions in some other project somewhere that was accidentally or intentionally using some of the previously vulnerable functionality. OpenStack can be such a case. Thats another reason why I would like to ensure that the "stable version with the vulnerable behavior fixed" version is actually used in testing so that we know it doesn't break functionality and that also OpenStack does not introduce changes that rely on behavior that has been patched out by vulnerability fixes in the downstream vendor distributions. A security fix could intentionally change the observed behavior of the dependency because that is the essential part of the security fix. One example of that could be that it rejects a certain invalid input that it sloppily ignored beforehand. By running our testing against the version that still allows the invalid input we're not finding the issue. Distribution vendors that are careful and provide security fixes to their user/customer base will do the security backport and release that as an update, either with some testing or with the implicit assumption that necessary cross-checking has already been performed. If it does cause a problem, its is going to be a security fix that explodes right into the face of the user of OpenStack in this case. Eventually, every once in a while there is a production outage due to that, and customers are generally not happy about that. the naive reaction of the user is either "Ok, this vendor is not able to patch things properly" or "OpenStack is broken by the security fix". Both are uncomfortable situations to be in. I don't ask you to walk in the shoes of the distribution vendor that goes through that customer conversation, but I would like to ask you to consider the case that OpenStack needs to *deal* with behavior changes in its dependencies that are introduced as an essential part or the side effects of security fixes being done there. I can give some theoretical examples. There were cases where dependencies ran code that left files with insecure permissions. Quite an obvious problem, and quite easy to fix. Suddenly files have secure permissions and openstack breaks. There were cases where a url was sanitized and slightly malformatted urls were rejected, and openstack generated those and fell into the trap. Yes, all of that can be fixed by patches, but there is no testing of those backports because theyre being executed against the *still vulnerable* version. > Right, again we seem to agree on the risk, just not the source of > the problem. I continue to argue that the underlying issue is the > choice to reuse the existing upper-constraints.txt (which was > invented for a different, conflicting purpose) rather than creating > a solid solution to their problem. Perhaps the projects that currently use upper constraints don't care about a secure virtualenv/container build, and thats fine. It still does have a point to test against the versions end users will most likely have, and they most likely have security fixed versions (because they're good users and run against a stable security maintained enterprise operating system). We'd be doing ourselves a favor by testing a situation that is coming close to the end user situation in our CI. Yes, we can debate that upping "requests" to a 3 version newer version is stretching the limits. I think thats a very useful conversation to have, and it doesn't take me a lot to feel convinced that its going outside stable policy. It isn't the only case however. > OpenStack chose to use pip to *test* its Python dependency chain so > that we can evaluate newer versions of those dependencies than the > distros in question are carrying. So that would not change then at all. with by-demand including version updates of our dependencies we're still going to do exactly that. > The stable branches are meant as a place for distro package > maintainers to collaborate on selective backports of patches for > their packaged versions of our software, and suddenly starting to > have them depend on newer versions of external dependencies which > don't follow the same branching model and cadence creates new > challenges for verifying that critical and security fixes for *our* > software continues to remain compatible with deps contemporary to > when we initially created those branches. I'm not saying we should bump all of our 500+ dependencies regularly on stable branches. the number of packages the pypi tool "safety" considers vulnerable is a very small quantity of dependencies that are in that bucket, I would say less than 1%.Lets not make the problem bigger than it is. > And I think we're better off picking a different solution for > coordinating security updates to external dependencies of our stable > branches, rather than trying to turn upper-constraints.txt into that > while destabilizing its intended use. The intended use was to keep stable branches "stable" (or in my interpretation: "working"). distribution vendors don't have much choice. they do have to patch security vulnerabilities. It is their core responsibility. It makes their life easier by having upstream testing and known working against those patched versions (which can not 100% be reflected in the current pip installed fashion, but we can approximate that once we agree on the policy that would describe that OpenStack CI is testing what users are likely going to hit). Greetings, Dirk From dirk at dmllr.de Wed May 22 21:52:47 2019 From: dirk at dmllr.de (=?UTF-8?B?RGlyayBNw7xsbGVy?=) Date: Wed, 22 May 2019 23:52:47 +0200 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: <20190514143935.thuj6t7z6v4xoyay@mthode.org> References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> <20190514123155.xbj5srhhnrmg2h2y@yuggoth.org> <20190514143935.thuj6t7z6v4xoyay@mthode.org> Message-ID: Hi Matthew, > 2. add a new file, let's call it 'security-updates.txt' maybe better call it updates-for-known-insecure-versions.txt ;-) > b. the file needs to maintain co-installability of openstack. It is > laid over the upper-constraints file and tested the same way > upper-constraints is. This testing is NOT perfect. The generated > file could be called something like > 'somewhat-tested-secureconstraints.txt' coinstallability is a problem, but I think its not the main one. But I agree we can try that. > This also sets up incrased work and scope for the requirements team. > Perhaps this could be a sub team type of item or something? Allowing for additions there doesn't immediately increase work. unless there is somebody actually proposing a change to review, that is. It doesn"t make the team magically fulfill the promise - the policy change would allow the review team to accept such a review as it is within policy. From dirk at dmllr.de Wed May 22 21:58:27 2019 From: dirk at dmllr.de (=?UTF-8?B?RGlyayBNw7xsbGVy?=) Date: Wed, 22 May 2019 23:58:27 +0200 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: <90baa056-ab00-d2fc-f068-0a312ea775f7@openstack.org> References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> <20190514123155.xbj5srhhnrmg2h2y@yuggoth.org> <20190514143935.thuj6t7z6v4xoyay@mthode.org> <90baa056-ab00-d2fc-f068-0a312ea775f7@openstack.org> Message-ID: Hi, > dependencies, we can't rely on *them* to properly avoid depending on > vulnerable second-level dependencies (and so on). And this solution does > not cover non-Python dependencies. So saying "use this to be secure" is > just misleading our users. We typically don't use pip constraints for non-python dependencies, so that is an orthogonal problem. Yes, it doesn't tell you that you need to install kernel updates to be secure but since we're not documenting the kernel version that OpenStack requires you to be having installed to begin with its not really part of the problem scope.. > Nothing short of full distribution security work can actually deliver on > a "use this to be secure" promise. And that is definitely not the kind > of effort we should tackle as a community imho. See my previous reply to Jeremy. I'm not asking for the community to suddenly replace the paid work of distribution vendors. What I was aiming at is that we could align our testing coverage with the situation that end users might likely end up, which is something along the lines of "the upper-constraints version of the dependency + distro specific bugfixes + customer/user requested bugfixes + security backports". Hopefully the vendor follows good practices and upstreams those changes, which means a stable release of the version that OpenStack depends on would pretty closely match end user situation, which is good because we want to test that functionality in our CI. > I would rather continue to use that mechanism to communicate about > critical vulnerabilities in all our dependencies than implement a > complex and costly process to only cover /some/ of our Python dependencies. I think announcing such changes in our upper-constraints via OSSN is an excellent idea, totally on board with that. Greetings, Dirk From dirk at dmllr.de Wed May 22 22:05:41 2019 From: dirk at dmllr.de (=?UTF-8?B?RGlyayBNw7xsbGVy?=) Date: Thu, 23 May 2019 00:05:41 +0200 Subject: [glance][interop] standardized image "name" ? In-Reply-To: <7893dbd2-acc1-692c-df38-29ec7c8a98e7@debian.org> References: <939FEDBD-6E5E-43F2-AE1F-2FE71A71BF58@vmware.com> <20190408123255.vqwwvzzdt24tm3pq@yuggoth.org> <4234325e-7569-e11f-53e9-72f07ed8ce53@gmail.com> <7893dbd2-acc1-692c-df38-29ec7c8a98e7@debian.org> Message-ID: Hi zigo, > > The multihash is displayed in the image-list and image-show API > > responses since Images API v2.7, and in the glanceclient since 2.12.0. > That's the thing. "image show --long" continues to display the md5sum > instead of the sha512. is there a bug reference for this problem.. ? Thanks, Dirk From mthode at mthode.org Wed May 22 22:12:16 2019 From: mthode at mthode.org (Matthew Thode) Date: Wed, 22 May 2019 17:12:16 -0500 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> <20190514123155.xbj5srhhnrmg2h2y@yuggoth.org> <20190514143935.thuj6t7z6v4xoyay@mthode.org> Message-ID: <20190522221216.gjnxfgs25n7s7syj@mthode.org> On 19-05-22 23:52:47, Dirk Müller wrote: > Hi Matthew, > > > 2. add a new file, let's call it 'security-updates.txt' > > maybe better call it updates-for-known-insecure-versions.txt ;-) > > > b. the file needs to maintain co-installability of openstack. It is > > laid over the upper-constraints file and tested the same way > > upper-constraints is. This testing is NOT perfect. The generated > > file could be called something like > > 'somewhat-tested-secureconstraints.txt' > > coinstallability is a problem, but I think its not the main one. But I > agree we can try that. > > > This also sets up incrased work and scope for the requirements team. > > Perhaps this could be a sub team type of item or something? > > Allowing for additions there doesn't immediately increase work. unless > there is somebody actually proposing a change to review, that is. It > doesn"t make the team magically fulfill the promise - the policy change > would allow the review team to accept such a review as it is within > policy. These are all true, but even before changing anything we'd still have to document the policy. Perhaps that's the next step. Do you mind generating a policy change and proposing it (to this thread) for review? -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From mriedemos at gmail.com Wed May 22 22:13:48 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 22 May 2019 17:13:48 -0500 Subject: [nova] Validation for requested host/node on server create Message-ID: <78fa937a-beb6-c63d-01a0-40e6519928be@gmail.com> It seems we've come to an impasse on this change [1] because of a concern about where to validate the requested host and/or hypervisor_hostname. The change is currently validating in the API to provide a fast fail 400 response to the user if the host and/or node don't exist. The concern is that the lookup for the compute node will be done again in the scheduler [2]. Also, if the host is not provided, then to validate the node we have to iterate the cells looking for the given compute node (we could use placement though, more on that below). I've added this to the nova meeting agenda for tomorrow but wanted to try and enumerate what I see are the options before the meeting so we don't have to re-cap all of this during the meeting. The options as I see them are: 1. Omit the validation in the API and let the scheduler do the validation. Pros: no performance impact in the API when creating server(s) Cons: if the host/node does not exist, the user will get a 202 response and eventually a NoValidHost error which is not a great user experience, although it is what happens today with the availability_zone forced host/node behavior we already have [3] so maybe it's acceptable. 2. Only validate host in the API since we can look up the HostMapping in the API DB. If the user also provided a node then we'd just throw that on the RequestSpec and let the scheduler code validate it. Pros: basic validation for the simple and probably most widely used case since for non-baremetal instances the host and node are going to be the same Cons: still could have a late failure in the scheduler with NoValidHost error; does not cover the case that only node (no host) is specified 3. Validate both the host and node in the API. This can be broken down: a) If only host is specified, do #2 above. b) If only node is specified, iterate the cells looking for the node (or query a resource provider with that name in placement which would avoid down cell issues) c) If both host and node is specified, get the HostMapping and from that lookup the ComputeNode in the given cell (per the HostMapping) Pros: fail fast behavior in the API if either the host and/or node do not exist Cons: performance hit in the API to validate the host/node and redundancy with the scheduler to find the ComputeNode to get its uuid for the in_tree filtering on GET /allocation_candidates. Note that if we do find the ComputeNode in the API, we could also (later?) make a change to the Destination object to add a node_uuid field so we can pass that through on the RequestSpec from API->conductor->scheduler and that should remove the need for the duplicate query in the scheduler code for the in_tree logic. I'm personally in favor of option 3 since we know that users hate NoValidHost errors and we have ways to mitigate the performance overhead of that validation. Note that this isn't necessarily something that has to happen in the same change that introduces the host/hypervisor_hostname parameters to the API. If we do the validation in the API I'd probably split the validation logic into it's own patch to make it easier to test and review on its own. [1] https://review.opendev.org/#/c/645520/ [2] https://github.com/openstack/nova/blob/2e85453879533af0b4d0e1178797d26f026a9423/nova/scheduler/utils.py#L528 [3] https://docs.openstack.org/nova/latest/admin/availability-zones.html -- Thanks, Matt From fungi at yuggoth.org Wed May 22 22:49:31 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 22 May 2019 22:49:31 +0000 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> <20190514123155.xbj5srhhnrmg2h2y@yuggoth.org> Message-ID: <20190522224930.yy35h7imhedm2lyy@yuggoth.org> On 2019-05-22 23:49:55 +0200 (+0200), Dirk Müller wrote: [...snip bits about pragmatic compromise over absolutes...] > Perhaps the projects that currently use upper constraints don't > care about a secure virtualenv/container build, and thats fine. It > still does have a point to test against the versions end users > will most likely have, and they most likely have security fixed > versions (because they're good users and run against a stable > security maintained enterprise operating system). We'd be doing > ourselves a favor by testing a situation that is coming close to > the end user situation in our CI. [...] Doing conformance testing on those distros with their packaged versions of our external dependencies would much more closely approximate what I think you want than testing with a shifting set of old-and-new Python dependencies installed from PyPI. It would probably also be easier to maintain over the long haul. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From dirk at dmllr.de Wed May 22 22:53:25 2019 From: dirk at dmllr.de (=?UTF-8?B?RGlyayBNw7xsbGVy?=) Date: Thu, 23 May 2019 00:53:25 +0200 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: <20190522221216.gjnxfgs25n7s7syj@mthode.org> References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> <20190514123155.xbj5srhhnrmg2h2y@yuggoth.org> <20190514143935.thuj6t7z6v4xoyay@mthode.org> <20190522221216.gjnxfgs25n7s7syj@mthode.org> Message-ID: Hi Matthew, > document the policy. Perhaps that's the next step. Do you mind > generating a policy change and proposing it (to this thread) for review? I suggest to discuss it in gerrit: https://review.opendev.org/#/c/660855/ Greetings, Dirk From dirk at dmllr.de Wed May 22 23:09:58 2019 From: dirk at dmllr.de (=?UTF-8?B?RGlyayBNw7xsbGVy?=) Date: Thu, 23 May 2019 01:09:58 +0200 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: <20190522224930.yy35h7imhedm2lyy@yuggoth.org> References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> <20190514123155.xbj5srhhnrmg2h2y@yuggoth.org> <20190522224930.yy35h7imhedm2lyy@yuggoth.org> Message-ID: Hi Jeremy, > Doing conformance testing on those distros with their packaged > versions of our external dependencies would much more closely > approximate what I think you want I think that would also work. Would the community be interested in solving conformance incompatibilities when purely vendored versions are used? I somehow have doubts. Would we track the vendored version/releases in a constraints file to ensure gating issues are not creeping in? All the existing tooling is around tracking lower and upper constraints as defined by pip and our opendev defined wheel mirrors. Unless we have a tool that translate pip install commands into the respective distribution equivalent, such a vendored-test also adds significant drag for projects : maintaining two different ways to install things and for X number of vendors to cross-check and help debug solve integration issues. Plus the amount of extra CI load this might cause. Not a fun task. Considering that I would prefer to volunteer maintaining a pypi/pip wheel fork of the ~5 dependencies with security vulnerabilities that we care about and pull those in instead of exposing the full scope of X vendors downstream specific patching issues to us as a community. Greetings, Dirk From melwittt at gmail.com Wed May 22 23:33:17 2019 From: melwittt at gmail.com (melanie witt) Date: Wed, 22 May 2019 16:33:17 -0700 Subject: [nova] Validation for requested host/node on server create In-Reply-To: <78fa937a-beb6-c63d-01a0-40e6519928be@gmail.com> References: <78fa937a-beb6-c63d-01a0-40e6519928be@gmail.com> Message-ID: <306acabc-601f-0689-6988-c99a00fcdfdb@gmail.com> On Wed, 22 May 2019 17:13:48 -0500, Matt Riedemann wrote: > It seems we've come to an impasse on this change [1] because of a > concern about where to validate the requested host and/or > hypervisor_hostname. > > The change is currently validating in the API to provide a fast fail 400 > response to the user if the host and/or node don't exist. The concern is > that the lookup for the compute node will be done again in the scheduler > [2]. Also, if the host is not provided, then to validate the node we > have to iterate the cells looking for the given compute node (we could > use placement though, more on that below). > > I've added this to the nova meeting agenda for tomorrow but wanted to > try and enumerate what I see are the options before the meeting so we > don't have to re-cap all of this during the meeting. > > The options as I see them are: > > 1. Omit the validation in the API and let the scheduler do the validation. > > Pros: no performance impact in the API when creating server(s) > > Cons: if the host/node does not exist, the user will get a 202 response > and eventually a NoValidHost error which is not a great user experience, > although it is what happens today with the availability_zone forced > host/node behavior we already have [3] so maybe it's acceptable. > > 2. Only validate host in the API since we can look up the HostMapping in > the API DB. If the user also provided a node then we'd just throw that > on the RequestSpec and let the scheduler code validate it. > > Pros: basic validation for the simple and probably most widely used case > since for non-baremetal instances the host and node are going to be the same > > Cons: still could have a late failure in the scheduler with NoValidHost > error; does not cover the case that only node (no host) is specified > > 3. Validate both the host and node in the API. This can be broken down: > > a) If only host is specified, do #2 above. > b) If only node is specified, iterate the cells looking for the node (or > query a resource provider with that name in placement which would avoid > down cell issues) > c) If both host and node is specified, get the HostMapping and from that > lookup the ComputeNode in the given cell (per the HostMapping) > > Pros: fail fast behavior in the API if either the host and/or node do > not exist > > Cons: performance hit in the API to validate the host/node and > redundancy with the scheduler to find the ComputeNode to get its uuid > for the in_tree filtering on GET /allocation_candidates. > > Note that if we do find the ComputeNode in the API, we could also > (later?) make a change to the Destination object to add a node_uuid > field so we can pass that through on the RequestSpec from > API->conductor->scheduler and that should remove the need for the > duplicate query in the scheduler code for the in_tree logic. > > I'm personally in favor of option 3 since we know that users hate > NoValidHost errors and we have ways to mitigate the performance overhead > of that validation. Count me in the option 3 boat too, for the same reasons. Rather avoid NoValidHost and there's mitigation we can do for the perf issue. -melanie > Note that this isn't necessarily something that has to happen in the > same change that introduces the host/hypervisor_hostname parameters to > the API. If we do the validation in the API I'd probably split the > validation logic into it's own patch to make it easier to test and > review on its own. > > [1] https://review.opendev.org/#/c/645520/ > [2] > https://github.com/openstack/nova/blob/2e85453879533af0b4d0e1178797d26f026a9423/nova/scheduler/utils.py#L528 > [3] https://docs.openstack.org/nova/latest/admin/availability-zones.html > From fungi at yuggoth.org Wed May 22 23:53:50 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 22 May 2019 23:53:50 +0000 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> <20190514123155.xbj5srhhnrmg2h2y@yuggoth.org> <20190522224930.yy35h7imhedm2lyy@yuggoth.org> Message-ID: <20190522235350.yw4dn5cgwemhmtak@yuggoth.org> On 2019-05-23 01:09:58 +0200 (+0200), Dirk Müller wrote: > Hi Jeremy, > > > Doing conformance testing on those distros with their packaged > > versions of our external dependencies would much more closely > > approximate what I think you want > > I think that would also work. Would the community be interested > in solving conformance incompatibilities when purely vendored > versions are used? I somehow have doubts. Would we track > the vendored version/releases in a constraints file to ensure > gating issues are not creeping in? I don't know that we need to if the goal is to let us know (e.g. with a periodic job) that a distro we care about has upgraded a dependency in a way that our stable branch targeting that distro version no longer works with. > All the existing tooling is around tracking lower and upper > constraints as defined by pip and our opendev defined wheel > mirrors. > > Unless we have a tool that translate pip install commands into the > respective distribution equivalent, such a vendored-test also adds > significant drag for projects : maintaining two different ways to > install things and for X number of vendors to cross-check and help > debug solve integration issues. Plus the amount of extra CI load > this might cause. Not a fun task. DevStack used to support this, but it does indeed seem to have been refactored out some time ago. Reintroducing that, or something like it, could be an alternative solution though. > Considering that I would prefer to volunteer maintaining a > pypi/pip wheel fork of the ~5 dependencies with security > vulnerabilities that we care about and pull those in instead of > exposing the full scope of X vendors downstream specific patching > issues to us as a community. Do we really only care about 5 out of our many hundreds of external Python dependencies? Or is it that we should assume over years of maintenance, fewer than one percent of them will discover vulnerabilities? At any rate, I'm not opposed to the experiment as long as we can still also run jobs for our original frozen dependency sets (so that our stable branches don't inadvertently develop a requirement for a new feature in newer versions of these dependencies) *and* as long as we make it clear to users that this is not a substitute for running on security-supported distro packages (where someone more accountable and read-in than the OpenStack project is backporting patches for vulnerabilities to forks of those dependencies). -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From dirk at dmllr.de Thu May 23 00:24:02 2019 From: dirk at dmllr.de (=?UTF-8?B?RGlyayBNw7xsbGVy?=) Date: Thu, 23 May 2019 02:24:02 +0200 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: <20190522235350.yw4dn5cgwemhmtak@yuggoth.org> References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> <20190514123155.xbj5srhhnrmg2h2y@yuggoth.org> <20190522224930.yy35h7imhedm2lyy@yuggoth.org> <20190522235350.yw4dn5cgwemhmtak@yuggoth.org> Message-ID: Hi Jeremy, > DevStack used to support this, but it does indeed seem to have been > refactored out some time ago. Reintroducing that, or something like > it, could be an alternative solution though. Most of interesting test coverage is in the project's functional test jobs as well, so "just" devstack alone isn't enough, all the projects need to support this variation as well. > Do we really only care about 5 out of our many hundreds of external > Python dependencies? So running safety against our stable branch spills out this list of packages: ansible-runner cryptography django numpy msgpack pyOpenSSL urllib3 requests plus transitive (docker, zhmcclient, kubernetes). numpy is hopefully to ignore in our use case, and django is used unconstrained anyway in the gate, so I would imho remove it from this list. msgpack is an API change, so unsuitable as well. transitive changes are not needed if we have backports in the gate. so this totals in ~5 packages to maintain. I am not looking at branches that are under extended maintenance fwiw. > dependency sets (so that our stable branches don't inadvertently > develop a requirement for a new feature in newer versions of these > dependencies) lower-constraints.txt jobs try to ensure that this won't happen, assuming that we somehow bump the version numbers to X.X.X.post1, so thats not an additional thing to worry about. > *and* as long as we make it clear to users that this > is not a substitute for running on security-supported distro > packages (where someone more accountable and read-in than the > OpenStack project is backporting patches for vulnerabilities to > forks of those dependencies). I'm not aware of any statement anywhere that we'd be security maintaining a distro? Where would we state that? in the project's documentation/README? Greetings, Dirk From dangtrinhnt at gmail.com Thu May 23 01:44:48 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Thu, 23 May 2019 10:44:48 +0900 Subject: [telemetry] Team meetings on May 23rd 02:00 and 08:00 In-Reply-To: References: Message-ID: Meeting in 15m :) On Fri, May 17, 2019 at 4:33 PM Trinh Nguyen wrote: > Hi team, > > I have sent this in another thread but I think It would better to do it in > a separate email. > > According to the poll [1], I will organize 2 meeting sessions on May 23rd: > > - Core contributors (mostly in APAC): 02:00-03:00 UTC > - Cross-projects contributors (mostly in the US or around): > 08:00-09:00 UTC > > Some core members or at least myself will be able to attend both meetings > so I think it should be fine. I draft the meeting agenda in [2]. Please > check it out and input the topics that you want to discuss but remember the > 1-hour meeting constraint. > > [1] https://doodle.com/poll/cd9d3ksvpms4frud > [2] https://etherpad.openstack.org/p/telemetry-meeting-agenda > > Bests, > > -- > *Trinh Nguyen* > *www.edlab.xyz * > > -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From melwittt at gmail.com Thu May 23 01:58:34 2019 From: melwittt at gmail.com (melanie witt) Date: Wed, 22 May 2019 18:58:34 -0700 Subject: [nova][dev][ops] server status when compute host is down Message-ID: <065da98d-300d-00ca-83ee-f6e9dc458277@gmail.com> Hey all, I'm looking for feedback around whether we can improve how we show server status in server list and server show when the compute host it resides on is down. When a compute host goes down while a server on it was previously running, the server status continues to show as ACTIVE in a server list. This is because the power state and status is adjusted by a periodic task run by nova-compute, so if nova-compute is down, it cannot update those states. So, for an end user, when they do a server list, they see their server as ACTIVE when it's actually powered off. We have another field called 'host_status' available since API microversion 2.16 [1] which is controlled by policy and defaults to admin, which is capable of showing the server status as UNKNOWN if the field is specified, for example: nova list --fields id,name,status,task_state,power_state,networks,host_status This is cool, but it is only available to admin by default, and it requires that the end user adds the field to their CLI command in the --fields option. Question: do people think we should make the server status field reflect UNKNOWN as well, if the 'host_status' is UNKNOWN? And if so, should it be controlled by policy or no? Normally, we do not expose compute host details to non-admin in the API by default, but I noticed recently that our "down cells" support will show server status as UNKNOWN if a server is in a down cell [2]. So I wondered if it would be considered OK to show UNKNOWN if a host is down we well, without defaulting it to admin-only. I would really appreciate if people could share their opinion here and if consensus is in support, I will move forward with proposing a change accordingly. Cheers, -melanie [1] https://docs.openstack.org/nova/latest/reference/api-microversion-history.html#id14 [2] https://github.com/openstack/nova/blob/66a77f2fb75bbb9daebdca1cad0255ecafe41e92/nova/api/openstack/compute/views/servers.py#L108 From mthode at mthode.org Thu May 23 02:16:24 2019 From: mthode at mthode.org (Matthew Thode) Date: Wed, 22 May 2019 21:16:24 -0500 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: <20190522235350.yw4dn5cgwemhmtak@yuggoth.org> References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> <20190514123155.xbj5srhhnrmg2h2y@yuggoth.org> <20190522224930.yy35h7imhedm2lyy@yuggoth.org> <20190522235350.yw4dn5cgwemhmtak@yuggoth.org> Message-ID: <20190523021624.yi64clmzbbtfo53w@mthode.org> On 19-05-22 23:53:50, Jeremy Stanley wrote: > On 2019-05-23 01:09:58 +0200 (+0200), Dirk Müller wrote: > > Hi Jeremy, > > > > > Doing conformance testing on those distros with their packaged > > > versions of our external dependencies would much more closely > > > approximate what I think you want > > > > I think that would also work. Would the community be interested > > in solving conformance incompatibilities when purely vendored > > versions are used? I somehow have doubts. Would we track > > the vendored version/releases in a constraints file to ensure > > gating issues are not creeping in? > > I don't know that we need to if the goal is to let us know (e.g. > with a periodic job) that a distro we care about has upgraded a > dependency in a way that our stable branch targeting that distro > version no longer works with. > > > All the existing tooling is around tracking lower and upper > > constraints as defined by pip and our opendev defined wheel > > mirrors. > > > > Unless we have a tool that translate pip install commands into the > > respective distribution equivalent, such a vendored-test also adds > > significant drag for projects : maintaining two different ways to > > install things and for X number of vendors to cross-check and help > > debug solve integration issues. Plus the amount of extra CI load > > this might cause. Not a fun task. > > DevStack used to support this, but it does indeed seem to have been > refactored out some time ago. Reintroducing that, or something like > it, could be an alternative solution though. > > > Considering that I would prefer to volunteer maintaining a > > pypi/pip wheel fork of the ~5 dependencies with security > > vulnerabilities that we care about and pull those in instead of > > exposing the full scope of X vendors downstream specific patching > > issues to us as a community. > > Do we really only care about 5 out of our many hundreds of external > Python dependencies? Or is it that we should assume over years of > maintenance, fewer than one percent of them will discover > vulnerabilities? At any rate, I'm not opposed to the experiment as > long as we can still also run jobs for our original frozen > dependency sets (so that our stable branches don't inadvertently > develop a requirement for a new feature in newer versions of these > dependencies) *and* as long as we make it clear to users that this > is not a substitute for running on security-supported distro > packages (where someone more accountable and read-in than the > OpenStack project is backporting patches for vulnerabilities to > forks of those dependencies). I don't know if we only care about certain things. But if we go forward with this accepting changes that pass tests and go into another constraints file (not upper-constraints) and not actively submitting them ourselves. More opportunistic then active on our part. The other constraints file should have a LARGE warning saying that this is not a substitute for actual security backports as it is not complete (is opportunistic). -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From tony at bakeyournoodle.com Thu May 23 02:24:01 2019 From: tony at bakeyournoodle.com (Tony Breeds) Date: Thu, 23 May 2019 12:24:01 +1000 Subject: [dev][requirements] Upcoming changes to constraints handling in tox.ini In-Reply-To: <20190522151836.fc3wpuiuwlwa5kki@yuggoth.org> References: <20190522030203.GD15808@thor.bakeyournoodle.com> <2e3fa8543cde73f3b93566c0a5b89c30c8d6b42b.camel@redhat.com> <20190522151836.fc3wpuiuwlwa5kki@yuggoth.org> Message-ID: <20190523022401.GG15808@thor.bakeyournoodle.com> On Wed, May 22, 2019 at 03:18:36PM +0000, Jeremy Stanley wrote: > On 2019-05-22 09:26:19 +0100 (+0100), Stephen Finucane wrote: > [...] > > I realize this is bound to be controversial, but would it be > > possible to just auto-merge these patches assuming they pass CI? > > We've had a lot of these initiatives before and, invariably, there > > are some projects that won't get around to merging these for a > > long time (if ever). We had to do this recently with the opendev > > updates to the '.gitreview' files (I think?) so there is precedent > > here. > > Well, there were two approaches we used in the OpenDev migration: > > 1. Backward-compatible mass changes which fixed things we knew would > otherwise break were proposed, given a brief opportunity for > projects to review and approve or -2, and then at an pre-announced > deadline any which were still open but passing their jobs and had no > blocking votes were bulk-approved by a Gerrit administrator who > temporarily elevated their access to act as a core reviewer for all > projects. More specifically, this was the changes to replace git:// > URLs with https:// because we were dropping support for the > protocol. > > 2. Non-backward-compatible mass changes which fixed things we knew > would otherwise be broken by the transition were committed directly > into the on-disk copies of repositories in Gerrit while the service > was offline for maintenance, entirely bypassing CI and code review. > These were changes for things like .gitreview files and zuul > pipelines/jobs/playbooks/roles. Yeah clearly not this one ;P > I think something similar to #1 might be appropriate here. I could > see, for example, requiring Gerrit ACLs for official OpenStack > deliverable repositories to inherit from a parent ACL (Gerrit > supports this) which includes core reviewer permissions for a group > that the Release team can temporarily add themselves to, We could use "Project Bootstrappers" for this, which I think has all the right permissions. It would be a matter of adding the right people to that group for a short period of time. It'd be the requirements team rather than release team. > for the > purposes of bulk approving relevant changes at or shortly following > the coordinated release. The release process they follow already > involves some automated group updates for reassigning control of > branches, so this probably wouldn't be too hard to incorporate. We could add a group similar to "Project Bootstrappers" but with slightly fewer permissions if we think using this approach from time-to-time is an acceptable idea. Yours Tony. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From tony at bakeyournoodle.com Thu May 23 02:26:00 2019 From: tony at bakeyournoodle.com (Tony Breeds) Date: Thu, 23 May 2019 12:26:00 +1000 Subject: [dev][requirements] Upcoming changes to constraints handling in tox.ini In-Reply-To: References: <20190522030203.GD15808@thor.bakeyournoodle.com> Message-ID: <20190523022559.GH15808@thor.bakeyournoodle.com> On Wed, May 22, 2019 at 08:45:29AM +0200, Dirk Müller wrote: > Hi Tony, > > Thanks for the write-up. > > > 2) Switch to the new canonical constraints URL on master > > At the last Denver PTG we also discussed the switch from > UPPER_CONSTRAINTS_FILE environment variable > to TOX_CONSTRAINTS_FILE. As this change and the switch from > UPPER_CONSTRAINTS_FILE to TOX_CONSTRAINTS_FILE > would touch the very same line of text in the tox.ini, I would suggest > that we combine that into one review as that is ~ 300 reviews > less to conflict-merge and resolve when both would happen > independently at the same time. > > I started the patch series to add TOX_CONSTRAINTS_FILE in addition to > UPPER_CONSTRAINTS_FILE so that lower-constraints > setting looks less odd here: > > https://review.opendev.org/657886 > https://review.opendev.org/660187 > > Would be good to get this in in-time so that requirements team can do > both changes in one review set. Yup if they're approved when we start we can do this at the same time. It's just another line in the shell script :) Yours Tony. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From rony.khan at brilliant.com.bd Thu May 23 04:29:08 2019 From: rony.khan at brilliant.com.bd (Md. Farhad Hasan Khan) Date: Thu, 23 May 2019 10:29:08 +0600 Subject: neutron network namespaces not create In-Reply-To: References: <7FDE461E16EF1E4587C3A3333C7DA2D90E8C5DAC75@Email.novotel-bd.com> Message-ID: <014901d51120$10934410$31b9cc30$@brilliant.com.bd> Hi, Thanks a lot Slawomir & Michael for your quick response. I found that my neutron-server is not getting able to create namespace. I restart all service but not work. After that I reboot three controller one by one. After that it start creating namespaces. Thanks & B'Rgds, Rony -----Original Message----- From: Slawomir Kaplonski [mailto:skaplons at redhat.com] Sent: Wednesday, May 22, 2019 10:51 AM To: OpenStack Discuss Subject: Re: neutron network namespaces not create Hi, Yes, Michael is right. This error message is probably suspicious :) So, in ML2/OVS its neutron-ovs-agent who should create br-int during the start. It’s in [1]. So is neutron-openvswitch-agent running on Your node? Or maybe if You are using some other solution rather than ML2/OVS, maybe You should change to other than “openvswitch” interface driver in neutron-l3-agent’s config? [1] https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L1074 > On 22 May 2019, at 03:19, Michael Richardson wrote: > > Hi Rony, > > Based on this line: > > BridgeDoesNotExist: Bridge br-int does not exist > > ...it looks as if your SDN-provider-of-choice (OVS ?) may need some attention. As a wild stab in the dark, is the bridge defined in bridge_mappings, for the default bridge, present, and available to be patched (hypothetically via OVS, or another provider) to br-int ? > > Cheers, > Michael. > > > On 22/05/19 2:53 AM, Md. Farhad Hasan Khan wrote: >> Hi, >> I can create router from horizon. But network namespaces not created. I check with # ip netns list command. Not found router ID, but showing in horizon. >> Here is some log I get from neutron: >> #cat /var/log/neutron/l3-agent.log >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent [-] Error while deleting router cad85ce0-6624-42ff-b42b-09480aea2613: OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent ri.delete() >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 452, in delete >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.disable_keepalived() >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 178, in disable_keepalived >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent shutil.rmtree(conf_dir) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 239, in rmtree >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent onerror(os.listdir, path, sys.exc_info()) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib64/python2.7/shutil.py", line 237, in rmtree >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent names = os.listdir(path) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent OSError: [Errno 2] No such file or directory: '/var/lib/neutron/ha_confs/cad85ce0-6624-42ff-b42b-09480aea2613' >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: cad85ce0-6624-42ff-b42b-09480aea2613: BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent Traceback (most recent call last): >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 562, in _process_router_update >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 481, in _process_router_if_compatible >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self._process_added_router(router) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 486, in _process_added_router >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 375, in _router_added >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent router_id) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.force_reraise() >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 364, in _router_added >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent ri.initialize(self.process_monitor) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 128, in initialize >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.ha_network_added() >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 198, in ha_network_added >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent mtu=self.ha_port.get('mtu')) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 263, in plug >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent bridge, namespace, prefix, mtu) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 346, in plug_new >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent self.check_bridge_exists(bridge) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/interface.py", line 221, in check_bridge_exists >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent raise exceptions.BridgeDoesNotExist(bridge=bridge) >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent BridgeDoesNotExist: Bridge br-int does not exist. >> 2019-05-21 17:37:12.640 52248 ERROR neutron.agent.l3.agent >> 2019-05-21 17:37:13.024 52248 WARNING neutron.agent.l3.agent [-] Info >> for router cad85ce0-6624-42ff-b42b-09480aea2613 was not found. >> Performing router cleanup >> 2019-05-21 17:37:14.358 52248 WARNING neutron.agent.l3.agent [-] Info >> for router cad85ce0-6624-42ff-b42b-09480aea2613 was not found. >> Performing router cleanup Thanks & B’Rgds, Rony > > -- > Michael Richardson > Catalyst Cloud || Catalyst IT Limited > 150-154 Willis Street, PO Box 11-053 Wellington New Zealand > http://catalyst.net.nz > GPG: 0530 4686 F996 4E2C 5DC7 6327 5C98 5EED A302 > — Slawek Kaplonski Senior software engineer Red Hat From snikitin at mirantis.com Thu May 23 06:51:05 2019 From: snikitin at mirantis.com (Sergey Nikitin) Date: Thu, 23 May 2019 10:51:05 +0400 Subject: [stackalytics] Reported numbers seem inaccurate In-Reply-To: <20190522213927.iuty4y5mrgw7dmjt@pacific.linksys.moosehall> References: <20190522213927.iuty4y5mrgw7dmjt@pacific.linksys.moosehall> Message-ID: Thank you for message! yes, I guess new train release wasn't added into repos (just on drop down). I'll fix it now. On Thu, May 23, 2019 at 1:39 AM Adam Spiers wrote: > There are still issues. For example nova is not showing any commits > since April: > > > https://www.stackalytics.com/?metric=commits&release=train&project_type=all&module=nova > > Rong Zhu wrote: > >Hi Sergey, > > > >Thanks for your help. Now the numbers are correctly. > > > > > >Sergey Nikitin 于2019年5月19日 周日21:12写道: > > > >> Hi, Rong, > >> > >> Database was rebuild and now stats o gengchc2 [1] is correct [2]. > >> > >> [1] > >> > https://www.stackalytics.com/?release=all&metric=commits&project_type=all&user_id=578043796-b > >> [2] https://review.opendev.org/#/q/owner:gengchc2,n,z > >> > >> Sorry for delay, > >> Sergey > >> > >> > >> > >> > >> On Fri, May 17, 2019 at 6:20 PM Sergey Nikitin > >> wrote: > >> > >>> Testing of migration process shown us that we have to rebuild database > >>> "on live". > >>> Unfortunately it means that during rebuild data will be incomplete. I > >>> talked with the colleague who did it previously and he told me that > it's > >>> normal procedure. > >>> I got these results on Monday and at this moment I'm waiting for > weekend. > >>> It's better to rebuild database in Saturday and Sunday to do now affect > >>> much number of users. > >>> So by the end of this week everything will be completed. Thank you for > >>> patient. > >>> > >>> On Fri, May 17, 2019 at 6:15 AM Rong Zhu > wrote: > >>> > >>>> Hi Sergey, > >>>> > >>>> What is the process about rebuild the database? > >>>> > >>>> Thanks, > >>>> Rong Zhu > >>>> > >>>> Sergey Nikitin 于2019年5月7日 周二00:59写道: > >>>> > >>>>> Hello Rong, > >>>>> > >>>>> Sorry for long response. I was on a trip during last 5 days. > >>>>> > >>>>> What I have found: > >>>>> Lets take a look on this patch [1]. It must be a contribution of > >>>>> gengchc2, but for some reasons it was matched to Yuval Brik [2] > >>>>> I'm still trying to find a root cause of it, but anyway on this week > we > >>>>> are planing to rebuild our database to increase RAM. I checked > statistics > >>>>> of gengchc2 on clean database and it's complete correct. > >>>>> So your problem will be solved in several days. It will take so long > >>>>> time because full rebuild of DB takes 48 hours, but we need to test > our > >>>>> migration process first to keep zero down time. > >>>>> I'll share a results with you here when the process will be finished. > >>>>> Thank you for your patience. > >>>>> > >>>>> Sergey > >>>>> > >>>>> [1] https://review.opendev.org/#/c/627762/ > >>>>> [2] > >>>>> > https://www.stackalytics.com/?user_id=jhamhader&project_type=all&release=all&metric=commits&company=&module=freezer-api > >>>>> > >>>>> > >>>>> On Mon, May 6, 2019 at 6:30 AM Rong Zhu > wrote: > >>>>> > >>>>>> Hi Sergey, > >>>>>> > >>>>>> Do we have any process about my colleague's data loss problem? > >>>>>> > >>>>>> Sergey Nikitin 于2019年4月29日 周一19:57写道: > >>>>>> > >>>>>>> Thank you for information! I will take a look > >>>>>>> > >>>>>>> On Mon, Apr 29, 2019 at 3:47 PM Rong Zhu > >>>>>>> wrote: > >>>>>>> > >>>>>>>> Hi there, > >>>>>>>> > >>>>>>>> Recently we found we lost a person's data from our company at the > >>>>>>>> stackalytics website. > >>>>>>>> You can check the merged patch from [0], but there no date from > >>>>>>>> the stackalytics website. > >>>>>>>> > >>>>>>>> stackalytics info as below: > >>>>>>>> Company: ZTE Corporation > >>>>>>>> Launchpad: 578043796-b > >>>>>>>> Gerrit: gengchc2 > >>>>>>>> > >>>>>>>> Look forward to hearing from you! > >>>>>>>> > >>>>>>> > >>>>>> Best Regards, > >>>>>> Rong Zhu > >>>>>> > >>>>>>> > >>>>>>>> -- > >>>>>> Thanks, > >>>>>> Rong Zhu > >>>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> Best Regards, > >>>>> Sergey Nikitin > >>>>> > >>>> -- > >>>> Thanks, > >>>> Rong Zhu > >>>> > >>> > >>> > >>> -- > >>> Best Regards, > >>> Sergey Nikitin > >>> > >> > >> > >> -- > >> Best Regards, > >> Sergey Nikitin > >> > >-- > >Thanks, > >Rong Zhu > -- Best Regards, Sergey Nikitin -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Thu May 23 07:40:02 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Thu, 23 May 2019 16:40:02 +0900 Subject: [telemetry] Team meetings on May 23rd 02:00 and 08:00 In-Reply-To: References: Message-ID: The 2nd meeting is in 20m :) On Thu, May 23, 2019 at 10:44 AM Trinh Nguyen wrote: > Meeting in 15m :) > > On Fri, May 17, 2019 at 4:33 PM Trinh Nguyen > wrote: > >> Hi team, >> >> I have sent this in another thread but I think It would better to do it >> in a separate email. >> >> According to the poll [1], I will organize 2 meeting sessions on May 23rd: >> >> - Core contributors (mostly in APAC): 02:00-03:00 UTC >> - Cross-projects contributors (mostly in the US or around): >> 08:00-09:00 UTC >> >> Some core members or at least myself will be able to attend both meetings >> so I think it should be fine. I draft the meeting agenda in [2]. Please >> check it out and input the topics that you want to discuss but remember the >> 1-hour meeting constraint. >> >> [1] https://doodle.com/poll/cd9d3ksvpms4frud >> [2] https://etherpad.openstack.org/p/telemetry-meeting-agenda >> >> Bests, >> >> -- >> *Trinh Nguyen* >> *www.edlab.xyz * >> >> > > -- > *Trinh Nguyen* > *www.edlab.xyz * > > -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From elfosardo at gmail.com Thu May 23 08:18:11 2019 From: elfosardo at gmail.com (Riccardo Pittau) Date: Thu, 23 May 2019 10:18:11 +0200 Subject: [Infra][ironic][Release-job-failures] release-post job for openstack/releases failed In-Reply-To: <20190522163537.GA28637@sm-workstation> References: <20190522163537.GA28637@sm-workstation> Message-ID: Hi Sean, I just started noticing the same error in other jobs, for example this: http://logs.openstack.org/08/660708/1/check/build-openstack-api-ref/4a33c64/job-output.txt.gz#_2019-05-23_08_03_42_370782 This is of course making the entire Zuul check routine fail. Riccardo On Wed, 22 May 2019 at 18:42, Sean McGinnis wrote: > > There was a failure in the post job for ironic-inspector in stable/stein. The > error was after tagging during the log collection task. > > "rsync: connection unexpectedly closed" > > http://logs.openstack.org/8f/8f71bcfb8d02cae91d0e49e5813c03c745745c92/release-post/tag-releases/52abf46/job-output.txt.gz#_2019-05-22_16_08_25_140267 > > It appears all necessary tasks completed though, so I believe this should be > safe to ignore. This failures caused the doc publishing job to be skipped, but > we had other releases that would have picked that up. > > Just sharing in case we run into something similar again and need to start > tracking down root cause of the rsync failure. If anyone sees anything unusual > that may be a side effect of this, please let us know and we can try to > investigate. > > Sean > > ----- Forwarded message from zuul at openstack.org ----- > > Date: Wed, 22 May 2019 16:08:49 +0000 > From: zuul at openstack.org > To: release-job-failures at lists.openstack.org > Subject: [Release-job-failures] release-post job for openstack/releases failed > Reply-To: openstack-discuss at lists.openstack.org > > Build failed. > > - tag-releases http://logs.openstack.org/8f/8f71bcfb8d02cae91d0e49e5813c03c745745c92/release-post/tag-releases/52abf46/ : POST_FAILURE in 4m 48s > - publish-tox-docs-static publish-tox-docs-static : SKIPPED > > _______________________________________________ > Release-job-failures mailing list > Release-job-failures at lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/release-job-failures > > ----- End forwarded message ----- > From elfosardo at gmail.com Thu May 23 09:36:14 2019 From: elfosardo at gmail.com (Riccardo Pittau) Date: Thu, 23 May 2019 11:36:14 +0200 Subject: [Infra][ironic][Release-job-failures] release-post job for openstack/releases failed In-Reply-To: References: <20190522163537.GA28637@sm-workstation> Message-ID: Hello again! I think this change https://review.opendev.org/659975 broke a couple of things. For example, when running build-openstack-api-ref, the tox_envlist is correctly set to api-ref, overwriting the value of docs from the parent openstack-tox-docs That means that tox will never build docs, so the doc/build dir doesn't exist, that's why the rsync error afterwards. The fetch-sphinx-output role should reflect that or at least evaluate the presence of the dir. Thanks, Riccardo On Thu, 23 May 2019 at 10:18, Riccardo Pittau wrote: > > Hi Sean, > > I just started noticing the same error in other jobs, for example this: > http://logs.openstack.org/08/660708/1/check/build-openstack-api-ref/4a33c64/job-output.txt.gz#_2019-05-23_08_03_42_370782 > > This is of course making the entire Zuul check routine fail. > > Riccardo > > On Wed, 22 May 2019 at 18:42, Sean McGinnis wrote: > > > > There was a failure in the post job for ironic-inspector in stable/stein. The > > error was after tagging during the log collection task. > > > > "rsync: connection unexpectedly closed" > > > > http://logs.openstack.org/8f/8f71bcfb8d02cae91d0e49e5813c03c745745c92/release-post/tag-releases/52abf46/job-output.txt.gz#_2019-05-22_16_08_25_140267 > > > > It appears all necessary tasks completed though, so I believe this should be > > safe to ignore. This failures caused the doc publishing job to be skipped, but > > we had other releases that would have picked that up. > > > > Just sharing in case we run into something similar again and need to start > > tracking down root cause of the rsync failure. If anyone sees anything unusual > > that may be a side effect of this, please let us know and we can try to > > investigate. > > > > Sean > > > > ----- Forwarded message from zuul at openstack.org ----- > > > > Date: Wed, 22 May 2019 16:08:49 +0000 > > From: zuul at openstack.org > > To: release-job-failures at lists.openstack.org > > Subject: [Release-job-failures] release-post job for openstack/releases failed > > Reply-To: openstack-discuss at lists.openstack.org > > > > Build failed. > > > > - tag-releases http://logs.openstack.org/8f/8f71bcfb8d02cae91d0e49e5813c03c745745c92/release-post/tag-releases/52abf46/ : POST_FAILURE in 4m 48s > > - publish-tox-docs-static publish-tox-docs-static : SKIPPED > > > > _______________________________________________ > > Release-job-failures mailing list > > Release-job-failures at lists.openstack.org > > http://lists.openstack.org/cgi-bin/mailman/listinfo/release-job-failures > > > > ----- End forwarded message ----- > > From frickler at offenerstapel.de Thu May 23 10:04:15 2019 From: frickler at offenerstapel.de (Jens Harbott) Date: Thu, 23 May 2019 10:04:15 +0000 Subject: [Infra][ironic][Release-job-failures] release-post job for openstack/releases failed In-Reply-To: References: <20190522163537.GA28637@sm-workstation> Message-ID: On Thu, 2019-05-23 at 11:36 +0200, Riccardo Pittau wrote: > Hello again! > > I think this change https://review.opendev.org/659975 broke a couple > of things. > > For example, when running build-openstack-api-ref, the tox_envlist is > correctly set to api-ref, overwriting the value of docs from the > parent openstack-tox-docs > That means that tox will never build docs, so the doc/build dir > doesn't exist, that's why the rsync error afterwards. > > The fetch-sphinx-output role should reflect that or at least evaluate > the presence of the dir. I think the correct fix is to keep the removed variable as proposed in https://review.opendev.org/660962 . Sorry for merging the broken patch in the first place. Jens From mbooth at redhat.com Thu May 23 10:11:19 2019 From: mbooth at redhat.com (Matthew Booth) Date: Thu, 23 May 2019 11:11:19 +0100 Subject: [nova][dev][ops] server status when compute host is down In-Reply-To: <065da98d-300d-00ca-83ee-f6e9dc458277@gmail.com> References: <065da98d-300d-00ca-83ee-f6e9dc458277@gmail.com> Message-ID: On Thu, 23 May 2019 at 03:02, melanie witt wrote: > > Hey all, > > I'm looking for feedback around whether we can improve how we show > server status in server list and server show when the compute host it > resides on is down. > > When a compute host goes down while a server on it was previously > running, the server status continues to show as ACTIVE in a server list. > This is because the power state and status is adjusted by a periodic > task run by nova-compute, so if nova-compute is down, it cannot update > those states. > > So, for an end user, when they do a server list, they see their server > as ACTIVE when it's actually powered off. > > We have another field called 'host_status' available since API > microversion 2.16 [1] which is controlled by policy and defaults to > admin, which is capable of showing the server status as UNKNOWN if the > field is specified, for example: > > nova list --fields > id,name,status,task_state,power_state,networks,host_status > > This is cool, but it is only available to admin by default, and it > requires that the end user adds the field to their CLI command in the > --fields option. > > Question: do people think we should make the server status field reflect > UNKNOWN as well, if the 'host_status' is UNKNOWN? And if so, should it > be controlled by policy or no? > > Normally, we do not expose compute host details to non-admin in the API > by default, but I noticed recently that our "down cells" support will > show server status as UNKNOWN if a server is in a down cell [2]. So I > wondered if it would be considered OK to show UNKNOWN if a host is down > we well, without defaulting it to admin-only. +1 from me. This seems to have confused users in the past and honest is better than potentially wrong, imho. I can't think of a reason why this information 'leak' would cause any problems. Can anybody else? Matt > > I would really appreciate if people could share their opinion here and > if consensus is in support, I will move forward with proposing a change > accordingly. > > Cheers, > -melanie > > [1] > https://docs.openstack.org/nova/latest/reference/api-microversion-history.html#id14 > [2] > https://github.com/openstack/nova/blob/66a77f2fb75bbb9daebdca1cad0255ecafe41e92/nova/api/openstack/compute/views/servers.py#L108 > -- Matthew Booth Red Hat OpenStack Engineer, Compute DFG Phone: +442070094448 (UK) From openstack at fried.cc Thu May 23 10:36:32 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 23 May 2019 05:36:32 -0500 Subject: [nova][dev][ops] server status when compute host is down In-Reply-To: References: <065da98d-300d-00ca-83ee-f6e9dc458277@gmail.com> Message-ID: +1 from me too. > I can't think of a reason why > this information 'leak' would cause any problems. Can anybody else? Me neither. But if controlled by policy, the paranoid admin can decide. efried -------------- next part -------------- An HTML attachment was scrubbed... URL: From witold.bedyk at suse.com Thu May 23 11:24:37 2019 From: witold.bedyk at suse.com (Witek Bedyk) Date: Thu, 23 May 2019 13:24:37 +0200 Subject: [monasca] Proposing Akhil Jain for Monasca core team Message-ID: <8bfbc2a3-03a4-d469-f723-09aba5b2b96b@suse.com> Hello team, I would like to propose Akhil Jain to join the Monasca core team. Akhil has added authenticated webhook notification support [1] and worked on integrating Monasca with Congress, where he's also the core reviewer. His work has been presented at the last Open Infrastructure Summit in Denver [2]. Akhil has good understanding of Monasca and OpenStack architecture and constantly helps us in providing sensible reviews. I'm sure Monasca project will benefit from this nomination. Cheers Witek [1] https://storyboard.openstack.org/#!/story/2003105 [2] https://www.openstack.org/summit/denver-2019/summit-schedule/events/23261/policy-driven-fault-management-of-nfv-eco-system From balazs.gibizer at ericsson.com Thu May 23 11:26:05 2019 From: balazs.gibizer at ericsson.com (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Thu, 23 May 2019 11:26:05 +0000 Subject: [nova][dev][ops] server status when compute host is down In-Reply-To: <065da98d-300d-00ca-83ee-f6e9dc458277@gmail.com> References: <065da98d-300d-00ca-83ee-f6e9dc458277@gmail.com> Message-ID: <1558610762.28465.6@smtp.office365.com> On Thu, May 23, 2019 at 3:58 AM, melanie witt wrote: > > Question: do people think we should make the server status field > reflect UNKNOWN as well, if the 'host_status' is UNKNOWN? And if so, > should it be controlled by policy or no? > Works for me. Cheers, gibi From surya.seetharaman9 at gmail.com Thu May 23 11:39:31 2019 From: surya.seetharaman9 at gmail.com (Surya Seetharaman) Date: Thu, 23 May 2019 13:39:31 +0200 Subject: [nova][dev][ops] server status when compute host is down In-Reply-To: <065da98d-300d-00ca-83ee-f6e9dc458277@gmail.com> References: <065da98d-300d-00ca-83ee-f6e9dc458277@gmail.com> Message-ID: On Thu, May 23, 2019 at 3:59 AM melanie witt wrote: > Hey all, > > Question: do people think we should make the server status field reflect > UNKNOWN as well, if the 'host_status' is UNKNOWN? And if so, should it > be controlled by policy or no? > > +1 to doing this with a policy. I would prefer giving the ability/choice to the operators to opt-out of it if they want to. ----------- Regards, Surya. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sundar.nadathur at intel.com Thu May 23 12:00:26 2019 From: sundar.nadathur at intel.com (Nadathur, Sundar) Date: Thu, 23 May 2019 12:00:26 +0000 Subject: [nova] [cyborg] Impact of moving bind to compute Message-ID: <1CC272501B5BC543A05DB90AA509DED52757522F@fmsmsx122.amr.corp.intel.com> Hi, The feedback in the Nova - Cyborg interaction spec [1] is to move the call for creating/binding accelerator requests (ARQs) from the conductor (just before the call to build_and_run_instance, [2]) to the compute manager (just before spawn, without holding the build sempahore [3]). The point where the results of the bind are needed is in the virt driver [4] - that is not changing. The reason for the move is to enable Cyborg to notify Nova [5] instead of Nova virt driver polling Cyborg, thus making the interaction similar to other services like Neutron. The binding involves device preparation by Cyborg, which may take some time (ballpark: milliseconds to few seconds to perhaps 10s of seconds - of course devices vary a lot). We want to overlap as much of this as possible with other tasks, by starting the binding as early as possible and making it asynchronous, so that bulk VM creation rate etc. are not affected. These considerations are probably specific to Cyborg, so trying to make it uniform with other projects deserve a closer look before we commit to it. Moving the binding from [2] to [3] reduces this overlap. I did some measurements of the time window from [2] to [3]: it was consistently between 20 and 50 milliseconds, whether I launched 1 VM at a time, 2 at a time, etc. This seems acceptable. But this was just in a two-node deployment. Are there situations where this window could get much larger (thus reducing the overlap)? Such as in larger deployments, or issues with RabbitMQ messaging, etc. Are there larger considerations of performance or scaling for this approach? Thanks in advance. [1] https://review.opendev.org/#/c/603955/ [2] https://github.com/openstack/nova/blob/master/nova/conductor/manager.py#L1501 [3] https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1882 [4] https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L3215 [5] https://wiki.openstack.org/wiki/Nova/ExternalEventAPI Regards, Sundar -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at stackhpc.com Thu May 23 12:23:31 2019 From: doug at stackhpc.com (Doug Szumski) Date: Thu, 23 May 2019 13:23:31 +0100 Subject: [monasca] Proposing Akhil Jain for Monasca core team In-Reply-To: <8bfbc2a3-03a4-d469-f723-09aba5b2b96b@suse.com> References: <8bfbc2a3-03a4-d469-f723-09aba5b2b96b@suse.com> Message-ID: <60faa8a1-51ea-bea6-93f0-855fdc0ccde9@stackhpc.com> On 23/05/2019 12:24, Witek Bedyk wrote: > Hello team, > > I would like to propose Akhil Jain to join the Monasca core team. +1, thanks for your contributions Akhil! > Akhil has added authenticated webhook notification support [1] and > worked on integrating Monasca with Congress, where he's also the core > reviewer. His work has been presented at the last Open Infrastructure > Summit in Denver [2]. > > Akhil has good understanding of Monasca and OpenStack architecture and > constantly helps us in providing sensible reviews. I'm sure Monasca > project will benefit from this nomination. > > Cheers > Witek > > [1] https://storyboard.openstack.org/#!/story/2003105 > [2] > https://www.openstack.org/summit/denver-2019/summit-schedule/events/23261/policy-driven-fault-management-of-nfv-eco-system > From ikuo.otani.rw at hco.ntt.co.jp Thu May 23 12:58:54 2019 From: ikuo.otani.rw at hco.ntt.co.jp (Ikuo Otani) Date: Thu, 23 May 2019 21:58:54 +0900 Subject: [cyborg][nova][sdk]Cyborgclient integration Message-ID: <00be01d51167$46a0e1b0$d3e2a510$@hco.ntt.co.jp> Hi, I am a Cyborg member and take the role of integrating openstacksdk and replacing use of python-*client. Related bluespec: https://blueprints.launchpad.net/nova/+spec/openstacksdk-in-nova My question is: When the first code should be uploaded to gerrit? >From my understanding, we should update cyborg client library referring to openstacksdk rule, and make it reviewed in gerrit by Eric Fried. I'm sorry if I misunderstand. Thanks in advance, Ikuo NTT Network Service Systems Laboratories Server Network Innovation Project Ikuo Otani TEL: +81-422-59-4140 Email: ikuo.otani.rw at hco.ntt.co.jp From marcin.juszkiewicz at linaro.org Thu May 23 13:27:58 2019 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Thu, 23 May 2019 15:27:58 +0200 Subject: [all][qinling] Please check your README files Message-ID: Train cycle is supposed to be "we really go for Python 3" cycle. Contrary to previous "we do not have to care about Python 3" ones. I am working on switching Kolla images into Python 3 by default [1]. It is a job I would not recommend even to my potential enemies. Misc projects fail into random ways (sent a bunch of patches). 1. https://review.opendev.org/#/c/642375 Quite common issue is related to README files. I know that we have XXI century and UTF-8 is encoding for most of people here. But Python tools are not so advanced so far and loves to explode when see characters outside of US-ASCII encoding. Please check your README files. And move them to US-ASCII if needed. I already fixed few projects [2][3][4][5] but would love to see developers fixing their code too. 2. https://review.opendev.org/644531 3. https://review.opendev.org/644533 4. https://review.opendev.org/644535 5. https://review.opendev.org/644536 From muroi.masahito at lab.ntt.co.jp Thu May 23 13:39:23 2019 From: muroi.masahito at lab.ntt.co.jp (Masahito MUROI) Date: Thu, 23 May 2019 22:39:23 +0900 Subject: [blazar] Limits to reservation usage In-Reply-To: References: Message-ID: Hello, Thanks for nice feedback from first US timezome meeting. IMO, it's good to introduce limits or quota for Blazar's top level resources, Leases, Hosts, FloatingIP, and etc. It makes possible to handle any type of resource_type in reservations. The additional limitation for length of reservations the Chameleon cloud does looks good to me as an additional quota mechanism. best regards, Masahito On 2019/05/18 3:19, Pierre Riteau wrote: > Hello, > > We had a very interesting first Blazar IRC meeting for the Americas > last week [1]. > > One of the topics discussed was enforcing limits to reservation usage. > Currently, upstream Blazar doesn't provide ways to limit reservations > per user or project. It is technically possible for users to reserve > more resources than what their quotas allows them to use. > > The Chameleon project, which has been running Blazar in production for > several years, has extended it to: > > a) enforce operator-defined limits to reservation length (e.g. 7 days) > b) when creating or updating reservations, check whether the project > has enough available Service Units (SU) in their allocation, which is > stored in a custom external database > > George Turner from Indiana University Bloomington explained how > Jetstream, if it were to use Blazar to share GPU resources, would have > a similar requirement to check reservation usage against XSEDE > allocations (which are again stored in a custom database). > > I am starting this thread to discuss how Blazar can support enforcing > these kinds of limits, making sure that it is generic enough to be > plugged with the various custom allocation backends in use. > > 1) Blazar should check Nova limits as a guide for limiting reservation > usage at any point in time: if number of instances quota is 8, we > shouldn't allow the user to reserve more than 8 instances at any point > in time. > > 2) In addition, Blazar could use a quota to limit how much resources > can be allocated in advance. Operators may be happy for projects to > reserve 8 instances for a month, but not for a century. This could be > expressed as a time dimension that would apply to the instance / cores > / ram quotas. > > 3) If Blazar was making REST requests to a customisable endpoint on > reservation creation / update, expecting to get a simple yes/no answer > (with a human-friendly error message, like how much SUs are left > compared to how much would be used), would people be motivated to > write a small REST service making the link between Blazar and any > custom allocation backend? > > Feel free to reply to this message or join our next meeting on > Thursday May 23 at 1600 UTC. > > Cheers, > Pierre > > [1] http://eavesdrop.openstack.org/meetings/blazar/2019/blazar.2019-05-09-16.01.log.html > > > From bxzhu_5355 at 163.com Thu May 23 13:46:06 2019 From: bxzhu_5355 at 163.com (Boxiang Zhu) Date: Thu, 23 May 2019 21:46:06 +0800 (CST) Subject: [nova] Validation for requested host/node on server create Message-ID: <59ee8034.193bc.16ae4f11882.Coremail.bxzhu_5355@163.com> +1 for option 3, too. Check host/hypervisor_hostname fisrt in API layer so that we will not create a "ERROR" vm with "NoValidHost" exception. >> It seems we've come to an impasse on this change [1] because of a >> concern about where to validate the requested host and/or >> hypervisor_hostname. >> >> The change is currently validating in the API to provide a fast fail 400 >> response to the user if the host and/or node don't exist. The concern is >> that the lookup for the compute node will be done again in the scheduler >> [2]. Also, if the host is not provided, then to validate the node we >> have to iterate the cells looking for the given compute node (we could >> use placement though, more on that below). >> >> I've added this to the nova meeting agenda for tomorrow but wanted to >> try and enumerate what I see are the options before the meeting so we >> don't have to re-cap all of this during the meeting. >> >> The options as I see them are: >> >> 1. Omit the validation in the API and let the scheduler do the validation. >> >> Pros: no performance impact in the API when creating server(s) >> >> Cons: if the host/node does not exist, the user will get a 202 response >> and eventually a NoValidHost error which is not a great user experience, >> although it is what happens today with the availability_zone forced >> host/node behavior we already have [3] so maybe it's acceptable. >> >> 2. Only validate host in the API since we can look up the HostMapping in >> the API DB. If the user also provided a node then we'd just throw that >> on the RequestSpec and let the scheduler code validate it. >> >> Pros: basic validation for the simple and probably most widely used case >> since for non-baremetal instances the host and node are going to be the same >> >> Cons: still could have a late failure in the scheduler with NoValidHost >> error; does not cover the case that only node (no host) is specified >> >> 3. Validate both the host and node in the API. This can be broken down: >> >> a) If only host is specified, do #2 above. >> b) If only node is specified, iterate the cells looking for the node (or >> query a resource provider with that name in placement which would avoid >> down cell issues) >> c) If both host and node is specified, get the HostMapping and from that >> lookup the ComputeNode in the given cell (per the HostMapping) >> >> Pros: fail fast behavior in the API if either the host and/or node do >> not exist >> >> Cons: performance hit in the API to validate the host/node and >> redundancy with the scheduler to find the ComputeNode to get its uuid >> for the in_tree filtering on GET /allocation_candidates. >> >> Note that if we do find the ComputeNode in the API, we could also >> (later?) make a change to the Destination object to add a node_uuid IMHO, is it better to call it compute_node_uuid? : ) >> field so we can pass that through on the RequestSpec from >> API->conductor->scheduler and that should remove the need for the >> duplicate query in the scheduler code for the in_tree logic. >> >> I'm personally in favor of option 3 since we know that users hate >> NoValidHost errors and we have ways to mitigate the performance overhead >> of that validation. > >Count me in the option 3 boat too, for the same reasons. Rather avoid >NoValidHost and there's mitigation we can do for the perf issue. > >-melanie > >> Note that this isn't necessarily something that has to happen in the >> same change that introduces the host/hypervisor_hostname parameters to >> the API. If we do the validation in the API I'd probably split the >> validation logic into it's own patch to make it easier to test and >> review on its own. >> >> [1] https://review.opendev.org/#/c/645520/ >> [2] >> https://github.com/openstack/nova/blob/2e85453879533af0b4d0e1178797d26f026a9423/nova/scheduler/utils.py#L528 >> [3] https://docs.openstack.org/nova/latest/admin/availability-zones.html >> -- Boxiang -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Thu May 23 13:50:31 2019 From: openstack at nemebean.com (Ben Nemec) Date: Thu, 23 May 2019 08:50:31 -0500 Subject: [all][qinling] Please check your README files In-Reply-To: References: Message-ID: <2d210f25-db54-5822-f54f-28283adbadbd@nemebean.com> On 5/23/19 8:27 AM, Marcin Juszkiewicz wrote: > Train cycle is supposed to be "we really go for Python 3" cycle. > Contrary to previous "we do not have to care about Python 3" ones. > > I am working on switching Kolla images into Python 3 by default [1]. It > is a job I would not recommend even to my potential enemies. Misc > projects fail into random ways (sent a bunch of patches). > > 1. https://review.opendev.org/#/c/642375 > > Quite common issue is related to README files. I know that we have XXI > century and UTF-8 is encoding for most of people here. But Python tools > are not so advanced so far and loves to explode when see characters > outside of US-ASCII encoding. Hmm, is this because of https://bugs.launchpad.net/pbr/+bug/1704472 ? If so, we should just fix it in pbr. I have a patch up to do that (https://review.opendev.org/#/c/564874) but I haven't gotten around to writing tests for it. I'll try to get that done shortly. > > Please check your README files. And move them to US-ASCII if needed. > > I already fixed few projects [2][3][4][5] but would love to see > developers fixing their code too. > > 2. https://review.opendev.org/644531 > 3. https://review.opendev.org/644533 > 4. https://review.opendev.org/644535 > 5. https://review.opendev.org/644536 > From marcin.juszkiewicz at linaro.org Thu May 23 13:57:01 2019 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Thu, 23 May 2019 15:57:01 +0200 Subject: [all][qinling] Please check your README files In-Reply-To: <2d210f25-db54-5822-f54f-28283adbadbd@nemebean.com> References: <2d210f25-db54-5822-f54f-28283adbadbd@nemebean.com> Message-ID: <178883ca-fe3f-582f-fe6b-1482e3ca0411@linaro.org> W dniu 23.05.2019 o 15:50, Ben Nemec pisze: > On 5/23/19 8:27 AM, Marcin Juszkiewicz wrote: >> Quite common issue is related to README files. I know that we have XXI >> century and UTF-8 is encoding for most of people here. But Python tools >> are not so advanced so far and loves to explode when see characters >> outside of US-ASCII encoding. > > Hmm, is this because of https://bugs.launchpad.net/pbr/+bug/1704472 ? > > If so, we should just fix it in pbr. I have a patch up to do that > (https://review.opendev.org/#/c/564874) but I haven't gotten around to > writing tests for it. I'll try to get that done shortly. Thanks! I did not digged in whole stack as there were several other things to debug. From dms at danplanet.com Thu May 23 14:05:19 2019 From: dms at danplanet.com (Dan Smith) Date: Thu, 23 May 2019 07:05:19 -0700 Subject: [nova][dev][ops] server status when compute host is down In-Reply-To: (Surya Seetharaman's message of "Thu, 23 May 2019 13:39:31 +0200") References: <065da98d-300d-00ca-83ee-f6e9dc458277@gmail.com> Message-ID: > Question: do people think we should make the server status field > reflect UNKNOWN as well, if the 'host_status' is UNKNOWN? And if so, > should it be controlled by policy or no? Do we have other things that change *value* depending on policy? I was thinking that was one of the situations the policy people (i.e. Matt) have avoided in the past. Also, AFAIK, our documentation specifies (and existing behavior is) to only return UNKNOWN in the case where we return a partial instance because we couldn't look up the rest of the details from the cell. This would break that relationship, and I'm not sure how people would know that they shouldn't expect a full instance record, other than to poke it with a stick to see if it contains certain properties. > +1 to doing this with a policy. I would prefer giving the > ability/choice to the operators to opt-out of it if they want to. In general, I think we should try to avoid leaking things about the infrastructure to regular users. In the case of a cell being down, we couldn't really fake it because we don't have much of the information available to us. I agree that a host being down is not that different from a cell being down from the perspective of a user, but I also think that allowing operators to opt-in to such a disclosure would be better, although as above, I start to worry about the degrees of freedom in the response. My biggest concern, which came out during the host status discussion, is that we should *not* say the instance is "down" just because the compute service is unreachable. Saying it's in "unknown" state is better. I'd like to hear from some more operators about whether they would opt-in to this unknown-state behavior for compute host down-age. Specifically, whether they want customer instances to show as "unknown" state while they're doing an upgrade that otherwise wouldn't impact the instance's health. --Dan From openstack at fried.cc Thu May 23 14:10:27 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 23 May 2019 09:10:27 -0500 Subject: [cyborg][nova][sdk]Cyborgclient integration In-Reply-To: <00be01d51167$46a0e1b0$d3e2a510$@hco.ntt.co.jp> References: <00be01d51167$46a0e1b0$d3e2a510$@hco.ntt.co.jp> Message-ID: Hi Ikuo. I'm glad you're interested in helping out with this effort. I'm trying to understand where you intend to make your changes, assuming you're coming at this purely from a cyborg perspective: - If in Nova, this isn't necessary because there's no python-cyborgclient integration there. Communication from Nova to Cyborg is being added as part of blueprint nova-cyborg-interaction [0], but it'll be done without python-cyborgclient from the start. The patch at [1] sets up direct REST API communication through a KSA adapter. Once we have base openstacksdk enablement in Nova [2] we can simply s/get_ksa_adapter/get_sdk_adapter/ at [3]. And in the future as the sdk starts supporting richer methods for cyborg, we can replace the direct REST calls in that file (see [4] later in that series to get an idea of what kinds of calls those will be). - If in the cyborg CLI, I'm afraid I have very little context there. There's a (nearly-official [5]) community push to make all CLIs OSC-based. I'm not sure what state the cyborg CLI is in, but I would have expected it will need brand new work to expose the v2 changes being done for Train. From that perspective I would say: do that via OSC. But that's not really related to bp/openstacksdk-in-nova. - If in python-cyborgclient, again I lack background, but I don't think there's a need to make changes here. The point of bp/openstacksdk-in-nova (or openstacksdk-anywhere) is to get rid of usages of client libraries like python-cyborgclient. Where is python-cyborgclient being used today? If it's just in the CLI, and you can make the above (conversion to OSC) happen, it may be possible to retire python-cyborgclient entirely \o/ Now, if you're generally available to contribute to either bp/nova-cyborg-interaction (by helping with the nova code) or bp/openstack-sdk-in-nova (on non-cyborg-related aspects), I would be delighted to get you ramped up. We could sure use the help. Please let me know if you're interested. Thanks, efried [0] https://blueprints.launchpad.net/nova/+spec/nova-cyborg-interaction [1] https://review.opendev.org/#/c/631242/ [2] https://review.opendev.org/#/c/643664/ [3] https://review.opendev.org/#/c/631242/19/nova/accelerator/cyborg.py at 23 [4] https://review.opendev.org/#/c/631245/13/nova/accelerator/cyborg.py [5] https://review.opendev.org/#/c/639376/ On 5/23/19 7:58 AM, Ikuo Otani wrote: > Hi, > > I am a Cyborg member and take the role of integrating openstacksdk and replacing use of python-*client. > Related bluespec: > https://blueprints.launchpad.net/nova/+spec/openstacksdk-in-nova > > My question is: > When the first code should be uploaded to gerrit? > From my understanding, we should update cyborg client library referring to openstacksdk rule, > and make it reviewed in gerrit by Eric Fried. > I'm sorry if I misunderstand. > > Thanks in advance, > Ikuo > > NTT Network Service Systems Laboratories > Server Network Innovation Project > Ikuo Otani > TEL: +81-422-59-4140 > Email: ikuo.otani.rw at hco.ntt.co.jp > > > > From surya.seetharaman9 at gmail.com Thu May 23 14:46:21 2019 From: surya.seetharaman9 at gmail.com (Surya Seetharaman) Date: Thu, 23 May 2019 16:46:21 +0200 Subject: [nova] Validation for requested host/node on server create In-Reply-To: <78fa937a-beb6-c63d-01a0-40e6519928be@gmail.com> References: <78fa937a-beb6-c63d-01a0-40e6519928be@gmail.com> Message-ID: On Thu, May 23, 2019 at 12:16 AM Matt Riedemann wrote: > > 1. Omit the validation in the API and let the scheduler do the validation. > > Pros: no performance impact in the API when creating server(s) > > Cons: if the host/node does not exist, the user will get a 202 response > and eventually a NoValidHost error which is not a great user experience, > although it is what happens today with the availability_zone forced > host/node behavior we already have [3] so maybe it's acceptable. > What I had in mind when suggesting this was to actually return a Host/NodeNotFound exception from the host_manager [1] instead of confusing that with the NoValidHost exception when its actually not a NoValidHost (as this is usually associated with host capacity) if the host or node doesn't exist. I know that it has already been implemented as a NoValidHost [2] but we could change this. > 3. Validate both the host and node in the API. This can be broken down: > > a) If only host is specified, do #2 above. > b) If only node is specified, iterate the cells looking for the node (or > query a resource provider with that name in placement which would avoid > down cell issues) > c) If both host and node is specified, get the HostMapping and from that > lookup the ComputeNode in the given cell (per the HostMapping) > > Pros: fail fast behavior in the API if either the host and/or node do > not exist > > Cons: performance hit in the API to validate the host/node and > redundancy with the scheduler to find the ComputeNode to get its uuid > for the in_tree filtering on GET /allocation_candidates. > I don't mind if we did this as long as don't hit all the cells twice (in api and scheduler) which like you said could be avoided by going to placement. > > Note that if we do find the ComputeNode in the API, we could also > (later?) make a change to the Destination object to add a node_uuid > field so we can pass that through on the RequestSpec from > API->conductor->scheduler and that should remove the need for the > duplicate query in the scheduler code for the in_tree logic. > > I guess we discussed this in a similar(ly different) situation and decided against it [3]. [1] https://github.com/openstack/nova/blob/c7e9e667426a6d88d396a59cb40d30763a3265f9/nova/scheduler/host_manager.py#L660 [2] https://github.com/openstack/nova/blob/c7e9e667426a6d88d396a59cb40d30763a3265f9/nova/scheduler/utils.py#L533 [3] https://review.opendev.org/#/c/646029/4/specs/train/approved/use-placement-in-tree.rst at 58 -- Regards, Surya. -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Thu May 23 15:02:02 2019 From: smooney at redhat.com (Sean Mooney) Date: Thu, 23 May 2019 16:02:02 +0100 Subject: [cyborg][nova][sdk]Cyborgclient integration In-Reply-To: References: <00be01d51167$46a0e1b0$d3e2a510$@hco.ntt.co.jp> Message-ID: On Thu, 2019-05-23 at 09:10 -0500, Eric Fried wrote: > Hi Ikuo. I'm glad you're interested in helping out with this effort. > > I'm trying to understand where you intend to make your changes, assuming > you're coming at this purely from a cyborg perspective: > > - If in Nova, this isn't necessary because there's no > python-cyborgclient integration there. Communication from Nova to Cyborg > is being added as part of blueprint nova-cyborg-interaction [0], but > it'll be done without python-cyborgclient from the start. The patch at > [1] sets up direct REST API communication through a KSA adapter. Once we > have base openstacksdk enablement in Nova [2] we can simply > s/get_ksa_adapter/get_sdk_adapter/ at [3]. And in the future as the sdk > starts supporting richer methods for cyborg, we can replace the direct > REST calls in that file (see [4] later in that series to get an idea of > what kinds of calls those will be). > > - If in the cyborg CLI, I'm afraid I have very little context there. > There's a (nearly-official [5]) community push to make all CLIs > OSC-based. I'm not sure what state the cyborg CLI is in, but I would > have expected it will need brand new work to expose the v2 changes being > done for Train. From that perspective I would say: do that via OSC. But > that's not really related to bp/openstacksdk-in-nova. given the cyborg api has 2 enpoint with like 10 effective action you can prefrom crun operation on acclerator + deployable + program it shoudl be trivail to add port to an osc pugin at this point. if you add the api suport to the openstack sdk frist you can kill two brid with one stone and just make the osc plugin for cyborg call the sdk function then compltel drop python-cyborgclient cfom a nova integration point of view we could then also jsut use the sdk functions. > > - If in python-cyborgclient, again I lack background, but I don't think > there's a need to make changes here. The point of > bp/openstacksdk-in-nova (or openstacksdk-anywhere) is to get rid of > usages of client libraries like python-cyborgclient. Where is > python-cyborgclient being used today? If it's just in the CLI, and you > can make the above (conversion to OSC) happen, it may be possible to > retire python-cyborgclient entirely \o/ > > Now, if you're generally available to contribute to either > bp/nova-cyborg-interaction (by helping with the nova code) or > bp/openstack-sdk-in-nova (on non-cyborg-related aspects), I would be > delighted to get you ramped up. We could sure use the help. Please let > me know if you're interested. > > Thanks, > efried > > [0] https://blueprints.launchpad.net/nova/+spec/nova-cyborg-interaction > [1] https://review.opendev.org/#/c/631242/ > [2] https://review.opendev.org/#/c/643664/ > [3] https://review.opendev.org/#/c/631242/19/nova/accelerator/cyborg.py at 23 > [4] https://review.opendev.org/#/c/631245/13/nova/accelerator/cyborg.py > [5] https://review.opendev.org/#/c/639376/ > > On 5/23/19 7:58 AM, Ikuo Otani wrote: > > Hi, > > > > I am a Cyborg member and take the role of integrating openstacksdk and replacing use of python-*client. > > Related bluespec: > > https://blueprints.launchpad.net/nova/+spec/openstacksdk-in-nova > > > > My question is: > > When the first code should be uploaded to gerrit? > > From my understanding, we should update cyborg client library referring to openstacksdk rule, > > and make it reviewed in gerrit by Eric Fried. > > I'm sorry if I misunderstand. > > > > Thanks in advance, > > Ikuo > > > > NTT Network Service Systems Laboratories > > Server Network Innovation Project > > Ikuo Otani > > TEL: +81-422-59-4140 > > Email: ikuo.otani.rw at hco.ntt.co.jp > > > > > > > > > > From openstack at fried.cc Thu May 23 15:33:37 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 23 May 2019 10:33:37 -0500 Subject: [all][docs] season of docs In-Reply-To: <674e3d5d525a550c50f03a4be8ff01c37451d8dd.camel@redhat.com> Message-ID: > Just to close this off, we never got to finish the application for > this. It was quite involved, as promised, and Summit/PTG work took > priority. > Hopefully we'll be able to try again next year. Thanks to all who > provided suggestions for things to work on. I assume this doesn't stop us from making docs a focus this release, and from leaning on guidelines like: > I can only really speak for nova and oslo. For nova, I'd like to see > us better align with the documentation style used in Django, which is > described in the below article: > > https://jacobian.org/2009/nov/10/what-to-write/ As you know (but for others' awareness) Nova has a cycle theme for this [1]. Just need some bodies to throw at it... efried [1] https://review.opendev.org/#/c/657171/2/priorities/train-priorities.rst at 37 From sfinucan at redhat.com Thu May 23 15:41:21 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Thu, 23 May 2019 16:41:21 +0100 Subject: [all][docs] season of docs In-Reply-To: References: Message-ID: <15318177f0628071c2bd51b6b6289c40204c3484.camel@redhat.com> On Thu, 2019-05-23 at 10:33 -0500, Eric Fried wrote: > > Just to close this off, we never got to finish the application for > > this. It was quite involved, as promised, and Summit/PTG work took > > priority. > > Hopefully we'll be able to try again next year. Thanks to all who > > provided suggestions for things to work on. > > I assume this doesn't stop us from making docs a focus this release, and > from leaning on guidelines like: Not at all. This was just a chance to get even more eyes on this, but we should be able to make a good hand of this ourselves over the course of the cycle. > > I can only really speak for nova and oslo. For nova, I'd like to see > > us better align with the documentation style used in Django, which is > > described in the below article: > > > > https://jacobian.org/2009/nov/10/what-to-write/ > > As you know (but for others' awareness) Nova has a cycle theme for this > [1]. Just need some bodies to throw at it... > > efried > > [1] > https://review.opendev.org/#/c/657171/2/priorities/train-priorities.rst at 37 > > From dtroyer at gmail.com Thu May 23 16:08:06 2019 From: dtroyer at gmail.com (Dean Troyer) Date: Thu, 23 May 2019 11:08:06 -0500 Subject: [glance][interop] standardized image "name" ? In-Reply-To: <7893dbd2-acc1-692c-df38-29ec7c8a98e7@debian.org> References: <939FEDBD-6E5E-43F2-AE1F-2FE71A71BF58@vmware.com> <20190408123255.vqwwvzzdt24tm3pq@yuggoth.org> <4234325e-7569-e11f-53e9-72f07ed8ce53@gmail.com> <7893dbd2-acc1-692c-df38-29ec7c8a98e7@debian.org> Message-ID: On Fri, May 17, 2019 at 1:49 AM Thomas Goirand wrote: > On 4/18/19 2:37 PM, Brian Rosmaita wrote: > > The multihash is displayed in the image-list and image-show API > > responses since Images API v2.7, and in the glanceclient since 2.12.0. > > That's the thing. "image show --long" continues to display the md5sum > instead of the sha512. We are gearing up to do a backward-compat-breaking OSC4 release soon, this would be the right time to make this change if someone wants to follow-up on it... dt -- Dean Troyer dtroyer at gmail.com From mriedemos at gmail.com Thu May 23 16:09:53 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 23 May 2019 11:09:53 -0500 Subject: Summit video website shenanigans In-Reply-To: References: <21ce1f4d-2e19-589f-3bce-44f411a22e67@gmail.com> Message-ID: On 5/14/2019 5:42 PM, Kendall Nelson wrote: > They are doing some more research to see if there are other videos > facing the same issues, but it should all be fine now? This one also appears to still be busted and/or lost. https://www.openstack.org/summit/denver-2019/summit-schedule/events/23234/openstack-troubleshooting-field-survival-guide -- Thanks, Matt From mriedemos at gmail.com Thu May 23 16:37:03 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 23 May 2019 11:37:03 -0500 Subject: [nova] Validation for requested host/node on server create In-Reply-To: References: <78fa937a-beb6-c63d-01a0-40e6519928be@gmail.com> Message-ID: On 5/23/2019 9:46 AM, Surya Seetharaman wrote: > 1. Omit the validation in the API and let the scheduler do the > validation. > > Pros: no performance impact in the API when creating server(s) > > Cons: if the host/node does not exist, the user will get a 202 response > and eventually a NoValidHost error which is not a great user > experience, > although it is what happens today with the availability_zone forced > host/node behavior we already have [3] so maybe it's acceptable. > > > > What I had in mind when suggesting this was to actually return a > Host/NodeNotFound exception from the host_manager [1] instead of > confusing that with the NoValidHost exception when its actually not a > NoValidHost (as this is usually associated with host capacity) if the > host or node doesn't exist. I know that it has already been implemented > as a NoValidHost [2] but we could change this. The point is by the time we hit this, we've given the user a 202 and eventually scheduling is going to fail. It doesn't matter if it's NoValidHost or HostNotFound or NodeNotFound or MyToiletIsBroken, the server is going to go to ERROR state and the user has to figure out why from the fault information which is poor UX IMO. > > 3. Validate both the host and node in the API. This can be broken down: > > a) If only host is specified, do #2 above. > b) If only node is specified, iterate the cells looking for the node > (or > query a resource provider with that name in placement which would avoid > down cell issues) > c) If both host and node is specified, get the HostMapping and from > that > lookup the ComputeNode in the given cell (per the HostMapping) > > Pros: fail fast behavior in the API if either the host and/or node do > not exist > > Cons: performance hit in the API to validate the host/node and > redundancy with the scheduler to find the ComputeNode to get its uuid > for the in_tree filtering on GET /allocation_candidates. > > > I don't mind if we did this as long as don't hit all the cells twice (in > api and scheduler) which like you said could be avoided by going to > placement. Yeah I think we can more efficiently check for the node using placement (this was Sean Mooney's idea while talking about it yesterday in IRC). > > > Note that if we do find the ComputeNode in the API, we could also > (later?) make a change to the Destination object to add a node_uuid > field so we can pass that through on the RequestSpec from > API->conductor->scheduler and that should remove the need for the > duplicate query in the scheduler code for the in_tree logic. > > > I guess we discussed this in a similar(ly different) situation and > decided against it [3]. I'm having a hard time dredging up the context on that conversation but unless I'm mistaken I think that was talking about the RequestGroup vs the Destination object. Because of when and where the RequestGroup stuff happens today, we can't really use that from the API to set in_tree early, which is why the API code is only setting the RequestSpec.requested_destination (Destination object) field with the host/node values. -- Thanks, Matt From jimmy at openstack.org Thu May 23 16:49:59 2019 From: jimmy at openstack.org (Jimmy McArthur) Date: Thu, 23 May 2019 11:49:59 -0500 Subject: Fwd: Summit video website shenanigans In-Reply-To: <022A1281-449C-4E58-9648-038604714BA3@openstack.org> References: <022A1281-449C-4E58-9648-038604714BA3@openstack.org> Message-ID: <5CE6CF37.6020504@openstack.org> Hey Matt, I responded to the cut-off video via speaker support. We had 8 videos that were truncated, but most were only by a few seconds. Yours was the unfortunate 15 minute chop and I'm afraid that one is irretrievable. Look over the option I sent in speaker support ticket and let me know if that's something you're interested in. Re: https://www.openstack.org/summit/denver-2019/summit-schedule/events/23234/openstack-troubleshooting-field-survival-guide This one appears to be blocked as a duplicate video by YouTube. I'm trying to sort this out as we speak. I'll update this thread as soon as I have an answer. If anyone else out there notices any other shenanigans, please don't hesitate to respond to this thread or directly via speakersupport at openstack.org where we have our devs and Foundation support staff all looking out. Cheers, Jimmy Allison Price wrote: > > > >> Begin forwarded message: >> >> *From: *Matt Riedemann > >> *Subject: **Re: Summit video website shenanigans* >> *Date: *May 23, 2019 at 11:09:53 AM CDT >> *To: *"openstack-discuss at lists.openstack.org >> " >> > > >> >> On 5/14/2019 5:42 PM, Kendall Nelson wrote: >>> They are doing some more research to see if there are other videos >>> facing the same issues, but it should all be fine now? >> >> This one also appears to still be busted and/or lost. >> >> https://www.openstack.org/summit/denver-2019/summit-schedule/events/23234/openstack-troubleshooting-field-survival-guide >> >> -- >> >> Thanks, >> >> Matt >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Thu May 23 16:50:25 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 23 May 2019 11:50:25 -0500 Subject: [nova][dev][ops] server status when compute host is down In-Reply-To: References: <065da98d-300d-00ca-83ee-f6e9dc458277@gmail.com> Message-ID: On 5/23/2019 9:05 AM, Dan Smith wrote: >> Question: do people think we should make the server status field >> reflect UNKNOWN as well, if the 'host_status' is UNKNOWN? And if so, >> should it be controlled by policy or no? > > Do we have other things that change *value* depending on policy? I was > thinking that was one of the situations the policy people (i.e. Matt) > have avoided in the past. > > Also, AFAIK, our documentation specifies (and existing behavior is) to > only return UNKNOWN in the case where we return a partial instance > because we couldn't look up the rest of the details from the cell. This > would break that relationship, and I'm not sure how people would know > that they shouldn't expect a full instance record, other than to poke it > with a stick to see if it contains certain properties. > >> +1 to doing this with a policy. I would prefer giving the >> ability/choice to the operators to opt-out of it if they want to. > > In general, I think we should try to avoid leaking things about the > infrastructure to regular users. In the case of a cell being down, we > couldn't really fake it because we don't have much of the information > available to us. I agree that a host being down is not that different > from a cell being down from the perspective of a user, but I also think > that allowing operators to opt-in to such a disclosure would be better, > although as above, I start to worry about the degrees of freedom in the > response. > > My biggest concern, which came out during the host status discussion, is > that we should *not* say the instance is "down" just because the compute > service is unreachable. Saying it's in "unknown" state is better. > > I'd like to hear from some more operators about whether they would > opt-in to this unknown-state behavior for compute host > down-age. Specifically, whether they want customer instances to show as > "unknown" state while they're doing an upgrade that otherwise wouldn't > impact the instance's health. > > --Dan > Agree with Dan that I'd like some operator input on this thread before we consider making a change in behavior. Changing the UNKNOWN status based on down cell vs compute service is down is also confusing as Dan mentions above because vm_state being UNKNOWN is only new as of Stein and is only for the down cell case. With the 'nova list --fields' thing aside, we already have a workaround for this today, right? If I'm an operator and want to expose this information to my users, I configure nova's policy to have: "os_compute_api:servers:show:host_status": "rule:admin_or_owner" And then the user, with the proper microversion, can see the host status if the cloud allows it. As an aside, I now realize we have a nasty performance regression since Stein [1] when listing servers with details concerning this host_status field. The code used to rely on this method [2] to cache the host status information per host when iterating over a list of instances but now it fetches it per host per instance in the view builder [3]. Granted by default policy this would only affect performance for an admin, but if I'm an admin listing 1000 servers across all tenants using "nova list --all-tenants" (which is going to use a microversion high enough to hit this) it could be a noticeable slow down compared to before Stein. I'll open a bug. [1] https://review.opendev.org/#/c/584590/ [2] https://github.com/openstack/nova/blob/c7e9e667426a6d88d396a59cb40d30763a3265f9/nova/compute/api.py#L4926 [3] https://github.com/openstack/nova/blob/c7e9e667426a6d88d396a59cb40d30763a3265f9/nova/api/openstack/compute/views/servers.py#L325 -- Thanks, Matt From elmiko at redhat.com Thu May 23 16:53:44 2019 From: elmiko at redhat.com (Michael McCune) Date: Thu, 23 May 2019 12:53:44 -0400 Subject: [keystone][placement][neutron][api-sig] http404 to NotFound, or how should a http json error body look like? In-Reply-To: <93c95d69-c87a-4d4d-bf10-3b6b293b8a6a@www.fastmail.com> References: <9ae10062-a9c8-3e76-15a1-da0745361c57@ericsson.com> <93c95d69-c87a-4d4d-bf10-3b6b293b8a6a@www.fastmail.com> Message-ID: On Mon, May 20, 2019 at 10:59 AM Colleen Murphy wrote: > I gotta say, though, this guideline on error handling really takes me aback. Why should a single HTTP request ever result in a list of errors, plural? Is there any standard, pattern, or example *outside* of OpenStack where this is done or recommended? Why? hi Colleen, i will attempt to answer your questions, but it has been quite awhile since i (apparently) approved that guideline. i don't remember the specifics of that discussion, but as Chris pointed out the idea was that there are conditions where a single call will result in several errors that /may/ be of interest to caller. we decided that adding a list for this object was a low bar to pass for being able to support those conditions. for example, a call to sahara to deploy a hadoop cluster might result in several nodes of the cluster failing to deploy and thus an error for a call to "create an entire hadoop cluster" might need to return a dense error message. (note, i don't think this is how it's actually done, but i just wanted to provide an example and i happen to know sahara better than other services) to your question about other examples outside of openstack, i can't provide anything further. i vaguely recall that we had a suggested format that included the list of errors, but in looking back through my materials i can't find it. i want to say that it was something like a json-home (it wasn't json-home, but similar in spirit) type spec in which it was suggested. these guidelines are not written in stone, and perhaps we need to revisit this one. although given that folks are now implementing it, that might be troublesome in other ways. i hope that helps at least shed some light on the thought process. peace o/ From donny at fortnebula.com Thu May 23 16:55:40 2019 From: donny at fortnebula.com (Donny Davis) Date: Thu, 23 May 2019 12:55:40 -0400 Subject: [all][docs] season of docs In-Reply-To: <15318177f0628071c2bd51b6b6289c40204c3484.camel@redhat.com> References: <15318177f0628071c2bd51b6b6289c40204c3484.camel@redhat.com> Message-ID: Right now we have several docs to maintain, and users have several docs to sort through. My observation of the issue has been that beginners don't know what they don't know. There has to be a way of lowering the barrier to entry without rendering the docs useless for people past their first deployment. In the context of this discussion I am talking about all of the guides, but not really the content. The content we have really is not that bad. I am more interested about how that content is found and presented to the user. Is there a facility we can use that would just scope the docs to a point of view? Maybe like a tag in sphinx? Docs and "cloudy understanding" have been the barrier to entry in Openstack for many shops for a long time, so its a great conversation for us to be having. ~/Donny Davis On Thu, May 23, 2019 at 11:45 AM Stephen Finucane wrote: > On Thu, 2019-05-23 at 10:33 -0500, Eric Fried wrote: > > > Just to close this off, we never got to finish the application for > > > this. It was quite involved, as promised, and Summit/PTG work took > > > priority. > > > Hopefully we'll be able to try again next year. Thanks to all who > > > provided suggestions for things to work on. > > > > I assume this doesn't stop us from making docs a focus this release, and > > from leaning on guidelines like: > > Not at all. This was just a chance to get even more eyes on this, but > we should be able to make a good hand of this ourselves over the course > of the cycle. > > > > I can only really speak for nova and oslo. For nova, I'd like to see > > > us better align with the documentation style used in Django, which is > > > described in the below article: > > > > > > https://jacobian.org/2009/nov/10/what-to-write/ > > > > As you know (but for others' awareness) Nova has a cycle theme for this > > [1]. Just need some bodies to throw at it... > > > > efried > > > > [1] > > > https://review.opendev.org/#/c/657171/2/priorities/train-priorities.rst at 37 > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Thu May 23 17:00:23 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 23 May 2019 12:00:23 -0500 Subject: [nova][dev][ops] server status when compute host is down In-Reply-To: References: <065da98d-300d-00ca-83ee-f6e9dc458277@gmail.com> Message-ID: On 5/23/2019 11:50 AM, Matt Riedemann wrote: > As an aside, I now realize we have a nasty performance regression since > Stein [1] when listing servers with details concerning this host_status > field. The code used to rely on this method [2] to cache the host status > information per host when iterating over a list of instances but now it > fetches it per host per instance in the view builder [3]. Granted by > default policy this would only affect performance for an admin, but if > I'm an admin listing 1000 servers across all tenants using "nova list > --all-tenants" (which is going to use a microversion high enough to hit > this) it could be a noticeable slow down compared to before Stein. I'll > open a bug. > > [1] https://review.opendev.org/#/c/584590/ > [2] > https://github.com/openstack/nova/blob/c7e9e667426a6d88d396a59cb40d30763a3265f9/nova/compute/api.py#L4926 > > [3] > https://github.com/openstack/nova/blob/c7e9e667426a6d88d396a59cb40d30763a3265f9/nova/api/openstack/compute/views/servers.py#L325 https://bugs.launchpad.net/nova/+bug/1830260 -- Thanks, Matt From elod.illes at ericsson.com Thu May 23 17:01:01 2019 From: elod.illes at ericsson.com (=?utf-8?B?RWzDtWQgSWxsw6lz?=) Date: Thu, 23 May 2019 17:01:01 +0000 Subject: [nova] stable-maint is especially unhealthily RH-centric In-Reply-To: References: Message-ID: <171a6b9f-909c-a715-4957-18cbc91ac685@ericsson.com> Hi, I was thinking to answer here... What I see is that more projects struggle sometimes with having enough and quick stable reviews as cores are busy with the development track. I know that to become a stable core for a project needs to be a core of that project already and to know stable rules. My question is: wouldn't it be possible to someone who proves that he/she knows and follows the stable rules could be a stable core without being a core before? Maybe it could be enough or just make the projects life easier if for example one +2 could come from a person who is 'just' a stable core and one is necessary to come from a core both in stable and in 'master' on that project. I'm writing this because I mostly deal with stable patches (reviewing + bugfix backporting + fixing periodic job problems in various projects, also in nova) and for example I would volunteer to help with stable reviews as I am aware of stable rules (at least I believe so :)). I'm working like this because my employer, Ericsson, wants to strengthen stable and extended maintenance of OpenStack, too. What do you think about this kind of stable cores? Thanks, Előd On 2019. 05. 21. 18:16, Matthew Booth wrote: > During the trifecta discussions at PTG I was only considering > nova-core. I didn't appreciate at the time how bad the situation is > for nova-stable-maint. nova-stable-maint currently consists of: > > https://protect2.fireeye.com/url?k=cef2d936-9226d444-cef299ad-86859b2931b3-fc35df5ae8e953f2&q=1&u=https%3A%2F%2Freview.opendev.org%2F%23%2Fadmin%2Fgroups%2F540%2Cmembers > https://protect2.fireeye.com/url?k=2f4a8d4b-739e8039-2f4acdd0-86859b2931b3-622999115b951b2e&q=1&u=https%3A%2F%2Freview.opendev.org%2F%23%2Fadmin%2Fgroups%2F530%2Cmembers > > Not Red Hat: > Claudiu Belu -> Inactive? > Matt Riedemann > John Garbutt > Matthew Treinish > > Red Hat: > Dan Smith > Lee Yarwood > Sylvain Bauza > Tony Breeds > Melanie Witt > Alan Pevec > Chuck Short > Flavio Percoco > Tony Breeds > > This leaves Nova entirely dependent on Matt Riedemann, John Garbutt, > and Matthew Treinish to land patches in stable, which isn't a great > situation. With Matt R temporarily out of action that's especially > bad. > > Looking for constructive suggestions. I'm obviously in favour of > relaxing the trifecta rules, but adding some non-RH stable cores also > seems like it would be a generally healthy thing for the project to > do. > > Matt From melwittt at gmail.com Thu May 23 17:01:59 2019 From: melwittt at gmail.com (melanie witt) Date: Thu, 23 May 2019 10:01:59 -0700 Subject: [nova][dev][ops] server status when compute host is down In-Reply-To: References: <065da98d-300d-00ca-83ee-f6e9dc458277@gmail.com> Message-ID: <290d877a-4084-2d0c-3070-3eacc7487249@gmail.com> + at openstack-discuss, ML was accidentally dropped from Mohammed's reply On Thu, 23 May 2019 20:45:13 +0800, Mohammed Naser wrote: > > > On Thu, May 23, 2019, 10:00 AM melanie witt > wrote: > > Hey all, > > I'm looking for feedback around whether we can improve how we show > server status in server list and server show when the compute host it > resides on is down. > > When a compute host goes down while a server on it was previously > running, the server status continues to show as ACTIVE in a server > list. > This is because the power state and status is adjusted by a periodic > task run by nova-compute, so if nova-compute is down, it cannot update > those states. > > So, for an end user, when they do a server list, they see their server > as ACTIVE when it's actually powered off. > > We have another field called 'host_status' available since API > microversion 2.16 [1] which is controlled by policy and defaults to > admin, which is capable of showing the server status as UNKNOWN if the > field is specified, for example: > > nova list --fields > id,name,status,task_state,power_state,networks,host_status > > This is cool, but it is only available to admin by default, and it > requires that the end user adds the field to their CLI command in the > --fields option. > > Question: do people think we should make the server status field > reflect > UNKNOWN as well, if the 'host_status' is UNKNOWN? And if so, should it > be controlled by policy or no? > > > I'm in support of this. I also agree on the fact it should be controlled > by policy however not to get too ahead of my self: > > - I think it should ideally be defaulted to true > - ideally this should be an API level policy and not compute level > > Normally, we do not expose compute host details to non-admin in the API > by default, but I noticed recently that our "down cells" support will > show server status as UNKNOWN if a server is in a down cell [2]. So I > wondered if it would be considered OK to show UNKNOWN if a host is down > we well, without defaulting it to admin-only. > > I would really appreciate if people could share their opinion here and > if consensus is in support, I will move forward with proposing a change > accordingly. > > Cheers, > -melanie > > > Thank you for proposing this > > [1] > https://docs.openstack.org/nova/latest/reference/api-microversion-history.html#id14 > [2] > https://github.com/openstack/nova/blob/66a77f2fb75bbb9daebdca1cad0255ecafe41e92/nova/api/openstack/compute/views/servers.py#L108 > From iain.macdonnell at oracle.com Thu May 23 17:26:18 2019 From: iain.macdonnell at oracle.com (iain.macdonnell at oracle.com) Date: Thu, 23 May 2019 10:26:18 -0700 Subject: [nova][dev][ops] server status when compute host is down In-Reply-To: References: <065da98d-300d-00ca-83ee-f6e9dc458277@gmail.com> Message-ID: On 5/23/19 3:11 AM, Matthew Booth wrote: > On Thu, 23 May 2019 at 03:02, melanie witt wrote: >> >> Hey all, >> >> I'm looking for feedback around whether we can improve how we show >> server status in server list and server show when the compute host it >> resides on is down. >> >> When a compute host goes down while a server on it was previously >> running, the server status continues to show as ACTIVE in a server list. >> This is because the power state and status is adjusted by a periodic >> task run by nova-compute, so if nova-compute is down, it cannot update >> those states. >> >> So, for an end user, when they do a server list, they see their server >> as ACTIVE when it's actually powered off. >> >> We have another field called 'host_status' available since API >> microversion 2.16 [1] which is controlled by policy and defaults to >> admin, which is capable of showing the server status as UNKNOWN if the >> field is specified, for example: >> >> nova list --fields >> id,name,status,task_state,power_state,networks,host_status >> >> This is cool, but it is only available to admin by default, and it >> requires that the end user adds the field to their CLI command in the >> --fields option. >> >> Question: do people think we should make the server status field reflect >> UNKNOWN as well, if the 'host_status' is UNKNOWN? And if so, should it >> be controlled by policy or no? >> >> Normally, we do not expose compute host details to non-admin in the API >> by default, but I noticed recently that our "down cells" support will >> show server status as UNKNOWN if a server is in a down cell [2]. So I >> wondered if it would be considered OK to show UNKNOWN if a host is down >> we well, without defaulting it to admin-only. > > +1 from me. This seems to have confused users in the past and honest > is better than potentially wrong, imho. I can't think of a reason why > this information 'leak' would cause any problems. Can anybody else? Agreed. I don't think that a server status of "UNKNOWN" really constitutes "exposing compute host details". It's not sharing anything about *why* the server status is unknown - it's just not pretending that the last known status is still valid, when that may or may not actually be true. Or is the proposal to expose host_status where it would not normally be visible? It seems that the the down-host scenario is basically the same as down-cell, as far as being able to ascertain server status, so it seems to make sense to use the same indicator. ~iain From sundar.nadathur at intel.com Thu May 23 17:51:46 2019 From: sundar.nadathur at intel.com (Nadathur, Sundar) Date: Thu, 23 May 2019 17:51:46 +0000 Subject: [cyborg][nova][sdk]Cyborgclient integration In-Reply-To: References: <00be01d51167$46a0e1b0$d3e2a510$@hco.ntt.co.jp> Message-ID: <1CC272501B5BC543A05DB90AA509DED5275753DD@fmsmsx122.amr.corp.intel.com> > -----Original Message----- > From: Eric Fried > Sent: Thursday, May 23, 2019 7:10 AM > To: openstack-discuss at lists.openstack.org > Subject: Re: [cyborg][nova][sdk]Cyborgclient integration > > Hi Ikuo. I'm glad you're interested in helping out with this effort. > > I'm trying to understand where you intend to make your changes, assuming > you're coming at this purely from a cyborg perspective: > > - If in Nova, this isn't necessary because there's no python-cyborgclient Yes, the changes would not be in Nova. Perhaps Ikuo mentioned Nova as a reference. > - If in the cyborg CLI, I'm afraid I have very little context there. > There's a (nearly-official [5]) community push to make all CLIs OSC-based. I'm > not sure what state the cyborg CLI is in, but I would have expected it will need > brand new work to expose the v2 changes being done for Train. From that > perspective I would say: do that via OSC. But that's not really related to > bp/openstacksdk-in-nova. We need changes in Cyborg CLI for version 2 (which enables Nova integration, and introduces new objects in the process). New CLIs are supposed to be based on OpenStackClient [1]. Not sure if that is mandatory, but seems like a good idea if we have to redo significant parts of the CLI anyway. > - If in python-cyborgclient, again I lack background, but I don't think there's a > need to make changes here. The point of bp/openstacksdk-in-nova (or > openstacksdk-anywhere) is to get rid of usages of client libraries like python- > cyborgclient. Where is python-cyborgclient being used today? If it's just in the > CLI, and you can make the above (conversion to OSC) happen, it may be > possible to retire python-cyborgclient entirely \o/ The python-cyborgclient [2] is being used by the cyborg CLI. No other OpenStack services call into Cyborg using the client. The CLI from python-cyborgclient is based on the OpenStack CLI syntax. However, AFAICS, it is not a plugin into OSC. For example, it does not do "from openstackclient import shell". The list of OpenStack CLI plugins [3] does not include Cyborg. Does this mean that python-cyborgclient should be rewritten as an OSC plugin? IIUC, the push towards OpenStack SDK [4] is unrelated to OpenStack CLI. It becomes relevant only if some other service wants to call into Cyborg. > Thanks, > efried > > [0] https://blueprints.launchpad.net/nova/+spec/nova-cyborg-interaction > On 5/23/19 7:58 AM, Ikuo Otani wrote: > > Hi, > > > > I am a Cyborg member and take the role of integrating openstacksdk and > replacing use of python-*client. > > Related bluespec: > > https://blueprints.launchpad.net/nova/+spec/openstacksdk-in-nova > > > > My question is: > > When the first code should be uploaded to gerrit? > > From my understanding, we should update cyborg client library > > referring to openstacksdk rule, and make it reviewed in gerrit by Eric Fried. > > I'm sorry if I misunderstand. > > > > Thanks in advance, > > Ikuo [1] https://docs.openstack.org/python-openstackclient/latest/ [2] https://github.com/openstack/python-cyborgclient [3] https://docs.openstack.org/python-openstackclient/latest/contributor/plugins.html [4] https://docs.openstack.org/openstacksdk/latest/ Regards, Sundar From smooney at redhat.com Thu May 23 18:10:45 2019 From: smooney at redhat.com (Sean Mooney) Date: Thu, 23 May 2019 19:10:45 +0100 Subject: [nova] stable-maint is especially unhealthily RH-centric In-Reply-To: <171a6b9f-909c-a715-4957-18cbc91ac685@ericsson.com> References: <171a6b9f-909c-a715-4957-18cbc91ac685@ericsson.com> Message-ID: <402fcdc60e2c81b1ad920664daff16e5e2df809e.camel@redhat.com> On Thu, 2019-05-23 at 17:01 +0000, Elõd Illés wrote: > Hi, > > I was thinking to answer here... What I see is that more projects > struggle sometimes with having enough and quick stable reviews as cores > are busy with the development track. I know that to become a stable core > for a project needs to be a core of that project already and to know > stable rules. My question is: wouldn't it be possible to someone who > proves that he/she knows and follows the stable rules could be a stable > core without being a core before? we have stable cores in nova that are not cores on nova. the cirtria for stable cores are similar to normal cores show up, do good reviews, and always keep in mind if the backport followst the stable policy. if you do that then you can become a stable core without ever needing to review a patch on master. that said reviews on master never hurt. > Maybe it could be enough or just make > the projects life easier if for example one +2 could come from a person > who is 'just' a stable core and one is necessary to come from a core > both in stable and in 'master' on that project. > > I'm writing this because I mostly deal with stable patches (reviewing + > bugfix backporting + fixing periodic job problems in various projects, > also in nova) and for example I would volunteer to help with stable > reviews as I am aware of stable rules (at least I believe so :)). I'm > working like this because my employer, Ericsson, wants to strengthen > stable and extended maintenance of OpenStack, too. > > What do you think about this kind of stable cores? > > Thanks, > > Előd > > > > On 2019. 05. 21. 18:16, Matthew Booth wrote: > > During the trifecta discussions at PTG I was only considering > > nova-core. I didn't appreciate at the time how bad the situation is > > for nova-stable-maint. nova-stable-maint currently consists of: > > > > https://protect2.fireeye.com/url?k=cef2d936-9226d444-cef299ad-86859b2931b3-fc35df5ae8e953f2&q=1&u=https%3A%2F%2Freview.opendev.org%2F%23%2Fadmin%2Fgroups%2F540%2Cmembers > > https://protect2.fireeye.com/url?k=2f4a8d4b-739e8039-2f4acdd0-86859b2931b3-622999115b951b2e&q=1&u=https%3A%2F%2Freview.opendev.org%2F%23%2Fadmin%2Fgroups%2F530%2Cmembers > > > > Not Red Hat: > > Claudiu Belu -> Inactive? > > Matt Riedemann > > John Garbutt > > Matthew Treinish > > > > Red Hat: > > Dan Smith > > Lee Yarwood > > Sylvain Bauza > > Tony Breeds > > Melanie Witt > > Alan Pevec > > Chuck Short > > Flavio Percoco > > Tony Breeds > > > > This leaves Nova entirely dependent on Matt Riedemann, John Garbutt, > > and Matthew Treinish to land patches in stable, which isn't a great > > situation. With Matt R temporarily out of action that's especially > > bad. > > > > Looking for constructive suggestions. I'm obviously in favour of > > relaxing the trifecta rules, but adding some non-RH stable cores also > > seems like it would be a generally healthy thing for the project to > > do. > > > > Matt > > From mriedemos at gmail.com Thu May 23 18:12:16 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 23 May 2019 13:12:16 -0500 Subject: [nova] stable-maint is especially unhealthily RH-centric In-Reply-To: <171a6b9f-909c-a715-4957-18cbc91ac685@ericsson.com> References: <171a6b9f-909c-a715-4957-18cbc91ac685@ericsson.com> Message-ID: <3e1db2c6-6fab-0cd8-a975-c2dc246a8b8b@gmail.com> On 5/23/2019 12:01 PM, Elõd Illés wrote: > I know that to become a stable core > for a project needs to be a core of that project already and to know > stable rules. My question is: wouldn't it be possible to someone who > proves that he/she knows and follows the stable rules could be a stable > core without being a core before? This isn't true. tonyb and lyarwood are not nova core but are stable core for nova. -- Thanks, Matt From joseph.davis at suse.com Thu May 23 18:25:06 2019 From: joseph.davis at suse.com (Joseph Davis) Date: Thu, 23 May 2019 11:25:06 -0700 Subject: [monasca] Proposing Akhil Jain for Monasca core team Message-ID: <4f3c194a-d891-1b92-b4b2-ab4ccda9d770@suse.com> +1 to what Witek wrote. :) Date: Thu, 23 May 2019 13:23:31 +0100 From: Doug Szumski To: Witek Bedyk, openstack-discuss at lists.openstack.org Subject: Re: [monasca] Proposing Akhil Jain for Monasca core team Message-ID:<60faa8a1-51ea-bea6-93f0-855fdc0ccde9 at stackhpc.com> Content-Type: text/plain; charset=utf-8; format=flowed On 23/05/2019 12:24, Witek Bedyk wrote: > Hello team, > > I would like to propose Akhil Jain to join the Monasca core team. +1, thanks for your contributions Akhil! > Akhil has added authenticated webhook notification support [1] and > worked on integrating Monasca with Congress, where he's also the core > reviewer. His work has been presented at the last Open Infrastructure > Summit in Denver [2]. > > Akhil has good understanding of Monasca and OpenStack architecture and > constantly helps us in providing sensible reviews. I'm sure Monasca > project will benefit from this nomination. > > Cheers > Witek > > [1]https://storyboard.openstack.org/#!/story/2003105 > [2] > https://www.openstack.org/summit/denver-2019/summit-schedule/events/23261/policy-driven-fault-management-of-nfv-eco-system > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Thu May 23 18:32:56 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 23 May 2019 13:32:56 -0500 Subject: [nova][dev][ops] server status when compute host is down In-Reply-To: <065da98d-300d-00ca-83ee-f6e9dc458277@gmail.com> References: <065da98d-300d-00ca-83ee-f6e9dc458277@gmail.com> Message-ID: On 5/22/2019 8:58 PM, melanie witt wrote: > So, for an end user, when they do a server list, they see their server > as ACTIVE when it's actually powered off. Well, it might be powered off, we don't know. If nova-compute is down the guest could still be running if the hypervisor is running. > > We have another field called 'host_status' available since API > microversion 2.16 [1] which is controlled by policy and defaults to > admin, which is capable of showing the server status as UNKNOWN if the > field is specified, for example: > > nova list --fields > id,name,status,task_state,power_state,networks,host_status > > This is cool, but it is only available to admin by default, and it > requires that the end user adds the field to their CLI command in the > --fields option. As I said elsewhere in this thread, if you're proposing to add a new policy rule to change the 'status' field based on host_status, why not just tell people to open up the policy rule we already have for the host_status field so non-admins can see it in their server details? This sounds like an education problem more than a technical problem to me. Also, --fields is one thing on one interface to the API. Microversions are opt-in on purpose to avoid backward incompatible and behavior changes to the client, so if the client has a need to know this information, they can opt into getting it via the host_status field by using the 2.16 microversion or higher. That's the case for any microversion that adds new fields like the embedded instance.flavor details in 2.47 - we didn't just say "let's add a new policy rule to expose those details". > > Question: do people think we should make the server status field reflect > UNKNOWN as well, if the 'host_status' is UNKNOWN? And if so, should it > be controlled by policy or no? I'm going to vote no given we have a way to determine this already, as noted above. > > Normally, we do not expose compute host details to non-admin in the API > by default, but I noticed recently that our "down cells" support will > show server status as UNKNOWN if a server is in a down cell [2]. So I > wondered if it would be considered OK to show UNKNOWN if a host is down > we well, without defaulting it to admin-only. The down-cell UNKNOWN stuff is also opt-in behavior using the 2.69 microversion. I would likely only get behind changing the behavior of the 'status' field based on the compute service status in a new microversion, and then we have to talk about whether or not the response should mirror the down-cell case where we return partial results. That all sounds like a lot more work than just educating people about the host_status field and the existing policy rule to expose it. -- Thanks, Matt From dms at danplanet.com Thu May 23 18:47:16 2019 From: dms at danplanet.com (Dan Smith) Date: Thu, 23 May 2019 11:47:16 -0700 Subject: [nova][dev][ops] server status when compute host is down In-Reply-To: (Matt Riedemann's message of "Thu, 23 May 2019 13:32:56 -0500") References: <065da98d-300d-00ca-83ee-f6e9dc458277@gmail.com> Message-ID: > As I said elsewhere in this thread, if you're proposing to add a new > policy rule to change the 'status' field based on host_status, why not > just tell people to open up the policy rule we already have for the > host_status field so non-admins can see it in their server details? > This sounds like an education problem more than a technical problem to > me. Yeah, I'm much more in favor of this, unsurprisingly. It also avoids the case where a script is polling for an instance's state, and if it becomes anything other than ACTIVE, it takes action or wakes someone up. If you've just taken the compute service down for an upgrade (or rabbit took a dump) you don't end up freaking out because "the instance has changed state" which is what that looks like from the outside. If you _want_ to take action based on the host's state, then you look at that attribute (if allowed) and make decisions thusly. > Also, --fields is one thing on one interface to the API. Microversions > are opt-in on purpose to avoid backward incompatible and behavior > changes to the client, so if the client has a need to know this > information, they can opt into getting it via the host_status field by > using the 2.16 microversion or higher. That's the case for any > microversion that adds new fields like the embedded instance.flavor > details in 2.47 - we didn't just say "let's add a new policy rule to > expose those details". Clearly we couldn't return the UNKNOWN state if the request was from before whatever microversion we enable this in. > The down-cell UNKNOWN stuff is also opt-in behavior using the 2.69 > microversion. I would likely only get behind changing the behavior of > the 'status' field based on the compute service status in a new > microversion, and then we have to talk about whether or not the > response should mirror the down-cell case where we return partial > results. That all sounds like a lot more work than just educating > people about the host_status field and the existing policy rule to > expose it. I actually think if we're going to do this, we *should* make compute-down mirror cell-down in terms of what we return. I think that's unfortunate, mind you, but otherwise we'd be effectively re-writing what we said in the down-cell microversion, going from "If it's UNKNOWN, expect the instance to look like the minimal version" to "Well, that depends...". It would mean that something using the later microversion would no longer be able to check for UNKNOWN to determine if there's a full instance to look at, and instead would have to poke for keys. --Dan From mriedemos at gmail.com Thu May 23 18:53:43 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 23 May 2019 13:53:43 -0500 Subject: [nova][dev][ops] server status when compute host is down In-Reply-To: References: <065da98d-300d-00ca-83ee-f6e9dc458277@gmail.com> Message-ID: <6d57aca2-d5c4-54b1-46f6-216382394842@gmail.com> On 5/23/2019 1:47 PM, Dan Smith wrote: > It also avoids the case where a script is polling for an instance's > state, and if it becomes anything other than ACTIVE, it takes action or > wakes someone up. If you've just taken the compute service down for an > upgrade (or rabbit took a dump) you don't end up freaking out because > "the instance has changed state" which is what that looks like from the > outside. If you_want_ to take action based on the host's state, then > you look at that attribute (if allowed) and make decisions thusly. This raises another concern - if the UNKNOWN status is not baked into the instance.vm_state itself, then what do you do about notifications that nova is sending out? Would those also be checking the host status and changing the instance status in the notification payload to UNKNOWN? Anyway, it's stuff like this that requires a lot more thought than just deciding on a whim that we'd like some behavior change in the API (note the original email nor any of the people agreeing with it in this thread said anything about a new microversion). Rather than deal with all of these side effects, just explain to people that need this information how to configure their cloud to expose it and how to write their client side tooling to get it. -- Thanks, Matt From iain.macdonnell at oracle.com Thu May 23 18:56:34 2019 From: iain.macdonnell at oracle.com (iain.macdonnell at oracle.com) Date: Thu, 23 May 2019 11:56:34 -0700 Subject: [nova][dev][ops] server status when compute host is down In-Reply-To: References: <065da98d-300d-00ca-83ee-f6e9dc458277@gmail.com> Message-ID: On 5/23/19 11:32 AM, Matt Riedemann wrote: > As I said elsewhere in this thread, if you're proposing to add a new > policy rule to change the 'status' field based on host_status, why not > just tell people to open up the policy rule we already have for the > host_status field so non-admins can see it in their server details? This > sounds like an education problem more than a technical problem to me. Because *that* implies revealing infrastructure status details to end-users, which is probably not desirable in a lot of cases. Isn't this as simple as not lying to the user about the *server* status when it cannot be ascertained for any reason? In that case, the user should be given (only) that information, but not any "dirty laundry" about what caused it.... Even if the admin doesn't care about revealing infrastructure status, the end-user shouldn't have to know that server_status can't be trusted, and that they have to check other fields to figure out if it's reliable or not at any given time. ~iain From openstack at fried.cc Thu May 23 19:09:57 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 23 May 2019 14:09:57 -0500 Subject: [cyborg][nova][sdk]Cyborgclient integration In-Reply-To: <1CC272501B5BC543A05DB90AA509DED5275753DD@fmsmsx122.amr.corp.intel.com> References: <00be01d51167$46a0e1b0$d3e2a510$@hco.ntt.co.jp> <1CC272501B5BC543A05DB90AA509DED5275753DD@fmsmsx122.amr.corp.intel.com> Message-ID: <37d29b24-e4ea-58e4-fbf4-180d9fc6d18b@fried.cc> > We need changes in Cyborg CLI for version 2 (which enables Nova integration, and introduces new objects in the process). New CLIs are supposed to be based on OpenStackClient [1]. Not sure if that is mandatory, but seems like a good idea if we have to redo significant parts of the CLI anyway. +1 > The CLI from python-cyborgclient is based on the OpenStack CLI syntax. However, AFAICS, it is not a plugin into OSC. For example, it does not do "from openstackclient import shell". The list of OpenStack CLI plugins [3] does not include Cyborg. > > Does this mean that python-cyborgclient should be rewritten as an OSC plugin? Ultimately yes*. Both can exist concurrently, so for the sake of not accruing further technical debt, you might consider starting the OSC plugin piece with v2 and porting v1 functionality gradually/later. (Or maybe not at all. Is the v1 API a real thing that's going to be maintained long term?) Eventually the goal would be to retire the non-OSC CLI pieces. *or starting a new osc-cyborg project for the plugin > IIUC, the push towards OpenStack SDK [4] is unrelated to OpenStack CLI. It becomes relevant only if some other service wants to call into Cyborg. Agreed. I think the overlap is that a given OSC plugin can (with at least an undercurrent of "should") use the SDK. Since cyborg is starting both OSC and v2 from scratch, I would definitely say write the OSC code to talk through SDK. efried . From m.andre at redhat.com Thu May 23 19:37:31 2019 From: m.andre at redhat.com (=?UTF-8?Q?Martin_Andr=C3=A9?=) Date: Thu, 23 May 2019 21:37:31 +0200 Subject: [kolla] Stepping down from core reviewer Message-ID: Hi all, It became clear over the past few months I no longer have the time to contribute to Kolla in a meaningful way and would like to step down from core reviewer. It was an honor to be part of this great team, you fools who trusted me enough to give me +2 powers. Thanks, and long live Kolla! Martin From melwittt at gmail.com Thu May 23 20:08:38 2019 From: melwittt at gmail.com (melanie witt) Date: Thu, 23 May 2019 13:08:38 -0700 Subject: [nova][dev][ops] server status when compute host is down In-Reply-To: References: <065da98d-300d-00ca-83ee-f6e9dc458277@gmail.com> Message-ID: <3c74024f-aa0d-12a6-b5bb-54ceebc07c64@gmail.com> On Thu, 23 May 2019 11:56:34 -0700, Iain Macdonnell wrote: > > > On 5/23/19 11:32 AM, Matt Riedemann wrote: >> As I said elsewhere in this thread, if you're proposing to add a new >> policy rule to change the 'status' field based on host_status, why not >> just tell people to open up the policy rule we already have for the >> host_status field so non-admins can see it in their server details? This >> sounds like an education problem more than a technical problem to me. > > Because *that* implies revealing infrastructure status details to > end-users, which is probably not desirable in a lot of cases. This is a good point. If an operator were to enable 'host_status' via policy, end users would also get to see host_status UP and DOWN, which is typically not desired by cloud admins. There's currently no option for exposing only UNKNOWN, as a small but helpful bit of info for end users. > Isn't this as simple as not lying to the user about the *server* status > when it cannot be ascertained for any reason? In that case, the user > should be given (only) that information, but not any "dirty laundry" > about what caused it.... > > Even if the admin doesn't care about revealing infrastructure status, > the end-user shouldn't have to know that server_status can't be trusted, > and that they have to check other fields to figure out if it's reliable > or not at any given time. And yes, I was thinking about it more simply, and the replies on this thread have led me to think that if we could show the cosmetic-only status of UNKNOWN for nova-compute communication interruptions, similar to what we do for down cells, we would not put a policy control on it (since UNKNOWN is not leaking infra details). And not make any changes to notifications etc, just a cosmetic-only UNKNOWN status implemented at the REST API layer if host_status is UNKNOWN. I was thinking maybe we'd leave server status alone if host_status is UP or DOWN since its status should be reflected in those cases as-is. Assuming we could move forward without a policy control on it, I think the only remaining concern would be the collision of UNKNOWN status with down cells where for down cells, some server attributes are not available. Personally, this doesn't seem like a major problem to me since UNKNOWN implies an uncertain state, in general. But maybe I'm wrong. How important is the difference? Finally, it sounds like the consensus is that if we do decide to make this change, we would need a new microversion to account for server status being able to be UNKNOWN if host_status is UNKNOWN. -melanie From dtroyer at gmail.com Thu May 23 20:23:16 2019 From: dtroyer at gmail.com (Dean Troyer) Date: Thu, 23 May 2019 15:23:16 -0500 Subject: [cyborg][nova][sdk]Cyborgclient integration In-Reply-To: <1CC272501B5BC543A05DB90AA509DED5275753DD@fmsmsx122.amr.corp.intel.com> References: <00be01d51167$46a0e1b0$d3e2a510$@hco.ntt.co.jp> <1CC272501B5BC543A05DB90AA509DED5275753DD@fmsmsx122.amr.corp.intel.com> Message-ID: On Thu, May 23, 2019 at 1:03 PM Nadathur, Sundar wrote: > The python-cyborgclient [2] is being used by the cyborg CLI. No other OpenStack services call into Cyborg using the client. > > The CLI from python-cyborgclient is based on the OpenStack CLI syntax. However, AFAICS, it is not a plugin into OSC. For example, it does not do "from openstackclient import shell". The list of OpenStack CLI plugins [3] does not include Cyborg. > > Does this mean that python-cyborgclient should be rewritten as an OSC plugin? This is one option and the option that lets the Cyborg team have total control over their CLI. Some teams want that, other teams do not. > IIUC, the push towards OpenStack SDK [4] is unrelated to OpenStack CLI. It becomes relevant only if some other service wants to call into Cyborg. Yes and no. The two things are generally independent, however they will eventually fit together in that we want OSC to use the SDK for all back-end work soon(TM), depending on when we get an SDK 1.0 release. At the 2018 Denver PTG we started thinking about what OSC plugins that use the SDK would look like, and the only things left in the plugin itself would be the cliff command classes. Since SDK is accepting direct support for official projects directly in the repo, OSC will consider doing the same for commands that do not require any additional dependencies, ie if Cyborg were completely backed by the SDK we would consider adding its commands directly to the OSC repo. This is a significant change for OSC, and would come with one really large caveat: commands must maintain the same level of consistency that the rest of the commands in the OSC repo maintain. ie, 'update' is not a verb, resources do not contain hyphens in their names, etc. There are projects that have deviated from these rules in their plugins, and there they are free to do that, incurring only the wrath or disdain or joy of their users for being different. That is not the case for commands contained in the OSC repo itself. dt -- Dean Troyer dtroyer at gmail.com From jrist at redhat.com Thu May 23 20:35:47 2019 From: jrist at redhat.com (Jason Rist) Date: Thu, 23 May 2019 14:35:47 -0600 Subject: Retiring TripleO-UI - no longer supported Message-ID: <3924F5DE-314C-4D41-8CEA-DCF7A2A2CDEA@redhat.com> Hi everyone - I’m writing the list to announce that we are retiring TripleO-UI and it will no longer be supported. It’s already deprecated in Zuul and removed from requirements, so I’ve submitted a patch to remove all code. https://review.opendev.org/661113 Thanks, Jason Jason Rist Red Hat jrist / knowncitizen ` -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcin.juszkiewicz at linaro.org Thu May 23 20:59:43 2019 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Thu, 23 May 2019 22:59:43 +0200 Subject: [kolla] Stepping down from core reviewer In-Reply-To: References: Message-ID: <9e5c24ce-e4bf-4843-c728-34903228e936@linaro.org> W dniu 23.05.2019 o 21:37, Martin André pisze: > It became clear over the past few months I no longer have the time to > contribute to Kolla in a meaningful way and would like to step down > from core reviewer. It was an honor to be part of this great team, > you fools who trusted me enough to give me +2 powers. Thanks, and > long live Kolla! Thank you for all your contributions. You were great help when I started playing with Kolla and later. From aspiers at suse.com Thu May 23 21:32:20 2019 From: aspiers at suse.com (Adam Spiers) Date: Thu, 23 May 2019 22:32:20 +0100 Subject: [vitrage][ptl][tc] stepping down as Vitrage PTL In-Reply-To: References: Message-ID: <20190523213220.uj4l3xi33zektu7x@arabian.linksys.moosehall> Thank you Ifat for all your great work! If we had not had a great conversation in the sun during OpenStack Day Israel in June 2017, the idea for creating the Self-healing SIG would never have been born! So I would like to say an extra special thanks for this. It has been a pleasure working with you ever since then. Of course I am sad to see you go, but I'm sure the new role will be an exciting new opportunity for you, and that you leave us in very safe hands with Eyal. All the best, and stay in touch! Adam :-) Ifat Afek wrote: >As I have taken on a new role in my company, I will not be able to continue >serving as the Vitrage PTL. > >I’ve been the PTL of Vitrage from the day it started (back then in Mitaka), >and it has been a real pleasure for me. Helping a project grow from an idea >and a set of diagrams to a production-grade service was an amazing >experience. I am honored to have worked with excellent developers, both >Vitrage core contributors and other community members. I learned a lot, and >also managed to have fun along the way :-) > >I would like to take this opportunity to thank everyone who contributed to >the success of Vitrage – either by writing code, suggesting new use cases, >participating in our discussions, or helping out when our gate was broken. > > >Eyal Bar-Ilan (irc: eyalb), a Vitrage core contributor who was part of the >team from the very beginning, will be replacing me as the new Vitrage PTL >[1]. I’m sure he will make an excellent PTL, as someone who knows every >piece of the code and is tightly connected to the community. I will still >be around to help if needed. > >I wish Eyal lots of luck in his new role! > > >Ifat > > >[1] https://review.opendev.org/#/c/660563/ From mriedemos at gmail.com Thu May 23 21:40:17 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 23 May 2019 16:40:17 -0500 Subject: [watcher] Question about baremetal node support in nova CDM Message-ID: While working on [1] I noticed that the NovaHelper.get_compute_node_by_hostname method is making 3 API calls to get the details about a hypervisor: 1. listing hypervisors and then filtering client-side by the compute service host - this fails if there is not exactly one hypervisor for the given compute service host name 2. search for the hypervisor given the hypervisor_hostname to get the id 3. get the hypervisor details using the id My patch is collapsing 2 and 3 into a single API call to nova. The question I had was if we need that first call. For all non-ironic baremetal nodes, the compute service host and node hypervisor_hostname should be identical, so technically we could just search for the hypervisor details with the compute service hostname. Only in the case of ironic would we potentially get more than one hypervisor (node) for a single compute service (host). I don't think the nova CDM code really handles baremetal instances at all because it's not handling this kind of host:node 1:M cardinality elsewhere in the code, but you also can't do most things to optimize like cold or live migrating baremetal instances. I'm not exactly sure how Watcher deals with ironic but I know there is the separate BaremetalClusterDataModelCollector so I'm assuming watcher just optimizes for baremetal outside of the nova compute API? If this is true, then we can get rid of that first API all noted above and I don't need to write a nova spec to add a host filter parameter to the GET /os-hypervisors/detail API. [1] https://review.opendev.org/#/c/661121/2/watcher/common/nova_helper.py at 65 -- Thanks, Matt From aspiers at suse.com Thu May 23 21:42:00 2019 From: aspiers at suse.com (Adam Spiers) Date: Thu, 23 May 2019 22:42:00 +0100 Subject: [self-healing] [autohealing] Demo of Application Autohealing in OpenStack (Heat + Octavia + Aodh) In-Reply-To: References: Message-ID: <20190523214200.scfzjes42uulhn23@arabian.linksys.moosehall> Lingxian Kong wrote: >On Mon, May 20, 2019 at 4:11 PM Lingxian Kong wrote: >> Please see the demo here: https://youtu.be/dXsGnbr7DfM > >Recommend to watch in a 1080p video quality. This is great - thanks a lot for sharing this! It's exactly the kind of cross-project integration which the self-healing SIG wants to make more easy for operators to implement. Would you be willing to submit some documentation to the SIG[0] explaining the use case and how to set it up? We have a short guide on how to contribute new use cases[1] but if you need any help with this I would be very happy to assist! [0] https://docs.openstack.org/self-healing-sig/latest/ [1] https://docs.openstack.org/self-healing-sig/latest/meta/CONTRIBUTING.html#use-cases Please also feel free to join us in the #openstack-self-healing IRC channel. Thanks, Adam From aspiers at suse.com Thu May 23 21:53:35 2019 From: aspiers at suse.com (Adam Spiers) Date: Thu, 23 May 2019 22:53:35 +0100 Subject: [stackalytics] Reported numbers seem inaccurate In-Reply-To: References: <20190522213927.iuty4y5mrgw7dmjt@pacific.linksys.moosehall> Message-ID: <20190523215335.w3e5cnqt5tl7f2wr@arabian.linksys.moosehall> Thanks for looking at this. Maybe I'm just being too impatient and the data is still synchronising, but now I only see 4 commits to nova in May, and there have definitely been a *lot* more than that :-) https://opendev.org/openstack/nova/commits/branch/master Sergey Nikitin wrote: >Thank you for message! >yes, I guess new train release wasn't added into repos (just on drop down). >I'll fix it now. > >On Thu, May 23, 2019 at 1:39 AM Adam Spiers wrote: > >> There are still issues. For example nova is not showing any commits >> since April: >> >> >> https://www.stackalytics.com/?metric=commits&release=train&project_type=all&module=nova >> >> Rong Zhu wrote: >> >Hi Sergey, >> > >> >Thanks for your help. Now the numbers are correctly. >> > >> > >> >Sergey Nikitin 于2019年5月19日 周日21:12写道: >> > >> >> Hi, Rong, >> >> >> >> Database was rebuild and now stats o gengchc2 [1] is correct [2]. >> >> >> >> [1] >> >> >> https://www.stackalytics.com/?release=all&metric=commits&project_type=all&user_id=578043796-b >> >> [2] https://review.opendev.org/#/q/owner:gengchc2,n,z >> >> >> >> Sorry for delay, >> >> Sergey >> >> >> >> >> >> >> >> >> >> On Fri, May 17, 2019 at 6:20 PM Sergey Nikitin >> >> wrote: >> >> >> >>> Testing of migration process shown us that we have to rebuild database >> >>> "on live". >> >>> Unfortunately it means that during rebuild data will be incomplete. I >> >>> talked with the colleague who did it previously and he told me that >> it's >> >>> normal procedure. >> >>> I got these results on Monday and at this moment I'm waiting for >> weekend. >> >>> It's better to rebuild database in Saturday and Sunday to do now affect >> >>> much number of users. >> >>> So by the end of this week everything will be completed. Thank you for >> >>> patient. >> >>> >> >>> On Fri, May 17, 2019 at 6:15 AM Rong Zhu >> wrote: >> >>> >> >>>> Hi Sergey, >> >>>> >> >>>> What is the process about rebuild the database? >> >>>> >> >>>> Thanks, >> >>>> Rong Zhu >> >>>> >> >>>> Sergey Nikitin 于2019年5月7日 周二00:59写道: >> >>>> >> >>>>> Hello Rong, >> >>>>> >> >>>>> Sorry for long response. I was on a trip during last 5 days. >> >>>>> >> >>>>> What I have found: >> >>>>> Lets take a look on this patch [1]. It must be a contribution of >> >>>>> gengchc2, but for some reasons it was matched to Yuval Brik [2] >> >>>>> I'm still trying to find a root cause of it, but anyway on this week >> we >> >>>>> are planing to rebuild our database to increase RAM. I checked >> statistics >> >>>>> of gengchc2 on clean database and it's complete correct. >> >>>>> So your problem will be solved in several days. It will take so long >> >>>>> time because full rebuild of DB takes 48 hours, but we need to test >> our >> >>>>> migration process first to keep zero down time. >> >>>>> I'll share a results with you here when the process will be finished. >> >>>>> Thank you for your patience. >> >>>>> >> >>>>> Sergey >> >>>>> >> >>>>> [1] https://review.opendev.org/#/c/627762/ >> >>>>> [2] >> >>>>> >> https://www.stackalytics.com/?user_id=jhamhader&project_type=all&release=all&metric=commits&company=&module=freezer-api >> >>>>> >> >>>>> >> >>>>> On Mon, May 6, 2019 at 6:30 AM Rong Zhu >> wrote: >> >>>>> >> >>>>>> Hi Sergey, >> >>>>>> >> >>>>>> Do we have any process about my colleague's data loss problem? >> >>>>>> >> >>>>>> Sergey Nikitin 于2019年4月29日 周一19:57写道: >> >>>>>> >> >>>>>>> Thank you for information! I will take a look >> >>>>>>> >> >>>>>>> On Mon, Apr 29, 2019 at 3:47 PM Rong Zhu >> >>>>>>> wrote: >> >>>>>>> >> >>>>>>>> Hi there, >> >>>>>>>> >> >>>>>>>> Recently we found we lost a person's data from our company at the >> >>>>>>>> stackalytics website. >> >>>>>>>> You can check the merged patch from [0], but there no date from >> >>>>>>>> the stackalytics website. >> >>>>>>>> >> >>>>>>>> stackalytics info as below: >> >>>>>>>> Company: ZTE Corporation >> >>>>>>>> Launchpad: 578043796-b >> >>>>>>>> Gerrit: gengchc2 >> >>>>>>>> >> >>>>>>>> Look forward to hearing from you! >> >>>>>>>> >> >>>>>>> >> >>>>>> Best Regards, >> >>>>>> Rong Zhu >> >>>>>> >> >>>>>>> >> >>>>>>>> -- >> >>>>>> Thanks, >> >>>>>> Rong Zhu >> >>>>>> >> >>>>> >> >>>>> >> >>>>> -- >> >>>>> Best Regards, >> >>>>> Sergey Nikitin >> >>>>> >> >>>> -- >> >>>> Thanks, >> >>>> Rong Zhu >> >>>> >> >>> >> >>> >> >>> -- >> >>> Best Regards, >> >>> Sergey Nikitin >> >>> >> >> >> >> >> >> -- >> >> Best Regards, >> >> Sergey Nikitin >> >> >> >-- >> >Thanks, >> >Rong Zhu >> > > >-- >Best Regards, >Sergey Nikitin From mriedemos at gmail.com Thu May 23 22:02:15 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 23 May 2019 17:02:15 -0500 Subject: [nova] Validation for requested host/node on server create In-Reply-To: <78fa937a-beb6-c63d-01a0-40e6519928be@gmail.com> References: <78fa937a-beb6-c63d-01a0-40e6519928be@gmail.com> Message-ID: On 5/22/2019 5:13 PM, Matt Riedemann wrote: > 3. Validate both the host and node in the API. This can be broken down: > > a) If only host is specified, do #2 above. > b) If only node is specified, iterate the cells looking for the node (or > query a resource provider with that name in placement which would avoid > down cell issues) > c) If both host and node is specified, get the HostMapping and from that > lookup the ComputeNode in the given cell (per the HostMapping) > > Pros: fail fast behavior in the API if either the host and/or node do > not exist > > Cons: performance hit in the API to validate the host/node and > redundancy with the scheduler to find the ComputeNode to get its uuid > for the in_tree filtering on GET /allocation_candidates. > > Note that if we do find the ComputeNode in the API, we could also > (later?) make a change to the Destination object to add a node_uuid > field so we can pass that through on the RequestSpec from > API->conductor->scheduler and that should remove the need for the > duplicate query in the scheduler code for the in_tree logic. > > I'm personally in favor of option 3 since we know that users hate > NoValidHost errors and we have ways to mitigate the performance overhead > of that validation. > > Note that this isn't necessarily something that has to happen in the > same change that introduces the host/hypervisor_hostname parameters to > the API. If we do the validation in the API I'd probably split the > validation logic into it's own patch to make it easier to test and > review on its own. > > [1] https://review.opendev.org/#/c/645520/ > [2] > https://github.com/openstack/nova/blob/2e85453879533af0b4d0e1178797d26f026a9423/nova/scheduler/utils.py#L528 > > [3] https://docs.openstack.org/nova/latest/admin/availability-zones.html Per the nova meeting today [1] it sounds like we're going to go with option 3 and do the validation in the API - check hostmapping for the host, check placement for the node, we can optimize the redundant scheduler calculation for in_tree later. For review and test sanity I ask that the API validation code comes in a separate patch in the series. [1] http://eavesdrop.openstack.org/meetings/nova/2019/nova.2019-05-23-21.00.log.html#l-104 -- Thanks, Matt From anlin.kong at gmail.com Thu May 23 22:23:06 2019 From: anlin.kong at gmail.com (Lingxian Kong) Date: Fri, 24 May 2019 10:23:06 +1200 Subject: [all][qinling] Please check your README files In-Reply-To: References: Message-ID: Thanks for the reminder, Marcin. Qinling issue should be fixed in https://review.opendev.org/#/c/661005 (Tip hat to Gaëtan Trellu!) --- Best regards, Lingxian Kong Catalyst Cloud On Fri, May 24, 2019 at 1:35 AM Marcin Juszkiewicz < marcin.juszkiewicz at linaro.org> wrote: > Train cycle is supposed to be "we really go for Python 3" cycle. > Contrary to previous "we do not have to care about Python 3" ones. > > I am working on switching Kolla images into Python 3 by default [1]. It > is a job I would not recommend even to my potential enemies. Misc > projects fail into random ways (sent a bunch of patches). > > 1. https://review.opendev.org/#/c/642375 > > Quite common issue is related to README files. I know that we have XXI > century and UTF-8 is encoding for most of people here. But Python tools > are not so advanced so far and loves to explode when see characters > outside of US-ASCII encoding. > > Please check your README files. And move them to US-ASCII if needed. > > I already fixed few projects [2][3][4][5] but would love to see > developers fixing their code too. > > 2. https://review.opendev.org/644531 > 3. https://review.opendev.org/644533 > 4. https://review.opendev.org/644535 > 5. https://review.opendev.org/644536 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Thu May 23 23:16:39 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 23 May 2019 23:16:39 +0000 Subject: [all][requirements][stable] requests version bump on stable brances {pike|queens} for CVE-2018-18074 In-Reply-To: References: <20190507203022.ctlwkqh4awa5z3ez@mthode.org> <20190508142758.gbio47mo3f7pfpgz@yuggoth.org> <20190514123155.xbj5srhhnrmg2h2y@yuggoth.org> <20190522224930.yy35h7imhedm2lyy@yuggoth.org> <20190522235350.yw4dn5cgwemhmtak@yuggoth.org> Message-ID: <20190523231638.2diw2leubtmsmzng@yuggoth.org> On 2019-05-23 02:24:02 +0200 (+0200), Dirk Müller wrote: [...] > Most of interesting test coverage is in the project's functional > test jobs as well, so "just" devstack alone isn't enough, all the > projects need to support this variation as well. [...] I'm also not sure running a second copy of *all* our jobs is sensible either. A balance must be struck for testing enough that we'll catch likely issues while acknowledging that we simply can't test everything. > So running safety against our stable branch spills out this list > of packages: [...] Out of curiosity, which branch was it? Stein? Rocky? Queens? I have a feeling the further back you go, the longer that list is going to get. Also as the safety tool grows in popularity it may see increased coverage for less common, more fringe dependencies. > transitive changes are not needed if we have backports in the > gate. [...] I'm not sure what that means. Can you rephrase it? > lower-constraints.txt jobs try to ensure that this won't happen, > assuming that we somehow bump the version numbers to X.X.X.post1, > so thats not an additional thing to worry about. [...] Not all projects have lower constraints jobs, and they aren't required to use consistent versions, so running integration tests with lower constraints is a non-starter and therefore not a replacement for the frozen central upper-constraints.txt file. > I'm not aware of any statement anywhere that we'd be security > maintaining a distro? Where would we state that? in the project's > documentation/README? http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006087.html Apparently we don't have to claim it's a secure deployment solution for some operators to just assume it is. I'm saying as a community we need to do a better job of communicating to our users that production deployments from stable branch sources need to get their external dependencies from a security-managed platform, that installing the versions we test with is not and ultimately *can not* be secure. This is similar to the concerns raised with the TC which resulted in the 2017-05-30 Guidelines for Managing Releases of Binary Artifacts resolution: https://governance.openstack.org/tc/resolutions/20170530-binary-artifacts.html Adding comments in our constraints files is certainly one measure we can (and likely should) take, but more importantly we need our deployment projects who provide such options to get the word out that this model of operation is wholly unsuitable for production use. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From tony at bakeyournoodle.com Thu May 23 23:55:26 2019 From: tony at bakeyournoodle.com (Tony Breeds) Date: Fri, 24 May 2019 09:55:26 +1000 Subject: [nova] stable-maint is especially unhealthily RH-centric In-Reply-To: <171a6b9f-909c-a715-4957-18cbc91ac685@ericsson.com> References: <171a6b9f-909c-a715-4957-18cbc91ac685@ericsson.com> Message-ID: <20190523235523.GB4763@thor.bakeyournoodle.com> On Thu, May 23, 2019 at 05:01:01PM +0000, Elõd Illés wrote: > Hi, > > I was thinking to answer here... What I see is that more projects > struggle sometimes with having enough and quick stable reviews as cores > are busy with the development track. I know that to become a stable core > for a project needs to be a core of that project already and to know > stable rules. My question is: wouldn't it be possible to someone who > proves that he/she knows and follows the stable rules could be a stable > core without being a core before? Maybe it could be enough or just make > the projects life easier if for example one +2 could come from a person > who is 'just' a stable core and one is necessary to come from a core > both in stable and in 'master' on that project. Sean and Matt have both answered this. I'd just like to add that the Extended Maintenance policy was designed to encourage this so by all means go forth and do good stable reviews :) Yours Tony. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From info at dantalion.nl Fri May 24 06:39:11 2019 From: info at dantalion.nl (info at dantalion.nl) Date: Fri, 24 May 2019 08:39:11 +0200 Subject: [watcher] Question about baremetal node support in nova CDM In-Reply-To: References: Message-ID: <53585cb2-a207-58ff-588a-6c9694f8245f@dantalion.nl> Hello, I haven't investigated thoroughly but I suspect that the bare metal nodes returned from the hypervisor calls are not handled by the nova CDM just like you expect. I think we should look into if bare metal nodes are stored in the compute_model as I think it would more sense to filter them out. Overall I think the way Watcher handles bare metal nodes must be analysed and improved for example the saving energy strategy uses the ironic client directly and there is currently not a single strategy that accesses the baremetal_model. Maybe I have time to test my suspicions in a test environment with both ironic and hypervisor nodes next week. On 5/23/19 11:40 PM, Matt Riedemann wrote: > While working on [1] I noticed that the > NovaHelper.get_compute_node_by_hostname method is making 3 API calls to > get the details about a hypervisor: > > 1. listing hypervisors and then filtering client-side by the compute > service host - this fails if there is not exactly one hypervisor for the > given compute service host name > > 2. search for the hypervisor given the hypervisor_hostname to get the id > > 3. get the hypervisor details using the id > > My patch is collapsing 2 and 3 into a single API call to nova. > > The question I had was if we need that first call. For all non-ironic > baremetal nodes, the compute service host and node hypervisor_hostname > should be identical, so technically we could just search for the > hypervisor details with the compute service hostname. > > Only in the case of ironic would we potentially get more than one > hypervisor (node) for a single compute service (host). > > I don't think the nova CDM code really handles baremetal instances at > all because it's not handling this kind of host:node 1:M cardinality > elsewhere in the code, but you also can't do most things to optimize > like cold or live migrating baremetal instances. > > I'm not exactly sure how Watcher deals with ironic but I know there is > the separate BaremetalClusterDataModelCollector so I'm assuming watcher > just optimizes for baremetal outside of the nova compute API? > > If this is true, then we can get rid of that first API all noted above > and I don't need to write a nova spec to add a host filter parameter to > the GET /os-hypervisors/detail API. > > [1] > https://review.opendev.org/#/c/661121/2/watcher/common/nova_helper.py at 65 > From tony at bakeyournoodle.com Fri May 24 06:39:20 2019 From: tony at bakeyournoodle.com (Tony Breeds) Date: Fri, 24 May 2019 16:39:20 +1000 Subject: [tripleo][Release-job-failures] Tag of openstack/tripleo-upgrade failed In-Reply-To: References: Message-ID: <20190524063920.GC4763@thor.bakeyournoodle.com> On Thu, May 23, 2019 at 11:58:20PM +0000, zuul at openstack.org wrote: > Build failed. > > - publish-openstack-releasenotes-python3 http://logs.openstack.org/e3/e350fb6648441ce0b33f7960cee7ba81083b1adb/tag/publish-openstack-releasenotes-python3/6524ace/ : FAILURE in 5m 42s This failed because the releasenotes don't build[1]: Could not import extension openstackdocstheme (exception: No module named 'openstackdocstheme' The publish of the sdist and wheels worked. https://review.opendev.org/661203, should fix the failure but we probably also want to add releasenotes to the check (and gate) pipelines Yours Tony. [1] http://logs.openstack.org/e3/e350fb6648441ce0b33f7960cee7ba81083b1adb/tag/publish-openstack-releasenotes-python3/6524ace/job-output.txt.gz#_2019-05-23_23_57_25_508296 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From marcin.juszkiewicz at linaro.org Fri May 24 08:19:49 2019 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Fri, 24 May 2019 10:19:49 +0200 Subject: [all][qinling] Please check your README files In-Reply-To: References: Message-ID: <3b416622-495d-713c-7ab8-6f46a3295dca@linaro.org> W dniu 24.05.2019 o 00:23, Lingxian Kong pisze: > Thanks for the reminder, Marcin. Qinling issue should be fixed in > https://review.opendev.org/#/c/661005 (Tip hat to Gaëtan Trellu!) Thanks! I hope that PBR issue gets fixes soon, then openstack/requirements gets PBR version bump so we can revert that change to show IPA characters again. From marcin.juszkiewicz at linaro.org Fri May 24 08:20:58 2019 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Fri, 24 May 2019 10:20:58 +0200 Subject: [all][qinling] Please check your README files In-Reply-To: <2d210f25-db54-5822-f54f-28283adbadbd@nemebean.com> References: <2d210f25-db54-5822-f54f-28283adbadbd@nemebean.com> Message-ID: <76f8a665-20f9-eadd-0ba5-bcf0dd10c66d@linaro.org> W dniu 23.05.2019 o 15:50, Ben Nemec pisze: > Hmm, is this because of https://bugs.launchpad.net/pbr/+bug/1704472 ? > > If so, we should just fix it in pbr. I have a patch up to do that > (https://review.opendev.org/#/c/564874) but I haven't gotten around to > writing tests for it. I'll try to get that done shortly. I provided better description example for that patch. Based on changes done in some projects. u'UTF-8 description can contain misc Unicode “quotes”, ’apostrophes’, multiple dots like “…“, misc dashes like “–“ for example. Some projects also use IPA to show pronounciation of their name so chars like ”ʃŋ” can happen.' From elod.illes at ericsson.com Fri May 24 08:58:50 2019 From: elod.illes at ericsson.com (=?utf-8?B?RWzDtWQgSWxsw6lz?=) Date: Fri, 24 May 2019 08:58:50 +0000 Subject: [nova] stable-maint is especially unhealthily RH-centric In-Reply-To: <20190523235523.GB4763@thor.bakeyournoodle.com> References: <171a6b9f-909c-a715-4957-18cbc91ac685@ericsson.com> <20190523235523.GB4763@thor.bakeyournoodle.com> Message-ID: Thanks for the responses and sorry for the misleading info. Thanks Tony, will do. :) Thanks, Előd On 2019. 05. 24. 1:55, Tony Breeds wrote: > On Thu, May 23, 2019 at 05:01:01PM +0000, Elõd Illés wrote: >> Hi, >> >> I was thinking to answer here... What I see is that more projects >> struggle sometimes with having enough and quick stable reviews as cores >> are busy with the development track. I know that to become a stable core >> for a project needs to be a core of that project already and to know >> stable rules. My question is: wouldn't it be possible to someone who >> proves that he/she knows and follows the stable rules could be a stable >> core without being a core before? Maybe it could be enough or just make >> the projects life easier if for example one +2 could come from a person >> who is 'just' a stable core and one is necessary to come from a core >> both in stable and in 'master' on that project. > Sean and Matt have both answered this. I'd just like to add that the > Extended Maintenance policy was designed to encourage this so by all > means go forth and do good stable reviews :) > > Yours Tony. From hberaud at redhat.com Fri May 24 09:04:07 2019 From: hberaud at redhat.com (Herve Beraud) Date: Fri, 24 May 2019 11:04:07 +0200 Subject: [all][qinling] Please check your README files In-Reply-To: <3b416622-495d-713c-7ab8-6f46a3295dca@linaro.org> References: <3b416622-495d-713c-7ab8-6f46a3295dca@linaro.org> Message-ID: Le ven. 24 mai 2019 à 10:26, Marcin Juszkiewicz < marcin.juszkiewicz at linaro.org> a écrit : > W dniu 24.05.2019 o 00:23, Lingxian Kong pisze: > > Thanks for the reminder, Marcin. Qinling issue should be fixed in > > https://review.opendev.org/#/c/661005 (Tip hat to Gaëtan Trellu!) > > Thanks! > > I hope that PBR issue gets fixes soon, then openstack/requirements gets > PBR version bump so we can revert that change to show IPA characters again. > > Hello, I've added some changes related to your comments to tests more characters who need UTF8. -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From kalyani.rajkumar at bristol.ac.uk Fri May 24 10:13:05 2019 From: kalyani.rajkumar at bristol.ac.uk (Kalyani Rajkumar) Date: Fri, 24 May 2019 10:13:05 +0000 Subject: [networking-sfc] Unable to get Service Function Chain Mechanism working in Neutron In-Reply-To: References: Message-ID: Hi, I would like some help regarding the networking-SFC in openstack. I have been trying to set it up but I am not able to see any packets in the VMs in the service chain when I do a ping command from the source VM to the destination VM even though I am getting a ping response. The following is what I see for the IP addresses of the VMs when I do ovs-ofctl dump-flows br-int. cookie=0x51e24153cd662cb7, duration=76955.198s, table=24, n_packets=13, n_bytes=546, priority=2,arp,in_port="qvoc5a16c34-53",arp_spa=50.50.50.29 actions=resubmit(,25) cookie=0x51e24153cd662cb7, duration=76955.179s, table=24, n_packets=5, n_bytes=210, priority=2,arp,in_port="qvo0edc6dab-9c",arp_spa=50.50.50.19 actions=resubmit(,25) cookie=0x51e24153cd662cb7, duration=76955.169s, table=24, n_packets=5, n_bytes=210, priority=2,arp,in_port="qvo3f5fdc8e-56",arp_spa=50.50.50.13 actions=resubmit(,25) cookie=0x51e24153cd662cb7, duration=76955.154s, table=24, n_packets=10, n_bytes=420, priority=2,arp,in_port="qvo36c64023-a8",arp_spa=50.50.50.11 actions=resubmit(,25) cookie=0x51e24153cd662cb7, duration=76810.903s, table=24, n_packets=5, n_bytes=210, priority=2,arp,in_port="qvo55b6db77-73",arp_spa=50.50.50.14 actions=resubmit(,25) cookie=0x51e24153cd662cb7, duration=76810.894s, table=24, n_packets=23, n_bytes=966, priority=2,arp,in_port="qvoaebad029-52",arp_spa=50.50.50.3 actions=resubmit(,25) I am following the steps from the following tutorial https://www.openstack.org/assets/presentation-media/SFC-for-OpenStack-Austin-Aummit-publich.pdf. I installed networking-sfc version 6.0.0 for Openstack Queens as per https://docs.openstack.org/networking-sfc/latest/install/install.html. Kindly let me know if there is an alternate way of achieving the SFC mechanism or if I am missing something. Regards, Kalyani From: Kalyani Rajkumar Sent: 15 May 2019 13:24 To: openstack-discuss at lists.openstack.org Subject: [networking-sfc] Unable to get Service Function Chain Mechanism working in Neutron Hi, I have been trying to enable the networking SFC mechanism in OpenStack. I have successfully created port pairs, port pair groups, port chain and a flow classifier. However, I am unable to get the service chain working. The architecture of the set up I have deployed is attached. I have used the queens version of OpenStack. The steps that I followed are as below. * Create port neutron port-create --name sfc-Network * Create VMs and attach the interfaces with them accordingly VM1 - P1 & P2; VM2 - P3 & P4; VM3 - P5 & P6 * Create port pairs neutron port-pair-create pp1 -- ingress p1 -- egress p2 neutron port-pair-create pp2 -- ingress p3 -- egress p4 neutron port-pair-create pp3 -- ingress p5 -- egress p6 * Create port pair groups neutron port-pair-group-create -- port-pair pp1 ppg1 neutron port-pair-group-create -- port-pair pp2 ppg2 neutron port-pair-group-create -- port-pair pp3 ppg3 * Create flow classifier neutron flow-classifier-create --source-ip-prefix --destination-ip-prefix --logical-source-port p1 fc1 * Create port chain neutron port-chain-create --port-pair-group ppg1 --port-pair-group ppg2 --port-pair-group ppg3 --flow-classifier fc1 pc1 I am testing this architecture by sending a ping request from VM1 to VM3. Therefore, the destination port is P6. If SFC is working correctly, I should be able to see the packets go through the VM2 to VM3 when I do a tcpdump in VM2. As I am new to OpenStack and SFC, I am not certain if this is logically correct. I would like to pose two questions. 1) All the VMs are on the same network, is it logically correct to expect the ping packets to be routed from VM1 > VM2 > VM3 in the SFC scenario? Because all the ports are on the same network, I get a ping response but it is not via VM2 even though the port chain is created through VM2. 2) If not, how do I make sure that the packets are routed through VM2 which is the second port pair in the port pair chain. Could it be something to do with the OpenVSwitch configuration? Any help would be highly appreciated. Regards, Kalyani Rajkumar High Performance Networks Group, University of Bristol -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.macnaughton at canonical.com Fri May 24 10:30:14 2019 From: chris.macnaughton at canonical.com (Chris MacNaughton) Date: Fri, 24 May 2019 12:30:14 +0200 Subject: [charms] Proposing Sahid Orentino Ferdjaoui to the Charms core team Message-ID: <17abd9ed-e76d-52b3-29b1-6d6ae75161bf@canonical.com> Hello all, I would like to propose Sahid Orentino Ferdjaoui as a member of the Charms core team. Chris MacNaughton -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: pEpkey.asc Type: application/pgp-keys Size: 2480 bytes Desc: not available URL: From alex.kavanagh at canonical.com Fri May 24 10:35:37 2019 From: alex.kavanagh at canonical.com (Alex Kavanagh) Date: Fri, 24 May 2019 11:35:37 +0100 Subject: [OpenStack][Foreman][MAAS][Juju][Kubernetes][Docker] OpenStack deployment on Bare Metal In-Reply-To: References: Message-ID: Hi Jayachander I thought I'd have a go at answering your questions, but (please note) that I'm doing so from the perspective of only having worked on the "Canonical" stack, so I can't answer with respect to other deployment stacks, so obviously read the replies with respect to that. So the Canonical OpenStack is called "CDO" meaning "Canonical Distribution of OpenStack": you can get a fully supported installation at https://www.ubuntu.com/openstack, but also DIY with all the bits of software is very, very, possible, as everything is Free Software/Open Source. So the Stack is MaaS running the servers, using Juju to deploy and manage the lifecycle of OpenStack: On Tue, May 21, 2019 at 4:42 PM Jay See wrote: > Hi, > > I am trying to deploy OpenStack cloud , before proceeding with deployment > I wanted to take suggestion from people using some of these technologies. > > *Setup / plan:* I will be deploying OpenStack cloud using 3 nodes > (servers), later on I will be adding more nodes. I want automatically > install and provision the OS on the new nodes. I want scale up my cloud > with new nodes in near future. If there are any issues in existing > OpenStack, I would like to fix them or patch the fixes without much > trouble. (I might be using wrong terminology - sorry for that) > 3 nodes for a full, production, deployment might be a bit tight. However, if this is just exploration, then it would be an acceptable starting point. Most of the CDO OpenStack Juju bundles start at 4 machines. > 1. Which tool is better for Bare Metal OS installation? > I have looked around and found Foremen and MaaS. All the servers we > are going to deploy will be running on Ubuntu (16.04 /18.04 LTS). If you > have any other suggestions please let me know. > Well, I'd obviously say MaaS (https://maas.io/) :) MaaS itself needs a small server for itself exclusive of the nodes, if you'd like the hardware managed in that form. This means that for a 3 node cluster, you'd need a 4th machine for MaaS itself to run on. > 2. Do you suggest me to use Juju to deploy the OpenStack or do all the > manual installation for the OpenStack as mentioned in OpenStack > installation guide? > So that depends on what you are trying to achieve. The Juju OpenStack charms work together to install and configure OpenStack to provide an (up to and including an HA) solution. They take a coordinated view on how to configure a system (including a set of defaults that the OpenStack charms team have arrived at as provided solid solutions), and offer some higher level config values (in the charms) to modify those systems. The charms also deal with upgrading OpenStack systems between versions and generally deal with most of the lifecycle stuff. MaaS, Juju and the charms are very much "managing" the system with you; you are not configuring individual config files on nova, neutron, ceph, rabbitmq-server, vault, barbican, swift, etc. It could feel constraining, as you can't go in and change individual config files, but the charms solution is designed to work as a 'set'. > 3. I am bit confused about choosing Kubernetes. If I am going with Juju, > can I still use Kubernetes? If yes, if there any documentation please guide > me in right direction. > I'm not entirely sure what you mean here? Do you mean "running Kubernetes alongside OpenStack for other users to use docker on the same hardware", or "running the OpenStack control plane in kubernetes/docker"? If you mean the former: There are a set of charms that do Kubernetes (or K8s as it's often called) that work with Juju / MaaS. So kubernetes can be installed/managed by Juju on a MaaS cluster. This model (as Juju deployments are called) can be run alongside other models, and CDO is also a Juju model. More info here: https://www.ubuntu.com/kubernetes If you mean the latter (i.e. running the control plane in containers), then Juju charms take a different approach. Many of the services can be run in LXD containers, so that multiple control plane services can be run on the same physical server. The Juju tool knows how to 'talk' to MaaS and LXD to seamlessly run charms inside containers. > > 4. I was thinking of going with OpenStack installation over Kubernetes. Is > this a right decision? > or > Do I need to do some research between Kubernetes and Docker, find out > which one is better? > or > Just install OpenStack without any containers. > Or MaaS + Ubuntu 18.04 machines with LXD + Juju + OpenStack charms -- gives you services in LXD containers :) > > 5. I could not find installation of OpenStack with Kubernetes or Docker. > If you know something, could you share the link? > > I don't have bigger picture at the moment. If some tools might help in > near future. or If you can give any other suggestions, please let me know. > Well, configuring OpenStack is pretty complicated :) And that's because there is a lot there. There are multiple ways of configuring it, and there is also the Ansible installation, Docker (kolla), and TrippleO. Please see here for more details: https://docs.openstack.org/stein/deploy/ For further details on deploying Maas, Juju and the OpenStack charms, then 1. Welcome to the OpenStack charm guide: https://docs.openstack.org/charm-guide/latest/ 2. OpenStack Charms Deployment Guide: https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/stein/ Also, feel free to come to #openstack-charms on Freenode where there are always devs/support for the OpenStack charms available. Hope this helps, Best regards Alex. > > Thanks and Best regards, > Jayachander. > -- Alex Kavanagh - Software Engineer OpenStack Engineering - Data Centre Development - Canonical Ltd -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.kavanagh at canonical.com Fri May 24 10:38:30 2019 From: alex.kavanagh at canonical.com (Alex Kavanagh) Date: Fri, 24 May 2019 11:38:30 +0100 Subject: [charms] Proposing Sahid Orentino Ferdjaoui to the Charms core team In-Reply-To: <17abd9ed-e76d-52b3-29b1-6d6ae75161bf@canonical.com> References: <17abd9ed-e76d-52b3-29b1-6d6ae75161bf@canonical.com> Message-ID: On Fri, May 24, 2019 at 11:37 AM Chris MacNaughton < chris.macnaughton at canonical.com> wrote: > Hello all, > > I would like to propose Sahid Orentino Ferdjaoui as a member of the Charms > core team. > +1 from me. Sahid has been a solid contributor and definitely would be a welcome addition to the core team. > Chris MacNaughton > -- Alex Kavanagh - Software Engineer OpenStack Engineering - Data Centre Development - Canonical Ltd -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Fri May 24 11:12:14 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Fri, 24 May 2019 12:12:14 +0100 Subject: [tripleo][kolla][osa][nova][deployment] Removing nova-consoleauth Message-ID: In my continued efforts to remove as much nova code as possible in one cycle, I've set my sights on the 'nova-consoleauth' service. Since Rocky [1], 'nova-consoleauth' is no longer needed and we now store and retrieve tokens from the database. The only reason to still deploy 'nova-consoleauth' was to support cells v1 or to provide a window where existing tokens could continue to be validated before everything switched over to the new model, but we're also in the process of removing cells v1 [2] and two cycles in quite a large window in which to migrate things. I've the work done from the nova side but before we can merge anything, we need to remove support for nova-consoleauth from the various deployment projects and anything else that relies on it at the moment. I have talked to some folks internally about doing this in TripleO this cycle and I have a (likely wrong) patch proposed against Kolla [3] for this, but I haven't been able to figure out how/if OSA are deploying the service and would appreciate some help here. I'd also like it if people could let me know if there are any other potential blockers out there that we should be aware of before we proceed with this. Cheers, Stephen [1] https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/convert-consoles-to-objects.html [2] https://blueprints.launchpad.net/nova/+spec/remove-cells-v1/ [3] https://review.opendev.org/#/c/661251/ From noonedeadpunk at ya.ru Fri May 24 11:42:23 2019 From: noonedeadpunk at ya.ru (Dmitriy Rabotyagov) Date: Fri, 24 May 2019 14:42:23 +0300 Subject: [tripleo][kolla][osa][nova][deployment] Removing nova-consoleauth In-Reply-To: References: Message-ID: <44847531558698143@sas2-985f744271ca.qloud-c.yandex.net> Hi, OSA already dropped the 'nova-consoleauth' service with this patch [1]. It was also backported to stein, so since rocky we do not deploy it anymore. But still thanks for the notification. [1] https://review.opendev.org/#/c/649202/ 24.05.2019, 14:16, "Stephen Finucane" : > In my continued efforts to remove as much nova code as possible in one > cycle, I've set my sights on the 'nova-consoleauth' service. Since > Rocky [1], 'nova-consoleauth' is no longer needed and we now store and > retrieve tokens from the database. The only reason to still deploy > 'nova-consoleauth' was to support cells v1 or to provide a window where > existing tokens could continue to be validated before everything > switched over to the new model, but we're also in the process of > removing cells v1 [2] and two cycles in quite a large window in which > to migrate things. > > I've the work done from the nova side but before we can merge anything, > we need to remove support for nova-consoleauth from the various > deployment projects and anything else that relies on it at the moment. > I have talked to some folks internally about doing this in TripleO this > cycle and I have a (likely wrong) patch proposed against Kolla [3] for > this, but I haven't been able to figure out how/if OSA are deploying > the service and would appreciate some help here. I'd also like it if > people could let me know if there are any other potential blockers out > there that we should be aware of before we proceed with this. > > Cheers, > Stephen > > [1] https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/convert-consoles-to-objects.html > [2] https://blueprints.launchpad.net/nova/+spec/remove-cells-v1/ > [3] https://review.opendev.org/#/c/661251/ --  Kind Regards, Dmitriy Rabotyagov From emilien at redhat.com Fri May 24 12:00:55 2019 From: emilien at redhat.com (Emilien Macchi) Date: Fri, 24 May 2019 08:00:55 -0400 Subject: [tripleo][kolla][osa][nova][deployment] Removing nova-consoleauth In-Reply-To: <44847531558698143@sas2-985f744271ca.qloud-c.yandex.net> References: <44847531558698143@sas2-985f744271ca.qloud-c.yandex.net> Message-ID: Thanks to @Martin Schuppert , the work has been done in TripleO. On Fri, May 24, 2019 at 7:51 AM Dmitriy Rabotyagov wrote: > Hi, > > OSA already dropped the 'nova-consoleauth' service with this patch [1]. > It was also backported to stein, so since rocky we do not deploy it > anymore. > > But still thanks for the notification. > > [1] https://review.opendev.org/#/c/649202/ > > 24.05.2019, 14:16, "Stephen Finucane" : > > In my continued efforts to remove as much nova code as possible in one > > cycle, I've set my sights on the 'nova-consoleauth' service. Since > > Rocky [1], 'nova-consoleauth' is no longer needed and we now store and > > retrieve tokens from the database. The only reason to still deploy > > 'nova-consoleauth' was to support cells v1 or to provide a window where > > existing tokens could continue to be validated before everything > > switched over to the new model, but we're also in the process of > > removing cells v1 [2] and two cycles in quite a large window in which > > to migrate things. > > > > I've the work done from the nova side but before we can merge anything, > > we need to remove support for nova-consoleauth from the various > > deployment projects and anything else that relies on it at the moment. > > I have talked to some folks internally about doing this in TripleO this > > cycle and I have a (likely wrong) patch proposed against Kolla [3] for > > this, but I haven't been able to figure out how/if OSA are deploying > > the service and would appreciate some help here. I'd also like it if > > people could let me know if there are any other potential blockers out > > there that we should be aware of before we proceed with this. > > > > Cheers, > > Stephen > > > > [1] > https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/convert-consoles-to-objects.html > > [2] https://blueprints.launchpad.net/nova/+spec/remove-cells-v1/ > > [3] https://review.opendev.org/#/c/661251/ > > -- > Kind Regards, > Dmitriy Rabotyagov > > > -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From cdent+os at anticdent.org Fri May 24 12:09:43 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Fri, 24 May 2019 13:09:43 +0100 (BST) Subject: [placement] update 19-20 Message-ID: HTML: https://anticdent.org/placement-update-19-20.html Placement update 19-20. Lots of cleanups in progress, laying in the groundwork to do the nested magic work (see themes below). The poll to determine [what to do with the weekly meeting](https://civs.cs.cornell.edu/cgi-bin/vote.pl?id=E_9599a2647c319fd4&akey=12a23953ab33e056) will close at the end of today. Thus far the leader is office hours. Whatever the outcome, the meeting that would happen this coming Monday is cancelled because many people will be having a holiday. # Most Important The [spec for nested magic](https://review.opendev.org/658510) is ready for more robust review. Since most of the work happening in placement this cycle is described by that spec, getting it reviewed well and quickly is important. Generally speaking: review things. This is, and always will be, the most important thing to do. # What's Changed * os-resource-classes 0.4.0 was released, promptly breaking the placement gate (tests are broken not os-resource-classes). [Fixes underway](https://review.opendev.org/661131). * [Null root provider protections](https://review.opendev.org/657716) have been removed and a blocker migration and status check added. This removes a few now redundant joins in the SQL queries which should help with our ongoing efforts to speed up and simplify getting allocation candidates. * I had suggested an additional core group for os-traits and os-resource-classes but after discussion with various people it was decided it's easier/better to be aware of the right subject matter experts and call them in to the reviews when required. # Specs/Features * Support Consumer Types. This is very close with a few details to work out on what we're willing and able to query on. It's a week later and it still only has reviews from me so far. * Spec for Nested Magic. Un-wipped. * Resource provider - request group mapping in allocation candidate. This spec was copied over from nova. It is a requirement of the overall nested magic theme. While it has a well-defined and refined design, there's currently no one on the hook implement it. These and other features being considered can be found on the [feature worklist](https://storyboard.openstack.org/#!/worklist/594). Some non-placement specs are listed in the Other section below. # Stories/Bugs (Numbers in () are the change since the last pupdate.) There are 20 (-3) stories in [the placement group](https://storyboard.openstack.org/#!/project_group/placement). 0 are [untagged](https://storyboard.openstack.org/#!/worklist/580). 2 (-2) are [bugs](https://storyboard.openstack.org/#!/worklist/574). 5 are [cleanups](https://storyboard.openstack.org/#!/worklist/575). 11 (-1) are [rfes](https://storyboard.openstack.org/#!/worklist/594). 2 are [docs](https://storyboard.openstack.org/#!/worklist/637). If you're interested in helping out with placement, those stories are good places to look. On launchpad: * Placement related nova [bugs not yet in progress](https://goo.gl/TgiPXb) on launchpad: 16 (0). * Placement related nova [in progress bugs](https://goo.gl/vzGGDQ) on launchpad: 7 (+1). # osc-placement osc-placement is currently behind by 11 microversions. No change since the last report. Pending changes: * Add 'resource provider inventory update' command (that helps with aggregate allocation ratios). * Add support for 1.22 microversion * Provide a useful message in the case of 500-error # Main Themes ## Nested Magic At the PTG we decided that it was worth the effort, in both Nova and Placement, to make the push to make better use of nested providers — things like NUMA layouts, multiple devices, networks — while keeping the "simple" case working well. The general ideas for this are described in a [story](https://storyboard.openstack.org/#!/story/2005575) and an evolving [spec](https://review.opendev.org/658510). Some code has started, mostly to reveal issues: * Changing request group suffix to string * WIP: Allow RequestGroups without resources * Add NUMANetworkFixture for gabbits * Gabbi test cases for can_split ## Consumer Types Adding a type to consumers will allow them to be grouped for various purposes, including quota accounting. A [spec](https://review.opendev.org/654799) has started. There are some questions about request and response details that need to be resolved, but the overall concept is sound. ## Cleanup As we explore and extend nested functionality we'll need to do some work to make sure that the code is maintainable and has suitable performance. There's some work in progress for this that's important enough to call out as a theme: * Some work from Tetsuro exploring ways to remove redundancies in the code. There's a [stack of good improvements](https://review.opendev.org/658778). * WIP: Optionally run a wsgi profiler when asked. This was used to find some of the above issues. Should we make it generally available or is it better as a thing to base off when exploring? * Avoid traversing summaries in _check_traits_for_alloc_request Ed Leafe has also been doing some intriguing work on using graph databases with placement. It's not yet clear if or how it could be integrated with mainline placement, but there are likely many things to be learned from the experiment. # Other Placement Miscellaneous changes can be found in [the usual place](https://review.opendev.org/#/q/project:openstack/placement+status:open). There are several [os-traits changes](https://review.opendev.org/#/q/project:openstack/os-traits+status:open) being discussed. # Other Service Users New discoveries are added to the end. Merged stuff is removed. Starting with the next pupdate I'll also be removing anything that has had no reviews and no activity from the author in 4 weeks. Otherwise these lists get too long and uselessly noisy. * Nova: Spec: Proposes NUMA topology with RPs * Nova: Spec: Virtual persistent memory libvirt driver implementation * Nova: Check compute_node existence in when nova-compute reports info to placement * Nova: spec: support virtual persistent memory * Workaround doubling allocations on resize * Nova: Pre-filter hosts based on multiattach volume support * Nova: Add flavor to requested_resources in RequestSpec * Blazar: Retry on inventory update conflict * Nova: count quota usage from placement * Nova: nova-manage: heal port allocations * Nova: Spec for a new nova virt driver to manage an RSD * Cyborg: Initial readme for nova pilot * Tempest: Add QoS policies and minimum bandwidth rule client * Nova-spec: Add PENDING vm state * nova-spec: Allow compute nodes to use DISK_GB from shared storage RP * nova-spec: RMD Plugin: Energy Efficiency using CPU Core P-State control * nova-spec: Proposes NUMA affinity for vGPUs. This describes a legacy way of doing things because affinity in placement may be a ways off. But it also [may not be](https://review.openstack.org/650476). * Nova: heal allocations, --dry-run * Watcher spec: Add Placement helper * Cyborg: Placement report * Nova: Spec to pre-filter disabled computes with placement * rpm-packaging: placement service * Delete resource providers for all nodes when deleting compute service * nova fix for: Drop source node allocations if finish_resize fails * neutron: Add devstack plugin for placement service plugin * ansible: Add playbook to test placement * nova: WIP: Hey let's support routed networks y'all! # End As indicated above, I'm going to tune these pupdates to make sure they are reporting only active links. This doesn't mean stalled out stuff will be ignored, just that it won't come back on the lists until someone does some work related to it. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From snikitin at mirantis.com Fri May 24 12:16:03 2019 From: snikitin at mirantis.com (Sergey Nikitin) Date: Fri, 24 May 2019 16:16:03 +0400 Subject: [stackalytics] Reported numbers seem inaccurate In-Reply-To: <20190523215335.w3e5cnqt5tl7f2wr@arabian.linksys.moosehall> References: <20190522213927.iuty4y5mrgw7dmjt@pacific.linksys.moosehall> <20190523215335.w3e5cnqt5tl7f2wr@arabian.linksys.moosehall> Message-ID: Hi, Yes, data synchronization takes up to 24 hours. So we have to wait. I'll inform you when the process will be finished. On Fri, May 24, 2019 at 1:53 AM Adam Spiers wrote: > Thanks for looking at this. Maybe I'm just being too impatient and > the data is still synchronising, but now I only see 4 commits to nova > in May, and there have definitely been a *lot* more than that :-) > > https://opendev.org/openstack/nova/commits/branch/master > > Sergey Nikitin wrote: > >Thank you for message! > >yes, I guess new train release wasn't added into repos (just on drop > down). > >I'll fix it now. > > > >On Thu, May 23, 2019 at 1:39 AM Adam Spiers wrote: > > > >> There are still issues. For example nova is not showing any commits > >> since April: > >> > >> > >> > https://www.stackalytics.com/?metric=commits&release=train&project_type=all&module=nova > >> > >> Rong Zhu wrote: > >> >Hi Sergey, > >> > > >> >Thanks for your help. Now the numbers are correctly. > >> > > >> > > >> >Sergey Nikitin 于2019年5月19日 周日21:12写道: > >> > > >> >> Hi, Rong, > >> >> > >> >> Database was rebuild and now stats o gengchc2 [1] is correct [2]. > >> >> > >> >> [1] > >> >> > >> > https://www.stackalytics.com/?release=all&metric=commits&project_type=all&user_id=578043796-b > >> >> [2] https://review.opendev.org/#/q/owner:gengchc2,n,z > >> >> > >> >> Sorry for delay, > >> >> Sergey > >> >> > >> >> > >> >> > >> >> > >> >> On Fri, May 17, 2019 at 6:20 PM Sergey Nikitin < > snikitin at mirantis.com> > >> >> wrote: > >> >> > >> >>> Testing of migration process shown us that we have to rebuild > database > >> >>> "on live". > >> >>> Unfortunately it means that during rebuild data will be incomplete. > I > >> >>> talked with the colleague who did it previously and he told me that > >> it's > >> >>> normal procedure. > >> >>> I got these results on Monday and at this moment I'm waiting for > >> weekend. > >> >>> It's better to rebuild database in Saturday and Sunday to do now > affect > >> >>> much number of users. > >> >>> So by the end of this week everything will be completed. Thank you > for > >> >>> patient. > >> >>> > >> >>> On Fri, May 17, 2019 at 6:15 AM Rong Zhu > >> wrote: > >> >>> > >> >>>> Hi Sergey, > >> >>>> > >> >>>> What is the process about rebuild the database? > >> >>>> > >> >>>> Thanks, > >> >>>> Rong Zhu > >> >>>> > >> >>>> Sergey Nikitin 于2019年5月7日 周二00:59写道: > >> >>>> > >> >>>>> Hello Rong, > >> >>>>> > >> >>>>> Sorry for long response. I was on a trip during last 5 days. > >> >>>>> > >> >>>>> What I have found: > >> >>>>> Lets take a look on this patch [1]. It must be a contribution of > >> >>>>> gengchc2, but for some reasons it was matched to Yuval Brik [2] > >> >>>>> I'm still trying to find a root cause of it, but anyway on this > week > >> we > >> >>>>> are planing to rebuild our database to increase RAM. I checked > >> statistics > >> >>>>> of gengchc2 on clean database and it's complete correct. > >> >>>>> So your problem will be solved in several days. It will take so > long > >> >>>>> time because full rebuild of DB takes 48 hours, but we need to > test > >> our > >> >>>>> migration process first to keep zero down time. > >> >>>>> I'll share a results with you here when the process will be > finished. > >> >>>>> Thank you for your patience. > >> >>>>> > >> >>>>> Sergey > >> >>>>> > >> >>>>> [1] https://review.opendev.org/#/c/627762/ > >> >>>>> [2] > >> >>>>> > >> > https://www.stackalytics.com/?user_id=jhamhader&project_type=all&release=all&metric=commits&company=&module=freezer-api > >> >>>>> > >> >>>>> > >> >>>>> On Mon, May 6, 2019 at 6:30 AM Rong Zhu > >> wrote: > >> >>>>> > >> >>>>>> Hi Sergey, > >> >>>>>> > >> >>>>>> Do we have any process about my colleague's data loss problem? > >> >>>>>> > >> >>>>>> Sergey Nikitin 于2019年4月29日 周一19:57写道: > >> >>>>>> > >> >>>>>>> Thank you for information! I will take a look > >> >>>>>>> > >> >>>>>>> On Mon, Apr 29, 2019 at 3:47 PM Rong Zhu < > aaronzhu1121 at gmail.com> > >> >>>>>>> wrote: > >> >>>>>>> > >> >>>>>>>> Hi there, > >> >>>>>>>> > >> >>>>>>>> Recently we found we lost a person's data from our company at > the > >> >>>>>>>> stackalytics website. > >> >>>>>>>> You can check the merged patch from [0], but there no date from > >> >>>>>>>> the stackalytics website. > >> >>>>>>>> > >> >>>>>>>> stackalytics info as below: > >> >>>>>>>> Company: ZTE Corporation > >> >>>>>>>> Launchpad: 578043796-b > >> >>>>>>>> Gerrit: gengchc2 > >> >>>>>>>> > >> >>>>>>>> Look forward to hearing from you! > >> >>>>>>>> > >> >>>>>>> > >> >>>>>> Best Regards, > >> >>>>>> Rong Zhu > >> >>>>>> > >> >>>>>>> > >> >>>>>>>> -- > >> >>>>>> Thanks, > >> >>>>>> Rong Zhu > >> >>>>>> > >> >>>>> > >> >>>>> > >> >>>>> -- > >> >>>>> Best Regards, > >> >>>>> Sergey Nikitin > >> >>>>> > >> >>>> -- > >> >>>> Thanks, > >> >>>> Rong Zhu > >> >>>> > >> >>> > >> >>> > >> >>> -- > >> >>> Best Regards, > >> >>> Sergey Nikitin > >> >>> > >> >> > >> >> > >> >> -- > >> >> Best Regards, > >> >> Sergey Nikitin > >> >> > >> >-- > >> >Thanks, > >> >Rong Zhu > >> > > > > > >-- > >Best Regards, > >Sergey Nikitin > -- Best Regards, Sergey Nikitin -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.urdin at binero.se Fri May 24 12:22:02 2019 From: tobias.urdin at binero.se (Tobias Urdin) Date: Fri, 24 May 2019 14:22:02 +0200 Subject: [tripleo][kolla][osa][nova][deployment] Removing nova-consoleauth In-Reply-To: References: <44847531558698143@sas2-985f744271ca.qloud-c.yandex.net> Message-ID: Hello, Seems like we have missed deprecating this behavior in Puppet OpenStack in Stein. I've pushed two patches to have the functionality removed and classes/params deprecated as per our process [1]. I haven't researched further if any more action is required, so please let me know if something needs to be added. Best regards [1] https://review.opendev.org/#/q/topic:deprecate-consoleauth On 05/24/2019 02:04 PM, Emilien Macchi wrote: > Thanks to @Martin Schuppert , the work > has been done in TripleO. > > On Fri, May 24, 2019 at 7:51 AM Dmitriy Rabotyagov > > wrote: > > Hi, > > OSA already dropped the 'nova-consoleauth' service with this patch > [1]. > It was also backported to stein, so since rocky we do not deploy > it anymore. > > But still thanks for the notification. > > [1] https://review.opendev.org/#/c/649202/ > > > 24.05.2019, 14:16, "Stephen Finucane" >: > > In my continued efforts to remove as much nova code as possible > in one > > cycle, I've set my sights on the 'nova-consoleauth' service. Since > > Rocky [1], 'nova-consoleauth' is no longer needed and we now > store and > > retrieve tokens from the database. The only reason to still deploy > > 'nova-consoleauth' was to support cells v1 or to provide a > window where > > existing tokens could continue to be validated before everything > > switched over to the new model, but we're also in the process of > > removing cells v1 [2] and two cycles in quite a large window in > which > > to migrate things. > > > > I've the work done from the nova side but before we can merge > anything, > > we need to remove support for nova-consoleauth from the various > > deployment projects and anything else that relies on it at the > moment. > > I have talked to some folks internally about doing this in > TripleO this > > cycle and I have a (likely wrong) patch proposed against Kolla > [3] for > > this, but I haven't been able to figure out how/if OSA are deploying > > the service and would appreciate some help here. I'd also like it if > > people could let me know if there are any other potential > blockers out > > there that we should be aware of before we proceed with this. > > > > Cheers, > > Stephen > > > > [1] > https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/convert-consoles-to-objects.html > > [2] https://blueprints.launchpad.net/nova/+spec/remove-cells-v1/ > > > [3] https://review.opendev.org/#/c/661251/ > > > -- > Kind Regards, > Dmitriy Rabotyagov > > > > > -- > Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimmy at openstack.org Fri May 24 13:05:34 2019 From: jimmy at openstack.org (Jimmy McArthur) Date: Fri, 24 May 2019 08:05:34 -0500 Subject: Fwd: Summit video website shenanigans In-Reply-To: <5CE6CF37.6020504@openstack.org> References: <022A1281-449C-4E58-9648-038604714BA3@openstack.org> <5CE6CF37.6020504@openstack.org> Message-ID: <5CE7EC1E.1000002@openstack.org> Alrighty, this video is fixed: https://www.openstack.org/videos/summits/denver-2019/openstack-troubleshooting-field-survival-guide-1 Thanks again for pointing it out! Let us know if you have any further troubles. Cheers, Jimmy > Jimmy McArthur > May 23, 2019 at 11:49 AM > Hey Matt, > > I responded to the cut-off video via speaker support. We had 8 videos > that were truncated, but most were only by a few seconds. Yours was > the unfortunate 15 minute chop and I'm afraid that one is > irretrievable. Look over the option I sent in speaker support ticket > and let me know if that's something you're interested in. > > Re: > https://www.openstack.org/summit/denver-2019/summit-schedule/events/23234/openstack-troubleshooting-field-survival-guide > > This one appears to be blocked as a duplicate video by YouTube. I'm > trying to sort this out as we speak. I'll update this thread as soon > as I have an answer. > > If anyone else out there notices any other shenanigans, please don't > hesitate to respond to this thread or directly via > speakersupport at openstack.org where we have our devs and Foundation > support staff all looking out. > > Cheers, > Jimmy > > Allison Price wrote: > > Allison Price > May 23, 2019 at 11:21 AM > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Fri May 24 13:26:25 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Fri, 24 May 2019 15:26:25 +0200 Subject: [DVR config] Can we use drv_snat agent_mode in every compute node? In-Reply-To: <55f84d63363640b480ff5bfd6013e895@inspur.com> References: <67d4e0f3053949fc844b6d1d26f05559@inspur.com> <279f1523-bfcd-9863-c5d6-7cb190f7142b@gmail.com> <58f85a3e3f1449cebdf59f7e16e7090e@inspur.com> <1B6127C7-2794-40F4-BEED-6CD40DDB4BD9@redhat.com> <55f84d63363640b480ff5bfd6013e895@inspur.com> Message-ID: Hi, I’m not expert in spine-leaf topology TBH but I know that for L3HA neutron creates for each tenant, “tenant network” which usually is vxlan or gre tunnels network. And this works like any other vxlan network created in neutron. So tunnels are established between nodes using L3 and it transports “tenant L2” inside vxlan packets, right? In the same way works this network created for L3 HA needs. And it transports VRRP packets inside this tunnel network (which often is vxlan network). > On 20 May 2019, at 09:33, Yi Yang (杨燚)-云服务集团 wrote: > > Hi, Slawomir, do you mean VRRP over VXLAN? I mean servers in leaf switch are attached to the leaf switch by VLAN and servers handle VxLAN encap and decap, for such case, how can leaf-spine transport a L2 packet to another server in another leaf switch? > > -----邮件原件----- > 发件人: Slawomir Kaplonski [mailto:skaplons at redhat.com] > 发送时间: 2019年5月20日 15:13 > 收件人: Yi Yang (杨燚)-云服务集团 > 抄送: haleyb.dev at gmail.com; openstack-discuss at lists.openstack.org > 主题: Re: [DVR config] Can we use drv_snat agent_mode in every compute node? > 重要性: 高 > > Hi, > >> On 20 May 2019, at 02:07, Yi Yang (杨燚)-云服务集团 wrote: >> >> Brian, thank for your reply. So if I configure 3 compute nodes of many compute node as drv_snat, it doesn't have substantial difference from the case that I configure 3 single network nodes as snat gateway except deployment difference, right? Another question, it doesn't use HA even if we have multiple dvr_snat nodes, right? If enable l3_ha, I think one external router will be scheduled in multiple (2 at least) dvr_snat nodes, for that case, IPs of these HA routers for this one router are same one and are activated by VRRP, right? For l3_ha, two or multiple HA l3 nodes must be in the same L2 network because it uses VRRP (keepalived) to share a VIP, right? For that case, how can we make sure VRRP can work well across leaf switches in a L3 leaf-spine network (servers are connected to leaf switch by L2)? > > That is correct what You are saying. In DVR-HA case, SNAT nodes are working in same way like in “standard” L3HA. So it’s active-backup config and keepalived is deciding which node is active. > Neutron creates “HA network” for tenant to use for keepalived. It can be e.g. vxlan network and that way You will have L2 between such nodes (routers). > >> >> -----邮件原件----- >> 发件人: Brian Haley [mailto:haleyb.dev at gmail.com] >> 发送时间: 2019年5月17日 22:11 >> 收件人: Yi Yang (杨燚)-云服务集团 >> 抄送: openstack-discuss at lists.openstack.org >> 主题: Re: 答复: [DVR config] Can we use drv_snat agent_mode in every compute node? >> >> On 5/16/19 8:29 PM, Yi Yang (杨燚)-云服务集团 wrote: >>> Thanks Brian, your explanation clarified something, but I don't get the answer if we can have multiple compute nodes are configured to dvr_snat, for this case, SNAT IPs are obviously different. Why do we want to use network node if compute node can do everything? >> >> Hi Yi, >> >> There will only be one DVR SNAT IP allocated for a router on the external network, and only one router scheduled using it, so having dvr_snat mode on a compute node doesn't mean that North/South router will be local, only the East/West portion might be. >> >> Typically people choose to place these on separate systems since the requirements of the role are different - network node could have fewer cores and a 10G nic for higher bandwidth, compute node could have lots of cores for instances but maybe a 1G nic. There's no reason you can't run dvr_snat everywhere, I would just say it's not common. >> >> -Brian >> >> >>> -----邮件原件----- >>> 发件人: Brian Haley [mailto:haleyb.dev at gmail.com] >>> 发送时间: 2019年5月16日 21:46 >>> 收件人: Yi Yang (杨燚)-云服务集团 >>> 抄送: openstack-discuss at lists.openstack.org >>> 主题: Re: [DVR config] Can we use drv_snat agent_mode in every compute node? >>> >>> Hi Yi, >>> >>> I'm a little confused by the question, comments inline. >>> >>> On 5/15/19 11:47 PM, Yi Yang (杨燚)-云服务集团 wrote: >>>> Hi, folks >>>> >>>> I saw somebody discussed distributed SNAT, but finally they didn’t >>>> make agreement on how to implement distributed SNAT, my question is >>>> can we use dvr_snat agent_mode in compute node? I understand >>>> dvr_snat only does snat but doesn’t do east west routing, right? Can >>>> we set dvr_snat and dvr in one compute node at the same time? It is >>>> equivalent to distributed SNAT if we can set drv_snat in every >>>> compute node, isn’t right? I know Opendaylight can do SNAT in >>>> compute node in distributed way, but one external router only can run in one compute node. >>> >>> Distributed SNAT is not available in neutron, there was a spec >>> proposed recently though, https://review.opendev.org/#/c/658414 >>> >>> Regarding the agent_mode setting for L3, only one mode can be set at a time. Typically 'dvr_snat' is used on network nodes and 'dvr' on compute nodes because it leads to less resource usage (i.e. namespaces). >>> The centralized part of the router hosting the default SNAT IP address will only be scheduled to one of the agents in 'dvr_snat' mode. All the DVR modes can do East/West routing when an instance is scheduled to the node, and two can do North/South - 'dvr_snat' using the default SNAT IP, and 'dvr' using a floating IP. 'dvr_no_external' can only do East/West. >>> >>> Hopefully that clarifies things. >>> >>> -Brian >>> >>>> I also see https://wiki.openstack.org/wiki/Dragonflow is trying to >>>> implement distributed SNAT, what are technical road blocks for >>>> distributed SNAT in openstack dvr? Do we have any good way to remove >>>> these road blocks? >>>> >>>> Thank you in advance and look forward to getting your replies and insights. >>>> >>>> Also attached official drv configuration guide for your reference. >>>> >>>> https://docs.openstack.org/neutron/stein/configuration/l3-agent.html >>>> >>>> |agent_mode|¶ >>>> >>> l >>>> # >>>> DEFAULT.agent_mode> >>>> >>>> Type >>>> >>>> string >>>> >>>> Default >>>> >>>> legacy >>>> >>>> Valid Values >>>> >>>> dvr, dvr_snat, legacy, dvr_no_external >>>> >>>> The working mode for the agent. Allowed modes are: ‘legacy’ - this >>>> preserves the existing behavior where the L3 agent is deployed on a >>>> centralized networking node to provide L3 services like DNAT, and SNAT. >>>> Use this mode if you do not want to adopt DVR. ‘dvr’ - this mode >>>> enables DVR functionality and must be used for an L3 agent that runs >>>> on a compute host. ‘dvr_snat’ - this enables centralized SNAT >>>> support in conjunction with DVR. This mode must be used for an L3 >>>> agent running on a centralized node (or in single-host deployments, e.g. devstack). >>>> ‘dvr_no_external’ - this mode enables only East/West DVR routing >>>> functionality for a L3 agent that runs on a compute host, the >>>> North/South functionality such as DNAT and SNAT will be provided by >>>> the centralized network node that is running in ‘dvr_snat’ mode. >>>> This mode should be used when there is no external network >>>> connectivity on the compute host. >>>> > > — > Slawek Kaplonski > Senior software engineer > Red Hat > — Slawek Kaplonski Senior software engineer Red Hat From skaplons at redhat.com Fri May 24 13:47:11 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Fri, 24 May 2019 15:47:11 +0200 Subject: [heat][neutron] improving extraroute support In-Reply-To: References: <8b4f8152-b6d0-2145-104b-300bfd479ca8@redhat.com> <7998fbfd-1262-237e-8c59-a96dec00f8eb@redhat.com> <8748D38C-4ACA-4823-9C24-52260AC8A058@redhat.com> Message-ID: <7E842116-486D-4992-BCDB-3EF00410109B@redhat.com> Hi Bence, I posted my comment there (in PS6). > On 21 May 2019, at 10:17, Bence Romsics wrote: > > Hi All, > > Some of you may not be aware yet that a new concern was raised > regarding the extraroute improvement plans just after the last neutron > session was closed on the PTG. > > It seems we have a tradeoff between the support for the use case of > tracking multiple needs for the same extra route or keeping the > virtual router abstraction as simple as it was in the past. > > I'm raising the question of this tradeoff here in the mailing list > because this (I hope) seems to be the last cross-project question of > this topic. If we could find a cross-project consensus on this I could > continue making progress inside each project without need for further > cross-project coordination. Please help me find this consensus. > > I don't want to unnecessarily repeat arguments already made. I think > the question is clearly formulated in the comments of patch sets 5, 6 > and 8 of the below neutron-spec: > > https://review.opendev.org/655680 Improve Extraroute API > > All opinions, comments, questions are welcome there. > > Thanks in advance, > Bence (rubasov) > — Slawek Kaplonski Senior software engineer Red Hat From david.ames at canonical.com Fri May 24 14:57:20 2019 From: david.ames at canonical.com (David Ames) Date: Fri, 24 May 2019 07:57:20 -0700 Subject: [charms] Proposing Sahid Orentino Ferdjaoui to the Charms core team In-Reply-To: <17abd9ed-e76d-52b3-29b1-6d6ae75161bf@canonical.com> References: <17abd9ed-e76d-52b3-29b1-6d6ae75161bf@canonical.com> Message-ID: +1 On Fri, May 24, 2019 at 3:34 AM Chris MacNaughton wrote: > > Hello all, > > I would like to propose Sahid Orentino Ferdjaoui as a member of the Charms core team. > > Chris MacNaughton From sfinucan at redhat.com Fri May 24 15:09:25 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Fri, 24 May 2019 16:09:25 +0100 Subject: [tripleo][kolla][osa][nova][deployment] Removing nova-consoleauth In-Reply-To: References: <44847531558698143@sas2-985f744271ca.qloud-c.yandex.net> Message-ID: Thanks for the replies, all. Looks like we're in pretty good shape to proceed with the removal this cycle. I've since drafted patches to deprecate the service in RDO/OSP too [1]. Other distros should probably do the same at some point in the next few weeks. Stephen [1] https://review.rdoproject.org/r/20922 On Fri, 2019-05-24 at 14:22 +0200, Tobias Urdin wrote: > Hello, > > > > Seems like we have missed deprecating this behavior in Puppet > OpenStack in Stein. > > I've pushed two patches to have the functionality removed and > classes/params deprecated as per our process [1]. > > > > I haven't researched further if any more action is required, so > please let me know if something needs to be added. > > Best regards > > > > [1] https://review.opendev.org/#/q/topic:deprecate-consoleauth > > > > On 05/24/2019 02:04 PM, Emilien Macchi > wrote: > > > > > > > > > Thanks to > > @Martin Schuppert , the work has been done in TripleO. > > > > > > > > > > On Fri, May 24, 2019 at 7:51 > > AM Dmitriy Rabotyagov wrote: > > > > > > > > > Hi, > > > > > > > > > > > > OSA already dropped the 'nova-consoleauth' service with > > > this > > > patch [1]. > > > > > > It was also backported to stein, so since rocky we do > > > not > > > deploy it anymore. > > > > > > > > > > > > But still thanks for the notification. > > > > > > > > > > > > [1] > > > https://review.opendev.org/#/c/649202/ > > > > > > > > > > > > 24.05.2019, 14:16, "Stephen Finucane" < > > > sfinucan at redhat.com>: > > > > > > > In my continued efforts to remove as much nova code > > > as > > > possible in one > > > > > > > cycle, I've set my sights on the 'nova-consoleauth' > > > service. Since > > > > > > > Rocky [1], 'nova-consoleauth' is no longer needed and > > > we > > > now store and > > > > > > > retrieve tokens from the database. The only reason to > > > still deploy > > > > > > > 'nova-consoleauth' was to support cells v1 or to > > > provide > > > a window where > > > > > > > existing tokens could continue to be validated before > > > everything > > > > > > > switched over to the new model, but we're also in the > > > process of > > > > > > > removing cells v1 [2] and two cycles in quite a large > > > window in which > > > > > > > to migrate things. > > > > > > > > > > > > > > I've the work done from the nova side but before we > > > can > > > merge anything, > > > > > > > we need to remove support for nova-consoleauth from > > > the > > > various > > > > > > > deployment projects and anything else that relies on > > > it > > > at the moment. > > > > > > > I have talked to some folks internally about doing > > > this > > > in TripleO this > > > > > > > cycle and I have a (likely wrong) patch proposed > > > against > > > Kolla [3] for > > > > > > > this, but I haven't been able to figure out how/if > > > OSA > > > are deploying > > > > > > > the service and would appreciate some help here. I'd > > > also > > > like it if > > > > > > > people could let me know if there are any other > > > potential > > > blockers out > > > > > > > there that we should be aware of before we proceed > > > with > > > this. > > > > > > > > > > > > > > Cheers, > > > > > > > Stephen > > > > > > > > > > > > > > [1] > > > https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/convert-consoles-to-objects.html > > > > > > > [2] > > > > > > https://blueprints.launchpad.net/nova/+spec/remove-cells-v1/ > > > > > > > [3] > > > https://review.opendev.org/#/c/661251/ > > > > > > > > > > > > -- > > > > > > Kind Regards, > > > > > > Dmitriy Rabotyagov > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Emilien Macchi > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Fri May 24 15:21:49 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 24 May 2019 10:21:49 -0500 Subject: [tripleo][kolla][osa][nova][deployment] Removing nova-consoleauth In-Reply-To: References: <44847531558698143@sas2-985f744271ca.qloud-c.yandex.net> Message-ID: <8529cd2e-0c03-843b-f5f5-e7c0f55f4a03@gmail.com> On 5/24/2019 7:22 AM, Tobias Urdin wrote: > Seems like we have missed deprecating this behavior in Puppet OpenStack > in Stein. > I've pushed two patches to have the functionality removed and > classes/params deprecated as per our process [1]. > > I haven't researched further if any more action is required, so please > let me know if something needs to be added. A check was added to "nova-status upgrade check" in Stein (and backported to Rocky) [1] so I don't know if you're running those upgrade checks but that could be potentially useful during an upgrade, but at the same time if you're just not deploying the service anymore (in Train?) then I'm not sure how useful it would be. [1] https://review.opendev.org/#/c/611214/ -- Thanks, Matt From aj at suse.com Fri May 24 15:31:04 2019 From: aj at suse.com (Andreas Jaeger) Date: Fri, 24 May 2019 17:31:04 +0200 Subject: Retiring TripleO-UI - no longer supported In-Reply-To: <3924F5DE-314C-4D41-8CEA-DCF7A2A2CDEA@redhat.com> References: <3924F5DE-314C-4D41-8CEA-DCF7A2A2CDEA@redhat.com> Message-ID: On 5/23/19 10:35 PM, Jason Rist wrote: > Hi everyone - I’m writing the list to announce that we are retiring > TripleO-UI and it will no longer be supported. It’s already deprecated > in Zuul and removed from requirements, so I’ve submitted a patch to > remove all code.  > > https://review.opendev.org/661113 We discussed in IRC briefly, but I like to bring this up here as well: So, you plan to stop development for tripleo-ui on all stable branches as well, is that really your intention? Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter: jaegerandi SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 From jrist at redhat.com Fri May 24 15:57:27 2019 From: jrist at redhat.com (Jason Rist) Date: Fri, 24 May 2019 09:57:27 -0600 Subject: Retiring TripleO-UI - no longer supported In-Reply-To: References: <3924F5DE-314C-4D41-8CEA-DCF7A2A2CDEA@redhat.com> Message-ID: No, there might be stable branch work, but going forward no additional features or work will be done against master, and I will additionally be retiring associated projects such as openstack/ansible-role-tripleo-ui and code relating to tripleo-ui in puppet-triple and tripleoclient. I will follow-up on this thread with additional links. -J Jason Rist Red Hat jrist / knowncitizen ` > On May 24, 2019, at 9:31 AM, Andreas Jaeger wrote: > > On 5/23/19 10:35 PM, Jason Rist wrote: >> Hi everyone - I’m writing the list to announce that we are retiring >> TripleO-UI and it will no longer be supported. It’s already deprecated >> in Zuul and removed from requirements, so I’ve submitted a patch to >> remove all code. >> >> https://review.opendev.org/661113 > > We discussed in IRC briefly, but I like to bring this up here as well: > > So, you plan to stop development for tripleo-ui on all stable branches > as well, is that really your intention? > > Andreas > -- > Andreas Jaeger aj@{suse.com,opensuse.org} Twitter: jaegerandi > SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah > HRB 21284 (AG Nürnberg) > GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at stackhpc.com Fri May 24 16:05:12 2019 From: mark at stackhpc.com (Mark Goddard) Date: Fri, 24 May 2019 17:05:12 +0100 Subject: [kolla] Stepping down from core reviewer In-Reply-To: References: Message-ID: On Thu, 23 May 2019, 20:38 Martin André, wrote: > Hi all, > > It became clear over the past few months I no longer have the time to > contribute to Kolla in a meaningful way and would like to step down > from core reviewer. It was an honor to be part of this great team, you > fools who trusted me enough to give me +2 powers. Thanks, and long > live Kolla! > > Martin > Very sorry to hear this Martin. Thanks for your contributions to the project over the years. If you find you have time again in future you will of course be welcome to rejoin the core team. I will update the group in Gerrit. Cheers, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Fri May 24 17:24:32 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 24 May 2019 17:24:32 +0000 Subject: [all][qinling] Please check your README files In-Reply-To: <3b416622-495d-713c-7ab8-6f46a3295dca@linaro.org> References: <3b416622-495d-713c-7ab8-6f46a3295dca@linaro.org> Message-ID: <20190524172431.l7isfxobtzm7quwj@yuggoth.org> On 2019-05-24 10:19:49 +0200 (+0200), Marcin Juszkiewicz wrote: [...] > I hope that PBR issue gets fixes soon, then openstack/requirements > gets PBR version bump so we can revert that change to show IPA > characters again. The PBR constraint in global requirements is really only used in places like integration testing environments where it's explicitly preinstalled. For situations where it gets pulled in by setuptools (which is most times where a job pip installs a project using it), all that really matters is what the latest version allowed by the setup_requires list argument to the setuptools.setup() function call in its setup.py file is. That call never gets constrained anyway. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From snikitin at mirantis.com Fri May 24 18:03:58 2019 From: snikitin at mirantis.com (Sergey Nikitin) Date: Fri, 24 May 2019 22:03:58 +0400 Subject: [stackalytics] Reported numbers seem inaccurate In-Reply-To: References: <20190522213927.iuty4y5mrgw7dmjt@pacific.linksys.moosehall> <20190523215335.w3e5cnqt5tl7f2wr@arabian.linksys.moosehall> Message-ID: Hi, Looks like data finally was processed! [1] Thank your for your notification! If you will find some other problems please let me know. Sergey [1] https://www.stackalytics.com/?metric=commits&release=train&project_type=all&module=nova On Fri, May 24, 2019 at 4:16 PM Sergey Nikitin wrote: > Hi, > Yes, data synchronization takes up to 24 hours. So we have to wait. > I'll inform you when the process will be finished. > > On Fri, May 24, 2019 at 1:53 AM Adam Spiers wrote: > >> Thanks for looking at this. Maybe I'm just being too impatient and >> the data is still synchronising, but now I only see 4 commits to nova >> in May, and there have definitely been a *lot* more than that :-) >> >> https://opendev.org/openstack/nova/commits/branch/master >> >> Sergey Nikitin wrote: >> >Thank you for message! >> >yes, I guess new train release wasn't added into repos (just on drop >> down). >> >I'll fix it now. >> > >> >On Thu, May 23, 2019 at 1:39 AM Adam Spiers wrote: >> > >> >> There are still issues. For example nova is not showing any commits >> >> since April: >> >> >> >> >> >> >> https://www.stackalytics.com/?metric=commits&release=train&project_type=all&module=nova >> >> >> >> Rong Zhu wrote: >> >> >Hi Sergey, >> >> > >> >> >Thanks for your help. Now the numbers are correctly. >> >> > >> >> > >> >> >Sergey Nikitin 于2019年5月19日 周日21:12写道: >> >> > >> >> >> Hi, Rong, >> >> >> >> >> >> Database was rebuild and now stats o gengchc2 [1] is correct [2]. >> >> >> >> >> >> [1] >> >> >> >> >> >> https://www.stackalytics.com/?release=all&metric=commits&project_type=all&user_id=578043796-b >> >> >> [2] https://review.opendev.org/#/q/owner:gengchc2,n,z >> >> >> >> >> >> Sorry for delay, >> >> >> Sergey >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> On Fri, May 17, 2019 at 6:20 PM Sergey Nikitin < >> snikitin at mirantis.com> >> >> >> wrote: >> >> >> >> >> >>> Testing of migration process shown us that we have to rebuild >> database >> >> >>> "on live". >> >> >>> Unfortunately it means that during rebuild data will be >> incomplete. I >> >> >>> talked with the colleague who did it previously and he told me that >> >> it's >> >> >>> normal procedure. >> >> >>> I got these results on Monday and at this moment I'm waiting for >> >> weekend. >> >> >>> It's better to rebuild database in Saturday and Sunday to do now >> affect >> >> >>> much number of users. >> >> >>> So by the end of this week everything will be completed. Thank you >> for >> >> >>> patient. >> >> >>> >> >> >>> On Fri, May 17, 2019 at 6:15 AM Rong Zhu >> >> wrote: >> >> >>> >> >> >>>> Hi Sergey, >> >> >>>> >> >> >>>> What is the process about rebuild the database? >> >> >>>> >> >> >>>> Thanks, >> >> >>>> Rong Zhu >> >> >>>> >> >> >>>> Sergey Nikitin 于2019年5月7日 周二00:59写道: >> >> >>>> >> >> >>>>> Hello Rong, >> >> >>>>> >> >> >>>>> Sorry for long response. I was on a trip during last 5 days. >> >> >>>>> >> >> >>>>> What I have found: >> >> >>>>> Lets take a look on this patch [1]. It must be a contribution of >> >> >>>>> gengchc2, but for some reasons it was matched to Yuval Brik [2] >> >> >>>>> I'm still trying to find a root cause of it, but anyway on this >> week >> >> we >> >> >>>>> are planing to rebuild our database to increase RAM. I checked >> >> statistics >> >> >>>>> of gengchc2 on clean database and it's complete correct. >> >> >>>>> So your problem will be solved in several days. It will take so >> long >> >> >>>>> time because full rebuild of DB takes 48 hours, but we need to >> test >> >> our >> >> >>>>> migration process first to keep zero down time. >> >> >>>>> I'll share a results with you here when the process will be >> finished. >> >> >>>>> Thank you for your patience. >> >> >>>>> >> >> >>>>> Sergey >> >> >>>>> >> >> >>>>> [1] https://review.opendev.org/#/c/627762/ >> >> >>>>> [2] >> >> >>>>> >> >> >> https://www.stackalytics.com/?user_id=jhamhader&project_type=all&release=all&metric=commits&company=&module=freezer-api >> >> >>>>> >> >> >>>>> >> >> >>>>> On Mon, May 6, 2019 at 6:30 AM Rong Zhu >> >> wrote: >> >> >>>>> >> >> >>>>>> Hi Sergey, >> >> >>>>>> >> >> >>>>>> Do we have any process about my colleague's data loss problem? >> >> >>>>>> >> >> >>>>>> Sergey Nikitin 于2019年4月29日 周一19:57写道: >> >> >>>>>> >> >> >>>>>>> Thank you for information! I will take a look >> >> >>>>>>> >> >> >>>>>>> On Mon, Apr 29, 2019 at 3:47 PM Rong Zhu < >> aaronzhu1121 at gmail.com> >> >> >>>>>>> wrote: >> >> >>>>>>> >> >> >>>>>>>> Hi there, >> >> >>>>>>>> >> >> >>>>>>>> Recently we found we lost a person's data from our company at >> the >> >> >>>>>>>> stackalytics website. >> >> >>>>>>>> You can check the merged patch from [0], but there no date >> from >> >> >>>>>>>> the stackalytics website. >> >> >>>>>>>> >> >> >>>>>>>> stackalytics info as below: >> >> >>>>>>>> Company: ZTE Corporation >> >> >>>>>>>> Launchpad: 578043796-b >> >> >>>>>>>> Gerrit: gengchc2 >> >> >>>>>>>> >> >> >>>>>>>> Look forward to hearing from you! >> >> >>>>>>>> >> >> >>>>>>> >> >> >>>>>> Best Regards, >> >> >>>>>> Rong Zhu >> >> >>>>>> >> >> >>>>>>> >> >> >>>>>>>> -- >> >> >>>>>> Thanks, >> >> >>>>>> Rong Zhu >> >> >>>>>> >> >> >>>>> >> >> >>>>> >> >> >>>>> -- >> >> >>>>> Best Regards, >> >> >>>>> Sergey Nikitin >> >> >>>>> >> >> >>>> -- >> >> >>>> Thanks, >> >> >>>> Rong Zhu >> >> >>>> >> >> >>> >> >> >>> >> >> >>> -- >> >> >>> Best Regards, >> >> >>> Sergey Nikitin >> >> >>> >> >> >> >> >> >> >> >> >> -- >> >> >> Best Regards, >> >> >> Sergey Nikitin >> >> >> >> >> >-- >> >> >Thanks, >> >> >Rong Zhu >> >> >> > >> > >> >-- >> >Best Regards, >> >Sergey Nikitin >> > > > -- > Best Regards, > Sergey Nikitin > -- Best Regards, Sergey Nikitin -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Fri May 24 19:27:18 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 24 May 2019 14:27:18 -0500 Subject: [watcher] Question about baremetal node support in nova CDM In-Reply-To: <53585cb2-a207-58ff-588a-6c9694f8245f@dantalion.nl> References: <53585cb2-a207-58ff-588a-6c9694f8245f@dantalion.nl> Message-ID: <89845749-876d-e7ac-bd49-9c262ed56e43@gmail.com> On 5/24/2019 1:39 AM, info at dantalion.nl wrote: > I think we should look into if bare metal > nodes are stored in the compute_model as I think it would more sense to > filter them out. The tricky thing with this would be there isn't a great way to identify a baremetal node from a kvm node, for example. There is a hypervisor_type column on the compute_nodes table in the nova cell DB, but it's not exposed in the API. Two obvious differences would be: 1. The hypervisor_hostname on an ironic node in the os-hypervisors API is a UUID rather than a normal hostname. That could be one way to try and identify an ironic node (hypervisor). 2. For servers, the associated flavor should have a CUSTOM resource class extra spec associated with it and the VCPU, DISK_GB, and MEMORY_MB resource classes should also be zero'ed out in the flavor per [1]. The server OS-EXT-SRV-ATTR:hypervisor_hostname field would also be a UUID like above (the UUID is the ironic node ID). [1] https://docs.openstack.org/ironic/latest/install/configure-nova-flavors.html -- Thanks, Matt From jp.methot at planethoster.info Fri May 24 19:46:35 2019 From: jp.methot at planethoster.info (=?utf-8?Q?Jean-Philippe_M=C3=A9thot?=) Date: Fri, 24 May 2019 15:46:35 -0400 Subject: [ops] [nova] Wrong network interface model virtio1.0-net given to libvirt Message-ID: <6BC66D84-D349-465C-A3B3-392E7E6EA4B7@planethoster.info> Hi, We’re setting up 4 new compute nodes in our openstack Pike setup and 2 of them are behaving strangely, despite having the same hardware as the others. When I try to create a new instance on these compute nodes, the instance fail to spawn and the following error message appears in the log : 2019-05-24 15:13:35.559 16944 ERROR nova.compute.manager Traceback (most recent call last): 2019-05-24 15:13:35.559 16944 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2203, in _build_resources 2019-05-24 15:13:35.559 16944 ERROR nova.compute.manager yield resources 2019-05-24 15:13:35.559 16944 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2018, in _build_and_run_instance 2019-05-24 15:13:35.559 16944 ERROR nova.compute.manager block_device_info=block_device_info) 2019-05-24 15:13:35.559 16944 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2898, in spawn 2019-05-24 15:13:35.559 16944 ERROR nova.compute.manager block_device_info=block_device_info) 2019-05-24 15:13:35.559 16944 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5220, in _get_guest_xml 2019-05-24 15:13:35.559 16944 ERROR nova.compute.manager context) 2019-05-24 15:13:35.559 16944 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5035, in _get_guest_config 2019-05-24 15:13:35.559 16944 ERROR nova.compute.manager flavor, virt_type, self._host) 2019-05-24 15:13:35.559 16944 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/vif.py", line 558, in get_config 2019-05-24 15:13:35.559 16944 ERROR nova.compute.manager vnic_type) 2019-05-24 15:13:35.559 16944 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/vif.py", line 521, in _get_config_os_vif 2019-05-24 15:13:35.559 16944 ERROR nova.compute.manager inst_type, virt_type, vnic_type) 2019-05-24 15:13:35.559 16944 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/vif.py", line 134, in get_base_config 2019-05-24 15:13:35.559 16944 ERROR nova.compute.manager virt=virt_type) 2019-05-24 15:13:35.559 16944 ERROR nova.compute.manager UnsupportedHardware: Requested hardware 'virtio1.0-net' is not supported by the 'kvm' virt driver After doing some research on google, I understand that this happens when the network interface model is defined as virtio1.0-net in the xml that’s fed to libvirt and libvirt does not consider that a valid name for a network interface model. That said, why is Nova passing that model name to libvirt? I understand that it comes from libosinfo, but how is Nova getting that (wrong) information? If I look on other compute nodes, the model for the network interface is simply virtio. Best regards, Jean-Philippe Méthot Openstack system administrator Administrateur système Openstack PlanetHoster inc. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Fri May 24 20:04:52 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 24 May 2019 15:04:52 -0500 Subject: [nova] Should we have a known issue reno for bug 1829062 in 19.0.1 (stein)? Message-ID: I've got a release request for stein 19.0.1 [1] but am holding it up to figure out if we should have a known issue release note for the nova-api + wsgi + eventlet monkey patch bug(s) [2][3]. [4] contains a workaround to disable the eventlet monkeypatching which it sounds like StarlingX is using for now, but is not really something we're recommending for production (setting OS_NOVA_DISABLE_EVENTLET_PATCHING=1). Sean Mooney has another workaround [5]. Should we try to clean that up for a known issue before we release 19.0.1 given it's the first release since the Stein GA? I tend to think "yes". [1] https://review.opendev.org/#/c/661376/ [2] https://bugs.launchpad.net/nova/+bug/1829062 [3] https://bugs.launchpad.net/nova/+bug/1825584 [4] https://review.opendev.org/#/c/647310/ [5] https://bugs.launchpad.net/nova/+bug/1829062/comments/7 -- Thanks, Matt From gagehugo at gmail.com Fri May 24 20:44:15 2019 From: gagehugo at gmail.com (Gage Hugo) Date: Fri, 24 May 2019 15:44:15 -0500 Subject: [Security SIG] Weekly Newsletter - May 23rd 2019 Message-ID: At the Denver Summit, one of the forum sessions was a PTL Tips & Tricks session[0] where one topic was sending out a project update email. Other projects/SIGs seem to do this from time-to-time (this idea was mostly inspired by Keystone's weekly newsletter, thanks cmurphy!) and the plan for the Security SIG to do something similar was discussed during this week's meeting and seemed to have unanimous approval. So starting this week, the Security SIG will begin sending out a weekly newsletter, the overall goal of this is to provide updates to the happenings of the Security SIG as well as provide insight to the current security happenings within OpenStack. As the amount of content varies week to week, the occurrence may be tweaked in the future to something bi-weekly or monthly as we see how this goes. [0] https://etherpad.openstack.org/p/DEN-ptl-tips-and-tricks If there's anything else you would like to see here or feedback you'd like to give, please feel free to respond here, reach out via IRC in #openstack-security, and/or comment in the newsletter etherpad here: https://etherpad.openstack.org/p/security-sig-newsletter. Thanks! # Week of: 23 May 2019 - Security SIG Meeting Info: http://eavesdrop.openstack.org/#Security_SIG_meeting - Weekly on Thursday at 1500 UTC in #openstack-meeting - Agenda: https://etherpad.openstack.org/p/security-agenda - https://security.openstack.org/ - https://wiki.openstack.org/wiki/Security-SIG ## Meeting Notes - Summary: http://eavesdrop.openstack.org/meetings/security/2019/security.2019-05-23-15.00.txt - TL;DR: During this week's meeting, we discussed the two bugs/stories listed below, as well as the idea of sending out some Security SIG newsletter. ## VMT Bug List A full list of publicly marked security issues can be found here: https://bugs.launchpad.net/ossa/ Updates from this week: - Security Group filtering hides rules from user Edit: https://bugs.launchpad.net/ossa/+bug/1824248 - This was made public this week, and multiple fixes have been submitted. - SQL Injection vulnerability in node_cache: https://storyboard.openstack.org/#!/story/2005678 - Made public this week, multiple fixes have been submitted/merged -------------- next part -------------- An HTML attachment was scrubbed... URL: From hongbin034 at gmail.com Sat May 25 04:05:41 2019 From: hongbin034 at gmail.com (Hongbin Lu) Date: Sat, 25 May 2019 00:05:41 -0400 Subject: [infra] api reference not updated Message-ID: Hi infra team, I have a patch to update the Zun API reference: https://review.opendev.org/#/c/658722/ . The patch is merged but the API reference [1] doesn't seem pick up the change. Is anything wrong in somewhere? [1] https://developer.openstack.org/api-ref/application-container/ Best regards, Hongbin -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhang.lei.fly+os-discuss at gmail.com Sat May 25 04:56:53 2019 From: zhang.lei.fly+os-discuss at gmail.com (Jeffrey Zhang) Date: Sat, 25 May 2019 12:56:53 +0800 Subject: [tripleo][kolla][osa][nova][deployment] Removing nova-consoleauth In-Reply-To: References: Message-ID: Hi Stephen, Thanks for the patch in Kolla side. But we need remove the code in kolla-ansible at first, then remove kolla image. Otherwise, kolla-ansible CI will be busted. I pushed the kolla-ansible side patch[1]. Could you add a Depends-On for your patch? Thanks, Jeffrey4l [1] https://review.opendev.org/#/c/661427/ On Fri, May 24, 2019 at 7:14 PM Stephen Finucane wrote: > In my continued efforts to remove as much nova code as possible in one > cycle, I've set my sights on the 'nova-consoleauth' service. Since > Rocky [1], 'nova-consoleauth' is no longer needed and we now store and > retrieve tokens from the database. The only reason to still deploy > 'nova-consoleauth' was to support cells v1 or to provide a window where > existing tokens could continue to be validated before everything > switched over to the new model, but we're also in the process of > removing cells v1 [2] and two cycles in quite a large window in which > to migrate things. > > I've the work done from the nova side but before we can merge anything, > we need to remove support for nova-consoleauth from the various > deployment projects and anything else that relies on it at the moment. > I have talked to some folks internally about doing this in TripleO this > cycle and I have a (likely wrong) patch proposed against Kolla [3] for > this, but I haven't been able to figure out how/if OSA are deploying > the service and would appreciate some help here. I'd also like it if > people could let me know if there are any other potential blockers out > there that we should be aware of before we proceed with this. > > Cheers, > Stephen > > [1] > https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/convert-consoles-to-objects.html > [2] https://blueprints.launchpad.net/nova/+spec/remove-cells-v1/ > [3] https://review.opendev.org/#/c/661251/ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aj at suse.com Sat May 25 09:41:30 2019 From: aj at suse.com (Andreas Jaeger) Date: Sat, 25 May 2019 11:41:30 +0200 Subject: [infra] api reference not updated In-Reply-To: References: Message-ID: <6f55d5bf-72a1-541c-568e-ebb78cdb1599@suse.com> On 25/05/2019 06.05, Hongbin Lu wrote: > Hi infra team, > > I have a patch to update the Zun API > reference: https://review.opendev.org/#/c/658722/ . The patch is merged > but the API reference [1] doesn't seem pick up the change. Is anything > wrong in somewhere? > > [1] https://developer.openstack.org/api-ref/application-container/ The page was last updated "25 May 2019, 07.00.39 CES" according to page info. Looking at the log files via http://zuul.openstack.org/builds?job_name=publish-api-ref&project=openstack%2Fzun shows that the last run succeeded. When your change merged, we had a bug that was fixed with https://review.opendev.org/#/c/659976/ So, looks like all is fine again, Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter: jaegerandi SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 From colleen at gazlene.net Sat May 25 13:47:54 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Sat, 25 May 2019 06:47:54 -0700 Subject: [dev][keystone] M-1 check-in and retrospective meeting Message-ID: Hi team, During the PTG, we agreed to have milestone-ly check-ins in order to try to keep momentum going throughout the cycle. Milestone 1 is already nearly upon us, so it's time to schedule this meeting. I'd like to schedule a two-hour video call during which we'll conduct a brief retrospective of the cycle so far, review our past action items, and refine and reevaluate our plans for the rest of the cycle. I've created a doodle poll[1] to schedule the session for either the week of M-1[2] or the following week. If you have questions, concerns, or thoughts about this meeting, let's discuss it in this thread (or you can message me privately). Colleen [1] https://doodle.com/poll/hyibxqp9h8sgz56p [2] https://releases.openstack.org/train/schedule.html From hongbin034 at gmail.com Sat May 25 14:23:09 2019 From: hongbin034 at gmail.com (Hongbin Lu) Date: Sat, 25 May 2019 10:23:09 -0400 Subject: [infra] api reference not updated In-Reply-To: <6f55d5bf-72a1-541c-568e-ebb78cdb1599@suse.com> References: <6f55d5bf-72a1-541c-568e-ebb78cdb1599@suse.com> Message-ID: Cool. It is working now. Thanks. On Sat., May 25, 2019, 5:41 a.m. Andreas Jaeger wrote: > On 25/05/2019 06.05, Hongbin Lu wrote: > > Hi infra team, > > > > I have a patch to update the Zun API > > reference: https://review.opendev.org/#/c/658722/ . The patch is merged > > but the API reference [1] doesn't seem pick up the change. Is anything > > wrong in somewhere? > > > > [1] https://developer.openstack.org/api-ref/application-container/ > > The page was last updated "25 May 2019, 07.00.39 CES" according to page > info. > > Looking at the log files via > > http://zuul.openstack.org/builds?job_name=publish-api-ref&project=openstack%2Fzun > > shows that the last run succeeded. > > When your change merged, we had a bug that was fixed with > https://review.opendev.org/#/c/659976/ > > So, looks like all is fine again, > > Andreas > -- > Andreas Jaeger aj@{suse.com,opensuse.org} Twitter: jaegerandi > SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah > HRB 21284 (AG Nürnberg) > GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doka.ua at gmx.com Sat May 25 18:22:41 2019 From: doka.ua at gmx.com (Volodymyr Litovka) Date: Sat, 25 May 2019 21:22:41 +0300 Subject: [neutron] set/find dhcp-server address Message-ID: <5f65989d-1616-a5f5-1c90-a5f6e6e364fe@gmx.com> Dear colleagues, is there way to explicitly assign DHCP address when creating subnet? The issue is that it isn't always first address from allocation pool, e.g. $ openstack port list +--------------------------------------+-------+-------------------+------------------------------------------------------------------------------+--------+ | ID | Name | MAC Address | Fixed IP Addresses | Status | +--------------------------------------+-------+-------------------+------------------------------------------------------------------------------+--------+ | 0897bcc4-6cad-479c-8743-ca7cc5a57271 | | 72:d0:1c:d1:6b:51 | ip_address='172.16.53.3', subnet_id='20329549-124c-484d-8278-edca9829e262' | ACTIVE | | | | | ip_address='172.16.54.2', subnet_id='07249cd3-11a9-4da7-a4db-bd838aa8c4e7' | | both subnet have similar configuration of allocation pool (172.16.xx.2-254/24) and there are two different addresses for DHCP in every subnet. This makes a trouble during project generation with pre-assigned addresses for servers if the pre-assigned address is same as [surprisigly, non-first] address of DHCP namespace. And, may be, there is a way to determine this address in more simple way than looking into 'openstack port list' output, searching for port (a) without name and (b) with multiple addresses from all belonging subnets :) At the moment, 'openstack subnet show' say nothing about assigned DHCP-address. Thank you! -- Volodymyr Litovka "Vision without Execution is Hallucination." -- Thomas Edison -------------- next part -------------- An HTML attachment was scrubbed... URL: From doka.ua at gmx.com Sat May 25 20:19:32 2019 From: doka.ua at gmx.com (Volodymyr Litovka) Date: Sat, 25 May 2019 23:19:32 +0300 Subject: [neutron] set/find dhcp-server address In-Reply-To: <5f65989d-1616-a5f5-1c90-a5f6e6e364fe@gmx.com> References: <5f65989d-1616-a5f5-1c90-a5f6e6e364fe@gmx.com> Message-ID: <2e8c680b-d392-6468-bcc2-44449cc30084@gmx.com> Hi, it seems I wasn't first who asked for this - https://wiki.openstack.org/wiki/Neutron/enable-to-set-dhcp-port-attributes and it seems there was no progress on this? Is it possible to at least include DHCP address in output of 'subnet show' API call? The shortest way I've found is: * openstack port list --project ... --device-owner network:dhcp and then for **every port** in resulting list * openstack port show in order to extract 'Fixed IP Addresses' attribute for analysis Too much calls, isn't it? On 5/25/19 9:22 PM, Volodymyr Litovka wrote: > Dear colleagues, > > is there way to explicitly assign DHCP address when creating subnet? > The issue is that it isn't always first address from allocation pool, e.g. > $ openstack port list > +--------------------------------------+-------+-------------------+------------------------------------------------------------------------------+--------+ > | ID | Name | MAC Address | Fixed IP Addresses | Status | > +--------------------------------------+-------+-------------------+------------------------------------------------------------------------------+--------+ > | 0897bcc4-6cad-479c-8743-ca7cc5a57271 | | 72:d0:1c:d1:6b:51 | ip_address='172.16.53.3', subnet_id='20329549-124c-484d-8278-edca9829e262' | ACTIVE | > | | | | ip_address='172.16.54.2', subnet_id='07249cd3-11a9-4da7-a4db-bd838aa8c4e7' | | > both subnet have similar configuration of allocation pool > (172.16.xx.2-254/24) and there are two different addresses for DHCP in > every subnet. > > This makes a trouble during project generation with pre-assigned > addresses for servers if the pre-assigned address is same as > [surprisigly, non-first] address of DHCP namespace. > > And, may be, there is a way to determine this address in more simple > way than looking into 'openstack port list' output, searching for port > (a) without name and (b) with multiple addresses from all belonging > subnets :) At the moment, 'openstack subnet show' say nothing about > assigned DHCP-address. > > Thank you! > > -- > Volodymyr Litovka > "Vision without Execution is Hallucination." -- Thomas Edison -- Volodymyr Litovka "Vision without Execution is Hallucination." -- Thomas Edison -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Sun May 26 07:07:03 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Sun, 26 May 2019 09:07:03 +0200 Subject: [neutron] set/find dhcp-server address In-Reply-To: <2e8c680b-d392-6468-bcc2-44449cc30084@gmx.com> References: <5f65989d-1616-a5f5-1c90-a5f6e6e364fe@gmx.com> <2e8c680b-d392-6468-bcc2-44449cc30084@gmx.com> Message-ID: <3954208F-442A-4519-AAEE-80AB6E5C15B2@redhat.com> Hi, If You do something like: openstack port list --network d79eea02-31dc-45c7-bd48-d98af46fd2d5 --device-owner network:dhcp Then You will get only dhcp ports from specific network. And Fixed IP are by default displayed on this list. Is this enough “workaround” for You? > On 25 May 2019, at 22:19, Volodymyr Litovka wrote: > > Hi, > > it seems I wasn't first who asked for this - https://wiki.openstack.org/wiki/Neutron/enable-to-set-dhcp-port-attributes and it seems there was no progress on this? > > Is it possible to at least include DHCP address in output of 'subnet show' API call? > > The shortest way I've found is: > * openstack port list --project ... --device-owner network:dhcp > and then for **every port** in resulting list > * openstack port show > in order to extract 'Fixed IP Addresses' attribute for analysis > > Too much calls, isn't it? > > On 5/25/19 9:22 PM, Volodymyr Litovka wrote: >> Dear colleagues, >> >> is there way to explicitly assign DHCP address when creating subnet? The issue is that it isn't always first address from allocation pool, e.g. >> $ openstack port list >> +--------------------------------------+-------+-------------------+------------------------------------------------------------------------------+--------+ >> | ID | Name | MAC Address | Fixed IP Addresses | Status | >> +--------------------------------------+-------+-------------------+------------------------------------------------------------------------------+--------+ >> | 0897bcc4-6cad-479c-8743-ca7cc5a57271 | | 72:d0:1c:d1:6b:51 | ip_address='172.16.53.3', subnet_id='20329549-124c-484d-8278-edca9829e262' | ACTIVE | >> | | | | ip_address='172.16.54.2', subnet_id='07249cd3-11a9-4da7-a4db-bd838aa8c4e7' | | >> >> both subnet have similar configuration of allocation pool (172.16.xx.2-254/24) and there are two different addresses for DHCP in every subnet. >> >> This makes a trouble during project generation with pre-assigned addresses for servers if the pre-assigned address is same as [surprisigly, non-first] address of DHCP namespace. >> >> And, may be, there is a way to determine this address in more simple way than looking into 'openstack port list' output, searching for port (a) without name and (b) with multiple addresses from all belonging subnets :) At the moment, 'openstack subnet show' say nothing about assigned DHCP-address. >> >> Thank you! >> >> -- >> Volodymyr Litovka >> "Vision without Execution is Hallucination." -- Thomas Edison >> > > -- > Volodymyr Litovka > "Vision without Execution is Hallucination." -- Thomas Edison > — Slawek Kaplonski Senior software engineer Red Hat From yangyi01 at inspur.com Sun May 26 23:54:13 2019 From: yangyi01 at inspur.com (=?utf-8?B?WWkgWWFuZyAo5p2o54eaKS3kupHmnI3liqHpm4blm6I=?=) Date: Sun, 26 May 2019 23:54:13 +0000 Subject: =?utf-8?B?562U5aSNOiBbRFZSIGNvbmZpZ10gQ2FuIHdlIHVzZSBkcnZfc25hdCBhZ2Vu?= =?utf-8?B?dF9tb2RlIGluIGV2ZXJ5IGNvbXB1dGUgbm9kZT8=?= In-Reply-To: References: <67d4e0f3053949fc844b6d1d26f05559@inspur.com> <279f1523-bfcd-9863-c5d6-7cb190f7142b@gmail.com> <58f85a3e3f1449cebdf59f7e16e7090e@inspur.com> <1B6127C7-2794-40F4-BEED-6CD40DDB4BD9@redhat.com> <55f84d63363640b480ff5bfd6013e895@inspur.com> Message-ID: No, please read https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/14/html-single/spine_leaf_networking/index, L3HA won't handle VRRP over VXLAN. 1.4. Spine-leaf limitations Some roles, such as the Controller role, use virtual IP addresses and clustering. The mechanism behind this functionality requires layer-2 network connectivity between these nodes. These nodes are all be placed within the same leaf. Similar restrictions apply to Networker nodes. The network service implements highly-available default paths in the network using Virtual Router Redundancy Protocol (VRRP). Since VRRP uses a virtual router IP address, you must connect master and backup nodes to the same L2 network segment. When using tenant or provider networks with VLAN segmentation, you must share the particular VLANs between all Networker and Compute nodes. -----邮件原件----- 发件人: Slawomir Kaplonski [mailto:skaplons at redhat.com] 发送时间: 2019年5月24日 21:26 收件人: Yi Yang (杨燚)-云服务集团 抄送: haleyb.dev at gmail.com; openstack-discuss at lists.openstack.org 主题: Re: [DVR config] Can we use drv_snat agent_mode in every compute node? Hi, I’m not expert in spine-leaf topology TBH but I know that for L3HA neutron creates for each tenant, “tenant network” which usually is vxlan or gre tunnels network. And this works like any other vxlan network created in neutron. So tunnels are established between nodes using L3 and it transports “tenant L2” inside vxlan packets, right? In the same way works this network created for L3 HA needs. And it transports VRRP packets inside this tunnel network (which often is vxlan network). > On 20 May 2019, at 09:33, Yi Yang (杨燚)-云服务集团 wrote: > > Hi, Slawomir, do you mean VRRP over VXLAN? I mean servers in leaf switch are attached to the leaf switch by VLAN and servers handle VxLAN encap and decap, for such case, how can leaf-spine transport a L2 packet to another server in another leaf switch? > > -----邮件原件----- > 发件人: Slawomir Kaplonski [mailto:skaplons at redhat.com] > 发送时间: 2019年5月20日 15:13 > 收件人: Yi Yang (杨燚)-云服务集团 > 抄送: haleyb.dev at gmail.com; openstack-discuss at lists.openstack.org > 主题: Re: [DVR config] Can we use drv_snat agent_mode in every compute node? > 重要性: 高 > > Hi, > >> On 20 May 2019, at 02:07, Yi Yang (杨燚)-云服务集团 wrote: >> >> Brian, thank for your reply. So if I configure 3 compute nodes of many compute node as drv_snat, it doesn't have substantial difference from the case that I configure 3 single network nodes as snat gateway except deployment difference, right? Another question, it doesn't use HA even if we have multiple dvr_snat nodes, right? If enable l3_ha, I think one external router will be scheduled in multiple (2 at least) dvr_snat nodes, for that case, IPs of these HA routers for this one router are same one and are activated by VRRP, right? For l3_ha, two or multiple HA l3 nodes must be in the same L2 network because it uses VRRP (keepalived) to share a VIP, right? For that case, how can we make sure VRRP can work well across leaf switches in a L3 leaf-spine network (servers are connected to leaf switch by L2)? > > That is correct what You are saying. In DVR-HA case, SNAT nodes are working in same way like in “standard” L3HA. So it’s active-backup config and keepalived is deciding which node is active. > Neutron creates “HA network” for tenant to use for keepalived. It can be e.g. vxlan network and that way You will have L2 between such nodes (routers). > >> >> -----邮件原件----- >> 发件人: Brian Haley [mailto:haleyb.dev at gmail.com] >> 发送时间: 2019年5月17日 22:11 >> 收件人: Yi Yang (杨燚)-云服务集团 >> 抄送: openstack-discuss at lists.openstack.org >> 主题: Re: 答复: [DVR config] Can we use drv_snat agent_mode in every compute node? >> >> On 5/16/19 8:29 PM, Yi Yang (杨燚)-云服务集团 wrote: >>> Thanks Brian, your explanation clarified something, but I don't get the answer if we can have multiple compute nodes are configured to dvr_snat, for this case, SNAT IPs are obviously different. Why do we want to use network node if compute node can do everything? >> >> Hi Yi, >> >> There will only be one DVR SNAT IP allocated for a router on the external network, and only one router scheduled using it, so having dvr_snat mode on a compute node doesn't mean that North/South router will be local, only the East/West portion might be. >> >> Typically people choose to place these on separate systems since the requirements of the role are different - network node could have fewer cores and a 10G nic for higher bandwidth, compute node could have lots of cores for instances but maybe a 1G nic. There's no reason you can't run dvr_snat everywhere, I would just say it's not common. >> >> -Brian >> >> >>> -----邮件原件----- >>> 发件人: Brian Haley [mailto:haleyb.dev at gmail.com] >>> 发送时间: 2019年5月16日 21:46 >>> 收件人: Yi Yang (杨燚)-云服务集团 >>> 抄送: openstack-discuss at lists.openstack.org >>> 主题: Re: [DVR config] Can we use drv_snat agent_mode in every compute node? >>> >>> Hi Yi, >>> >>> I'm a little confused by the question, comments inline. >>> >>> On 5/15/19 11:47 PM, Yi Yang (杨燚)-云服务集团 wrote: >>>> Hi, folks >>>> >>>> I saw somebody discussed distributed SNAT, but finally they didn’t >>>> make agreement on how to implement distributed SNAT, my question is >>>> can we use dvr_snat agent_mode in compute node? I understand >>>> dvr_snat only does snat but doesn’t do east west routing, right? >>>> Can we set dvr_snat and dvr in one compute node at the same time? >>>> It is equivalent to distributed SNAT if we can set drv_snat in >>>> every compute node, isn’t right? I know Opendaylight can do SNAT in >>>> compute node in distributed way, but one external router only can run in one compute node. >>> >>> Distributed SNAT is not available in neutron, there was a spec >>> proposed recently though, https://review.opendev.org/#/c/658414 >>> >>> Regarding the agent_mode setting for L3, only one mode can be set at a time. Typically 'dvr_snat' is used on network nodes and 'dvr' on compute nodes because it leads to less resource usage (i.e. namespaces). >>> The centralized part of the router hosting the default SNAT IP address will only be scheduled to one of the agents in 'dvr_snat' mode. All the DVR modes can do East/West routing when an instance is scheduled to the node, and two can do North/South - 'dvr_snat' using the default SNAT IP, and 'dvr' using a floating IP. 'dvr_no_external' can only do East/West. >>> >>> Hopefully that clarifies things. >>> >>> -Brian >>> >>>> I also see https://wiki.openstack.org/wiki/Dragonflow is trying to >>>> implement distributed SNAT, what are technical road blocks for >>>> distributed SNAT in openstack dvr? Do we have any good way to >>>> remove these road blocks? >>>> >>>> Thank you in advance and look forward to getting your replies and insights. >>>> >>>> Also attached official drv configuration guide for your reference. >>>> >>>> https://docs.openstack.org/neutron/stein/configuration/l3-agent.htm >>>> l >>>> >>>> |agent_mode|¶ >>>> >>> m >>>> l >>>> # >>>> DEFAULT.agent_mode> >>>> >>>> Type >>>> >>>> string >>>> >>>> Default >>>> >>>> legacy >>>> >>>> Valid Values >>>> >>>> dvr, dvr_snat, legacy, dvr_no_external >>>> >>>> The working mode for the agent. Allowed modes are: ‘legacy’ - this >>>> preserves the existing behavior where the L3 agent is deployed on a >>>> centralized networking node to provide L3 services like DNAT, and SNAT. >>>> Use this mode if you do not want to adopt DVR. ‘dvr’ - this mode >>>> enables DVR functionality and must be used for an L3 agent that >>>> runs on a compute host. ‘dvr_snat’ - this enables centralized SNAT >>>> support in conjunction with DVR. This mode must be used for an L3 >>>> agent running on a centralized node (or in single-host deployments, e.g. devstack). >>>> ‘dvr_no_external’ - this mode enables only East/West DVR routing >>>> functionality for a L3 agent that runs on a compute host, the >>>> North/South functionality such as DNAT and SNAT will be provided by >>>> the centralized network node that is running in ‘dvr_snat’ mode. >>>> This mode should be used when there is no external network >>>> connectivity on the compute host. >>>> > > — > Slawek Kaplonski > Senior software engineer > Red Hat > — Slawek Kaplonski Senior software engineer Red Hat -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3600 bytes Desc: not available URL: From li.canwei2 at zte.com.cn Mon May 27 01:52:15 2019 From: li.canwei2 at zte.com.cn (li.canwei2 at zte.com.cn) Date: Mon, 27 May 2019 09:52:15 +0800 (CST) Subject: =?UTF-8?B?UmU6W3dhdGNoZXJdIFF1ZXN0aW9uIGFib3V0IGJhcmVtZXRhbCBub2RlIHN1cHBvcnQgaW4gbm92YSBDRE0=?= In-Reply-To: References: f6025d11-8756-f709-f7eb-89020701f607@gmail.com Message-ID: <201905270952159932327@zte.com.cn> refer to the comment by hidekazu, there is the case 1:M host:nodes when VMware vCenter driver is used. We can remove the first API call if we restrict Nova driver such as KVM only. Thanks licanwei -------------- next part -------------- An HTML attachment was scrubbed... URL: From miguel at mlavalle.com Mon May 27 03:10:04 2019 From: miguel at mlavalle.com (Miguel Lavalle) Date: Sun, 26 May 2019 22:10:04 -0500 Subject: [openstack-dev] [neutron] Cancelling Neutron weekly meeting on May 27th Message-ID: Hi Neutrinos, May 27th is a holiday in the USA, so we will cancel our weekly meeting. We will resume on Tuesday June 4th Best regards Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Mon May 27 04:23:49 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 27 May 2019 13:23:49 +0900 Subject: [neutron][qa] Grenade jobs in check queue In-Reply-To: References: Message-ID: <16af787befa.f7ac691e96217.2071645019220374467@ghanshyammann.com> ---- On Sat, 11 May 2019 16:05:05 +0900 Slawomir Kaplonski wrote ---- > Hi, > > In Neutron team we are thinking about limiting a bit number of CI jobs which we are running. > Currently we have e.g. 4 different grenade jobs: > > neutron-grenade This should go away in the start of U cycle when we will drop the py27 support. > grenade-py3 > neutron-grenade-multinode > neutron-grenade-dvr-multinode > > And jobs grenade-py3 and neutron-grenade-multinode are almost the same (same python version, same L2 agent, same legacy L3 agent, same fw driver). Only difference between those 2 jobs is that one of them is single and one is multinode (2) job. > So I thought that maybe we can use only one of them and I wanted to remove grenade-py3 and left only multinode job in check queue. But grenade-py3 comes from "integrated-gate-py3” template so we probably shouldn’t remove this one. > Can we run only grenade-py3 and drop neutron-grenade-multinode? Or maybe we can change grenade-py3 to be multinode job and then drop neutron-grenade-multinode? Making grenade-py3 multinode jobs might cause the issue in other project gates. IMO, we should keep testing multinode and single node scenario separately. One thing you can do is not t run the multinode job on gate pipeline, running it on check pipeline only should be enough. -gmann > > — > Slawek Kaplonski > Senior software engineer > Red Hat > > > From natal at redhat.com Mon May 27 07:18:19 2019 From: natal at redhat.com (=?UTF-8?Q?Natal_Ng=C3=A9tal?=) Date: Mon, 27 May 2019 09:18:19 +0200 Subject: About the constraints url. Message-ID: Hi everyone, I'm little lost or totally lost, actually. Sorry if a mail have already sent, I have not find it. If I have understood correctly, the url to use for the constraints is: https://releases.openstack.org/constraints/upper/master One question, this rule is the same for all projects? I'm also bit confused, I know the oslo projects must be use this url, and I have a patch is blocked for this reason: https://review.opendev.org/#/c/658296/ The url is not good in master also, so I have make a patch for it: https://review.opendev.org/#/c/661100/ So why I'm confused, because in the same project, few patches with the bad url was validate or merge: https://review.opendev.org/#/c/655650/ https://review.opendev.org/#/c/655649/ https://review.opendev.org/#/c/655648/ https://review.opendev.org/#/c/655640/ So which url must be used for constraints and this a global for all openstack projects? Then can we be careful, when we review a patch about this and not mixed up both url. From sfinucan at redhat.com Mon May 27 07:33:28 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Mon, 27 May 2019 08:33:28 +0100 Subject: About the constraints url. In-Reply-To: References: Message-ID: <85cf109a1db24ef70f89d6e11c3effe76b05402a.camel@redhat.com> On Mon, 2019-05-27 at 09:18 +0200, Natal Ngétal wrote: > Hi everyone, > > I'm little lost or totally lost, actually. Sorry if a mail have > already sent, I have not find it. If I have understood correctly, the > url to use for the constraints is: > > https://releases.openstack.org/constraints/upper/master > > One question, this rule is the same for all projects? I'm also bit > confused, I know the oslo projects must be use this url, and I have a > patch is blocked for this reason: > > https://review.opendev.org/#/c/658296/ > > The url is not good in master also, so I have make a patch for it: > > https://review.opendev.org/#/c/661100/ > > So why I'm confused, because in the same project, few patches with the > bad url was validate or merge: > > https://review.opendev.org/#/c/655650/ > https://review.opendev.org/#/c/655649/ > https://review.opendev.org/#/c/655648/ > https://review.opendev.org/#/c/655640/ > > So which url must be used for constraints and this a global for all > openstack projects? Then can we be careful, when we review a patch > about this and not mixed up both url. The wording there is admittedly confusing: these URLs aren't "wrong", per se, they're just not what we're aiming for going forward. You probably want to look at this email: http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006478.html The reason for the -1s is presumably so we can do things correctly now rather than having to fix them again in the future. Hope that makes sense, Stephen From berndbausch at gmail.com Mon May 27 07:43:11 2019 From: berndbausch at gmail.com (Bernd Bausch) Date: Mon, 27 May 2019 16:43:11 +0900 Subject: [cinder] ceph multiattach details? Message-ID: <2569f62a-1d30-2468-b019-06d99a819f82@gmail.com> The Stein release notes mention that the RBD driver now supports multiattach, but i have not found any details. Are there limitations? Is there a need to configure anything? In the RBD driver , I find this: def _enable_multiattach(self, volume): multipath_feature_exclusions = [ self.rbd.RBD_FEATURE_JOURNALING, self.rbd.RBD_FEATURE_FAST_DIFF, self.rbd.RBD_FEATURE_OBJECT_MAP, self.rbd.RBD_FEATURE_EXCLUSIVE_LOCK, ] This seems to mean that journaling and other features (to me, it's not quite clear what they are) will be automatically disabled when switching on multiattachment. Further down in the code I see that replication and multiattach are mutually exclusive. Is there some documentation about the Ceph multiattach feature, even an email thread? Thanks, Bernd || -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at stackhpc.com Mon May 27 09:07:39 2019 From: mark at stackhpc.com (Mark Goddard) Date: Mon, 27 May 2019 10:07:39 +0100 Subject: [kolla] Virtual PTG Message-ID: Hi, Just a reminder for the Kolla virtual PTG this week on Tuesday 28th and Wednesday 29th May, 12:00 - 16:00 UTC. We're planning to meet virtually using Google meet [1] for voice and/or video, but please let me know if this is likely to be a problem for you. Everyone is welcome to attend, regardless of how long you've been working with the project or your level of contribution so far. If you can't make the full session, you could join for part of it to say hello to the team. We'll be discussing potential new features, planning, and prioritising work for the Train cycle. We'll also take a step back and look at the current state of the project, and ask if we should be doing anything differently. I've put up a rough agenda on the etherpad [2]. It's more of an ordering than a timed schedule - let's just go through the list. If you can't make the whole session and there's something in particular you'd like to discuss, let me know and I will pin it to a specific time. See you tomorrow, Mark (mgoddard) [1] https://meet.google.com/pbo-boob-csh?hs=122 [2] https://etherpad.openstack.org/p/kolla-train-ptg -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at stackhpc.com Mon May 27 09:10:19 2019 From: mark at stackhpc.com (Mark Goddard) Date: Mon, 27 May 2019 10:10:19 +0100 Subject: [kayobe] No meeting today Message-ID: Hi, It's a public holiday in the UK today so there won't be a kayobe meeting. We'll meet again on 10th June but please discuss issues in IRC before then if necessary. Cheers, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From cdent+os at anticdent.org Mon May 27 09:19:00 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Mon, 27 May 2019 10:19:00 +0100 (BST) Subject: [placement] reminder: no meeting today May 27 Message-ID: There will be no Placement/Scheduler meeting today (May 27th) for two reasons: * It's a holiday in a few different places. * The poll to see what to do with the meeting has resulted in switching to office hours: https://civs.cs.cornell.edu/cgi-bin/results.pl?id=E_9599a2647c319fd4 I'll set up a doodle or similar so we can figure out when to schedule the office hours. They will not be at at the same time as the meeting as "meeting at a different time" was the second most popular choice. Thanks. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From gmann at ghanshyammann.com Mon May 27 09:31:46 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 27 May 2019 18:31:46 +0900 Subject: [qa][ptg][patrole] RBAC testing improvement ideas for Patrole In-Reply-To: <7a5dcff9-ca99-496e-a022-f06830fd03a5@Spark> References: <16a86d4834e.e46610fc23956.8020827235456111857@ghanshyammann.com> <7a5dcff9-ca99-496e-a022-f06830fd03a5@Spark> Message-ID: <16af8a1ad82.c87c40f9104767.7556785085526055329@ghanshyammann.com> ---- On Mon, 06 May 2019 10:36:00 +0900 Sergey Vilgelm wrote ---- > Hi, Gmann, thank you so much. > 1. I’m not sure that I understood the #1. Do you mean that oslo.policy will raise a special exceptions for successful and unsuccessful verification if the flag is set? So a service will see the exception and just return it. And Patorle can recognize those exceptions? Yeah, your understanding is correct. The special exception can be recognized by the error code. > I’m totally agree with using one job for one services, It can give us a possibility to temporary disable some services and allow patches for other services to be tested and merged. > 2. +1 for the option 2. We can decrease the number of jobs and have just one job for one services, but we need to think about how to separate the logs. IMO we need to extend the `action` decorator to run a test 9 times (depends on the configuration) and memorize all results for all combinations and use something like `if not all(results): raise PatroleException()` yeah, that is one option to implement that or another option is using testscenarios where we can define the test scenarios per role. That way we can dynamically generate the different tests for each configured role and have their separate results too. -gmann > > -- Sergey Vilgelm https://www.vilgelm.info > On May 5, 2019, 2:15 AM -0500, Ghanshyam Mann , wrote: > Patrole is emerging as a good tool for RBAC testing. AT&T already running it on their production cloud and > we have got a good amount of interest/feedback from other operators. > > We had few discussions regarding the Patrole testing improvement during PTG among QA, Nova, Keystone team. > I am writing the summary of those discussions below and would like to get the opinion from Felipe & Sergey also. > > 1. How to improve the Patrole testing time: > Currently Patrole test perform the complete API operaion which takes time and make Patrole testing > very long. Patrole is responsible to test the policies only so does not need to wait for API complete operation > to be completed. > John has a good idea to handle that via flag. If that flag is enabled (per service and disabled by default) then > oslo.policy can return some different error code on success (other than 403). The API can return the response > with that error code which can be treated as pass case in Patrole. > Morgan raises a good point on making it per API call than global. We can do that as next step and let's > start with the global flag per service as of now? > - https://etherpad.openstack.org/p/ptg-train-xproj-nova-keystone > > Another thing we should improve in current Patrole jobs is to separate the jobs per service. Currently, all 5 services > are installed and run in a single job. Running all on Patrole gate is good but the project side gate does not need to run > any other service tests. For example, patrole-keystone which can install the only keystone and run only > keystone tests. This way project can reuse the patrole jobs only and does not need to prepare a separate job. > > 2. How to run patrole tests with all negative, positive combination for all scope + defaults roles combinations: > - Current jobs patrole-admin/member/reader are able to test the negative pattern. For example: > patrole-member job tests the admin APIs in a negative way and make sure test is passed only if member > role gets 403. > - As we have scope_type support also we need to extend the jobs to run for all 9 combinations of 3 scopes > (system, project, domain) and 3 roles(admin, member, reader). > - option1: running 9 different jobs with each combination as we do have currently > for admin, member, reader role. The issue with this approach is gate will take a lot of time to > run these 9 jobs separately. > - option2: Run all the 9 combinations in a single job with running the tests in the loop with different > combination of scope_roles. This might require the current config option [role] to convert to list type > and per service so that the user can configure what all default roles are available for corresponding service. > This option can save a lot of time to avoid devstack installation time as compared to 9 different jobs option. > > -gmann > > > From gmann at ghanshyammann.com Mon May 27 09:43:35 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 27 May 2019 18:43:35 +0900 Subject: [qa][ptg][nova][cinder][keystone][neutron][glance][swift][placement] How to make integrated-gate testing (tempest-full) more stable and fast In-Reply-To: References: <16a86db6ccd.d787148123989.2198391414179782565@ghanshyammann.com> Message-ID: <16af8ac7fc7.fa901245105341.2925519493395080868@ghanshyammann.com> ---- On Tue, 07 May 2019 07:06:23 +0900 Morgan Fainberg wrote ---- > > > On Sun, May 5, 2019 at 12:19 AM Ghanshyam Mann wrote: > > For the "Integrated-gate-identity", I have a slight worry that we might lose some coverage with this change. I am unsure of how varied the use of Keystone is outside of KeystoneMiddleware (i.e. token validation) consumption that all services perform, Heat (not part of the integrated gate) and it's usage of Trusts, and some newer emerging uses such as "look up limit data" (potentially in Train, would be covered by Nova). Worst case, we could run all the integrated tests for Keystone changes (at least initially) until we have higher confidence and minimize the tests once we have a clearer audit of how the services use Keystone. The changes would speed up/minimize the usage for the other services directly and Keystone can follow down the line. > I want to be as close to 100% sure we're not going to suddenly break everyone because of some change we land. Keystone fortunately and unfortunately sits below most other services in an OpenStack deployment and is heavily relied throughout almost every single request. > --Morgan Thanks Morgan. That was what we were worried during PTG discussion. I agree with your point about not to lose coverage and first get to know how Keystone is being used by each service. Let's keep running the all service tests for keystone gate as of now and later we can shorten the tests run based on the clarity of usage. -gmann > Current integrated-gate jobs (tempest-full) is not so stable for various bugs specially timeout. We tried > to improve it via filtering the slow tests in the separate tempest-slow job but the situation has not been improved much. > > We talked about the Ideas to make it more stable and fast for projects especially when failure is not > related to each project. We are planning to split the integrated-gate template (only tempest-full job as > first step) per related services. > > Idea: > - Run only dependent service tests on project gate. > - Tempest gate will keep running all the services tests as the integrated gate at a centeralized place without any change in the current job. > - Each project can run the below mentioned template. > - All below template will be defined and maintained by QA team. > > I would like to know each 6 services which run integrated-gate jobs > > 1."Integrated-gate-networking" (job to run on neutron gate) > Tests to run in this template: neutron APIs , nova APIs, keystone APIs ? All scenario currently running in tempest-full in the same way ( means non-slow and in serial) > Improvement for neutron gate: exlcude the cinder API tests, glance API tests, swift API tests, > > 2."Integrated-gate-storage" (job to run on cinder gate, glance gate) > Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, Nova APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) > Improvement for cinder, glance gate: excluded the neutron APIs tests, Keystone APIs tests > > 3. "Integrated-gate-object-storage" (job to run on swift gate) > Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) > Improvement for swift gate: excluded the neutron APIs tests, - Keystone APIs tests, - Nova APIs tests. > Note: swift does not run integrated-gate as of now. > > 4. "Integrated-gate-compute" (job to run on Nova gate) > tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs and All scenario currently running in tempest-full in same way ( means non-slow and in serial) > Improvement for Nova gate: excluded the swift APIs tests(not running in current job but in future, it might), Keystone API tests. > > 5. "Integrated-gate-identity" (job to run on keystone gate) > Tests to run is : all as all project use keystone, we might need to run all tests as it is running in integrated-gate. > But does keystone is being unsed differently by all services? if no then, is it enough to run only single service tests say Nova or neutron ? > > 6. "Integrated-gate-placement" (job to run on placement gate) > Tests to run in this template: Nova APIs tests, Neutron APIs tests + scenario tests + any new service depends on placement APIs > Improvement for placement gate: excluded the glance APIs tests, cinder APIs tests, swift APIs tests, keystone APIs tests > > Thoughts on this approach? > > The important point is we must not lose the coverage of integrated testing per project. So I would like to > get each project view if we are missing any dependency (proposed tests removal) in above proposed templates. > > - https://etherpad.openstack.org/p/qa-train-ptg > > -gmann > > > From gmann at ghanshyammann.com Mon May 27 11:12:50 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 27 May 2019 20:12:50 +0900 Subject: [qa][ptg][nova][cinder][keystone][neutron][glance][swift][placement] How to make integrated-gate testing (tempest-full) more stable and fast In-Reply-To: References: <16a86db6ccd.d787148123989.2198391414179782565@ghanshyammann.com> Message-ID: <16af8fe335c.ad61d719109006.9112028995288147526@ghanshyammann.com> ---- On Tue, 07 May 2019 08:25:11 +0900 Tim Burke wrote ---- > > > On 5/5/19 12:18 AM, Ghanshyam Mann wrote: > Current integrated-gate jobs (tempest-full) is not so stable for various bugs specially timeout. We triedto improve it via filtering the slow tests in the separate tempest-slow job but the situation has not been improved much.We talked about the Ideas to make it more stable and fast for projects especially when failure is notrelated to each project. We are planning to split the integrated-gate template (only tempest-full job asfirst step) per related services. Idea:- Run only dependent service tests on project gate. I love this plan already. > - Tempest gate will keep running all the services tests as the integrated gate at a centeralized place without any change in the current job.- Each project can run the below mentioned template. - All below template will be defined and maintained by QA team. My biggest regret is that I couldn't figure out how to do this myself. Much thanks to the QA team! > I would like to know each 6 services which run integrated-gate jobs1."Integrated-gate-networking" (job to run on neutron gate) Tests to run in this template: neutron APIs , nova APIs, keystone APIs ? All scenario currently running in tempest-full in the same way ( means non-slow and in serial)Improvement for neutron gate: exlcude the cinder API tests, glance API tests, swift API tests,2."Integrated-gate-storage" (job to run on cinder gate, glance gate)Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, Nova APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial)Improvement for cinder, glance gate: excluded the neutron APIs tests, Keystone APIs tests3. "Integrated-gate-object-storage" (job to run on swift gate)Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial)Improvement for swift gate: excluded the neutron APIs tests, - Keystone APIs tests, - Nova APIs tests. This sounds great. My only question is why Cinder tests are still included, but I trust that it's there for a reason and I'm just revealing my own ignorance of Swift's consumers, however removed. As Cinder use Swift as one of the backend, I think it is worth running cinder tests on the swift gate. But honestly saying I am covering the most possible broader coverage so that we do not lose any coverage among dependent services. Later we can always optimize this template more once we are very clear about the isolation of services. > Note: swift does not run integrated-gate as of now. Yeah, Kota too brought this in QA team. We suggested to wait for the stability of integrated-gate (this mailing thread) and after that, swift can add the integrated-gate-* template. > Correct, and for all the reasons that you're seeking to address. Some eight months ago I'd gotten tired of seeing spurious failures that had nothing to do with Swift, and I was hard pressed to find an instance where the tempest tests caught a regression or behavior change that wasn't already caught by Swift's own functional tests. In short, the signal-to-noise ratio for those particular tests was low enough that a failure only told me "you should leave a recheck comment," so I proposed https://review.opendev.org/#/c/601813/ . There was also a side benefit of having our longest-running job change from legacy-tempest-dsvm-neutron-full (at 90-100 minutes) to swift-probetests-centos-7 (at ~30 minutes), tightening developer feedback loops. > It sounds like this proposal addresses both concerns: by reducing the scope of tests to what might actually exercise the Swift API (if indirectly), the signal-to-noise ratio should be much better and the wall-clock time will be reduced. True, many other project gate faces a similar problem. Let's see how much this idea can improve integrated gate testing. Thanks for confirmation from the Swift side. -gmann > > 4. "Integrated-gate-compute" (job to run on Nova gate)tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs and All scenario currently running in tempest-full in same way ( means non-slow and in serial)Improvement for Nova gate: excluded the swift APIs tests(not running in current job but in future, it might), Keystone API tests. 5. "Integrated-gate-identity" (job to run on keystone gate)Tests to run is : all as all project use keystone, we might need to run all tests as it is running in integrated-gate.But does keystone is being unsed differently by all services? if no then, is it enough to run only single service tests say Nova or neutron ?6. "Integrated-gate-placement" (job to run on placement gate)Tests to run in this template: Nova APIs tests, Neutron APIs tests + scenario tests + any new service depends on placement APIs Improvement for placement gate: excluded the glance APIs tests, cinder APIs tests, swift APIs tests, keystone APIs testsThoughts on this approach?The important point is we must not lose the coverage of integrated testing per project. So I would like toget each project view if we are missing any dependency (proposed tests removal) in above proposed templates. As far as Swift is aware, these dependencies seem accurate; at any rate, *we* don't use anything other than Keystone, even by way of another API. Further, Swift does not use particularly esoteric Keysonte APIs; I would be OK with integrated-gate-identity not exercising Swift's API with the assumption that some other (or indeed, almost *any* other) service would likely exercise the parts that we care about. > - https:/etherpad.openstack.org/p/qa-train-ptg -gmann From gmann at ghanshyammann.com Mon May 27 11:35:25 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 27 May 2019 20:35:25 +0900 Subject: [qa][ptg][nova][cinder][keystone][neutron][glance][swift][placement] How to make integrated-gate testing (tempest-full) more stable and fast In-Reply-To: References: <16a86db6ccd.d787148123989.2198391414179782565@ghanshyammann.com> Message-ID: <16af912e1b1.12abbb746109900.1643886338503700452@ghanshyammann.com> ---- On Thu, 16 May 2019 20:48:30 +0900 Erno Kuvaja wrote ---- > > On Tue, May 7, 2019 at 12:31 AM Tim Burke wrote: > > > On 5/5/19 12:18 AM, Ghanshyam Mann wrote: > Current integrated-gate jobs (tempest-full) is not so stable for various bugs specially timeout. We triedto improve it via filtering the slow tests in the separate tempest-slow job but the situation has not been improved much.We talked about the Ideas to make it more stable and fast for projects especially when failure is notrelated to each project. We are planning to split the integrated-gate template (only tempest-full job asfirst step) per related services. Idea:- Run only dependent service tests on project gate. I love this plan already. > - Tempest gate will keep running all the services tests as the integrated gate at a centeralized place without any change in the current job.- Each project can run the below mentioned template. - All below template will be defined and maintained by QA team. My biggest regret is that I couldn't figure out how to do this myself. Much thanks to the QA team! > I would like to know each 6 services which run integrated-gate jobs1."Integrated-gate-networking" (job to run on neutron gate) Tests to run in this template: neutron APIs , nova APIs, keystone APIs ? All scenario currently running in tempest-full in the same way ( means non-slow and in serial)Improvement for neutron gate: exlcude the cinder API tests, glance API tests, swift API tests,2."Integrated-gate-storage" (job to run on cinder gate, glance gate)Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, Nova APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial)Improvement for cinder, glance gate: excluded the neutron APIs tests, Keystone APIs tests3. "Integrated-gate-object-storage" (job to run on swift gate)Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial)Improvement for swift gate: excluded the neutron APIs tests, - Keystone APIs tests, - Nova APIs tests. This sounds great. My only question is why Cinder tests are still included, but I trust that it's there for a reason and I'm just revealing my own ignorance of Swift's consumers, however removed. > Note: swift does not run integrated-gate as of now. Correct, and for all the reasons that you're seeking to address. Some eight months ago I'd gotten tired of seeing spurious failures that had nothing to do with Swift, and I was hard pressed to find an instance where the tempest tests caught a regression or behavior change that wasn't already caught by Swift's own functional tests. In short, the signal-to-noise ratio for those particular tests was low enough that a failure only told me "you should leave a recheck comment," so I proposed https://review.opendev.org/#/c/601813/ . There was also a side benefit of having our longest-running job change from legacy-tempest-dsvm-neutron-full (at 90-100 minutes) to swift-probetests-centos-7 (at ~30 minutes), tightening developer feedback loops. > It sounds like this proposal addresses both concerns: by reducing the scope of tests to what might actually exercise the Swift API (if indirectly), the signal-to-noise ratio should be much better and the wall-clock time will be reduced. > > 4. "Integrated-gate-compute" (job to run on Nova gate)tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs and All scenario currently running in tempest-full in same way ( means non-slow and in serial)Improvement for Nova gate: excluded the swift APIs tests(not running in current job but in future, it might), Keystone API tests. 5. "Integrated-gate-identity" (job to run on keystone gate)Tests to run is : all as all project use keystone, we might need to run all tests as it is running in integrated-gate.But does keystone is being unsed differently by all services? if no then, is it enough to run only single service tests say Nova or neutron ?6. "Integrated-gate-placement" (job to run on placement gate)Tests to run in this template: Nova APIs tests, Neutron APIs tests + scenario tests + any new service depends on placement APIs Improvement for placement gate: excluded the glance APIs tests, cinder APIs tests, swift APIs tests, keystone APIs testsThoughts on this approach?The important point is we must not lose the coverage of integrated testing per project. So I would like toget each project view if we are missing any dependency (proposed tests removal) in above proposed templates. As far as Swift is aware, these dependencies seem accurate; at any rate, *we* don't use anything other than Keystone, even by way of another API. Further, Swift does not use particularly esoteric Keysonte APIs; I would be OK with integrated-gate-identity not exercising Swift's API with the assumption that some other (or indeed, almost *any* other) service would likely exercise the parts that we care about. > - https:/etherpad.openstack.org/p/qa-train-ptg -gmann > While I'm all up for limiting the scope Tempest is targeting for each patch to save time and our precious infra resources I have feeling that we might end up missing something here. Honestly I'm not sure what that something would be and maybe it's me thinking the scopes wrong way around. > For example:4. "Integrated-gate-compute" (job to run on Nova gate) > I'm not exactly sure what any given Nova patch would be able to break from Cinder, Glance or Neutron or on number 2 what Swift is depending on Glance and Cinder that we could break when we introduce a change. There can be various scenario where these services are cross-dependent. It is difficult to judge the isolation among them. For example, multi-attach feature depends on Nova as well as Cinder to work correctly. Either side change can break this feature. > > Shouldn't we be looking "What projects are consuming service X and target those Tempest tests"? In Glance perspective this would be (from core projects) Glance, Cinder, Nova; Cinder probably interested about Cinder, Glance and Nova (anyone else consuming Cinder?) etc. I agree on your point of more optimize the testing base on consumer only. But there are few cross service call among consumer and consumed services. For example, Nova and Cinder call back to each other in case of the Swap volume feature. To be honest, I want to cover the most broader possible coverage with consumer and consumed services cross-testing. There is a possibility of optimizing it more but that has the risk of losing some coverage and introducing a regression. That risk is more dangerous and we should avoid that until we are very clear about service isolation. > > I'd like to propose approach where we define these jobs and run them in check for the start and let gate run full suites until we figure out are we catching something in gate we did not catch in check and once the understanding has been reached that we have sufficient coverage, we can go ahead and swap gate using those jobs as well. This approach would give us the benefit where the impact is highest until we are confident we got the coverage right. I think biggest issue is that for the transition period _everyone_ needs to understand that gate might catch something check did not and simple "recheck" might not be sufficient when tempest succeeded in check but failed in gate. > I like your idea of testing this idea as experimental way before actual migration. But I am worried about how to do that. There are two challenges here- 1. Any job in gate pipeline has to run in check pipeline first. Replacing integrated-gate to integrated-gate-* in check pipeline only need exception in that process. 2. how to get the matrix of failure-gap between check and gate pipeline due to this change? OpenStack health dashboard does not collect the check pipeline data. -gmann > Best, > Erno "jokke_" Kuvaja > From doka.ua at gmx.com Mon May 27 13:44:24 2019 From: doka.ua at gmx.com (Volodymyr Litovka) Date: Mon, 27 May 2019 16:44:24 +0300 Subject: [neutron] set/find dhcp-server address In-Reply-To: <3954208F-442A-4519-AAEE-80AB6E5C15B2@redhat.com> References: <5f65989d-1616-a5f5-1c90-a5f6e6e364fe@gmx.com> <2e8c680b-d392-6468-bcc2-44449cc30084@gmx.com> <3954208F-442A-4519-AAEE-80AB6E5C15B2@redhat.com> Message-ID: <4350515b-e5f1-f9c0-7f4a-ccd2ff9541c2@gmx.com> Hi Slawomir, yes, thanks, it works: neutron.list_ports(retrieve_all=False, network_id='2697930d-65f2-4a7a-b360-91d75cc8750d', device_owner='network:dhcp') Thank you. On 5/26/19 10:07 AM, Slawomir Kaplonski wrote: > Hi, > > If You do something like: > > openstack port list --network d79eea02-31dc-45c7-bd48-d98af46fd2d5 --device-owner network:dhcp > > Then You will get only dhcp ports from specific network. And Fixed IP are by default displayed on this list. Is this enough “workaround” for You? > > >> On 25 May 2019, at 22:19, Volodymyr Litovka wrote: >> >> Hi, >> >> it seems I wasn't first who asked for this - https://wiki.openstack.org/wiki/Neutron/enable-to-set-dhcp-port-attributes and it seems there was no progress on this? >> >> Is it possible to at least include DHCP address in output of 'subnet show' API call? >> >> The shortest way I've found is: >> * openstack port list --project ... --device-owner network:dhcp >> and then for **every port** in resulting list >> * openstack port show >> in order to extract 'Fixed IP Addresses' attribute for analysis >> >> Too much calls, isn't it? >> >> On 5/25/19 9:22 PM, Volodymyr Litovka wrote: >>> Dear colleagues, >>> >>> is there way to explicitly assign DHCP address when creating subnet? The issue is that it isn't always first address from allocation pool, e.g. >>> $ openstack port list >>> +--------------------------------------+-------+-------------------+------------------------------------------------------------------------------+--------+ >>> | ID | Name | MAC Address | Fixed IP Addresses | Status | >>> +--------------------------------------+-------+-------------------+------------------------------------------------------------------------------+--------+ >>> | 0897bcc4-6cad-479c-8743-ca7cc5a57271 | | 72:d0:1c:d1:6b:51 | ip_address='172.16.53.3', subnet_id='20329549-124c-484d-8278-edca9829e262' | ACTIVE | >>> | | | | ip_address='172.16.54.2', subnet_id='07249cd3-11a9-4da7-a4db-bd838aa8c4e7' | | >>> >>> both subnet have similar configuration of allocation pool (172.16.xx.2-254/24) and there are two different addresses for DHCP in every subnet. >>> >>> This makes a trouble during project generation with pre-assigned addresses for servers if the pre-assigned address is same as [surprisigly, non-first] address of DHCP namespace. >>> >>> And, may be, there is a way to determine this address in more simple way than looking into 'openstack port list' output, searching for port (a) without name and (b) with multiple addresses from all belonging subnets :) At the moment, 'openstack subnet show' say nothing about assigned DHCP-address. >>> >>> Thank you! >>> >>> -- >>> Volodymyr Litovka >>> "Vision without Execution is Hallucination." -- Thomas Edison >>> >> -- >> Volodymyr Litovka >> "Vision without Execution is Hallucination." -- Thomas Edison >> > — > Slawek Kaplonski > Senior software engineer > Red Hat > > -- Volodymyr Litovka "Vision without Execution is Hallucination." -- Thomas Edison -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Mon May 27 13:48:14 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 27 May 2019 14:48:14 +0100 Subject: About the constraints url. In-Reply-To: <85cf109a1db24ef70f89d6e11c3effe76b05402a.camel@redhat.com> References: <85cf109a1db24ef70f89d6e11c3effe76b05402a.camel@redhat.com> Message-ID: On Mon, 2019-05-27 at 08:33 +0100, Stephen Finucane wrote: > On Mon, 2019-05-27 at 09:18 +0200, Natal Ngétal wrote: > > Hi everyone, > > > > I'm little lost or totally lost, actually. Sorry if a mail have > > already sent, I have not find it. If I have understood correctly, the > > url to use for the constraints is: > > > > https://releases.openstack.org/constraints/upper/master > > > > One question, this rule is the same for all projects? I'm also bit > > confused, I know the oslo projects must be use this url, and I have a > > patch is blocked for this reason: > > > > https://review.opendev.org/#/c/658296/ > > > > The url is not good in master also, so I have make a patch for it: > > > > https://review.opendev.org/#/c/661100/ > > > > So why I'm confused, because in the same project, few patches with the > > bad url was validate or merge: > > > > https://review.opendev.org/#/c/655650/ > > https://review.opendev.org/#/c/655649/ > > https://review.opendev.org/#/c/655648/ > > https://review.opendev.org/#/c/655640/ > > > > So which url must be used for constraints and this a global for all > > openstack projects? Then can we be careful, when we review a patch > > about this and not mixed up both url. > > The wording there is admittedly confusing: these URLs aren't "wrong", > per se, they're just not what we're aiming for going forward. > You > probably want to look at this email: > > http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006478.html > > The reason for the -1s is presumably so we can do things correctly now > rather than having to fix them again in the future. why would we prefer the release repo over git? i would much prefer if we coninued to use the direct opendev based git urls instead of relying on the redirect as we do here https://github.com/openstack/os-vif/blob/master/tox.ini#L12 https://releases.openstack.org/constraints/upper/master is just a redirect to https://opendev.org/openstack/requirements/raw/branch/master/upper-constraints.txt which is rather inefficient. would a better soluton to the EOL branches not be dont delete the branch in the first place and just merge a commit declaring it eol. with extended maintenance i think that makes even more sense now then it did before. as a side note in the gate jobs we should also set the UPPER_CONSTRAINTS_FILE env to point a the copy created by the zuul cloner rather then relying on either approch. im not going to -1 patches that update to the https://releases.openstack.org/constraints/upper/$series form although i would prefer people do not do it on os-vif until the http{s,}://releases.openstack.org/constraints/upper/master redirect actully use opendev.org as we have already made that switch and i dont want to go form 0 redirects to 2. i would like to know why we continue to kill the stable branches for when the go EOL as that is the root of one of the issues raised. Tony do you have insight into that? os-vif's lower constratis job is also not broken in they way that was described in the email so if an automated patch is recived that would regress that ill -2 it and then fix it up. > > Hope that makes sense, > Stephen > > From arshad.alam.ansari at hpe.com Mon May 27 11:40:37 2019 From: arshad.alam.ansari at hpe.com (Ansari, Arshad) Date: Mon, 27 May 2019 11:40:37 +0000 Subject: stable/newton devstack installation is failing on ubuntu 14.04.6 Message-ID: Hi, I am trying to install stable/newton devstack on Ubuntu 14.04.6 and getting below error:- 2019-05-20 05:21:05.321 | full installdeps: -chttps://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt, -r/opt/stack/tempest/requirements.txt 2019-05-20 05:21:06.347 | 2019-05-20 05:21:06.347 | =================================== log end ==================================== 2019-05-20 05:21:06.347 | ERROR: could not install deps [-chttps://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt, -r/opt/stack/tempest/requirements.txt]; v = InvocationError(u'/opt/stack/tempest/.tox/tempest/bin/pip install -chttps://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt -r/opt/stack/tempest/requirements.txt', 2) 2019-05-20 05:21:06.347 | ___________________________________ summary ____________________________________ 2019-05-20 05:21:06.347 | ERROR: full: could not install deps [-chttps://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt, -r/opt/stack/tempest/requirements.txt]; v = InvocationError(u'/opt/stack/tempest/.tox/tempest/bin/pip install -chttps://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt -r/opt/stack/tempest/requirements.txt', 2) SSLError: hostname 'git.openstack.org' doesn't match either of 'developer.openstack.org', 'www.developer.openstack.org' You are using pip version 8.1.2, however version 19.1.1 is available. Please let me know if other details are required. Thanks, Arshad Alam Ansari -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Mon May 27 20:22:34 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 27 May 2019 20:22:34 +0000 Subject: [infra] stable/newton devstack installation is failing on ubuntu 14.04.6 In-Reply-To: References: Message-ID: <20190527202234.v7r6fhoh3dh6zl53@yuggoth.org> On 2019-05-27 11:40:37 +0000 (+0000), Ansari, Arshad wrote: [...] > SSLError: hostname 'git.openstack.org' doesn't match either of > 'developer.openstack.org', 'www.developer.openstack.org' [...] The old https://git.openstack.org/ service moved to https://opendev.org/ as of 2019-04-20. Compatibility redirects are provided at the old name but are on a shared IP address so a client supporting SNI (IETF RFC 3546 section 3.1 "Server Name Indication") is required to see the correct hostname on the redirect. SNI is ~16 years old now (RFC 3546 is circa June 2003), but a number of HTTPS libraries took a while to catch up so may still not support it on some older platforms. We can look into moving this vhost to a dedicated server/IP address or to be the default vhost on the current address, but backporting any of the master branch fixes for updating to the new URL would be another option if the Python interpreter and/or pip can't be upgraded to use SNI. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From tobias.rydberg at citynetwork.eu Mon May 27 20:23:07 2019 From: tobias.rydberg at citynetwork.eu (Tobias Rydberg) Date: Mon, 27 May 2019 22:23:07 +0200 Subject: [sigs][publiccloud][publiccloud-wg][publiccloud-sig][billing] Meeting tomorrow continues discussions billing initiative Message-ID: <67bf157a-d50b-cab3-2777-6b0b5b415b4c@citynetwork.eu> Hi all, Thanks for a good meeting last Thursday! As decided during last meeting, we will try to keep up the pace here and have meeting every week before summer vacations. Tomorrow at 1400 UTC in IRC #openstack-publiccloud we will continue where we left last week. Please read the meeting logs from last meeting [0] and the etherpad regarding this topic and initiative [1]. Talk to you all tomorrow! Cheers, Tobias [0] http://eavesdrop.openstack.org/meetings/publiccloud_wg/2019/publiccloud_wg.2019-05-23-14.00.log.html [1] https://etherpad.openstack.org/p/publiccloud-sig-billing-implementation-proposal -- Tobias Rydberg Senior Developer Twitter & IRC: tobberydberg www.citynetwork.eu | www.citycloud.com INNOVATION THROUGH OPEN IT INFRASTRUCTURE ISO 9001, 14001, 27001, 27015 & 27018 CERTIFIED From thiagocmartinsc at gmail.com Mon May 27 21:31:29 2019 From: thiagocmartinsc at gmail.com (=?UTF-8?B?TWFydGlueCAtIOOCuOOCp+ODvOODoOOCug==?=) Date: Mon, 27 May 2019 17:31:29 -0400 Subject: [cinder] ceph multiattach details? In-Reply-To: <2569f62a-1d30-2468-b019-06d99a819f82@gmail.com> References: <2569f62a-1d30-2468-b019-06d99a819f82@gmail.com> Message-ID: Hello, I'm very curious about this as well! It would be awesome to support Cinder multi-attach when using Ceph... If the code is already there, how to use it?! Cheers, Thiago On Mon, 27 May 2019 at 03:52, Bernd Bausch wrote: > The Stein release notes mention that the RBD driver now supports > multiattach, but i have not found any details. Are there limitations? Is > there a need to configure anything? > > In the RBD driver > , > I find this: > > def _enable_multiattach(self, volume): > multipath_feature_exclusions = [ > self.rbd.RBD_FEATURE_JOURNALING, > self.rbd.RBD_FEATURE_FAST_DIFF, > self.rbd.RBD_FEATURE_OBJECT_MAP, > self.rbd.RBD_FEATURE_EXCLUSIVE_LOCK, > ] > > This seems to mean that journaling and other features (to me, it's not > quite clear what they are) will be automatically disabled when switching on > multiattachment. > > Further down in the code I see that replication and multiattach are > mutually exclusive. > > Is there some documentation about the Ceph multiattach feature, even an > email thread? > > Thanks, > > Bernd > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tony at bakeyournoodle.com Mon May 27 22:48:47 2019 From: tony at bakeyournoodle.com (Tony Breeds) Date: Tue, 28 May 2019 08:48:47 +1000 Subject: About the constraints url. In-Reply-To: References: <85cf109a1db24ef70f89d6e11c3effe76b05402a.camel@redhat.com> Message-ID: <20190527224846.GD4763@thor.bakeyournoodle.com> On Mon, May 27, 2019 at 02:48:14PM +0100, Sean Mooney wrote: > > The wording there is admittedly confusing: these URLs aren't "wrong", > > per se, they're just not what we're aiming for going forward. > > You > > probably want to look at this email: > > > > http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006478.html > > > > The reason for the -1s is presumably so we can do things correctly now > > rather than having to fix them again in the future. > why would we prefer the release repo over git? There are a couple of reasons. 1. Using release.o.o allows us is maintain the project state in one location therefore removing a race around branch time. When pointing directly at git the constraints file for $branch doesn't exist until the requirements repo branches. So if $project merges an update to tox.ini pointing that the constraints file for $branch that project on the new stable branch is running without constraints .... Hmm actually this in more nuanced that I have previously understood as I think we might actually be safe in the gate .... I'll look into that. 2. Using releases.o.o allows the requirements team to EOL branches without breaking tox files in git / on pypi. Now clearly neither of these really apply to master, so there it basically comes down to consistency with stable branches where points 1 and 2 make more sense. > i would much prefer if we coninued to use the direct opendev based git urls > instead of relying on the redirect as we do here > https://github.com/openstack/os-vif/blob/master/tox.ini#L12 > > https://releases.openstack.org/constraints/upper/master is just a redirect to > https://opendev.org/openstack/requirements/raw/branch/master/upper-constraints.txt > which is rather inefficient. would a better soluton to the EOL branches not be > dont delete the branch in the first place and just merge a commit declaring it eol. > > with extended maintenance i think that makes even more sense now then it did before. From where I sit, there are lots of things was *can do* and each has pros and cons. None is right or wrong none is the clear winner. Regardless of which we pick there will be knock-on technical impacts. In Sydney, Dublin we had those discussions, in Vancouver we chose to use the tag/delete process. I don't want to sound "ranty" and I don't want to close down a conversation but I also don't want to keep talking about it :/ > as a side note in the gate jobs we should also set the UPPER_CONSTRAINTS_FILE env to point a the > copy created by the zuul cloner rather then relying on either approch. Just for clarity s/should also// Redirects aren't used in the gate. > im not going to -1 patches that update to the > https://releases.openstack.org/constraints/upper/$series form although i would prefer > people do not do it on os-vif until the http{s,}://releases.openstack.org/constraints/upper/master > redirect actully use opendev.org as we have already made that switch and i dont want to go form 0 redirects > to 2. So https://review.opendev.org/#/c/660553/ is ready to merge which does this. It has 3 +2's and just didn't get a +W because Friday. I'd expect this to merge "real soon now". As you point out/imply this will take you from 0 to 1 redirect for tox runs outside of the gate. So I'm going to propose the change in os-vif along with all the others. Feel free to add a Depends-On or -W until you're happy with it. > i would like to know why we continue to kill the stable branches for when the go EOL as that is the > root of one of the issues raised. Tony do you have insight into that? The bottom line is that as a whole the community decided at some point we run out of time/energy/resources/spoons to maintain those older branches in the gate and at that point a strong signal for that is desirable. The signal we've chosen is the tag/delete process. The EM policy gives many project teams the ability to make those choices for themselves on a schedule that makes sense to them[1]. In this regard requirements is "just a project"[2] that would like to be able to make that call also. > os-vif's lower constratis job is also not broken in they way that was described in the email so if an automated patch is > recived that would regress that ill -2 it and then fix it up. So the fixing of lower-constraints is being done by hand not a script/tool and I can confirm os-vif isn't in the list of projects that gets an update ;P I do have *other* questions about os-vif's tox.ini, they're totally style related so I'll fire off a change as a place to have that discussion at some stage this week. Yours Tony. [1] Without the general schedule outlined in https://docs.openstack.org/project-team-guide/stable-branches.html [2] Okay it's not "just" a project as it has far reaching impacts when it goes EOL but we're trying to reduce/minimise them and the scope for mistakes. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From tony at bakeyournoodle.com Mon May 27 22:58:16 2019 From: tony at bakeyournoodle.com (Tony Breeds) Date: Tue, 28 May 2019 08:58:16 +1000 Subject: About the constraints url. In-Reply-To: References: Message-ID: <20190527225815.GE4763@thor.bakeyournoodle.com> On Mon, May 27, 2019 at 09:18:19AM +0200, Natal Ngétal wrote: > Hi everyone, > > I'm little lost or totally lost, actually. Sorry if a mail have > already sent, I have not find it. If I have understood correctly, the > url to use for the constraints is: > > https://releases.openstack.org/constraints/upper/master Correct, that is the preferred URL on master with https://releases.openstack.org/constraints/upper/$series being the form on stable branches. > One question, this rule is the same for all projects? I'm also bit > confused, I know the oslo projects must be use this url, and I have a > patch is blocked for this reason: > > https://review.opendev.org/#/c/658296/ This is on stable/stein where we need it to match the form above to enable os-vif to out live openstack-requirements > The url is not good in master also, so I have make a patch for it: > > https://review.opendev.org/#/c/661100/ Thanks, the requirements team has committed to making this change, so you can expect a conflicting change in the near future ;P > So why I'm confused, because in the same project, few patches with the > bad url was validate or merge: > > https://review.opendev.org/#/c/655650/ > https://review.opendev.org/#/c/655649/ > https://review.opendev.org/#/c/655648/ > https://review.opendev.org/#/c/655640/ > > So which url must be used for constraints and this a global for all > openstack projects? Then can we be careful, when we review a patch > about this and not mixed up both url. The bottom line is these patches were generated by a well meaning but slightly out of touch member of the community. They're not wrong, they're just not the preferred direction for the requirements team. on *master* you can use the git URLs and ignore/abandon the changes the requirements team generate. I'll be a little sad if you do as then master is more different to a stable branch but nothing will break if you point directly at opedev.org/.../master/.../upper-constraints.txt. The same isn't true for stable/$series. Using git will break just not for somewhere between 1-4 years from now. Yours Tony. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From doug at doughellmann.com Mon May 27 23:26:11 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Mon, 27 May 2019 19:26:11 -0400 Subject: About the constraints url. In-Reply-To: <20190527224846.GD4763@thor.bakeyournoodle.com> References: <85cf109a1db24ef70f89d6e11c3effe76b05402a.camel@redhat.com> <20190527224846.GD4763@thor.bakeyournoodle.com> Message-ID: <948BA185-B6F5-4FD9-A268-CB69D6EA0E01@doughellmann.com> > On May 27, 2019, at 6:48 PM, Tony Breeds wrote: > > On Mon, May 27, 2019 at 02:48:14PM +0100, Sean Mooney wrote: > >>> The wording there is admittedly confusing: these URLs aren't "wrong", >>> per se, they're just not what we're aiming for going forward. >>> You >>> probably want to look at this email: >>> >>> http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006478.html >>> >>> The reason for the -1s is presumably so we can do things correctly now >>> rather than having to fix them again in the future. >> why would we prefer the release repo over git? > > There are a couple of reasons. > > 1. Using release.o.o allows us is maintain the project state in one > location therefore removing a race around branch time. When pointing > directly at git the constraints file for $branch doesn't exist until the > requirements repo branches. So if $project merges an update to tox.ini > pointing that the constraints file for $branch that project on the new > stable branch is running without constraints .... Hmm actually this in > more nuanced that I have previously understood as I think we might > actually be safe in the gate .... I'll look into that. The gate always uses a local copy of the constraints list. The race condition is on developers’ local systems, where the old style URL will point to something that does not exist until the requirements repo is branched. Doug From tony at bakeyournoodle.com Mon May 27 23:28:49 2019 From: tony at bakeyournoodle.com (Tony Breeds) Date: Tue, 28 May 2019 09:28:49 +1000 Subject: About the constraints url. In-Reply-To: <948BA185-B6F5-4FD9-A268-CB69D6EA0E01@doughellmann.com> References: <85cf109a1db24ef70f89d6e11c3effe76b05402a.camel@redhat.com> <20190527224846.GD4763@thor.bakeyournoodle.com> <948BA185-B6F5-4FD9-A268-CB69D6EA0E01@doughellmann.com> Message-ID: <20190527232848.GF4763@thor.bakeyournoodle.com> On Mon, May 27, 2019 at 07:26:11PM -0400, Doug Hellmann wrote: > The gate always uses a local copy of the constraints list. The race > condition is on developers’ local systems, where the old style URL > will point to something that does not exist until the requirements > repo is branched. I assume that when zuul tries to get the stable/$series branch from openstack/requirements if that branch doesn't exist it just gets master right? Yours Tony. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From emilien at redhat.com Mon May 27 23:35:41 2019 From: emilien at redhat.com (Emilien Macchi) Date: Mon, 27 May 2019 19:35:41 -0400 Subject: [tripleo] Configuration Management without Puppet Message-ID: (First of all: thanks to Alex Schultz who initiated the discussion during the last PTG). Context: With the containerization of the services in TripleO, our usage of Puppet is pretty much limited to laying down parameters into configuration files. Our long term goal is to reduce our number of tools in TripleO and converge toward more Ansible and we are currently investigating how to make it happen for Configuration Management, in a backward compatible and non-disruptive way. Problems: - Our current interface to configure things is very tight to Puppet and Hiera, which would make it complicated to bind it with Hiera. Some of us have tried (to create some Hiera module in Ansible) but clearly this is not the path we want to take as far I know. - We don't use the logic (generally) in the Puppet modules and again only care about the configuration providers (in puppet-openstacklib and for some services, templates files), so the Puppet modules now do too much for what we need in TripleO. Proposals: - Create an interface in THT that will define what the configuration looks like for each service. The format will be simple and easy to consume from any tool (Puppet to start for backward compatibility but also Ansible and why not etcd one day). Example with Glance in deployment/glance/glance-api-container-puppet.yaml: param_config: map_merge: - glance_api_config: DEFAULT: enable_v2_api: true log_dir: /var/log/glance oslo_middleware: enable_proxy_headers_parsing: true - glance_image_import_config: image_import_opts: image_import_plugins: {get_param: GlanceImageImportPlugins} https://review.opendev.org/#/c/660791/29/deployment/glance/glance-api-container-puppet.yaml It's not tight to Hiera and it's generating param_config which is JSON and consumable by Ansible or etcd later. Here, glance_api_config and glance_image_import_config are the Puppet providers which configure a specific file for Glance but we could imagine a binding for Ansible like: glance_api_config: /etc/glance/glance-api.conf - Move the services to use this new interface and I think it'll take multiple cycles to do that due to our amount of services. Things I haven't figured out (yet): - I'm still working on figuring out how we will handle ExtraConfig, service_config_settings and all the Hiera datafiles that we support in puppet/role.role.j2.yaml (ideas welcome). I can only imagine something like: ExtraConfig: - glance_api_config: DEFAULT: log_dir: /var/log/glance-2 Which somehow would override a previous hash. The blocker I found is that map_merge doesn't do deep merges, so if you merge glance_api_config, you overrides all keys... which is problematic if you only want to update one key. I'll investigate that further. - Backward compatibility for things like "ExtraConfig: glance::api::workers". Since we wouldn't rely on the classes anymore, our users calling the parameters from the classes will be broken. This also needs to be investigated. - non Puppet OpenStack modules like mysql, apache, etc, mostly use erb templates. This is, for now, out of scope and the plan is to look at OpenStack services first. But ideas welcome here as well. Of course we would prefer to consume upstream roles if they are well tested & maintained. If you want to review the ongoing work, here are the links: https://review.opendev.org/#/c/660726 https://review.opendev.org/#/c/661377 https://review.opendev.org/#/c/661093 https://review.opendev.org/#/c/660791 Thanks for reading that far, feel free to comment and contribute. I plan to continue this work and evaluate if all of this is actually worth it. -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at doughellmann.com Mon May 27 23:37:26 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Mon, 27 May 2019 19:37:26 -0400 Subject: About the constraints url. In-Reply-To: <20190527232848.GF4763@thor.bakeyournoodle.com> References: <85cf109a1db24ef70f89d6e11c3effe76b05402a.camel@redhat.com> <20190527224846.GD4763@thor.bakeyournoodle.com> <948BA185-B6F5-4FD9-A268-CB69D6EA0E01@doughellmann.com> <20190527232848.GF4763@thor.bakeyournoodle.com> Message-ID: > On May 27, 2019, at 7:28 PM, Tony Breeds wrote: > >> On Mon, May 27, 2019 at 07:26:11PM -0400, Doug Hellmann wrote: >> >> The gate always uses a local copy of the constraints list. The race >> condition is on developers’ local systems, where the old style URL >> will point to something that does not exist until the requirements >> repo is branched. > > I assume that when zuul tries to get the stable/$series branch from > openstack/requirements if that branch doesn't exist it just gets master > right? > > Yours Tony. Yes, it always works right when running under zuul. Doug From dsneddon at redhat.com Mon May 27 23:49:09 2019 From: dsneddon at redhat.com (Dan Sneddon) Date: Mon, 27 May 2019 16:49:09 -0700 Subject: [tripleo] Configuration Management without Puppet In-Reply-To: References: Message-ID: This looks very promising, thank you. One concern I have is that we maintain some of the troubleshooting ability that the hieradata files give us today in the long term. The hieradata files on overcloud nodes make it convenient to glean facts about the deployed host. They also act as a canary for when the overcloud Heat stacks are out of sync with the deployed hosts. Once we make the param_config consumable by Ansible or etcd, please keep supportability and troubleshooting in mind. Having local config data in a convenient place on the deployed host is important. On Mon, May 27, 2019 at 4:36 PM Emilien Macchi wrote: > (First of all: thanks to Alex Schultz who initiated the discussion during > the last PTG). > > Context: > With the containerization of the services in TripleO, our usage of Puppet > is pretty much limited to laying down parameters into configuration files. > Our long term goal is to reduce our number of tools in TripleO and > converge toward more Ansible and we are currently investigating how to make > it happen for Configuration Management, in a backward compatible and > non-disruptive way. > > Problems: > - Our current interface to configure things is very tight to Puppet and > Hiera, which would make it complicated to bind it with Hiera. Some of us > have tried (to create some Hiera module in Ansible) but clearly this is not > the path we want to take as far I know. > - We don't use the logic (generally) in the Puppet modules and again only > care about the configuration providers (in puppet-openstacklib and for some > services, templates files), so the Puppet modules now do too much for what > we need in TripleO. > > Proposals: > - Create an interface in THT that will define what the configuration looks > like for each service. The format will be simple and easy to consume from > any tool (Puppet to start for backward compatibility but also Ansible and > why not etcd one day). > > Example with Glance in deployment/glance/glance-api-container-puppet.yaml: > param_config: > map_merge: > - glance_api_config: > DEFAULT: > enable_v2_api: true > log_dir: /var/log/glance > oslo_middleware: > enable_proxy_headers_parsing: true > - glance_image_import_config: > image_import_opts: > image_import_plugins: {get_param: > GlanceImageImportPlugins} > > > https://review.opendev.org/#/c/660791/29/deployment/glance/glance-api-container-puppet.yaml > > It's not tight to Hiera and it's generating param_config which is JSON and > consumable by Ansible or etcd later. > Here, glance_api_config and glance_image_import_config are the Puppet > providers which configure a specific file for Glance but we could imagine a > binding for Ansible like: > glance_api_config: /etc/glance/glance-api.conf > > - Move the services to use this new interface and I think it'll take > multiple cycles to do that due to our amount of services. > > Things I haven't figured out (yet): > - I'm still working on figuring out how we will handle ExtraConfig, > service_config_settings and all the Hiera datafiles that we support in > puppet/role.role.j2.yaml (ideas welcome). > I can only imagine something like: > ExtraConfig: > - glance_api_config: > DEFAULT: > log_dir: /var/log/glance-2 > > Which somehow would override a previous hash. The blocker I found is that > map_merge doesn't do deep merges, so if you merge glance_api_config, you > overrides all keys... which is problematic if you only want to update one > key. I'll investigate that further. > > - Backward compatibility for things like "ExtraConfig: > glance::api::workers". Since we wouldn't rely on the classes anymore, our > users calling the parameters from the classes will be broken. This also > needs to be investigated. > > - non Puppet OpenStack modules like mysql, apache, etc, mostly use erb > templates. This is, for now, out of scope and the plan is to look at > OpenStack services first. But ideas welcome here as well. Of course we > would prefer to consume upstream roles if they are well tested & maintained. > > If you want to review the ongoing work, here are the links: > https://review.opendev.org/#/c/660726 > https://review.opendev.org/#/c/661377 > https://review.opendev.org/#/c/661093 > https://review.opendev.org/#/c/660791 > > Thanks for reading that far, feel free to comment and contribute. I plan > to continue this work and evaluate if all of this is actually worth it. > -- > Emilien Macchi > -- Dan Sneddon | Senior Principal Software Engineer dsneddon at redhat.com | redhat.com/cloud dsneddon:irc | @dxs:twitter -------------- next part -------------- An HTML attachment was scrubbed... URL: From emilien at redhat.com Mon May 27 23:58:40 2019 From: emilien at redhat.com (Emilien Macchi) Date: Mon, 27 May 2019 19:58:40 -0400 Subject: [tripleo] Configuration Management without Puppet In-Reply-To: References: Message-ID: On Mon, May 27, 2019 at 7:49 PM Dan Sneddon wrote: > This looks very promising, thank you. One concern I have is that we > maintain some of the troubleshooting ability that the hieradata files give > us today in the long term. > > The hieradata files on overcloud nodes make it convenient to glean facts > about the deployed host. They also act as a canary for when the overcloud > Heat stacks are out of sync with the deployed hosts. > > Once we make the param_config consumable by Ansible or etcd, please keep > supportability and troubleshooting in mind. Having local config data in a > convenient place on the deployed host is important. > You made very good points and I think we'll maintain an optional puppet step where our operators can apply any Hiera like before; at least until we think our operators have switched to the new interface at some point. For debugging, we could think of the same kind of usage, where you have your custom-params.yaml like : --- glance_api_config: DEFAULT: enable_v2_api: true log_dir: /var/log/glance We would merge the hashes and lay down the config (with some hierarchy like before with Hiera datafiles). Also a note on etcd; even if the data would be consumed from etcd by services (currently not supported by oslo-config AFIK), the data would be available in config-download when generating the playbooks & data, so we could make it so you can modify the data and run the playbooks to change any config. Thanks for your feedback, -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From henry at thebonaths.com Tue May 28 02:05:12 2019 From: henry at thebonaths.com (Henry Bonath) Date: Mon, 27 May 2019 22:05:12 -0400 Subject: [openstack-ansible] Installing Third-Party drivers into the Cinder-Volume container during playbook execution Message-ID: Hello, I asked this into IRC but I thought this might be a more appropriate place to ask considering the IRC channel usage over the weekend. If I wanted to deploy a third party driver along with my Cinder-Volume container, is there a built-in mechanism for doing so? (I am specifically wanting to use: https://github.com/iXsystems/cinder) I am able to configure a cinder-backend in the "openstack_user_config.yml" file which works perfectly if I let it fail during the first run, then copy the driver into the containers and run "os-cinder-install.yml" a second time. I've found that you guys have built similar stuff into the system (e.g. Horizon custom Theme installation via .tgz) and was curious if there is a similar mechanism for Cinder Drivers that may be undocumented. http://paste.openstack.org/show/752132/ This is an example of my working config, which relies on the driver being copied into the /openstack/venvs/cinder-19.x.x.x/lib/python2.7/site-packages/cinder/volume/drivers/ixsystems/ folder. Thanks in advance! From emccormick at cirrusseven.com Tue May 28 02:30:04 2019 From: emccormick at cirrusseven.com (Erik McCormick) Date: Mon, 27 May 2019 22:30:04 -0400 Subject: [cinder] ceph multiattach details? In-Reply-To: References: <2569f62a-1d30-2468-b019-06d99a819f82@gmail.com> Message-ID: On Mon, May 27, 2019, 5:33 PM Martinx - ジェームズ wrote: > Hello, > > I'm very curious about this as well! > > It would be awesome to support Cinder multi-attach when using Ceph... If > the code is already there, how to use it?! > > Cheers, > Thiago > > On Mon, 27 May 2019 at 03:52, Bernd Bausch wrote: > >> The Stein release notes mention that the RBD driver now supports >> multiattach, but i have not found any details. Are there limitations? Is >> there a need to configure anything? >> >> In the RBD driver >> , >> I find this: >> >> def _enable_multiattach(self, volume): >> multipath_feature_exclusions = [ >> self.rbd.RBD_FEATURE_JOURNALING, >> self.rbd.RBD_FEATURE_FAST_DIFF, >> self.rbd.RBD_FEATURE_OBJECT_MAP, >> self.rbd.RBD_FEATURE_EXCLUSIVE_LOCK, >> ] >> >> This seems to mean that journaling and other features (to me, it's not >> quite clear what they are) will be automatically disabled when switching on >> multiattachment. >> >> Further down in the code I see that replication and multiattach are >> mutually exclusive. >> >> Is there some documentation about the Ceph multiattach feature, even an >> email thread? >> >> Thanks, >> >> Bernd >> > There isn't really a Ceph multi-attach feature using Cinder. The code comment is stating that, while the Openstack side of things is in place, Ceph doesn't yet support it with RBD due to replication issues with multiple clients. The Ceph community is aware of it, but has thus far focused on CephFS as the shared file system instead. This could possibly be used with the NFS Cinder driver talking to Ganesha with CephFS mounted. You may also want to look at Openstack's Manilla project to orchestrate that. -Erik > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Tue May 28 08:19:17 2019 From: smooney at redhat.com (Sean Mooney) Date: Tue, 28 May 2019 09:19:17 +0100 Subject: About the constraints url. In-Reply-To: <20190527232848.GF4763@thor.bakeyournoodle.com> References: <85cf109a1db24ef70f89d6e11c3effe76b05402a.camel@redhat.com> <20190527224846.GD4763@thor.bakeyournoodle.com> <948BA185-B6F5-4FD9-A268-CB69D6EA0E01@doughellmann.com> <20190527232848.GF4763@thor.bakeyournoodle.com> Message-ID: On Tue, 2019-05-28 at 09:28 +1000, Tony Breeds wrote: > On Mon, May 27, 2019 at 07:26:11PM -0400, Doug Hellmann wrote: > > > The gate always uses a local copy of the constraints list. this is almost always true. howver the tempest jobs use a tempest tox env to create a venv used to generate teh tempst config https://github.com/openstack/devstack/blob/master/lib/tempest#L592 we then uncondtionally use master upper constaits to reinstall the deps form the local zuul clonned copy of the requirement repo. https://github.com/openstack/devstack/blob/master/lib/tempest#L595-L597 that first invocation of tox does not use the local requirement repo and hits the interwebs as i have seen this cause gate failures due to dns resoltuion issues in the past. im not really sure why we do it this way to be honest. for the tox jobs a non default constaitns file is only used if tox_constraints_file is defiend. https://github.com/openstack-infra/zuul-jobs/blob/d1465e8b1b5ba8af4619f7bb95551cbdbb42c059/roles/tox/tasks/main.yaml#L6-L24 we dont define that in our base tox jobs in zuul jobs repo https://github.com/openstack-infra/zuul-jobs/blob/ed1323d09616230eb44636570231b761351c845c/zuul.yaml#L86-L143 but we do set it in our openstack-tox job in the openstack-zuul-jobs repo so we shoudl be safe in that regard. https://github.com/openstack/openstack-zuul-jobs/blob/c716915911a612ea351ab6b1718c3239e900c469/zuul.d/jobs.yaml#L343-L378 so it only where we direcly invoke tox in devstack or costom jobs that we need to be carful. > > The race > > condition is on developers’ local systems, where the old style URL > > will point to something that does not exist until the requirements > > repo is branched. > > I assume that when zuul tries to get the stable/$series branch from > openstack/requirements if that branch doesn't exist it just gets master > right? > > Yours Tony. From smooney at redhat.com Tue May 28 08:57:10 2019 From: smooney at redhat.com (Sean Mooney) Date: Tue, 28 May 2019 09:57:10 +0100 Subject: About the constraints url. In-Reply-To: <20190527224846.GD4763@thor.bakeyournoodle.com> References: <85cf109a1db24ef70f89d6e11c3effe76b05402a.camel@redhat.com> <20190527224846.GD4763@thor.bakeyournoodle.com> Message-ID: <81cd6f89e6e76507ead9f5468ca270ca42022634.camel@redhat.com> On Tue, 2019-05-28 at 08:48 +1000, Tony Breeds wrote: > On Mon, May 27, 2019 at 02:48:14PM +0100, Sean Mooney wrote: > > > > The wording there is admittedly confusing: these URLs aren't "wrong", > > > per se, they're just not what we're aiming for going forward. > > > You > > > probably want to look at this email: > > > > > > http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006478.html > > > > > > The reason for the -1s is presumably so we can do things correctly now > > > rather than having to fix them again in the future. > > > > why would we prefer the release repo over git? > > There are a couple of reasons. > > 1. Using release.o.o allows us is maintain the project state in one > location therefore removing a race around branch time. When pointing > directly at git the constraints file for $branch doesn't exist until the > requirements repo branches. So if $project merges an update to tox.ini > pointing that the constraints file for $branch that project on the new > stable branch is running without constraints .... Hmm actually this in > more nuanced that I have previously understood as I think we might > actually be safe in the gate .... I'll look into that. > > 2. Using releases.o.o allows the requirements team to EOL branches > without breaking tox files in git / on pypi. > > Now clearly neither of these really apply to master, so there it > basically comes down to consistency with stable branches where points 1 > and 2 make more sense. ok i wont stand in the way of progress if does infact remove operation hassel for the requiremetns team :) > > > i would much prefer if we coninued to use the direct opendev based git urls > > instead of relying on the redirect as we do here > > https://github.com/openstack/os-vif/blob/master/tox.ini#L12 > > > > https://releases.openstack.org/constraints/upper/master is just a redirect to > > https://opendev.org/openstack/requirements/raw/branch/master/upper-constraints.txt > > which is rather inefficient. would a better soluton to the EOL branches not be > > dont delete the branch in the first place and just merge a commit declaring it eol. > > > > with extended maintenance i think that makes even more sense now then it did before. > > From where I sit, there are lots of things was *can do* and each has > pros and cons. None is right or wrong none is the clear winner. > Regardless of which we pick there will be knock-on technical impacts. > In Sydney, Dublin we had those discussions, in Vancouver we chose to use > the tag/delete process. > > I don't want to sound "ranty" and I don't want to close down a > conversation but I also don't want to keep talking about it :/ > > > as a side note in the gate jobs we should also set the UPPER_CONSTRAINTS_FILE env to point a the > > copy created by the zuul cloner rather then relying on either approch. > > Just for clarity s/should also// we do it for the tox jobs but we dont define it in the env of all jobs like devstack as a result there are edgecases where the local copy is not always used and i have seen intermitent gate failures due to dns or other network issues as a result. specifcally this call in devstack https://github.com/openstack/devstack/blob/master/lib/tempest#L592 so when i ment set it i ment actuly export UPPER_CONSTRAINTS_FILE=/opt/stack/... so that any casual invocation would use the local copy. that said looking at the devstack souce code all direct invocation need a different constraits file then the rest of the job and i dont think any other poject directly invoke tox directly in the gate witout using the tox zuul jobs so setting UPPER_CONSTRAINTS_FILE=/opt/stack/... wont actlly change teh behavior in a constutive way so we can proably ignore this. > > Redirects aren't used in the gate. > > > im not going to -1 patches that update to the > > https://releases.openstack.org/constraints/upper/$series form although i would prefer > > people do not do it on os-vif until the http{s,}://releases.openstack.org/constraints/upper/master > > redirect actully use opendev.org as we have already made that switch and i dont want to go form 0 redirects > > to 2. > > So https://review.opendev.org/#/c/660553/ is ready to merge which does > this. It has 3 +2's and just didn't get a +W because Friday. I'd > expect this to merge "real soon now". As you point out/imply this will > take you from 0 to 1 redirect for tox runs outside of the gate. > > So I'm going to propose the change in os-vif along with all the > others. Feel free to add a Depends-On or -W until you're happy with > it. its fine i can live with one :) i have had issues in the past with clinet not likeing to follow multiple redirect from behind a proxy but i have not seen that in 2 or 3 years and i nolonger code behind a proxy. one minor thing for people that use squid or other cacheing proxies is that sicne we redirect to https:// they will nolonger get any caching. we do this if you connect direclty too sean at pop-os:~/temp$ git clone http://opendev.org/openstack/os-vif Cloning into 'os-vif'... warning: redirecting to https://opendev.org/openstack/os-vif/ remote: Enumerating objects: 2274, done. remote: Counting objects: 100% (2274/2274), done. remote: Compressing objects: 100% (867/867), done. remote: Total 2274 (delta 1395), reused 2169 (delta 1371) Receiving objects: 100% (2274/2274), 430.83 KiB | 1.17 MiB/s, done. Resolving deltas: 100% (1395/1395), done. i understand why we do this .e.g. secure by default but its one of the downsides of our current redirects. > > > i would like to know why we continue to kill the stable branches for when the go EOL as that is the > > root of one of the issues raised. Tony do you have insight into that? > > The bottom line is that as a whole the community decided at some point > we run out of time/energy/resources/spoons to maintain those older > branches in the gate and at that point a strong signal for that is > desirable. The signal we've chosen is the tag/delete process. The EM > policy gives many project teams the ability to make those choices for > themselves on a schedule that makes sense to them[1]. In this regard > requirements is "just a project"[2] that would like to be able to make > that call also. > > > os-vif's lower constratis job is also not broken in they way that was described in the email so if an automated > > patch is > > recived that would regress that ill -2 it and then fix it up. > > So the fixing of lower-constraints is being done by hand not a > script/tool and I can confirm os-vif isn't in the list of projects that > gets an update ;P :) > > I do have *other* questions about os-vif's tox.ini, they're totally > style related so I'll fire off a change as a place to have that > discussion at some stage this week. sure although we deliberately chose to implement things differently in a few places to try and keep our tox config cleaner/simpler. if you have improvement they are welcome although i think stylistically the os-vif tox.ini is cleaner then most :) > > Yours Tony. > > [1] Without the general schedule outlined in https://docs.openstack.org/project-team-guide/stable-branches.html > [2] Okay it's not "just" a project as it has far reaching impacts when > it goes EOL but we're trying to reduce/minimise them and the scope for > mistakes. From bdobreli at redhat.com Tue May 28 09:31:07 2019 From: bdobreli at redhat.com (Bogdan Dobrelya) Date: Tue, 28 May 2019 11:31:07 +0200 Subject: [tripleo] Configuration Management without Puppet In-Reply-To: References: Message-ID: On 28.05.2019 1:35, Emilien Macchi wrote: > (First of all: thanks to Alex Schultz who initiated the discussion > during the last PTG). > > Context: > With the containerization of the services in TripleO, our usage of > Puppet is pretty much limited to laying down parameters into > configuration files. > Our long term goal is to reduce our number of tools in TripleO and > converge toward more Ansible and we are currently investigating how to > make it happen for Configuration Management, in a backward compatible > and non-disruptive way. > > Problems: > - Our current interface to configure things is very tight to Puppet and > Hiera, which would make it complicated to bind it with Hiera. Some of us > have tried (to create some Hiera module in Ansible) but clearly this is > not the path we want to take as far I know. > - We don't use the logic (generally) in the Puppet modules and again > only care about the configuration providers (in puppet-openstacklib and > for some services, templates files), so the Puppet modules now do too > much for what we need in TripleO. > > Proposals: > - Create an interface in THT that will define what the configuration > looks like for each service. The format will be simple and easy to > consume from any tool (Puppet to start for backward compatibility but > also Ansible and why not etcd one day). > > Example with Glance in deployment/glance/glance-api-container-puppet.yaml: >         param_config: >           map_merge: >             - glance_api_config: >                 DEFAULT: >                     enable_v2_api: true >                     log_dir: /var/log/glance >                 oslo_middleware: >                   enable_proxy_headers_parsing: true >             - glance_image_import_config: >                 image_import_opts: >                   image_import_plugins: {get_param: > GlanceImageImportPlugins} > > https://review.opendev.org/#/c/660791/29/deployment/glance/glance-api-container-puppet.yaml > > It's not tight to Hiera and it's generating param_config which is JSON > and consumable by Ansible or etcd later. > Here, glance_api_config and glance_image_import_config are the Puppet > providers which configure a specific file for Glance but we could > imagine a binding for Ansible like: >   glance_api_config: /etc/glance/glance-api.conf > > - Move the services to use this new interface and I think it'll take > multiple cycles to do that due to our amount of services. > > Things I haven't figured out (yet): > - I'm still working on figuring out how we will handle ExtraConfig, > service_config_settings and all the Hiera datafiles that we support in > puppet/role.role.j2.yaml (ideas welcome). > I can only imagine something like: >           ExtraConfig: >             - glance_api_config: >                 DEFAULT: >                     log_dir: /var/log/glance-2 > > Which somehow would override a previous hash. The blocker I found is > that map_merge doesn't do deep merges, so if you merge > glance_api_config, you overrides all keys... which is problematic if you > only want to update one key. I'll investigate that further. > > - Backward compatibility for things like "ExtraConfig: > glance::api::workers". Since we wouldn't rely on the classes anymore, > our users calling the parameters from the classes will be broken. This > also needs to be investigated. > > - non Puppet OpenStack modules like mysql, apache, etc, mostly use erb > templates. This is, for now, out of scope and the plan is to look at > OpenStack services first. But ideas welcome here as well. Of course we > would prefer to consume upstream roles if they are well tested & maintained. I believe we should "translate" hiera & puppet configurations we have in t-h-t directly into k8s-native (or at least looking-like) YAML structures, to have those consumable either as k8s config-maps or ansible vars, or etcd keys transparently and interchangeable. That would allow us to omit any additional translation work and intermediate data abstractions in future, when/if we'd decided to convert t-h-t to manage container pods and/or deploy it via k8s operators framework. I know there had been the translation work done for adapting ceph-ansible data structures (right, not really puppet...) into Rook [0], an operator for Kubernetes clusters. So perhaps we should add the ceph clusters deployment automation folks and teams in the loop and leverage their experience with the subject. [0] https://github.com/rook/rook > > If you want to review the ongoing work, here are the links: > https://review.opendev.org/#/c/660726 > https://review.opendev.org/#/c/661377 > https://review.opendev.org/#/c/661093 > https://review.opendev.org/#/c/660791 > > Thanks for reading that far, feel free to comment and contribute. I plan > to continue this work and evaluate if all of this is actually worth it. > -- > Emilien Macchi -- Best regards, Bogdan Dobrelya, Irc #bogdando From marcin.juszkiewicz at linaro.org Tue May 28 10:19:18 2019 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Tue, 28 May 2019 12:19:18 +0200 Subject: [kolla][neutron][networking-infoblox] Python3 issue: "TypeError: Unicode-objects must be encoded before hashing" In-Reply-To: <42626a00-df14-3d9b-e52c-1dfc3eeb639f@linaro.org> References: <1d56ad05-9fa4-16b7-5cbe-af5c339f58b1@linaro.org> <42626a00-df14-3d9b-e52c-1dfc3eeb639f@linaro.org> Message-ID: <2c558490-753d-ccfa-6068-653d73844ad8@linaro.org> W dniu 07.05.2019 o 09:34, Marcin Juszkiewicz pisze: > W dniu 07.05.2019 o 08:42, Marcin Juszkiewicz pisze: >> I am working on making Kolla images Python 3 only. So far images are py3 >> but then there are issues during deployment phase which I do not know >> how to solve. >> >> https://review.opendev.org/#/c/642375/ is a patch. >> >> 'kolla-ansible-ubuntu-source' CI job deploys using Ubuntu 18.04 based >> images. And fails. >> >> Log [1] shows something which looks like 'works in py2, not tested with py3' >> code: >> >> 1. http://logs.openstack.org/75/642375/19/check/kolla-ansible-ubuntu-source/40878ed/primary/logs/ansible/deploy >> >> > >> " File \"/var/lib/kolla/venv/lib/python3.6/site-packages/networking_infoblox/neutron/common/utils.py\", line 374, in get_hash", >> " return hashlib.md5(str(time.time())).hexdigest()", >> "TypeError: Unicode-objects must be encoded before hashing" >> ], Looking at progress of https://review.opendev.org/#/c/657578/ patch I decided to ignore existance of 'networking_infoblox' project. Added all core reviewers and got completely ignored on one line patch making that thing work with Python 3. Hope that other x/* projects are better maintained ;d From berndbausch at gmail.com Tue May 28 12:29:27 2019 From: berndbausch at gmail.com (Bernd Bausch) Date: Tue, 28 May 2019 21:29:27 +0900 Subject: [cinder] ceph multiattach details? In-Reply-To: References: <2569f62a-1d30-2468-b019-06d99a819f82@gmail.com> Message-ID: <8c70bb02-a959-96fd-382e-26f8816aad5d@gmail.com> Thanks for clarifying this Erik. Let me point out then that the release notes [1] are worded rather unequivocally, and what you are saying contradicts them: RBD driver has added multiattach support. It should be noted that replication and multiattach are mutually exclusive, so a single RBD volume can only be configured to support one of these features at a time. Additionally, RBD image features are not preserved which prevents a volume being retyped from multiattach to another type. This limitation is temporary and will be addressed soon. Bernd. On 5/28/2019 11:30 AM, Erik McCormick wrote: > There isn't really a Ceph multi-attach feature using Cinder [1] https://docs.openstack.org/releasenotes/cinder/stein.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From emccormick at cirrusseven.com Tue May 28 12:57:17 2019 From: emccormick at cirrusseven.com (Erik McCormick) Date: Tue, 28 May 2019 08:57:17 -0400 Subject: [cinder] ceph multiattach details? In-Reply-To: <8c70bb02-a959-96fd-382e-26f8816aad5d@gmail.com> References: <2569f62a-1d30-2468-b019-06d99a819f82@gmail.com> <8c70bb02-a959-96fd-382e-26f8816aad5d@gmail.com> Message-ID: On Tue, May 28, 2019, 8:29 AM Bernd Bausch wrote: > Thanks for clarifying this Erik. Let me point out then that the release > notes [1] are worded rather unequivocally, and what you are saying > contradicts them: > > RBD driver has added multiattach support. It should be noted that > replication and multiattach are mutually exclusive, so a single RBD volume > can only be configured to support one of these features at a time. > Additionally, RBD image features are not preserved which prevents a volume > being retyped from multiattach to another type. This limitation is > temporary and will be addressed soon. > > Bernd. > I haven't even looked at Stein yet so I could be wrong, but I thin this is referring to replication like RBD mirroring. There's still an issue in Ceph, as far as I know, where ceph's object replication would have a problem with multi-attach. I haven't come across any Ceph release note to say this has been addressed. Hopefully someone from the Cinder team can straighten us out. -Erik > On 5/28/2019 11:30 AM, Erik McCormick wrote: > > There isn't really a Ceph multi-attach feature using Cinder > > [1] https://docs.openstack.org/releasenotes/cinder/stein.html > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mihalis68 at gmail.com Tue May 28 13:04:05 2019 From: mihalis68 at gmail.com (Chris Morgan) Date: Tue, 28 May 2019 09:04:05 -0400 Subject: [ops] last weeks meetups team meeting minutes Message-ID: Minutes for last weeks brief openstack ops meetups team meeting are here: Minutes: http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-05-21-14.05.html 10:24 AM Minutes (text): http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-05-21-14.05.txt 10:24 AM Log: http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-05-21-14.05.log.html Also please note: detailed planning has started for the NYC meetup in September, please add your ideas for sessions topics and +1 ones already there that you would like to attend, see https://etherpad.openstack.org/p/NYC19-OPS-MEETUP our next meeting is in just under an hour on #openstack-operators, see you there! Chris -- Chris Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Tue May 28 13:21:44 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 28 May 2019 22:21:44 +0900 Subject: [qa][ptg][nova][cinder][keystone][neutron][glance][swift][placement] How to make integrated-gate testing (tempest-full) more stable and fast In-Reply-To: <16af8ac7fc7.fa901245105341.2925519493395080868@ghanshyammann.com> References: <16a86db6ccd.d787148123989.2198391414179782565@ghanshyammann.com> <16af8ac7fc7.fa901245105341.2925519493395080868@ghanshyammann.com> Message-ID: <16afe9a95fa.ff3031a3146625.6650617153857325463@ghanshyammann.com> ---- On Mon, 27 May 2019 18:43:35 +0900 Ghanshyam Mann wrote ---- > ---- On Tue, 07 May 2019 07:06:23 +0900 Morgan Fainberg wrote ---- > > > > > > On Sun, May 5, 2019 at 12:19 AM Ghanshyam Mann wrote: > > > > For the "Integrated-gate-identity", I have a slight worry that we might lose some coverage with this change. I am unsure of how varied the use of Keystone is outside of KeystoneMiddleware (i.e. token validation) consumption that all services perform, Heat (not part of the integrated gate) and it's usage of Trusts, and some newer emerging uses such as "look up limit data" (potentially in Train, would be covered by Nova). Worst case, we could run all the integrated tests for Keystone changes (at least initially) until we have higher confidence and minimize the tests once we have a clearer audit of how the services use Keystone. The changes would speed up/minimize the usage for the other services directly and Keystone can follow down the line. > > I want to be as close to 100% sure we're not going to suddenly break everyone because of some change we land. Keystone fortunately and unfortunately sits below most other services in an OpenStack deployment and is heavily relied throughout almost every single request. > > --Morgan > > > Thanks Morgan. That was what we were worried during PTG discussion. I agree with your point about not to lose coverage and first get to know how Keystone is being used by each service. Let's keep running the all service tests for keystone gate as of now and later we can shorten the tests run based on the clarity of usage. We can disable the ssh validation for "Integrated-gate-identity" which keystone does not need to care about. This can save the keystone gate for ssh timeout failure. -gmann > > -gmann > > > > Current integrated-gate jobs (tempest-full) is not so stable for various bugs specially timeout. We tried > > to improve it via filtering the slow tests in the separate tempest-slow job but the situation has not been improved much. > > > > We talked about the Ideas to make it more stable and fast for projects especially when failure is not > > related to each project. We are planning to split the integrated-gate template (only tempest-full job as > > first step) per related services. > > > > Idea: > > - Run only dependent service tests on project gate. > > - Tempest gate will keep running all the services tests as the integrated gate at a centeralized place without any change in the current job. > > - Each project can run the below mentioned template. > > - All below template will be defined and maintained by QA team. > > > > I would like to know each 6 services which run integrated-gate jobs > > > > 1."Integrated-gate-networking" (job to run on neutron gate) > > Tests to run in this template: neutron APIs , nova APIs, keystone APIs ? All scenario currently running in tempest-full in the same way ( means non-slow and in serial) > > Improvement for neutron gate: exlcude the cinder API tests, glance API tests, swift API tests, > > > > 2."Integrated-gate-storage" (job to run on cinder gate, glance gate) > > Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, Nova APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) > > Improvement for cinder, glance gate: excluded the neutron APIs tests, Keystone APIs tests > > > > 3. "Integrated-gate-object-storage" (job to run on swift gate) > > Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) > > Improvement for swift gate: excluded the neutron APIs tests, - Keystone APIs tests, - Nova APIs tests. > > Note: swift does not run integrated-gate as of now. > > > > 4. "Integrated-gate-compute" (job to run on Nova gate) > > tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs and All scenario currently running in tempest-full in same way ( means non-slow and in serial) > > Improvement for Nova gate: excluded the swift APIs tests(not running in current job but in future, it might), Keystone API tests. > > > > 5. "Integrated-gate-identity" (job to run on keystone gate) > > Tests to run is : all as all project use keystone, we might need to run all tests as it is running in integrated-gate. > > But does keystone is being unsed differently by all services? if no then, is it enough to run only single service tests say Nova or neutron ? > > > > 6. "Integrated-gate-placement" (job to run on placement gate) > > Tests to run in this template: Nova APIs tests, Neutron APIs tests + scenario tests + any new service depends on placement APIs > > Improvement for placement gate: excluded the glance APIs tests, cinder APIs tests, swift APIs tests, keystone APIs tests > > > > Thoughts on this approach? > > > > The important point is we must not lose the coverage of integrated testing per project. So I would like to > > get each project view if we are missing any dependency (proposed tests removal) in above proposed templates. > > > > - https://etherpad.openstack.org/p/qa-train-ptg > > > > -gmann > > > > > > > From sean.mcginnis at gmx.com Tue May 28 13:33:23 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Tue, 28 May 2019 08:33:23 -0500 Subject: [cinder] ceph multiattach details? In-Reply-To: References: <2569f62a-1d30-2468-b019-06d99a819f82@gmail.com> <8c70bb02-a959-96fd-382e-26f8816aad5d@gmail.com> Message-ID: <20190528133322.GA23003@sm-workstation> On Tue, May 28, 2019 at 08:57:17AM -0400, Erik McCormick wrote: > On Tue, May 28, 2019, 8:29 AM Bernd Bausch wrote: > > > Thanks for clarifying this Erik. Let me point out then that the release > > notes [1] are worded rather unequivocally, and what you are saying > > contradicts them: > > > > RBD driver has added multiattach support. It should be noted that > > replication and multiattach are mutually exclusive, so a single RBD volume > > can only be configured to support one of these features at a time. > > Additionally, RBD image features are not preserved which prevents a volume > > being retyped from multiattach to another type. This limitation is > > temporary and will be addressed soon. > > > > Bernd. > > > I haven't even looked at Stein yet so I could be wrong, but I thin this is > referring to replication like RBD mirroring. There's still an issue in > Ceph, as far as I know, where ceph's object replication would have a > problem with multi-attach. I haven't come across any Ceph release note to > say this has been addressed. > > Hopefully someone from the Cinder team can straighten us out. > > -Erik Multiattach support has indeed been enabled for RBD as of the Stein release. Though there are the known caveats that you point out. I thought there was a pending patch to add some details on this to to the RBD driver configuration reference, but I am not finding anything at the moment. I don't have all the details on that myself, but hopefully one of the RBD driver maintainers can chime in here with better details. Sean From mthode at mthode.org Tue May 28 14:40:30 2019 From: mthode at mthode.org (Matthew Thode) Date: Tue, 28 May 2019 09:40:30 -0500 Subject: [requirements][tooz] uncap grpcio Message-ID: <20190528144030.iifm6omp2cwphgv7@mthode.org> Hi all, Requirements is looking to raise the version of grpcio to 1.21.1. grpcio has been uncapped in requirements for six months now and currently only tooz is blocking the usage of newer versions. Following the unblocking we'd need a release in order to actually unblock things in gate. It looks like it was fixed in 1.18.0, if you need to mask version between (and including) 1.16.0 and 1.17.x feel free to submit the patches to global-requirements. https://bugs.launchpad.net/python-tooz/+bug/1808046 70f144abde14e07d7f9620a2abb563ed16ef8c63 (in openstack/tooz) Thanks, -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From openstack at nemebean.com Tue May 28 14:54:50 2019 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 28 May 2019 09:54:50 -0500 Subject: [requirements][tooz] uncap grpcio In-Reply-To: <20190528144030.iifm6omp2cwphgv7@mthode.org> References: <20190528144030.iifm6omp2cwphgv7@mthode.org> Message-ID: <5bb280a1-efc5-627d-8652-e58b3fc2e6ce@nemebean.com> On 5/28/19 9:40 AM, Matthew Thode wrote: > Hi all, > > Requirements is looking to raise the version of grpcio to 1.21.1. > grpcio has been uncapped in requirements for six months now and > currently only tooz is blocking the usage of newer versions. It hasn't been uncapped for six months. The commit was initially proposed to gerrit six months ago, but it was only merged a couple of weeks ago: https://review.opendev.org/#/c/625010/ We have a patch up to uncap grpcio in tooz: https://review.opendev.org/#/c/659590 It was previously blocked by an unrelated issue with the gate jobs, but that has since been fixed. I tweaked the patch to address the nits that had been pointed out and it should be good to go now. > Following the unblocking we'd need a release in order to actually > unblock things in gate. > > It looks like it was fixed in 1.18.0, if you need to mask version > between (and including) 1.16.0 and 1.17.x feel free to submit the > patches to global-requirements. The exclusions were part of the uncap patch in g-r. In tooz we're just bumping the minimum to 1.18.0 to avoid problems if they would happen to do another 1.16 or 1.17 bugfix release. > > https://bugs.launchpad.net/python-tooz/+bug/1808046 > 70f144abde14e07d7f9620a2abb563ed16ef8c63 (in openstack/tooz) > > Thanks, > From mthode at mthode.org Tue May 28 15:02:34 2019 From: mthode at mthode.org (Matthew Thode) Date: Tue, 28 May 2019 10:02:34 -0500 Subject: [requirements][tooz] uncap grpcio In-Reply-To: <5bb280a1-efc5-627d-8652-e58b3fc2e6ce@nemebean.com> References: <20190528144030.iifm6omp2cwphgv7@mthode.org> <5bb280a1-efc5-627d-8652-e58b3fc2e6ce@nemebean.com> Message-ID: <20190528150234.ccsfmlrqzsy7gyvl@mthode.org> On 19-05-28 09:54:50, Ben Nemec wrote: > > > On 5/28/19 9:40 AM, Matthew Thode wrote: > > Hi all, > > > > Requirements is looking to raise the version of grpcio to 1.21.1. > > grpcio has been uncapped in requirements for six months now and > > currently only tooz is blocking the usage of newer versions. > > It hasn't been uncapped for six months. The commit was initially proposed to > gerrit six months ago, but it was only merged a couple of weeks ago: > https://review.opendev.org/#/c/625010/ > > We have a patch up to uncap grpcio in tooz: > https://review.opendev.org/#/c/659590 > > It was previously blocked by an unrelated issue with the gate jobs, but that > has since been fixed. I tweaked the patch to address the nits that had been > pointed out and it should be good to go now. > > > Following the unblocking we'd need a release in order to actually > > unblock things in gate. > > > > It looks like it was fixed in 1.18.0, if you need to mask version > > between (and including) 1.16.0 and 1.17.x feel free to submit the > > patches to global-requirements. > > The exclusions were part of the uncap patch in g-r. In tooz we're just > bumping the minimum to 1.18.0 to avoid problems if they would happen to do > another 1.16 or 1.17 bugfix release. > > > > > https://bugs.launchpad.net/python-tooz/+bug/1808046 > > 70f144abde14e07d7f9620a2abb563ed16ef8c63 (in openstack/tooz) > > > > Thanks, > > Cool, will be watching https://review.opendev.org/#/c/659590 -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From gergely.csatari at nokia.com Tue May 28 15:31:12 2019 From: gergely.csatari at nokia.com (Csatari, Gergely (Nokia - HU/Budapest)) Date: Tue, 28 May 2019 15:31:12 +0000 Subject: [ironic][edge]: Recap of PTG discussions Message-ID: Hi, There was a one hour discussion with Julia from Ironic with the Edge Computing Group [1]. In this mail I try to conclude what was discussed and ask some clarification questions. Current Ironic uses DHCP for hardware provisioning, therefore it requires DHCP relay enabled on the whole path to the edge cloud instances. There are two alternatives to solve this: 1) Virtual media support [2] where the ip configuration is embedded into a virtual image what is booted via the board management interface 2) Redfish support, however the state and support of redfish for host management is not clear. Is there already a specification has been added for redfish support? Upgrade of edge cloud infrastructures: - Firmware upgrade should be supported by Ironic. Is this something on its way or is this a new need? - Operating system and infra update can be solved using Fenix [3], however handling several edge cloud instances from a central location needs new features. Handling of failed servers: - A monitoring system or the operator should provide the input to mark a server as failed - Ironic can power down the failed servers and have the definition of a maintenance state - Discussed in [4] Additional ideas what we half discussed: - Running Ironic containers in a switch with the images hosted by Swift somewhere else. Are there any concerns about this idea? Any missing features from somewhere? [1]: https://etherpad.openstack.org/p/edge-wg-ptg-preparation-denver-2019 [2]: https://specs.openstack.org/openstack/ironic-specs/specs/not-implemented/L3-based-deployment.html [3]: https://wiki.openstack.org/wiki/Fenix [4]: http://lists.openstack.org/pipermail/edge-computing/2019-May/000582.html Br, Gerg0 From rtidwell at suse.com Tue May 28 15:35:07 2019 From: rtidwell at suse.com (Ryan Tidwell) Date: Tue, 28 May 2019 10:35:07 -0500 Subject: [neutron] [OVN] ML2+OVS+DVR convergence with OVN spec Message-ID: <85615a29-4b1b-bae5-2a14-4b625edf4f28@suse.com> Hello neutrinos, As discussed recently at the Denver PTG [1] and in the neutron drivers meeting last Friday May 24th [2], I have started on a spec for ML2+OVS and OVN convergence [3]. It is in very rough shape at the moment, but I have pushed a rough outline so this can be developed as collaboratively as possible starting now. I personally don't have all the information to fill out the spec right at the moment but I'm sure it can be found across the community of folks working on neutron and OVN, so please feel free to comment and add relevant information to the spec. -Ryan Tidwell [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006408.html [2] http://eavesdrop.openstack.org/meetings/neutron_drivers/2019/neutron_drivers.2019-05-24-14.00.log.txt [3] https://review.opendev.org/#/c/658414 From cboylan at sapwetik.org Tue May 28 16:11:44 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Tue, 28 May 2019 09:11:44 -0700 Subject: [tripleo] Configuration Management without Puppet In-Reply-To: References: Message-ID: On Mon, May 27, 2019, at 4:36 PM, Emilien Macchi wrote: > (First of all: thanks to Alex Schultz who initiated the discussion > during the last PTG). > > Context: > With the containerization of the services in TripleO, our usage of > Puppet is pretty much limited to laying down parameters into > configuration files. > Our long term goal is to reduce our number of tools in TripleO and > converge toward more Ansible and we are currently investigating how to > make it happen for Configuration Management, in a backward compatible > and non-disruptive way. > > Problems: > - Our current interface to configure things is very tight to Puppet and > Hiera, which would make it complicated to bind it with Hiera. Some of > us have tried (to create some Hiera module in Ansible) but clearly this > is not the path we want to take as far I know. > - We don't use the logic (generally) in the Puppet modules and again > only care about the configuration providers (in puppet-openstacklib and > for some services, templates files), so the Puppet modules now do too > much for what we need in TripleO. > snip I'm not sure if this is useful but what the Infra team did was to transplant all of its hiera data into /etc/ansible/hosts/host_vars and /etc/ansible/hosts/group_vars. Then we updated our puppetry to pull hiera data out of there. This means that puppet and ansible read the same data sources which has made transitioning things easy for us. Though I think the ansible-role-puppet role copies hiera data sources from /etc/ansible/hosts to the appropriate puppet hiera location on the remote source. But you mostly don't have to think about that and there is no double accounting from an ops perspective. Clark From donny at fortnebula.com Tue May 28 16:40:09 2019 From: donny at fortnebula.com (Donny Davis) Date: Tue, 28 May 2019 12:40:09 -0400 Subject: [tripleo] Configuration Management without Puppet In-Reply-To: References: Message-ID: Wouldn't it be easier just to use OSA once things get past laying down the OS and the relevant configs for the network interfaces. It seems like building it all in ansible has already been done, and much of the work could be used. I understand that both have different opinions on how to deploy openstack, but it would lower the lift in getting an ansible based deployment operational much sooner. On Tue, May 28, 2019 at 12:17 PM Clark Boylan wrote: > On Mon, May 27, 2019, at 4:36 PM, Emilien Macchi wrote: > > (First of all: thanks to Alex Schultz who initiated the discussion > > during the last PTG). > > > > Context: > > With the containerization of the services in TripleO, our usage of > > Puppet is pretty much limited to laying down parameters into > > configuration files. > > Our long term goal is to reduce our number of tools in TripleO and > > converge toward more Ansible and we are currently investigating how to > > make it happen for Configuration Management, in a backward compatible > > and non-disruptive way. > > > > Problems: > > - Our current interface to configure things is very tight to Puppet and > > Hiera, which would make it complicated to bind it with Hiera. Some of > > us have tried (to create some Hiera module in Ansible) but clearly this > > is not the path we want to take as far I know. > > - We don't use the logic (generally) in the Puppet modules and again > > only care about the configuration providers (in puppet-openstacklib and > > for some services, templates files), so the Puppet modules now do too > > much for what we need in TripleO. > > > > snip > > I'm not sure if this is useful but what the Infra team did was to > transplant all of its hiera data into /etc/ansible/hosts/host_vars and > /etc/ansible/hosts/group_vars. Then we updated our puppetry to pull hiera > data out of there. This means that puppet and ansible read the same data > sources which has made transitioning things easy for us. > > Though I think the ansible-role-puppet role copies hiera data sources from > /etc/ansible/hosts to the appropriate puppet hiera location on the remote > source. But you mostly don't have to think about that and there is no > double accounting from an ops perspective. > > Clark > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cboylan at sapwetik.org Tue May 28 16:45:21 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Tue, 28 May 2019 09:45:21 -0700 Subject: [tripleo] Configuration Management without Puppet In-Reply-To: References: Message-ID: <7186dc36-c547-4c35-8b05-ef030c714252@www.fastmail.com> On Tue, May 28, 2019, at 9:40 AM, Donny Davis wrote: > Wouldn't it be easier just to use OSA once things get past laying down > the OS and the relevant configs for the network interfaces. It seems > like building it all in ansible has already been done, and much of the > work could be used. I understand that both have different opinions on > how to deploy openstack, but it would lower the lift in getting an > ansible based deployment operational much sooner. Note sure if this was directed to my comments about what Infra has done or if you mean to suggest this for Tripleo. In the Infra case we do not deploy openstack. We deploy a ton of software on top of OpenStack so OSA isn't relevant for our uses. > > > > > On Tue, May 28, 2019 at 12:17 PM Clark Boylan wrote: > > On Mon, May 27, 2019, at 4:36 PM, Emilien Macchi wrote: > > > (First of all: thanks to Alex Schultz who initiated the discussion > > > during the last PTG). > > > > > > Context: > > > With the containerization of the services in TripleO, our usage of > > > Puppet is pretty much limited to laying down parameters into > > > configuration files. > > > Our long term goal is to reduce our number of tools in TripleO and > > > converge toward more Ansible and we are currently investigating how to > > > make it happen for Configuration Management, in a backward compatible > > > and non-disruptive way. > > > > > > Problems: > > > - Our current interface to configure things is very tight to Puppet and > > > Hiera, which would make it complicated to bind it with Hiera. Some of > > > us have tried (to create some Hiera module in Ansible) but clearly this > > > is not the path we want to take as far I know. > > > - We don't use the logic (generally) in the Puppet modules and again > > > only care about the configuration providers (in puppet-openstacklib and > > > for some services, templates files), so the Puppet modules now do too > > > much for what we need in TripleO. > > > > > > > snip > > > > I'm not sure if this is useful but what the Infra team did was to transplant all of its hiera data into /etc/ansible/hosts/host_vars and /etc/ansible/hosts/group_vars. Then we updated our puppetry to pull hiera data out of there. This means that puppet and ansible read the same data sources which has made transitioning things easy for us. > > > > Though I think the ansible-role-puppet role copies hiera data sources from /etc/ansible/hosts to the appropriate puppet hiera location on the remote source. But you mostly don't have to think about that and there is no double accounting from an ops perspective. > > > > Clark > > From donny at fortnebula.com Tue May 28 16:52:34 2019 From: donny at fortnebula.com (Donny Davis) Date: Tue, 28 May 2019 12:52:34 -0400 Subject: [tripleo] Configuration Management without Puppet In-Reply-To: <7186dc36-c547-4c35-8b05-ef030c714252@www.fastmail.com> References: <7186dc36-c547-4c35-8b05-ef030c714252@www.fastmail.com> Message-ID: I was directing this at 3O, but didn't want to jump into the middle of the thread. Thanks for helping to clarify. ~/D On Tue, May 28, 2019 at 12:48 PM Clark Boylan wrote: > On Tue, May 28, 2019, at 9:40 AM, Donny Davis wrote: > > Wouldn't it be easier just to use OSA once things get past laying down > > the OS and the relevant configs for the network interfaces. It seems > > like building it all in ansible has already been done, and much of the > > work could be used. I understand that both have different opinions on > > how to deploy openstack, but it would lower the lift in getting an > > ansible based deployment operational much sooner. > > Note sure if this was directed to my comments about what Infra has done or > if you mean to suggest this for Tripleo. In the Infra case we do not deploy > openstack. We deploy a ton of software on top of OpenStack so OSA isn't > relevant for our uses. > > > > > > > > > > > On Tue, May 28, 2019 at 12:17 PM Clark Boylan > wrote: > > > On Mon, May 27, 2019, at 4:36 PM, Emilien Macchi wrote: > > > > (First of all: thanks to Alex Schultz who initiated the discussion > > > > during the last PTG). > > > > > > > > Context: > > > > With the containerization of the services in TripleO, our usage of > > > > Puppet is pretty much limited to laying down parameters into > > > > configuration files. > > > > Our long term goal is to reduce our number of tools in TripleO and > > > > converge toward more Ansible and we are currently investigating how > to > > > > make it happen for Configuration Management, in a backward > compatible > > > > and non-disruptive way. > > > > > > > > Problems: > > > > - Our current interface to configure things is very tight to Puppet > and > > > > Hiera, which would make it complicated to bind it with Hiera. Some > of > > > > us have tried (to create some Hiera module in Ansible) but clearly > this > > > > is not the path we want to take as far I know. > > > > - We don't use the logic (generally) in the Puppet modules and > again > > > > only care about the configuration providers (in puppet-openstacklib > and > > > > for some services, templates files), so the Puppet modules now do > too > > > > much for what we need in TripleO. > > > > > > > > > > snip > > > > > > I'm not sure if this is useful but what the Infra team did was to > transplant all of its hiera data into /etc/ansible/hosts/host_vars and > /etc/ansible/hosts/group_vars. Then we updated our puppetry to pull hiera > data out of there. This means that puppet and ansible read the same data > sources which has made transitioning things easy for us. > > > > > > Though I think the ansible-role-puppet role copies hiera data sources > from /etc/ansible/hosts to the appropriate puppet hiera location on the > remote source. But you mostly don't have to think about that and there is > no double accounting from an ops perspective. > > > > > > Clark > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aschultz at redhat.com Tue May 28 17:05:41 2019 From: aschultz at redhat.com (Alex Schultz) Date: Tue, 28 May 2019 11:05:41 -0600 Subject: [tripleo] Configuration Management without Puppet In-Reply-To: References: Message-ID: On Tue, May 28, 2019 at 10:50 AM Donny Davis wrote: > Wouldn't it be easier just to use OSA once things get past laying down the > OS and the relevant configs for the network interfaces. It seems like > building it all in ansible has already been done, and much of the work > could be used. I understand that both have different opinions on how to > deploy openstack, but it would lower the lift in getting an ansible based > deployment operational much sooner. > > > > It's not easier because we then have to translate our current puppet/hiera configuration into their ansible vars. We're trying to come up with a generic construct that still allows end users to override specific configuration items without having to continue to use puppet to lay down the configuration. The proposed format maps more to what oslo.config needs and less about the implementation of what is ultimately doing the configuration writing. We still also need to support backwards compatibility to a certain extent. We unfortunately cannot just drop in OSA because we have our own opinionated assumptions about how we do containers/orchestration so for this conversation it's just about simplifying or reorganizing the configuration bits. In the longer term if we decided that we wanted to stop using configuration files and switch to "something else", we wouldn't then need to figure out how to rip out OSA (if they don't support the "something else"). Thanks, -Alex > > > On Tue, May 28, 2019 at 12:17 PM Clark Boylan > wrote: > >> On Mon, May 27, 2019, at 4:36 PM, Emilien Macchi wrote: >> > (First of all: thanks to Alex Schultz who initiated the discussion >> > during the last PTG). >> > >> > Context: >> > With the containerization of the services in TripleO, our usage of >> > Puppet is pretty much limited to laying down parameters into >> > configuration files. >> > Our long term goal is to reduce our number of tools in TripleO and >> > converge toward more Ansible and we are currently investigating how to >> > make it happen for Configuration Management, in a backward compatible >> > and non-disruptive way. >> > >> > Problems: >> > - Our current interface to configure things is very tight to Puppet and >> > Hiera, which would make it complicated to bind it with Hiera. Some of >> > us have tried (to create some Hiera module in Ansible) but clearly this >> > is not the path we want to take as far I know. >> > - We don't use the logic (generally) in the Puppet modules and again >> > only care about the configuration providers (in puppet-openstacklib and >> > for some services, templates files), so the Puppet modules now do too >> > much for what we need in TripleO. >> > >> >> snip >> >> I'm not sure if this is useful but what the Infra team did was to >> transplant all of its hiera data into /etc/ansible/hosts/host_vars and >> /etc/ansible/hosts/group_vars. Then we updated our puppetry to pull hiera >> data out of there. This means that puppet and ansible read the same data >> sources which has made transitioning things easy for us. >> >> Though I think the ansible-role-puppet role copies hiera data sources >> from /etc/ansible/hosts to the appropriate puppet hiera location on the >> remote source. But you mostly don't have to think about that and there is >> no double accounting from an ops perspective. >> >> Clark >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From dsneddon at redhat.com Tue May 28 17:23:40 2019 From: dsneddon at redhat.com (Dan Sneddon) Date: Tue, 28 May 2019 10:23:40 -0700 Subject: [ironic][edge]: Recap of PTG discussions In-Reply-To: References: Message-ID: On Tue, May 28, 2019 at 8:33 AM Csatari, Gergely (Nokia - HU/Budapest) < gergely.csatari at nokia.com> wrote: > Hi, > > There was a one hour discussion with Julia from Ironic with the Edge > Computing Group [1]. In this mail I try to conclude what was discussed and > ask some clarification questions. > > Current Ironic uses DHCP for hardware provisioning, therefore it requires > DHCP relay enabled on the whole path to the edge cloud instances. There are > two alternatives to solve this: > 1) Virtual media support [2] where the ip configuration is embedded into a > virtual image what is booted via the board management interface > 2) Redfish support, however the state and support of redfish for host > management is not clear. Is there already a specification has been added > for redfish support? > > Upgrade of edge cloud infrastructures: > - Firmware upgrade should be supported by Ironic. Is this something on > its way or is this a new need? > - Operating system and infra update can be solved using Fenix [3], > however handling several edge cloud instances from a central location needs > new features. > > Handling of failed servers: > - A monitoring system or the operator should provide the input to mark a > server as failed > - Ironic can power down the failed servers and have the definition of a > maintenance state > - Discussed in [4] > > Additional ideas what we half discussed: > - Running Ironic containers in a switch with the images hosted by Swift > somewhere else. Are there any concerns about this idea? Any missing > features from somewhere? > > [1]: https://etherpad.openstack.org/p/edge-wg-ptg-preparation-denver-2019 > [2]: > https://specs.openstack.org/openstack/ironic-specs/specs/not-implemented/L3-based-deployment.html > [3]: https://wiki.openstack.org/wiki/Fenix > [4]: > http://lists.openstack.org/pipermail/edge-computing/2019-May/000582.html > > Br, > Gerg0 > I have researched putting Ironic into a container on a switch. However, Ironic has historically required DHCP services, which would also be running inside the same container. In order to respond to DHCP requests, the container must be able to listen on the network for DHCP requests. Not all switches will allow a container which is attached directly to the VLAN interfaces and can receive traffic to the broadcast MAC address. If you have a switch which allows you to listen to MAC broadcasts inside a container then this should be feasible. Also note that Ironic is not monolithic. There are separate functions for API (which would not live on the switch), Ironic Inspector, and Ironic Conductor. When using DHCP for Ironic Inspection, Ironic provides DHCP using its own dnsmasq process. When using DHCP for deploying a node, the DHCP services are provided by Neutron. It would be best to avoid DHCP in this scenario so that neither inspection nor deployment required a DHCP server. However, as you note we are working on booting without DHCP, which will make it much easier to host an Ironic service inside a container on a switch or router. Without DHCP, the Ironic container must still be reachable from the outside, but only by its IP address. -- Dan Sneddon | Senior Principal Software Engineer dsneddon at redhat.com | redhat.com/cloud dsneddon:irc | @dxs:twitter -------------- next part -------------- An HTML attachment was scrubbed... URL: From mthode at mthode.org Tue May 28 17:27:41 2019 From: mthode at mthode.org (Matthew Thode) Date: Tue, 28 May 2019 12:27:41 -0500 Subject: [requirements][nova][neutron] updating sqlalchemy to 1.3.4 Message-ID: <20190528172741.rj3zokt352dm5n2o@mthode.org> so, there are no requirements type reasons why we can't merge this now, but functional tests seem to be failing for nova/neutron (tempest). Would those teams be able to test? https://review.opendev.org/651591 is the test patch (now updated for 1.3.4) Some failures show up here as well. http://logs.openstack.org/39/661539/1/check/tempest-full/b127828/testr_results.html.gz -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From blake at platform9.com Tue May 28 17:41:12 2019 From: blake at platform9.com (Blake Covarrubias) Date: Tue, 28 May 2019 10:41:12 -0700 Subject: [neutron] set/find dhcp-server address In-Reply-To: <4350515b-e5f1-f9c0-7f4a-ccd2ff9541c2@gmx.com> References: <5f65989d-1616-a5f5-1c90-a5f6e6e364fe@gmx.com> <2e8c680b-d392-6468-bcc2-44449cc30084@gmx.com> <3954208F-442A-4519-AAEE-80AB6E5C15B2@redhat.com> <4350515b-e5f1-f9c0-7f4a-ccd2ff9541c2@gmx.com> Message-ID: <22F66DD7-57FC-46EC-8AFD-668620195A3F@platform9.com> Hi Volodymyr, There is an undocumented process to reserve IP addresses for use by DHCP servers [1]. It works by creating port with the desired IP and a device owner of “reserved_dhcp_port." openstack port create dhcp-port1 --network --fixed-ip subnet=,ip-address=192.0.2.2 --device reserved_dhcp_port You can either create the port prior to enabling DHCP on the subnet, or delete an existing DHCP port after creating the reserved port. In either case Neutron should utilize the reserved port on subsequent DHCP agent creation. Does this satisfy your use-case? [1] https://github.com/openstack/neutron/blob/31fd237/neutron/agent/linux/dhcp.py#L1417-L1419 — Blake Covarrubias > On May 27, 2019, at 6:44 AM, Volodymyr Litovka wrote: > > Hi Slawomir, > > yes, thanks, it works: > neutron.list_ports(retrieve_all=False, > network_id='2697930d-65f2-4a7a-b360-91d75cc8750d', > device_owner='network:dhcp') > Thank you. > > On 5/26/19 10:07 AM, Slawomir Kaplonski wrote: >> Hi, >> >> If You do something like: >> >> openstack port list --network d79eea02-31dc-45c7-bd48-d98af46fd2d5 --device-owner network:dhcp >> >> Then You will get only dhcp ports from specific network. And Fixed IP are by default displayed on this list. Is this enough “workaround” for You? >> >> >>> On 25 May 2019, at 22:19, Volodymyr Litovka wrote: >>> >>> Hi, >>> >>> it seems I wasn't first who asked for this - https://wiki.openstack.org/wiki/Neutron/enable-to-set-dhcp-port-attributes and it seems there was no progress on this? >>> >>> Is it possible to at least include DHCP address in output of 'subnet show' API call? >>> >>> The shortest way I've found is: >>> * openstack port list --project ... --device-owner network:dhcp >>> and then for **every port** in resulting list >>> * openstack port show >>> in order to extract 'Fixed IP Addresses' attribute for analysis >>> >>> Too much calls, isn't it? >>> >>> On 5/25/19 9:22 PM, Volodymyr Litovka wrote: >>>> Dear colleagues, >>>> >>>> is there way to explicitly assign DHCP address when creating subnet? The issue is that it isn't always first address from allocation pool, e.g. >>>> $ openstack port list >>>> +--------------------------------------+-------+-------------------+------------------------------------------------------------------------------+--------+ >>>> | ID | Name | MAC Address | Fixed IP Addresses | Status | >>>> +--------------------------------------+-------+-------------------+------------------------------------------------------------------------------+--------+ >>>> | 0897bcc4-6cad-479c-8743-ca7cc5a57271 | | 72:d0:1c:d1:6b:51 | ip_address='172.16.53.3', subnet_id='20329549-124c-484d-8278-edca9829e262' | ACTIVE | >>>> | | | | ip_address='172.16.54.2', subnet_id='07249cd3-11a9-4da7-a4db-bd838aa8c4e7' | | >>>> >>>> both subnet have similar configuration of allocation pool (172.16.xx.2-254/24) and there are two different addresses for DHCP in every subnet. >>>> >>>> This makes a trouble during project generation with pre-assigned addresses for servers if the pre-assigned address is same as [surprisigly, non-first] address of DHCP namespace. >>>> >>>> And, may be, there is a way to determine this address in more simple way than looking into 'openstack port list' output, searching for port (a) without name and (b) with multiple addresses from all belonging subnets :) At the moment, 'openstack subnet show' say nothing about assigned DHCP-address. >>>> >>>> Thank you! >>>> >>>> -- >>>> Volodymyr Litovka >>>> "Vision without Execution is Hallucination." -- Thomas Edison >>>> >>> -- >>> Volodymyr Litovka >>> "Vision without Execution is Hallucination." -- Thomas Edison >>> >> — >> Slawek Kaplonski >> Senior software engineer >> Red Hat >> >> > > -- > Volodymyr Litovka > "Vision without Execution is Hallucination." -- Thomas Edison -------------- next part -------------- An HTML attachment was scrubbed... URL: From emilien at redhat.com Tue May 28 18:15:58 2019 From: emilien at redhat.com (Emilien Macchi) Date: Tue, 28 May 2019 14:15:58 -0400 Subject: [tripleo] Configuration Management without Puppet In-Reply-To: References: Message-ID: On Tue, May 28, 2019 at 12:24 PM Clark Boylan wrote: [...] > I'm not sure if this is useful but what the Infra team did was to > transplant all of its hiera data into /etc/ansible/hosts/host_vars and > /etc/ansible/hosts/group_vars. Then we updated our puppetry to pull hiera > data out of there. This means that puppet and ansible read the same data > sources which has made transitioning things easy for us. > Yes this is useful to know, thanks for sharing. In our case, we are not so worried about the source of data (it can be anywhere really); but our challenge is how we manage the data in our Heat templates and how the way it is exposed to our end-users will remain backward compatible. Example: Right now, if you want to override parameter DEFAULT/foo in Glance API config file you would do: ControllerExtraConfig: glance::config::glance_api_config: DEFAULT/foo: value: bar With the new interface proposal it would be: ControllerExtraConfig: glance_api_config: DEFAULT: foo: bar And the second challenge here is to maintain the hierarchies that we have between layers of configuration (per service, per role, per nodes, etc), knowing that Heat doesn't have a hash deep_merge function (yet) AFIK (happy to be corrected here). I think we can try to keep hiera now (which provides a nice way to do parameters hierarchy) and keep the focus on stop using the Puppet OpenStack modules logic, just build a new interface which directly lays down the configuration (with Puppet now but something else later). -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From thiagocmartinsc at gmail.com Tue May 28 19:07:26 2019 From: thiagocmartinsc at gmail.com (=?UTF-8?B?TWFydGlueCAtIOOCuOOCp+ODvOODoOOCug==?=) Date: Tue, 28 May 2019 15:07:26 -0400 Subject: [cinder] ceph multiattach details? In-Reply-To: References: <2569f62a-1d30-2468-b019-06d99a819f82@gmail.com> Message-ID: Last time I checked OpenStack Ansible, Manila wasn't there... I believe that they added support for it in Stein but, I'm not sure if it supports CephFS backend (and the required Ceph Metadata Containers, since CephFS needs if, I believe). I'll definitely give it a try! Initially, I was planning to multi-attach an RDB block device against 2 or more Instances and run OCFS2 on top of it but, Manila with CephFS looks way simpler. Cheers! Thiago On Mon, 27 May 2019 at 22:30, Erik McCormick wrote: > > > On Mon, May 27, 2019, 5:33 PM Martinx - ジェームズ > wrote: > >> Hello, >> >> I'm very curious about this as well! >> >> It would be awesome to support Cinder multi-attach when using Ceph... If >> the code is already there, how to use it?! >> >> Cheers, >> Thiago >> >> On Mon, 27 May 2019 at 03:52, Bernd Bausch wrote: >> >>> The Stein release notes mention that the RBD driver now supports >>> multiattach, but i have not found any details. Are there limitations? Is >>> there a need to configure anything? >>> >>> In the RBD driver >>> , >>> I find this: >>> >>> def _enable_multiattach(self, volume): >>> multipath_feature_exclusions = [ >>> self.rbd.RBD_FEATURE_JOURNALING, >>> self.rbd.RBD_FEATURE_FAST_DIFF, >>> self.rbd.RBD_FEATURE_OBJECT_MAP, >>> self.rbd.RBD_FEATURE_EXCLUSIVE_LOCK, >>> ] >>> >>> This seems to mean that journaling and other features (to me, it's not >>> quite clear what they are) will be automatically disabled when switching on >>> multiattachment. >>> >>> Further down in the code I see that replication and multiattach are >>> mutually exclusive. >>> >>> Is there some documentation about the Ceph multiattach feature, even an >>> email thread? >>> >>> Thanks, >>> >>> Bernd >>> >> > There isn't really a Ceph multi-attach feature using Cinder. The code > comment is stating that, while the Openstack side of things is in place, > Ceph doesn't yet support it with RBD due to replication issues with > multiple clients. The Ceph community is aware of it, but has thus far > focused on CephFS as the shared file system instead. > > This could possibly be used with the NFS Cinder driver talking to Ganesha > with CephFS mounted. You may also want to look at Openstack's Manilla > project to orchestrate that. > > -Erik > >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Tue May 28 19:19:08 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Tue, 28 May 2019 14:19:08 -0500 Subject: [watcher] Question about baremetal node support in nova CDM In-Reply-To: <201905270952159932327@zte.com.cn> References: <201905270952159932327@zte.com.cn> Message-ID: <9eb193c0-1313-2ce4-649d-cc01b04cb285@gmail.com> On 5/26/2019 8:52 PM, li.canwei2 at zte.com.cn wrote: > refer to the comment by hidekazu, there is the case 1:M host:nodes when > VMware vCenter driver is used. As mentioned in the commit message, this hasn't been the case for the vCenter driver in nova since kilo: https://review.opendev.org/#/c/103916/ -- Thanks, Matt From noonedeadpunk at ya.ru Tue May 28 20:11:47 2019 From: noonedeadpunk at ya.ru (Dmitriy Rabotyagov) Date: Tue, 28 May 2019 23:11:47 +0300 Subject: [cinder] ceph multiattach details? In-Reply-To: References: <2569f62a-1d30-2468-b019-06d99a819f82@gmail.com> Message-ID: <499521559074307@myt1-06117f29c1ea.qloud-c.yandex.net> Hi, Yep, OSA has manila support since stein: https://docs.openstack.org/openstack-ansible/latest/contributor/testing.html#maturity-matrix So you may give it a try. Feedback regarding the role is highly appreciated:) 28.05.2019, 22:13, "Martinx - ジェームズ" : > Last time I checked OpenStack Ansible, Manila wasn't there... I believe that they added support for it in Stein but, I'm not sure if it supports CephFS backend (and the required Ceph Metadata Containers, since CephFS needs if, I believe). > > I'll definitely give it a try! > > Initially, I was planning to multi-attach an RDB block device against 2 or more Instances and run OCFS2 on top of it but, Manila with CephFS looks way simpler. > > Cheers! > Thiago > > On Mon, 27 May 2019 at 22:30, Erik McCormick wrote: >> On Mon, May 27, 2019, 5:33 PM Martinx - ジェームズ wrote: >>> Hello, >>> >>>  I'm very curious about this as well! >>> >>>  It would be awesome to support Cinder multi-attach when using Ceph... If the code is already there, how to use it?! >>> >>> Cheers, >>> Thiago >>> >>> On Mon, 27 May 2019 at 03:52, Bernd Bausch wrote: >>>> The Stein release notes mention that the RBD driver now supports multiattach, but i have not found any details. Are there limitations? Is there a need to configure anything? >>>> >>>> In the RBD driver, I find this: >>>> >>>>    def _enable_multiattach(self, volume):        multipath_feature_exclusions = [            self.rbd.RBD_FEATURE_JOURNALING,            self.rbd.RBD_FEATURE_FAST_DIFF,            self.rbd.RBD_FEATURE_OBJECT_MAP,            self.rbd.RBD_FEATURE_EXCLUSIVE_LOCK,        ] >>>> >>>> This seems to mean that journaling and other features (to me, it's not quite clear what they are) will be automatically disabled when switching on multiattachment. >>>> >>>> Further down in the code I see that replication and multiattach are mutually exclusive. >>>> >>>> Is there some documentation about the Ceph multiattach feature, even an email thread? >>>> >>>> Thanks, >>>> >>>> Bernd >> >> There isn't really a Ceph multi-attach feature using Cinder. The code comment is stating that, while the Openstack side of things is in place, Ceph doesn't yet support it with RBD due to replication issues with multiple clients. The Ceph community is aware of it, but has thus far focused on CephFS as the shared file system instead. >> >> This could possibly be used with the NFS Cinder driver talking to Ganesha with CephFS mounted. You may also want to look at Openstack's Manilla project to orchestrate that. >> >> -Erik >>>> --  Kind Regards, Dmitriy Rabotyagov From corey.bryant at canonical.com Tue May 28 20:27:51 2019 From: corey.bryant at canonical.com (Corey Bryant) Date: Tue, 28 May 2019 16:27:51 -0400 Subject: [charms] Proposing Sahid Orentino Ferdjaoui to the Charms core team In-Reply-To: <17abd9ed-e76d-52b3-29b1-6d6ae75161bf@canonical.com> References: <17abd9ed-e76d-52b3-29b1-6d6ae75161bf@canonical.com> Message-ID: On Fri, May 24, 2019 at 6:35 AM Chris MacNaughton < chris.macnaughton at canonical.com> wrote: > Hello all, > > I would like to propose Sahid Orentino Ferdjaoui as a member of the Charms > core team. > +1 Sahid is a solid contributor and I'm confident he'll use caution, ask questions, and pull the right people in if needed. Corey > Chris MacNaughton > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cboylan at sapwetik.org Tue May 28 20:51:07 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Tue, 28 May 2019 13:51:07 -0700 Subject: Gerrit Downtime May 31, 2019 beginning 1500UTC Message-ID: <62b711c9-7e0e-4220-8769-29cf2cfad0aa@www.fastmail.com> Hello everyone, We'll be taking a short (hopefully no longer than an hour) Gerrit downtime so that we can rename some projects this Friday, May 31, 2019 beginning at 1500UTC. Some of these renames are cleanups after the great OpenDev git migration and others aren't, but all of them require us to stop Gerrit to update the database and contents on disk. If you'd like specifics on what repos are being renamed or what the process looks like, you can follow along at: https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Upcoming_Project_Renames https://etherpad.openstack.org/p/project-renames-2019-05-31 As always feel free to bring up questions or concerns and we'll do our best to answer/address them. Thank you for your patience and sorry for the interruption, Clark From Hunter90960 at mail.com Tue May 28 22:46:30 2019 From: Hunter90960 at mail.com (Hunter Nins) Date: Wed, 29 May 2019 00:46:30 +0200 Subject: [barbican] dev: Using barbican in media center/box product Message-ID: An HTML attachment was scrubbed... URL: From Hunter90960 at mail.com Tue May 28 22:49:21 2019 From: Hunter90960 at mail.com (Hunter Nins) Date: Wed, 29 May 2019 00:49:21 +0200 Subject: [barbican] dev: Using Barbican for media box/center unattended cert, key (KEK) updates, etc. Message-ID: Hello, Pardon: the first attempt was html-formatted. I am currently working on a customized media center/box product for my employer. It's basically a Raspberry Pi 3b+ running Raspian, configured to auto-update periodically via `apt`. The device accesses binaries for proprietary applications via a private, secured `apt` repo, using a pre-installed certificate on the device. Right now, the certificate on the device is set to never expire, but going forward, we'd like to configure the certificate to expire every 4 months. We also plan to deploy a unique certificate per device we ship, so the certs can be revoked if the tamper mechanism is triggered (i.e. if the customer rips open the box, it blows a fuse that is attached to a ADC chip, and the device reports in s/w that the sensor has been tripped). Finally, we anticipate some customers leaving the device offline, and only updating once every year (allowing for time for the cert to expire). Is there a way I could use Barbican to: * Update the certs for apt-repo access on the device periodically. * Setup key-encryption-keys (KEK) on the device, if we need the device to be able to download sensitive data, such as an in-memory cached copy of customer info. * Provide a mechanism for a new key to be deployed on the device if the currently-used key has expired (i.e. the user hasn't connected the device to the internet for more than 4 months). * Allow keys to be tracked, revoked, and de-commissioned. Thank you for your time and assistance. From absubram at cisco.com Tue May 28 22:56:50 2019 From: absubram at cisco.com (Abishek Subramanian (absubram)) Date: Tue, 28 May 2019 22:56:50 +0000 Subject: [ironic][inspector] Can inspector create port groups or add a port group to the baremetal ports? Message-ID: <7C4F5599-6CDD-43AA-8E05-61823A907A2C@contoso.com> Hi all, I’m looking at port-groups/ bonding options with ironic and I’ve looked at this documentation - https://docs.openstack.org/ironic/rocky/admin/portgroups.html Everything listed there checks out and works when manually attempted. The next thing I’d like to do is see if the port group creation and addition of the port group to the baremetal ports can be done via inspector. Is this feature available? If not, once I create the port group and then run inspection, will the inspector apply the port group to any baremetal ports that it creates? Appreciate any help with this regard. Thanks! Abishek -------------- next part -------------- An HTML attachment was scrubbed... URL: From kaifeng.w at gmail.com Wed May 29 02:30:22 2019 From: kaifeng.w at gmail.com (Kaifeng Wang) Date: Wed, 29 May 2019 10:30:22 +0800 Subject: [ironic][inspector] Can inspector create port groups or add a port group to the baremetal ports? In-Reply-To: <7C4F5599-6CDD-43AA-8E05-61823A907A2C@contoso.com> References: <7C4F5599-6CDD-43AA-8E05-61823A907A2C@contoso.com> Message-ID: Hi Abishek, Abishek Subramanian (absubram) 于2019年5月29日 周三上午7:01写道: > Hi all, > > > > I’m looking at port-groups/ bonding options with ironic and I’ve looked at > this documentation - > https://docs.openstack.org/ironic/rocky/admin/portgroups.html > > > > > > Everything listed there checks out and works when manually attempted. > > The next thing I’d like to do is see if the port group creation and > addition of the port group to the baremetal ports can be done via inspector. > > > > Is this feature available? > AFAIK inspector doesn’t support port group enrollment or update at the moment. If not, once I create the port group and then run inspection, will the > inspector apply the port group to any baremetal ports that it creates? > > > This is not supported as well, even if you have pre-created a port group, it’s a per node value, thus can’t be used in a generic way. There are simple rules support in the inspector, but I suspect it could be done via rules. The caveat is the combination of ports and mode is flexible, and this is not something can be probed during inspection. Maybe we’ll need to write a plugin to get some predefined support, e.g., create a port group with mode 0 for all active ports. -------------- next part -------------- An HTML attachment was scrubbed... URL: From amansi26 at in.ibm.com Wed May 29 05:01:54 2019 From: amansi26 at in.ibm.com (Aman Kumar Sinha26) Date: Wed, 29 May 2019 05:01:54 +0000 Subject: PowerVM-CI is failing while stacking devstack Message-ID: An HTML attachment was scrubbed... URL: From absubram at cisco.com Wed May 29 05:19:08 2019 From: absubram at cisco.com (Abishek Subramanian (absubram)) Date: Wed, 29 May 2019 05:19:08 +0000 Subject: [ironic][inspector] Can inspector create port groups or add a port group to the baremetal ports? In-Reply-To: References: <7C4F5599-6CDD-43AA-8E05-61823A907A2C@contoso.com> Message-ID: <5977EE55-4788-4CAF-A77A-64DC0A1D8968@cisco.com> Thank you Kaifeng! From: Kaifeng Wang Date: Tuesday, May 28, 2019 at 7:30 PM To: Abishek Subramanian Cc: "openstack-discuss at lists.openstack.org" Subject: Re: [ironic][inspector] Can inspector create port groups or add a port group to the baremetal ports? Hi Abishek, Abishek Subramanian (absubram) >于2019年5月29日 周三上午7:01写道: Hi all, I’m looking at port-groups/ bonding options with ironic and I’ve looked at this documentation - https://docs.openstack.org/ironic/rocky/admin/portgroups.html Everything listed there checks out and works when manually attempted. The next thing I’d like to do is see if the port group creation and addition of the port group to the baremetal ports can be done via inspector. Is this feature available? AFAIK inspector doesn’t support port group enrollment or update at the moment. If not, once I create the port group and then run inspection, will the inspector apply the port group to any baremetal ports that it creates? This is not supported as well, even if you have pre-created a port group, it’s a per node value, thus can’t be used in a generic way. There are simple rules support in the inspector, but I suspect it could be done via rules. The caveat is the combination of ports and mode is flexible, and this is not something can be probed during inspection. Maybe we’ll need to write a plugin to get some predefined support, e.g., create a port group with mode 0 for all active ports. -------------- next part -------------- An HTML attachment was scrubbed... URL: From renat.akhmerov at gmail.com Wed May 29 07:00:29 2019 From: renat.akhmerov at gmail.com (Renat Akhmerov) Date: Wed, 29 May 2019 14:00:29 +0700 Subject: [mistral] Office hours in 1 hour at #openstack-mistral In-Reply-To: <735dc87c-efa8-4804-a3ff-7ed506884249@Spark> References: <735dc87c-efa8-4804-a3ff-7ed506884249@Spark> Message-ID: Hi, We’ll have an Office Hours session in 1 hour at #openstack-mistral. Welcome to join. Thanks Renat Akhmerov @Nokia -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtantsur at redhat.com Wed May 29 07:06:23 2019 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Wed, 29 May 2019 09:06:23 +0200 Subject: [ironic][inspector] Can inspector create port groups or add a port group to the baremetal ports? In-Reply-To: <7C4F5599-6CDD-43AA-8E05-61823A907A2C@contoso.com> References: <7C4F5599-6CDD-43AA-8E05-61823A907A2C@contoso.com> Message-ID: Hi, On 5/29/19 12:56 AM, Abishek Subramanian (absubram) wrote: > Hi all, > > I’m looking at port-groups/ bonding options with ironic and I’ve looked at this > documentation - https://docs.openstack.org/ironic/rocky/admin/portgroups.html > > Everything listed there checks out and works when manually attempted. > > The next thing I’d like to do is see if the port group creation and addition of > the port group to the baremetal ports can be done via inspector. > > Is this feature available? You can probably write a processing hook (ironic-inspector plugin) to do that for you. But there is nothing out-of-box. > > If not, once I create the port group and then run inspection, will the inspector > apply the port group to any baremetal ports that it creates? No. Dmitry > > Appreciate any help with this regard. > > Thanks! > > Abishek > From bcafarel at redhat.com Wed May 29 08:26:01 2019 From: bcafarel at redhat.com (Bernard Cafarelli) Date: Wed, 29 May 2019 10:26:01 +0200 Subject: [requirements][nova][neutron] updating sqlalchemy to 1.3.4 In-Reply-To: <20190528172741.rj3zokt352dm5n2o@mthode.org> References: <20190528172741.rj3zokt352dm5n2o@mthode.org> Message-ID: Replying to list for visibility On Tue, 28 May 2019 at 19:31, Matthew Thode wrote: > so, there are no requirements type reasons why we can't merge this now, > but functional tests seem to be failing for nova/neutron (tempest). > Would those teams be able to test? > > https://review.opendev.org/651591 is the test patch (now updated for > 1.3.4) > > Some failures show up here as well. > > > http://logs.openstack.org/39/661539/1/check/tempest-full/b127828/testr_results.html.gz Thanks for filling the neutron-lib new version tag [0], it should be indeed enough for sqlalchemy 1.3 support (thanks to ralonsoh) For interested folks, [1] is the relevant fix in neutron-lib [0] https://review.opendev.org/#/c/661839 / [1] https://review.opendev.org/#/c/651584/ -- Bernard Cafarelli -------------- next part -------------- An HTML attachment was scrubbed... URL: From aj at suse.com Wed May 29 08:26:22 2019 From: aj at suse.com (Andreas Jaeger) Date: Wed, 29 May 2019 10:26:22 +0200 Subject: [infra] Retire x/pbrx and openstack-infra/opendev-website-repos Message-ID: We will retire x/pbrx and openstack-infra/opendev-website-repos: pbrx was an attempt at container building tools which has been superceded by the zuul image building jobs and Dockerfiles. opendev-website is no longer needed since the content is directly served by gitea. First change for this: https://review.opendev.org/661910 -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter: jaegerandi SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 From rico.lin.guanyu at gmail.com Wed May 29 08:51:09 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Wed, 29 May 2019 16:51:09 +0800 Subject: [auto-scaling] SIG meeting in 10 mins, welcome to join us Message-ID: Dear all We will have SIG meeting in 10 mins, so welcome to join us in #openstack-auto-scaling channel -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From info at dantalion.nl Wed May 29 09:10:10 2019 From: info at dantalion.nl (info at dantalion.nl) Date: Wed, 29 May 2019 11:10:10 +0200 Subject: [watcher] Question about baremetal node support in nova CDM In-Reply-To: <89845749-876d-e7ac-bd49-9c262ed56e43@gmail.com> References: <53585cb2-a207-58ff-588a-6c9694f8245f@dantalion.nl> <89845749-876d-e7ac-bd49-9c262ed56e43@gmail.com> Message-ID: If the hypervisor_type is only available in the nova cell DB how does the OpenStackClient determine the hypervisor type? If I execute 'openstack hypervisor list' then the Hypervisor Type column is visible and correctly distinguishes between ironic and qemu hyervisors. On 5/24/19 9:27 PM, Matt Riedemann wrote: > On 5/24/2019 1:39 AM, info at dantalion.nl wrote: >> I think we should look into if bare metal >> nodes are stored in the compute_model as I think it would more sense to >> filter them out. > > The tricky thing with this would be there isn't a great way to identify > a baremetal node from a kvm node, for example. There is a > hypervisor_type column on the compute_nodes table in the nova cell DB, > but it's not exposed in the API. > > Two obvious differences would be: > > 1. The hypervisor_hostname on an ironic node in the os-hypervisors API > is a UUID rather than a normal hostname. That could be one way to try > and identify an ironic node (hypervisor). > > 2. For servers, the associated flavor should have a CUSTOM resource > class extra spec associated with it and the VCPU, DISK_GB, and MEMORY_MB > resource classes should also be zero'ed out in the flavor per [1]. The > server OS-EXT-SRV-ATTR:hypervisor_hostname field would also be a UUID > like above (the UUID is the ironic node ID). > > [1] > https://docs.openstack.org/ironic/latest/install/configure-nova-flavors.html > > From li.canwei2 at zte.com.cn Wed May 29 09:32:00 2019 From: li.canwei2 at zte.com.cn (li.canwei2 at zte.com.cn) Date: Wed, 29 May 2019 17:32:00 +0800 (CST) Subject: =?UTF-8?B?UmU6W3dhdGNoZXJdIFF1ZXN0aW9uIGFib3V0IGJhcmVtZXRhbCBub2RlIHN1cHBvcnQgaW4gbm92YSBDRE0=?= In-Reply-To: <9eb193c0-1313-2ce4-649d-cc01b04cb285@gmail.com> References: 201905270952159932327@zte.com.cn, 9eb193c0-1313-2ce4-649d-cc01b04cb285@gmail.com Message-ID: <201905291732005257968@zte.com.cn> Thank you Matt, then I think we can remove the first API call. 主 题 :Re: [watcher] Question about baremetal node support in nova CDM On 5/26/2019 8:52 PM, li.canwei2 at zte.com.cn wrote: > refer to the comment by hidekazu, there is the case 1:M host:nodes when > VMware vCenter driver is used. As mentioned in the commit message, this hasn't been the case for the vCenter driver in nova since kilo: https://review.opendev.org/#/c/103916/ -- Thanks, Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From vmarkov at mirantis.com Wed May 29 14:29:20 2019 From: vmarkov at mirantis.com (Vadym Markov) Date: Wed, 29 May 2019 17:29:20 +0300 Subject: [dev][horizon][docs][i18n]Horizon Taiwanese locale issue Message-ID: Hi, There is an issue with Taiwanese locale at least in Horizon. It would be good to get a feedback from i18n team according to OpenStack docs too. Speaking about Horizon, Chinese locales were renamed in Django 1.7 [1]. Zh-cn became zh-Hans, and zh-tw became zh-Hant. Old names were marked as deprecated and finally removed in Django 1.11. Current supported Django versions are 1.11 and 2.0 [2] It is not an issue for Chinese because django locale middleware does silent fallback from zh-cn to zh-Hans. But for Taiwanese it treats zh-tw as some unknown variant of zh-* and also does fallback to default zh-cn, which is wrong. Reported bug: https://bugs.launchpad.net/horizon/+bug/1830886 Proposed solution is to migrate to new locale names everywhere in Horizon. Here are proposed steps to fix this issue: 1. Zanata jobs reconfiguration 2. Django-related backend stuff. Biggest, but easiest part because it is well documented 3. AngularJS. Quite old version is used, so we can hit some well forgotten issues. Also, testing is complicated. 4. Horizon.js components. It is the smallest part Other question will be thorougful testing of the bugfix. [1] https://docs.djangoproject.com/en/dev/releases/1.7/#language-codes-zh-cn-zh-tw-and-fy-nl [2] https://docs.openstack.org/horizon/latest/install/system-requirements.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Wed May 29 14:53:52 2019 From: gmann at ghanshyammann.com (gmann at ghanshyammann.com) Date: Wed, 29 May 2019 23:53:52 +0900 Subject: [nova] API updates week 19-22 Message-ID: <16b04154bbb.12990f3e52722.158470004843561321@ghanshyammann.com> Hi All, Please find the Nova API updates of this week. Weekly Office Hour: =============== Canceled the API office hours as no response on ML - http://lists.openstack.org/pipermail/openstack-discuss/2019-March/004336.html API Related BP : ============ Code Ready for Review: ------------------------------ 1. Support adding description while locking an instance: - Topic: https://review.opendev.org/#/q/topic:bp/add-locked-reason+(status:open+OR+status:merged) - Weekly Progress: Only OSC patch is not merged (all other code is merged). waiting for python-novaclient version update after release. 2. Add host and hypervisor_hostname flag to create server - Topic: https://review.opendev.org/#/q/topic:bp/add-host-and-hypervisor-hostname-flag-to-create-server+(status:open+OR+status:merged) - Weekly Progress: Code is up for review 3. Detach and attach boot volumes: - Topic: https://review.openstack.org/#/q/topic:bp/detach-boot-volume+(status:open+OR+status:merged) - Weekly Progress: Spec is merged but the code is in the merge-conflict state. need rebase and update the patch. Spec Ready for Review: ----------------------------- 1. Nova API policy improvement - Spec: https://review.openstack.org/#/c/547850/ - PoC: https://review.openstack.org/#/q/topic:bp/policy-default-refresh+(status:open+OR+status:merged) - Weekly Progress: Spec has been updated with Train PTG discussion. Ready to re-review. 2. Support for changing deleted_on_termination after boot -Spec: https://review.openstack.org/#/c/580336/ - Weekly Progress: lyarwood commented on spec about the PTG discussion and mentioned to update the spec accordingly. 3. Nova API cleanup - Spec: https://review.openstack.org/#/c/603969/ - Weekly Progress: Spec has been updated with Train PTG discussion. Alex is +2 on this, need another +2. 4. Specifying az when restore shelved server - Spec: https://review.openstack.org/#/c/624689/ - Weekly Progress: No updates on this. 5. default value of swap in the show flavor details API - Spec: https://review.openstack.org/#/c/648919/ - Weekly Progress: This has been merged in API cleanup Spec(603969) 6. Support delete_on_termination in volume attach api -Spec: https://review.openstack.org/#/c/612949/ - Weekly Progress: Ready for review. 7. Add API ref guideline for body text - ~8 api-ref are left to fix. Previously approved Spec needs to be re-proposed for Train: --------------------------------------------------------------------------- 1. Servers Ips non-unique network names : - https://blueprints.launchpad.net/nova/+spec/servers-ips-non-unique-network-names - https://review.openstack.org/#/q/topic:bp/servers-ips-non-unique-network-names+(status:open+OR+status:merged) 2. Volume multiattach enhancements: - https://blueprints.launchpad.net/nova/+spec/volume-multiattach-enhancements - https://review.openstack.org/#/q/topic:bp/volume-multiattach-enhancements+(status:open+OR+status:merged) Bugs: ==== No progress report in this week. NOTE- There might be some bug which are not tagged as 'api' or 'api-ref', those are not in above list. Tag such bugs so that we can keep our eyes. -gmann From aj at suse.com Wed May 29 15:35:30 2019 From: aj at suse.com (Andreas Jaeger) Date: Wed, 29 May 2019 17:35:30 +0200 Subject: [OpenStack-I18n] [dev][horizon][docs][i18n]Horizon Taiwanese locale issue In-Reply-To: References: Message-ID: <0ffdf54c-e0c2-c23a-a09a-90ddd79fe839@suse.com> On 29/05/2019 16.29, Vadym Markov wrote: > Hi, > > > There is an issue with Taiwanese locale at least in Horizon. It would be > good to get a feedback from i18n team according to OpenStack docs too. > > > Speaking about Horizon, Chinese locales were renamed in Django 1.7 [1]. > Zh-cn became zh-Hans, and zh-tw became zh-Hant. Old names were marked as > deprecated and finally removed in Django 1.11. Current supported Django > versions are 1.11 and 2.0 [2] What does this mean for stable branches? Is that a change only for master (= train) or also for rocky and stein (those two are the only branches under active translation)? What is the situation about documents that are translated via sphinx? Is there a similar change needed? > > > It is not an issue for Chinese because django locale middleware does > silent fallback from zh-cn to zh-Hans. But for Taiwanese it treats zh-tw > as some unknown variant of zh-* and also does fallback to default zh-cn, > which is wrong. > > > Reported bug: https://bugs.launchpad.net/horizon/+bug/1830886 > > > Proposed solution is to migrate to new locale names everywhere in > Horizon. Here are proposed steps to fix this issue: > > 1. > > Zanata jobs reconfiguration > > 2. > > Django-related backend stuff. Biggest, but easiest part because it > is well documented > > 3. > > AngularJS. Quite old version is used, so we can hit some well > forgotten issues. Also, testing is complicated. > > 4. > > Horizon.js components. It is the smallest part > > > > > Other question will be thorougful testing of the bugfix. > > > [1] > https://docs.djangoproject.com/en/dev/releases/1.7/#language-codes-zh-cn-zh-tw-and-fy-nl > > [2] > https://docs.openstack.org/horizon/latest/install/system-requirements.html Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter: jaegerandi SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 From sc at linux.it Wed May 29 15:38:46 2019 From: sc at linux.it (Stefano Canepa) Date: Wed, 29 May 2019 16:38:46 +0100 Subject: [monasca] Proposing Akhil Jain for Monasca core team In-Reply-To: <8bfbc2a3-03a4-d469-f723-09aba5b2b96b@suse.com> References: <8bfbc2a3-03a4-d469-f723-09aba5b2b96b@suse.com> Message-ID: On Thu, 23 May 2019 at 12:30, Witek Bedyk wrote: > Hello team, > > I would like to propose Akhil Jain to join the Monasca core team. > > +1 and many thanks to Akhil. All the best sc -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Wed May 29 16:12:20 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 29 May 2019 11:12:20 -0500 Subject: [watcher] Question about baremetal node support in nova CDM In-Reply-To: <201905291732005257968@zte.com.cn> References: <201905291732005257968@zte.com.cn> Message-ID: On 5/29/2019 4:32 AM, li.canwei2 at zte.com.cn wrote: > then I think we can remove the first API call. OK, cool. I'll do it in a separate follow up change so it doesn't make the existing patch more complicated and because then we can revert it if it causes a problem without reverting the whole other optimization. -- Thanks, Matt From mriedemos at gmail.com Wed May 29 16:21:47 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 29 May 2019 11:21:47 -0500 Subject: [watcher] Question about baremetal node support in nova CDM In-Reply-To: References: <53585cb2-a207-58ff-588a-6c9694f8245f@dantalion.nl> <89845749-876d-e7ac-bd49-9c262ed56e43@gmail.com> Message-ID: <20ab6323-ff51-ff64-cb82-92bc1bb2d1b7@gmail.com> On 5/29/2019 4:10 AM, info at dantalion.nl wrote: > If the hypervisor_type is only available in the nova cell DB how does > the OpenStackClient determine the hypervisor type? > > If I execute 'openstack hypervisor list' then the Hypervisor Type column > is visible and correctly distinguishes between ironic and qemu hyervisors. I was mistaken, hypervisor_type is in the API response [1] so that's why you're seeing it in the OSC CLI. I must have missed it when looking at the API reference. [1] https://github.com/openstack/nova/blob/d4f58f5eb6e68d0348868efc45212b70feb0bde1/nova/api/openstack/compute/hypervisors.py#L69 -- Thanks, Matt From yadav.akshay58 at gmail.com Wed May 29 05:24:16 2019 From: yadav.akshay58 at gmail.com (Akki yadav) Date: Wed, 29 May 2019 10:54:16 +0530 Subject: OPENSTACK-HELM : INGRESS CONTROLLER PODS STUCK IN PENDING Message-ID: Hello team, I am running the following command in order to deploy the ingress controller: - ./tools/deployment/multinode/020-ingress.sh It makes two pods ingress-error-pages-899888c7-pnxhv and ingress-error-pages-899888c7-w5d5b which remains in pending state and the script ends after 900 seconds stating in "kubectl describe" that : - Warning FailedScheduling 60s (x3 over 2m12s) default-scheduler 0/3 nodes are available: 3 node(s) didn't match node selector. As only default labels are associated with the cluster nodes, do i need to add any specific one ? If yes, Please tell me where and how to add the same. Please guide me how to resolve this issue. Thanks in advance. Regards, Akshay -------------- next part -------------- An HTML attachment was scrubbed... URL: From Anirudh.Gupta at hsc.com Wed May 29 07:30:48 2019 From: Anirudh.Gupta at hsc.com (Anirudh Gupta) Date: Wed, 29 May 2019 07:30:48 +0000 Subject: [Airship-Seaworthy] Deployment of Airship-Seaworthy on Virtual Environment Message-ID: Hi Team, We want to test Production Ready Airship-Seaworthy in our virtual environment The link followed is https://airship-treasuremap.readthedocs.io/en/latest/seaworthy.html As per the document we need 6 DELL R720xd bare-metal servers: 3 control, and 3 compute nodes. But we need to deploy our setup on Virtual Environment. Does Airship-Seaworthy support Installation on Virtual Environment? We have 2 Rack Servers with Dual-CPU Intel® Xeon® E5 26xx with 16 cores each and 128 GB RAM. Is it possible that we can create Virtual Machines on them and set up the complete environment. In that case, what possible infrastructure do we require for setting up the complete setup. Looking forward for your response. Regards अनिरुद्ध गुप्ता (वरिष्ठ अभियंता) Hughes Systique Corporation D-23,24 Infocity II, Sector 33, Gurugram, Haryana 122001 DISCLAIMER: This electronic message and all of its contents, contains information which is privileged, confidential or otherwise protected from disclosure. The information contained in this electronic mail transmission is intended for use only by the individual or entity to which it is addressed. If you are not the intended recipient or may have received this electronic mail transmission in error, please notify the sender immediately and delete / destroy all copies of this electronic mail transmission without disclosing, copying, distributing, forwarding, printing or retaining any part of it. Hughes Systique accepts no responsibility for loss or damage arising from the use of the information transmitted by this email including damage from virus. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gouthampravi at gmail.com Wed May 29 18:51:26 2019 From: gouthampravi at gmail.com (Goutham Pacha Ravi) Date: Wed, 29 May 2019 11:51:26 -0700 Subject: [manila] Weekly Meeting on May 30th canceled Message-ID: Hello Zorillas, We have no items on the agenda for the weekly meeting tomorrow (30th May 2019) [1] and several of the folks on the team are either out of office or traveling. So, I'd like to cancel this week's meeting. Next week (6th June 2019), we'd welcome back folks that attended Cephalocon, KubeCon+CloudNativeCon and OpenStack Days @ CERN and have them share their updates. I'm also looking forward to officially introducing our Outreachy intern and have the awesome Outreachy mentors join us in that meeting. If you have any questions or concerns, please post them here, or on freenode's #openstack-manila. Thanks, Goutham [1] https://wiki.openstack.org/wiki/Manila/Meetings From melwittt at gmail.com Wed May 29 19:33:28 2019 From: melwittt at gmail.com (melanie witt) Date: Wed, 29 May 2019 12:33:28 -0700 Subject: [nova] Should we have a known issue reno for bug 1829062 in 19.0.1 (stein)? In-Reply-To: References: Message-ID: <141a35e2-df16-1af2-b281-ae19f7e2e33a@gmail.com> On 5/24/19 1:04 PM, Matt Riedemann wrote: > I've got a release request for stein 19.0.1 [1] but am holding it up to > figure out if we should have a known issue release note for the nova-api > + wsgi + eventlet monkey patch bug(s) [2][3]. [4] contains a workaround > to disable the eventlet monkeypatching which it sounds like StarlingX is > using for now, but is not really something we're recommending for > production (setting OS_NOVA_DISABLE_EVENTLET_PATCHING=1). Sean Mooney > has another workaround [5]. Should we try to clean that up for a known > issue before we release 19.0.1 given it's the first release since the > Stein GA? I tend to think "yes". I concur. I think a known issue reno would be worthwhile as this is a complex issue that we know some are facing already, and having some documentation around it would be helpful. -melanie > [1] https://review.opendev.org/#/c/661376/ > [2] https://bugs.launchpad.net/nova/+bug/1829062 > [3] https://bugs.launchpad.net/nova/+bug/1825584 > [4] https://review.opendev.org/#/c/647310/ > [5] https://bugs.launchpad.net/nova/+bug/1829062/comments/7 > From robson.rbarreto at gmail.com Wed May 29 19:34:18 2019 From: robson.rbarreto at gmail.com (Robson Ramos Barreto) Date: Wed, 29 May 2019 16:34:18 -0300 Subject: OPENSTACK-HELM : INGRESS CONTROLLER PODS STUCK IN PENDING In-Reply-To: References: Message-ID: Hi, I'm trying with helm too and I have these node labels: kubectl label nodes --all openstack-control-plane=enabled kubectl label nodes --all openstack-compute-node=enabled kubectl label nodes --all openvswitch=enabled kubectl label nodes --all linuxbridge=enabled kubectl label nodes --all ceph-mon=enabled kubectl label nodes --all ceph-osd=enabled kubectl label nodes --all ceph-mds=enabled kubectl label nodes --all ceph-rgw=enabled kubectl label nodes --all ceph-mgr=enabled >From kubectl describe pod you are able to see the needed node selector Regards On Wed, May 29, 2019 at 3:57 PM Akki yadav wrote: > Hello team, > I am running the following command in order to deploy the ingress > controller: > - ./tools/deployment/multinode/020-ingress.sh > It makes two pods ingress-error-pages-899888c7-pnxhv and > ingress-error-pages-899888c7-w5d5b which remains in pending state and the > script ends after 900 seconds stating in "kubectl describe" that : > - Warning FailedScheduling 60s (x3 over > 2m12s) default-scheduler 0/3 nodes are available: 3 node(s) didn't match > node selector. > > As only default labels are associated with the cluster nodes, do i need to > add any specific one ? If yes, Please tell me where and how to add the same. > > Please guide me how to resolve this issue. > Thanks in advance. > > Regards, > Akshay > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Wed May 29 20:16:19 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 29 May 2019 15:16:19 -0500 Subject: [watcher] Question about baremetal node support in nova CDM In-Reply-To: References: Message-ID: <9a8c18e4-a262-5ed3-d37c-42466a096d37@gmail.com> On 5/23/2019 4:40 PM, Matt Riedemann wrote: > Only in the case of ironic would we potentially get more than one > hypervisor (node) for a single compute service (host). As I was working on the change, I realized that while this is true, it's not strictly what this API could return: GET /os-hypervisors/detail?hypervisor_hostname_pattern=$node_hostname The problem is that is a fuzzy match in the DB query [1] so if you have some computes like this: compute1 compute10 compute100 And we're searching on compute1 we could get back 3 hypervisors which would break on the watcher side, so we can't really use that. There was a similar problem in some nova CLIs [2] which essentially does the same kind of strict hostname checking that watcher already does. So unfortunately this means I can't replace that code in watcher just yet. But if the compute API adds a service host filter to the hypervisors API we could push the strict service hostname filtering into the server rather than the client (watcher). I can draft a nova spec for that. [1] https://github.com/openstack/nova/blob/d4f58f5eb6e68d0348868efc45212b70feb0bde1/nova/db/sqlalchemy/api.py#L676 [2] https://review.opendev.org/#/c/520187/ -- Thanks, Matt From zbitter at redhat.com Wed May 29 20:35:31 2019 From: zbitter at redhat.com (Zane Bitter) Date: Wed, 29 May 2019 16:35:31 -0400 Subject: [kolla][magnum] Cluster creation failed due to "Waiting for Kubernetes API..." In-Reply-To: References: <1f5506ea-add1-749d-b6c3-1040776b0ff4@catalyst.net.nz> <54760998-DCF6-4E01-85C8-BB3F5879A14C@stackhpc.com> Message-ID: <329cb15b-1645-72af-cb5a-e9f143b0d0e7@redhat.com> On 20/02/19 2:15 PM, Mark Goddard wrote: > Hi, I think we've hit this, and John Garbutt has added the following > configuration for Kolla Ansible in /etc/kolla/config/heat.conf: > > [DEFAULT] > region_name_for_services=RegionOne > > > We'll need a patch in kolla ansible to do that without custom config > changes. > Mark > > On Wed, 20 Feb 2019 at 11:05, Bharat Kunwar > wrote: > > Hi Giuseppe, > > What version of heat are you running? > > Can you check if you have this patch merged? > https://review.openstack.org/579485 > > https://review.openstack.org/579485 This patch caused a regression (in combination with corresponding patches to os-collect-config and heat-agents) due to weird things that happen in os-apply-config (https://bugs.launchpad.net/os-apply-config/+bug/1830967). Details are here: https://storyboard.openstack.org/#!/story/2005797 I've proposed a fix, and once that merges the workaround suggested above will no longer be needed. (Although setting the region name explicitly is a Good Thing to do anyway.) cheers, Zane. > Bharat > > Sent from my iPhone > > On 20 Feb 2019, at 10:38, Giuseppe Sannino > > > wrote: > >> Hi Feilong, Bharat, >> thanks for your answer. >> >> @Feilong, >> From /etc/kolla/heat-engine/heat.conf I see: >> [clients_keystone] >> auth_uri = http://10.1.7.201:5000 >> >> This should map into auth_url within the k8s master. >> Within the k8s master in /etc/os-collect-config.conf  I see: >> >> [heat] >> auth_url = http://10.1.7.201:5000/v3/ >> : >> : >> resource_name = kube-master >> region_name = null >> >> >> and from /etc/sysconfig/heat-params (among the others): >> : >> REGION_NAME="RegionOne" >> : >> AUTH_URL="http://10.1.7.201:5000/v3" >> >> This URL corresponds to the "public" Heat endpoint >> openstack endpoint list | grep heat >> | 3d5f58c43f6b44f6b54990d6fd9ff55d | RegionOne | heat         | >> orchestration   | True    | internal  | >> http://10.1.7.200:8004/v1/%(tenant_id)s   | >> | 8c2492cb0ddc48ca94942a4a299a88dc | RegionOne | heat-cfn     | >> cloudformation  | True    | internal  | http://10.1.7.200:8000/v1 >>                | >> | b164c4618a784da9ae14da75a6c764a3 | RegionOne | heat         | >> orchestration   | True    | public    | >> http://10.1.7.201:8004/v1/%(tenant_id)s   | >> | da203f7d337b4587a0f5fc774c993390 | RegionOne | heat         | >> orchestration   | True    | admin     | >> http://10.1.7.200:8004/v1/%(tenant_id)s   | >> | e0d3743e7c604e5c8aa4684df2d1ce53 | RegionOne | heat-cfn     | >> cloudformation  | True    | public    | http://10.1.7.201:8000/v1 >>                | >> | efe0b8418aa24dfca33c243e7eed7e90 | RegionOne | heat-cfn     | >> cloudformation  | True    | admin     | http://10.1.7.200:8000/v1 >>                | >> >> Connectivity tests: >> [fedora at kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ ping 10.1.7.201 >> PING 10.1.7.201 (10.1.7.201) 56(84) bytes of data. >> 64 bytes from 10.1.7.201 : icmp_seq=1 ttl=63 >> time=0.285 ms >> >> [fedora at kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ curl >> http://10.1.7.201:5000/v3/ >> {"version": {"status": "stable", "updated": >> "2018-10-15T00:00:00Z", "media-types": [{"base": >> "application/json", "type": >> "application/vnd.openstack.identity-v3+json"}], "id": "v3.11", >> "links": [{"href": "http://10.1.7.201:5000/v3/", "rel": "self"}]}} >> >> >> Apparently, I can reach such endpoint from within the k8s master >> >> >> @Bharat, >> that file seems to be properly conifugured to me as well. >> The problem pointed by "systemctl status heat-container-agent" is >> with: >> >> Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal >> runc[2837]: publicURL endpoint for orchestration service in null >> region not found >> Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal >> runc[2837]: Source [heat] Unavailable. >> Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal >> runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping >> Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal >> runc[2837]: publicURL endpoint for orchestration service in null >> region not found >> Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal >> runc[2837]: Source [heat] Unavailable. >> Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal >> runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping >> >> >> Still no way forward from my side. >> >> /Giuseppe >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> On Tue, 19 Feb 2019 at 22:16, Bharat Kunwar > > wrote: >> >> I have the same problem. Weird thing is >> /etc/sysconfig/heat-params has region_name specified in my case! >> >> Sent from my iPhone >> >> On 19 Feb 2019, at 22:00, Feilong Wang >> > wrote: >> >>> Can you talk to the Heat API from your master node? >>> >>> >>> On 20/02/19 6:43 AM, Giuseppe Sannino wrote: >>>> Hi all...again, >>>> I managed to get over the previous issue by "not disabling" >>>> the TLS in the cluster template. >>>> From the cloud-init-output.log I see: >>>> Cloud-init v. 17.1 running 'modules:final' at Tue, 19 Feb >>>> 2019 17:03:53 +0000. Up 38.08 seconds. >>>> Cloud-init v. 17.1 finished at Tue, 19 Feb 2019 17:13:22 >>>> +0000. Datasource DataSourceEc2.  Up 607.13 seconds >>>> >>>> But the cluster creation keeps on failing. >>>> From the journalctl -f I see a possible issue: >>>> Feb 19 17:42:38 >>>> kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: >>>> publicURL endpoint for orchestration service in null region >>>> not found >>>> Feb 19 17:42:38 >>>> kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: >>>> Source [heat] Unavailable. >>>> Feb 19 17:42:38 >>>> kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: >>>> /var/lib/os-collect-config/local-data not found. Skipping >>>> >>>> anyone familiar with this problem ? >>>> >>>> Thanks as usual. >>>> /Giuseppe >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Tue, 19 Feb 2019 at 17:35, Giuseppe Sannino >>>> >>> > wrote: >>>> >>>> Hi all, >>>> need an help. >>>> I deployed an AIO via Kolla on a baremetal node. Here >>>> some information about the deployment: >>>> --------------- >>>> kolla-ansible: 7.0.1 >>>> openstack_release: Rocky >>>> kolla_base_distro: centos >>>> kolla_install_type: source >>>> TLS: disabled >>>> --------------- >>>> >>>> >>>> VMs spawn without issue but I can't make the "Kubernetes >>>> cluster creation" successfully. It fails due to "Time out" >>>> >>>> I managed to log into Kuber Master and from the >>>> cloud-init-output.log I can see: >>>> + echo 'Waiting for Kubernetes API...' >>>> Waiting for Kubernetes API... >>>> ++ curl --silent http://127.0.0.1:8080/healthz >>>> + '[' ok = '' ']' >>>> + sleep 5 >>>> >>>> >>>> Checking via systemctl and journalctl I see: >>>> [fedora at kube-clsuter-qamdealetlbi-master-0 log]$ >>>> systemctl status kube-apiserver >>>> ● kube-apiserver.service - kubernetes-apiserver >>>>    Loaded: loaded >>>> (/etc/systemd/system/kube-apiserver.service; enabled; >>>> vendor preset: disabled) >>>>    Active: failed (Result: exit-code) since Tue >>>> 2019-02-19 15:31:41 UTC; 45min ago >>>>   Process: 3796 ExecStart=/usr/bin/runc --systemd-cgroup >>>> run kube-apiserver (code=exited, status=1/FAILURE) >>>>  Main PID: 3796 (code=exited, status=1/FAILURE) >>>> >>>> Feb 19 15:31:40 >>>> kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: >>>> kube-apiserver.service: Main process exited, >>>> code=exited, status=1/FAILURE >>>> Feb 19 15:31:40 >>>> kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: >>>> kube-apiserver.service: Failed with result 'exit-code'. >>>> Feb 19 15:31:41 >>>> kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: >>>> kube-apiserver.service: Service RestartSec=100ms >>>> expired, scheduling restart. >>>> Feb 19 15:31:41 >>>> kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: >>>> kube-apiserver.service: Scheduled restart job, restart >>>> counter is at 6. >>>> Feb 19 15:31:41 >>>> kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: >>>> Stopped kubernetes-apiserver. >>>> Feb 19 15:31:41 >>>> kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: >>>> kube-apiserver.service: Start request repeated too quickly. >>>> Feb 19 15:31:41 >>>> kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: >>>> kube-apiserver.service: Failed with result 'exit-code'. >>>> Feb 19 15:31:41 >>>> kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: >>>> Failed to start kubernetes-apiserver. >>>> >>>> [fedora at kube-clsuter-qamdealetlbi-master-0 log]$ sudo >>>> journalctl -u kube-apiserver >>>> -- Logs begin at Tue 2019-02-19 15:21:36 UTC, end at Tue >>>> 2019-02-19 16:17:00 UTC. -- >>>> Feb 19 15:31:33 >>>> kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: >>>> Started kubernetes-apiserver. >>>> Feb 19 15:31:34 >>>> kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: >>>> Flag --insecure-bind-address has been deprecated, This >>>> flag will be removed in a future version. >>>> Feb 19 15:31:34 >>>> kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: >>>> Flag --insecure-port has been deprecated, This flag will >>>> be removed in a future version. >>>> Feb 19 15:31:35 >>>> kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: >>>> Error: error creating self-signed certificates: open >>>> /var/run/kubernetes/apiserver.crt: permission denied >>>> : >>>> : >>>> : >>>> Feb 19 15:31:35 >>>> kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: >>>> error: error creating self-signed certificates: open >>>> /var/run/kubernetes/apiserver.crt: permission denied >>>> Feb 19 15:31:35 >>>> kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: >>>> kube-apiserver.service: Main process exited, >>>> code=exited, status=1/FAILURE >>>> Feb 19 15:31:35 >>>> kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: >>>> kube-apiserver.service: Failed with result 'exit-code'. >>>> Feb 19 15:31:35 >>>> kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: >>>> kube-apiserver.service: Service RestartSec=100ms >>>> expired, scheduling restart. >>>> Feb 19 15:31:35 >>>> kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: >>>> kube-apiserver.service: Scheduled restart job, restart >>>> counter is at 1. >>>> >>>> >>>> May I ask for an help on this ? >>>> >>>> Many thanks >>>> /Giuseppe >>>> >>>> >>>> >>>> >>> -- >>> Cheers & Best regards, >>> Feilong Wang (王飞龙) >>> -------------------------------------------------------------------------- >>> Senior Cloud Software Engineer >>> Tel: +64-48032246 >>> Email:flwang at catalyst.net.nz >>> Catalyst IT Limited >>> Level 6, Catalyst House, 150 Willis Street, Wellington >>> -------------------------------------------------------------------------- >> From mthode at mthode.org Wed May 29 20:53:52 2019 From: mthode at mthode.org (Matthew Thode) Date: Wed, 29 May 2019 15:53:52 -0500 Subject: [requirements][kuryr][flame] openshift dificulties Message-ID: <20190529205352.f2dxzckgvfavbvtv@mthode.org> Openshift upstream is giving us difficulty as they are capping the version of urllib3 and kubernetes we are using. -urllib3===1.25.3 +urllib3===1.24.3 -kubernetes===9.0.0 +kubernetes===8.0.1 I've opened an issue with them but not had much luck there (and their prefered solution just pushes the can down the road). https://github.com/openshift/openshift-restclient-python/issues/289 What I'd us to do is move off of openshift as our usage doesn't seem too much. openstack/kuryr-tempest-plugin uses it for one import (and just one function with that import). I'm not sure exactly what you are doing with it but would it be too much to ask to move to something else? x/flame has it in it's constraints but I don't see any actual usage, so perhaps it's a false flag. Please let me know what you think -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From wilkers.steve at gmail.com Wed May 29 21:04:42 2019 From: wilkers.steve at gmail.com (Steve Wilkerson) Date: Wed, 29 May 2019 16:04:42 -0500 Subject: OPENSTACK-HELM : INGRESS CONTROLLER PODS STUCK IN PENDING In-Reply-To: References: Message-ID: As Robson mentioned, please make sure you have the correct labels applied to your nodes. Depending how you're deploying Kubernetes, you may have to manually add these labels. The default node selector key/value labels for the ingress controller can be found here: https://github.com/openstack/openstack-helm-infra/blob/master/ingress/values.yaml#L121-L127 >From what I see, you likely don't have these labels applied. Please apply them and see if this fixes your issue. Steve On Wed, May 29, 2019 at 2:38 PM Robson Ramos Barreto < robson.rbarreto at gmail.com> wrote: > Hi, > > I'm trying with helm too and I have these node labels: > > kubectl label nodes --all openstack-control-plane=enabled > kubectl label nodes --all openstack-compute-node=enabled > kubectl label nodes --all openvswitch=enabled > kubectl label nodes --all linuxbridge=enabled > kubectl label nodes --all ceph-mon=enabled > kubectl label nodes --all ceph-osd=enabled > kubectl label nodes --all ceph-mds=enabled > kubectl label nodes --all ceph-rgw=enabled > kubectl label nodes --all ceph-mgr=enabled > > From kubectl describe pod you are able to see the needed node selector > > Regards > > On Wed, May 29, 2019 at 3:57 PM Akki yadav > wrote: > >> Hello team, >> I am running the following command in order to deploy the ingress >> controller: >> - ./tools/deployment/multinode/020-ingress.sh >> It makes two pods ingress-error-pages-899888c7-pnxhv and >> ingress-error-pages-899888c7-w5d5b which remains in pending state and the >> script ends after 900 seconds stating in "kubectl describe" that : >> - Warning FailedScheduling 60s (x3 over >> 2m12s) default-scheduler 0/3 nodes are available: 3 node(s) didn't match >> node selector. >> >> As only default labels are associated with the cluster nodes, do i need >> to add any specific one ? If yes, Please tell me where and how to add the >> same. >> >> Please guide me how to resolve this issue. >> Thanks in advance. >> >> Regards, >> Akshay >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From melwittt at gmail.com Wed May 29 21:46:29 2019 From: melwittt at gmail.com (melanie witt) Date: Wed, 29 May 2019 14:46:29 -0700 Subject: [nova] Should we have a known issue reno for bug 1829062 in 19.0.1 (stein)? In-Reply-To: <141a35e2-df16-1af2-b281-ae19f7e2e33a@gmail.com> References: <141a35e2-df16-1af2-b281-ae19f7e2e33a@gmail.com> Message-ID: On 5/29/19 12:33 PM, melanie witt wrote: > On 5/24/19 1:04 PM, Matt Riedemann wrote: >> I've got a release request for stein 19.0.1 [1] but am holding it up >> to figure out if we should have a known issue release note for the >> nova-api + wsgi + eventlet monkey patch bug(s) [2][3]. [4] contains a >> workaround to disable the eventlet monkeypatching which it sounds like >> StarlingX is using for now, but is not really something we're >> recommending for production (setting >> OS_NOVA_DISABLE_EVENTLET_PATCHING=1). Sean Mooney has another >> workaround [5]. Should we try to clean that up for a known issue >> before we release 19.0.1 given it's the first release since the Stein >> GA? I tend to think "yes". > > I concur. I think a known issue reno would be worthwhile as this is a > complex issue that we know some are facing already, and having some > documentation around it would be helpful. I've proposed a known issue reno here, ready for feedback and reviews: https://review.opendev.org/662095 -melanie >> [1] https://review.opendev.org/#/c/661376/ >> [2] https://bugs.launchpad.net/nova/+bug/1829062 >> [3] https://bugs.launchpad.net/nova/+bug/1825584 >> [4] https://review.opendev.org/#/c/647310/ >> [5] https://bugs.launchpad.net/nova/+bug/1829062/comments/7 >> > From openstack at fried.cc Wed May 29 22:29:45 2019 From: openstack at fried.cc (Eric Fried) Date: Wed, 29 May 2019 17:29:45 -0500 Subject: PowerVM-CI is failing while stacking devstack In-Reply-To: References: Message-ID: Hi Aman. [3] indicates that your nova-compute service did not start properly. You'll want to look at the logs for that service to see why. efried On 5/29/19 12:01 AM, Aman Kumar Sinha26 wrote: > Hi OpenStack Community, >   > Initially, I was facing an issue while stacking devstack [1][4]. In > order to resolve the issue, I added [2] in devstack/functions-common > under the function _run_under_systemd. Now I am hitting with a new issue > [3]. Can you help me with some suggestion to resolve the error? > FYI, we are facing this issue on PowerVM-CI. > I tried installing stable/stein as well as master. But facing the same > error logs. local.conf looks good to me. >   >   > [1] http://paste.openstack.org/show/752148/ > [2] http://paste.openstack.org/show/752149/ > [3] http://paste.openstack.org/show/752147/ > [4] > http://184.172.12.213/66/661466/3/check/nova-in-tree-pvm/8bff237/logs/console.txt.gz > From cboylan at sapwetik.org Wed May 29 22:30:20 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Wed, 29 May 2019 15:30:20 -0700 Subject: [requirements][kuryr][flame] openshift dificulties In-Reply-To: <20190529205352.f2dxzckgvfavbvtv@mthode.org> References: <20190529205352.f2dxzckgvfavbvtv@mthode.org> Message-ID: <68198405-8373-4524-9d58-481b55e19722@www.fastmail.com> On Wed, May 29, 2019, at 1:54 PM, Matthew Thode wrote: > Openshift upstream is giving us difficulty as they are capping the > version of urllib3 and kubernetes we are using. > > -urllib3===1.25.3 > +urllib3===1.24.3 > -kubernetes===9.0.0 > +kubernetes===8.0.1 > > I've opened an issue with them but not had much luck there (and their > prefered solution just pushes the can down the road). > > https://github.com/openshift/openshift-restclient-python/issues/289 > > What I'd us to do is move off of openshift as our usage doesn't seem too > much. > > openstack/kuryr-tempest-plugin uses it for one import (and just one > function with that import). I'm not sure exactly what you are doing > with it but would it be too much to ask to move to something else? > > x/flame has it in it's constraints but I don't see any actual usage, so > perhaps it's a false flag. > > Please let me know what you think I think part of the issue here is that the kubernetes lib is generated code and not "curated" with backward compatibility in mind. That said it is worth noting that the infra team found that openshift lib + kubernetes lib < 9.0 does not work with the ansible k8s modules (there is a bug in the blocking calls which causes them to go out to lunch and never return). Depending on where we pull these libraries in we may need to override anyway for things to function at all. Clark From manuel.sb at garvan.org.au Thu May 30 01:24:41 2019 From: manuel.sb at garvan.org.au (Manuel Sopena Ballesteros) Date: Thu, 30 May 2019 01:24:41 +0000 Subject: How to migrate a vm to a different project? Message-ID: <9D8A2486E35F0941A60430473E29F15B017EA9DC04@mxdb2.ad.garvan.unsw.edu.au> Dear Openstack community, I have one vm under the admin project I would like to migrate to migrate to a different project. Reason is the vm user wants to manage it himself. I have created a project and a user/password for that user and now I need to migrate the vm. I have not been able to find a way to do this, so I was thinking in doing this: Create an image from the vm Shelve the vm (I don't have more hosts to run it because it has a x4 GPUs attached to it and don't have spare host with free resources) Create a new vm based on image as the new user/project Destroy the shelved vm Would this work? Please correct me if there is an easier/safer way of doing this. Thank you very much Manuel Sopena Ballesteros Big Data Engineer | Kinghorn Centre for Clinical Genomics [cid:image001.png at 01D4C835.ED3C2230] a: 384 Victoria Street, Darlinghurst NSW 2010 p: +61 2 9355 5760 | +61 4 12 123 123 e: manuel.sb at garvan.org.au Like us on Facebook | Follow us on Twitter and LinkedIn NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 7579 bytes Desc: image001.png URL: From Hunter90960 at mail.com Thu May 30 01:52:01 2019 From: Hunter90960 at mail.com (Hunter Nins) Date: Thu, 30 May 2019 03:52:01 +0200 Subject: [barbican] dev: Using Barbican for media box/center unattended cert, key (KEK) updates, etc. In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From Hunter90960 at mail.com Thu May 30 01:53:01 2019 From: Hunter90960 at mail.com (Hunter Nins) Date: Thu, 30 May 2019 03:53:01 +0200 Subject: [barbican] dev: Using Barbican for media box/center unattended cert, key (KEK) updates, etc. In-Reply-To: References: Message-ID: 2nd try. Including link to my Stackoverflow post to centrlize responses. stackoverflow.com/questions/56360728/unattended-automated-linux-device-key-management-certs-for-accessing-update-ser     Sent: Tuesday, May 28, 2019 at 3:49 PM From: "Hunter Nins" To: openstack-discuss at lists.openstack.org Subject: [barbican] dev: Using Barbican for media box/center unattended cert, key (KEK) updates, etc. Hello, Pardon: the first attempt was html-formatted. I am currently working on a customized media center/box product for my employer. It's basically a Raspberry Pi 3b+ running Raspian, configured to auto-update periodically via `apt`. The device accesses binaries for proprietary applications via a private, secured `apt` repo, using a pre-installed certificate on the device. Right now, the certificate on the device is set to never expire, but going forward, we'd like to configure the certificate to expire every 4 months. We also plan to deploy a unique certificate per device we ship, so the certs can be revoked if the tamper mechanism is triggered (i.e. if the customer rips open the box, it blows a fuse that is attached to a ADC chip, and the device reports in s/w that the sensor has been tripped). Finally, we anticipate some customers leaving the device offline, and only updating once every year (allowing for time for the cert to expire). Is there a way I could use Barbican to: * Update the certs for apt-repo access on the device periodically. * Setup key-encryption-keys (KEK) on the device, if we need the device to be able to download sensitive data, such as an in-memory cached copy of customer info. * Provide a mechanism for a new key to be deployed on the device if the currently-used key has expired (i.e. the user hasn't connected the device to the internet for more than 4 months). * Allow keys to be tracked, revoked, and de-commissioned. Thank you for your time and assistance.   From cheng1.li at intel.com Thu May 30 02:59:20 2019 From: cheng1.li at intel.com (Li, Cheng1) Date: Thu, 30 May 2019 02:59:20 +0000 Subject: [Airship-Seaworthy] Deployment of Airship-Seaworthy on Virtual Environment In-Reply-To: References: Message-ID: I have the same question. I haven’t seen any docs which guides how to deploy airsloop/air-seaworthy in virtual env. I am trying to deploy airsloop on libvirt/kvm driven virtual env. Two VMs, one for genesis, the other for compute. Virtualbmc for ipmi simulation. The genesis.sh scripts has been run on genesis node without error. But deploy_site fails at prepare_and_deploy_nodes task(action ‘set_node_boot’ timeout). I am still investigating this issue. It will be great if we have official document for this scenario. Thanks, Cheng From: Anirudh Gupta [mailto:Anirudh.Gupta at hsc.com] Sent: Wednesday, May 29, 2019 3:31 PM To: airship-discuss at lists.airshipit.org; airship-announce at lists.airshipit.org; openstack-dev at lists.openstack.org; openstack at lists.openstack.org Subject: [Airship-Seaworthy] Deployment of Airship-Seaworthy on Virtual Environment Hi Team, We want to test Production Ready Airship-Seaworthy in our virtual environment The link followed is https://airship-treasuremap.readthedocs.io/en/latest/seaworthy.html As per the document we need 6 DELL R720xd bare-metal servers: 3 control, and 3 compute nodes. But we need to deploy our setup on Virtual Environment. Does Airship-Seaworthy support Installation on Virtual Environment? We have 2 Rack Servers with Dual-CPU Intel® Xeon® E5 26xx with 16 cores each and 128 GB RAM. Is it possible that we can create Virtual Machines on them and set up the complete environment. In that case, what possible infrastructure do we require for setting up the complete setup. Looking forward for your response. Regards अनिरुद्ध गुप्ता (वरिष्ठ अभियंता) Hughes Systique Corporation D-23,24 Infocity II, Sector 33, Gurugram, Haryana 122001 DISCLAIMER: This electronic message and all of its contents, contains information which is privileged, confidential or otherwise protected from disclosure. The information contained in this electronic mail transmission is intended for use only by the individual or entity to which it is addressed. If you are not the intended recipient or may have received this electronic mail transmission in error, please notify the sender immediately and delete / destroy all copies of this electronic mail transmission without disclosing, copying, distributing, forwarding, printing or retaining any part of it. Hughes Systique accepts no responsibility for loss or damage arising from the use of the information transmitted by this email including damage from virus. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sundar.nadathur at intel.com Thu May 30 03:51:46 2019 From: sundar.nadathur at intel.com (Nadathur, Sundar) Date: Thu, 30 May 2019 03:51:46 +0000 Subject: [cyborg][nova][sdk]Cyborgclient integration In-Reply-To: References: <00be01d51167$46a0e1b0$d3e2a510$@hco.ntt.co.jp> <1CC272501B5BC543A05DB90AA509DED5275753DD@fmsmsx122.amr.corp.intel.com> Message-ID: <1CC272501B5BC543A05DB90AA509DED527576F84@fmsmsx122.amr.corp.intel.com> > From: Dean Troyer > Sent: Thursday, May 23, 2019 1:23 PM > Subject: Re: [cyborg][nova][sdk]Cyborgclient integration > > On Thu, May 23, 2019 at 1:03 PM Nadathur, Sundar > wrote: > > IIUC, the push towards OpenStack SDK [4] is unrelated to OpenStack CLI. It > becomes relevant only if some other service wants to call into Cyborg. > > Yes and no. The two things are generally independent, however they will > eventually fit together in that we want OSC to use the SDK for all back-end > work soon(TM), depending on when we get an SDK 1.0 release. > > At the 2018 Denver PTG we started thinking about what OSC plugins that use > the SDK would look like, and the only things left in the plugin itself would be > the cliff command classes. Since SDK is accepting direct support for official > projects directly in the repo, OSC will consider doing the same for commands > that do not require any additional dependencies, ie if Cyborg were completely > backed by the SDK we would consider adding its commands directly to the > OSC repo. We do intend to adopt the SDK. > This is a significant change for OSC, and would come with one really large > caveat: commands must maintain the same level of consistency that the rest > of the commands in the OSC repo maintain. ie, 'update' > is not a verb, resources do not contain hyphens in their names, etc. > There are projects that have deviated from these rules in their plugins, and > there they are free to do that, incurring only the wrath or disdain or joy of > their users for being different. That is not the case for commands contained > in the OSC repo itself. We are trying to understand the pros and cons of doing the OSC plugin vs integrating directly into OSC repo. 1. Consistency requirements are fine. Presumably we should follow [1]. They seem to be loose guidelines than specific requirements. I suppose the command syntax would look like: $ openstack accelerator create device-profile 2. When you say, " This is a significant change for OSC ", could you tell us what the risks are for direct integration? The repo [2] seems to have entries for compute, network, identity, etc. So, it doesn’t look like uncharted waters. [1] https://docs.openstack.org/python-openstackclient/latest/contributor/humaninterfaceguide.html [2] https://github.com/openstack/python-openstackclient/tree/master/openstackclient > dt Thanks & Regards, Sundar From akhil.jain at india.nec.com Thu May 30 07:23:31 2019 From: akhil.jain at india.nec.com (AKHIL Jain) Date: Thu, 30 May 2019 07:23:31 +0000 Subject: [monasca] Proposing Akhil Jain for Monasca core team In-Reply-To: References: <8bfbc2a3-03a4-d469-f723-09aba5b2b96b@suse.com>, Message-ID: Thanks, Witek and whole Monasca team for giving me this opportunity. I will do my best to uphold your support. -Akhil ________________________________________ From: Stefano Canepa Sent: Wednesday, May 29, 2019 9:08:46 PM To: Witek Bedyk Cc: openstack-discuss at lists.openstack.org Subject: Re: [monasca] Proposing Akhil Jain for Monasca core team On Thu, 23 May 2019 at 12:30, Witek Bedyk > wrote: Hello team, I would like to propose Akhil Jain to join the Monasca core team. +1 and many thanks to Akhil. All the best sc From smooney at redhat.com Thu May 30 09:47:57 2019 From: smooney at redhat.com (Sean Mooney) Date: Thu, 30 May 2019 10:47:57 +0100 Subject: How to migrate a vm to a different project? In-Reply-To: <9D8A2486E35F0941A60430473E29F15B017EA9DC04@mxdb2.ad.garvan.unsw.edu.au> References: <9D8A2486E35F0941A60430473E29F15B017EA9DC04@mxdb2.ad.garvan.unsw.edu.au> Message-ID: <68996ada8a95156c4480da35ec763500e3432092.camel@redhat.com> On Thu, 2019-05-30 at 01:24 +0000, Manuel Sopena Ballesteros wrote: > Dear Openstack community, > > I have one vm under the admin project I would like to migrate to migrate to a different project. Reason is the vm user > wants to manage it himself. I have created a project and a user/password for that user and now I need to migrate the > vm. > > I have not been able to find a way to do this, so I was thinking in doing this: > > Create an image from the vm > Shelve the vm (I don't have more hosts to run it because it has a x4 GPUs attached to it and don't have spare host > with free resources) > Create a new vm based on image as the new user/project > Destroy the shelved vm > > Would this work? you could achive a similar effect by snapshoting the vm. deleting the origninal and then creating a new vm form the snapshot. in either case you will need to share the flavor and image/snapshot with the other users project and then you can create the new vm. i assume you chose to shevle the vm and then delete it later so that you could role back if needed? anyway if you are fine with destroying the original vm and creating a new one form the snapshot of the vm, and the new project has acess to the correct flavor then yes the workflow you suggest should work. > > Please correct me if there is an easier/safer way of doing this. > > Thank you very much > > Manuel Sopena Ballesteros > > Big Data Engineer | Kinghorn Centre for Clinical Genomics > > [cid:image001.png at 01D4C835.ED3C2230] > > a: 384 Victoria Street, Darlinghurst NSW 2010 > p: +61 2 9355 5760 | +61 4 12 123 123 > e: manuel.sb at garvan.org.au > > Like us on Facebook | Follow us on Twitter > and LinkedIn > > NOTICE > Please consider the environment before printing this email. This message and any attachments are intended for the > addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended > recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this > message in error please notify us at once by return email and then delete both messages. We accept no liability for > the distribution of viruses or similar in electronic communications. This notice should not be removed. From smooney at redhat.com Thu May 30 10:00:55 2019 From: smooney at redhat.com (Sean Mooney) Date: Thu, 30 May 2019 11:00:55 +0100 Subject: [cyborg][nova][sdk]Cyborgclient integration In-Reply-To: <1CC272501B5BC543A05DB90AA509DED527576F84@fmsmsx122.amr.corp.intel.com> References: <00be01d51167$46a0e1b0$d3e2a510$@hco.ntt.co.jp> <1CC272501B5BC543A05DB90AA509DED5275753DD@fmsmsx122.amr.corp.intel.com> <1CC272501B5BC543A05DB90AA509DED527576F84@fmsmsx122.amr.corp.intel.com> Message-ID: <8457085b04e80164fe45eb54a3b54afab7c10bfb.camel@redhat.com> On Thu, 2019-05-30 at 03:51 +0000, Nadathur, Sundar wrote: > > From: Dean Troyer > > Sent: Thursday, May 23, 2019 1:23 PM > > Subject: Re: [cyborg][nova][sdk]Cyborgclient integration > > > > On Thu, May 23, 2019 at 1:03 PM Nadathur, Sundar > > wrote: > > > IIUC, the push towards OpenStack SDK [4] is unrelated to OpenStack CLI. It > > > > becomes relevant only if some other service wants to call into Cyborg. > > > > Yes and no. The two things are generally independent, however they will > > eventually fit together in that we want OSC to use the SDK for all back-end > > work soon(TM), depending on when we get an SDK 1.0 release. > > > > At the 2018 Denver PTG we started thinking about what OSC plugins that use > > the SDK would look like, and the only things left in the plugin itself would be > > the cliff command classes. Since SDK is accepting direct support for official > > projects directly in the repo, OSC will consider doing the same for commands > > that do not require any additional dependencies, ie if Cyborg were completely > > backed by the SDK we would consider adding its commands directly to the > > OSC repo. > > We do intend to adopt the SDK. > > > This is a significant change for OSC, and would come with one really large > > caveat: commands must maintain the same level of consistency that the rest > > of the commands in the OSC repo maintain. ie, 'update' > > is not a verb, resources do not contain hyphens in their names, etc. > > There are projects that have deviated from these rules in their plugins, and > > there they are free to do that, incurring only the wrath or disdain or joy of > > their users for being different. That is not the case for commands contained > > in the OSC repo itself. > > We are trying to understand the pros and cons of doing the OSC plugin vs integrating directly into OSC repo. > > 1. Consistency requirements are fine. Presumably we should follow [1]. They seem to be loose guidelines than specific > requirements. I suppose the command syntax would look like: > $ openstack accelerator create device-profile i think this should be "openstack acclerator device-profile create" looking how "security group rule create" works the action is always last in the osc commands > > 2. When you say, " This is a significant change for OSC ", could you tell us what the risks are for direct > integration? The repo [2] seems to have entries for compute, network, identity, etc. So, it doesn’t look like > uncharted waters. osc has the core service integreated in tree but everything else is done via a plugin including things like ironic and placement. if we were to start osc again we would either have made everything plugins or done everything intree. as it stands we have been moving towrord the everything is a plugin side of that scale. even for nova we have been considering if a plugin would be better in order to allow use more easily close the gap between the cli and api. the main disadvantage to integrating osc intree will be review time. e.g. as the cyborg core team ye will not have +2 rights on osc but if you have your own osc plugin then the cyborg core team can also manage the cli for cyborg. > > [1] https://docs.openstack.org/python-openstackclient/latest/contributor/humaninterfaceguide.html > [2] https://github.com/openstack/python-openstackclient/tree/master/openstackclient > > > dt > > Thanks & Regards, > Sundar From ekuvaja at redhat.com Thu May 30 11:30:39 2019 From: ekuvaja at redhat.com (Erno Kuvaja) Date: Thu, 30 May 2019 12:30:39 +0100 Subject: [Glance] Removal of glance-replicator Message-ID: Hi all, The replicator has been purely depending of the Images API v1 and thus not operational at all for good couple of cycles now. As a part of the v1 cleanup we are going to remove the replicator command as well. If the replicator is needed (currently doesn't seem to as no-one even noticed that it deasn't work since we removed v1 api), lets not port the original just to v2 but open a spec for it and take the opportunity to come up with robust solution that will serve us well into the future. Thanks, Erno "jokke" Kuvaja -------------- next part -------------- An HTML attachment was scrubbed... URL: From ekuvaja at redhat.com Thu May 30 11:43:16 2019 From: ekuvaja at redhat.com (Erno Kuvaja) Date: Thu, 30 May 2019 12:43:16 +0100 Subject: [Glance] Deprecation and removal of sheepdog store Message-ID: Hola, Due to the stopped development and maintenance of sheepdog [0] we made the call to deprecate the store driver for removal. As the sheepdog job has not been working and there has been no interest to fix it and sheepdog upstream development has dried out as well, we are looking to remove the Glance driver for it at early U. This means that if there is someone who is using the sheepdog store for images, Train will be the last release to fix the driver and fork it for your own repo for your own maintenance. We are also looking into killing the failing test job for the same reason [1] [0] http://lists.wpkg.org/pipermail/sheepdog/2019-March/068451.html [1] https://review.opendev.org/#/c/660450/ Best, Erno "jokke" Kuvaja -------------- next part -------------- An HTML attachment was scrubbed... URL: From cdent+os at anticdent.org Thu May 30 12:52:01 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Thu, 30 May 2019 13:52:01 +0100 (BST) Subject: [ops][nova][placement] NUMA topology vs non-NUMA workloads Message-ID: This message is primarily addressed at operators, and of those, operators who are interested in effectively managing and mixing workloads that care about NUMA with workloads that do not. There are some questions within, after some background to explain the issue. At the PTG, Nova and Placement developers made a commitment to more effectively manage NUMA topologies within Nova and Placement. On the placement side this resulted in a spec which proposed several features that would enable more expressive queries when requesting allocation candidates (places for workloads to go), resulting in fewer late scheduling failures. At first there was one spec that discussed all the features. This morning it was split in two because one of the features is proving hard to resolve. Those two specs can be found at: * https://review.opendev.org/658510 (has all the original discussion) * https://review.opendev.org/662191 (the less contentious features split out) After much discussion, we would prefer to not do the feature discussed in 658510. Called 'can_split', it would allow specified classes of resource (notably VCPU and memory) to be split across multiple numa nodes when each node can only contribute a portion of the required resources and where those resources are modelled as inventory on the NUMA nodes, not the host at large. While this is a good idea in principle it turns out (see the spec) to cause many issues that require changes throughout the ecosystem, for example enforcing pinned cpus for workloads that would normally float. It's possible to make the changes, but it would require additional contributors to join the effort, both in terms of writing the code and understanding the many issues. So the questions: * How important, in your cloud, is it to co-locate guests needing a NUMA topology with guests that do not? A review of documentation (upstream and vendor) shows differing levels of recommendation on this, but in many cases the recommendation is to not do it. * If your answer to the above is "we must be able to do that": How important is it that your cloud be able to pack workloads as tight as possible? That is: If there are two NUMA nodes and each has 2 VCPU free, should a 4 VCPU demanding non-NUMA workload be able to land there? Or would you prefer that not happen? * If the answer to the first question is "we can get by without that" is it satisfactory to be able to configure some hosts as NUMA aware and others as not, as described in the "NUMA topology with RPs" spec [1]? In this set up some non-NUMA workloads could end up on a NUMA host (unless otherwise excluded by traits or aggregates), but only when there was contiguous resource available. This latter question articulates the current plan unless responses to this message indicate it simply can't work or legions of assistance shows up. Note that even if we don't do can_split, we'll still be enabling significant progress with the other features described in the second spec [2]. Thanks for your help in moving us in the right direction. [1] https://review.opendev.org/552924 [2] https://review.opendev.org/662191 -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From mriedemos at gmail.com Thu May 30 13:11:12 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 30 May 2019 08:11:12 -0500 Subject: stable/newton devstack installation is failing on ubuntu 14.04.6 In-Reply-To: References: Message-ID: <8962c449-6610-d75d-60cf-ad85cf22dd0f@gmail.com> On 5/27/2019 6:40 AM, Ansari, Arshad wrote: > I am trying to install stable/newton devstack on Ubuntu 14.04.6 and > getting below error:- > > 2019-05-20 05:21:05.321 | full installdeps: > -chttps://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt, > -r/opt/stack/tempest/requirements.txt > > 2019-05-20 05:21:06.347 | > > 2019-05-20 05:21:06.347 | =================================== log end > ==================================== > > 2019-05-20 05:21:06.347 | ERROR: could not install deps > [-chttps://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt, > -r/opt/stack/tempest/requirements.txt]; v = > InvocationError(u'/opt/stack/tempest/.tox/tempest/bin/pip install > -chttps://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt > -r/opt/stack/tempest/requirements.txt', 2) > > 2019-05-20 05:21:06.347 | ___________________________________ summary > ____________________________________ > > 2019-05-20 05:21:06.347 | ERROR:   full: could not install deps > [-chttps://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt, > -r/opt/stack/tempest/requirements.txt]; v = > InvocationError(u'/opt/stack/tempest/.tox/tempest/bin/pip install > -chttps://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt > -r/opt/stack/tempest/requirements.txt', 2) > > SSLError: hostname 'git.openstack.org' doesn't match either of > 'developer.openstack.org', 'www.developer.openstack.org' > > You are using pip version 8.1.2, however version 19.1.1 is available. > > Please let me know if other details are required. > This is probably due to the opendev rename that happened a few weeks ago. I'm surprised there is still a stable/newton branch open for devstack since most if not all of the other service projects that devstack relies on have already end of life'd the stable/newton branch, so I would expect the devstack newton branch should also be end of life. -- Thanks, Matt From ed at leafe.com Thu May 30 13:31:12 2019 From: ed at leafe.com (Ed Leafe) Date: Thu, 30 May 2019 08:31:12 -0500 Subject: [placement] Office Hours Message-ID: <923884A4-E427-439E-AD76-9DDBB45550D9@leafe.com> A recent poll of our members resulted in the decision to no longer hold our regular weekly meetings, but to instead switch to an Office Hour format. Now the question is when to hold them, and how often. A time between 1400—2200 UTC would work well for US-based people, while times between 0900—1700 UTC would be best for Europe people, and 0100—0900 UTC for Asia-Pacific. It would seem that holding it at 1600 UTC would be a good compromise for both US and Europe, but that leaves out the other half of the world. So perhaps our members in that side of the globe might be interested in holding their own Office Hour? So I'd like your input. Respond with your preference, and after a week if there is an overwhelming favorite, we'll go with that. If there is a range of preferences, we can hold another poll to settle on one. I'll start: my preference would be to hold weekly office hours every Wednesday at 1600 UTC. -- Ed Leafe From alee at redhat.com Thu May 30 14:17:32 2019 From: alee at redhat.com (Ade Lee) Date: Thu, 30 May 2019 10:17:32 -0400 Subject: [barbican] dev: Using Barbican for media box/center unattended cert, key (KEK) updates, etc. In-Reply-To: References: Message-ID: <3e1c9fb4395cdeb4293f64b0ebbfdd3e33e47b68.camel@redhat.com> See responses below: On Thu, 2019-05-30 at 03:53 +0200, Hunter Nins wrote: > 2nd try. > > Including link to my Stackoverflow post to centrlize responses. > > stackoverflow.com/questions/56360728/unattended-automated-linux- > device-key-management-certs-for-accessing-update-ser > > > > Sent: Tuesday, May 28, 2019 at 3:49 PM > From: "Hunter Nins" > To: openstack-discuss at lists.openstack.org > Subject: [barbican] dev: Using Barbican for media box/center > unattended cert, key (KEK) updates, etc. > Hello, > > Pardon: the first attempt was html-formatted. > > I am currently working on a customized media center/box product for > my employer. It's basically a Raspberry Pi 3b+ running Raspian, > configured to auto-update periodically via `apt`. The device accesses > binaries for proprietary applications via a private, secured `apt` > repo, using a pre-installed certificate on the device. > > Right now, the certificate on the device is set to never expire, but > going forward, we'd like to configure the certificate to expire every > 4 months. We also plan to deploy a unique certificate per device we > ship, so the certs can be revoked if the tamper mechanism is > triggered (i.e. if the customer rips open the box, it blows a fuse > that is attached to a ADC chip, and the device reports in s/w that > the sensor has been tripped). Finally, we anticipate some customers > leaving the device offline, and only updating once every year > (allowing for time for the cert to expire). > > Is there a way I could use Barbican to: > * Update the certs for apt-repo access on the device periodically. Barbican used to have an interface to issue certs, but this was removed. Therefore barbican is simply a service to generate and store secrets. You could use something like certmonger. certmonger is a client side daemon that generates cert requests and submits them to a CA. It then tracks those certs and requests new ones when the certs are going to expire. > * Setup key-encryption-keys (KEK) on the device, if we need the > device to be able to download sensitive data, such as an in-memory > cached copy of customer info. To use barbican, you need to be able to authenticate and retrieve something like a keystone token. Once you have that, you can use barbican to generate key encryption keys (which would be stored in the barbican database) and download them to the device using the secret retrieval API. Do you need/want the KEK's escrowed like this though? > * Provide a mechanism for a new key to be deployed on the device if > the currently-used key has expired (i.e. the user hasn't connected > the device to the internet for more than 4 months). Barbican has no mechanism for this. This is client side tooling that would need to be written. You'd need to think about authentication. > * Allow keys to be tracked, revoked, and de-commissioned. > Same as above. Barbican has no mechanism for this. > Thank you for your time and assistance. > > From balazs.gibizer at est.tech Thu May 30 14:37:48 2019 From: balazs.gibizer at est.tech (=?Windows-1252?Q?Bal=E1zs_Gibizer?=) Date: Thu, 30 May 2019 14:37:48 +0000 Subject: [placement] Office Hours In-Reply-To: <923884A4-E427-439E-AD76-9DDBB45550D9@leafe.com> References: <923884A4-E427-439E-AD76-9DDBB45550D9@leafe.com> Message-ID: <1559227066.23481.3@smtp.office365.com> On Thu, May 30, 2019 at 3:31 PM, Ed Leafe wrote: > A recent poll of our members resulted in the decision to no longer > hold our regular weekly meetings, but to instead switch to an Office > Hour format. Now the question is when to hold them, and how often. > > A time between 1400—2200 UTC would work well for US-based people, > while times between 0900—1700 UTC would be best for Europe people, > and 0100—0900 UTC for Asia-Pacific. > > It would seem that holding it at 1600 UTC would be a good compromise > for both US and Europe, but that leaves out the other half of the > world. So perhaps our members in that side of the globe might be > interested in holding their own Office Hour? > > So I'd like your input. Respond with your preference, and after a > week if there is an overwhelming favorite, we'll go with that. If > there is a range of preferences, we can hold another poll to settle > on one. > > I'll start: my preference would be to hold weekly office hours every > Wednesday at 1600 UTC. 16:00 UTC means 18:00 CEST starting time for me during the summer. I would prefer at least one hour earlier if possible but I can manage the 16:00 UTC as well if majority wants that. So my preference is 15:00 UTC. Cheers, gibi > > > -- Ed Leafe > > > > > > From jean-philippe at evrard.me Thu May 30 14:58:34 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Thu, 30 May 2019 16:58:34 +0200 Subject: =?UTF-8?Q?Re:_[openstack-ansible]_Installing_Third-Party_drivers_into_th?= =?UTF-8?Q?e_Cinder-Volume_container_during_playbook_execution?= In-Reply-To: References: Message-ID: On Tue, May 28, 2019, at 04:10, Henry Bonath wrote: > Hello, I asked this into IRC but I thought this might be a more > appropriate place to ask considering the IRC channel usage over the > weekend. > > If I wanted to deploy a third party driver along with my Cinder-Volume > container, is there a built-in mechanism for doing so? (I am > specifically wanting to use: https://github.com/iXsystems/cinder) > > I am able to configure a cinder-backend in the > "openstack_user_config.yml" file which works perfectly if I let it > fail during the first run, then copy the driver into the containers > and run "os-cinder-install.yml" a second time. > > I've found that you guys have built similar stuff into the system > (e.g. Horizon custom Theme installation via .tgz) and was curious if > there is a similar mechanism for Cinder Drivers that may be > undocumented. > > http://paste.openstack.org/show/752132/ > This is an example of my working config, which relies on the driver > being copied into the > /openstack/venvs/cinder-19.x.x.x/lib/python2.7/site-packages/cinder/volume/drivers/ixsystems/ > folder. > > Thanks in advance! > > I suppose the community would be okay to have this in tree, so no need for a third party system here (and no need to maintain this on your own, separately). However... if it's just about copying the content of this repo, did you think of packaging this, and publish it to pypi ? This way you could just pip install the necessary package into your cinder venv... Regards, Jean-Philippe Evrard (evrardjp) From mdulko at redhat.com Thu May 30 15:07:54 2019 From: mdulko at redhat.com (=?UTF-8?Q?Micha=C5=82?= Dulko) Date: Thu, 30 May 2019 17:07:54 +0200 Subject: [requirements][kuryr][flame] openshift dificulties In-Reply-To: <20190529205352.f2dxzckgvfavbvtv@mthode.org> References: <20190529205352.f2dxzckgvfavbvtv@mthode.org> Message-ID: On Wed, 2019-05-29 at 15:53 -0500, Matthew Thode wrote: > Openshift upstream is giving us difficulty as they are capping the > version of urllib3 and kubernetes we are using. > > -urllib3===1.25.3 > +urllib3===1.24.3 > -kubernetes===9.0.0 > +kubernetes===8.0.1 > > I've opened an issue with them but not had much luck there (and their > prefered solution just pushes the can down the road). > > https://github.com/openshift/openshift-restclient-python/issues/289 > > What I'd us to do is move off of openshift as our usage doesn't seem too > much. > > openstack/kuryr-tempest-plugin uses it for one import (and just one > function with that import). I'm not sure exactly what you are doing > with it but would it be too much to ask to move to something else? >From Kuryr side it's not really much effort, we can switch to bare REST calls, but obviously we prefer the client. If there's much support for getting rid of it, we can do the switch. > x/flame has it in it's constraints but I don't see any actual usage, so > perhaps it's a false flag. > > Please let me know what you think > From mthode at mthode.org Thu May 30 15:17:39 2019 From: mthode at mthode.org (Matthew Thode) Date: Thu, 30 May 2019 10:17:39 -0500 Subject: [requirements][kuryr][flame] openshift dificulties In-Reply-To: References: <20190529205352.f2dxzckgvfavbvtv@mthode.org> Message-ID: <20190530151739.nfzrqfstlb2sbrq5@mthode.org> On 19-05-30 17:07:54, Michał Dulko wrote: > On Wed, 2019-05-29 at 15:53 -0500, Matthew Thode wrote: > > Openshift upstream is giving us difficulty as they are capping the > > version of urllib3 and kubernetes we are using. > > > > -urllib3===1.25.3 > > +urllib3===1.24.3 > > -kubernetes===9.0.0 > > +kubernetes===8.0.1 > > > > I've opened an issue with them but not had much luck there (and their > > prefered solution just pushes the can down the road). > > > > https://github.com/openshift/openshift-restclient-python/issues/289 > > > > What I'd us to do is move off of openshift as our usage doesn't seem too > > much. > > > > openstack/kuryr-tempest-plugin uses it for one import (and just one > > function with that import). I'm not sure exactly what you are doing > > with it but would it be too much to ask to move to something else? > > From Kuryr side it's not really much effort, we can switch to bare REST > calls, but obviously we prefer the client. If there's much support for > getting rid of it, we can do the switch. > Right now Kyryr is only using it in that one place and it's blocking the update of urllib3 and kubernetes for the rest of openstack. So if it's not too much trouble it'd be nice to have happen. > > x/flame has it in it's constraints but I don't see any actual usage, so > > perhaps it's a false flag. > > > > Please let me know what you think > > -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From jim at jimrollenhagen.com Thu May 30 15:27:10 2019 From: jim at jimrollenhagen.com (Jim Rollenhagen) Date: Thu, 30 May 2019 11:27:10 -0400 Subject: [tc][all] Github mirroring (or lack thereof) for unofficial projects In-Reply-To: References: <20190503190538.GB3377@localhost.localdomain> <20190515175110.26i2xuclkksgx744@arabian.linksys.moosehall> Message-ID: On Wed, May 15, 2019 at 2:05 PM Jim Rollenhagen wrote: > On Wed, May 15, 2019 at 1:51 PM Adam Spiers wrote: > >> Hi Jim, >> >> Jim Rollenhagen wrote: >> >On Fri, May 3, 2019 at 3:05 PM Paul Belanger >> wrote: >> > >> >> On Fri, May 03, 2019 at 08:48:10PM +0200, Roman Gorshunov wrote: >> >> > Hello Jim, team, >> >> > >> >> > I'm from Airship project. I agree with archival of Github mirrors of >> >> > repositories. One small suggestion: could we have project >> descriptions >> >> > adjusted to point to the new location of the source code repository, >> >> > please? E.g. "The repo now lives at opendev.org/x/y". >> >> > >> >> This is something important to keep in mind from infra side, once the >> >> repo is read-only, we lose the ability to use the API to change it. >> >> >> >> From manage-projects.py POV, we can update the description before >> >> flipping the archive bit without issues, just need to make sure we have >> >> the ordering correct. >> > >> >Agree this is a good idea. >> >> Just checking you saw my reply to the same email from Paul? >> >> >> http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005846.html > > > Sorry, yes I saw it, but then later mis-remembered it. :( > > >> >> >> >There's been no objections to this plan for some time now. >> >> I might be missing some context, but I *think* my email could be >> interpreted as an objection based on the reasons listed in it. >> >> Also, the understanding I took away from the PTG session was that >> there was consensus *not* to archive repos, but rather to ensure that >> mirroring and redirects are set up properly. However I am of course >> very willing to be persuaded otherwise. >> >> Please could you take a look at that mail and let us know what you >> think? Thanks! >> > > So there's two things to do, in this order: > > 1) do a proper transfer of the Airship repos > 2) Archive any other repos on github that are no longer in the openstack > namespace. > > Has the airship team been working with the infra team on getting > the transfer done? I would think that could be done quickly, and then > we can proceed with archiving the others. > Bumping this thread. Looks like the airship transfer is done[0], is that correct? I don't believe step 2 has been done yet; is there someone from the infra team that can do that or help me do it? [0] https://github.com/openstack/?utf8=%E2%9C%93&q=airship&type=&language= // jim -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at fried.cc Thu May 30 15:47:56 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 30 May 2019 10:47:56 -0500 Subject: [nova] Spec review sprint Tuesday June 04 Message-ID: <52df6449-5d49-ee77-5309-90f2cd90283c@fried.cc> Hi all. We would like to do a nova-specs review push next Tuesday, June 4th. If you own one or more specs, please try to polish them and address any outstanding downvotes before Tuesday; and on Tuesday, please try to be available in #openstack-nova (or paying close attention to gerrit) to discuss them if needed. If you are a nova reviewer, contributor, or stakeholder, please try to spend a good chunk of your upstream time on Tuesday reviewing open Train specs [1]. Thanks, efried [1] Approximately: https://review.opendev.org/#/q/project:openstack/nova-specs+status:open+path:%255Especs/train/approved/.* From cboylan at sapwetik.org Thu May 30 16:00:20 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Thu, 30 May 2019 09:00:20 -0700 Subject: =?UTF-8?Q?Re:_[tc][all]_Github_mirroring_(or_lack_thereof)_for_unofficia?= =?UTF-8?Q?l_projects?= In-Reply-To: References: <20190503190538.GB3377@localhost.localdomain> <20190515175110.26i2xuclkksgx744@arabian.linksys.moosehall> Message-ID: <8d81b9a7-b460-43e1-a774-9bd65ee42143@www.fastmail.com> On Thu, May 30, 2019, at 8:28 AM, Jim Rollenhagen wrote: > On Wed, May 15, 2019 at 2:05 PM Jim Rollenhagen wrote: > > On Wed, May 15, 2019 at 1:51 PM Adam Spiers wrote: > >> Hi Jim, > >> > >> Jim Rollenhagen wrote: > >> >On Fri, May 3, 2019 at 3:05 PM Paul Belanger wrote: > >> > > >> >> On Fri, May 03, 2019 at 08:48:10PM +0200, Roman Gorshunov wrote: > >> >> > Hello Jim, team, > >> >> > > >> >> > I'm from Airship project. I agree with archival of Github mirrors of > >> >> > repositories. One small suggestion: could we have project descriptions > >> >> > adjusted to point to the new location of the source code repository, > >> >> > please? E.g. "The repo now lives at opendev.org/x/y". > >> >> > > >> >> This is something important to keep in mind from infra side, once the > >> >> repo is read-only, we lose the ability to use the API to change it. > >> >> > >> >> From manage-projects.py POV, we can update the description before > >> >> flipping the archive bit without issues, just need to make sure we have > >> >> the ordering correct. > >> > > >> >Agree this is a good idea. > >> > >> Just checking you saw my reply to the same email from Paul? > >> > >> http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005846.html > > > > Sorry, yes I saw it, but then later mis-remembered it. :( > >> > >> > >> >There's been no objections to this plan for some time now. > >> > >> I might be missing some context, but I *think* my email could be > >> interpreted as an objection based on the reasons listed in it. > >> > >> Also, the understanding I took away from the PTG session was that > >> there was consensus *not* to archive repos, but rather to ensure that > >> mirroring and redirects are set up properly. However I am of course > >> very willing to be persuaded otherwise. > >> > >> Please could you take a look at that mail and let us know what you > >> think? Thanks! > > > > So there's two things to do, in this order: > > > > 1) do a proper transfer of the Airship repos > > 2) Archive any other repos on github that are no longer in the openstack namespace. > > Has the airship team been working with the infra team on getting > > the transfer done? I would think that could be done quickly, and then > > we can proceed with archiving the others. > > Bumping this thread. > > Looks like the airship transfer is done[0], is that correct? Correct. > > I don't believe step 2 has been done yet; is there someone from > the infra team that can do that or help me do it? If you provide us with the canonical list of things to archive I think we can probably script that up or do lots of clicking depending on the size of the list I guess. > > [0] https://github.com/openstack/?utf8=%E2%9C%93&q=airship&type=&language= > > // jim From km.giuseppesannino at gmail.com Thu May 30 16:53:37 2019 From: km.giuseppesannino at gmail.com (Giuseppe Sannino) Date: Thu, 30 May 2019 18:53:37 +0200 Subject: [rally] Is there a way to log command printout in rally logs? Message-ID: Hi all, I'm new in Rally, so please apologize in advance if the question is a no-sense. As per the subject, I wonder if there is a way in Rally to collect the printouts from the openstack command execution. For example, when I execute the "NovaHypervisors.list_and_get_hypervisors" scenario I'd like to see and inspect (in a file or on the screen) the actual result of the "nova hypervisor-list" command. Many thanks in advance BR /Giuseppe -------------- next part -------------- An HTML attachment was scrubbed... URL: From elmiko at redhat.com Thu May 30 17:08:27 2019 From: elmiko at redhat.com (Michael McCune) Date: Thu, 30 May 2019 13:08:27 -0400 Subject: [keystone][placement][neutron][api-sig] http404 to NotFound, or how should a http json error body look like? In-Reply-To: References: <9ae10062-a9c8-3e76-15a1-da0745361c57@ericsson.com> <93c95d69-c87a-4d4d-bf10-3b6b293b8a6a@www.fastmail.com> Message-ID: hi, just wanted to post a followup to this discussion. during our office hours today we discussed this conversation again, i think the ultimate output is that i will take a look at adding a patch for keystoneauth to be more tolerant of the error payload variances. thanks again for the discussions =) peace o/ From robson.rbarreto at gmail.com Thu May 30 17:29:22 2019 From: robson.rbarreto at gmail.com (Robson Ramos Barreto) Date: Thu, 30 May 2019 14:29:22 -0300 Subject: [openstack-helm] external ceph Message-ID: Hi All, I'd like to know how can I integrate openstack deployment with my existing ceph cluster using openstack helm to be used for both k8s dynamic provision (PVC) and as storage backend to cinder, glance and nova. I'm trying deploying openstack with ceph[1] to understanding the needs but it's a little painful I'd appreciate any help Thank you [1] https://docs.openstack.org/openstack-helm/latest/install/developer/deploy-with-ceph.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Thu May 30 18:06:59 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 30 May 2019 18:06:59 +0000 Subject: [tc][all] Github mirroring (or lack thereof) for unofficial projects In-Reply-To: <8d81b9a7-b460-43e1-a774-9bd65ee42143@www.fastmail.com> References: <20190503190538.GB3377@localhost.localdomain> <20190515175110.26i2xuclkksgx744@arabian.linksys.moosehall> <8d81b9a7-b460-43e1-a774-9bd65ee42143@www.fastmail.com> Message-ID: <20190530180658.xgpcy35au72ccmzt@yuggoth.org> On 2019-05-30 09:00:20 -0700 (-0700), Clark Boylan wrote: [...] > If you provide us with the canonical list of things to archive I > think we can probably script that up or do lots of clicking > depending on the size of the list I guess. [...] Alternatively, I's like to believe we're at the point where we can add other interested parties to the curating group for the openstack org on GH, at which point any of them could volunteer to do the archiving. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From fungi at yuggoth.org Thu May 30 18:14:32 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 30 May 2019 18:14:32 +0000 Subject: stable/newton devstack installation is failing on ubuntu 14.04.6 In-Reply-To: <8962c449-6610-d75d-60cf-ad85cf22dd0f@gmail.com> References: <8962c449-6610-d75d-60cf-ad85cf22dd0f@gmail.com> Message-ID: <20190530181432.wsvzgtf23f5xy45t@yuggoth.org> On 2019-05-30 08:11:12 -0500 (-0500), Matt Riedemann wrote: > On 5/27/2019 6:40 AM, Ansari, Arshad wrote: > > I am trying to install stable/newton devstack on Ubuntu 14.04.6 and > > getting below error:- [...] > This is probably due to the opendev rename that happened a few weeks ago. > I'm surprised there is still a stable/newton branch open for devstack since > most if not all of the other service projects that devstack relies on have > already end of life'd the stable/newton branch, so I would expect the > devstack newton branch should also be end of life. Also it warrants pointing out that Ubuntu 14.04 is past end of support from Canonical. As far as I know, versions of Ubuntu still considered supported can do SNI and will not exhibit this bug. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From ikuo.otani.rw at hco.ntt.co.jp Thu May 30 09:02:57 2019 From: ikuo.otani.rw at hco.ntt.co.jp (Ikuo Otani) Date: Thu, 30 May 2019 18:02:57 +0900 Subject: [cyborg][nova][sdk]Cyborgclient integration In-Reply-To: References: <00be01d51167$46a0e1b0$d3e2a510$@hco.ntt.co.jp> Message-ID: <002101d516c6$7979d1a0$6c6d74e0$@hco.ntt.co.jp> Hi Eric, I'm sorry for late reply. Your kind explanation made me understand. Our situation is as Sundar mentioned. I am a beginner for developing openstack, but I will try to make cyborgclient more elaborate anyway! Thanks, Ikuo -----Original Message----- From: Eric Fried Sent: Thursday, May 23, 2019 11:10 PM To: openstack-discuss at lists.openstack.org Subject: Re: [cyborg][nova][sdk]Cyborgclient integration Hi Ikuo. I'm glad you're interested in helping out with this effort. I'm trying to understand where you intend to make your changes, assuming you're coming at this purely from a cyborg perspective: - If in Nova, this isn't necessary because there's no python-cyborgclient integration there. Communication from Nova to Cyborg is being added as part of blueprint nova-cyborg-interaction [0], but it'll be done without python-cyborgclient from the start. The patch at [1] sets up direct REST API communication through a KSA adapter. Once we have base openstacksdk enablement in Nova [2] we can simply s/get_ksa_adapter/get_sdk_adapter/ at [3]. And in the future as the sdk starts supporting richer methods for cyborg, we can replace the direct REST calls in that file (see [4] later in that series to get an idea of what kinds of calls those will be). - If in the cyborg CLI, I'm afraid I have very little context there. There's a (nearly-official [5]) community push to make all CLIs OSC-based. I'm not sure what state the cyborg CLI is in, but I would have expected it will need brand new work to expose the v2 changes being done for Train. From that perspective I would say: do that via OSC. But that's not really related to bp/openstacksdk-in-nova. - If in python-cyborgclient, again I lack background, but I don't think there's a need to make changes here. The point of bp/openstacksdk-in-nova (or openstacksdk-anywhere) is to get rid of usages of client libraries like python-cyborgclient. Where is python-cyborgclient being used today? If it's just in the CLI, and you can make the above (conversion to OSC) happen, it may be possible to retire python-cyborgclient entirely \o/ Now, if you're generally available to contribute to either bp/nova-cyborg-interaction (by helping with the nova code) or bp/openstack-sdk-in-nova (on non-cyborg-related aspects), I would be delighted to get you ramped up. We could sure use the help. Please let me know if you're interested. Thanks, efried [0] https://blueprints.launchpad.net/nova/+spec/nova-cyborg-interaction [1] https://review.opendev.org/#/c/631242/ [2] https://review.opendev.org/#/c/643664/ [3] https://review.opendev.org/#/c/631242/19/nova/accelerator/cyborg.py at 23 [4] https://review.opendev.org/#/c/631245/13/nova/accelerator/cyborg.py [5] https://review.opendev.org/#/c/639376/ On 5/23/19 7:58 AM, Ikuo Otani wrote: > Hi, > > I am a Cyborg member and take the role of integrating openstacksdk and replacing use of python-*client. > Related bluespec: > https://blueprints.launchpad.net/nova/+spec/openstacksdk-in-nova > > My question is: > When the first code should be uploaded to gerrit? > From my understanding, we should update cyborg client library > referring to openstacksdk rule, and make it reviewed in gerrit by Eric Fried. > I'm sorry if I misunderstand. > > Thanks in advance, > Ikuo > > NTT Network Service Systems Laboratories Server Network Innovation > Project Ikuo Otani > TEL: +81-422-59-4140 > Email: ikuo.otani.rw at hco.ntt.co.jp > > > > From Anirudh.Gupta at hsc.com Thu May 30 11:29:08 2019 From: Anirudh.Gupta at hsc.com (Anirudh Gupta) Date: Thu, 30 May 2019 11:29:08 +0000 Subject: [Airship-Seaworthy] Deployment of Airship-Seaworthy on Virtual Environment In-Reply-To: References: Message-ID: Hi Team, I am trying to create Airship-Seaworthy from the link https://airship-treasuremap.readthedocs.io/en/latest/seaworthy.html It requires 6 DELL R720xd bare-metal servers: 3 control, and 3 compute nodes to be configured, but there is no documentation of how to install and getting started with Airship-Seaworthy. Do we need to follow the “Getting Started” section mentioned in Airsloop or will there be any difference in case of Seaworthy. https://airship-treasuremap.readthedocs.io/en/latest/airsloop.html#getting-started Also what all configurations need to be run from the 3 controller nodes and what needs to be run from 3 computes? Regards अनिरुद्ध गुप्ता (वरिष्ठ अभियंता) From: Li, Cheng1 Sent: 30 May 2019 08:29 To: Anirudh Gupta ; airship-discuss at lists.airshipit.org; airship-announce at lists.airshipit.org; openstack-dev at lists.openstack.org; openstack at lists.openstack.org Subject: RE: [Airship-Seaworthy] Deployment of Airship-Seaworthy on Virtual Environment I have the same question. I haven’t seen any docs which guides how to deploy airsloop/air-seaworthy in virtual env. I am trying to deploy airsloop on libvirt/kvm driven virtual env. Two VMs, one for genesis, the other for compute. Virtualbmc for ipmi simulation. The genesis.sh scripts has been run on genesis node without error. But deploy_site fails at prepare_and_deploy_nodes task(action ‘set_node_boot’ timeout). I am still investigating this issue. It will be great if we have official document for this scenario. Thanks, Cheng From: Anirudh Gupta [mailto:Anirudh.Gupta at hsc.com] Sent: Wednesday, May 29, 2019 3:31 PM To: airship-discuss at lists.airshipit.org; airship-announce at lists.airshipit.org; openstack-dev at lists.openstack.org; openstack at lists.openstack.org Subject: [Airship-Seaworthy] Deployment of Airship-Seaworthy on Virtual Environment Hi Team, We want to test Production Ready Airship-Seaworthy in our virtual environment The link followed is https://airship-treasuremap.readthedocs.io/en/latest/seaworthy.html As per the document we need 6 DELL R720xd bare-metal servers: 3 control, and 3 compute nodes. But we need to deploy our setup on Virtual Environment. Does Airship-Seaworthy support Installation on Virtual Environment? We have 2 Rack Servers with Dual-CPU Intel® Xeon® E5 26xx with 16 cores each and 128 GB RAM. Is it possible that we can create Virtual Machines on them and set up the complete environment. In that case, what possible infrastructure do we require for setting up the complete setup. Looking forward for your response. Regards अनिरुद्ध गुप्ता (वरिष्ठ अभियंता) Hughes Systique Corporation D-23,24 Infocity II, Sector 33, Gurugram, Haryana 122001 DISCLAIMER: This electronic message and all of its contents, contains information which is privileged, confidential or otherwise protected from disclosure. The information contained in this electronic mail transmission is intended for use only by the individual or entity to which it is addressed. If you are not the intended recipient or may have received this electronic mail transmission in error, please notify the sender immediately and delete / destroy all copies of this electronic mail transmission without disclosing, copying, distributing, forwarding, printing or retaining any part of it. Hughes Systique accepts no responsibility for loss or damage arising from the use of the information transmitted by this email including damage from virus. DISCLAIMER: This electronic message and all of its contents, contains information which is privileged, confidential or otherwise protected from disclosure. The information contained in this electronic mail transmission is intended for use only by the individual or entity to which it is addressed. If you are not the intended recipient or may have received this electronic mail transmission in error, please notify the sender immediately and delete / destroy all copies of this electronic mail transmission without disclosing, copying, distributing, forwarding, printing or retaining any part of it. Hughes Systique accepts no responsibility for loss or damage arising from the use of the information transmitted by this email including damage from virus. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Thu May 30 19:01:28 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 30 May 2019 14:01:28 -0500 Subject: [nova] Quality warning patch for xenapi driver Message-ID: <76d3402c-4e3f-5d75-2292-577f017ce8a0@gmail.com> As discussed during today's nova meeting [1] I've posted a change [2] to the xenapi driver such that it logs a warning on startup saying its not tested (no 3rd party CI) and as such may be subject to deprecation in the future. This is kind of the first step in formally deprecating the driver. If you have a vested interest in maintaining the xenapi driver please speak up here and/or on the change. [1] http://eavesdrop.openstack.org/meetings/nova/2019/nova.2019-05-30-14.00.log.html#l-43 [2] https://review.opendev.org/#/c/662295/ -- Thanks, Matt From MM9745 at att.com Thu May 30 19:04:56 2019 From: MM9745 at att.com (MCEUEN, MATT) Date: Thu, 30 May 2019 19:04:56 +0000 Subject: [infra] Elections for Airship Message-ID: <7C64A75C21BB8D43BD75BB18635E4D89709A2256@MOSTLS1MSGUSRFF.ITServices.sbc.com> OpenStack Infra team, As the Airship project works to finalize our governance and elected positions [1], we need to be ready to hold our first elections. I wanted to reach out and ask for any experience, guidance, materials, or tooling you can share that would help this run correctly and smoothly? This is an area where the Airship team doesn't have much experience so we may not know the right questions to ask. Aside from a member of the Airship community creating a poll in CIVS [2], is there anything else you would recommend? Is there any additional tooling in place in the OpenStack world? Any potential pitfalls, or other hard-won advice for us? [1] https://review.opendev.org/#/c/653865/ [2] https://civs.cs.cornell.edu/ Thank you for your help! Matt From mriedemos at gmail.com Thu May 30 19:06:36 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Thu, 30 May 2019 14:06:36 -0500 Subject: [nova] libvirt+virtuozzo CI is broken Message-ID: <7fb7636c-01b6-fd10-6d81-1c2f95ace61c@gmail.com> I noticed this today: http://openstack-3rd-party-virtuozzo-ci-logs.virtuozzo.com/24/638324/30/check/check-dsvm-tempest-vz7-exe-minimal/620b3db/console.html.gz 2019-05-24 02:16:51.833 | + :pre_test_hook:L13: git fetch https://review.opendev.org/p/openstack/nova refs/changes/86/506686/4 2019-05-24 02:17:01.100 | From https://review.opendev.org/p/openstack/nova 2019-05-24 02:17:01.102 | * branch refs/changes/86/506686/4 -> FETCH_HEAD 2019-05-24 02:17:01.109 | + :pre_test_hook:L14: git cherry-pick FETCH_HEAD 2019-05-24 02:17:01.311 | error: could not apply 789a1f2... don't add device address if there is no any units 2019-05-24 02:17:01.313 | hint: after resolving the conflicts, mark the corrected paths 2019-05-24 02:17:01.315 | hint: with 'git add ' or 'git rm ' 2019-05-24 02:17:01.318 | hint: and commit the result with 'git commit' Looks like the CI is broken because this change is in merge conflict: https://review.opendev.org/#/c/506686/ Does anyone plan on working on that? It's had a -1 on it since August of 2018. Is that change really needed for the CI to work? -- Thanks, Matt From dtroyer at gmail.com Thu May 30 19:07:14 2019 From: dtroyer at gmail.com (Dean Troyer) Date: Thu, 30 May 2019 14:07:14 -0500 Subject: [cyborg][nova][sdk]Cyborgclient integration In-Reply-To: <1CC272501B5BC543A05DB90AA509DED527576F84@fmsmsx122.amr.corp.intel.com> References: <00be01d51167$46a0e1b0$d3e2a510$@hco.ntt.co.jp> <1CC272501B5BC543A05DB90AA509DED5275753DD@fmsmsx122.amr.corp.intel.com> <1CC272501B5BC543A05DB90AA509DED527576F84@fmsmsx122.amr.corp.intel.com> Message-ID: On Wed, May 29, 2019 at 10:51 PM Nadathur, Sundar wrote: > We are trying to understand the pros and cons of doing the OSC plugin vs integrating directly into OSC repo. > > 1. Consistency requirements are fine. Presumably we should follow [1]. They seem to be loose guidelines than specific requirements. I suppose the command syntax would look like: > $ openstack accelerator create device-profile Yes, that is the standard. I did not write it using RFC-style 'MUST' language, but I do consider the command structure a hard requirement for in-repo commands. That is the entire point of OSC's existence. > 2. When you say, " This is a significant change for OSC ", could you tell us what the risks are for direct integration? The repo [2] seems to have entries for compute, network, identity, etc. So, it doesn’t look like uncharted waters. Without reciting the entire history, what is in the repo is all that existed when I initially developed the command set, we later added Network API support. Everything else has been plugins since then for various reasons. The risk is that you lose absolute control over your CLI. Everything from the naming of resources (this is the hardest part by far) to the option names and verbs used must follow the guidelines. Not all teams want or can accept that. Others found freedom there, including the freedom to be heavily involved in OSC directly (Keystone is the prime example) and set their path within those guidelines. Neutron also spent a large amount of time and effort getting their core commands implemented in-repo, plus there is a Neutron plugin for a number of extensions. dt -- Dean Troyer dtroyer at gmail.com From dtroyer at gmail.com Thu May 30 19:15:40 2019 From: dtroyer at gmail.com (Dean Troyer) Date: Thu, 30 May 2019 14:15:40 -0500 Subject: [cyborg][nova][sdk]Cyborgclient integration In-Reply-To: <8457085b04e80164fe45eb54a3b54afab7c10bfb.camel@redhat.com> References: <00be01d51167$46a0e1b0$d3e2a510$@hco.ntt.co.jp> <1CC272501B5BC543A05DB90AA509DED5275753DD@fmsmsx122.amr.corp.intel.com> <1CC272501B5BC543A05DB90AA509DED527576F84@fmsmsx122.amr.corp.intel.com> <8457085b04e80164fe45eb54a3b54afab7c10bfb.camel@redhat.com> Message-ID: On Thu, May 30, 2019 at 5:00 AM Sean Mooney wrote: > > $ openstack accelerator create device-profile > i think this should be "openstack acclerator device-profile create" looking how > "security group rule create" works the action is always last in the osc commands The resource name should be clear and descriptive and unique. Depending on the set of resources you have in whole you may want 'accelerator profile', I would leave 'device' out as it does not really add anything unless you also have other types of profiles. > the main disadvantage to integrating osc intree will be review time. e.g. as the cyborg core team > ye will not have +2 rights on osc but if you have your own osc plugin then the cyborg core team > can also manage the cli for cyborg. This is true, the number of people reviewing regularly on OSC outside their specific project commands is small. dt -- Dean Troyer dtroyer at gmail.com From jim at jimrollenhagen.com Thu May 30 19:15:59 2019 From: jim at jimrollenhagen.com (Jim Rollenhagen) Date: Thu, 30 May 2019 15:15:59 -0400 Subject: [tc][all] Github mirroring (or lack thereof) for unofficial projects In-Reply-To: <20190530180658.xgpcy35au72ccmzt@yuggoth.org> References: <20190503190538.GB3377@localhost.localdomain> <20190515175110.26i2xuclkksgx744@arabian.linksys.moosehall> <8d81b9a7-b460-43e1-a774-9bd65ee42143@www.fastmail.com> <20190530180658.xgpcy35au72ccmzt@yuggoth.org> Message-ID: On Thu, May 30, 2019 at 2:18 PM Jeremy Stanley wrote: > On 2019-05-30 09:00:20 -0700 (-0700), Clark Boylan wrote: > [...] > > If you provide us with the canonical list of things to archive I > > think we can probably script that up or do lots of clicking > > depending on the size of the list I guess. > [...] > > Alternatively, I's like to believe we're at the point where we can > add other interested parties to the curating group for the openstack > org on GH, at which point any of them could volunteer to do the > archiving. > Thanks Clark/Jeremy. I'll make a list tomorrow, as we'll need that in either case. :) // jim -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnaser at vexxhost.com Thu May 30 19:23:43 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Thu, 30 May 2019 15:23:43 -0400 Subject: [tc][ops] reviving osops- repos Message-ID: Hi everyone, So we've realized that we have a few operational tools that we keep around that live all over the place and it would probably be the best for the community to have access to them. I did a bit of research and it looks like the openstack/osops-* repos used to exist between operators to maintain things like this, however, they've gone pretty abandoned over time. I'd like to pick up this effort and as part of doing that, I'd like to move them into a GitHub organization and rename them out of x/ to make them not seem so .. 'dead'. I've requested to move x/osops-* to openstack-operators/* in GitHub so that I can setup the appropriate mirroring in post pipeline (and then propose a patch to rename things inside Gerrit as well). There was concern around the idea of using 'openstack' as it's a copyrighted term and a TC discussion was brought up as an idea which makes sense. I'd like to ask if everyone else feels like it's okay so move things into that org and host them there. Thanks, Mohammed -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. http://vexxhost.com From sean.mcginnis at gmx.com Thu May 30 19:30:15 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 30 May 2019 14:30:15 -0500 Subject: [infra] Elections for Airship In-Reply-To: <7C64A75C21BB8D43BD75BB18635E4D89709A2256@MOSTLS1MSGUSRFF.ITServices.sbc.com> References: <7C64A75C21BB8D43BD75BB18635E4D89709A2256@MOSTLS1MSGUSRFF.ITServices.sbc.com> Message-ID: <20190530193015.GA16128@sm-workstation> On Thu, May 30, 2019 at 07:04:56PM +0000, MCEUEN, MATT wrote: > OpenStack Infra team, > > As the Airship project works to finalize our governance and elected positions [1], we need to be ready to hold our first elections. I wanted to reach out and ask for any experience, guidance, materials, or tooling you can share that would help this run correctly and smoothly? This is an area where the Airship team doesn't have much experience so we may not know the right questions to ask. > > Aside from a member of the Airship community creating a poll in CIVS [2], is there anything else you would recommend? Is there any additional tooling in place in the OpenStack world? Any potential pitfalls, or other hard-won advice for us? > > [1] https://review.opendev.org/#/c/653865/ > [2] https://civs.cs.cornell.edu/ > > Thank you for your help! > Matt > Hey Matt, You might be able to find some useful process information and some tooling for how the OpenStack community has done elections here: https://opendev.org/openstack/election Sean From fungi at yuggoth.org Thu May 30 20:55:52 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 30 May 2019 20:55:52 +0000 Subject: [tc][ops] reviving osops- repos In-Reply-To: References: Message-ID: <20190530205552.falsvxcegehtyuge@yuggoth.org> On 2019-05-30 15:23:43 -0400 (-0400), Mohammed Naser wrote: [...] > I've requested to move x/osops-* to openstack-operators/* in > GitHub so that I can setup the appropriate mirroring in post > pipeline (and then propose a patch to rename things inside Gerrit > as well). [...] My only real concern (voiced already with perhaps far less brevity in #openstack-tc) is that we should avoid leaving the impression that software written by "operators" isn't good enough for the openstack/ Git namespace. I want to be sure we remain clear that OpenStack was, and still is, written by its operators/users, not by some separate and nebulous cloud of "developers." We already have plenty of repositories under openstack/ which are maintained by SIGs and UC WGs/teams, so not everything there needs to be a deliverable governed by the OpenStack TC anyway. If this really is a collection of software written and maintained by the operators of OpenStack then it should be fine alongside the rest of the official OpenStack software. If it's not, then perhaps calling it the "openstack-operators" organization is... misleading? -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From mnaser at vexxhost.com Thu May 30 21:12:28 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Thu, 30 May 2019 17:12:28 -0400 Subject: [tc][ops] reviving osops- repos In-Reply-To: <20190530205552.falsvxcegehtyuge@yuggoth.org> References: <20190530205552.falsvxcegehtyuge@yuggoth.org> Message-ID: On Thu, May 30, 2019 at 4:59 PM Jeremy Stanley wrote: > > On 2019-05-30 15:23:43 -0400 (-0400), Mohammed Naser wrote: > [...] > > I've requested to move x/osops-* to openstack-operators/* in > > GitHub so that I can setup the appropriate mirroring in post > > pipeline (and then propose a patch to rename things inside Gerrit > > as well). > [...] > > My only real concern (voiced already with perhaps far less brevity > in #openstack-tc) is that we should avoid leaving the impression > that software written by "operators" isn't good enough for the > openstack/ Git namespace. I want to be sure we remain clear that > OpenStack was, and still is, written by its operators/users, not by > some separate and nebulous cloud of "developers." I would love to personally make it live under the openstack/ namespace however I do feel like that does make it 'pretty darn official' and the quality of the tools there isn't something we probably want to put our name under yet. A lot of it is really old (2+ years old) and it probably needs a bit of time to get into a state where it's something that we can make official. Personally, I think that repo should be nothing but a 'buffer' between project features and tools needed by deployers, a lot of the things there seem to be there because of bugs (i.e. orphaned resource cleanup -- which should ideally be cleaned up by the service itself or warned there), clean-up disks for deleted VMs that were not removed, etc. > We already have plenty of repositories under openstack/ which are > maintained by SIGs and UC WGs/teams, so not everything there needs > to be a deliverable governed by the OpenStack TC anyway. If this > really is a collection of software written and maintained by the > operators of OpenStack then it should be fine alongside the rest of > the official OpenStack software. If it's not, then perhaps calling > it the "openstack-operators" organization is... misleading? > -- > Jeremy Stanley -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. http://vexxhost.com From openstack at fried.cc Thu May 30 21:14:55 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 30 May 2019 16:14:55 -0500 Subject: [nova] libvirt+virtuozzo CI is broken In-Reply-To: <7fb7636c-01b6-fd10-6d81-1c2f95ace61c@gmail.com> References: <7fb7636c-01b6-fd10-6d81-1c2f95ace61c@gmail.com> Message-ID: > Looks like the CI is broken because this change is in merge conflict: > > https://review.opendev.org/#/c/506686/ > > Does anyone plan on working on that? It's had a -1 on it since August of > 2018. Is that change really needed for the CI to work? This work was done in [1], so I abandoned the above change. Of course, that won't help the CI to pass, since it appears to be hardcoding that patch number somewhere. Anyone know where the setup lives for this? (Or better yet, who owns it?) efried [1] https://review.opendev.org/#/c/611974/ From fungi at yuggoth.org Thu May 30 21:22:07 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 30 May 2019 21:22:07 +0000 Subject: [nova] libvirt+virtuozzo CI is broken In-Reply-To: References: <7fb7636c-01b6-fd10-6d81-1c2f95ace61c@gmail.com> Message-ID: <20190530212207.ly54le7ai6jq3jlt@yuggoth.org> On 2019-05-30 16:14:55 -0500 (-0500), Eric Fried wrote: [...] > Anyone know where the setup lives for this? (Or better yet, who > owns it?) [...] Is it maybe the same as: https://wiki.openstack.org/wiki/ThirdPartySystems/Virtuozzo_CI -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From openstack at fried.cc Thu May 30 21:29:50 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 30 May 2019 16:29:50 -0500 Subject: [nova] libvirt+virtuozzo CI is broken In-Reply-To: <20190530212207.ly54le7ai6jq3jlt@yuggoth.org> References: <7fb7636c-01b6-fd10-6d81-1c2f95ace61c@gmail.com> <20190530212207.ly54le7ai6jq3jlt@yuggoth.org> Message-ID: <57d84059-a638-296e-4159-6f589b9f8b1a@fried.cc> > Is it maybe the same as: > > https://wiki.openstack.org/wiki/ThirdPartySystems/Virtuozzo_CI > Thanks Jeremy, seems worth a try. CC'ing... Hello Virtuozzo CI maintainers! Please read [1] and [2] regarding the broken-ness of the libvirt+virtuozzo CI. Our speculation is that the CI environment is hardcoding a particular patch as a dependency, and breaking because that patch is (was) in merge conflict. Because that same code fix was developed elsewhere, and has merged, you should be able to remove the dependency entirely from your CI setup. Thanks, efried [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006746.html [2] http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006754.html From openstack at fried.cc Thu May 30 21:53:21 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 30 May 2019 16:53:21 -0500 Subject: [nova] Spec review sprint Tuesday June 04 In-Reply-To: <52df6449-5d49-ee77-5309-90f2cd90283c@fried.cc> References: <52df6449-5d49-ee77-5309-90f2cd90283c@fried.cc> Message-ID: Here's a slightly tighter dashboard, filtering out specs with -W. 23 total as of right now. https://review.opendev.org/#/q/project:openstack/nova-specs+status:open+path:%255Especs/train/approved/.*+NOT+label:Workflow-1 On 5/30/19 10:47 AM, Eric Fried wrote: > Hi all. We would like to do a nova-specs review push next Tuesday, June 4th. > > If you own one or more specs, please try to polish them and address any > outstanding downvotes before Tuesday; and on Tuesday, please try to be > available in #openstack-nova (or paying close attention to gerrit) to > discuss them if needed. > > If you are a nova reviewer, contributor, or stakeholder, please try to > spend a good chunk of your upstream time on Tuesday reviewing open Train > specs [1]. > > Thanks, > efried > > [1] Approximately: > https://review.opendev.org/#/q/project:openstack/nova-specs+status:open+path:%255Especs/train/approved/.* > From openstack at fried.cc Thu May 30 21:55:31 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 30 May 2019 16:55:31 -0500 Subject: [placement] Office Hours In-Reply-To: <1559227066.23481.3@smtp.office365.com> References: <923884A4-E427-439E-AD76-9DDBB45550D9@leafe.com> <1559227066.23481.3@smtp.office365.com> Message-ID: +1 for 1500 UTC Wednesdays. From mikal at stillhq.com Thu May 30 22:03:37 2019 From: mikal at stillhq.com (Michael Still) Date: Fri, 31 May 2019 08:03:37 +1000 Subject: [Glance] Removal of glance-replicator In-Reply-To: References: Message-ID: As the original author I vote that we just remove it. If someone wants that thing again they can deal with it then. Michael On Thu, May 30, 2019 at 9:41 PM Erno Kuvaja wrote: > Hi all, > > The replicator has been purely depending of the Images API v1 and thus not > operational at all for good couple of cycles now. As a part of the v1 > cleanup we are going to remove the replicator command as well. > > If the replicator is needed (currently doesn't seem to as no-one even > noticed that it deasn't work since we removed v1 api), lets not port the > original just to v2 but open a spec for it and take the opportunity to come > up with robust solution that will serve us well into the future. > > Thanks, > Erno "jokke" Kuvaja > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sundar.nadathur at intel.com Thu May 30 22:21:58 2019 From: sundar.nadathur at intel.com (Nadathur, Sundar) Date: Thu, 30 May 2019 22:21:58 +0000 Subject: [cyborg][nova][sdk]Cyborgclient integration In-Reply-To: References: <00be01d51167$46a0e1b0$d3e2a510$@hco.ntt.co.jp> <1CC272501B5BC543A05DB90AA509DED5275753DD@fmsmsx122.amr.corp.intel.com> <1CC272501B5BC543A05DB90AA509DED527576F84@fmsmsx122.amr.corp.intel.com> <8457085b04e80164fe45eb54a3b54afab7c10bfb.camel@redhat.com> Message-ID: <1CC272501B5BC543A05DB90AA509DED5275774BC@fmsmsx122.amr.corp.intel.com> > -----Original Message----- > From: Dean Troyer > Sent: Thursday, May 30, 2019 12:16 PM > > On Thu, May 30, 2019 at 5:00 AM Sean Mooney > wrote: > > > $ openstack accelerator create device-profile > > i think this should be "openstack acclerator device-profile create" > > looking how "security group rule create" works the action is always > > last in the osc commands The osc documentation [1] says the syntax should be 'object-1 action object-2'. Your other points are well-taken. [1] https://docs.openstack.org/python-openstackclient/latest/contributor/humaninterfaceguide.html#command-structure > The resource name should be clear and descriptive and unique. > Depending on the set of resources you have in whole you may want > 'accelerator profile', I would leave 'device' out as it does not really add > anything unless you also have other types of profiles. The object itself is called a device profile, in the specs and in code. > > the main disadvantage to integrating osc intree will be review time. > > e.g. as the cyborg core team ye will not have +2 rights on osc but if > > you have your own osc plugin then the cyborg core team can also manage > the cli for cyborg. > > This is true, the number of people reviewing regularly on OSC outside their > specific project commands is small. This may be the clinching argument. Also, Sean's observation that "as it stands we have been moving [toward] the everything is a plugin side of that scale." Since we need to deliver the client by Train, and the Cyborg team doing that is also doing other activities, perhaps we should keep the timeline as the main factor. Thanks, Sean, Dean and Eric for your inputs. Cyborg folks, it is time to weigh in. > dt Thanks & Regards, Sundar From fungi at yuggoth.org Thu May 30 22:36:25 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 30 May 2019 22:36:25 +0000 Subject: Elections for Airship In-Reply-To: <7C64A75C21BB8D43BD75BB18635E4D89709A2256@MOSTLS1MSGUSRFF.ITServices.sbc.com> References: <7C64A75C21BB8D43BD75BB18635E4D89709A2256@MOSTLS1MSGUSRFF.ITServices.sbc.com> Message-ID: <20190530223625.7ao2hmxlrrj3ny4b@yuggoth.org> On 2019-05-30 19:04:56 +0000 (+0000), MCEUEN, MATT wrote: > OpenStack Infra team, The OpenStack Infrastructure team hasn't been officially involved in running technical elections for OpenStack for several years now (subject tag removed accordingly). With the advent of Gerrit's REST API, contributor data can be queried and assembled anonymously by anyone. While I happen to be involved in these activities for longer than that's been the case, I'll be answering while wearing my OpenStack Technical Election Official hat throughout the remainder of this reply. > As the Airship project works to finalize our governance and > elected positions [1], we need to be ready to hold our first > elections. I wanted to reach out and ask for any experience, > guidance, materials, or tooling you can share that would help this > run correctly and smoothly? This is an area where the Airship team > doesn't have much experience so we may not know the right > questions to ask. > > Aside from a member of the Airship community creating a poll in > CIVS [2], is there anything else you would recommend? Is there any > additional tooling in place in the OpenStack world? Any potential > pitfalls, or other hard-won advice for us? [...] As Sean mentioned in his reply, the OpenStack community has been building and improving tooling in the openstack/election Git repository on OpenDev over the past few years. The important bits (in my opinion) center around querying Gerrit for a list of contributors whose changes have merged to sets of official project repositories within a qualifying date range. I've recently been assisting StarlingX's election officials with a similar request, and do have some recommendations. Probably the best place to start is adding an official structured dataset with your team/project information following the same schema used by OpenStack[0] and now StarlingX[1], then applying a couple of feature patches[2][3] (if they haven't merged by the time you read this) to the openstack/election master branch. After that, you ought to be able to run something along the lines of: tox -e venv -- owners --after 2018-05-30 --before 2019-05-31 --nonmember --outdir airship-electorate --projects ../../airship/governance/projects.yaml --ref master (Note that the --after and --before dates work like in Gerrit's query language and carry with them an implied midnight UTC, so one is the actual start date but the other is the day after the end date; "on or after" and "before but not on" is how I refer to them in prose.) You'll see the resulting airship-electorate directory includes a lot of individual files. There are two basic types: .yaml files which are structured data meant for human auditing as well as scripted analysis, and .txt files which are a strict list of one Gerrit preferred E-mail address per line for each voter (the format expected by the https://civs.cs.cornell.edu/ voting service). It's probably also obvious that there are sets of these named for each team in your governance, as well as a set which start with underscore (_). The former represent contributions to the deliverable repositories of each team, while the latter are produced from an aggregate of all deliverable repositories for all teams (this is what you might use for electing an Airship-wide governing body). There are a couple of extra underscore files... _duplicate_owners.yaml includes information on deduplicated entries for contributors where the script was able to detect more than one Gerrit account for the same individual, while the _invites.csv file isn't really election-related at all and is what the OSF normally feeds into the automation which sends event discounts to contributors. In case you're curious about the _invites.csv file, the first column is the OSF member ID (if known) or 0 (if no matching membership was found), the second column is the display name from Gerrit, the third column is the preferred E-mail address from Gerrit (this corresponds to the address used for the _electorate.txt file), and any subsequent columns are the extra non-preferred addresses configured in Gerrit for that account. Please don't hesitate to follow up with any additional questions you might have! [0] https://opendev.org/openstack/governance/src/branch/master/reference/projects.yaml [1] https://opendev.org/starlingx/governance/src/branch/master/reference/tsc/projects.yaml [2] https://review.opendev.org/661647 [3] https://review.opendev.org/661648 -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From David.Paterson at dell.com Thu May 30 21:50:47 2019 From: David.Paterson at dell.com (David.Paterson at dell.com) Date: Thu, 30 May 2019 21:50:47 +0000 Subject: [ironic][edge]: Recap of PTG discussions In-Reply-To: References: Message-ID: <9734b5a6cf23459890adae6ac4bab7c7@AUSX13MPC106.AMER.DELL.COM> There is an ironic redfish driver but it's still a WIP. Right now, I believe you can control power and manage boot mode (pxe, media...) with Redfish driver but implementing BIOS and RAID support is still ongoing. Re: firmware, there is an ironic spec for Dell EMC hardware here: https://github.com/openstack/ironic-specs/blob/master/specs/approved/drac-firmware-update-spec.rst Thanks, dp -----Original Message----- From: Csatari, Gergely (Nokia - HU/Budapest) Sent: Tuesday, May 28, 2019 11:31 AM To: edge-computing at lists.openstack.org; openstack-discuss at lists.openstack.org Subject: [Edge-computing] [ironic][edge]: Recap of PTG discussions [EXTERNAL EMAIL] Hi, There was a one hour discussion with Julia from Ironic with the Edge Computing Group [1]. In this mail I try to conclude what was discussed and ask some clarification questions. Current Ironic uses DHCP for hardware provisioning, therefore it requires DHCP relay enabled on the whole path to the edge cloud instances. There are two alternatives to solve this: 1) Virtual media support [2] where the ip configuration is embedded into a virtual image what is booted via the board management interface 2) Redfish support, however the state and support of redfish for host management is not clear. Is there already a specification has been added for redfish support? Upgrade of edge cloud infrastructures: - Firmware upgrade should be supported by Ironic. Is this something on its way or is this a new need? - Operating system and infra update can be solved using Fenix [3], however handling several edge cloud instances from a central location needs new features. Handling of failed servers: - A monitoring system or the operator should provide the input to mark a server as failed - Ironic can power down the failed servers and have the definition of a maintenance state - Discussed in [4] Additional ideas what we half discussed: - Running Ironic containers in a switch with the images hosted by Swift somewhere else. Are there any concerns about this idea? Any missing features from somewhere? [1]: https://etherpad.openstack.org/p/edge-wg-ptg-preparation-denver-2019 [2]: https://specs.openstack.org/openstack/ironic-specs/specs/not-implemented/L3-based-deployment.html [3]: https://wiki.openstack.org/wiki/Fenix [4]: http://lists.openstack.org/pipermail/edge-computing/2019-May/000582.html Br, Gerg0 _______________________________________________ Edge-computing mailing list Edge-computing at lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/edge-computing From katonalala at gmail.com Fri May 31 05:59:09 2019 From: katonalala at gmail.com (Lajos Katona) Date: Fri, 31 May 2019 07:59:09 +0200 Subject: [keystone][placement][neutron][api-sig] http404 to NotFound, or how should a http json error body look like? In-Reply-To: References: <9ae10062-a9c8-3e76-15a1-da0745361c57@ericsson.com> <93c95d69-c87a-4d4d-bf10-3b6b293b8a6a@www.fastmail.com> Message-ID: Hi Michael, Thanks. That would be the best. I checked another way to do the error processing on the client side like nova does (see the wip patches: https://review.opendev.org/662204 & https://review.opendev.org/662205) but that would mean everybody has to do the same who start using an API which was changed to list of errors. Regards Lajos Michael McCune ezt írta (időpont: 2019. máj. 30., Cs, 19:15): > hi, just wanted to post a followup to this discussion. > > during our office hours today we discussed this conversation again, i > think the ultimate output is that i will take a look at adding a patch > for keystoneauth to be more tolerant of the error payload variances. > > thanks again for the discussions =) > > peace o/ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aj at suse.com Fri May 31 06:40:24 2019 From: aj at suse.com (Andreas Jaeger) Date: Fri, 31 May 2019 08:40:24 +0200 Subject: Retiring TripleO-UI - no longer supported In-Reply-To: References: <3924F5DE-314C-4D41-8CEA-DCF7A2A2CDEA@redhat.com> Message-ID: <2f23d85a-a917-23ab-0317-71e1a58e5dbe@suse.com> On 24/05/2019 17.57, Jason Rist wrote: > No, there might be stable branch work, but going forward no additional > features or work will be done against master, and I will additionally be > retiring associated projects such as openstack/ansible-role-tripleo-ui > and code relating to tripleo-ui in puppet-triple and tripleoclient.  I > will follow-up on this thread with additional links. Won't you need ansible-role-tripleo-ui for stable branches? puppet-tripleo and pyyton-tripleoclient have branches, so you can remove from master. But since you want to maintain stable branches of tripleo-ui longer, I guess you need support for it in repos that are not branched. Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter: jaegerandi SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 From Tim.Bell at cern.ch Fri May 31 07:51:03 2019 From: Tim.Bell at cern.ch (Tim Bell) Date: Fri, 31 May 2019 07:51:03 +0000 Subject: [ops][nova][placement] NUMA topology vs non-NUMA workloads In-Reply-To: References: Message-ID: <62C35C83-F933-4D8F-967C-0EBD9FC2C842@cern.ch> Chris, From the CERN set up, I think there are dedicated cells for NUMA optimised configurations (but maybe one of the engineers on the team could confirm to be sure) Q: How important, in your cloud, is it to co-locate guests needing a NUMA topology with guests that do not? A review of documentation (upstream and vendor) shows differing levels of recommendation on this, but in many cases the recommendation is to not do it. A: no co-location currently Q: If your answer to the above is "we must be able to do that": How important is it that your cloud be able to pack workloads as tight as possible? That is: If there are two NUMA nodes and each has 2 VCPU free, should a 4 VCPU demanding non-NUMA workload be able to land there? Or would you prefer that not happen? A: not applicable Q: If the answer to the first question is "we can get by without that" is it satisfactory to be able to configure some hosts as NUMA aware and others as not, as described in the "NUMA topology with RPs" spec [1]? In this set up some non-NUMA workloads could end up on a NUMA host (unless otherwise excluded by traits or aggregates), but only when there was contiguous resource available. A: I think this would be OK Tim -----Original Message----- From: Chris Dent Reply-To: "openstack-discuss at lists.openstack.org" Date: Thursday, 30 May 2019 at 14:57 To: "OpenStack-discuss at lists.openstack.org" Subject: [ops][nova][placement] NUMA topology vs non-NUMA workloads This message is primarily addressed at operators, and of those, operators who are interested in effectively managing and mixing workloads that care about NUMA with workloads that do not. There are some questions within, after some background to explain the issue. At the PTG, Nova and Placement developers made a commitment to more effectively manage NUMA topologies within Nova and Placement. On the placement side this resulted in a spec which proposed several features that would enable more expressive queries when requesting allocation candidates (places for workloads to go), resulting in fewer late scheduling failures. At first there was one spec that discussed all the features. This morning it was split in two because one of the features is proving hard to resolve. Those two specs can be found at: * https://review.opendev.org/658510 (has all the original discussion) * https://review.opendev.org/662191 (the less contentious features split out) After much discussion, we would prefer to not do the feature discussed in 658510. Called 'can_split', it would allow specified classes of resource (notably VCPU and memory) to be split across multiple numa nodes when each node can only contribute a portion of the required resources and where those resources are modelled as inventory on the NUMA nodes, not the host at large. While this is a good idea in principle it turns out (see the spec) to cause many issues that require changes throughout the ecosystem, for example enforcing pinned cpus for workloads that would normally float. It's possible to make the changes, but it would require additional contributors to join the effort, both in terms of writing the code and understanding the many issues. So the questions: * How important, in your cloud, is it to co-locate guests needing a NUMA topology with guests that do not? A review of documentation (upstream and vendor) shows differing levels of recommendation on this, but in many cases the recommendation is to not do it. * If your answer to the above is "we must be able to do that": How important is it that your cloud be able to pack workloads as tight as possible? That is: If there are two NUMA nodes and each has 2 VCPU free, should a 4 VCPU demanding non-NUMA workload be able to land there? Or would you prefer that not happen? * If the answer to the first question is "we can get by without that" is it satisfactory to be able to configure some hosts as NUMA aware and others as not, as described in the "NUMA topology with RPs" spec [1]? In this set up some non-NUMA workloads could end up on a NUMA host (unless otherwise excluded by traits or aggregates), but only when there was contiguous resource available. This latter question articulates the current plan unless responses to this message indicate it simply can't work or legions of assistance shows up. Note that even if we don't do can_split, we'll still be enabling significant progress with the other features described in the second spec [2]. Thanks for your help in moving us in the right direction. [1] https://review.opendev.org/552924 [2] https://review.opendev.org/662191 -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From arne.wiebalck at cern.ch Fri May 31 08:20:47 2019 From: arne.wiebalck at cern.ch (Arne Wiebalck) Date: Fri, 31 May 2019 10:20:47 +0200 Subject: [ops][nova][placement] NUMA topology vs non-NUMA workloads In-Reply-To: <62C35C83-F933-4D8F-967C-0EBD9FC2C842@cern.ch> References: <62C35C83-F933-4D8F-967C-0EBD9FC2C842@cern.ch> Message-ID: <039da9d7-43b1-0f4b-0551-693a86f9cb5e@cern.ch> On 31.05.19 09:51, Tim Bell wrote: > Chris, > > From the CERN set up, I think there are dedicated cells for NUMA optimised configurations (but maybe one of the engineers on the team could confirm to be sure) This is correct: we have dedicated cells for NUMA aware guests (and hence do not mix NUMA aware and NUMA unaware guests on the same set of hosts). Arne > > Q: How important, in your cloud, is it to co-locate guests needing a > NUMA topology with guests that do not? A review of documentation > (upstream and vendor) shows differing levels of recommendation on > this, but in many cases the recommendation is to not do it. > > A: no co-location currently > > Q: If your answer to the above is "we must be able to do that": How > important is it that your cloud be able to pack workloads as tight > as possible? That is: If there are two NUMA nodes and each has 2 > VCPU free, should a 4 VCPU demanding non-NUMA workload be able to > land there? Or would you prefer that not happen? > > A: not applicable > > Q: If the answer to the first question is "we can get by without > that" is it satisfactory to be able to configure some hosts as NUMA > aware and others as not, as described in the "NUMA topology with > RPs" spec [1]? In this set up some non-NUMA workloads could end up > on a NUMA host (unless otherwise excluded by traits or aggregates), > but only when there was contiguous resource available. > > A: I think this would be OK > > Tim > -----Original Message----- > From: Chris Dent > Reply-To: "openstack-discuss at lists.openstack.org" > Date: Thursday, 30 May 2019 at 14:57 > To: "OpenStack-discuss at lists.openstack.org" > Subject: [ops][nova][placement] NUMA topology vs non-NUMA workloads > > > This message is primarily addressed at operators, and of those, > operators who are interested in effectively managing and mixing > workloads that care about NUMA with workloads that do not. There are > some questions within, after some background to explain the issue. > > At the PTG, Nova and Placement developers made a commitment to more > effectively manage NUMA topologies within Nova and Placement. On the > placement side this resulted in a spec which proposed several > features that would enable more expressive queries when requesting > allocation candidates (places for workloads to go), resulting in > fewer late scheduling failures. > > At first there was one spec that discussed all the features. This > morning it was split in two because one of the features is proving > hard to resolve. Those two specs can be found at: > > * https://review.opendev.org/658510 (has all the original discussion) > * https://review.opendev.org/662191 (the less contentious features split out) > > After much discussion, we would prefer to not do the feature > discussed in 658510. Called 'can_split', it would allow specified > classes of resource (notably VCPU and memory) to be split across > multiple numa nodes when each node can only contribute a portion of > the required resources and where those resources are modelled as > inventory on the NUMA nodes, not the host at large. > > While this is a good idea in principle it turns out (see the spec) > to cause many issues that require changes throughout the ecosystem, > for example enforcing pinned cpus for workloads that would normally > float. It's possible to make the changes, but it would require > additional contributors to join the effort, both in terms of writing > the code and understanding the many issues. > > So the questions: > > * How important, in your cloud, is it to co-locate guests needing a > NUMA topology with guests that do not? A review of documentation > (upstream and vendor) shows differing levels of recommendation on > this, but in many cases the recommendation is to not do it. > > * If your answer to the above is "we must be able to do that": How > important is it that your cloud be able to pack workloads as tight > as possible? That is: If there are two NUMA nodes and each has 2 > VCPU free, should a 4 VCPU demanding non-NUMA workload be able to > land there? Or would you prefer that not happen? > > * If the answer to the first question is "we can get by without > that" is it satisfactory to be able to configure some hosts as NUMA > aware and others as not, as described in the "NUMA topology with > RPs" spec [1]? In this set up some non-NUMA workloads could end up > on a NUMA host (unless otherwise excluded by traits or aggregates), > but only when there was contiguous resource available. > > This latter question articulates the current plan unless responses > to this message indicate it simply can't work or legions of > assistance shows up. Note that even if we don't do can_split, we'll > still be enabling significant progress with the other features > described in the second spec [2]. > > Thanks for your help in moving us in the right direction. > > [1] https://review.opendev.org/552924 > [2] https://review.opendev.org/662191 > -- > Chris Dent ٩◔̯◔۶ https://anticdent.org/ > freenode: cdent > From lucasagomes at gmail.com Fri May 31 08:38:09 2019 From: lucasagomes at gmail.com (Lucas Alvares Gomes) Date: Fri, 31 May 2019 09:38:09 +0100 Subject: [neutron][networking-ovn] Core team updates Message-ID: Hi all, I'd like to welcome Jakub Libosvar to the networking-ovn core team. The team was in need for more reviewers with +2/+A power and Jakub's reviews have been super high quality [0][1]. He's also helping the project out in many other different efforts such as bringing in the full stack test suit and bug fixes. Also, Miguel Ajo has changed focus from OVN/networking-ovn and is been dropped from the core team. Of course, we will welcome him back when his activity picks back up again. Thank you Jakub and Miguel! [0] https://www.stackalytics.com/report/contribution/networking-ovn/30 [1] https://www.stackalytics.com/report/contribution/networking-ovn/90 Cheers, Lucas From jean-philippe at evrard.me Fri May 31 09:50:35 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Fri, 31 May 2019 11:50:35 +0200 Subject: [openstack-helm] external ceph In-Reply-To: References: Message-ID: <804442a9-0a7e-4f02-9753-d5c3483b95de@www.fastmail.com> I think there is a documentation that was started by Jay [1] This should get you started. Don't hesiate to ask questions on the channel :) [1]: https://review.opendev.org/#/c/586992/ From skaplons at redhat.com Fri May 31 10:35:08 2019 From: skaplons at redhat.com (Slawomir Kaplonski) Date: Fri, 31 May 2019 12:35:08 +0200 Subject: Integration of osprofiler in rally results page Message-ID: <1F182B59-AB12-4C33-8F28-F9F0BF057030@redhat.com> Hi, Recently thx to Andrey Kurilin and Ilya Shakhat we integrated in neutron-rally-task job results from osprofiler that profiler results are available in “Scenario data” tab for every iteration of every scenario. Example of how it looks is available at [1]. Patch with changes required to enable it in rally job is in [2]. It doesn’t require many changes and job isn’t run much longer so maybe it will be useful for other projects also :) [1] http://logs.openstack.org/50/615350/44/check/neutron-rally-task/e453164/results/report.html.gz#/NeutronNetworks.create_and_list_routers/output [2] https://review.opendev.org/#/c/615350/ — Slawek Kaplonski Senior software engineer Red Hat From doug at doughellmann.com Fri May 31 11:50:45 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Fri, 31 May 2019 07:50:45 -0400 Subject: [tc][ops] reviving osops- repos In-Reply-To: References: <20190530205552.falsvxcegehtyuge@yuggoth.org> Message-ID: Mohammed Naser writes: > On Thu, May 30, 2019 at 4:59 PM Jeremy Stanley wrote: >> >> On 2019-05-30 15:23:43 -0400 (-0400), Mohammed Naser wrote: >> [...] >> > I've requested to move x/osops-* to openstack-operators/* in >> > GitHub so that I can setup the appropriate mirroring in post >> > pipeline (and then propose a patch to rename things inside Gerrit >> > as well). >> [...] >> >> My only real concern (voiced already with perhaps far less brevity >> in #openstack-tc) is that we should avoid leaving the impression >> that software written by "operators" isn't good enough for the >> openstack/ Git namespace. I want to be sure we remain clear that >> OpenStack was, and still is, written by its operators/users, not by >> some separate and nebulous cloud of "developers." > > I would love to personally make it live under the openstack/ namespace > however I do feel like that does make it 'pretty darn official' and the quality > of the tools there isn't something we probably want to put our name under > yet. A lot of it is really old (2+ years old) and it probably needs a > bit of time > to get into a state where it's something that we can make official. Oh, my. No. Just, no. All software starts out as crap, and most of the stuff running in production all around the world is still crap to some degree or another (my own code very much not excluded). The parts that are less crap today got that way through the efforts of people collaborating to improve them from the beginning (thank you to everyone who has reviewed my code), not by waiting until they were "good enough" to share. The tools in the osops repo are not special in this regard, and we shouldn't treat them as though they are. > Personally, I think that repo should be nothing but a 'buffer' between project > features and tools needed by deployers, a lot of the things there seem to be > there because of bugs (i.e. orphaned resource cleanup -- which should ideally > be cleaned up by the service itself or warned there), clean-up disks for deleted > VMs that were not removed, etc. That's a great vision. I love the idea of a "sandbox" (or several) for exploring those sorts of improvements. >> We already have plenty of repositories under openstack/ which are >> maintained by SIGs and UC WGs/teams, so not everything there needs >> to be a deliverable governed by the OpenStack TC anyway. If this >> really is a collection of software written and maintained by the >> operators of OpenStack then it should be fine alongside the rest of >> the official OpenStack software. If it's not, then perhaps calling >> it the "openstack-operators" organization is... misleading? My vote is a hard "no" on an openstack-operators/ namespace because I'm tired of perpetuating the idea that "operators" cannot (or should not) be "contributors" to openstack/. They *must* be contributors, because that's the only way open source works over the log term. -- Doug From emilien at redhat.com Fri May 31 12:10:48 2019 From: emilien at redhat.com (Emilien Macchi) Date: Fri, 31 May 2019 08:10:48 -0400 Subject: Retiring TripleO-UI - no longer supported In-Reply-To: <2f23d85a-a917-23ab-0317-71e1a58e5dbe@suse.com> References: <3924F5DE-314C-4D41-8CEA-DCF7A2A2CDEA@redhat.com> <2f23d85a-a917-23ab-0317-71e1a58e5dbe@suse.com> Message-ID: On Fri, May 31, 2019 at 2:49 AM Andreas Jaeger wrote: > On 24/05/2019 17.57, Jason Rist wrote: > > No, there might be stable branch work, but going forward no additional > > features or work will be done against master, and I will additionally be > > retiring associated projects such as openstack/ansible-role-tripleo-ui > > and code relating to tripleo-ui in puppet-triple and tripleoclient. I > > will follow-up on this thread with additional links. > > Won't you need ansible-role-tripleo-ui for stable branches? > no, this repo was never used anywhere. -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From jon at csail.mit.edu Fri May 31 12:35:01 2019 From: jon at csail.mit.edu (Jonathan D. Proulx) Date: Fri, 31 May 2019 08:35:01 -0400 Subject: [tc][ops] reviving osops- repos In-Reply-To: References: <20190530205552.falsvxcegehtyuge@yuggoth.org> Message-ID: <20190531123501.tawgvqgsw6yle2nu@csail.mit.edu> On Fri, May 31, 2019 at 07:50:45AM -0400, Doug Hellmann wrote: :My vote is a hard "no" on an openstack-operators/ namespace because I'm :tired of perpetuating the idea that "operators" cannot (or should not) :be "contributors" to openstack/. They *must* be contributors, because :that's the only way open source works over the log term. what Doug said! The most important and perhaps most difficult thing in OpenSource is the bravery to be bad in public. This is why the worst thing in OpenSource is shaming people for being bad in public. The internet has no end of my "dumb" questions and "stupid" statements and virtually everything I know about software and operations is because of them. So let's dare to be bad :) -Jon From mnaser at vexxhost.com Fri May 31 12:51:18 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Fri, 31 May 2019 08:51:18 -0400 Subject: [tc][ops] reviving osops- repos In-Reply-To: <20190531123501.tawgvqgsw6yle2nu@csail.mit.edu> References: <20190530205552.falsvxcegehtyuge@yuggoth.org> <20190531123501.tawgvqgsw6yle2nu@csail.mit.edu> Message-ID: On Fri, May 31, 2019 at 8:35 AM Jonathan D. Proulx wrote: > > On Fri, May 31, 2019 at 07:50:45AM -0400, Doug Hellmann wrote: > > :My vote is a hard "no" on an openstack-operators/ namespace because I'm > :tired of perpetuating the idea that "operators" cannot (or should not) > :be "contributors" to openstack/. They *must* be contributors, because > :that's the only way open source works over the log term. > > what Doug said! > > The most important and perhaps most difficult thing in OpenSource is > the bravery to be bad in public. This is why the worst thing in > OpenSource is shaming people for being bad in public. That sounds fair. It seems there is significant thoughts from within the community that this should continue to live under the openstack/ namespace. How do we make this happen then? I'm struggling to find the correct platform to put this under in order to make this live under openstack/ > The internet has no end of my "dumb" questions and "stupid" statements > and virtually everything I know about software and operations is > because of them. > > So let's dare to be bad :) > > -Jon -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. http://vexxhost.com From mriedemos at gmail.com Fri May 31 13:39:22 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 31 May 2019 08:39:22 -0500 Subject: Integration of osprofiler in rally results page In-Reply-To: <1F182B59-AB12-4C33-8F28-F9F0BF057030@redhat.com> References: <1F182B59-AB12-4C33-8F28-F9F0BF057030@redhat.com> Message-ID: <7642c6ad-57f5-2142-aa4a-be6fa78df2dd@gmail.com> On 5/31/2019 5:35 AM, Slawomir Kaplonski wrote: > Recently thx to Andrey Kurilin and Ilya Shakhat we integrated in neutron-rally-task job results from osprofiler that profiler results are available in “Scenario data” tab for every iteration of every scenario. Example of how it looks is available at [1]. Patch with changes required to enable it in rally job is in [2]. It doesn’t require many changes and job isn’t run much longer so maybe it will be useful for other projects also:) > > [1]http://logs.openstack.org/50/615350/44/check/neutron-rally-task/e453164/results/report.html.gz#/NeutronNetworks.create_and_list_routers/output > [2]https://review.opendev.org/#/c/615350/ Thanks for sharing this, it's pretty cool. Nova doesn't run any rally jobs but I'd be interested in adding a rally job to nova's experimental queue if we had something like this because there are times when working on a change which might have performance impacts (either positive or negative) and it would be nice to be able to see this kind of data. -- Thanks, Matt From doug at doughellmann.com Fri May 31 13:49:24 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Fri, 31 May 2019 09:49:24 -0400 Subject: [tc][ops] reviving osops- repos In-Reply-To: References: <20190530205552.falsvxcegehtyuge@yuggoth.org> <20190531123501.tawgvqgsw6yle2nu@csail.mit.edu> Message-ID: Mohammed Naser writes: > On Fri, May 31, 2019 at 8:35 AM Jonathan D. Proulx wrote: >> >> On Fri, May 31, 2019 at 07:50:45AM -0400, Doug Hellmann wrote: >> >> :My vote is a hard "no" on an openstack-operators/ namespace because I'm >> :tired of perpetuating the idea that "operators" cannot (or should not) >> :be "contributors" to openstack/. They *must* be contributors, because >> :that's the only way open source works over the log term. >> >> what Doug said! >> >> The most important and perhaps most difficult thing in OpenSource is >> the bravery to be bad in public. This is why the worst thing in >> OpenSource is shaming people for being bad in public. > > That sounds fair. It seems there is significant thoughts from within > the community > that this should continue to live under the openstack/ namespace. > > How do we make this happen then? I'm struggling to find the correct platform to > put this under in order to make this live under openstack/ What part is hard, defining an owner? -- Doug From cdent+os at anticdent.org Fri May 31 13:51:51 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Fri, 31 May 2019 14:51:51 +0100 (BST) Subject: [placement] update 19-21 Message-ID: HTML: https://anticdent.org/placement-update-19-21.html And here we have placement update 19-21. # Most Important The [spec for nested magic](https://review.opendev.org/658510) has been split. The [second half](https://review.opendev.org/662191) contains the parts which should be relatively straightforward. The original retains functionality that may be too complex. An email has been [addressed to operators](http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006726.html) asking for feedback. Those two specs represent a significant portion of the work planned this cycle. Getting them reviewed and merged is a good thing to do. # What's Changed * A few small refactorings plus the removal of [null provider protections](https://review.opendev.org/657716) has improved performance when retrieving 10,000 allocation candidates significantly: from around 36 seconds to 6 seconds. * We've chosen to switch to office hours. Ed has started an [email thread](http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006728.html) to determine when they should be. * Tetsuro's changes to add a RequestGroupSearchContext have merged. These simplify state management throughout the processing of individual request groups, and help avoid redundant queries. * Most of the code for [counting (nova) quota usage from placement](https://review.opendev.org/#/q/topic:bp/count-quota-usage-from-placement) has merged. * [Microversion 1.33](https://docs.openstack.org/placement/latest/placement-api-microversion-history.html#support-string-request-group-suffixes) of the placement API has merged. This allows more expressive suffixes on granular request groups (e.g., `resources_COMPUTE` in addition to `resources1`). # Specs/Features * Support Consumer Types. This is very close with a few details to work out on what we're willing and able to query on. It's two weeks later and still only has reviews from me. * Spec for can_split part of nested magic. We're unlikely to do this at this point, unless response to the [email asking for input](http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006726.html) is large and insisting. * Spec for nested magic 1. The easier parts of nested magic: same_subtree, resource request groups, verbose suffixes (already merged as 1.33). * Resource provider - request group mapping in allocation candidate. There's a [WIP](https://review.opendev.org/#/c/662245/) that looks promising, but we ought to make sure the spec is presenting the desired outcome. These and other features being considered can be found on the [feature worklist](https://storyboard.openstack.org/#!/worklist/594). Some non-placement specs are listed in the Other section below. Note that nova will be having a spec-review-sprint this coming Tuesday. if you're doing that, spending a bit of time on the placement specs would be great too. # Stories/Bugs (Numbers in () are the change since the last pupdate.) There are 19 (-1) stories in [the placement group](https://storyboard.openstack.org/#!/project_group/placement). 0 (0) are [untagged](https://storyboard.openstack.org/#!/worklist/580). 2 (0) are [bugs](https://storyboard.openstack.org/#!/worklist/574). 4 (-1) are [cleanups](https://storyboard.openstack.org/#!/worklist/575). 11 (0) are [rfes](https://storyboard.openstack.org/#!/worklist/594). 2 (0) are [docs](https://storyboard.openstack.org/#!/worklist/637). If you're interested in helping out with placement, those stories are good places to look. * Placement related nova [bugs not yet in progress](https://goo.gl/TgiPXb) on launchpad: 16 (0). * Placement related nova [in progress bugs](https://goo.gl/vzGGDQ) on launchpad: 7 (0). # osc-placement osc-placement is currently behind by 12 microversions. No change since the last report. _Note_: Based on conversations that we see on reviews, explicitly trying to chase microversions when patching the plugin may not be aligned with the point of OSC. We're trying to make a humane interface to getting stuff done with placement. Different microversions allow different takes on stuff. It's the stuff that matters, not the microversion. Pending Changes: * Add 'resource provider inventory update' command (that helps with aggregate allocation ratios). * Add support for 1.22 microversion. (So, for example, what matters here is that support for forbidden traits is being added. That it is in microversion 1.22 ought to be incidental (to the user of the client).) * Provide a useful message in the case of 500-error # Main Themes ## Nested Magic The overview of the features encapsulated by the term "nested magic" are in a [story](https://storyboard.openstack.org/#!/story/2005575). There is some in progress code, some of it WIPs to expose issues: * WIP: Allow RequestGroups without resources * Add NUMANetworkFixture for gabbits * Gabbi test cases for can_split * WIP: Implement allocation candidate mappings ## Consumer Types Adding a type to consumers will allow them to be grouped for various purposes, including quota accounting. A [spec](https://review.opendev.org/654799) has started. There are some questions about request and response details that need to be resolved, but the overall concept is sound. ## Cleanup We continue to do cleanup work to lay in reasonable foundations for the nested work above. As a nice bonus, we keep eking out additional performance gains too. * Optionally run a wsgi profiler when asked. This has proven useful to find issues. The change has been unwipped and augumented with some documentation. * Add olso.middleware.cors to conf generator * Modernize CORS config and setup. Ed Leafe has also been doing some intriguing work on using graph databases with placement. It's not yet clear if or how it could be integrated with mainline placement, but there are likely many things to be learned from the experiment. # Other Placement Miscellaneous changes can be found in [the usual place](https://review.opendev.org/#/q/project:openstack/placement+status:open). There are several [os-traits changes](https://review.opendev.org/#/q/project:openstack/os-traits+status:open) being discussed. # Other Service Users New discoveries are added to the end. Merged stuff is removed. As announced last week, anything that has had no activity in 4 weeks has been removed (many have been removed). * Nova: spec: support virtual persistent memory * Nova: Pre-filter hosts based on multiattach volume support * Nova: Add flavor to requested_resources in RequestSpec * Blazar: Retry on inventory update conflict * Nova: count quota usage from placement * Nova: nova-manage: heal port allocations * Tempest: Add QoS policies and minimum bandwidth rule client * nova-spec: Allow compute nodes to use DISK_GB from shared storage RP * Cyborg: Placement report * Nova: Spec to pre-filter disabled computes with placement * rpm-packaging: placement service * Delete resource providers for all nodes when deleting compute service * nova fix for: Drop source node allocations if finish_resize fails * nova: WIP: Hey let's support routed networks y'all! * Fixes to keystoneauth to deal with the style of error responses that placement uses. * starlingx: Add placement chart patch to openstack-helm * helm: WIP: add placement chart * starlingx: Add stx-placement docker image directives files * kolla-ansible: Add a explanatory note for "placement_api_port" * neutron-spec: L3 agent capacity and scheduling # End That's a lot of reviewing. Please help out where you can. Your reward will be brief but sincere moments of joy. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From mnaser at vexxhost.com Fri May 31 13:52:30 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Fri, 31 May 2019 09:52:30 -0400 Subject: [tc][ops] reviving osops- repos In-Reply-To: References: <20190530205552.falsvxcegehtyuge@yuggoth.org> <20190531123501.tawgvqgsw6yle2nu@csail.mit.edu> Message-ID: On Fri., May 31, 2019, 9:49 a.m. Doug Hellmann, wrote: > Mohammed Naser writes: > > > On Fri, May 31, 2019 at 8:35 AM Jonathan D. Proulx > wrote: > >> > >> On Fri, May 31, 2019 at 07:50:45AM -0400, Doug Hellmann wrote: > >> > >> :My vote is a hard "no" on an openstack-operators/ namespace because I'm > >> :tired of perpetuating the idea that "operators" cannot (or should not) > >> :be "contributors" to openstack/. They *must* be contributors, because > >> :that's the only way open source works over the log term. > >> > >> what Doug said! > >> > >> The most important and perhaps most difficult thing in OpenSource is > >> the bravery to be bad in public. This is why the worst thing in > >> OpenSource is shaming people for being bad in public. > > > > That sounds fair. It seems there is significant thoughts from within > > the community > > that this should continue to live under the openstack/ namespace. > > > > How do we make this happen then? I'm struggling to find the correct > platform to > > put this under in order to make this live under openstack/ > > What part is hard, defining an owner? > Yes. A SIG? A WG? I'm not sure where to put it under to make it official under OpenStack namespace. I'm happy to hear suggestions -- > Doug > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at doughellmann.com Fri May 31 14:01:27 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Fri, 31 May 2019 10:01:27 -0400 Subject: [tc][ops] reviving osops- repos In-Reply-To: References: <20190530205552.falsvxcegehtyuge@yuggoth.org> <20190531123501.tawgvqgsw6yle2nu@csail.mit.edu> Message-ID: <755C37D0-89A7-4335-BCE1-388A5598F7BA@doughellmann.com> > On May 31, 2019, at 9:52 AM, Mohammed Naser wrote: > > > > On Fri., May 31, 2019, 9:49 a.m. Doug Hellmann, > wrote: > Mohammed Naser > writes: > > > On Fri, May 31, 2019 at 8:35 AM Jonathan D. Proulx > wrote: > >> > >> On Fri, May 31, 2019 at 07:50:45AM -0400, Doug Hellmann wrote: > >> > >> :My vote is a hard "no" on an openstack-operators/ namespace because I'm > >> :tired of perpetuating the idea that "operators" cannot (or should not) > >> :be "contributors" to openstack/. They *must* be contributors, because > >> :that's the only way open source works over the log term. > >> > >> what Doug said! > >> > >> The most important and perhaps most difficult thing in OpenSource is > >> the bravery to be bad in public. This is why the worst thing in > >> OpenSource is shaming people for being bad in public. > > > > That sounds fair. It seems there is significant thoughts from within > > the community > > that this should continue to live under the openstack/ namespace. > > > > How do we make this happen then? I'm struggling to find the correct platform to > > put this under in order to make this live under openstack/ > > What part is hard, defining an owner? > > Yes. A SIG? A WG? I thought we had converted everything that wasn’t a Board working group to a SIG. A SIG seems appropriate for this, in any case. Doug > > I'm not sure where to put it under to make it official under OpenStack namespace. > > I'm happy to hear suggestions > > -- > Doug -------------- next part -------------- An HTML attachment was scrubbed... URL: From emccormick at cirrusseven.com Fri May 31 14:24:36 2019 From: emccormick at cirrusseven.com (Erik McCormick) Date: Fri, 31 May 2019 10:24:36 -0400 Subject: [tc][ops] reviving osops- repos In-Reply-To: References: <20190530205552.falsvxcegehtyuge@yuggoth.org> <20190531123501.tawgvqgsw6yle2nu@csail.mit.edu> Message-ID: On Fri, May 31, 2019 at 9:54 AM Mohammed Naser wrote: > > > > On Fri., May 31, 2019, 9:49 a.m. Doug Hellmann, wrote: >> >> Mohammed Naser writes: >> >> > On Fri, May 31, 2019 at 8:35 AM Jonathan D. Proulx wrote: >> >> >> >> On Fri, May 31, 2019 at 07:50:45AM -0400, Doug Hellmann wrote: >> >> >> >> :My vote is a hard "no" on an openstack-operators/ namespace because I'm >> >> :tired of perpetuating the idea that "operators" cannot (or should not) >> >> :be "contributors" to openstack/. They *must* be contributors, because >> >> :that's the only way open source works over the log term. >> >> >> >> what Doug said! >> >> >> >> The most important and perhaps most difficult thing in OpenSource is >> >> the bravery to be bad in public. This is why the worst thing in >> >> OpenSource is shaming people for being bad in public. >> > >> > That sounds fair. It seems there is significant thoughts from within >> > the community >> > that this should continue to live under the openstack/ namespace. >> > >> > How do we make this happen then? I'm struggling to find the correct platform to >> > put this under in order to make this live under openstack/ >> >> What part is hard, defining an owner? > > > Yes. A SIG? A WG? > > I'm not sure where to put it under to make it official under OpenStack namespace. > > I'm happy to hear suggestions > Back in the old days (*shakes cane at you kids on his lawn*) when these repos were started, there really weren't WGs or SIGs. There were projects. So there's a project [1]. So either: A) Make a SIG out of that and assign the repos to the sig, or B) Maybe add it under / rename the Ops Docs SIG [2] as it might bring more eyes to both things which serve the same folks. I personally don't care about the location for the sake of feeling "included." A repo is a repo to me. However, from a marketing standpoint, I think having it in openstack/ would increase the visibility and garner both more contributors and users. We currently talk about Ops Docs things during the Ops Meetup team meeting on Tuesdays at 10am EDT (15:00 UTC), so if anyone would like to discuss this as an agenda item during that meeting I can add it to the agenda. Let me know if there's interest. [1] https://wiki.openstack.org/wiki/Osops [2] https://wiki.openstack.org/wiki/Operation_Docs_SIG -Erik >> -- >> Doug From mnaser at vexxhost.com Fri May 31 15:19:05 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Fri, 31 May 2019 11:19:05 -0400 Subject: [nova] dropping 2013.1 tag on pypi Message-ID: Hi everyone, I've recently had some folks reach out who were quite confused by the fact that searching for 'nova' on pypi shows a 2013 release, and then hitting the correct path: https://pypi.org/project/nova/19.0.0/ shows that there is a new version available. I know usually deleting releases is a bit of a 'don't do this' thing, but in this case, that's a 6 year old unsupported/never-to-be-used again release and I think to make life easier for consumers, we should maybe drop it. Thoughts? Regards, Mohammed -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. http://vexxhost.com From miguel at mlavalle.com Fri May 31 15:20:46 2019 From: miguel at mlavalle.com (Miguel Lavalle) Date: Fri, 31 May 2019 10:20:46 -0500 Subject: [neutron][networking-ovn] Core team updates In-Reply-To: References: Message-ID: Great addition to the team core team. Congratulations! Also, sad to see that my "tocayo" Miguel has moved to other endeavors. He is great, though, and I know he will be successful in whatever he is doing Regards The other Miguel On Fri, May 31, 2019 at 3:42 AM Lucas Alvares Gomes wrote: > Hi all, > > I'd like to welcome Jakub Libosvar to the networking-ovn core team. > The team was in need for more reviewers with +2/+A power and Jakub's > reviews have been super high quality [0][1]. He's also helping the > project out in many other different efforts such as bringing in the > full stack test suit and bug fixes. > > Also, Miguel Ajo has changed focus from OVN/networking-ovn and is been > dropped from the core team. Of course, we will welcome him back when > his activity picks back up again. > > Thank you Jakub and Miguel! > > [0] https://www.stackalytics.com/report/contribution/networking-ovn/30 > [1] https://www.stackalytics.com/report/contribution/networking-ovn/90 > > Cheers, > Lucas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gsteinmuller at vexxhost.com Fri May 31 15:30:18 2019 From: gsteinmuller at vexxhost.com (=?UTF-8?Q?Guilherme_Steinm=C3=BCller?=) Date: Fri, 31 May 2019 12:30:18 -0300 Subject: [nova] dropping 2013.1 tag on pypi In-Reply-To: References: Message-ID: +1 On Fri, May 31, 2019, 12:22 Mohammed Naser wrote: > Hi everyone, > > I've recently had some folks reach out who were quite confused by the > fact that searching for 'nova' on pypi shows a 2013 release, and then > hitting the correct path: > > https://pypi.org/project/nova/19.0.0/ > > shows that there is a new version available. I know usually deleting > releases is a bit of a 'don't do this' thing, but in this case, that's > a 6 year old unsupported/never-to-be-used again release and I think to > make life easier for consumers, we should maybe drop it. > > Thoughts? > > Regards, > Mohammed > > > -- > Mohammed Naser — vexxhost > ----------------------------------------------------- > D. 514-316-8872 > D. 800-910-1726 ext. 200 > E. mnaser at vexxhost.com > W. http://vexxhost.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From colleen at gazlene.net Fri May 31 15:30:48 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Fri, 31 May 2019 08:30:48 -0700 Subject: [dev][keystone] M-1 check-in and retrospective meeting In-Reply-To: References: Message-ID: <627ae3a7-b998-4323-8981-2d1cd7bc3085@www.fastmail.com> On Sat, May 25, 2019, at 06:51, Colleen Murphy wrote: > Hi team, > > During the PTG, we agreed to have milestone-ly check-ins in order to > try to keep momentum going throughout the cycle. Milestone 1 is already > nearly upon us, so it's time to schedule this meeting. I'd like to > schedule a two-hour video call during which we'll conduct a brief > retrospective of the cycle so far, review our past action items, and > refine and reevaluate our plans for the rest of the cycle. > > I've created a doodle poll[1] to schedule the session for either the > week of M-1[2] or the following week. > > If you have questions, concerns, or thoughts about this meeting, let's > discuss it in this thread (or you can message me privately). > > Colleen > > [1] https://doodle.com/poll/hyibxqp9h8sgz56p > [2] https://releases.openstack.org/train/schedule.html > > Mark your calendars: the poll shows that the best day for this is June 11, 15:00-17:00 UTC (starting an hour before our regular meeting). Colleen From juliaashleykreger at gmail.com Fri May 31 16:09:53 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Fri, 31 May 2019 09:09:53 -0700 Subject: [ironic][edge]: Recap of PTG discussions In-Reply-To: References: Message-ID: On Tue, May 28, 2019 at 8:31 AM Csatari, Gergely (Nokia - HU/Budapest) wrote: > > Hi, > > There was a one hour discussion with Julia from Ironic with the Edge Computing Group [1]. In this mail I try to conclude what was discussed and ask some clarification questions. > > Current Ironic uses DHCP for hardware provisioning, therefore it requires DHCP relay enabled on the whole path to the edge cloud instances. There are two alternatives to solve this: > 1) Virtual media support [2] where the ip configuration is embedded into a virtual image what is booted via the board management interface > 2) Redfish support, however the state and support of redfish for host management is not clear. Is there already a specification has been added for redfish support? I don't quite remember discussing redfish in this regard. Is there something your expecting redfish to be able to provide in this regard? Redfish virtual media is something we're working on. > > Upgrade of edge cloud infrastructures: > - Firmware upgrade should be supported by Ironic. Is this something on its way or is this a new need? The ilo hardware type supports OOB firmware upgrades. iRMC has code up in review, and Dell has a posted specification. Work is going into the redfish libraries to help support this so we will likely see something for redfish at some point, but it may also be that because of vendor differences, that we may not be able to provide a generic surface through which to provide firmware update capabilities through the generic hardware type. > - Operating system and infra update can be solved using Fenix [3], however handling several edge cloud instances from a central location needs new features. > > Handling of failed servers: > - A monitoring system or the operator should provide the input to mark a server as failed > - Ironic can power down the failed servers and have the definition of a maintenance state > - Discussed in [4] > > Additional ideas what we half discussed: > - Running Ironic containers in a switch with the images hosted by Swift somewhere else. Are there any concerns about this idea? Any missing features from somewhere? > > [1]: https://etherpad.openstack.org/p/edge-wg-ptg-preparation-denver-2019 > [2]: https://specs.openstack.org/openstack/ironic-specs/specs/not-implemented/L3-based-deployment.html > [3]: https://wiki.openstack.org/wiki/Fenix > [4]: http://lists.openstack.org/pipermail/edge-computing/2019-May/000582.html > > Br, > Gerg0 > From juliaashleykreger at gmail.com Fri May 31 16:18:28 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Fri, 31 May 2019 09:18:28 -0700 Subject: [ironic][edge]: Recap of PTG discussions In-Reply-To: <9734b5a6cf23459890adae6ac4bab7c7@AUSX13MPC106.AMER.DELL.COM> References: <9734b5a6cf23459890adae6ac4bab7c7@AUSX13MPC106.AMER.DELL.COM> Message-ID: On Thu, May 30, 2019 at 3:37 PM wrote: > > There is an ironic redfish driver but it's still a WIP. Right now, I believe you can control power and manage boot mode (pxe, media...) with Redfish driver but implementing BIOS and RAID support is still ongoing. Power, boot mode, boot device (pxe, disk), inspection, and bios settings are present in Ironic today for the redfish hardware type. Sensor data collection, RAID, virtual media, and firmware management are hopefully things we evolve in the next cycle or two. > > Re: firmware, there is an ironic spec for Dell EMC hardware here: https://github.com/openstack/ironic-specs/blob/master/specs/approved/drac-firmware-update-spec.rst > > Thanks, > dp From fungi at yuggoth.org Fri May 31 16:18:49 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 31 May 2019 16:18:49 +0000 Subject: [tc][ops] reviving osops- repos In-Reply-To: References: <20190530205552.falsvxcegehtyuge@yuggoth.org> Message-ID: <20190531161848.h35lh7njqpn7atn3@yuggoth.org> On 2019-05-31 07:50:45 -0400 (-0400), Doug Hellmann wrote: [...] > I'm tired of perpetuating the idea that "operators" cannot (or > should not) be "contributors" to openstack/. They *must* be > contributors, because that's the only way open source works over > the log term. As I said in my previous post, OpenStack was built by its operators and continues to be maintained by them. That there are some in our midst who feel separate from and disenfranchised by the process through which OpenStack is developed is disappointing, but I don't believe they are representative of OpenStack operators as a whole. I would further argue that operators of OpenStack are and have always been its primary contributors. Following the true spirit of open source, operators got together and built automation they needed to make their jobs easier (or even possible). To me a key signal of this heritage was the early choice to standardize on Python, a runtime-interpreted scripting language enforcing readability, historically used by operators for automating systems administration tasks. This was almost certainly not a coincidence. Find a bug? You can read the source code right there on the server's filesystem, modify it in place, test that it's fixed without having to manually recompile anything, and then share the solution with the community. (Tweaking software in production is of course usually not the *best* way to fix bugs, but when you're in a tight spot because it's 3am and customers are blowing up the support line and your pager won't shut up, it's attractive and expedient.) We should celebrate this heritage, and always remind ourselves that OpenStack emerged out of a shared desire for operational automation. To assume that operators are not the people making OpenStack day in and day out is to discredit the work we all do. "Operators" aren't some separate set of people from "developers." Operators are us. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From openstack at fried.cc Fri May 31 16:31:32 2019 From: openstack at fried.cc (Eric Fried) Date: Fri, 31 May 2019 11:31:32 -0500 Subject: [nova] dropping 2013.1 tag on pypi In-Reply-To: References: Message-ID: Wow, that *is* confusing. Yes, please delete the tag. (Note to other readers: Even with mnaser's instructions it took me a bit to find the "real" nova by expanding the "Release history" using the button on the left: https://pypi.org/project/nova/#history) efried On 5/31/19 10:19 AM, Mohammed Naser wrote: > Hi everyone, > > I've recently had some folks reach out who were quite confused by the > fact that searching for 'nova' on pypi shows a 2013 release, and then > hitting the correct path: > > https://pypi.org/project/nova/19.0.0/ > > shows that there is a new version available. I know usually deleting > releases is a bit of a 'don't do this' thing, but in this case, that's > a 6 year old unsupported/never-to-be-used again release and I think to > make life easier for consumers, we should maybe drop it. > > Thoughts? > > Regards, > Mohammed > > From fungi at yuggoth.org Fri May 31 16:41:02 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 31 May 2019 16:41:02 +0000 Subject: [uc][tc][ops] reviving osops- repos In-Reply-To: References: <20190530205552.falsvxcegehtyuge@yuggoth.org> <20190531123501.tawgvqgsw6yle2nu@csail.mit.edu> Message-ID: <20190531164102.5lwt2jyxk24u3vdz@yuggoth.org> On 2019-05-31 10:24:36 -0400 (-0400), Erik McCormick wrote: [...] > there's a project [1]. > > So either: > A) Make a SIG out of that and assign the repos to the sig, or > B) Maybe add it under / rename the Ops Docs SIG [2] as it might bring > more eyes to both things which serve the same folks. [...] I'd also be perfectly fine with C) say that it's being vouched for by the UC through its Osops project, stick these repos in a list *somewhere* as a durable record of that, and let decisions about project vs. SIG decision be independent of the repository naming decision. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From fungi at yuggoth.org Fri May 31 17:18:46 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 31 May 2019 17:18:46 +0000 Subject: [nova] dropping 2013.1 tag on pypi In-Reply-To: References: Message-ID: <20190531171846.zgvggddfmtfrnyqt@yuggoth.org> On 2019-05-31 11:19:05 -0400 (-0400), Mohammed Naser wrote: > I've recently had some folks reach out who were quite confused by the > fact that searching for 'nova' on pypi shows a 2013 release, and then > hitting the correct path: > > https://pypi.org/project/nova/19.0.0/ > > shows that there is a new version available. I know usually deleting > releases is a bit of a 'don't do this' thing, but in this case, that's > a 6 year old unsupported/never-to-be-used again release and I think to > make life easier for consumers, we should maybe drop it. We've been deleting any of the old date-based releases of official OpenStack deliverables from PyPI as we come across them. For projects which recently (re)started publishing releases there this may have been missed. At least in the case of nova it clearly was. I have manually deleted the 2013.1 release of nova from PyPI now. Don't hesitate to let me know if you spot anything similar on others and I'm happy to do the same to them as well. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From openstack at medberry.net Fri May 31 17:41:50 2019 From: openstack at medberry.net (David Medberry) Date: Fri, 31 May 2019 11:41:50 -0600 Subject: [uc][tc][ops] reviving osops- repos In-Reply-To: <20190531164102.5lwt2jyxk24u3vdz@yuggoth.org> References: <20190530205552.falsvxcegehtyuge@yuggoth.org> <20190531123501.tawgvqgsw6yle2nu@csail.mit.edu> <20190531164102.5lwt2jyxk24u3vdz@yuggoth.org> Message-ID: I came to this thread late, but I'm very glad for the clarity brought to the discussion. Yes, yes, yes, leave this in openstack. Yes, operators wrote/write/will write OpenStack (but yes, also, there are full-time devels that never run an openstack for others. I'm not sure anyone is arguing against that.) Yes, this occurred well before the emergence of SIGs and maybe that ownership question is a valid concern. Who has commit/+2 rights to this repo once it exists (again). I don't have a stake in this game but maybe there should be a poll somewhere originating in openstack.org that goes out and queries anyone registered with openstack.org to find out if a) they are still involved b) if they self identify as an operator c) if they self identify are they interested in joining a SIG focused on ops. OpenStack operators are already having some trouble reaching/contacting/keeping involved other operators. Anything we can do to unite in force will be a good thing. -dave "med" medberry From jim at jimrollenhagen.com Fri May 31 18:08:31 2019 From: jim at jimrollenhagen.com (Jim Rollenhagen) Date: Fri, 31 May 2019 14:08:31 -0400 Subject: [tc][all] Github mirroring (or lack thereof) for unofficial projects In-Reply-To: References: <20190503190538.GB3377@localhost.localdomain> <20190515175110.26i2xuclkksgx744@arabian.linksys.moosehall> <8d81b9a7-b460-43e1-a774-9bd65ee42143@www.fastmail.com> <20190530180658.xgpcy35au72ccmzt@yuggoth.org> Message-ID: On Thu, May 30, 2019 at 3:15 PM Jim Rollenhagen wrote: > On Thu, May 30, 2019 at 2:18 PM Jeremy Stanley wrote: > >> On 2019-05-30 09:00:20 -0700 (-0700), Clark Boylan wrote: >> [...] >> > If you provide us with the canonical list of things to archive I >> > think we can probably script that up or do lots of clicking >> > depending on the size of the list I guess. >> [...] >> >> Alternatively, I's like to believe we're at the point where we can >> add other interested parties to the curating group for the openstack >> org on GH, at which point any of them could volunteer to do the >> archiving. >> > > Thanks Clark/Jeremy. I'll make a list tomorrow, as we'll > need that in either case. :) > I think what we want is to archive all Github repos in the openstack, openstack-dev, and openstack-infra orgs, which don't have something with the same name on Gitea in the openstack namespace. Is that right? If so, here's the list I came up with: http://paste.openstack.org/show/752373/ And the code, in case I win the lottery and disappear: http://paste.openstack.org/show/752374/ // jim > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtroyer at gmail.com Fri May 31 18:18:56 2019 From: dtroyer at gmail.com (Dean Troyer) Date: Fri, 31 May 2019 13:18:56 -0500 Subject: [cyborg][nova][sdk]Cyborgclient integration In-Reply-To: <1CC272501B5BC543A05DB90AA509DED5275774BC@fmsmsx122.amr.corp.intel.com> References: <00be01d51167$46a0e1b0$d3e2a510$@hco.ntt.co.jp> <1CC272501B5BC543A05DB90AA509DED5275753DD@fmsmsx122.amr.corp.intel.com> <1CC272501B5BC543A05DB90AA509DED527576F84@fmsmsx122.amr.corp.intel.com> <8457085b04e80164fe45eb54a3b54afab7c10bfb.camel@redhat.com> <1CC272501B5BC543A05DB90AA509DED5275774BC@fmsmsx122.amr.corp.intel.com> Message-ID: On Thu, May 30, 2019 at 5:22 PM Nadathur, Sundar wrote: > The osc documentation [1] says the syntax should be 'object-1 action object-2'. Your other points are well-taken. > > [1] https://docs.openstack.org/python-openstackclient/latest/contributor/humaninterfaceguide.html#command-structure For commands that involve two resources, yes. There are only a handful of commands with two resources (objects). > The object itself is called a device profile, in the specs and in code. To be honest I don't care what the specs or code call it, what is important is what the users will know it as. 'device profile' needs a qualifier it is too generic. 'accelerator device profile' and 'accelerator profile' mean the same thing to me if you decide that the type of device you are referring to is an accelerator. The only way to stick with 'device profile' is to add an option defining the type, similar to how the limits and quota commands work. > > This is true, the number of people reviewing regularly on OSC outside their > > specific project commands is small. > > This may be the clinching argument. Also, Sean's observation that "as it stands we have been moving [toward] the everything is a plugin side of that scale." Since we need to deliver the client by Train, and the Cyborg team doing that is also doing other activities, perhaps we should keep the timeline as the main factor. Sean does not speak for the OSC team, that seems to be a sentiment expressed last fall by some Nova devs. You do what is right for your team, I want you to have correct information to make that decision. I am not aware of any actual effort to remove any of the commands from the OSC repo either now or in the forseeable future. dt -- Dean Troyer dtroyer at gmail.com From mark at stackhpc.com Fri May 31 18:21:27 2019 From: mark at stackhpc.com (Mark Goddard) Date: Fri, 31 May 2019 19:21:27 +0100 Subject: [kolla] Virtual PTG summary Message-ID: Hi, This week we held a virtual PTG. The etherpad [1] includes the agenda and notes. Here is my summary of what was discussed, decisions and actions. # Cross-project We started the sessions with a visitor from Tripleo - Tengu - who is working on validations [2]. Validations are a way of checking a control plane at various points in its life cycle, usually to check that some operation was completed successfully. Tengu described the project and explained how there was interest in making it a more generic framework that could be used by kolla ansible and other deployment projects. On our side there was no objection but it would require an owner. I offered to help Tengu with a PoC to see how it would work. # General Next we moved onto general project matters. ## Sustainability We've seen a slow decline in contribution over the past few cycles, as have many other projects. Part of this is natural for a more mature project that now mostly does what is required of it. Still, its good to reflect on sustainability, and plan ahead for leaner times. As a project we have a huge support matrix, with hundreds of container images, multiple OS distributions, different CPU architectures and a plethora of configuration options. We decided that a good starting point would be to define some categories of support: * images we maintain * images we more or less care about * the rest (once broken we may or may not work on fixing) The wording may need some work :) We can extend this also to OS distros, CPU arches, service deployment, and features. A key factor here will be the level of test coverage. We would document this on the wiki or docs.o.o. Ultimately this is all best effort, given that we're an open source project with a community of volunteers. A way to improve sustainability we discussed is through trimming the feature matrix. * OracleLinux - this is small but non-zero maintenance overhead. It's a candidate because we don't see people using it and Oracle maintainers left the project. * Debian binary images - not seeing many users of these. * Ceph - this is well used but does require maintenance. We support integration with external ceph clusters, and have discussed switching to recommending ceph-ansible, with a documented, automated and tested migration path. * kolla-cli - this project was added a few cycles ago, then the maintainers left the community. We haven't released it, CI is broken and have heard no complaints. For features which do not require much code we could disable testing, and move them to an explicit 'unmaintained' status. Alternatively, we could deprecate then remove them. I will follow up with emails asking for community feedback on the impact of removal. ## Core team We're looking out for potential core reviewers. If you'd like to join the team, speak to one of us and we can help you on the path. The main factor here is quality, thoughtful reviews. ## Meetings Recently attendance at IRC meetings has been low. We decided to move the meetings to #openstack-kolla to try to get more people involved. ## Kayobe Kayobe seeks to become an official project. This could be under kolla project governance, or as a separate project - the main deciding factor will be the kolla team. I will send an email about this separately. ## Priorities I would like us to agree as a team on some priorities for the Train cycle. I will send out a separate mail about this. # Kolla (images) ## Python 3 images In the Train cycle all projects should be moving to support python 3 only. We therefore need to build images based on python 3. hrw has been working on this [3] for Ubuntu and Debian source images. We will also need to switch to python 3 for CentOS/RHEL source images. This work needs an owner. A related issue is that of CentOS 8, which will support only python 3. I will follow up with a mail about this. Binary images (RPM/deb) depend on the distros and their plans for python 3. There is some python 3 work in kolla-ansible, to ensure we can execute ansible via python both locally and on remote hosts. ## Health checks Tripleo builds some healthchecks [4] into their docker images. This allows Docker to restart a service if it is deemed to be unhealthy. We agreed it would be nice to see support for this in kolla but did not discuss in depth. ## Ansible version upgrade Ansible moves on, and so must we. The version of ansible in kolla_toolbox is now rather old (2.2), and our minimum ansible version for kolla-ansible (2.5) is also getting old. It is likely we will need to update some of the custom ansible modules in order to do this. We may be able to replace others with modules from Ansible (e.g. kolla_keystone_user, kolla_keystone_service). mnasiadka offered to pick this up after the Ceph Nautilus upgrade. ## Fluentd upgrade The fluentd service needs an update - we are using 0.12.something. Needs an owner. ## Machine readable image config The issues this would solve have now mostly been fixed in tools/version-check.py, so we'll probably leave it. ## Buildah support At the Denver summit there was some interest in support for buildah as an alternative to docker for building images. I am led to believe tripleo already does this, so will ask how they do it. # Kolla Ansible ## Test coverage We made some good progress on improving CI test coverage during the Stein cycle, adding these jobs: * Cinder LVM * Scenario NFV * Major version upgrades * Zun During the Train cycle we aim to add these: * MariaDB: https://review.opendev.org/655663 * Monasca: https://review.opendev.org/649893 (WIP) * Ironic: https://review.opendev.org/568829 * Ceph upgrade: https://review.opendev.org/658132 * Tempest: https://review.opendev.org/402122 (WIP) Other candidates include magnum, octavia and prometheus. ## Nova cells v2 We had a long discussion about nova cells v2 [5]. Thanks to jroll and johnthetubaguy for getting involved. We generally agreed that this should be part of a wider assessment of scalability in kolla-ansible, although I was keen to treat different aspects of this separately during development. There are a number of ways to approach a multi-cell deployment, particularly in relation to how the per-cell infrastructure (DB, MQ, nova-conductor) are deployed. We discussed building a flexible mechanism for stamping out this infrastructure to arbitrary locations, then being able to point services in each cell at a given infrastructure location. In an effort to make this as simple to use for deployers as possible, we discussed defining a reference large scale cloud architecture. Our first pass at this is as follows: * API controllers x3+: APIs, super conductors, Galera, RabbitMQ, Keystone, etc * Cell controller clusters x3+: cell conductors, Galera, RabbitMQ, Glance, maybe neutron? (or one per "failure domain / AZ") * Cell computes: nova compute, neutron agents We would aim to add a CI job based on a two cell cloud using this architecture. We moved on to operations in a multi-cell world. It should be easy to add API controllers, cell controllers and compute nodes. We would want the ability to operate on the cloud as a whole, or individual cells using --limit. Upgrades seem likely to introduce challenges, particularly if we want to upgrade cells individually. It's likely CERN and others have some good experience we could benefit from here. The spec [6] needs an update based on this discussion, but comments there are welcome. ## Others We ran out of time before discussing the other kolla-ansible issues. Please update the etherpad [1] if you have thoughts on any of these. # Thanks Thanks to everyone who attended the kolla PTG. A conference call is certainly more challenging than a face to face dIscussion, but it was also nice to not have to fly people around the world for a few design discussions. I feel we did make progress on a number of important issues. If anyone has feedback on how we could improve next time, please get in touch. Cheers, Mark [1] https://etherpad.openstack.org/p/kolla-train-ptg [2] https://docs.openstack.org/tripleo-validations/latest/readme.html [3] https://review.opendev.org/642375 [4] https://github.com/openstack/tripleo-common/tree/master/healthcheck [5] https://blueprints.launchpad.net/kolla-ansible/+spec/support-nova-cells [6] https://review.openstack.org/616645 From mark at stackhpc.com Fri May 31 18:28:13 2019 From: mark at stackhpc.com (Mark Goddard) Date: Fri, 31 May 2019 19:28:13 +0100 Subject: [docs][kolla] Project deploy guide Message-ID: Hi, It was recently noticed that the kolla project deploy guide [1] was out of sync with the kolla-ansible docs [2]. This turned out to be because we broke the deploy-guide build. We're working on fixing this. This made me question why we have a separate deploy guide, which duplicates some but not all of the kolla-ansible docs. Tripleo links to their own install guide. Is there any reason why we (kolla) should not do the same? Honestly, I was not even aware of the deploy guide before this episode, perhaps I have missed the history behind it. Cheers, Mark [1] https://docs.openstack.org/project-deploy-guide/kolla-ansible/stein/ [2] https://docs.openstack.org/kolla-ansible/stein/user/quickstart.html From haleyb.dev at gmail.com Fri May 31 18:44:15 2019 From: haleyb.dev at gmail.com (Brian Haley) Date: Fri, 31 May 2019 14:44:15 -0400 Subject: [neutron] [OVN] ML2+OVS+DVR convergence with OVN spec In-Reply-To: <85615a29-4b1b-bae5-2a14-4b625edf4f28@suse.com> References: <85615a29-4b1b-bae5-2a14-4b625edf4f28@suse.com> Message-ID: Hi Ryan, On 5/28/19 11:35 AM, Ryan Tidwell wrote: > Hello neutrinos, > > As discussed recently at the Denver PTG [1] and in the neutron drivers > meeting last Friday May 24th [2], I have started on a spec for ML2+OVS > and OVN convergence [3]. It is in very rough shape at the moment, but I > have pushed a rough outline so this can be developed as collaboratively > as possible starting now. I personally don't have all the information to > fill out the spec right at the moment but I'm sure it can be found > across the community of folks working on neutron and OVN, so please feel > free to comment and add relevant information to the spec. Thanks for starting this, would be good to decide where we want to be, so we don't possibly duplicate effort unnecessarily. I'd be happy to add more information to the spec, or start others if we need to dig deeper into specific things. Thanks again! -Brian > [1] > http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006408.html > > [2] > http://eavesdrop.openstack.org/meetings/neutron_drivers/2019/neutron_drivers.2019-05-24-14.00.log.txt > > [3] https://review.opendev.org/#/c/658414 > > > From rtidwell at suse.com Fri May 31 19:56:07 2019 From: rtidwell at suse.com (Ryan Tidwell) Date: Fri, 31 May 2019 14:56:07 -0500 Subject: [neutron] [OVN] ML2+OVS+DVR convergence with OVN spec In-Reply-To: References: <85615a29-4b1b-bae5-2a14-4b625edf4f28@suse.com> Message-ID: Brian, Thanks, any help is greatly appreciated. Feel free to add to the spec, the current patch set is purposefully very raw. I'm hoping this can be a collaborative effort. I do expect to find specific things to drill into that will warrant more specs/RFE's, but I view the exercise of filling out this umbrella spec as a sort of "discovery" phase that will help us find those things. -Ryan On 5/31/19 1:44 PM, Brian Haley wrote: > Hi Ryan, > > On 5/28/19 11:35 AM, Ryan Tidwell wrote: >> Hello neutrinos, >> >> As discussed recently at the Denver PTG [1] and in the neutron drivers >> meeting last Friday May 24th [2], I have started on a spec for ML2+OVS >> and OVN convergence [3]. It is in very rough shape at the moment, but I >> have pushed a rough outline so this can be developed as collaboratively >> as possible starting now. I personally don't have all the information to >> fill out the spec right at the moment but I'm sure it can be found >> across the community of folks working on neutron and OVN, so please feel >> free to comment and add relevant information to the spec. > > Thanks for starting this, would be good to decide where we want to be, > so we don't possibly duplicate effort unnecessarily. > > I'd be happy to add more information to the spec, or start others if > we need to dig deeper into specific things. > > Thanks again! > > -Brian > >> [1] >> http://lists.openstack.org/pipermail/openstack-discuss/2019-May/006408.html >> >> >> [2] >> http://eavesdrop.openstack.org/meetings/neutron_drivers/2019/neutron_drivers.2019-05-24-14.00.log.txt >> >> >> [3] https://review.opendev.org/#/c/658414 >> >> >> > > From mnaser at vexxhost.com Fri May 31 22:41:05 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Fri, 31 May 2019 18:41:05 -0400 Subject: [openstack-ansible][powervm] dropping support Message-ID: Hi everyone, I've pushed up a patch to propose dropping support for PowerVM support inside OpenStack Ansible. There has been no work done on this for a few years now, the configured compute driver is the incorrect one for ~2 years now which indicates that no one has been able to use it for that long. It would be nice to have this driver however given the infrastructure we have upstream, there would be no way for us to effectively test it and bring it back to functional state. I'm proposing that we remove the code here: https://review.opendev.org/662587 powervm: drop support If you're using this code and would like to contribute to fixing it and (somehow) adding coverage, please reach out, otherwise, we'll drop this code to clean things up. Thanks, Mohammed -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. http://vexxhost.com From cboylan at sapwetik.org Fri May 31 23:50:10 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Fri, 31 May 2019 16:50:10 -0700 Subject: =?UTF-8?Q?Re:_[tc][all]_Github_mirroring_(or_lack_thereof)_for_unofficia?= =?UTF-8?Q?l_projects?= In-Reply-To: References: <20190503190538.GB3377@localhost.localdomain> <20190515175110.26i2xuclkksgx744@arabian.linksys.moosehall> <8d81b9a7-b460-43e1-a774-9bd65ee42143@www.fastmail.com> <20190530180658.xgpcy35au72ccmzt@yuggoth.org> Message-ID: On Fri, May 31, 2019, at 11:09 AM, Jim Rollenhagen wrote: > On Thu, May 30, 2019 at 3:15 PM Jim Rollenhagen wrote: > > On Thu, May 30, 2019 at 2:18 PM Jeremy Stanley wrote: > >> On 2019-05-30 09:00:20 -0700 (-0700), Clark Boylan wrote: > >> [...] > >> > If you provide us with the canonical list of things to archive I > >> > think we can probably script that up or do lots of clicking > >> > depending on the size of the list I guess. > >> [...] > >> > >> Alternatively, I's like to believe we're at the point where we can > >> add other interested parties to the curating group for the openstack > >> org on GH, at which point any of them could volunteer to do the > >> archiving. > > > > Thanks Clark/Jeremy. I'll make a list tomorrow, as we'll > > need that in either case. :) > > I think what we want is to archive all Github repos in the > openstack, openstack-dev, and openstack-infra orgs, > which don't have something with the same name on > Gitea in the openstack namespace. Is that right? Close, I think we can archive all repos in openstack-dev and openstack-infra. Part of the repo renames we did today were to get the repos that were left behind in those two orgs into their longer term homes. Then any project in https://github.com/openstack that is not in https://opendev.org/openstack can be archived in Github too. > > If so, here's the list I came up with: > http://paste.openstack.org/show/752373/ > > And the code, in case I win the lottery and disappear: > http://paste.openstack.org/show/752374/ > > // jim From jlibosva at redhat.com Fri May 31 08:53:30 2019 From: jlibosva at redhat.com (Jakub Libosvar) Date: Fri, 31 May 2019 10:53:30 +0200 Subject: [neutron][networking-ovn] Core team updates In-Reply-To: References: Message-ID: <2e3ac83e-63bd-2107-2d41-943d483b0687@redhat.com> Thanks for your trust! I'll try to do my best! Looking forward to our future collaboration. Jakub On 31/05/2019 10:38, Lucas Alvares Gomes wrote: > Hi all, > > I'd like to welcome Jakub Libosvar to the networking-ovn core team. > The team was in need for more reviewers with +2/+A power and Jakub's > reviews have been super high quality [0][1]. He's also helping the > project out in many other different efforts such as bringing in the > full stack test suit and bug fixes. > > Also, Miguel Ajo has changed focus from OVN/networking-ovn and is been > dropped from the core team. Of course, we will welcome him back when > his activity picks back up again. > > Thank you Jakub and Miguel! > > [0] https://www.stackalytics.com/report/contribution/networking-ovn/30 > [1] https://www.stackalytics.com/report/contribution/networking-ovn/90 > > Cheers, > Lucas >