From gmann at ghanshyammann.com Wed Jan 1 01:03:22 2020 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 31 Dec 2019 19:03:22 -0600 Subject: [qa] QA Office hour new timing for 2020 Message-ID: <16f5ea0ef6f.1010065b175496.3744470588375359656@ghanshyammann.com> Hello Everyone, We have few members who actively started contribution to QA from India as well as from Europe TZ. The current office hour time is more convenient for CT and JST only and very difficult for members from India Europe to join. I would like to adjust office hour timing to include all four TZs. To do that someone has to wake up early or stay late at night :). I gave preference to new members and selected Tuesday 13.30 UTC [1]. which will be an early morning for me and late-night in Tokyo. let me know if any objection or better suggestion. Accordingly, I will make the new time effective from 7th Jan. Also cancelling the 1st Jan office hour and wish you all a very happy new year. [1] https://www.timeanddate.com/worldclock/meetingdetails.html?year=2020&month=1&day=7&hour=13&min=30&sec=0&p1=265&p2=204&p3=771&p4=248&iv=1800 -gmann From gouthampravi at gmail.com Wed Jan 1 14:39:02 2020 From: gouthampravi at gmail.com (Goutham Pacha Ravi) Date: Wed, 1 Jan 2020 20:09:02 +0530 Subject: [manila] No IRC/Community meeting on 2nd January 2020 Message-ID: Hello Zorillas, Due to the holidays we're expecting multiple community members (myself included) to be unavailable for the weekly IRC meeting tomorrow (2nd January 2020, 15:00 UTC), so we'll skip it. Please add any agenda items to the next meeting (9th January 2020, 15:00 UTC) [1]. Happy New Year, and the rest of your holidays! Thanks, Goutham [1] https://wiki.openstack.org/wiki/Manila/Meetings -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim at swiftstack.com Thu Jan 2 06:10:37 2020 From: tim at swiftstack.com (Tim Burke) Date: Wed, 01 Jan 2020 22:10:37 -0800 Subject: [stein][cinder][backup][swift] issue In-Reply-To: References: Message-ID: Hi Ignazio, That's expected behavior with rados gateway. They follow S3's lead and have a unified namespace for containers across all tenants. From their documentation [0]: If a container with the same name already exists, and the user is the container owner then the operation will succeed. Otherwise the operation will fail. FWIW, that's very much a Ceph-ism -- Swift proper allows each tenant full and independent control over their namespace. Tim [0] https://docs.ceph.com/docs/mimic/radosgw/swift/containerops/#http-response On Mon, 2019-12-30 at 15:48 +0100, Ignazio Cassano wrote: > Hello All, > I configured openstack stein on centos 7 witch ceph. > Cinder works fine and object storage on ceph seems to work fine: I > can clreate containers, volume etc ..... > > I configured cinder backup on swift (but swift is using ceph rados > gateway) : > > backup_driver = cinder.backup.drivers.swift.SwiftBackupDriver > swift_catalog_info = object-store:swift:publicURL > backup_swift_enable_progress_timer = True > #backup_swift_url = http://10.102.184.190:8080/v1/AUTH_ > backup_swift_auth_url = http://10.102.184.190:5000/v3 > backup_swift_auth = per_user > backup_swift_auth_version = 1 > backup_swift_user = admin > backup_swift_user_domain = default > #backup_swift_key = > #backup_swift_container = volumebackups > backup_swift_object_size = 52428800 > #backup_swift_project = > #backup_swift_project_domain = > backup_swift_retry_attempts = 3 > backup_swift_retry_backoff = 2 > backup_compression_algorithm = zlib > > If I run a backup as user admin, it creates a container named > "volumebackups". > If I run a backup as user demo and I do not specify a container name, > it tires to write on volumebackups and gives some errors: > > ClientException: Container PUT failed: > http://10.102.184.190:8080/swift/v1/AUTH_964f343cf5164028a803db91488bdb01/volumebackups > 409 Conflict BucketAlreadyExists > > > Does it mean I cannot use the same containers name on differents > projects ? > > My ceph.conf is configured for using keystone: > [client.rgw.tst2-osctrl01] > rgw_frontends = "civetweb port=10.102.184.190:8080" > # Keystone information > rgw keystone api version = 3 > rgw keystone url = http://10.102.184.190:5000 > rgw keystone admin user = admin > rgw keystone admin password = password > rgw keystone admin domain = default > rgw keystone admin project = admin > rgw swift account in url = true > rgw keystone implicit tenants = true > > > > Any help, please ? > Best Regards > Ignazio From tim at swiftstack.com Thu Jan 2 06:10:37 2020 From: tim at swiftstack.com (Tim Burke) Date: Wed, 01 Jan 2020 22:10:37 -0800 Subject: [stein][cinder][backup][swift] issue In-Reply-To: References: Message-ID: Hi Ignazio, That's expected behavior with rados gateway. They follow S3's lead and have a unified namespace for containers across all tenants. From their documentation [0]: If a container with the same name already exists, and the user is the container owner then the operation will succeed. Otherwise the operation will fail. FWIW, that's very much a Ceph-ism -- Swift proper allows each tenant full and independent control over their namespace. Tim [0] https://docs.ceph.com/docs/mimic/radosgw/swift/containerops/#http-response On Mon, 2019-12-30 at 15:48 +0100, Ignazio Cassano wrote: > Hello All, > I configured openstack stein on centos 7 witch ceph. > Cinder works fine and object storage on ceph seems to work fine: I > can clreate containers, volume etc ..... > > I configured cinder backup on swift (but swift is using ceph rados > gateway) : > > backup_driver = cinder.backup.drivers.swift.SwiftBackupDriver > swift_catalog_info = object-store:swift:publicURL > backup_swift_enable_progress_timer = True > #backup_swift_url = http://10.102.184.190:8080/v1/AUTH_ > backup_swift_auth_url = http://10.102.184.190:5000/v3 > backup_swift_auth = per_user > backup_swift_auth_version = 1 > backup_swift_user = admin > backup_swift_user_domain = default > #backup_swift_key = > #backup_swift_container = volumebackups > backup_swift_object_size = 52428800 > #backup_swift_project = > #backup_swift_project_domain = > backup_swift_retry_attempts = 3 > backup_swift_retry_backoff = 2 > backup_compression_algorithm = zlib > > If I run a backup as user admin, it creates a container named > "volumebackups". > If I run a backup as user demo and I do not specify a container name, > it tires to write on volumebackups and gives some errors: > > ClientException: Container PUT failed: > http://10.102.184.190:8080/swift/v1/AUTH_964f343cf5164028a803db91488bdb01/volumebackups > 409 Conflict BucketAlreadyExists > > > Does it mean I cannot use the same containers name on differents > projects ? > > My ceph.conf is configured for using keystone: > [client.rgw.tst2-osctrl01] > rgw_frontends = "civetweb port=10.102.184.190:8080" > # Keystone information > rgw keystone api version = 3 > rgw keystone url = http://10.102.184.190:5000 > rgw keystone admin user = admin > rgw keystone admin password = password > rgw keystone admin domain = default > rgw keystone admin project = admin > rgw swift account in url = true > rgw keystone implicit tenants = true > > > > Any help, please ? > Best Regards > Ignazio From ignaziocassano at gmail.com Thu Jan 2 06:38:09 2020 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 2 Jan 2020 07:38:09 +0100 Subject: [stein][cinder][backup][swift] issue In-Reply-To: References: Message-ID: Many Thanks, Tim Ignazio Il giorno gio 2 gen 2020 alle ore 07:10 Tim Burke ha scritto: > Hi Ignazio, > > That's expected behavior with rados gateway. They follow S3's lead and > have a unified namespace for containers across all tenants. From their > documentation [0]: > > If a container with the same name already exists, and the user is > the container owner then the operation will succeed. Otherwise the > operation will fail. > > FWIW, that's very much a Ceph-ism -- Swift proper allows each tenant > full and independent control over their namespace. > > Tim > > [0] > https://docs.ceph.com/docs/mimic/radosgw/swift/containerops/#http-response > > On Mon, 2019-12-30 at 15:48 +0100, Ignazio Cassano wrote: > > Hello All, > > I configured openstack stein on centos 7 witch ceph. > > Cinder works fine and object storage on ceph seems to work fine: I > > can clreate containers, volume etc ..... > > > > I configured cinder backup on swift (but swift is using ceph rados > > gateway) : > > > > backup_driver = cinder.backup.drivers.swift.SwiftBackupDriver > > swift_catalog_info = object-store:swift:publicURL > > backup_swift_enable_progress_timer = True > > #backup_swift_url = http://10.102.184.190:8080/v1/AUTH_ > > backup_swift_auth_url = http://10.102.184.190:5000/v3 > > backup_swift_auth = per_user > > backup_swift_auth_version = 1 > > backup_swift_user = admin > > backup_swift_user_domain = default > > #backup_swift_key = > > #backup_swift_container = volumebackups > > backup_swift_object_size = 52428800 > > #backup_swift_project = > > #backup_swift_project_domain = > > backup_swift_retry_attempts = 3 > > backup_swift_retry_backoff = 2 > > backup_compression_algorithm = zlib > > > > If I run a backup as user admin, it creates a container named > > "volumebackups". > > If I run a backup as user demo and I do not specify a container name, > > it tires to write on volumebackups and gives some errors: > > > > ClientException: Container PUT failed: > > > http://10.102.184.190:8080/swift/v1/AUTH_964f343cf5164028a803db91488bdb01/volumebackups > > 409 Conflict BucketAlreadyExists > > > > > > Does it mean I cannot use the same containers name on differents > > projects ? > > > > My ceph.conf is configured for using keystone: > > [client.rgw.tst2-osctrl01] > > rgw_frontends = "civetweb port=10.102.184.190:8080" > > # Keystone information > > rgw keystone api version = 3 > > rgw keystone url = http://10.102.184.190:5000 > > rgw keystone admin user = admin > > rgw keystone admin password = password > > rgw keystone admin domain = default > > rgw keystone admin project = admin > > rgw swift account in url = true > > rgw keystone implicit tenants = true > > > > > > > > Any help, please ? > > Best Regards > > Ignazio > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From svyas at redhat.com Thu Jan 2 12:27:52 2020 From: svyas at redhat.com (Soniya Vyas) Date: Thu, 2 Jan 2020 17:57:52 +0530 Subject: openstack-discuss Digest, Vol 15, Issue 1 In-Reply-To: References: Message-ID: > Message: 4 > Date: Tue, 31 Dec 2019 19:03:22 -0600 > From: Ghanshyam Mann > To: "openstack-discuss" > Subject: [qa] QA Office hour new timing for 2020 > Message-ID: > <16f5ea0ef6f.1010065b175496.3744470588375359656 at ghanshyammann.com> > Content-Type: text/plain; charset="UTF-8" > > Hello Everyone, > > We have few members who actively started contribution to QA from India as well as from Europe TZ. > The current office hour time is more convenient for CT and JST only and very difficult for members from India > Europe to join. > > I would like to adjust office hour timing to include all four TZs. To do that someone has to wake up early or stay > late at night :). I gave preference to new members and selected Tuesday 13.30 UTC [1]. which will be an early morning > for me and late-night in Tokyo. > > let me know if any objection or better suggestion. Accordingly, I will make the new time effective from 7th Jan. > > Also cancelling the 1st Jan office hour and wish you all a very happy new year. > > [1] https://www.timeanddate.com/worldclock/meetingdetails.html?year=2020&month=1&day=7&hour=13&min=30&sec=0&p1=265&p2=204&p3=771&p4=248&iv=1800 Thanks a lot to whole QA team and Ghanshyam Mann for considering our timing issues. Looking forward to join QA Office hours meeting. Regards, Soniya Vyas From mnaser at vexxhost.com Thu Jan 2 20:02:57 2020 From: mnaser at vexxhost.com (Mohammed Naser) Date: Thu, 2 Jan 2020 15:02:57 -0500 Subject: [openstack-ansible] strange execution delays In-Reply-To: References: Message-ID: Hi Joe, Those timeouts re almost 99% the reason behind this issue. I'd suggest restarting systemd-logind and seeing how that fares: systemctl restart systemd-logind If the issue persists or happens again, I'm not sure, but those timeouts are 100% a cause of issue here. Thanks, Mohammed On Mon, Dec 30, 2019 at 2:51 PM Joe Topjian wrote: > > Hi Mohammad, > >> Do you have any PAM modules that might be hitting some sorts of >> external API for auditing purposes that may be throttling you? > > > Not unless OSA would have configured something. The deployment is *very* standard, heavily leveraging default values. > > DNS of each container is configured to use LXC host for resolution. The host is using the systemd-based resolver, but is pointing to a local, dedicated upstream resolver. I want to point the problem there, but we've run into this issue in two different locations, one of which has an upstream DNS resolver that I'm confident does not throttle requests. But, hey, it's DNS - maybe it's still the cause. > >> >> How is systemd-logind feeling? Anything odd in your system logs? > > > Yes. We have a feeling it's *something* with systemd, but aren't exactly sure what. Affected containers' logs end up with a lot of the following entries: > > Dec 3 20:30:17 infra1-repo-container-a0f194b3 su[4170]: Successful su for root by root > Dec 3 20:30:17 infra1-repo-container-a0f194b3 su[4170]: + ??? root:root > Dec 3 20:30:17 infra1-repo-container-a0f194b3 su[4170]: pam_unix(su:session): session opened for user root by (uid=0) > Dec 3 20:30:27 infra1-repo-container-a0f194b3 dbus-daemon[47]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms) > Dec 3 20:30:42 infra1-repo-container-a0f194b3 su[4170]: pam_systemd(su:session): Failed to create session: Connection timed out > Dec 3 20:30:43 infra1-repo-container-a0f194b3 su[4170]: pam_unix(su:session): session closed for user root > > But we aren't sure if those timeouts are a symptom of cause. > > Thanks for your help! > > Joe -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. https://vexxhost.com From openstack at nemebean.com Thu Jan 2 20:20:53 2020 From: openstack at nemebean.com (Ben Nemec) Date: Thu, 2 Jan 2020 14:20:53 -0600 Subject: [oslo][kolla][requirements][release][infra] Hit by an old, fixed bug In-Reply-To: References: <20191230150137.GA9057@sm-workstation> Message-ID: <79cddc25-88e0-b5dd-8b8a-17cf14b9c4b1@nemebean.com> On 12/30/19 9:52 AM, Radosław Piliszek wrote: > Thanks, Sean! I knew I was missing something really basic! > I was under the impression that 9.x is Stein, like it happens with > main projects (major=branch). > I could not find any doc explaining oslo.messaging versioning, perhaps > Oslo could release 9.5.1 off the stein branch? Oslo for the most part follows semver, so we only bump major versions when there is a breaking change. We bump minor versions each release so we can do bugfix releases on the previous stable branch without stepping on master releases. The underlying cause of this is likely that I'm way behind on releasing the Oslo stable branches. It's high on my todo list now that most people are back from holidays and will be around to help out if a release breaks something. However, anyone can propose a release[0][1] (contrary to what [0] suggests), so if the necessary fix is already on stable/stein and just hasn't been released yet please feel free to do that. You'll just need a +1 from either myself or hberaud (the Oslo release liaison) before the release team will approve it. 0: https://releases.openstack.org/reference/using.html#requesting-a-release 1: https://releases.openstack.org/reference/using.html#using-new-release-command > > The issue remains that, even though oslo backports bugfixes into their > stable branches, kolla (and very possibly other deployment solutions) > no longer benefit from them. > > -yoctozepto > > pon., 30 gru 2019 o 16:01 Sean McGinnis napisał(a): >> >> On Sun, Dec 29, 2019 at 09:41:45PM +0100, Radosław Piliszek wrote: >>> Hi Folks, >>> >>> as the subject goes, my installation has been hit by an old bug: >>> https://bugs.launchpad.net/oslo.messaging/+bug/1828841 >>> (bug details not important, linked here for background) >>> >>> I am using Stein, deployed with recent Kolla-built source-based images >>> (with only slight modifications compared to vanilla ones). >>> Kolla's procedure for building source-based images considers upper >>> constraints, which, unfortunately, turned out to be lagging behind a >>> few releases w.r.t. oslo.messaging at least. >>> The fix was in 9.7.0 released on May 21, u-c still point to 9.5.0 from >>> Feb 26 and the latest of Stein is 9.8.0 from Jul 18. >>> >>> It seems oslo.messaging is missing from the automatic updates that bot proposes: >>> https://review.opendev.org/#/q/owner:%22OpenStack+Proposal+Bot%22+project:openstack/requirements+branch:stable/stein >>> >>> Per: >>> https://opendev.org/openstack/releases/src/branch/master/doc/source/reference/reviewer_guide.rst#release-jobs >>> this upper-constraint proposal should be happening for all releases. >>> >> >> This is normal and what is expected. >> >> Requirements are only updated for the branch in which those releases happen. So >> if there is a release of oslo.messaging for stable/train, only the stable/train >> upper constraints are updated for that new release. The stable/stein branch >> will not be affected because that shows what the tested upper constraints were >> for that branch. >> >> The last stable/stein release for oslo.messaging was 9.5.0: >> >> https://opendev.org/openstack/releases/src/branch/master/deliverables/stein/oslo.messaging.yaml#L49 >> >> And 9.5.0 is what is set in the stable/stein upper-constraints: >> >> https://opendev.org/openstack/requirements/src/branch/stable/stein/upper-constraints.txt#L146 >> >> To get that raised, whatever necessary bugfixes that are required in >> oslo.messaging would need to be backported per-cycle until stable/stein (as in, >> if it was in current master, it would need to be backported and merged to >> stable/train first, then stable/stein), and once merged a stable release would >> need to be proposed for that branch's version of the library. >> >> Once that stable release is done, that will propose the update to the upper >> constraint for the given branch. >> >>> I would be glad if someone investigated why it happens(/ed) and >>> audited whether other OpenStack projects don't need updating as well >>> to avoid running on old deps when new are awaiting for months. :-) >>> Please note this might apply to other branches as well. >>> >>> PS: for some reason oslo.messaging Stein release notes ( >>> https://docs.openstack.org/releasenotes/oslo.messaging/stein.html ) >>> are stuck at 9.5.0 as well, this could be right (I did not inspect the >>> sources) but I am adding this in PS so you have more things to >>> correlate if they need be. >>> >> >> Again, as expected. The last stable/stein release was 9.5.0, so that is correct >> that the release notes for stein only show up to that point. > From joe at topjian.net Thu Jan 2 20:54:25 2020 From: joe at topjian.net (Joe Topjian) Date: Thu, 2 Jan 2020 13:54:25 -0700 Subject: [openstack-ansible] strange execution delays In-Reply-To: References: Message-ID: Hi Mohammad, Restarting of systemd-logind would sometimes hang indefinitely, which is why we've defaulted to just a hard stop/start of the container. The problem then slowly begins to creep up again. If you haven't seen this behavior, then that's still helpful. We'll scour the environment trying to find *something* that might be causing it. Thanks, Joe On Thu, Jan 2, 2020 at 1:03 PM Mohammed Naser wrote: > Hi Joe, > > Those timeouts re almost 99% the reason behind this issue. I'd > suggest restarting systemd-logind and seeing how that fares: > > systemctl restart systemd-logind > > If the issue persists or happens again, I'm not sure, but those > timeouts are 100% a cause of issue here. > > Thanks, > Mohammed > > On Mon, Dec 30, 2019 at 2:51 PM Joe Topjian wrote: > > > > Hi Mohammad, > > > >> Do you have any PAM modules that might be hitting some sorts of > >> external API for auditing purposes that may be throttling you? > > > > > > Not unless OSA would have configured something. The deployment is *very* > standard, heavily leveraging default values. > > > > DNS of each container is configured to use LXC host for resolution. The > host is using the systemd-based resolver, but is pointing to a local, > dedicated upstream resolver. I want to point the problem there, but we've > run into this issue in two different locations, one of which has an > upstream DNS resolver that I'm confident does not throttle requests. But, > hey, it's DNS - maybe it's still the cause. > > > >> > >> How is systemd-logind feeling? Anything odd in your system logs? > > > > > > Yes. We have a feeling it's *something* with systemd, but aren't exactly > sure what. Affected containers' logs end up with a lot of the following > entries: > > > > Dec 3 20:30:17 infra1-repo-container-a0f194b3 su[4170]: Successful su > for root by root > > Dec 3 20:30:17 infra1-repo-container-a0f194b3 su[4170]: + ??? root:root > > Dec 3 20:30:17 infra1-repo-container-a0f194b3 su[4170]: > pam_unix(su:session): session opened for user root by (uid=0) > > Dec 3 20:30:27 infra1-repo-container-a0f194b3 dbus-daemon[47]: [system] > Failed to activate service 'org.freedesktop.systemd1': timed out > (service_start_timeout=25000ms) > > Dec 3 20:30:42 infra1-repo-container-a0f194b3 su[4170]: > pam_systemd(su:session): Failed to create session: Connection timed out > > Dec 3 20:30:43 infra1-repo-container-a0f194b3 su[4170]: > pam_unix(su:session): session closed for user root > > > > But we aren't sure if those timeouts are a symptom of cause. > > > > Thanks for your help! > > > > Joe > > > > -- > Mohammed Naser — vexxhost > ----------------------------------------------------- > D. 514-316-8872 > D. 800-910-1726 ext. 200 > E. mnaser at vexxhost.com > W. https://vexxhost.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnaser at vexxhost.com Thu Jan 2 21:29:42 2020 From: mnaser at vexxhost.com (Mohammed Naser) Date: Thu, 2 Jan 2020 16:29:42 -0500 Subject: [openstack-ansible] strange execution delays In-Reply-To: References: Message-ID: I'd suggest looking at the dbus logs too which can include some interesting things, but yeah, this is certainly the source of your issues so I would dig around dbus/systemd-logind Good luck and keep us updated! On Thu, Jan 2, 2020 at 3:54 PM Joe Topjian wrote: > > Hi Mohammad, > > Restarting of systemd-logind would sometimes hang indefinitely, which is why we've defaulted to just a hard stop/start of the container. The problem then slowly begins to creep up again. > > If you haven't seen this behavior, then that's still helpful. We'll scour the environment trying to find *something* that might be causing it. > > Thanks, > Joe > > > On Thu, Jan 2, 2020 at 1:03 PM Mohammed Naser wrote: >> >> Hi Joe, >> >> Those timeouts re almost 99% the reason behind this issue. I'd >> suggest restarting systemd-logind and seeing how that fares: >> >> systemctl restart systemd-logind >> >> If the issue persists or happens again, I'm not sure, but those >> timeouts are 100% a cause of issue here. >> >> Thanks, >> Mohammed >> >> On Mon, Dec 30, 2019 at 2:51 PM Joe Topjian wrote: >> > >> > Hi Mohammad, >> > >> >> Do you have any PAM modules that might be hitting some sorts of >> >> external API for auditing purposes that may be throttling you? >> > >> > >> > Not unless OSA would have configured something. The deployment is *very* standard, heavily leveraging default values. >> > >> > DNS of each container is configured to use LXC host for resolution. The host is using the systemd-based resolver, but is pointing to a local, dedicated upstream resolver. I want to point the problem there, but we've run into this issue in two different locations, one of which has an upstream DNS resolver that I'm confident does not throttle requests. But, hey, it's DNS - maybe it's still the cause. >> > >> >> >> >> How is systemd-logind feeling? Anything odd in your system logs? >> > >> > >> > Yes. We have a feeling it's *something* with systemd, but aren't exactly sure what. Affected containers' logs end up with a lot of the following entries: >> > >> > Dec 3 20:30:17 infra1-repo-container-a0f194b3 su[4170]: Successful su for root by root >> > Dec 3 20:30:17 infra1-repo-container-a0f194b3 su[4170]: + ??? root:root >> > Dec 3 20:30:17 infra1-repo-container-a0f194b3 su[4170]: pam_unix(su:session): session opened for user root by (uid=0) >> > Dec 3 20:30:27 infra1-repo-container-a0f194b3 dbus-daemon[47]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms) >> > Dec 3 20:30:42 infra1-repo-container-a0f194b3 su[4170]: pam_systemd(su:session): Failed to create session: Connection timed out >> > Dec 3 20:30:43 infra1-repo-container-a0f194b3 su[4170]: pam_unix(su:session): session closed for user root >> > >> > But we aren't sure if those timeouts are a symptom of cause. >> > >> > Thanks for your help! >> > >> > Joe >> >> >> >> -- >> Mohammed Naser — vexxhost >> ----------------------------------------------------- >> D. 514-316-8872 >> D. 800-910-1726 ext. 200 >> E. mnaser at vexxhost.com >> W. https://vexxhost.com -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. https://vexxhost.com From matt at oliver.net.au Thu Jan 2 22:32:43 2020 From: matt at oliver.net.au (Matthew Oliver) Date: Fri, 3 Jan 2020 09:32:43 +1100 Subject: [stein][cinder][backup][swift] issue In-Reply-To: References: Message-ID: Tim, as always, has hit the nail on the head. By default rgw doesn't use explicit tenants. if you want to use RGW and explicit tenants.. ie no global container namespace, then you need to add: rgw keystone implicit tenants = true To you rgw client configuration in ceph.conf. See: https://docs.ceph.com/docs/master/radosgw/multitenancy/#swift-with-keystone Not sure what happens to existing containers when you enable this option, because I think my default things are considered to be in the 'default' tenant. matt On Thu, Jan 2, 2020 at 5:40 PM Ignazio Cassano wrote: > Many Thanks, Tim > Ignazio > > Il giorno gio 2 gen 2020 alle ore 07:10 Tim Burke ha > scritto: > >> Hi Ignazio, >> >> That's expected behavior with rados gateway. They follow S3's lead and >> have a unified namespace for containers across all tenants. From their >> documentation [0]: >> >> If a container with the same name already exists, and the user is >> the container owner then the operation will succeed. Otherwise the >> operation will fail. >> >> FWIW, that's very much a Ceph-ism -- Swift proper allows each tenant >> full and independent control over their namespace. >> >> Tim >> >> [0] >> https://docs.ceph.com/docs/mimic/radosgw/swift/containerops/#http-response >> >> On Mon, 2019-12-30 at 15:48 +0100, Ignazio Cassano wrote: >> > Hello All, >> > I configured openstack stein on centos 7 witch ceph. >> > Cinder works fine and object storage on ceph seems to work fine: I >> > can clreate containers, volume etc ..... >> > >> > I configured cinder backup on swift (but swift is using ceph rados >> > gateway) : >> > >> > backup_driver = cinder.backup.drivers.swift.SwiftBackupDriver >> > swift_catalog_info = object-store:swift:publicURL >> > backup_swift_enable_progress_timer = True >> > #backup_swift_url = http://10.102.184.190:8080/v1/AUTH_ >> > backup_swift_auth_url = http://10.102.184.190:5000/v3 >> > backup_swift_auth = per_user >> > backup_swift_auth_version = 1 >> > backup_swift_user = admin >> > backup_swift_user_domain = default >> > #backup_swift_key = >> > #backup_swift_container = volumebackups >> > backup_swift_object_size = 52428800 >> > #backup_swift_project = >> > #backup_swift_project_domain = >> > backup_swift_retry_attempts = 3 >> > backup_swift_retry_backoff = 2 >> > backup_compression_algorithm = zlib >> > >> > If I run a backup as user admin, it creates a container named >> > "volumebackups". >> > If I run a backup as user demo and I do not specify a container name, >> > it tires to write on volumebackups and gives some errors: >> > >> > ClientException: Container PUT failed: >> > >> http://10.102.184.190:8080/swift/v1/AUTH_964f343cf5164028a803db91488bdb01/volumebackups >> > 409 Conflict BucketAlreadyExists >> > >> > >> > Does it mean I cannot use the same containers name on differents >> > projects ? >> > >> > My ceph.conf is configured for using keystone: >> > [client.rgw.tst2-osctrl01] >> > rgw_frontends = "civetweb port=10.102.184.190:8080" >> > # Keystone information >> > rgw keystone api version = 3 >> > rgw keystone url = http://10.102.184.190:5000 >> > rgw keystone admin user = admin >> > rgw keystone admin password = password >> > rgw keystone admin domain = default >> > rgw keystone admin project = admin >> > rgw swift account in url = true >> > rgw keystone implicit tenants = true >> > >> > >> > >> > Any help, please ? >> > Best Regards >> > Ignazio >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From matt at oliver.net.au Thu Jan 2 22:32:43 2020 From: matt at oliver.net.au (Matthew Oliver) Date: Fri, 3 Jan 2020 09:32:43 +1100 Subject: [stein][cinder][backup][swift] issue In-Reply-To: References: Message-ID: Tim, as always, has hit the nail on the head. By default rgw doesn't use explicit tenants. if you want to use RGW and explicit tenants.. ie no global container namespace, then you need to add: rgw keystone implicit tenants = true To you rgw client configuration in ceph.conf. See: https://docs.ceph.com/docs/master/radosgw/multitenancy/#swift-with-keystone Not sure what happens to existing containers when you enable this option, because I think my default things are considered to be in the 'default' tenant. matt On Thu, Jan 2, 2020 at 5:40 PM Ignazio Cassano wrote: > Many Thanks, Tim > Ignazio > > Il giorno gio 2 gen 2020 alle ore 07:10 Tim Burke ha > scritto: > >> Hi Ignazio, >> >> That's expected behavior with rados gateway. They follow S3's lead and >> have a unified namespace for containers across all tenants. From their >> documentation [0]: >> >> If a container with the same name already exists, and the user is >> the container owner then the operation will succeed. Otherwise the >> operation will fail. >> >> FWIW, that's very much a Ceph-ism -- Swift proper allows each tenant >> full and independent control over their namespace. >> >> Tim >> >> [0] >> https://docs.ceph.com/docs/mimic/radosgw/swift/containerops/#http-response >> >> On Mon, 2019-12-30 at 15:48 +0100, Ignazio Cassano wrote: >> > Hello All, >> > I configured openstack stein on centos 7 witch ceph. >> > Cinder works fine and object storage on ceph seems to work fine: I >> > can clreate containers, volume etc ..... >> > >> > I configured cinder backup on swift (but swift is using ceph rados >> > gateway) : >> > >> > backup_driver = cinder.backup.drivers.swift.SwiftBackupDriver >> > swift_catalog_info = object-store:swift:publicURL >> > backup_swift_enable_progress_timer = True >> > #backup_swift_url = http://10.102.184.190:8080/v1/AUTH_ >> > backup_swift_auth_url = http://10.102.184.190:5000/v3 >> > backup_swift_auth = per_user >> > backup_swift_auth_version = 1 >> > backup_swift_user = admin >> > backup_swift_user_domain = default >> > #backup_swift_key = >> > #backup_swift_container = volumebackups >> > backup_swift_object_size = 52428800 >> > #backup_swift_project = >> > #backup_swift_project_domain = >> > backup_swift_retry_attempts = 3 >> > backup_swift_retry_backoff = 2 >> > backup_compression_algorithm = zlib >> > >> > If I run a backup as user admin, it creates a container named >> > "volumebackups". >> > If I run a backup as user demo and I do not specify a container name, >> > it tires to write on volumebackups and gives some errors: >> > >> > ClientException: Container PUT failed: >> > >> http://10.102.184.190:8080/swift/v1/AUTH_964f343cf5164028a803db91488bdb01/volumebackups >> > 409 Conflict BucketAlreadyExists >> > >> > >> > Does it mean I cannot use the same containers name on differents >> > projects ? >> > >> > My ceph.conf is configured for using keystone: >> > [client.rgw.tst2-osctrl01] >> > rgw_frontends = "civetweb port=10.102.184.190:8080" >> > # Keystone information >> > rgw keystone api version = 3 >> > rgw keystone url = http://10.102.184.190:5000 >> > rgw keystone admin user = admin >> > rgw keystone admin password = password >> > rgw keystone admin domain = default >> > rgw keystone admin project = admin >> > rgw swift account in url = true >> > rgw keystone implicit tenants = true >> > >> > >> > >> > Any help, please ? >> > Best Regards >> > Ignazio >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Fri Jan 3 09:28:57 2020 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Fri, 3 Jan 2020 10:28:57 +0100 Subject: [stein][cinder][backup][swift] issue In-Reply-To: References: Message-ID: Thanks, Matt. When I add rgw keystone implicit tenants = true container are acreated with the project id/name. Regards Ignazio Il giorno gio 2 gen 2020 alle ore 23:32 Matthew Oliver ha scritto: > Tim, as always, has hit the nail on the head. By default rgw doesn't use > explicit tenants. > if you want to use RGW and explicit tenants.. ie no global container > namespace, then you need to add: > > rgw keystone implicit tenants = true > > To you rgw client configuration in ceph.conf. > > See: > https://docs.ceph.com/docs/master/radosgw/multitenancy/#swift-with-keystone > > Not sure what happens to existing containers when you enable this option, > because I think my default things are considered to be in the 'default' > tenant. > > matt > > On Thu, Jan 2, 2020 at 5:40 PM Ignazio Cassano > wrote: > >> Many Thanks, Tim >> Ignazio >> >> Il giorno gio 2 gen 2020 alle ore 07:10 Tim Burke >> ha scritto: >> >>> Hi Ignazio, >>> >>> That's expected behavior with rados gateway. They follow S3's lead and >>> have a unified namespace for containers across all tenants. From their >>> documentation [0]: >>> >>> If a container with the same name already exists, and the user is >>> the container owner then the operation will succeed. Otherwise the >>> operation will fail. >>> >>> FWIW, that's very much a Ceph-ism -- Swift proper allows each tenant >>> full and independent control over their namespace. >>> >>> Tim >>> >>> [0] >>> >>> https://docs.ceph.com/docs/mimic/radosgw/swift/containerops/#http-response >>> >>> On Mon, 2019-12-30 at 15:48 +0100, Ignazio Cassano wrote: >>> > Hello All, >>> > I configured openstack stein on centos 7 witch ceph. >>> > Cinder works fine and object storage on ceph seems to work fine: I >>> > can clreate containers, volume etc ..... >>> > >>> > I configured cinder backup on swift (but swift is using ceph rados >>> > gateway) : >>> > >>> > backup_driver = cinder.backup.drivers.swift.SwiftBackupDriver >>> > swift_catalog_info = object-store:swift:publicURL >>> > backup_swift_enable_progress_timer = True >>> > #backup_swift_url = http://10.102.184.190:8080/v1/AUTH_ >>> > backup_swift_auth_url = http://10.102.184.190:5000/v3 >>> > backup_swift_auth = per_user >>> > backup_swift_auth_version = 1 >>> > backup_swift_user = admin >>> > backup_swift_user_domain = default >>> > #backup_swift_key = >>> > #backup_swift_container = volumebackups >>> > backup_swift_object_size = 52428800 >>> > #backup_swift_project = >>> > #backup_swift_project_domain = >>> > backup_swift_retry_attempts = 3 >>> > backup_swift_retry_backoff = 2 >>> > backup_compression_algorithm = zlib >>> > >>> > If I run a backup as user admin, it creates a container named >>> > "volumebackups". >>> > If I run a backup as user demo and I do not specify a container name, >>> > it tires to write on volumebackups and gives some errors: >>> > >>> > ClientException: Container PUT failed: >>> > >>> http://10.102.184.190:8080/swift/v1/AUTH_964f343cf5164028a803db91488bdb01/volumebackups >>> > 409 Conflict BucketAlreadyExists >>> > >>> > >>> > Does it mean I cannot use the same containers name on differents >>> > projects ? >>> > >>> > My ceph.conf is configured for using keystone: >>> > [client.rgw.tst2-osctrl01] >>> > rgw_frontends = "civetweb port=10.102.184.190:8080" >>> > # Keystone information >>> > rgw keystone api version = 3 >>> > rgw keystone url = http://10.102.184.190:5000 >>> > rgw keystone admin user = admin >>> > rgw keystone admin password = password >>> > rgw keystone admin domain = default >>> > rgw keystone admin project = admin >>> > rgw swift account in url = true >>> > rgw keystone implicit tenants = true >>> > >>> > >>> > >>> > Any help, please ? >>> > Best Regards >>> > Ignazio >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From katonalala at gmail.com Fri Jan 3 09:56:33 2020 From: katonalala at gmail.com (Lajos Katona) Date: Fri, 3 Jan 2020 10:56:33 +0100 Subject: About the use of security groups with neutron ports In-Reply-To: <000401d5bcc3$3f9ebd30$bedc3790$@gmail.com> References: <003d01d5bc42$2af8ceb0$80ea6c10$@gmail.com> <582E6225-F178-401A-A1D4-A52484B76DD9@redhat.com> <000401d5bcc3$3f9ebd30$bedc3790$@gmail.com> Message-ID: Hi, General answer: if you check your processes running on the host you will see which config files are used: $ ps -ef |grep neutron-server lajoska+ 32072 1 2 09:51 ? 00:00:03 /usr/bin/python3.6 /usr/local/bin/neutron-server --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini --config-file /etc/neutron/taas_plugin.ini .... Similarly you can check your ovs-agent: $ ps -ef |grep neutron-openvswitch-agent .... For the documentation of the config files check the configuration reference: https://docs.openstack.org/neutron/latest/configuration/config.html (this is the latest, so I suppose you need some older one like train or similar) Regards Lajos ezt írta (időpont: 2019. dec. 27., P, 15:42): > Thank you very much, Slawek. > > > > In case I have multiple configuration files, how to know which one is > currently loaded in neutron? > > For example, in my environment I have: > > - ml2_conf.ini > - ml2_conf_odl.ini > - ml2_conf_sriov.ini > - openvswitch_agent.ini > - sriov_agent.ini > > > > > > [root at overcloud-controller-0 cbis-admin]# cd /etc/neutron/plugins/ml2/ > > [root at overcloud-controller-0 ml2]# ls > > ml2_conf.ini ml2_conf_odl.ini ml2_conf_sriov.ini openvswitch_agent.ini > sriov_agent.ini > > > > > > Which one of these is used? > > > > Cheers, > > Ahmed > > > > > > > > -----Original Message----- > From: Slawek Kaplonski > Sent: Friday, December 27, 2019 10:28 AM > To: ahmed.zaky.abdallah at gmail.com > Cc: openstack-discuss at lists.openstack.org > Subject: Re: About the use of security groups with neutron ports > > > > Hi, > > > > > On 27 Dec 2019, at 00:14, ahmed.zaky.abdallah at gmail.com wrote: > > > > > > Hi All, > > > > > > I am trying to wrap my head around something I came across in one of the > OpenStack deployments. I am running Telco VNFs one of them is having > different VMs using SR-IOV interfaces. > > > > > > On one of my VNFs on Openstack, I defined a wrong IPv6 Gm bearer > interface to be exactly the same as the IPv6 Gateway. As I hate > re-onboarding, I decided to embark on a journey of changing the IPv6 of the > Gm bearer interface manually on the application side, everything went on > fine. > > > > > > After two weeks, my customer started complaining about one way RTP flow. > The customer was reluctant to blame the operation I carried out because > everything worked smooth after my modification. > > > After days of investigation, I remembered that I have port-security > enabled and this means AAP “Allowed-Address-Pairs” are defined per vPort > (AAP contain the floating IP address of the VM so that the security to > allow traffic to and from this VIP). I gave it a try and edited AAP > “Allowed-Address-Pairs” to include the correct new IPv6 address. Doing that > everything started working fine. > > > > > > The only logical explanation at that time is security group rules are > really invoked. > > > > > > Now, I am trying to understand how the iptables are really invoked. I > did some digging and it seems like we can control the firewall drivers on > two levels: > > > > > > • Nova compute > > > • ML2 plugin > > > > > > I was curious to check nova.conf and it has already the following line: > firewall_driver=nova.virt.firewall.NoopFirewallDriver > > > > > > However, checking the ml2 plugin configuration, the following is found: > > > > > > 230 [securitygroup] > > > 231 > > > 232 # > > > 233 # From neutron.ml2 > > > 234 # > > > 235 > > > 236 # Driver for security groups firewall in the L2 agent (string > value) > > > 237 #firewall_driver = > > > 238 firewall_driver = openvswitch > > > > > > So, I am jumping to a conclusion that ml2 plugin is the one responsible > for enforcing the firewall rules in my case. > > > > > > Have you had a similar experience? > > > Is my assumption correct: If I comment out the ml2 plugin firewall > driver then the port security carries no sense at all and security groups > won’t be invoked? > > > > Firewall_driver config option has to be set to some value. You can set > “noop” as firewall_driver to completely disable this feature for all ports. > > But please remember that You need to set it on agent’s side so it’s on > compute nodes, not on neutron-server side. > > Also, if You want to disable it only for some ports, You can set > “port_security_enabled” to False and than SG will not be applied for such > port and You will not need to configure any additional IPs in allowed > address pairs for this port. > > > > > > > > Cheers, > > > Ahmed > > > > — > > Slawek Kaplonski > > Senior software engineer > > Red Hat > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Arkady.Kanevsky at dell.com Fri Jan 3 21:19:22 2020 From: Arkady.Kanevsky at dell.com (Arkady.Kanevsky at dell.com) Date: Fri, 3 Jan 2020 21:19:22 +0000 Subject: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management Message-ID: <87344b65740d47fc9777ae710dcf4af9@AUSX13MPS308.AMER.DELL.COM> Fellow Open Stackers, I have been thinking on how to handle SmartNICs, GPUs, FPGA handling across different projects within OpenStack with Cyborg taking a leading role in it. Cyborg is important project and address accelerator devices that are part of the server and potentially switches and storage. It is address 3 different use cases and users there are all grouped into single project. 1. Application user need to program a portion of the device under management, like GPU, or SmartNIC for that app usage. Having a common way to do it across different device families and across different vendor is very important. And that has to be done every time a VM is deploy that need usage of a device. That is tied with VM scheduling. 2. Administrator need to program the whole device for specific usage. That covers the scenario when device can only support single tenant or single use case. That is done once during OpenStack deployment but may need reprogramming to configure device for different usage. May or may not require reboot of the server. 3. Administrator need to setup device for its use, like burning specific FW on it. This is typically done as part of server life-cycle event. The first 2 cases cover application life cycle of device usage. The last one covers device life cycle independently how it is used. Managing life cycle of devices is Ironic responsibility, One cannot and should not manage lifecycle of server components independently. Managing server devices outside server management violates customer service agreements with server vendors and breaks server support agreements. Nova and Neutron are getting info about all devices and their capabilities from Ironic; that they use for scheduling. We should avoid creating new project for every new component of the server and modify nova and neuron for each new device. (the same will also apply to cinder and manila if smart devices used in its data/control path on a server). Finally we want Cyborg to be able to be used in standalone capacity, say for Kubernetes. Thus, I propose that Cyborg cover use cases 1 & 2, and Ironic would cover use case 3. Thus, move all device Life-cycle code from Cyborg to Ironic. Concentrate Cyborg of fulfilling the first 2 use cases. Simplify integration with Nova and Neutron for using these accelerators to use existing Ironic mechanism for it. Create idempotent calls for use case 1 so Nova and Neutron can use it as part of VM deployment to ensure that devices are programmed for VM under scheduling need. Create idempotent call(s) for use case 2 for TripleO to setup device for single accelerator usage of a node. [Propose similar model for CNI integration.] Let the discussion start! Thanks., Arkady -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhipengh512 at gmail.com Sat Jan 4 01:53:10 2020 From: zhipengh512 at gmail.com (Zhipeng Huang) Date: Sat, 4 Jan 2020 09:53:10 +0800 Subject: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management In-Reply-To: <87344b65740d47fc9777ae710dcf4af9@AUSX13MPS308.AMER.DELL.COM> References: <87344b65740d47fc9777ae710dcf4af9@AUSX13MPS308.AMER.DELL.COM> Message-ID: Hi Arkady, Thanks for your interest in Cyborg project :) I would like to point out that when we initiated the project there are two specific use cases we want to cover: the accelerators attached locally (via PCIe or other bus type) or remotely (via Ethernet or other fabric type). For the latter one, it is clear that its life cycle is independent from the server (like block device managed by Cinder). For the former one however, its life cycle is not dependent on server for all kinds of accelerators either. For example we already have PCIe based AI accelerator cards or Smart NICs that could be power on/off when the server is on all the time. Therefore it is not a good idea to move all the life cycle management part into Ironic for the above mentioned reasons. Ironic integration is very important for the standalone usage of Cyborg for Kubernetes, Envoy (TLS acceleration) and others alike. Hope this answers your question :) On Sat, Jan 4, 2020 at 5:23 AM wrote: > Fellow Open Stackers, > > I have been thinking on how to handle SmartNICs, GPUs, FPGA handling > across different projects within OpenStack with Cyborg taking a leading > role in it. > > > > Cyborg is important project and address accelerator devices that are part > of the server and potentially switches and storage. > > It is address 3 different use cases and users there are all grouped into > single project. > > > > 1. Application user need to program a portion of the device under > management, like GPU, or SmartNIC for that app usage. Having a common way > to do it across different device families and across different vendor is > very important. And that has to be done every time a VM is deploy that need > usage of a device. That is tied with VM scheduling. > 2. Administrator need to program the whole device for specific usage. > That covers the scenario when device can only support single tenant or > single use case. That is done once during OpenStack deployment but may need > reprogramming to configure device for different usage. May or may not > require reboot of the server. > 3. Administrator need to setup device for its use, like burning > specific FW on it. This is typically done as part of server life-cycle > event. > > > > The first 2 cases cover application life cycle of device usage. > > The last one covers device life cycle independently how it is used. > > > > Managing life cycle of devices is Ironic responsibility, One cannot and > should not manage lifecycle of server components independently. Managing > server devices outside server management violates customer service > agreements with server vendors and breaks server support agreements. > > Nova and Neutron are getting info about all devices and their capabilities > from Ironic; that they use for scheduling. We should avoid creating new > project for every new component of the server and modify nova and neuron > for each new device. (the same will also apply to cinder and manila if > smart devices used in its data/control path on a server). > > Finally we want Cyborg to be able to be used in standalone capacity, say > for Kubernetes. > > > > Thus, I propose that Cyborg cover use cases 1 & 2, and Ironic would cover > use case 3. > > Thus, move all device Life-cycle code from Cyborg to Ironic. > > Concentrate Cyborg of fulfilling the first 2 use cases. > > Simplify integration with Nova and Neutron for using these accelerators to > use existing Ironic mechanism for it. > > Create idempotent calls for use case 1 so Nova and Neutron can use it as > part of VM deployment to ensure that devices are programmed for VM under > scheduling need. > > Create idempotent call(s) for use case 2 for TripleO to setup device for > single accelerator usage of a node. > > [Propose similar model for CNI integration.] > > > > Let the discussion start! > > > > Thanks., > Arkady > -- Zhipeng (Howard) Huang Principle Engineer OpenStack, Kubernetes, CNCF, LF Edge, ONNX, Kubeflow, OpenSDS, Open Service Broker API, OCP, Hyperledger, ETSI, SNIA, DMTF, W3C -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcm at jonmasters.org Sat Jan 4 04:44:24 2020 From: jcm at jonmasters.org (Jon Masters) Date: Fri, 3 Jan 2020 23:44:24 -0500 Subject: [kolla] neutron-l3-agent namespace NAT table not working? Message-ID: Hi there, I've got a weird problem with the neutron-l3-agent container on my deployment. It comes up, sets up the iptables rules in the qrouter namespace (and I can see these using "ip netns...") but traffic isn't having DNAT or SNAT applied. What's most strange is that manually adding a LOG jump target to the iptables nat PRE/POSTROUTING chains (after enabling nf logging sent to the host kernel, confirmed that works) doesn't result in any log entries. It's as if the nat table isn't being applied at all for any packets traversing the qrouter namespace. This is driving me crazy :) Anyone got some quick suggestions? (assume I tried the obvious stuff). Jon. -- Computer Architect -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Sat Jan 4 09:35:52 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Sat, 4 Jan 2020 10:35:52 +0100 Subject: [oslo][kolla][requirements][release][infra] Hit by an old, fixed bug In-Reply-To: <79cddc25-88e0-b5dd-8b8a-17cf14b9c4b1@nemebean.com> References: <20191230150137.GA9057@sm-workstation> <79cddc25-88e0-b5dd-8b8a-17cf14b9c4b1@nemebean.com> Message-ID: Thanks, Ben. That doc preamble really made me think not to cross the holy ground of release proposals. :-) I proposed release [1] and added you and Hervé as reviewers. [1] https://review.opendev.org/701080 -yoctozepto czw., 2 sty 2020 o 21:20 Ben Nemec napisał(a): > > > > On 12/30/19 9:52 AM, Radosław Piliszek wrote: > > Thanks, Sean! I knew I was missing something really basic! > > I was under the impression that 9.x is Stein, like it happens with > > main projects (major=branch). > > I could not find any doc explaining oslo.messaging versioning, perhaps > > Oslo could release 9.5.1 off the stein branch? > > Oslo for the most part follows semver, so we only bump major versions > when there is a breaking change. We bump minor versions each release so > we can do bugfix releases on the previous stable branch without stepping > on master releases. > > The underlying cause of this is likely that I'm way behind on releasing > the Oslo stable branches. It's high on my todo list now that most people > are back from holidays and will be around to help out if a release > breaks something. > > However, anyone can propose a release[0][1] (contrary to what [0] > suggests), so if the necessary fix is already on stable/stein and just > hasn't been released yet please feel free to do that. You'll just need a > +1 from either myself or hberaud (the Oslo release liaison) before the > release team will approve it. > > 0: https://releases.openstack.org/reference/using.html#requesting-a-release > 1: > https://releases.openstack.org/reference/using.html#using-new-release-command > > > > > The issue remains that, even though oslo backports bugfixes into their > > stable branches, kolla (and very possibly other deployment solutions) > > no longer benefit from them. > > > > -yoctozepto > > > > pon., 30 gru 2019 o 16:01 Sean McGinnis napisał(a): > >> > >> On Sun, Dec 29, 2019 at 09:41:45PM +0100, Radosław Piliszek wrote: > >>> Hi Folks, > >>> > >>> as the subject goes, my installation has been hit by an old bug: > >>> https://bugs.launchpad.net/oslo.messaging/+bug/1828841 > >>> (bug details not important, linked here for background) > >>> > >>> I am using Stein, deployed with recent Kolla-built source-based images > >>> (with only slight modifications compared to vanilla ones). > >>> Kolla's procedure for building source-based images considers upper > >>> constraints, which, unfortunately, turned out to be lagging behind a > >>> few releases w.r.t. oslo.messaging at least. > >>> The fix was in 9.7.0 released on May 21, u-c still point to 9.5.0 from > >>> Feb 26 and the latest of Stein is 9.8.0 from Jul 18. > >>> > >>> It seems oslo.messaging is missing from the automatic updates that bot proposes: > >>> https://review.opendev.org/#/q/owner:%22OpenStack+Proposal+Bot%22+project:openstack/requirements+branch:stable/stein > >>> > >>> Per: > >>> https://opendev.org/openstack/releases/src/branch/master/doc/source/reference/reviewer_guide.rst#release-jobs > >>> this upper-constraint proposal should be happening for all releases. > >>> > >> > >> This is normal and what is expected. > >> > >> Requirements are only updated for the branch in which those releases happen. So > >> if there is a release of oslo.messaging for stable/train, only the stable/train > >> upper constraints are updated for that new release. The stable/stein branch > >> will not be affected because that shows what the tested upper constraints were > >> for that branch. > >> > >> The last stable/stein release for oslo.messaging was 9.5.0: > >> > >> https://opendev.org/openstack/releases/src/branch/master/deliverables/stein/oslo.messaging.yaml#L49 > >> > >> And 9.5.0 is what is set in the stable/stein upper-constraints: > >> > >> https://opendev.org/openstack/requirements/src/branch/stable/stein/upper-constraints.txt#L146 > >> > >> To get that raised, whatever necessary bugfixes that are required in > >> oslo.messaging would need to be backported per-cycle until stable/stein (as in, > >> if it was in current master, it would need to be backported and merged to > >> stable/train first, then stable/stein), and once merged a stable release would > >> need to be proposed for that branch's version of the library. > >> > >> Once that stable release is done, that will propose the update to the upper > >> constraint for the given branch. > >> > >>> I would be glad if someone investigated why it happens(/ed) and > >>> audited whether other OpenStack projects don't need updating as well > >>> to avoid running on old deps when new are awaiting for months. :-) > >>> Please note this might apply to other branches as well. > >>> > >>> PS: for some reason oslo.messaging Stein release notes ( > >>> https://docs.openstack.org/releasenotes/oslo.messaging/stein.html ) > >>> are stuck at 9.5.0 as well, this could be right (I did not inspect the > >>> sources) but I am adding this in PS so you have more things to > >>> correlate if they need be. > >>> > >> > >> Again, as expected. The last stable/stein release was 9.5.0, so that is correct > >> that the release notes for stein only show up to that point. > > From skaplons at redhat.com Sat Jan 4 09:46:12 2020 From: skaplons at redhat.com (Slawek Kaplonski) Date: Sat, 4 Jan 2020 10:46:12 +0100 Subject: [kolla] neutron-l3-agent namespace NAT table not working? In-Reply-To: References: Message-ID: Hi, Is this qrouter namespace created with all those rules in container or in the host directly? Do You have qr-xxx and qg-xxx ports from br-int in this qrouter namespace? > On 4 Jan 2020, at 05:44, Jon Masters wrote: > > Hi there, > > I've got a weird problem with the neutron-l3-agent container on my deployment. It comes up, sets up the iptables rules in the qrouter namespace (and I can see these using "ip netns...") but traffic isn't having DNAT or SNAT applied. What's most strange is that manually adding a LOG jump target to the iptables nat PRE/POSTROUTING chains (after enabling nf logging sent to the host kernel, confirmed that works) doesn't result in any log entries. It's as if the nat table isn't being applied at all for any packets traversing the qrouter namespace. This is driving me crazy :) > > Anyone got some quick suggestions? (assume I tried the obvious stuff). > > Jon. > > -- > Computer Architect — Slawek Kaplonski Senior software engineer Red Hat From ahmed.zaky.abdallah at gmail.com Sat Jan 4 11:46:36 2020 From: ahmed.zaky.abdallah at gmail.com (Ahmed ZAKY) Date: Sat, 4 Jan 2020 12:46:36 +0100 Subject: About the use of security groups with neutron ports In-Reply-To: References: <003d01d5bc42$2af8ceb0$80ea6c10$@gmail.com> <582E6225-F178-401A-A1D4-A52484B76DD9@redhat.com> <000401d5bcc3$3f9ebd30$bedc3790$@gmail.com> Message-ID: Thank you, Lajos. Kind regards, Ahmed On Fri, 3 Jan 2020, 10:56 Lajos Katona, wrote: > Hi, > > General answer: > if you check your processes running on the host you will see which config > files are used: > $ ps -ef |grep neutron-server > lajoska+ 32072 1 2 09:51 ? 00:00:03 /usr/bin/python3.6 > /usr/local/bin/neutron-server --config-file /etc/neutron/neutron.conf > --config-file /etc/neutron/plugins/ml2/ml2_conf.ini --config-file > /etc/neutron/taas_plugin.ini > .... > > Similarly you can check your ovs-agent: > $ ps -ef |grep neutron-openvswitch-agent > .... > > For the documentation of the config files check the configuration > reference: > https://docs.openstack.org/neutron/latest/configuration/config.html (this > is the latest, so I suppose you need some older one like train or similar) > > Regards > Lajos > > ezt írta (időpont: 2019. dec. 27., P, > 15:42): > >> Thank you very much, Slawek. >> >> >> >> In case I have multiple configuration files, how to know which one is >> currently loaded in neutron? >> >> For example, in my environment I have: >> >> - ml2_conf.ini >> - ml2_conf_odl.ini >> - ml2_conf_sriov.ini >> - openvswitch_agent.ini >> - sriov_agent.ini >> >> >> >> >> >> [root at overcloud-controller-0 cbis-admin]# cd /etc/neutron/plugins/ml2/ >> >> [root at overcloud-controller-0 ml2]# ls >> >> ml2_conf.ini ml2_conf_odl.ini ml2_conf_sriov.ini >> openvswitch_agent.ini sriov_agent.ini >> >> >> >> >> >> Which one of these is used? >> >> >> >> Cheers, >> >> Ahmed >> >> >> >> >> >> >> >> -----Original Message----- >> From: Slawek Kaplonski >> Sent: Friday, December 27, 2019 10:28 AM >> To: ahmed.zaky.abdallah at gmail.com >> Cc: openstack-discuss at lists.openstack.org >> Subject: Re: About the use of security groups with neutron ports >> >> >> >> Hi, >> >> >> >> > On 27 Dec 2019, at 00:14, ahmed.zaky.abdallah at gmail.com wrote: >> >> > >> >> > Hi All, >> >> > >> >> > I am trying to wrap my head around something I came across in one of >> the OpenStack deployments. I am running Telco VNFs one of them is having >> different VMs using SR-IOV interfaces. >> >> > >> >> > On one of my VNFs on Openstack, I defined a wrong IPv6 Gm bearer >> interface to be exactly the same as the IPv6 Gateway. As I hate >> re-onboarding, I decided to embark on a journey of changing the IPv6 of the >> Gm bearer interface manually on the application side, everything went on >> fine. >> >> > >> >> > After two weeks, my customer started complaining about one way RTP >> flow. The customer was reluctant to blame the operation I carried out >> because everything worked smooth after my modification. >> >> > After days of investigation, I remembered that I have port-security >> enabled and this means AAP “Allowed-Address-Pairs” are defined per vPort >> (AAP contain the floating IP address of the VM so that the security to >> allow traffic to and from this VIP). I gave it a try and edited AAP >> “Allowed-Address-Pairs” to include the correct new IPv6 address. Doing that >> everything started working fine. >> >> > >> >> > The only logical explanation at that time is security group rules are >> really invoked. >> >> > >> >> > Now, I am trying to understand how the iptables are really invoked. I >> did some digging and it seems like we can control the firewall drivers on >> two levels: >> >> > >> >> > • Nova compute >> >> > • ML2 plugin >> >> > >> >> > I was curious to check nova.conf and it has already the following line: >> firewall_driver=nova.virt.firewall.NoopFirewallDriver >> >> > >> >> > However, checking the ml2 plugin configuration, the following is found: >> >> > >> >> > 230 [securitygroup] >> >> > 231 >> >> > 232 # >> >> > 233 # From neutron.ml2 >> >> > 234 # >> >> > 235 >> >> > 236 # Driver for security groups firewall in the L2 agent (string >> value) >> >> > 237 #firewall_driver = >> >> > 238 firewall_driver = openvswitch >> >> > >> >> > So, I am jumping to a conclusion that ml2 plugin is the one responsible >> for enforcing the firewall rules in my case. >> >> > >> >> > Have you had a similar experience? >> >> > Is my assumption correct: If I comment out the ml2 plugin firewall >> driver then the port security carries no sense at all and security groups >> won’t be invoked? >> >> >> >> Firewall_driver config option has to be set to some value. You can set >> “noop” as firewall_driver to completely disable this feature for all ports. >> >> But please remember that You need to set it on agent’s side so it’s on >> compute nodes, not on neutron-server side. >> >> Also, if You want to disable it only for some ports, You can set >> “port_security_enabled” to False and than SG will not be applied for such >> port and You will not need to configure any additional IPs in allowed >> address pairs for this port. >> >> >> >> > >> >> > Cheers, >> >> > Ahmed >> >> >> >> — >> >> Slawek Kaplonski >> >> Senior software engineer >> >> Red Hat >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Sat Jan 4 12:56:19 2020 From: smooney at redhat.com (Sean Mooney) Date: Sat, 04 Jan 2020 12:56:19 +0000 Subject: [kolla] neutron-l3-agent namespace NAT table not working? In-Reply-To: References: Message-ID: On Sat, 2020-01-04 at 10:46 +0100, Slawek Kaplonski wrote: > Hi, > > Is this qrouter namespace created with all those rules in container or in the host directly? > Do You have qr-xxx and qg-xxx ports from br-int in this qrouter namespace? in kolla the l3 agent should be running with net=host so the container should be useing the hosts root namespace and it will create network namespaces as needed for the different routers. the ip table rules should be in the router sub namespaces. > > > On 4 Jan 2020, at 05:44, Jon Masters wrote: > > > > Hi there, > > > > I've got a weird problem with the neutron-l3-agent container on my deployment. It comes up, sets up the iptables > > rules in the qrouter namespace (and I can see these using "ip netns...") but traffic isn't having DNAT or SNAT > > applied. What's most strange is that manually adding a LOG jump target to the iptables nat PRE/POSTROUTING chains > > (after enabling nf logging sent to the host kernel, confirmed that works) doesn't result in any log entries. It's as > > if the nat table isn't being applied at all for any packets traversing the qrouter namespace. This is driving me > > crazy :) > > > > Anyone got some quick suggestions? (assume I tried the obvious stuff). > > > > Jon. > > > > -- > > Computer Architect > > — > Slawek Kaplonski > Senior software engineer > Red Hat > > From jcm at jonmasters.org Sat Jan 4 15:39:43 2020 From: jcm at jonmasters.org (Jon Masters) Date: Sat, 4 Jan 2020 10:39:43 -0500 Subject: [kolla] neutron-l3-agent namespace NAT table not working? In-Reply-To: References: Message-ID: Excuse top posting on my phone. Also, yes, the namespaces are as described. It’s just that the (correct) nat rules for the qrouter netns are never running, in spite of the two interfaces existing in that ns and correctly attached to the vswitch. -- Computer Architect > On Jan 4, 2020, at 07:56, Sean Mooney wrote: > > On Sat, 2020-01-04 at 10:46 +0100, Slawek Kaplonski wrote: >> Hi, >> >> Is this qrouter namespace created with all those rules in container or in the host directly? >> Do You have qr-xxx and qg-xxx ports from br-int in this qrouter namespace? > in kolla the l3 agent should be running with net=host so the container should be useing the hosts > root namespace and it will create network namespaces as needed for the different routers. > > the ip table rules should be in the router sub namespaces. > >> >>>> On 4 Jan 2020, at 05:44, Jon Masters wrote: >>> >>> Hi there, >>> >>> I've got a weird problem with the neutron-l3-agent container on my deployment. It comes up, sets up the iptables >>> rules in the qrouter namespace (and I can see these using "ip netns...") but traffic isn't having DNAT or SNAT >>> applied. What's most strange is that manually adding a LOG jump target to the iptables nat PRE/POSTROUTING chains >>> (after enabling nf logging sent to the host kernel, confirmed that works) doesn't result in any log entries. It's as >>> if the nat table isn't being applied at all for any packets traversing the qrouter namespace. This is driving me >>> crazy :) >>> >>> Anyone got some quick suggestions? (assume I tried the obvious stuff). >>> >>> Jon. >>> >>> -- >>> Computer Architect >> >> — >> Slawek Kaplonski >> Senior software engineer >> Red Hat >> >> > From jcm at jonmasters.org Sun Jan 5 19:04:25 2020 From: jcm at jonmasters.org (Jon Masters) Date: Sun, 5 Jan 2020 14:04:25 -0500 Subject: [kolla] neutron-l3-agent namespace NAT table not working? In-Reply-To: References: Message-ID: This turns out to a not well documented bug in the CentOS7.7 kernel that causes exactly nat rules not to run as I was seeing. Oh dear god was this nasty as whatever to find and workaround. -- Computer Architect > On Jan 4, 2020, at 10:39, Jon Masters wrote: > > Excuse top posting on my phone. Also, yes, the namespaces are as described. It’s just that the (correct) nat rules for the qrouter netns are never running, in spite of the two interfaces existing in that ns and correctly attached to the vswitch. > > -- > Computer Architect > > >>> On Jan 4, 2020, at 07:56, Sean Mooney wrote: >>> >>> On Sat, 2020-01-04 at 10:46 +0100, Slawek Kaplonski wrote: >>> Hi, >>> >>> Is this qrouter namespace created with all those rules in container or in the host directly? >>> Do You have qr-xxx and qg-xxx ports from br-int in this qrouter namespace? >> in kolla the l3 agent should be running with net=host so the container should be useing the hosts >> root namespace and it will create network namespaces as needed for the different routers. >> >> the ip table rules should be in the router sub namespaces. >> >>> >>>>> On 4 Jan 2020, at 05:44, Jon Masters wrote: >>>> >>>> Hi there, >>>> >>>> I've got a weird problem with the neutron-l3-agent container on my deployment. It comes up, sets up the iptables >>>> rules in the qrouter namespace (and I can see these using "ip netns...") but traffic isn't having DNAT or SNAT >>>> applied. What's most strange is that manually adding a LOG jump target to the iptables nat PRE/POSTROUTING chains >>>> (after enabling nf logging sent to the host kernel, confirmed that works) doesn't result in any log entries. It's as >>>> if the nat table isn't being applied at all for any packets traversing the qrouter namespace. This is driving me >>>> crazy :) >>>> >>>> Anyone got some quick suggestions? (assume I tried the obvious stuff). >>>> >>>> Jon. >>>> >>>> -- >>>> Computer Architect >>> >>> — >>> Slawek Kaplonski >>> Senior software engineer >>> Red Hat >>> >>> >> From laurentfdumont at gmail.com Sun Jan 5 23:50:51 2020 From: laurentfdumont at gmail.com (Laurent Dumont) Date: Sun, 5 Jan 2020 18:50:51 -0500 Subject: [kolla] neutron-l3-agent namespace NAT table not working? In-Reply-To: References: Message-ID: Do you happen to have the bug ID for Centos? On Sun, Jan 5, 2020 at 2:11 PM Jon Masters wrote: > This turns out to a not well documented bug in the CentOS7.7 kernel that > causes exactly nat rules not to run as I was seeing. Oh dear god was this > nasty as whatever to find and workaround. > > -- > Computer Architect > > > > On Jan 4, 2020, at 10:39, Jon Masters wrote: > > > > Excuse top posting on my phone. Also, yes, the namespaces are as > described. It’s just that the (correct) nat rules for the qrouter netns are > never running, in spite of the two interfaces existing in that ns and > correctly attached to the vswitch. > > > > -- > > Computer Architect > > > > > >>> On Jan 4, 2020, at 07:56, Sean Mooney wrote: > >>> > >>> On Sat, 2020-01-04 at 10:46 +0100, Slawek Kaplonski wrote: > >>> Hi, > >>> > >>> Is this qrouter namespace created with all those rules in container or > in the host directly? > >>> Do You have qr-xxx and qg-xxx ports from br-int in this qrouter > namespace? > >> in kolla the l3 agent should be running with net=host so the container > should be useing the hosts > >> root namespace and it will create network namespaces as needed for the > different routers. > >> > >> the ip table rules should be in the router sub namespaces. > >> > >>> > >>>>> On 4 Jan 2020, at 05:44, Jon Masters wrote: > >>>> > >>>> Hi there, > >>>> > >>>> I've got a weird problem with the neutron-l3-agent container on my > deployment. It comes up, sets up the iptables > >>>> rules in the qrouter namespace (and I can see these using "ip > netns...") but traffic isn't having DNAT or SNAT > >>>> applied. What's most strange is that manually adding a LOG jump > target to the iptables nat PRE/POSTROUTING chains > >>>> (after enabling nf logging sent to the host kernel, confirmed that > works) doesn't result in any log entries. It's as > >>>> if the nat table isn't being applied at all for any packets > traversing the qrouter namespace. This is driving me > >>>> crazy :) > >>>> > >>>> Anyone got some quick suggestions? (assume I tried the obvious stuff). > >>>> > >>>> Jon. > >>>> > >>>> -- > >>>> Computer Architect > >>> > >>> — > >>> Slawek Kaplonski > >>> Senior software engineer > >>> Red Hat > >>> > >>> > >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jan.vondra at ultimum.io Mon Jan 6 00:25:55 2020 From: jan.vondra at ultimum.io (Jan Vondra) Date: Mon, 6 Jan 2020 01:25:55 +0100 Subject: [kolla] neutron-l3-agent namespace NAT table not working? In-Reply-To: References: Message-ID: Could you send us more details about your deployment - e.g. kolla version and image info? And please try to check neutron-openvswitch-agent log - errors regarding applying iptables rules should be there. I've encountered similar behavior when trying to run a nftables OS image (Debian 10) on iptables OS image (Ubuntu 16.04). You can try it by running sudo update-alternatives --query iptables If it's the case, option to force legacy iptables has been added - https://review.opendev.org/#/c/685967/. Best regards, Jan Dne po 6. 1. 2020 0:56 uživatel Laurent Dumont napsal: > Do you happen to have the bug ID for Centos? > > On Sun, Jan 5, 2020 at 2:11 PM Jon Masters wrote: > >> This turns out to a not well documented bug in the CentOS7.7 kernel that >> causes exactly nat rules not to run as I was seeing. Oh dear god was this >> nasty as whatever to find and workaround. >> >> -- >> Computer Architect >> >> >> > On Jan 4, 2020, at 10:39, Jon Masters wrote: >> > >> > Excuse top posting on my phone. Also, yes, the namespaces are as >> described. It’s just that the (correct) nat rules for the qrouter netns are >> never running, in spite of the two interfaces existing in that ns and >> correctly attached to the vswitch. >> > >> > -- >> > Computer Architect >> > >> > >> >>> On Jan 4, 2020, at 07:56, Sean Mooney wrote: >> >>> >> >>> On Sat, 2020-01-04 at 10:46 +0100, Slawek Kaplonski wrote: >> >>> Hi, >> >>> >> >>> Is this qrouter namespace created with all those rules in container >> or in the host directly? >> >>> Do You have qr-xxx and qg-xxx ports from br-int in this qrouter >> namespace? >> >> in kolla the l3 agent should be running with net=host so the container >> should be useing the hosts >> >> root namespace and it will create network namespaces as needed for >> the different routers. >> >> >> >> the ip table rules should be in the router sub namespaces. >> >> >> >>> >> >>>>> On 4 Jan 2020, at 05:44, Jon Masters wrote: >> >>>> >> >>>> Hi there, >> >>>> >> >>>> I've got a weird problem with the neutron-l3-agent container on my >> deployment. It comes up, sets up the iptables >> >>>> rules in the qrouter namespace (and I can see these using "ip >> netns...") but traffic isn't having DNAT or SNAT >> >>>> applied. What's most strange is that manually adding a LOG jump >> target to the iptables nat PRE/POSTROUTING chains >> >>>> (after enabling nf logging sent to the host kernel, confirmed that >> works) doesn't result in any log entries. It's as >> >>>> if the nat table isn't being applied at all for any packets >> traversing the qrouter namespace. This is driving me >> >>>> crazy :) >> >>>> >> >>>> Anyone got some quick suggestions? (assume I tried the obvious >> stuff). >> >>>> >> >>>> Jon. >> >>>> >> >>>> -- >> >>>> Computer Architect >> >>> >> >>> — >> >>> Slawek Kaplonski >> >>> Senior software engineer >> >>> Red Hat >> >>> >> >>> >> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcm at jonmasters.org Mon Jan 6 02:26:28 2020 From: jcm at jonmasters.org (Jon Masters) Date: Sun, 5 Jan 2020 21:26:28 -0500 Subject: [kolla] neutron-l3-agent namespace NAT table not working? In-Reply-To: References: Message-ID: There’s no bug ID that I’m aware of. But I’ll go look for one or file one. -- Computer Architect > On Jan 5, 2020, at 18:51, Laurent Dumont wrote: > >  > Do you happen to have the bug ID for Centos? > >> On Sun, Jan 5, 2020 at 2:11 PM Jon Masters wrote: >> This turns out to a not well documented bug in the CentOS7.7 kernel that causes exactly nat rules not to run as I was seeing. Oh dear god was this nasty as whatever to find and workaround. >> >> -- >> Computer Architect >> >> >> > On Jan 4, 2020, at 10:39, Jon Masters wrote: >> > >> > Excuse top posting on my phone. Also, yes, the namespaces are as described. It’s just that the (correct) nat rules for the qrouter netns are never running, in spite of the two interfaces existing in that ns and correctly attached to the vswitch. >> > >> > -- >> > Computer Architect >> > >> > >> >>> On Jan 4, 2020, at 07:56, Sean Mooney wrote: >> >>> >> >>> On Sat, 2020-01-04 at 10:46 +0100, Slawek Kaplonski wrote: >> >>> Hi, >> >>> >> >>> Is this qrouter namespace created with all those rules in container or in the host directly? >> >>> Do You have qr-xxx and qg-xxx ports from br-int in this qrouter namespace? >> >> in kolla the l3 agent should be running with net=host so the container should be useing the hosts >> >> root namespace and it will create network namespaces as needed for the different routers. >> >> >> >> the ip table rules should be in the router sub namespaces. >> >> >> >>> >> >>>>> On 4 Jan 2020, at 05:44, Jon Masters wrote: >> >>>> >> >>>> Hi there, >> >>>> >> >>>> I've got a weird problem with the neutron-l3-agent container on my deployment. It comes up, sets up the iptables >> >>>> rules in the qrouter namespace (and I can see these using "ip netns...") but traffic isn't having DNAT or SNAT >> >>>> applied. What's most strange is that manually adding a LOG jump target to the iptables nat PRE/POSTROUTING chains >> >>>> (after enabling nf logging sent to the host kernel, confirmed that works) doesn't result in any log entries. It's as >> >>>> if the nat table isn't being applied at all for any packets traversing the qrouter namespace. This is driving me >> >>>> crazy :) >> >>>> >> >>>> Anyone got some quick suggestions? (assume I tried the obvious stuff). >> >>>> >> >>>> Jon. >> >>>> >> >>>> -- >> >>>> Computer Architect >> >>> >> >>> — >> >>> Slawek Kaplonski >> >>> Senior software engineer >> >>> Red Hat >> >>> >> >>> >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrei.perepiolkin at open-e.com Mon Jan 6 04:51:26 2020 From: andrei.perepiolkin at open-e.com (Andrei Perapiolkin) Date: Mon, 6 Jan 2020 06:51:26 +0200 Subject: [kolla] Quick start: ansible deploy failure In-Reply-To: References: Message-ID: <88226158-d15c-7f23-e692-aa461f6d8549@open-e.com> Hello, Im following quick start guide on deploying Kolla ansible and getting failure on deploy stage: https://docs.openstack.org/kolla-ansible/latest/user/quickstart.html kolla-ansible -i ./multinode deploy TASK [mariadb : Creating haproxy mysql user] ******************************************************************************************************************************** fatal: [control01]: FAILED! => {"changed": false, "msg": "Can not parse the inner module output: localhost | SUCCESS => {\n    \"changed\": false, \n    \"user\": \"haproxy\"\n}\n"} I deploy to Centos7 with latest updates. [user at master ~]$ pip list DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. More details about Python 2 support in pip, can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support Package                          Version -------------------------------- ---------- ansible                          2.9.1 Babel                            2.8.0 backports.ssl-match-hostname     3.7.0.1 certifi                          2019.11.28 cffi                             1.13.2 chardet                          3.0.4 configobj                        4.7.2 cryptography                     2.8 debtcollector                    1.22.0 decorator                        3.4.0 docker                           4.1.0 enum34                           1.1.6 funcsigs                         1.0.2 httplib2                         0.9.2 idna                             2.8 iniparse                         0.4 ipaddress                        1.0.23 IPy                              0.75 iso8601                          0.1.12 Jinja2                           2.10.3 jmespath                         0.9.4 kitchen                          1.1.1 kolla-ansible                    9.0.0 MarkupSafe                       1.1.1 monotonic                        1.5 netaddr                          0.7.19 netifaces                        0.10.9 oslo.config                      6.12.0 oslo.i18n                        3.25.0 oslo.utils                       3.42.1 paramiko                         2.1.1 pbr                              5.4.4 perf                             0.1 pip                              19.3.1 ply                              3.4 policycoreutils-default-encoding 0.1 pyasn1                           0.1.9 pycparser                        2.19 pycurl                           7.19.0 pygobject                        3.22.0 pygpgme                          0.3 pyliblzma                        0.5.3 pyparsing                        2.4.6 python-linux-procfs              0.4.9 pytz                             2019.3 pyudev                           0.15 pyxattr                          0.5.1 PyYAML                           5.2 requests                         2.22.0 rfc3986                          1.3.2 schedutils                       0.4 seobject                         0.1 sepolicy                         1.1 setuptools                       44.0.0 six                              1.13.0 slip                             0.4.0 slip.dbus                        0.4.0 stevedore                        1.31.0 urlgrabber                       3.10 urllib3                          1.25.7 websocket-client                 0.57.0 wrapt                            1.11.2 yum-metadata-parser              1.1.4 and it looks like Im not alone with such issue: https://q.cnblogs.com/q/125213/ Thanks for your attention, Andrei Perepiolkin -------------- next part -------------- An HTML attachment was scrubbed... URL: From rico.lin.guanyu at gmail.com Mon Jan 6 06:41:00 2020 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Mon, 6 Jan 2020 14:41:00 +0800 Subject: [meta-sig][multi-arch] propose forming a Multi-arch SIG In-Reply-To: References: <20191121001509.GB976114@fedora19.localdomain> <20191217043512.GA2367741@fedora19.localdomain> Message-ID: Hi all, according to our doodle result, I propose a patch [1] to settle down our initial meeting schedules as *2020/01/07 Tuesday 0800 UTC on #openstack-meeting-alt and 1500 UTC on #openstack-meeting.* I assume we can use our initial meetings to discuss about SIG setup details (schedules, chairs, etc.), and general goals we should set and initial action we need to takes. please join us if you got time. Also, let me know if anything is wrong with the above schedule. [1] https://review.opendev.org/#/c/701147/ On Tue, Dec 17, 2019 at 2:11 PM Rico Lin wrote: > From this ML, and some IRC and Wechat discussions. I put most of the > information I collected in [1]. > At this point, we can tell there's a lot of works already in progress in > this community. So I think we can definitely get benefits from this SIG. > > Here are things we need to settle at this point: > > - *SIG chairs*: We need multiple SIG chairs who can help to drive SIG > goals and host meetings/events. *Put your name under `SIG chairs:` if > you're interested*. I will propose my name on the create SIG patch > since I'm interested in helping set this SIG up and we need to fillup > something there. But that won't block you from signing up. And I'm more > than happy if we can have more people rush in for the chair role:). > - *First meeting schedule*: I create polling for meeting time [2]. *Please > pick your favorite for our first meeting time* (And potentially our > long term meeting schedule, but let's discuss that in the meeting). I pick > the second week of Jan. because some might be on their vacation in the > following two weeks. As for the location, I do like to suggest we use > #openstack-meeting, so we might be able to get more people's attention. > From the experience of other SIGs, to run a meeting on your own IRC > channel, make it harder for new community members to join. > - *Resources*: We need to find out who or which organization is also > interested in this. Right now, I believe we need more servers to run tests, > and people to help on making test jobs, feedbacks, or any other tasks. So > please help to forward the etherpad([1]) and add on more information that I > fail to mention:) If you can find organizations that might be interested in > donating servers, I can help to reach out too. *So sign up and provide > any information that you think will helps:)* > - *Build and trace*: We definitely need to target all the above > works(from the previous replies) in this SIG, and (like Ian mentioned) to > work on the test infrastructure. And these make great first step tasks for > SIG. And to track all jobs, I think it will be reasonable to create a > Storyboard for this SIG and document those tasks in one Storyboard. > > All the above tasks IMO don't need to wait for the first meeting to happen > before them, so If anyone likes to put their effort on any of them or like > to suggest more initial tasks, you're the most welcome here! > > [1] https://etherpad.openstack.org/p/Multi-arch > [2] https://doodle.com/poll/8znyzc57skqkryv8 > > On Tue, Dec 17, 2019 at 12:45 PM Ian Wienand wrote: > >> On Tue, Nov 26, 2019 at 11:33:16AM +0000, Jonathan Rosser wrote: >> > openstack-ansible is ready to go on arm CI but in order to make the >> jobs run >> > in a reasonable time and not simply timeout a source of pre-built arm >> python >> > wheels is needed. It would be a shame to let the work that got >> contributed >> > to OSA for arm just rot. >> >> So ARM64 wheels are still a work-in-progress, but in the mean time we >> have merged a change to install a separate queue for ARM64 jobs [1]. >> Jobs in the "check-arm64" queue will be implicitly non-voting (Zuul >> isn't configured to add +-1 votes for this queue) but importantly will >> run asynchronously to the regular queue. Thus if there's very high >> demand, or any intermittent instability your gates won't be held up. >> >> [2] is an example of using this in diskimage-builder. >> >> Of course you *can* put ARM64 jobs in your gate queues as voting jobs, >> but just be aware with only 8 nodes available at this time, it could >> easily become a bottle-neck to merging code. >> >> The "check-arm64" queue is designed to be an automatically-running >> half-way point as we (hopefully) scale up support (like wheel builds >> and mirrors) and resources further. >> >> Thanks, >> >> -i >> >> [1] https://review.opendev.org/#/c/698606/ >> [2] https://review.opendev.org/#/c/676111/ >> >> >> > > -- > May The Force of OpenStack Be With You, > > *Rico Lin*irc: ricolin > > -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From rico.lin.guanyu at gmail.com Mon Jan 6 07:04:14 2020 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Mon, 6 Jan 2020 15:04:14 +0800 Subject: [auto-scaling][self-healing] Discussion to merge two SIG to one In-Reply-To: References: <1b39c3fe-22a1-c84c-ed13-05fbd9360d7d@suse.com> Message-ID: Hi guys, I send out a new schedule patch [1], please take a look to see if that works for you. Which proposed 2020/01/07 Tuesday 1400 UTC on irc #openstack-meeting as our first combined meeting schedule. [1] https://review.opendev.org/701137 On Wed, Dec 18, 2019 at 11:46 AM Rico Lin wrote: > To further push this task. I would like to propose we pick a new joint > meeting schedule for both SIGs together. > > The first steps should be we share same meeting time and schedule, also > share same event plan (as Witek suggested). And we can go from there to > discuss if we need further plans. > I also would like to suggest we move our meeting place to > #openstack-meeting so we can have chance to have more people to join. > Let's have a quick doodle polling for time, > https://doodle.com/poll/98nrf8iibr7zv3kt > Please join that doodle survey if you're interested in join us:) > > > On Thu, Nov 28, 2019 at 4:57 PM Rico Lin > wrote: > >> >> >> On Thu, Nov 28, 2019 at 4:37 PM Witek Bedyk >> wrote: >> > >> > Hi, >> > how about starting with joining the SIGs meeting times and organizing >> > the Forum and PTG events together? The repositories and wiki pages could >> > stay as they are and refer to each other. >> > >> I think even if we merged two SIG, repositories should stay separated as >> they're now. IMO we can simply rename openstack/auto-scaling-sig >> to openstack/auto-scaling and so as to self-healing. >> Or just keep it the same will be fine IMO. >> We don't need a new repo for the new SIG (at least not for now). >> >> I do like the idea to start with joining the SIGs meeting times and >> organizing the Forum and PTG events together. >> One more proposal in my mind will be, join the channel for IRC. >> > >> > I think merging is good if you have an idea how to better structure the >> > content, and time to review the existing one and do all the formal >> > stuff. Just gluing the documents won't help. >> Totally agree with this point! >> > >> > Cheers >> > Witek >> > >> >> >> -- >> May The Force of OpenStack Be With You, >> Rico Lin >> irc: ricolin >> > > > -- > May The Force of OpenStack Be With You, > > *Rico Lin*irc: ricolin > > -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Mon Jan 6 09:08:46 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Mon, 6 Jan 2020 10:08:46 +0100 Subject: [kolla] Quick start: ansible deploy failure In-Reply-To: <88226158-d15c-7f23-e692-aa461f6d8549@open-e.com> References: <88226158-d15c-7f23-e692-aa461f6d8549@open-e.com> Message-ID: Hi Andrei, I see you use kolla-ansible for Train, yet it looks as if you are deploying Stein there. Could you confirm that? If you prefer to deploy Stein, please use the Stein branch of kolla-ansible or analogically the 8.* releases from PyPI. Otherwise try deploying Train. -yoctozepto pon., 6 sty 2020 o 05:58 Andrei Perapiolkin napisał(a): > > Hello, > > > Im following quick start guide on deploying Kolla ansible and getting failure on deploy stage: > > https://docs.openstack.org/kolla-ansible/latest/user/quickstart.html > > kolla-ansible -i ./multinode deploy > > TASK [mariadb : Creating haproxy mysql user] ******************************************************************************************************************************** > > fatal: [control01]: FAILED! => {"changed": false, "msg": "Can not parse the inner module output: localhost | SUCCESS => {\n \"changed\": false, \n \"user\": \"haproxy\"\n}\n"} > > > I deploy to Centos7 with latest updates. > > > [user at master ~]$ pip list > DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. More details about Python 2 support in pip, can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support > Package Version > -------------------------------- ---------- > ansible 2.9.1 > Babel 2.8.0 > backports.ssl-match-hostname 3.7.0.1 > certifi 2019.11.28 > cffi 1.13.2 > chardet 3.0.4 > configobj 4.7.2 > cryptography 2.8 > debtcollector 1.22.0 > decorator 3.4.0 > docker 4.1.0 > enum34 1.1.6 > funcsigs 1.0.2 > httplib2 0.9.2 > idna 2.8 > iniparse 0.4 > ipaddress 1.0.23 > IPy 0.75 > iso8601 0.1.12 > Jinja2 2.10.3 > jmespath 0.9.4 > kitchen 1.1.1 > kolla-ansible 9.0.0 > MarkupSafe 1.1.1 > monotonic 1.5 > netaddr 0.7.19 > netifaces 0.10.9 > oslo.config 6.12.0 > oslo.i18n 3.25.0 > oslo.utils 3.42.1 > paramiko 2.1.1 > pbr 5.4.4 > perf 0.1 > pip 19.3.1 > ply 3.4 > policycoreutils-default-encoding 0.1 > pyasn1 0.1.9 > pycparser 2.19 > pycurl 7.19.0 > pygobject 3.22.0 > pygpgme 0.3 > pyliblzma 0.5.3 > pyparsing 2.4.6 > python-linux-procfs 0.4.9 > pytz 2019.3 > pyudev 0.15 > pyxattr 0.5.1 > PyYAML 5.2 > requests 2.22.0 > rfc3986 1.3.2 > schedutils 0.4 > seobject 0.1 > sepolicy 1.1 > setuptools 44.0.0 > six 1.13.0 > slip 0.4.0 > slip.dbus 0.4.0 > stevedore 1.31.0 > urlgrabber 3.10 > urllib3 1.25.7 > websocket-client 0.57.0 > wrapt 1.11.2 > yum-metadata-parser 1.1.4 > > > and it looks like Im not alone with such issue: https://q.cnblogs.com/q/125213/ > > > Thanks for your attention, > > Andrei Perepiolkin From radoslaw.piliszek at gmail.com Mon Jan 6 09:11:48 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Mon, 6 Jan 2020 10:11:48 +0100 Subject: [kolla] neutron-l3-agent namespace NAT table not working? In-Reply-To: References: Message-ID: If it's RHEL kernel's bug, then Red Hat would likely want to know about it (if not knowing already). I have my kolla deployment on c7.7 and I don't encounter this issue, though there is a pending kernel update so now I'm worried about applying it... -yoctozepto pon., 6 sty 2020 o 03:34 Jon Masters napisał(a): > > There’s no bug ID that I’m aware of. But I’ll go look for one or file one. > > -- > Computer Architect > > > On Jan 5, 2020, at 18:51, Laurent Dumont wrote: > >  > Do you happen to have the bug ID for Centos? > > On Sun, Jan 5, 2020 at 2:11 PM Jon Masters wrote: >> >> This turns out to a not well documented bug in the CentOS7.7 kernel that causes exactly nat rules not to run as I was seeing. Oh dear god was this nasty as whatever to find and workaround. >> >> -- >> Computer Architect >> >> >> > On Jan 4, 2020, at 10:39, Jon Masters wrote: >> > >> > Excuse top posting on my phone. Also, yes, the namespaces are as described. It’s just that the (correct) nat rules for the qrouter netns are never running, in spite of the two interfaces existing in that ns and correctly attached to the vswitch. >> > >> > -- >> > Computer Architect >> > >> > >> >>> On Jan 4, 2020, at 07:56, Sean Mooney wrote: >> >>> >> >>> On Sat, 2020-01-04 at 10:46 +0100, Slawek Kaplonski wrote: >> >>> Hi, >> >>> >> >>> Is this qrouter namespace created with all those rules in container or in the host directly? >> >>> Do You have qr-xxx and qg-xxx ports from br-int in this qrouter namespace? >> >> in kolla the l3 agent should be running with net=host so the container should be useing the hosts >> >> root namespace and it will create network namespaces as needed for the different routers. >> >> >> >> the ip table rules should be in the router sub namespaces. >> >> >> >>> >> >>>>> On 4 Jan 2020, at 05:44, Jon Masters wrote: >> >>>> >> >>>> Hi there, >> >>>> >> >>>> I've got a weird problem with the neutron-l3-agent container on my deployment. It comes up, sets up the iptables >> >>>> rules in the qrouter namespace (and I can see these using "ip netns...") but traffic isn't having DNAT or SNAT >> >>>> applied. What's most strange is that manually adding a LOG jump target to the iptables nat PRE/POSTROUTING chains >> >>>> (after enabling nf logging sent to the host kernel, confirmed that works) doesn't result in any log entries. It's as >> >>>> if the nat table isn't being applied at all for any packets traversing the qrouter namespace. This is driving me >> >>>> crazy :) >> >>>> >> >>>> Anyone got some quick suggestions? (assume I tried the obvious stuff). >> >>>> >> >>>> Jon. >> >>>> >> >>>> -- >> >>>> Computer Architect >> >>> >> >>> — >> >>> Slawek Kaplonski >> >>> Senior software engineer >> >>> Red Hat >> >>> >> >>> >> >> >> From thierry at openstack.org Mon Jan 6 09:40:54 2020 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 6 Jan 2020 10:40:54 +0100 Subject: [cloudkitty] Stepping down from PTL In-Reply-To: <6a879c96c9aa82cc31f4ffde7a6b2663@objectif-libre.com> References: <6a879c96c9aa82cc31f4ffde7a6b2663@objectif-libre.com> Message-ID: <601c77c1-f725-f4dc-7a24-da7627f6d998@openstack.org> Luka Peschke wrote: > I'm moving to a new position that doesn't involve OpenStack, and won't > leave me the required time to be Cloudkitty's PTL. This is why I have to > step down from the PTL position. jferrieu will take my position for the > end of the U cycle (he's been a major contributor recently), with the > help of huats, who's been the Cloudkitty PTL before me, and has been > around in the community for a long time. > > I've been the PTL for two and a half cycles, and I think that it is a > good thing for the project to take a new lead, with a new vision. > > I'm grateful for my experience within the OpenStack community. Sorry to see you go, Luka! To make the transition official, could you propose a change to update the PTL name at: https://opendev.org/openstack/governance/src/branch/master/reference/projects.yaml#L161 Thanks in advance, -- Thierry Carrez (ttx) From arnaud.morin at gmail.com Mon Jan 6 09:34:08 2020 From: arnaud.morin at gmail.com (Arnaud Morin) Date: Mon, 6 Jan 2020 09:34:08 +0000 Subject: [neutron][nova][cinder][glance][largescale-sig] Documentation update for large-scale Message-ID: <20200106093408.GH1174@sync> Hey all, With the new "Large scale SIG", we were thinking about updating documentation to help operators setting up large deployments. To do so, we would like to propose, at least, some documentation changes to identify options that affect large scale. The plan is to have a small note on some options and eventually a link to a specific page for large scale (see attachments). I know that nova started working on collecting those parameters here: https://bugs.launchpad.net/nova/+bug/1838819 Do you know if something similar exists on other projects? Moreover, we would like to collect more parameters that could be tuned on a large scale deployment, for every project. So, if you have any, feel free to answer to this mail or add some info on the following etherpad: https://etherpad.openstack.org/p/large-scale-sig-documentation Thanks for your help! Regards. The large-scale team. -- Arnaud Morin -------------- next part -------------- A non-text attachment was scrubbed... Name: before.png Type: application/octet-stream Size: 42004 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: after.png Type: application/octet-stream Size: 59687 bytes Desc: not available URL: From smooney at redhat.com Mon Jan 6 11:40:15 2020 From: smooney at redhat.com (Sean Mooney) Date: Mon, 06 Jan 2020 11:40:15 +0000 Subject: [kolla] neutron-l3-agent namespace NAT table not working? In-Reply-To: References: Message-ID: <9a331abcc2d5eaf119dc1c1903c3405024ce84a8.camel@redhat.com> On Mon, 2020-01-06 at 10:11 +0100, Radosław Piliszek wrote: > If it's RHEL kernel's bug, then Red Hat would likely want to know > about it (if not knowing already). > I have my kolla deployment on c7.7 and I don't encounter this issue, > though there is a pending kernel update so now I'm worried about > applying it... it sound more like a confilct between legacy iptables and the new nftables based replacement. if you mix the two then it will appear as if the rules are installed but only some of the rules will run. so the container images and the host need to be both configured to use the same versions. that said fi you are using centos images on a centos host they should be providing your usnign centos 7 or centos 8 on both. if you try to use centos 7 image on a centos 8 host or centos 8 images on a centos 7 host it would likely have issues due to the fact centos 8 uses a differt iptables implemeantion > > -yoctozepto > > pon., 6 sty 2020 o 03:34 Jon Masters napisał(a): > > > > There’s no bug ID that I’m aware of. But I’ll go look for one or file one. > > > > -- > > Computer Architect > > > > > > On Jan 5, 2020, at 18:51, Laurent Dumont wrote: > > > >  > > Do you happen to have the bug ID for Centos? > > > > On Sun, Jan 5, 2020 at 2:11 PM Jon Masters wrote: > > > > > > This turns out to a not well documented bug in the CentOS7.7 kernel that causes exactly nat rules not to run as I > > > was seeing. Oh dear god was this nasty as whatever to find and workaround. > > > > > > -- > > > Computer Architect > > > > > > > > > > On Jan 4, 2020, at 10:39, Jon Masters wrote: > > > > > > > > Excuse top posting on my phone. Also, yes, the namespaces are as described. It’s just that the (correct) nat > > > > rules for the qrouter netns are never running, in spite of the two interfaces existing in that ns and correctly > > > > attached to the vswitch. > > > > > > > > -- > > > > Computer Architect > > > > > > > > > > > > > > On Jan 4, 2020, at 07:56, Sean Mooney wrote: > > > > > > > > > > > > On Sat, 2020-01-04 at 10:46 +0100, Slawek Kaplonski wrote: > > > > > > Hi, > > > > > > > > > > > > Is this qrouter namespace created with all those rules in container or in the host directly? > > > > > > Do You have qr-xxx and qg-xxx ports from br-int in this qrouter namespace? > > > > > > > > > > in kolla the l3 agent should be running with net=host so the container should be useing the hosts > > > > > root namespace and it will create network namespaces as needed for the different routers. > > > > > > > > > > the ip table rules should be in the router sub namespaces. > > > > > > > > > > > > > > > > > > > On 4 Jan 2020, at 05:44, Jon Masters wrote: > > > > > > > > > > > > > > Hi there, > > > > > > > > > > > > > > I've got a weird problem with the neutron-l3-agent container on my deployment. It comes up, sets up the > > > > > > > iptables > > > > > > > rules in the qrouter namespace (and I can see these using "ip netns...") but traffic isn't having DNAT or > > > > > > > SNAT > > > > > > > applied. What's most strange is that manually adding a LOG jump target to the iptables nat PRE/POSTROUTING > > > > > > > chains > > > > > > > (after enabling nf logging sent to the host kernel, confirmed that works) doesn't result in any log > > > > > > > entries. It's as > > > > > > > if the nat table isn't being applied at all for any packets traversing the qrouter namespace. This is > > > > > > > driving me > > > > > > > crazy :) > > > > > > > > > > > > > > Anyone got some quick suggestions? (assume I tried the obvious stuff). > > > > > > > > > > > > > > Jon. > > > > > > > > > > > > > > -- > > > > > > > Computer Architect > > > > > > > > > > > > — > > > > > > Slawek Kaplonski > > > > > > Senior software engineer > > > > > > Red Hat > > > > > > > > > > > > > > From skaplons at redhat.com Mon Jan 6 11:50:11 2020 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 6 Jan 2020 12:50:11 +0100 Subject: [neutron][nova][cinder][glance][largescale-sig] Documentation update for large-scale In-Reply-To: <20200106093408.GH1174@sync> References: <20200106093408.GH1174@sync> Message-ID: <515FFAFC-83C7-4E4F-9B43-19186FE86C1F@redhat.com> Hi, I just opened similar bug for Neutron to track this from Neutron perspective also. It’s here: https://bugs.launchpad.net/neutron/+bug/1858419 I will also raise this on our next team meeting. > On 6 Jan 2020, at 10:34, Arnaud Morin wrote: > > Hey all, > > With the new "Large scale SIG", we were thinking about updating > documentation to help operators setting up large deployments. > To do so, we would like to propose, at least, some documentation changes > to identify options that affect large scale. > The plan is to have a small note on some options and eventually a link > to a specific page for large scale (see attachments). > > I know that nova started working on collecting those parameters here: > https://bugs.launchpad.net/nova/+bug/1838819 > > Do you know if something similar exists on other projects? > > Moreover, we would like to collect more parameters that could be tuned > on a large scale deployment, for every project. > So, if you have any, feel free to answer to this mail or add some info > on the following etherpad: > https://etherpad.openstack.org/p/large-scale-sig-documentation > > Thanks for your help! > > Regards. > The large-scale team. > > -- > Arnaud Morin > > — Slawek Kaplonski Senior software engineer Red Hat From jan.vondra at ultimum.io Mon Jan 6 12:02:51 2020 From: jan.vondra at ultimum.io (Jan Vondra) Date: Mon, 6 Jan 2020 13:02:51 +0100 Subject: [kolla] neutron-l3-agent namespace NAT table not working? In-Reply-To: <9a331abcc2d5eaf119dc1c1903c3405024ce84a8.camel@redhat.com> References: <9a331abcc2d5eaf119dc1c1903c3405024ce84a8.camel@redhat.com> Message-ID: po 6. 1. 2020 v 12:46 odesílatel Sean Mooney napsal: > > On Mon, 2020-01-06 at 10:11 +0100, Radosław Piliszek wrote: > > If it's RHEL kernel's bug, then Red Hat would likely want to know > > about it (if not knowing already). > > I have my kolla deployment on c7.7 and I don't encounter this issue, > > though there is a pending kernel update so now I'm worried about > > applying it... > it sound more like a confilct between legacy iptables and the new nftables based replacement. > if you mix the two then it will appear as if the rules are installed but only some of the rules will run. > so the container images and the host need to be both configured to use the same versions. > > that said fi you are using centos images on a centos host they should be providing your usnign centos 7 or centos 8 on > both. if you try to use centos 7 image on a centos 8 host or centos 8 images on a centos 7 host it would likely have > issues due to the fact centos 8 uses a differt iptables implemeantion > As I wrote before this scenario has already been covered in following patches: https://review.opendev.org/#/c/685967/ https://review.opendev.org/#/c/683679/ To force iptables legacy in neutron containers put following line into globals.yml file: neutron_legacy_iptables: "yes" Beware currently there is an issue in applying changes in enviromental variables for already running containers so you may have to manually delete neutron containers and recreate them using reconfigure or - if possible - destroy and redeploy whole deployment. J.V. From jcm at jonmasters.org Mon Jan 6 12:13:22 2020 From: jcm at jonmasters.org (Jon Masters) Date: Mon, 6 Jan 2020 04:13:22 -0800 Subject: [kolla] neutron-l3-agent namespace NAT table not working? In-Reply-To: <9a331abcc2d5eaf119dc1c1903c3405024ce84a8.camel@redhat.com> References: <9a331abcc2d5eaf119dc1c1903c3405024ce84a8.camel@redhat.com> Message-ID: I did specifically check for such a conflict tho before proceeding down the path I went :) -- Computer Architect > On Jan 6, 2020, at 03:40, Sean Mooney wrote: > > On Mon, 2020-01-06 at 10:11 +0100, Radosław Piliszek wrote: >> If it's RHEL kernel's bug, then Red Hat would likely want to know >> about it (if not knowing already). >> I have my kolla deployment on c7.7 and I don't encounter this issue, >> though there is a pending kernel update so now I'm worried about >> applying it... > it sound more like a confilct between legacy iptables and the new nftables based replacement. > if you mix the two then it will appear as if the rules are installed but only some of the rules will run. > so the container images and the host need to be both configured to use the same versions. > > that said fi you are using centos images on a centos host they should be providing your usnign centos 7 or centos 8 on > both. if you try to use centos 7 image on a centos 8 host or centos 8 images on a centos 7 host it would likely have > issues due to the fact centos 8 uses a differt iptables implemeantion > >> >> -yoctozepto >> >> pon., 6 sty 2020 o 03:34 Jon Masters napisał(a): >>> >>> There’s no bug ID that I’m aware of. But I’ll go look for one or file one. >>> >>> -- >>> Computer Architect >>> >>> >>>> On Jan 5, 2020, at 18:51, Laurent Dumont wrote: >>> >>>  >>> Do you happen to have the bug ID for Centos? >>> >>> On Sun, Jan 5, 2020 at 2:11 PM Jon Masters wrote: >>>> >>>> This turns out to a not well documented bug in the CentOS7.7 kernel that causes exactly nat rules not to run as I >>>> was seeing. Oh dear god was this nasty as whatever to find and workaround. >>>> >>>> -- >>>> Computer Architect >>>> >>>> >>>>> On Jan 4, 2020, at 10:39, Jon Masters wrote: >>>>> >>>>> Excuse top posting on my phone. Also, yes, the namespaces are as described. It’s just that the (correct) nat >>>>> rules for the qrouter netns are never running, in spite of the two interfaces existing in that ns and correctly >>>>> attached to the vswitch. >>>>> >>>>> -- >>>>> Computer Architect >>>>> >>>>> >>>>>>> On Jan 4, 2020, at 07:56, Sean Mooney wrote: >>>>>>> >>>>>>> On Sat, 2020-01-04 at 10:46 +0100, Slawek Kaplonski wrote: >>>>>>> Hi, >>>>>>> >>>>>>> Is this qrouter namespace created with all those rules in container or in the host directly? >>>>>>> Do You have qr-xxx and qg-xxx ports from br-int in this qrouter namespace? >>>>>> >>>>>> in kolla the l3 agent should be running with net=host so the container should be useing the hosts >>>>>> root namespace and it will create network namespaces as needed for the different routers. >>>>>> >>>>>> the ip table rules should be in the router sub namespaces. >>>>>> >>>>>>> >>>>>>>>> On 4 Jan 2020, at 05:44, Jon Masters wrote: >>>>>>>> >>>>>>>> Hi there, >>>>>>>> >>>>>>>> I've got a weird problem with the neutron-l3-agent container on my deployment. It comes up, sets up the >>>>>>>> iptables >>>>>>>> rules in the qrouter namespace (and I can see these using "ip netns...") but traffic isn't having DNAT or >>>>>>>> SNAT >>>>>>>> applied. What's most strange is that manually adding a LOG jump target to the iptables nat PRE/POSTROUTING >>>>>>>> chains >>>>>>>> (after enabling nf logging sent to the host kernel, confirmed that works) doesn't result in any log >>>>>>>> entries. It's as >>>>>>>> if the nat table isn't being applied at all for any packets traversing the qrouter namespace. This is >>>>>>>> driving me >>>>>>>> crazy :) >>>>>>>> >>>>>>>> Anyone got some quick suggestions? (assume I tried the obvious stuff). >>>>>>>> >>>>>>>> Jon. >>>>>>>> >>>>>>>> -- >>>>>>>> Computer Architect >>>>>>> >>>>>>> — >>>>>>> Slawek Kaplonski >>>>>>> Senior software engineer >>>>>>> Red Hat >>>>>>> >>>>>>> >> >> > From radoslaw.piliszek at gmail.com Mon Jan 6 12:33:02 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Mon, 6 Jan 2020 13:33:02 +0100 Subject: [kolla] neutron-l3-agent namespace NAT table not working? In-Reply-To: References: <9a331abcc2d5eaf119dc1c1903c3405024ce84a8.camel@redhat.com> Message-ID: Folks, this seems to be about C7, not C8, and "neutron_legacy_iptables" does not apply here. @Jon - what is the kernel bug you mentioned but never referenced? -yoctozepto pon., 6 sty 2020 o 13:13 Jon Masters napisał(a): > > I did specifically check for such a conflict tho before proceeding down the path I went :) > > -- > Computer Architect > > > > On Jan 6, 2020, at 03:40, Sean Mooney wrote: > > > > On Mon, 2020-01-06 at 10:11 +0100, Radosław Piliszek wrote: > >> If it's RHEL kernel's bug, then Red Hat would likely want to know > >> about it (if not knowing already). > >> I have my kolla deployment on c7.7 and I don't encounter this issue, > >> though there is a pending kernel update so now I'm worried about > >> applying it... > > it sound more like a confilct between legacy iptables and the new nftables based replacement. > > if you mix the two then it will appear as if the rules are installed but only some of the rules will run. > > so the container images and the host need to be both configured to use the same versions. > > > > that said fi you are using centos images on a centos host they should be providing your usnign centos 7 or centos 8 on > > both. if you try to use centos 7 image on a centos 8 host or centos 8 images on a centos 7 host it would likely have > > issues due to the fact centos 8 uses a differt iptables implemeantion > > > >> > >> -yoctozepto > >> > >> pon., 6 sty 2020 o 03:34 Jon Masters napisał(a): > >>> > >>> There’s no bug ID that I’m aware of. But I’ll go look for one or file one. > >>> > >>> -- > >>> Computer Architect > >>> > >>> > >>>> On Jan 5, 2020, at 18:51, Laurent Dumont wrote: > >>> > >>>  > >>> Do you happen to have the bug ID for Centos? > >>> > >>> On Sun, Jan 5, 2020 at 2:11 PM Jon Masters wrote: > >>>> > >>>> This turns out to a not well documented bug in the CentOS7.7 kernel that causes exactly nat rules not to run as I > >>>> was seeing. Oh dear god was this nasty as whatever to find and workaround. > >>>> > >>>> -- > >>>> Computer Architect > >>>> > >>>> > >>>>> On Jan 4, 2020, at 10:39, Jon Masters wrote: > >>>>> > >>>>> Excuse top posting on my phone. Also, yes, the namespaces are as described. It’s just that the (correct) nat > >>>>> rules for the qrouter netns are never running, in spite of the two interfaces existing in that ns and correctly > >>>>> attached to the vswitch. > >>>>> > >>>>> -- > >>>>> Computer Architect > >>>>> > >>>>> > >>>>>>> On Jan 4, 2020, at 07:56, Sean Mooney wrote: > >>>>>>> > >>>>>>> On Sat, 2020-01-04 at 10:46 +0100, Slawek Kaplonski wrote: > >>>>>>> Hi, > >>>>>>> > >>>>>>> Is this qrouter namespace created with all those rules in container or in the host directly? > >>>>>>> Do You have qr-xxx and qg-xxx ports from br-int in this qrouter namespace? > >>>>>> > >>>>>> in kolla the l3 agent should be running with net=host so the container should be useing the hosts > >>>>>> root namespace and it will create network namespaces as needed for the different routers. > >>>>>> > >>>>>> the ip table rules should be in the router sub namespaces. > >>>>>> > >>>>>>> > >>>>>>>>> On 4 Jan 2020, at 05:44, Jon Masters wrote: > >>>>>>>> > >>>>>>>> Hi there, > >>>>>>>> > >>>>>>>> I've got a weird problem with the neutron-l3-agent container on my deployment. It comes up, sets up the > >>>>>>>> iptables > >>>>>>>> rules in the qrouter namespace (and I can see these using "ip netns...") but traffic isn't having DNAT or > >>>>>>>> SNAT > >>>>>>>> applied. What's most strange is that manually adding a LOG jump target to the iptables nat PRE/POSTROUTING > >>>>>>>> chains > >>>>>>>> (after enabling nf logging sent to the host kernel, confirmed that works) doesn't result in any log > >>>>>>>> entries. It's as > >>>>>>>> if the nat table isn't being applied at all for any packets traversing the qrouter namespace. This is > >>>>>>>> driving me > >>>>>>>> crazy :) > >>>>>>>> > >>>>>>>> Anyone got some quick suggestions? (assume I tried the obvious stuff). > >>>>>>>> > >>>>>>>> Jon. > >>>>>>>> > >>>>>>>> -- > >>>>>>>> Computer Architect > >>>>>>> > >>>>>>> — > >>>>>>> Slawek Kaplonski > >>>>>>> Senior software engineer > >>>>>>> Red Hat > >>>>>>> > >>>>>>> > >> > >> > > From mark at stackhpc.com Mon Jan 6 13:38:19 2020 From: mark at stackhpc.com (Mark Goddard) Date: Mon, 6 Jan 2020 13:38:19 +0000 Subject: [kolla] Adding Dincer Celik to kolla-core and kolla-ansible-core Message-ID: Hi, I recently proposed to the existing cores that we add Dincer Celik (osmanlicilegi) to the kolla-core and kolla-ansible-core groups and we agreed to go ahead. Thanks for your contribution to the project so far Dincer, I'm glad to have you on the team. Cheers, Mark From hberaud at redhat.com Mon Jan 6 14:16:38 2020 From: hberaud at redhat.com (Herve Beraud) Date: Mon, 6 Jan 2020 15:16:38 +0100 Subject: [oslo][kolla][requirements][release][infra] Hit by an old, fixed bug In-Reply-To: References: <20191230150137.GA9057@sm-workstation> <79cddc25-88e0-b5dd-8b8a-17cf14b9c4b1@nemebean.com> Message-ID: Thanks Radosław for the heads up, I validated the new release. Le sam. 4 janv. 2020 à 10:39, Radosław Piliszek a écrit : > Thanks, Ben. That doc preamble really made me think not to cross the > holy ground of release proposals. :-) > > I proposed release [1] and added you and Hervé as reviewers. > > [1] https://review.opendev.org/701080 > > -yoctozepto > > czw., 2 sty 2020 o 21:20 Ben Nemec napisał(a): > > > > > > > > On 12/30/19 9:52 AM, Radosław Piliszek wrote: > > > Thanks, Sean! I knew I was missing something really basic! > > > I was under the impression that 9.x is Stein, like it happens with > > > main projects (major=branch). > > > I could not find any doc explaining oslo.messaging versioning, perhaps > > > Oslo could release 9.5.1 off the stein branch? > > > > Oslo for the most part follows semver, so we only bump major versions > > when there is a breaking change. We bump minor versions each release so > > we can do bugfix releases on the previous stable branch without stepping > > on master releases. > > > > The underlying cause of this is likely that I'm way behind on releasing > > the Oslo stable branches. It's high on my todo list now that most people > > are back from holidays and will be around to help out if a release > > breaks something. > > > > However, anyone can propose a release[0][1] (contrary to what [0] > > suggests), so if the necessary fix is already on stable/stein and just > > hasn't been released yet please feel free to do that. You'll just need a > > +1 from either myself or hberaud (the Oslo release liaison) before the > > release team will approve it. > > > > 0: > https://releases.openstack.org/reference/using.html#requesting-a-release > > 1: > > > https://releases.openstack.org/reference/using.html#using-new-release-command > > > > > > > > The issue remains that, even though oslo backports bugfixes into their > > > stable branches, kolla (and very possibly other deployment solutions) > > > no longer benefit from them. > > > > > > -yoctozepto > > > > > > pon., 30 gru 2019 o 16:01 Sean McGinnis > napisał(a): > > >> > > >> On Sun, Dec 29, 2019 at 09:41:45PM +0100, Radosław Piliszek wrote: > > >>> Hi Folks, > > >>> > > >>> as the subject goes, my installation has been hit by an old bug: > > >>> https://bugs.launchpad.net/oslo.messaging/+bug/1828841 > > >>> (bug details not important, linked here for background) > > >>> > > >>> I am using Stein, deployed with recent Kolla-built source-based > images > > >>> (with only slight modifications compared to vanilla ones). > > >>> Kolla's procedure for building source-based images considers upper > > >>> constraints, which, unfortunately, turned out to be lagging behind a > > >>> few releases w.r.t. oslo.messaging at least. > > >>> The fix was in 9.7.0 released on May 21, u-c still point to 9.5.0 > from > > >>> Feb 26 and the latest of Stein is 9.8.0 from Jul 18. > > >>> > > >>> It seems oslo.messaging is missing from the automatic updates that > bot proposes: > > >>> > https://review.opendev.org/#/q/owner:%22OpenStack+Proposal+Bot%22+project:openstack/requirements+branch:stable/stein > > >>> > > >>> Per: > > >>> > https://opendev.org/openstack/releases/src/branch/master/doc/source/reference/reviewer_guide.rst#release-jobs > > >>> this upper-constraint proposal should be happening for all releases. > > >>> > > >> > > >> This is normal and what is expected. > > >> > > >> Requirements are only updated for the branch in which those releases > happen. So > > >> if there is a release of oslo.messaging for stable/train, only the > stable/train > > >> upper constraints are updated for that new release. The stable/stein > branch > > >> will not be affected because that shows what the tested upper > constraints were > > >> for that branch. > > >> > > >> The last stable/stein release for oslo.messaging was 9.5.0: > > >> > > >> > https://opendev.org/openstack/releases/src/branch/master/deliverables/stein/oslo.messaging.yaml#L49 > > >> > > >> And 9.5.0 is what is set in the stable/stein upper-constraints: > > >> > > >> > https://opendev.org/openstack/requirements/src/branch/stable/stein/upper-constraints.txt#L146 > > >> > > >> To get that raised, whatever necessary bugfixes that are required in > > >> oslo.messaging would need to be backported per-cycle until > stable/stein (as in, > > >> if it was in current master, it would need to be backported and > merged to > > >> stable/train first, then stable/stein), and once merged a stable > release would > > >> need to be proposed for that branch's version of the library. > > >> > > >> Once that stable release is done, that will propose the update to the > upper > > >> constraint for the given branch. > > >> > > >>> I would be glad if someone investigated why it happens(/ed) and > > >>> audited whether other OpenStack projects don't need updating as well > > >>> to avoid running on old deps when new are awaiting for months. :-) > > >>> Please note this might apply to other branches as well. > > >>> > > >>> PS: for some reason oslo.messaging Stein release notes ( > > >>> https://docs.openstack.org/releasenotes/oslo.messaging/stein.html ) > > >>> are stuck at 9.5.0 as well, this could be right (I did not inspect > the > > >>> sources) but I am adding this in PS so you have more things to > > >>> correlate if they need be. > > >>> > > >> > > >> Again, as expected. The last stable/stein release was 9.5.0, so that > is correct > > >> that the release notes for stein only show up to that point. > > > > > -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Mon Jan 6 14:32:28 2020 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 6 Jan 2020 15:32:28 +0100 Subject: [largescale-sig] Meeting summary and next actions In-Reply-To: <3c3a6232-9a3b-d240-ab82-c7ac4997f5c0@openstack.org> References: <3c3a6232-9a3b-d240-ab82-c7ac4997f5c0@openstack.org> Message-ID: <06e5f16f-dfa4-8189-da7b-ad2250df8125@openstack.org> Thierry Carrez wrote: > [...] > The next meeting will happen on January 15, at 9:00 UTC on > #openstack-meeting. Oops, some unexpected travel came up and I won't be available to chair the meeting on that date. We can either: 1- keep the meeting, with someone else chairing. I can help with posting the agenda before and the summary after, just need someone to start the meeting and lead it -- any volunteer? 2- move the meeting to January 22, but we may lose Chinese participants to new year preparations... Thoughts? -- Thierry Carrez (ttx) From haleyb.dev at gmail.com Mon Jan 6 15:15:53 2020 From: haleyb.dev at gmail.com (Brian Haley) Date: Mon, 6 Jan 2020 10:15:53 -0500 Subject: [kolla] neutron-l3-agent namespace NAT table not working? In-Reply-To: References: <9a331abcc2d5eaf119dc1c1903c3405024ce84a8.camel@redhat.com> Message-ID: On 1/6/20 7:33 AM, Radosław Piliszek wrote: > Folks, this seems to be about C7, not C8, and > "neutron_legacy_iptables" does not apply here. > @Jon - what is the kernel bug you mentioned but never referenced? There was a previous kernel bug in a Centos kernel that broke DNAT, https://bugs.launchpad.net/neutron/+bug/1776778 but don't know if this is the same issue. I would have hoped no one was using that kernel by now, and/or it was blacklisted. -Brian > pon., 6 sty 2020 o 13:13 Jon Masters napisał(a): >> >> I did specifically check for such a conflict tho before proceeding down the path I went :) >> >> -- >> Computer Architect >> >> >>> On Jan 6, 2020, at 03:40, Sean Mooney wrote: >>> >>> On Mon, 2020-01-06 at 10:11 +0100, Radosław Piliszek wrote: >>>> If it's RHEL kernel's bug, then Red Hat would likely want to know >>>> about it (if not knowing already). >>>> I have my kolla deployment on c7.7 and I don't encounter this issue, >>>> though there is a pending kernel update so now I'm worried about >>>> applying it... >>> it sound more like a confilct between legacy iptables and the new nftables based replacement. >>> if you mix the two then it will appear as if the rules are installed but only some of the rules will run. >>> so the container images and the host need to be both configured to use the same versions. >>> >>> that said fi you are using centos images on a centos host they should be providing your usnign centos 7 or centos 8 on >>> both. if you try to use centos 7 image on a centos 8 host or centos 8 images on a centos 7 host it would likely have >>> issues due to the fact centos 8 uses a differt iptables implemeantion >>> >>>> >>>> -yoctozepto >>>> >>>> pon., 6 sty 2020 o 03:34 Jon Masters napisał(a): >>>>> >>>>> There’s no bug ID that I’m aware of. But I’ll go look for one or file one. >>>>> >>>>> -- >>>>> Computer Architect >>>>> >>>>> >>>>>> On Jan 5, 2020, at 18:51, Laurent Dumont wrote: >>>>> >>>>>  >>>>> Do you happen to have the bug ID for Centos? >>>>> >>>>> On Sun, Jan 5, 2020 at 2:11 PM Jon Masters wrote: >>>>>> >>>>>> This turns out to a not well documented bug in the CentOS7.7 kernel that causes exactly nat rules not to run as I >>>>>> was seeing. Oh dear god was this nasty as whatever to find and workaround. >>>>>> >>>>>> -- >>>>>> Computer Architect >>>>>> >>>>>> >>>>>>> On Jan 4, 2020, at 10:39, Jon Masters wrote: >>>>>>> >>>>>>> Excuse top posting on my phone. Also, yes, the namespaces are as described. It’s just that the (correct) nat >>>>>>> rules for the qrouter netns are never running, in spite of the two interfaces existing in that ns and correctly >>>>>>> attached to the vswitch. >>>>>>> >>>>>>> -- >>>>>>> Computer Architect >>>>>>> >>>>>>> >>>>>>>>> On Jan 4, 2020, at 07:56, Sean Mooney wrote: >>>>>>>>> >>>>>>>>> On Sat, 2020-01-04 at 10:46 +0100, Slawek Kaplonski wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> Is this qrouter namespace created with all those rules in container or in the host directly? >>>>>>>>> Do You have qr-xxx and qg-xxx ports from br-int in this qrouter namespace? >>>>>>>> >>>>>>>> in kolla the l3 agent should be running with net=host so the container should be useing the hosts >>>>>>>> root namespace and it will create network namespaces as needed for the different routers. >>>>>>>> >>>>>>>> the ip table rules should be in the router sub namespaces. >>>>>>>> >>>>>>>>> >>>>>>>>>>> On 4 Jan 2020, at 05:44, Jon Masters wrote: >>>>>>>>>> >>>>>>>>>> Hi there, >>>>>>>>>> >>>>>>>>>> I've got a weird problem with the neutron-l3-agent container on my deployment. It comes up, sets up the >>>>>>>>>> iptables >>>>>>>>>> rules in the qrouter namespace (and I can see these using "ip netns...") but traffic isn't having DNAT or >>>>>>>>>> SNAT >>>>>>>>>> applied. What's most strange is that manually adding a LOG jump target to the iptables nat PRE/POSTROUTING >>>>>>>>>> chains >>>>>>>>>> (after enabling nf logging sent to the host kernel, confirmed that works) doesn't result in any log >>>>>>>>>> entries. It's as >>>>>>>>>> if the nat table isn't being applied at all for any packets traversing the qrouter namespace. This is >>>>>>>>>> driving me >>>>>>>>>> crazy :) >>>>>>>>>> >>>>>>>>>> Anyone got some quick suggestions? (assume I tried the obvious stuff). >>>>>>>>>> >>>>>>>>>> Jon. >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Computer Architect >>>>>>>>> >>>>>>>>> — >>>>>>>>> Slawek Kaplonski >>>>>>>>> Senior software engineer >>>>>>>>> Red Hat >>>>>>>>> >>>>>>>>> >>>> >>>> >>> > From Arkady.Kanevsky at dell.com Mon Jan 6 17:07:17 2020 From: Arkady.Kanevsky at dell.com (Arkady.Kanevsky at dell.com) Date: Mon, 6 Jan 2020 17:07:17 +0000 Subject: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management In-Reply-To: References: <87344b65740d47fc9777ae710dcf4af9@AUSX13MPS308.AMER.DELL.COM> Message-ID: <7b55e3b28d644492a846fdb10f7b127b@AUSX13MPS308.AMER.DELL.COM> Zhipeng, Thanks for quick feedback. Where is accelerating device is running? I am aware of 3 possibilities: servers, storage, switches. In each one of them the device is managed as part of server, storage box or switch. The core of my message is separation of device life cycle management in the “box” where it is placed, from the programming the device as needed per application (VM, container). Thanks, Arkady From: Zhipeng Huang Sent: Friday, January 3, 2020 7:53 PM To: Kanevsky, Arkady Cc: OpenStack Discuss Subject: Re: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management [EXTERNAL EMAIL] Hi Arkady, Thanks for your interest in Cyborg project :) I would like to point out that when we initiated the project there are two specific use cases we want to cover: the accelerators attached locally (via PCIe or other bus type) or remotely (via Ethernet or other fabric type). For the latter one, it is clear that its life cycle is independent from the server (like block device managed by Cinder). For the former one however, its life cycle is not dependent on server for all kinds of accelerators either. For example we already have PCIe based AI accelerator cards or Smart NICs that could be power on/off when the server is on all the time. Therefore it is not a good idea to move all the life cycle management part into Ironic for the above mentioned reasons. Ironic integration is very important for the standalone usage of Cyborg for Kubernetes, Envoy (TLS acceleration) and others alike. Hope this answers your question :) On Sat, Jan 4, 2020 at 5:23 AM > wrote: Fellow Open Stackers, I have been thinking on how to handle SmartNICs, GPUs, FPGA handling across different projects within OpenStack with Cyborg taking a leading role in it. Cyborg is important project and address accelerator devices that are part of the server and potentially switches and storage. It is address 3 different use cases and users there are all grouped into single project. 1. Application user need to program a portion of the device under management, like GPU, or SmartNIC for that app usage. Having a common way to do it across different device families and across different vendor is very important. And that has to be done every time a VM is deploy that need usage of a device. That is tied with VM scheduling. 2. Administrator need to program the whole device for specific usage. That covers the scenario when device can only support single tenant or single use case. That is done once during OpenStack deployment but may need reprogramming to configure device for different usage. May or may not require reboot of the server. 3. Administrator need to setup device for its use, like burning specific FW on it. This is typically done as part of server life-cycle event. The first 2 cases cover application life cycle of device usage. The last one covers device life cycle independently how it is used. Managing life cycle of devices is Ironic responsibility, One cannot and should not manage lifecycle of server components independently. Managing server devices outside server management violates customer service agreements with server vendors and breaks server support agreements. Nova and Neutron are getting info about all devices and their capabilities from Ironic; that they use for scheduling. We should avoid creating new project for every new component of the server and modify nova and neuron for each new device. (the same will also apply to cinder and manila if smart devices used in its data/control path on a server). Finally we want Cyborg to be able to be used in standalone capacity, say for Kubernetes. Thus, I propose that Cyborg cover use cases 1 & 2, and Ironic would cover use case 3. Thus, move all device Life-cycle code from Cyborg to Ironic. Concentrate Cyborg of fulfilling the first 2 use cases. Simplify integration with Nova and Neutron for using these accelerators to use existing Ironic mechanism for it. Create idempotent calls for use case 1 so Nova and Neutron can use it as part of VM deployment to ensure that devices are programmed for VM under scheduling need. Create idempotent call(s) for use case 2 for TripleO to setup device for single accelerator usage of a node. [Propose similar model for CNI integration.] Let the discussion start! Thanks., Arkady -- Zhipeng (Howard) Huang Principle Engineer OpenStack, Kubernetes, CNCF, LF Edge, ONNX, Kubeflow, OpenSDS, Open Service Broker API, OCP, Hyperledger, ETSI, SNIA, DMTF, W3C -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Mon Jan 6 19:51:24 2020 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 6 Jan 2020 20:51:24 +0100 Subject: [neutron] Bug deputy report - week of 30th December Message-ID: <40343C7D-8C58-4D56-A1E5-D3F72C90D7F1@redhat.com> Hi, I was on bug deputy last week. It was pretty quiet week with only few bugs reported. Below is my summary of it. Critical: https://bugs.launchpad.net/neutron/+bug/1858260 - Upstream CI neutron-tempest-plugin-* fails - I marked it as critical as it cause gate failures, I will have to take a look at it closer next week, Medium: https://bugs.launchpad.net/neutron/+bug/1858086 - qrouter's local link route cannot be restored - confirmed by me on local env, it would be good if someone from L3 subteam can take a look at it, Undecided and others: https://bugs.launchpad.net/neutron/+bug/1858377 - probably bug for openstackclient rather than neutron, but I would like to wait for confirmation from bug reporter first, https://bugs.launchpad.net/neutron/+bug/1858262 - duplicate of other bug, https://bugs.launchpad.net/neutron/+bug/1858419 - docs bug to update config options for large scale deployments, see http://lists.openstack.org/pipermail/openstack-discuss/2020-January/011820.html for details — Slawek Kaplonski Senior software engineer Red Hat From skaplons at redhat.com Mon Jan 6 19:56:36 2020 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 6 Jan 2020 20:56:36 +0100 Subject: [kolla] neutron-l3-agent namespace NAT table not working? In-Reply-To: References: <9a331abcc2d5eaf119dc1c1903c3405024ce84a8.camel@redhat.com> Message-ID: Hi, > On 6 Jan 2020, at 16:15, Brian Haley wrote: > > On 1/6/20 7:33 AM, Radosław Piliszek wrote: >> Folks, this seems to be about C7, not C8, and >> "neutron_legacy_iptables" does not apply here. >> @Jon - what is the kernel bug you mentioned but never referenced? > > There was a previous kernel bug in a Centos kernel that broke DNAT, https://bugs.launchpad.net/neutron/+bug/1776778 but don't know if this is the same issue. I would have hoped no one was using that kernel by now, and/or it was blacklisted. This one also came to my mind when I read about kernel bug here. But this old bug was affecting only DNAT on dvr routers IIRC so IMO it doesn’t seems like same issue. > > -Brian > >> pon., 6 sty 2020 o 13:13 Jon Masters napisał(a): >>> >>> I did specifically check for such a conflict tho before proceeding down the path I went :) >>> >>> -- >>> Computer Architect >>> >>> >>>> On Jan 6, 2020, at 03:40, Sean Mooney wrote: >>>> >>>> On Mon, 2020-01-06 at 10:11 +0100, Radosław Piliszek wrote: >>>>> If it's RHEL kernel's bug, then Red Hat would likely want to know >>>>> about it (if not knowing already). >>>>> I have my kolla deployment on c7.7 and I don't encounter this issue, >>>>> though there is a pending kernel update so now I'm worried about >>>>> applying it... >>>> it sound more like a confilct between legacy iptables and the new nftables based replacement. >>>> if you mix the two then it will appear as if the rules are installed but only some of the rules will run. >>>> so the container images and the host need to be both configured to use the same versions. >>>> >>>> that said fi you are using centos images on a centos host they should be providing your usnign centos 7 or centos 8 on >>>> both. if you try to use centos 7 image on a centos 8 host or centos 8 images on a centos 7 host it would likely have >>>> issues due to the fact centos 8 uses a differt iptables implemeantion >>>> >>>>> >>>>> -yoctozepto >>>>> >>>>> pon., 6 sty 2020 o 03:34 Jon Masters napisał(a): >>>>>> >>>>>> There’s no bug ID that I’m aware of. But I’ll go look for one or file one. >>>>>> >>>>>> -- >>>>>> Computer Architect >>>>>> >>>>>> >>>>>>> On Jan 5, 2020, at 18:51, Laurent Dumont wrote: >>>>>> >>>>>>  >>>>>> Do you happen to have the bug ID for Centos? >>>>>> >>>>>> On Sun, Jan 5, 2020 at 2:11 PM Jon Masters wrote: >>>>>>> >>>>>>> This turns out to a not well documented bug in the CentOS7.7 kernel that causes exactly nat rules not to run as I >>>>>>> was seeing. Oh dear god was this nasty as whatever to find and workaround. >>>>>>> >>>>>>> -- >>>>>>> Computer Architect >>>>>>> >>>>>>> >>>>>>>> On Jan 4, 2020, at 10:39, Jon Masters wrote: >>>>>>>> >>>>>>>> Excuse top posting on my phone. Also, yes, the namespaces are as described. It’s just that the (correct) nat >>>>>>>> rules for the qrouter netns are never running, in spite of the two interfaces existing in that ns and correctly >>>>>>>> attached to the vswitch. >>>>>>>> >>>>>>>> -- >>>>>>>> Computer Architect >>>>>>>> >>>>>>>> >>>>>>>>>> On Jan 4, 2020, at 07:56, Sean Mooney wrote: >>>>>>>>>> >>>>>>>>>> On Sat, 2020-01-04 at 10:46 +0100, Slawek Kaplonski wrote: >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> Is this qrouter namespace created with all those rules in container or in the host directly? >>>>>>>>>> Do You have qr-xxx and qg-xxx ports from br-int in this qrouter namespace? >>>>>>>>> >>>>>>>>> in kolla the l3 agent should be running with net=host so the container should be useing the hosts >>>>>>>>> root namespace and it will create network namespaces as needed for the different routers. >>>>>>>>> >>>>>>>>> the ip table rules should be in the router sub namespaces. >>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> On 4 Jan 2020, at 05:44, Jon Masters wrote: >>>>>>>>>>> >>>>>>>>>>> Hi there, >>>>>>>>>>> >>>>>>>>>>> I've got a weird problem with the neutron-l3-agent container on my deployment. It comes up, sets up the >>>>>>>>>>> iptables >>>>>>>>>>> rules in the qrouter namespace (and I can see these using "ip netns...") but traffic isn't having DNAT or >>>>>>>>>>> SNAT >>>>>>>>>>> applied. What's most strange is that manually adding a LOG jump target to the iptables nat PRE/POSTROUTING >>>>>>>>>>> chains >>>>>>>>>>> (after enabling nf logging sent to the host kernel, confirmed that works) doesn't result in any log >>>>>>>>>>> entries. It's as >>>>>>>>>>> if the nat table isn't being applied at all for any packets traversing the qrouter namespace. This is >>>>>>>>>>> driving me >>>>>>>>>>> crazy :) >>>>>>>>>>> >>>>>>>>>>> Anyone got some quick suggestions? (assume I tried the obvious stuff). >>>>>>>>>>> >>>>>>>>>>> Jon. >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Computer Architect >>>>>>>>>> >>>>>>>>>> — >>>>>>>>>> Slawek Kaplonski >>>>>>>>>> Senior software engineer >>>>>>>>>> Red Hat >>>>>>>>>> >>>>>>>>>> >>>>> >>>>> >>>> > — Slawek Kaplonski Senior software engineer Red Hat From skaplons at redhat.com Mon Jan 6 20:05:53 2020 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 6 Jan 2020 21:05:53 +0100 Subject: [all][neutron][neutron-fwaas] Maintainers needed In-Reply-To: <20191119102615.oq46xojyhoybulna@skaplons-mac> References: <20191119102615.oq46xojyhoybulna@skaplons-mac> Message-ID: Hi, Just as a reminder, we are still looking for maintainers who want to keep neutron-fwaas project alive. As it was written in my previous email, we will mark this project as deprecated. So please reply to this email or contact me directly if You are interested in maintaining this project. > On 19 Nov 2019, at 11:26, Slawek Kaplonski wrote: > > Hi, > > Over the past couple of cycles we have noticed that new contributions and > maintenance efforts for neutron-fwaas project were almost non existent. > This impacts patches for bug fixes, new features and reviews. The Neutron > core team is trying to at least keep the CI of this project healthy, but we > don’t have enough knowledge about the details of the neutron-fwaas > code base to review more complex patches. > > During the PTG in Shanghai we discussed that with operators and TC members > during the forum session [1] and later within the Neutron team during the > PTG session [2]. > > During these discussions, with the help of operators and TC members, we reached > the conclusion that we need to have someone responsible for maintaining project. > This doesn’t mean that the maintainer needs to spend full time working on this > project. Rather, we need someone to be the contact person for the project, who > takes care of the project’s CI and review patches. Of course that’s only a > minimal requirement. If the new maintainer works on new features for the > project, it’s even better :) > > If we don’t have any new maintainer(s) before milestone Ussuri-2, which is > Feb 10 - Feb 14 according to [3], we will need to mark neutron-fwaas > as deprecated and in “V” cycle we will propose to move the project > from the Neutron stadium, hosted in the “openstack/“ namespace, to the > unofficial projects hosted in the “x/“ namespace. > > So if You are using this project now, or if You have customers who are > using it, please consider the possibility of maintaining it. Otherwise, > please be aware that it is highly possible that the project will be > deprecated and moved out from the official OpenStack projects. > > [1] > https://etherpad.openstack.org/p/PVG-Neutron-stadium-projects-the-path-forward > [2] https://etherpad.openstack.org/p/Shanghai-Neutron-Planning-restored - > Lines 379-421 > [3] https://releases.openstack.org/ussuri/schedule.html > > -- > Slawek Kaplonski > Senior software engineer > Red Hat — Slawek Kaplonski Senior software engineer Red Hat From skaplons at redhat.com Mon Jan 6 20:06:13 2020 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 6 Jan 2020 21:06:13 +0100 Subject: [all][neutron][neutron-vpnaas] Maintainers needed In-Reply-To: <20191119104137.pkra6hehfhdjjhh3@skaplons-mac> References: <20191119104137.pkra6hehfhdjjhh3@skaplons-mac> Message-ID: Hi, Just as a reminder, we are still looking for maintainers who want to keep neutron-vpnaas project alive. As it was written in my previous email, we will mark this project as deprecated. So please reply to this email or contact me directly if You are interested in maintaining this project. > On 19 Nov 2019, at 11:41, Slawek Kaplonski wrote: > > Hi, > > Over the past couple of cycles we have noticed that new contributions and > maintenance efforts for neutron-vpnaas were almost non existent. > This impacts patches for bug fixes, new features and reviews. The Neutron > core team is trying to at least keep the CI of this project healthy, but we > don’t have enough knowledge about the details of the neutron-vpnaas > code base to review more complex patches. > > During the PTG in Shanghai we discussed that with operators and TC members > during the forum session [1] and later within the Neutron team during the > PTG session [2]. > > During these discussions, with the help of operators and TC members, we reached > the conclusion that we need to have someone responsible for maintaining project. > This doesn’t mean that the maintainer needs to spend full time working on this > project. Rather, we need someone to be the contact person for the project, who > takes care of the project’s CI and review patches. Of course that’s only a > minimal requirement. If the new maintainer works on new features for the > project, it’s even better :) > > If we don’t have any new maintainer(s) before milestone Ussuri-2, which is > Feb 10 - Feb 14 according to [3], we will need to mark neutron-vpnaas > as deprecated and in “V” cycle we will propose to move the project > from the Neutron stadium, hosted in the “openstack/“ namespace, to the > unofficial projects hosted in the “x/“ namespace. > > So if You are using this project now, or if You have customers who are > using it, please consider the possibility of maintaining it. Otherwise, > please be aware that it is highly possible that the project will be > deprecated and moved out from the official OpenStack projects. > > [1] > https://etherpad.openstack.org/p/PVG-Neutron-stadium-projects-the-path-forward > [2] https://etherpad.openstack.org/p/Shanghai-Neutron-Planning-restored - > Lines 379-421 > [3] https://releases.openstack.org/ussuri/schedule.html > > -- > Slawek Kaplonski > Senior software engineer > Red Hat — Slawek Kaplonski Senior software engineer Red Hat From neil at tigera.io Mon Jan 6 20:38:11 2020 From: neil at tigera.io (Neil Jerram) Date: Mon, 6 Jan 2020 20:38:11 +0000 Subject: [all] Is there something I can do to get a simple fix done? Message-ID: I'm struggling to say this positively, but... it feels like OpenStack promotes refactoring work that will likely break something, but is very slow when a corresponding fix is needed, even when the fix is trivial. Is there something we could do to get fixes done more quickly when needed? My case in point: my team's networking plugin (networking-calico) does not do "extraroutes", and so was broken by some python-openstackclient change (possibly [1]) that wrongly assumed that. I posted a fix [2] that passed CI on 21st October, and asked for it to be reviewed on IRC a couple of days later. Édouard Thuleau posted a similar fix [3] on 4th December, and we agreed that his was better, so I abandoned mine. His fix attracted a +2 on 11th December, but has been sitting like that ever since. It's a fix that I would expect to be simple to review, so I wonder if there's something else we could have done here to get this moving? Or if there is a systemic problem here that deserves discussion? [1] https://opendev.org/openstack/python-openstackclient/commit/c44f26eb7e41c28bb13ef9bd31c8ddda9e638862 [2] https://review.opendev.org/#/c/685312/ [3] https://review.opendev.org/#/c/697240/ Many thanks, Neil -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at fried.cc Mon Jan 6 21:29:27 2020 From: openstack at fried.cc (Eric Fried) Date: Mon, 6 Jan 2020 15:29:27 -0600 Subject: [all] Is there something I can do to get a simple fix done? In-Reply-To: References: Message-ID: Neil- > Édouard Thuleau posted a similar fix [3] on 4th > December, and we agreed that his was better, so I abandoned mine. His > fix attracted a +2 on 11th December, but has been sitting like that ever > since. Expecting any fix - even a trivial one - to get merged in less than a month when that month includes most of December is expecting a lot. That said... > It's a fix that I would expect to be simple to review, so I wonder if > there's something else we could have done here to get this moving?  Or > if there is a systemic problem here that deserves discussion? In this specific case I think the issue is a dearth of "core hours" available to the python-openstackclient project. Dean (dtroyer) and Monty (mordred) are the main cores there, and their time is *very* divided. Absent some signal of urgency to garner their attention and cause them to prioritize it over other work, a given change has a decent chance of languishing indefinitely, particularly as other more urgent work never ceases to pile up. > I'm struggling to say this positively, but... it feels like OpenStack > promotes refactoring work that will likely break something, but is very > slow when a corresponding fix is needed, even when the fix is trivial. > Is there something we could do to get fixes done more quickly when needed? Presumably this generalization is based on more than just the above fix. Certainly it doesn't apply to all patches in all projects. But to the extent that it is true, it often has the same root cause: shortage of maintainers. This is a message that needs to be taken back to companies that continue to expect OpenStack to be maintained without investing in the human power necessary to create cores. #brokenrecord efried . From juliaashleykreger at gmail.com Mon Jan 6 21:32:57 2020 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Mon, 6 Jan 2020 13:32:57 -0800 Subject: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management In-Reply-To: <7b55e3b28d644492a846fdb10f7b127b@AUSX13MPS308.AMER.DELL.COM> References: <87344b65740d47fc9777ae710dcf4af9@AUSX13MPS308.AMER.DELL.COM> <7b55e3b28d644492a846fdb10f7b127b@AUSX13MPS308.AMER.DELL.COM> Message-ID: Greetings Arkady, I think your message makes a very good case and raises a point that I've been trying to type out for the past hour, but with only different words. We have multiple USER driven interactions with a similarly desired, if not the exact same desired end result where different paths can be taken, as we perceive use cases from "As a user, I would like a VM with a configured accelerator", "I would like any compute resource (VM or Baremetal), with a configured accelerator", to "As an administrator, I need to reallocate a baremetal node for this different use, so my user can leverage its accelerator once they know how and are ready to use it.", and as suggested "I as a user want baremetal with k8s and configured accelerators." And I suspect this diversity of use patterns is where things begin to become difficult. As such I believe, we in essence, have a question of a support or compatibility matrix that definitely has gaps depending on "how" the "user" wants or needs to achieve their goals. And, I think where this entire discussion _can_ go sideways is... (from what I understand) some of these devices need to be flashed by the application user with firmware on demand to meet the user's needs, which is where lifecycle and support interactions begin to become... conflicted. Further complicating matters is the "Metal to Tenant" use cases where the user requesting the machine is not an administrator, but has some level of inherent administrative access to all Operating System accessible devices once their OS has booted. Which makes me wonder "What if the cloud administrators WANT to block the tenant's direct ability to write/flash firmware into accelerator/smartnic/etc?" I suspect if cloud administrators want to block such hardware access, vendors will want to support such a capability. Blocking such access inherently forces some actions into hardware management/maintenance workflows, and may ultimately may cause some of a support matrix's use cases to be unsupportable, again ultimately depending on what exactly the user is attempting to achieve. Going back to the suggestions in the original email, They seem logical to me in terms of the delineation and separation of responsibilities as we present a cohesive solution the users of our software. Greetings Zhipeng, Is there any documentation at present that details the desired support and use cases? I think this would at least help my understanding, since everything that requires the power to be on would still need to be integrated with-in workflows for eventual tighter integration. Also, has Cyborg drafted any plans or proposals for integration? -Julia On Mon, Jan 6, 2020 at 9:14 AM wrote: > > Zhipeng, > > Thanks for quick feedback. > > Where is accelerating device is running? I am aware of 3 possibilities: servers, storage, switches. > > In each one of them the device is managed as part of server, storage box or switch. > > > > The core of my message is separation of device life cycle management in the “box” where it is placed, from the programming the device as needed per application (VM, container). > > > > Thanks, > Arkady > > > > From: Zhipeng Huang > Sent: Friday, January 3, 2020 7:53 PM > To: Kanevsky, Arkady > Cc: OpenStack Discuss > Subject: Re: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management > > > > [EXTERNAL EMAIL] > > Hi Arkady, > > > > Thanks for your interest in Cyborg project :) I would like to point out that when we initiated the project there are two specific use cases we want to cover: the accelerators attached locally (via PCIe or other bus type) or remotely (via Ethernet or other fabric type). > > > > For the latter one, it is clear that its life cycle is independent from the server (like block device managed by Cinder). For the former one however, its life cycle is not dependent on server for all kinds of accelerators either. For example we already have PCIe based AI accelerator cards or Smart NICs that could be power on/off when the server is on all the time. > > > > Therefore it is not a good idea to move all the life cycle management part into Ironic for the above mentioned reasons. Ironic integration is very important for the standalone usage of Cyborg for Kubernetes, Envoy (TLS acceleration) and others alike. > > > > Hope this answers your question :) > > > > On Sat, Jan 4, 2020 at 5:23 AM wrote: > > Fellow Open Stackers, > > I have been thinking on how to handle SmartNICs, GPUs, FPGA handling across different projects within OpenStack with Cyborg taking a leading role in it. > > > > Cyborg is important project and address accelerator devices that are part of the server and potentially switches and storage. > > It is address 3 different use cases and users there are all grouped into single project. > > > > Application user need to program a portion of the device under management, like GPU, or SmartNIC for that app usage. Having a common way to do it across different device families and across different vendor is very important. And that has to be done every time a VM is deploy that need usage of a device. That is tied with VM scheduling. > Administrator need to program the whole device for specific usage. That covers the scenario when device can only support single tenant or single use case. That is done once during OpenStack deployment but may need reprogramming to configure device for different usage. May or may not require reboot of the server. > Administrator need to setup device for its use, like burning specific FW on it. This is typically done as part of server life-cycle event. > > > > The first 2 cases cover application life cycle of device usage. > > The last one covers device life cycle independently how it is used. > > > > Managing life cycle of devices is Ironic responsibility, One cannot and should not manage lifecycle of server components independently. Managing server devices outside server management violates customer service agreements with server vendors and breaks server support agreements. > > Nova and Neutron are getting info about all devices and their capabilities from Ironic; that they use for scheduling. We should avoid creating new project for every new component of the server and modify nova and neuron for each new device. (the same will also apply to cinder and manila if smart devices used in its data/control path on a server). > > Finally we want Cyborg to be able to be used in standalone capacity, say for Kubernetes. > > > > Thus, I propose that Cyborg cover use cases 1 & 2, and Ironic would cover use case 3. > > Thus, move all device Life-cycle code from Cyborg to Ironic. > > Concentrate Cyborg of fulfilling the first 2 use cases. > > Simplify integration with Nova and Neutron for using these accelerators to use existing Ironic mechanism for it. > > Create idempotent calls for use case 1 so Nova and Neutron can use it as part of VM deployment to ensure that devices are programmed for VM under scheduling need. > > Create idempotent call(s) for use case 2 for TripleO to setup device for single accelerator usage of a node. > > [Propose similar model for CNI integration.] > > > > Let the discussion start! > > > > Thanks., > Arkady > > > > > -- > > Zhipeng (Howard) Huang > > > > Principle Engineer > > OpenStack, Kubernetes, CNCF, LF Edge, ONNX, Kubeflow, OpenSDS, Open Service Broker API, OCP, Hyperledger, ETSI, SNIA, DMTF, W3C > > From jcm at jonmasters.org Mon Jan 6 22:27:43 2020 From: jcm at jonmasters.org (Jon Masters) Date: Mon, 6 Jan 2020 14:27:43 -0800 Subject: [kolla] neutron-l3-agent namespace NAT table not working? In-Reply-To: References: <9a331abcc2d5eaf119dc1c1903c3405024ce84a8.camel@redhat.com> Message-ID: https://bugs.launchpad.net/kolla/+bug/1858505 On Mon, Jan 6, 2020 at 11:56 AM Slawek Kaplonski wrote: > Hi, > > > On 6 Jan 2020, at 16:15, Brian Haley wrote: > > > > On 1/6/20 7:33 AM, Radosław Piliszek wrote: > >> Folks, this seems to be about C7, not C8, and > >> "neutron_legacy_iptables" does not apply here. > >> @Jon - what is the kernel bug you mentioned but never referenced? > > > > There was a previous kernel bug in a Centos kernel that broke DNAT, > https://bugs.launchpad.net/neutron/+bug/1776778 but don't know if this is > the same issue. I would have hoped no one was using that kernel by now, > and/or it was blacklisted. > > This one also came to my mind when I read about kernel bug here. But this > old bug was affecting only DNAT on dvr routers IIRC so IMO it doesn’t seems > like same issue. > > > > > -Brian > > > >> pon., 6 sty 2020 o 13:13 Jon Masters napisał(a): > >>> > >>> I did specifically check for such a conflict tho before proceeding > down the path I went :) > >>> > >>> -- > >>> Computer Architect > >>> > >>> > >>>> On Jan 6, 2020, at 03:40, Sean Mooney wrote: > >>>> > >>>> On Mon, 2020-01-06 at 10:11 +0100, Radosław Piliszek wrote: > >>>>> If it's RHEL kernel's bug, then Red Hat would likely want to know > >>>>> about it (if not knowing already). > >>>>> I have my kolla deployment on c7.7 and I don't encounter this issue, > >>>>> though there is a pending kernel update so now I'm worried about > >>>>> applying it... > >>>> it sound more like a confilct between legacy iptables and the new > nftables based replacement. > >>>> if you mix the two then it will appear as if the rules are installed > but only some of the rules will run. > >>>> so the container images and the host need to be both configured to > use the same versions. > >>>> > >>>> that said fi you are using centos images on a centos host they should > be providing your usnign centos 7 or centos 8 on > >>>> both. if you try to use centos 7 image on a centos 8 host or centos 8 > images on a centos 7 host it would likely have > >>>> issues due to the fact centos 8 uses a differt iptables implemeantion > >>>> > >>>>> > >>>>> -yoctozepto > >>>>> > >>>>> pon., 6 sty 2020 o 03:34 Jon Masters > napisał(a): > >>>>>> > >>>>>> There’s no bug ID that I’m aware of. But I’ll go look for one or > file one. > >>>>>> > >>>>>> -- > >>>>>> Computer Architect > >>>>>> > >>>>>> > >>>>>>> On Jan 5, 2020, at 18:51, Laurent Dumont > wrote: > >>>>>> > >>>>>>  > >>>>>> Do you happen to have the bug ID for Centos? > >>>>>> > >>>>>> On Sun, Jan 5, 2020 at 2:11 PM Jon Masters > wrote: > >>>>>>> > >>>>>>> This turns out to a not well documented bug in the CentOS7.7 > kernel that causes exactly nat rules not to run as I > >>>>>>> was seeing. Oh dear god was this nasty as whatever to find and > workaround. > >>>>>>> > >>>>>>> -- > >>>>>>> Computer Architect > >>>>>>> > >>>>>>> > >>>>>>>> On Jan 4, 2020, at 10:39, Jon Masters wrote: > >>>>>>>> > >>>>>>>> Excuse top posting on my phone. Also, yes, the namespaces are as > described. It’s just that the (correct) nat > >>>>>>>> rules for the qrouter netns are never running, in spite of the > two interfaces existing in that ns and correctly > >>>>>>>> attached to the vswitch. > >>>>>>>> > >>>>>>>> -- > >>>>>>>> Computer Architect > >>>>>>>> > >>>>>>>> > >>>>>>>>>> On Jan 4, 2020, at 07:56, Sean Mooney > wrote: > >>>>>>>>>> > >>>>>>>>>> On Sat, 2020-01-04 at 10:46 +0100, Slawek Kaplonski wrote: > >>>>>>>>>> Hi, > >>>>>>>>>> > >>>>>>>>>> Is this qrouter namespace created with all those rules in > container or in the host directly? > >>>>>>>>>> Do You have qr-xxx and qg-xxx ports from br-int in this qrouter > namespace? > >>>>>>>>> > >>>>>>>>> in kolla the l3 agent should be running with net=host so the > container should be useing the hosts > >>>>>>>>> root namespace and it will create network namespaces as needed > for the different routers. > >>>>>>>>> > >>>>>>>>> the ip table rules should be in the router sub namespaces. > >>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>>> On 4 Jan 2020, at 05:44, Jon Masters > wrote: > >>>>>>>>>>> > >>>>>>>>>>> Hi there, > >>>>>>>>>>> > >>>>>>>>>>> I've got a weird problem with the neutron-l3-agent container > on my deployment. It comes up, sets up the > >>>>>>>>>>> iptables > >>>>>>>>>>> rules in the qrouter namespace (and I can see these using "ip > netns...") but traffic isn't having DNAT or > >>>>>>>>>>> SNAT > >>>>>>>>>>> applied. What's most strange is that manually adding a LOG > jump target to the iptables nat PRE/POSTROUTING > >>>>>>>>>>> chains > >>>>>>>>>>> (after enabling nf logging sent to the host kernel, confirmed > that works) doesn't result in any log > >>>>>>>>>>> entries. It's as > >>>>>>>>>>> if the nat table isn't being applied at all for any packets > traversing the qrouter namespace. This is > >>>>>>>>>>> driving me > >>>>>>>>>>> crazy :) > >>>>>>>>>>> > >>>>>>>>>>> Anyone got some quick suggestions? (assume I tried the obvious > stuff). > >>>>>>>>>>> > >>>>>>>>>>> Jon. > >>>>>>>>>>> > >>>>>>>>>>> -- > >>>>>>>>>>> Computer Architect > >>>>>>>>>> > >>>>>>>>>> — > >>>>>>>>>> Slawek Kaplonski > >>>>>>>>>> Senior software engineer > >>>>>>>>>> Red Hat > >>>>>>>>>> > >>>>>>>>>> > >>>>> > >>>>> > >>>> > > > > — > Slawek Kaplonski > Senior software engineer > Red Hat > > -- Computer Architect -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at fried.cc Mon Jan 6 22:48:19 2020 From: openstack at fried.cc (Eric Fried) Date: Mon, 6 Jan 2020 16:48:19 -0600 Subject: [cliff][docs][requirements] new cliff versions causes docs to fail to build In-Reply-To: <20191222175308.juzyu6grndfcf2ez@mthode.org> References: <20191222175308.juzyu6grndfcf2ez@mthode.org> Message-ID: On 12/22/19 11:53 AM, Matthew Thode wrote: > Looks like some things changed in the new version that we depended upon > and are now causing failures. > > Exception occurred: > File "/home/zuul/src/opendev.org/openstack/python-openstackclient/.tox/docs/lib/python3.6/site-packages/cliff/sphinxext.py", line 245, in _load_app > if not issubclass(cliff_app_class, app.App): > TypeError: issubclass() arg 1 must be a class > This should have been fixed by [1], which is in cliff since 2.14.0. The python-openstackclient docs target (which IIUC still uses the def in tox.ini?) pulls in requirements.txt which lists cliff!=2.9.0,>=2.8.0 # Apache-2.0 and upper-constraints, which is at 2.16.0. All that seems copacetic to me. I also can't reproduce the failure locally building python-openstackclient docs from scratch. What/where/how were you building when you encountered this? efried [1] https://review.opendev.org/#/c/614218/ From openstack at fried.cc Mon Jan 6 22:59:45 2020 From: openstack at fried.cc (Eric Fried) Date: Mon, 6 Jan 2020 16:59:45 -0600 Subject: [cliff][docs][requirements] new cliff versions causes docs to fail to build In-Reply-To: References: <20191222175308.juzyu6grndfcf2ez@mthode.org> Message-ID: <1f796271-40f9-f93b-17b8-9ed30c91e51a@fried.cc> > cliff!=2.9.0,>=2.8.0 # Apache-2.0 I guess it wouldn't hurt to bump this to >=2.14.0 efried . From andrei.perepiolkin at open-e.com Tue Jan 7 06:28:09 2020 From: andrei.perepiolkin at open-e.com (Andrei Perapiolkin) Date: Tue, 7 Jan 2020 08:28:09 +0200 Subject: [kolla] Quick start: ansible deploy failure In-Reply-To: References: <88226158-d15c-7f23-e692-aa461f6d8549@open-e.com> Message-ID: Hi Radosław, Thanks for answering me. Yes I was deploying "Stein". And Yes, after setting openstack_release to Train error disappeared. Many thanks again Radosław. Andrei Perepiolkin On 1/6/20 11:08 AM, Radosław Piliszek wrote: > Hi Andrei, > > I see you use kolla-ansible for Train, yet it looks as if you are > deploying Stein there. > Could you confirm that? > If you prefer to deploy Stein, please use the Stein branch of > kolla-ansible or analogically the 8.* releases from PyPI. > Otherwise try deploying Train. > > -yoctozepto > > pon., 6 sty 2020 o 05:58 Andrei Perapiolkin > napisał(a): >> Hello, >> >> >> Im following quick start guide on deploying Kolla ansible and getting failure on deploy stage: >> >> https://docs.openstack.org/kolla-ansible/latest/user/quickstart.html >> >> kolla-ansible -i ./multinode deploy >> >> TASK [mariadb : Creating haproxy mysql user] ******************************************************************************************************************************** >> >> fatal: [control01]: FAILED! => {"changed": false, "msg": "Can not parse the inner module output: localhost | SUCCESS => {\n \"changed\": false, \n \"user\": \"haproxy\"\n}\n"} >> >> >> I deploy to Centos7 with latest updates. >> >> >> [user at master ~]$ pip list >> DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. More details about Python 2 support in pip, can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support >> Package Version >> -------------------------------- ---------- >> ansible 2.9.1 >> Babel 2.8.0 >> backports.ssl-match-hostname 3.7.0.1 >> certifi 2019.11.28 >> cffi 1.13.2 >> chardet 3.0.4 >> configobj 4.7.2 >> cryptography 2.8 >> debtcollector 1.22.0 >> decorator 3.4.0 >> docker 4.1.0 >> enum34 1.1.6 >> funcsigs 1.0.2 >> httplib2 0.9.2 >> idna 2.8 >> iniparse 0.4 >> ipaddress 1.0.23 >> IPy 0.75 >> iso8601 0.1.12 >> Jinja2 2.10.3 >> jmespath 0.9.4 >> kitchen 1.1.1 >> kolla-ansible 9.0.0 >> MarkupSafe 1.1.1 >> monotonic 1.5 >> netaddr 0.7.19 >> netifaces 0.10.9 >> oslo.config 6.12.0 >> oslo.i18n 3.25.0 >> oslo.utils 3.42.1 >> paramiko 2.1.1 >> pbr 5.4.4 >> perf 0.1 >> pip 19.3.1 >> ply 3.4 >> policycoreutils-default-encoding 0.1 >> pyasn1 0.1.9 >> pycparser 2.19 >> pycurl 7.19.0 >> pygobject 3.22.0 >> pygpgme 0.3 >> pyliblzma 0.5.3 >> pyparsing 2.4.6 >> python-linux-procfs 0.4.9 >> pytz 2019.3 >> pyudev 0.15 >> pyxattr 0.5.1 >> PyYAML 5.2 >> requests 2.22.0 >> rfc3986 1.3.2 >> schedutils 0.4 >> seobject 0.1 >> sepolicy 1.1 >> setuptools 44.0.0 >> six 1.13.0 >> slip 0.4.0 >> slip.dbus 0.4.0 >> stevedore 1.31.0 >> urlgrabber 3.10 >> urllib3 1.25.7 >> websocket-client 0.57.0 >> wrapt 1.11.2 >> yum-metadata-parser 1.1.4 >> >> >> and it looks like Im not alone with such issue: https://q.cnblogs.com/q/125213/ >> >> >> Thanks for your attention, >> >> Andrei Perepiolkin From jiaopengju at cmss.chinamobile.com Tue Jan 7 07:43:47 2020 From: jiaopengju at cmss.chinamobile.com (jiaopengju) Date: Tue, 07 Jan 2020 15:43:47 +0800 Subject: [largescale-sig] Meeting summary and next actions In-Reply-To: <06e5f16f-dfa4-8189-da7b-ad2250df8125@openstack.org> References: <3c3a6232-9a3b-d240-ab82-c7ac4997f5c0@openstack.org> <06e5f16f-dfa4-8189-da7b-ad2250df8125@openstack.org> Message-ID: 2- move the meeting to January 22, but we may lose Chinese participants to new year preparations... Thank you ttx. The second option is OK for me, I will be online on January 22. -- Pengju Jiao(jiaopengju) 在 2020/1/6 下午10:32,“Thierry Carrez” 写入: Thierry Carrez wrote: > [...] > The next meeting will happen on January 15, at 9:00 UTC on > #openstack-meeting. Oops, some unexpected travel came up and I won't be available to chair the meeting on that date. We can either: 1- keep the meeting, with someone else chairing. I can help with posting the agenda before and the summary after, just need someone to start the meeting and lead it -- any volunteer? 2- move the meeting to January 22, but we may lose Chinese participants to new year preparations... Thoughts? -- Thierry Carrez (ttx) From radoslaw.piliszek at gmail.com Tue Jan 7 07:51:10 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Tue, 7 Jan 2020 08:51:10 +0100 Subject: [kolla] neutron-l3-agent namespace NAT table not working? In-Reply-To: References: <9a331abcc2d5eaf119dc1c1903c3405024ce84a8.camel@redhat.com> Message-ID: Thanks, Jon, though it's still too general to deduce anything more. Please see my comments on bug. -yoctozepto pon., 6 sty 2020 o 23:27 Jon Masters napisał(a): > > https://bugs.launchpad.net/kolla/+bug/1858505 > > On Mon, Jan 6, 2020 at 11:56 AM Slawek Kaplonski wrote: >> >> Hi, >> >> > On 6 Jan 2020, at 16:15, Brian Haley wrote: >> > >> > On 1/6/20 7:33 AM, Radosław Piliszek wrote: >> >> Folks, this seems to be about C7, not C8, and >> >> "neutron_legacy_iptables" does not apply here. >> >> @Jon - what is the kernel bug you mentioned but never referenced? >> > >> > There was a previous kernel bug in a Centos kernel that broke DNAT, https://bugs.launchpad.net/neutron/+bug/1776778 but don't know if this is the same issue. I would have hoped no one was using that kernel by now, and/or it was blacklisted. >> >> This one also came to my mind when I read about kernel bug here. But this old bug was affecting only DNAT on dvr routers IIRC so IMO it doesn’t seems like same issue. >> >> > >> > -Brian >> > >> >> pon., 6 sty 2020 o 13:13 Jon Masters napisał(a): >> >>> >> >>> I did specifically check for such a conflict tho before proceeding down the path I went :) >> >>> >> >>> -- >> >>> Computer Architect >> >>> >> >>> >> >>>> On Jan 6, 2020, at 03:40, Sean Mooney wrote: >> >>>> >> >>>> On Mon, 2020-01-06 at 10:11 +0100, Radosław Piliszek wrote: >> >>>>> If it's RHEL kernel's bug, then Red Hat would likely want to know >> >>>>> about it (if not knowing already). >> >>>>> I have my kolla deployment on c7.7 and I don't encounter this issue, >> >>>>> though there is a pending kernel update so now I'm worried about >> >>>>> applying it... >> >>>> it sound more like a confilct between legacy iptables and the new nftables based replacement. >> >>>> if you mix the two then it will appear as if the rules are installed but only some of the rules will run. >> >>>> so the container images and the host need to be both configured to use the same versions. >> >>>> >> >>>> that said fi you are using centos images on a centos host they should be providing your usnign centos 7 or centos 8 on >> >>>> both. if you try to use centos 7 image on a centos 8 host or centos 8 images on a centos 7 host it would likely have >> >>>> issues due to the fact centos 8 uses a differt iptables implemeantion >> >>>> >> >>>>> >> >>>>> -yoctozepto >> >>>>> >> >>>>> pon., 6 sty 2020 o 03:34 Jon Masters napisał(a): >> >>>>>> >> >>>>>> There’s no bug ID that I’m aware of. But I’ll go look for one or file one. >> >>>>>> >> >>>>>> -- >> >>>>>> Computer Architect >> >>>>>> >> >>>>>> >> >>>>>>> On Jan 5, 2020, at 18:51, Laurent Dumont wrote: >> >>>>>> >> >>>>>>  >> >>>>>> Do you happen to have the bug ID for Centos? >> >>>>>> >> >>>>>> On Sun, Jan 5, 2020 at 2:11 PM Jon Masters wrote: >> >>>>>>> >> >>>>>>> This turns out to a not well documented bug in the CentOS7.7 kernel that causes exactly nat rules not to run as I >> >>>>>>> was seeing. Oh dear god was this nasty as whatever to find and workaround. >> >>>>>>> >> >>>>>>> -- >> >>>>>>> Computer Architect >> >>>>>>> >> >>>>>>> >> >>>>>>>> On Jan 4, 2020, at 10:39, Jon Masters wrote: >> >>>>>>>> >> >>>>>>>> Excuse top posting on my phone. Also, yes, the namespaces are as described. It’s just that the (correct) nat >> >>>>>>>> rules for the qrouter netns are never running, in spite of the two interfaces existing in that ns and correctly >> >>>>>>>> attached to the vswitch. >> >>>>>>>> >> >>>>>>>> -- >> >>>>>>>> Computer Architect >> >>>>>>>> >> >>>>>>>> >> >>>>>>>>>> On Jan 4, 2020, at 07:56, Sean Mooney wrote: >> >>>>>>>>>> >> >>>>>>>>>> On Sat, 2020-01-04 at 10:46 +0100, Slawek Kaplonski wrote: >> >>>>>>>>>> Hi, >> >>>>>>>>>> >> >>>>>>>>>> Is this qrouter namespace created with all those rules in container or in the host directly? >> >>>>>>>>>> Do You have qr-xxx and qg-xxx ports from br-int in this qrouter namespace? >> >>>>>>>>> >> >>>>>>>>> in kolla the l3 agent should be running with net=host so the container should be useing the hosts >> >>>>>>>>> root namespace and it will create network namespaces as needed for the different routers. >> >>>>>>>>> >> >>>>>>>>> the ip table rules should be in the router sub namespaces. >> >>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>>>> On 4 Jan 2020, at 05:44, Jon Masters wrote: >> >>>>>>>>>>> >> >>>>>>>>>>> Hi there, >> >>>>>>>>>>> >> >>>>>>>>>>> I've got a weird problem with the neutron-l3-agent container on my deployment. It comes up, sets up the >> >>>>>>>>>>> iptables >> >>>>>>>>>>> rules in the qrouter namespace (and I can see these using "ip netns...") but traffic isn't having DNAT or >> >>>>>>>>>>> SNAT >> >>>>>>>>>>> applied. What's most strange is that manually adding a LOG jump target to the iptables nat PRE/POSTROUTING >> >>>>>>>>>>> chains >> >>>>>>>>>>> (after enabling nf logging sent to the host kernel, confirmed that works) doesn't result in any log >> >>>>>>>>>>> entries. It's as >> >>>>>>>>>>> if the nat table isn't being applied at all for any packets traversing the qrouter namespace. This is >> >>>>>>>>>>> driving me >> >>>>>>>>>>> crazy :) >> >>>>>>>>>>> >> >>>>>>>>>>> Anyone got some quick suggestions? (assume I tried the obvious stuff). >> >>>>>>>>>>> >> >>>>>>>>>>> Jon. >> >>>>>>>>>>> >> >>>>>>>>>>> -- >> >>>>>>>>>>> Computer Architect >> >>>>>>>>>> >> >>>>>>>>>> — >> >>>>>>>>>> Slawek Kaplonski >> >>>>>>>>>> Senior software engineer >> >>>>>>>>>> Red Hat >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>> >> >>>>> >> >>>> >> > >> >> — >> Slawek Kaplonski >> Senior software engineer >> Red Hat >> > > > -- > Computer Architect From martialmichel at datamachines.io Tue Jan 7 02:38:32 2020 From: martialmichel at datamachines.io (Martial Michel) Date: Mon, 6 Jan 2020 21:38:32 -0500 Subject: [Scientific] Scientific SIG meeting July 7th 2100 UTC Message-ID: The first meeting of the new year at 2100 UTC on Tuesday, July 7th. Mostly an Any Other Business meeting to get its participants back from the holidays as reflected by the agenda :) https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_January_7th_2020 -------------- next part -------------- An HTML attachment was scrubbed... URL: From martialmichel at datamachines.io Tue Jan 7 02:38:32 2020 From: martialmichel at datamachines.io (Martial Michel) Date: Mon, 6 Jan 2020 21:38:32 -0500 Subject: [Scientific] Scientific SIG meeting July 7th 2100 UTC Message-ID: The first meeting of the new year at 2100 UTC on Tuesday, July 7th. Mostly an Any Other Business meeting to get its participants back from the holidays as reflected by the agenda :) https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_January_7th_2020 -------------- next part -------------- An HTML attachment was scrubbed... URL: From sshnaidm at redhat.com Tue Jan 7 11:20:04 2020 From: sshnaidm at redhat.com (Sagi Shnaidman) Date: Tue, 7 Jan 2020 13:20:04 +0200 Subject: [all][tripleo][openstack-ansible] Openstack Ansible modules - next steps In-Reply-To: References: Message-ID: Hi, last meeting was pretty short and only 2 participants due to holidays, so I think we can discuss this week the same agenda. I remind that we agreed to move ansible modules after 13 January. - what is the best strategy for freezing current modules in Ansible? Because a few patches were merged just recently [1] Seems like "freezing" doesn't really work. - python versions support in modules - keeping history when moving modules and other topics [2] Please add your questions to "Open discussion" section if there are some. Thanks [1] https://github.com/ansible/ansible/commits/devel/lib/ansible/modules/cloud/openstack [2] https://etherpad.openstack.org/p/openstack-ansible-modules On Fri, Dec 13, 2019 at 12:00 AM Sagi Shnaidman wrote: > Hi, all > short minutes from the meeting today about moving of Openstack Ansible > modules to Openstack. > > 1. Because of some level of uncertainty and different opinions, the > details of treatment of old modules will be under discussion in ML. I'll > send a mail about this topic. > 2. We agreed to have modules under "openstack." namespace and named > "cloud". So regular modules will be named like "openstack.cloud.os_server" > for example. > 3. We agreed to keep Ansible modules as thin as possible, putting the > logic into SDK. > 4. Also we will keep compatibility with as much Ansible versions as > possible. > 5. We agreed to have manual releases of Ansible modules as much as we > need. Similarly as it's done with SDK. > > Logs: > http://eavesdrop.openstack.org/meetings/api_sig/2019/api_sig.2019-12-12-16.00.log.html > Minutes: > http://eavesdrop.openstack.org/meetings/api_sig/2019/api_sig.2019-12-12-16.00.html > Etherpad: https://etherpad.openstack.org/p/openstack-ansible-modules > > Next time: Thursday 19 Dec 2019 4.00 PM UTC. > > Thanks > > On Fri, Dec 6, 2019 at 12:03 AM Sagi Shnaidman > wrote: > >> Hi, all >> short minutes from the meeting today about Openstack Ansible modules. >> >> 1. Ansible 2.10 is going to move all modules to collections, so Openstack >> modules should find a new home in Openstack repos. >> 2. Namespace for openstack modules will be named "openstack.". What is >> coming after the dot is still under discussion. >> 3. Current modules will be migrated to collections in "openstack." as is >> with their names and will be still available for playbooks (via >> symlinking). It will avoid breaking people that use in their playbooks os_* >> modules now. >> 4. Old modules will be frozen after migrations and all development work >> will go in the new modules which will live aside. >> 5. Critical bugfixes to 2.9 versions will be done via Ansible GitHub repo >> as usual and synced manually to "openstack." collection. It must be a very >> exceptional case. >> 6. Migrations are set for mid of January 2020 approximately. >> 7. Modules should stay compatible with last Ansible and collections API >> changed. >> 8. Because current old modules are licensed with GPL and license of >> Openstack is Apache2, we need to figure out if we can either relicense them >> or develop new ones with different license or to continue to work on new >> ones with GPL in SIG repo. Agreed to ask on legal-discuss ML. >> >> Long minutes: >> http://eavesdrop.openstack.org/meetings/api_sig/2019/api_sig.2019-12-05-16.00.html >> Logs: >> http://eavesdrop.openstack.org/meetings/api_sig/2019/api_sig.2019-12-05-16.00.log.html >> >> Etherpad: https://etherpad.openstack.org/p/openstack-ansible-modules >> Next time Thursday 12 Dec 2019 4.00 PM UTC. >> >> Thanks >> >> On Tue, Dec 3, 2019 at 8:18 PM Sagi Shnaidman >> wrote: >> >>> Hi, all >>> In the meeting today we agreed to meet every Thursday starting *this >>> week* at 4.00 PM UTC on #openstack-sdks channel on Freenode. We'll >>> discuss everything related to Openstack Ansible modules. >>> Agenda and topics are in the etherpad: >>> https://etherpad.openstack.org/p/openstack-ansible-modules >>> (I've created a new one, because we don't limit to Ironic modules only, >>> it's about all of them in general) >>> >>> Short minutes from meeting today: >>> Organizational: >>> 1. We meet every Thursday from this week at 4.00 PM UTC on >>> #openstack-sdks >>> 2. Interested parties for now are: Ironic, Tripleo, Openstack-Ansible, >>> Kolla-ansible, OpenstackSDK teams. Feel free to join and add yourself in >>> the etherpad. [1] >>> 3. We'll track our work in Storyboard for ansible-collections-openstack >>> (in progress) >>> 4. Openstack Ansible modules will live as collections under Ansible SIG >>> in repo openstack/ansible-collections-openstack [2] because there are >>> issues with different licensing: GPLv3 for Ansible in upstream and >>> Openstack license (Apache2). >>> 5. Ansible upstream Openstack modules will be merge-frozen when we'll >>> have our collections fully working and will be deprecated from Ansible at >>> some point in the future. >>> 6. Openstack Ansible collections will be published to Galaxy. >>> 7. There is a list of people that can be pinged for reviews in >>> ansible-collections-openstack project, feel free to join there [1] >>> >>> Technical: >>> 1. We use openstacksdk instead of [project]client modules. >>> 2. We will rename modules to be more like os_[service_type] named, >>> examples are in Ironic modules etherpad [3] >>> >>> Logs from meeting today you can find here: >>> http://eavesdrop.openstack.org/meetings/ansible_sig/2019/ansible_sig.2019-12-03-15.01.log.html >>> Please feel free to participate and add topics to agenda. [1] >>> >>> [1] https://etherpad.openstack.org/p/openstack-ansible-modules >>> [2] https://review.opendev.org/#/c/684740/ >>> [3] https://etherpad.openstack.org/p/ironic-ansible-modules >>> >>> Thanks >>> >>> On Wed, Nov 27, 2019 at 7:57 PM Sagi Shnaidman >>> wrote: >>> >>>> Hi, all >>>> >>>> in the light of finding the new home place for openstack related >>>> ansible modules [1] I'd like to discuss the best strategy to create Ironic >>>> ansible modules. Existing Ironic modules in Ansible repo don't cover even >>>> half of Ironic functionality, don't fit current needs and definitely >>>> require an additional work. There are a few topics that require attention >>>> and better be solved before modules are written to save additional work. We >>>> prepared an etherpad [2] with all these questions and if you have ideas or >>>> suggestions on how it should look you're welcome to update it. >>>> We'd like to decide the final place for them, name conventions (the >>>> most complex one!), what they should look like and how better to implement. >>>> Anybody interested in Ansible and baremetal management in Openstack, >>>> you're more than welcome to contribute. >>>> >>>> Thanks >>>> >>>> [1] https://review.opendev.org/#/c/684740/ >>>> [2] https://etherpad.openstack.org/p/ironic-ansible-modules >>>> >>>> -- >>>> Best regards >>>> Sagi Shnaidman >>>> >>> >>> >>> -- >>> Best regards >>> Sagi Shnaidman >>> >> >> >> -- >> Best regards >> Sagi Shnaidman >> > > > -- > Best regards > Sagi Shnaidman > -- Best regards Sagi Shnaidman -------------- next part -------------- An HTML attachment was scrubbed... URL: From jean-philippe at evrard.me Tue Jan 7 12:26:10 2020 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Tue, 07 Jan 2020 13:26:10 +0100 Subject: [tc] Expediting patch to unblock governance's CI Message-ID: <58bd7e162eb81f2bc86792a5642e0a1e09d68d99.camel@evrard.me> Hello everyone, Our governance repo's testing is currently failing, and I have patches to fix it (short term [1] and long term). Because it's been a while it's in there (and got recently updated), I will now merge the short term fix, even if we don't have enough votes (nor time). It will unblock many patches and prevent useless rechecks, at the expense of being out of policy. I will gladly take the blame, should there be any. Regards, JP [1] https://review.opendev.org/#/c/700422/ From jichenjc at cn.ibm.com Tue Jan 7 13:02:56 2020 From: jichenjc at cn.ibm.com (Chen CH Ji) Date: Tue, 7 Jan 2020 13:02:56 +0000 Subject: IBM z/VM CI is planning to migrate new environment In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From kotobi at dkrz.de Tue Jan 7 15:14:34 2020 From: kotobi at dkrz.de (Amjad Kotobi) Date: Tue, 7 Jan 2020 16:14:34 +0100 Subject: [neutron][rabbitmq] Neutron-server service shows deprecated "AMQPDeprecationWarning" Message-ID: <4D3B074F-09F2-48BE-BD61-5D34CBFE509E@dkrz.de> Hi, Today we are facing losing connection of neutron especially during instance creation or so as “systemctl status neutron-server” shows below message be deprecated in amqp 2.2.0. Since amqp 2.0 you have to explicitly call Connection.connect() before using the connection. W_FORCE_CONNECT.format(attr=attr))) /usr/lib/python2.7/site-packages/amqp/connection.py:304: AMQPDeprecationWarning: The .transport attribute on the connection was accessed before the connection was established. This is supported for now, but will be deprecated in amqp 2.2.0. Since amqp 2.0 you have to explicitly call Connection.connect() before using the connection. W_FORCE_CONNECT.format(attr=attr))) OpenStack release which we are running is “Pike”. Is there any way to remedy this? Thanks Amjad -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Tue Jan 7 15:15:21 2020 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Tue, 7 Jan 2020 09:15:21 -0600 Subject: [cliff][docs][requirements] new cliff versions causes docs to fail to build In-Reply-To: References: <20191222175308.juzyu6grndfcf2ez@mthode.org> Message-ID: <20200107151521.GA349057@sm-workstation> On Mon, Jan 06, 2020 at 04:48:19PM -0600, Eric Fried wrote: > On 12/22/19 11:53 AM, Matthew Thode wrote: > > Looks like some things changed in the new version that we depended upon > > and are now causing failures. > > > > Exception occurred: > > File "/home/zuul/src/opendev.org/openstack/python-openstackclient/.tox/docs/lib/python3.6/site-packages/cliff/sphinxext.py", line 245, in _load_app > > if not issubclass(cliff_app_class, app.App): > > TypeError: issubclass() arg 1 must be a class > > > > This should have been fixed by [1], which is in cliff since 2.14.0. The > python-openstackclient docs target (which IIUC still uses the def in > tox.ini?) pulls in requirements.txt which lists > > cliff!=2.9.0,>=2.8.0 # Apache-2.0 > > and upper-constraints, which is at 2.16.0. All that seems copacetic to > me. I also can't reproduce the failure locally building > python-openstackclient docs from scratch. > > What/where/how were you building when you encountered this? > > efried > > [1] https://review.opendev.org/#/c/614218/ Part of this could be that cliff is still capped since the newest release has some issues that have yet to be addressed. The upper constraint can't be raised until they are (which also likely means blacklisting this version), but I haven't seen any activity there yet. So the fix that was supposed to handle this doesn't appear to have actually done so. http://lists.openstack.org/pipermail/openstack-discuss/2019-December/011741.html From sean.mcginnis at gmx.com Tue Jan 7 15:22:37 2020 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Tue, 7 Jan 2020 09:22:37 -0600 Subject: [cliff][docs][requirements] new cliff versions causes docs to fail to build In-Reply-To: <20200107151521.GA349057@sm-workstation> References: <20191222175308.juzyu6grndfcf2ez@mthode.org> <20200107151521.GA349057@sm-workstation> Message-ID: <20200107152237.GA349707@sm-workstation> > > > TypeError: issubclass() arg 1 must be a class > > > > > > > This should have been fixed by [1], which is in cliff since 2.14.0. The > > python-openstackclient docs target (which IIUC still uses the def in > > tox.ini?) pulls in requirements.txt which lists > > > > cliff!=2.9.0,>=2.8.0 # Apache-2.0 > > > > and upper-constraints, which is at 2.16.0. All that seems copacetic to > > me. I also can't reproduce the failure locally building > > python-openstackclient docs from scratch. > > > > What/where/how were you building when you encountered this? > > > > efried > > > > [1] https://review.opendev.org/#/c/614218/ > > Part of this could be that cliff is still capped since the newest release has > some issues that have yet to be addressed. The upper constraint can't be raised > until they are (which also likely means blacklisting this version), but I > haven't seen any activity there yet. > > So the fix that was supposed to handle this doesn't appear to have actually > done so. > Or rather, it had fixed it, but somehow in the latest 2.17.0 release, the one unrelated change included somehow broke it again. https://github.com/openstack/cliff/compare/2.16.0...2.17.0 From sfinucan at redhat.com Tue Jan 7 16:44:37 2020 From: sfinucan at redhat.com (Stephen Finucane) Date: Tue, 07 Jan 2020 16:44:37 +0000 Subject: [cliff][docs][requirements] new cliff versions causes docs to fail to build In-Reply-To: <20200107152237.GA349707@sm-workstation> References: <20191222175308.juzyu6grndfcf2ez@mthode.org> <20200107151521.GA349057@sm-workstation> <20200107152237.GA349707@sm-workstation> Message-ID: On Tue, 2020-01-07 at 09:22 -0600, Sean McGinnis wrote: > > > > TypeError: issubclass() arg 1 must be a class > > > > > > > > > > This should have been fixed by [1], which is in cliff since 2.14.0. The > > > python-openstackclient docs target (which IIUC still uses the def in > > > tox.ini?) pulls in requirements.txt which lists > > > > > > cliff!=2.9.0,>=2.8.0 # Apache-2.0 > > > > > > and upper-constraints, which is at 2.16.0. All that seems copacetic to > > > me. I also can't reproduce the failure locally building > > > python-openstackclient docs from scratch. > > > > > > What/where/how were you building when you encountered this? > > > > > > efried > > > > > > [1] https://review.opendev.org/#/c/614218/ > > > > Part of this could be that cliff is still capped since the newest release has > > some issues that have yet to be addressed. The upper constraint can't be raised > > until they are (which also likely means blacklisting this version), but I > > haven't seen any activity there yet. > > > > So the fix that was supposed to handle this doesn't appear to have actually > > done so. > > > > Or rather, it had fixed it, but somehow in the latest 2.17.0 release, the one > unrelated change included somehow broke it again. > > https://github.com/openstack/cliff/compare/2.16.0...2.17.0 Commit 8bcd068e876ddd48ae61c1803449d666f5e28ba0, a.k.a. cliff 2.17.0 is not the commit you (should have been) looking for. As noted at [1], we appear to have tagged a commit from the review branch instead of one from master, which mean 2.17.0 is based on code from master shortly after 2.11.0 (!) was released. I suggest we blacklist 2.17.0 and issue a new 2.17.1 or 2.18.0 release post-haste. That's not all though. For some daft reason, the 'python-rsdclient' projects has imported argparses 'HelpFormatter' from 'cliff._argparse' instead of 'argparse'. They need to stop doing this because commit 584352dcd008d58c433136539b22a6ae9d6c45cc of cliff means this will no longer work. Just import argparse directly. Stephen [1] https://review.opendev.org/#/c/698485/1/deliverables/ussuri/cliff.yaml at 13 From gmann at ghanshyammann.com Tue Jan 7 17:16:19 2020 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 07 Jan 2020 11:16:19 -0600 Subject: [qa][infra][stable] Stable branches gate status: tempest-full-* jobs failing for stable/ocata|pike|queens Message-ID: <16f8101d6ea.be1780a3214520.3007727257147254758@ghanshyammann.com> Hello Everyone, tempest-full-* jobs are failing on stable/queens, stable/pike, and stable/ocata(legacy-tempest-dsvm-neutron-full-ocata) [1].ld Please hold any recheck till fix is merged. whoami-rajat reported about the tempest-full-queens-py3 job failure and later while debugging we found that same is failing for pike and ocata(job name there - legacy-tempest-dsvm-neutron-full-ocata). Failure is due to "Timeout on connecting the vnc console url" because there is no 'n-cauth' service running which is required for these stable branches. In Ussuri that service has been removed from nova. 'n-cauth' has been removed from ENABLED_SERVICES recently in - https://review.opendev.org/#/c/700217/ which effected only stable branches till queens. stable/rocky|stein are working because we have moved the services enable things from devstack-gate's test matrix to devstack base job[2]. Patch[2] was not backported to stable/queens and stable/pike which I am not sure why. We have two ways to fix the stable branches gate: 1. re-enable the n-cauth in devstack-gate. Hope all other removes services create no problem. pros: easy to fix, fix for all three stable branches. patch- https://review.opendev.org/#/c/701404/ 2. Backport the 546765[2] to stable/queens and stable/pike. pros: this removes the dependency form test-matrix which is the overall goal to remove d-g dependency. cons: It cannot be backported to stable/ocata as no zuulv3 base jobs there. This is already EM and anyone still cares about this? I think for fixing the gate (Tempest master and stable/queens|pike|ocata), we can go with option 1 and later we backport the devstack migration. [1] - http://zuul.openstack.org/builds?job_name=tempest-full-queens-py3 - http://zuul.openstack.org/builds?job_name=tempest-full-pike - http://zuul.openstack.org/builds?job_name=legacy-tempest-dsvm-neutron-full-ocata - reported bug - https://bugs.launchpad.net/devstack/+bug/1858666 [2] https://review.opendev.org/#/c/546765/ -gmann From rosmaita.fossdev at gmail.com Tue Jan 7 19:11:40 2020 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Tue, 7 Jan 2020 14:11:40 -0500 Subject: [cinder] first meeting of 2020 tomorrow (8 January) Message-ID: <9fc10836-8805-daca-10f9-a615a152949f@gmail.com> Just wanted to send a quick reminder that the first Cinder team meeting of 2020 well be held tomorrow (8 January) on the usual day (Wednesday) at the usual time (1400 UTC) and in the usual place (#openstack-meeting-4). https://etherpad.openstack.org/p/cinder-ussuri-meetings See you there! brian From aj at suse.com Tue Jan 7 19:50:51 2020 From: aj at suse.com (Andreas Jaeger) Date: Tue, 7 Jan 2020 20:50:51 +0100 Subject: [infra] Retire x/dox Message-ID: <30c7199c-fc6c-7f02-d764-4e51ee9a9cfd@suse.com> The x/dox repo is unused, let's retire it. I'll put up changes with topic "retire-dox", Andreas -- Andreas Jaeger aj at suse.com Twitter: jaegerandi SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg (HRB 36809, AG Nürnberg) GF: Felix Imendörffer GPG fingerprint = EF18 1673 38C4 A372 86B1 E699 5294 24A3 FF91 2ACB From ahmed.zaky.abdallah at gmail.com Tue Jan 7 20:20:37 2020 From: ahmed.zaky.abdallah at gmail.com (Ahmed ZAKY) Date: Tue, 7 Jan 2020 21:20:37 +0100 Subject: VM boot volume disappears from compute's multipath Daemon when NetApp controller is placed offline Message-ID: I have a setup where each VM gets assigned two vDisks, one encrypted boot volume and another storage volume. Storage used is NetApp (tripleo-netapp). With two controllers on NetApp side working in active/ active mode. My test case goes as follows: - I stop one of the active controllers. - I stop one of my VMs using OpenStack server stop - I then start my VM one more time using OpenStack server start. - VM fails to start. Here're my findings, hope someone would help if they can explain me the behaviour seen below: My VM: vel1bgw01-MCM2, it is running on compute overcloud-sriovperformancecompute-3.localdomain [root at overcloud-controller-0 (vel1asbc01) cbis-admin]# openstack server show vel1bgw01-MCM2 +--------------------------------------+------------------------------------------------------------------------------------------------------------+ | Field | Value | +--------------------------------------+------------------------------------------------------------------------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | zone1 | | OS-EXT-SRV-ATTR:host | overcloud-sriovperformancecompute-3.localdomain | | OS-EXT-SRV-ATTR:hypervisor_hostname | overcloud-sriovperformancecompute-3.localdomain | | OS-EXT-SRV-ATTR:instance_name | instance-00000 <+4400000>e93 | | OS-EXT-STS:power_state | Running | | OS-EXT-STS:task_state | None | | OS-EXT-STS:vm_state | active | | OS-SRV-USG:launched_at | 2019-12-18T15:49:37.000000 <+4437000000> | | OS-SRV-USG:terminated_at | None | | accessIPv4 | | | accessIPv6 | | | addresses | SBC01_MGW01_TIPC=192.168.48.22 <+441921684822>; SBC01_MGW01_DATAPATH_MATE=192.168.16.11 <+441921681611>; SBC01_MGW01_DATAPATH=192.168.32.8 <+44192168328> | | config_drive | True | | created | 2019-12-18T15:49:16Z | | flavor | SBC_MCM (asbc_mcm) | | hostId | 7886 <+447886>df0f7a3d4e131304 <+44131304>a8eb860e6a704c5fda2a7ed751b544ff2bf5 | | id | 5c70a984-89 <+4498489>a9-44ce-876d-9e2e568eb819 | | image | | | key_name | CBAM-b5fd59a066e8450 <+448450>ca9f104a69da5a043-Keypair | | name | vel1bgw01-MCM2 | | os-extended-volumes:volumes_attached | [{u'id': u'717e5744-4786-42 <+445744478642>dc-9e3e-3c5e6994 <+446994>c482'}, {u'id': u'd6cf0cf9-36d1-4b 62-86 <+446286>b4-faa4a6642166 <+446642166>'}] | | progress | 0 | | project_id | 41777 <+4441777>c6f1e7b4f8d8fd76b5e0f67e5e8 | | properties | | | security_groups | [{u'name': u'vel1bgw01-TIPC-Security-Group'}] | | status | ACTIVE | | updated | 2020-01-07T17:18:32Z | | user_id | be13deba85794016 <+4485794016>a00fec9d18c5d7cf | +--------------------------------------+------------------------------------------------------------------------------------------------------------+ *It is mapped to the following vDisks (seen using virsh list on compute-3)*: - dm-uuid-mpath-3600 <+443600>a098000 <+44098000>d9818 <+449818>b0000185 <+440000185>c5dfa0714 <+440714> è Boot Volume - dm-uuid-mpath-3600 <+443600>a098000 <+44098000>d9818 <+449818>b 000018565 <+44000018565>dfa069e è Storage volume 717e5744-4786-42 <+445744478642>dc-9e3e-3c5e6994 <+446994> c482
d6cf0cf9-36d1-4b62-86 <+446286>b4-faa4a6642166 <+446642166>
Name: crypt-dm-uuid-mpath-3600 <+443600>a098000 <+44098000>d9818 <+449818>b 0000185 <+440000185>c5dfa0714 <+440714> State: ACTIVE Read Ahead: 256 Tables present: LIVE Open count: 1 Event number: 0 Major, minor: 253, 5 Number of targets: 1 UUID: CRYPT-LUKS1-769 <+441769>cc20bc5af469c8c9075 <+449075> a2a6fc4aa0-crypt-dm-uuid-mpath-*3600 <+443600>a098000 <+44098000>d9818 <+449818>b0000185 <+440000185>c5dfa0714 <+440714>* Name: *mpathpy* State: ACTIVE Read Ahead: 256 Tables present: LIVE Open count: 1 Event number: 32 Major, minor: 253, *4* Number of targets: 1 UUID: mpath-*3600 <+443600>a098000 <+44098000>d9818 <+449818>b0000185 <+440000185>c5dfa0714 <+440714>* Name: crypt-dm-uuid-mpath-3600 <+443600>a098000 <+44098000>d9818 <+449818>b 000018565 <+44000018565>dfa069e State: ACTIVE Read Ahead: 256 Tables present: LIVE Open count: 1 Event number: 0 Major, minor: 253, 7 Number of targets: 1 UUID: CRYPT-LUKS1-4015 <+4414015>c585a0df4074821 <+444074821> ca312c4caacca-crypt-dm-uuid-mpath-*3600 <+443600>a098000 <+44098000>d9818 <+449818>b000018565 <+44000018565>dfa069e* Name: *mpathpz* State: ACTIVE Read Ahead: 256 Tables present: LIVE Open count: 1 Event number: 28 Major, minor: 253, *6* Number of targets: 1 UUID: mpath-*3600 <+443600>a098000 <+44098000>d9818 <+449818>b000018565 <+44000018565>dfa069e* This means boot volume is represented by dm-4 while storage volume is represented by dm-6 Dumping the multipath daemon on the controller shows that at a steady running state both DMs are accounted for (see below). multipathd> show maps name sysfs uuid *mpathpy dm-4 3600 <+4443600>a098000 <+44098000>d9818 <+449818>b0000185 <+440000185>c5dfa0714 <+440714>* *mpathpz dm-6 3600 <+4463600>a098000 <+44098000>d9818 <+449818>b000018565 <+44000018565>dfa069e* mpathqi dm-12 3600 <+44123600>a098000 <+44098000>d9818 <+449818>b000018 <+44000018>df5dfafd40 mpathqj dm-13 3600 <+44133600>a098000 <+44098000>d9818 <+449818>b000018 <+44000018>de5dfafd10 mpathpw dm-0 3600 <+4403600>a098000 <+44098000>d9818 <+449818>b000018425 <+44000018425>dfa059f mpathpx dm-1 3600 <+4413600>a098000 <+44098000>d9818 <+449818>b0000184 <+440000184>c5dfa05fc mpathqk dm-16 3600 <+44163600>a098000 <+44098000>d9818 <+449818>b000018 <+44000018>eb5dfafe80 mpathql dm-17 3600 <+44173600>a098000 <+44098000>d9818 <+449818>b000018 <+44000018>e95dfafe26 mpathqh dm-9 3600 <+4493600>a098000 <+44098000>d9818 <+449818>b000018 <+44000018>c65dfafa91 These vDisks are mapped to the following multipaths: multipathd> show topology mpathpy (3600 <+443600>a098000 <+44098000>d9818 <+449818>b0000185 <+440000185>c5dfa0714) <+440714> dm-4 NETAPP ,INF-01-00 <+440100> size=21G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 rdac' wp=rw |-+- policy='service-time 0' prio=14 status=active | |- 30:0:0:82 sdm 8:192 active ready running | `- 32:0:0:82 sdk 8:160 active ready running `-+- policy='service-time 0' prio=0 status=enabled |- 33:0:0:82 sdn 8:208 failed faulty running `- 31:0:0:82 sdl 8:176 failed faulty running mpathpz (3600 <+443600>a098000 <+44098000>d9818 <+449818>b000018565 <+44000018565>dfa069e) dm-6 NETAPP ,INF-01-00 <+440100> size=10G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 rdac' wp=rw |-+- policy='service-time 0' prio=14 status=active | |- 30:0:0:229 sdr 65:16 active ready running | `- 32:0:0:229 sdp 8:240 active ready running `-+- policy='service-time 0' prio=0 status=enabled |- 31:0:0:229 sdo 8:224 failed faulty running `- 33:0:0:229 sdq 65:0 failed faulty running Now, it starts getting very interesting, if I shutdown controller-A from NetApp side, dm-4 disappears but dm-6 is still running while detecting the active path is controller B while standby path is controller-A which now is displayed as failed multipathd> show topology mpathpz (3600 <+443600>a098000 <+44098000>d9818 <+449818>b000018565 <+44000018565>dfa069e) dm-6 NETAPP ,INF-01-00 <+440100> size=10G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 rdac' wp=rw |-+- policy='service-time 0' prio=0 status=enabled | |- 30:0:0:229 sdr 65:16 failed faulty running | `- 32:0:0:229 sdp 8:240 failed faulty running `-+- policy='service-time 0' prio=11 status=active |- 31:0:0:229 sdo 8:224 active ready running `- 33:0:0:229 sdq 65:0 active ready running multipathd> show maps name sysfs uuid *mpathpz dm-6 3600 <+4463600>a098000 <+44098000>d9818 <+449818>b000018565 <+44000018565>dfa069e* mpathqi dm-12 3600 <+44123600>a098000 <+44098000>d9818 <+449818>b000018 <+44000018>df5dfafd40 mpathqj dm-13 3600 <+44133600>a098000 <+44098000>d9818 <+449818>b000018 <+44000018>de5dfafd10 mpathpw dm-0 3600 <+4403600>a098000 <+44098000>d9818 <+449818>b000018425 <+44000018425>dfa059f mpathpx dm-1 3600 <+4413600>a098000 <+44098000>d9818 <+449818>b0000184 <+440000184>c5dfa05fc mpathqk dm-16 3600 <+44163600>a098000 <+44098000>d9818 <+449818>b000018 <+44000018>eb5dfafe80 mpathql dm-17 3600 <+44173600>a098000 <+44098000>d9818 <+449818>b000018 <+44000018>e95dfafe26 mpathqg dm-8 3600 <+4483600>a098000 <+44098000>d9818 <+449818>b000018 <+44000018>c75dfafac0 mpathqh dm-9 3600 <+4493600>a098000 <+44098000>d9818 <+449818>b000018 <+44000018>c65dfafa91 If I restore Controller-A into service from NetApp side while fail only the path to controller A from multipathd everything works fine, dm-4 is still present and the VM can be put into service. multipathd> fail path sdk ok multipathd> multipathd> fail path sdm ok mpathpy (3600 <+443600>a098000 <+44098000>d9818 <+449818>b0000185 <+440000185>c5dfa0714) <+440714> dm-4 NETAPP ,INF-01-00 <+440100> size=21G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 rdac' wp=rw |-+- policy='service-time 0' prio=0 status=enabled | |- 32:0:0:82 sdk 8:160 failed faulty running | `- 30:0:0:82 sdm 8:192 failed faulty running `-+- policy='service-time 0' prio=9 status=active |- 31:0:0:82 sdl 8:176 active ready running `- 33:0:0:82 sdn 8:208 active ready running multipathd> reinstate path sdk ok multipathd> multipathd> reinstate path sdm ok mpathpy (3600 <+443600>a098000 <+44098000>d9818 <+449818>b0000185 <+440000185>c5dfa0714) <+440714> dm-4 NETAPP ,INF-01-00 <+440100> size=21G features='4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handle' hwhandler='1 rdac' wp=rw |-+- policy='service-time 0' prio=14 status=active | |- 32:0:0:82 sdk 8:160 active ready running | `- 30:0:0:82 sdm 8:192 active ready running `-+- policy='service-time 0' prio=9 status=enabled |- 31:0:0:82 sdl 8:176 active ready running `- 33:0:0:82 sdn 8:208 active ready running It is observed in the working case, the storage volume disappears (which seems normal), also the instance totally vanishes from the virsh list and no trace can be found at the KVM level if we run ps -def | grep fd | grep . However, the boot volume is always present in the multipathd records when we stop the VM at normal conditions without stopping NetApp controller. Any ideas? Kind regards, Ahmed -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Tue Jan 7 21:59:57 2020 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 7 Jan 2020 15:59:57 -0600 Subject: [neutron][rabbitmq][oslo] Neutron-server service shows deprecated "AMQPDeprecationWarning" In-Reply-To: <4D3B074F-09F2-48BE-BD61-5D34CBFE509E@dkrz.de> References: <4D3B074F-09F2-48BE-BD61-5D34CBFE509E@dkrz.de> Message-ID: <294c93b5-0ddc-284b-34a1-ffce654ba047@nemebean.com> On 1/7/20 9:14 AM, Amjad Kotobi wrote: > Hi, > > Today we are facing losing connection of neutron especially during > instance creation or so as “systemctl status neutron-server” shows below > message > > be deprecated in amqp 2.2.0. > Since amqp 2.0 you have to explicitly call Connection.connect() > before using the connection. > W_FORCE_CONNECT.format(attr=attr))) > /usr/lib/python2.7/site-packages/amqp/connection.py:304: > AMQPDeprecationWarning: The .transport attribute on the connection was > accessed before > the connection was established.  This is supported for now, but will > be deprecated in amqp 2.2.0. > Since amqp 2.0 you have to explicitly call Connection.connect() > before using the connection. > W_FORCE_CONNECT.format(attr=attr))) It looks like this is a red herring, but it should be fixed in the current oslo.messaging pike release. See [0] and the related bug. 0: https://review.opendev.org/#/c/605324/ > > OpenStack release which we are running is “Pike”. > > Is there any way to remedy this? I don't think this should be a fatal problem in and of itself so I suspect it's masking something else. However, I would recommend updating to the latest pike release of oslo.messaging where the deprecated feature is not used. If that doesn't fix the problem, please send us whatever errors remain after this one is eliminated. > > Thanks > Amjad From Arkady.Kanevsky at dell.com Tue Jan 7 23:17:25 2020 From: Arkady.Kanevsky at dell.com (Arkady.Kanevsky at dell.com) Date: Tue, 7 Jan 2020 23:17:25 +0000 Subject: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management In-Reply-To: References: <87344b65740d47fc9777ae710dcf4af9@AUSX13MPS308.AMER.DELL.COM> <7b55e3b28d644492a846fdb10f7b127b@AUSX13MPS308.AMER.DELL.COM> Message-ID: <582d2544d3d74fe7beef50aaaa35d558@AUSX13MPS308.AMER.DELL.COM> Excellent points Julia. It is hard to image that any production env of any customer will allow anybody but administrator to update FW on any device at any time. The security implication are huge. Cheers, Arkady -----Original Message----- From: Julia Kreger Sent: Monday, January 6, 2020 3:33 PM To: Kanevsky, Arkady Cc: Zhipeng Huang; openstack-discuss Subject: Re: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management [EXTERNAL EMAIL] Greetings Arkady, I think your message makes a very good case and raises a point that I've been trying to type out for the past hour, but with only different words. We have multiple USER driven interactions with a similarly desired, if not the exact same desired end result where different paths can be taken, as we perceive use cases from "As a user, I would like a VM with a configured accelerator", "I would like any compute resource (VM or Baremetal), with a configured accelerator", to "As an administrator, I need to reallocate a baremetal node for this different use, so my user can leverage its accelerator once they know how and are ready to use it.", and as suggested "I as a user want baremetal with k8s and configured accelerators." And I suspect this diversity of use patterns is where things begin to become difficult. As such I believe, we in essence, have a question of a support or compatibility matrix that definitely has gaps depending on "how" the "user" wants or needs to achieve their goals. And, I think where this entire discussion _can_ go sideways is... (from what I understand) some of these devices need to be flashed by the application user with firmware on demand to meet the user's needs, which is where lifecycle and support interactions begin to become... conflicted. Further complicating matters is the "Metal to Tenant" use cases where the user requesting the machine is not an administrator, but has some level of inherent administrative access to all Operating System accessible devices once their OS has booted. Which makes me wonder "What if the cloud administrators WANT to block the tenant's direct ability to write/flash firmware into accelerator/smartnic/etc?" I suspect if cloud administrators want to block such hardware access, vendors will want to support such a capability. Blocking such access inherently forces some actions into hardware management/maintenance workflows, and may ultimately may cause some of a support matrix's use cases to be unsupportable, again ultimately depending on what exactly the user is attempting to achieve. Going back to the suggestions in the original email, They seem logical to me in terms of the delineation and separation of responsibilities as we present a cohesive solution the users of our software. Greetings Zhipeng, Is there any documentation at present that details the desired support and use cases? I think this would at least help my understanding, since everything that requires the power to be on would still need to be integrated with-in workflows for eventual tighter integration. Also, has Cyborg drafted any plans or proposals for integration? -Julia On Mon, Jan 6, 2020 at 9:14 AM wrote: > > Zhipeng, > > Thanks for quick feedback. > > Where is accelerating device is running? I am aware of 3 possibilities: servers, storage, switches. > > In each one of them the device is managed as part of server, storage box or switch. > > > > The core of my message is separation of device life cycle management in the “box” where it is placed, from the programming the device as needed per application (VM, container). > > > > Thanks, > Arkady > > > > From: Zhipeng Huang > Sent: Friday, January 3, 2020 7:53 PM > To: Kanevsky, Arkady > Cc: OpenStack Discuss > Subject: Re: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] > accelerators management > > > > [EXTERNAL EMAIL] > > Hi Arkady, > > > > Thanks for your interest in Cyborg project :) I would like to point out that when we initiated the project there are two specific use cases we want to cover: the accelerators attached locally (via PCIe or other bus type) or remotely (via Ethernet or other fabric type). > > > > For the latter one, it is clear that its life cycle is independent from the server (like block device managed by Cinder). For the former one however, its life cycle is not dependent on server for all kinds of accelerators either. For example we already have PCIe based AI accelerator cards or Smart NICs that could be power on/off when the server is on all the time. > > > > Therefore it is not a good idea to move all the life cycle management part into Ironic for the above mentioned reasons. Ironic integration is very important for the standalone usage of Cyborg for Kubernetes, Envoy (TLS acceleration) and others alike. > > > > Hope this answers your question :) > > > > On Sat, Jan 4, 2020 at 5:23 AM wrote: > > Fellow Open Stackers, > > I have been thinking on how to handle SmartNICs, GPUs, FPGA handling across different projects within OpenStack with Cyborg taking a leading role in it. > > > > Cyborg is important project and address accelerator devices that are part of the server and potentially switches and storage. > > It is address 3 different use cases and users there are all grouped into single project. > > > > Application user need to program a portion of the device under management, like GPU, or SmartNIC for that app usage. Having a common way to do it across different device families and across different vendor is very important. And that has to be done every time a VM is deploy that need usage of a device. That is tied with VM scheduling. > Administrator need to program the whole device for specific usage. That covers the scenario when device can only support single tenant or single use case. That is done once during OpenStack deployment but may need reprogramming to configure device for different usage. May or may not require reboot of the server. > Administrator need to setup device for its use, like burning specific FW on it. This is typically done as part of server life-cycle event. > > > > The first 2 cases cover application life cycle of device usage. > > The last one covers device life cycle independently how it is used. > > > > Managing life cycle of devices is Ironic responsibility, One cannot and should not manage lifecycle of server components independently. Managing server devices outside server management violates customer service agreements with server vendors and breaks server support agreements. > > Nova and Neutron are getting info about all devices and their capabilities from Ironic; that they use for scheduling. We should avoid creating new project for every new component of the server and modify nova and neuron for each new device. (the same will also apply to cinder and manila if smart devices used in its data/control path on a server). > > Finally we want Cyborg to be able to be used in standalone capacity, say for Kubernetes. > > > > Thus, I propose that Cyborg cover use cases 1 & 2, and Ironic would cover use case 3. > > Thus, move all device Life-cycle code from Cyborg to Ironic. > > Concentrate Cyborg of fulfilling the first 2 use cases. > > Simplify integration with Nova and Neutron for using these accelerators to use existing Ironic mechanism for it. > > Create idempotent calls for use case 1 so Nova and Neutron can use it as part of VM deployment to ensure that devices are programmed for VM under scheduling need. > > Create idempotent call(s) for use case 2 for TripleO to setup device for single accelerator usage of a node. > > [Propose similar model for CNI integration.] > > > > Let the discussion start! > > > > Thanks., > Arkady > > > > > -- > > Zhipeng (Howard) Huang > > > > Principle Engineer > > OpenStack, Kubernetes, CNCF, LF Edge, ONNX, Kubeflow, OpenSDS, Open > Service Broker API, OCP, Hyperledger, ETSI, SNIA, DMTF, W3C > > From fungi at yuggoth.org Tue Jan 7 23:51:39 2020 From: fungi at yuggoth.org (Jeremy Stanley) Date: Tue, 7 Jan 2020 23:51:39 +0000 Subject: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management In-Reply-To: <582d2544d3d74fe7beef50aaaa35d558@AUSX13MPS308.AMER.DELL.COM> References: <87344b65740d47fc9777ae710dcf4af9@AUSX13MPS308.AMER.DELL.COM> <7b55e3b28d644492a846fdb10f7b127b@AUSX13MPS308.AMER.DELL.COM> <582d2544d3d74fe7beef50aaaa35d558@AUSX13MPS308.AMER.DELL.COM> Message-ID: <20200107235139.2l5iw2fumgsfoz5u@yuggoth.org> On 2020-01-07 23:17:25 +0000 (+0000), Arkady.Kanevsky at dell.com wrote: > It is hard to image that any production env of any customer will > allow anybody but administrator to update FW on any device at any > time. The security implication are huge. [...] I thought this was precisely the point of exposing FPGA hardware into server instances. Or do you not count programming those as "updating firmware?" -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From aj at suse.com Wed Jan 8 08:35:36 2020 From: aj at suse.com (Andreas Jaeger) Date: Wed, 8 Jan 2020 09:35:36 +0100 Subject: [infra] Retire openstack/js-openstack-lib repository Message-ID: The js-openstack-lib repository is orphaned and has not seen any real merges or contributions since February 2017, I propose to retire it. I'll send retirement changes using topic retire-js-openstack-lib, Andreas -- Andreas Jaeger aj at suse.com Twitter: jaegerandi SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg (HRB 36809, AG Nürnberg) GF: Felix Imendörffer GPG fingerprint = EF18 1673 38C4 A372 86B1 E699 5294 24A3 FF91 2ACB From radoslaw.piliszek at gmail.com Wed Jan 8 09:21:06 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Wed, 8 Jan 2020 10:21:06 +0100 Subject: [infra] Retire openstack/js-openstack-lib repository In-Reply-To: References: Message-ID: Are there any alternatives? I would be glad to pick this up because I planned some integrations like this on my own. -yoctozepto śr., 8 sty 2020 o 09:48 Andreas Jaeger napisał(a): > > The js-openstack-lib repository is orphaned and has not seen any real > merges or contributions since February 2017, I propose to retire it. > > I'll send retirement changes using topic retire-js-openstack-lib, > > Andreas > -- > Andreas Jaeger aj at suse.com Twitter: jaegerandi > SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg > (HRB 36809, AG Nürnberg) GF: Felix Imendörffer > GPG fingerprint = EF18 1673 38C4 A372 86B1 E699 5294 24A3 FF91 2ACB > From aj at suse.com Wed Jan 8 09:26:14 2020 From: aj at suse.com (Andreas Jaeger) Date: Wed, 8 Jan 2020 10:26:14 +0100 Subject: [infra] Retire openstack/js-openstack-lib repository In-Reply-To: References: Message-ID: On 08/01/2020 10.21, Radosław Piliszek wrote: > Are there any alternatives? > I would be glad to pick this up because I planned some integrations > like this on my own. If you want to pick this up, best discuss with Clark as Infra PTL. We can keep it if there is real interest, Andreas > -yoctozepto > > śr., 8 sty 2020 o 09:48 Andreas Jaeger napisał(a): >> >> The js-openstack-lib repository is orphaned and has not seen any real >> merges or contributions since February 2017, I propose to retire it. >> >> I'll send retirement changes using topic retire-js-openstack-lib, >> >> Andreas >> -- >> Andreas Jaeger aj at suse.com Twitter: jaegerandi >> SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg >> (HRB 36809, AG Nürnberg) GF: Felix Imendörffer >> GPG fingerprint = EF18 1673 38C4 A372 86B1 E699 5294 24A3 FF91 2ACB >> -- Andreas Jaeger aj at suse.com Twitter: jaegerandi SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg (HRB 36809, AG Nürnberg) GF: Felix Imendörffer GPG fingerprint = EF18 1673 38C4 A372 86B1 E699 5294 24A3 FF91 2ACB From radoslaw.piliszek at gmail.com Wed Jan 8 09:32:37 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Wed, 8 Jan 2020 10:32:37 +0100 Subject: [infra] Retire openstack/js-openstack-lib repository In-Reply-To: References: Message-ID: Thanks, Andreas. Will do. I thought it might be also wise to preserve this since there are posts now and then that horizon is reaching its limit and a JS lib might be beneficial for any possible replacement (as it can run from the browser). Though I have no idea what the state of this library is. OTOH, quick google search reveals that alternatives do not seem better at the first glance. The only promising one was https://github.com/pkgcloud/pkgcloud but it is not OS-centric and has therefore different goals. -yoctozepto śr., 8 sty 2020 o 10:26 Andreas Jaeger napisał(a): > > On 08/01/2020 10.21, Radosław Piliszek wrote: > > Are there any alternatives? > > I would be glad to pick this up because I planned some integrations > > like this on my own. > > If you want to pick this up, best discuss with Clark as Infra PTL. We > can keep it if there is real interest, > > Andreas > > > -yoctozepto > > > > śr., 8 sty 2020 o 09:48 Andreas Jaeger napisał(a): > >> > >> The js-openstack-lib repository is orphaned and has not seen any real > >> merges or contributions since February 2017, I propose to retire it. > >> > >> I'll send retirement changes using topic retire-js-openstack-lib, > >> > >> Andreas > >> -- > >> Andreas Jaeger aj at suse.com Twitter: jaegerandi > >> SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg > >> (HRB 36809, AG Nürnberg) GF: Felix Imendörffer > >> GPG fingerprint = EF18 1673 38C4 A372 86B1 E699 5294 24A3 FF91 2ACB > >> > > > -- > Andreas Jaeger aj at suse.com Twitter: jaegerandi > SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg > (HRB 36809, AG Nürnberg) GF: Felix Imendörffer > GPG fingerprint = EF18 1673 38C4 A372 86B1 E699 5294 24A3 FF91 2ACB From ileixe at gmail.com Wed Jan 8 10:08:59 2020 From: ileixe at gmail.com (=?UTF-8?B?7JaR7Jyg7ISd?=) Date: Wed, 8 Jan 2020 19:08:59 +0900 Subject: [neutron][ironic] dynamic routing protocol status for routed network Message-ID: Hi, For ironic flat provider network, I found it's hard to scale manually (for many racks). So I'm trying to use routed network for the purpose. One blur thing for routed network is how we can handle segment connectivity. >From reference (https://docs.openstack.org/neutron/latest/admin/config-routed-networks.html), there is future work about dynamic routing protocol though, I could not find any hints for the functionality. How do you guys using routed network handle the segments' connectivity? Is there any project to speak subnet info via BGP? Thanks in advance. From skatsaounis at admin.grnet.gr Wed Jan 8 11:25:08 2020 From: skatsaounis at admin.grnet.gr (Stamatis Katsaounis) Date: Wed, 8 Jan 2020 11:25:08 +0000 Subject: [charms][watcher] OpenStack Watcher Charm Message-ID: <159661b1-7edf-e55d-c7b9-cf3b97bffffb@admin.grnet.gr> Hi all, Purpose of this email is to let you know that we released an unofficial charm of OpenStack Watcher [1]. This charm gave us the opportunity to deploy OpenStack Watcher to our charmed OpenStack deployment. After seeing value in it, we decided to publish it through GRNET GitHub Organization account for several reasons. First of all, we would love to get feedback on it as it is our first try on creating an OpenStack reactive charm. Secondly, we would be glad to see other OpenStack operators deploy Watcher and share with us knowledge on the project and possible use cases. Finally, it would be ideal to come up with an official OpenStack Watcher charm repository under charmers umbrella. By doing this, another OpenStack project is going to be available not only for Train version but for any future version of OpenStack. Most important, the CI tests are going to ensure that the code is not broken and persuade other operators to use it. Before closing my email, I would like to give some insight on the architecture of the code base and the deployment process. To begin with, charm-watcher is based on other reactive OpenStack charms. During its deployment Barbican, Designate, Octavia and other charms' code bases were counseled. Furthermore, the structure is the same as any official OpenStack charm, of course without functional tests, which is something we cannot provide. Speaking about the deployment process, apart from having a basic charmed OpenStack deployment, operator has to change two tiny configuration options on Nova cloud controller and Cinder. As explained in the Watcher configuration guide, special care has to be done with Oslo notifications for Nova and Cinder [2]. In order to achieve that in charmed OpenStack some issues were met and solved with the following patches [3], [4], [5], [6]. With these patches, operator can set the extra Oslo configuration and this is the only extra configuration needs to take place. Finally, with [7] Keystone charm can accept a relation with Watcher charm instead of ignoring it. To be able to deploy GRNET Watcher charm on Train, patches [3], [4], [5] and [7] have to be back-ported to stable/19.10 branch but that will require the approval of charmers team. Please let me know if such an option is available and in that case I am going to open the relevant patches. Furthermore, if you think that it could be a good option to create a spec and then introduce an official Watcher charm, I would love to help on that. I wish all a happy new year and I am looking forward to your response and possible feedback. PS. If we could have an Ubuntu package for watcher-dashboard [8] like octavia-dashboard [9] we would release a charm for it as well. Best regards, Stamatis Katsaounis [1] https://github.com/grnet/charm-watcher [2] https://docs.openstack.org/watcher/latest/configuration/configuring.html#configure-nova-notifications [3] https://review.opendev.org/#/c/699079/ [4] https://review.opendev.org/#/c/699081/ [5] https://review.opendev.org/#/c/699657/ [6] https://github.com/juju/charm-helpers/pull/405 [7] https://review.opendev.org/#/c/699082/ [8] https://github.com/openstack/watcher-dashboard [9] https://launchpad.net/ubuntu/+source/octavia-dashboard -- [cid:part10.101270FA.EDBB3A4A at admin.grnet.gr] Stamatis Katsaounis DevOps Engineer t :(+30) 210 7471130 (ext. 483) f : + 30 210 7474490 GRNET | Networking Research and Education www.grnet.gr | 7, Kifisias Av., 115 23, Athens [Facebook icon] [Twitter icon] [Youtube icon] [LinkedIn icon] [Instagram icon] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: elggckbbbdlhkdig.png Type: image/png Size: 4948 bytes Desc: elggckbbbdlhkdig.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: bopemjoginlmgoci.png Type: image/png Size: 645 bytes Desc: bopemjoginlmgoci.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: gpjbaodeiokhnjgn.png Type: image/png Size: 661 bytes Desc: gpjbaodeiokhnjgn.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: oichedjkimmmjbge.png Type: image/png Size: 738 bytes Desc: oichedjkimmmjbge.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: edpopdhgfeakkkhe.png Type: image/png Size: 716 bytes Desc: edpopdhgfeakkkhe.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ajmnellhaofbfpnj.png Type: image/png Size: 653 bytes Desc: ajmnellhaofbfpnj.png URL: From moreira.belmiro.email.lists at gmail.com Wed Jan 8 12:20:06 2020 From: moreira.belmiro.email.lists at gmail.com (Belmiro Moreira) Date: Wed, 8 Jan 2020 13:20:06 +0100 Subject: [largescale-sig] Meeting summary and next actions In-Reply-To: <06e5f16f-dfa4-8189-da7b-ad2250df8125@openstack.org> References: <3c3a6232-9a3b-d240-ab82-c7ac4997f5c0@openstack.org> <06e5f16f-dfa4-8189-da7b-ad2250df8125@openstack.org> Message-ID: Hi Thierry, all, I'm OK with both dates. If you agree to keep the meeting on January 15 I can chair it. cheers, Belmiro On Mon, Jan 6, 2020 at 3:41 PM Thierry Carrez wrote: > Thierry Carrez wrote: > > [...] > > The next meeting will happen on January 15, at 9:00 UTC on > > #openstack-meeting. > > Oops, some unexpected travel came up and I won't be available to chair > the meeting on that date. We can either: > > 1- keep the meeting, with someone else chairing. I can help with posting > the agenda before and the summary after, just need someone to start the > meeting and lead it -- any volunteer? > > 2- move the meeting to January 22, but we may lose Chinese participants > to new year preparations... > > Thoughts? > > -- > Thierry Carrez (ttx) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From m.carpen at cineca.it Wed Jan 8 13:01:35 2020 From: m.carpen at cineca.it (mcarpene) Date: Wed, 8 Jan 2020 14:01:35 +0100 Subject: OIDC/OAuth2 token introspection in Keystone Message-ID: Hi all, my question is: could OS Keystone support OIDC/OAuth2 token introspection/validation. I mean for example executing a swift command via CLI adding a OIDC token bearer as a parameter to the swift command. In this case Keystone should validate the OIDC token towards and external IdP (using introspection endpoint/protocol for oidc). Is this currently supported, or eventually would be done in the near future? thanks Michele -- Michele Carpené SuperComputing Applications and Innovation Department CINECA - via Magnanelli, 6/3, 40033 Casalecchio di Reno (Bologna) - ITALY Tel: +39 051 6171730 Fax: +39 051 6132198 Skype: mcarpene http://www.hpc.cineca.it/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knikolla at bu.edu Wed Jan 8 15:28:14 2020 From: knikolla at bu.edu (Nikolla, Kristi) Date: Wed, 8 Jan 2020 15:28:14 +0000 Subject: OIDC/OAuth2 token introspection in Keystone In-Reply-To: References: Message-ID: <338A6D25-9DBF-492D-A94C-14E4A311FBE7@bu.edu> Hi Michele, We just approved a feature request for that [0], however it was merged to backlog, meaning no specific timeline for it being implemented yet. With the current implementation, you can use OAuth 2.0 Access Tokens with Keystone, however the token introspection endpoint will be used, therefore only the claims contained in the access token will be returned. I am assuming your question is with regards to the userinfo endpoint and OIDC claims, which we do not currently support. [0]. https://review.opendev.org/#/c/373983/ On Jan 8, 2020, at 8:01 AM, mcarpene > wrote: Hi all, my question is: could OS Keystone support OIDC/OAuth2 token introspection/validation. I mean for example executing a swift command via CLI adding a OIDC token bearer as a parameter to the swift command. In this case Keystone should validate the OIDC token towards and external IdP (using introspection endpoint/protocol for oidc). Is this currently supported, or eventually would be done in the near future? thanks Michele -- Michele Carpené SuperComputing Applications and Innovation Department CINECA - via Magnanelli, 6/3, 40033 Casalecchio di Reno (Bologna) - ITALY Tel: +39 051 6171730 Fax: +39 051 6132198 Skype: mcarpene http://www.hpc.cineca.it/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knikolla at bu.edu Wed Jan 8 15:43:31 2020 From: knikolla at bu.edu (Nikolla, Kristi) Date: Wed, 8 Jan 2020 15:43:31 +0000 Subject: OIDC/OAuth2 token introspection in Keystone In-Reply-To: <099190af-4c18-ce7a-3cb4-6e2ee033a07c@cineca.it> References: <338A6D25-9DBF-492D-A94C-14E4A311FBE7@bu.edu> <099190af-4c18-ce7a-3cb4-6e2ee033a07c@cineca.it> Message-ID: <3AFA4803-A34B-4ACE-96CC-63C2A4186922@bu.edu> There is an patch to improve the documentation for using the CLI with OIDC, but it hasn't merged yet. See here https://review.opendev.org/#/c/693838 Keystoneauth has plugins in place for authenticating with the OIDC IdP in multiple ways, including using an access token, see here https://github.com/openstack/keystoneauth/blob/master/keystoneauth1/identity/v3/oidc.py Best, Kristi On Jan 8, 2020, at 10:31 AM, mcarpene > wrote: Many thanks Nikolla, I was able to federate using OIDC IdP via the dashboard. I meant the problem is authenticating via CLI providing a OIDC token via command line, but maybe you already answered to my request. BR, Michele On 08/01/20 16:28, wrote: Hi Michele, We just approved a feature request for that [0], however it was merged to backlog, meaning no specific timeline for it being implemented yet. With the current implementation, you can use OAuth 2.0 Access Tokens with Keystone, however the token introspection endpoint will be used, therefore only the claims contained in the access token will be returned. I am assuming your question is with regards to the userinfo endpoint and OIDC claims, which we do not currently support. [0]. https://review.opendev.org/#/c/373983/ On Jan 8, 2020, at 8:01 AM, mcarpene > wrote: Hi all, my question is: could OS Keystone support OIDC/OAuth2 token introspection/validation. I mean for example executing a swift command via CLI adding a OIDC token bearer as a parameter to the swift command. In this case Keystone should validate the OIDC token towards and external IdP (using introspection endpoint/protocol for oidc). Is this currently supported, or eventually would be done in the near future? thanks Michele -- Michele Carpené SuperComputing Applications and Innovation Department CINECA - via Magnanelli, 6/3, 40033 Casalecchio di Reno (Bologna) - ITALY Tel: +39 051 6171730 Fax: +39 051 6132198 Skype: mcarpene http://www.hpc.cineca.it/ -- Michele Carpené SuperComputing Applications and Innovation Department CINECA - via Magnanelli, 6/3, 40033 Casalecchio di Reno (Bologna) - ITALY Tel: +39 051 6171730 Fax: +39 051 6132198 Skype: mcarpene http://www.hpc.cineca.it/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From shubjero at gmail.com Wed Jan 8 16:25:35 2020 From: shubjero at gmail.com (shubjero) Date: Wed, 8 Jan 2020 11:25:35 -0500 Subject: Compute node NIC bonding for increased instance throughput Message-ID: Good day, I have a question for the OpenStack community, hopefully someone can help me out here. Goal ------------------ Provision an NFS instance capable of providing 20Gbps of network throughput to be used by multiple other instances within the same project/network. Background ------------------ We run an OpenStack Stein cluster on Ubuntu 18.04. Our Neutron architecture is using openvswitch and GRE. Our compute nodes have two 10G NIC's and are configured in a layer3+4 LACP to the Top of Rack switch. Observations ------------------ Successfully see 20Gbps of traffic balanced across both slaves in the bond when performing iperf3 tests at the *baremetal/os/ubuntu* layer with two other compute nodes as iperf3 clients. Problem ------------------ We are unable to achieve 20Gbps at the instance level. We have tried multiple iperf3 connections from multiple other instances on different compute nodes and we are only able to reach 10Gbps and notice that traffic is not utilizing both slaves in the bond. One slave gets all of the traffic while the other slave sits basically idle. I have some configuration output here: http://paste.openstack.org/show/QdQq76q6VI1XN5tLW0xH/ Any help would be appreciated! Jared Baker Cloud Architect, Ontario Institute for Cancer Research From Arkady.Kanevsky at dell.com Wed Jan 8 16:31:49 2020 From: Arkady.Kanevsky at dell.com (Arkady.Kanevsky at dell.com) Date: Wed, 8 Jan 2020 16:31:49 +0000 Subject: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management In-Reply-To: <20200107235139.2l5iw2fumgsfoz5u@yuggoth.org> References: <87344b65740d47fc9777ae710dcf4af9@AUSX13MPS308.AMER.DELL.COM> <7b55e3b28d644492a846fdb10f7b127b@AUSX13MPS308.AMER.DELL.COM> <582d2544d3d74fe7beef50aaaa35d558@AUSX13MPS308.AMER.DELL.COM> <20200107235139.2l5iw2fumgsfoz5u@yuggoth.org> Message-ID: <2706c21c3f7d4203a8a20342f8f6a68c@AUSX13MPS308.AMER.DELL.COM> Jeremy, Correct. programming devices and "updating firmware" I count as separate activities. Similar to CPU or GPU. -----Original Message----- From: Jeremy Stanley Sent: Tuesday, January 7, 2020 5:52 PM To: openstack-discuss at lists.openstack.org Subject: Re: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management On 2020-01-07 23:17:25 +0000 (+0000), Arkady.Kanevsky at dell.com wrote: > It is hard to image that any production env of any customer will allow > anybody but administrator to update FW on any device at any time. The > security implication are huge. [...] I thought this was precisely the point of exposing FPGA hardware into server instances. Or do you not count programming those as "updating firmware?" -- Jeremy Stanley From sinan at turka.nl Wed Jan 8 16:43:06 2020 From: sinan at turka.nl (Sinan Polat) Date: Wed, 8 Jan 2020 17:43:06 +0100 Subject: Compute node NIC bonding for increased instance throughput In-Reply-To: References: Message-ID: Hi Jared, A single stream will utilize just 1 link. Have you tried with multiple streams using different sources? What do you mean with layer3+4. Do you mean the xmit hash policy? Sinan > Op 8 jan. 2020 om 17:25 heeft shubjero het volgende geschreven: > > Good day, > > I have a question for the OpenStack community, hopefully someone can > help me out here. > > Goal > ------------------ > Provision an NFS instance capable of providing 20Gbps of network > throughput to be used by multiple other instances within the same > project/network. > > Background > ------------------ > We run an OpenStack Stein cluster on Ubuntu 18.04. Our Neutron > architecture is using openvswitch and GRE. Our compute nodes have two > 10G NIC's and are configured in a layer3+4 LACP to the Top of Rack > switch. > > Observations > ------------------ > Successfully see 20Gbps of traffic balanced across both slaves in the > bond when performing iperf3 tests at the *baremetal/os/ubuntu* layer > with two other compute nodes as iperf3 clients. > > Problem > ------------------ > We are unable to achieve 20Gbps at the instance level. We have tried > multiple iperf3 connections from multiple other instances on different > compute nodes and we are only able to reach 10Gbps and notice that > traffic is not utilizing both slaves in the bond. One slave gets all > of the traffic while the other slave sits basically idle. > > I have some configuration output here: > http://paste.openstack.org/show/QdQq76q6VI1XN5tLW0xH/ > > Any help would be appreciated! > > Jared Baker > Cloud Architect, Ontario Institute for Cancer Research > From tpb at dyncloud.net Wed Jan 8 16:54:12 2020 From: tpb at dyncloud.net (Tom Barron) Date: Wed, 8 Jan 2020 11:54:12 -0500 Subject: [Manila] First meeting of 2020 Message-ID: <20200108165412.jtzxx425wfzq6um7@barron.net> Hey Zorillas! Just a reminder of our first meeting in 2020, 9 January at 1500 UTC on Freenode #openstack-meeting-alt. Feel free to update the agenda [1]. -- Tom [1] https://wiki.openstack.org/wiki/Manila/Meetings#Next_meeting From pierre at stackhpc.com Wed Jan 8 17:31:10 2020 From: pierre at stackhpc.com (Pierre Riteau) Date: Wed, 8 Jan 2020 18:31:10 +0100 Subject: [scientific][www] Unable to download OpenStack for Scientific Research book Message-ID: Hello, I tried to download the book at https://www.openstack.org/science/ but the link doesn't work. Could this please be fixed? I looked on openstack.org for a contact address, but couldn't find one. Please let me know if there is a specific address I should use next time. Thanks, Pierre From cboylan at sapwetik.org Wed Jan 8 17:35:04 2020 From: cboylan at sapwetik.org (Clark Boylan) Date: Wed, 08 Jan 2020 09:35:04 -0800 Subject: [all][infra] Compressed job log artifacts not loading in browser Message-ID: <0e443056-2eb3-4bd2-9d25-6a1b55d214ea@www.fastmail.com> Over the holidays the infra team noticed that some of our release artifacts were in the wrong file format. On further investigation we discovered the reason for this was some swift implementations were inflating compressed tarballs when retrieved by clients not setting accept-encoding: gzip. We then ended up with tar files when we expected .tar.gz format files. This behavior seems to be controlled by the content-encoding we set on object upload. If we tell swift that the object is a gzip'd file on upload then the webservers helpfully inflate them when a client retrieves them. In order to fix our release artifacts we have updated our swift upload tooling to stop setting content-type on gzip files. This forces the swift implementation to return the files in the same format they are uploaded when retrieved. A side effect of this is that any gzip'd files (like testr_results.html.gz) are no longer automatically decompressed for you when you retrieve them. We have fixes up to handle common occurrences of this at https://review.opendev.org/#/c/701578/ and https://review.opendev.org/701282 (note the second case is already handled on OpenDev's Zuul). If you run into other files you expect to be browesable but get compressed instead, the fix is to stop compressing the files explicitly in the job. We will still upload files in compressed form to swift for efficiency but we should operate on these files as if they were uncompressed. Then for files that should be compressed (like .tar.gz files) we can pass them through as is avoiding any format change problems. Clark From mordred at inaugust.com Wed Jan 8 17:39:18 2020 From: mordred at inaugust.com (Monty Taylor) Date: Wed, 8 Jan 2020 12:39:18 -0500 Subject: [infra] Retire openstack/js-openstack-lib repository In-Reply-To: References: Message-ID: <2AA78965-B496-41D9-A41B-DF75694A3EB9@inaugust.com> > On Jan 8, 2020, at 4:32 AM, Radosław Piliszek wrote: > > Thanks, Andreas. Will do. > > I thought it might be also wise to preserve this since there are posts > now and then that horizon is reaching its limit and a JS lib might be > beneficial for any possible replacement (as it can run from the > browser) Said this in IRC, but for the mailing list - I’d be happy to accept it into the SDK project as a deliverable if you wanted to take it on. From what I can tell it does process clouds.yaml files - so it might be a nice way for us to verify good cross-language support for that format. (Should probably also add support for things like os-service-types and the well-known api discovery that have come since this library was last worked on) It would be nice to keep it and move it forward if it’s solid and a thing that’s valuable to people. > Though I have no idea what the state of this library is. OTOH, quick > google search reveals that alternatives do not seem better at the > first glance. > The only promising one was https://github.com/pkgcloud/pkgcloud but it > is not OS-centric and has therefore different goals. > > -yoctozepto > > śr., 8 sty 2020 o 10:26 Andreas Jaeger napisał(a): >> >> On 08/01/2020 10.21, Radosław Piliszek wrote: >>> Are there any alternatives? >>> I would be glad to pick this up because I planned some integrations >>> like this on my own. >> >> If you want to pick this up, best discuss with Clark as Infra PTL. We >> can keep it if there is real interest, >> >> Andreas >> >>> -yoctozepto >>> >>> śr., 8 sty 2020 o 09:48 Andreas Jaeger napisał(a): >>>> >>>> The js-openstack-lib repository is orphaned and has not seen any real >>>> merges or contributions since February 2017, I propose to retire it. >>>> >>>> I'll send retirement changes using topic retire-js-openstack-lib, >>>> >>>> Andreas >>>> -- >>>> Andreas Jaeger aj at suse.com Twitter: jaegerandi >>>> SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg >>>> (HRB 36809, AG Nürnberg) GF: Felix Imendörffer >>>> GPG fingerprint = EF18 1673 38C4 A372 86B1 E699 5294 24A3 FF91 2ACB >>>> >> >> >> -- >> Andreas Jaeger aj at suse.com Twitter: jaegerandi >> SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg >> (HRB 36809, AG Nürnberg) GF: Felix Imendörffer >> GPG fingerprint = EF18 1673 38C4 A372 86B1 E699 5294 24A3 FF91 2ACB > > From fungi at yuggoth.org Wed Jan 8 18:47:45 2020 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 8 Jan 2020 18:47:45 +0000 Subject: [scientific][www] Unable to download OpenStack for Scientific Research book In-Reply-To: References: Message-ID: <20200108184745.fs6d4p4udr7deffp@yuggoth.org> On 2020-01-08 18:31:10 +0100 (+0100), Pierre Riteau wrote: > I tried to download the book at https://www.openstack.org/science/ > but the link doesn't work. Could this please be fixed? I've personally reported it to the webmasters for the www.openstack.org site. In the meantime, a bit of searching turns up https://www.openstack.org/assets/science/CrossroadofCloudandHPC.pdf which will redirect to a working copy. As I wrote this, Wes Wilson pointed out to me that there's also a 6x9in "printable" version at https://www.openstack.org/assets/science/CrossroadofCloudandHPC-Print.pdf and preprinted copies for purchase at https://www.amazon.com/dp/1978244703/ if that's more your speed. > I looked on openstack.org for a contact address, but couldn't find > one. Please let me know if there is a specific address I should use > next time. Yes, I know they're working on getting something added. They've generally been relying on E-mails to summitapp at openstack.org or bugs filed at https://bugs.launchpad.net/openstack-org/+filebug but I gather they're creating a support at openstack.org address or something along those lines to mention in page footers on the site soon. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From openstack at fried.cc Wed Jan 8 19:45:37 2020 From: openstack at fried.cc (Eric Fried) Date: Wed, 8 Jan 2020 13:45:37 -0600 Subject: [cliff][docs][requirements] new cliff versions causes docs to fail to build In-Reply-To: References: <20191222175308.juzyu6grndfcf2ez@mthode.org> <20200107151521.GA349057@sm-workstation> <20200107152237.GA349707@sm-workstation> Message-ID: <28c1b684-7597-99cf-42bb-4995e0aa9f54@fried.cc> > I suggest we blacklist 2.17.0 I see you did this here [1] (merged) > and issue > a new 2.17.1 or 2.18.0 release post-haste. and this here [2] (open) > That's not all though. For some daft reason, the 'python-rsdclient' > projects has imported argparses 'HelpFormatter' from 'cliff._argparse' > instead of 'argparse'. They need to stop doing this because commit > 584352dcd008d58c433136539b22a6ae9d6c45cc of cliff means this will no > longer work. Just import argparse directly. I didn't see an open review for this so I made one [3]. I'm not sure how we should deal with python-rsdclient's cliff req, though. Should we blacklist the bad version in accordance with [1], or remove the req entirely since that was the only thing in the project that referenced it? And should that happen in the same patch or separately? efried [1] https://review.opendev.org/#/c/701406/ [2] https://review.opendev.org/#/c/701405/ [3] https://review.opendev.org/#/c/701599/ From openstack at fried.cc Wed Jan 8 19:53:40 2020 From: openstack at fried.cc (Eric Fried) Date: Wed, 8 Jan 2020 13:53:40 -0600 Subject: [all][infra] Compressed job log artifacts not loading in browser In-Reply-To: <0e443056-2eb3-4bd2-9d25-6a1b55d214ea@www.fastmail.com> References: <0e443056-2eb3-4bd2-9d25-6a1b55d214ea@www.fastmail.com> Message-ID: Since it was not obvious to me, and thus might not be obvious to others, this > If you run into other files you expect to be browesable but get compressed instead, the fix is to stop compressing the files explicitly in the job. means that (many? most? all?) legacy jobs are impacted, and the remedy is to fix the individual jobs as noted, or (preferably) port them to zuulv3. efried . From rosmaita.fossdev at gmail.com Wed Jan 8 20:43:33 2020 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Wed, 8 Jan 2020 15:43:33 -0500 Subject: [cinder] ussuri virtual mid-cycle poll Message-ID: As determined at the virtual PTG (which seems like it happened only a few weeks ago), we'll be doing a two-phase virtual mid-cycle, meeting for two hours the week of 20 January (before the spec freeze) and again around the week of 16 March (the Cinder new feature status checkpoint). There's a poll up to determine a suitable time for the first virtual mid-cycle meeting: https://doodle.com/poll/n3tmq8ep43dyi7tv Please fill out the poll as soon as you can. If all the times are horrible for you, please suggest an alternative in a comment on the poll. The poll will close at 23:59 UTC on Saturday 11 January. (I know it's soon, but that way we'll have time to make adjustments if necessary.) cheers, brian From radoslaw.piliszek at gmail.com Wed Jan 8 21:03:48 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Wed, 8 Jan 2020 22:03:48 +0100 Subject: [infra] Retire openstack/js-openstack-lib repository In-Reply-To: <2AA78965-B496-41D9-A41B-DF75694A3EB9@inaugust.com> References: <2AA78965-B496-41D9-A41B-DF75694A3EB9@inaugust.com> Message-ID: While the project is not well-documented (for any potential user), the code looks quite nice (well-structured, test-covered and documented). I checked with nodejs6 (old obsoleted) as this was what functional tests jobs were mentioning and I did not want any surprises. Yet it failed to properly interpret Stein endpoints. First issue is that it requires unversioned keystone url passed to it. Then it started failing on something less obvious and I am too tired today to debug it. :-) Deps are partially deprecated, some have been replaced, some have security issues. Based on first impression I see it fit for keeping as a deliverable but it needs some work to bring it back in shape. It makes sense to go to SDK project, albeit it requires nodejs familiarity in addition to general API/SDK building knowledge. PS: I noticed nodejs 8 is already EOL (this year) and it seems to be the max in infra. I would appreciate any help with getting nodejs 10 and 12 into infra. -yoctozepto śr., 8 sty 2020 o 18:39 Monty Taylor napisał(a): > > > > > On Jan 8, 2020, at 4:32 AM, Radosław Piliszek wrote: > > > > Thanks, Andreas. Will do. > > > > I thought it might be also wise to preserve this since there are posts > > now and then that horizon is reaching its limit and a JS lib might be > > beneficial for any possible replacement (as it can run from the > > browser) > > Said this in IRC, but for the mailing list - I’d be happy to accept it into the SDK project as a deliverable if you wanted to take it on. From what I can tell it does process clouds.yaml files - so it might be a nice way for us to verify good cross-language support for that format. (Should probably also add support for things like os-service-types and the well-known api discovery that have come since this library was last worked on) It would be nice to keep it and move it forward if it’s solid and a thing that’s valuable to people. > > > Though I have no idea what the state of this library is. OTOH, quick > > google search reveals that alternatives do not seem better at the > > first glance. > > The only promising one was https://github.com/pkgcloud/pkgcloud but it > > is not OS-centric and has therefore different goals. > > > > -yoctozepto > > > > śr., 8 sty 2020 o 10:26 Andreas Jaeger napisał(a): > >> > >> On 08/01/2020 10.21, Radosław Piliszek wrote: > >>> Are there any alternatives? > >>> I would be glad to pick this up because I planned some integrations > >>> like this on my own. > >> > >> If you want to pick this up, best discuss with Clark as Infra PTL. We > >> can keep it if there is real interest, > >> > >> Andreas > >> > >>> -yoctozepto > >>> > >>> śr., 8 sty 2020 o 09:48 Andreas Jaeger napisał(a): > >>>> > >>>> The js-openstack-lib repository is orphaned and has not seen any real > >>>> merges or contributions since February 2017, I propose to retire it. > >>>> > >>>> I'll send retirement changes using topic retire-js-openstack-lib, > >>>> > >>>> Andreas > >>>> -- > >>>> Andreas Jaeger aj at suse.com Twitter: jaegerandi > >>>> SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg > >>>> (HRB 36809, AG Nürnberg) GF: Felix Imendörffer > >>>> GPG fingerprint = EF18 1673 38C4 A372 86B1 E699 5294 24A3 FF91 2ACB > >>>> > >> > >> > >> -- > >> Andreas Jaeger aj at suse.com Twitter: jaegerandi > >> SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg > >> (HRB 36809, AG Nürnberg) GF: Felix Imendörffer > >> GPG fingerprint = EF18 1673 38C4 A372 86B1 E699 5294 24A3 FF91 2ACB > > > > > From fungi at yuggoth.org Wed Jan 8 21:12:08 2020 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 8 Jan 2020 21:12:08 +0000 Subject: [infra] Retire openstack/js-openstack-lib repository In-Reply-To: References: <2AA78965-B496-41D9-A41B-DF75694A3EB9@inaugust.com> Message-ID: <20200108211208.mnvspulwlghdyyz5@yuggoth.org> On 2020-01-08 22:03:48 +0100 (+0100), Radosław Piliszek wrote: [...] > I noticed nodejs 8 is already EOL (this year) and it seems to be > the max in infra. I would appreciate any help with getting nodejs > 10 and 12 into infra. [...] Can you be more specific? Zuul will obviously allow you to install anything you like in a job, so presumably you're finding some defaults hard-coded somewhere we should reevaluate? -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From ekultails at gmail.com Wed Jan 8 21:49:15 2020 From: ekultails at gmail.com (Luke Short) Date: Wed, 8 Jan 2020 16:49:15 -0500 Subject: [tripleo] Use Podman 1.6 in CI Message-ID: Hey folks, We have been running into a situation where an older version of Podman is used in CI that has consistent failures. It has problems deleting storage associated with a container. A possible workaround (originally created by Damien) can be found here [1]. A few of us been in talks with the Podman team about this problem and have tested with a newer version of it (1.6.4, to be exact) and found that it is no longer an issue. The most ideal situation is to simply use this newer version of Podman instead of adding hacky workarounds that we will soon revert. However, I am unsure about how we would go about doing this upstream. RHEL will soon get an updated Podman version but CentOS always lags behind. Even CentOS 8 Stream does not contain the newer version [2]. The question/ask I have is can we ship/use a newer version of Podman in our upstream CI? Or should we continue our efforts on making a workaround? 1. https://review.opendev.org/#/c/698999/ 2. http://mirror.centos.org/centos/8-stream/AppStream/x86_64/os/Packages/ Sincerely, Luke Short -------------- next part -------------- An HTML attachment was scrubbed... URL: From cboylan at sapwetik.org Wed Jan 8 21:55:57 2020 From: cboylan at sapwetik.org (Clark Boylan) Date: Wed, 08 Jan 2020 13:55:57 -0800 Subject: [infra] Retire openstack/js-openstack-lib repository In-Reply-To: <20200108211208.mnvspulwlghdyyz5@yuggoth.org> References: <2AA78965-B496-41D9-A41B-DF75694A3EB9@inaugust.com> <20200108211208.mnvspulwlghdyyz5@yuggoth.org> Message-ID: <8679edfb-a26a-4292-9886-3c71cec21f83@www.fastmail.com> On Wed, Jan 8, 2020, at 1:12 PM, Jeremy Stanley wrote: > On 2020-01-08 22:03:48 +0100 (+0100), Radosław Piliszek wrote: > [...] > > I noticed nodejs 8 is already EOL (this year) and it seems to be > > the max in infra. I would appreciate any help with getting nodejs > > 10 and 12 into infra. > [...] > > Can you be more specific? Zuul will obviously allow you to install > anything you like in a job, so presumably you're finding some > defaults hard-coded somewhere we should reevaluate? We even supply a role from zuul-jobs to install nodejs from nodesource for you, https://zuul-ci.org/docs/zuul-jobs/js-roles.html#role-install-nodejs. This can install any nodejs version available from nodesource for the current platform. Clark From aschultz at redhat.com Wed Jan 8 22:18:23 2020 From: aschultz at redhat.com (Alex Schultz) Date: Wed, 8 Jan 2020 15:18:23 -0700 Subject: [tripleo] tripleo-operator-ansible start and request for input Message-ID: [Hello folks, I've begun the basic start of the tripleo-operator-ansible collection work[0]. At the start of this work, I've chosen the undercloud installation[1] as the first role to use to figure out how we the end user's to consume these roles. I wanted to bring up this initial implementation so that we can discuss how folks will include these roles. The initial implementation is a wrapper around the tripleoclient command as run via openstackclient. This means that the 'tripleo-undercloud' role provides implementations for 'openstack undercloud backup', 'openstack undercloud install', and 'openstack undercloud upgrade'. In terms of naming conventions, I'm proposing that we would name the roles "tripleo-" with the last part of the command action being an "action". Examples: "openstack undercloud *" -> role: tripleo-undercloud action: (backup|install|upgrade) "openstack undercloud minion *" -> role: tripleo-undercloud-minion action: (install|upgrade) "openstack overcloud *" -> role: tripleo-overcloud action: (deploy|delete|export) "openstack overcloud node *" -> role: tripleo-overcloud-node action: (import|introspect|provision|unprovision) In terms of end user interface, I've got two proposals out in terms of possible implementations. Tasks from method: The initial commit propose that we would require the end user to use an include_role/tasks_from call to perform the desired action. For example: - hosts: undercloud gather_facts: true tasks: - name: Install undercloud collections: - tripleo.operator import_role: name: tripleo-undercloud tasks_from: install vars: tripleo_undercloud_debug: true Variable switch method: I've also proposed an alternative implementation[2] that would use include_role but require the end user to set a specific variable to change if the role runs 'install', 'backup' or 'upgrade'. With this patch the playbook would look something like: - hosts: undercloud gather_facts: true tasks: - name: Install undercloud collections: - tripleo.operator import_role: name: tripleo-undercloud vars: tripleo_undercloud_action: install tripleo_undercloud_debug: true I would like to solicit feedback on which one of these is the preferred integration method when calling these roles. I have two patches up in tripleo-quickstart-extras to show how these calls could be run. The "Tasks from method" can be viewed here[3]. The "Variable switch method" can be viewed here[4]. I can see pros and cons for both methods. My take would be: Tasks from method: Pros: - action is a bit more explicit - dynamic logic left up to the playbook/consumer. - May not have a 'default' action (as main.yml is empty, though it could be implemented). - tasks_from would be a global implementation across all roles rather than having a changing variable name. Cons: - internal task file names must be known by the consumer (though IMHO this is no different than the variable name + values in the other implementation) - role/action inclusions is not dynamic in the role (it can be in the playbook) Variable switch method: Pros: - inclusion of the role by default runs an install - action can be dynamically changed from the calling playbook via an ansible var - structure of the task files is internal to the role and the user of the role need not know the filenames/structure. Cons: - calling playbook is not explicit in that the action can be switched dynamically (e.g. intentionally or accidentally because it is dynamic) - implementer must know to configure a variable called `tripleo_undercloud_action` to switch between install/backup/upgrade actions - variable names are likely different depending on the role My personal preference might be to use the "Tasks from method" because it would lend itself to the same implementation across all roles and the dynamic logic is left to the playbook rather than internally in the role. For example, we'd end up with something like: - hosts: undercloud gather_facts: true collections: - tripleo.operator tasks: - name: Install undercloud import_role: name: tripleo-undercloud tasks_from: install vars: tripleo_undercloud_debug: true - name: Upload images import_role: name: tripleo-overcloud-images tasks_from: upload vars: tripleo_overcloud_images_debug: true - name: Import nodes import_role: name: tripleo-overcloud-node tasks_from: import vars: tripleo_overcloud_node_debug: true tripleo_overcloud_node_import_file: instack.json - name: Introspect nodes import_role: name: tripleo-overcloud-node tasks_from: introspect vars: tripleo_overcloud_node_debug: true tripleo_overcloud_node_introspect_all_manageable: True tripleo_overcloud_node_introspect_provide: True - name: Overcloud deploy import_role: name: tripleo-overcloud tasks_from: deploy vars: tripleo_overcloud_debug: true tripleo_overcloud_deploy_environment_files: - /home/stack/params.yaml The same general tasks performed via the "Variable switch method" would look something like: - hosts: undercloud gather_facts: true collections: - tripleo.operator tasks: - name: Install undercloud import_role: name: tripleo-undercloud vars: tripleo_undercloud_action: install tripleo_undercloud_debug: true - name: Upload images import_role: name: tripleo-overcloud-images vars: tripleo_overcloud_images_action: upload tripleo_overcloud_images_debug: true - name: Import nodes import_role: name: tripleo-overcloud-node vars: tripleo_overcloud_node_action: import tripleo_overcloud_node_debug: true tripleo_overcloud_node_import_file: instack.json - name: Introspect nodes import_role: name: tripleo-overcloud-node vars: tripleo_overcloud_node_action: introspect tripleo_overcloud_node_debug: true tripleo_overcloud_node_introspect_all_manageable: True tripleo_overcloud_node_introspect_provide: True - name: Overcloud deploy import_role: name: tripleo-overcloud vars: tripleo_overcloud_action: deploy tripleo_overcloud_debug: true tripleo_overcloud_deploy_environment_files: - /home/stack/params.yaml Thoughts? Thanks, -Alex [0] https://blueprints.launchpad.net/tripleo/+spec/tripleo-operator-ansible [1] https://review.opendev.org/#/c/699311/ [2] https://review.opendev.org/#/c/701628/ [3] https://review.opendev.org/#/c/701034/ [4] https://review.opendev.org/#/c/701628/ From fungi at yuggoth.org Wed Jan 8 22:25:24 2020 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 8 Jan 2020 22:25:24 +0000 Subject: [tripleo] Use Podman 1.6 in CI In-Reply-To: References: Message-ID: <20200108222524.jxhxlxzuvxt3mazw@yuggoth.org> On 2020-01-08 16:49:15 -0500 (-0500), Luke Short wrote: > We have been running into a situation where an older version of > Podman is used in CI that has consistent failures. It has problems > deleting storage associated with a container. [...] > The question/ask I have is can we ship/use a newer version of > Podman in our upstream CI? Or should we continue our efforts on > making a workaround? [...] This sounds like a problem users of your software could encounter in production. If so, how does only fixing it in CI jobs help your users? It seems like time might be better spent fixing the problem for everyone. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From aschultz at redhat.com Wed Jan 8 22:35:03 2020 From: aschultz at redhat.com (Alex Schultz) Date: Wed, 8 Jan 2020 15:35:03 -0700 Subject: [tripleo] Use Podman 1.6 in CI In-Reply-To: <20200108222524.jxhxlxzuvxt3mazw@yuggoth.org> References: <20200108222524.jxhxlxzuvxt3mazw@yuggoth.org> Message-ID: On Wed, Jan 8, 2020 at 3:30 PM Jeremy Stanley wrote: > > On 2020-01-08 16:49:15 -0500 (-0500), Luke Short wrote: > > We have been running into a situation where an older version of > > Podman is used in CI that has consistent failures. It has problems > > deleting storage associated with a container. > [...] > > The question/ask I have is can we ship/use a newer version of > > Podman in our upstream CI? Or should we continue our efforts on > > making a workaround? > [...] > > This sounds like a problem users of your software could encounter in > production. If so, how does only fixing it in CI jobs help your > users? It seems like time might be better spent fixing the problem > for everyone. Btw fixing CI implies fixing for everyone. In other words, how do we make it available for everyone (including CI). This is one of those ecosystem things because we (tripleo/openstack) don't necessarily ship it but we do need to use it. I'm uncertain of the centos7/podman 1.6 support and which branches are affected by this? This might be a better question for RDO. Thanks, -Alex > -- > Jeremy Stanley From ekultails at gmail.com Wed Jan 8 22:42:31 2020 From: ekultails at gmail.com (Luke Short) Date: Wed, 8 Jan 2020 17:42:31 -0500 Subject: [tripleo] tripleo-operator-ansible start and request for input In-Reply-To: References: Message-ID: Hey Alex, This is a great starting point! Thanks for sharing. I personally prefer the variables approach. This more-so aligns with the best practices for creating an Ansible role. An operator could provide a single variables file that has the booleans set for what they want configured. This also makes it easier to include/import the role once and then have it do multiple actions. For example, tripleo-undercloud can be used for both installation and backup. CI could even consume this to run "all the things" from the role by setting those extra variables. We provide the playbooks and the operators configure the variables for their environment. Long-term, I see TripleO consuming a single or few straight-forward Ansible variables files that define the entire deployment as opposed to the giant monster that Heat templates have become. Those are just my initial thoughts on the matter. I am interested to see what others think as well. Sincerely, Luke Short On Wed, Jan 8, 2020 at 5:20 PM Alex Schultz wrote: > [Hello folks, > > I've begun the basic start of the tripleo-operator-ansible collection > work[0]. At the start of this work, I've chosen the undercloud > installation[1] as the first role to use to figure out how we the end > user's to consume these roles. I wanted to bring up this initial > implementation so that we can discuss how folks will include these > roles. The initial implementation is a wrapper around the > tripleoclient command as run via openstackclient. This means that the > 'tripleo-undercloud' role provides implementations for 'openstack > undercloud backup', 'openstack undercloud install', and 'openstack > undercloud upgrade'. > > In terms of naming conventions, I'm proposing that we would name the > roles "tripleo-" with the last part of the command > action being an "action". Examples: > > "openstack undercloud *" -> > role: tripleo-undercloud > action: (backup|install|upgrade) > > "openstack undercloud minion *" -> > role: tripleo-undercloud-minion > action: (install|upgrade) > > "openstack overcloud *" -> > role: tripleo-overcloud > action: (deploy|delete|export) > > "openstack overcloud node *" -> > role: tripleo-overcloud-node > action: (import|introspect|provision|unprovision) > > In terms of end user interface, I've got two proposals out in terms of > possible implementations. > > Tasks from method: > The initial commit propose that we would require the end user to use > an include_role/tasks_from call to perform the desired action. For > example: > > - hosts: undercloud > gather_facts: true > tasks: > - name: Install undercloud > collections: > - tripleo.operator > import_role: > name: tripleo-undercloud > tasks_from: install > vars: > tripleo_undercloud_debug: true > > Variable switch method: > I've also proposed an alternative implementation[2] that would use > include_role but require the end user to set a specific variable to > change if the role runs 'install', 'backup' or 'upgrade'. With this > patch the playbook would look something like: > > - hosts: undercloud > gather_facts: true > tasks: > - name: Install undercloud > collections: > - tripleo.operator > import_role: > name: tripleo-undercloud > vars: > tripleo_undercloud_action: install > tripleo_undercloud_debug: true > > I would like to solicit feedback on which one of these is the > preferred integration method when calling these roles. I have two > patches up in tripleo-quickstart-extras to show how these calls could > be run. The "Tasks from method" can be viewed here[3]. The "Variable > switch method" can be viewed here[4]. I can see pros and cons for > both methods. > > My take would be: > > Tasks from method: > Pros: > - action is a bit more explicit > - dynamic logic left up to the playbook/consumer. > - May not have a 'default' action (as main.yml is empty, though it > could be implemented). > - tasks_from would be a global implementation across all roles rather > than having a changing variable name. > > Cons: > - internal task file names must be known by the consumer (though IMHO > this is no different than the variable name + values in the other > implementation) > - role/action inclusions is not dynamic in the role (it can be in the > playbook) > > Variable switch method: > Pros: > - inclusion of the role by default runs an install > - action can be dynamically changed from the calling playbook via an > ansible var > - structure of the task files is internal to the role and the user of > the role need not know the filenames/structure. > > Cons: > - calling playbook is not explicit in that the action can be switched > dynamically (e.g. intentionally or accidentally because it is dynamic) > - implementer must know to configure a variable called > `tripleo_undercloud_action` to switch between install/backup/upgrade > actions > - variable names are likely different depending on the role > > My personal preference might be to use the "Tasks from method" because > it would lend itself to the same implementation across all roles and > the dynamic logic is left to the playbook rather than internally in > the role. For example, we'd end up with something like: > > - hosts: undercloud > gather_facts: true > collections: > - tripleo.operator > tasks: > - name: Install undercloud > import_role: > name: tripleo-undercloud > tasks_from: install > vars: > tripleo_undercloud_debug: true > - name: Upload images > import_role: > name: tripleo-overcloud-images > tasks_from: upload > vars: > tripleo_overcloud_images_debug: true > - name: Import nodes > import_role: > name: tripleo-overcloud-node > tasks_from: import > vars: > tripleo_overcloud_node_debug: true > tripleo_overcloud_node_import_file: instack.json > - name: Introspect nodes > import_role: > name: tripleo-overcloud-node > tasks_from: introspect > vars: > tripleo_overcloud_node_debug: true > tripleo_overcloud_node_introspect_all_manageable: True > tripleo_overcloud_node_introspect_provide: True > - name: Overcloud deploy > import_role: > name: tripleo-overcloud > tasks_from: deploy > vars: > tripleo_overcloud_debug: true > tripleo_overcloud_deploy_environment_files: > - /home/stack/params.yaml > > The same general tasks performed via the "Variable switch method" > would look something like: > > - hosts: undercloud > gather_facts: true > collections: > - tripleo.operator > tasks: > - name: Install undercloud > import_role: > name: tripleo-undercloud > vars: > tripleo_undercloud_action: install > tripleo_undercloud_debug: true > - name: Upload images > import_role: > name: tripleo-overcloud-images > vars: > tripleo_overcloud_images_action: upload > tripleo_overcloud_images_debug: true > - name: Import nodes > import_role: > name: tripleo-overcloud-node > vars: > tripleo_overcloud_node_action: import > tripleo_overcloud_node_debug: true > tripleo_overcloud_node_import_file: instack.json > - name: Introspect nodes > import_role: > name: tripleo-overcloud-node > vars: > tripleo_overcloud_node_action: introspect > tripleo_overcloud_node_debug: true > tripleo_overcloud_node_introspect_all_manageable: True > tripleo_overcloud_node_introspect_provide: True > - name: Overcloud deploy > import_role: > name: tripleo-overcloud > vars: > tripleo_overcloud_action: deploy > tripleo_overcloud_debug: true > tripleo_overcloud_deploy_environment_files: > - /home/stack/params.yaml > > Thoughts? > > Thanks, > -Alex > > [0] > https://blueprints.launchpad.net/tripleo/+spec/tripleo-operator-ansible > [1] https://review.opendev.org/#/c/699311/ > [2] https://review.opendev.org/#/c/701628/ > [3] https://review.opendev.org/#/c/701034/ > [4] https://review.opendev.org/#/c/701628/ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Wed Jan 8 22:54:26 2020 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 8 Jan 2020 22:54:26 +0000 Subject: [tripleo] Use Podman 1.6 in CI In-Reply-To: References: <20200108222524.jxhxlxzuvxt3mazw@yuggoth.org> Message-ID: <20200108225426.7jqu7mquf7ktxkqx@yuggoth.org> On 2020-01-08 15:35:03 -0700 (-0700), Alex Schultz wrote: > On Wed, Jan 8, 2020 at 3:30 PM Jeremy Stanley wrote: > > On 2020-01-08 16:49:15 -0500 (-0500), Luke Short wrote: > > > We have been running into a situation where an older version of > > > Podman is used in CI that has consistent failures. It has problems > > > deleting storage associated with a container. > > [...] > > > The question/ask I have is can we ship/use a newer version of > > > Podman in our upstream CI? Or should we continue our efforts on > > > making a workaround? > > [...] > > > > This sounds like a problem users of your software could encounter in > > production. If so, how does only fixing it in CI jobs help your > > users? It seems like time might be better spent fixing the problem > > for everyone. > > Btw fixing CI implies fixing for everyone. In other words, how do we > make it available for everyone (including CI). This is one of those > ecosystem things because we (tripleo/openstack) don't necessarily ship > it but we do need to use it. I'm uncertain of the centos7/podman 1.6 > support and which branches are affected by this? This might be a > better question for RDO. I see, "ship/use a newer version of Podman in our upstream CI" didn't seem to necessarily imply getting a newer version of Podman into RDO/TripleO and the hands of its users. I have a bit of a knee-jerk reaction whenever I see someone talk about "fixing CI" when the underlying problem is in the software being tested and not the CI jobs. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From ekultails at gmail.com Wed Jan 8 23:07:17 2020 From: ekultails at gmail.com (Luke Short) Date: Wed, 8 Jan 2020 18:07:17 -0500 Subject: [tripleo] Use Podman 1.6 in CI In-Reply-To: <20200108225426.7jqu7mquf7ktxkqx@yuggoth.org> References: <20200108222524.jxhxlxzuvxt3mazw@yuggoth.org> <20200108225426.7jqu7mquf7ktxkqx@yuggoth.org> Message-ID: Hey folks, Thank you for all of the feedback so far. The goal is definitely to fix this everywhere we can, no just in CI. Sorry for my poor choice of words. I will migrate this discussion over to the RDO community. Sincerely, Luke Short On Wed, Jan 8, 2020 at 5:55 PM Jeremy Stanley wrote: > On 2020-01-08 15:35:03 -0700 (-0700), Alex Schultz wrote: > > On Wed, Jan 8, 2020 at 3:30 PM Jeremy Stanley wrote: > > > On 2020-01-08 16:49:15 -0500 (-0500), Luke Short wrote: > > > > We have been running into a situation where an older version of > > > > Podman is used in CI that has consistent failures. It has problems > > > > deleting storage associated with a container. > > > [...] > > > > The question/ask I have is can we ship/use a newer version of > > > > Podman in our upstream CI? Or should we continue our efforts on > > > > making a workaround? > > > [...] > > > > > > This sounds like a problem users of your software could encounter in > > > production. If so, how does only fixing it in CI jobs help your > > > users? It seems like time might be better spent fixing the problem > > > for everyone. > > > > Btw fixing CI implies fixing for everyone. In other words, how do we > > make it available for everyone (including CI). This is one of those > > ecosystem things because we (tripleo/openstack) don't necessarily ship > > it but we do need to use it. I'm uncertain of the centos7/podman 1.6 > > support and which branches are affected by this? This might be a > > better question for RDO. > > I see, "ship/use a newer version of Podman in our upstream CI" > didn't seem to necessarily imply getting a newer version of Podman > into RDO/TripleO and the hands of its users. I have a bit of a > knee-jerk reaction whenever I see someone talk about "fixing CI" > when the underlying problem is in the software being tested and not > the CI jobs. > -- > Jeremy Stanley > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ssbarnea at redhat.com Wed Jan 8 23:16:22 2020 From: ssbarnea at redhat.com (Sorin Sbarnea) Date: Wed, 8 Jan 2020 23:16:22 +0000 Subject: [tripleo] Use Podman 1.6 in CI In-Reply-To: <20200108225426.7jqu7mquf7ktxkqx@yuggoth.org> References: <20200108222524.jxhxlxzuvxt3mazw@yuggoth.org> <20200108225426.7jqu7mquf7ktxkqx@yuggoth.org> Message-ID: <84998B48-B820-4931-A35A-31AD98EF8A2A@redhat.com> One of the things I am working on is to add CI jobs on podman project itself that builds rpms packages for all supported systems and tests them, final goal being to test them with openstack. I am not done yet but got a good progress: https://review.rdoproject.org/zuul/buildsets?project=containers%2Flibpod https://github.com/containers/libpod/pull/4815 - current WIP (after merging few others) Since I started working on this I found several bugs in podman, so I think that the effort would pay off. Cheers Sorin > On 8 Jan 2020, at 22:54, Jeremy Stanley wrote: > > ewer version of Podman in our upstream CI" > didn't seem to necessarily imply getting a newer version of Podman > into RDO/TripleO and the hands of its users. I have a bit of a > knee-jerk reaction -------------- next part -------------- An HTML attachment was scrubbed... URL: From emilien at redhat.com Wed Jan 8 23:21:23 2020 From: emilien at redhat.com (Emilien Macchi) Date: Wed, 8 Jan 2020 18:21:23 -0500 Subject: [tripleo] tripleo-operator-ansible start and request for input In-Reply-To: References: Message-ID: On Wed, Jan 8, 2020 at 5:25 PM Alex Schultz wrote: > [Hello folks, > > I've begun the basic start of the tripleo-operator-ansible collection > work[0]. At the start of this work, I've chosen the undercloud > installation[1] as the first role to use to figure out how we the end > user's to consume these roles. I wanted to bring up this initial > implementation so that we can discuss how folks will include these > roles. The initial implementation is a wrapper around the > tripleoclient command as run via openstackclient. This means that the > 'tripleo-undercloud' role provides implementations for 'openstack > undercloud backup', 'openstack undercloud install', and 'openstack > undercloud upgrade'. > > In terms of naming conventions, I'm proposing that we would name the > roles "tripleo-" with the last part of the command > action being an "action". Examples: > > "openstack undercloud *" -> > role: tripleo-undercloud > action: (backup|install|upgrade) > > "openstack undercloud minion *" -> > role: tripleo-undercloud-minion > action: (install|upgrade) > > "openstack overcloud *" -> > role: tripleo-overcloud > action: (deploy|delete|export) > > "openstack overcloud node *" -> > role: tripleo-overcloud-node > action: (import|introspect|provision|unprovision) > Another technically valid option could be: "openstack overcloud node *" to role: tripleo-overcloud action: node/import|node/introspect, etc. The role could have tasks/node/import.yml, tasks/node/introspect.yml, etc. It's to me another option to consider so we reduce the number of roles (and therefore LOC involved). > > In terms of end user interface, I've got two proposals out in terms of > possible implementations. > > Tasks from method: > The initial commit propose that we would require the end user to use > an include_role/tasks_from call to perform the desired action. For > example: > > - hosts: undercloud > gather_facts: true > tasks: > - name: Install undercloud > collections: > - tripleo.operator > import_role: > name: tripleo-undercloud > tasks_from: install > vars: > tripleo_undercloud_debug: true > > Variable switch method: > I've also proposed an alternative implementation[2] that would use > include_role but require the end user to set a specific variable to > change if the role runs 'install', 'backup' or 'upgrade'. With this > patch the playbook would look something like: > > - hosts: undercloud > gather_facts: true > tasks: > - name: Install undercloud > collections: > - tripleo.operator > import_role: > name: tripleo-undercloud > vars: > tripleo_undercloud_action: install > tripleo_undercloud_debug: true > > I would like to solicit feedback on which one of these is the > preferred integration method when calling these roles. I have two > patches up in tripleo-quickstart-extras to show how these calls could > be run. The "Tasks from method" can be viewed here[3]. The "Variable > switch method" can be viewed here[4]. I can see pros and cons for > both methods. > > My take would be: > > Tasks from method: > Pros: > - action is a bit more explicit > - dynamic logic left up to the playbook/consumer. > - May not have a 'default' action (as main.yml is empty, though it > could be implemented). > - tasks_from would be a global implementation across all roles rather > than having a changing variable name. > Not sure but it might be slightly faster as well, since we directly import what we need. I prefer this proposal as well also because I've already seen this pattern in tripleo-ansible. > > Cons: > - internal task file names must be known by the consumer (though IMHO > this is no different than the variable name + values in the other > implementation) > - role/action inclusions is not dynamic in the role (it can be in the > playbook) > > Variable switch method: > Pros: > - inclusion of the role by default runs an install > - action can be dynamically changed from the calling playbook via an > ansible var > - structure of the task files is internal to the role and the user of > the role need not know the filenames/structure. > > Cons: > - calling playbook is not explicit in that the action can be switched > dynamically (e.g. intentionally or accidentally because it is dynamic) > - implementer must know to configure a variable called > `tripleo_undercloud_action` to switch between install/backup/upgrade > actions > - variable names are likely different depending on the role > > My personal preference might be to use the "Tasks from method" because > it would lend itself to the same implementation across all roles and > the dynamic logic is left to the playbook rather than internally in > the role. For example, we'd end up with something like: > > - hosts: undercloud > gather_facts: true > collections: > - tripleo.operator > tasks: > - name: Install undercloud > import_role: > name: tripleo-undercloud > tasks_from: install > vars: > tripleo_undercloud_debug: true > - name: Upload images > import_role: > name: tripleo-overcloud-images > tasks_from: upload > vars: > tripleo_overcloud_images_debug: true > - name: Import nodes > import_role: > name: tripleo-overcloud-node > tasks_from: import > vars: > tripleo_overcloud_node_debug: true > tripleo_overcloud_node_import_file: instack.json > - name: Introspect nodes > import_role: > name: tripleo-overcloud-node > tasks_from: introspect > vars: > tripleo_overcloud_node_debug: true > tripleo_overcloud_node_introspect_all_manageable: True > tripleo_overcloud_node_introspect_provide: True > - name: Overcloud deploy > import_role: > name: tripleo-overcloud > tasks_from: deploy > vars: > tripleo_overcloud_debug: true > tripleo_overcloud_deploy_environment_files: > - /home/stack/params.yaml > > The same general tasks performed via the "Variable switch method" > would look something like: > > - hosts: undercloud > gather_facts: true > collections: > - tripleo.operator > tasks: > - name: Install undercloud > import_role: > name: tripleo-undercloud > vars: > tripleo_undercloud_action: install > tripleo_undercloud_debug: true > - name: Upload images > import_role: > name: tripleo-overcloud-images > vars: > tripleo_overcloud_images_action: upload > tripleo_overcloud_images_debug: true > - name: Import nodes > import_role: > name: tripleo-overcloud-node > vars: > tripleo_overcloud_node_action: import > tripleo_overcloud_node_debug: true > tripleo_overcloud_node_import_file: instack.json > - name: Introspect nodes > import_role: > name: tripleo-overcloud-node > vars: > tripleo_overcloud_node_action: introspect > tripleo_overcloud_node_debug: true > tripleo_overcloud_node_introspect_all_manageable: True > tripleo_overcloud_node_introspect_provide: True > - name: Overcloud deploy > import_role: > name: tripleo-overcloud > vars: > tripleo_overcloud_action: deploy > tripleo_overcloud_debug: true > tripleo_overcloud_deploy_environment_files: > - /home/stack/params.yaml > > Thoughts? > > Thanks, > -Alex > > [0] > https://blueprints.launchpad.net/tripleo/+spec/tripleo-operator-ansible > [1] https://review.opendev.org/#/c/699311/ > [2] https://review.opendev.org/#/c/701628/ > [3] https://review.opendev.org/#/c/701034/ > [4] https://review.opendev.org/#/c/701628/ > > > Nice work Alex! -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From jean-philippe at evrard.me Wed Jan 8 23:31:40 2020 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Thu, 09 Jan 2020 00:31:40 +0100 Subject: [tc] January meeting agenda Message-ID: <1d35cdc723dbd4d50ab6a933b6a6a2c8a8ee4153.camel@evrard.me> Hello everyone, Our next meeting is happening next week Thursday (the 16th), and the agenda is, as usual, on the wiki! Here is a primer of the agenda for this month: - report on large scale sig -- how does this fly and how/what are the action items. - report on the vision reflection update - report on the analysis of the survey - report on the convo for Telemetry with Catalyst -- where are we now? What are the next steps (Gnocchi fork)? - report on multi-arch SIG - report on infra liaison and static hosting -- check if there is progress - report on stable branch policy work. - report on the oslo metrics project -- has code appeared since last convo? - report on the community goals for U and V, py2 drop, and goal select process schedule. - report on release naming - report on the ideas repo See you there! Regards, Jean-Philippe Evrard (evrardjp) From rico.lin.guanyu at gmail.com Thu Jan 9 05:35:57 2020 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Thu, 9 Jan 2020 13:35:57 +0800 Subject: [Multi-Arch SIG] summary and actions from last meeting Message-ID: Hi all Thanks to all who sign up and help to form Multi-Arch SIG We hosted our two initial meetings this week [1] and both successful. And for who cares about multi-arch (for example like ARM support in community), feel free to join our future meetings [2]. Here are some actions from this week's meetings: * Create StoryBoard for Multi-Arch SIG (ricolin) * Build multi-arch SIG repo (ricolin) * Collect update ppc64le actions and resources in community if see any (mrda) * Help with documentations once Multi-Arch SIG repo is ready (jeremyfreudberg) * Update governance-sigs to make a more clear description for Multi-Arch SIG (jeremyfreudberg) There are two docs. ideas `oh, on arm64 you need to do XYZ in other way - here is how and why` and `use cases, who, and issues they had`. Both of them seem like great docs. to start with. So that I assume will be something this SIG will try to work on as one of our first step goals. We need more people to help and more resource in CI as well. There's a lot of WIP resources and CI jobs in our community and all of them are collected in our etherpad [3] now. Feel free to update that etherpad and also sign you or your organization's name on it. Please join us:) Once we have our StoryBoard ready, we should be able to create tasks on it so everyone can create and track tasks like CI job, documentations, etc. last but not least, we are looking for people to nominate or volunteer for chair roles (which can be multiple). There are some really experienced community members been nominated and I will check with them if willing to take the role. On the other hand, I'm volunteering to apply for one of SIG chair seats and help to build this SIG, but happy to give it to others if we can have more people sign up for that role:) So let me know if you're interested. [1] http://eavesdrop.openstack.org/meetings/multi_arch/2020/ [2] http://eavesdrop.openstack.org/#Multi-Arch_SIG_Meeting [3] https://etherpad.openstack.org/p/Multi-arch -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From ykarel at redhat.com Thu Jan 9 07:04:39 2020 From: ykarel at redhat.com (Yatin Karel) Date: Thu, 9 Jan 2020 12:34:39 +0530 Subject: [tripleo] Use Podman 1.6 in CI In-Reply-To: References: <20200108222524.jxhxlxzuvxt3mazw@yuggoth.org> <20200108225426.7jqu7mquf7ktxkqx@yuggoth.org> Message-ID: Hi Luke Short, On Thu, Jan 9, 2020 at 4:40 AM Luke Short wrote: > > Hey folks, > > Thank you for all of the feedback so far. The goal is definitely to fix this everywhere we can, no just in CI. Sorry for my poor choice of words. I will migrate this discussion over to the RDO community. > So iiuc the problem correctly podman 1.6.4 is needed to fix some race issues, the corresponding bug https://bugs.launchpad.net/tripleo/+bug/1856324 mainly referred CentOS7 jobs/users as most of the Upstream work/development around CentOS7 but considering efforts around CentOS8 i will try to put info related to both wrt RDO. With respect to CentOS8:- So plan for master is to move to CentOS8, but CentOS8 is still not completely ready, it's WIP. Current status and issues can be found with [1][2]. wrt podman version, as soon as job/users start consuming CentOS8, podman version whatever shipped with it will be available, most likely it will be podman-1.4.2-5 looking at Stream content [5], which might be updated with future updates/releases. I guess similar race issue might be hitting in Train as well, so with respect to Train, there is also plan to add CentOS8 support for Train in addition to CentOS7 as a follow up/parallel to master efforts. Now with respect to CentOS7:- Current podman version we have in RDO is 1.5.1-3 for both train and master. There was an attempt [3] in past from @Emilien Macchi to update podman to 1.6.1 in RDO but there were some issues running on CentOS7 and we didn't hear much from container Team on how to move forward, we can attempt again to see if > 1.6.1 is working which mostly depends on Container Teams plan for podman and CentOS7. In RDO we use the builds done by Container Team and last successful build on CBS is 1.6.2[4]. [1] https://lists.rdoproject.org/pipermail/dev/2020-January/009230.html [2] https://trello.com/c/fv3u22df/709-centos8-move-to-centos8 [3] https://review.rdoproject.org/r/#/c/23449/ [4] https://cbs.centos.org/koji/packageinfo?packageID=6853 [5] http://mirror.centos.org/centos/8-stream/AppStream/x86_64/os/Packages/ > Sincerely, > Luke Short > > On Wed, Jan 8, 2020 at 5:55 PM Jeremy Stanley wrote: >> >> On 2020-01-08 15:35:03 -0700 (-0700), Alex Schultz wrote: >> > On Wed, Jan 8, 2020 at 3:30 PM Jeremy Stanley wrote: >> > > On 2020-01-08 16:49:15 -0500 (-0500), Luke Short wrote: >> > > > We have been running into a situation where an older version of >> > > > Podman is used in CI that has consistent failures. It has problems >> > > > deleting storage associated with a container. >> > > [...] >> > > > The question/ask I have is can we ship/use a newer version of >> > > > Podman in our upstream CI? Or should we continue our efforts on >> > > > making a workaround? >> > > [...] >> > > >> > > This sounds like a problem users of your software could encounter in >> > > production. If so, how does only fixing it in CI jobs help your >> > > users? It seems like time might be better spent fixing the problem >> > > for everyone. >> > >> > Btw fixing CI implies fixing for everyone. In other words, how do we >> > make it available for everyone (including CI). This is one of those >> > ecosystem things because we (tripleo/openstack) don't necessarily ship >> > it but we do need to use it. I'm uncertain of the centos7/podman 1.6 >> > support and which branches are affected by this? This might be a >> > better question for RDO. >> >> I see, "ship/use a newer version of Podman in our upstream CI" >> didn't seem to necessarily imply getting a newer version of Podman >> into RDO/TripleO and the hands of its users. I have a bit of a >> knee-jerk reaction whenever I see someone talk about "fixing CI" >> when the underlying problem is in the software being tested and not >> the CI jobs. >> -- >> Jeremy Stanley Thanks and Regards Yatin Karel From radoslaw.piliszek at gmail.com Thu Jan 9 07:58:32 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Thu, 9 Jan 2020 08:58:32 +0100 Subject: [infra] Retire openstack/js-openstack-lib repository In-Reply-To: <8679edfb-a26a-4292-9886-3c71cec21f83@www.fastmail.com> References: <2AA78965-B496-41D9-A41B-DF75694A3EB9@inaugust.com> <20200108211208.mnvspulwlghdyyz5@yuggoth.org> <8679edfb-a26a-4292-9886-3c71cec21f83@www.fastmail.com> Message-ID: Best infra team around, you go to sleep and the problem is solved. :-) Thanks for the link. I was meaning these templates: https://opendev.org/openstack/openstack-zuul-jobs/src/branch/master/zuul.d/project-templates.yaml which reference nodejs up to 8. I see zuul is already using the same jobs referenced in those templates but with node 10 so it presumably works which is great indeed: https://opendev.org/zuul/zuul/src/branch/master/.zuul.yaml#L212 The most nodejs-scary part is included in infra docs: https://docs.openstack.org/infra/manual/creators.html#central-config-exceptions which reference nodejs4 (exorcists required immediately). -yoctozepto śr., 8 sty 2020 o 23:03 Clark Boylan napisał(a): > > On Wed, Jan 8, 2020, at 1:12 PM, Jeremy Stanley wrote: > > On 2020-01-08 22:03:48 +0100 (+0100), Radosław Piliszek wrote: > > [...] > > > I noticed nodejs 8 is already EOL (this year) and it seems to be > > > the max in infra. I would appreciate any help with getting nodejs > > > 10 and 12 into infra. > > [...] > > > > Can you be more specific? Zuul will obviously allow you to install > > anything you like in a job, so presumably you're finding some > > defaults hard-coded somewhere we should reevaluate? > > We even supply a role from zuul-jobs to install nodejs from nodesource for you, https://zuul-ci.org/docs/zuul-jobs/js-roles.html#role-install-nodejs. This can install any nodejs version available from nodesource for the current platform. > > Clark > From pierre at stackhpc.com Thu Jan 9 09:27:43 2020 From: pierre at stackhpc.com (Pierre Riteau) Date: Thu, 9 Jan 2020 10:27:43 +0100 Subject: [scientific][www] Unable to download OpenStack for Scientific Research book In-Reply-To: <20200108184745.fs6d4p4udr7deffp@yuggoth.org> References: <20200108184745.fs6d4p4udr7deffp@yuggoth.org> Message-ID: On Wed, 8 Jan 2020 at 19:56, Jeremy Stanley wrote: > > On 2020-01-08 18:31:10 +0100 (+0100), Pierre Riteau wrote: > > I tried to download the book at https://www.openstack.org/science/ > > but the link doesn't work. Could this please be fixed? > > I've personally reported it to the webmasters for the > www.openstack.org site. In the meantime, a bit of searching turns up > https://www.openstack.org/assets/science/CrossroadofCloudandHPC.pdf > which will redirect to a working copy. As I wrote this, Wes Wilson > pointed out to me that there's also a 6x9in "printable" version at > https://www.openstack.org/assets/science/CrossroadofCloudandHPC-Print.pdf > and preprinted copies for purchase at > https://www.amazon.com/dp/1978244703/ if that's more your speed. > > > I looked on openstack.org for a contact address, but couldn't find > > one. Please let me know if there is a specific address I should use > > next time. > > Yes, I know they're working on getting something added. They've > generally been relying on E-mails to summitapp at openstack.org or bugs > filed at https://bugs.launchpad.net/openstack-org/+filebug but I > gather they're creating a support at openstack.org address or something > along those lines to mention in page footers on the site soon. > -- > Jeremy Stanley Hi Jeremy, Thanks a lot for reporting it. I did not realize there was a Launchpad project for openstack.org, I should try it next time. Pierre Riteau (priteau) From aj at suse.com Thu Jan 9 08:25:18 2020 From: aj at suse.com (Andreas Jaeger) Date: Thu, 9 Jan 2020 09:25:18 +0100 Subject: [infra] Retire openstack/js-openstack-lib repository In-Reply-To: References: <2AA78965-B496-41D9-A41B-DF75694A3EB9@inaugust.com> <20200108211208.mnvspulwlghdyyz5@yuggoth.org> <8679edfb-a26a-4292-9886-3c71cec21f83@www.fastmail.com> Message-ID: <07467b2a-7e5e-6ba1-8481-27c87f58d318@suse.com> On 09/01/2020 08.58, Radosław Piliszek wrote: > Best infra team around, you go to sleep and the problem is solved. :-) > Thanks for the link. > > I was meaning these templates: > https://opendev.org/openstack/openstack-zuul-jobs/src/branch/master/zuul.d/project-templates.yaml > which reference nodejs up to 8. New templates for nodejs 10 or 11 are welcome ;) > I see zuul is already using the same jobs referenced in those > templates but with node 10 so it presumably works which is great > indeed: > https://opendev.org/zuul/zuul/src/branch/master/.zuul.yaml#L212 > > The most nodejs-scary part is included in infra docs: > https://docs.openstack.org/infra/manual/creators.html#central-config-exceptions > which reference nodejs4 (exorcists required immediately). It is meant to reference the publish-to-npm nodejs jobs, Andreas -- Andreas Jaeger aj at suse.com Twitter: jaegerandi SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg (HRB 36809, AG Nürnberg) GF: Felix Imendörffer GPG fingerprint = EF18 1673 38C4 A372 86B1 E699 5294 24A3 FF91 2ACB From sshnaidm at redhat.com Thu Jan 9 10:20:02 2020 From: sshnaidm at redhat.com (Sagi Shnaidman) Date: Thu, 9 Jan 2020 12:20:02 +0200 Subject: [tripleo] tripleo-operator-ansible start and request for input In-Reply-To: References: Message-ID: Thanks for bringing this up, Alex I was thinking if we can the third option - to have small "single responsibility" roles for every action. For example: tripleo-undercloud-install tripleo-undercloud-backup tripleo-undercloud-upgrade And then no one needs to dig into roles to check what actions are supported, but just "ls roles/". Also these roles usually have nothing in common but name, and if they are quite isolated, I think it's better to have them defined separately. >From cons I can count: more roles and might be some level of duplication in variables. For pros it's more readable playbook and clear actions: - hosts: undercloud gather_facts: true collections: - tripleo.operator vars: tripleo_undercloud_debug: true tasks: - name: Install undercloud import_role: name: undercloud-install - name: Upgrade undercloud import_role: name: undercloud-upgrade Thanks On Thu, Jan 9, 2020 at 12:22 AM Alex Schultz wrote: > [Hello folks, > > I've begun the basic start of the tripleo-operator-ansible collection > work[0]. At the start of this work, I've chosen the undercloud > installation[1] as the first role to use to figure out how we the end > user's to consume these roles. I wanted to bring up this initial > implementation so that we can discuss how folks will include these > roles. The initial implementation is a wrapper around the > tripleoclient command as run via openstackclient. This means that the > 'tripleo-undercloud' role provides implementations for 'openstack > undercloud backup', 'openstack undercloud install', and 'openstack > undercloud upgrade'. > > In terms of naming conventions, I'm proposing that we would name the > roles "tripleo-" with the last part of the command > action being an "action". Examples: > > "openstack undercloud *" -> > role: tripleo-undercloud > action: (backup|install|upgrade) > > "openstack undercloud minion *" -> > role: tripleo-undercloud-minion > action: (install|upgrade) > > "openstack overcloud *" -> > role: tripleo-overcloud > action: (deploy|delete|export) > > "openstack overcloud node *" -> > role: tripleo-overcloud-node > action: (import|introspect|provision|unprovision) > > In terms of end user interface, I've got two proposals out in terms of > possible implementations. > > Tasks from method: > The initial commit propose that we would require the end user to use > an include_role/tasks_from call to perform the desired action. For > example: > > - hosts: undercloud > gather_facts: true > tasks: > - name: Install undercloud > collections: > - tripleo.operator > import_role: > name: tripleo-undercloud > tasks_from: install > vars: > tripleo_undercloud_debug: true > > Variable switch method: > I've also proposed an alternative implementation[2] that would use > include_role but require the end user to set a specific variable to > change if the role runs 'install', 'backup' or 'upgrade'. With this > patch the playbook would look something like: > > - hosts: undercloud > gather_facts: true > tasks: > - name: Install undercloud > collections: > - tripleo.operator > import_role: > name: tripleo-undercloud > vars: > tripleo_undercloud_action: install > tripleo_undercloud_debug: true > > I would like to solicit feedback on which one of these is the > preferred integration method when calling these roles. I have two > patches up in tripleo-quickstart-extras to show how these calls could > be run. The "Tasks from method" can be viewed here[3]. The "Variable > switch method" can be viewed here[4]. I can see pros and cons for > both methods. > > My take would be: > > Tasks from method: > Pros: > - action is a bit more explicit > - dynamic logic left up to the playbook/consumer. > - May not have a 'default' action (as main.yml is empty, though it > could be implemented). > - tasks_from would be a global implementation across all roles rather > than having a changing variable name. > > Cons: > - internal task file names must be known by the consumer (though IMHO > this is no different than the variable name + values in the other > implementation) > - role/action inclusions is not dynamic in the role (it can be in the > playbook) > > Variable switch method: > Pros: > - inclusion of the role by default runs an install > - action can be dynamically changed from the calling playbook via an > ansible var > - structure of the task files is internal to the role and the user of > the role need not know the filenames/structure. > > Cons: > - calling playbook is not explicit in that the action can be switched > dynamically (e.g. intentionally or accidentally because it is dynamic) > - implementer must know to configure a variable called > `tripleo_undercloud_action` to switch between install/backup/upgrade > actions > - variable names are likely different depending on the role > > My personal preference might be to use the "Tasks from method" because > it would lend itself to the same implementation across all roles and > the dynamic logic is left to the playbook rather than internally in > the role. For example, we'd end up with something like: > > - hosts: undercloud > gather_facts: true > collections: > - tripleo.operator > tasks: > - name: Install undercloud > import_role: > name: tripleo-undercloud > tasks_from: install > vars: > tripleo_undercloud_debug: true > - name: Upload images > import_role: > name: tripleo-overcloud-images > tasks_from: upload > vars: > tripleo_overcloud_images_debug: true > - name: Import nodes > import_role: > name: tripleo-overcloud-node > tasks_from: import > vars: > tripleo_overcloud_node_debug: true > tripleo_overcloud_node_import_file: instack.json > - name: Introspect nodes > import_role: > name: tripleo-overcloud-node > tasks_from: introspect > vars: > tripleo_overcloud_node_debug: true > tripleo_overcloud_node_introspect_all_manageable: True > tripleo_overcloud_node_introspect_provide: True > - name: Overcloud deploy > import_role: > name: tripleo-overcloud > tasks_from: deploy > vars: > tripleo_overcloud_debug: true > tripleo_overcloud_deploy_environment_files: > - /home/stack/params.yaml > > The same general tasks performed via the "Variable switch method" > would look something like: > > - hosts: undercloud > gather_facts: true > collections: > - tripleo.operator > tasks: > - name: Install undercloud > import_role: > name: tripleo-undercloud > vars: > tripleo_undercloud_action: install > tripleo_undercloud_debug: true > - name: Upload images > import_role: > name: tripleo-overcloud-images > vars: > tripleo_overcloud_images_action: upload > tripleo_overcloud_images_debug: true > - name: Import nodes > import_role: > name: tripleo-overcloud-node > vars: > tripleo_overcloud_node_action: import > tripleo_overcloud_node_debug: true > tripleo_overcloud_node_import_file: instack.json > - name: Introspect nodes > import_role: > name: tripleo-overcloud-node > vars: > tripleo_overcloud_node_action: introspect > tripleo_overcloud_node_debug: true > tripleo_overcloud_node_introspect_all_manageable: True > tripleo_overcloud_node_introspect_provide: True > - name: Overcloud deploy > import_role: > name: tripleo-overcloud > vars: > tripleo_overcloud_action: deploy > tripleo_overcloud_debug: true > tripleo_overcloud_deploy_environment_files: > - /home/stack/params.yaml > > Thoughts? > > Thanks, > -Alex > > [0] > https://blueprints.launchpad.net/tripleo/+spec/tripleo-operator-ansible > [1] https://review.opendev.org/#/c/699311/ > [2] https://review.opendev.org/#/c/701628/ > [3] https://review.opendev.org/#/c/701034/ > [4] https://review.opendev.org/#/c/701628/ > > > -- Best regards Sagi Shnaidman -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Thu Jan 9 10:31:32 2020 From: thierry at openstack.org (Thierry Carrez) Date: Thu, 9 Jan 2020 11:31:32 +0100 Subject: [largescale-sig] Meeting summary and next actions In-Reply-To: References: <3c3a6232-9a3b-d240-ab82-c7ac4997f5c0@openstack.org> <06e5f16f-dfa4-8189-da7b-ad2250df8125@openstack.org> Message-ID: <3bc9f82f-a856-c02a-86a0-0e927397acf8@openstack.org> Belmiro Moreira wrote: > Hi Thierry, all, > I'm OK with both dates. > > If you agree to keep the meeting on January 15 I can chair it. Then I propose we keep the date as planned, will be less confusing. Thanks Belmiro for the offer of chairing it. I'll be preparing and sending the agenda ahead of the meeting, and pick up the summary afterwards. Regards, -- Thierry Carrez (ttx) From bdobreli at redhat.com Thu Jan 9 11:38:04 2020 From: bdobreli at redhat.com (Bogdan Dobrelya) Date: Thu, 9 Jan 2020 12:38:04 +0100 Subject: [tripleo] tripleo-operator-ansible start and request for input In-Reply-To: References: Message-ID: <65f971d5-a4ec-893c-65e8-9fddb9c2407f@redhat.com> On 09.01.2020 11:20, Sagi Shnaidman wrote: > Thanks for bringing this up, Alex > > I was thinking if we can the third option - to have small "single > responsibility" roles for every action. For example: > tripleo-undercloud-install > tripleo-undercloud-backup > tripleo-undercloud-upgrade > > And then no one needs to dig into roles to check what actions are > supported, but just "ls roles/". Also these roles usually have nothing > in common but name, and if they are quite isolated, I think it's better > to have them defined separately. +1 A role should do one thing and do it good (c) from somewhere > From cons I can count: more roles and might be some level of > duplication in variables. > For pros it's more readable playbook and clear actions: > > - hosts: undercloud >   gather_facts: true >   collections: >     - tripleo.operator >   vars: >     tripleo_undercloud_debug: true >   tasks: > >     - name: Install undercloud >       import_role: >         name: undercloud-install > >     - name: Upgrade undercloud >       import_role: >         name: undercloud-upgrade > > Thanks > > On Thu, Jan 9, 2020 at 12:22 AM Alex Schultz > wrote: > > [Hello folks, > > I've begun the basic start of the tripleo-operator-ansible collection > work[0].  At the start of this work, I've chosen the undercloud > installation[1] as the first role to use to figure out how we the end > user's to consume these roles.  I wanted to bring up this initial > implementation so that we can discuss how folks will include these > roles.  The initial implementation is a wrapper around the > tripleoclient command as run via openstackclient.  This means that the > 'tripleo-undercloud' role provides implementations for 'openstack > undercloud backup', 'openstack undercloud install', and 'openstack > undercloud upgrade'. > > In terms of naming conventions, I'm proposing that we would name the > roles "tripleo-" with the last part of the command > action being an "action". Examples: > > "openstack undercloud *" -> > role: tripleo-undercloud > action: (backup|install|upgrade) > > "openstack undercloud minion *" -> > role: tripleo-undercloud-minion > action: (install|upgrade) > > "openstack overcloud *" -> > role: tripleo-overcloud > action: (deploy|delete|export) > > "openstack overcloud node *" -> > role: tripleo-overcloud-node > action: (import|introspect|provision|unprovision) > > In terms of end user interface, I've got two proposals out in terms of > possible implementations. > > Tasks from method: > The initial commit propose that we would require the end user to use > an include_role/tasks_from call to perform the desired action.  For > example: > >     - hosts: undercloud >       gather_facts: true >       tasks: >         - name: Install undercloud >           collections: >             - tripleo.operator >           import_role: >             name: tripleo-undercloud >             tasks_from: install >           vars: >             tripleo_undercloud_debug: true > > Variable switch method: > I've also proposed an alternative implementation[2] that would use > include_role but require the end user to set a specific variable to > change if the role runs 'install', 'backup' or 'upgrade'. With this > patch the playbook would look something like: > >     - hosts: undercloud >       gather_facts: true >       tasks: >         - name: Install undercloud >           collections: >             - tripleo.operator >           import_role: >             name: tripleo-undercloud >           vars: >             tripleo_undercloud_action: install >             tripleo_undercloud_debug: true > > I would like to solicit feedback on which one of these is the > preferred integration method when calling these roles. I have two > patches up in tripleo-quickstart-extras to show how these calls could > be run. The "Tasks from method" can be viewed here[3]. The "Variable > switch method" can be viewed here[4].  I can see pros and cons for > both methods. > > My take would be: > > Tasks from method: > Pros: >  - action is a bit more explicit >  - dynamic logic left up to the playbook/consumer. >  - May not have a 'default' action (as main.yml is empty, though it > could be implemented). >  - tasks_from would be a global implementation across all roles rather > than having a changing variable name. > > Cons: >  - internal task file names must be known by the consumer (though IMHO > this is no different than the variable name + values in the other > implementation) >  - role/action inclusions is not dynamic in the role (it can be in > the playbook) > > Variable switch method: > Pros: >  - inclusion of the role by default runs an install >  - action can be dynamically changed from the calling playbook via an > ansible var >  - structure of the task files is internal to the role and the user of > the role need not know the filenames/structure. > > Cons: >  - calling playbook is not explicit in that the action can be switched > dynamically (e.g. intentionally or accidentally because it is dynamic) >  - implementer must know to configure a variable called > `tripleo_undercloud_action` to switch between install/backup/upgrade > actions >  - variable names are likely different depending on the role > > My personal preference might be to use the "Tasks from method" because > it would lend itself to the same implementation across all roles and > the dynamic logic is left to the playbook rather than internally in > the role. For example, we'd end up with something like: > >     - hosts: undercloud >       gather_facts: true >       collections: >         - tripleo.operator >       tasks: >         - name: Install undercloud >           import_role: >             name: tripleo-undercloud >             tasks_from: install >           vars: >             tripleo_undercloud_debug: true >         - name: Upload images >           import_role: >             name: tripleo-overcloud-images >             tasks_from: upload >           vars: >             tripleo_overcloud_images_debug: true >         - name: Import nodes >           import_role: >             name: tripleo-overcloud-node >             tasks_from: import >           vars: >             tripleo_overcloud_node_debug: true >             tripleo_overcloud_node_import_file: instack.json >         - name: Introspect nodes >           import_role: >             name: tripleo-overcloud-node >             tasks_from: introspect >           vars: >             tripleo_overcloud_node_debug: true >             tripleo_overcloud_node_introspect_all_manageable: True >             tripleo_overcloud_node_introspect_provide: True >         - name: Overcloud deploy >           import_role: >             name: tripleo-overcloud >             tasks_from: deploy >           vars: >             tripleo_overcloud_debug: true >             tripleo_overcloud_deploy_environment_files: >               - /home/stack/params.yaml > > The same general tasks performed via the "Variable switch method" > would look something like: > >     - hosts: undercloud >       gather_facts: true >       collections: >         - tripleo.operator >       tasks: >         - name: Install undercloud >           import_role: >             name: tripleo-undercloud >           vars: >             tripleo_undercloud_action: install >             tripleo_undercloud_debug: true >         - name: Upload images >           import_role: >             name: tripleo-overcloud-images >           vars: >             tripleo_overcloud_images_action: upload >             tripleo_overcloud_images_debug: true >         - name: Import nodes >           import_role: >             name: tripleo-overcloud-node >           vars: >             tripleo_overcloud_node_action: import >             tripleo_overcloud_node_debug: true >             tripleo_overcloud_node_import_file: instack.json >         - name: Introspect nodes >           import_role: >             name: tripleo-overcloud-node >           vars: >             tripleo_overcloud_node_action: introspect >             tripleo_overcloud_node_debug: true >             tripleo_overcloud_node_introspect_all_manageable: True >             tripleo_overcloud_node_introspect_provide: True >         - name: Overcloud deploy >           import_role: >             name: tripleo-overcloud >           vars: >             tripleo_overcloud_action: deploy >             tripleo_overcloud_debug: true >             tripleo_overcloud_deploy_environment_files: >               - /home/stack/params.yaml > > Thoughts? > > Thanks, > -Alex > > [0] > https://blueprints.launchpad.net/tripleo/+spec/tripleo-operator-ansible > [1] https://review.opendev.org/#/c/699311/ > [2] https://review.opendev.org/#/c/701628/ > [3] https://review.opendev.org/#/c/701034/ > [4] https://review.opendev.org/#/c/701628/ > > > > > -- > Best regards > Sagi Shnaidman -- Best regards, Bogdan Dobrelya, Irc #bogdando From kotobi at dkrz.de Thu Jan 9 11:57:11 2020 From: kotobi at dkrz.de (Amjad Kotobi) Date: Thu, 9 Jan 2020 12:57:11 +0100 Subject: [neutron][rabbitmq][oslo] Neutron-server service shows deprecated "AMQPDeprecationWarning" In-Reply-To: <294c93b5-0ddc-284b-34a1-ffce654ba047@nemebean.com> References: <4D3B074F-09F2-48BE-BD61-5D34CBFE509E@dkrz.de> <294c93b5-0ddc-284b-34a1-ffce654ba047@nemebean.com> Message-ID: <274FDC2A-837B-45CC-BFBF-8C09A182550A@dkrz.de> Hi Ben, > On 7. Jan 2020, at 22:59, Ben Nemec wrote: > > > > On 1/7/20 9:14 AM, Amjad Kotobi wrote: >> Hi, >> Today we are facing losing connection of neutron especially during instance creation or so as “systemctl status neutron-server” shows below message >> be deprecated in amqp 2.2.0. >> Since amqp 2.0 you have to explicitly call Connection.connect() >> before using the connection. >> W_FORCE_CONNECT.format(attr=attr))) >> /usr/lib/python2.7/site-packages/amqp/connection.py:304: AMQPDeprecationWarning: The .transport attribute on the connection was accessed before >> the connection was established. This is supported for now, but will >> be deprecated in amqp 2.2.0. >> Since amqp 2.0 you have to explicitly call Connection.connect() >> before using the connection. >> W_FORCE_CONNECT.format(attr=attr))) > > It looks like this is a red herring, but it should be fixed in the current oslo.messaging pike release. See [0] and the related bug. > > 0: https://review.opendev.org/#/c/605324/ > >> OpenStack release which we are running is “Pike”. >> Is there any way to remedy this? > > I don't think this should be a fatal problem in and of itself so I suspect it's masking something else. However, I would recommend updating to the latest pike release of oslo.messaging where the deprecated feature is not used. If that doesn't fix the problem, please send us whatever errors remain after this one is eliminated. I checked it out, we are having the latest pike Oslo.messaging and it still showing the same upper messages. Any ideas? > >> Thanks >> Amjad From cjeanner at redhat.com Thu Jan 9 12:02:43 2020 From: cjeanner at redhat.com (=?UTF-8?Q?C=c3=a9dric_Jeanneret?=) Date: Thu, 9 Jan 2020 13:02:43 +0100 Subject: [tripleo] tripleo-operator-ansible start and request for input In-Reply-To: References: Message-ID: <5d0ab9db-0858-54df-1ddd-52b37294c5c5@redhat.com> On 1/9/20 11:20 AM, Sagi Shnaidman wrote: > Thanks for bringing this up, Alex > > I was thinking if we can the third option - to have small "single > responsibility" roles for every action. For example: > tripleo-undercloud-install > tripleo-undercloud-backup > tripleo-undercloud-upgrade I would prefer that solution, since it allows to keep code stupid simple, without block|when|other switches that can make maintenance complicated. Doing so will probably make the unit testing easier as well (thinking of molecule here, mainly). > > And then no one needs to dig into roles to check what actions are > supported, but just "ls roles/". Also these roles usually have nothing > in common but name, and if they are quite isolated, I think it's better > to have them defined separately. > From cons I can count: more roles and might be some level of duplication > in variables. We would probably need some common params|variables in order to avoid duplication... The var part might be a source of headache in order to avoid as much as possible duplications. > For pros it's more readable playbook and clear actions: > > - hosts: undercloud >   gather_facts: true >   collections: >     - tripleo.operator >   vars: >     tripleo_undercloud_debug: true >   tasks: > >     - name: Install undercloud >       import_role: >         name: undercloud-install > >     - name: Upgrade undercloud >       import_role: >         name: undercloud-upgrade > > Thanks > > On Thu, Jan 9, 2020 at 12:22 AM Alex Schultz > wrote: > > [Hello folks, > > I've begun the basic start of the tripleo-operator-ansible collection > work[0].  At the start of this work, I've chosen the undercloud > installation[1] as the first role to use to figure out how we the end > user's to consume these roles.  I wanted to bring up this initial > implementation so that we can discuss how folks will include these > roles.  The initial implementation is a wrapper around the > tripleoclient command as run via openstackclient.  This means that the > 'tripleo-undercloud' role provides implementations for 'openstack > undercloud backup', 'openstack undercloud install', and 'openstack > undercloud upgrade'. > > In terms of naming conventions, I'm proposing that we would name the > roles "tripleo-" with the last part of the command > action being an "action". Examples: > > "openstack undercloud *" -> > role: tripleo-undercloud > action: (backup|install|upgrade) > > "openstack undercloud minion *" -> > role: tripleo-undercloud-minion > action: (install|upgrade) > > "openstack overcloud *" -> > role: tripleo-overcloud > action: (deploy|delete|export) > > "openstack overcloud node *" -> > role: tripleo-overcloud-node > action: (import|introspect|provision|unprovision) > > In terms of end user interface, I've got two proposals out in terms of > possible implementations. > > Tasks from method: > The initial commit propose that we would require the end user to use > an include_role/tasks_from call to perform the desired action.  For > example: > >     - hosts: undercloud >       gather_facts: true >       tasks: >         - name: Install undercloud >           collections: >             - tripleo.operator >           import_role: >             name: tripleo-undercloud >             tasks_from: install >           vars: >             tripleo_undercloud_debug: true > > Variable switch method: > I've also proposed an alternative implementation[2] that would use > include_role but require the end user to set a specific variable to > change if the role runs 'install', 'backup' or 'upgrade'. With this > patch the playbook would look something like: > >     - hosts: undercloud >       gather_facts: true >       tasks: >         - name: Install undercloud >           collections: >             - tripleo.operator >           import_role: >             name: tripleo-undercloud >           vars: >             tripleo_undercloud_action: install >             tripleo_undercloud_debug: true > > I would like to solicit feedback on which one of these is the > preferred integration method when calling these roles. I have two > patches up in tripleo-quickstart-extras to show how these calls could > be run. The "Tasks from method" can be viewed here[3]. The "Variable > switch method" can be viewed here[4].  I can see pros and cons for > both methods. > > My take would be: > > Tasks from method: > Pros: >  - action is a bit more explicit >  - dynamic logic left up to the playbook/consumer. >  - May not have a 'default' action (as main.yml is empty, though it > could be implemented). >  - tasks_from would be a global implementation across all roles rather > than having a changing variable name. > > Cons: >  - internal task file names must be known by the consumer (though IMHO > this is no different than the variable name + values in the other > implementation) >  - role/action inclusions is not dynamic in the role (it can be in > the playbook) > > Variable switch method: > Pros: >  - inclusion of the role by default runs an install >  - action can be dynamically changed from the calling playbook via an > ansible var >  - structure of the task files is internal to the role and the user of > the role need not know the filenames/structure. > > Cons: >  - calling playbook is not explicit in that the action can be switched > dynamically (e.g. intentionally or accidentally because it is dynamic) >  - implementer must know to configure a variable called > `tripleo_undercloud_action` to switch between install/backup/upgrade > actions >  - variable names are likely different depending on the role > > My personal preference might be to use the "Tasks from method" because > it would lend itself to the same implementation across all roles and > the dynamic logic is left to the playbook rather than internally in > the role. For example, we'd end up with something like: > >     - hosts: undercloud >       gather_facts: true >       collections: >         - tripleo.operator >       tasks: >         - name: Install undercloud >           import_role: >             name: tripleo-undercloud >             tasks_from: install >           vars: >             tripleo_undercloud_debug: true >         - name: Upload images >           import_role: >             name: tripleo-overcloud-images >             tasks_from: upload >           vars: >             tripleo_overcloud_images_debug: true >         - name: Import nodes >           import_role: >             name: tripleo-overcloud-node >             tasks_from: import >           vars: >             tripleo_overcloud_node_debug: true >             tripleo_overcloud_node_import_file: instack.json >         - name: Introspect nodes >           import_role: >             name: tripleo-overcloud-node >             tasks_from: introspect >           vars: >             tripleo_overcloud_node_debug: true >             tripleo_overcloud_node_introspect_all_manageable: True >             tripleo_overcloud_node_introspect_provide: True >         - name: Overcloud deploy >           import_role: >             name: tripleo-overcloud >             tasks_from: deploy >           vars: >             tripleo_overcloud_debug: true >             tripleo_overcloud_deploy_environment_files: >               - /home/stack/params.yaml > > The same general tasks performed via the "Variable switch method" > would look something like: > >     - hosts: undercloud >       gather_facts: true >       collections: >         - tripleo.operator >       tasks: >         - name: Install undercloud >           import_role: >             name: tripleo-undercloud >           vars: >             tripleo_undercloud_action: install >             tripleo_undercloud_debug: true >         - name: Upload images >           import_role: >             name: tripleo-overcloud-images >           vars: >             tripleo_overcloud_images_action: upload >             tripleo_overcloud_images_debug: true >         - name: Import nodes >           import_role: >             name: tripleo-overcloud-node >           vars: >             tripleo_overcloud_node_action: import >             tripleo_overcloud_node_debug: true >             tripleo_overcloud_node_import_file: instack.json >         - name: Introspect nodes >           import_role: >             name: tripleo-overcloud-node >           vars: >             tripleo_overcloud_node_action: introspect >             tripleo_overcloud_node_debug: true >             tripleo_overcloud_node_introspect_all_manageable: True >             tripleo_overcloud_node_introspect_provide: True >         - name: Overcloud deploy >           import_role: >             name: tripleo-overcloud >           vars: >             tripleo_overcloud_action: deploy >             tripleo_overcloud_debug: true >             tripleo_overcloud_deploy_environment_files: >               - /home/stack/params.yaml > > Thoughts? > > Thanks, > -Alex > > [0] > https://blueprints.launchpad.net/tripleo/+spec/tripleo-operator-ansible > [1] https://review.opendev.org/#/c/699311/ > [2] https://review.opendev.org/#/c/701628/ > [3] https://review.opendev.org/#/c/701034/ > [4] https://review.opendev.org/#/c/701628/ > > > > > -- > Best regards > Sagi Shnaidman -- Cédric Jeanneret (He/Him/His) Software Engineer - OpenStack Platform Red Hat EMEA https://www.redhat.com/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From radoslaw.piliszek at gmail.com Thu Jan 9 12:55:32 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Thu, 9 Jan 2020 13:55:32 +0100 Subject: [infra] Retire openstack/js-openstack-lib repository In-Reply-To: <07467b2a-7e5e-6ba1-8481-27c87f58d318@suse.com> References: <2AA78965-B496-41D9-A41B-DF75694A3EB9@inaugust.com> <20200108211208.mnvspulwlghdyyz5@yuggoth.org> <8679edfb-a26a-4292-9886-3c71cec21f83@www.fastmail.com> <07467b2a-7e5e-6ba1-8481-27c87f58d318@suse.com> Message-ID: Aye, will do at some point. So the lib looks like user friendliness was not one of its goals. I had a typo in credentials and instead of throwing unauth (or similar) at me, it instead threw the whole response object (which node gladly printed out as "Object" because why not). Error handling to improve. OTOH, it checks code coverage and has both kinds of tests (unit, functional). As for more good news, I managed to run functional tests locally against Stein with only two failures: Failed: Current devstack glance version (2.7) is not supported. Failed: Current devstack keystone version (3.12) is not supported. which are quite expected (no idea why these are tested as functional, these are more like sanity checks for debugging if functionals actually fail IMHO). Real functional tests passed and it really does what it says on the box (which is very little but still). The CI functional tests jobs in Zuul are part of the legacy dsvm (devstack vm?) thingy. nodejs4 fails because of repos being long gone, nodejs6 actually installs nodejs8 but fails on npm being not installed. I see all Zuul config is external now. I would prefer it all in lib's repo. I presume it would work if I added zuul.d there, right? Still, need to drop the failing functional jobs to merge anything new. I did my research as promised, please let me know how we would like to (/should) proceed now. -yoctozepto czw., 9 sty 2020 o 10:43 Andreas Jaeger napisał(a): > > On 09/01/2020 08.58, Radosław Piliszek wrote: > > Best infra team around, you go to sleep and the problem is solved. :-) > > Thanks for the link. > > > > I was meaning these templates: > > https://opendev.org/openstack/openstack-zuul-jobs/src/branch/master/zuul.d/project-templates.yaml > > which reference nodejs up to 8. > > New templates for nodejs 10 or 11 are welcome ;) > > > I see zuul is already using the same jobs referenced in those > > templates but with node 10 so it presumably works which is great > > indeed: > > https://opendev.org/zuul/zuul/src/branch/master/.zuul.yaml#L212 > > > > The most nodejs-scary part is included in infra docs: > > https://docs.openstack.org/infra/manual/creators.html#central-config-exceptions > > which reference nodejs4 (exorcists required immediately). > > It is meant to reference the publish-to-npm nodejs jobs, > > Andreas > -- > Andreas Jaeger aj at suse.com Twitter: jaegerandi > SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg > (HRB 36809, AG Nürnberg) GF: Felix Imendörffer > GPG fingerprint = EF18 1673 38C4 A372 86B1 E699 5294 24A3 FF91 2ACB From james.slagle at gmail.com Thu Jan 9 12:57:26 2020 From: james.slagle at gmail.com (James Slagle) Date: Thu, 9 Jan 2020 07:57:26 -0500 Subject: [tripleo] tripleo-operator-ansible start and request for input In-Reply-To: References: Message-ID: On Thu, Jan 9, 2020 at 5:25 AM Sagi Shnaidman wrote: > > Thanks for bringing this up, Alex > > I was thinking if we can the third option - to have small "single responsibility" roles for every action. For example: > tripleo-undercloud-install > tripleo-undercloud-backup > tripleo-undercloud-upgrade Good idea, and I tend to agree as well. If we really wanted a single undercloud role at some point, then we could always go back to the original idea, and have a tripleo-undercloud role that just included these other more fine grained roles. But, for now, I like the idea of smaller focused roles. -- -- James Slagle -- From C-Ramakrishna.Bhupathi at charter.com Thu Jan 9 13:45:45 2020 From: C-Ramakrishna.Bhupathi at charter.com (Bhupathi, Ramakrishna) Date: Thu, 9 Jan 2020 13:45:45 +0000 Subject: [magnum]: K8s cluster creation times out. OpenStack Train : [ERROR]: Unable to render networking. Network config is likely broken Message-ID: <59ed63745f4e4c42a63692c3ee4eb10d@ncwmexgp031.CORP.CHARTERCOM.com> Folks, I am building a Kubernetes Cluster( Openstack Train) and using fedora atomic-29 image . The nodes come up fine ( I have a simple 1 master and 1 node) , but the cluster creation times out, and when I access the cloud-init logs I see this error . Wondering what I am missing as this used to work before. I wonder if this is image related . [ERROR]: Unable to render networking. Network config is likely broken: No available network renderers found. Searched through list: ['eni', 'sysconfig', 'netplan'] Essentially the stack creation fails in "kube_cluster_deploy" Can somebody help me debug this ? Any help is appreciated. --RamaK E-MAIL CONFIDENTIALITY NOTICE: The contents of this e-mail message and any attachments are intended solely for the addressee(s) and may contain confidential and/or legally privileged information. If you are not the intended recipient of this message or if this message has been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this message and any attachments. If you are not the intended recipient, you are notified that any use, dissemination, distribution, copying, or storage of this message or any attachment is strictly prohibited. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aschultz at redhat.com Thu Jan 9 16:00:54 2020 From: aschultz at redhat.com (Alex Schultz) Date: Thu, 9 Jan 2020 09:00:54 -0700 Subject: [tripleo] tripleo-operator-ansible start and request for input In-Reply-To: References: Message-ID: On Thu, Jan 9, 2020 at 3:20 AM Sagi Shnaidman wrote: > > Thanks for bringing this up, Alex > > I was thinking if we can the third option - to have small "single responsibility" roles for every action. For example: > tripleo-undercloud-install > tripleo-undercloud-backup > tripleo-undercloud-upgrade > Ok it seems like this is the generally preferred structure. We'll go with this and I'll update my patches to reflect this. One issue with this is the extra duplication in files but that might be minor. > And then no one needs to dig into roles to check what actions are supported, but just "ls roles/". Also these roles usually have nothing in common but name, and if they are quite isolated, I think it's better to have them defined separately. > From cons I can count: more roles and might be some level of duplication in variables. > For pros it's more readable playbook and clear actions: > > - hosts: undercloud > gather_facts: true > collections: > - tripleo.operator > vars: > tripleo_undercloud_debug: true > tasks: > > - name: Install undercloud > import_role: > name: undercloud-install > > - name: Upgrade undercloud > import_role: > name: undercloud-upgrade > > Thanks > > On Thu, Jan 9, 2020 at 12:22 AM Alex Schultz wrote: >> >> [Hello folks, >> >> I've begun the basic start of the tripleo-operator-ansible collection >> work[0]. At the start of this work, I've chosen the undercloud >> installation[1] as the first role to use to figure out how we the end >> user's to consume these roles. I wanted to bring up this initial >> implementation so that we can discuss how folks will include these >> roles. The initial implementation is a wrapper around the >> tripleoclient command as run via openstackclient. This means that the >> 'tripleo-undercloud' role provides implementations for 'openstack >> undercloud backup', 'openstack undercloud install', and 'openstack >> undercloud upgrade'. >> >> In terms of naming conventions, I'm proposing that we would name the >> roles "tripleo-" with the last part of the command >> action being an "action". Examples: >> >> "openstack undercloud *" -> >> role: tripleo-undercloud >> action: (backup|install|upgrade) >> >> "openstack undercloud minion *" -> >> role: tripleo-undercloud-minion >> action: (install|upgrade) >> >> "openstack overcloud *" -> >> role: tripleo-overcloud >> action: (deploy|delete|export) >> >> "openstack overcloud node *" -> >> role: tripleo-overcloud-node >> action: (import|introspect|provision|unprovision) >> >> In terms of end user interface, I've got two proposals out in terms of >> possible implementations. >> >> Tasks from method: >> The initial commit propose that we would require the end user to use >> an include_role/tasks_from call to perform the desired action. For >> example: >> >> - hosts: undercloud >> gather_facts: true >> tasks: >> - name: Install undercloud >> collections: >> - tripleo.operator >> import_role: >> name: tripleo-undercloud >> tasks_from: install >> vars: >> tripleo_undercloud_debug: true >> >> Variable switch method: >> I've also proposed an alternative implementation[2] that would use >> include_role but require the end user to set a specific variable to >> change if the role runs 'install', 'backup' or 'upgrade'. With this >> patch the playbook would look something like: >> >> - hosts: undercloud >> gather_facts: true >> tasks: >> - name: Install undercloud >> collections: >> - tripleo.operator >> import_role: >> name: tripleo-undercloud >> vars: >> tripleo_undercloud_action: install >> tripleo_undercloud_debug: true >> >> I would like to solicit feedback on which one of these is the >> preferred integration method when calling these roles. I have two >> patches up in tripleo-quickstart-extras to show how these calls could >> be run. The "Tasks from method" can be viewed here[3]. The "Variable >> switch method" can be viewed here[4]. I can see pros and cons for >> both methods. >> >> My take would be: >> >> Tasks from method: >> Pros: >> - action is a bit more explicit >> - dynamic logic left up to the playbook/consumer. >> - May not have a 'default' action (as main.yml is empty, though it >> could be implemented). >> - tasks_from would be a global implementation across all roles rather >> than having a changing variable name. >> >> Cons: >> - internal task file names must be known by the consumer (though IMHO >> this is no different than the variable name + values in the other >> implementation) >> - role/action inclusions is not dynamic in the role (it can be in the playbook) >> >> Variable switch method: >> Pros: >> - inclusion of the role by default runs an install >> - action can be dynamically changed from the calling playbook via an >> ansible var >> - structure of the task files is internal to the role and the user of >> the role need not know the filenames/structure. >> >> Cons: >> - calling playbook is not explicit in that the action can be switched >> dynamically (e.g. intentionally or accidentally because it is dynamic) >> - implementer must know to configure a variable called >> `tripleo_undercloud_action` to switch between install/backup/upgrade >> actions >> - variable names are likely different depending on the role >> >> My personal preference might be to use the "Tasks from method" because >> it would lend itself to the same implementation across all roles and >> the dynamic logic is left to the playbook rather than internally in >> the role. For example, we'd end up with something like: >> >> - hosts: undercloud >> gather_facts: true >> collections: >> - tripleo.operator >> tasks: >> - name: Install undercloud >> import_role: >> name: tripleo-undercloud >> tasks_from: install >> vars: >> tripleo_undercloud_debug: true >> - name: Upload images >> import_role: >> name: tripleo-overcloud-images >> tasks_from: upload >> vars: >> tripleo_overcloud_images_debug: true >> - name: Import nodes >> import_role: >> name: tripleo-overcloud-node >> tasks_from: import >> vars: >> tripleo_overcloud_node_debug: true >> tripleo_overcloud_node_import_file: instack.json >> - name: Introspect nodes >> import_role: >> name: tripleo-overcloud-node >> tasks_from: introspect >> vars: >> tripleo_overcloud_node_debug: true >> tripleo_overcloud_node_introspect_all_manageable: True >> tripleo_overcloud_node_introspect_provide: True >> - name: Overcloud deploy >> import_role: >> name: tripleo-overcloud >> tasks_from: deploy >> vars: >> tripleo_overcloud_debug: true >> tripleo_overcloud_deploy_environment_files: >> - /home/stack/params.yaml >> >> The same general tasks performed via the "Variable switch method" >> would look something like: >> >> - hosts: undercloud >> gather_facts: true >> collections: >> - tripleo.operator >> tasks: >> - name: Install undercloud >> import_role: >> name: tripleo-undercloud >> vars: >> tripleo_undercloud_action: install >> tripleo_undercloud_debug: true >> - name: Upload images >> import_role: >> name: tripleo-overcloud-images >> vars: >> tripleo_overcloud_images_action: upload >> tripleo_overcloud_images_debug: true >> - name: Import nodes >> import_role: >> name: tripleo-overcloud-node >> vars: >> tripleo_overcloud_node_action: import >> tripleo_overcloud_node_debug: true >> tripleo_overcloud_node_import_file: instack.json >> - name: Introspect nodes >> import_role: >> name: tripleo-overcloud-node >> vars: >> tripleo_overcloud_node_action: introspect >> tripleo_overcloud_node_debug: true >> tripleo_overcloud_node_introspect_all_manageable: True >> tripleo_overcloud_node_introspect_provide: True >> - name: Overcloud deploy >> import_role: >> name: tripleo-overcloud >> vars: >> tripleo_overcloud_action: deploy >> tripleo_overcloud_debug: true >> tripleo_overcloud_deploy_environment_files: >> - /home/stack/params.yaml >> >> Thoughts? >> >> Thanks, >> -Alex >> >> [0] https://blueprints.launchpad.net/tripleo/+spec/tripleo-operator-ansible >> [1] https://review.opendev.org/#/c/699311/ >> [2] https://review.opendev.org/#/c/701628/ >> [3] https://review.opendev.org/#/c/701034/ >> [4] https://review.opendev.org/#/c/701628/ >> >> > > > -- > Best regards > Sagi Shnaidman From juliaashleykreger at gmail.com Thu Jan 9 16:52:15 2020 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Thu, 9 Jan 2020 08:52:15 -0800 Subject: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management In-Reply-To: <2706c21c3f7d4203a8a20342f8f6a68c@AUSX13MPS308.AMER.DELL.COM> References: <87344b65740d47fc9777ae710dcf4af9@AUSX13MPS308.AMER.DELL.COM> <7b55e3b28d644492a846fdb10f7b127b@AUSX13MPS308.AMER.DELL.COM> <582d2544d3d74fe7beef50aaaa35d558@AUSX13MPS308.AMER.DELL.COM> <20200107235139.2l5iw2fumgsfoz5u@yuggoth.org> <2706c21c3f7d4203a8a20342f8f6a68c@AUSX13MPS308.AMER.DELL.COM> Message-ID: On Wed, Jan 8, 2020 at 8:38 AM wrote: > > Jeremy, > Correct. > programming devices and "updating firmware" I count as separate activities. > Similar to CPU or GPU. > Which makes me really wonder, where is that line between the activities? I guess the worry, from a security standpoint, is persistent bytecode. I guess I just don't have a good enough understanding of all the facets in this area to have a sense for that. :/ > -----Original Message----- > From: Jeremy Stanley > Sent: Tuesday, January 7, 2020 5:52 PM > To: openstack-discuss at lists.openstack.org > Subject: Re: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management > > On 2020-01-07 23:17:25 +0000 (+0000), Arkady.Kanevsky at dell.com wrote: > > It is hard to image that any production env of any customer will allow > > anybody but administrator to update FW on any device at any time. The > > security implication are huge. > [...] > > I thought this was precisely the point of exposing FPGA hardware into server instances. Or do you not count programming those as "updating firmware?" > -- > Jeremy Stanley > From sean.mcginnis at gmx.com Thu Jan 9 17:01:15 2020 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 9 Jan 2020 11:01:15 -0600 Subject: [release] making releases fast again (was: decentralising release approvals) In-Reply-To: References: Message-ID: <20200109170115.GA453843@sm-workstation> On Fri, Dec 20, 2019 at 11:04:36AM +0100, Thierry Carrez wrote: > Mark Goddard wrote: > > [...] > > As kolla PTL and ironic release liaison I've proposed a number of > > release patches recently. Generally the release team is good at churning > > through these, but sometimes patches can hang around for a while. > > Usually a ping on IRC will get things moving again within a day or so > > (thanks in particular to Sean who has been very responsive). > > I agree we've seen an increase in processing delay lately, and I'd like to > correct that. There are generally three things that would cause a > perceptible delay in release processing... > > 1- wait for two release managers +2 > > This is something we put in place some time ago, as we had a lot of new > members and thought that would be a good way to onboard them. Lately it > created delays as a lot of those were not as active. > > 2- stable releases > > Two subcases in there... Eitherthe deliverable is under stable policy and > there are *significant* delays there as we have to pause to give a chance to > stable-maint-core people to voice an opinion. Or the deliverable is not > under stable policy, but we do a manual check on the changes, as a way to > educate the requester on semver. > > 3- waiting for PTL/release liaison to approve > > That can take a long time, but the release management team is not really at > fault there. > Coming back to hopefully wrap this up... We discussed this in today's release team meeting and decided to make some changes to hopefully make things a little smoother. We will now use the following guidelines for reviewing and approving release requests: For releases in the current development (including some time for the previous cycle for the release-trailing deliverables) we will only require a single reviewer. If everything looks good and there are no concerns, we will +2 and approve the release request without waiting for a second. If the reviewer has any doubts or hesitation, they can decide to wait for a second reviewer, but this should be a much less common situation. For stable releases, we will require two +2s. We will not, however, wait for a designated day for stable team review. If we can get one, all the better, but the normal release team should be aware of stable rules and look for them for any stable release request. Keeping the requirement for two reviewers should help make sure nothing is overlooked with stable policy. We do still want PTL/liaison +1 to appove, so we will continue to wait for that. Thierry is working on some job automation to make checking for that a little easier, so hopefully that will help make that process as smooth as possible. If there are any other questions or concerns, please do let us know. Sean From radoslaw.piliszek at gmail.com Thu Jan 9 17:08:53 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Thu, 9 Jan 2020 18:08:53 +0100 Subject: [release] making releases fast again (was: decentralising release approvals) In-Reply-To: <20200109170115.GA453843@sm-workstation> References: <20200109170115.GA453843@sm-workstation> Message-ID: Hi Sean, just to verify my interpretation. This means e.g. [1] is now good to go? 2 release team members, 1 liaison and 1 PTL extra (for stable release). [1] https://review.opendev.org/701080 -yoctozepto czw., 9 sty 2020 o 18:03 Sean McGinnis napisał(a): > > On Fri, Dec 20, 2019 at 11:04:36AM +0100, Thierry Carrez wrote: > > Mark Goddard wrote: > > > [...] > > > As kolla PTL and ironic release liaison I've proposed a number of > > > release patches recently. Generally the release team is good at churning > > > through these, but sometimes patches can hang around for a while. > > > Usually a ping on IRC will get things moving again within a day or so > > > (thanks in particular to Sean who has been very responsive). > > > > I agree we've seen an increase in processing delay lately, and I'd like to > > correct that. There are generally three things that would cause a > > perceptible delay in release processing... > > > > 1- wait for two release managers +2 > > > > This is something we put in place some time ago, as we had a lot of new > > members and thought that would be a good way to onboard them. Lately it > > created delays as a lot of those were not as active. > > > > 2- stable releases > > > > Two subcases in there... Eitherthe deliverable is under stable policy and > > there are *significant* delays there as we have to pause to give a chance to > > stable-maint-core people to voice an opinion. Or the deliverable is not > > under stable policy, but we do a manual check on the changes, as a way to > > educate the requester on semver. > > > > 3- waiting for PTL/release liaison to approve > > > > That can take a long time, but the release management team is not really at > > fault there. > > > > Coming back to hopefully wrap this up... > > We discussed this in today's release team meeting and decided to make some > changes to hopefully make things a little smoother. We will now use the > following guidelines for reviewing and approving release requests: > > For releases in the current development (including some time for the previous > cycle for the release-trailing deliverables) we will only require a single > reviewer. If everything looks good and there are no concerns, we will +2 and > approve the release request without waiting for a second. If the reviewer has > any doubts or hesitation, they can decide to wait for a second reviewer, but > this should be a much less common situation. > > For stable releases, we will require two +2s. We will not, however, wait for a > designated day for stable team review. If we can get one, all the better, but > the normal release team should be aware of stable rules and look for them for > any stable release request. Keeping the requirement for two reviewers should > help make sure nothing is overlooked with stable policy. > > We do still want PTL/liaison +1 to appove, so we will continue to wait for > that. Thierry is working on some job automation to make checking for that a > little easier, so hopefully that will help make that process as smooth as > possible. > > If there are any other questions or concerns, please do let us know. > > Sean > From sean.mcginnis at gmx.com Thu Jan 9 17:17:47 2020 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 9 Jan 2020 11:17:47 -0600 Subject: [release] making releases fast again (was: decentralising release approvals) In-Reply-To: References: <20200109170115.GA453843@sm-workstation> Message-ID: <20200109171747.GB453843@sm-workstation> On Thu, Jan 09, 2020 at 06:08:53PM +0100, Radosław Piliszek wrote: > Hi Sean, > > just to verify my interpretation. > This means e.g. [1] is now good to go? > 2 release team members, 1 liaison and 1 PTL extra (for stable release). > > [1] https://review.opendev.org/701080 > > -yoctozepto > Correct. I will take one quick look again, and if all looks good get that one going. Sean From radoslaw.piliszek at gmail.com Thu Jan 9 18:10:30 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Thu, 9 Jan 2020 19:10:30 +0100 Subject: [release] making releases fast again (was: decentralising release approvals) In-Reply-To: <20200109171747.GB453843@sm-workstation> References: <20200109170115.GA453843@sm-workstation> <20200109171747.GB453843@sm-workstation> Message-ID: Thanks, Sean. -yoctozepto czw., 9 sty 2020 o 18:17 Sean McGinnis napisał(a): > > On Thu, Jan 09, 2020 at 06:08:53PM +0100, Radosław Piliszek wrote: > > Hi Sean, > > > > just to verify my interpretation. > > This means e.g. [1] is now good to go? > > 2 release team members, 1 liaison and 1 PTL extra (for stable release). > > > > [1] https://review.opendev.org/701080 > > > > -yoctozepto > > > > Correct. I will take one quick look again, and if all looks good get that one > going. > > Sean From gmann at ghanshyammann.com Thu Jan 9 18:38:26 2020 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 09 Jan 2020 12:38:26 -0600 Subject: [qa][infra][stable] Stable branches gate status: tempest-full-* jobs failing for stable/ocata|pike|queens In-Reply-To: <16f8101d6ea.be1780a3214520.3007727257147254758@ghanshyammann.com> References: <16f8101d6ea.be1780a3214520.3007727257147254758@ghanshyammann.com> Message-ID: <16f8b99bf5c.d67ee841304059.4438464382583793057@ghanshyammann.com> ---- On Tue, 07 Jan 2020 11:16:19 -0600 Ghanshyam Mann wrote ---- > Hello Everyone, > > tempest-full-* jobs are failing on stable/queens, stable/pike, and stable/ocata(legacy-tempest-dsvm-neutron-full-ocata) [1].ld > Please hold any recheck till fix is merged. > > whoami-rajat reported about the tempest-full-queens-py3 job failure and later while debugging we found that same is failing > for pike and ocata(job name there - legacy-tempest-dsvm-neutron-full-ocata). > > Failure is due to "Timeout on connecting the vnc console url" because there is no 'n-cauth' service running which is required > for these stable branches. In Ussuri that service has been removed from nova. > > 'n-cauth' has been removed from ENABLED_SERVICES recently in - https://review.opendev.org/#/c/700217/ which effected only > stable branches till queens. stable/rocky|stein are working because we have moved the services enable things from devstack-gate's > test matrix to devstack base job[2]. Patch[2] was not backported to stable/queens and stable/pike which I am not sure why. > > We have two ways to fix the stable branches gate: > 1. re-enable the n-cauth in devstack-gate. Hope all other removes services create no problem. > pros: easy to fix, fix for all three stable branches. > patch- https://review.opendev.org/#/c/701404/ This is merged now, We can recheck. -gmann > > 2. Backport the 546765[2] to stable/queens and stable/pike. > pros: this removes the dependency form test-matrix which is the overall goal to remove d-g dependency. > cons: It cannot be backported to stable/ocata as no zuulv3 base jobs there. This is already EM and anyone still cares about this? > > I think for fixing the gate (Tempest master and stable/queens|pike|ocata), we can go with option 1 and later > we backport the devstack migration. > > [1] > - http://zuul.openstack.org/builds?job_name=tempest-full-queens-py3 > - http://zuul.openstack.org/builds?job_name=tempest-full-pike > - http://zuul.openstack.org/builds?job_name=legacy-tempest-dsvm-neutron-full-ocata > - reported bug - https://bugs.launchpad.net/devstack/+bug/1858666 > > [2] https://review.opendev.org/#/c/546765/ > > > -gmann > > > From openstack at nemebean.com Thu Jan 9 19:20:34 2020 From: openstack at nemebean.com (Ben Nemec) Date: Thu, 9 Jan 2020 13:20:34 -0600 Subject: [neutron][rabbitmq][oslo] Neutron-server service shows deprecated "AMQPDeprecationWarning" In-Reply-To: <274FDC2A-837B-45CC-BFBF-8C09A182550A@dkrz.de> References: <4D3B074F-09F2-48BE-BD61-5D34CBFE509E@dkrz.de> <294c93b5-0ddc-284b-34a1-ffce654ba047@nemebean.com> <274FDC2A-837B-45CC-BFBF-8C09A182550A@dkrz.de> Message-ID: On 1/9/20 5:57 AM, Amjad Kotobi wrote: > Hi Ben, > >> On 7. Jan 2020, at 22:59, Ben Nemec wrote: >> >> >> >> On 1/7/20 9:14 AM, Amjad Kotobi wrote: >>> Hi, >>> Today we are facing losing connection of neutron especially during instance creation or so as “systemctl status neutron-server” shows below message >>> be deprecated in amqp 2.2.0. >>> Since amqp 2.0 you have to explicitly call Connection.connect() >>> before using the connection. >>> W_FORCE_CONNECT.format(attr=attr))) >>> /usr/lib/python2.7/site-packages/amqp/connection.py:304: AMQPDeprecationWarning: The .transport attribute on the connection was accessed before >>> the connection was established. This is supported for now, but will >>> be deprecated in amqp 2.2.0. >>> Since amqp 2.0 you have to explicitly call Connection.connect() >>> before using the connection. >>> W_FORCE_CONNECT.format(attr=attr))) >> >> It looks like this is a red herring, but it should be fixed in the current oslo.messaging pike release. See [0] and the related bug. >> >> 0: https://review.opendev.org/#/c/605324/ >> >>> OpenStack release which we are running is “Pike”. >>> Is there any way to remedy this? >> >> I don't think this should be a fatal problem in and of itself so I suspect it's masking something else. However, I would recommend updating to the latest pike release of oslo.messaging where the deprecated feature is not used. If that doesn't fix the problem, please send us whatever errors remain after this one is eliminated. > > I checked it out, we are having the latest pike Oslo.messaging and it still showing the same upper messages. Any ideas? Hmm, not sure then. Are there any other log messages around that one which might provide more context on where this is happening? I've also copied a couple of our messaging folks in case they have a better idea what might be going on. >> >>> Thanks >>> Amjad > > From sean.mcginnis at gmx.com Thu Jan 9 21:50:14 2020 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 9 Jan 2020 15:50:14 -0600 Subject: [Release-job-failures] release-post job for openstack/releases for ref refs/heads/master failed In-Reply-To: References: Message-ID: <20200109215014.GA472836@sm-workstation> On Thu, Jan 09, 2020 at 08:32:28PM +0000, zuul at openstack.org wrote: > Build failed. > > - tag-releases https://zuul.opendev.org/t/openstack/build/deb8b8d5504b4689ab6d669eac92f979 : FAILURE in 3m 46s > - publish-tox-docs-static https://zuul.opendev.org/t/openstack/build/None : SKIPPED > This failure can be safely ignored. This was a side effect of marking some older independent repos as 'abandoned' in https://review.opendev.org/#/c/700013/. These are old repos that are no longer under governance and/or retired. The failure itself appears to be from one of the retired repos, before we had specified in the retire procedure that the gitreview file should be kept around. Since these are no longer active, and there wasn't actually anything to tag and release, no need to worry about this failure. Sean From feilong at catalyst.net.nz Thu Jan 9 23:11:50 2020 From: feilong at catalyst.net.nz (Feilong Wang) Date: Fri, 10 Jan 2020 12:11:50 +1300 Subject: [magnum]: K8s cluster creation times out. OpenStack Train : [ERROR]: Unable to render networking. Network config is likely broken In-Reply-To: <59ed63745f4e4c42a63692c3ee4eb10d@ncwmexgp031.CORP.CHARTERCOM.com> References: <59ed63745f4e4c42a63692c3ee4eb10d@ncwmexgp031.CORP.CHARTERCOM.com> Message-ID: <6c8f45f2-da74-18fd-7909-84c9c6762fe3@catalyst.net.nz> Hi Bhupathi, Could you please share your cluster template? And please make sure your Nova/Neutron works. On 10/01/20 2:45 AM, Bhupathi, Ramakrishna wrote: > > Folks, > > I am building a Kubernetes Cluster( Openstack Train) and using fedora > atomic-29 image . The nodes come up  fine ( I have a simple 1 master > and 1 node) , but the cluster creation times out,  and when I access > the cloud-init logs I see this error .  Wondering what I am missing as > this used to work before.  I wonder if this is image related . > >   > > [ERROR]: Unable to render networking. Network config is likely broken: > No available network renderers found. Searched through list: ['eni', > 'sysconfig', 'netplan'] > >   > > Essentially the stack creation fails in “kube_cluster_deploy” > >   > > Can somebody help me debug this ? Any help is appreciated. > >   > > --RamaK > > The contents of this e-mail message and > any attachments are intended solely for the > addressee(s) and may contain confidential > and/or legally privileged information. If you > are not the intended recipient of this message > or if this message has been addressed to you > in error, please immediately alert the sender > by reply e-mail and then delete this message > and any attachments. If you are not the > intended recipient, you are notified that > any use, dissemination, distribution, copying, > or storage of this message or any attachment > is strictly prohibited. -- Cheers & Best regards, Feilong Wang (王飞龙) Head of R&D Catalyst Cloud - Cloud Native New Zealand -------------------------------------------------------------------------- Tel: +64-48032246 Email: flwang at catalyst.net.nz Level 6, Catalyst House, 150 Willis Street, Wellington -------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From amotoki at gmail.com Fri Jan 10 06:55:49 2020 From: amotoki at gmail.com (Akihiro Motoki) Date: Fri, 10 Jan 2020 15:55:49 +0900 Subject: [infra][stable] python3 is used by default in older stable branches Message-ID: Hi, The horizon team recently noticed that python3 is used as a default python interpreter in older stable branches like pike or ocata. For example, horizon pep8 job in stable/pike and stable/ocata fails [1][2]. We also noticed that some jobs which are expected to run with python2 (using the tox default interpreter as of the release) are now run with python3 [3]. What is the recommended way to cope with this situation? Individual projects can cope with the default interpreter change repo by repo, but this potentially affects all projects with older stable branches. This is the reason I am sending this mail. Best Regards, Akihiro Motoki (amotoki) [1] https://zuul.opendev.org/t/openstack/build/daaeaedb0a184e29a03eeaae59157c78 [2] https://zuul.opendev.org/t/openstack/build/525dc7f926684e54be8b565a7bbf7193 [3] https://zuul.opendev.org/t/openstack/build/adbc53b8d1f74dac9cd606f4a796c442/log/tox/py27dj110-0.log#3 From skaplons at redhat.com Fri Jan 10 07:43:03 2020 From: skaplons at redhat.com (Slawek Kaplonski) Date: Fri, 10 Jan 2020 08:43:03 +0100 Subject: [neutron][vpnaas] New neutron-vpnaas maintainer and core reviewer Message-ID: Hi, After our last call for volunteers to maintain some of the neutron stadium projects, we have new neutron-vpnaas maintainer now \o/ Dongcan Ye just stepped up to take care of this project. After discussion with other neutron-vpnaas core reviewers I added him to neutron-vpnaas core team. He works on neutron-vpnaas since some time already and in our opinion he knows this project good enough to be core reviewer there. Thank You very much Dongcan Ye for help with neutron-vpnaas project :) — Slawek Kaplonski Senior software engineer Red Hat From info at dantalion.nl Fri Jan 10 09:28:19 2020 From: info at dantalion.nl (info at dantalion.nl) Date: Fri, 10 Jan 2020 10:28:19 +0100 Subject: [aodh][keystone] handling of webhook / alarm authentication Message-ID: Hello, I was wondering how a service receiving an aodh webhook could perform authentication? The documentation describes the webhook as a simple post-request so I was wondering if a keystone token context is available when these requests are received? If not, I was wondering if anyone had any recommendation on how to perform authentication upon received post-requests? So far I have come up with limiting the functionality of these webhooks such as rate-limiting and administrators having to explicitly enable these webhooks before they work. Hope anyone else could provide further valuable information. Kind regards, Corne Lukken Watcher core-reviewer From aj at suse.com Fri Jan 10 09:50:51 2020 From: aj at suse.com (Andreas Jaeger) Date: Fri, 10 Jan 2020 10:50:51 +0100 Subject: [infra][stable] python3 is used by default in older stable branches In-Reply-To: References: Message-ID: On 10/01/2020 07.55, Akihiro Motoki wrote: > Hi, > > The horizon team recently noticed that python3 is used as a default > python interpreter in older stable branches like pike or ocata. > For example, horizon pep8 job in stable/pike and stable/ocata fails [1][2]. > We also noticed that some jobs which are expected to run with python2 > (using the tox default interpreter as of the release) are now run with > python3 [3]. Do you know what changed? I don't remember any intended change here, so I'm curious why this happens suddenly. https://zuul.opendev.org/t/openstack/build/0615d1df250144e6a137f0615c25ce66/logs from 27th of December already shows this on rocky https://zuul.opendev.org/t/openstack/build/085c0d9ea5d8466099eef3bb0ffb2213 from the 18th of December uses python 2.7. Andreas > What is the recommended way to cope with this situation? > > Individual projects can cope with the default interpreter change repo by repo, > but this potentially affects all projects with older stable branches. > This is the reason I am sending this mail. > > Best Regards, > Akihiro Motoki (amotoki) > > [1] https://zuul.opendev.org/t/openstack/build/daaeaedb0a184e29a03eeaae59157c78 > [2] https://zuul.opendev.org/t/openstack/build/525dc7f926684e54be8b565a7bbf7193 > [3] https://zuul.opendev.org/t/openstack/build/adbc53b8d1f74dac9cd606f4a796c442/log/tox/py27dj110-0.log#3 > -- Andreas Jaeger aj at suse.com Twitter: jaegerandi SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg (HRB 36809, AG Nürnberg) GF: Felix Imendörffer GPG fingerprint = EF18 1673 38C4 A372 86B1 E699 5294 24A3 FF91 2ACB From ltoscano at redhat.com Fri Jan 10 10:06:52 2020 From: ltoscano at redhat.com (Luigi Toscano) Date: Fri, 10 Jan 2020 11:06:52 +0100 Subject: [infra][stable] python3 is used by default in older stable branches In-Reply-To: References: Message-ID: <15221729.TVbWr6RBN8@whitebase.usersys.redhat.com> On Friday, 10 January 2020 10:50:51 CET Andreas Jaeger wrote: > On 10/01/2020 07.55, Akihiro Motoki wrote: > > Hi, > > > > The horizon team recently noticed that python3 is used as a default > > python interpreter in older stable branches like pike or ocata. > > For example, horizon pep8 job in stable/pike and stable/ocata fails > > [1][2]. > > We also noticed that some jobs which are expected to run with python2 > > (using the tox default interpreter as of the release) are now run with > > python3 [3]. > > Do you know what changed? I don't remember any intended change here, so > I'm curious why this happens suddenly. > > https://zuul.opendev.org/t/openstack/build/0615d1df250144e6a137f0615c25ce66/ > logs from 27th of December already shows this on rocky > > https://zuul.opendev.org/t/openstack/build/085c0d9ea5d8466099eef3bb0ffb2213 > from the 18th of December uses python 2.7. > Maybe this is related to http://lists.openstack.org/pipermail/openstack-discuss/2019-November/ 010957.html http://eavesdrop.openstack.org/meetings/infra/2019/infra. 2019-11-19-19.05.log.html#l-102 ? Ciao -- Luigi From aj at suse.com Fri Jan 10 10:13:09 2020 From: aj at suse.com (Andreas Jaeger) Date: Fri, 10 Jan 2020 11:13:09 +0100 Subject: [infra][stable] python3 is used by default in older stable branches In-Reply-To: <15221729.TVbWr6RBN8@whitebase.usersys.redhat.com> References: <15221729.TVbWr6RBN8@whitebase.usersys.redhat.com> Message-ID: On 10/01/2020 11.06, Luigi Toscano wrote: > On Friday, 10 January 2020 10:50:51 CET Andreas Jaeger wrote: >> On 10/01/2020 07.55, Akihiro Motoki wrote: >>> Hi, >>> >>> The horizon team recently noticed that python3 is used as a default >>> python interpreter in older stable branches like pike or ocata. >>> For example, horizon pep8 job in stable/pike and stable/ocata fails >>> [1][2]. >>> We also noticed that some jobs which are expected to run with python2 >>> (using the tox default interpreter as of the release) are now run with >>> python3 [3]. >> >> Do you know what changed? I don't remember any intended change here, so >> I'm curious why this happens suddenly. >> >> https://zuul.opendev.org/t/openstack/build/0615d1df250144e6a137f0615c25ce66/ >> logs from 27th of December already shows this on rocky >> >> https://zuul.opendev.org/t/openstack/build/085c0d9ea5d8466099eef3bb0ffb2213 >> from the 18th of December uses python 2.7. >> > > Maybe this is related to > http://lists.openstack.org/pipermail/openstack-discuss/2019-November/ > 010957.html that one speaks about Bionic, and these failures are on Xenial nodes. The timing seems of - if it worked on the 18th of December and failed sometime later. thanks, Andreas > > http://eavesdrop.openstack.org/meetings/infra/2019/infra. > 2019-11-19-19.05.log.html#l-102 > > ? > > Ciao > -- Andreas Jaeger aj at suse.com Twitter: jaegerandi SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg (HRB 36809, AG Nürnberg) GF: Felix Imendörffer GPG fingerprint = EF18 1673 38C4 A372 86B1 E699 5294 24A3 FF91 2ACB From anlin.kong at gmail.com Fri Jan 10 10:44:27 2020 From: anlin.kong at gmail.com (Lingxian Kong) Date: Fri, 10 Jan 2020 23:44:27 +1300 Subject: [aodh][keystone] handling of webhook / alarm authentication In-Reply-To: References: Message-ID: Hi Corne, I didn't fully understand your question, could you please provide the doc mentioned and if possible, an example of aodh alarm you want to create would be better. - Best regards, Lingxian Kong Catalyst Cloud On Fri, Jan 10, 2020 at 10:30 PM info at dantalion.nl wrote: > Hello, > > I was wondering how a service receiving an aodh webhook could perform > authentication? > > The documentation describes the webhook as a simple post-request so I > was wondering if a keystone token context is available when these > requests are received? > > If not, I was wondering if anyone had any recommendation on how to > perform authentication upon received post-requests? > > So far I have come up with limiting the functionality of these webhooks > such as rate-limiting and administrators having to explicitly enable > these webhooks before they work. > > Hope anyone else could provide further valuable information. > > Kind regards, > Corne Lukken > Watcher core-reviewer > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.page at canonical.com Fri Jan 10 11:22:46 2020 From: james.page at canonical.com (James Page) Date: Fri, 10 Jan 2020 11:22:46 +0000 Subject: [charms][watcher] OpenStack Watcher Charm In-Reply-To: References: <159661b1-7edf-e55d-c7b9-cf3b97bffffb@admin.grnet.gr> Message-ID: Dropping direct recipients as this causes a reject from openstack-discuss! On Fri, Jan 10, 2020 at 11:12 AM James Page wrote: > Hi Stamatis > > Thankyou for this work! > > I'll take a look at your charm over the next few days. > > On Wed, Jan 8, 2020 at 11:25 AM Stamatis Katsaounis < > skatsaounis at admin.grnet.gr> wrote: > >> Hi all, >> >> Purpose of this email is to let you know that we released an unofficial >> charm of OpenStack Watcher [1]. This charm gave us the opportunity to >> deploy OpenStack Watcher to our charmed OpenStack deployment. >> >> After seeing value in it, we decided to publish it through GRNET GitHub >> Organization account for several reasons. First of all, we would love to >> get feedback on it as it is our first try on creating an OpenStack reactive >> charm. Secondly, we would be glad to see other OpenStack operators deploy >> Watcher and share with us knowledge on the project and possible use cases. >> Finally, it would be ideal to come up with an official OpenStack Watcher >> charm repository under charmers umbrella. By doing this, another OpenStack >> project is going to be available not only for Train version but for any >> future version of OpenStack. Most important, the CI tests are going to >> ensure that the code is not broken and persuade other operators to use it. >> >> Before closing my email, I would like to give some insight on the >> architecture of the code base and the deployment process. To begin with, >> charm-watcher is based on other reactive OpenStack charms. During its >> deployment Barbican, Designate, Octavia and other charms' code bases were >> counseled. Furthermore, the structure is the same as any official OpenStack >> charm, of course without functional tests, which is something we cannot >> provide. >> > I'd suggest that we initiate the process to include your watcher charm as > part of the OpenStack Charmers project on opendev.org; once the initial > migration completes adding some functional tests should be fairly easy as > you'll be able to run them on the Canonical 3rd party CI infrastructure. > > This requires that a couple of reviews be raised - here are examples for > the new Manila Ganesha charms: > > https://review.opendev.org/#/c/693463/ > https://review.opendev.org/#/c/693462/ > > One is for the infrastructure setup, the other is to formally include the > repositories as part of the TC approved project. If you would like to > raise them for the watcher charm I'm happy to review with Frode (who is the > current PTL). > >> Speaking about the deployment process, apart from having a basic charmed >> OpenStack deployment, operator has to change two tiny configuration options >> on Nova cloud controller and Cinder. As explained in the Watcher >> configuration guide, special care has to be done with Oslo notifications >> for Nova and Cinder [2]. In order to achieve that in charmed OpenStack some >> issues were met and solved with the following patches [3], [4], [5], [6]. >> With these patches, operator can set the extra Oslo configuration and this >> is the only extra configuration needs to take place. Finally, with [7] >> Keystone charm can accept a relation with Watcher charm instead of ignoring >> it. >> >> To be able to deploy GRNET Watcher charm on Train, patches [3], [4], [5] >> and [7] have to be back-ported to stable/19.10 branch but that will require >> the approval of charmers team. Please let me know if such an option is >> available and in that case I am going to open the relevant patches. >> Furthermore, if you think that it could be a good option to create a spec >> and then introduce an official Watcher charm, I would love to help on that. >> > I'd rather we wait until the 20.02 charm release - dependent changes have > all landed and will be included. > > I wish all a happy new year and I am looking forward to your response and >> possible feedback. >> > > Happy new year to you as well! > > PS. If we could have an Ubuntu package for watcher-dashboard [8] like >> octavia-dashboard [9] we would release a charm for it as well. >> > > I'll chat with coreycb and see if we might be able to package that for > 20.04/Ussuri. > > Cheers > > James > >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Fri Jan 10 11:56:47 2020 From: thierry at openstack.org (Thierry Carrez) Date: Fri, 10 Jan 2020 12:56:47 +0100 Subject: [ops][largescale-sig] Collecting scaling stories Message-ID: <07b7df31-999d-8de9-839f-85830628855b@openstack.org> Hi everyone, As part of its goal of further pushing back scaling limits within a given cluster, the Large Scale SIG would like to collect scaling stories from OpenStack users. There is a size/load limit for single clusters past which things in OpenStack start to break, and we need to start using multiple clusters or cells to scale out. The SIG is interested in hearing: - what broke first for you, is it RabbitMQ or something else - what were the first symptoms - at what size/load did it start to break This will be a great help to document expected limits, and identify where improvements should be focused. You can contribute your experience by replying directly to this thread, or adding to the following etherpad: https://etherpad.openstack.org/p/scaling-stories Thanks in advance for your help ! -- Thierry Carrez (ttx) on behalf of the Large Scale SIG From mark at stackhpc.com Fri Jan 10 12:15:12 2020 From: mark at stackhpc.com (Mark Goddard) Date: Fri, 10 Jan 2020 12:15:12 +0000 Subject: [release] making releases fast again (was: decentralising release approvals) In-Reply-To: <20200109170115.GA453843@sm-workstation> References: <20200109170115.GA453843@sm-workstation> Message-ID: On Thu, 9 Jan 2020 at 17:01, Sean McGinnis wrote: > > On Fri, Dec 20, 2019 at 11:04:36AM +0100, Thierry Carrez wrote: > > Mark Goddard wrote: > > > [...] > > > As kolla PTL and ironic release liaison I've proposed a number of > > > release patches recently. Generally the release team is good at churning > > > through these, but sometimes patches can hang around for a while. > > > Usually a ping on IRC will get things moving again within a day or so > > > (thanks in particular to Sean who has been very responsive). > > > > I agree we've seen an increase in processing delay lately, and I'd like to > > correct that. There are generally three things that would cause a > > perceptible delay in release processing... > > > > 1- wait for two release managers +2 > > > > This is something we put in place some time ago, as we had a lot of new > > members and thought that would be a good way to onboard them. Lately it > > created delays as a lot of those were not as active. > > > > 2- stable releases > > > > Two subcases in there... Eitherthe deliverable is under stable policy and > > there are *significant* delays there as we have to pause to give a chance to > > stable-maint-core people to voice an opinion. Or the deliverable is not > > under stable policy, but we do a manual check on the changes, as a way to > > educate the requester on semver. > > > > 3- waiting for PTL/release liaison to approve > > > > That can take a long time, but the release management team is not really at > > fault there. > > > > Coming back to hopefully wrap this up... > > We discussed this in today's release team meeting and decided to make some > changes to hopefully make things a little smoother. We will now use the > following guidelines for reviewing and approving release requests: > > For releases in the current development (including some time for the previous > cycle for the release-trailing deliverables) we will only require a single > reviewer. If everything looks good and there are no concerns, we will +2 and > approve the release request without waiting for a second. If the reviewer has > any doubts or hesitation, they can decide to wait for a second reviewer, but > this should be a much less common situation. > > For stable releases, we will require two +2s. We will not, however, wait for a > designated day for stable team review. If we can get one, all the better, but > the normal release team should be aware of stable rules and look for them for > any stable release request. Keeping the requirement for two reviewers should > help make sure nothing is overlooked with stable policy. > > We do still want PTL/liaison +1 to appove, so we will continue to wait for > that. Thierry is working on some job automation to make checking for that a > little easier, so hopefully that will help make that process as smooth as > possible. Thanks for taking action on this - I expect the above changes will be a big improvement. > > If there are any other questions or concerns, please do let us know. > > Sean > From mark at stackhpc.com Fri Jan 10 12:22:54 2020 From: mark at stackhpc.com (Mark Goddard) Date: Fri, 10 Jan 2020 12:22:54 +0000 Subject: [kolla] Kayobe 7.0.0 released for Train Message-ID: Hi, I'm pleased to announce the release of Kayobe 7.0.0 - the first release in the Train series, and the first release as a deliverable of the Kolla project. Full details are available in the release notes which are available here: https://docs.openstack.org/releasenotes/kayobe/train.html We anticipate significant changes during the Train release series to support a migration to CentOS 8. We will communicate which releases are affected. Thanks to everyone who contributed to this release, and for the Kolla project for accepting us. I look forward to the continued integration of these teams. Cheers, Mark From thierry at openstack.org Fri Jan 10 12:40:22 2020 From: thierry at openstack.org (Thierry Carrez) Date: Fri, 10 Jan 2020 13:40:22 +0100 Subject: [largescale-sig] Next meeting: Jan 15, 9utc Message-ID: <6e31eca3-31ee-b393-7c7d-0def96185b00@openstack.org> Hi everyone, The Large Scale SIG will have a meeting next week on Wednesday, Jan 15 at 9 UTC[1] in #openstack-meeting on IRC. I'll not be around but Belmiro Moreira volunteered to chair the meeting. [1] https://www.timeanddate.com/worldclock/fixedtime.html?iso=20200115T09 We had several TODOs out of our December meeting, so I invite you to review the summary of that meeting in preparation of the next: http://lists.openstack.org/pipermail/openstack-discuss/2019-December/011667.html As always, the agenda for the meeting next week is available at: https://etherpad.openstack.org/p/large-scale-sig-meeting Regards, -- Thierry Carrez (ttx) From info at dantalion.nl Fri Jan 10 12:50:10 2020 From: info at dantalion.nl (info at dantalion.nl) Date: Fri, 10 Jan 2020 13:50:10 +0100 Subject: [aodh][keystone] handling of webhook / alarm authentication In-Reply-To: References: Message-ID: <75131451-9b07-0dc8-2ed2-3573434e0e7d@dantalion.nl> Hi Lingxian, The information referenced comes from: https://docs.openstack.org/aodh/latest/admin/telemetry-alarms.html Here it would be an alarm that would use the webhooks action. The endpoint in our use case would be Watcher for which we have just passed a spec: https://review.opendev.org/#/c/695646/ With these alarms that report using a webhook I am wondering how these received alarms can be authenticated and if the keystone token context is available? Hope this makes it clearer. Kind regards, Corne Lukken Watcher core-reviewer On 1/10/20 11:44 AM, Lingxian Kong wrote: > Hi Corne, > > I didn't fully understand your question, could you please provide the doc > mentioned and if possible, an example of aodh alarm you want to create > would be better. > > - > Best regards, > Lingxian Kong > Catalyst Cloud > > > On Fri, Jan 10, 2020 at 10:30 PM info at dantalion.nl > wrote: > >> Hello, >> >> I was wondering how a service receiving an aodh webhook could perform >> authentication? >> >> The documentation describes the webhook as a simple post-request so I >> was wondering if a keystone token context is available when these >> requests are received? >> >> If not, I was wondering if anyone had any recommendation on how to >> perform authentication upon received post-requests? >> >> So far I have come up with limiting the functionality of these webhooks >> such as rate-limiting and administrators having to explicitly enable >> these webhooks before they work. >> >> Hope anyone else could provide further valuable information. >> >> Kind regards, >> Corne Lukken >> Watcher core-reviewer >> >> > From corey.bryant at canonical.com Fri Jan 10 13:11:46 2020 From: corey.bryant at canonical.com (Corey Bryant) Date: Fri, 10 Jan 2020 08:11:46 -0500 Subject: [charms][watcher] OpenStack Watcher Charm In-Reply-To: References: <159661b1-7edf-e55d-c7b9-cf3b97bffffb@admin.grnet.gr> Message-ID: On Fri, Jan 10, 2020 at 6:12 AM James Page wrote: > Hi Stamatis > > Thankyou for this work! > > I'll take a look at your charm over the next few days. > > On Wed, Jan 8, 2020 at 11:25 AM Stamatis Katsaounis < > skatsaounis at admin.grnet.gr> wrote: > >> Hi all, >> >> Purpose of this email is to let you know that we released an unofficial >> charm of OpenStack Watcher [1]. This charm gave us the opportunity to >> deploy OpenStack Watcher to our charmed OpenStack deployment. >> >> After seeing value in it, we decided to publish it through GRNET GitHub >> Organization account for several reasons. First of all, we would love to >> get feedback on it as it is our first try on creating an OpenStack reactive >> charm. Secondly, we would be glad to see other OpenStack operators deploy >> Watcher and share with us knowledge on the project and possible use cases. >> Finally, it would be ideal to come up with an official OpenStack Watcher >> charm repository under charmers umbrella. By doing this, another OpenStack >> project is going to be available not only for Train version but for any >> future version of OpenStack. Most important, the CI tests are going to >> ensure that the code is not broken and persuade other operators to use it. >> >> Before closing my email, I would like to give some insight on the >> architecture of the code base and the deployment process. To begin with, >> charm-watcher is based on other reactive OpenStack charms. During its >> deployment Barbican, Designate, Octavia and other charms' code bases were >> counseled. Furthermore, the structure is the same as any official OpenStack >> charm, of course without functional tests, which is something we cannot >> provide. >> > I'd suggest that we initiate the process to include your watcher charm as > part of the OpenStack Charmers project on opendev.org; once the initial > migration completes adding some functional tests should be fairly easy as > you'll be able to run them on the Canonical 3rd party CI infrastructure. > > This requires that a couple of reviews be raised - here are examples for > the new Manila Ganesha charms: > > https://review.opendev.org/#/c/693463/ > https://review.opendev.org/#/c/693462/ > > One is for the infrastructure setup, the other is to formally include the > repositories as part of the TC approved project. If you would like to > raise them for the watcher charm I'm happy to review with Frode (who is the > current PTL). > >> Speaking about the deployment process, apart from having a basic charmed >> OpenStack deployment, operator has to change two tiny configuration options >> on Nova cloud controller and Cinder. As explained in the Watcher >> configuration guide, special care has to be done with Oslo notifications >> for Nova and Cinder [2]. In order to achieve that in charmed OpenStack some >> issues were met and solved with the following patches [3], [4], [5], [6]. >> With these patches, operator can set the extra Oslo configuration and this >> is the only extra configuration needs to take place. Finally, with [7] >> Keystone charm can accept a relation with Watcher charm instead of ignoring >> it. >> >> To be able to deploy GRNET Watcher charm on Train, patches [3], [4], [5] >> and [7] have to be back-ported to stable/19.10 branch but that will require >> the approval of charmers team. Please let me know if such an option is >> available and in that case I am going to open the relevant patches. >> Furthermore, if you think that it could be a good option to create a spec >> and then introduce an official Watcher charm, I would love to help on that. >> > I'd rather we wait until the 20.02 charm release - dependent changes have > all landed and will be included. > > I wish all a happy new year and I am looking forward to your response and >> possible feedback. >> > > Happy new year to you as well! > > PS. If we could have an Ubuntu package for watcher-dashboard [8] like >> octavia-dashboard [9] we would release a charm for it as well. >> > > I'll chat with coreycb and see if we might be able to package that for > 20.04/Ussuri. > > Hi, I'll take a look at packaging watcher-dashboard. Corey Cheers > > James > >> >> -- Corey -------------- next part -------------- An HTML attachment was scrubbed... URL: From marios at redhat.com Fri Jan 10 13:42:30 2020 From: marios at redhat.com (Marios Andreou) Date: Fri, 10 Jan 2020 15:42:30 +0200 Subject: [tripleo][ci] Proposing Sorin Barnea as TripleO-CI core Message-ID: I would like to propose Sorin Barnea (ssbarnea at redhat.com) as core on tripleo-ci repos (tripleo-ci, tripleo-quickstart, tripleo-quickstart-extras). Sorin has been a member of the tripleo-ci team for over one and a half years and has made many contributions across the tripleo-ci repos and beyond - highlights include helping the team to adopt molecule testing, leading linting efforts/changes/fixes and many others. Please vote by replying to this thread with +1 or -1 for any objections thanks marios -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlandy at redhat.com Fri Jan 10 13:52:54 2020 From: rlandy at redhat.com (Ronelle Landy) Date: Fri, 10 Jan 2020 08:52:54 -0500 Subject: [tripleo][ci] Proposing Sorin Barnea as TripleO-CI core In-Reply-To: References: Message-ID: +1 - thanks for your work here, Sorin, On Fri, Jan 10, 2020 at 8:44 AM Marios Andreou wrote: > I would like to propose Sorin Barnea (ssbarnea at redhat.com) as core on > tripleo-ci repos (tripleo-ci, tripleo-quickstart, > tripleo-quickstart-extras). > > Sorin has been a member of the tripleo-ci team for over one and a half > years and has made many contributions across the tripleo-ci repos and > beyond - highlights include helping the team to adopt molecule testing, > leading linting efforts/changes/fixes and many others. > > Please vote by replying to this thread with +1 or -1 for any objections > > thanks > marios > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bdobreli at redhat.com Fri Jan 10 14:07:34 2020 From: bdobreli at redhat.com (Bogdan Dobrelya) Date: Fri, 10 Jan 2020 15:07:34 +0100 Subject: [tripleo][ci] Proposing Sorin Barnea as TripleO-CI core In-Reply-To: References: Message-ID: <51174bf8-5fdc-28d5-74de-0daa0e1f425a@redhat.com> On 10.01.2020 14:42, Marios Andreou wrote: > I would like to propose Sorin Barnea (ssbarnea at redhat.com > ) as core on tripleo-ci repos (tripleo-ci, > tripleo-quickstart, tripleo-quickstart-extras). > > Sorin has been a  member of the tripleo-ci team for over one and a half > years and has made many contributions across the tripleo-ci repos and > beyond - highlights include helping the team to adopt molecule testing, > leading linting efforts/changes/fixes and many others. > > Please vote by replying to this thread with +1 or -1 for any objections +1 Well deserved! > > thanks > marios > -- Best regards, Bogdan Dobrelya, Irc #bogdando From lshort at redhat.com Fri Jan 10 14:11:10 2020 From: lshort at redhat.com (Luke Short) Date: Fri, 10 Jan 2020 09:11:10 -0500 Subject: [tripleo][ci] Proposing Sorin Barnea as TripleO-CI core In-Reply-To: References: Message-ID: +1 Sorin has been an incredibly resourceful and creative thinker. He definitely deserves a spot as a TripleO CI Core! Keep up the amazing work! Luke Short, RHCE Software Engineer, OpenStack Deployment Framework Red Hat, Inc. On Fri, Jan 10, 2020 at 8:48 AM Marios Andreou wrote: > I would like to propose Sorin Barnea (ssbarnea at redhat.com) as core on > tripleo-ci repos (tripleo-ci, tripleo-quickstart, > tripleo-quickstart-extras). > > Sorin has been a member of the tripleo-ci team for over one and a half > years and has made many contributions across the tripleo-ci repos and > beyond - highlights include helping the team to adopt molecule testing, > leading linting efforts/changes/fixes and many others. > > Please vote by replying to this thread with +1 or -1 for any objections > > thanks > marios > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlandy at redhat.com Fri Jan 10 14:11:17 2020 From: rlandy at redhat.com (Ronelle Landy) Date: Fri, 10 Jan 2020 09:11:17 -0500 Subject: [tripleo][ci] Proposing Chandan Kumar and Arx Cruz as TripleO-CI cores Message-ID: Hello All, I'd like to propose Arx Cruz (arxcruz at redhat.com) and Chandan Kumar ( chkumar at redhat.com) as core on tripleo-ci repos (tripleo-ci, tripleo-quickstart, tripleo-quickstart-extras). In addition to the extensive work that Arx and Chandan have done on the Tempest-related repos ( and Tempest interface/settings within the Tripleo CI repos) , they have become active contributors to the core Tripleo CI repos, in general, in the past two years. Please vote by replying to this thread with +1 or -1 for any objections. We will close the vote 7 days from now. Thank you, Ronelle -------------- next part -------------- An HTML attachment was scrubbed... URL: From aschultz at redhat.com Fri Jan 10 14:36:59 2020 From: aschultz at redhat.com (Alex Schultz) Date: Fri, 10 Jan 2020 07:36:59 -0700 Subject: [tripleo][ci] Proposing Sorin Barnea as TripleO-CI core In-Reply-To: References: Message-ID: +1 On Fri, Jan 10, 2020 at 6:48 AM Marios Andreou wrote: > > I would like to propose Sorin Barnea (ssbarnea at redhat.com) as core on tripleo-ci repos (tripleo-ci, tripleo-quickstart, tripleo-quickstart-extras). > > Sorin has been a member of the tripleo-ci team for over one and a half years and has made many contributions across the tripleo-ci repos and beyond - highlights include helping the team to adopt molecule testing, leading linting efforts/changes/fixes and many others. > > Please vote by replying to this thread with +1 or -1 for any objections > > thanks > marios > From aschultz at redhat.com Fri Jan 10 14:37:24 2020 From: aschultz at redhat.com (Alex Schultz) Date: Fri, 10 Jan 2020 07:37:24 -0700 Subject: [tripleo][ci] Proposing Chandan Kumar and Arx Cruz as TripleO-CI cores In-Reply-To: References: Message-ID: +1 On Fri, Jan 10, 2020 at 7:16 AM Ronelle Landy wrote: > > Hello All, > > I'd like to propose Arx Cruz (arxcruz at redhat.com) and Chandan Kumar (chkumar at redhat.com) as core on tripleo-ci repos (tripleo-ci, tripleo-quickstart, tripleo-quickstart-extras). > > In addition to the extensive work that Arx and Chandan have done on the Tempest-related repos ( and Tempest interface/settings within the Tripleo CI repos) , they have become active contributors to the core Tripleo CI repos, in general, in the past two years. > > Please vote by replying to this thread with +1 or -1 for any objections. We will close the vote 7 days from now. > > Thank you, > Ronelle From marios at redhat.com Fri Jan 10 14:49:11 2020 From: marios at redhat.com (Marios Andreou) Date: Fri, 10 Jan 2020 16:49:11 +0200 Subject: [tripleo][ci] Proposing Chandan Kumar and Arx Cruz as TripleO-CI cores In-Reply-To: References: Message-ID: +1 On Fri, Jan 10, 2020 at 4:13 PM Ronelle Landy wrote: > Hello All, > > I'd like to propose Arx Cruz (arxcruz at redhat.com) and Chandan Kumar ( > chkumar at redhat.com) as core on tripleo-ci repos (tripleo-ci, > tripleo-quickstart, tripleo-quickstart-extras). > > In addition to the extensive work that Arx and Chandan have done on the > Tempest-related repos ( and Tempest interface/settings within the Tripleo > CI repos) , they have become active contributors to the core Tripleo CI > repos, in general, in the past two years. > > Please vote by replying to this thread with +1 or -1 for any objections. > We will close the vote 7 days from now. > > Thank you, > Ronelle > -------------- next part -------------- An HTML attachment was scrubbed... URL: From emilien at redhat.com Fri Jan 10 15:15:42 2020 From: emilien at redhat.com (Emilien Macchi) Date: Fri, 10 Jan 2020 10:15:42 -0500 Subject: [tripleo][ci] Proposing Sorin Barnea as TripleO-CI core In-Reply-To: References: Message-ID: +1, Sorin has been providing meaningful and careful reviews on the CI projects. I think it's safe to promote him core at this point. Keep the good work going! On Fri, Jan 10, 2020 at 9:44 AM Alex Schultz wrote: > +1 > > On Fri, Jan 10, 2020 at 6:48 AM Marios Andreou wrote: > > > > I would like to propose Sorin Barnea (ssbarnea at redhat.com) as core on > tripleo-ci repos (tripleo-ci, tripleo-quickstart, > tripleo-quickstart-extras). > > > > Sorin has been a member of the tripleo-ci team for over one and a half > years and has made many contributions across the tripleo-ci repos and > beyond - highlights include helping the team to adopt molecule testing, > leading linting efforts/changes/fixes and many others. > > > > Please vote by replying to this thread with +1 or -1 for any objections > > > > thanks > > marios > > > > > -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From kgiusti at gmail.com Fri Jan 10 15:24:51 2020 From: kgiusti at gmail.com (Ken Giusti) Date: Fri, 10 Jan 2020 10:24:51 -0500 Subject: [neutron][rabbitmq][oslo] Neutron-server service shows deprecated "AMQPDeprecationWarning" In-Reply-To: References: <4D3B074F-09F2-48BE-BD61-5D34CBFE509E@dkrz.de> <294c93b5-0ddc-284b-34a1-ffce654ba047@nemebean.com> <274FDC2A-837B-45CC-BFBF-8C09A182550A@dkrz.de> Message-ID: On Thu, Jan 9, 2020 at 2:27 PM Ben Nemec wrote: > > > On 1/9/20 5:57 AM, Amjad Kotobi wrote: > > Hi Ben, > > > >> On 7. Jan 2020, at 22:59, Ben Nemec wrote: > >> > >> > >> > >> On 1/7/20 9:14 AM, Amjad Kotobi wrote: > >>> Hi, > >>> Today we are facing losing connection of neutron especially during > instance creation or so as “systemctl status neutron-server” shows below > message > >>> be deprecated in amqp 2.2.0. > >>> Since amqp 2.0 you have to explicitly call Connection.connect() > >>> before using the connection. > >>> W_FORCE_CONNECT.format(attr=attr))) > >>> /usr/lib/python2.7/site-packages/amqp/connection.py:304: > AMQPDeprecationWarning: The .transport attribute on the connection was > accessed before > >>> the connection was established. This is supported for now, but will > >>> be deprecated in amqp 2.2.0. > >>> Since amqp 2.0 you have to explicitly call Connection.connect() > >>> before using the connection. > >>> W_FORCE_CONNECT.format(attr=attr))) > >> > >> It looks like this is a red herring, but it should be fixed in the > current oslo.messaging pike release. See [0] and the related bug. > >> > >> 0: https://review.opendev.org/#/c/605324/ > >> > >>> OpenStack release which we are running is “Pike”. > >>> Is there any way to remedy this? > >> > >> I don't think this should be a fatal problem in and of itself so I > suspect it's masking something else. However, I would recommend updating to > the latest pike release of oslo.messaging where the deprecated feature is > not used. If that doesn't fix the problem, please send us whatever errors > remain after this one is eliminated. > > > > I checked it out, we are having the latest pike Oslo.messaging and it > still showing the same upper messages. Any ideas? > > Hmm, not sure then. Are there any other log messages around that one > which might provide more context on where this is happening? > > I've also copied a couple of our messaging folks in case they have a > better idea what might be going on. > > This deprecation warning should not result in a connection failure. After the amqp connection code issues that warning it immediately calls connect() for you, which establishes the connection. What concerns me is that you're hitting that warning in latest Pike. That version of oslo.messaging should no longer trigger that warning. If at all possible can you get a traceback when the warning is issued? When did these failures start to occur? Did something change - upgrade/downgrade, etc? Otherwise if you can reproduce the problem can you get a debug-level log trace? thanks, > >> > >>> Thanks > >>> Amjad > > > > > > -- Ken Giusti (kgiusti at gmail.com) -------------- next part -------------- An HTML attachment was scrubbed... URL: From emilien at redhat.com Fri Jan 10 15:38:41 2020 From: emilien at redhat.com (Emilien Macchi) Date: Fri, 10 Jan 2020 10:38:41 -0500 Subject: [tripleo][ci] Proposing Chandan Kumar and Arx Cruz as TripleO-CI cores In-Reply-To: References: Message-ID: +1 for Chandan; no doubt; he's always available on IRC to help when things go wrong in gate or promotion, and very often he's proposing the fix. Providing thoroughful reviews, and multi-project contributors, I've seen Chandan involved not only in TripleO CI but also in other projects like RDO and TripleO itself. I've seen him contributing to the tripleo-common and tripleoclient projects; which make him someone capable to understand not only how CI works but also how the project in general works. Having him core is to me natural. Number of commits/reviews shows his interests in the CI repos: https://www.stackalytics.com/?user_id=chandankumar-093047&release=train&metric=marks https://www.stackalytics.com/?user_id=chandankumar-093047&release=train&metric=commits ---- I hate playing devil's advocate here but I'll give my honest (and hopefully constructive) opinion. I would like to see more involvement from Arx in the TripleO community. He did a tremendous work on openstack-ansible-os_tempest; however this repo isn't governed by TripleO CI group. I would like to see more reviews; where he can bring his expertise; and not only in Gerrit but also on IRC when things aren't going well (gate issues, promotion blockers, etc). Number of commits/reviews aren't low but IMHO can be better for a core reviewer. https://www.stackalytics.com/?user_id=arxcruz&release=train&metric=commits https://www.stackalytics.com/?user_id=arxcruz&release=train&metric=marks I don't think it'll take time until Arx gets there but to me it's a -1 for now, for what it's worth. Emilien On Fri, Jan 10, 2020 at 9:20 AM Ronelle Landy wrote: > Hello All, > > I'd like to propose Arx Cruz (arxcruz at redhat.com) and Chandan Kumar ( > chkumar at redhat.com) as core on tripleo-ci repos (tripleo-ci, > tripleo-quickstart, tripleo-quickstart-extras). > > In addition to the extensive work that Arx and Chandan have done on the > Tempest-related repos ( and Tempest interface/settings within the Tripleo > CI repos) , they have become active contributors to the core Tripleo CI > repos, in general, in the past two years. > > Please vote by replying to this thread with +1 or -1 for any objections. > We will close the vote 7 days from now. > > Thank you, > Ronelle > -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From sgolovat at redhat.com Fri Jan 10 16:45:21 2020 From: sgolovat at redhat.com (Sergii Golovatiuk) Date: Fri, 10 Jan 2020 17:45:21 +0100 Subject: [tripleo][ci] Proposing Sorin Barnea as TripleO-CI core In-Reply-To: References: Message-ID: +1 пт, 10 янв. 2020 г. в 16:17, Emilien Macchi : > +1, Sorin has been providing meaningful and careful reviews on the CI > projects. > I think it's safe to promote him core at this point. > > Keep the good work going! > > On Fri, Jan 10, 2020 at 9:44 AM Alex Schultz wrote: > >> +1 >> >> On Fri, Jan 10, 2020 at 6:48 AM Marios Andreou wrote: >> > >> > I would like to propose Sorin Barnea (ssbarnea at redhat.com) as core on >> tripleo-ci repos (tripleo-ci, tripleo-quickstart, >> tripleo-quickstart-extras). >> > >> > Sorin has been a member of the tripleo-ci team for over one and a half >> years and has made many contributions across the tripleo-ci repos and >> beyond - highlights include helping the team to adopt molecule testing, >> leading linting efforts/changes/fixes and many others. >> > >> > Please vote by replying to this thread with +1 or -1 for any objections >> > >> > thanks >> > marios >> > >> >> >> > > -- > Emilien Macchi > -- Sergii Golovatiuk Senior Software Developer Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From whayutin at redhat.com Fri Jan 10 17:12:00 2020 From: whayutin at redhat.com (Wesley Hayutin) Date: Fri, 10 Jan 2020 10:12:00 -0700 Subject: [tripleo] rocky builds Message-ID: Greetings, I've confirmed that builds from the Rocky release will no longer be imported. Looking for input from the upstream folks with regards to maintaining the Rocky release for upstream. Can you please comment if you have any requirement to continue building, patching Rocky as I know there are active reviews [1]. I've added this topic to be discussed at the next meeting [2] Thank you! [1] https://review.opendev.org/#/q/status:open+tripleo+branch:stable/rocky [2] https://etherpad.openstack.org/p/tripleo-meeting-items -------------- next part -------------- An HTML attachment was scrubbed... URL: From duc.openstack at gmail.com Fri Jan 10 17:18:05 2020 From: duc.openstack at gmail.com (Duc Truong) Date: Fri, 10 Jan 2020 09:18:05 -0800 Subject: [aodh][keystone] handling of webhook / alarm authentication In-Reply-To: <75131451-9b07-0dc8-2ed2-3573434e0e7d@dantalion.nl> References: <75131451-9b07-0dc8-2ed2-3573434e0e7d@dantalion.nl> Message-ID: Senlin implements unauthenticated webhooks [1] that can be called by aodh. The webhook id is a uuid that is generated for each webhook. When the webhook is created, Senlin creates a keystone trust with the user to perform actions on their behalf when the webhook is received. That is probably the easiest way to implement webhooks without worrying about passing the keystone token context. [1] https://docs.openstack.org/api-ref/clustering/#trigger-webhook-action On Fri, Jan 10, 2020 at 4:48 AM info at dantalion.nl wrote: > > Hi Lingxian, > > The information referenced comes from: > https://docs.openstack.org/aodh/latest/admin/telemetry-alarms.html > > Here it would be an alarm that would use the webhooks action. The > endpoint in our use case would be Watcher for which we have just passed > a spec: https://review.opendev.org/#/c/695646/ > > With these alarms that report using a webhook I am wondering how these > received alarms can be authenticated and if the keystone token context > is available? > > Hope this makes it clearer. > > Kind regards, > Corne Lukken > Watcher core-reviewer > > On 1/10/20 11:44 AM, Lingxian Kong wrote: > > Hi Corne, > > > > I didn't fully understand your question, could you please provide the doc > > mentioned and if possible, an example of aodh alarm you want to create > > would be better. > > > > - > > Best regards, > > Lingxian Kong > > Catalyst Cloud > > > > > > On Fri, Jan 10, 2020 at 10:30 PM info at dantalion.nl > > wrote: > > > >> Hello, > >> > >> I was wondering how a service receiving an aodh webhook could perform > >> authentication? > >> > >> The documentation describes the webhook as a simple post-request so I > >> was wondering if a keystone token context is available when these > >> requests are received? > >> > >> If not, I was wondering if anyone had any recommendation on how to > >> perform authentication upon received post-requests? > >> > >> So far I have come up with limiting the functionality of these webhooks > >> such as rate-limiting and administrators having to explicitly enable > >> these webhooks before they work. > >> > >> Hope anyone else could provide further valuable information. > >> > >> Kind regards, > >> Corne Lukken > >> Watcher core-reviewer > >> > >> > > > From sshnaidm at redhat.com Fri Jan 10 17:25:56 2020 From: sshnaidm at redhat.com (Sagi Shnaidman) Date: Fri, 10 Jan 2020 19:25:56 +0200 Subject: [tripleo][ci] Proposing Sorin Barnea as TripleO-CI core In-Reply-To: References: Message-ID: +1! On Fri, Jan 10, 2020 at 3:44 PM Marios Andreou wrote: > I would like to propose Sorin Barnea (ssbarnea at redhat.com) as core on > tripleo-ci repos (tripleo-ci, tripleo-quickstart, > tripleo-quickstart-extras). > > Sorin has been a member of the tripleo-ci team for over one and a half > years and has made many contributions across the tripleo-ci repos and > beyond - highlights include helping the team to adopt molecule testing, > leading linting efforts/changes/fixes and many others. > > Please vote by replying to this thread with +1 or -1 for any objections > > thanks > marios > > -- Best regards Sagi Shnaidman -------------- next part -------------- An HTML attachment was scrubbed... URL: From whayutin at redhat.com Fri Jan 10 21:36:25 2020 From: whayutin at redhat.com (Wesley Hayutin) Date: Fri, 10 Jan 2020 14:36:25 -0700 Subject: [tripleo] introducing zuul-runner Message-ID: Greetings, Hey everyone I want to highlight work that some of our good friends in zuul are working another way to try and help everyone debug their jobs. The tool is called zuul-runner and the spec is here [1], code is here [2] Please have a look through the spec and vote to show support for Tristan's work. Thanks to Arx Cruz for contributing and testing it as well w/ TripleO jobs. This will be another very nice tool in the toolbox for us if we can get it through. Thanks! [1] https://review.opendev.org/#/c/681277/ [2] https://review.opendev.org/#/c/607078/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Fri Jan 10 21:57:58 2020 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Fri, 10 Jan 2020 15:57:58 -0600 Subject: [all] Announcing OpenStack Victoria! Message-ID: <20200110215758.GB536693@sm-workstation> Hello everyone, The polling results are in, and the legal vetting process has now completed. We now have an official name for the "V" release. The full results of the poll can be found here: https://civs.cs.cornell.edu/cgi-bin/results.pl?num_winners=1&id=E_13ccd49b66cfd1b4&rkey=4e184724fa32eed6&algorithm=minimax While Victoria and Vancouver were technically a tie, based on the Minimax rankingi, it puts Victoria slightly ahead of Vancouver based on the votes. In addition to that, we chose to have the TC do a tie breaker vote which confirmed Victoria as the winner. Victoria is the capital city of British Columbia: https://en.wikipedia.org/wiki/Victoria,_British_Columbia Thank you all for participating in the release naming! Sean From sean.mcginnis at gmx.com Fri Jan 10 22:02:17 2020 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Fri, 10 Jan 2020 16:02:17 -0600 Subject: [release] Release countdown for week R-17, January 13-17 Message-ID: <20200110220217.GC536693@sm-workstation> Development Focus ----------------- The Ussuri-2 milestone will happen in next month, on February 13. Ussuri-related specs should now be finalized so that teams can move to implementation ASAP. Some teams observe specific deadlines on the second milestone (mostly spec freezes): please refer to https://releases.openstack.org/ussuri/schedule.html for details. General Information ------------------- Please remember that libraries need to be released at least once per milestone period. At milestone 2, the release team will propose releases for any library that has not been otherwise released since milestone 1. Other non-library deliverables that follow the cycle-with-intermediary release model should have an intermediary release before milestone-2. Those who haven't will be proposed to switch to the cycle-with-rc model, which is more suited to deliverables that are released only once per cycle. At milestone-2 we also freeze the contents of the final release. If you have a new deliverable that should be included in the final release, you should make sure it has a deliverable file in: https://opendev.org/openstack/releases/src/branch/master/deliverables/ussuri You should request a beta release (or intermediary release) for those new deliverables by milestone-2. We understand some may not be quite ready for a full release yet, but if you have something minimally viable to get released it would be good to do a 0.x release to exercise the release tooling for your deliverables. See the MembershipFreeze description for more details: https://releases.openstack.org/ussuri/schedule.html#u-mf Finally, now may be a good time for teams to check on any stable releases that need to be done for your deliverables. If you have bugfixes that have been backported, but no stable release getting those. If you are unsure what is out there committed but not released, in the openstack/releases repo, running the command "tools/list_stable_unreleased_changes.sh " gives a nice report. Upcoming Deadlines & Dates -------------------------- Ussuri-2 Milestone: February 13 (R-13 week) From anlin.kong at gmail.com Sat Jan 11 01:06:37 2020 From: anlin.kong at gmail.com (Lingxian Kong) Date: Sat, 11 Jan 2020 14:06:37 +1300 Subject: [aodh][keystone] handling of webhook / alarm authentication In-Reply-To: <75131451-9b07-0dc8-2ed2-3573434e0e7d@dantalion.nl> References: <75131451-9b07-0dc8-2ed2-3573434e0e7d@dantalion.nl> Message-ID: On Sat, Jan 11, 2020 at 1:47 AM info at dantalion.nl wrote: > With these alarms that report using a webhook I am wondering how these > received alarms can be authenticated and if the keystone token context > is available? > Aodh supports to create an alarm with actions such as 'trust+http://', once the alarm is triggered, the URL service will receive POST request with 'X-Auth-Token' in the headers and alarm information in the body. - Best regards, Lingxian Kong Catalyst Cloud -------------- next part -------------- An HTML attachment was scrubbed... URL: From agarwalvishakha18 at gmail.com Sat Jan 11 10:37:30 2020 From: agarwalvishakha18 at gmail.com (vishakha agarwal) Date: Sat, 11 Jan 2020 16:07:30 +0530 Subject: [keystone] Keystone Team Update - Week of 6 January 2020 Message-ID: # Keystone Team Update - Week of 6 January 2020 ## News ### User Support and Bug Duty The person in-charge for bug duty for current and upcoming week can be seen on the etherpad [1] [1] https://etherpad.openstack.org/p/keystone-l1-duty ## Action Items The one-fourth of the ussuri cycle is almost over. We need to find a new mechanism for the retrospective and to check our progress for this cycle which is more convenient and less time consuming for the members. ## Open Specs Ussuri specs: https://bit.ly/2XDdpkU Ongoing specs: https://bit.ly/2OyDLTh ## Recently Merged Changes Search query: https://bit.ly/2pquOwT We merged 19 changes this week. ## Changes that need Attention Search query: https://bit.ly/2tymTje There are 37 changes that are passing CI, not in merge conflict, have no negative reviews and aren't proposed by bots. ### Priority Reviews * Community Goals https://review.opendev.org/#/c/699127/ [ussuri][goal] Drop python 2.7 support and testing keystone-tempest-plugin https://review.opendev.org/#/c/699126/ [ussuri][goal] Drop python 2.7 support and testing ldappool https://review.opendev.org/#/c/699119/ [ussuri][goal] Drop python 2.7 support and testing python-keystoneclient * Special Requests https://review.opendev.org/#/c/662734/ Change the default Identity endpoint to internal https://review.opendev.org/#/c/699013/ Always have username in CADF initiator https://review.opendev.org/#/c/700826/ Fix role_assignments role.id filter ## Bugs This week we opened 3 new bugs and closed 5. Bugs opened (3) Bug #1858410 (keystone:Low): Got error 'NoneType' when executing unittest on stable/rocky - Opened by Eric Xie https://bugs.launchpad.net/keystone/+bug/1858410 Bug #1858186 (keystoneauth:Undecided): http_log_request will print debug info include pki certificate which is unsafety - Opened by kuangpeiling https://bugs.launchpad.net/keystoneauth/+bug/1858186 Bug #1858189 (keystoneauth:Undecided): http_log_request will print debug info include pki certificate which is unsafety - Opened by kuangpeiling https://bugs.launchpad.net/keystoneauth/+bug/1858189 Bugs closed (5) Bug #1858186 (keystoneauth:Undecided) https://bugs.launchpad.net/keystoneauth/+bug/1858186 Bug #1856881 (keystone:Medium): keystone-manage bootstrap fails with ambiguous role names - Fixed by Lance Bragstad https://bugs.launchpad.net/keystone/+bug/1856881 Bug #1856962 (keystone:Undecided): openid method failed when federation_group_ids is empty list - Fixed by Colleen Murphy https://bugs.launchpad.net/keystone/+bug/1856962 Bug #1857086 (keystone: Won't Fix) https://bugs.launchpad.net/keystone/+bug/1857086 Bug #1831018 (keystone: Invalid)https://bugs.launchpad.net/keystone/+bug/1831018 ## Milestone Outlook https://releases.openstack.org/ussuri/schedule.html Spec freeze is on the week of 10 February. All the specs targeted for this cycle should be ready for the review soon. ## Help with this newsletter Help contribute to this newsletter by editing the etherpad: https://etherpad.openstack.org/p/keystone-team-newsletter From arxcruz at redhat.com Sat Jan 11 14:39:09 2020 From: arxcruz at redhat.com (Arx Cruz) Date: Sat, 11 Jan 2020 15:39:09 +0100 Subject: [tripleo][ci] Proposing Chandan Kumar and Arx Cruz as TripleO-CI cores In-Reply-To: References: Message-ID: Hello Emilien, Thanks for your feedback, I really appreciate it. You are right, there are places that I really can improve, and I will work to improve it, and I really looking forward to have your help. Regarding the amount of reviews and commits, it’s true that I haven’t be so active on tripleo upstream projects, but please, remember that stackalytics only reflect the projects under tripleo umbrella, and you know that in tripleo-ci we also work on rdo side, where I’ve been working more activelly, right now, working on integration with thirdy party projects like podman and ceph-ansible, which is not directly related to Tripleo indeed, but are key projects to Tripleo work properly. Also, look only in the latest release patches doesn’t seems to be too fair, if you check the previous release I have more than double of reviews (although yes, the number of commits remains stable), and probably if you get the Ussuri release, I will not have too much reviews or commits, since I’ve been on vacation mostly of the december. Also, and please, correct me if I am wrong, I don’t remember anytime that people ping me on IRC and I did not reply, or was prompt to help, if that happens, please accept my sincere apologies, as you know, when things are on fire (long time without promotions for example, like the last sprint I was ruck and rover) our focus is to make things get back to normal. One more time, I am taking your feedback, and I’ll do my best to improve in the areas you point, and hopefully change your mind regarding my core promotion. Kind regards, Arx Cruz On Fri, 10 Jan 2020 at 16:45 Emilien Macchi wrote: > +1 for Chandan; no doubt; he's always available on IRC to help when things > go wrong in gate or promotion, and very often he's proposing the fix. > Providing thoroughful reviews, and multi-project contributors, I've seen > Chandan involved not only in TripleO CI but also in other projects like RDO > and TripleO itself. I've seen him contributing to the tripleo-common and > tripleoclient projects; which make him someone capable to understand not > only how CI works but also how the project in general works. Having him > core is to me natural. > > Number of commits/reviews shows his interests in the CI repos: > > https://www.stackalytics.com/?user_id=chandankumar-093047&release=train&metric=marks > > https://www.stackalytics.com/?user_id=chandankumar-093047&release=train&metric=commits > > ---- > > I hate playing devil's advocate here but I'll give my honest (and > hopefully constructive) opinion. > I would like to see more involvement from Arx in the TripleO community. He > did a tremendous work on openstack-ansible-os_tempest; however this repo > isn't governed by TripleO CI group. I would like to see more reviews; where > he can bring his expertise; and not only in Gerrit but also on IRC when > things aren't going well (gate issues, promotion blockers, etc). > > Number of commits/reviews aren't low but IMHO can be better for a core > reviewer. > https://www.stackalytics.com/?user_id=arxcruz&release=train&metric=commits > https://www.stackalytics.com/?user_id=arxcruz&release=train&metric=marks > > I don't think it'll take time until Arx gets there but to me it's a -1 for > now, for what it's worth. > > Emilien > > On Fri, Jan 10, 2020 at 9:20 AM Ronelle Landy wrote: > >> Hello All, >> >> I'd like to propose Arx Cruz (arxcruz at redhat.com) and Chandan Kumar ( >> chkumar at redhat.com) as core on tripleo-ci repos (tripleo-ci, >> tripleo-quickstart, tripleo-quickstart-extras). >> >> In addition to the extensive work that Arx and Chandan have done on the >> Tempest-related repos ( and Tempest interface/settings within the Tripleo >> CI repos) , they have become active contributors to the core Tripleo CI >> repos, in general, in the past two years. >> >> Please vote by replying to this thread with +1 or -1 for any objections. >> We will close the vote 7 days from now. >> >> Thank you, >> Ronelle >> > > > -- > Emilien Macchi > -- Arx Cruz Software Engineer Red Hat EMEA arxcruz at redhat.com @RedHat Red Hat Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From emilien at redhat.com Sat Jan 11 16:06:36 2020 From: emilien at redhat.com (Emilien Macchi) Date: Sat, 11 Jan 2020 11:06:36 -0500 Subject: [tripleo][ci] Proposing Chandan Kumar and Arx Cruz as TripleO-CI cores In-Reply-To: References: Message-ID: Arx, First of all I want to repeat that it has nothing to do with the quality of your work. Again, I'm aware of what you've been working on and I appreciate what you have been doing with the CI team. The major issue that I'm dealing with as a major maintainer of TripleO is that over the past years we have promoted a lot of people to be core reviewers; but if you closely look at numbers: most of the reviews are done by 3 people; this is problematic when one of us is absent; and even more problematic if one of us one day leave. I have the feeling that promoting more core developers hasn't solved that problem; and there are few folks currently core that should not be core anymore IMO; because they don't review much and aren't much involved as "core maintainers". Being a core reviewer means you're an official maintainer. You maintain the code, wherever it is; if it's something that your direct peer wrote or something that $random_contributor wrote. Very often we have promoted cores who only review things from their direct peers and this has been problematic because 1) reviews are done by silos and 2) some parts of the project aren't reviewed at all. It has nothing to do with you but just to give you a bit of context on why I'm being more conservative now. You said that you have spent major time on things not under TripleO umbrella: please know that I'm aware of this, I'm watching it and I appreciate it. However we are talking about TripleO CI core which is under TripleO umbrella. Not Podman, not RDO CI etc. Which is why I went looking to Stackalytics to see numbers (even if I take them with a grain of salt). The core promotion is a decision that is taken as a group. My -1 doesn't mean you won't be core, it just means I had to provide some feedback on why I'm reluctant of you being core as of now. It doesn't mean I don't find your work valuable or that you're not helping on IRC; actually you're doing great. I just think that the bar is a bit higher compared to my taste and I don't think you're far from reaching it. Now, this is only my opinion and what it's worth. My hope is that 1) you continue to improve your involvement in TripleO and 2) our core reviewers do more reviews because it can't only be 3 persons who do more than 70% of the reviews. Have a great weekend, Emilien On Sat, Jan 11, 2020 at 9:39 AM Arx Cruz wrote: > Hello Emilien, > > Thanks for your feedback, I really appreciate it. > You are right, there are places that I really can improve, and I will work > to improve it, and I really looking forward to have your help. > > Regarding the amount of reviews and commits, it’s true that I haven’t be > so active on tripleo upstream projects, but please, remember that > stackalytics only reflect the projects under tripleo umbrella, and you know > that in tripleo-ci we also work on rdo side, where I’ve been working more > activelly, right now, working on integration with thirdy party projects > like podman and ceph-ansible, which is not directly related to Tripleo > indeed, but are key projects to Tripleo work properly. > > Also, look only in the latest release patches doesn’t seems to be too > fair, if you check the previous release I have more than double of reviews > (although yes, the number of commits remains stable), and probably if you > get the Ussuri release, I will not have too much reviews or commits, since > I’ve been on vacation mostly of the december. > > Also, and please, correct me if I am wrong, I don’t remember anytime that > people ping me on IRC and I did not reply, or was prompt to help, if that > happens, please accept my sincere apologies, as you know, when things are > on fire (long time without promotions for example, like the last sprint I > was ruck and rover) our focus is to make things get back to normal. > > One more time, I am taking your feedback, and I’ll do my best to improve > in the areas you point, and hopefully change your mind regarding my core > promotion. > > Kind regards, > Arx Cruz > > On Fri, 10 Jan 2020 at 16:45 Emilien Macchi wrote: > >> +1 for Chandan; no doubt; he's always available on IRC to help when >> things go wrong in gate or promotion, and very often he's proposing the fix. >> Providing thoroughful reviews, and multi-project contributors, I've seen >> Chandan involved not only in TripleO CI but also in other projects like RDO >> and TripleO itself. I've seen him contributing to the tripleo-common and >> tripleoclient projects; which make him someone capable to understand not >> only how CI works but also how the project in general works. Having him >> core is to me natural. >> >> Number of commits/reviews shows his interests in the CI repos: >> >> https://www.stackalytics.com/?user_id=chandankumar-093047&release=train&metric=marks >> >> https://www.stackalytics.com/?user_id=chandankumar-093047&release=train&metric=commits >> >> ---- >> >> I hate playing devil's advocate here but I'll give my honest (and >> hopefully constructive) opinion. >> I would like to see more involvement from Arx in the TripleO community. >> He did a tremendous work on openstack-ansible-os_tempest; however this repo >> isn't governed by TripleO CI group. I would like to see more reviews; where >> he can bring his expertise; and not only in Gerrit but also on IRC when >> things aren't going well (gate issues, promotion blockers, etc). >> >> Number of commits/reviews aren't low but IMHO can be better for a core >> reviewer. >> https://www.stackalytics.com/?user_id=arxcruz&release=train&metric=commits >> https://www.stackalytics.com/?user_id=arxcruz&release=train&metric=marks >> >> I don't think it'll take time until Arx gets there but to me it's a -1 >> for now, for what it's worth. >> >> Emilien >> >> On Fri, Jan 10, 2020 at 9:20 AM Ronelle Landy wrote: >> >>> Hello All, >>> >>> I'd like to propose Arx Cruz (arxcruz at redhat.com) and Chandan Kumar ( >>> chkumar at redhat.com) as core on tripleo-ci repos (tripleo-ci, >>> tripleo-quickstart, tripleo-quickstart-extras). >>> >>> In addition to the extensive work that Arx and Chandan have done on the >>> Tempest-related repos ( and Tempest interface/settings within the Tripleo >>> CI repos) , they have become active contributors to the core Tripleo CI >>> repos, in general, in the past two years. >>> >>> Please vote by replying to this thread with +1 or -1 for any objections. >>> We will close the vote 7 days from now. >>> >>> Thank you, >>> Ronelle >>> >> >> >> -- >> Emilien Macchi >> > -- > > Arx Cruz > > Software Engineer > > Red Hat EMEA > > arxcruz at redhat.com > @RedHat Red Hat > Red Hat > > > -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From emiller at genesishosting.com Sat Jan 11 17:44:44 2020 From: emiller at genesishosting.com (Eric K. Miller) Date: Sat, 11 Jan 2020 11:44:44 -0600 Subject: [magnum][kolla] etcd wal sync duration issue Message-ID: <046E9C0290DD9149B106B72FC9156BEA0477170E@gmsxchsvr01.thecreation.com> Hi, We are using the following coe cluster template and cluster create commands on an OpenStack Stein installation that installs Magnum 8.2.0 Kolla containers installed by Kolla-Ansible 8.0.1: openstack coe cluster template create \ --image Fedora-AtomicHost-29-20191126.0.x86_64_raw \ --keypair userkey \ --external-network ext-net \ --dns-nameserver 1.1.1.1 \ --master-flavor c5sd.4xlarge \ --flavor m5sd.4xlarge \ --coe kubernetes \ --network-driver flannel \ --volume-driver cinder \ --docker-storage-driver overlay2 \ --docker-volume-size 100 \ --registry-enabled \ --master-lb-enabled \ --floating-ip-disabled \ --fixed-network KubernetesProjectNetwork001 \ --fixed-subnet KubernetesProjectSubnet001 \ --labels kube_tag=v1.15.7,cloud_provider_tag=v1.15.0,heat_container_agent_tag=ste in-dev,master_lb_floating_ip_enabled=true \ k8s-cluster-template-1.15.7-production-private openstack coe cluster create \ --cluster-template k8s-cluster-template-1.15.7-production-private \ --keypair userkey \ --master-count 3 \ --node-count 3 \ k8s-cluster001 The deploy process works perfectly, however, the cluster health status flips between healthy and unhealthy. The unhealthy status indicates that etcd has an issue. When logged into master-0 (out of 3, as configured above), "systemctl status etcd" shows the stdout from etcd, which shows: Jan 11 17:27:36 k8s-cluster001-4effrc2irvjq-master-0.novalocal runc[2725]: 2020-01-11 17:27:36.548453 W | etcdserver: timed out waiting for read index response Jan 11 17:28:02 k8s-cluster001-4effrc2irvjq-master-0.novalocal runc[2725]: 2020-01-11 17:28:02.960977 W | wal: sync duration of 1.696804699s, expected less than 1s Jan 11 17:28:31 k8s-cluster001-4effrc2irvjq-master-0.novalocal runc[2725]: 2020-01-11 17:28:31.292753 W | wal: sync duration of 2.249722223s, expected less than 1s We also see: Jan 11 17:40:39 k8s-cluster001-4effrc2irvjq-master-0.novalocal runc[2725]: 2020-01-11 17:40:39.132459 I | etcdserver/api/v3rpc: grpc: Server.processUnaryRPC failed to write status: stream error: code = DeadlineExceeded desc = "context deadline exceeded" We initially used relatively small flavors, but increased these to something very large to be sure resources were not constrained in any way. "top" reported no CPU nor memory contention on any nodes in either case. Multiple clusters have been deployed, and they all have this issue, including empty clusters that were just deployed. I see a very large number of reports of similar issues with etcd, but discussions lead to disk performance, which can't be the cause here, not only because persistent storage for etcd isn't configured in Magnum, but also the disks are "very" fast in this environment. Looking at "vmstat -D" from within master-0, the number of writes is minimal. Ceilometer logs about 15 to 20 write IOPS for this VM in Gnocchi. Any ideas? We are finalizing procedures to upgrade to Train, so we wanted to be sure that we weren't running into some common issue with Stein that would immediately be solved with Train. If so, we will simply proceed with the upgrade and avoid diagnosing this issue further. Thanks! Eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From ssbarnea at redhat.com Sun Jan 12 10:02:55 2020 From: ssbarnea at redhat.com (Sorin Sbarnea) Date: Sun, 12 Jan 2020 10:02:55 +0000 Subject: [tripleo][ci] Proposing Sorin Barnea as TripleO-CI core In-Reply-To: References: Message-ID: <88ED8214-C7A3-44D7-B8C0-431C31DFC4F2@redhat.com> # Thanks marios and everyone for your support! I am looking forward to simplify tripleo, with extra focus on improving testing (with results in minutes, not hours) Cheers Sorin > On 10 Jan 2020, at 13:42, Marios Andreou wrote: > > I would like to propose Sorin Barnea (ssbarnea at redhat.com) as core on tripleo-ci repos (tripleo-ci, tripleo-quickstart, tripleo-quickstart-extras). > > Sorin has been a member of the tripleo-ci team for over one and a half years and has made many contributions across the tripleo-ci repos and beyond - highlights include helping the team to adopt molecule testing, leading linting efforts/changes/fixes and many others. > > Please vote by replying to this thread with +1 or -1 for any objections > > thanks > marios > From hongbin034 at gmail.com Sun Jan 12 16:44:46 2020 From: hongbin034 at gmail.com (Hongbin Lu) Date: Sun, 12 Jan 2020 11:44:46 -0500 Subject: [neutron] Bug deputy report - Jan 06 to 12 Message-ID: Hi, I was on bug deputy last week. Below is my summary of it. Critical: https://bugs.launchpad.net/neutron/+bug/1858645 https://bugs.launchpad.net/neutron/+bug/1858421 High: https://bugs.launchpad.net/neutron/+bug/1858661 https://bugs.launchpad.net/neutron/+bug/1858642 Medium: https://bugs.launchpad.net/neutron/+bug/1859163 https://bugs.launchpad.net/neutron/+bug/1858680 https://bugs.launchpad.net/neutron/+bug/1858419 Low: https://bugs.launchpad.net/neutron/+bug/1859258 https://bugs.launchpad.net/neutron/+bug/1859190 https://bugs.launchpad.net/neutron/+bug/1858783 RFE: https://bugs.launchpad.net/neutron/+bug/1858610 -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Sun Jan 12 18:20:27 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Sun, 12 Jan 2020 19:20:27 +0100 Subject: [all] DevStack jobs broken due to setuptools not available for Python 2 Message-ID: Hi all, I noticed DevStack jobs fail all over the place [1] due to: UnsupportedPythonVersion: Package 'setuptools' requires a different Python: 2.7.17 not in '>=3.5' Bug reported in [2]. Notice USE_PYTHON3=True does not help as stack.sh is hardcoded to versionless Python. [1] https://zuul.opendev.org/t/openstack/builds?result=RETRY_LIMIT&result=FAILURE [2] https://bugs.launchpad.net/devstack/+bug/1859350 -yoctozepto From sundar.nadathur at intel.com Sun Jan 12 21:41:16 2020 From: sundar.nadathur at intel.com (Nadathur, Sundar) Date: Sun, 12 Jan 2020 21:41:16 +0000 Subject: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management In-Reply-To: <87344b65740d47fc9777ae710dcf4af9@AUSX13MPS308.AMER.DELL.COM> References: <87344b65740d47fc9777ae710dcf4af9@AUSX13MPS308.AMER.DELL.COM> Message-ID: Hi Arkady and all, Good discussions and questions. First, it is good to clarify what we mean by lifecycle management. It includes: * Discovery: We need to get more than just the PCI IDs/addresses of devices. We would need their properties and features as well. This is especially the case for programmable devices, as the properties and changes can change over time, though the PCI ID may not. * Scheduling: We would want to schedule the application that needs offload based on the properties/features discovered above. * Programming/configuration and/or firmware update. More on this later. * Health management: discover the health of a device, esp. if programming/configuration etc. fail. * Inventory management: Track the fleet of accelerators based on their properties/features. * Other aspects that I won't dwell on here. In short, lifecycle management is more than just firmware update. Secondly, regarding the difference between programming and firmware updates, some key questions are: 1. What does the device configuration do? A. Expose properties/features relevant to scheduling: Could be for a specific application or workload (e.g. apply a new AI algorithm) Or expose new/premium device features (e.g. enable more memory banks) B. Update general features not relevant to scheduling. E.g. fix a bug in BMC firmware. 2. When/how is the device configuration done? A. Dynamically: as instances are provisioned/retired, based on time of day, workload demand, etc. This would be part of OpenStack workflow. B. Statically: as part of the physical host configuration. This is typically done 'offline', perhaps in a maintenance window, often using external frameworks like Redfish/Ansible/Puppet/Chef/... The combination 1A+2A is what I'd call programming, while 1B+2B is firmware update. I don't see a motivation for 1B+2A. The case 1A+2B is interesting. It basically means that a programmable device is being treated like a fixed-function accelerator for a period of time before it gets reprogrammed offline. This model is being used in the industry today, esp. telcos. I am fine with calling this a 'firmware update' too. There are some grey areas to consider. For example, many FPGA deployments are structured to have a 'shell', which is hardware logic that exposes some generic features like PCI and DMA, and a separate user/custom logic that is application/workload-specific. Would updating the shell qualify as 'programming' or a 'firmware update'? Today, it often falls under 2B, esp. if it requires server reboots. But it could conceivably come under 1A+2A as products and use cases evolve. IOW, what is called a firmware update today could become a programming update tomorrow. Cyborg is designed for programming, i.e. 1A+2A. It can be used with Nova (to program devices as instances are provisioned/retired) or standalone (based on time of day, traffic patterns, etc.) Other cases (1A/1B + 2B) can be classified as firmware update and outside of Cyborg. TL;DR * Agree with Arkady that firmware updates should follow the server vendors' guidelines, and can/should be done as part of the server configuration. * If the claim is that firmware updates, as defined above (i.e. 1A/1B + 2B), should be done by Ironic, I am fine with it. * To reiterate, it is NOT enough to handle devices based only on their PCI IDs -- we should be able to use their features/properties for scheduling, inventory management, etc. This is extra true for programmable devices where features can change dynamically while PCI IDs potentially stay constant. * Cyborg is designed for these devices and its stated role includes all other aspects of lifecycle management. * I see value in having Cyborg and Ironic work together, esp. for 1A+2B, where Ironic can do the 'firmware update' and Cyborg discovers the schedulable properties of the device. > From: Arkady.Kanevsky at dell.com > Sent: Friday, January 3, 2020 1:19 PM > To: openstack-discuss at lists.openstack.org > Subject: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management > 1. Application user need to program a portion of the device ... Sure. > 2. Administrator need to program the whole device for specific usage. That covers the scenario when device can only support single tenant or single use case. Why does it have to be single-tenant or single use-case? For example, one could program an FPGA with an Open Vswitch implementation, which is shared by VMs from different tenants. > That is done once during OpenStack deployment but may need reprogramming to configure device for different usage. If the change exposes workload-specific or schedulable properties, this would not necessarily be a one-shot thing at deployment time. > 3. Administrator need to setup device for its use, like burning specific FW on it. This is typically done as part of server life-cycle event. With the definition of firmware update as above, I agree. > The first 2 cases cover application life cycle of device usage. Yes. > The last one covers device life cycle independently how it is used. Here's where I beg to disagree. As I said, the term 'device lifecycle' is far broader than just firmware update. > Managing life cycle of devices is Ironic responsibility, Disagree here. To the best of my knowledge, Ironic handles devices based on PCI IDs. Cyborg is designed to go deeper for discovering device features/properties and utilize Placement for scheduling based on these. > One cannot and should not manage lifecycle of server components independently. If what you meant to say is: ' do not update device firmware independently of other server components', agreed. > Managing server devices outside server management violates customer service agreements with server vendors and breaks server support agreements. Sure. > Nova and Neutron are getting info about all devices and their capabilities from Ironic; that they use for scheduling Hmm, this seems overly broad to me: not every deployment includes Ironic, and getting PCI IDs is not enough for scheduling and management. > Finally we want Cyborg to be able to be used in standalone capacity, say for Kubernetes +1 > Thus, I propose that Cyborg cover use cases 1 & 2, and Ironic would cover use case 3 Use case 3 says "setup device for its use, like burning specific FW." With the definition of firmware above, I agree. Other aspects of lifecycle management, not covered by use cases 1 - 3, would come under Cyborg. > Thus, move all device Life-cycle code from Cyborg to Ironic To recap, there is more to device lifecycle than firmware update. I'd suggest the other aspects can remain in Cyborg. Regards, Sundar -------------- next part -------------- An HTML attachment was scrubbed... URL: From sundar.nadathur at intel.com Sun Jan 12 21:42:35 2020 From: sundar.nadathur at intel.com (Nadathur, Sundar) Date: Sun, 12 Jan 2020 21:42:35 +0000 Subject: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management In-Reply-To: References: <87344b65740d47fc9777ae710dcf4af9@AUSX13MPS308.AMER.DELL.COM> <7b55e3b28d644492a846fdb10f7b127b@AUSX13MPS308.AMER.DELL.COM> Message-ID: > From: Julia Kreger > Sent: Monday, January 6, 2020 1:33 PM > To: Arkady.Kanevsky at dell.com > Cc: Zhipeng Huang ; openstack-discuss discuss at lists.openstack.org> > Subject: Re: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators > management Hi Julia, Lots of good points here. > Greetings Arkady, > > I think your message makes a very good case and raises a point that I've been > trying to type out for the past hour, but with only different words. > > We have multiple USER driven interactions with a similarly desired, if not the > exact same desired end result where different paths can be taken, as we > perceive use cases from "As a user, I would like a VM with a configured > accelerator", "I would like any compute resource (VM or Baremetal), with a > configured accelerator", to "As an administrator, I need to reallocate a > baremetal node for this different use, so my user can leverage its accelerator > once they know how and are ready to use it.", and as suggested "I as a user > want baremetal with k8s and configured accelerators." > And I suspect this diversity of use patterns is where things begin to become > difficult. As such I believe, we in essence, have a question of a support or > compatibility matrix that definitely has gaps depending on "how" the "user" > wants or needs to achieve their goals. Yes, there are a wide variety of deployments and use cases. There may not be a single silver bullet solution for all of them. There may be different solutions, such as Ironic standalone, Ironic with Nova, and potentially some combination with Cyborg. > And, I think where this entire discussion _can_ go sideways is... > (from what I understand) some of these devices need to be flashed by the > application user with firmware on demand to meet the user's needs, which is > where lifecycle and support interactions begin to become... > conflicted. We are probably using different definitions of the term 'firmware.' As I said in another response in this thread, if a device configuration exposes application-specific features or schedulable features, then the term 'firmware update' may not be applicable IMHO, since it is going to be done dynamically as workloads spin up and retire. This is especially so given Arkady's stipulation that firmware updates are done as part of server configuration and as per server vendor's guidelines. > Further complicating matters is the "Metal to Tenant" use cases where the > user requesting the machine is not an administrator, but has some level of > inherent administrative access to all Operating System accessible devices once > their OS has booted. Which makes me wonder "What if the cloud > administrators WANT to block the tenant's direct ability to write/flash > firmware into accelerator/smartnic/etc?" Yes, admins may want to do that. This can be done (partly) via RBAC, by having different roles for tenants who can use devices but not reprogram them, and for tenants who can program the device with application/scheduling-relevant features (but not firmware), etc. > I suspect if cloud administrators want to block such hardware access, > vendors will want to support such a capability. Devices can and usually do offer separate mechanisms for reading from registers, writing to them, updating flash etc. each with associated access permissions. A device vendor can go a bit extra by requiring specific Linux capabilities, such as say CAP_IPC_LOCK for mmap access, in their device driver. > Blocking such access inherently forces some actions into hardware > management/maintenance workflows, and may ultimately may cause some of > a support matrix's use cases to be unsupportable, again ultimately depending > on what exactly the user is attempting to achieve. Not sure if you are expressing a concern here. If the admin is using device features or RBAC to restrict access, then she is intentionally blocking some combinations in your support matrix, right? Users in such a deployment need to live with that. > Is there any documentation at present that details the desired support and > use cases? I think this would at least help my understanding, since everything > that requires the power to be on would still need to be integrated with-in > workflows for eventual tighter integration. The Cyborg spec [1] addresses the Nova/VM-based use cases. [1] https://opendev.org/openstack/cyborg-specs/src/branch/master/specs/train/approved/cyborg-nova-placement.rst > Also, has Cyborg drafted any plans or proposals for integration? For Nova integration, we have a spec [2]. [2] https://review.opendev.org/#/c/684151/ > -Julia Regards, Sundar From Arkady.Kanevsky at dell.com Mon Jan 13 00:44:11 2020 From: Arkady.Kanevsky at dell.com (Arkady.Kanevsky at dell.com) Date: Mon, 13 Jan 2020 00:44:11 +0000 Subject: [all] Announcing OpenStack Victoria! In-Reply-To: <20200110215758.GB536693@sm-workstation> References: <20200110215758.GB536693@sm-workstation> Message-ID: <93fc266f5188442d983ce1caf261a18b@AUSX13MPS308.AMER.DELL.COM> Hurray to Victoria. -----Original Message----- From: Sean McGinnis Sent: Friday, January 10, 2020 3:58 PM To: openstack-discuss at lists.openstack.org Subject: [all] Announcing OpenStack Victoria! [EXTERNAL EMAIL] Hello everyone, The polling results are in, and the legal vetting process has now completed. We now have an official name for the "V" release. The full results of the poll can be found here: https://civs.cs.cornell.edu/cgi-bin/results.pl?num_winners=1&id=E_13ccd49b66cfd1b4&rkey=4e184724fa32eed6&algorithm=minimax While Victoria and Vancouver were technically a tie, based on the Minimax rankingi, it puts Victoria slightly ahead of Vancouver based on the votes. In addition to that, we chose to have the TC do a tie breaker vote which confirmed Victoria as the winner. Victoria is the capital city of British Columbia: https://en.wikipedia.org/wiki/Victoria,_British_Columbia Thank you all for participating in the release naming! Sean From iwienand at redhat.com Mon Jan 13 06:05:54 2020 From: iwienand at redhat.com (Ian Wienand) Date: Mon, 13 Jan 2020 17:05:54 +1100 Subject: [all] DevStack jobs broken due to setuptools not available for Python 2 In-Reply-To: References: Message-ID: <20200113060554.GA3819219@fedora19.localdomain> On Sun, Jan 12, 2020 at 07:20:27PM +0100, Radosław Piliszek wrote: > I noticed DevStack jobs fail all over the place [1] due to: > UnsupportedPythonVersion: Package 'setuptools' requires a different > Python: 2.7.17 not in '>=3.5' I think there's a wide variety of things going on here. Firstly, I think pip should be not be trying to install this ... you clearly felt the same thing and have filed [1] where it seems that it might be due to the wheels we create not specifying "data-requires-python" in our links to the wheel. This is the first I've heard of this ... we will need to look into this wrt to our wheel building and I have filed [2]. The plain "virtualenv" call that sets up the requirements virtualenv should be using Python 3 I think; proposed in [3]. This would avoid the issue by using python3 on master. The other places calling "virtualenv" appear to be related to TRACK_DEPENDS, which I think we can remove now to avoid further confusion. Proposed in [4] However, this leaves devstack-gate which is used by grenade. I *think* that [5] will work if the older branch of devstack also installs with python3. The short answer is, yes, this is a big mess :/ -i [1] https://github.com/pypa/pip/issues/7586#issuecomment-573460206 [2] https://storyboard.openstack.org/#!/story/2007084 [3] https://review.opendev.org/702162 [4] https://review.opendev.org/702163 [5] https://review.opendev.org/702126 From prash.ing.pucsd at gmail.com Mon Jan 13 07:30:03 2020 From: prash.ing.pucsd at gmail.com (prashant) Date: Mon, 13 Jan 2020 13:00:03 +0530 Subject: [horizon][stein]issue: Message-ID: when i m deleting multiple instances into the stein, could not able to see all remaining instances until i refresh instances list again. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Mon Jan 13 14:08:20 2020 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 13 Jan 2020 08:08:20 -0600 Subject: [all] DevStack jobs broken due to setuptools not available for Python 2 In-Reply-To: <20200113060554.GA3819219@fedora19.localdomain> References: <20200113060554.GA3819219@fedora19.localdomain> Message-ID: <16f9f3be487.10f2e3b05395088.6540142001370012765@ghanshyammann.com> ---- On Mon, 13 Jan 2020 00:05:54 -0600 Ian Wienand wrote ---- > On Sun, Jan 12, 2020 at 07:20:27PM +0100, Radosław Piliszek wrote: > > I noticed DevStack jobs fail all over the place [1] due to: > > UnsupportedPythonVersion: Package 'setuptools' requires a different > > Python: 2.7.17 not in '>=3.5' > > I think there's a wide variety of things going on here. > > Firstly, I think pip should be not be trying to install this ... you > clearly felt the same thing and have filed [1] where it seems that it > might be due to the wheels we create not specifying > "data-requires-python" in our links to the wheel. This is the first > I've heard of this ... we will need to look into this wrt to our wheel > building and I have filed [2]. > > The plain "virtualenv" call that sets up the requirements virtualenv > should be using Python 3 I think; proposed in [3]. This would avoid > the issue by using python3 on master. > > The other places calling "virtualenv" appear to be related to > TRACK_DEPENDS, which I think we can remove now to avoid further > confusion. Proposed in [4] > > However, this leaves devstack-gate which is used by grenade. I > *think* that [5] will work if the older branch of devstack also > installs with python3. Yes, grenade master jobs use py3 in both (new and older devstack) which is expected testing behaviour. We avoided (or at least not done yet) any mixed (py2->py3) upgrade testing. -gmann > > The short answer is, yes, this is a big mess :/ > > -i > > [1] https://github.com/pypa/pip/issues/7586#issuecomment-573460206 > [2] https://storyboard.openstack.org/#!/story/2007084 > [3] https://review.opendev.org/702162 > [4] https://review.opendev.org/702163 > [5] https://review.opendev.org/702126 > > > From radoslaw.piliszek at gmail.com Mon Jan 13 14:16:58 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Mon, 13 Jan 2020 15:16:58 +0100 Subject: [all] DevStack jobs broken due to setuptools not available for Python 2 In-Reply-To: <16f9f3be487.10f2e3b05395088.6540142001370012765@ghanshyammann.com> References: <20200113060554.GA3819219@fedora19.localdomain> <16f9f3be487.10f2e3b05395088.6540142001370012765@ghanshyammann.com> Message-ID: It turned out to be real mess to fix devstack and devstack-gate and whatever else could be impacted. The current goal is to get rid of setuptools wheel so that it does not interfere in Zuul. As more packages go py3-only we will be forced to generate proper metadata in HTML indices... -yoctozepto pon., 13 sty 2020 o 15:08 Ghanshyam Mann napisał(a): > > ---- On Mon, 13 Jan 2020 00:05:54 -0600 Ian Wienand wrote ---- > > On Sun, Jan 12, 2020 at 07:20:27PM +0100, Radosław Piliszek wrote: > > > I noticed DevStack jobs fail all over the place [1] due to: > > > UnsupportedPythonVersion: Package 'setuptools' requires a different > > > Python: 2.7.17 not in '>=3.5' > > > > I think there's a wide variety of things going on here. > > > > Firstly, I think pip should be not be trying to install this ... you > > clearly felt the same thing and have filed [1] where it seems that it > > might be due to the wheels we create not specifying > > "data-requires-python" in our links to the wheel. This is the first > > I've heard of this ... we will need to look into this wrt to our wheel > > building and I have filed [2]. > > > > The plain "virtualenv" call that sets up the requirements virtualenv > > should be using Python 3 I think; proposed in [3]. This would avoid > > the issue by using python3 on master. > > > > The other places calling "virtualenv" appear to be related to > > TRACK_DEPENDS, which I think we can remove now to avoid further > > confusion. Proposed in [4] > > > > However, this leaves devstack-gate which is used by grenade. I > > *think* that [5] will work if the older branch of devstack also > > installs with python3. > > Yes, grenade master jobs use py3 in both (new and older devstack) which > is expected testing behaviour. We avoided (or at least not done yet) any mixed (py2->py3) > upgrade testing. > > > -gmann > > > > > The short answer is, yes, this is a big mess :/ > > > > -i > > > > [1] https://github.com/pypa/pip/issues/7586#issuecomment-573460206 > > [2] https://storyboard.openstack.org/#!/story/2007084 > > [3] https://review.opendev.org/702162 > > [4] https://review.opendev.org/702163 > > [5] https://review.opendev.org/702126 > > > > > > > From fungi at yuggoth.org Mon Jan 13 14:43:56 2020 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 13 Jan 2020 14:43:56 +0000 Subject: [all] DevStack jobs broken due to setuptools not available for Python 2 In-Reply-To: References: <20200113060554.GA3819219@fedora19.localdomain> <16f9f3be487.10f2e3b05395088.6540142001370012765@ghanshyammann.com> Message-ID: <20200113144356.mj3ot2crequlcowc@yuggoth.org> On 2020-01-13 15:16:58 +0100 (+0100), Radosław Piliszek wrote: > It turned out to be real mess to fix devstack and devstack-gate > and whatever else could be impacted. The current goal is to get > rid of setuptools wheel so that it does not interfere in Zuul. As > more packages go py3-only we will be forced to generate proper > metadata in HTML indices... [...] Yes, pardon me as I'm still catching up. Based on what I've read so far, the PyPI simple API provides some additional information with its package indices indicating which Python releases are supported by a given package. Our wheel cache is just being served from Apache with mod_autoindex providing basic indexing of the files, so does not have that information to provide. We've discussed a number of possible solutions to the problem. One option is to (temporarily) stop using the pre-built wheel cache, but that presupposes that it doesn't do much for us in the first place. That cache is there to provide pre-built wheels for packages which otherwise take a *very* long time to build from sdist and for which there is on usable wheel on PyPI (at least for the platforms on which we're running our jobs). Another option is to stop unnecessarily copying wheels already available on PyPI into that cache. It's used as a sieve, so that jobs can pull wheels from it when they exist but will still fall back to PyPI for any the cache doesn't contain. I suspect that the majority of projects dropping compatibility with older Python releases publish usable wheels for our platforms on PyPI already, so their presence in our cache is redundant. We could remove the latest Setuptools release from the wheel-mirror volume as a short-term solution, but will need to temporarily stop further updates to the cache since the job which builds it would just put that file right back. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From dms at danplanet.com Mon Jan 13 15:16:30 2020 From: dms at danplanet.com (Dan Smith) Date: Mon, 13 Jan 2020 07:16:30 -0800 Subject: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management In-Reply-To: (Sundar Nadathur's message of "Sun, 12 Jan 2020 21:41:16 +0000") References: <87344b65740d47fc9777ae710dcf4af9@AUSX13MPS308.AMER.DELL.COM> Message-ID: > TL;DR > > * Agree with Arkady that firmware updates should follow the server > vendors' guidelines, and can/should be done as part of the server > configuration. I'm worried there's a little bit of confusion about "which nova" and "which ironic" in this case, especially since Arkady mentioned tripleo. More on that below. However, I agree that if you're using ironic to manage the nodes that form your actual (over)cloud, then having ironic update firmware on your accelerator device in the same way that it might update firmware on a regular NIC, GPU card, or anything else makes sense. However, if you're talking about services all at the same level (i.e. nova working with ironic to provide metal as a tenant as well as VMs) then *that* ironic is not going to be managing firmware on accelerators that you're handing to your VM instances on the compute nodes. >> Managing life cycle of devices is Ironic responsibility, > > Disagree here. Me too, but in a general sense. I would not agree with the assessment that "Managing life cycle of devices is Ironic responsibility." Specifically the wide scope of "devices" being more than just physical machines. It's true that Ironic manages the lifecycle of physical machines, which may be used in a tripleo type of environment to manage the lifecycle of things like compute nodes. I *think* you both agree with that clarification, because of the next point, but I think it's important to avoid such statements that imply "all devices." > To the best of my knowledge, Ironic handles devices based on PCI > IDs. Cyborg is designed to go deeper for discovering device > features/properties and utilize Placement for scheduling based on > these. What does this matter though? If you're talking about firmware for an FPGA card, that's what you need to know in order to apply the correct firmware to it, independent of whatever application-level bitstream is going to go in there right? >> One cannot and should not manage lifecycle of server components independently. > > If what you meant to say is: ' do not update device firmware > independently of other server components', agreed. I'm not really sure what this original point from Arkady really means. Are (either of) you saying that if there's a CVE for the firmware in some card that the firmware patch shouldn't be applied without taking the box through a full lifecycle event or something? AFAIK, Ironic can't just do this in isolation, which means that if you've got a compute node managed by ironic in a tripleo type of environment, you're looking to move workloads away from that node, destroy it, apply updates, and re-create it before you can use it again. I guess I'd be surprised if people are doing this every time intel releases another microcode update. Am I wrong about that? Either way, I'm not sure how the firmware for accelerator cards is any different from the firmware for other devices on the system. Maybe the confusion is just that Cyborg does "programming" which seems similar to "updating firmware"? >> Nova and Neutron are getting info about all devices and their >> capabilities from Ironic; that they use for scheduling > > Hmm, this seems overly broad to me: not every deployment includes > Ironic, and getting PCI IDs is not enough for scheduling and > management. I also don't think it's correct. Nova does not get info about devices from Ironic, and I kinda doubt Neutron does either. If Nova is using ironic to provide metal as tenants, then...sure, but in the case where nova is providing VMs with accelerator cards, Ironic is not involved. >> Thus, I propose that Cyborg cover use cases 1 & 2, and Ironic would cover use case 3 > > Use case 3 says "setup device for its use, like burning specific FW." > With the definition of firmware above, I agree. Other aspects of > lifecycle management, not covered by use cases 1 - 3, would come under > Cyborg. > >> Thus, move all device Life-cycle code from Cyborg to Ironic > > To recap, there is more to device lifecycle than firmware update. I'd > suggest the other aspects can remain in Cyborg. Didn't you say that firmware programming (as defined here) is not something that Cyborg currently does? Thus, nothing Cyborg currently does should be moved to Ironic, AFAICT. If that is true, then I agree. I guess my summary is: firmware updates for accelerators can and should be handled the same as for other devices on the system, in whatever way the operator currently does that. Programming an application-level bitstream should not be confused with the former activity, and is fully within the domain of Cyborg's responsibilities. --Dan From cboylan at sapwetik.org Mon Jan 13 15:40:14 2020 From: cboylan at sapwetik.org (Clark Boylan) Date: Mon, 13 Jan 2020 07:40:14 -0800 Subject: =?UTF-8?Q?Re:_[all]_DevStack_jobs_broken_due_to_setuptools_not_available?= =?UTF-8?Q?_for_Python_2?= In-Reply-To: <20200113144356.mj3ot2crequlcowc@yuggoth.org> References: <20200113060554.GA3819219@fedora19.localdomain> <16f9f3be487.10f2e3b05395088.6540142001370012765@ghanshyammann.com> <20200113144356.mj3ot2crequlcowc@yuggoth.org> Message-ID: <88fd3935-61c4-4ccc-a3ef-f4e72391dbff@www.fastmail.com> On Mon, Jan 13, 2020, at 6:43 AM, Jeremy Stanley wrote: > On 2020-01-13 15:16:58 +0100 (+0100), Radosław Piliszek wrote: > > It turned out to be real mess to fix devstack and devstack-gate > > and whatever else could be impacted. The current goal is to get > > rid of setuptools wheel so that it does not interfere in Zuul. As > > more packages go py3-only we will be forced to generate proper > > metadata in HTML indices... > [...] > > Yes, pardon me as I'm still catching up. Based on what I've read so > far, the PyPI simple API provides some additional information with > its package indices indicating which Python releases are supported > by a given package. Our wheel cache is just being served from Apache > with mod_autoindex providing basic indexing of the files, so does > not have that information to provide. > PyPI supplies this via the 'data-requires-python' element attributes in the html source of the index. You can see these viewing the source of eg https://pypi.org/simple/setuptools/. The problem with relying on this is not all packages supply this data (it is provided at package upload time iirc) and not all versions of pip support it. While we rely on mod_autoindex for the per package indexes (this is where data-requires-python goes) we do generate the top level index manually via https://opendev.org/openstack/project-config/src/branch/master/roles/copy-wheels/files/wheel-index.sh. It is possible this script could be extended to write index files for each package as well. If we can determine what python versions are required we could then include that info. However, as mentioned above this won't fix all cases. I think the simplest option would be to simply stop building and mirroring wheels for packages which already have wheels on pypi. Let the wheel mirror be a true fallback for slow building packages like lxml and libvirt-python. Note that setuptools is a bit of an exception here because it is the bootstrap module. With setuptools in place we can control python specific package version selections using environment markers. This is already something we do a fair bit, https://opendev.org/openstack/requirements/src/branch/master/global-requirements.txt#L47-L48, which means we have the tooling in place to manage it. Clark From fungi at yuggoth.org Mon Jan 13 15:41:44 2020 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 13 Jan 2020 15:41:44 +0000 Subject: [all] DevStack jobs broken due to setuptools not available for Python 2 In-Reply-To: <20200113144356.mj3ot2crequlcowc@yuggoth.org> References: <20200113060554.GA3819219@fedora19.localdomain> <16f9f3be487.10f2e3b05395088.6540142001370012765@ghanshyammann.com> <20200113144356.mj3ot2crequlcowc@yuggoth.org> Message-ID: <20200113154144.os44mrihcotvwt3r@yuggoth.org> On 2020-01-13 14:43:56 +0000 (+0000), Jeremy Stanley wrote: > On 2020-01-13 15:16:58 +0100 (+0100), Radosław Piliszek wrote: > > It turned out to be real mess to fix devstack and devstack-gate > > and whatever else could be impacted. The current goal is to get > > rid of setuptools wheel so that it does not interfere in Zuul. As > > more packages go py3-only we will be forced to generate proper > > metadata in HTML indices... > [...] > > Yes, pardon me as I'm still catching up. Based on what I've read so > far, the PyPI simple API provides some additional information with > its package indices indicating which Python releases are supported > by a given package. Our wheel cache is just being served from Apache > with mod_autoindex providing basic indexing of the files, so does > not have that information to provide. > > We've discussed a number of possible solutions to the problem. One > option is to (temporarily) stop using the pre-built wheel cache, but > that presupposes that it doesn't do much for us in the first place. > That cache is there to provide pre-built wheels for packages which > otherwise take a *very* long time to build from sdist and for which > there is on usable wheel on PyPI (at least for the platforms on > which we're running our jobs). This was proposed via https://review.opendev.org/702166 for the record. > Another option is to stop unnecessarily copying wheels already > available on PyPI into that cache. It's used as a sieve, so that > jobs can pull wheels from it when they exist but will still fall > back to PyPI for any the cache doesn't contain. I suspect that the > majority of projects dropping compatibility with older Python > releases publish usable wheels for our platforms on PyPI already, so > their presence in our cache is redundant. We could remove the latest > Setuptools release from the wheel-mirror volume as a short-term > solution, but will need to temporarily stop further updates to the > cache since the job which builds it would just put that file right > back. I've proposed https://review.opendev.org/702244 for a smaller-scale version of this now (explicitly blacklisting wheels for pip, setuptools and virtualenv) but we can expand it to something more thorough once it's put through its paces. If we merge that, then we can manually delete the affected setuptools wheel from our wheel-mirror volume and not have to worry about it coming back. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From rosmaita.fossdev at gmail.com Mon Jan 13 15:42:09 2020 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Mon, 13 Jan 2020 10:42:09 -0500 Subject: [cinder] ussuri virtual mid-cycle next week Message-ID: <53eacc20-8762-2ef2-f115-8732b8f1827a@gmail.com> The poll results are in. There was only one time when everyone can meet (apologies to the "if need be" people, but the need be). Session One of the Cinder Ussuri virtual mid-cycle will be held: DATE: 21 JANUARY 2020 TIME: 1300-1500 UTC LOCATION: https://bluejeans.com/3228528973 The meeting will be recorded. Please add topics to the planning etherpad: https://etherpad.openstack.org/p/cinder-ussuri-mid-cycle-planning cheers, brian From fungi at yuggoth.org Mon Jan 13 16:53:41 2020 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 13 Jan 2020 16:53:41 +0000 Subject: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management In-Reply-To: References: <87344b65740d47fc9777ae710dcf4af9@AUSX13MPS308.AMER.DELL.COM> Message-ID: <20200113165340.ge3hitlqrdfhj52m@yuggoth.org> On 2020-01-13 07:16:30 -0800 (-0800), Dan Smith wrote: [...] > What does this matter though? If you're talking about firmware for an > FPGA card, that's what you need to know in order to apply the correct > firmware to it, independent of whatever application-level bitstream is > going to go in there right? [...] > Either way, I'm not sure how the firmware for accelerator cards is any > different from the firmware for other devices on the system. Maybe the > confusion is just that Cyborg does "programming" which seems similar to > "updating firmware"? [...] FPGA configuration is a compiled binary blob written into non-volatile memory through a hardware interface. These similarities to firmware also result in many people actually calling it "firmware" even though, you're right, technically it's a mapping of gate interconnections and not really firmware in the conventional sense. In retrospect maybe I shouldn't have brought it up. I wouldn't be surprised, though, if there *are* NFV-related cases where the users of the virtual machines into which some network hardware is mapped need access to alter parts of, say, an interface controller's firmware. The Linux kernel has for years incorporated features to write or rewrite firmware and other microcode for certain devices at boot time for similar reasons, after all. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From fungi at yuggoth.org Mon Jan 13 17:32:20 2020 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 13 Jan 2020 17:32:20 +0000 Subject: [all] DevStack jobs broken due to setuptools not available for Python 2 In-Reply-To: <20200113154144.os44mrihcotvwt3r@yuggoth.org> References: <20200113060554.GA3819219@fedora19.localdomain> <16f9f3be487.10f2e3b05395088.6540142001370012765@ghanshyammann.com> <20200113144356.mj3ot2crequlcowc@yuggoth.org> <20200113154144.os44mrihcotvwt3r@yuggoth.org> Message-ID: <20200113173220.sdryzodzxvhxhvqc@yuggoth.org> On 2020-01-13 15:41:44 +0000 (+0000), Jeremy Stanley wrote: [...] > I've proposed https://review.opendev.org/702244 for a smaller-scale > version of this now (explicitly blacklisting wheels for pip, > setuptools and virtualenv) but we can expand it to something more > thorough once it's put through its paces. If we merge that, then we > can manually delete the affected setuptools wheel from our > wheel-mirror volume and not have to worry about it coming back. This has since merged, and as of 16:30 UTC (roughly an hour ago) I deleted all copies of the setuptools-45.0.0-py2.py3-none-any.whl file from our AFS volumes. We're testing now to see if previously broken jobs are working again, but suspect things should be back to normal. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From colleen at gazlene.net Mon Jan 13 17:38:06 2020 From: colleen at gazlene.net (Colleen Murphy) Date: Mon, 13 Jan 2020 09:38:06 -0800 Subject: [ops] Federated Identity Management survey In-Reply-To: <4a7a0c41-59ce-4aac-839e-0840eeb50348@www.fastmail.com> References: <4a7a0c41-59ce-4aac-839e-0840eeb50348@www.fastmail.com> Message-ID: <2116da33-6d85-4132-94e5-68bcea0c8385@www.fastmail.com> On Mon, Dec 23, 2019, at 09:32, Colleen Murphy wrote: > Hello operators, > > A researcher from the University of Kent who was influential in the > design of keystone's federation implementation has asked the keystone > team to gauge adoption of federated identity management in OpenStack > deployments. This is something we've neglected to track well in the > last few OpenStack user surveys, so I'd like to ask OpenStack operators > to please take a few minutes to complete the following survey about > your usage of identity federation in your OpenStack deployment (even if > you don't use federation): > > https://uok.typeform.com/to/KuRY0q > > The results of this survey will benefit not only university research > but also the keystone team as it will help us understand where to focus > our efforts. Your participation is greatly appreciated. > > Thanks for your time, > > Colleen (cmurphy) > > Thanks to everyone who has completed this survey so far! The survey will be closing in about a week, so if you have not yet completed it, we appreciate you taking the time to do so now. Colleen (cmurphy) From dms at danplanet.com Mon Jan 13 17:58:00 2020 From: dms at danplanet.com (Dan Smith) Date: Mon, 13 Jan 2020 09:58:00 -0800 Subject: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management In-Reply-To: <20200113165340.ge3hitlqrdfhj52m@yuggoth.org> (Jeremy Stanley's message of "Mon, 13 Jan 2020 16:53:41 +0000") References: <87344b65740d47fc9777ae710dcf4af9@AUSX13MPS308.AMER.DELL.COM> <20200113165340.ge3hitlqrdfhj52m@yuggoth.org> Message-ID: > FPGA configuration is a compiled binary blob written into > non-volatile memory through a hardware interface. These similarities > to firmware also result in many people actually calling it > "firmware" even though, you're right, technically it's a mapping of > gate interconnections and not really firmware in the conventional > sense. In retrospect maybe I shouldn't have brought it up. It's a super easy thing to conflate those two topics I think. Probably calling one the "firmware" and the other the "bitstream" is the most common distinction I've heard. The latter also potentially being the "application" or "function." > I wouldn't be surprised, though, if there *are* NFV-related cases > where the users of the virtual machines into which some network > hardware is mapped need access to alter parts of, say, an interface > controller's firmware. The Linux kernel has for years incorporated > features to write or rewrite firmware and other microcode for > certain devices at boot time for similar reasons, after all. Yeah, I'm not sure because I don't have a lot of experience with these devices. I guess I kinda expected that they have effectively two devices on each card: one being the FPGA itself and the other being just a management device that lets you flash the FPGA. If the FPGA is connected to the bus as well, I'd expect it to be able to define its own interaction (i.e. be like a NIC or be like a compression accelerator), and the actual "firmware" being purely a function of the management device. Either way, I think my point is that ironic's ability to manage the firmware part regardless of how often you need it to change is limited (currently, AFAIK) to the cleaning/prep phase of the lifecycle, and only really applies anyway if a compute node when it is a workload on top of the undercloud. For people that don't use ironic to provision their compute nodes, ironic wouldn't even have the opportunity to manage the firmware of those devices. I'm not saying Cyborg should fill the firmware gap, just not saying we should expect that Ironic will. --Dan From sundar.nadathur at intel.com Mon Jan 13 18:16:20 2020 From: sundar.nadathur at intel.com (Nadathur, Sundar) Date: Mon, 13 Jan 2020 18:16:20 +0000 Subject: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management In-Reply-To: References: <87344b65740d47fc9777ae710dcf4af9@AUSX13MPS308.AMER.DELL.COM> Message-ID: > From: Dan Smith > Sent: Monday, January 13, 2020 7:17 AM > To: Nadathur, Sundar > Cc: Arkady.Kanevsky at dell.com; openstack-discuss at lists.openstack.org > Subject: Re: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators > management > > > TL;DR > > > > * Agree with Arkady that firmware updates should follow the server > > vendors' guidelines, and can/should be done as part of the server > > configuration. > > I'm worried there's a little bit of confusion about "which nova" and "which > ironic" in this case, especially since Arkady mentioned tripleo. More on that > below. However, I agree that if you're using ironic to manage the nodes that > form your actual (over)cloud, then having ironic update firmware on your > accelerator device in the same way that it might update firmware on a regular > NIC, GPU card, or anything else makes sense. > > However, if you're talking about services all at the same level (i.e. nova > working with ironic to provide metal as a tenant as well as > VMs) then *that* ironic is not going to be managing firmware on accelerators > that you're handing to your VM instances on the compute nodes. This goes back to the definition of firmware update vs. programming in my earlier post. In a Nova + Ironic + Cyborg env, I'd expect Cyborg to do programming. Firmware updates can be done by Ironic, Ansible/Redfish/... , some combination like Ironic with Redfish driver, or whatever the operator chooses. > > To the best of my knowledge, Ironic handles devices based on PCI IDs. > > Cyborg is designed to go deeper for discovering device > > features/properties and utilize Placement for scheduling based on > > these. > > What does this matter though? If you're talking about firmware for an FPGA > card, that's what you need to know in order to apply the correct firmware to > it, independent of whatever application-level bitstream is going to go in there > right? The device properties are needed for scheduling: users are often interested in getting a VM with an accelerator that has specific properties: e.g. implements a specific version of gzip, has 4 GB or more of device-local memory etc. Device properties are also needed for management of accelerator inventory: admins want to know how many FPGAs have a particular bitstream burnt into them, etc. Re. programming, sometimes we may need to determine what's in a device (beyond PCI ID) before programming it to ensure the image being programmed and the existing device contents are compatible. > >> One cannot and should not manage lifecycle of server components > independently. > > > > If what you meant to say is: ' do not update device firmware > > independently of other server components', agreed. > > I'm not really sure what this original point from Arkady really means. Are > (either of) you saying that if there's a CVE for the firmware in some card that > the firmware patch shouldn't be applied without taking the box through a full > lifecycle event or something? My paraphrase of Arkady's points: a. Updating CPU firmware/microcode should be done as per the server/CPU vendor's rules (use their specific tools, or some specific mechanisms like Redfish, with auditing, ....) b. Updating firmware for devices/accelerators should be done the same way. By a "full lifecycle event", you presumably mean vacating the entire node. For device updates, that is not always needed: one could disconnect just the instances using that device. The server/device vendor rules must specify the 'lifecycle event' involved for a specific update. > AFAIK, Ironic can't just do this in isolation, which > means that if you've got a compute node managed by ironic in a tripleo type > of environment, you're looking to move workloads away from that node, > destroy it, apply updates, and re-create it before you can use it again. I guess > I'd be surprised if people are doing this every time intel releases another > microcode update. Am I wrong about that? Not making any official statements but, generally, if a microcode/firmware update requires a reboot, one would have to do that. The admin would declare a maintenance window and combine software/firmware/configuration updates in that window. > Either way, I'm not sure how the firmware for accelerator cards is any > different from the firmware for other devices on the system. Updates of other devices, like CPU or motherboard components, often require server reboots. Accelerator updates may or may not require them, depending on ... all kinds of things. > Maybe the confusion is just that Cyborg does "programming" which seems similar to > "updating firmware"? Yes, indeed. That is why I went at length on the distinction between the two. > >> Nova and Neutron are getting info about all devices and their > >> capabilities from Ironic; that they use for scheduling > > > > Hmm, this seems overly broad to me: not every deployment includes > > Ironic, and getting PCI IDs is not enough for scheduling and > > management. > > I also don't think it's correct. Nova does not get info about devices from > Ironic, and I kinda doubt Neutron does either. If Nova is using ironic to > provide metal as tenants, then...sure, but in the case where nova is providing > VMs with accelerator cards, Ironic is not involved. +1 > >> Thus, move all device Life-cycle code from Cyborg to Ironic > > > > To recap, there is more to device lifecycle than firmware update. I'd > > suggest the other aspects can remain in Cyborg. > > Didn't you say that firmware programming (as defined here) is not something > that Cyborg currently does? Thus, nothing Cyborg currently does should be > moved to Ironic, AFAICT. If that is true, then I agree. Yes ^. > I guess my summary is: firmware updates for accelerators can and should be > handled the same as for other devices on the system, in whatever way the > operator currently does that. Programming an application-level bitstream > should not be confused with the former activity, and is fully within the > domain of Cyborg's responsibilities. Agreed. > --Dan Regards, Sundar From sundar.nadathur at intel.com Mon Jan 13 18:26:23 2020 From: sundar.nadathur at intel.com (Nadathur, Sundar) Date: Mon, 13 Jan 2020 18:26:23 +0000 Subject: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management In-Reply-To: <20200113165340.ge3hitlqrdfhj52m@yuggoth.org> References: <87344b65740d47fc9777ae710dcf4af9@AUSX13MPS308.AMER.DELL.COM> <20200113165340.ge3hitlqrdfhj52m@yuggoth.org> Message-ID: > -----Original Message----- > From: Jeremy Stanley > Sent: Monday, January 13, 2020 8:54 AM > To: openstack-discuss at lists.openstack.org > Subject: Re: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators > management > > On 2020-01-13 07:16:30 -0800 (-0800), Dan Smith wrote: > [...] > > What does this matter though? If you're talking about firmware for an > > FPGA card, that's what you need to know in order to apply the correct > > firmware to it, independent of whatever application-level bitstream is > > going to go in there right? > [...] > > Either way, I'm not sure how the firmware for accelerator cards is any > > different from the firmware for other devices on the system. Maybe the > > confusion is just that Cyborg does "programming" which seems similar > > to "updating firmware"? > [...] > > FPGA configuration is a compiled binary blob written into non-volatile > memory through a hardware interface. These similarities to firmware also > result in many people actually calling it "firmware" even though, you're right, > technically it's a mapping of gate interconnections and not really firmware in > the conventional sense. +1 > I wouldn't be surprised, though, if there *are* NFV-related cases where the > users of the virtual machines into which some network hardware is mapped > need access to alter parts of, say, an interface controller's firmware. The Linux > kernel has for years incorporated features to write or rewrite firmware and > other microcode for certain devices at boot time for similar reasons, after all. This aspect does come up for discussion a lot. Generally, operators and device vendors get alarmed at the prospect of letting a user/VNF/instance program an image/bitstream into a device directly -- we wouldn't know what image it is, etc. Cyborg doesn't support that. But Cyborg could program an image/bitstream on behalf of the user/VNF. That said, the VNF or VM (in a non-networking context) can configure a device by reading from registers/DDR on the card or writing to them. They can be handled using standard access permissions, Linux capabilities, etc. For example, the VM may memory-map a region of the device's address space using the mmap system call, and that access can be controlled. > -- > Jeremy Stanley Regards, Sundar From smooney at redhat.com Mon Jan 13 18:53:00 2020 From: smooney at redhat.com (Sean Mooney) Date: Mon, 13 Jan 2020 18:53:00 +0000 Subject: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management In-Reply-To: References: <87344b65740d47fc9777ae710dcf4af9@AUSX13MPS308.AMER.DELL.COM> <20200113165340.ge3hitlqrdfhj52m@yuggoth.org> Message-ID: On Mon, 2020-01-13 at 18:26 +0000, Nadathur, Sundar wrote: > > -----Original Message----- > > From: Jeremy Stanley > > Sent: Monday, January 13, 2020 8:54 AM > > To: openstack-discuss at lists.openstack.org > > Subject: Re: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators > > management > > > > On 2020-01-13 07:16:30 -0800 (-0800), Dan Smith wrote: > > [...] > > > What does this matter though? If you're talking about firmware for an > > > FPGA card, that's what you need to know in order to apply the correct > > > firmware to it, independent of whatever application-level bitstream is > > > going to go in there right? > > > > [...] > > > Either way, I'm not sure how the firmware for accelerator cards is any > > > different from the firmware for other devices on the system. Maybe the > > > confusion is just that Cyborg does "programming" which seems similar > > > to "updating firmware"? > > > > [...] > > > > FPGA configuration is a compiled binary blob written into non-volatile > > memory through a hardware interface. These similarities to firmware also > > result in many people actually calling it "firmware" even though, you're right, > > technically it's a mapping of gate interconnections and not really firmware in > > the conventional sense. > > +1 > > > I wouldn't be surprised, though, if there *are* NFV-related cases where the > > users of the virtual machines into which some network hardware is mapped > > need access to alter parts of, say, an interface controller's firmware. The Linux > > kernel has for years incorporated features to write or rewrite firmware and > > other microcode for certain devices at boot time for similar reasons, after all. > > This aspect does come up for discussion a lot. Generally, operators and device vendors get alarmed at the prospect of > letting a user/VNF/instance program an image/bitstream into a device directly -- we wouldn't know what image it is, > etc. Cyborg doesn't support that. But Cyborg could program an image/bitstream on behalf of the user/VNF. to be fair if you device support reprogramming over pcie then you can enable the guest to reprogram the device using nova's pci passthough feature by passing through the entire pf. cyborgs role is to provide a magaged acclerator not an unmanaged one. if we wanted to use use pre programed fpga or fix function acclerator you have been able to do that with pci passtough for the better part of 4 years. so i would consider unmanaged acclerator out of scope of cyborg at least until the integration of managed accllerator is done. nova already handelds vGPU, vPMEM(persistent memeory), generic pci passthough, sriov for neutron ports and hardware offloaded ovs VF(e.g. smart nic integration). cyborgs add value is in managing things nova cannot provide easily. arguing that ironic shoudl mangage fpga bitstream becasue it can manage firmware from a nova point of view is arguaing the virt driver should manage all devices that are provide to the guest meaning in the libvirt case it and not cyborg shoudl be continuted to be extended to mange fpgas and any other devices directly. we coudl do that but that would leave only one thing for cyborge to manage which woudl be remote acclartor that could be proved to instnace over a network fabric. making it a kind of cinder of acclerators. that is a usecase that nova and ironic both woudl be ill sutied for but it is not the dirction the cyborg project has moved in so unless you are suggesing cyborg should piviot i dont think we should redesign the interaction between nova ironic cyborg and neutron to have ironci manage the devices. i do think there is merrit in some integration between the ironic python agent and cyborg for discovery and perhaps programing of the fpga on an ironic node assuming the actual discovery and programing logic live in cyborg and ironic simply runs/deploys/configures the cyborg agent in the ipa image or invokes the cyborg code directly. > > That said, the VNF or VM (in a non-networking context) can configure a device by reading from registers/DDR on the > card or writing to them. They can be handled using standard access permissions, Linux capabilities, etc. For example, > the VM may memory-map a region of the device's address space using the mmap system call, and that access can be > controlled. > > > -- > > Jeremy Stanley > > Regards, > Sundar > From dms at danplanet.com Mon Jan 13 19:15:23 2020 From: dms at danplanet.com (Dan Smith) Date: Mon, 13 Jan 2020 11:15:23 -0800 Subject: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management In-Reply-To: (Sundar Nadathur's message of "Mon, 13 Jan 2020 18:16:20 +0000") References: <87344b65740d47fc9777ae710dcf4af9@AUSX13MPS308.AMER.DELL.COM> Message-ID: > This goes back to the definition of firmware update vs. programming in > my earlier post. In a Nova + Ironic + Cyborg env, I'd expect Cyborg to > do programming. Firmware updates can be done by Ironic, > Ansible/Redfish/... , some combination like Ironic with Redfish > driver, or whatever the operator chooses. Yes, this is my point. I think we're in agreement here. >> What does this matter though? If you're talking about firmware for an FPGA >> card, that's what you need to know in order to apply the correct firmware to >> it, independent of whatever application-level bitstream is going to go in there >> right? > > The device properties are needed for scheduling: users are often > interested in getting a VM with an accelerator that has specific > properties: e.g. implements a specific version of gzip, has 4 GB or > more of device-local memory etc. Right, I'm saying I don't think Ironic needs to know anything other than the PCI ID of a card in order to update its firmware, correct? You and I are definitely in agreement that Ironic should have nothing to do with _programming_ and thus nothing to do with _scheduling_ of workloads (affined-) to accelerators. > By a "full lifecycle event", you presumably mean vacating the entire > node. For device updates, that is not always needed: one could > disconnect just the instances using that device. The server/device > vendor rules must specify the 'lifecycle event' involved for a > specific update. Right, I'm saying that today (AFAIK) Ironic can only do the "vacate, destroy, clean, re-image" sort of lifecycle, which is very heavyweight to just update firmware on a card. > Updates of other devices, like CPU or motherboard components, often > require server reboots. Accelerator updates may or may not require > them, depending on ... all kinds of things. Yep, all of this is lighter-weight than Ironic destroying, cleaning, and re-imaging a node. I'm making the case for "sure, Ironic could do the firmware update if it's cleaning a node, but in most cases you probably want a more lightweight process like ansible and a reboot." So again, I think we're in full agreement on the classification of operation, and the subset of that which is wholly owned by Cyborg, as well as what of that *may* be owned by Ironic or any other hardware management tool. --Dan From feilong at catalyst.net.nz Mon Jan 13 19:03:57 2020 From: feilong at catalyst.net.nz (feilong) Date: Tue, 14 Jan 2020 08:03:57 +1300 Subject: [magnum]: K8s cluster creation times out. OpenStack Train : [ERROR]: Unable to render networking. Network config is likely broken In-Reply-To: References: <59ed63745f4e4c42a63692c3ee4eb10d@ncwmexgp031.CORP.CHARTERCOM.com> <6c8f45f2-da74-18fd-7909-84c9c6762fe3@catalyst.net.nz> Message-ID: <24a8d164-8e38-1512-caf3-9447f070b8fd@catalyst.net.nz> Hi Bhupathi, Firstly, I would suggest setting the use_podman=False when using fedora atomic image. And it would be nice to set the "kube_tag", e.g. v1.15.6 explicitly. Then please trigger a new cluster creation. Then if you still run into error. Here is the debug steps: 1. ssh into the master node, check log /var/log/cloud-init-output.log 2. If there is no error in above log file, then run journalctl -u heat-container-agent to check the heat-container-agent log. If above step is correct, then you must be able to see something useful here. On 11/01/20 12:15 AM, Bhupathi, Ramakrishna wrote: > > Wang, > > Here it is  . I added the labels subsequently. My nova and neutron are > working all right as I installed various systems there working with no > issues.. > >   > >   > > *From:*Feilong Wang [mailto:feilong at catalyst.net.nz] > *Sent:* Thursday, January 9, 2020 6:12 PM > *To:* openstack-discuss at lists.openstack.org > *Subject:* Re: [magnum]: K8s cluster creation times out. OpenStack > Train : [ERROR]: Unable to render networking. Network config is likely > broken > >   > > Hi Bhupathi, > > Could you please share your cluster template? And please make sure > your Nova/Neutron works. > >   > > On 10/01/20 2:45 AM, Bhupathi, Ramakrishna wrote: > > Folks, > > I am building a Kubernetes Cluster( Openstack Train) and using > fedora atomic-29 image . The nodes come up  fine ( I have a simple > 1 master and 1 node) , but the cluster creation times out,  and > when I access the cloud-init logs I see this error .  Wondering > what I am missing as this used to work before.  I wonder if this > is image related . > >   > > [ERROR]: Unable to render networking. Network config is likely > broken: No available network renderers found. Searched through > list: ['eni', 'sysconfig', 'netplan'] > >   > > Essentially the stack creation fails in “kube_cluster_deploy” > >   > > Can somebody help me debug this ? Any help is appreciated. > >   > > --RamaK > > The contents of this e-mail message and > any attachments are intended solely for the > addressee(s) and may contain confidential > and/or legally privileged information. If you > are not the intended recipient of this message > or if this message has been addressed to you > in error, please immediately alert the sender > by reply e-mail and then delete this message > and any attachments. If you are not the > intended recipient, you are notified that > any use, dissemination, distribution, copying, > or storage of this message or any attachment > is strictly prohibited. > > -- > Cheers & Best regards, > Feilong Wang (王飞龙) > Head of R&D > Catalyst Cloud - Cloud Native New Zealand > -------------------------------------------------------------------------- > Tel: +64-48032246 > Email: flwang at catalyst.net.nz > Level 6, Catalyst House, 150 Willis Street, Wellington > -------------------------------------------------------------------------- > The contents of this e-mail message and > any attachments are intended solely for the > addressee(s) and may contain confidential > and/or legally privileged information. If you > are not the intended recipient of this message > or if this message has been addressed to you > in error, please immediately alert the sender > by reply e-mail and then delete this message > and any attachments. If you are not the > intended recipient, you are notified that > any use, dissemination, distribution, copying, > or storage of this message or any attachment > is strictly prohibited. -- Cheers & Best regards, Feilong Wang (王飞龙) ------------------------------------------------------ Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang at catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington ------------------------------------------------------ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 55224 bytes Desc: not available URL: From pierre at stackhpc.com Mon Jan 13 20:20:50 2020 From: pierre at stackhpc.com (Pierre Riteau) Date: Mon, 13 Jan 2020 20:20:50 +0000 Subject: [blazar] No IRC meeting tomorrow Message-ID: Hello, As announced in the last meeting, due to travel the weekly Blazar IRC meeting of January 14 is cancelled. The next meeting will be held on Jan 21. Thanks, Pierre Riteau (priteau) From feilong at catalyst.net.nz Mon Jan 13 21:10:29 2020 From: feilong at catalyst.net.nz (Feilong Wang) Date: Tue, 14 Jan 2020 10:10:29 +1300 Subject: [magnum]: K8s cluster creation times out. OpenStack Train : [ERROR]: Unable to render networking. Network config is likely broken In-Reply-To: References: <59ed63745f4e4c42a63692c3ee4eb10d@ncwmexgp031.CORP.CHARTERCOM.com> <6c8f45f2-da74-18fd-7909-84c9c6762fe3@catalyst.net.nz> <24a8d164-8e38-1512-caf3-9447f070b8fd@catalyst.net.nz> Message-ID: Hi Donny, Do you mean Fedore CoreOS or just CoreOS? The current CoreOS driver is not actively maintained, I would suggest migrating to Fedora CoreOS and I'm happy to help if you have any question. Thanks. On 14/01/20 9:57 AM, Donny Davis wrote: > FWIW I was only able to get the coreos image working with magnum oob.. > the rest just didn't work.  > > On Mon, Jan 13, 2020 at 2:31 PM feilong > wrote: > > Hi Bhupathi, > > Firstly, I would suggest setting the use_podman=False when using > fedora atomic image. And it would be nice to set the "kube_tag", > e.g. v1.15.6 explicitly. Then please trigger a new cluster > creation. Then if you still run into error. Here is the debug steps: > > 1. ssh into the master node, check log /var/log/cloud-init-output.log > > 2. If there is no error in above log file, then run journalctl -u > heat-container-agent to check the heat-container-agent log. If > above step is correct, then you must be able to see something > useful here. > > > On 11/01/20 12:15 AM, Bhupathi, Ramakrishna wrote: >> >> Wang, >> >> Here it is  . I added the labels subsequently. My nova and >> neutron are working all right as I installed various systems >> there working with no issues.. >> >>   >> >>   >> >> *From:*Feilong Wang [mailto:feilong at catalyst.net.nz] >> *Sent:* Thursday, January 9, 2020 6:12 PM >> *To:* openstack-discuss at lists.openstack.org >> >> *Subject:* Re: [magnum]: K8s cluster creation times out. >> OpenStack Train : [ERROR]: Unable to render networking. Network >> config is likely broken >> >>   >> >> Hi Bhupathi, >> >> Could you please share your cluster template? And please make >> sure your Nova/Neutron works. >> >>   >> >> On 10/01/20 2:45 AM, Bhupathi, Ramakrishna wrote: >> >> Folks, >> >> I am building a Kubernetes Cluster( Openstack Train) and >> using fedora atomic-29 image . The nodes come up  fine ( I >> have a simple 1 master and 1 node) , but the cluster creation >> times out,  and when I access the cloud-init logs I see this >> error .  Wondering what I am missing as this used to work >> before.  I wonder if this is image related . >> >>   >> >> [ERROR]: Unable to render networking. Network config is >> likely broken: No available network renderers found. Searched >> through list: ['eni', 'sysconfig', 'netplan'] >> >>   >> >> Essentially the stack creation fails in “kube_cluster_deploy” >> >>   >> >> Can somebody help me debug this ? Any help is appreciated. >> >>   >> >> --RamaK >> >> The contents of this e-mail message and >> any attachments are intended solely for the >> addressee(s) and may contain confidential >> and/or legally privileged information. If you >> are not the intended recipient of this message >> or if this message has been addressed to you >> in error, please immediately alert the sender >> by reply e-mail and then delete this message >> and any attachments. If you are not the >> intended recipient, you are notified that >> any use, dissemination, distribution, copying, >> or storage of this message or any attachment >> is strictly prohibited. >> >> -- >> Cheers & Best regards, >> Feilong Wang (王飞龙) >> Head of R&D >> Catalyst Cloud - Cloud Native New Zealand >> -------------------------------------------------------------------------- >> Tel: +64-48032246 >> Email: flwang at catalyst.net.nz >> Level 6, Catalyst House, 150 Willis Street, Wellington >> -------------------------------------------------------------------------- >> The contents of this e-mail message and >> any attachments are intended solely for the >> addressee(s) and may contain confidential >> and/or legally privileged information. If you >> are not the intended recipient of this message >> or if this message has been addressed to you >> in error, please immediately alert the sender >> by reply e-mail and then delete this message >> and any attachments. If you are not the >> intended recipient, you are notified that >> any use, dissemination, distribution, copying, >> or storage of this message or any attachment >> is strictly prohibited. > > -- > Cheers & Best regards, > Feilong Wang (王飞龙) > ------------------------------------------------------ > Senior Cloud Software Engineer > Tel: +64-48032246 > Email: flwang at catalyst.net.nz > Catalyst IT Limited > Level 6, Catalyst House, 150 Willis Street, Wellington > ------------------------------------------------------ > > > > -- > ~/DonnyD > C: 805 814 6800 > "No mission too difficult. No sacrifice too great. Duty First" -- Cheers & Best regards, Feilong Wang (王飞龙) Head of R&D Catalyst Cloud - Cloud Native New Zealand -------------------------------------------------------------------------- Tel: +64-48032246 Email: flwang at catalyst.net.nz Level 6, Catalyst House, 150 Willis Street, Wellington -------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 55224 bytes Desc: not available URL: From donny at fortnebula.com Mon Jan 13 21:21:33 2020 From: donny at fortnebula.com (Donny Davis) Date: Mon, 13 Jan 2020 16:21:33 -0500 Subject: [magnum]: K8s cluster creation times out. OpenStack Train : [ERROR]: Unable to render networking. Network config is likely broken In-Reply-To: References: <59ed63745f4e4c42a63692c3ee4eb10d@ncwmexgp031.CORP.CHARTERCOM.com> <6c8f45f2-da74-18fd-7909-84c9c6762fe3@catalyst.net.nz> <24a8d164-8e38-1512-caf3-9447f070b8fd@catalyst.net.nz> Message-ID: Just Coreos - I tried them all and it was the only one that worked oob. On Mon, Jan 13, 2020 at 4:10 PM Feilong Wang wrote: > Hi Donny, > > Do you mean Fedore CoreOS or just CoreOS? The current CoreOS driver is not > actively maintained, I would suggest migrating to Fedora CoreOS and I'm > happy to help if you have any question. Thanks. > > > On 14/01/20 9:57 AM, Donny Davis wrote: > > FWIW I was only able to get the coreos image working with magnum oob.. the > rest just didn't work. > > On Mon, Jan 13, 2020 at 2:31 PM feilong wrote: > >> Hi Bhupathi, >> >> Firstly, I would suggest setting the use_podman=False when using fedora >> atomic image. And it would be nice to set the "kube_tag", e.g. v1.15.6 >> explicitly. Then please trigger a new cluster creation. Then if you still >> run into error. Here is the debug steps: >> >> 1. ssh into the master node, check log /var/log/cloud-init-output.log >> >> 2. If there is no error in above log file, then run journalctl -u >> heat-container-agent to check the heat-container-agent log. If above step >> is correct, then you must be able to see something useful here. >> >> >> On 11/01/20 12:15 AM, Bhupathi, Ramakrishna wrote: >> >> Wang, >> >> Here it is . I added the labels subsequently. My nova and neutron are >> working all right as I installed various systems there working with no >> issues.. >> >> >> >> >> >> *From:* Feilong Wang [mailto:feilong at catalyst.net.nz >> ] >> *Sent:* Thursday, January 9, 2020 6:12 PM >> *To:* openstack-discuss at lists.openstack.org >> *Subject:* Re: [magnum]: K8s cluster creation times out. OpenStack Train >> : [ERROR]: Unable to render networking. Network config is likely broken >> >> >> >> Hi Bhupathi, >> >> Could you please share your cluster template? And please make sure your >> Nova/Neutron works. >> >> >> >> On 10/01/20 2:45 AM, Bhupathi, Ramakrishna wrote: >> >> Folks, >> >> I am building a Kubernetes Cluster( Openstack Train) and using fedora >> atomic-29 image . The nodes come up fine ( I have a simple 1 master and 1 >> node) , but the cluster creation times out, and when I access the >> cloud-init logs I see this error . Wondering what I am missing as this >> used to work before. I wonder if this is image related . >> >> >> >> [ERROR]: Unable to render networking. Network config is likely broken: No >> available network renderers found. Searched through list: ['eni', >> 'sysconfig', 'netplan'] >> >> >> >> Essentially the stack creation fails in “kube_cluster_deploy” >> >> >> >> Can somebody help me debug this ? Any help is appreciated. >> >> >> >> --RamaK >> >> The contents of this e-mail message and >> any attachments are intended solely for the >> addressee(s) and may contain confidential >> and/or legally privileged information. If you >> are not the intended recipient of this message >> or if this message has been addressed to you >> in error, please immediately alert the sender >> by reply e-mail and then delete this message >> and any attachments. If you are not the >> intended recipient, you are notified that >> any use, dissemination, distribution, copying, >> or storage of this message or any attachment >> is strictly prohibited. >> >> -- >> >> Cheers & Best regards, >> >> Feilong Wang (王飞龙) >> >> Head of R&D >> >> Catalyst Cloud - Cloud Native New Zealand >> >> -------------------------------------------------------------------------- >> >> Tel: +64-48032246 >> >> Email: flwang at catalyst.net.nz >> >> Level 6, Catalyst House, 150 Willis Street, Wellington >> >> -------------------------------------------------------------------------- >> >> The contents of this e-mail message and >> any attachments are intended solely for the >> addressee(s) and may contain confidential >> and/or legally privileged information. If you >> are not the intended recipient of this message >> or if this message has been addressed to you >> in error, please immediately alert the sender >> by reply e-mail and then delete this message >> and any attachments. If you are not the >> intended recipient, you are notified that >> any use, dissemination, distribution, copying, >> or storage of this message or any attachment >> is strictly prohibited. >> >> -- >> Cheers & Best regards, >> Feilong Wang (王飞龙) >> ------------------------------------------------------ >> Senior Cloud Software Engineer >> Tel: +64-48032246 >> Email: flwang at catalyst.net.nz >> Catalyst IT Limited >> Level 6, Catalyst House, 150 Willis Street, Wellington >> ------------------------------------------------------ >> >> > > -- > ~/DonnyD > C: 805 814 6800 > "No mission too difficult. No sacrifice too great. Duty First" > > -- > Cheers & Best regards, > Feilong Wang (王飞龙) > Head of R&D > Catalyst Cloud - Cloud Native New Zealand > -------------------------------------------------------------------------- > Tel: +64-48032246 > Email: flwang at catalyst.net.nz > Level 6, Catalyst House, 150 Willis Street, Wellington > -------------------------------------------------------------------------- > > -- ~/DonnyD C: 805 814 6800 "No mission too difficult. No sacrifice too great. Duty First" -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 55224 bytes Desc: not available URL: From feilong at catalyst.net.nz Mon Jan 13 21:25:08 2020 From: feilong at catalyst.net.nz (Feilong Wang) Date: Tue, 14 Jan 2020 10:25:08 +1300 Subject: [magnum]: K8s cluster creation times out. OpenStack Train : [ERROR]: Unable to render networking. Network config is likely broken In-Reply-To: References: <59ed63745f4e4c42a63692c3ee4eb10d@ncwmexgp031.CORP.CHARTERCOM.com> <6c8f45f2-da74-18fd-7909-84c9c6762fe3@catalyst.net.nz> <24a8d164-8e38-1512-caf3-9447f070b8fd@catalyst.net.nz> Message-ID: <5846dbb7-fbcd-db22-7342-5ba2b6e4a1d3@catalyst.net.nz> OK, if you're happy to stay on CoreOS, all good. If you're interested in migrating to Fedora CoreOS and have questions, then you're welcome to popup in #openstack-containers. Cheers. On 14/01/20 10:21 AM, Donny Davis wrote: > Just Coreos - I tried them all and it was the only one that worked oob.  > > On Mon, Jan 13, 2020 at 4:10 PM Feilong Wang > wrote: > > Hi Donny, > > Do you mean Fedore CoreOS or just CoreOS? The current CoreOS > driver is not actively maintained, I would suggest migrating to > Fedora CoreOS and I'm happy to help if you have any question. Thanks. > > > On 14/01/20 9:57 AM, Donny Davis wrote: >> FWIW I was only able to get the coreos image working with magnum >> oob.. the rest just didn't work.  >> >> On Mon, Jan 13, 2020 at 2:31 PM feilong > > wrote: >> >> Hi Bhupathi, >> >> Firstly, I would suggest setting the use_podman=False when >> using fedora atomic image. And it would be nice to set the >> "kube_tag", e.g. v1.15.6 explicitly. Then please trigger a >> new cluster creation. Then if you still run into error. Here >> is the debug steps: >> >> 1. ssh into the master node, check log >> /var/log/cloud-init-output.log >> >> 2. If there is no error in above log file, then run >> journalctl -u heat-container-agent to check the >> heat-container-agent log. If above step is correct, then you >> must be able to see something useful here. >> >> >> On 11/01/20 12:15 AM, Bhupathi, Ramakrishna wrote: >>> >>> Wang, >>> >>> Here it is  . I added the labels subsequently. My nova and >>> neutron are working all right as I installed various systems >>> there working with no issues.. >>> >>>   >>> >>>   >>> >>> *From:*Feilong Wang [mailto:feilong at catalyst.net.nz] >>> *Sent:* Thursday, January 9, 2020 6:12 PM >>> *To:* openstack-discuss at lists.openstack.org >>> >>> *Subject:* Re: [magnum]: K8s cluster creation times out. >>> OpenStack Train : [ERROR]: Unable to render networking. >>> Network config is likely broken >>> >>>   >>> >>> Hi Bhupathi, >>> >>> Could you please share your cluster template? And please >>> make sure your Nova/Neutron works. >>> >>>   >>> >>> On 10/01/20 2:45 AM, Bhupathi, Ramakrishna wrote: >>> >>> Folks, >>> >>> I am building a Kubernetes Cluster( Openstack Train) and >>> using fedora atomic-29 image . The nodes come up  fine ( >>> I have a simple 1 master and 1 node) , but the cluster >>> creation times out,  and when I access the cloud-init >>> logs I see this error .  Wondering what I am missing as >>> this used to work before.  I wonder if this is image >>> related . >>> >>>   >>> >>> [ERROR]: Unable to render networking. Network config is >>> likely broken: No available network renderers found. >>> Searched through list: ['eni', 'sysconfig', 'netplan'] >>> >>>   >>> >>> Essentially the stack creation fails in >>> “kube_cluster_deploy” >>> >>>   >>> >>> Can somebody help me debug this ? Any help is appreciated. >>> >>>   >>> >>> --RamaK >>> >>> The contents of this e-mail message and >>> any attachments are intended solely for the >>> addressee(s) and may contain confidential >>> and/or legally privileged information. If you >>> are not the intended recipient of this message >>> or if this message has been addressed to you >>> in error, please immediately alert the sender >>> by reply e-mail and then delete this message >>> and any attachments. If you are not the >>> intended recipient, you are notified that >>> any use, dissemination, distribution, copying, >>> or storage of this message or any attachment >>> is strictly prohibited. >>> >>> -- >>> Cheers & Best regards, >>> Feilong Wang (王飞龙) >>> Head of R&D >>> Catalyst Cloud - Cloud Native New Zealand >>> -------------------------------------------------------------------------- >>> Tel: +64-48032246 >>> Email: flwang at catalyst.net.nz >>> Level 6, Catalyst House, 150 Willis Street, Wellington >>> -------------------------------------------------------------------------- >>> The contents of this e-mail message and >>> any attachments are intended solely for the >>> addressee(s) and may contain confidential >>> and/or legally privileged information. If you >>> are not the intended recipient of this message >>> or if this message has been addressed to you >>> in error, please immediately alert the sender >>> by reply e-mail and then delete this message >>> and any attachments. If you are not the >>> intended recipient, you are notified that >>> any use, dissemination, distribution, copying, >>> or storage of this message or any attachment >>> is strictly prohibited. >> >> -- >> Cheers & Best regards, >> Feilong Wang (王飞龙) >> ------------------------------------------------------ >> Senior Cloud Software Engineer >> Tel: +64-48032246 >> Email: flwang at catalyst.net.nz >> Catalyst IT Limited >> Level 6, Catalyst House, 150 Willis Street, Wellington >> ------------------------------------------------------ >> >> >> >> -- >> ~/DonnyD >> C: 805 814 6800 >> "No mission too difficult. No sacrifice too great. Duty First" > > -- > Cheers & Best regards, > Feilong Wang (王飞龙) > Head of R&D > Catalyst Cloud - Cloud Native New Zealand > -------------------------------------------------------------------------- > Tel: +64-48032246 > Email: flwang at catalyst.net.nz > Level 6, Catalyst House, 150 Willis Street, Wellington > -------------------------------------------------------------------------- > > > > -- > ~/DonnyD > C: 805 814 6800 > "No mission too difficult. No sacrifice too great. Duty First" -- Cheers & Best regards, Feilong Wang (王飞龙) Head of R&D Catalyst Cloud - Cloud Native New Zealand -------------------------------------------------------------------------- Tel: +64-48032246 Email: flwang at catalyst.net.nz Level 6, Catalyst House, 150 Willis Street, Wellington -------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 55224 bytes Desc: not available URL: From feilong at catalyst.net.nz Mon Jan 13 21:38:56 2020 From: feilong at catalyst.net.nz (Feilong Wang) Date: Tue, 14 Jan 2020 10:38:56 +1300 Subject: [magnum][kolla] etcd wal sync duration issue In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA0477170E@gmsxchsvr01.thecreation.com> References: <046E9C0290DD9149B106B72FC9156BEA0477170E@gmsxchsvr01.thecreation.com> Message-ID: <3f3fe0d1-7b61-d2f9-da65-d126ea5ed336@catalyst.net.nz> Hi Eric, That issue looks familiar for me. There are some questions I'd like to check before answering if you should upgrade to train. 1. Are using the default v3.2.7 version for etcd? 2. Did you try to reproduce this with devstack, using Fedora CoreOS driver? The etcd version could be 3.2.26 I asked above questions because I saw the same error when I used Fedora Atomic with etcd v3.2.7 and I can't reproduce it with Fedora CoreOS + etcd 3.2.26 On 12/01/20 6:44 AM, Eric K. Miller wrote: > > Hi, > >   > > We are using the following coe cluster template and cluster create > commands on an OpenStack Stein installation that installs Magnum 8.2.0 > Kolla containers installed by Kolla-Ansible 8.0.1: > >   > > openstack coe cluster template create \ > >   --image Fedora-AtomicHost-29-20191126.0.x86_64_raw \ > >   --keypair userkey \ > >   --external-network ext-net \ > >   --dns-nameserver 1.1.1.1 \ > >   --master-flavor c5sd.4xlarge \ > >   --flavor m5sd.4xlarge \ > >   --coe kubernetes \ > >   --network-driver flannel \ > >   --volume-driver cinder \ > >   --docker-storage-driver overlay2 \ > >   --docker-volume-size 100 \ > >   --registry-enabled \ > >  --master-lb-enabled \ > >   --floating-ip-disabled \ > >   --fixed-network KubernetesProjectNetwork001 \ > >   --fixed-subnet KubernetesProjectSubnet001 \ > >   --labels > kube_tag=v1.15.7,cloud_provider_tag=v1.15.0,heat_container_agent_tag=stein-dev,master_lb_floating_ip_enabled=true > \ > >   k8s-cluster-template-1.15.7-production-private > >   > > openstack coe cluster create \ > >   --cluster-template k8s-cluster-template-1.15.7-production-private \ > >   --keypair userkey \ > >   --master-count 3 \ > >   --node-count 3 \ > >   k8s-cluster001 > >   > > The deploy process works perfectly, however, the cluster health status > flips between healthy and unhealthy.  The unhealthy status indicates > that etcd has an issue. > >   > > When logged into master-0 (out of 3, as configured above), "systemctl > status etcd" shows the stdout from etcd, which shows: > >   > > Jan 11 17:27:36 k8s-cluster001-4effrc2irvjq-master-0.novalocal > runc[2725]: 2020-01-11 17:27:36.548453 W | etcdserver: timed out > waiting for read index response > > Jan 11 17:28:02 k8s-cluster001-4effrc2irvjq-master-0.novalocal > runc[2725]: 2020-01-11 17:28:02.960977 W | wal: sync duration of > 1.696804699s, expected less than 1s > > Jan 11 17:28:31 k8s-cluster001-4effrc2irvjq-master-0.novalocal > runc[2725]: 2020-01-11 17:28:31.292753 W | wal: sync duration of > 2.249722223s, expected less than 1s > >   > > We also see: > > Jan 11 17:40:39 k8s-cluster001-4effrc2irvjq-master-0.novalocal > runc[2725]: 2020-01-11 17:40:39.132459 I | etcdserver/api/v3rpc: grpc: > Server.processUnaryRPC failed to write status: stream error: code = > DeadlineExceeded desc = "context deadline exceeded" > >   > > We initially used relatively small flavors, but increased these to > something very large to be sure resources were not constrained in any > way.  "top" reported no CPU nor memory contention on any nodes in > either case. > >   > > Multiple clusters have been deployed, and they all have this issue, > including empty clusters that were just deployed. > >   > > I see a very large number of reports of similar issues with etcd, but > discussions lead to disk performance, which can't be the cause here, > not only because persistent storage for etcd isn't configured in > Magnum, but also the disks are "very" fast in this environment.  > Looking at "vmstat -D" from within master-0, the number of writes is > minimal.  Ceilometer logs about 15 to 20 write IOPS for this VM in > Gnocchi. > >   > > Any ideas? > >   > > We are finalizing procedures to upgrade to Train, so we wanted to be > sure that we weren't running into some common issue with Stein that > would immediately be solved with Train.  If so, we will simply proceed > with the upgrade and avoid diagnosing this issue further. > > > Thanks! > >   > > Eric > >   > -- Cheers & Best regards, Feilong Wang (王飞龙) Head of R&D Catalyst Cloud - Cloud Native New Zealand -------------------------------------------------------------------------- Tel: +64-48032246 Email: flwang at catalyst.net.nz Level 6, Catalyst House, 150 Willis Street, Wellington -------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From mihalis68 at gmail.com Mon Jan 13 21:46:28 2020 From: mihalis68 at gmail.com (Chris Morgan) Date: Mon, 13 Jan 2020 16:46:28 -0500 Subject: [ops] No ops meetup team meeting tomorrow (2020/1/14) on IRC Message-ID: We had a great OpenStack Ops meetup last week in London. Those involved now need to catch up on other work, so we'll resume regular meetings on Jan 21st. Our thanks to everyone who attended and/or helped make it one of the best meetups in a while. Chris -- Chris Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From emiller at genesishosting.com Mon Jan 13 21:52:54 2020 From: emiller at genesishosting.com (Eric K. Miller) Date: Mon, 13 Jan 2020 15:52:54 -0600 Subject: [magnum][kolla] etcd wal sync duration issue In-Reply-To: <3f3fe0d1-7b61-d2f9-da65-d126ea5ed336@catalyst.net.nz> References: <046E9C0290DD9149B106B72FC9156BEA0477170E@gmsxchsvr01.thecreation.com> <3f3fe0d1-7b61-d2f9-da65-d126ea5ed336@catalyst.net.nz> Message-ID: <046E9C0290DD9149B106B72FC9156BEA04771716@gmsxchsvr01.thecreation.com> Hi Feilong, Thanks for responding! I am, indeed, using the default v3.2.7 version for etcd, which is the only available image. I did not try to reproduce with any other driver (we have never used DevStack, honestly, only Kolla-Ansible deployments). I did see a number of people indicating similar issues with etcd versions in the 3.3.x range, so I didn't think of it being an etcd issue, but then again most issues seem to be a result of people using HDDs and not SSDs, which makes sense. Interesting that you saw the same issue, though. We haven't tried Fedora CoreOS, but I think we would need Train for this. Everything I read about etcd indicates that it is extremely latency sensitive, due to the fact that it replicates all changes to all nodes and sends an fsync to Linux each time, so data is always guaranteed to be stored. I can see this becoming an issue quickly without super-low-latency network and storage. We are using Ceph-based SSD volumes for the Kubernetes Master node disks, which is extremely fast (likely 10x or better than anything people recommend for etcd), but network latency is always going to be higher with VMs on OpenStack with DVR than bare metal with VLANs due to all of the abstractions. Do you know who maintains the etcd images for Magnum here? Is there an easy way to create a newer image? https://hub.docker.com/r/openstackmagnum/etcd/tags/ Eric From: Feilong Wang [mailto:feilong at catalyst.net.nz] Sent: Monday, January 13, 2020 3:39 PM To: openstack-discuss at lists.openstack.org Subject: Re: [magnum][kolla] etcd wal sync duration issue Hi Eric, That issue looks familiar for me. There are some questions I'd like to check before answering if you should upgrade to train. 1. Are using the default v3.2.7 version for etcd? 2. Did you try to reproduce this with devstack, using Fedora CoreOS driver? The etcd version could be 3.2.26 I asked above questions because I saw the same error when I used Fedora Atomic with etcd v3.2.7 and I can't reproduce it with Fedora CoreOS + etcd 3.2.26 From Albert.Braden at synopsys.com Tue Jan 14 01:01:30 2020 From: Albert.Braden at synopsys.com (Albert Braden) Date: Tue, 14 Jan 2020 01:01:30 +0000 Subject: Designate zone troubleshooting Message-ID: I would like to improve this document so that it can be more useful. https://docs.openstack.org/designate/rocky/admin/troubleshooting.html I'm experiencing "I have a broken zone" in my dev cluster right now, and I would like to update this document with the repair procedure. Can anyone help me figure out what that is? The logs no longer contain the original failure; I want to figure out and then document the procedure that would change my zone statuses from "ERROR" back to "ACTIVE." root at us01odc-dev1-ctrl1:/var/log/designate# openstack zone list --all-projects +--------------------------------------+----------------------------------+-----------------------------+---------+------------+--------+--------+ | id | project_id | name | type | serial | status | action | +--------------------------------------+----------------------------------+-----------------------------+---------+------------+--------+--------+ | d9a74e85-22a7-4844-968d-35e0aefd9997 | cb36981f16674c1a8b2a73f30370f88e | dg.us01-dev1.synopsys.com. | PRIMARY | 1578962764 | ERROR | CREATE | | 29484d33-eb26-4a35-aff8-22f84acf16cd | 474ae347d8ad426f8118e55eee47dcfd | it.us01-dev1.synopsys.com. | PRIMARY | 1578962485 | ACTIVE | NONE | | 05356780-26c7-4649-8532-a42e3c2b75a3 | 1cc94ed7c37a4b4d86e1af3c92a8967c | 112.195.10.in-addr.arpa. | PRIMARY | 1578962486 | ACTIVE | NONE | | cc8290ba-12f8-485e-a9bb-6de3324764ef | eb5fa5310ca648d19cc0d35fdf13953a | seg.us01-dev1.synopsys.com. | PRIMARY | 1578962207 | ERROR | CREATE | | e3abb13c-58f6-49da-9aab-0a143c7c4fb8 | 1cc94ed7c37a4b4d86e1af3c92a8967c | 117.195.10.in-addr.arpa. | PRIMARY | 1578962208 | ERROR | CREATE | | 236949dc-ea7e-4ad7-a570-b62fccd05fac | 1cc94ed7c37a4b4d86e1af3c92a8967c | 113.195.10.in-addr.arpa. | PRIMARY | 1578962765 | ERROR | CREATE | +--------------------------------------+----------------------------------+-----------------------------+---------+------------+--------+--------+ -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Tue Jan 14 07:37:16 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Tue, 14 Jan 2020 08:37:16 +0100 Subject: [magnum][kolla] etcd wal sync duration issue In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA04771716@gmsxchsvr01.thecreation.com> References: <046E9C0290DD9149B106B72FC9156BEA0477170E@gmsxchsvr01.thecreation.com> <3f3fe0d1-7b61-d2f9-da65-d126ea5ed336@catalyst.net.nz> <046E9C0290DD9149B106B72FC9156BEA04771716@gmsxchsvr01.thecreation.com> Message-ID: Just to clarify: this etcd is not provided by Kolla nor installed by Kolla-Ansible. -yoctozepto pon., 13 sty 2020 o 22:54 Eric K. Miller napisał(a): > > Hi Feilong, > > Thanks for responding! I am, indeed, using the default v3.2.7 version for etcd, which is the only available image. > > I did not try to reproduce with any other driver (we have never used DevStack, honestly, only Kolla-Ansible deployments). I did see a number of people indicating similar issues with etcd versions in the 3.3.x range, so I didn't think of it being an etcd issue, but then again most issues seem to be a result of people using HDDs and not SSDs, which makes sense. > > Interesting that you saw the same issue, though. We haven't tried Fedora CoreOS, but I think we would need Train for this. > > Everything I read about etcd indicates that it is extremely latency sensitive, due to the fact that it replicates all changes to all nodes and sends an fsync to Linux each time, so data is always guaranteed to be stored. I can see this becoming an issue quickly without super-low-latency network and storage. We are using Ceph-based SSD volumes for the Kubernetes Master node disks, which is extremely fast (likely 10x or better than anything people recommend for etcd), but network latency is always going to be higher with VMs on OpenStack with DVR than bare metal with VLANs due to all of the abstractions. > > Do you know who maintains the etcd images for Magnum here? Is there an easy way to create a newer image? > https://hub.docker.com/r/openstackmagnum/etcd/tags/ > > Eric > > > > From: Feilong Wang [mailto:feilong at catalyst.net.nz] > Sent: Monday, January 13, 2020 3:39 PM > To: openstack-discuss at lists.openstack.org > Subject: Re: [magnum][kolla] etcd wal sync duration issue > > Hi Eric, > That issue looks familiar for me. There are some questions I'd like to check before answering if you should upgrade to train. > 1. Are using the default v3.2.7 version for etcd? > 2. Did you try to reproduce this with devstack, using Fedora CoreOS driver? The etcd version could be 3.2.26 > I asked above questions because I saw the same error when I used Fedora Atomic with etcd v3.2.7 and I can't reproduce it with Fedora CoreOS + etcd 3.2.26 > > From skaplons at redhat.com Tue Jan 14 08:20:54 2020 From: skaplons at redhat.com (Slawek Kaplonski) Date: Tue, 14 Jan 2020 09:20:54 +0100 Subject: [all] DevStack jobs broken due to setuptools not available for Python 2 In-Reply-To: <20200113173220.sdryzodzxvhxhvqc@yuggoth.org> References: <20200113060554.GA3819219@fedora19.localdomain> <16f9f3be487.10f2e3b05395088.6540142001370012765@ghanshyammann.com> <20200113144356.mj3ot2crequlcowc@yuggoth.org> <20200113154144.os44mrihcotvwt3r@yuggoth.org> <20200113173220.sdryzodzxvhxhvqc@yuggoth.org> Message-ID: Hi, > On 13 Jan 2020, at 18:32, Jeremy Stanley wrote: > > On 2020-01-13 15:41:44 +0000 (+0000), Jeremy Stanley wrote: > [...] >> I've proposed https://review.opendev.org/702244 for a smaller-scale >> version of this now (explicitly blacklisting wheels for pip, >> setuptools and virtualenv) but we can expand it to something more >> thorough once it's put through its paces. If we merge that, then we >> can manually delete the affected setuptools wheel from our >> wheel-mirror volume and not have to worry about it coming back. > > This has since merged, and as of 16:30 UTC (roughly an hour ago) I > deleted all copies of the setuptools-45.0.0-py2.py3-none-any.whl > file from our AFS volumes. We're testing now to see if previously > broken jobs are working again, but suspect things should be back to > normal. Thx Jeremy for quick fix for this issue. It seems that, at least for Neutron, all jobs are working again :) > -- > Jeremy Stanley — Slawek Kaplonski Senior software engineer Red Hat From renat.akhmerov at gmail.com Tue Jan 14 09:28:37 2020 From: renat.akhmerov at gmail.com (Renat Akhmerov) Date: Tue, 14 Jan 2020 16:28:37 +0700 Subject: [mistral][core] Promoting Eyal Bar-Ilan to the Mistral core team In-Reply-To: <11f24b32-4512-4b4d-95a0-71d485850ec3@Spark> References: <11f24b32-4512-4b4d-95a0-71d485850ec3@Spark> Message-ID: <248b0b4c-b8bf-484c-aca6-6c1b6429ec18@Spark> Hi, I’d like to promote Eyal Bar-Ilan to the Mistral core team since he’s shown a great contribution performance in the recent months. Eyal always reacts on various CI issues timely and provides fixes very quickly. He’s also completed a number of useful functional Mistral features in Train and Ussuri. And his overall statistics for Ussuri ([1]) makes him a clear candidate for core membership. Core reviewers, please let me know if you have any objections. [1] https://www.stackalytics.com/?module=mistral-group&release=ussuri&user_id=eyal.bar-ilan at nokia.com&metric=commits Thanks Renat Akhmerov @Nokia -------------- next part -------------- An HTML attachment was scrubbed... URL: From apetrich at redhat.com Tue Jan 14 10:18:10 2020 From: apetrich at redhat.com (Adriano Petrich) Date: Tue, 14 Jan 2020 11:18:10 +0100 Subject: [mistral][core] Promoting Eyal Bar-Ilan to the Mistral core team In-Reply-To: <248b0b4c-b8bf-484c-aca6-6c1b6429ec18@Spark> References: <11f24b32-4512-4b4d-95a0-71d485850ec3@Spark> <248b0b4c-b8bf-484c-aca6-6c1b6429ec18@Spark> Message-ID: +1 from me On Tue, 14 Jan 2020 at 10:28, Renat Akhmerov wrote: > Hi, > > I’d like to promote Eyal Bar-Ilan to the Mistral core team since he’s > shown a great contribution performance in the recent months. Eyal always > reacts on various CI issues timely and provides fixes very quickly. He’s > also completed a number of useful functional Mistral features in Train and > Ussuri. And his overall statistics for Ussuri ([1]) makes him a clear > candidate for core membership. > > Core reviewers, please let me know if you have any objections. > > [1] > https://www.stackalytics.com/?module=mistral-group&release=ussuri&user_id=eyal.bar-ilan at nokia.com&metric=commits > > > Thanks > > Renat Akhmerov > @Nokia > -- Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Laurie Krebs, Michael O'Neill, Thomas Savage -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhangbailin at inspur.com Tue Jan 14 11:31:18 2020 From: zhangbailin at inspur.com (=?gb2312?B?QnJpbiBaaGFuZyjVxbDZwdYp?=) Date: Tue, 14 Jan 2020 11:31:18 +0000 Subject: [cyborg] enable launchpad or storyboard for Cyborg Message-ID: Hi Sundar and all: I think we should enable launchpad for the Cyborg project to record its reported bugs, submitted blueprints, etc., so that we can keep track of project updates and changes. Now I found there are some specifications in the cyborg-specs, and there has not been management by the Launchpad (https://launchpad.net/cyborg) or storyboard (https://storyboard.openstack.org/#!/project/openstack/cyborg). Personally recommend using Launchpad, it looks very intuitive. brinzhang -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Tue Jan 14 11:42:59 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Tue, 14 Jan 2020 12:42:59 +0100 Subject: [cyborg] enable launchpad or storyboard for Cyborg In-Reply-To: References: Message-ID: Two cents from me: I believe storyboard is the way forward. -yoctozepto wt., 14 sty 2020 o 12:39 Brin Zhang(张百林) napisał(a): > > Hi Sundar and all: > > I think we should enable launchpad for the Cyborg project to record its reported bugs, submitted blueprints, etc., so that we can keep track of project updates and changes. > > Now I found there are some specifications in the cyborg-specs, and there has not been management by the Launchpad (https://launchpad.net/cyborg) or storyboard (https://storyboard.openstack.org/#!/project/openstack/cyborg). > > Personally recommend using Launchpad, it looks very intuitive. > > > > brinzhang > > From andras.1.kovi at nokia.com Tue Jan 14 10:11:50 2020 From: andras.1.kovi at nokia.com (Kovi, Andras 1. (Nokia - HU/Budapest)) Date: Tue, 14 Jan 2020 10:11:50 +0000 Subject: [mistral][core] Promoting Eyal Bar-Ilan to the Mistral core team In-Reply-To: <248b0b4c-b8bf-484c-aca6-6c1b6429ec18@Spark> References: <11f24b32-4512-4b4d-95a0-71d485850ec3@Spark>, <248b0b4c-b8bf-484c-aca6-6c1b6429ec18@Spark> Message-ID: Workflow +1 Very welcome in the team! A ________________________________ From: Renat Akhmerov Sent: Tuesday, January 14, 2020 10:28:37 AM To: openstack-discuss at lists.openstack.org Subject: [mistral][core] Promoting Eyal Bar-Ilan to the Mistral core team Hi, I’d like to promote Eyal Bar-Ilan to the Mistral core team since he’s shown a great contribution performance in the recent months. Eyal always reacts on various CI issues timely and provides fixes very quickly. He’s also completed a number of useful functional Mistral features in Train and Ussuri. And his overall statistics for Ussuri ([1]) makes him a clear candidate for core membership. Core reviewers, please let me know if you have any objections. [1] https://www.stackalytics.com/?module=mistral-group&release=ussuri&user_id=eyal.bar-ilan at nokia.com&metric=commits Thanks Renat Akhmerov @Nokia -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Tue Jan 14 13:09:22 2020 From: smooney at redhat.com (Sean Mooney) Date: Tue, 14 Jan 2020 13:09:22 +0000 Subject: [cyborg] enable launchpad or storyboard for Cyborg In-Reply-To: References: Message-ID: <5760300f37d4d4e3cd71037688fb23ee76c5c685.camel@redhat.com> cyborg is already using storyboard. i dont really like it but since its already moved i dont think its worth moving back. i think launchpad is more intuitive but i think that boat has already sailed. On Tue, 2020-01-14 at 12:42 +0100, Radosław Piliszek wrote: > Two cents from me: I believe storyboard is the way forward. > > -yoctozepto > > wt., 14 sty 2020 o 12:39 Brin Zhang(张百林) napisał(a): > > > > Hi Sundar and all: > > > > I think we should enable launchpad for the Cyborg project to record its reported bugs, submitted blueprints, etc., > > so that we can keep track of project updates and changes. > > > > Now I found there are some specifications in the cyborg-specs, and there has not been management by the Launchpad ( > > https://launchpad.net/cyborg) or storyboard (https://storyboard.openstack.org/#!/project/openstack/cyborg). > > > > Personally recommend using Launchpad, it looks very intuitive. > > > > > > > > brinzhang > > > > > > From beagles at redhat.com Tue Jan 14 14:36:39 2020 From: beagles at redhat.com (Brent Eagles) Date: Tue, 14 Jan 2020 11:06:39 -0330 Subject: [tripleo] rocky builds In-Reply-To: References: Message-ID: <7fe24eb0-3a96-18a3-34d7-1a1495506b10@redhat.com> Hi, On 2020-01-10 1:42 p.m., Wesley Hayutin wrote: > Greetings, > > I've confirmed that builds from the Rocky release will no longer be > imported.  Looking for input from the upstream folks with regards to > maintaining the Rocky release for upstream.  Can you please comment if > you have any requirement to continue building, patching Rocky as I know > there are active reviews [1].  I've added this topic to be discussed at > the next meeting [2] > > Thank you! > > > [1] https://review.opendev.org/#/q/status:open+tripleo+branch:stable/rocky > [2] https://etherpad.openstack.org/p/tripleo-meeting-items Am I correct in that this only applies to rocky and we will continue to build and run CI on queens? Cheers, Brent From sean.mcginnis at gmx.com Tue Jan 14 14:44:38 2020 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Tue, 14 Jan 2020 08:44:38 -0600 Subject: [tripleo] rocky builds In-Reply-To: <7fe24eb0-3a96-18a3-34d7-1a1495506b10@redhat.com> References: <7fe24eb0-3a96-18a3-34d7-1a1495506b10@redhat.com> Message-ID: <63cef2c5-27bc-5e4f-34fa-639dbafd9230@gmx.com> On 1/14/20 8:36 AM, Brent Eagles wrote: > Hi, > > On 2020-01-10 1:42 p.m., Wesley Hayutin wrote: >> Greetings, >> >> I've confirmed that builds from the Rocky release will no longer be >> imported.  Looking for input from the upstream folks with regards to >> maintaining the Rocky release for upstream.  Can you please comment >> if you have any requirement to continue building, patching Rocky as I >> know there are active reviews [1].  I've added this topic to be >> discussed at the next meeting [2] >> >> Thank you! >> >> >> [1] >> https://review.opendev.org/#/q/status:open+tripleo+branch:stable/rocky >> [2] https://etherpad.openstack.org/p/tripleo-meeting-items > > Am I correct in that this only applies to rocky and we will continue > to build and run CI on queens? > > Cheers, > > Brent > Rocky is still in the Maintained phase: https://releases.openstack.org/#release-series Older stable branches are in extended maintenance, so if the plan is to not support them anymore, that needs to be declared, with six months allowed for someone else to have a window to offer to continue providing that extended maintenance: https://docs.openstack.org/project-team-guide/stable-branches.html#maintenance-phases After the six months, if no one offers to maintain the code, it can then be marked as EOL. From radoslaw.piliszek at gmail.com Tue Jan 14 17:19:20 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Tue, 14 Jan 2020 18:19:20 +0100 Subject: [deployment][kolla][tripleo][osa][docs] Intro to OpenStack Message-ID: Hiya, Folks! I had some thought after talking with people new to OpenStack - our deployment tools are too well-hidden in the ecosystem. People try deploying devstack for fun (and some still use packstack!) and then give up on manual installation for production due to its complexity and fear of upgrades. I also got voices that Kolla/TripleO/OSA (order random) is an "unofficial" way to deploy OpenStack and the installation guide is the only "official" one (whatever that may mean in this very context). So I decided I go the "newbie" route and inspected the website. https://www.openstack.org invites us to browse: https://www.openstack.org/software/start/ which is nice and dandy, presents options of enterprise-grade solutions etc. but fails to really mention OpenStack has deployment tools (if one does not look at the submenu bar) and instead points the "newbie" to the installation guide: https://docs.openstack.org/install-guide/overview.html which kind-of negates that OpenStack ecosystem has any ready tools of deployment by saying: "After becoming familiar with basic installation, configuration, operation, and troubleshooting of these OpenStack services, you should consider the following steps toward deployment using a production architecture: ... Implement a deployment tool such as Ansible, Chef, Puppet, or Salt to automate deployment and management of the production environment." Just some food for thought. Extra for Kolla and OSA: https://docs.openstack.org/train/deploy/ seems we no longer deploy OpenStack since Train. -yoctozepto From noonedeadpunk at ya.ru Tue Jan 14 17:53:54 2020 From: noonedeadpunk at ya.ru (Dmitriy Rabotyagov) Date: Tue, 14 Jan 2020 19:53:54 +0200 Subject: [deployment][kolla][tripleo][osa][docs] Intro to OpenStack In-Reply-To: References: Message-ID: <20673451579024434@myt3-9168aea9495d.qloud-c.yandex.net> This part is really strage, since there are links for stein and ussuri, but not for train... While OSA has train deployment docs [1] [1] https://docs.openstack.org/project-deploy-guide/openstack-ansible/train/ 14.01.2020, 19:24, "Radosław Piliszek" : > > Extra for Kolla and OSA: > https://docs.openstack.org/train/deploy/ > seems we no longer deploy OpenStack since Train. > > -yoctozepto --  Kind Regards, Dmitriy Rabotyagov From radoslaw.piliszek at gmail.com Tue Jan 14 18:05:00 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Tue, 14 Jan 2020 19:05:00 +0100 Subject: [deployment][kolla][tripleo][osa][docs] Intro to OpenStack In-Reply-To: <20673451579024434@myt3-9168aea9495d.qloud-c.yandex.net> References: <20673451579024434@myt3-9168aea9495d.qloud-c.yandex.net> Message-ID: As does Kolla-Ansible to be complete: https://docs.openstack.org/project-deploy-guide/kolla-ansible/train/ -yoctozepto wt., 14 sty 2020 o 19:01 Dmitriy Rabotyagov napisał(a): > > This part is really strage, since there are links for stein and ussuri, but not for train... While OSA has train deployment docs [1] > > [1] https://docs.openstack.org/project-deploy-guide/openstack-ansible/train/ > > 14.01.2020, 19:24, "Radosław Piliszek" : > > > > > Extra for Kolla and OSA: > > https://docs.openstack.org/train/deploy/ > > seems we no longer deploy OpenStack since Train. > > > > -yoctozepto > > -- > Kind Regards, > Dmitriy Rabotyagov > > From C-Ramakrishna.Bhupathi at charter.com Tue Jan 14 20:56:42 2020 From: C-Ramakrishna.Bhupathi at charter.com (Bhupathi, Ramakrishna) Date: Tue, 14 Jan 2020 20:56:42 +0000 Subject: [magnum]: K8s cluster creation times out. OpenStack Train : [ERROR]: Unable to render networking. Network config is likely broken In-Reply-To: <5846dbb7-fbcd-db22-7342-5ba2b6e4a1d3@catalyst.net.nz> References: <59ed63745f4e4c42a63692c3ee4eb10d@ncwmexgp031.CORP.CHARTERCOM.com> <6c8f45f2-da74-18fd-7909-84c9c6762fe3@catalyst.net.nz> <24a8d164-8e38-1512-caf3-9447f070b8fd@catalyst.net.nz> <5846dbb7-fbcd-db22-7342-5ba2b6e4a1d3@catalyst.net.nz> Message-ID: I just moved to Fedora core OS image (fedora-coreos-31) to build my K8s Magnum cluster and cluster creation fails with ERROR: The Parameter (octavia_ingress_controller_tag) was not defined in template. I wonder why I need that tag. Any help please? --RamaK From: Feilong Wang [mailto:feilong at catalyst.net.nz] Sent: Monday, January 13, 2020 4:25 PM To: Donny Davis Cc: OpenStack Discuss Subject: Re: [magnum]: K8s cluster creation times out. OpenStack Train : [ERROR]: Unable to render networking. Network config is likely broken OK, if you're happy to stay on CoreOS, all good. If you're interested in migrating to Fedora CoreOS and have questions, then you're welcome to popup in #openstack-containers. Cheers. On 14/01/20 10:21 AM, Donny Davis wrote: Just Coreos - I tried them all and it was the only one that worked oob. On Mon, Jan 13, 2020 at 4:10 PM Feilong Wang > wrote: Hi Donny, Do you mean Fedore CoreOS or just CoreOS? The current CoreOS driver is not actively maintained, I would suggest migrating to Fedora CoreOS and I'm happy to help if you have any question. Thanks. On 14/01/20 9:57 AM, Donny Davis wrote: FWIW I was only able to get the coreos image working with magnum oob.. the rest just didn't work. On Mon, Jan 13, 2020 at 2:31 PM feilong > wrote: Hi Bhupathi, Firstly, I would suggest setting the use_podman=False when using fedora atomic image. And it would be nice to set the "kube_tag", e.g. v1.15.6 explicitly. Then please trigger a new cluster creation. Then if you still run into error. Here is the debug steps: 1. ssh into the master node, check log /var/log/cloud-init-output.log 2. If there is no error in above log file, then run journalctl -u heat-container-agent to check the heat-container-agent log. If above step is correct, then you must be able to see something useful here. On 11/01/20 12:15 AM, Bhupathi, Ramakrishna wrote: Wang, Here it is . I added the labels subsequently. My nova and neutron are working all right as I installed various systems there working with no issues.. [cid:image001.png at 01D5CAF2.D1A55F30] From: Feilong Wang [mailto:feilong at catalyst.net.nz] Sent: Thursday, January 9, 2020 6:12 PM To: openstack-discuss at lists.openstack.org Subject: Re: [magnum]: K8s cluster creation times out. OpenStack Train : [ERROR]: Unable to render networking. Network config is likely broken Hi Bhupathi, Could you please share your cluster template? And please make sure your Nova/Neutron works. On 10/01/20 2:45 AM, Bhupathi, Ramakrishna wrote: Folks, I am building a Kubernetes Cluster( Openstack Train) and using fedora atomic-29 image . The nodes come up fine ( I have a simple 1 master and 1 node) , but the cluster creation times out, and when I access the cloud-init logs I see this error . Wondering what I am missing as this used to work before. I wonder if this is image related . [ERROR]: Unable to render networking. Network config is likely broken: No available network renderers found. Searched through list: ['eni', 'sysconfig', 'netplan'] Essentially the stack creation fails in “kube_cluster_deploy” Can somebody help me debug this ? Any help is appreciated. --RamaK The contents of this e-mail message and any attachments are intended solely for the addressee(s) and may contain confidential and/or legally privileged information. If you are not the intended recipient of this message or if this message has been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this message and any attachments. If you are not the intended recipient, you are notified that any use, dissemination, distribution, copying, or storage of this message or any attachment is strictly prohibited. -- Cheers & Best regards, Feilong Wang (王飞龙) Head of R&D Catalyst Cloud - Cloud Native New Zealand -------------------------------------------------------------------------- Tel: +64-48032246 Email: flwang at catalyst.net.nz Level 6, Catalyst House, 150 Willis Street, Wellington -------------------------------------------------------------------------- The contents of this e-mail message and any attachments are intended solely for the addressee(s) and may contain confidential and/or legally privileged information. If you are not the intended recipient of this message or if this message has been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this message and any attachments. If you are not the intended recipient, you are notified that any use, dissemination, distribution, copying, or storage of this message or any attachment is strictly prohibited. -- Cheers & Best regards, Feilong Wang (王飞龙) ------------------------------------------------------ Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang at catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington ------------------------------------------------------ -- ~/DonnyD C: 805 814 6800 "No mission too difficult. No sacrifice too great. Duty First" -- Cheers & Best regards, Feilong Wang (王飞龙) Head of R&D Catalyst Cloud - Cloud Native New Zealand -------------------------------------------------------------------------- Tel: +64-48032246 Email: flwang at catalyst.net.nz Level 6, Catalyst House, 150 Willis Street, Wellington -------------------------------------------------------------------------- -- ~/DonnyD C: 805 814 6800 "No mission too difficult. No sacrifice too great. Duty First" -- Cheers & Best regards, Feilong Wang (王飞龙) Head of R&D Catalyst Cloud - Cloud Native New Zealand -------------------------------------------------------------------------- Tel: +64-48032246 Email: flwang at catalyst.net.nz Level 6, Catalyst House, 150 Willis Street, Wellington -------------------------------------------------------------------------- E-MAIL CONFIDENTIALITY NOTICE: The contents of this e-mail message and any attachments are intended solely for the addressee(s) and may contain confidential and/or legally privileged information. If you are not the intended recipient of this message or if this message has been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this message and any attachments. If you are not the intended recipient, you are notified that any use, dissemination, distribution, copying, or storage of this message or any attachment is strictly prohibited. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 55224 bytes Desc: image001.png URL: From donny at fortnebula.com Tue Jan 14 21:04:42 2020 From: donny at fortnebula.com (Donny Davis) Date: Tue, 14 Jan 2020 16:04:42 -0500 Subject: [magnum]: K8s cluster creation times out. OpenStack Train : [ERROR]: Unable to render networking. Network config is likely broken In-Reply-To: References: <59ed63745f4e4c42a63692c3ee4eb10d@ncwmexgp031.CORP.CHARTERCOM.com> <6c8f45f2-da74-18fd-7909-84c9c6762fe3@catalyst.net.nz> <24a8d164-8e38-1512-caf3-9447f070b8fd@catalyst.net.nz> <5846dbb7-fbcd-db22-7342-5ba2b6e4a1d3@catalyst.net.nz> Message-ID: Did you update your cluster distro? Can you share your current cluster template? Donny Davis c: 805 814 6800 On Tue, Jan 14, 2020, 3:56 PM Bhupathi, Ramakrishna < C-Ramakrishna.Bhupathi at charter.com> wrote: > I just moved to Fedora core OS image (fedora-coreos-31) to build my K8s > Magnum cluster and cluster creation fails with > > ERROR: The Parameter (octavia_ingress_controller_tag) was not defined in > template. > > > > I wonder why I need that tag. Any help please? > > > > --RamaK > > > > *From:* Feilong Wang [mailto:feilong at catalyst.net.nz] > *Sent:* Monday, January 13, 2020 4:25 PM > *To:* Donny Davis > *Cc:* OpenStack Discuss > *Subject:* Re: [magnum]: K8s cluster creation times out. OpenStack Train > : [ERROR]: Unable to render networking. Network config is likely broken > > > > OK, if you're happy to stay on CoreOS, all good. If you're interested in > migrating to Fedora CoreOS and have questions, then you're welcome to popup > in #openstack-containers. Cheers. > > > > On 14/01/20 10:21 AM, Donny Davis wrote: > > Just Coreos - I tried them all and it was the only one that worked oob. > > > > On Mon, Jan 13, 2020 at 4:10 PM Feilong Wang > wrote: > > Hi Donny, > > Do you mean Fedore CoreOS or just CoreOS? The current CoreOS driver is not > actively maintained, I would suggest migrating to Fedora CoreOS and I'm > happy to help if you have any question. Thanks. > > > > On 14/01/20 9:57 AM, Donny Davis wrote: > > FWIW I was only able to get the coreos image working with magnum oob.. the > rest just didn't work. > > > > On Mon, Jan 13, 2020 at 2:31 PM feilong wrote: > > Hi Bhupathi, > > Firstly, I would suggest setting the use_podman=False when using fedora > atomic image. And it would be nice to set the "kube_tag", e.g. v1.15.6 > explicitly. Then please trigger a new cluster creation. Then if you still > run into error. Here is the debug steps: > > 1. ssh into the master node, check log /var/log/cloud-init-output.log > > 2. If there is no error in above log file, then run journalctl -u > heat-container-agent to check the heat-container-agent log. If above step > is correct, then you must be able to see something useful here. > > > > On 11/01/20 12:15 AM, Bhupathi, Ramakrishna wrote: > > Wang, > > Here it is . I added the labels subsequently. My nova and neutron are > working all right as I installed various systems there working with no > issues.. > > > > > > *From:* Feilong Wang [mailto:feilong at catalyst.net.nz > ] > *Sent:* Thursday, January 9, 2020 6:12 PM > *To:* openstack-discuss at lists.openstack.org > *Subject:* Re: [magnum]: K8s cluster creation times out. OpenStack Train > : [ERROR]: Unable to render networking. Network config is likely broken > > > > Hi Bhupathi, > > Could you please share your cluster template? And please make sure your > Nova/Neutron works. > > > > On 10/01/20 2:45 AM, Bhupathi, Ramakrishna wrote: > > Folks, > > I am building a Kubernetes Cluster( Openstack Train) and using fedora > atomic-29 image . The nodes come up fine ( I have a simple 1 master and 1 > node) , but the cluster creation times out, and when I access the > cloud-init logs I see this error . Wondering what I am missing as this > used to work before. I wonder if this is image related . > > > > [ERROR]: Unable to render networking. Network config is likely broken: No > available network renderers found. Searched through list: ['eni', > 'sysconfig', 'netplan'] > > > > Essentially the stack creation fails in “kube_cluster_deploy” > > > > Can somebody help me debug this ? Any help is appreciated. > > > > --RamaK > > The contents of this e-mail message and > any attachments are intended solely for the > addressee(s) and may contain confidential > and/or legally privileged information. If you > are not the intended recipient of this message > or if this message has been addressed to you > in error, please immediately alert the sender > by reply e-mail and then delete this message > and any attachments. If you are not the > intended recipient, you are notified that > any use, dissemination, distribution, copying, > or storage of this message or any attachment > is strictly prohibited. > > -- > > Cheers & Best regards, > > Feilong Wang (王飞龙) > > Head of R&D > > Catalyst Cloud - Cloud Native New Zealand > > -------------------------------------------------------------------------- > > Tel: +64-48032246 > > Email: flwang at catalyst.net.nz > > Level 6, Catalyst House, 150 Willis Street, Wellington > > -------------------------------------------------------------------------- > > The contents of this e-mail message and > any attachments are intended solely for the > addressee(s) and may contain confidential > and/or legally privileged information. If you > are not the intended recipient of this message > or if this message has been addressed to you > in error, please immediately alert the sender > by reply e-mail and then delete this message > and any attachments. If you are not the > intended recipient, you are notified that > any use, dissemination, distribution, copying, > or storage of this message or any attachment > is strictly prohibited. > > -- > > Cheers & Best regards, > > Feilong Wang (王飞龙) > > ------------------------------------------------------ > > Senior Cloud Software Engineer > > Tel: +64-48032246 > > Email: flwang at catalyst.net.nz > > Catalyst IT Limited > > Level 6, Catalyst House, 150 Willis Street, Wellington > > ------------------------------------------------------ > > > > > -- > > ~/DonnyD > > C: 805 814 6800 > > "No mission too difficult. No sacrifice too great. Duty First" > > -- > > Cheers & Best regards, > > Feilong Wang (王飞龙) > > Head of R&D > > Catalyst Cloud - Cloud Native New Zealand > > -------------------------------------------------------------------------- > > Tel: +64-48032246 > > Email: flwang at catalyst.net.nz > > Level 6, Catalyst House, 150 Willis Street, Wellington > > -------------------------------------------------------------------------- > > > > > -- > > ~/DonnyD > > C: 805 814 6800 > > "No mission too difficult. No sacrifice too great. Duty First" > > -- > > Cheers & Best regards, > > Feilong Wang (王飞龙) > > Head of R&D > > Catalyst Cloud - Cloud Native New Zealand > > -------------------------------------------------------------------------- > > Tel: +64-48032246 > > Email: flwang at catalyst.net.nz > > Level 6, Catalyst House, 150 Willis Street, Wellington > > -------------------------------------------------------------------------- > > The contents of this e-mail message and > any attachments are intended solely for the > addressee(s) and may contain confidential > and/or legally privileged information. If you > are not the intended recipient of this message > or if this message has been addressed to you > in error, please immediately alert the sender > by reply e-mail and then delete this message > and any attachments. If you are not the > intended recipient, you are notified that > any use, dissemination, distribution, copying, > or storage of this message or any attachment > is strictly prohibited. > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 55224 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 55224 bytes Desc: not available URL: From Albert.Braden at synopsys.com Wed Jan 15 00:49:19 2020 From: Albert.Braden at synopsys.com (Albert Braden) Date: Wed, 15 Jan 2020 00:49:19 +0000 Subject: Designate zone troubleshooting [designate] Message-ID: Trying again: I would like to improve this document so that it can be more useful. https://docs.openstack.org/designate/rocky/admin/troubleshooting.html I'm experiencing "I have a broken zone" in my dev cluster right now, and I would like to update this document with the repair procedure. Can anyone help me figure out what that is? The logs no longer contain the original failure; I want to figure out and then document the procedure that would change my zone statuses from "ERROR" back to "ACTIVE." root at us01odc-dev1-ctrl1:/var/log/designate# openstack zone list --all-projects +--------------------------------------+----------------------------------+-----------------------------+---------+------------+--------+--------+ | id | project_id | name | type | serial | status | action | +--------------------------------------+----------------------------------+-----------------------------+---------+------------+--------+--------+ | d9a74e85-22a7-4844-968d-35e0aefd9997 | cb36981f16674c1a8b2a73f30370f88e | dg.us01-dev1.synopsys.com. | PRIMARY | 1578962764 | ERROR | CREATE | | 29484d33-eb26-4a35-aff8-22f84acf16cd | 474ae347d8ad426f8118e55eee47dcfd | it.us01-dev1.synopsys.com. | PRIMARY | 1578962485 | ACTIVE | NONE | | 05356780-26c7-4649-8532-a42e3c2b75a3 | 1cc94ed7c37a4b4d86e1af3c92a8967c | 112.195.10.in-addr.arpa. | PRIMARY | 1578962486 | ACTIVE | NONE | | cc8290ba-12f8-485e-a9bb-6de3324764ef | eb5fa5310ca648d19cc0d35fdf13953a | seg.us01-dev1.synopsys.com. | PRIMARY | 1578962207 | ERROR | CREATE | | e3abb13c-58f6-49da-9aab-0a143c7c4fb8 | 1cc94ed7c37a4b4d86e1af3c92a8967c | 117.195.10.in-addr.arpa. | PRIMARY | 1578962208 | ERROR | CREATE | | 236949dc-ea7e-4ad7-a570-b62fccd05fac | 1cc94ed7c37a4b4d86e1af3c92a8967c | 113.195.10.in-addr.arpa. | PRIMARY | 1578962765 | ERROR | CREATE | +--------------------------------------+----------------------------------+-----------------------------+---------+------------+--------+--------+ -------------- next part -------------- An HTML attachment was scrubbed... URL: From vgvoleg at gmail.com Wed Jan 15 07:09:53 2020 From: vgvoleg at gmail.com (Oleg Ovcharuk) Date: Wed, 15 Jan 2020 10:09:53 +0300 Subject: [mistral][core] Promoting Eyal Bar-Ilan to the Mistral core team In-Reply-To: References: Message-ID: <5DEAE4DB-3D39-46EB-9F20-D7414A0FE4C1@gmail.com> +1 > 14 янв. 2020 г., в 15:12, Kovi, Andras 1. (Nokia - HU/Budapest) написал(а): > >  > Workflow +1 > > Very welcome in the team! > > A > > From: Renat Akhmerov > Sent: Tuesday, January 14, 2020 10:28:37 AM > To: openstack-discuss at lists.openstack.org > Subject: [mistral][core] Promoting Eyal Bar-Ilan to the Mistral core team > > Hi, > > I’d like to promote Eyal Bar-Ilan to the Mistral core team since he’s shown a great contribution performance in the recent months. Eyal always reacts on various CI issues timely and provides fixes very quickly. He’s also completed a number of useful functional Mistral features in Train and Ussuri. And his overall statistics for Ussuri ([1]) makes him a clear candidate for core membership. > > Core reviewers, please let me know if you have any objections. > > [1] https://www.stackalytics.com/?module=mistral-group&release=ussuri&user_id=eyal.bar-ilan at nokia.com&metric=commits > > > Thanks > > Renat Akhmerov > @Nokia -------------- next part -------------- An HTML attachment was scrubbed... URL: From renat.akhmerov at gmail.com Wed Jan 15 07:13:07 2020 From: renat.akhmerov at gmail.com (Renat Akhmerov) Date: Wed, 15 Jan 2020 14:13:07 +0700 Subject: [mistral][core] Promoting Eyal Bar-Ilan to the Mistral core team In-Reply-To: <5DEAE4DB-3D39-46EB-9F20-D7414A0FE4C1@gmail.com> References: <5DEAE4DB-3D39-46EB-9F20-D7414A0FE4C1@gmail.com> Message-ID: <076d1816-cbdb-47d7-b9b6-8cd5337bdf81@Spark> Eyal, congrats! You just became a core member :) You now can vote with +2 (or -2) and approve patches. Keep up the good work! Thanks Renat Akhmerov @Nokia On 15 Jan 2020, 14:09 +0700, Oleg Ovcharuk , wrote: > +1 > > > 14 янв. 2020 г., в 15:12, Kovi, Andras 1. (Nokia - HU/Budapest) написал(а): > > > > Workflow +1 > > > > Very welcome in the team! > > > > A > > > > From: Renat Akhmerov > > Sent: Tuesday, January 14, 2020 10:28:37 AM > > To: openstack-discuss at lists.openstack.org > > Subject: [mistral][core] Promoting Eyal Bar-Ilan to the Mistral core team > > > > Hi, > > > > I’d like to promote Eyal Bar-Ilan to the Mistral core team since he’s shown a great contribution performance in the recent months. Eyal always reacts on various CI issues timely and provides fixes very quickly. He’s also completed a number of useful functional Mistral features in Train and Ussuri. And his overall statistics for Ussuri ([1]) makes him a clear candidate for core membership. > > > > Core reviewers, please let me know if you have any objections. > > > > [1] https://www.stackalytics.com/?module=mistral-group&release=ussuri&user_id=eyal.bar-ilan at nokia.com&metric=commits > > > > > > Thanks > > > > Renat Akhmerov > > @Nokia -------------- next part -------------- An HTML attachment was scrubbed... URL: From aj at suse.com Wed Jan 15 07:29:13 2020 From: aj at suse.com (Andreas Jaeger) Date: Wed, 15 Jan 2020 08:29:13 +0100 Subject: [deployment][kolla][tripleo][osa][docs] Intro to OpenStack In-Reply-To: References: Message-ID: On 14/01/2020 18.19, Radosław Piliszek wrote: > Hiya, Folks! > > I had some thought after talking with people new to OpenStack - our > deployment tools are too well-hidden in the ecosystem. > People try deploying devstack for fun (and some still use packstack!) > and then give up on manual installation for production due to its > complexity and fear of upgrades. > I also got voices that Kolla/TripleO/OSA (order random) is an > "unofficial" way to deploy OpenStack and the installation guide is the > only "official" one (whatever that may mean in this very context). > > So I decided I go the "newbie" route and inspected the website. > https://www.openstack.org invites us to browse: > https://www.openstack.org/software/start/ > which is nice and dandy, presents options of enterprise-grade solutions etc. > but fails to really mention OpenStack has deployment tools (if one > does not look at the submenu bar) and instead points the "newbie" to > the installation guide: > https://docs.openstack.org/install-guide/overview.html > which kind-of negates that OpenStack ecosystem has any ready tools of > deployment by saying: > "After becoming familiar with basic installation, configuration, > operation, and troubleshooting of these OpenStack services, you should > consider the following steps toward deployment using a production > architecture: ... > Implement a deployment tool such as Ansible, Chef, Puppet, or Salt to > automate deployment and management of the production environment." The goal of the install guide is to learn: "This guide covers step-by-step deployment of the major OpenStack services using a functional example architecture suitable for new users of OpenStack with sufficient Linux experience. This guide is not intended to be used for production system installations, but to create a minimum proof-of-concept for the purpose of learning about OpenStack." And then comes the above cited. We can change the text and point to the deployment pages if you want. That's from the page you mention, Andreas > Just some food for thought. > > Extra for Kolla and OSA: > https://docs.openstack.org/train/deploy/ > seems we no longer deploy OpenStack since Train. It was not ready when be branched it - and nobody added it to the index page, see https://docs.openstack.org/doc-contrib-guide/doc-index.html, Andreas -- Andreas Jaeger aj at suse.com Twitter: jaegerandi SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg (HRB 36809, AG Nürnberg) GF: Felix Imendörffer GPG fingerprint = EF18 1673 38C4 A372 86B1 E699 5294 24A3 FF91 2ACB From radoslaw.piliszek at gmail.com Wed Jan 15 09:33:00 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Wed, 15 Jan 2020 10:33:00 +0100 Subject: [deployment][kolla][tripleo][osa][docs] Intro to OpenStack In-Reply-To: References: Message-ID: > "This guide covers step-by-step deployment of the major OpenStack > services using a functional example architecture suitable for new users > of OpenStack with sufficient Linux experience. This guide is not > intended to be used for production system installations, but to create a > minimum proof-of-concept for the purpose of learning about OpenStack." > > And then comes the above cited. > > We can change the text and point to the deployment pages if you want. Yup, that's true but somehow the effect is opposite. It would be better if we mentioned deployment tools in this place as well, I agree. This paragraph deserves a little rewrite. Though I would opt also for renovation of the OpenStack foundation page in that respect - it's there were interested parties gather and are guided to deploy OpenStack with the installation guide. Installation guide is in "Deploy OpenStack", devstack is in "Try OpenStack". I think both really belong to "Try OpenStack" (the installation guide should just add "the harder way" :-) ). I did not investigate yet how to propose a change there and, tbh, I don't have a concrete idea how to make it so that deployment tools are really visible w/o the notion they are something 3rd party and not cared about. > > Extra for Kolla and OSA: > > https://docs.openstack.org/train/deploy/ > > seems we no longer deploy OpenStack since Train. > > It was not ready when be branched it - and nobody added it to the index > page, see https://docs.openstack.org/doc-contrib-guide/doc-index.html, Ah, thanks. We will add it to Kolla procedures then. -yoctozepto śr., 15 sty 2020 o 08:29 Andreas Jaeger napisał(a): > > On 14/01/2020 18.19, Radosław Piliszek wrote: > > Hiya, Folks! > > > > I had some thought after talking with people new to OpenStack - our > > deployment tools are too well-hidden in the ecosystem. > > People try deploying devstack for fun (and some still use packstack!) > > and then give up on manual installation for production due to its > > complexity and fear of upgrades. > > I also got voices that Kolla/TripleO/OSA (order random) is an > > "unofficial" way to deploy OpenStack and the installation guide is the > > only "official" one (whatever that may mean in this very context). > > > > So I decided I go the "newbie" route and inspected the website. > > https://www.openstack.org invites us to browse: > > https://www.openstack.org/software/start/ > > which is nice and dandy, presents options of enterprise-grade solutions etc. > > but fails to really mention OpenStack has deployment tools (if one > > does not look at the submenu bar) and instead points the "newbie" to > > the installation guide: > > https://docs.openstack.org/install-guide/overview.html > > which kind-of negates that OpenStack ecosystem has any ready tools of > > deployment by saying: > > "After becoming familiar with basic installation, configuration, > > operation, and troubleshooting of these OpenStack services, you should > > consider the following steps toward deployment using a production > > architecture: ... > > Implement a deployment tool such as Ansible, Chef, Puppet, or Salt to > > automate deployment and management of the production environment." > > The goal of the install guide is to learn: > > "This guide covers step-by-step deployment of the major OpenStack > services using a functional example architecture suitable for new users > of OpenStack with sufficient Linux experience. This guide is not > intended to be used for production system installations, but to create a > minimum proof-of-concept for the purpose of learning about OpenStack." > > And then comes the above cited. > > We can change the text and point to the deployment pages if you want. > > That's from the page you mention, > > Andreas > > > Just some food for thought. > > > > Extra for Kolla and OSA: > > https://docs.openstack.org/train/deploy/ > > seems we no longer deploy OpenStack since Train. > > It was not ready when be branched it - and nobody added it to the index > page, see https://docs.openstack.org/doc-contrib-guide/doc-index.html, > > Andreas > -- > Andreas Jaeger aj at suse.com Twitter: jaegerandi > SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg > (HRB 36809, AG Nürnberg) GF: Felix Imendörffer > GPG fingerprint = EF18 1673 38C4 A372 86B1 E699 5294 24A3 FF91 2ACB From stig.openstack at telfer.org Wed Jan 15 10:26:18 2020 From: stig.openstack at telfer.org (Stig Telfer) Date: Wed, 15 Jan 2020 10:26:18 +0000 Subject: [scientific-sig] IRC Meeting today 1100UTC - large scale & planning for 2020 Message-ID: <6757CB11-C2B9-4FE5-A26C-E7BC5D318BF4@telfer.org> Hi All - We have a Scientific SIG meeting today at 1100UTC (about 30 minutes time) in channel #openstack-meeting. Everyone is welcome. Today’s agenda is here: https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_January_15th_2020 We’d like to gather some datapoints for the Large Scale SIG, and talk CFPs and conferences for 2020. Cheers, Stig -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Wed Jan 15 11:22:21 2020 From: sfinucan at redhat.com (Stephen Finucane) Date: Wed, 15 Jan 2020 11:22:21 +0000 Subject: [nova] Removal of 'deallocate_networks_on_reschedule' virt driver API method Message-ID: <7eef48707decf612b619700df13d8f14383e6967.camel@redhat.com> Just FYI, the 'deallocate_networks_on_reschedule' method of the nova virt driver API has been removed in [1]. It was only used for nova- network based flows and is therefore surplus to requirements. Third party drivers that implement this function can now remove it. Stephen [1] https://review.opendev.org/#/c/696516/ From aj at suse.com Wed Jan 15 13:42:09 2020 From: aj at suse.com (Andreas Jaeger) Date: Wed, 15 Jan 2020 14:42:09 +0100 Subject: [deployment][kolla][tripleo][osa][docs] Intro to OpenStack In-Reply-To: References: Message-ID: Proposed fix: https://review.opendev.org/702666 Andreas -- Andreas Jaeger aj at suse.com Twitter: jaegerandi SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg (HRB 36809, AG Nürnberg) GF: Felix Imendörffer GPG fingerprint = EF18 1673 38C4 A372 86B1 E699 5294 24A3 FF91 2ACB From emilien at redhat.com Wed Jan 15 14:34:50 2020 From: emilien at redhat.com (Emilien Macchi) Date: Wed, 15 Jan 2020 09:34:50 -0500 Subject: [tripleo] Switch to tripleo-container-manage by default (replacing Paunch) Message-ID: Hi folks, Some work has been done to replace Paunch and use the new "podman_container" Ansible module; It's possible thanks to a role that is now documented here: https://docs.openstack.org/tripleo-ansible/latest/roles/role-tripleo-container-manage.html While some efforts are still ongoing to replace container-puppet.py (which uses Paunch to execute a podman run); the tripleo-container-manage role has reached stability and enough maturity to be the default now. I would like to give it a try and for that I have 2 patches that would need to be landed: https://review.opendev.org/#/c/700737 https://review.opendev.org/#/c/700738 The most popular question that has been asked about $topic so far is: how can I run paunch debug to print the podman commands. Answer: https://docs.openstack.org/tripleo-ansible/latest/roles/role-tripleo-container-manage.html#check-mode Please raise any concern here and we'll address it. Hopefully we can make the default on time before U cycle ends. On a side note: I prepared all the backports to stable/train so the feature will be available in that branch as well. Thanks, -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevin at cloudnull.com Wed Jan 15 15:31:24 2020 From: kevin at cloudnull.com (Carter, Kevin) Date: Wed, 15 Jan 2020 09:31:24 -0600 Subject: [tripleo] Switch to tripleo-container-manage by default (replacing Paunch) In-Reply-To: References: Message-ID: Nicely done Emilien! -- Kevin Carter IRC: Cloudnull On Wed, Jan 15, 2020 at 8:38 AM Emilien Macchi wrote: > Hi folks, > > Some work has been done to replace Paunch and use the new > "podman_container" Ansible module; It's possible thanks to a role that is > now documented here: > > https://docs.openstack.org/tripleo-ansible/latest/roles/role-tripleo-container-manage.html > > While some efforts are still ongoing to replace container-puppet.py (which > uses Paunch to execute a podman run); the tripleo-container-manage role has > reached stability and enough maturity to be the default now. > I would like to give it a try and for that I have 2 patches that would > need to be landed: > https://review.opendev.org/#/c/700737 > https://review.opendev.org/#/c/700738 > > The most popular question that has been asked about $topic so far is: how > can I run paunch debug to print the podman commands. > Answer: > https://docs.openstack.org/tripleo-ansible/latest/roles/role-tripleo-container-manage.html#check-mode > > Please raise any concern here and we'll address it. > Hopefully we can make the default on time before U cycle ends. > > On a side note: I prepared all the backports to stable/train so the > feature will be available in that branch as well. > > Thanks, > -- > Emilien Macchi > -------------- next part -------------- An HTML attachment was scrubbed... URL: From emilien at redhat.com Wed Jan 15 15:42:57 2020 From: emilien at redhat.com (Emilien Macchi) Date: Wed, 15 Jan 2020 10:42:57 -0500 Subject: [tripleo] Switch to tripleo-container-manage by default (replacing Paunch) In-Reply-To: References: Message-ID: On Wed, Jan 15, 2020 at 10:31 AM Carter, Kevin wrote: > Nicely done Emilien! > > On Wed, Jan 15, 2020 at 8:38 AM Emilien Macchi wrote: > >> [...] >> >> The most popular question that has been asked about $topic so far is: how >> can I run paunch debug to print the podman commands. >> Answer: >> https://docs.openstack.org/tripleo-ansible/latest/roles/role-tripleo-container-manage.html#check-mode >> > I just thought about it but I thought we could have a tripleo command like : $ openstack tripleo container deploy --name keystone --host overcloud-controller1 $ openstack tripleo container deploy --name keystone --host overcloud-controller1 --dry-run It would use the new Ansible-runner to execute something like: https://docs.openstack.org/tripleo-ansible/latest/roles/role-tripleo-container-manage.html#example-with-one-container Dry-run would basically run the same thing with Ansible in check mode. In overall, we would still have a CLI (in tripleoclient instead of Paunch); and most of the container logic resides in podman_container which aims to be shared outside of TripleO which has been the main goal driving this effort. What do you think? -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevin at cloudnull.com Wed Jan 15 15:57:33 2020 From: kevin at cloudnull.com (Carter, Kevin) Date: Wed, 15 Jan 2020 09:57:33 -0600 Subject: [tripleo] Switch to tripleo-container-manage by default (replacing Paunch) In-Reply-To: References: Message-ID: On Wed, Jan 15, 2020 at 9:43 AM Emilien Macchi wrote: > > > On Wed, Jan 15, 2020 at 10:31 AM Carter, Kevin > wrote: > >> Nicely done Emilien! >> >> On Wed, Jan 15, 2020 at 8:38 AM Emilien Macchi >> wrote: >> >>> [...] >>> >>> The most popular question that has been asked about $topic so far is: >>> how can I run paunch debug to print the podman commands. >>> Answer: >>> https://docs.openstack.org/tripleo-ansible/latest/roles/role-tripleo-container-manage.html#check-mode >>> >> > I just thought about it but I thought we could have a tripleo command like > : > > $ openstack tripleo container deploy --name keystone --host > overcloud-controller1 > $ openstack tripleo container deploy --name keystone --host > overcloud-controller1 --dry-run > > +1 > It would use the new Ansible-runner to execute something like: > https://docs.openstack.org/tripleo-ansible/latest/roles/role-tripleo-container-manage.html#example-with-one-container > Dry-run would basically run the same thing with Ansible in check mode. > > In overall, we would still have a CLI (in tripleoclient instead of > Paunch); and most of the container logic resides in podman_container which > aims to be shared outside of TripleO which has been the main goal driving > this effort. > What do you think? > I like it! I think this is a very natural progression, especially now that we're using ansible-runner and have built a solid foundation of roles, filters, modules, etc. > -- > Emilien Macchi > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bdobreli at redhat.com Wed Jan 15 16:44:50 2020 From: bdobreli at redhat.com (Bogdan Dobrelya) Date: Wed, 15 Jan 2020 17:44:50 +0100 Subject: [tripleo] Switch to tripleo-container-manage by default (replacing Paunch) In-Reply-To: References: Message-ID: On 15.01.2020 16:42, Emilien Macchi wrote: > > > On Wed, Jan 15, 2020 at 10:31 AM Carter, Kevin > wrote: > > Nicely done Emilien! > > On Wed, Jan 15, 2020 at 8:38 AM Emilien Macchi > wrote: > > [...] > > The most popular question that has been asked about $topic so > far is: how can I run paunch debug to print the podman commands. > Answer: > https://docs.openstack.org/tripleo-ansible/latest/roles/role-tripleo-container-manage.html#check-mode > > > I just thought about it but I thought we could have a tripleo command like : > > $ openstack tripleo container deploy --name keystone --host > overcloud-controller1 > $ openstack tripleo container deploy --name keystone --host > overcloud-controller1 --dry-run > > It would use the new Ansible-runner to execute something like: > https://docs.openstack.org/tripleo-ansible/latest/roles/role-tripleo-container-manage.html#example-with-one-container > Dry-run would basically run the same thing with Ansible in check mode. > > In overall, we would still have a CLI (in tripleoclient instead of > Paunch); and most of the container logic resides in podman_container > which aims to be shared outside of TripleO which has been the main goal > driving this effort. > What do you think? Yes please, "shared outside of TripleO" is great aim to accomplish. I think a simple standalone pomdan+systemd+tripleo-container-manage might lead us much further than only Tripleo, and only OpenStack cases. > -- > Emilien Macchi -- Best regards, Bogdan Dobrelya, Irc #bogdando From C-Ramakrishna.Bhupathi at charter.com Wed Jan 15 19:09:19 2020 From: C-Ramakrishna.Bhupathi at charter.com (Bhupathi, Ramakrishna) Date: Wed, 15 Jan 2020 19:09:19 +0000 Subject: [magnum]: K8s cluster creation times out. OpenStack Train : [ERROR]: Unable to render networking. Network config is likely broken In-Reply-To: References: <59ed63745f4e4c42a63692c3ee4eb10d@ncwmexgp031.CORP.CHARTERCOM.com> <6c8f45f2-da74-18fd-7909-84c9c6762fe3@catalyst.net.nz> <24a8d164-8e38-1512-caf3-9447f070b8fd@catalyst.net.nz> <5846dbb7-fbcd-db22-7342-5ba2b6e4a1d3@catalyst.net.nz> Message-ID: Donny, Yes. Here it is. Cluster-template info as well as the Image info. magnum cluster-template-show kt-coreOS +-----------------------+--------------------------------------+ | Property | Value | +-----------------------+--------------------------------------+ | insecure_registry | - | | http_proxy | - | | updated_at | 2020-01-15T19:05:56+00:00 | | floating_ip_enabled | True | | fixed_subnet | - | | master_flavor_id | - | | user_id | 8d22ae284924432ba026e8a6236bc52e | | uuid | 6aea495e-6d8d-420b-8ca3-6e7fed73f3c7 | | no_proxy | - | | https_proxy | - | | tls_disabled | True | | keypair_id | ramak-test | | hidden | False | | project_id | 0c1abff4e920448ba86638bd0d78f7ca | | public | False | | labels | {'use_podman': 'false', ' | | | kube_tag': 'v1.16.2'} | | docker_volume_size | 5 | | server_type | vm | | external_network_id | thunder-public-vlan280 | | cluster_distro | coreos | | image_id | b1354e4e-8281-4330-a4b2-b5fdb022f805 | | volume_driver | - | | registry_enabled | False | | docker_storage_driver | devicemapper | | apiserver_port | - | | name | kt-coreOS | | created_at | 2020-01-14T20:32:04+00:00 | | network_driver | flannel | | fixed_network | - | | coe | kubernetes | | flavor_id | kuber-node | | master_lb_enabled | False | | dns_nameserver | 8.8.8.8 | +-----------------------+--------------------------------------+ ubuntu at kolla-ubuntu:~$ glance image-show 2fa8b3d8-c2e5-4568-9340-a18dd3d3120a +------------------+----------------------------------------------------------------------------------+ | Property | Value | +------------------+----------------------------------------------------------------------------------+ | checksum | cfbdc70bde5cd7df73a05a0fdc8e806c | | container_format | bare | | created_at | 2020-01-14T14:53:01Z | | disk_format | qcow2 | | id | 2fa8b3d8-c2e5-4568-9340-a18dd3d3120a | | locations | [{"url": "rbd://8c7d79a9-1275-4487-8ed0-6ea1fedccbef/images/2fa8b3d8-c2e5-4568-9 | | | 340-a18dd3d3120a/snap", "metadata": {}}] | | min_disk | 0 | | min_ram | 0 | | name | coreOS-latest | | os_distro | coreos | | os_hash_algo | sha512 | | os_hash_value | e6c4ce2e3e9dac4606f0edf689erf8782f99e249cc07887f620db69c9b91631301b480086c0e | | | 8ef5f42f4909b3fc3ef110e0erwff2922c0ca6a665dd11c57a | | os_hidden | False | | owner | 0c1abff4e920448ba86638bd0d78f7ca | | protected | False | | size | 1068171264 | | status | active | | tags | [] | | updated_at | 2020-01-14T14:53:39Z | | virtual_size | None | | visibility | public | +------------------+----------------------------------------------------------------------------------+ --RamaK From: Donny Davis [mailto:donny at fortnebula.com] Sent: Tuesday, January 14, 2020 4:05 PM To: Bhupathi, Ramakrishna Cc: OpenStack Discuss Subject: Re: [magnum]: K8s cluster creation times out. OpenStack Train : [ERROR]: Unable to render networking. Network config is likely broken Did you update your cluster distro? Can you share your current cluster template? Donny Davis c: 805 814 6800 On Tue, Jan 14, 2020, 3:56 PM Bhupathi, Ramakrishna > wrote: I just moved to Fedora core OS image (fedora-coreos-31) to build my K8s Magnum cluster and cluster creation fails with ERROR: The Parameter (octavia_ingress_controller_tag) was not defined in template. I wonder why I need that tag. Any help please? --RamaK From: Feilong Wang [mailto:feilong at catalyst.net.nz] Sent: Monday, January 13, 2020 4:25 PM To: Donny Davis > Cc: OpenStack Discuss > Subject: Re: [magnum]: K8s cluster creation times out. OpenStack Train : [ERROR]: Unable to render networking. Network config is likely broken OK, if you're happy to stay on CoreOS, all good. If you're interested in migrating to Fedora CoreOS and have questions, then you're welcome to popup in #openstack-containers. Cheers. On 14/01/20 10:21 AM, Donny Davis wrote: Just Coreos - I tried them all and it was the only one that worked oob. On Mon, Jan 13, 2020 at 4:10 PM Feilong Wang > wrote: Hi Donny, Do you mean Fedore CoreOS or just CoreOS? The current CoreOS driver is not actively maintained, I would suggest migrating to Fedora CoreOS and I'm happy to help if you have any question. Thanks. On 14/01/20 9:57 AM, Donny Davis wrote: FWIW I was only able to get the coreos image working with magnum oob.. the rest just didn't work. On Mon, Jan 13, 2020 at 2:31 PM feilong > wrote: Hi Bhupathi, Firstly, I would suggest setting the use_podman=False when using fedora atomic image. And it would be nice to set the "kube_tag", e.g. v1.15.6 explicitly. Then please trigger a new cluster creation. Then if you still run into error. Here is the debug steps: 1. ssh into the master node, check log /var/log/cloud-init-output.log 2. If there is no error in above log file, then run journalctl -u heat-container-agent to check the heat-container-agent log. If above step is correct, then you must be able to see something useful here. On 11/01/20 12:15 AM, Bhupathi, Ramakrishna wrote: Wang, Here it is . I added the labels subsequently. My nova and neutron are working all right as I installed various systems there working with no issues.. [cid:image001.png at 01D5CAF2.D1A55F30] From: Feilong Wang [mailto:feilong at catalyst.net.nz] Sent: Thursday, January 9, 2020 6:12 PM To: openstack-discuss at lists.openstack.org Subject: Re: [magnum]: K8s cluster creation times out. OpenStack Train : [ERROR]: Unable to render networking. Network config is likely broken Hi Bhupathi, Could you please share your cluster template? And please make sure your Nova/Neutron works. On 10/01/20 2:45 AM, Bhupathi, Ramakrishna wrote: Folks, I am building a Kubernetes Cluster( Openstack Train) and using fedora atomic-29 image . The nodes come up fine ( I have a simple 1 master and 1 node) , but the cluster creation times out, and when I access the cloud-init logs I see this error . Wondering what I am missing as this used to work before. I wonder if this is image related . [ERROR]: Unable to render networking. Network config is likely broken: No available network renderers found. Searched through list: ['eni', 'sysconfig', 'netplan'] Essentially the stack creation fails in “kube_cluster_deploy” Can somebody help me debug this ? Any help is appreciated. --RamaK The contents of this e-mail message and any attachments are intended solely for the addressee(s) and may contain confidential and/or legally privileged information. If you are not the intended recipient of this message or if this message has been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this message and any attachments. If you are not the intended recipient, you are notified that any use, dissemination, distribution, copying, or storage of this message or any attachment is strictly prohibited. -- Cheers & Best regards, Feilong Wang (王飞龙) Head of R&D Catalyst Cloud - Cloud Native New Zealand -------------------------------------------------------------------------- Tel: +64-48032246 Email: flwang at catalyst.net.nz Level 6, Catalyst House, 150 Willis Street, Wellington -------------------------------------------------------------------------- The contents of this e-mail message and any attachments are intended solely for the addressee(s) and may contain confidential and/or legally privileged information. If you are not the intended recipient of this message or if this message has been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this message and any attachments. If you are not the intended recipient, you are notified that any use, dissemination, distribution, copying, or storage of this message or any attachment is strictly prohibited. -- Cheers & Best regards, Feilong Wang (王飞龙) ------------------------------------------------------ Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang at catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington ------------------------------------------------------ -- ~/DonnyD C: 805 814 6800 "No mission too difficult. No sacrifice too great. Duty First" -- Cheers & Best regards, Feilong Wang (王飞龙) Head of R&D Catalyst Cloud - Cloud Native New Zealand -------------------------------------------------------------------------- Tel: +64-48032246 Email: flwang at catalyst.net.nz Level 6, Catalyst House, 150 Willis Street, Wellington -------------------------------------------------------------------------- -- ~/DonnyD C: 805 814 6800 "No mission too difficult. No sacrifice too great. Duty First" -- Cheers & Best regards, Feilong Wang (王飞龙) Head of R&D Catalyst Cloud - Cloud Native New Zealand -------------------------------------------------------------------------- Tel: +64-48032246 Email: flwang at catalyst.net.nz Level 6, Catalyst House, 150 Willis Street, Wellington -------------------------------------------------------------------------- The contents of this e-mail message and any attachments are intended solely for the addressee(s) and may contain confidential and/or legally privileged information. If you are not the intended recipient of this message or if this message has been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this message and any attachments. If you are not the intended recipient, you are notified that any use, dissemination, distribution, copying, or storage of this message or any attachment is strictly prohibited. E-MAIL CONFIDENTIALITY NOTICE: The contents of this e-mail message and any attachments are intended solely for the addressee(s) and may contain confidential and/or legally privileged information. If you are not the intended recipient of this message or if this message has been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this message and any attachments. If you are not the intended recipient, you are notified that any use, dissemination, distribution, copying, or storage of this message or any attachment is strictly prohibited. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bharat at stackhpc.com Wed Jan 15 19:52:23 2020 From: bharat at stackhpc.com (Bharat Kunwar) Date: Wed, 15 Jan 2020 19:52:23 +0000 Subject: [magnum]: K8s cluster creation times out. OpenStack Train : [ERROR]: Unable to render networking. Network config is likely broken In-Reply-To: References: Message-ID: <294DAE7D-9BDC-41B3-A591-FA8AF0B99E92@stackhpc.com> The os_distro label needs to be fedora-coreos. Sent from my iPhone > On 15 Jan 2020, at 19:49, Bhupathi, Ramakrishna wrote: > >  > Donny, > Yes. Here it is. Cluster-template info as well as the Image info. > > > magnum cluster-template-show kt-coreOS > +-----------------------+--------------------------------------+ > | Property | Value | > +-----------------------+--------------------------------------+ > | insecure_registry | - | > | http_proxy | - | > | updated_at | 2020-01-15T19:05:56+00:00 | > | floating_ip_enabled | True | > | fixed_subnet | - | > | master_flavor_id | - | > | user_id | 8d22ae284924432ba026e8a6236bc52e | > | uuid | 6aea495e-6d8d-420b-8ca3-6e7fed73f3c7 | > | no_proxy | - | > | https_proxy | - | > | tls_disabled | True | > | keypair_id | ramak-test | > | hidden | False | > | project_id | 0c1abff4e920448ba86638bd0d78f7ca | > | public | False | > | labels | {'use_podman': 'false', ' | > | | kube_tag': 'v1.16.2'} | > | docker_volume_size | 5 | > | server_type | vm | > | external_network_id | thunder-public-vlan280 | > | cluster_distro | coreos | > | image_id | b1354e4e-8281-4330-a4b2-b5fdb022f805 | > | volume_driver | - | > | registry_enabled | False | > | docker_storage_driver | devicemapper | > | apiserver_port | - | > | name | kt-coreOS | > | created_at | 2020-01-14T20:32:04+00:00 | > | network_driver | flannel | > | fixed_network | - | > | coe | kubernetes | > | flavor_id | kuber-node | > | master_lb_enabled | False | > | dns_nameserver | 8.8.8.8 | > +-----------------------+--------------------------------------+ > > ubuntu at kolla-ubuntu:~$ glance image-show 2fa8b3d8-c2e5-4568-9340-a18dd3d3120a > +------------------+----------------------------------------------------------------------------------+ > | Property | Value | > +------------------+----------------------------------------------------------------------------------+ > | checksum | cfbdc70bde5cd7df73a05a0fdc8e806c | > | container_format | bare | > | created_at | 2020-01-14T14:53:01Z | > | disk_format | qcow2 | > | id | 2fa8b3d8-c2e5-4568-9340-a18dd3d3120a | > | locations | [{"url": "rbd://8c7d79a9-1275-4487-8ed0-6ea1fedccbef/images/2fa8b3d8-c2e5-4568-9 | > | | 340-a18dd3d3120a/snap", "metadata": {}}] | > | min_disk | 0 | > | min_ram | 0 | > | name | coreOS-latest | > | os_distro | coreos | > | os_hash_algo | sha512 | > | os_hash_value | e6c4ce2e3e9dac4606f0edf689erf8782f99e249cc07887f620db69c9b91631301b480086c0e | > | | 8ef5f42f4909b3fc3ef110e0erwff2922c0ca6a665dd11c57a | > | os_hidden | False | > | owner | 0c1abff4e920448ba86638bd0d78f7ca | > | protected | False | > | size | 1068171264 | > | status | active | > | tags | [] | > | updated_at | 2020-01-14T14:53:39Z | > | virtual_size | None | > | visibility | public | > +------------------+----------------------------------------------------------------------------------+ > > --RamaK > > From: Donny Davis [mailto:donny at fortnebula.com] > Sent: Tuesday, January 14, 2020 4:05 PM > To: Bhupathi, Ramakrishna > Cc: OpenStack Discuss > Subject: Re: [magnum]: K8s cluster creation times out. OpenStack Train : [ERROR]: Unable to render networking. Network config is likely broken > > Did you update your cluster distro? Can you share your current cluster template? > > Donny Davis > c: 805 814 6800 > > On Tue, Jan 14, 2020, 3:56 PM Bhupathi, Ramakrishna wrote: > I just moved to Fedora core OS image (fedora-coreos-31) to build my K8s Magnum cluster and cluster creation fails with > ERROR: The Parameter (octavia_ingress_controller_tag) was not defined in template. > > I wonder why I need that tag. Any help please? > > --RamaK > > From: Feilong Wang [mailto:feilong at catalyst.net.nz] > Sent: Monday, January 13, 2020 4:25 PM > To: Donny Davis > Cc: OpenStack Discuss > Subject: Re: [magnum]: K8s cluster creation times out. OpenStack Train : [ERROR]: Unable to render networking. Network config is likely broken > > OK, if you're happy to stay on CoreOS, all good. If you're interested in migrating to Fedora CoreOS and have questions, then you're welcome to popup in #openstack-containers. Cheers. > > > > On 14/01/20 10:21 AM, Donny Davis wrote: > Just Coreos - I tried them all and it was the only one that worked oob. > > On Mon, Jan 13, 2020 at 4:10 PM Feilong Wang wrote: > Hi Donny, > > Do you mean Fedore CoreOS or just CoreOS? The current CoreOS driver is not actively maintained, I would suggest migrating to Fedora CoreOS and I'm happy to help if you have any question. Thanks. > > > > On 14/01/20 9:57 AM, Donny Davis wrote: > FWIW I was only able to get the coreos image working with magnum oob.. the rest just didn't work. > > On Mon, Jan 13, 2020 at 2:31 PM feilong wrote: > Hi Bhupathi, > > Firstly, I would suggest setting the use_podman=False when using fedora atomic image. And it would be nice to set the "kube_tag", e.g. v1.15.6 explicitly. Then please trigger a new cluster creation. Then if you still run into error. Here is the debug steps: > > 1. ssh into the master node, check log /var/log/cloud-init-output.log > > 2. If there is no error in above log file, then run journalctl -u heat-container-agent to check the heat-container-agent log. If above step is correct, then you must be able to see something useful here. > > > > On 11/01/20 12:15 AM, Bhupathi, Ramakrishna wrote: > Wang, > Here it is . I added the labels subsequently. My nova and neutron are working all right as I installed various systems there working with no issues.. > > > > From: Feilong Wang [mailto:feilong at catalyst.net.nz] > Sent: Thursday, January 9, 2020 6:12 PM > To: openstack-discuss at lists.openstack.org > Subject: Re: [magnum]: K8s cluster creation times out. OpenStack Train : [ERROR]: Unable to render networking. Network config is likely broken > > Hi Bhupathi, > > Could you please share your cluster template? And please make sure your Nova/Neutron works. > > > > On 10/01/20 2:45 AM, Bhupathi, Ramakrishna wrote: > Folks, > I am building a Kubernetes Cluster( Openstack Train) and using fedora atomic-29 image . The nodes come up fine ( I have a simple 1 master and 1 node) , but the cluster creation times out, and when I access the cloud-init logs I see this error . Wondering what I am missing as this used to work before. I wonder if this is image related . > > [ERROR]: Unable to render networking. Network config is likely broken: No available network renderers found. Searched through list: ['eni', 'sysconfig', 'netplan'] > > Essentially the stack creation fails in “kube_cluster_deploy” > > Can somebody help me debug this ? Any help is appreciated. > > --RamaK > The contents of this e-mail message and > any attachments are intended solely for the > addressee(s) and may contain confidential > and/or legally privileged information. If you > are not the intended recipient of this message > or if this message has been addressed to you > in error, please immediately alert the sender > by reply e-mail and then delete this message > and any attachments. If you are not the > intended recipient, you are notified that > any use, dissemination, distribution, copying, > or storage of this message or any attachment > is strictly prohibited. > -- > Cheers & Best regards, > Feilong Wang (王飞龙) > Head of R&D > Catalyst Cloud - Cloud Native New Zealand > -------------------------------------------------------------------------- > Tel: +64-48032246 > Email: flwang at catalyst.net.nz > Level 6, Catalyst House, 150 Willis Street, Wellington > -------------------------------------------------------------------------- > The contents of this e-mail message and > any attachments are intended solely for the > addressee(s) and may contain confidential > and/or legally privileged information. If you > are not the intended recipient of this message > or if this message has been addressed to you > in error, please immediately alert the sender > by reply e-mail and then delete this message > and any attachments. If you are not the > intended recipient, you are notified that > any use, dissemination, distribution, copying, > or storage of this message or any attachment > is strictly prohibited. > -- > Cheers & Best regards, > Feilong Wang (王飞龙) > ------------------------------------------------------ > Senior Cloud Software Engineer > Tel: +64-48032246 > Email: flwang at catalyst.net.nz > Catalyst IT Limited > Level 6, Catalyst House, 150 Willis Street, Wellington > ------------------------------------------------------ > > > -- > ~/DonnyD > C: 805 814 6800 > "No mission too difficult. No sacrifice too great. Duty First" > -- > Cheers & Best regards, > Feilong Wang (王飞龙) > Head of R&D > Catalyst Cloud - Cloud Native New Zealand > -------------------------------------------------------------------------- > Tel: +64-48032246 > Email: flwang at catalyst.net.nz > Level 6, Catalyst House, 150 Willis Street, Wellington > -------------------------------------------------------------------------- > > > -- > ~/DonnyD > C: 805 814 6800 > "No mission too difficult. No sacrifice too great. Duty First" > -- > Cheers & Best regards, > Feilong Wang (王飞龙) > Head of R&D > Catalyst Cloud - Cloud Native New Zealand > -------------------------------------------------------------------------- > Tel: +64-48032246 > Email: flwang at catalyst.net.nz > Level 6, Catalyst House, 150 Willis Street, Wellington > -------------------------------------------------------------------------- > The contents of this e-mail message and > any attachments are intended solely for the > addressee(s) and may contain confidential > and/or legally privileged information. If you > are not the intended recipient of this message > or if this message has been addressed to you > in error, please immediately alert the sender > by reply e-mail and then delete this message > and any attachments. If you are not the > intended recipient, you are notified that > any use, dissemination, distribution, copying, > or storage of this message or any attachment > is strictly prohibited. > The contents of this e-mail message and > any attachments are intended solely for the > addressee(s) and may contain confidential > and/or legally privileged information. If you > are not the intended recipient of this message > or if this message has been addressed to you > in error, please immediately alert the sender > by reply e-mail and then delete this message > and any attachments. If you are not the > intended recipient, you are notified that > any use, dissemination, distribution, copying, > or storage of this message or any attachment > is strictly prohibited. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aj at suse.com Wed Jan 15 20:31:02 2020 From: aj at suse.com (Andreas Jaeger) Date: Wed, 15 Jan 2020 21:31:02 +0100 Subject: [infra] Retire x/zmq-event-publisher Message-ID: <57e6c329-430f-9d16-31b1-3b7c88a7e9ae@suse.com> This repo is not used anymore, it was forked and is maintained for Jenkins now elsewhere. I'll retire the repo now with topic retire-zmq-event-publisher, Andreas -- Andreas Jaeger aj at suse.com Twitter: jaegerandi SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg (HRB 36809, AG Nürnberg) GF: Felix Imendörffer GPG fingerprint = EF18 1673 38C4 A372 86B1 E699 5294 24A3 FF91 2ACB From feilong at catalyst.net.nz Wed Jan 15 20:36:19 2020 From: feilong at catalyst.net.nz (feilong) Date: Thu, 16 Jan 2020 09:36:19 +1300 Subject: [magnum][kolla] etcd wal sync duration issue In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA04771716@gmsxchsvr01.thecreation.com> References: <046E9C0290DD9149B106B72FC9156BEA0477170E@gmsxchsvr01.thecreation.com> <3f3fe0d1-7b61-d2f9-da65-d126ea5ed336@catalyst.net.nz> <046E9C0290DD9149B106B72FC9156BEA04771716@gmsxchsvr01.thecreation.com> Message-ID: <279cedf1-8bf4-fcf1-cfc2-990c97685531@catalyst.net.nz> Hi Eric, If you're using SSD, then I think the IO performance should  be OK. You can use this https://github.com/etcd-io/etcd/tree/master/tools/benchmark to verify and confirm that 's the root cause. Meanwhile, you can review the config of etcd cluster deployed by Magnum. I'm not an export of Etcd, so TBH I can't see anything wrong with the config. Most of them are just default configurations. As for the etcd image, it's built from https://github.com/projectatomic/atomic-system-containers/tree/master/etcd or you can refer CERN's repo https://gitlab.cern.ch/cloud/atomic-system-containers/blob/cern-qa/etcd/ *Spyros*, any comments? On 14/01/20 10:52 AM, Eric K. Miller wrote: > Hi Feilong, > > Thanks for responding! I am, indeed, using the default v3.2.7 version for etcd, which is the only available image. > > I did not try to reproduce with any other driver (we have never used DevStack, honestly, only Kolla-Ansible deployments). I did see a number of people indicating similar issues with etcd versions in the 3.3.x range, so I didn't think of it being an etcd issue, but then again most issues seem to be a result of people using HDDs and not SSDs, which makes sense. > > Interesting that you saw the same issue, though. We haven't tried Fedora CoreOS, but I think we would need Train for this. > > Everything I read about etcd indicates that it is extremely latency sensitive, due to the fact that it replicates all changes to all nodes and sends an fsync to Linux each time, so data is always guaranteed to be stored. I can see this becoming an issue quickly without super-low-latency network and storage. We are using Ceph-based SSD volumes for the Kubernetes Master node disks, which is extremely fast (likely 10x or better than anything people recommend for etcd), but network latency is always going to be higher with VMs on OpenStack with DVR than bare metal with VLANs due to all of the abstractions. > > Do you know who maintains the etcd images for Magnum here? Is there an easy way to create a newer image? > https://hub.docker.com/r/openstackmagnum/etcd/tags/ > > Eric > > > > From: Feilong Wang [mailto:feilong at catalyst.net.nz] > Sent: Monday, January 13, 2020 3:39 PM > To: openstack-discuss at lists.openstack.org > Subject: Re: [magnum][kolla] etcd wal sync duration issue > > Hi Eric, > That issue looks familiar for me. There are some questions I'd like to check before answering if you should upgrade to train. > 1. Are using the default v3.2.7 version for etcd? > 2. Did you try to reproduce this with devstack, using Fedora CoreOS driver? The etcd version could be 3.2.26 > I asked above questions because I saw the same error when I used Fedora Atomic with etcd v3.2.7 and I can't reproduce it with Fedora CoreOS + etcd 3.2.26 > > -- Cheers & Best regards, Feilong Wang (王飞龙) ------------------------------------------------------ Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang at catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington ------------------------------------------------------ -------------- next part -------------- An HTML attachment was scrubbed... URL: From feilong at catalyst.net.nz Wed Jan 15 20:45:36 2020 From: feilong at catalyst.net.nz (feilong) Date: Thu, 16 Jan 2020 09:45:36 +1300 Subject: [magnum]: K8s cluster creation times out. OpenStack Train : [ERROR]: Unable to render networking. Network config is likely broken In-Reply-To: References: <59ed63745f4e4c42a63692c3ee4eb10d@ncwmexgp031.CORP.CHARTERCOM.com> <6c8f45f2-da74-18fd-7909-84c9c6762fe3@catalyst.net.nz> <24a8d164-8e38-1512-caf3-9447f070b8fd@catalyst.net.nz> <5846dbb7-fbcd-db22-7342-5ba2b6e4a1d3@catalyst.net.nz> Message-ID: <4b997f76-b300-9c96-ea16-5d4c84ea244f@catalyst.net.nz> Hi Bhupathi, Please read https://docs.openstack.org/magnum/latest/user/#use-podman When you're using Fedora CoreOS driver, you have to use the use_podman=True, because in Magnum Fedora CoreOS driver, podman is the only option. Please take my devstack cluster template as a reference. feilong at feilong-pc:~$ occt show c5ed303d-d255-45e8-8efb-bffac97f852c +-----------------------+-------------------------------------------------------------------------------------------------------------------------+ | Field                 | Value                                                                                                                   | +-----------------------+-------------------------------------------------------------------------------------------------------------------------+ | insecure_registry     | -                                                                                                                       | | labels                | {u'use_podman': u'true', u'kube_tag': u'v1.16.4', u'etcd_tag': u'3.2.26', u'heat_container_agent_tag': u'train-stable'} | | updated_at            | 2020-01-06T23:14:12+00:00                                                                                               | | floating_ip_enabled   | True                                                                                                                    | | fixed_subnet          | -                                                                                                                       | | master_flavor_id      | ds2G                                                                                                                    | | uuid                  | c5ed303d-d255-45e8-8efb-bffac97f852c                                                                                    | | no_proxy              | -                                                                                                                       | | https_proxy           | -                                                                                                                       | | tls_disabled          | False                                                                                                                   | | keypair_id            | feilong                                                                                                                 | | public                | False                                                                                                                   | | http_proxy            | -                                                                                                                       | | docker_volume_size    | -                                                                                                                       | | server_type           | vm                                                                                                                      | | external_network_id   | public                                                                                                                  | | cluster_distro        | fedora-coreos                                                                                                           | | image_id              | c089d627-0265-4cbc-8c96-957eb529b024                                                                                    | | volume_driver         | -                                                                                                                       | | registry_enabled      | False                                                                                                                   | | docker_storage_driver | overlay2                                                                                                                | | apiserver_port        | -                                                                                                                       | | name                  | k8s-fc31-v1.16.4                                                                                                        | | created_at            | 2020-01-05T22:26:41+00:00                                                                                               | | network_driver        | calico                                                                                                                  | | fixed_network         | -                                                                                                                       | | coe                   | kubernetes                                                                                                              | | flavor_id             | ds1G                                                                                                                    | | master_lb_enabled     | False                                                                                                                   | | dns_nameserver        | 8.8.8.8                                                                                                                 | | hidden                | False                                                                                                                   | +-----------------------+-------------------------------------------------------------------------------------------------------------------------+ On 16/01/20 8:09 AM, Bhupathi, Ramakrishna wrote: > > Donny, > > Yes. Here it is. Cluster-template info as well as the Image info. > >   > >   > > magnum cluster-template-show kt-coreOS > > +-----------------------+--------------------------------------+ > > | Property              | Value                                | > > +-----------------------+--------------------------------------+ > > | insecure_registry     | -                                    | > > | http_proxy            | -                                    | > > | updated_at            | 2020-01-15T19:05:56+00:00            | > > | floating_ip_enabled   | True                                 | > > | fixed_subnet          | -                                    | > > | master_flavor_id      | -                                    | > > | user_id               | 8d22ae284924432ba026e8a6236bc52e     | > > | uuid                  | 6aea495e-6d8d-420b-8ca3-6e7fed73f3c7 | > > | no_proxy              | -                                    | > > | https_proxy           | -                                    | > > | tls_disabled          | True                                 | > > | keypair_id            | ramak-test                           | > > | hidden                | False                                | > > | project_id            | 0c1abff4e920448ba86638bd0d78f7ca     | > > | public                | False                                | > > | labels                | {'use_podman': 'false', '            | > > |                       | kube_tag': 'v1.16.2'}                | > > | docker_volume_size    | 5                                    | > > | server_type           | vm                                   | > > | external_network_id   | thunder-public-vlan280         | > > | cluster_distro        | coreos                               | > > | image_id              | b1354e4e-8281-4330-a4b2-b5fdb022f805 | > > | volume_driver         | -                                    | > > | registry_enabled      | False                                | > > | docker_storage_driver | devicemapper                         | > > | apiserver_port        | -                                    | > > | name                  | kt-coreOS                            | > > | created_at            | 2020-01-14T20:32:04+00:00            | > > | network_driver        | flannel                              | > > | fixed_network         | -                                    | > > | coe                   | kubernetes                           | > > | flavor_id             | kuber-node                           | > > | master_lb_enabled     | False                                | > > | dns_nameserver        | 8.8.8.8                              | > > +-----------------------+--------------------------------------+ > >   > > ubuntu at kolla-ubuntu:~$ glance image-show  > 2fa8b3d8-c2e5-4568-9340-a18dd3d3120a > > +------------------+----------------------------------------------------------------------------------+ > > | Property         | > Value                                                                            > | > > +------------------+----------------------------------------------------------------------------------+ > > | checksum         | > cfbdc70bde5cd7df73a05a0fdc8e806c                                                 > | > > | container_format | > bare                                                                             > | > > | created_at       | > 2020-01-14T14:53:01Z                                                             > | > > | disk_format      | > qcow2                                                                            > | > > | id               | > 2fa8b3d8-c2e5-4568-9340-a18dd3d3120a                          >                    | > > | locations        | [{"url": > "rbd://8c7d79a9-1275-4487-8ed0-6ea1fedccbef/images/2fa8b3d8-c2e5-4568-9 | > > |                  | 340-a18dd3d3120a/snap", "metadata": > {}}]                                         | > > | min_disk         | 0     >                                                                            | > > | min_ram          | > 0                                                                                > | > > | name             | > coreOS-latest                                         >                            | > > | os_distro        | > coreos                                                                           > | > > | os_hash_algo     | > sha512                                                                           > | > > | os_hash_value    | > e6c4ce2e3e9dac4606f0edf689erf8782f99e249cc07887f620db69c9b91631301b480086c0e > | > > |                  | > 8ef5f42f4909b3fc3ef110e0erwff2922c0ca6a665dd11c57a                                 > | > > | os_hidden        | False                                         >                                    | > > | owner            | > 0c1abff4e920448ba86638bd0d78f7ca                                                 > | > > | protected        | > False                                                                            > | > > | size             | > 1068171264                                                                       > | > > | status           | > active                                                                           > | > > | tags             | > []                                                                               > | > > | updated_at       | > 2020-01-14T14:53:39Z                                                             > | > > | virtual_size     | None                       >                                                       | > > | visibility       | > public                                                                           > | > > +------------------+----------------------------------------------------------------------------------+ > >   > > --RamaK > >   > > *From:*Donny Davis [mailto:donny at fortnebula.com] > *Sent:* Tuesday, January 14, 2020 4:05 PM > *To:* Bhupathi, Ramakrishna > *Cc:* OpenStack Discuss > *Subject:* Re: [magnum]: K8s cluster creation times out. OpenStack > Train : [ERROR]: Unable to render networking. Network config is likely > broken > >   > > Did you update your cluster distro? Can you share your current cluster > template? > > Donny Davis > c: 805 814 6800 > >   > > On Tue, Jan 14, 2020, 3:56 PM Bhupathi, Ramakrishna > > wrote: > > I just moved to Fedora core OS image (fedora-coreos-31)  to build > my K8s Magnum cluster and  cluster creation fails with  > > ERROR: The Parameter (octavia_ingress_controller_tag) was not > defined in template. > >   > > I wonder why I need that tag. Any help please? > >   > > --RamaK > >   > > *From:*Feilong Wang [mailto:feilong at catalyst.net.nz > ] > *Sent:* Monday, January 13, 2020 4:25 PM > *To:* Donny Davis > > *Cc:* OpenStack Discuss > > *Subject:* Re: [magnum]: K8s cluster creation times out. OpenStack > Train : [ERROR]: Unable to render networking. Network config is > likely broken > >   > > OK, if you're happy to stay on CoreOS, all good. If you're > interested in migrating to Fedora CoreOS and have questions, then > you're welcome to popup in #openstack-containers. Cheers. > >   > > On 14/01/20 10:21 AM, Donny Davis wrote: > > Just Coreos - I tried them all and it was the only one that > worked oob.  > >   > > On Mon, Jan 13, 2020 at 4:10 PM Feilong Wang > > wrote: > > Hi Donny, > > Do you mean Fedore CoreOS or just CoreOS? The current > CoreOS driver is not actively maintained, I would suggest > migrating to Fedora CoreOS and I'm happy to help if you > have any question. Thanks. > >   > > On 14/01/20 9:57 AM, Donny Davis wrote: > > FWIW I was only able to get the coreos image working > with magnum oob.. the rest just didn't work.  > >   > > On Mon, Jan 13, 2020 at 2:31 PM feilong > > wrote: > > Hi Bhupathi, > > Firstly, I would suggest setting the > use_podman=False when using fedora atomic image. > And it would be nice to set the "kube_tag", e.g. > v1.15.6 explicitly. Then please trigger a new > cluster creation. Then if you still run into > error. Here is the debug steps: > > 1. ssh into the master node, check log > /var/log/cloud-init-output.log > > 2. If there is no error in above log file, then > run journalctl -u heat-container-agent to check > the heat-container-agent log. If above step is > correct, then you must be able to see something > useful here. > >   > > On 11/01/20 12:15 AM, Bhupathi, Ramakrishna wrote: > > Wang, > > Here it is  . I added the labels subsequently. > My nova and neutron are working all right as I > installed various systems there working with > no issues.. > >   > >   > > *From:* Feilong Wang > [mailto:feilong at catalyst.net.nz] > *Sent:* Thursday, January 9, 2020 6:12 PM > *To:* openstack-discuss at lists.openstack.org > > *Subject:* Re: [magnum]: K8s cluster creation > times out. OpenStack Train : [ERROR]: Unable > to render networking. Network config is likely > broken > >   > > Hi Bhupathi, > > Could you please share your cluster template? > And please make sure your Nova/Neutron works. > >   > > On 10/01/20 2:45 AM, Bhupathi, Ramakrishna wrote: > > Folks, > > I am building a Kubernetes Cluster( > Openstack Train) and using fedora > atomic-29 image . The nodes come up  fine > ( I have a simple 1 master and 1 node) , > but the cluster creation times out,  and > when I access the cloud-init logs I see > this error .  Wondering what I am missing > as this used to work before.  I wonder if > this is image related . > >   > > [ERROR]: Unable to render networking. > Network config is likely broken: No > available network renderers found. > Searched through list: ['eni', > 'sysconfig', 'netplan'] > >   > > Essentially the stack creation fails in > “kube_cluster_deploy” > >   > > Can somebody help me debug this ? Any help > is appreciated. > >   > > --RamaK > > The contents of this e-mail message and > any attachments are intended solely for the > addressee(s) and may contain confidential > and/or legally privileged information. If you > are not the intended recipient of this message > or if this message has been addressed to you > in error, please immediately alert the sender > by reply e-mail and then delete this message > and any attachments. If you are not the > intended recipient, you are notified that > any use, dissemination, distribution, copying, > or storage of this message or any attachment > is strictly prohibited. > > -- > > Cheers & Best regards, > > Feilong Wang (王飞龙) > > Head of R&D > > Catalyst Cloud - Cloud Native New Zealand > > -------------------------------------------------------------------------- > > Tel: +64-48032246 > > Email: flwang at catalyst.net.nz > > Level 6, Catalyst House, 150 Willis Street, Wellington > > -------------------------------------------------------------------------- > > The contents of this e-mail message and > any attachments are intended solely for the > addressee(s) and may contain confidential > and/or legally privileged information. If you > are not the intended recipient of this message > or if this message has been addressed to you > in error, please immediately alert the sender > by reply e-mail and then delete this message > and any attachments. If you are not the > intended recipient, you are notified that > any use, dissemination, distribution, copying, > or storage of this message or any attachment > is strictly prohibited. > > -- > > Cheers & Best regards, > > Feilong Wang (王飞龙) > > ------------------------------------------------------ > > Senior Cloud Software Engineer > > Tel: +64-48032246 > > Email: flwang at catalyst.net.nz > > Catalyst IT Limited > > Level 6, Catalyst House, 150 Willis Street, Wellington > > ------------------------------------------------------ > > >   > > -- > > ~/DonnyD > > C: 805 814 6800 > > "No mission too difficult. No sacrifice too great. > Duty First" > > -- > > Cheers & Best regards, > > Feilong Wang (王飞龙) > > Head of R&D > > Catalyst Cloud - Cloud Native New Zealand > > -------------------------------------------------------------------------- > > Tel: +64-48032246 > > Email: flwang at catalyst.net.nz > > Level 6, Catalyst House, 150 Willis Street, Wellington > > -------------------------------------------------------------------------- > > >   > > -- > > ~/DonnyD > > C: 805 814 6800 > > "No mission too difficult. No sacrifice too great. Duty First" > > -- > > Cheers & Best regards, > > Feilong Wang (王飞龙) > > Head of R&D > > Catalyst Cloud - Cloud Native New Zealand > > -------------------------------------------------------------------------- > > Tel: +64-48032246 > > Email: flwang at catalyst.net.nz > > Level 6, Catalyst House, 150 Willis Street, Wellington > > -------------------------------------------------------------------------- > > The contents of this e-mail message and > any attachments are intended solely for the > addressee(s) and may contain confidential > and/or legally privileged information. If you > are not the intended recipient of this message > or if this message has been addressed to you > in error, please immediately alert the sender > by reply e-mail and then delete this message > and any attachments. If you are not the > intended recipient, you are notified that > any use, dissemination, distribution, copying, > or storage of this message or any attachment > is strictly prohibited. > > The contents of this e-mail message and > any attachments are intended solely for the > addressee(s) and may contain confidential > and/or legally privileged information. If you > are not the intended recipient of this message > or if this message has been addressed to you > in error, please immediately alert the sender > by reply e-mail and then delete this message > and any attachments. If you are not the > intended recipient, you are notified that > any use, dissemination, distribution, copying, > or storage of this message or any attachment > is strictly prohibited. -- Cheers & Best regards, Feilong Wang (王飞龙) ------------------------------------------------------ Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang at catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington ------------------------------------------------------ -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Wed Jan 15 23:01:01 2020 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 15 Jan 2020 17:01:01 -0600 Subject: [tc][all] Updates on Ussuri cycle community-wide goals In-Reply-To: <16e860b4a50.10ee607ea44010.6012230745285412048@ghanshyammann.com> References: <16e181adbc4.1191b0166302215.2291880664205036921@ghanshyammann.com> <16e860b4a50.10ee607ea44010.6012230745285412048@ghanshyammann.com> Message-ID: <16fab704bc9.11f04c9ec93844.5066595270036750282@ghanshyammann.com> ---- On Tue, 19 Nov 2019 17:41:57 -0600 Ghanshyam Mann wrote ---- > ---- On Tue, 29 Oct 2019 10:20:43 -0500 Ghanshyam Mann wrote ---- > > Hello Everyone, > > > > We have two goals with their champions ready for review. Please review and provide your feedback on Gerrit. > > > > > > 1. Add goal for project specific PTL and contributor guides - Kendall Nelson > > - https://review.opendev.org/#/c/691737/ > > > > 2. Propose a new goal to migrate all legacy zuul jobs - Luigi Toscano > > - https://review.opendev.org/#/c/691278/ > > > > Hello Everyone, > > From the Forum and PTG discussions[1], we agreed to proceed with below two goals for the Ussuri cyle. > > 1. Drop Python 2.7 Support - Already Accepted. > Patches on almost all services are up for review and merge[2]. Merge those fast to avoid your projects gate > break due to cross projects dropping py2. > > 2. Project Specific New Contributor & PTL Docs - Under Review > The goal patch is under review. Feel Free to provide your feedback on https://review.opendev.org/#/c/691737/ > > 'migrate all legacy zuul job' is pre-selected as V cycle goal and under review in > https://review.opendev.org/#/c/691278/ This is the final update of Ussuri cycle community-wide goals selection. 2nd community-wide goal for the Ussuri cycle has been merged today. Below two are the final goals for this cycle[1]: 1. Drop Python 2.7 Support - In-progress - https://governance.openstack.org/tc/goals/selected/ussuri/drop-py27.html 2. Project Specific New Contributor & PTL Docs - Selected for Ussuri. - https://governance.openstack.org/tc/goals/selected/ussuri/project-ptl-and-contrib-docs.htm [1] https://governance.openstack.org/tc/goals/selected/ussuri/index.html -gmann > > > [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-November/010943.html > [2] https://review.opendev.org/#/q/topic:drop-py27-support+(status:open+OR+status:merged) > > -gmann > > > We are still looking for the Champion volunteer for RBAC goal[1]. If you have any new ideas for goal, do not hesitate to add in etherpad[2] > > > > [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010291.html > > [2] https://etherpad.openstack.org/p/PVG-u-series-goals > > > > -gmann & diablo_rojo > > > > > > > > > From tony.pearce at cinglevue.com Thu Jan 16 07:36:45 2020 From: tony.pearce at cinglevue.com (Tony Pearce) Date: Thu, 16 Jan 2020 15:36:45 +0800 Subject: DR options with openstack Message-ID: <5e201295.1c69fb81.a69b.d77d@mx.google.com> An HTML attachment was scrubbed... URL: From pierre at stackhpc.com Thu Jan 16 09:23:26 2020 From: pierre at stackhpc.com (Pierre Riteau) Date: Thu, 16 Jan 2020 09:23:26 +0000 Subject: [blazar] No IRC meeting today Message-ID: Hello, Similarly to Tuesday, I have to cancel today's Blazar IRC meeting. Sorry for the late notice. Thanks, Pierre Riteau (priteau) From amy at demarco.com Thu Jan 16 12:53:38 2020 From: amy at demarco.com (Amy Marrich) Date: Thu, 16 Jan 2020 06:53:38 -0600 Subject: Rails Girls Summer of Code Message-ID: Hi All, I was contacted about this program to see if OpenStack might be interested in participating and despite the name it is language agnostic. Moe information on the program can be found at Rails Girls Summer of Code, I'm willing to help organize our efforts but would need to know level of interest to participate and mentor. Thanks, Amy (spotz) Chair, Diversity and Inclusion WG Chair, User Committee -------------- next part -------------- An HTML attachment was scrubbed... URL: From francois.scheurer at everyware.ch Thu Jan 16 12:56:49 2020 From: francois.scheurer at everyware.ch (Francois Scheurer) Date: Thu, 16 Jan 2020 13:56:49 +0100 Subject: [cinder] consistency group not working In-Reply-To: <20191111140016.qyftq5iy27ekmdtj@localhost> References: <7adf0a5d-43b3-c606-2ba8-00d97b96cbdc@everyware.ch> <20191111140016.qyftq5iy27ekmdtj@localhost> Message-ID: Dear Gorka Many thanks for your answer. Cheers Francois On 11/11/19 3:00 PM, Gorka Eguileor wrote: > On 27/09, Francois Scheurer wrote: >> Dear Cinder Experts >> >> >> We are running the rocky release. >> >> |We can create a consistency group: openstack consistency group create >> --volume-type b9f67298-cf68-4cb2-bed2-c806c5f83487 fsc-consgroup Bug 1: but >> adding volumes is not working: openstack consistency group add volume >> c3f49ef0-601e-4558-a75a-9b758304ce3b b48752e3-641f-4a49-a892-6cb54ab6b74d >> c0022411-59a4-4c7c-9474-c7ea8ccc7691 0f4c6493-dbe2-4f75-8e37-5541a267e3f2 => >> Invalid volume: Volume is not local to this node. (HTTP 400) (Request-ID: >> req-7f67934a-5835-40ef-b25c-12591fd79f85) Bug 2: deleting consistency group >> is also not working (silently failing): openstack consistency group delete >> c3f49ef0-601e-4558-a75a-9b758304ce3b |||=> AttributeError: 'RBDDriver' >> object has no attribute 'delete_consistencygroup'| See details below. Using >> the --force option makes no difference and the consistency group is not >> deleted. Do you think this is a bug or a configuration issue? Thank you in >> advance. | >> >> Cheers >> >> Francois > Hi, > > It seems you are trying to use consistency groups with the RBD driver, > which doesn't currently support consistency groups. > > Cheers, > Gorka. > >> |Details: ==> >> /var/lib/docker/volumes/kolla_logs/_data/cinder/cinder-api-access.log <== >> 10.0.129.17 - - [27/Sep/2019:12:16:24 +0200] "POST /v3/f099965b37ac41489e9cac8c9d208711/consistencygroups/3706bbab-e2df-4507-9168-08ef811e452c/delete >> HTTP/1.1" 202 - 109720 "-" "python-cinderclient" ==> >> /var/lib/docker/volumes/kolla_logs/_data/cinder/cinder-volume.log <== >> 2019-09-27 12:16:24.491 30 ERROR oslo_messaging.rpc.server >> [req-9010336e-d569-47ad-84e2-8dd8b729939c b141574ee71f49a0b53a05ae968576c5 >> f099965b37ac41489e9cac8c9d208711 - default default] Exception during message >> handling: AttributeError: 'RBDDriver' object has no attribute >> 'delete_consistencygroup' 2019-09-27 12:16:24.491 30 ERROR >> oslo_messaging.rpc.server Traceback (most recent call last): 2019-09-27 >> 12:16:24.491 30 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", >> line 163, in _process_incoming 2019-09-27 12:16:24.491 30 ERROR >> oslo_messaging.rpc.server res = self.dispatcher.dispatch(message) 2019-09-27 >> 12:16:24.491 30 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", >> line 265, in dispatch 2019-09-27 12:16:24.491 30 ERROR >> oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, >> args) 2019-09-27 12:16:24.491 30 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", >> line 194, in _do_dispatch 2019-09-27 12:16:24.491 30 ERROR >> oslo_messaging.rpc.server result = func(ctxt, **new_args) 2019-09-27 >> 12:16:24.491 30 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/osprofiler/profiler.py", >> line 159, in wrapper 2019-09-27 12:16:24.491 30 ERROR >> oslo_messaging.rpc.server result = f(*args, **kwargs) 2019-09-27 >> 12:16:24.491 30 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/cinder/volume/manager.py", >> line 3397, in delete_group 2019-09-27 12:16:24.491 30 ERROR >> oslo_messaging.rpc.server vol_obj.save() 2019-09-27 12:16:24.491 30 ERROR >> oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_utils/excutils.py", >> line 220, in __exit__ 2019-09-27 12:16:24.491 30 ERROR >> oslo_messaging.rpc.server self.force_reraise() 2019-09-27 12:16:24.491 30 >> ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_utils/excutils.py", >> line 196, in force_reraise 2019-09-27 12:16:24.491 30 ERROR >> oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) >> 2019-09-27 12:16:24.491 30 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/cinder/volume/manager.py", >> line 3362, in delete_group 2019-09-27 12:16:24.491 30 ERROR >> oslo_messaging.rpc.server self.driver.delete_consistencygroup(context, cg, >> 2019-09-27 12:16:24.491 30 ERROR oslo_messaging.rpc.server AttributeError: >> 'RBDDriver' object has no attribute 'delete_consistencygroup' 2019-09-27 >> 12:16:24.491 30 ERROR oslo_messaging.rpc.server| >> >> >> >> >> -- >> >> >> EveryWare AG >> François Scheurer >> Senior Systems Engineer >> Zurlindenstrasse 52a >> CH-8003 Zürich >> >> tel: +41 44 466 60 00 >> fax: +41 44 466 60 10 >> mail: francois.scheurer at everyware.ch >> web: http://www.everyware.ch >> > -- EveryWare AG François Scheurer Senior Systems Engineer Zurlindenstrasse 52a CH-8003 Zürich tel: +41 44 466 60 00 fax: +41 44 466 60 10 mail: francois.scheurer at everyware.ch web: http://www.everyware.ch -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5978 bytes Desc: not available URL: From radoslaw.piliszek at gmail.com Thu Jan 16 13:30:51 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Thu, 16 Jan 2020 14:30:51 +0100 Subject: [all] cirros-cloud.net is down Message-ID: Hi, Folks! If your CI jobs depend on downloaded cirros image, then be aware that cirros links are seemingly permanently down atm. I reported [1] due to lack of a better place (that I know of). [1] https://github.com/cirros-dev/cirros/issues/12 -yoctozepto From Martin.Gehrke at twosigma.com Thu Jan 16 13:37:04 2020 From: Martin.Gehrke at twosigma.com (Martin Gehrke) Date: Thu, 16 Jan 2020 13:37:04 +0000 Subject: [ops] live-migration progress Message-ID: Hi, Last week at the Openstack Operators meetup in London, someone mentioned that there was an issue with the progress updates during a live migration and that by turning it off you could increase your success rate. Does anyone know more TIA Martin Gehrke DevOps Manager & OpenStack Tech Lead Two Sigma Investments, LP New York, NY -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcin.juszkiewicz at linaro.org Thu Jan 16 13:52:28 2020 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Thu, 16 Jan 2020 14:52:28 +0100 Subject: [all] cirros-cloud.net is down In-Reply-To: References: Message-ID: <6764c27d-f7c1-42d1-2f4b-4dcedde2b5d7@linaro.org> W dniu 16.01.2020 o 14:30, Radosław Piliszek pisze: > Hi, Folks! > > If your CI jobs depend on downloaded cirros image, then be aware that > cirros links are seemingly permanently down atm. > > I reported [1] due to lack of a better place (that I know of). > > [1] https://github.com/cirros-dev/cirros/issues/12 This is official place now. We moved Cirros from Launchpad to Github in December. From jean-philippe at evrard.me Thu Jan 16 15:05:29 2020 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Thu, 16 Jan 2020 16:05:29 +0100 Subject: [tc] January meeting agenda In-Reply-To: <1d35cdc723dbd4d50ab6a933b6a6a2c8a8ee4153.camel@evrard.me> References: <1d35cdc723dbd4d50ab6a933b6a6a2c8a8ee4153.camel@evrard.me> Message-ID: <05c9dde499c4ac9577e99f59071c85b0b5029a91.camel@evrard.me> Hello, The meeting logs are available here: http://eavesdrop.openstack.org/meetings/tc/2020/tc.2020-01-16-14.00.html Thank you everyone! Regards, JP From emiller at genesishosting.com Thu Jan 16 17:00:20 2020 From: emiller at genesishosting.com (Eric K. Miller) Date: Thu, 16 Jan 2020 11:00:20 -0600 Subject: [magnum][kolla] etcd wal sync duration issue In-Reply-To: <279cedf1-8bf4-fcf1-cfc2-990c97685531@catalyst.net.nz> References: <046E9C0290DD9149B106B72FC9156BEA0477170E@gmsxchsvr01.thecreation.com> <3f3fe0d1-7b61-d2f9-da65-d126ea5ed336@catalyst.net.nz> <046E9C0290DD9149B106B72FC9156BEA04771716@gmsxchsvr01.thecreation.com> <279cedf1-8bf4-fcf1-cfc2-990c97685531@catalyst.net.nz> Message-ID: <046E9C0290DD9149B106B72FC9156BEA04771749@gmsxchsvr01.thecreation.com> Hi Feilong, Before I was able to use the benchmark tool you mentioned, we saw some other slowdowns with Ceph (all flash). It appears that something must have crashed somewhere since we had to restart a couple things, after which etcd has been performing fine and no more health issues being reported by Magnum. So, it looks like it wasn't etcd related afterall. However, while researching, I found that etcd's fsync on every write (so it guarantees a write cache flush for each write) apparently creates some havoc with some SSDs, where the SSD performs a full cache flush of multiple caches. This article explains it a LOT better: https://yourcmc.ru/wiki/Ceph_performance (scroll to the "Drive cache is slowing you down" section) It seems that the optimal configuration for etcd would be to use local drives in each node and be sure that the write cache is disabled in the SSDs - as opposed to using Ceph volumes, which already adds network latency, but can create even more latency for synchronizations due to Ceph's replication. Eric From: feilong [mailto:feilong at catalyst.net.nz] Sent: Wednesday, January 15, 2020 2:36 PM To: Eric K. Miller; openstack-discuss at lists.openstack.org Cc: Spyros Trigazis Subject: Re: [magnum][kolla] etcd wal sync duration issue Hi Eric, If you're using SSD, then I think the IO performance should be OK. You can use this https://github.com/etcd-io/etcd/tree/master/tools/benchmark to verify and confirm that 's the root cause. Meanwhile, you can review the config of etcd cluster deployed by Magnum. I'm not an export of Etcd, so TBH I can't see anything wrong with the config. Most of them are just default configurations. As for the etcd image, it's built from https://github.com/projectatomic/atomic-system-containers/tree/master/etcd or you can refer CERN's repo https://gitlab.cern.ch/cloud/atomic-system-containers/blob/cern-qa/etcd/ Spyros, any comments? On 14/01/20 10:52 AM, Eric K. Miller wrote: Hi Feilong, Thanks for responding! I am, indeed, using the default v3.2.7 version for etcd, which is the only available image. I did not try to reproduce with any other driver (we have never used DevStack, honestly, only Kolla-Ansible deployments). I did see a number of people indicating similar issues with etcd versions in the 3.3.x range, so I didn't think of it being an etcd issue, but then again most issues seem to be a result of people using HDDs and not SSDs, which makes sense. Interesting that you saw the same issue, though. We haven't tried Fedora CoreOS, but I think we would need Train for this. Everything I read about etcd indicates that it is extremely latency sensitive, due to the fact that it replicates all changes to all nodes and sends an fsync to Linux each time, so data is always guaranteed to be stored. I can see this becoming an issue quickly without super-low-latency network and storage. We are using Ceph-based SSD volumes for the Kubernetes Master node disks, which is extremely fast (likely 10x or better than anything people recommend for etcd), but network latency is always going to be higher with VMs on OpenStack with DVR than bare metal with VLANs due to all of the abstractions. Do you know who maintains the etcd images for Magnum here? Is there an easy way to create a newer image? https://hub.docker.com/r/openstackmagnum/etcd/tags/ Eric From: Feilong Wang [mailto:feilong at catalyst.net.nz] Sent: Monday, January 13, 2020 3:39 PM To: openstack-discuss at lists.openstack.org Subject: Re: [magnum][kolla] etcd wal sync duration issue Hi Eric, That issue looks familiar for me. There are some questions I'd like to check before answering if you should upgrade to train. 1. Are using the default v3.2.7 version for etcd? 2. Did you try to reproduce this with devstack, using Fedora CoreOS driver? The etcd version could be 3.2.26 I asked above questions because I saw the same error when I used Fedora Atomic with etcd v3.2.7 and I can't reproduce it with Fedora CoreOS + etcd 3.2.26 -- Cheers & Best regards, Feilong Wang (王飞龙) ------------------------------------------------------ Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang at catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington ------------------------------------------------------ -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Thu Jan 16 17:55:06 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Thu, 16 Jan 2020 18:55:06 +0100 Subject: [all] cirros-cloud.net is down In-Reply-To: References: Message-ID: We investigated the issue further with Clark (@clarkb). It seems infra provides cached cirros 0.4.0 but some jobs use older cirros versions. Please verify your jobs use cirros 0.4.0 and use the cache in /opt/cache/files to avoid failures due to mirror flakiness. Default DevStack already uses it. -yoctozepto czw., 16 sty 2020 o 14:30 Radosław Piliszek napisał(a): > > Hi, Folks! > > If your CI jobs depend on downloaded cirros image, then be aware that > cirros links are seemingly permanently down atm. > > I reported [1] due to lack of a better place (that I know of). > > [1] https://github.com/cirros-dev/cirros/issues/12 > > -yoctozepto From Albert.Braden at synopsys.com Thu Jan 16 19:49:08 2020 From: Albert.Braden at synopsys.com (Albert Braden) Date: Thu, 16 Jan 2020 19:49:08 +0000 Subject: DR options with openstack In-Reply-To: <5e201295.1c69fb81.a69b.d77d@mx.google.com> References: <5e201295.1c69fb81.a69b.d77d@mx.google.com> Message-ID: Hi Tony, It looks like Cheesecake didn’t survive but apparently some components of it did; details in https://docs.openstack.org/cinder/pike/contributor/replication.html I’m not using Cinder now; we used it at eBay with Ceph and Netapp backends. Netapp makes it easy but is expensive; Ceph is free but you have to figure out how to make it work. You’re right about forking; we did it and then upgrading turned from an incredibly difficult ordeal to an impossible one. It’s better to stay with the “official” code so that upgrading remains an option. I’m just an operator; hopefully someone more expert will reply with more useful info. It’s true that our community lacks participation. It’s very difficult for a new operator to start using openstack and get help with the issues that they encounter. So far this mailing list has been the best resource for me. IRC and Ask Openstack are mostly unattended. I try to help out in #openstack when I can, but I don’t know a lot so I mostly end up telling people to ask on the list. On IRC sometimes I find help by asking in other openstack-* channels. Sometimes people complain that I’m asking in a developer channel, but sometimes I get help. Persistence is the key. If I keep asking long enough in enough places, eventually someone will answer. If all else fails, I open a bug. Good luck and welcome to the Openstack community! From: Tony Pearce Sent: Wednesday, January 15, 2020 11:37 PM To: openstack-discuss at lists.openstack.org Subject: DR options with openstack Hi all My questions are; 1. How are people using iSCSI Cinder storage with Openstack to-date? For example a Nimble Storage array backend. I mean to say, are people using backend integration drivers for other hardware (like netapp)? Or are they using backend iscsi for example? 2. How are people managing DR with Openstack in terms of backend storage replication to another array in another location and continuing to use Openstack? The environment which I am currently using; 1 x Nimble Storage array (iSCSI) with nimble.py Cinder driver 1 x virtualised Controller node 2 x physical compute nodes This is Openstack Pike. In addition, I have a 2nd Nimble Storage array in another location. To explain the questions I’d like to put forward my thoughts for question 2 first: For point 2 above, I have been searching for a way to utilise replicated volumes on the 2nd array from Openstack with existing instances. For example, if site 1 goes down how would I bring up openstack in the 2nd location and boot up the instances where their volumes are stored on the 2nd array. I found a proposal for something called “cheesecake” ref: https://specs.openstack.org/openstack/cinder-specs/specs/rocky/cheesecake-promote-backend.html But I could not find if it had been approved or implemented. So I return to square 1. I have some thoughts about failing over the controller VM and compute node but I don’t think there’s any need to go into here because of the above blocker and for brevity anyway. The nimble.py driver which I am using came with Openstack Pike and it appears Nimble / HPE are not maintaining it any longer. I saw a commit to remove nimble.py in Openstack Train release. The driver uses the REST API to perform actions on the array. Such as creating a volume, downloading the image, mounting the volume to the instance, snapshots, clones etc. This is great for me because to date I have around 10TB of openstack storage data allocated and the Nimble array shows the amount of data being consumed is <900GB. This is due to the compression and zero-byte snapshots and clones. So coming back to question 2 – is it possible? Can you drop me some keywords that I can search for such as an Openstack component like Cheesecake? I think basically what I am looking for is a supported way of telling Openstack that the instance volumes are now located at the new / second array. This means a new cinder backend. Example, new iqn, IP address, volume serial number. I think I could probably hack the cinder db but I really want to avoid that. So failing the above, it brings me to the question 1 I asked before. How are people using Cinder volumes? May be I am going about this the wrong way and need to take a few steps backwards to go forwards? I need storage to be able to deploy instances onto. Snapshots and clones are desired. At the moment these operations take less time than the horizon dashboard takes to load because of the waiting API responses. When searching for information about the above as an end-user / consumer I get a bit concerned. Is it right that Openstack usage is dropping? There’s no web forum to post questions. The chatroom on freenode is filled with ~300 ghosts. Ask Openstack questions go without response. Earlier this week (before I found this mail list) I had to use facebook to report that the Openstack.org website had been hacked. Basically it seems that if you’re a developer that can write code then you’re in but that’s it. I have never been a coder and so I am somewhat stuck. Thanks in advance Sent from Mail for Windows 10 -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Thu Jan 16 21:24:15 2020 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 16 Jan 2020 21:24:15 +0000 Subject: DR options with openstack In-Reply-To: References: <5e201295.1c69fb81.a69b.d77d@mx.google.com> Message-ID: <20200116212414.ugbths4zeilnylxc@yuggoth.org> On 2020-01-16 19:49:08 +0000 (+0000), Albert Braden wrote: [...] > On IRC sometimes I find help by asking in other openstack-* > channels. Sometimes people complain that I’m asking in a developer > channel, but sometimes I get help [...] I hope we don't have "developer[-only] channels" in OpenStack. The way of free/libre open source software is that users often become developers once they gain an increased familiarity with a project, so telling them to go away when they have a question is absolutely the wrong approach if we want this to be a sustainable effort longer term. I'm a developer on a number of projects where I still regularly have questions as a user, so even for selfish reasons I don't think that sort of discussion should be off-topic. If software developers get annoyed by users asking them too many questions or the same questions over and over, they should see that as a clear sign that they need to improve the documentation they maintain. So just to reassure you, you are absolutely doing the right thing by asking folks in project-specific IRC channels (or on this mailing list) when documentation about something is unclear or you encounter an undocumented behavior you'd like help investigating. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From Burak.Hoban at iag.com.au Thu Jan 16 22:41:56 2020 From: Burak.Hoban at iag.com.au (Burak Hoban) Date: Thu, 16 Jan 2020 22:41:56 +0000 Subject: DR options with openstack Message-ID: Hey Tony, Keep in mind that if you're looking to run OpenStack, but you're not feeling comfortable with the community support then there's always the option to go with a vendor backed version. These are usually a good option for those a little more risk adverse, or who don't have the time/or skills to maintain upstream releases - however going down that path usually means you can do less with OpenStack (depending on the vendor), but you have a large pool of resources to help troubleshoot and answer questions. We do both approaches internally for different clusters, so both approaches have their pro and cons. You touched on a few points in your original email... > If you had two OpenStack clusters, one in "site 1" and another in "site 2", then you could look at below for backup/restore of instances cross-cluster: - Freezer -> https://wiki.openstack.org/wiki/Freezer - Trillio (basically just a series of nova snapshots under the cover) -> https://www.trilio.io/ You could then over the top roll out a file level based backup tool on each instance, this would pretty much offer you replication functionality without having to do block-level tinkering. > Failover of OpenStack controller/computes If you have two sites, you can always go for 3x Controller deployment spanning cross site. Depending on latency obviously, however all you really need is a good enough link for RabbitMQ/Galera to talk reliably etc. Failing that, I'd recommend backing up your Controller with ReaR. From there you can also schedule frequent automated jobs to do a OpenStack DB backups. Recovering should be a case of ReaR restore, load latest OpenStack DB and start everything up... You'll probably want to ensure your VLANs are spanned cross-site so you can reuse same IP addresses. https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/post_deployment/backup_and_restore/05_rear.html https://superuser.openstack.org/articles/tutorial-rear-openstack-deployment/ In reality, the best solution would be to have two isolated clusters, and your workloads spanned across both sites. Obviously that isn't always possible (from personal experience), but pushing people down the Kubernetes path and then for the rest automation/backup utilities may cater for your needs. Having said that, Albert's link does look promising -> https://docs.openstack.org/cinder/pike/contributor/replication.html Date: Thu, 16 Jan 2020 19:49:08 +0000 From: Albert Braden To: Tony Pearce , "openstack-discuss at lists.openstack.org" Subject: RE: DR options with openstack Message-ID: Content-Type: text/plain; charset="utf-8" Hi Tony, It looks like Cheesecake didn’t survive but apparently some components of it did; details in https://docs.openstack.org/cinder/pike/contributor/replication.html I’m not using Cinder now; we used it at eBay with Ceph and Netapp backends. Netapp makes it easy but is expensive; Ceph is free but you have to figure out how to make it work. You’re right about forking; we did it and then upgrading turned from an incredibly difficult ordeal to an impossible one. It’s better to stay with the “official” code so that upgrading remains an option. I’m just an operator; hopefully someone more expert will reply with more useful info. It’s true that our community lacks participation. It’s very difficult for a new operator to start using openstack and get help with the issues that they encounter. So far this mailing list has been the best resource for me. IRC and Ask Openstack are mostly unattended. I try to help out in #openstack when I can, but I don’t know a lot so I mostly end up telling people to ask on the list. On IRC sometimes I find help by asking in other openstack-* channels. Sometimes people complain that I’m asking in a developer channel, but sometimes I get help. Persistence is the key. If I keep asking long enough in enough places, eventually someone will answer. If all else fails, I open a bug. Good luck and welcome to the Openstack community! From: Tony Pearce Sent: Wednesday, January 15, 2020 11:37 PM To: openstack-discuss at lists.openstack.org Subject: DR options with openstack Hi all My questions are; 1. How are people using iSCSI Cinder storage with Openstack to-date? For example a Nimble Storage array backend. I mean to say, are people using backend integration drivers for other hardware (like netapp)? Or are they using backend iscsi for example? 2. How are people managing DR with Openstack in terms of backend storage replication to another array in another location and continuing to use Openstack? The environment which I am currently using; 1 x Nimble Storage array (iSCSI) with nimble.py Cinder driver 1 x virtualised Controller node 2 x physical compute nodes This is Openstack Pike. In addition, I have a 2nd Nimble Storage array in another location. To explain the questions I’d like to put forward my thoughts for question 2 first: For point 2 above, I have been searching for a way to utilise replicated volumes on the 2nd array from Openstack with existing instances. For example, if site 1 goes down how would I bring up openstack in the 2nd location and boot up the instances where their volumes are stored on the 2nd array. I found a proposal for something called “cheesecake” ref: https://specs.openstack.org/openstack/cinder-specs/specs/rocky/cheesecake-promote-backend.html But I could not find if it had been approved or implemented. So I return to square 1. I have some thoughts about failing over the controller VM and compute node but I don’t think there’s any need to go into here because of the above blocker and for brevity anyway. The nimble.py driver which I am using came with Openstack Pike and it appears Nimble / HPE are not maintaining it any longer. I saw a commit to remove nimble.py in Openstack Train release. The driver uses the REST API to perform actions on the array. Such as creating a volume, downloading the image, mounting the volume to the instance, snapshots, clones etc. This is great for me because to date I have around 10TB of openstack storage data allocated and the Nimble array shows the amount of data being consumed is <900GB. This is due to the compression and zero-byte snapshots and clones. So coming back to question 2 – is it possible? Can you drop me some keywords that I can search for such as an Openstack component like Cheesecake? I think basically what I am looking for is a supported way of telling Openstack that the instance volumes are now located at the new / second array. This means a new cinder backend. Example, new iqn, IP address, volume serial number. I think I could probably hack the cinder db but I really want to avoid that. So failing the above, it brings me to the question 1 I asked before. How are people using Cinder volumes? May be I am going about this the wrong way and need to take a few steps backwards to go forwards? I need storage to be able to deploy instances onto. Snapshots and clones are desired. At the moment these operations take less time than the horizon dashboard takes to load because of the waiting API responses. When searching for information about the above as an end-user / consumer I get a bit concerned. Is it right that Openstack usage is dropping? There’s no web forum to post questions. The chatroom on freenode is filled with ~300 ghosts. Ask Openstack questions go without response. Earlier this week (before I found this mail list) I had to use facebook to report that the Openstack.org website had been hacked. Basically it seems that if you’re a developer that can write code then you’re in but that’s it. I have never been a coder and so I am somewhat stuck. Thanks in advance Sent from Mail for Windows 10 _____________________________________________________________________ The information transmitted in this message and its attachments (if any) is intended only for the person or entity to which it is addressed. The message may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information, by persons or entities other than the intended recipient is prohibited. If you have received this in error, please contact the sender and delete this e-mail and associated material from any computer. The intended recipient of this e-mail may only use, reproduce, disclose or distribute the information contained in this e-mail and any attached files, with the permission of the sender. This message has been scanned for viruses. _____________________________________________________________________ From ignaziocassano at gmail.com Thu Jan 16 23:00:37 2020 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Fri, 17 Jan 2020 00:00:37 +0100 Subject: DR options with openstack In-Reply-To: References: Message-ID: Hello, I suggest hystax for openstack failover and failback between two openstack sites. It works with openstack upstream As well. Ignazio Il Gio 16 Gen 2020, 23:46 Burak Hoban ha scritto: > Hey Tony, > > Keep in mind that if you're looking to run OpenStack, but you're not > feeling comfortable with the community support then there's always the > option to go with a vendor backed version. These are usually a good option > for those a little more risk adverse, or who don't have the time/or skills > to maintain upstream releases - however going down that path usually means > you can do less with OpenStack (depending on the vendor), but you have a > large pool of resources to help troubleshoot and answer questions. We do > both approaches internally for different clusters, so both approaches have > their pro and cons. > > You touched on a few points in your original email... > > > If you had two OpenStack clusters, one in "site 1" and another in "site > 2", then you could look at below for backup/restore of instances > cross-cluster: > - Freezer -> https://wiki.openstack.org/wiki/Freezer > - Trillio (basically just a series of nova snapshots under the cover) -> > https://www.trilio.io/ > > You could then over the top roll out a file level based backup tool on > each instance, this would pretty much offer you replication functionality > without having to do block-level tinkering. > > > Failover of OpenStack controller/computes > If you have two sites, you can always go for 3x Controller deployment > spanning cross site. Depending on latency obviously, however all you really > need is a good enough link for RabbitMQ/Galera to talk reliably etc. > > Failing that, I'd recommend backing up your Controller with ReaR. From > there you can also schedule frequent automated jobs to do a OpenStack DB > backups. Recovering should be a case of ReaR restore, load latest OpenStack > DB and start everything up... You'll probably want to ensure your VLANs are > spanned cross-site so you can reuse same IP addresses. > > > https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/post_deployment/backup_and_restore/05_rear.html > > https://superuser.openstack.org/articles/tutorial-rear-openstack-deployment/ > > > In reality, the best solution would be to have two isolated clusters, and > your workloads spanned across both sites. Obviously that isn't always > possible (from personal experience), but pushing people down the Kubernetes > path and then for the rest automation/backup utilities may cater for your > needs. > > Having said that, Albert's link does look promising -> > https://docs.openstack.org/cinder/pike/contributor/replication.html > > > > Date: Thu, 16 Jan 2020 19:49:08 +0000 > From: Albert Braden > To: Tony Pearce , > "openstack-discuss at lists.openstack.org" > > Subject: RE: DR options with openstack > Message-ID: > < > BN8PR12MB3636451FC8E2BC6A50216425D9360 at BN8PR12MB3636.namprd12.prod.outlook.com > > > > Content-Type: text/plain; charset="utf-8" > > Hi Tony, > > It looks like Cheesecake didn’t survive but apparently some components of > it did; details in > https://docs.openstack.org/cinder/pike/contributor/replication.html > > I’m not using Cinder now; we used it at eBay with Ceph and Netapp > backends. Netapp makes it easy but is expensive; Ceph is free but you have > to figure out how to make it work. You’re right about forking; we did it > and then upgrading turned from an incredibly difficult ordeal to an > impossible one. It’s better to stay with the “official” code so that > upgrading remains an option. > > I’m just an operator; hopefully someone more expert will reply with more > useful info. > > It’s true that our community lacks participation. It’s very difficult for > a new operator to start using openstack and get help with the issues that > they encounter. So far this mailing list has been the best resource for me. > IRC and Ask Openstack are mostly unattended. I try to help out in > #openstack when I can, but I don’t know a lot so I mostly end up telling > people to ask on the list. On IRC sometimes I find help by asking in other > openstack-* channels. Sometimes people complain that I’m asking in a > developer channel, but sometimes I get help. Persistence is the key. If I > keep asking long enough in enough places, eventually someone will answer. > If all else fails, I open a bug. > > Good luck and welcome to the Openstack community! > > From: Tony Pearce > Sent: Wednesday, January 15, 2020 11:37 PM > To: openstack-discuss at lists.openstack.org > Subject: DR options with openstack > > Hi all > > My questions are; > > > 1. How are people using iSCSI Cinder storage with Openstack to-date? > For example a Nimble Storage array backend. I mean to say, are people using > backend integration drivers for other hardware (like netapp)? Or are they > using backend iscsi for example? > 2. How are people managing DR with Openstack in terms of backend > storage replication to another array in another location and continuing to > use Openstack? > > The environment which I am currently using; > 1 x Nimble Storage array (iSCSI) with nimble.py Cinder driver > 1 x virtualised Controller node > 2 x physical compute nodes > This is Openstack Pike. > > In addition, I have a 2nd Nimble Storage array in another location. > > To explain the questions I’d like to put forward my thoughts for question > 2 first: > For point 2 above, I have been searching for a way to utilise replicated > volumes on the 2nd array from Openstack with existing instances. For > example, if site 1 goes down how would I bring up openstack in the 2nd > location and boot up the instances where their volumes are stored on the > 2nd array. I found a proposal for something called “cheesecake” ref: > https://specs.openstack.org/openstack/cinder-specs/specs/rocky/cheesecake-promote-backend.html > < > https://urldefense.proofpoint.com/v2/url?u=https-3A__specs.openstack.org_openstack_cinder-2Dspecs_specs_rocky_cheesecake-2Dpromote-2Dbackend.html&d=DwMFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=e2jC0sFEUAs6byl7JOv5IAZTKPkABl-Eh6rQwQ55tWk&s=oVEr3DpxprOpbuxZ_4WSfSqAVCaZUlPCFT6g6DsqQHQ&e= > > > But I could not find if it had been approved or implemented. So I return > to square 1. I have some thoughts about failing over the controller VM and > compute node but I don’t think there’s any need to go into here because of > the above blocker and for brevity anyway. > > The nimble.py driver which I am using came with Openstack Pike and it > appears Nimble / HPE are not maintaining it any longer. I saw a commit to > remove nimble.py in Openstack Train release. The driver uses the REST API > to perform actions on the array. Such as creating a volume, downloading the > image, mounting the volume to the instance, snapshots, clones etc. This is > great for me because to date I have around 10TB of openstack storage data > allocated and the Nimble array shows the amount of data being consumed is > <900GB. This is due to the compression and zero-byte snapshots and clones. > > So coming back to question 2 – is it possible? Can you drop me some > keywords that I can search for such as an Openstack component like > Cheesecake? I think basically what I am looking for is a supported way of > telling Openstack that the instance volumes are now located at the new / > second array. This means a new cinder backend. Example, new iqn, IP > address, volume serial number. I think I could probably hack the cinder db > but I really want to avoid that. > > So failing the above, it brings me to the question 1 I asked before. How > are people using Cinder volumes? May be I am going about this the wrong way > and need to take a few steps backwards to go forwards? I need storage to be > able to deploy instances onto. Snapshots and clones are desired. At the > moment these operations take less time than the horizon dashboard takes to > load because of the waiting API responses. > > When searching for information about the above as an end-user / consumer I > get a bit concerned. Is it right that Openstack usage is dropping? There’s > no web forum to post questions. The chatroom on freenode is filled with > ~300 ghosts. Ask Openstack questions go without response. Earlier this week > (before I found this mail list) I had to use facebook to report that the > Openstack.org website had been hacked. Basically it seems that if you’re a > developer that can write code then you’re in but that’s it. I have never > been a coder and so I am somewhat stuck. > > Thanks in advance > > Sent from Mail< > https://urldefense.proofpoint.com/v2/url?u=https-3A__go.microsoft.com_fwlink_-3FLinkId-3D550986&d=DwMFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=e2jC0sFEUAs6byl7JOv5IAZTKPkABl-Eh6rQwQ55tWk&s=Qo1wKkAeo1uTCH83dVO-IVt4MWhQRk7rg3xKmlzPGhI&e=> > for Windows 10 > > > _____________________________________________________________________ > > The information transmitted in this message and its attachments (if any) > is intended > only for the person or entity to which it is addressed. > The message may contain confidential and/or privileged material. Any > review, > retransmission, dissemination or other use of, or taking of any action in > reliance > upon this information, by persons or entities other than the intended > recipient is > prohibited. > > If you have received this in error, please contact the sender and delete > this e-mail > and associated material from any computer. > > The intended recipient of this e-mail may only use, reproduce, disclose or > distribute > the information contained in this e-mail and any attached files, with the > permission > of the sender. > > This message has been scanned for viruses. > _____________________________________________________________________ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tony.pearce at cinglevue.com Fri Jan 17 03:17:40 2020 From: tony.pearce at cinglevue.com (Tony Pearce) Date: Fri, 17 Jan 2020 11:17:40 +0800 Subject: DR options with openstack In-Reply-To: References: Message-ID: Hi all. Thanks to all that replied to no end. Lots of helpful information there. I apologise for not making this point but I am not looking for a 3rd party tool to achieve this. What I am looking for at this time are components already existing within openstack and open source is desired. I currently run Pike so I expect I may need to upgrade to get components I need. I did come across Freezer but not that wiki page. I'll work on setting up a test for this :) It’s true that our community lacks participation. It’s very difficult for a > new operator to start using openstack and get help with the issues that > they encounter. If I keep asking long enough in enough places, eventually someone will > answer. Yes it was difficult for me to learn. I managed to find a way through which worked for me. I started with Packstack. With regards to IRC - in my experience, once you get passed the authentication problems and often session timeout/kick out, you see the chat room with 300 people but no one chatting or answering. Kind of reduces the worth of the chatroom this way in my opinion. Although, I am in Australia so the timezone I am in could be a contributor. Thanks again - I have enough hints from you guys to go away and do some research. Best regards, *Tony Pearce* | *Senior Network Engineer / Infrastructure Lead**Cinglevue International * Email: tony.pearce at cinglevue.com Web: http://www.cinglevue.com *Australia* 1 Walsh Loop, Joondalup, WA 6027 Australia. Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 Note: This email and all attachments are the sole property of Cinglevue International Pty Ltd. (or any of its subsidiary entities), and the information contained herein must be considered confidential, unless specified otherwise. If you are not the intended recipient, you must not use or forward the information contained in these documents. If you have received this message in error, please delete the email and notify the sender. On Fri, 17 Jan 2020 at 07:00, Ignazio Cassano wrote: > Hello, I suggest hystax for openstack failover and failback between two > openstack sites. > It works with openstack upstream As well. > Ignazio > > Il Gio 16 Gen 2020, 23:46 Burak Hoban ha scritto: > >> Hey Tony, >> >> Keep in mind that if you're looking to run OpenStack, but you're not >> feeling comfortable with the community support then there's always the >> option to go with a vendor backed version. These are usually a good option >> for those a little more risk adverse, or who don't have the time/or skills >> to maintain upstream releases - however going down that path usually means >> you can do less with OpenStack (depending on the vendor), but you have a >> large pool of resources to help troubleshoot and answer questions. We do >> both approaches internally for different clusters, so both approaches have >> their pro and cons. >> >> You touched on a few points in your original email... >> >> > If you had two OpenStack clusters, one in "site 1" and another in "site >> 2", then you could look at below for backup/restore of instances >> cross-cluster: >> - Freezer -> https://wiki.openstack.org/wiki/Freezer >> - Trillio (basically just a series of nova snapshots under the cover) -> >> https://www.trilio.io/ >> >> You could then over the top roll out a file level based backup tool on >> each instance, this would pretty much offer you replication functionality >> without having to do block-level tinkering. >> >> > Failover of OpenStack controller/computes >> If you have two sites, you can always go for 3x Controller deployment >> spanning cross site. Depending on latency obviously, however all you really >> need is a good enough link for RabbitMQ/Galera to talk reliably etc. >> >> Failing that, I'd recommend backing up your Controller with ReaR. From >> there you can also schedule frequent automated jobs to do a OpenStack DB >> backups. Recovering should be a case of ReaR restore, load latest OpenStack >> DB and start everything up... You'll probably want to ensure your VLANs are >> spanned cross-site so you can reuse same IP addresses. >> >> >> https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/post_deployment/backup_and_restore/05_rear.html >> >> https://superuser.openstack.org/articles/tutorial-rear-openstack-deployment/ >> >> >> In reality, the best solution would be to have two isolated clusters, and >> your workloads spanned across both sites. Obviously that isn't always >> possible (from personal experience), but pushing people down the Kubernetes >> path and then for the rest automation/backup utilities may cater for your >> needs. >> >> Having said that, Albert's link does look promising -> >> https://docs.openstack.org/cinder/pike/contributor/replication.html >> >> >> >> Date: Thu, 16 Jan 2020 19:49:08 +0000 >> From: Albert Braden >> To: Tony Pearce , >> "openstack-discuss at lists.openstack.org" >> >> Subject: RE: DR options with openstack >> Message-ID: >> < >> BN8PR12MB3636451FC8E2BC6A50216425D9360 at BN8PR12MB3636.namprd12.prod.outlook.com >> > >> >> Content-Type: text/plain; charset="utf-8" >> >> Hi Tony, >> >> It looks like Cheesecake didn’t survive but apparently some components of >> it did; details in >> https://docs.openstack.org/cinder/pike/contributor/replication.html >> >> I’m not using Cinder now; we used it at eBay with Ceph and Netapp >> backends. Netapp makes it easy but is expensive; Ceph is free but you have >> to figure out how to make it work. You’re right about forking; we did it >> and then upgrading turned from an incredibly difficult ordeal to an >> impossible one. It’s better to stay with the “official” code so that >> upgrading remains an option. >> >> I’m just an operator; hopefully someone more expert will reply with more >> useful info. >> >> It’s true that our community lacks participation. It’s very difficult for >> a new operator to start using openstack and get help with the issues that >> they encounter. So far this mailing list has been the best resource for me. >> IRC and Ask Openstack are mostly unattended. I try to help out in >> #openstack when I can, but I don’t know a lot so I mostly end up telling >> people to ask on the list. On IRC sometimes I find help by asking in other >> openstack-* channels. Sometimes people complain that I’m asking in a >> developer channel, but sometimes I get help. Persistence is the key. If I >> keep asking long enough in enough places, eventually someone will answer. >> If all else fails, I open a bug. >> >> Good luck and welcome to the Openstack community! >> >> From: Tony Pearce >> Sent: Wednesday, January 15, 2020 11:37 PM >> To: openstack-discuss at lists.openstack.org >> Subject: DR options with openstack >> >> Hi all >> >> My questions are; >> >> >> 1. How are people using iSCSI Cinder storage with Openstack to-date? >> For example a Nimble Storage array backend. I mean to say, are people using >> backend integration drivers for other hardware (like netapp)? Or are they >> using backend iscsi for example? >> 2. How are people managing DR with Openstack in terms of backend >> storage replication to another array in another location and continuing to >> use Openstack? >> >> The environment which I am currently using; >> 1 x Nimble Storage array (iSCSI) with nimble.py Cinder driver >> 1 x virtualised Controller node >> 2 x physical compute nodes >> This is Openstack Pike. >> >> In addition, I have a 2nd Nimble Storage array in another location. >> >> To explain the questions I’d like to put forward my thoughts for question >> 2 first: >> For point 2 above, I have been searching for a way to utilise replicated >> volumes on the 2nd array from Openstack with existing instances. For >> example, if site 1 goes down how would I bring up openstack in the 2nd >> location and boot up the instances where their volumes are stored on the >> 2nd array. I found a proposal for something called “cheesecake” ref: >> https://specs.openstack.org/openstack/cinder-specs/specs/rocky/cheesecake-promote-backend.html >> < >> https://urldefense.proofpoint.com/v2/url?u=https-3A__specs.openstack.org_openstack_cinder-2Dspecs_specs_rocky_cheesecake-2Dpromote-2Dbackend.html&d=DwMFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=e2jC0sFEUAs6byl7JOv5IAZTKPkABl-Eh6rQwQ55tWk&s=oVEr3DpxprOpbuxZ_4WSfSqAVCaZUlPCFT6g6DsqQHQ&e= >> > >> But I could not find if it had been approved or implemented. So I return >> to square 1. I have some thoughts about failing over the controller VM and >> compute node but I don’t think there’s any need to go into here because of >> the above blocker and for brevity anyway. >> >> The nimble.py driver which I am using came with Openstack Pike and it >> appears Nimble / HPE are not maintaining it any longer. I saw a commit to >> remove nimble.py in Openstack Train release. The driver uses the REST API >> to perform actions on the array. Such as creating a volume, downloading the >> image, mounting the volume to the instance, snapshots, clones etc. This is >> great for me because to date I have around 10TB of openstack storage data >> allocated and the Nimble array shows the amount of data being consumed is >> <900GB. This is due to the compression and zero-byte snapshots and clones. >> >> So coming back to question 2 – is it possible? Can you drop me some >> keywords that I can search for such as an Openstack component like >> Cheesecake? I think basically what I am looking for is a supported way of >> telling Openstack that the instance volumes are now located at the new / >> second array. This means a new cinder backend. Example, new iqn, IP >> address, volume serial number. I think I could probably hack the cinder db >> but I really want to avoid that. >> >> So failing the above, it brings me to the question 1 I asked before. How >> are people using Cinder volumes? May be I am going about this the wrong way >> and need to take a few steps backwards to go forwards? I need storage to be >> able to deploy instances onto. Snapshots and clones are desired. At the >> moment these operations take less time than the horizon dashboard takes to >> load because of the waiting API responses. >> >> When searching for information about the above as an end-user / consumer >> I get a bit concerned. Is it right that Openstack usage is dropping? >> There’s no web forum to post questions. The chatroom on freenode is filled >> with ~300 ghosts. Ask Openstack questions go without response. Earlier this >> week (before I found this mail list) I had to use facebook to report that >> the Openstack.org website had been hacked. Basically it seems that if >> you’re a developer that can write code then you’re in but that’s it. I have >> never been a coder and so I am somewhat stuck. >> >> Thanks in advance >> >> Sent from Mail< >> https://urldefense.proofpoint.com/v2/url?u=https-3A__go.microsoft.com_fwlink_-3FLinkId-3D550986&d=DwMFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=e2jC0sFEUAs6byl7JOv5IAZTKPkABl-Eh6rQwQ55tWk&s=Qo1wKkAeo1uTCH83dVO-IVt4MWhQRk7rg3xKmlzPGhI&e=> >> for Windows 10 >> >> >> _____________________________________________________________________ >> >> The information transmitted in this message and its attachments (if any) >> is intended >> only for the person or entity to which it is addressed. >> The message may contain confidential and/or privileged material. Any >> review, >> retransmission, dissemination or other use of, or taking of any action in >> reliance >> upon this information, by persons or entities other than the intended >> recipient is >> prohibited. >> >> If you have received this in error, please contact the sender and delete >> this e-mail >> and associated material from any computer. >> >> The intended recipient of this e-mail may only use, reproduce, disclose >> or distribute >> the information contained in this e-mail and any attached files, with the >> permission >> of the sender. >> >> This message has been scanned for viruses. >> _____________________________________________________________________ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mike.carden at gmail.com Fri Jan 17 03:35:50 2020 From: mike.carden at gmail.com (Mike Carden) Date: Fri, 17 Jan 2020 14:35:50 +1100 Subject: DR options with openstack In-Reply-To: References: Message-ID: On Fri, Jan 17, 2020 at 2:25 PM Tony Pearce wrote: > With regards to IRC - in my experience, once you get passed the > authentication problems and often session timeout/kick out, you see the > chat room with 300 people but no one chatting or answering. Kind of reduces > the worth of the chatroom this way in my opinion. Although, I am in > Australia so the timezone I am in could be a contributor. > I'm also in Australia, but my IRC experience has been different from yours. I find that the individual, project-specific OpenStack IRC channels are a great resource, often attended by really helpful experts. 'openstack-ansible' 'openstack-ironic' 'openstack-qa' etc. Also, I keep a teeny tiny VM running in Google's cloud (AU 20 cents a month) to run quassel core so I have a 24/7 IRC connection to channels I watch so that people can reply to me while I'm asleep and I can catch up the next day. -- MC -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Fri Jan 17 03:51:15 2020 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 17 Jan 2020 03:51:15 +0000 Subject: DR options with openstack In-Reply-To: References: Message-ID: <20200117035115.l5zahi4apmjogpuf@yuggoth.org> On 2020-01-17 11:17:40 +0800 (+0800), Tony Pearce wrote: [...] > With regards to IRC - in my experience, once you get passed the > authentication problems These are a relatively recent and unfortunate addition to our channels, necessitated by spammers randomly popping in and generally being nuisances for everyone. We keep testing the waters by lifting the identification requirement here and there, but the coast is not yet clear. We'd really rather people were able to freely join and ask questions without setting up accounts, it's just a bit hard to keep our channels usable that way at the moment. > and often session timeout/kick out, you see the chat room with 300 > people but no one chatting or answering. Kind of reduces the worth > of the chatroom this way in my opinion. Although, I am in > Australia so the timezone I am in could be a contributor. Certainly the bulk of discussion for most projects happens when Europe and the Americas are awake, so likely less in the middle of your day and a lot more overnight for you. There may be some increased activity in your mornings or evenings at least. But if this is the #openstack channel, the bigger problem is that it's just not got a lot of people with answers to user questions paying attention in there (I too am guilty of forgetting to keep tabs on it). The fundamental truth is that whenever you balkanize communications into topic areas for "users" and "developers," the end result is that the user forum is all questions nobody's answering because most of the folks with answers are all conversing somewhere else in places where such questions are discouraged. We used to have separate mailing lists for user questions, sharing between operators, and development topics; those suffered precisely the same problem and I'm quite happy we agreed as a community to merge the lists into one where users' questions *are* getting seen by people who already possess the necessary knowledge to provide accurate answers and guidance. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From tony.pearce at cinglevue.com Fri Jan 17 03:54:19 2020 From: tony.pearce at cinglevue.com (Tony Pearce) Date: Fri, 17 Jan 2020 11:54:19 +0800 Subject: DR options with openstack In-Reply-To: References: Message-ID: I had not discovered the channels like openstack-ansible. I just googled for the channel list. "openstack" is described as being meant for general questions. When I need to I'll try and use a more specific topic channel. Thanks for the advice. For reference, the channel list is https://wiki.openstack.org/wiki/IRC *Tony Pearce* | *Senior Network Engineer / Infrastructure Lead**Cinglevue International * Email: tony.pearce at cinglevue.com Web: http://www.cinglevue.com *Australia* 1 Walsh Loop, Joondalup, WA 6027 Australia. Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 Note: This email and all attachments are the sole property of Cinglevue International Pty Ltd. (or any of its subsidiary entities), and the information contained herein must be considered confidential, unless specified otherwise. If you are not the intended recipient, you must not use or forward the information contained in these documents. If you have received this message in error, please delete the email and notify the sender. On Fri, 17 Jan 2020 at 11:41, Mike Carden wrote: > > > On Fri, Jan 17, 2020 at 2:25 PM Tony Pearce > wrote: > >> With regards to IRC - in my experience, once you get passed the >> authentication problems and often session timeout/kick out, you see the >> chat room with 300 people but no one chatting or answering. Kind of reduces >> the worth of the chatroom this way in my opinion. Although, I am in >> Australia so the timezone I am in could be a contributor. >> > > I'm also in Australia, but my IRC experience has been different from > yours. I find that the individual, project-specific OpenStack IRC channels > are a great resource, often attended by really helpful experts. > 'openstack-ansible' 'openstack-ironic' 'openstack-qa' etc. > > Also, I keep a teeny tiny VM running in Google's cloud (AU 20 cents a > month) to run quassel core so I have a 24/7 IRC connection to channels I > watch so that people can reply to me while I'm asleep and I can catch up > the next day. > > -- > MC > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Fri Jan 17 04:02:05 2020 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 16 Jan 2020 22:02:05 -0600 Subject: [qa][stable][tempest-plugins]: Tempest & plugins py2 jobs failure for stable branches (1860033: the EOLing python2 drama) Message-ID: <16fb1aa4aae.10e957b6324515.5822370422740200537@ghanshyammann.com> Hello Everyone, This is regarding bug: https://bugs.launchpad.net/tempest/+bug/1860033. Using Radosław's fancy statement of 'EOLing python2 drama' in subject :). neutron tempest plugin job on stable/rocky started failing as neutron-lib dropped the py2. neutron-lib 2.0.0 is py3 only and so does u-c on the master has been updated to 2.0.0. All tempest and its plugin uses the master u-c for stable branch testing which is the valid way because of master Tempest & plugin is being used to test the stable branches which need u-c from master itself. These failed jobs also used master u-c[1] which is trying to install the latest neutron-lib and failing. This is not just neutron tempest plugin issue but for all Tempest plugins jobs. Any lib used by Tempest or plugins can drop the py2 now and leads to this failure. Its just neutron-lib raised the flag first before I plan to hack on Tempest & plugins jobs for py2 drop from master and kepe testing py2 on stable bracnhes. We have two way to fix this: 1. Separate out the testing of python2 jobs with python2 supported version of Tempest plugins and with respective u-c. For example, test all python2 job with tempest plugin train version (or any latest version if any which support py2) and use u-c from stable/train. This will cap the Tempest & plugins with respective u-c for stable branches testing. 2. Second option is to install the tempest and plugins in py3 env on py2 jobs also. This should be an easy and preferred way. I am trying this first[2] and testing[3]. [1] https://zuul.opendev.org/t/openstack/build/fb8a928ed3614e09a9a3cf4637f2f6c2/log/job-output.txt#33040 [2] https://review.opendev.org/#/c/703011/ [3] https://review.opendev.org/#/c/703012/ -gmanne From fungi at yuggoth.org Fri Jan 17 04:10:05 2020 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 17 Jan 2020 04:10:05 +0000 Subject: [qa][stable][tempest-plugins]: Tempest & plugins py2 jobs failure for stable branches (1860033: the EOLing python2 drama) In-Reply-To: <16fb1aa4aae.10e957b6324515.5822370422740200537@ghanshyammann.com> References: <16fb1aa4aae.10e957b6324515.5822370422740200537@ghanshyammann.com> Message-ID: <20200117041005.cgxggu5wrv3amheh@yuggoth.org> On 2020-01-16 22:02:05 -0600 (-0600), Ghanshyam Mann wrote: [...] > Second option is to install the tempest and plugins in py3 env on > py2 jobs also. This should be an easy and preferred way. [...] This makes more sense anyway. Tempest and its plug-ins are already segregated from the system with a virtualenv due to conflicts with stable branch requirements, so hopefully switching that virtualenv to Python 3.x for all jobs is trivial (but I won't be surprised to learn there are subtle challenges hidden just beneath the surface). -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From agarwalvishakha18 at gmail.com Fri Jan 17 06:21:01 2020 From: agarwalvishakha18 at gmail.com (Vishakha Agarwal) Date: Fri, 17 Jan 2020 11:51:01 +0530 Subject: [keystone] Keystone Team Update - Week of 13 January 2020 Message-ID: # Keystone Team Update - Week of 13 January 2020 ## News ### Roadmap Review The Team has decided to review the roadmap every other week so as to keep up the development momentum of the ussuri cycle [1]. [1] https://tree.taiga.io/project/keystone-ussuri-roadmap/kanban ### User Support and Bug Duty Every week the duty is being rotated between the members. The person-in-charge for bug duty for current and upcoming week can be seen on the etherpad [2] [2] https://etherpad.openstack.org/p/keystone-l1-duty ## Open Specs Ussuri specs: https://bit.ly/2XDdpkU Ongoing specs: https://bit.ly/2OyDLTh ## Recently Merged Changes Search query: https://bit.ly/2pquOwT We merged 7 changes this week. ## Changes that need Attention Search query: https://bit.ly/2tymTje There are 36 changes that are passing CI, not in merge conflict, have no negative reviews and aren't proposed by bots. ### Priority Reviews * Community Goals https://review.opendev.org/#/c/699127/ [ussuri][goal] Drop python 2.7 support and testing keystone-tempest-plugin https://review.opendev.org/#/c/699119/ [ussuri][goal] Drop python 2.7 support and testing python-keystoneclient * Special Requests https://review.opendev.org/#/c/662734/ Change the default Identity endpoint to internal https://review.opendev.org/#/c/699013/ Always have username in CADF initiator https://review.opendev.org/#/c/700826/ Fix role_assignments role.id filter https://review.opendev.org/#/c/697444/ Adding options to user cli https://review.opendev.org/#/c/702374/ Cleanup doc/requirements.txt ## Bugs This week we opened 2 new bugs and closed 4. Bugs opened (2) Bug #1859759 (keystone:Undecided): Keystone is unable to remove role-assignment for deleted LDAP users - Opened by Eigil Obrestad https://bugs.launchpad.net/keystone/+bug/1859759 Bug #1859844 (keystone:Undecided): Impossible to rename the Default domain id to the string 'default.' - Opened by Marcelo Subtil Marcal https://bugs.launchpad.net/keystone/+bug/1859844 Bugs closed (4) Bug #1833207 (keystoneauth:Undecided) https://bugs.launchpad.net/keystoneauth/+bug/1833207 Bug #1858189 (keystoneauth:Undecided) https://bugs.launchpad.net/keystoneauth/+bug/1858189 Bug #1857086 (keystone:Won't Fix) https://bugs.launchpad.net/keystone/+bug/1857086 Bug #1859844 (keystone:Invalid) https://bugs.launchpad.net/keystone/+bug/1859844 ## Milestone Outlook https://releases.openstack.org/ussuri/schedule.html Reminder for Spec freeze as it is on the week of 10 Feburary. ## Help with this newsletter Help contribute to this newsletter by editing the etherpad: https://etherpad.openstack.org/p/keystone-team-newsletter ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From radoslaw.piliszek at gmail.com Fri Jan 17 07:49:32 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Fri, 17 Jan 2020 08:49:32 +0100 Subject: [qa][stable][tempest-plugins]: Tempest & plugins py2 jobs failure for stable branches (1860033: the EOLing python2 drama) In-Reply-To: <20200117041005.cgxggu5wrv3amheh@yuggoth.org> References: <16fb1aa4aae.10e957b6324515.5822370422740200537@ghanshyammann.com> <20200117041005.cgxggu5wrv3amheh@yuggoth.org> Message-ID: +1 for py3 in tempest venv. Makes most sense. Though the test is failing now: 2020-01-17 04:30:06.975801 | controller | ERROR: Could not find a version that satisfies the requirement neutron-lib===2.0.0 (from -c u-c-m.txt (line 79)) (from versions: 0.0.1, 0.0.2, 0.0.3, 0.1.0, 0.2.0, 0.3.0, 0.4.0, 1.0.0, 1.1.0, 1.2.0, 1.3.0, 1.4.0, 1.5.0, 1.6.0, 1.7.0, 1.8.0, 1.9.0, 1.9.1, 1.9.2, 1.10.0, 1.11.0, 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0, 1.19.0, 1.20.0, 1.21.0, 1.22.0, 1.23.0, 1.24.0, 1.25.0, 1.26.0, 1.27.0, 1.28.0, 1.29.0, 1.29.1, 1.30.0, 1.31.0) 2020-01-17 04:30:06.993738 | controller | ERROR: No matching distribution found for neutron-lib===2.0.0 (from -c u-c-m.txt (line 79)) and the reason is: pypi: data-requires-python=">=3.6" 3.5 < 3.6 Need some newer python in there. -yoctozepto pt., 17 sty 2020 o 05:15 Jeremy Stanley napisał(a): > > On 2020-01-16 22:02:05 -0600 (-0600), Ghanshyam Mann wrote: > [...] > > Second option is to install the tempest and plugins in py3 env on > > py2 jobs also. This should be an easy and preferred way. > [...] > > This makes more sense anyway. Tempest and its plug-ins are already > segregated from the system with a virtualenv due to conflicts with > stable branch requirements, so hopefully switching that virtualenv > to Python 3.x for all jobs is trivial (but I won't be surprised to > learn there are subtle challenges hidden just beneath the surface). > -- > Jeremy Stanley From radoslaw.piliszek at gmail.com Fri Jan 17 08:34:09 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Fri, 17 Jan 2020 09:34:09 +0100 Subject: DR options with openstack In-Reply-To: References: Message-ID: On #openstack-kolla we, kolla cores, help users (being or becoming OpenStack operators) with kolla-based deployments. I observed a nice trend that users give us positive feedback about our support efforts, stick to the channel and help other users as well. That's the spirit. As for ask, I don't have any positive experience with it. Most support efforts that I saw end like the one below: https://ask.openstack.org/en/question/124531/how-to-install-kolla-ansible-with-5-mon/ so it's a bit discouraging. Personally I would vote +1 for archiving the ask, seemingly doing more bad than good these days. -yoctozepto pt., 17 sty 2020 o 05:00 Tony Pearce napisał(a): > > I had not discovered the channels like openstack-ansible. I just googled for the channel list. "openstack" is described as being meant for general questions. When I need to I'll try and use a more specific topic channel. Thanks for the advice. > > For reference, the channel list is https://wiki.openstack.org/wiki/IRC > > > Tony Pearce | Senior Network Engineer / Infrastructure Lead > Cinglevue International > > Email: tony.pearce at cinglevue.com > Web: http://www.cinglevue.com > > Australia > 1 Walsh Loop, Joondalup, WA 6027 Australia. > > Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 > > Note: This email and all attachments are the sole property of Cinglevue International Pty Ltd. (or any of its subsidiary entities), and the information contained herein must be considered confidential, unless specified otherwise. If you are not the intended recipient, you must not use or forward the information contained in these documents. If you have received this message in error, please delete the email and notify the sender. > > > > > > On Fri, 17 Jan 2020 at 11:41, Mike Carden wrote: >> >> >> >> On Fri, Jan 17, 2020 at 2:25 PM Tony Pearce wrote: >>> >>> With regards to IRC - in my experience, once you get passed the authentication problems and often session timeout/kick out, you see the chat room with 300 people but no one chatting or answering. Kind of reduces the worth of the chatroom this way in my opinion. Although, I am in Australia so the timezone I am in could be a contributor. >> >> >> I'm also in Australia, but my IRC experience has been different from yours. I find that the individual, project-specific OpenStack IRC channels are a great resource, often attended by really helpful experts. 'openstack-ansible' 'openstack-ironic' 'openstack-qa' etc. >> >> Also, I keep a teeny tiny VM running in Google's cloud (AU 20 cents a month) to run quassel core so I have a 24/7 IRC connection to channels I watch so that people can reply to me while I'm asleep and I can catch up the next day. >> >> -- >> MC >> >> >> From tony.pearce at cinglevue.com Fri Jan 17 09:56:25 2020 From: tony.pearce at cinglevue.com (Tony Pearce) Date: Fri, 17 Jan 2020 17:56:25 +0800 Subject: Cinder snapshot delete successful when expected to fail Message-ID: Could anyone help by pointing me where to go to be able to dig into this issue further? I have installed a test Openstack environment using RDO Packstack. I wanted to install the same version that I have in Production (Pike) but it's not listed in the CentOS repo via yum search. So I installed Queens. I am using nimble.py Cinder driver. Nimble Storage is a storage array accessed via iscsi from the Openstack host, and is controlled from Openstack by the driver and API. *What I expected to happen:* 1. create an instance with volume (the volume is created on the storage array successfully and instance boots from it) 2. take a snapshot (snapshot taken on the volume on the array successfully) 3. create a new instance from the snapshot (the api tells the array to clone the snapshot into a new volume on the array and use that volume for the instance) 4. try and delete the snapshot Expected Result - Openstack gives the user a message like "you're not allowed to do that". Note: Step 3 above creates a child volume from the parent snapshot. It's impossible to delete the parent snapshot because IO READ is sent to that part of the original volume (as I understand it). *My production problem is this: * 1. create an instance with volume (the volume is created on the storage array successfully) 2. take a snapshot (snapshot taken on the volume on the array successfully) 3. create a new instance from the snapshot (the api tells the array to clone the snapshot into a new volume on the array and use that volume for the instance) 4. try and delete the snapshot Result - snapshot goes into error state and later, all Cinder operations fail such as new instance/create volume etc. until the correct service is restarted. Then everything works once again. To troubleshoot the above, I installed the RDP Packstack Queens (because I couldnt get Pike). I tested the above and now, the result is the snapshot is successfully deleted from openstack but not deleted on the array. The log is below for reference. But I can see the in the log that the array sends back info to openstack saying the snapshot has a clone and the delete cannot be done because of that. Also response code 409. *Some info about why the problem with Pike started in the first place* 1. Vendor is Nimble Storage which HPE purchased 2. HPE/Nimble have dropped support for openstack. Latest supported version is Queens and Nimble array version v4.x. The current Array version is v5.x. Nimble say there are no guarantees with openstack, the driver and the array version v5.x 3. I was previously advised by Nimble that the array version v5.x will work fine and so left our DR array on v5.x with a pending upgrade that had a blocker due to an issue. This issue was resolved in December and the pending upgrade completed to match the DR array took place around 30 days ago. With regards to the production issue, I assumed that the array API has some changes between v4.x and v5.x and it's causing an issue with Cinder due to the API response. Although I have not been able to find out if or what changes there are that may have occurred after the array upgrade, as the documentation for this is Nimble internal-only. *So with that - some questions if I may:* When Openstack got the 409 error response from the API (as seen in the log below), why would Openstack then proceed to delete the snapshot on the Openstack side? How could I debug this further? I'm not sure what Openstack Cinder is acting on in terns of the response as yet. Maybe Openstack is not specifically looking for the error code in the response? The snapshot that got deleted on the openstack side is a problem. Would this be related to the driver? Could it be possible that the driver did not pass the error response to Cinder? Thanks in advance. Just for reference, the log snippet is below. ==> volume.log <== > 2020-01-17 16:53:23.718 24723 WARNING py.warnings > [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 > 87e34c89e6fb41d2af25085b64011a55 - default default] > /usr/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py:852: > InsecureRequestWarning: Unverified HTTPS request is being made. Adding > certificate verification is strongly advised. See: > https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings > InsecureRequestWarning) > : NimbleAPIException: Failed to execute api > snapshots/0464a5fd65d75fcfe1000000000000011100001a41: Error Code: 409 > Message: Snapshot snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume > volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone. > ==> api.log <== > 2020-01-17 16:53:23.769 25242 INFO cinder.api.openstack.wsgi > [req-bfcbff34-134b-497e-82b1-082d48f8767f df7548ecad684f26b8bc802ba63a9814 > 87e34c89e6fb41d2af25085b64011a55 - default default] > http://192.168.53.45:8776/v3/87e34c89e6fb41d2af25085b64011a55/volumes/detail > returned with HTTP 200 > 2020-01-17 16:53:23.770 25242 INFO eventlet.wsgi.server > [req-bfcbff34-134b-497e-82b1-082d48f8767f df7548ecad684f26b8bc802ba63a9814 > 87e34c89e6fb41d2af25085b64011a55 - default default] 192.168.53.45 "GET > /v3/87e34c89e6fb41d2af25085b64011a55/volumes/detail HTTP/1.1" status: 200 > len: 4657 time: 0.1152730 > ==> volume.log <== > 2020-01-17 16:53:23.811 24723 WARNING py.warnings > [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 > 87e34c89e6fb41d2af25085b64011a55 - default default] > /usr/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py:852: > InsecureRequestWarning: Unverified HTTPS request is being made. Adding > certificate verification is strongly advised. See: > https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings > InsecureRequestWarning) > : NimbleAPIException: Failed to execute api > snapshots/0464a5fd65d75fcfe1000000000000011100001a41: Error Code: 409 > Message: Snapshot snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume > volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone. > 2020-01-17 16:53:23.902 24723 ERROR cinder.volume.drivers.nimble > [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 > 87e34c89e6fb41d2af25085b64011a55 - default default] Re-throwing Exception > Failed to execute api snapshots/0464a5fd65d75fcfe1000000000000011100001a41: > Error Code: 409 Message: Snapshot > snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume > volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone.: > NimbleAPIException: Failed to execute api > snapshots/0464a5fd65d75fcfe1000000000000011100001a41: Error Code: 409 > Message: Snapshot snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume > volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone. > 2020-01-17 16:53:23.903 24723 WARNING cinder.volume.drivers.nimble > [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 > 87e34c89e6fb41d2af25085b64011a55 - default default] Snapshot > snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 : has a clone: > NimbleAPIException: Failed to execute api > snapshots/0464a5fd65d75fcfe1000000000000011100001a41: Error Code: 409 > Message: Snapshot snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume > volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone. > 2020-01-17 16:53:23.964 24723 WARNING cinder.quota > [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 > 87e34c89e6fb41d2af25085b64011a55 - default default] Deprecated: Default > quota for resource: snapshots_Nimble-DR is set by the default quota flag: > quota_snapshots_Nimble-DR, it is now deprecated. Please use the default > quota class for default quota. > 2020-01-17 16:53:24.054 24723 INFO cinder.volume.manager > [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 > 87e34c89e6fb41d2af25085b64011a55 - default default] Delete snapshot > completed successfully. Regards, *Tony Pearce* | *Senior Network Engineer / Infrastructure Lead**Cinglevue International * Email: tony.pearce at cinglevue.com Web: http://www.cinglevue.com *Australia* 1 Walsh Loop, Joondalup, WA 6027 Australia. Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 Note: This email and all attachments are the sole property of Cinglevue International Pty Ltd. (or any of its subsidiary entities), and the information contained herein must be considered confidential, unless specified otherwise. If you are not the intended recipient, you must not use or forward the information contained in these documents. If you have received this message in error, please delete the email and notify the sender. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bcafarel at redhat.com Fri Jan 17 10:14:49 2020 From: bcafarel at redhat.com (Bernard Cafarelli) Date: Fri, 17 Jan 2020 11:14:49 +0100 Subject: [qa][stable][tempest-plugins]: Tempest & plugins py2 jobs failure for stable branches (1860033: the EOLing python2 drama) In-Reply-To: <20200117041005.cgxggu5wrv3amheh@yuggoth.org> References: <16fb1aa4aae.10e957b6324515.5822370422740200537@ghanshyammann.com> <20200117041005.cgxggu5wrv3amheh@yuggoth.org> Message-ID: On Fri, 17 Jan 2020 at 05:11, Jeremy Stanley wrote: > On 2020-01-16 22:02:05 -0600 (-0600), Ghanshyam Mann wrote: > [...] > > Second option is to install the tempest and plugins in py3 env on > > py2 jobs also. This should be an easy and preferred way. > [...] > > This makes more sense anyway. Tempest and its plug-ins are already > segregated from the system with a virtualenv due to conflicts with > stable branch requirements, so hopefully switching that virtualenv > to Python 3.x for all jobs is trivial (but I won't be surprised to > learn there are subtle challenges hidden just beneath the surface). > That sounds good for supported releases. Once we have them back in working order, I wonder how it will turn out for queens. In neutron, there is a recent failure [1] as this EM branch now uses a pinned version of the plugin. The fix there is most likely to also pin tempest - to queens-em [2] but then will also require some fix for the EOLing python2 drama. As tempest is branchless, it looks like if we want to keep neutron-tempest-plugin tests for queens we will rather need solution 1 for this branch? (but let's focus first on getting the supported branches back in working order) [1] https://bugs.launchpad.net/neutron/+bug/1859988 [2] https://review.opendev.org/702868 -- Bernard Cafarelli -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Fri Jan 17 11:01:37 2020 From: skaplons at redhat.com (Slawek Kaplonski) Date: Fri, 17 Jan 2020 12:01:37 +0100 Subject: [qa][stable][tempest-plugins]: Tempest & plugins py2 jobs failure for stable branches (1860033: the EOLing python2 drama) In-Reply-To: References: <16fb1aa4aae.10e957b6324515.5822370422740200537@ghanshyammann.com> <20200117041005.cgxggu5wrv3amheh@yuggoth.org> Message-ID: <3DCDAE2D-4368-4A0B-BF8B-7AF4BA729055@redhat.com> Hi, > On 17 Jan 2020, at 11:14, Bernard Cafarelli wrote: > > On Fri, 17 Jan 2020 at 05:11, Jeremy Stanley wrote: > On 2020-01-16 22:02:05 -0600 (-0600), Ghanshyam Mann wrote: > [...] > > Second option is to install the tempest and plugins in py3 env on > > py2 jobs also. This should be an easy and preferred way. > [...] > > This makes more sense anyway. Tempest and its plug-ins are already > segregated from the system with a virtualenv due to conflicts with > stable branch requirements, so hopefully switching that virtualenv > to Python 3.x for all jobs is trivial (but I won't be surprised to > learn there are subtle challenges hidden just beneath the surface). > > That sounds good for supported releases. Once we have them back in working order, I wonder how it will turn out for queens. > In neutron, there is a recent failure [1] as this EM branch now uses a pinned version of the plugin. The fix there is most likely to also pin tempest - to queens-em [2] but then will also require some fix for the EOLing python2 drama. But if we will use for queens branch tempest pinned to queens-em tag, we shouldn’t have any such problems there as all requirements will be also used from queens branch, or am I missing something here? > > As tempest is branchless, it looks like if we want to keep neutron-tempest-plugin tests for queens we will rather need solution 1 for this branch? (but let's focus first on getting the supported branches back in working order) > > [1] https://bugs.launchpad.net/neutron/+bug/1859988 > [2] https://review.opendev.org/702868 > > > -- > Bernard Cafarelli — Slawek Kaplonski Senior software engineer Red Hat From victoria at vmartinezdelacruz.com Fri Jan 17 11:43:49 2020 From: victoria at vmartinezdelacruz.com (=?UTF-8?Q?Victoria_Mart=C3=ADnez_de_la_Cruz?=) Date: Fri, 17 Jan 2020 08:43:49 -0300 Subject: Rails Girls Summer of Code In-Reply-To: References: Message-ID: Hi Amy, This is great! How is that agnostic? IIRC it was all related to Ruby on Rails projects? How OpenStack can join this effort? Thanks, V On Thu, Jan 16, 2020 at 9:55 AM Amy Marrich wrote: > Hi All, > > I was contacted about this program to see if OpenStack might be interested > in participating and despite the name it is language agnostic. Moe > information on the program can be found at Rails Girls Summer of Code, > > > I'm willing to help organize our efforts but would need to know level of > interest to participate and mentor. > > Thanks, > > Amy (spotz) > Chair, Diversity and Inclusion WG > Chair, User Committee > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bcafarel at redhat.com Fri Jan 17 11:51:28 2020 From: bcafarel at redhat.com (Bernard Cafarelli) Date: Fri, 17 Jan 2020 12:51:28 +0100 Subject: [qa][stable][tempest-plugins]: Tempest & plugins py2 jobs failure for stable branches (1860033: the EOLing python2 drama) In-Reply-To: <3DCDAE2D-4368-4A0B-BF8B-7AF4BA729055@redhat.com> References: <16fb1aa4aae.10e957b6324515.5822370422740200537@ghanshyammann.com> <20200117041005.cgxggu5wrv3amheh@yuggoth.org> <3DCDAE2D-4368-4A0B-BF8B-7AF4BA729055@redhat.com> Message-ID: On Fri, 17 Jan 2020 at 12:01, Slawek Kaplonski wrote: > Hi, > > > On 17 Jan 2020, at 11:14, Bernard Cafarelli wrote: > > > > On Fri, 17 Jan 2020 at 05:11, Jeremy Stanley wrote: > > On 2020-01-16 22:02:05 -0600 (-0600), Ghanshyam Mann wrote: > > [...] > > > Second option is to install the tempest and plugins in py3 env on > > > py2 jobs also. This should be an easy and preferred way. > > [...] > > > > This makes more sense anyway. Tempest and its plug-ins are already > > segregated from the system with a virtualenv due to conflicts with > > stable branch requirements, so hopefully switching that virtualenv > > to Python 3.x for all jobs is trivial (but I won't be surprised to > > learn there are subtle challenges hidden just beneath the surface). > > > > That sounds good for supported releases. Once we have them back in > working order, I wonder how it will turn out for queens. > > In neutron, there is a recent failure [1] as this EM branch now uses a > pinned version of the plugin. The fix there is most likely to also pin > tempest - to queens-em [2] but then will also require some fix for the > EOLing python2 drama. > > But if we will use for queens branch tempest pinned to queens-em tag, we > shouldn’t have any such problems there as all requirements will be also > used from queens branch, or am I missing something here? > Sadly not, from what I read in attempt [1] to limit neutron-lib to "old" version. And I see the same error in a test run with pinned tempest [2]: 2020-01-16 14:44:18.741517 | controller | 2020-01-16 14:44:18.741 | Collecting neutron-lib===2.0.0 (from -c u-c-m.txt (line 79)) 2020-01-16 14:44:19.023699 | controller | 2020-01-16 14:44:19.023 | Could not find a version that satisfies the requirement neutron-lib===2.0.0 (from -c u-c-m.txt (line 79)) (from versions: 0.0.1, 0.0.2, 0.0.3, 0.1.0, 0.2.0, 0.3.0, 0.4.0, 1.0.0, 1.1.0, 1.2.0, 1.3.0, 1.4.0, 1.5.0, 1.6.0, 1.7.0, 1.8.0, 1.9.0, 1.9.1, 1.9.2, 1.10.0, 1.11.0, 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0, 1.19.0, 1.20.0, 1.21.0, 1.22.0, 1.23.0, 1.24.0, 1.25.0, 1.26.0, 1.27.0, 1.28.0, 1.29.0, 1.29.1, 1.30.0, 1.31.0) 2020-01-16 14:44:19.042505 | controller | 2020-01-16 14:44:19.042 | No matching distribution found for neutron-lib===2.0.0 (from -c u-c-m.txt (line 79)) [1] https://review.opendev.org/702986/ [2] https://review.opendev.org/#/c/701900/ https://zuul.opendev.org/t/openstack/build/ee8021c1470a4fb88f55d64cc16ed15e > > > > > As tempest is branchless, it looks like if we want to keep > neutron-tempest-plugin tests for queens we will rather need solution 1 for > this branch? (but let's focus first on getting the supported branches back > in working order) > > > > [1] https://bugs.launchpad.net/neutron/+bug/1859988 > > [2] https://review.opendev.org/702868 > -- Bernard Cafarelli -------------- next part -------------- An HTML attachment was scrubbed... URL: From waboring at hemna.com Fri Jan 17 13:10:34 2020 From: waboring at hemna.com (Walter Boring) Date: Fri, 17 Jan 2020 08:10:34 -0500 Subject: DR options with openstack In-Reply-To: <5e201295.1c69fb81.a69b.d77d@mx.google.com> References: <5e201295.1c69fb81.a69b.d77d@mx.google.com> Message-ID: Hi Tony, Looking at the nimble driver, it has been removed from Cinder due to lack of support and maintenance from the vendor. Also, Looking at the code prior to it's removal, it didn't have any support for replication and failover. Cinder is a community based opensource project that relies on vendors, operators and users to contribute and support the codebase. As a core member of the Cinder team, we do our best to provide support for folks using Cinder and this mailing list and the #openstack-cinder channel is the best mechanism to get in touch with us. The #openstack-cinder irc channel is not a developer only channel. We help when we can, but also remember we have our day jobs as well. Unfortunately Nimble stopped providing support for their driver quite a while ago now and part of the Cinder policy to have a driver in tree is to have CI (Continuous Integration) tests in place to ensure that cinder patches don't break a driver. If the CI isn't in place, then the Cinder team marks the driver as unsupported in a release, and the following release the driver gets removed. All that being said, the nimbe driver never supported the cheesecake replication/DR capabilities that were added in Cinder. Walt (hemna in irc) On Thu, Jan 16, 2020 at 2:49 AM Tony Pearce wrote: > Hi all > > > > My questions are; > > > > 1. How are people using iSCSI Cinder storage with Openstack to-date? > For example a Nimble Storage array backend. I mean to say, are people using > backend integration drivers for other hardware (like netapp)? Or are they > using backend iscsi for example? > 2. How are people managing DR with Openstack in terms of backend > storage replication to another array in another location and continuing to > use Openstack? > > > > The environment which I am currently using; > > 1 x Nimble Storage array (iSCSI) with nimble.py Cinder driver > > 1 x virtualised Controller node > > 2 x physical compute nodes > > This is Openstack Pike. > > > > In addition, I have a 2nd Nimble Storage array in another location. > > > > To explain the questions I’d like to put forward my thoughts for question > 2 first: > > For point 2 above, I have been searching for a way to utilise replicated > volumes on the 2nd array from Openstack with existing instances. For > example, if site 1 goes down how would I bring up openstack in the 2nd > location and boot up the instances where their volumes are stored on the 2 > nd array. I found a proposal for something called “cheesecake” ref: > https://specs.openstack.org/openstack/cinder-specs/specs/rocky/cheesecake-promote-backend.html > But I could not find if it had been approved or implemented. So I return > to square 1. I have some thoughts about failing over the controller VM and > compute node but I don’t think there’s any need to go into here because of > the above blocker and for brevity anyway. > > > > The nimble.py driver which I am using came with Openstack Pike and it > appears Nimble / HPE are not maintaining it any longer. I saw a commit to > remove nimble.py in Openstack Train release. The driver uses the REST API > to perform actions on the array. Such as creating a volume, downloading the > image, mounting the volume to the instance, snapshots, clones etc. This is > great for me because to date I have around 10TB of openstack storage data > allocated and the Nimble array shows the amount of data being consumed is > <900GB. This is due to the compression and zero-byte snapshots and clones. > > > > So coming back to question 2 – is it possible? Can you drop me some > keywords that I can search for such as an Openstack component like > Cheesecake? I think basically what I am looking for is a supported way of > telling Openstack that the instance volumes are now located at the new / > second array. This means a new cinder backend. Example, new iqn, IP > address, volume serial number. I think I could probably hack the cinder db > but I really want to avoid that. > > > > So failing the above, it brings me to the question 1 I asked before. How > are people using Cinder volumes? May be I am going about this the wrong way > and need to take a few steps backwards to go forwards? I need storage to be > able to deploy instances onto. Snapshots and clones are desired. At the > moment these operations take less time than the horizon dashboard takes to > load because of the waiting API responses. > > > > When searching for information about the above as an end-user / consumer I > get a bit concerned. Is it right that Openstack usage is dropping? There’s > no web forum to post questions. The chatroom on freenode is filled with > ~300 ghosts. Ask Openstack questions go without response. Earlier this week > (before I found this mail list) I had to use facebook to report that the > Openstack.org website had been hacked. Basically it seems that if you’re a > developer that can write code then you’re in but that’s it. I have never > been a coder and so I am somewhat stuck. > > > > Thanks in advance > > > > Sent from Mail for > Windows 10 > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Fri Jan 17 13:13:46 2020 From: thierry at openstack.org (Thierry Carrez) Date: Fri, 17 Jan 2020 14:13:46 +0100 Subject: [largescale-sig] Meeting summary and next actions Message-ID: Hi everyone, The Large Scale SIG held a meeting earlier this week. Thanks to belmiro for chairing it! You can access the summary and logs of the meeting at: http://eavesdrop.openstack.org/meetings/large_scale_sig/2020/large_scale_sig.2020-01-15-09.00.html For the "Scaling within one cluster, and instrumentation of the bottlenecks" goal, I created a ML thread and etherpad to collect user stories, so far without much success. masahito is still working on the draft for oslo.metrics, hopefully will be ready by end of January. [1] http://lists.openstack.org/pipermail/openstack-discuss/2020-January/011925.html [2] https://etherpad.openstack.org/p/scaling-stories Standing TODOs: - all post short descriptions of what happens (what breaks first) when scaling up a single cluster to https://etherpad.openstack.org/p/scaling-stories - masahito to produce first draft for the oslo.metric blueprint - all learn more about golden signals concept as described in https://landing.google.com/sre/book.html For the "Document large scale configuration and tips &tricks" goal, amorin started a thread[3] and etherpad[4] on documenting configuration defaults for large scale, to which slaweq contributed for Neutron. [3] http://lists.openstack.org/pipermail/openstack-discuss/2020-January/011820.html [4] https://etherpad.openstack.org/p/large-scale-sig-documentation Standing TODOs: - oneswig to follow up with Scientific community to find articles around large scale openstack The next meeting will happen on January 29, at 9:00 UTC on #openstack-meeting. They will happen from now on every two weeks. Cheers, -- Thierry Carrez (ttx) From gmann at ghanshyammann.com Fri Jan 17 13:19:54 2020 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 17 Jan 2020 07:19:54 -0600 Subject: [qa][stable][tempest-plugins]: Tempest & plugins py2 jobs failure for stable branches (1860033: the EOLing python2 drama) In-Reply-To: References: <16fb1aa4aae.10e957b6324515.5822370422740200537@ghanshyammann.com> <20200117041005.cgxggu5wrv3amheh@yuggoth.org> Message-ID: <16fb3a8fd5e.12771fa0546403.7013538671378206138@ghanshyammann.com> ---- On Fri, 17 Jan 2020 04:14:49 -0600 Bernard Cafarelli wrote ---- > On Fri, 17 Jan 2020 at 05:11, Jeremy Stanley wrote: > On 2020-01-16 22:02:05 -0600 (-0600), Ghanshyam Mann wrote: > [...] > > Second option is to install the tempest and plugins in py3 env on > > py2 jobs also. This should be an easy and preferred way. > [...] > > This makes more sense anyway. Tempest and its plug-ins are already > segregated from the system with a virtualenv due to conflicts with > stable branch requirements, so hopefully switching that virtualenv > to Python 3.x for all jobs is trivial (but I won't be surprised to > learn there are subtle challenges hidden just beneath the surface). > > That sounds good for supported releases. Once we have them back in working order, I wonder how it will turn out for queens.In neutron, there is a recent failure [1] as this EM branch now uses a pinned version of the plugin. The fix there is most likely to also pin tempest - to queens-em [2] but then will also require some fix for the EOLing python2 drama. > As tempest is branchless, it looks like if we want to keep neutron-tempest-plugin tests for queens we will rather need solution 1 for this branch? (but let's focus first on getting the supported branches back in working order) Yes, for EM branch we need to apply the options#1. Tempest does not support EM branches and we will keep using Tempest master as long as it keeps passing. If it fails due to test incompatibility or any code behaviour, then we need to pin the Tempest. We did this for Ocata[1] and Pike[2]. But for phyton 2.7 drop case, we will use py3 env if possible to test the stable branch until failing due to other reasons then cap it. Currently, we support the Tempest pin by TEEMPEST_BRANCH but no way to pin Tempest Plugins which need some logic in devstack side to pick up the plugin tag from job. [1] https://review.opendev.org/#/c/681950/ [2] https://review.opendev.org/#/c/684769/ -gmann > [1] https://bugs.launchpad.net/neutron/+bug/1859988 [2] https://review.opendev.org/702868 > > -- > Bernard Cafarelli > From amy at demarco.com Fri Jan 17 13:23:50 2020 From: amy at demarco.com (Amy Marrich) Date: Fri, 17 Jan 2020 07:23:50 -0600 Subject: Rails Girls Summer of Code In-Reply-To: References: Message-ID: Victoria, I thought it was related to Ruby on Rails as well until I found the following on their site: Rails Girls Summer of Code is programming language agnostic, and students have contributed to an overall of 76 unique Open Source projects such as Bundler, Rails, Discourse, Tessel, NextCloud, Processing, Babel, impress.js, Lektor CMS, Hoodie, Speakerinnen, Lotus (now Hanami) and Servo. Maybe they've changed as the name is misleading when compared to that statement. So if OpenStack wanted to get involved we would submit an application and have some mentors/projects lined up similar to Outreachy and Google Summer of Cone. Thanks, Amy (spotz) On Fri, Jan 17, 2020 at 5:44 AM Victoria Martínez de la Cruz < victoria at vmartinezdelacruz.com> wrote: > Hi Amy, > > This is great! > > How is that agnostic? IIRC it was all related to Ruby on Rails projects? > How OpenStack can join this effort? > > Thanks, > > V > > On Thu, Jan 16, 2020 at 9:55 AM Amy Marrich wrote: > >> Hi All, >> >> I was contacted about this program to see if OpenStack might be >> interested in participating and despite the name it is language agnostic. >> Moe information on the program can be found at Rails Girls Summer of Code >> , >> >> I'm willing to help organize our efforts but would need to know level of >> interest to participate and mentor. >> >> Thanks, >> >> Amy (spotz) >> Chair, Diversity and Inclusion WG >> Chair, User Committee >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Fri Jan 17 13:31:24 2020 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 17 Jan 2020 07:31:24 -0600 Subject: [qa][stable][tempest-plugins]: Tempest & plugins py2 jobs failure for stable branches (1860033: the EOLing python2 drama) In-Reply-To: References: <16fb1aa4aae.10e957b6324515.5822370422740200537@ghanshyammann.com> <20200117041005.cgxggu5wrv3amheh@yuggoth.org> <3DCDAE2D-4368-4A0B-BF8B-7AF4BA729055@redhat.com> Message-ID: <16fb3b385b4.cbec982347026.4712409073911949240@ghanshyammann.com> ---- On Fri, 17 Jan 2020 05:51:28 -0600 Bernard Cafarelli wrote ---- > On Fri, 17 Jan 2020 at 12:01, Slawek Kaplonski wrote: > Hi, > > > On 17 Jan 2020, at 11:14, Bernard Cafarelli wrote: > > > > On Fri, 17 Jan 2020 at 05:11, Jeremy Stanley wrote: > > On 2020-01-16 22:02:05 -0600 (-0600), Ghanshyam Mann wrote: > > [...] > > > Second option is to install the tempest and plugins in py3 env on > > > py2 jobs also. This should be an easy and preferred way. > > [...] > > > > This makes more sense anyway. Tempest and its plug-ins are already > > segregated from the system with a virtualenv due to conflicts with > > stable branch requirements, so hopefully switching that virtualenv > > to Python 3.x for all jobs is trivial (but I won't be surprised to > > learn there are subtle challenges hidden just beneath the surface). > > > > That sounds good for supported releases. Once we have them back in working order, I wonder how it will turn out for queens. > > In neutron, there is a recent failure [1] as this EM branch now uses a pinned version of the plugin. The fix there is most likely to also pin tempest - to queens-em [2] but then will also require some fix for the EOLing python2 drama. > > But if we will use for queens branch tempest pinned to queens-em tag, we shouldn’t have any such problems there as all requirements will be also used from queens branch, or am I missing something here? > Sadly not, from what I read in attempt [1] to limit neutron-lib to "old" version. And I see the same error in a test run with pinned tempest [2]:2020-01-16 14:44:18.741517 | controller | 2020-01-16 14:44:18.741 | Collecting neutron-lib===2.0.0 (from -c u-c-m.txt (line 79)) > 2020-01-16 14:44:19.023699 | controller | 2020-01-16 14:44:19.023 | Could not find a version that satisfies the requirement neutron-lib===2.0.0 (from -c u-c-m.txt (line 79)) (from versions: 0.0.1, 0.0.2, 0.0.3, 0.1.0, 0.2.0, 0.3.0, 0.4.0, 1.0.0, 1.1.0, 1.2.0, 1.3.0, 1.4.0, 1.5.0, 1.6.0, 1.7.0, 1.8.0, 1.9.0, 1.9.1, 1.9.2, 1.10.0, 1.11.0, 1.12.0, 1.13.0, 1.14.0, 1.15.0, 1.16.0, 1.17.0, 1.18.0, 1.19.0, 1.20.0, 1.21.0, 1.22.0, 1.23.0, 1.24.0, 1.25.0, 1.26.0, 1.27.0, 1.28.0, 1.29.0, 1.29.1, 1.30.0, 1.31.0) > 2020-01-16 14:44:19.042505 | controller | 2020-01-16 14:44:19.042 | No matching distribution found for neutron-lib===2.0.0 (from -c u-c-m.txt (line 79)) Yes, Temepst venv uses the upper constraint from master always. We need to cap the u-c also accordingly. I did not do for Ocata/Pike case which I will fix as they can start failing any time. For Tempest, it is straight forward where u-c has to be used of the branch corresponding to pin Tempest. But for plugins, it is complex. All tempest plugins are being installed one by one un single logic in devstack so using different constraint for different plugins might not be possible (there should not be much cases like that where job tests more than one plugins tests but there are few). Best possible solution I can think of is to cap all the Tempest plugins together with Tempest and use corresponding stable branch u-c. Or we modify devstack logic with if-else condition for plugins require cap and rest else will be master. Any other thought? -gmann > > [1] https://review.opendev.org/702986/[2] https://review.opendev.org/#/c/701900/ https://zuul.opendev.org/t/openstack/build/ee8021c1470a4fb88f55d64cc16ed15e > > > > As tempest is branchless, it looks like if we want to keep neutron-tempest-plugin tests for queens we will rather need solution 1 for this branch? (but let's focus first on getting the supported branches back in working order) > > > > [1] https://bugs.launchpad.net/neutron/+bug/1859988 > > [2] https://review.opendev.org/702868 > > > -- > Bernard Cafarelli > From tony.pearce at cinglevue.com Fri Jan 17 13:44:37 2020 From: tony.pearce at cinglevue.com (Tony Pearce) Date: Fri, 17 Jan 2020 21:44:37 +0800 Subject: DR options with openstack In-Reply-To: References: <5e201295.1c69fb81.a69b.d77d@mx.google.com> Message-ID: Hi Walter Thank you for the information. It's unfortunate about the lack of support from nimble. With regards to replication, nimble has their own software implementation that I'm currently using. The problem I face is that the replicated volumes have a different iqn, serial number and are accessed via a different array IP. I didn't get time to read up on freezer today but I'm hopeful that I can use something there. 🙂 On Fri, 17 Jan 2020, 21:10 Walter Boring, wrote: > Hi Tony, > Looking at the nimble driver, it has been removed from Cinder due to > lack of support and maintenance from the vendor. Also, > Looking at the code prior to it's removal, it didn't have any support for > replication and failover. Cinder is a community based opensource project > that relies on vendors, operators and users to contribute and support the > codebase. As a core member of the Cinder team, we do our best to provide > support for folks using Cinder and this mailing list and the > #openstack-cinder channel is the best mechanism to get in touch with us. > The #openstack-cinder irc channel is not a developer only channel. We > help when we can, but also remember we have our day jobs as well. > > Unfortunately Nimble stopped providing support for their driver quite a > while ago now and part of the Cinder policy to have a driver in tree is to > have CI (Continuous Integration) tests in place to ensure that cinder > patches don't break a driver. If the CI isn't in place, then the Cinder > team marks the driver as unsupported in a release, and the following > release the driver gets removed. > > All that being said, the nimbe driver never supported the cheesecake > replication/DR capabilities that were added in Cinder. > > Walt (hemna in irc) > > On Thu, Jan 16, 2020 at 2:49 AM Tony Pearce > wrote: > >> Hi all >> >> >> >> My questions are; >> >> >> >> 1. How are people using iSCSI Cinder storage with Openstack to-date? >> For example a Nimble Storage array backend. I mean to say, are people using >> backend integration drivers for other hardware (like netapp)? Or are they >> using backend iscsi for example? >> 2. How are people managing DR with Openstack in terms of backend >> storage replication to another array in another location and continuing to >> use Openstack? >> >> >> >> The environment which I am currently using; >> >> 1 x Nimble Storage array (iSCSI) with nimble.py Cinder driver >> >> 1 x virtualised Controller node >> >> 2 x physical compute nodes >> >> This is Openstack Pike. >> >> >> >> In addition, I have a 2nd Nimble Storage array in another location. >> >> >> >> To explain the questions I’d like to put forward my thoughts for question >> 2 first: >> >> For point 2 above, I have been searching for a way to utilise replicated >> volumes on the 2nd array from Openstack with existing instances. For >> example, if site 1 goes down how would I bring up openstack in the 2nd >> location and boot up the instances where their volumes are stored on the 2 >> nd array. I found a proposal for something called “cheesecake” ref: >> https://specs.openstack.org/openstack/cinder-specs/specs/rocky/cheesecake-promote-backend.html >> But I could not find if it had been approved or implemented. So I return >> to square 1. I have some thoughts about failing over the controller VM and >> compute node but I don’t think there’s any need to go into here because of >> the above blocker and for brevity anyway. >> >> >> >> The nimble.py driver which I am using came with Openstack Pike and it >> appears Nimble / HPE are not maintaining it any longer. I saw a commit to >> remove nimble.py in Openstack Train release. The driver uses the REST API >> to perform actions on the array. Such as creating a volume, downloading the >> image, mounting the volume to the instance, snapshots, clones etc. This is >> great for me because to date I have around 10TB of openstack storage data >> allocated and the Nimble array shows the amount of data being consumed is >> <900GB. This is due to the compression and zero-byte snapshots and clones. >> >> >> >> So coming back to question 2 – is it possible? Can you drop me some >> keywords that I can search for such as an Openstack component like >> Cheesecake? I think basically what I am looking for is a supported way of >> telling Openstack that the instance volumes are now located at the new / >> second array. This means a new cinder backend. Example, new iqn, IP >> address, volume serial number. I think I could probably hack the cinder db >> but I really want to avoid that. >> >> >> >> So failing the above, it brings me to the question 1 I asked before. How >> are people using Cinder volumes? May be I am going about this the wrong way >> and need to take a few steps backwards to go forwards? I need storage to be >> able to deploy instances onto. Snapshots and clones are desired. At the >> moment these operations take less time than the horizon dashboard takes to >> load because of the waiting API responses. >> >> >> >> When searching for information about the above as an end-user / consumer >> I get a bit concerned. Is it right that Openstack usage is dropping? >> There’s no web forum to post questions. The chatroom on freenode is filled >> with ~300 ghosts. Ask Openstack questions go without response. Earlier this >> week (before I found this mail list) I had to use facebook to report that >> the Openstack.org website had been hacked. Basically it seems that if >> you’re a developer that can write code then you’re in but that’s it. I have >> never been a coder and so I am somewhat stuck. >> >> >> >> Thanks in advance >> >> >> >> Sent from Mail for >> Windows 10 >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ionut at fleio.com Fri Jan 17 13:54:12 2020 From: ionut at fleio.com (Ionut Biru) Date: Fri, 17 Jan 2020 15:54:12 +0200 Subject: [magnum] subnet created in public network? Message-ID: Hello, I'm using magnum 9.2.0 and while trying to experiment with this version, i was finding out that while deploying the cluster, heat creates the subnet into the public network. In the past, on rocky and stein, magnum/heat was creating a new network, with a router and an port within the public network for connectivity. I was wondering, if this is the expected behavior (subnet in public network). How do I revert to the old way of having new network? -- Ionut Biru - https://fleio.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosmaita.fossdev at gmail.com Fri Jan 17 14:12:27 2020 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Fri, 17 Jan 2020 09:12:27 -0500 Subject: [cinder][ops] new driver/new target driver merge deadline Message-ID: <18aa16da-10bc-a9db-1caf-96be65530f9c@gmail.com> Greetings to anyone developing a new Cinder driver (or target driver), or anyone trying to get someone to develop such a driver, This is a reminder that the deadline for merging a new backend driver or a new target driver to Cinder for the Ussuri release is the Ussuri-2 milestone on 13 February 2020 (23:59 UTC). New drivers must be (a) code complete including unit tests, (b) merged into the code repository, and (c) must have a 3rd Party CI running reliably. (The idea is that new drivers will be included in a release at the second milestone and thus be easily available for downstream testing, documentation feedback, etc.) You can find more information about Cinder drivers here: https://docs.openstack.org/cinder/latest/drivers-all-about.html and you can ask questions in #openstack-cinder on IRC or here on the mailing list. cheers, brian From abishop at redhat.com Fri Jan 17 14:17:30 2020 From: abishop at redhat.com (Alan Bishop) Date: Fri, 17 Jan 2020 06:17:30 -0800 Subject: Cinder snapshot delete successful when expected to fail In-Reply-To: References: Message-ID: On Fri, Jan 17, 2020 at 2:01 AM Tony Pearce wrote: > Could anyone help by pointing me where to go to be able to dig into this > issue further? > > I have installed a test Openstack environment using RDO Packstack. I > wanted to install the same version that I have in Production (Pike) but > it's not listed in the CentOS repo via yum search. So I installed Queens. I > am using nimble.py Cinder driver. Nimble Storage is a storage array > accessed via iscsi from the Openstack host, and is controlled from > Openstack by the driver and API. > > *What I expected to happen:* > 1. create an instance with volume (the volume is created on the storage > array successfully and instance boots from it) > 2. take a snapshot (snapshot taken on the volume on the array > successfully) > 3. create a new instance from the snapshot (the api tells the array to > clone the snapshot into a new volume on the array and use that volume for > the instance) > 4. try and delete the snapshot > Expected Result - Openstack gives the user a message like "you're not > allowed to do that". > > Note: Step 3 above creates a child volume from the parent snapshot. It's > impossible to delete the parent snapshot because IO READ is sent to that > part of the original volume (as I understand it). > > *My production problem is this: * > 1. create an instance with volume (the volume is created on the storage > array successfully) > 2. take a snapshot (snapshot taken on the volume on the array > successfully) > 3. create a new instance from the snapshot (the api tells the array to > clone the snapshot into a new volume on the array and use that volume for > the instance) > 4. try and delete the snapshot > Result - snapshot goes into error state and later, all Cinder operations > fail such as new instance/create volume etc. until the correct service is > restarted. Then everything works once again. > > > To troubleshoot the above, I installed the RDP Packstack Queens (because I > couldnt get Pike). I tested the above and now, the result is the snapshot > is successfully deleted from openstack but not deleted on the array. The > log is below for reference. But I can see the in the log that the array > sends back info to openstack saying the snapshot has a clone and the delete > cannot be done because of that. Also response code 409. > > *Some info about why the problem with Pike started in the first place* > 1. Vendor is Nimble Storage which HPE purchased > 2. HPE/Nimble have dropped support for openstack. Latest supported version > is Queens and Nimble array version v4.x. The current Array version is v5.x. > Nimble say there are no guarantees with openstack, the driver and the array > version v5.x > 3. I was previously advised by Nimble that the array version v5.x will > work fine and so left our DR array on v5.x with a pending upgrade that had > a blocker due to an issue. This issue was resolved in December and the > pending upgrade completed to match the DR array took place around 30 days > ago. > > > With regards to the production issue, I assumed that the array API has > some changes between v4.x and v5.x and it's causing an issue with Cinder > due to the API response. Although I have not been able to find out if or > what changes there are that may have occurred after the array upgrade, as > the documentation for this is Nimble internal-only. > > > *So with that - some questions if I may:* > When Openstack got the 409 error response from the API (as seen in the > log below), why would Openstack then proceed to delete the snapshot on the > Openstack side? How could I debug this further? I'm not sure what Openstack > Cinder is acting on in terns of the response as yet. Maybe Openstack is not > specifically looking for the error code in the response? > > The snapshot that got deleted on the openstack side is a problem. Would > this be related to the driver? Could it be possible that the driver did not > pass the error response to Cinder? > Hi Tony, This is exactly what happened, and it appears to be a driver bug introduced in queens by [1]. The code in question [2] logs the error, but fails to propagate the exception. As far as the volume manager is concerned, the snapshot deletion was successful. [1] https://review.opendev.org/601492 [2] https://opendev.org/openstack/cinder/src/branch/stable/queens/cinder/volume/drivers/nimble.py#L1815 Alan Thanks in advance. Just for reference, the log snippet is below. > > > ==> volume.log <== >> 2020-01-17 16:53:23.718 24723 WARNING py.warnings >> [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 >> 87e34c89e6fb41d2af25085b64011a55 - default default] >> /usr/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py:852: >> InsecureRequestWarning: Unverified HTTPS request is being made. Adding >> certificate verification is strongly advised. See: >> https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings >> InsecureRequestWarning) >> : NimbleAPIException: Failed to execute api >> snapshots/0464a5fd65d75fcfe1000000000000011100001a41: Error Code: 409 >> Message: Snapshot snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume >> volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone. >> ==> api.log <== >> 2020-01-17 16:53:23.769 25242 INFO cinder.api.openstack.wsgi >> [req-bfcbff34-134b-497e-82b1-082d48f8767f df7548ecad684f26b8bc802ba63a9814 >> 87e34c89e6fb41d2af25085b64011a55 - default default] >> http://192.168.53.45:8776/v3/87e34c89e6fb41d2af25085b64011a55/volumes/detail >> returned with HTTP 200 >> 2020-01-17 16:53:23.770 25242 INFO eventlet.wsgi.server >> [req-bfcbff34-134b-497e-82b1-082d48f8767f df7548ecad684f26b8bc802ba63a9814 >> 87e34c89e6fb41d2af25085b64011a55 - default default] 192.168.53.45 "GET >> /v3/87e34c89e6fb41d2af25085b64011a55/volumes/detail HTTP/1.1" status: 200 >> len: 4657 time: 0.1152730 >> ==> volume.log <== >> 2020-01-17 16:53:23.811 24723 WARNING py.warnings >> [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 >> 87e34c89e6fb41d2af25085b64011a55 - default default] >> /usr/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py:852: >> InsecureRequestWarning: Unverified HTTPS request is being made. Adding >> certificate verification is strongly advised. See: >> https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings >> InsecureRequestWarning) >> : NimbleAPIException: Failed to execute api >> snapshots/0464a5fd65d75fcfe1000000000000011100001a41: Error Code: 409 >> Message: Snapshot snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume >> volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone. >> 2020-01-17 16:53:23.902 24723 ERROR cinder.volume.drivers.nimble >> [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 >> 87e34c89e6fb41d2af25085b64011a55 - default default] Re-throwing Exception >> Failed to execute api snapshots/0464a5fd65d75fcfe1000000000000011100001a41: >> Error Code: 409 Message: Snapshot >> snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume >> volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone.: >> NimbleAPIException: Failed to execute api >> snapshots/0464a5fd65d75fcfe1000000000000011100001a41: Error Code: 409 >> Message: Snapshot snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume >> volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone. >> 2020-01-17 16:53:23.903 24723 WARNING cinder.volume.drivers.nimble >> [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 >> 87e34c89e6fb41d2af25085b64011a55 - default default] Snapshot >> snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 : has a clone: >> NimbleAPIException: Failed to execute api >> snapshots/0464a5fd65d75fcfe1000000000000011100001a41: Error Code: 409 >> Message: Snapshot snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume >> volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone. >> 2020-01-17 16:53:23.964 24723 WARNING cinder.quota >> [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 >> 87e34c89e6fb41d2af25085b64011a55 - default default] Deprecated: Default >> quota for resource: snapshots_Nimble-DR is set by the default quota flag: >> quota_snapshots_Nimble-DR, it is now deprecated. Please use the default >> quota class for default quota. >> 2020-01-17 16:53:24.054 24723 INFO cinder.volume.manager >> [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 >> 87e34c89e6fb41d2af25085b64011a55 - default default] Delete snapshot >> completed successfully. > > > > Regards, > > *Tony Pearce* | > *Senior Network Engineer / Infrastructure Lead**Cinglevue International > * > > Email: tony.pearce at cinglevue.com > Web: http://www.cinglevue.com > > *Australia* > 1 Walsh Loop, Joondalup, WA 6027 Australia. > > Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 > > Note: This email and all attachments are the sole property of Cinglevue > International Pty Ltd. (or any of its subsidiary entities), and the > information contained herein must be considered confidential, unless > specified otherwise. If you are not the intended recipient, you must not > use or forward the information contained in these documents. If you have > received this message in error, please delete the email and notify the > sender. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Fri Jan 17 14:50:32 2020 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 17 Jan 2020 14:50:32 +0000 Subject: [qa][stable][tempest-plugins]: Tempest & plugins py2 jobs failure for stable branches (1860033: the EOLing python2 drama) In-Reply-To: References: <16fb1aa4aae.10e957b6324515.5822370422740200537@ghanshyammann.com> <20200117041005.cgxggu5wrv3amheh@yuggoth.org> Message-ID: <20200117145031.j5tr6avxr3v7hdeg@yuggoth.org> On 2020-01-17 08:49:32 +0100 (+0100), Radosław Piliszek wrote: [...] > ERROR: No matching distribution found for neutron-lib===2.0.0 (from -c > u-c-m.txt (line 79)) > > and the reason is: > pypi: data-requires-python=">=3.6" > > 3.5 < 3.6 > > Need some newer python in there. [...] Or older neutron-lib? -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From thierry at openstack.org Fri Jan 17 14:58:08 2020 From: thierry at openstack.org (Thierry Carrez) Date: Fri, 17 Jan 2020 15:58:08 +0100 Subject: [Release-job-failures] release-post job for openstack/releases for ref refs/heads/master failed In-Reply-To: References: Message-ID: zuul at openstack.org wrote: > Build failed. > > - tag-releases https://zuul.opendev.org/t/openstack/build/a414023508294a65abe9715546757e41 : POST_FAILURE in 5m 13s > - publish-tox-docs-static https://zuul.opendev.org/t/openstack/build/None : SKIPPED There was an error running the post-job tasks on the tag job on https://review.opendev.org/702925. While trying to collect log output: ssh: connect to host 38.108.68.119 port 22: No route to host This looks like a transient error, and it can be ignored (the job itself had run and was a NOOP anyway). -- Thierry Carrez (ttx) From fungi at yuggoth.org Fri Jan 17 15:10:03 2020 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 17 Jan 2020 15:10:03 +0000 Subject: [qa][stable][tempest-plugins]: Tempest & plugins py2 jobs failure for stable branches (1860033: the EOLing python2 drama) In-Reply-To: <16fb3b385b4.cbec982347026.4712409073911949240@ghanshyammann.com> References: <16fb1aa4aae.10e957b6324515.5822370422740200537@ghanshyammann.com> <20200117041005.cgxggu5wrv3amheh@yuggoth.org> <3DCDAE2D-4368-4A0B-BF8B-7AF4BA729055@redhat.com> <16fb3b385b4.cbec982347026.4712409073911949240@ghanshyammann.com> Message-ID: <20200117151003.hpho3gjmnldk6bdd@yuggoth.org> On 2020-01-17 07:31:24 -0600 (-0600), Ghanshyam Mann wrote: [...] > Best possible solution I can think of is to cap all the Tempest > plugins together with Tempest and use corresponding stable branch > u-c. Or we modify devstack logic with if-else condition for > plugins require cap and rest else will be master. Any other > thought? [...] Constraints is going to be at odds with PEP 503 data-requires-python signaling. If we didn't include neutron-lib in the constraints list for Tempest's virtualenv (maybe filter it out with the edit-constraints tool) then pip should select the highest possible version which matches the versionspec in the requirements list and supports the Python interpreter with which that virtualenv was built. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From i at liuyulong.me Fri Jan 17 15:11:09 2020 From: i at liuyulong.me (=?utf-8?B?TElVIFl1bG9uZw==?=) Date: Fri, 17 Jan 2020 23:11:09 +0800 Subject: [Neutron] cancel neutron L3 meeting Message-ID: Hi all, Hi guys, due to the Chinese Spring Festival I will be offline in next two weeks. So I will not be available to chair the L3 meeting. Let's cancel the next two meetings. Then the L3 meeting will be rescheduled on 5th Feb, 2020. OK, see you guys then.  And happy Chinese New Year! 春节快乐! Regards, LIU Yulong -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at fried.cc Fri Jan 17 15:22:25 2020 From: openstack at fried.cc (Eric Fried) Date: Fri, 17 Jan 2020 09:22:25 -0600 Subject: [nova] Nova CI busted, please hold rechecks Message-ID: <4f03483c-3702-b71f-baca-43585096ca10@fried.cc> The nova-live-migration job is failing 100% since yesterday morning [1]. Your rechecks won't work until that's resolved. I'll send an all-clear message when we're green again. Thanks, efried [1] https://bugs.launchpad.net/nova/+bug/1860021 From madhuri.kumari at intel.com Fri Jan 17 15:31:17 2020 From: madhuri.kumari at intel.com (Kumari, Madhuri) Date: Fri, 17 Jan 2020 15:31:17 +0000 Subject: [ironic][nova][neutron][cloud-init] Infiniband Support in OpenStack Message-ID: <0512CBBECA36994BAA14C7FEDE986CA61A5528B0@BGSMSX102.gar.corp.intel.com> Hi, I am trying to deploy a node with infiniband in Ironic without any success. The node has two interfaces, eth0 and ib0. The deployment is successful, node becomes active but is not reachable. I debugged and checked that the issue is with cloud-init. The cloud-init fails to configure the network interfaces on the node complaining that the MAC address of infiniband port(ib0) is not known to the node. Ironic provides a fake MAC address for infiniband ports and cloud-init is supposed to generate the actual MAC address of infiband ports[1]. But it fails[2] before reaching there. I have posted the issue in cloud-init[3] as well. Can someone please help me with this issue? How do we specify "TYPE=InfiniBand" from OpenStack? Currently the type sent is "phy" only. [1] https://github.com/canonical/cloud-init/blob/master/cloudinit/sources/helpers/openstack.py#L686 [2] https://github.com/canonical/cloud-init/blob/master/cloudinit/sources/helpers/openstack.py#L677 [3] https://bugs.launchpad.net/cloud-init/+bug/1857031 Regards, Madhuri -------------- next part -------------- An HTML attachment was scrubbed... URL: From alawson at aqorn.com Fri Jan 17 15:54:36 2020 From: alawson at aqorn.com (Adam Peacock) Date: Fri, 17 Jan 2020 21:24:36 +0530 Subject: DR options with openstack In-Reply-To: References: <5e201295.1c69fb81.a69b.d77d@mx.google.com> Message-ID: I'm traveling in India right now and will reply later. I've architected several large OpenStack clouds from Cisco to Juniper to SAP to AT&T to HPE to Wells Fargo to -- you name it. Will share some things we've done regarding DR and more specifically how we handled replication and dividing the cloud up so it made sense from a design and operational perspective. Also, we need to be clear not everyone leans towards being a developer or even *wants* to go in that direction when using OpenStack. In fact, most don't and if there is that expectation by those entrenched with the OpenStack product, the OpenStack option gets dropped in favor of something else. It's developer-friendly but we need to be mega-mega-careful, as a community, to ensure development isn't the baseline or assumption for adequate support or to get questions answered. Especially since we've converged our communication channels. /soapbox More later. //adam On Fri, Jan 17, 2020, 7:19 PM Tony Pearce wrote: > Hi Walter > > Thank you for the information. > It's unfortunate about the lack of support from nimble. > > With regards to replication, nimble has their own software implementation > that I'm currently using. The problem I face is that the replicated volumes > have a different iqn, serial number and are accessed via a different array > IP. > > I didn't get time to read up on freezer today but I'm hopeful that I can > use something there. 🙂 > > > On Fri, 17 Jan 2020, 21:10 Walter Boring, wrote: > >> Hi Tony, >> Looking at the nimble driver, it has been removed from Cinder due to >> lack of support and maintenance from the vendor. Also, >> Looking at the code prior to it's removal, it didn't have any support for >> replication and failover. Cinder is a community based opensource project >> that relies on vendors, operators and users to contribute and support the >> codebase. As a core member of the Cinder team, we do our best to provide >> support for folks using Cinder and this mailing list and the >> #openstack-cinder channel is the best mechanism to get in touch with us. >> The #openstack-cinder irc channel is not a developer only channel. We >> help when we can, but also remember we have our day jobs as well. >> >> Unfortunately Nimble stopped providing support for their driver quite a >> while ago now and part of the Cinder policy to have a driver in tree is to >> have CI (Continuous Integration) tests in place to ensure that cinder >> patches don't break a driver. If the CI isn't in place, then the Cinder >> team marks the driver as unsupported in a release, and the following >> release the driver gets removed. >> >> All that being said, the nimbe driver never supported the cheesecake >> replication/DR capabilities that were added in Cinder. >> >> Walt (hemna in irc) >> >> On Thu, Jan 16, 2020 at 2:49 AM Tony Pearce >> wrote: >> >>> Hi all >>> >>> >>> >>> My questions are; >>> >>> >>> >>> 1. How are people using iSCSI Cinder storage with Openstack >>> to-date? For example a Nimble Storage array backend. I mean to say, are >>> people using backend integration drivers for other hardware (like netapp)? >>> Or are they using backend iscsi for example? >>> 2. How are people managing DR with Openstack in terms of backend >>> storage replication to another array in another location and continuing to >>> use Openstack? >>> >>> >>> >>> The environment which I am currently using; >>> >>> 1 x Nimble Storage array (iSCSI) with nimble.py Cinder driver >>> >>> 1 x virtualised Controller node >>> >>> 2 x physical compute nodes >>> >>> This is Openstack Pike. >>> >>> >>> >>> In addition, I have a 2nd Nimble Storage array in another location. >>> >>> >>> >>> To explain the questions I’d like to put forward my thoughts for >>> question 2 first: >>> >>> For point 2 above, I have been searching for a way to utilise replicated >>> volumes on the 2nd array from Openstack with existing instances. For >>> example, if site 1 goes down how would I bring up openstack in the 2nd >>> location and boot up the instances where their volumes are stored on the 2 >>> nd array. I found a proposal for something called “cheesecake” ref: >>> https://specs.openstack.org/openstack/cinder-specs/specs/rocky/cheesecake-promote-backend.html >>> But I could not find if it had been approved or implemented. So I return >>> to square 1. I have some thoughts about failing over the controller VM and >>> compute node but I don’t think there’s any need to go into here because of >>> the above blocker and for brevity anyway. >>> >>> >>> >>> The nimble.py driver which I am using came with Openstack Pike and it >>> appears Nimble / HPE are not maintaining it any longer. I saw a commit to >>> remove nimble.py in Openstack Train release. The driver uses the REST API >>> to perform actions on the array. Such as creating a volume, downloading the >>> image, mounting the volume to the instance, snapshots, clones etc. This is >>> great for me because to date I have around 10TB of openstack storage data >>> allocated and the Nimble array shows the amount of data being consumed is >>> <900GB. This is due to the compression and zero-byte snapshots and clones. >>> >>> >>> >>> So coming back to question 2 – is it possible? Can you drop me some >>> keywords that I can search for such as an Openstack component like >>> Cheesecake? I think basically what I am looking for is a supported way of >>> telling Openstack that the instance volumes are now located at the new / >>> second array. This means a new cinder backend. Example, new iqn, IP >>> address, volume serial number. I think I could probably hack the cinder db >>> but I really want to avoid that. >>> >>> >>> >>> So failing the above, it brings me to the question 1 I asked before. How >>> are people using Cinder volumes? May be I am going about this the wrong way >>> and need to take a few steps backwards to go forwards? I need storage to be >>> able to deploy instances onto. Snapshots and clones are desired. At the >>> moment these operations take less time than the horizon dashboard takes to >>> load because of the waiting API responses. >>> >>> >>> >>> When searching for information about the above as an end-user / consumer >>> I get a bit concerned. Is it right that Openstack usage is dropping? >>> There’s no web forum to post questions. The chatroom on freenode is filled >>> with ~300 ghosts. Ask Openstack questions go without response. Earlier this >>> week (before I found this mail list) I had to use facebook to report that >>> the Openstack.org website had been hacked. Basically it seems that if >>> you’re a developer that can write code then you’re in but that’s it. I have >>> never been a coder and so I am somewhat stuck. >>> >>> >>> >>> Thanks in advance >>> >>> >>> >>> Sent from Mail for >>> Windows 10 >>> >>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Fri Jan 17 16:30:32 2020 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Fri, 17 Jan 2020 10:30:32 -0600 Subject: [qa][stable][tempest-plugins]: Tempest & plugins py2 jobs failure for stable branches (1860033: the EOLing python2 drama) In-Reply-To: <20200117151003.hpho3gjmnldk6bdd@yuggoth.org> References: <16fb1aa4aae.10e957b6324515.5822370422740200537@ghanshyammann.com> <20200117041005.cgxggu5wrv3amheh@yuggoth.org> <3DCDAE2D-4368-4A0B-BF8B-7AF4BA729055@redhat.com> <16fb3b385b4.cbec982347026.4712409073911949240@ghanshyammann.com> <20200117151003.hpho3gjmnldk6bdd@yuggoth.org> Message-ID: <16fb45784f9.f37c163a56011.2179039106164150297@ghanshyammann.com> ---- On Fri, 17 Jan 2020 09:10:03 -0600 Jeremy Stanley wrote ---- > On 2020-01-17 07:31:24 -0600 (-0600), Ghanshyam Mann wrote: > [...] > > Best possible solution I can think of is to cap all the Tempest > > plugins together with Tempest and use corresponding stable branch > > u-c. Or we modify devstack logic with if-else condition for > > plugins require cap and rest else will be master. Any other > > thought? > [...] > > Constraints is going to be at odds with PEP 503 data-requires-python > signaling. If we didn't include neutron-lib in the constraints list > for Tempest's virtualenv (maybe filter it out with the > edit-constraints tool) then pip should select the highest possible > version which matches the versionspec in the requirements list and > supports the Python interpreter with which that virtualenv was > built. There will be more lib like neutron-lib, basically all dependency of Tempest or plugins that will become py2 incompatible day by day. -gmann > -- > Jeremy Stanley > From cboylan at sapwetik.org Fri Jan 17 17:35:37 2020 From: cboylan at sapwetik.org (Clark Boylan) Date: Fri, 17 Jan 2020 09:35:37 -0800 Subject: [tc][infra] Splitting OpenDev out of OpenStack Governance Message-ID: Hello, About six weeks ago I kicked off a discussion on what the future of OpenDev's governance looks like [0]. I think we had expected part of this process would be to split OpenDev out of OpenStack's governance, but wanted to be sure we had a bit more of a plan before we made that official. After some discussion in this thread [0] I believe we now have enough of a plan to move forward on splitting out. I've pushed a change [1] to the openstack/governance repo to make this official in git. I also wanted to make sure this change had some visibility so am sending this email too. Now for some background. The OpenDev effort intends to make our software development tools and processes available to projects outside of the OpenStack itself. We have actually made these resources available since Stackforge, but one of the major concerns we hear over and over is the implication that a project hosted on our platforms is still "OpenStack". The next step to avoiding this confusion and better reflecting our goals is to formally remove OpenDev from OpenStack's governance. As mentioned in the original thread [0], OpenDev would still incorporate input from the OpenStack project as one of its users. We aren't going away and will continue to work closely together to meet OpenStack's needs. But now we'll formalize doing that with other projects as well. Feedback is still welcome, though I ask people to read through the original thread [0] first. Please let me know if there is anything else I can do to help with this process. [0] http://lists.openstack.org/pipermail/openstack-infra/2019-December/006537.html [1] https://review.opendev.org/703134 Clark From fungi at yuggoth.org Fri Jan 17 17:57:13 2020 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 17 Jan 2020 17:57:13 +0000 Subject: [qa][stable][tempest-plugins]: Tempest & plugins py2 jobs failure for stable branches (1860033: the EOLing python2 drama) In-Reply-To: <16fb45784f9.f37c163a56011.2179039106164150297@ghanshyammann.com> References: <16fb1aa4aae.10e957b6324515.5822370422740200537@ghanshyammann.com> <20200117041005.cgxggu5wrv3amheh@yuggoth.org> <3DCDAE2D-4368-4A0B-BF8B-7AF4BA729055@redhat.com> <16fb3b385b4.cbec982347026.4712409073911949240@ghanshyammann.com> <20200117151003.hpho3gjmnldk6bdd@yuggoth.org> <16fb45784f9.f37c163a56011.2179039106164150297@ghanshyammann.com> Message-ID: <20200117175713.bz63guojhxa6raa3@yuggoth.org> On 2020-01-17 10:30:32 -0600 (-0600), Ghanshyam Mann wrote: > ---- On Fri, 17 Jan 2020 09:10:03 -0600 Jeremy Stanley wrote ---- > > On 2020-01-17 07:31:24 -0600 (-0600), Ghanshyam Mann wrote: > > [...] > > > Best possible solution I can think of is to cap all the Tempest > > > plugins together with Tempest and use corresponding stable branch > > > u-c. Or we modify devstack logic with if-else condition for > > > plugins require cap and rest else will be master. Any other > > > thought? > > [...] > > > > Constraints is going to be at odds with PEP 503 data-requires-python > > signaling. If we didn't include neutron-lib in the constraints list > > for Tempest's virtualenv (maybe filter it out with the > > edit-constraints tool) then pip should select the highest possible > > version which matches the versionspec in the requirements list and > > supports the Python interpreter with which that virtualenv was > > built. > > There will be more lib like neutron-lib, basically all dependency of Tempest > or plugins that will become py2 incompatible day by day. Yes, but the problem here isn't Python 2.7 incompatibility; it's Python 3.5 incompatibility. We can't run current Tempest on Ubuntu 16.04 LTS without installing a custom Python 3.6 interpreter. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From ignaziocassano at gmail.com Fri Jan 17 18:30:13 2020 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Fri, 17 Jan 2020 19:30:13 +0100 Subject: [queens][nova] iscsi issue Message-ID: Hello all we are testing openstack queens cinder driver for Unity iscsi (driver cinder.volume.drivers.dell_emc.unity.Driver). The unity storage is a Unity600 Version 4.5.10.5.001 We are facing an issue when we try to detach volume from a virtual machine with two or more volumes attached (this happens often but not always): The following is reported nova-compute.log: 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server self.connector.disconnect_volume(connection_info['data'], None) 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/os_brick/utils.py", line 137, in trace_logging_wrapper 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server return f(*args, **kwargs) 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 274, in inner 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server return f(*args, **kwargs) 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/os_brick/initiator/connectors/iscsi.py", line 848, in disconnect_volume 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server device_info=device_info) 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/os_brick/initiator/connectors/iscsi.py", line 892, in _cleanup_connection 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server path_used, was_multipath) 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/os_brick/initiator/linuxscsi.py", line 271, in remove_connection 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server self.flush_multipath_device(multipath_name) 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/os_brick/initiator/linuxscsi.py", line 329, in flush_multipath_device 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server root_helper=self._root_helper) 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/os_brick/executor.py", line 52, in _execute 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server result = self.__execute(*args, **kwargs) 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/os_brick/privileged/rootwrap.py", line 169, in execute 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server return execute_root(*cmd, **kwargs) 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_privsep/priv_context.py", line 207, in _wrap 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server return self.channel.remote_call(name, args, kwargs) 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_privsep/daemon.py", line 202, in remote_call 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server raise exc_type(*result[2]) 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server ProcessExecutionError: Unexpected error while running command. 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server Command: multipath -f 36006016006e04400d0c4215e3ec55757 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server Exit code: 1 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server Stdout: u'Jan 17 16:04:30 | 36006016006e04400d0c4215e3ec55757p1: map in use\nJan 17 16:04:31 | failed to remove multipath map 36006016006e04400d0c4215e3ec55757\n' 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server Stderr: u'' 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server Best Regards Ignazio -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Fri Jan 17 18:39:15 2020 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 17 Jan 2020 18:39:15 +0000 Subject: DR options with openstack In-Reply-To: References: <5e201295.1c69fb81.a69b.d77d@mx.google.com> Message-ID: <20200117183915.gugkawaqx42z6uvs@yuggoth.org> On 2020-01-17 21:24:36 +0530 (+0530), Adam Peacock wrote: [...] > Also, we need to be clear not everyone leans towards being a > developer or even *wants* to go in that direction when using > OpenStack. In fact, most don't and if there is that expectation by > those entrenched with the OpenStack product, the OpenStack option > gets dropped in favor of something else. It's developer-friendly > but we need to be mega-mega-careful, as a community, to ensure > development isn't the baseline or assumption for adequate support > or to get questions answered. Especially since we've converged our > communication channels. [...] Most users probably won't become developers on OpenStack, but some will, and I believe its long-term survival depends on that so we should do everything we can to encourage it. Users may also contribute in a variety of other ways like bug reporting and triage, outreach, revising or translating documentation, and so on. OpenStack isn't a "product," it's a community software collaboration on which many companies have built products (either by running it as a service or selling support for it). Treating the community the way you might treat a paid vendor is where all of this goes to a bad place very quickly. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From radoslaw.piliszek at gmail.com Fri Jan 17 18:42:00 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Fri, 17 Jan 2020 19:42:00 +0100 Subject: [api][sdk][dev][oslo] using uWSGI breaks CORS config Message-ID: Fellow Devs, as you might have noticed I started taking care of openstack/js-openstack-lib, now under the openstacksdk umbrella [1]. First goal is to modernize the CI to use Zuul v3, current devstack and nodejs, still WIP [2]. As part of the original suite of tests, the unit and functional tests are run from browsers as well as from node. And, as you may know, browsers care about CORS [3]. js-openstack-lib is connecting to various OpenStack APIs (currently limited to keystone, glance, neutron and nova) to act on behalf of the user (just like openstacksdk/client does). oslo.middleware, as used by those APIs, provides a way to configure CORS by setting params in the [cors] group but uWSGI seemingly ignores that completely [4]. I had to switch to mod_wsgi+apache instead of uwsgi+apache to get past that issue. I could not reproduce locally because kolla (thankfully) uses mostly mod_wsgi atm. The issue I see is that uWSGI is proposed as the future and mod_wsgi is termed deprecated. However, this means the future is broken w.r.t. CORS and so any modern web interface with it if not sitting on the exact same host and port (which is usually different between OpenStack APIs and any UI). [1] https://review.opendev.org/701854 [2] https://review.opendev.org/702132 [3] https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS [4] https://github.com/unbit/uwsgi/issues/1550 -yoctozepto From cboylan at sapwetik.org Fri Jan 17 22:11:23 2020 From: cboylan at sapwetik.org (Clark Boylan) Date: Fri, 17 Jan 2020 14:11:23 -0800 Subject: =?UTF-8?Q?Re:_[ironic][nova][neutron][cloud-init]_Infiniband_Support_in_?= =?UTF-8?Q?OpenStack?= In-Reply-To: <0512CBBECA36994BAA14C7FEDE986CA61A5528B0@BGSMSX102.gar.corp.intel.com> References: <0512CBBECA36994BAA14C7FEDE986CA61A5528B0@BGSMSX102.gar.corp.intel.com> Message-ID: On Fri, Jan 17, 2020, at 7:31 AM, Kumari, Madhuri wrote: > > Hi, > > > I am trying to deploy a node with infiniband in Ironic without any success. > > > The node has two interfaces, eth0 and ib0. The deployment is > successful, node becomes active but is not reachable. I debugged and > checked that the issue is with cloud-init. The cloud-init fails to > configure the network interfaces on the node complaining that the MAC > address of infiniband port(ib0) is not known to the node. Ironic > provides a fake MAC address for infiniband ports and cloud-init is > supposed to generate the actual MAC address of infiband ports[1]. But > it fails[2] before reaching there. Reading the cloud-init code [4][5] it appears that the ethernet format MAC should match bytes 13-15 + 18-20 of the infiniband address. Is the problem here that the fake MAC supplied is unrelated to the actual infiniband address? If so I think you'll either need cloud-init to ignore unknown interfaces (as proposed in the cloud-init bug), or have Ironic supply the mac address as bytes 13-15 + 18-20 of the actual infiniband address. > > I have posted the issue in cloud-init[3] as well. > > > Can someone please help me with this issue? How do we specify > “TYPE=InfiniBand” from OpenStack? Currently the type sent is “phy” only. > > > [1] > https://github.com/canonical/cloud-init/blob/master/cloudinit/sources/helpers/openstack.py#L686 > > [2] > https://github.com/canonical/cloud-init/blob/master/cloudinit/sources/helpers/openstack.py#L677 > > [3] https://bugs.launchpad.net/cloud-init/+bug/1857031 [4] https://github.com/canonical/cloud-init/blob/9bfb2ba7268e2c3c932023fc3d3020cdc6d6cc18/cloudinit/net/__init__.py#L793-L795 [5] https://github.com/canonical/cloud-init/blob/9bfb2ba7268e2c3c932023fc3d3020cdc6d6cc18/cloudinit/net/__init__.py#L844-L846 From Albert.Braden at synopsys.com Fri Jan 17 22:17:25 2020 From: Albert.Braden at synopsys.com (Albert Braden) Date: Fri, 17 Jan 2020 22:17:25 +0000 Subject: Galera config values Message-ID: I'm experimenting with Galera in my Rocky openstack-ansible dev cluster, and I'm finding that the default haproxy config values don't seem to work. Finding the correct values is a lot of work. For example, I spent this morning experimenting with different values for "timeout client" in /etc/haproxy/haproxy.cfg. The default is 1m, and with the default set I see this error in /var/log/nova/nova-scheduler.log on the controllers: 2020-01-17 13:54:26.059 443358 ERROR oslo_db.sqlalchemy.engines DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query') [SQL: u'SELECT 1'] (Background on this error at: http://sqlalche.me/e/e3q8) There are several timeout values in /etc/haproxy/haproxy.cfg. These are the values we started with: stats timeout 30s timeout http-request 10s timeout queue 1m timeout connect 10s timeout client 1m timeout server 1m timeout check 10s At first I changed them all to 30m. This stopped the "Lost connection" error in nova-scheduler.log. Then, one at a time, I changed them back to the default. When I got to "timeout client" I found that setting it back to 1m caused the errors to start again. I changed it back and forth and found that 4 minutes causes errors, and 6m stops them, so I left it at 6m. These are my active variables: root at us01odc-dev2-ctrl1:/etc/mysql# mysql -e 'show variables;'|grep timeout connect_timeout 20 deadlock_timeout_long 50000000 deadlock_timeout_short 10000 delayed_insert_timeout 300 idle_readonly_transaction_timeout 0 idle_transaction_timeout 0 idle_write_transaction_timeout 0 innodb_flush_log_at_timeout 1 innodb_lock_wait_timeout 50 innodb_rollback_on_timeout OFF interactive_timeout 28800 lock_wait_timeout 86400 net_read_timeout 30 net_write_timeout 60 rpl_semi_sync_master_timeout 10000 rpl_semi_sync_slave_kill_conn_timeout 5 slave_net_timeout 60 thread_pool_idle_timeout 60 wait_timeout 3600 So it looks like the value of "timeout client" in haproxy.cfg needs to match or exceed the value of "wait_timeout" in mysql. Also in nova.conf I see "#connection_recycle_time = 3600" - I need to experiment to see how that value interacts with the timeouts in the other config files. Is this the best way to find the correct config values? It seems like there should be a document that talks about these timeouts and how to set them (or maybe more generally how the different timeout settings in the various config files interact). Does that document exist? If not, maybe I could write one, since I have to figure out the correct values anyway. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eblock at nde.ag Fri Jan 17 22:34:58 2020 From: eblock at nde.ag (Eugen Block) Date: Fri, 17 Jan 2020 22:34:58 +0000 Subject: Galera config values In-Reply-To: Message-ID: <20200117223458.Horde.JLSaQRGPwIoHALX8zGRcgmW@webmail.nde.ag> Hi, I'm pretty sure you'll have to figure it out yourself. I always found the deployment guides quite good, I got my cloud running without major issues. But when it comes to HA configuration the guide lacks many information. I had to fiure out many details on my own, though haproxy is currently not in use here. > So it looks like the value of "timeout client" in haproxy.cfg needs > to match or exceed the value of "wait_timeout" in mysql. Although I'm not entirely sure I tend to agree with you. Dealing with a Ceph RGW deployment I encountered a similar issue and had to increase some timeout values to get it working. I'm convinced that many people would appreciate if you created a doc for haproxy. Regards, Eugen Zitat von Albert Braden : > I'm experimenting with Galera in my Rocky openstack-ansible dev > cluster, and I'm finding that the default haproxy config values > don't seem to work. Finding the correct values is a lot of work. For > example, I spent this morning experimenting with different values > for "timeout client" in /etc/haproxy/haproxy.cfg. The default is > 1m, and with the default set I see this error in > /var/log/nova/nova-scheduler.log on the controllers: > > 2020-01-17 13:54:26.059 443358 ERROR oslo_db.sqlalchemy.engines > DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost > connection to MySQL server during query') [SQL: u'SELECT 1'] > (Background on this error at: http://sqlalche.me/e/e3q8) > > There are several timeout values in /etc/haproxy/haproxy.cfg. These > are the values we started with: > > stats timeout 30s > timeout http-request 10s > timeout queue 1m > timeout connect 10s > timeout client 1m > timeout server 1m > timeout check 10s > > At first I changed them all to 30m. This stopped the "Lost > connection" error in nova-scheduler.log. Then, one at a time, I > changed them back to the default. When I got to "timeout client" I > found that setting it back to 1m caused the errors to start again. I > changed it back and forth and found that 4 minutes causes errors, > and 6m stops them, so I left it at 6m. > > These are my active variables: > > root at us01odc-dev2-ctrl1:/etc/mysql# mysql -e 'show variables;'|grep timeout > connect_timeout 20 > deadlock_timeout_long 50000000 > deadlock_timeout_short 10000 > delayed_insert_timeout 300 > idle_readonly_transaction_timeout 0 > idle_transaction_timeout 0 > idle_write_transaction_timeout 0 > innodb_flush_log_at_timeout 1 > innodb_lock_wait_timeout 50 > innodb_rollback_on_timeout OFF > interactive_timeout 28800 > lock_wait_timeout 86400 > net_read_timeout 30 > net_write_timeout 60 > rpl_semi_sync_master_timeout 10000 > rpl_semi_sync_slave_kill_conn_timeout 5 > slave_net_timeout 60 > thread_pool_idle_timeout 60 > wait_timeout 3600 > > So it looks like the value of "timeout client" in haproxy.cfg needs > to match or exceed the value of "wait_timeout" in mysql. Also in > nova.conf I see "#connection_recycle_time = 3600" - I need to > experiment to see how that value interacts with the timeouts in the > other config files. > > Is this the best way to find the correct config values? It seems > like there should be a document that talks about these timeouts and > how to set them (or maybe more generally how the different timeout > settings in the various config files interact). Does that document > exist? If not, maybe I could write one, since I have to figure out > the correct values anyway. From alawson at aqorn.com Fri Jan 17 22:44:28 2020 From: alawson at aqorn.com (Adam Peacock) Date: Sat, 18 Jan 2020 04:14:28 +0530 Subject: DR options with openstack In-Reply-To: <20200117183915.gugkawaqx42z6uvs@yuggoth.org> References: <5e201295.1c69fb81.a69b.d77d@mx.google.com> <20200117183915.gugkawaqx42z6uvs@yuggoth.org> Message-ID: How we view OpenStack within our community here is usually vastly different than the majority of enterprises and how they view it. Side note: My biggest gripe with OpenStack leadership is actually that everything is viewed from the lens of a developer which, I feel, is contributing to the plateau/decline in its adoption. That is but that's a topic for another day. Most organizations ( as I've seen anyway) view OpenStack as a product that is compared to other cloud products like vCloud Director/similar. And after 8 years architecting clouds with it, I see it the same way. So I'm not exactly inclined to split hairs with how it is characterized. Bottom line though, ensuring that non-developers are able to easily able to get their questions answered will, in my personal opinion, either promote OpenStack or promote the conception that it requires a team of developers to understand and run which kills any serious consideration in the boardroom. Sorry to the OP, didn't mean to hijack your thread here. :) just raises an important topic th get I see come up over and over. //adam On Sat, Jan 18, 2020, 2:43 AM Jeremy Stanley wrote: > On 2020-01-17 21:24:36 +0530 (+0530), Adam Peacock wrote: > [...] > > Also, we need to be clear not everyone leans towards being a > > developer or even *wants* to go in that direction when using > > OpenStack. In fact, most don't and if there is that expectation by > > those entrenched with the OpenStack product, the OpenStack option > > gets dropped in favor of something else. It's developer-friendly > > but we need to be mega-mega-careful, as a community, to ensure > > development isn't the baseline or assumption for adequate support > > or to get questions answered. Especially since we've converged our > > communication channels. > [...] > > Most users probably won't become developers on OpenStack, but some > will, and I believe its long-term survival depends on that so we > should do everything we can to encourage it. Users may also > contribute in a variety of other ways like bug reporting and triage, > outreach, revising or translating documentation, and so on. > > OpenStack isn't a "product," it's a community software collaboration > on which many companies have built products (either by running it as > a service or selling support for it). Treating the community the way > you might treat a paid vendor is where all of this goes to a bad > place very quickly. > -- > Jeremy Stanley > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Fri Jan 17 23:27:25 2020 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 17 Jan 2020 23:27:25 +0000 Subject: DR options with openstack In-Reply-To: References: <5e201295.1c69fb81.a69b.d77d@mx.google.com> <20200117183915.gugkawaqx42z6uvs@yuggoth.org> Message-ID: <20200117232724.iufxvq7owuvhoyoo@yuggoth.org> On 2020-01-18 04:14:28 +0530 (+0530), Adam Peacock wrote: > How we view OpenStack within our community here is usually vastly > different than the majority of enterprises and how they view it. > Side note: My biggest gripe with OpenStack leadership is actually > that everything is viewed from the lens of a developer which, I > feel, is contributing to the plateau/decline in its adoption. That > is but that's a topic for another day. I don't know whether you consider me part of OpenStack leadership, but if it helps, my background is ~30 years as a Unix/Linux sysadmin, data center engineer, security analyst and network architect. I don't have any formal education in software development (or even a University degree). This is the lens with which I view OpenStack. > Most organizations ( as I've seen anyway) view OpenStack as a > product that is compared to other cloud products like vCloud > Director/similar. And after 8 years architecting clouds with it, I > see it the same way. So I'm not exactly inclined to split hairs > with how it is characterized. I used vCloud Director for years, and I don't recall getting it for free nor being provided with access to its source outside an NDA. There also wasn't any way to reach out to the developers for it without a paid service contract (or really even with one most of the time). Sounds like VMware has become a bit more progressive recently? ;) > Bottom line though, ensuring that non-developers are able to > easily able to get their questions answered will, in my personal > opinion, either promote OpenStack or promote the conception that > it requires a team of developers to understand and run which kills > any serious consideration in the boardroom. [...] I wholeheartedly agree with this, and it's basically the point I've been trying to make as well. We need to welcome users and let them ask questions wherever we're all having conversations. Free/libre open source software thrives or withers based on the strength of its user base, not on its technical superiority or novelty. If we don't take every opportunity to accommodate users who engage with us, we're going to have fewer and fewer users... until the day comes when we have none at all. Also as the hype subsides, companies aren't going to throw developer hours at OpenStack just because it looks good in advertisements. We're going to need to learn how to shore up our ranks of developers and maintainers from the only other source available to us: our users. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From tony.pearce at cinglevue.com Sat Jan 18 01:44:16 2020 From: tony.pearce at cinglevue.com (Tony Pearce) Date: Sat, 18 Jan 2020 09:44:16 +0800 Subject: Cinder snapshot delete successful when expected to fail In-Reply-To: References: Message-ID: Thank you. That really helps. I am going to diff the nimble.py files between Pike and Queens and see what's changed. On Fri, 17 Jan 2020, 22:18 Alan Bishop, wrote: > > > On Fri, Jan 17, 2020 at 2:01 AM Tony Pearce > wrote: > >> Could anyone help by pointing me where to go to be able to dig into this >> issue further? >> >> I have installed a test Openstack environment using RDO Packstack. I >> wanted to install the same version that I have in Production (Pike) but >> it's not listed in the CentOS repo via yum search. So I installed Queens. I >> am using nimble.py Cinder driver. Nimble Storage is a storage array >> accessed via iscsi from the Openstack host, and is controlled from >> Openstack by the driver and API. >> >> *What I expected to happen:* >> 1. create an instance with volume (the volume is created on the storage >> array successfully and instance boots from it) >> 2. take a snapshot (snapshot taken on the volume on the array >> successfully) >> 3. create a new instance from the snapshot (the api tells the array to >> clone the snapshot into a new volume on the array and use that volume for >> the instance) >> 4. try and delete the snapshot >> Expected Result - Openstack gives the user a message like "you're not >> allowed to do that". >> >> Note: Step 3 above creates a child volume from the parent snapshot. It's >> impossible to delete the parent snapshot because IO READ is sent to that >> part of the original volume (as I understand it). >> >> *My production problem is this: * >> 1. create an instance with volume (the volume is created on the storage >> array successfully) >> 2. take a snapshot (snapshot taken on the volume on the array >> successfully) >> 3. create a new instance from the snapshot (the api tells the array to >> clone the snapshot into a new volume on the array and use that volume for >> the instance) >> 4. try and delete the snapshot >> Result - snapshot goes into error state and later, all Cinder operations >> fail such as new instance/create volume etc. until the correct service is >> restarted. Then everything works once again. >> >> >> To troubleshoot the above, I installed the RDP Packstack Queens (because >> I couldnt get Pike). I tested the above and now, the result is the snapshot >> is successfully deleted from openstack but not deleted on the array. The >> log is below for reference. But I can see the in the log that the array >> sends back info to openstack saying the snapshot has a clone and the delete >> cannot be done because of that. Also response code 409. >> >> *Some info about why the problem with Pike started in the first place* >> 1. Vendor is Nimble Storage which HPE purchased >> 2. HPE/Nimble have dropped support for openstack. Latest supported >> version is Queens and Nimble array version v4.x. The current Array version >> is v5.x. Nimble say there are no guarantees with openstack, the driver and >> the array version v5.x >> 3. I was previously advised by Nimble that the array version v5.x will >> work fine and so left our DR array on v5.x with a pending upgrade that had >> a blocker due to an issue. This issue was resolved in December and the >> pending upgrade completed to match the DR array took place around 30 days >> ago. >> >> >> With regards to the production issue, I assumed that the array API has >> some changes between v4.x and v5.x and it's causing an issue with Cinder >> due to the API response. Although I have not been able to find out if or >> what changes there are that may have occurred after the array upgrade, as >> the documentation for this is Nimble internal-only. >> >> >> *So with that - some questions if I may:* >> When Openstack got the 409 error response from the API (as seen in the >> log below), why would Openstack then proceed to delete the snapshot on the >> Openstack side? How could I debug this further? I'm not sure what Openstack >> Cinder is acting on in terns of the response as yet. Maybe Openstack is not >> specifically looking for the error code in the response? >> >> The snapshot that got deleted on the openstack side is a problem. Would >> this be related to the driver? Could it be possible that the driver did not >> pass the error response to Cinder? >> > > Hi Tony, > > This is exactly what happened, and it appears to be a driver bug > introduced in queens by [1]. The code in question [2] logs the error, but > fails to propagate the exception. As far as the volume manager is > concerned, the snapshot deletion was successful. > > [1] https://review.opendev.org/601492 > [2] > https://opendev.org/openstack/cinder/src/branch/stable/queens/cinder/volume/drivers/nimble.py#L1815 > > Alan > > Thanks in advance. Just for reference, the log snippet is below. >> >> >> ==> volume.log <== >>> 2020-01-17 16:53:23.718 24723 WARNING py.warnings >>> [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 >>> 87e34c89e6fb41d2af25085b64011a55 - default default] >>> /usr/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py:852: >>> InsecureRequestWarning: Unverified HTTPS request is being made. Adding >>> certificate verification is strongly advised. See: >>> https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings >>> InsecureRequestWarning) >>> : NimbleAPIException: Failed to execute api >>> snapshots/0464a5fd65d75fcfe1000000000000011100001a41: Error Code: 409 >>> Message: Snapshot snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume >>> volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone. >>> ==> api.log <== >>> 2020-01-17 16:53:23.769 25242 INFO cinder.api.openstack.wsgi >>> [req-bfcbff34-134b-497e-82b1-082d48f8767f df7548ecad684f26b8bc802ba63a9814 >>> 87e34c89e6fb41d2af25085b64011a55 - default default] >>> http://192.168.53.45:8776/v3/87e34c89e6fb41d2af25085b64011a55/volumes/detail >>> returned with HTTP 200 >>> 2020-01-17 16:53:23.770 25242 INFO eventlet.wsgi.server >>> [req-bfcbff34-134b-497e-82b1-082d48f8767f df7548ecad684f26b8bc802ba63a9814 >>> 87e34c89e6fb41d2af25085b64011a55 - default default] 192.168.53.45 "GET >>> /v3/87e34c89e6fb41d2af25085b64011a55/volumes/detail HTTP/1.1" status: 200 >>> len: 4657 time: 0.1152730 >>> ==> volume.log <== >>> 2020-01-17 16:53:23.811 24723 WARNING py.warnings >>> [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 >>> 87e34c89e6fb41d2af25085b64011a55 - default default] >>> /usr/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py:852: >>> InsecureRequestWarning: Unverified HTTPS request is being made. Adding >>> certificate verification is strongly advised. See: >>> https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings >>> InsecureRequestWarning) >>> : NimbleAPIException: Failed to execute api >>> snapshots/0464a5fd65d75fcfe1000000000000011100001a41: Error Code: 409 >>> Message: Snapshot snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume >>> volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone. >>> 2020-01-17 16:53:23.902 24723 ERROR cinder.volume.drivers.nimble >>> [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 >>> 87e34c89e6fb41d2af25085b64011a55 - default default] Re-throwing Exception >>> Failed to execute api snapshots/0464a5fd65d75fcfe1000000000000011100001a41: >>> Error Code: 409 Message: Snapshot >>> snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume >>> volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone.: >>> NimbleAPIException: Failed to execute api >>> snapshots/0464a5fd65d75fcfe1000000000000011100001a41: Error Code: 409 >>> Message: Snapshot snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume >>> volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone. >>> 2020-01-17 16:53:23.903 24723 WARNING cinder.volume.drivers.nimble >>> [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 >>> 87e34c89e6fb41d2af25085b64011a55 - default default] Snapshot >>> snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 : has a clone: >>> NimbleAPIException: Failed to execute api >>> snapshots/0464a5fd65d75fcfe1000000000000011100001a41: Error Code: 409 >>> Message: Snapshot snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume >>> volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone. >>> 2020-01-17 16:53:23.964 24723 WARNING cinder.quota >>> [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 >>> 87e34c89e6fb41d2af25085b64011a55 - default default] Deprecated: Default >>> quota for resource: snapshots_Nimble-DR is set by the default quota flag: >>> quota_snapshots_Nimble-DR, it is now deprecated. Please use the default >>> quota class for default quota. >>> 2020-01-17 16:53:24.054 24723 INFO cinder.volume.manager >>> [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 >>> 87e34c89e6fb41d2af25085b64011a55 - default default] Delete snapshot >>> completed successfully. >> >> >> >> Regards, >> >> *Tony Pearce* | >> *Senior Network Engineer / Infrastructure Lead**Cinglevue International >> * >> >> Email: tony.pearce at cinglevue.com >> Web: http://www.cinglevue.com >> >> *Australia* >> 1 Walsh Loop, Joondalup, WA 6027 Australia. >> >> Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 >> >> Note: This email and all attachments are the sole property of Cinglevue >> International Pty Ltd. (or any of its subsidiary entities), and the >> information contained herein must be considered confidential, unless >> specified otherwise. If you are not the intended recipient, you must not >> use or forward the information contained in these documents. If you have >> received this message in error, please delete the email and notify the >> sender. >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnaser at vexxhost.com Sat Jan 18 03:21:37 2020 From: mnaser at vexxhost.com (Mohammed Naser) Date: Fri, 17 Jan 2020 22:21:37 -0500 Subject: DR options with openstack In-Reply-To: <20200117183915.gugkawaqx42z6uvs@yuggoth.org> References: <5e201295.1c69fb81.a69b.d77d@mx.google.com> <20200117183915.gugkawaqx42z6uvs@yuggoth.org> Message-ID: On Fri, Jan 17, 2020 at 1:42 PM Jeremy Stanley wrote: > > On 2020-01-17 21:24:36 +0530 (+0530), Adam Peacock wrote: > [...] > > Also, we need to be clear not everyone leans towards being a > > developer or even *wants* to go in that direction when using > > OpenStack. In fact, most don't and if there is that expectation by > > those entrenched with the OpenStack product, the OpenStack option > > gets dropped in favor of something else. It's developer-friendly > > but we need to be mega-mega-careful, as a community, to ensure > > development isn't the baseline or assumption for adequate support > > or to get questions answered. Especially since we've converged our > > communication channels. > [...] > > Most users probably won't become developers on OpenStack, but some > will, and I believe its long-term survival depends on that so we > should do everything we can to encourage it. Users may also > contribute in a variety of other ways like bug reporting and triage, > outreach, revising or translating documentation, and so on. > > OpenStack isn't a "product," it's a community software collaboration > on which many companies have built products (either by running it as > a service or selling support for it). Treating the community the way > you might treat a paid vendor is where all of this goes to a bad > place very quickly. We've probably strayed a bit far away from the original topic, but I echo this thought very much. OpenStack is a project. $your_favorite_vendor's OpenStack is a product. It's important for us to keep that distinction for the success of both the project and vendors IMHO. > -- > Jeremy Stanley -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. https://vexxhost.com From mnaser at vexxhost.com Sat Jan 18 03:22:25 2020 From: mnaser at vexxhost.com (Mohammed Naser) Date: Fri, 17 Jan 2020 22:22:25 -0500 Subject: Galera config values In-Reply-To: References: Message-ID: On Fri, Jan 17, 2020 at 5:20 PM Albert Braden wrote: > > I’m experimenting with Galera in my Rocky openstack-ansible dev cluster, and I’m finding that the default haproxy config values don’t seem to work. Finding the correct values is a lot of work. For example, I spent this morning experimenting with different values for “timeout client” in /etc/haproxy/haproxy.cfg. The default is 1m, and with the default set I see this error in /var/log/nova/nova-scheduler.log on the controllers: > > > > 2020-01-17 13:54:26.059 443358 ERROR oslo_db.sqlalchemy.engines DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query') [SQL: u'SELECT 1'] (Background on this error at: http://sqlalche.me/e/e3q8) > > > > There are several timeout values in /etc/haproxy/haproxy.cfg. These are the values we started with: > > > > stats timeout 30s > > timeout http-request 10s > > timeout queue 1m > > timeout connect 10s > > timeout client 1m > > timeout server 1m > > timeout check 10s > > > > At first I changed them all to 30m. This stopped the “Lost connection” error in nova-scheduler.log. Then, one at a time, I changed them back to the default. When I got to “timeout client” I found that setting it back to 1m caused the errors to start again. I changed it back and forth and found that 4 minutes causes errors, and 6m stops them, so I left it at 6m. > > > > These are my active variables: > > > > root at us01odc-dev2-ctrl1:/etc/mysql# mysql -e 'show variables;'|grep timeout > > connect_timeout 20 > > deadlock_timeout_long 50000000 > > deadlock_timeout_short 10000 > > delayed_insert_timeout 300 > > idle_readonly_transaction_timeout 0 > > idle_transaction_timeout 0 > > idle_write_transaction_timeout 0 > > innodb_flush_log_at_timeout 1 > > innodb_lock_wait_timeout 50 > > innodb_rollback_on_timeout OFF > > interactive_timeout 28800 > > lock_wait_timeout 86400 > > net_read_timeout 30 > > net_write_timeout 60 > > rpl_semi_sync_master_timeout 10000 > > rpl_semi_sync_slave_kill_conn_timeout 5 > > slave_net_timeout 60 > > thread_pool_idle_timeout 60 > > wait_timeout 3600 > > > > So it looks like the value of “timeout client” in haproxy.cfg needs to match or exceed the value of “wait_timeout” in mysql. Also in nova.conf I see “#connection_recycle_time = 3600” – I need to experiment to see how that value interacts with the timeouts in the other config files. > > > > Is this the best way to find the correct config values? It seems like there should be a document that talks about these timeouts and how to set them (or maybe more generally how the different timeout settings in the various config files interact). Does that document exist? If not, maybe I could write one, since I have to figure out the correct values anyway. Is your cluster pretty idle? I've never seen that happen in any environments before... -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. https://vexxhost.com From eandersson at blizzard.com Sat Jan 18 03:36:55 2020 From: eandersson at blizzard.com (Erik Olof Gunnar Andersson) Date: Sat, 18 Jan 2020 03:36:55 +0000 Subject: Galera config values In-Reply-To: References: , Message-ID: <4E49E11B-83FA-4016-9FA1-30CDC377825C@blizzard.com> I can share our haproxt settings on monday, but you need to make sure that haproxy to at least match the Oslo config which I believe is 3600s, but I think in theory something like keepalived is better for galerara. btw pretty sure both client and server needs 3600s. Basically openstack recycles the connection every hour by default. So you need to make sure that haproxy does not close it before that if it’s idle. Sent from my iPhone > On Jan 17, 2020, at 7:24 PM, Mohammed Naser wrote: > > On Fri, Jan 17, 2020 at 5:20 PM Albert Braden > wrote: >> >> I’m experimenting with Galera in my Rocky openstack-ansible dev cluster, and I’m finding that the default haproxy config values don’t seem to work. Finding the correct values is a lot of work. For example, I spent this morning experimenting with different values for “timeout client” in /etc/haproxy/haproxy.cfg. The default is 1m, and with the default set I see this error in /var/log/nova/nova-scheduler.log on the controllers: >> >> >> >> 2020-01-17 13:54:26.059 443358 ERROR oslo_db.sqlalchemy.engines DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query') [SQL: u'SELECT 1'] (Background on this error at: https://urldefense.com/v3/__http://sqlalche.me/e/e3q8__;!!Ci6f514n9QsL8ck!39gvi32Ldv9W8zhZ_P1JLvkOFM-PelyP_RrU_rT5_EuELR24fLO5P3ShvZ56jfcQ7g$ ) >> >> >> >> There are several timeout values in /etc/haproxy/haproxy.cfg. These are the values we started with: >> >> >> >> stats timeout 30s >> >> timeout http-request 10s >> >> timeout queue 1m >> >> timeout connect 10s >> >> timeout client 1m >> >> timeout server 1m >> >> timeout check 10s >> >> >> >> At first I changed them all to 30m. This stopped the “Lost connection” error in nova-scheduler.log. Then, one at a time, I changed them back to the default. When I got to “timeout client” I found that setting it back to 1m caused the errors to start again. I changed it back and forth and found that 4 minutes causes errors, and 6m stops them, so I left it at 6m. >> >> >> >> These are my active variables: >> >> >> >> root at us01odc-dev2-ctrl1:/etc/mysql# mysql -e 'show variables;'|grep timeout >> >> connect_timeout 20 >> >> deadlock_timeout_long 50000000 >> >> deadlock_timeout_short 10000 >> >> delayed_insert_timeout 300 >> >> idle_readonly_transaction_timeout 0 >> >> idle_transaction_timeout 0 >> >> idle_write_transaction_timeout 0 >> >> innodb_flush_log_at_timeout 1 >> >> innodb_lock_wait_timeout 50 >> >> innodb_rollback_on_timeout OFF >> >> interactive_timeout 28800 >> >> lock_wait_timeout 86400 >> >> net_read_timeout 30 >> >> net_write_timeout 60 >> >> rpl_semi_sync_master_timeout 10000 >> >> rpl_semi_sync_slave_kill_conn_timeout 5 >> >> slave_net_timeout 60 >> >> thread_pool_idle_timeout 60 >> >> wait_timeout 3600 >> >> >> >> So it looks like the value of “timeout client” in haproxy.cfg needs to match or exceed the value of “wait_timeout” in mysql. Also in nova.conf I see “#connection_recycle_time = 3600” – I need to experiment to see how that value interacts with the timeouts in the other config files. >> >> >> >> Is this the best way to find the correct config values? It seems like there should be a document that talks about these timeouts and how to set them (or maybe more generally how the different timeout settings in the various config files interact). Does that document exist? If not, maybe I could write one, since I have to figure out the correct values anyway. > > Is your cluster pretty idle? I've never seen that happen in any > environments before... > > -- > Mohammed Naser — vexxhost > ----------------------------------------------------- > D. 514-316-8872 > D. 800-910-1726 ext. 200 > E. mnaser at vexxhost.com > W. https://urldefense.com/v3/__https://vexxhost.com__;!!Ci6f514n9QsL8ck!39gvi32Ldv9W8zhZ_P1JLvkOFM-PelyP_RrU_rT5_EuELR24fLO5P3ShvZ4PDThJbg$ > From mnaser at vexxhost.com Sat Jan 18 03:40:01 2020 From: mnaser at vexxhost.com (Mohammed Naser) Date: Fri, 17 Jan 2020 22:40:01 -0500 Subject: Galera config values In-Reply-To: <4E49E11B-83FA-4016-9FA1-30CDC377825C@blizzard.com> References: <4E49E11B-83FA-4016-9FA1-30CDC377825C@blizzard.com> Message-ID: On Fri, Jan 17, 2020 at 10:37 PM Erik Olof Gunnar Andersson wrote: > > I can share our haproxt settings on monday, but you need to make sure that haproxy to at least match the Oslo config which I believe is 3600s, but I think in theory something like keepalived is better for galerara. > > btw pretty sure both client and server needs 3600s. Basically openstack recycles the connection every hour by default. So you need to make sure that haproxy does not close it before that if it’s idle. Indeed, this adds up to what we do in OSA https://opendev.org/openstack/openstack-ansible/src/branch/master/inventory/group_vars/haproxy/haproxy.yml#L48-L49 > Sent from my iPhone > > > On Jan 17, 2020, at 7:24 PM, Mohammed Naser wrote: > > > > On Fri, Jan 17, 2020 at 5:20 PM Albert Braden > > wrote: > >> > >> I’m experimenting with Galera in my Rocky openstack-ansible dev cluster, and I’m finding that the default haproxy config values don’t seem to work. Finding the correct values is a lot of work. For example, I spent this morning experimenting with different values for “timeout client” in /etc/haproxy/haproxy.cfg. The default is 1m, and with the default set I see this error in /var/log/nova/nova-scheduler.log on the controllers: > >> > >> > >> > >> 2020-01-17 13:54:26.059 443358 ERROR oslo_db.sqlalchemy.engines DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query') [SQL: u'SELECT 1'] (Background on this error at: https://urldefense.com/v3/__http://sqlalche.me/e/e3q8__;!!Ci6f514n9QsL8ck!39gvi32Ldv9W8zhZ_P1JLvkOFM-PelyP_RrU_rT5_EuELR24fLO5P3ShvZ56jfcQ7g$ ) > >> > >> > >> > >> There are several timeout values in /etc/haproxy/haproxy.cfg. These are the values we started with: > >> > >> > >> > >> stats timeout 30s > >> > >> timeout http-request 10s > >> > >> timeout queue 1m > >> > >> timeout connect 10s > >> > >> timeout client 1m > >> > >> timeout server 1m > >> > >> timeout check 10s > >> > >> > >> > >> At first I changed them all to 30m. This stopped the “Lost connection” error in nova-scheduler.log. Then, one at a time, I changed them back to the default. When I got to “timeout client” I found that setting it back to 1m caused the errors to start again. I changed it back and forth and found that 4 minutes causes errors, and 6m stops them, so I left it at 6m. > >> > >> > >> > >> These are my active variables: > >> > >> > >> > >> root at us01odc-dev2-ctrl1:/etc/mysql# mysql -e 'show variables;'|grep timeout > >> > >> connect_timeout 20 > >> > >> deadlock_timeout_long 50000000 > >> > >> deadlock_timeout_short 10000 > >> > >> delayed_insert_timeout 300 > >> > >> idle_readonly_transaction_timeout 0 > >> > >> idle_transaction_timeout 0 > >> > >> idle_write_transaction_timeout 0 > >> > >> innodb_flush_log_at_timeout 1 > >> > >> innodb_lock_wait_timeout 50 > >> > >> innodb_rollback_on_timeout OFF > >> > >> interactive_timeout 28800 > >> > >> lock_wait_timeout 86400 > >> > >> net_read_timeout 30 > >> > >> net_write_timeout 60 > >> > >> rpl_semi_sync_master_timeout 10000 > >> > >> rpl_semi_sync_slave_kill_conn_timeout 5 > >> > >> slave_net_timeout 60 > >> > >> thread_pool_idle_timeout 60 > >> > >> wait_timeout 3600 > >> > >> > >> > >> So it looks like the value of “timeout client” in haproxy.cfg needs to match or exceed the value of “wait_timeout” in mysql. Also in nova.conf I see “#connection_recycle_time = 3600” – I need to experiment to see how that value interacts with the timeouts in the other config files. > >> > >> > >> > >> Is this the best way to find the correct config values? It seems like there should be a document that talks about these timeouts and how to set them (or maybe more generally how the different timeout settings in the various config files interact). Does that document exist? If not, maybe I could write one, since I have to figure out the correct values anyway. > > > > Is your cluster pretty idle? I've never seen that happen in any > > environments before... > > > > -- > > Mohammed Naser — vexxhost > > ----------------------------------------------------- > > D. 514-316-8872 > > D. 800-910-1726 ext. 200 > > E. mnaser at vexxhost.com > > W. https://urldefense.com/v3/__https://vexxhost.com__;!!Ci6f514n9QsL8ck!39gvi32Ldv9W8zhZ_P1JLvkOFM-PelyP_RrU_rT5_EuELR24fLO5P3ShvZ4PDThJbg$ > > -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. https://vexxhost.com From tony.pearce at cinglevue.com Sat Jan 18 06:48:36 2020 From: tony.pearce at cinglevue.com (Tony Pearce) Date: Sat, 18 Jan 2020 14:48:36 +0800 Subject: DR options with openstack In-Reply-To: References: <5e201295.1c69fb81.a69b.d77d@mx.google.com> <20200117183915.gugkawaqx42z6uvs@yuggoth.org> Message-ID: So if I understand correctly, this says to me that Openstack is never intended to be consumed by end users. Is this correct? Regards On Sat, 18 Jan 2020, 11:28 Mohammed Naser, wrote: > On Fri, Jan 17, 2020 at 1:42 PM Jeremy Stanley wrote: > > > > On 2020-01-17 21:24:36 +0530 (+0530), Adam Peacock wrote: > > [...] > > > Also, we need to be clear not everyone leans towards being a > > > developer or even *wants* to go in that direction when using > > > OpenStack. In fact, most don't and if there is that expectation by > > > those entrenched with the OpenStack product, the OpenStack option > > > gets dropped in favor of something else. It's developer-friendly > > > but we need to be mega-mega-careful, as a community, to ensure > > > development isn't the baseline or assumption for adequate support > > > or to get questions answered. Especially since we've converged our > > > communication channels. > > [...] > > > > Most users probably won't become developers on OpenStack, but some > > will, and I believe its long-term survival depends on that so we > > should do everything we can to encourage it. Users may also > > contribute in a variety of other ways like bug reporting and triage, > > outreach, revising or translating documentation, and so on. > > > > OpenStack isn't a "product," it's a community software collaboration > > on which many companies have built products (either by running it as > > a service or selling support for it). Treating the community the way > > you might treat a paid vendor is where all of this goes to a bad > > place very quickly. > > We've probably strayed a bit far away from the original topic, but I > echo this thought very much. > > OpenStack is a project. $your_favorite_vendor's OpenStack is a > product. It's important for us to keep that distinction for the success > of both the project and vendors IMHO. > > > -- > > Jeremy Stanley > > > > -- > Mohammed Naser — vexxhost > ----------------------------------------------------- > D. 514-316-8872 > D. 800-910-1726 ext. 200 > E. mnaser at vexxhost.com > W. https://vexxhost.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Sat Jan 18 15:02:27 2020 From: smooney at redhat.com (Sean Mooney) Date: Sat, 18 Jan 2020 15:02:27 +0000 Subject: DR options with openstack In-Reply-To: References: <5e201295.1c69fb81.a69b.d77d@mx.google.com> <20200117183915.gugkawaqx42z6uvs@yuggoth.org> Message-ID: <7bb7149f55e643615eb0767b37d33408e14b41df.camel@redhat.com> On Sat, 2020-01-18 at 14:48 +0800, Tony Pearce wrote: > So if I understand correctly, this says to me that Openstack is never > intended to be consumed by end users. > > Is this correct? no there are many end user that deploy there openstack directly from source or using comunity installer project. as an opensouce project developer of the project will try to help our end user fix there own problems by explain how things work and pointing them in the right direction. If a user reports a ligitimate bug we will try to fix it eventually when our day job allows. But the same way the mysql develpers wont provdie 1:1 support for tuneing your db deployment for your workload the openstack communtiy does nto provide deployment planning or day 2 operation supprot to customers. we obviously try to make day 2 operation eaiser and if user report pain point we take that on board but if you choose to deploy openstack directly from source or community distributions without a vendor then you are relying on the good will of indiviuals. i and other do help end users when they ask genuine questions providing links ot the relevnet docs or pointing them to config optins or presentaiton on the topic they ask about. we will also somtime help diagnose problem the people are having, but if that good will is abused then i will go back to my day job. if you show up and demand that we drop everything and fix your problem now you will likely have a negitive experience. unlike a vendor relationship we dont have a professional/paid relationship with our end users, on the other hand if you show up, get involved and help other where you can then i think you will find most people in the comunity will try to help you too when you need it. that is one of the big cultural difference between an opensouce project and a support product. a project is a comunity of people working together to advance a common goal. a product on the ohter hand has a businesses promise of support and with that an expectation that your vendor will go beyond good will if your business is impacted by an issue with there product. if you chose to deploy openstack as an end user understand that while we try to make that easy, the learning curvy is high and you need to have the right skills to make it a success but you can certenly do that if you invest the time and peopel to do it. kolla-ansible and openstack ansible provide two of the simplest comunity installer for managing openstack. as installer projects there focus is on day 1 and day 2 operations and tend to have more operators involved the component projects. while you can role your own they have already centralised the knoladge of may operators in the solutions they have developed so i would recommend reaching out to them to learn how you can deploy openstack your self. if you want a product as other said then you should reach out to your vendor of choice and they will help you with both planning your deployment and keeping it running. > > Regards > > On Sat, 18 Jan 2020, 11:28 Mohammed Naser, wrote: > > > On Fri, Jan 17, 2020 at 1:42 PM Jeremy Stanley wrote: > > > > > > On 2020-01-17 21:24:36 +0530 (+0530), Adam Peacock wrote: > > > [...] > > > > Also, we need to be clear not everyone leans towards being a > > > > developer or even *wants* to go in that direction when using > > > > OpenStack. In fact, most don't and if there is that expectation by > > > > those entrenched with the OpenStack product, the OpenStack option > > > > gets dropped in favor of something else. It's developer-friendly > > > > but we need to be mega-mega-careful, as a community, to ensure > > > > development isn't the baseline or assumption for adequate support > > > > or to get questions answered. Especially since we've converged our > > > > communication channels. > > > > > > [...] > > > > > > Most users probably won't become developers on OpenStack, but some > > > will, and I believe its long-term survival depends on that so we > > > should do everything we can to encourage it. Users may also > > > contribute in a variety of other ways like bug reporting and triage, > > > outreach, revising or translating documentation, and so on. > > > > > > OpenStack isn't a "product," it's a community software collaboration > > > on which many companies have built products (either by running it as > > > a service or selling support for it). Treating the community the way > > > you might treat a paid vendor is where all of this goes to a bad > > > place very quickly. > > > > We've probably strayed a bit far away from the original topic, but I > > echo this thought very much. > > > > OpenStack is a project. $your_favorite_vendor's OpenStack is a > > product. It's important for us to keep that distinction for the success > > of both the project and vendors IMHO. > > > > > -- > > > Jeremy Stanley > > > > > > > > -- > > Mohammed Naser — vexxhost > > ----------------------------------------------------- > > D. 514-316-8872 > > D. 800-910-1726 ext. 200 > > E. mnaser at vexxhost.com > > W. https://vexxhost.com > > > > From fungi at yuggoth.org Sat Jan 18 17:09:28 2020 From: fungi at yuggoth.org (Jeremy Stanley) Date: Sat, 18 Jan 2020 17:09:28 +0000 Subject: DR options with openstack In-Reply-To: References: <5e201295.1c69fb81.a69b.d77d@mx.google.com> <20200117183915.gugkawaqx42z6uvs@yuggoth.org> Message-ID: <20200118170928.efyjrjsqmzsrbigz@yuggoth.org> On 2020-01-18 14:48:36 +0800 (+0800), Tony Pearce wrote: > So if I understand correctly, this says to me that Openstack is > never intended to be consumed by end users. [...] I have no idea how you got that from my message (elided since I can't be bothered to fix your top-posting[*] right now). It's also unclear to me which definition of "end users" you're applying. For these purposes I lump people who install/manage OpenStack deployments and people who interact with OpenStack deployments together, though the latter have an established relationship with the former and that's generally where their recommended support channels lie. End users consume the Linux kernel. Where do they go for support when they have a problem with it? End users consume the bash shell. Where do they go for support with that? You can totally build and install those things yourself from source. When you do that you 1. are assumed to be a somewhat advanced user and 2. might consider reaching out to their developers and other advanced users for them when you run into an issue. They may have time to help you, or they may not. You don't have a paid support contract with them, so while they're likely to try and help you out if they can, they're certainly under no obligation. You can also get those things ready-to-use from various places, and can even buy support for them from someone who *is* obligated to help you. Which path you choose is up to you. [*] https://wiki.openstack.org/wiki/MailingListEtiquette -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From tony.pearce at cinglevue.com Sun Jan 19 04:52:25 2020 From: tony.pearce at cinglevue.com (Tony Pearce) Date: Sun, 19 Jan 2020 12:52:25 +0800 Subject: DR options with openstack In-Reply-To: <20200118170928.efyjrjsqmzsrbigz@yuggoth.org> References: <5e201295.1c69fb81.a69b.d77d@mx.google.com> <20200117183915.gugkawaqx42z6uvs@yuggoth.org> <20200118170928.efyjrjsqmzsrbigz@yuggoth.org> Message-ID: Thanks guys for clarifying 🙂 My question was in reply to Mohammed Naser's email. Sorry for the confusion. On Sun, 19 Jan 2020, 01:17 Jeremy Stanley, wrote: > On 2020-01-18 14:48:36 +0800 (+0800), Tony Pearce wrote: > > So if I understand correctly, this says to me that Openstack is > > never intended to be consumed by end users. > [...] > > I have no idea how you got that from my message (elided since I > can't be bothered to fix your top-posting[*] right now). It's also > unclear to me which definition of "end users" you're applying. For > these purposes I lump people who install/manage OpenStack > deployments and people who interact with OpenStack deployments > together, though the latter have an established relationship with > the former and that's generally where their recommended support > channels lie. > > End users consume the Linux kernel. Where do they go for support > when they have a problem with it? End users consume the bash shell. > Where do they go for support with that? You can totally build and > install those things yourself from source. When you do that you 1. > are assumed to be a somewhat advanced user and 2. might consider > reaching out to their developers and other advanced users for them > when you run into an issue. They may have time to help you, or they > may not. You don't have a paid support contract with them, so while > they're likely to try and help you out if they can, they're > certainly under no obligation. > > You can also get those things ready-to-use from various places, and > can even buy support for them from someone who *is* obligated to > help you. Which path you choose is up to you. > > [*] https://wiki.openstack.org/wiki/MailingListEtiquette > -- > Jeremy Stanley > -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Sun Jan 19 10:48:34 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Sun, 19 Jan 2020 11:48:34 +0100 Subject: [nova] Nova CI busted, please hold rechecks In-Reply-To: <4f03483c-3702-b71f-baca-43585096ca10@fried.cc> References: <4f03483c-3702-b71f-baca-43585096ca10@fried.cc> Message-ID: DevStack unblocked the gate by reverting recent changes. It's green now. Re-proposals of reverted changes will be tested against the could-be-faulty job. Seems we are hitting some odd situation with glance+swift when doing double cirros image upload. All details in bug report mentioned by Eric. -yoctozepto pt., 17 sty 2020 o 16:31 Eric Fried napisał(a): > > The nova-live-migration job is failing 100% since yesterday morning [1]. > Your rechecks won't work until that's resolved. I'll send an all-clear > message when we're green again. > > Thanks, > efried > > [1] https://bugs.launchpad.net/nova/+bug/1860021 > > From gmann at ghanshyammann.com Sun Jan 19 14:34:28 2020 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Sun, 19 Jan 2020 08:34:28 -0600 Subject: [nova] Nova CI busted, please hold rechecks In-Reply-To: References: <4f03483c-3702-b71f-baca-43585096ca10@fried.cc> Message-ID: <16fbe39fa9a.12bb88a2472227.6200009198294328118@ghanshyammann.com> now nova-next job is not so happy for multiple network case. Wait for the below patch to merge before recheck. - https://review.opendev.org/#/c/702553/ -gmann ---- On Sun, 19 Jan 2020 04:48:34 -0600 Radosław Piliszek wrote ---- > DevStack unblocked the gate by reverting recent changes. > It's green now. > > Re-proposals of reverted changes will be tested against the could-be-faulty job. > > Seems we are hitting some odd situation with glance+swift when doing > double cirros image upload. > All details in bug report mentioned by Eric. > > -yoctozepto > > pt., 17 sty 2020 o 16:31 Eric Fried napisał(a): > > > > The nova-live-migration job is failing 100% since yesterday morning [1]. > > Your rechecks won't work until that's resolved. I'll send an all-clear > > message when we're green again. > > > > Thanks, > > efried > > > > [1] https://bugs.launchpad.net/nova/+bug/1860021 > > > > > > From gmann at ghanshyammann.com Sun Jan 19 17:04:58 2020 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Sun, 19 Jan 2020 11:04:58 -0600 Subject: [nova] Nova CI busted, please hold rechecks In-Reply-To: <16fbe39fa9a.12bb88a2472227.6200009198294328118@ghanshyammann.com> References: <4f03483c-3702-b71f-baca-43585096ca10@fried.cc> <16fbe39fa9a.12bb88a2472227.6200009198294328118@ghanshyammann.com> Message-ID: <16fbec3c63d.1084d27af73514.1920681773096468265@ghanshyammann.com> ---- On Sun, 19 Jan 2020 08:34:28 -0600 Ghanshyam Mann wrote ---- > now nova-next job is not so happy for multiple network case. > > Wait for the below patch to merge before recheck. > - https://review.opendev.org/#/c/702553/ This is also merged and you are good to recheck. -gmann > > -gmann > > > ---- On Sun, 19 Jan 2020 04:48:34 -0600 Radosław Piliszek wrote ---- > > DevStack unblocked the gate by reverting recent changes. > > It's green now. > > > > Re-proposals of reverted changes will be tested against the could-be-faulty job. > > > > Seems we are hitting some odd situation with glance+swift when doing > > double cirros image upload. > > All details in bug report mentioned by Eric. > > > > -yoctozepto > > > > pt., 17 sty 2020 o 16:31 Eric Fried napisał(a): > > > > > > The nova-live-migration job is failing 100% since yesterday morning [1]. > > > Your rechecks won't work until that's resolved. I'll send an all-clear > > > message when we're green again. > > > > > > Thanks, > > > efried > > > > > > [1] https://bugs.launchpad.net/nova/+bug/1860021 > > > > > > > > > > > > > From madhuri.kumari at intel.com Mon Jan 20 05:58:36 2020 From: madhuri.kumari at intel.com (Kumari, Madhuri) Date: Mon, 20 Jan 2020 05:58:36 +0000 Subject: [ironic][nova][neutron][cloud-init] Infiniband Support in OpenStack In-Reply-To: References: <0512CBBECA36994BAA14C7FEDE986CA61A5528B0@BGSMSX102.gar.corp.intel.com> Message-ID: <0512CBBECA36994BAA14C7FEDE986CA61A554754@BGSMSX102.gar.corp.intel.com> Hi Clark, Thank you for your response. I think the infiniband MAC address should be mac[36:-14] + mac[51:] as suggested here[1]. I have specified the right MAC address as per this but it still fails. Please see the following output: ib0 interface details: 4: ib0: mtu 4092 qdisc noop state DOWN group default qlen 256 link/infiniband 80:00:00:02:fe:80:00:00:00:00:00:00:00:11:75:01:01:67:0f:b9 brd 00:ff:ff:ff:ff:12:40:1b:80:00:00:00:00:00:00:00:ff:ff:ff:ff Ironic port details: +-----------------------+----------------------------------------------------------------+ | Field | Value | +-----------------------+----------------------------------------------------------------+ | address | 00:11:75:67:0f:b9 | | created_at | 2020-01-16T08:24:15+00:00 | | extra | {'client-id': '0xfe800000000000000011750101670fb9'} | | internal_info | {'tenant_vif_port_id': 'c71b2b31-4231-423c-b512-962623220ddf'} | | is_smartnic | False | | local_link_connection | {} | | node_uuid | 05cce921-931f-4755-ad87-fc41d79a8988 | | physical_network | None | | portgroup_uuid | None | | pxe_enabled | False | | updated_at | 2020-01-16T08:59:47+00:00 | | uuid | 9921139a-63cc-4456-8e85-f7673c5c2b3b | +-----------------------+----------------------------------------------------------------+ [1] https://github.com/canonical/cloud-init/blob/9bfb2ba7268e2c3c932023fc3d3020cdc6d6cc18/cloudinit/net/__init__.py#L795 >>-----Original Message----- >>From: Clark Boylan >>Sent: Saturday, January 18, 2020 3:41 AM >>To: openstack-discuss at lists.openstack.org >>Subject: Re: [ironic][nova][neutron][cloud-init] Infiniband Support in >>OpenStack >> >>On Fri, Jan 17, 2020, at 7:31 AM, Kumari, Madhuri wrote: >>> >>> Hi, >>> >>> >>> I am trying to deploy a node with infiniband in Ironic without any success. >>> >>> >>> The node has two interfaces, eth0 and ib0. The deployment is >>> successful, node becomes active but is not reachable. I debugged and >>> checked that the issue is with cloud-init. The cloud-init fails to >>> configure the network interfaces on the node complaining that the MAC >>> address of infiniband port(ib0) is not known to the node. Ironic >>> provides a fake MAC address for infiniband ports and cloud-init is >>> supposed to generate the actual MAC address of infiband ports[1]. But >>> it fails[2] before reaching there. >> >>Reading the cloud-init code [4][5] it appears that the ethernet format MAC >>should match bytes 13-15 + 18-20 of the infiniband address. Is the problem >>here that the fake MAC supplied is unrelated to the actual infiniband >>address? If so I think you'll either need cloud-init to ignore unknown >>interfaces (as proposed in the cloud-init bug), or have Ironic supply the mac >>address as bytes 13-15 + 18-20 of the actual infiniband address. >> >>> >>> I have posted the issue in cloud-init[3] as well. >>> >>> >>> Can someone please help me with this issue? How do we specify >>> “TYPE=InfiniBand” from OpenStack? Currently the type sent is “phy” only. >>> >>> >>> [1] >>> https://github.com/canonical/cloud-init/blob/master/cloudinit/sources/ >>> helpers/openstack.py#L686 >>> >>> [2] >>> https://github.com/canonical/cloud-init/blob/master/cloudinit/sources/ >>> helpers/openstack.py#L677 >>> >>> [3] https://bugs.launchpad.net/cloud-init/+bug/1857031 >> >>[4] https://github.com/canonical/cloud- >>init/blob/9bfb2ba7268e2c3c932023fc3d3020cdc6d6cc18/cloudinit/net/__init >>__.py#L793-L795 >>[5] https://github.com/canonical/cloud- >>init/blob/9bfb2ba7268e2c3c932023fc3d3020cdc6d6cc18/cloudinit/net/__init >>__.py#L844-L846 -------------- next part -------------- An HTML attachment was scrubbed... URL: From kiseok7 at gmail.com Mon Jan 20 06:47:49 2020 From: kiseok7 at gmail.com (Kim KS) Date: Mon, 20 Jan 2020 15:47:49 +0900 Subject: [nova] I would like to add another option for cross_az_attach Message-ID: Hello all, In nova with setting [cinder]/ cross_az_attach option to false, nova creates instance and volume in same AZ. but some of usecase (in my case), we need to attach new volume in different AZ to the instance. so I need two options. one is for nova block device mapping and attaching volume and another is for attaching volume in specified AZ. [cinder] cross_az_attach = False enable_az_attach_list = AZ1,AZ2 how do you all think of it? Best, Kiseok From mark at stackhpc.com Mon Jan 20 08:39:43 2020 From: mark at stackhpc.com (Mark Goddard) Date: Mon, 20 Jan 2020 08:39:43 +0000 Subject: [ironic][nova][neutron][cloud-init] Infiniband Support in OpenStack In-Reply-To: <0512CBBECA36994BAA14C7FEDE986CA61A5528B0@BGSMSX102.gar.corp.intel.com> References: <0512CBBECA36994BAA14C7FEDE986CA61A5528B0@BGSMSX102.gar.corp.intel.com> Message-ID: On Fri, 17 Jan 2020 at 15:34, Kumari, Madhuri wrote: > > Hi, > > > > I am trying to deploy a node with infiniband in Ironic without any success. > > > > The node has two interfaces, eth0 and ib0. The deployment is successful, node becomes active but is not reachable. I debugged and checked that the issue is with cloud-init. The cloud-init fails to configure the network interfaces on the node complaining that the MAC address of infiniband port(ib0) is not known to the node. Ironic provides a fake MAC address for infiniband ports and cloud-init is supposed to generate the actual MAC address of infiband ports[1]. But it fails[2] before reaching there. > > I have posted the issue in cloud-init[3] as well. > > > > Can someone please help me with this issue? How do we specify “TYPE=InfiniBand” from OpenStack? Currently the type sent is “phy” only. > > > > [1] https://github.com/canonical/cloud-init/blob/master/cloudinit/sources/helpers/openstack.py#L686 > > [2] https://github.com/canonical/cloud-init/blob/master/cloudinit/sources/helpers/openstack.py#L677 > > [3] https://bugs.launchpad.net/cloud-init/+bug/1857031 > Hi Madhuri, Please see my blog post: https://www.stackhpc.com/bare-metal-infiniband.html. One major question to ask is whether you want shared IB network or multi-tenant isolation. The latter is significantly more challenging. It's probably best if you read that article and raise any further questions here or IRC. I'll be out of the office until Wednesday. Mark > > > > > Regards, > > Madhuri > > From tony.pearce at cinglevue.com Mon Jan 20 08:57:19 2020 From: tony.pearce at cinglevue.com (Tony Pearce) Date: Mon, 20 Jan 2020 16:57:19 +0800 Subject: Cinder snapshot delete successful when expected to fail In-Reply-To: References: Message-ID: Hi all, I made some progress on this, but I am unsure how to make it better. Long story short: - queens issue = snapshot is deleted from openstack but shouldnt be because the snapshot is unable to be deleted on the storage side - compared pike / queens / stein "nimble.py" and found all are different, with each newer version of openstack having additions in code - done some trial and error tests and decided to use the stein nimble.py and modify it - found the 'delete snapshot' section and found it is set to not halt on error - added 'raise' into the function and re-tested the 'delete snapshot' scenario - RESULT = now the snapshot is NOT deleted but goes "error deleting" instead :) So now after making that change, the snapshot is now in an unavailable status. I am looking as to how I can do something else other than make this snapshot go into an unavailable condition. Such as display a message while keeping the snapshot "available" because it can still be used Short story long: The "nimble.py" driver changes between pike,queens,stein versions (though within the file it has "driver version 4.0.1" on all). Pike has around 1700 lines. Queens has 1900 and Stein has 1910 approx. I confirmed the issue with the driver by copying the nimble.py driver (and the other 2 files named nimble.pyc and nimble.pyo) from Pike into the Queens test env. to test if the snapshot still gets deleted under Queens or shows an error instead. The snapshot was not deleted and it goes error status as expected. note: Initially, I only copied the text data from nimble.py and it appears as though the update to the text file was ignored. It looks to me like, openstack uses one of those .pyc or .pyo files instead. I googled on this and they are binaries that are used in some situations. If I make any changes to the nimble.py file then I need to re-generate those .pyc and .pyo files from the .py. So what is happening here is; I want to try and delete a snapshot that has a clone. The expected outcome is the snapshot is not deleted in Openstack. Current experience is that Openstack deletes the snapshot from the volume snapshots, leaving the snapshot behind on the array storage side. In the volume.log, I see the array sends back an error 409 with "has a clone" response. I managed to find which section is printing the error in the volume.log from the nimble.py driver file and so I edited the text section that gets printed and re-run the test. The volume.log now gets printed with the new text additions I added 'DELETE' and 'Response' words: : NimbleAPIException: DELETE Failed to execute api snapshots/0464a5fd65d75fcfe1000000000000011d00001b1d: Response Error Code: 409 Message: Snapshot snapshot-4ee076ad-2e14-4d0d-bc20-64c17c741f8c for volume volume-7dd64cf1-1d56-4f21-a153-a8137b68c557 has a clone. This is the python code where I made those changes: basically, because the error code is not 200 or 201 then it throws the error from what I understand. > def delete(self, api): > url = self.uri + api > r = requests.delete(url, headers=self.headers, verify=self.verify) > if r.status_code != 201 and r.status_code != 200: > base = "DELETE Failed to execute api %(api)s: Response Error > Code: %(code)s" % { > 'api': api, > 'code': r.status_code} > LOG.debug("Base error : %(base)s", {'base': base}) > try: > msg = _("%(base)s Message: %(msg)s") % { > 'base': base, > 'msg': r.json()['messages'][1]['text']} > except IndexError: > msg = _("%(base)s Message: %(msg)s") % { > 'base': base, > 'msg': six.text_type(r.json())} > raise NimbleAPIException(msg) > return r.json() However, slightly before the above within nimble.py I think I have found the function that is causing the initial problem: def delete_snap(self, volume_name, snap_name): > snap_info = self.get_snap_info(snap_name, volume_name) > api = "snapshots/" + six.text_type(snap_info['id']) > try: > self.delete(api) > except NimbleAPIException as ex: > LOG.debug("delete snapshot exception: %s", ex) > if SM_OBJ_HAS_CLONE in six.text_type(ex): > # if snap has a clone log the error and continue ahead > LOG.warning('Snapshot %(snap)s : %(state)s', > {'snap': snap_name, > 'state': SM_OBJ_HAS_CLONE}) > else: > raise SM_OBJ_HAS_CLONE is looking for "has a clone" and it's defined in the beginning of the file: "SM_OBJ_HAS_CLONE = "has a clone"" and I can see this in the log file "has a clone" as a response 409. My problem is literally " # if snap has a clone log the error and continue ahead" - it shouldnt be continuing, because by continuing it is deleting the snapshot on the Openstack side but is unable to do the same on the storage side because of the dependency issue. So what I did next was to look into the different "delete volume" section for some help - because a similar condition can occur there -> to explain; if volume2 is a clone of volume1 then we can't delete volume1 until we first delete volume2. What I notice is that there is a "raise" in that section at the end - I think I understand this to be throwing an exception to openstack. ie to cause an error condition. Here's the delete volume section from the driver: def delete_volume(self, volume): """Delete the specified volume.""" backup_snap_name, backup_vol_name = self .is_volume_backup_clone(volume) eventlet.sleep(DEFAULT_SLEEP) self.APIExecutor.online_vol(volume['name'], False) LOG.debug("Deleting volume %(vol)s", {'vol': volume['name']}) @utils.retry(NimbleAPIException, retries=3) def _retry_remove_vol(volume): self.APIExecutor.delete_vol(volume['name']) try: _retry_remove_vol(volume) except NimbleAPIException as ex: LOG.debug("delete volume exception: %s", ex) if SM_OBJ_HAS_CLONE in six.text_type(ex): LOG.warning('Volume %(vol)s : %(state)s', {'vol': volume['name'], 'state': SM_OBJ_HAS_CLONE}) # set the volume back to be online and raise busy exception self.APIExecutor.online_vol(volume['name'], True) raise exception.VolumeIsBusy(volume_name=volume['name']) raise So with the above, I modified the delete snapshot section and put in a simple "raise" like this (highlighted in yellow) > > def delete_snap(self, volume_name, snap_name): > snap_info = self.get_snap_info(snap_name, volume_name) > api = "snapshots/" + six.text_type(snap_info['id']) > try: > self.delete(api) > except NimbleAPIException as ex: > LOG.debug("delete snapshot exception: %s", ex) > if SM_OBJ_HAS_CLONE in six.text_type(ex): > # if snap has a clone log the error and continue ahead > LOG.warning('Snapshot %(snap)s : %(state)s', > {'snap': snap_name, > 'state': SM_OBJ_HAS_CLONE}) > raise > else: > raise And now when I test, the snapshot is not deleted but it instead goes into ERROR-DELETING. It's not perfect but at least I can make the snapshot back to "available" from the admin section within Openstack. Would anyone be able to if possible, give me some pointers how to accept this error but not cause the snapshot to go into "error" ? I think that I need to create a class? regards *Tony Pearce* | *Senior Network Engineer / Infrastructure Lead**Cinglevue International * Email: tony.pearce at cinglevue.com Web: http://www.cinglevue.com *Australia* 1 Walsh Loop, Joondalup, WA 6027 Australia. Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 Note: This email and all attachments are the sole property of Cinglevue International Pty Ltd. (or any of its subsidiary entities), and the information contained herein must be considered confidential, unless specified otherwise. If you are not the intended recipient, you must not use or forward the information contained in these documents. If you have received this message in error, please delete the email and notify the sender. On Sat, 18 Jan 2020 at 09:44, Tony Pearce wrote: > Thank you. That really helps. > > I am going to diff the nimble.py files between Pike and Queens and see > what's changed. > > On Fri, 17 Jan 2020, 22:18 Alan Bishop, wrote: > >> >> >> On Fri, Jan 17, 2020 at 2:01 AM Tony Pearce >> wrote: >> >>> Could anyone help by pointing me where to go to be able to dig into this >>> issue further? >>> >>> I have installed a test Openstack environment using RDO Packstack. I >>> wanted to install the same version that I have in Production (Pike) but >>> it's not listed in the CentOS repo via yum search. So I installed Queens. I >>> am using nimble.py Cinder driver. Nimble Storage is a storage array >>> accessed via iscsi from the Openstack host, and is controlled from >>> Openstack by the driver and API. >>> >>> *What I expected to happen:* >>> 1. create an instance with volume (the volume is created on the storage >>> array successfully and instance boots from it) >>> 2. take a snapshot (snapshot taken on the volume on the array >>> successfully) >>> 3. create a new instance from the snapshot (the api tells the array to >>> clone the snapshot into a new volume on the array and use that volume for >>> the instance) >>> 4. try and delete the snapshot >>> Expected Result - Openstack gives the user a message like "you're not >>> allowed to do that". >>> >>> Note: Step 3 above creates a child volume from the parent snapshot. >>> It's impossible to delete the parent snapshot because IO READ is sent to >>> that part of the original volume (as I understand it). >>> >>> *My production problem is this: * >>> 1. create an instance with volume (the volume is created on the storage >>> array successfully) >>> 2. take a snapshot (snapshot taken on the volume on the array >>> successfully) >>> 3. create a new instance from the snapshot (the api tells the array to >>> clone the snapshot into a new volume on the array and use that volume for >>> the instance) >>> 4. try and delete the snapshot >>> Result - snapshot goes into error state and later, all Cinder operations >>> fail such as new instance/create volume etc. until the correct service is >>> restarted. Then everything works once again. >>> >>> >>> To troubleshoot the above, I installed the RDP Packstack Queens (because >>> I couldnt get Pike). I tested the above and now, the result is the snapshot >>> is successfully deleted from openstack but not deleted on the array. The >>> log is below for reference. But I can see the in the log that the array >>> sends back info to openstack saying the snapshot has a clone and the delete >>> cannot be done because of that. Also response code 409. >>> >>> *Some info about why the problem with Pike started in the first place* >>> 1. Vendor is Nimble Storage which HPE purchased >>> 2. HPE/Nimble have dropped support for openstack. Latest supported >>> version is Queens and Nimble array version v4.x. The current Array version >>> is v5.x. Nimble say there are no guarantees with openstack, the driver and >>> the array version v5.x >>> 3. I was previously advised by Nimble that the array version v5.x will >>> work fine and so left our DR array on v5.x with a pending upgrade that had >>> a blocker due to an issue. This issue was resolved in December and the >>> pending upgrade completed to match the DR array took place around 30 days >>> ago. >>> >>> >>> With regards to the production issue, I assumed that the array API has >>> some changes between v4.x and v5.x and it's causing an issue with Cinder >>> due to the API response. Although I have not been able to find out if or >>> what changes there are that may have occurred after the array upgrade, as >>> the documentation for this is Nimble internal-only. >>> >>> >>> *So with that - some questions if I may:* >>> When Openstack got the 409 error response from the API (as seen in the >>> log below), why would Openstack then proceed to delete the snapshot on the >>> Openstack side? How could I debug this further? I'm not sure what Openstack >>> Cinder is acting on in terns of the response as yet. Maybe Openstack is not >>> specifically looking for the error code in the response? >>> >>> The snapshot that got deleted on the openstack side is a problem. Would >>> this be related to the driver? Could it be possible that the driver did not >>> pass the error response to Cinder? >>> >> >> Hi Tony, >> >> This is exactly what happened, and it appears to be a driver bug >> introduced in queens by [1]. The code in question [2] logs the error, but >> fails to propagate the exception. As far as the volume manager is >> concerned, the snapshot deletion was successful. >> >> [1] https://review.opendev.org/601492 >> [2] >> https://opendev.org/openstack/cinder/src/branch/stable/queens/cinder/volume/drivers/nimble.py#L1815 >> >> Alan >> >> Thanks in advance. Just for reference, the log snippet is below. >>> >>> >>> ==> volume.log <== >>>> 2020-01-17 16:53:23.718 24723 WARNING py.warnings >>>> [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 >>>> 87e34c89e6fb41d2af25085b64011a55 - default default] >>>> /usr/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py:852: >>>> InsecureRequestWarning: Unverified HTTPS request is being made. Adding >>>> certificate verification is strongly advised. See: >>>> https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings >>>> InsecureRequestWarning) >>>> : NimbleAPIException: Failed to execute api >>>> snapshots/0464a5fd65d75fcfe1000000000000011100001a41: Error Code: 409 >>>> Message: Snapshot snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume >>>> volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone. >>>> ==> api.log <== >>>> 2020-01-17 16:53:23.769 25242 INFO cinder.api.openstack.wsgi >>>> [req-bfcbff34-134b-497e-82b1-082d48f8767f df7548ecad684f26b8bc802ba63a9814 >>>> 87e34c89e6fb41d2af25085b64011a55 - default default] >>>> http://192.168.53.45:8776/v3/87e34c89e6fb41d2af25085b64011a55/volumes/detail >>>> returned with HTTP 200 >>>> 2020-01-17 16:53:23.770 25242 INFO eventlet.wsgi.server >>>> [req-bfcbff34-134b-497e-82b1-082d48f8767f df7548ecad684f26b8bc802ba63a9814 >>>> 87e34c89e6fb41d2af25085b64011a55 - default default] 192.168.53.45 "GET >>>> /v3/87e34c89e6fb41d2af25085b64011a55/volumes/detail HTTP/1.1" status: 200 >>>> len: 4657 time: 0.1152730 >>>> ==> volume.log <== >>>> 2020-01-17 16:53:23.811 24723 WARNING py.warnings >>>> [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 >>>> 87e34c89e6fb41d2af25085b64011a55 - default default] >>>> /usr/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py:852: >>>> InsecureRequestWarning: Unverified HTTPS request is being made. Adding >>>> certificate verification is strongly advised. See: >>>> https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings >>>> InsecureRequestWarning) >>>> : NimbleAPIException: Failed to execute api >>>> snapshots/0464a5fd65d75fcfe1000000000000011100001a41: Error Code: 409 >>>> Message: Snapshot snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume >>>> volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone. >>>> 2020-01-17 16:53:23.902 24723 ERROR cinder.volume.drivers.nimble >>>> [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 >>>> 87e34c89e6fb41d2af25085b64011a55 - default default] Re-throwing Exception >>>> Failed to execute api snapshots/0464a5fd65d75fcfe1000000000000011100001a41: >>>> Error Code: 409 Message: Snapshot >>>> snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume >>>> volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone.: >>>> NimbleAPIException: Failed to execute api >>>> snapshots/0464a5fd65d75fcfe1000000000000011100001a41: Error Code: 409 >>>> Message: Snapshot snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume >>>> volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone. >>>> 2020-01-17 16:53:23.903 24723 WARNING cinder.volume.drivers.nimble >>>> [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 >>>> 87e34c89e6fb41d2af25085b64011a55 - default default] Snapshot >>>> snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 : has a clone: >>>> NimbleAPIException: Failed to execute api >>>> snapshots/0464a5fd65d75fcfe1000000000000011100001a41: Error Code: 409 >>>> Message: Snapshot snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume >>>> volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone. >>>> 2020-01-17 16:53:23.964 24723 WARNING cinder.quota >>>> [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 >>>> 87e34c89e6fb41d2af25085b64011a55 - default default] Deprecated: Default >>>> quota for resource: snapshots_Nimble-DR is set by the default quota flag: >>>> quota_snapshots_Nimble-DR, it is now deprecated. Please use the default >>>> quota class for default quota. >>>> 2020-01-17 16:53:24.054 24723 INFO cinder.volume.manager >>>> [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 >>>> 87e34c89e6fb41d2af25085b64011a55 - default default] Delete snapshot >>>> completed successfully. >>> >>> >>> >>> Regards, >>> >>> *Tony Pearce* | >>> *Senior Network Engineer / Infrastructure Lead**Cinglevue International >>> * >>> >>> Email: tony.pearce at cinglevue.com >>> Web: http://www.cinglevue.com >>> >>> *Australia* >>> 1 Walsh Loop, Joondalup, WA 6027 Australia. >>> >>> Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 >>> >>> Note: This email and all attachments are the sole property of Cinglevue >>> International Pty Ltd. (or any of its subsidiary entities), and the >>> information contained herein must be considered confidential, unless >>> specified otherwise. If you are not the intended recipient, you must not >>> use or forward the information contained in these documents. If you have >>> received this message in error, please delete the email and notify the >>> sender. >>> >>> >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From balazs.gibizer at est.tech Mon Jan 20 10:00:14 2020 From: balazs.gibizer at est.tech (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Mon, 20 Jan 2020 10:00:14 +0000 Subject: [barbican] TPM2.0 backend Message-ID: <1579514411.790283.0@est.tech> Hi, Looking at the Barbican documentation I see that the secrets can be stored on disk (SimpleCrypto backend) or in a HW vendor specific HSM module. Is there a way to use a TPM 2.0 device as the backend of Barbican via something like [1]? Cheers, gibi [1] https://github.com/tpm2-software/tpm2-pkcs11 From rdhasman at redhat.com Mon Jan 20 09:24:48 2020 From: rdhasman at redhat.com (Rajat Dhasmana) Date: Mon, 20 Jan 2020 14:54:48 +0530 Subject: Cinder snapshot delete successful when expected to fail In-Reply-To: References: Message-ID: Hi Tony, The 'raise' you used raises 'NimbleAPIException' which isn't handled anywhere and is caught by the generic exception block here[1] which sets your snapshot to error_deleting state. My suggestion to try raise exception.SnapshotIsBusy exception which will be caught here[2] and will set your snapshot to available state. Let me know if it works. [1] https://github.com/openstack/cinder/blob/master/cinder/volume/manager.py#L1242 [2] https://github.com/openstack/cinder/blob/master/cinder/volume/manager.py#L1228 Regards Rajat Dhasmana On Mon, Jan 20, 2020 at 2:35 PM Tony Pearce wrote: > Hi all, > I made some progress on this, but I am unsure how to make it better. Long > story short: > > - queens issue = snapshot is deleted from openstack but shouldnt be > because the snapshot is unable to be deleted on the storage side > - compared pike / queens / stein "nimble.py" and found all are > different, with each newer version of openstack having additions in code > - done some trial and error tests and decided to use the stein > nimble.py and modify it > - found the 'delete snapshot' section and found it is set to not halt > on error > - added 'raise' into the function and re-tested the 'delete snapshot' > scenario > - RESULT = now the snapshot is NOT deleted but goes "error deleting" > instead :) > > So now after making that change, the snapshot is now in an unavailable > status. I am looking as to how I can do something else other than make this > snapshot go into an unavailable condition. Such as display a message while > keeping the snapshot "available" because it can still be used > > Short story long: > > The "nimble.py" driver changes between pike,queens,stein versions (though > within the file it has "driver version 4.0.1" on all). Pike has around 1700 > lines. Queens has 1900 and Stein has 1910 approx. > > I confirmed the issue with the driver by copying the nimble.py driver (and > the other 2 files named nimble.pyc and nimble.pyo) from Pike into the > Queens test env. to test if the snapshot still gets deleted under Queens or > shows an error instead. The snapshot was not deleted and it goes error > status as expected. > note: Initially, I only copied the text data from nimble.py and it appears > as though the update to the text file was ignored. It looks to me like, > openstack uses one of those .pyc or .pyo files instead. I googled on this > and they are binaries that are used in some situations. If I make any > changes to the nimble.py file then I need to re-generate those .pyc and > .pyo files from the .py. > > So what is happening here is; I want to try and delete a snapshot that has > a clone. The expected outcome is the snapshot is not deleted in Openstack. > Current experience is that Openstack deletes the snapshot from the volume > snapshots, leaving the snapshot behind on the array storage side. > > In the volume.log, I see the array sends back an error 409 with "has a > clone" response. I managed to find which section is printing the error in > the volume.log from the nimble.py driver file and so I edited the text > section that gets printed and re-run the test. The volume.log now gets > printed with the new text additions I added 'DELETE' and 'Response' words: > > : NimbleAPIException: DELETE Failed to execute api > snapshots/0464a5fd65d75fcfe1000000000000011d00001b1d: Response Error Code: > 409 Message: Snapshot snapshot-4ee076ad-2e14-4d0d-bc20-64c17c741f8c for > volume volume-7dd64cf1-1d56-4f21-a153-a8137b68c557 has a clone. > > This is the python code where I made those changes: basically, because the > error code is not 200 or 201 then it throws the error from what I > understand. > >> def delete(self, api): >> url = self.uri + api >> r = requests.delete(url, headers=self.headers, verify=self.verify) >> if r.status_code != 201 and r.status_code != 200: >> base = "DELETE Failed to execute api %(api)s: Response Error >> Code: %(code)s" % { >> 'api': api, >> 'code': r.status_code} >> LOG.debug("Base error : %(base)s", {'base': base}) >> try: >> msg = _("%(base)s Message: %(msg)s") % { >> 'base': base, >> 'msg': r.json()['messages'][1]['text']} >> except IndexError: >> msg = _("%(base)s Message: %(msg)s") % { >> 'base': base, >> 'msg': six.text_type(r.json())} >> raise NimbleAPIException(msg) >> return r.json() > > > > However, slightly before the above within nimble.py I think I have found > the function that is causing the initial problem: > > def delete_snap(self, volume_name, snap_name): >> snap_info = self.get_snap_info(snap_name, volume_name) >> api = "snapshots/" + six.text_type(snap_info['id']) >> try: >> self.delete(api) >> except NimbleAPIException as ex: >> LOG.debug("delete snapshot exception: %s", ex) >> if SM_OBJ_HAS_CLONE in six.text_type(ex): >> # if snap has a clone log the error and continue ahead >> LOG.warning('Snapshot %(snap)s : %(state)s', >> {'snap': snap_name, >> 'state': SM_OBJ_HAS_CLONE}) >> else: >> raise > > > SM_OBJ_HAS_CLONE is looking for "has a clone" and it's defined in the > beginning of the file: "SM_OBJ_HAS_CLONE = "has a clone"" and I can see > this in the log file "has a clone" as a response 409. > > My problem is literally " # if snap has a clone log the error and continue > ahead" - it shouldnt be continuing, because by continuing it is deleting > the snapshot on the Openstack side but is unable to do the same on the > storage side because of the dependency issue. > > So what I did next was to look into the different "delete volume" section > for some help - because a similar condition can occur there -> to explain; > if volume2 is a clone of volume1 then we can't delete volume1 until we > first delete volume2. > What I notice is that there is a "raise" in that section at the end - I > think I understand this to be throwing an exception to openstack. ie to > cause an error condition. > > Here's the delete volume section from the driver: > > def delete_volume(self, volume): > """Delete the specified volume.""" > backup_snap_name, backup_vol_name = self > .is_volume_backup_clone(volume) > eventlet.sleep(DEFAULT_SLEEP) > self.APIExecutor.online_vol(volume['name'], False) > LOG.debug("Deleting volume %(vol)s", {'vol': volume['name']}) > > @utils.retry(NimbleAPIException, retries=3) > def _retry_remove_vol(volume): > self.APIExecutor.delete_vol(volume['name']) > try: > _retry_remove_vol(volume) > except NimbleAPIException as ex: > LOG.debug("delete volume exception: %s", ex) > if SM_OBJ_HAS_CLONE in six.text_type(ex): > LOG.warning('Volume %(vol)s : %(state)s', > {'vol': volume['name'], > 'state': SM_OBJ_HAS_CLONE}) > > # set the volume back to be online and raise busy exception > self.APIExecutor.online_vol(volume['name'], True) > raise exception.VolumeIsBusy(volume_name=volume['name']) > raise > > > So with the above, I modified the delete snapshot section and put in a > simple "raise" like this (highlighted in yellow) > >> >> def delete_snap(self, volume_name, snap_name): >> snap_info = self.get_snap_info(snap_name, volume_name) >> api = "snapshots/" + six.text_type(snap_info['id']) >> try: >> self.delete(api) >> except NimbleAPIException as ex: >> LOG.debug("delete snapshot exception: %s", ex) >> if SM_OBJ_HAS_CLONE in six.text_type(ex): >> # if snap has a clone log the error and continue ahead >> LOG.warning('Snapshot %(snap)s : %(state)s', >> {'snap': snap_name, >> 'state': SM_OBJ_HAS_CLONE}) >> raise >> else: >> raise > > > > And now when I test, the snapshot is not deleted but it instead goes into > ERROR-DELETING. It's not perfect but at least I can make the snapshot back > to "available" from the admin section within Openstack. > > Would anyone be able to if possible, give me some pointers how to accept > this error but not cause the snapshot to go into "error" ? I think that I > need to create a class? > > regards > > *Tony Pearce* | > *Senior Network Engineer / Infrastructure Lead**Cinglevue International > * > > Email: tony.pearce at cinglevue.com > Web: http://www.cinglevue.com > > *Australia* > 1 Walsh Loop, Joondalup, WA 6027 Australia. > > Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 > > Note: This email and all attachments are the sole property of Cinglevue > International Pty Ltd. (or any of its subsidiary entities), and the > information contained herein must be considered confidential, unless > specified otherwise. If you are not the intended recipient, you must not > use or forward the information contained in these documents. If you have > received this message in error, please delete the email and notify the > sender. > > > > > On Sat, 18 Jan 2020 at 09:44, Tony Pearce > wrote: > >> Thank you. That really helps. >> >> I am going to diff the nimble.py files between Pike and Queens and see >> what's changed. >> >> On Fri, 17 Jan 2020, 22:18 Alan Bishop, wrote: >> >>> >>> >>> On Fri, Jan 17, 2020 at 2:01 AM Tony Pearce >>> wrote: >>> >>>> Could anyone help by pointing me where to go to be able to dig into >>>> this issue further? >>>> >>>> I have installed a test Openstack environment using RDO Packstack. I >>>> wanted to install the same version that I have in Production (Pike) but >>>> it's not listed in the CentOS repo via yum search. So I installed Queens. I >>>> am using nimble.py Cinder driver. Nimble Storage is a storage array >>>> accessed via iscsi from the Openstack host, and is controlled from >>>> Openstack by the driver and API. >>>> >>>> *What I expected to happen:* >>>> 1. create an instance with volume (the volume is created on the storage >>>> array successfully and instance boots from it) >>>> 2. take a snapshot (snapshot taken on the volume on the array >>>> successfully) >>>> 3. create a new instance from the snapshot (the api tells the array to >>>> clone the snapshot into a new volume on the array and use that volume for >>>> the instance) >>>> 4. try and delete the snapshot >>>> Expected Result - Openstack gives the user a message like "you're not >>>> allowed to do that". >>>> >>>> Note: Step 3 above creates a child volume from the parent snapshot. >>>> It's impossible to delete the parent snapshot because IO READ is sent to >>>> that part of the original volume (as I understand it). >>>> >>>> *My production problem is this: * >>>> 1. create an instance with volume (the volume is created on the storage >>>> array successfully) >>>> 2. take a snapshot (snapshot taken on the volume on the array >>>> successfully) >>>> 3. create a new instance from the snapshot (the api tells the array to >>>> clone the snapshot into a new volume on the array and use that volume for >>>> the instance) >>>> 4. try and delete the snapshot >>>> Result - snapshot goes into error state and later, all Cinder >>>> operations fail such as new instance/create volume etc. until the correct >>>> service is restarted. Then everything works once again. >>>> >>>> >>>> To troubleshoot the above, I installed the RDP Packstack Queens >>>> (because I couldnt get Pike). I tested the above and now, the result is the >>>> snapshot is successfully deleted from openstack but not deleted on the >>>> array. The log is below for reference. But I can see the in the log that >>>> the array sends back info to openstack saying the snapshot has a clone and >>>> the delete cannot be done because of that. Also response code 409. >>>> >>>> *Some info about why the problem with Pike started in the first place* >>>> 1. Vendor is Nimble Storage which HPE purchased >>>> 2. HPE/Nimble have dropped support for openstack. Latest supported >>>> version is Queens and Nimble array version v4.x. The current Array version >>>> is v5.x. Nimble say there are no guarantees with openstack, the driver and >>>> the array version v5.x >>>> 3. I was previously advised by Nimble that the array version v5.x will >>>> work fine and so left our DR array on v5.x with a pending upgrade that had >>>> a blocker due to an issue. This issue was resolved in December and the >>>> pending upgrade completed to match the DR array took place around 30 days >>>> ago. >>>> >>>> >>>> With regards to the production issue, I assumed that the array API has >>>> some changes between v4.x and v5.x and it's causing an issue with Cinder >>>> due to the API response. Although I have not been able to find out if or >>>> what changes there are that may have occurred after the array upgrade, as >>>> the documentation for this is Nimble internal-only. >>>> >>>> >>>> *So with that - some questions if I may:* >>>> When Openstack got the 409 error response from the API (as seen in the >>>> log below), why would Openstack then proceed to delete the snapshot on the >>>> Openstack side? How could I debug this further? I'm not sure what Openstack >>>> Cinder is acting on in terns of the response as yet. Maybe Openstack is not >>>> specifically looking for the error code in the response? >>>> >>>> The snapshot that got deleted on the openstack side is a problem. Would >>>> this be related to the driver? Could it be possible that the driver did not >>>> pass the error response to Cinder? >>>> >>> >>> Hi Tony, >>> >>> This is exactly what happened, and it appears to be a driver bug >>> introduced in queens by [1]. The code in question [2] logs the error, but >>> fails to propagate the exception. As far as the volume manager is >>> concerned, the snapshot deletion was successful. >>> >>> [1] https://review.opendev.org/601492 >>> [2] >>> https://opendev.org/openstack/cinder/src/branch/stable/queens/cinder/volume/drivers/nimble.py#L1815 >>> >>> Alan >>> >>> Thanks in advance. Just for reference, the log snippet is below. >>>> >>>> >>>> ==> volume.log <== >>>>> 2020-01-17 16:53:23.718 24723 WARNING py.warnings >>>>> [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 >>>>> 87e34c89e6fb41d2af25085b64011a55 - default default] >>>>> /usr/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py:852: >>>>> InsecureRequestWarning: Unverified HTTPS request is being made. Adding >>>>> certificate verification is strongly advised. See: >>>>> https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings >>>>> InsecureRequestWarning) >>>>> : NimbleAPIException: Failed to execute api >>>>> snapshots/0464a5fd65d75fcfe1000000000000011100001a41: Error Code: 409 >>>>> Message: Snapshot snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume >>>>> volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone. >>>>> ==> api.log <== >>>>> 2020-01-17 16:53:23.769 25242 INFO cinder.api.openstack.wsgi >>>>> [req-bfcbff34-134b-497e-82b1-082d48f8767f df7548ecad684f26b8bc802ba63a9814 >>>>> 87e34c89e6fb41d2af25085b64011a55 - default default] >>>>> http://192.168.53.45:8776/v3/87e34c89e6fb41d2af25085b64011a55/volumes/detail >>>>> returned with HTTP 200 >>>>> 2020-01-17 16:53:23.770 25242 INFO eventlet.wsgi.server >>>>> [req-bfcbff34-134b-497e-82b1-082d48f8767f df7548ecad684f26b8bc802ba63a9814 >>>>> 87e34c89e6fb41d2af25085b64011a55 - default default] 192.168.53.45 "GET >>>>> /v3/87e34c89e6fb41d2af25085b64011a55/volumes/detail HTTP/1.1" status: 200 >>>>> len: 4657 time: 0.1152730 >>>>> ==> volume.log <== >>>>> 2020-01-17 16:53:23.811 24723 WARNING py.warnings >>>>> [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 >>>>> 87e34c89e6fb41d2af25085b64011a55 - default default] >>>>> /usr/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py:852: >>>>> InsecureRequestWarning: Unverified HTTPS request is being made. Adding >>>>> certificate verification is strongly advised. See: >>>>> https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings >>>>> InsecureRequestWarning) >>>>> : NimbleAPIException: Failed to execute api >>>>> snapshots/0464a5fd65d75fcfe1000000000000011100001a41: Error Code: 409 >>>>> Message: Snapshot snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume >>>>> volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone. >>>>> 2020-01-17 16:53:23.902 24723 ERROR cinder.volume.drivers.nimble >>>>> [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 >>>>> 87e34c89e6fb41d2af25085b64011a55 - default default] Re-throwing Exception >>>>> Failed to execute api snapshots/0464a5fd65d75fcfe1000000000000011100001a41: >>>>> Error Code: 409 Message: Snapshot >>>>> snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume >>>>> volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone.: >>>>> NimbleAPIException: Failed to execute api >>>>> snapshots/0464a5fd65d75fcfe1000000000000011100001a41: Error Code: 409 >>>>> Message: Snapshot snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume >>>>> volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone. >>>>> 2020-01-17 16:53:23.903 24723 WARNING cinder.volume.drivers.nimble >>>>> [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 >>>>> 87e34c89e6fb41d2af25085b64011a55 - default default] Snapshot >>>>> snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 : has a clone: >>>>> NimbleAPIException: Failed to execute api >>>>> snapshots/0464a5fd65d75fcfe1000000000000011100001a41: Error Code: 409 >>>>> Message: Snapshot snapshot-3806efc5-65ca-495a-a87a-baaddc9607d9 for volume >>>>> volume-5b02db35-8d5c-4ef6-b0e7-2f9b62cac57e has a clone. >>>>> 2020-01-17 16:53:23.964 24723 WARNING cinder.quota >>>>> [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 >>>>> 87e34c89e6fb41d2af25085b64011a55 - default default] Deprecated: Default >>>>> quota for resource: snapshots_Nimble-DR is set by the default quota flag: >>>>> quota_snapshots_Nimble-DR, it is now deprecated. Please use the default >>>>> quota class for default quota. >>>>> 2020-01-17 16:53:24.054 24723 INFO cinder.volume.manager >>>>> [req-60fe4335-af66-4c46-9bbd-2408bf4d6f07 df7548ecad684f26b8bc802ba63a9814 >>>>> 87e34c89e6fb41d2af25085b64011a55 - default default] Delete snapshot >>>>> completed successfully. >>>> >>>> >>>> >>>> Regards, >>>> >>>> *Tony Pearce* | >>>> *Senior Network Engineer / Infrastructure Lead**Cinglevue >>>> International * >>>> >>>> Email: tony.pearce at cinglevue.com >>>> Web: http://www.cinglevue.com >>>> >>>> *Australia* >>>> 1 Walsh Loop, Joondalup, WA 6027 Australia. >>>> >>>> Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 >>>> >>>> Note: This email and all attachments are the sole property of Cinglevue >>>> International Pty Ltd. (or any of its subsidiary entities), and the >>>> information contained herein must be considered confidential, unless >>>> specified otherwise. If you are not the intended recipient, you must not >>>> use or forward the information contained in these documents. If you have >>>> received this message in error, please delete the email and notify the >>>> sender. >>>> >>>> >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From lyarwood at redhat.com Mon Jan 20 15:41:07 2020 From: lyarwood at redhat.com (Lee Yarwood) Date: Mon, 20 Jan 2020 15:41:07 +0000 Subject: [queens][nova] iscsi issue In-Reply-To: References: Message-ID: <20200120154107.czjws3n3p5rl64nu@lyarwood.usersys.redhat.com> On 17-01-20 19:30:13, Ignazio Cassano wrote: > Hello all we are testing openstack queens cinder driver for Unity iscsi > (driver cinder.volume.drivers.dell_emc.unity.Driver). > > The unity storage is a Unity600 Version 4.5.10.5.001 > > We are facing an issue when we try to detach volume from a virtual machine > with two or more volumes attached (this happens often but not always): Could you write this up as an os-brick bug and attach the nova-compute log in DEBUG showing the initial volume attachments prior to this detach error? https://bugs.launchpad.net/os-brick/+filebug > The following is reported nova-compute.log: > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server > self.connector.disconnect_volume(connection_info['data'], None) > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File > "/usr/lib/python2.7/site-packages/os_brick/utils.py", line 137, in > trace_logging_wrapper > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server return > f(*args, **kwargs) > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File > "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 274, > in inner > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server return > f(*args, **kwargs) > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File > "/usr/lib/python2.7/site-packages/os_brick/initiator/connectors/iscsi.py", > line 848, in disconnect_volume > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server > device_info=device_info) > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File > "/usr/lib/python2.7/site-packages/os_brick/initiator/connectors/iscsi.py", > line 892, in _cleanup_connection > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server path_used, > was_multipath) > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File > "/usr/lib/python2.7/site-packages/os_brick/initiator/linuxscsi.py", line > 271, in remove_connection > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server > self.flush_multipath_device(multipath_name) > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File > "/usr/lib/python2.7/site-packages/os_brick/initiator/linuxscsi.py", line > 329, in flush_multipath_device > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server > root_helper=self._root_helper) > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File > "/usr/lib/python2.7/site-packages/os_brick/executor.py", line 52, in > _execute > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server result = > self.__execute(*args, **kwargs) > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File > "/usr/lib/python2.7/site-packages/os_brick/privileged/rootwrap.py", line > 169, in execute > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server return > execute_root(*cmd, **kwargs) > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File > "/usr/lib/python2.7/site-packages/oslo_privsep/priv_context.py", line 207, > in _wrap > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server return > self.channel.remote_call(name, args, kwargs) > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File > "/usr/lib/python2.7/site-packages/oslo_privsep/daemon.py", line 202, in > remote_call > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server raise > exc_type(*result[2]) > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server > ProcessExecutionError: Unexpected error while running command. > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server Command: > multipath -f 36006016006e04400d0c4215e3ec55757 > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server Exit code: 1 > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server Stdout: u'Jan > 17 16:04:30 | 36006016006e04400d0c4215e3ec55757p1: map in use\nJan 17 > 16:04:31 | failed to remove multipath map > 36006016006e04400d0c4215e3ec55757\n' > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server Stderr: u'' > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server > > > Best Regards > > Ignazio -- Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From haleyb.dev at gmail.com Mon Jan 20 16:07:31 2020 From: haleyb.dev at gmail.com (Brian Haley) Date: Mon, 20 Jan 2020 11:07:31 -0500 Subject: [neutron] Bug deputy report for week of January 13th Message-ID: Hi, I was Neutron bug deputy last week. Below is a short summary about reported bugs. -Brian Critical bugs ------------- * https://bugs.launchpad.net/neutron/+bug/1859988 - neutron-tempest-plugin tests fail for stable/queens - bcafarel picked up, related to below bug * https://bugs.launchpad.net/neutron/+bug/1860033 - neutron tempest jobs broken on rocky due to requirements neutron-lib upgrade - many proposed changes, discussion on ML High bugs --------- * https://bugs.launchpad.net/neutron/+bug/1859977 - [OVN] Update of floatingip creates new row in NBDB NAT table - maciej picked up * https://bugs.launchpad.net/neutron/+bug/1860140 - [OVN] Provider driver sends malformed update to Octavia - https://review.opendev.org/#/c/703097 * https://bugs.launchpad.net/neutron/+bug/1860141 - [OVN] Provider driver fails while LB VIP port already created - https://review.opendev.org/#/c/703110 Medium bugs ----------- * https://bugs.launchpad.net/neutron/+bug/1859832 - L3 HA connectivity to GW port can be broken after reboot of backup node - Issue with MLDv2 packet causing issues with connections - Slawek picked up ownership * https://bugs.launchpad.net/neutron/+bug/1860273 - https://review.opendev.org/#/c/703292/ Low bugs -------- * https://bugs.launchpad.net/neutron/+bug/1859765 - Sgs of device_owner_network port can be update without specifing "device_owner" attr - https://review.opendev.org/#/c/702632/ - discussion in review * https://bugs.launchpad.net/neutron/+bug/1859962 - Sanity checks comparing versions should not use decimal comparison - https://review.opendev.org/#/c/702847/ Wishlist bugs ------------- Invalid bugs ------------ * https://bugs.launchpad.net/neutron/+bug/1859638 - VIP between dvr east-west networks does not work at all - Duplicate of https://bugs.launchpad.net/neutron/+bug/1774459 * https://bugs.launchpad.net/neutron/+bug/1859976 - Removing smart_nic port in openstack will try to delete representor port - os-vif issue Further triage required ----------------------- * https://bugs.launchpad.net/neutron/+bug/1859362 - Neutron accepts arbitrary MTU values for networks - Rodolfo has some questions on the bug * https://bugs.launchpad.net/ubuntu/+source/neutron/+bug/1859649 - networking disruption on upgrade from 14.0.0 to 14.0.3 - Rodolfo had question about restart order of services * https://bugs.launchpad.net/neutron/+bug/1859887 - External connectivity broken because of stale FIP rule - Asked for release information, etc * https://bugs.launchpad.net/networking-ovn/+bug/1859965 - networking-ovn's octavia api,how get octavia-api DB use python code? - networking-ovn octavia driver bug? asked for more information From rico.lin.guanyu at gmail.com Mon Jan 20 16:47:19 2020 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Tue, 21 Jan 2020 00:47:19 +0800 Subject: [Multi-Arch-sig] Meeting this week Message-ID: Hi all. As reminder, we will host our meeting this week. Tuesday at 0800 UTC in #openstack-meeting-alt and 1500 UTC in #openstack-meeting . Feel free to propose agenda here https://etherpad.openstack.org/p/Multi-Arch-agenda See you all in meeting:) -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdulko at redhat.com Mon Jan 20 16:51:47 2020 From: mdulko at redhat.com (=?UTF-8?Q?Micha=C5=82?= Dulko) Date: Mon, 20 Jan 2020 17:51:47 +0100 Subject: [kuryr] Deprecation of Ingress support and namespace isolation Message-ID: <97e0d4450c2777bcc4f8f2aff39dfdb0150fbe09.camel@redhat.com> Hi, I've decided to put up a patch [1] deprecating the aforementioned features. It's motivated by the fact that there are better ways to do both: * Ingress can be done by another controller or through cloud provider. * Namespace isolation can be achieved through network policies. Both alternative ways are way better tested and there's nobody maintaining the deprecated features. I'm open to keep them if someone using them steps up. With the fact that Kuryr seems to be tied more with Kubernetes releases than with OpenStack ones and given there will be no objections, we might start removing the code in the Ussuri timeframe. [1] https://review.opendev.org/#/c/703420/ Thanks, Michał From sean.mcginnis at gmail.com Mon Jan 20 18:01:19 2020 From: sean.mcginnis at gmail.com (Sean McGinnis) Date: Mon, 20 Jan 2020 12:01:19 -0600 Subject: [release] Python universal wheel support Message-ID: <64cc9fee-de60-7584-3f9e-7cb3b3be28aa@gmail.com> Greetings, We have just merged a change to the release-openstack-python job that changes the wheels we produce to not be universal. For some background - we added an explicit flag of "--universal" to the creation of wheels a while back. For projects that have both Python 2 and 3 support, you want an universal wheel. Not all (probably most) projects did not add this flag to their setup.cfg, so overriding at the release job level was considered a good way to make sure our output was what we wanted at the time. We now have the majority of projects dropping py2 support, so we actually no longer want to create these universal wheels if py2 support has been dropped. That has actually been seen to cause some issues. The downside with this change is that the job is for *all* deliverables we release, including stable releases. So with this change, any new stable branches will no longer get universal wheels if the flag has not been set locally. This was deemed a good tradeoff with the current needs though. The lack of a univeral wheel may just make installation of py2 stable deliverables just slightly slower, but should not cause any real issues. Actions ------ Most likely this won't require any actions on the project team's part. If you have a project that still supports both py2 and py3 and do not have the flag set in setup.cfg, that can be added to still get the universal wheels built. That is done by adding the following: ``` [bdist_wheel] universal = 1 ``` Again, the performance impact is probably very minimal during installation, so this shouldn't be a major concern. If there are any oddities noticed after this change, please bring them up and we can help investigate what is happening. Thanks! Sean From gmann at ghanshyammann.com Mon Jan 20 18:46:34 2020 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 20 Jan 2020 12:46:34 -0600 Subject: [neutron] Bug deputy report for week of January 13th In-Reply-To: References: Message-ID: <16fc4472431.edd8d6c6118183.415190410856446702@ghanshyammann.com> ---- On Mon, 20 Jan 2020 10:07:31 -0600 Brian Haley wrote ---- > Hi, > > I was Neutron bug deputy last week. Below is a short summary about > reported bugs. > > -Brian > > > Critical bugs > ------------- > > * https://bugs.launchpad.net/neutron/+bug/1859988 > - neutron-tempest-plugin tests fail for stable/queens > - bcafarel picked up, related to below bug This needs to be done at devstack level not only on job side. We have to cap the Tempest and use the corresponding upper-constraint for tempest venv. I have started the work on End of Queens support in Tempest[1] and based on what change exactly made Temepst master to fail on queens I will proceed to pin the Tempest on devstack queens like done for ocata and pike. [1] https://review.opendev.org/#/c/703255/ -gmann > > * https://bugs.launchpad.net/neutron/+bug/1860033 - neutron tempest jobs > broken on rocky due to requirements neutron-lib upgrade > - many proposed changes, discussion on ML > > High bugs > --------- > > * https://bugs.launchpad.net/neutron/+bug/1859977 - [OVN] Update of > floatingip creates new row in NBDB NAT table > - maciej picked up > > * https://bugs.launchpad.net/neutron/+bug/1860140 - [OVN] Provider > driver sends malformed update to Octavia > - https://review.opendev.org/#/c/703097 > > * https://bugs.launchpad.net/neutron/+bug/1860141 - [OVN] Provider > driver fails while LB VIP port already created > - https://review.opendev.org/#/c/703110 > > Medium bugs > ----------- > > * https://bugs.launchpad.net/neutron/+bug/1859832 - L3 HA connectivity > to GW port can be broken after reboot of backup node > - Issue with MLDv2 packet causing issues with connections > - Slawek picked up ownership > > * https://bugs.launchpad.net/neutron/+bug/1860273 > - https://review.opendev.org/#/c/703292/ > > Low bugs > -------- > > * https://bugs.launchpad.net/neutron/+bug/1859765 - Sgs of > device_owner_network port can be update without specifing "device_owner" > attr > - https://review.opendev.org/#/c/702632/ - discussion in review > > * https://bugs.launchpad.net/neutron/+bug/1859962 - Sanity checks > comparing versions should not use decimal comparison > - https://review.opendev.org/#/c/702847/ > > Wishlist bugs > ------------- > > Invalid bugs > ------------ > > * https://bugs.launchpad.net/neutron/+bug/1859638 - VIP between dvr > east-west networks does not work at all > - Duplicate of https://bugs.launchpad.net/neutron/+bug/1774459 > > * https://bugs.launchpad.net/neutron/+bug/1859976 - Removing smart_nic > port in openstack will try to delete representor port > - os-vif issue > > Further triage required > ----------------------- > > * https://bugs.launchpad.net/neutron/+bug/1859362 - Neutron accepts > arbitrary MTU values for networks > - Rodolfo has some questions on the bug > > * https://bugs.launchpad.net/ubuntu/+source/neutron/+bug/1859649 - > networking disruption on upgrade from 14.0.0 to 14.0.3 > - Rodolfo had question about restart order of services > > * https://bugs.launchpad.net/neutron/+bug/1859887 - > External connectivity broken because of stale FIP rule > - Asked for release information, etc > > * https://bugs.launchpad.net/networking-ovn/+bug/1859965 - > networking-ovn's octavia api,how get octavia-api DB use python code? > - networking-ovn octavia driver bug? asked for more information > > From skaplons at redhat.com Mon Jan 20 20:26:27 2020 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 20 Jan 2020 21:26:27 +0100 Subject: [all][neutron][neutron-fwaas] FINAL CALL Maintainers needed In-Reply-To: References: <20191119102615.oq46xojyhoybulna@skaplons-mac> Message-ID: Hi, We are getting closer and closer to Ussuri-2 milestone which is our deadline to deprecate neutron-fwaas project if there will be no any volunteers to maintain this project. So if You are interested in this project, please raise Your hand here or ping me on IRC about that. > On 6 Jan 2020, at 21:05, Slawek Kaplonski wrote: > > Hi, > > Just as a reminder, we are still looking for maintainers who want to keep neutron-fwaas project alive. As it was written in my previous email, we will mark this project as deprecated. > So please reply to this email or contact me directly if You are interested in maintaining this project. > >> On 19 Nov 2019, at 11:26, Slawek Kaplonski wrote: >> >> Hi, >> >> Over the past couple of cycles we have noticed that new contributions and >> maintenance efforts for neutron-fwaas project were almost non existent. >> This impacts patches for bug fixes, new features and reviews. The Neutron >> core team is trying to at least keep the CI of this project healthy, but we >> don’t have enough knowledge about the details of the neutron-fwaas >> code base to review more complex patches. >> >> During the PTG in Shanghai we discussed that with operators and TC members >> during the forum session [1] and later within the Neutron team during the >> PTG session [2]. >> >> During these discussions, with the help of operators and TC members, we reached >> the conclusion that we need to have someone responsible for maintaining project. >> This doesn’t mean that the maintainer needs to spend full time working on this >> project. Rather, we need someone to be the contact person for the project, who >> takes care of the project’s CI and review patches. Of course that’s only a >> minimal requirement. If the new maintainer works on new features for the >> project, it’s even better :) >> >> If we don’t have any new maintainer(s) before milestone Ussuri-2, which is >> Feb 10 - Feb 14 according to [3], we will need to mark neutron-fwaas >> as deprecated and in “V” cycle we will propose to move the project >> from the Neutron stadium, hosted in the “openstack/“ namespace, to the >> unofficial projects hosted in the “x/“ namespace. >> >> So if You are using this project now, or if You have customers who are >> using it, please consider the possibility of maintaining it. Otherwise, >> please be aware that it is highly possible that the project will be >> deprecated and moved out from the official OpenStack projects. >> >> [1] >> https://etherpad.openstack.org/p/PVG-Neutron-stadium-projects-the-path-forward >> [2] https://etherpad.openstack.org/p/Shanghai-Neutron-Planning-restored - >> Lines 379-421 >> [3] https://releases.openstack.org/ussuri/schedule.html >> >> -- >> Slawek Kaplonski >> Senior software engineer >> Red Hat > > — > Slawek Kaplonski > Senior software engineer > Red Hat > — Slawek Kaplonski Senior software engineer Red Hat From emiller at genesishosting.com Mon Jan 20 20:54:02 2020 From: emiller at genesishosting.com (Eric K. Miller) Date: Mon, 20 Jan 2020 14:54:02 -0600 Subject: [all][neutron][neutron-fwaas] FINAL CALL Maintainers needed In-Reply-To: References: <20191119102615.oq46xojyhoybulna@skaplons-mac> Message-ID: <046E9C0290DD9149B106B72FC9156BEA04771777@gmsxchsvr01.thecreation.com> > -----Original Message----- > From: Slawek Kaplonski [mailto:skaplons at redhat.com] > Sent: Monday, January 20, 2020 2:26 PM > To: OpenStack Discuss ML > Subject: Re: [all][neutron][neutron-fwaas] FINAL CALL Maintainers needed > > Hi, > > We are getting closer and closer to Ussuri-2 milestone which is our deadline > to deprecate neutron-fwaas project if there will be no any volunteers to > maintain this project. > So if You are interested in this project, please raise Your hand here or ping > me on IRC about that. > > > On 6 Jan 2020, at 21:05, Slawek Kaplonski wrote: > > > > Hi, > > > > Just as a reminder, we are still looking for maintainers who want to keep > neutron-fwaas project alive. As it was written in my previous email, we will > mark this project as deprecated. > > So please reply to this email or contact me directly if You are interested in > maintaining this project. > > > >> On 19 Nov 2019, at 11:26, Slawek Kaplonski > wrote: > >> > >> Hi, > >> > >> Over the past couple of cycles we have noticed that new contributions > and > >> maintenance efforts for neutron-fwaas project were almost non existent. > >> This impacts patches for bug fixes, new features and reviews. The > Neutron > >> core team is trying to at least keep the CI of this project healthy, but we > >> don’t have enough knowledge about the details of the neutron-fwaas > >> code base to review more complex patches. > >> > >> During the PTG in Shanghai we discussed that with operators and TC > members > >> during the forum session [1] and later within the Neutron team during the > >> PTG session [2]. > >> > >> During these discussions, with the help of operators and TC members, we > reached > >> the conclusion that we need to have someone responsible for > maintaining project. > >> This doesn’t mean that the maintainer needs to spend full time working > on this > >> project. Rather, we need someone to be the contact person for the > project, who > >> takes care of the project’s CI and review patches. Of course that’s only a > >> minimal requirement. If the new maintainer works on new features for > the > >> project, it’s even better :) > >> > >> If we don’t have any new maintainer(s) before milestone Ussuri-2, which > is > >> Feb 10 - Feb 14 according to [3], we will need to mark neutron-fwaas > >> as deprecated and in “V” cycle we will propose to move the project > >> from the Neutron stadium, hosted in the “openstack/“ namespace, to the > >> unofficial projects hosted in the “x/“ namespace. > >> > >> So if You are using this project now, or if You have customers who are > >> using it, please consider the possibility of maintaining it. Otherwise, > >> please be aware that it is highly possible that the project will be > >> deprecated and moved out from the official OpenStack projects. > >> > >> [1] > >> https://etherpad.openstack.org/p/PVG-Neutron-stadium-projects-the- > path-forward > >> [2] https://etherpad.openstack.org/p/Shanghai-Neutron-Planning- > restored - > >> Lines 379-421 > >> [3] https://releases.openstack.org/ussuri/schedule.html > >> > >> -- > >> Slawek Kaplonski > >> Senior software engineer > >> Red Hat > > > > — > > Slawek Kaplonski > > Senior software engineer > > Red Hat > > > > — > Slawek Kaplonski > Senior software engineer > Red Hat > > I'm not a developer, rather an operator. I just thought I'd ask if the OpenStack community had thought about creating a fund for developers that may want to contribute, but can only do it for a fee. Essentially a donation bucket. I don't know if the OpenStack Foundation does this or not already. There may be adequate need for fwaas, for example, but not enough volunteer resources to do it. However there may be money in multiple operators' budgets that can be used to donate to the support of a project. Eric From fungi at yuggoth.org Mon Jan 20 21:25:05 2020 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 20 Jan 2020 21:25:05 +0000 Subject: [all][neutron][neutron-fwaas] FINAL CALL Maintainers needed In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA04771777@gmsxchsvr01.thecreation.com> References: <20191119102615.oq46xojyhoybulna@skaplons-mac> <046E9C0290DD9149B106B72FC9156BEA04771777@gmsxchsvr01.thecreation.com> Message-ID: <20200120212505.tdqv5br5vwfib63z@yuggoth.org> On 2020-01-20 14:54:02 -0600 (-0600), Eric K. Miller wrote: [...] > I'm not a developer, rather an operator. I just thought I'd ask > if the OpenStack community had thought about creating a fund for > developers that may want to contribute, but can only do it for a > fee. Essentially a donation bucket. I don't know if the > OpenStack Foundation does this or not already. There may be > adequate need for fwaas, for example, but not enough volunteer > resources to do it. However there may be money in multiple > operators' budgets that can be used to donate to the support of a > project. Our community has mostly relied on commercial distribution vendors and service providers to fill that role in the past. If enough of their customers say it's a feature they're relying on, then employing developers who can help keep it maintained is their raison d'être. This is how pretty much all free/libre open source community software projects work. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From emiller at genesishosting.com Mon Jan 20 21:43:48 2020 From: emiller at genesishosting.com (Eric K. Miller) Date: Mon, 20 Jan 2020 15:43:48 -0600 Subject: [all][neutron][neutron-fwaas] FINAL CALL Maintainers needed In-Reply-To: <20200120212505.tdqv5br5vwfib63z@yuggoth.org> References: <20191119102615.oq46xojyhoybulna@skaplons-mac> <046E9C0290DD9149B106B72FC9156BEA04771777@gmsxchsvr01.thecreation.com> <20200120212505.tdqv5br5vwfib63z@yuggoth.org> Message-ID: <046E9C0290DD9149B106B72FC9156BEA0477177D@gmsxchsvr01.thecreation.com> > Our community has mostly relied on commercial distribution vendors and > service providers to fill that role in the past. If enough of their customers say > it's a feature they're relying on, then employing developers who can help > keep it maintained is their raison d'être. This is how pretty much all free/libre > open source community software projects work. > -- > Jeremy Stanley Understood. I thought there may be a way to crowd-fund something in case smaller operators had small budgets, needed some feature supported, but couldn't afford an entire employee with the small budget. Maybe there are developer consultants interested in a gig to maintain something for a while. Not sure where the best place to go for matching operators with consultants for this type of thing. fwaas seems like such a necessity. We would love to offer it, but it is unusable with DVR. We just don't have the budget to hire someone specifically for development/support of this. Eric From raubvogel at gmail.com Mon Jan 20 21:49:51 2020 From: raubvogel at gmail.com (Mauricio Tavares) Date: Mon, 20 Jan 2020 16:49:51 -0500 Subject: [all][neutron][neutron-fwaas] FINAL CALL Maintainers needed In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA0477177D@gmsxchsvr01.thecreation.com> References: <20191119102615.oq46xojyhoybulna@skaplons-mac> <046E9C0290DD9149B106B72FC9156BEA04771777@gmsxchsvr01.thecreation.com> <20200120212505.tdqv5br5vwfib63z@yuggoth.org> <046E9C0290DD9149B106B72FC9156BEA0477177D@gmsxchsvr01.thecreation.com> Message-ID: On Mon, Jan 20, 2020 at 4:46 PM Eric K. Miller wrote: > > > Our community has mostly relied on commercial distribution vendors and > > service providers to fill that role in the past. If enough of their customers say > > it's a feature they're relying on, then employing developers who can help > > keep it maintained is their raison d'être. This is how pretty much all free/libre > > open source community software projects work. > > -- > > Jeremy Stanley > > Understood. I thought there may be a way to crowd-fund something in case smaller operators had small budgets, needed some feature supported, but couldn't afford an entire employee with the small budget. > > Maybe there are developer consultants interested in a gig to maintain something for a while. Not sure where the best place to go for matching operators with consultants for this type of thing. > > fwaas seems like such a necessity. We would love to offer it, but it is unusable with DVR. We just don't have the budget to hire someone specifically for development/support of this. > Smells like a case for gofundme time. Or, is there still time for google summer of code submission? > Eric > From emiller at genesishosting.com Mon Jan 20 21:54:27 2020 From: emiller at genesishosting.com (Eric K. Miller) Date: Mon, 20 Jan 2020 15:54:27 -0600 Subject: [all][neutron][neutron-fwaas] FINAL CALL Maintainers needed In-Reply-To: References: <20191119102615.oq46xojyhoybulna@skaplons-mac> <046E9C0290DD9149B106B72FC9156BEA04771777@gmsxchsvr01.thecreation.com> <20200120212505.tdqv5br5vwfib63z@yuggoth.org> <046E9C0290DD9149B106B72FC9156BEA0477177D@gmsxchsvr01.thecreation.com> Message-ID: <046E9C0290DD9149B106B72FC9156BEA04771782@gmsxchsvr01.thecreation.com> > Smells like a case for gofundme time. Or, is there still time > for google summer of code submission? I'll check out gofundme. I honestly haven't looked at it much. I wasn't sure where OpenStack developer consultants would look for this type of gig. If it is gofundme, that works! Eric From petebirley+openstack-dev at gmail.com Tue Jan 21 01:09:08 2020 From: petebirley+openstack-dev at gmail.com (Pete Birley) Date: Mon, 20 Jan 2020 19:09:08 -0600 Subject: [openstack-helm] Core Reviewer Nominations Message-ID: OpenStack-Helm team, Based on their record of quality code review and substantial/meaningful code contributions to the openstack-helm project, at last weeks meeting we proposed the following individuals as core reviewers for openstack-helm: - Gage Hugo - Steven Fitzpatrick All OpenStack-Helm Core Reviewers are invited to reply with a +1/-1 by EOD next Monday (27/1/2020). A lone +1/-1 will apply to both candidates, otherwise please spell out votes individually for the candidates. Cheers, Pete -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilkers.steve at gmail.com Tue Jan 21 02:12:07 2020 From: wilkers.steve at gmail.com (Steve Wilkerson) Date: Mon, 20 Jan 2020 20:12:07 -0600 Subject: [openstack-helm] Core Reviewer Nominations In-Reply-To: References: Message-ID: A resounding +1 from me for both Steven and Gage. Both have done really great work and have provided meaningful reviews along the way. On Mon, Jan 20, 2020 at 7:14 PM Pete Birley < petebirley+openstack-dev at gmail.com> wrote: > OpenStack-Helm team, > > > > Based on their record of quality code review and substantial/meaningful > code contributions to the openstack-helm project, at last weeks meeting we > proposed the following individuals as core reviewers for openstack-helm: > > > > - Gage Hugo > - Steven Fitzpatrick > > > > All OpenStack-Helm Core Reviewers are invited to reply with a +1/-1 by > EOD next Monday (27/1/2020). A lone +1/-1 will apply to both candidates, > otherwise please spell out votes individually for the candidates. > > > > Cheers, > > > Pete > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Albert.Braden at synopsys.com Tue Jan 21 02:19:27 2020 From: Albert.Braden at synopsys.com (Albert Braden) Date: Tue, 21 Jan 2020 02:19:27 +0000 Subject: Galera config values In-Reply-To: <4E49E11B-83FA-4016-9FA1-30CDC377825C@blizzard.com> References: , <4E49E11B-83FA-4016-9FA1-30CDC377825C@blizzard.com> Message-ID: That would be fantastic; thanks! -----Original Message----- From: Erik Olof Gunnar Andersson Sent: Friday, January 17, 2020 7:37 PM To: Mohammed Naser Cc: Albert Braden ; openstack-discuss at lists.openstack.org Subject: Re: Galera config values I can share our haproxt settings on monday, but you need to make sure that haproxy to at least match the Oslo config which I believe is 3600s, but I think in theory something like keepalived is better for galerara. btw pretty sure both client and server needs 3600s. Basically openstack recycles the connection every hour by default. So you need to make sure that haproxy does not close it before that if it’s idle. Sent from my iPhone > On Jan 17, 2020, at 7:24 PM, Mohammed Naser wrote: > > On Fri, Jan 17, 2020 at 5:20 PM Albert Braden > wrote: >> >> I’m experimenting with Galera in my Rocky openstack-ansible dev cluster, and I’m finding that the default haproxy config values don’t seem to work. Finding the correct values is a lot of work. For example, I spent this morning experimenting with different values for “timeout client” in /etc/haproxy/haproxy.cfg. The default is 1m, and with the default set I see this error in /var/log/nova/nova-scheduler.log on the controllers: >> >> >> >> 2020-01-17 13:54:26.059 443358 ERROR oslo_db.sqlalchemy.engines DBConnectionError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query') [SQL: u'SELECT 1'] (Background on this error at: https://urldefense.com/v3/__http://sqlalche.me/e/e3q8__;!!Ci6f514n9QsL8ck!39gvi32Ldv9W8zhZ_P1JLvkOFM-PelyP_RrU_rT5_EuELR24fLO5P3ShvZ56jfcQ7g$ ) >> >> >> >> There are several timeout values in /etc/haproxy/haproxy.cfg. These are the values we started with: >> >> >> >> stats timeout 30s >> >> timeout http-request 10s >> >> timeout queue 1m >> >> timeout connect 10s >> >> timeout client 1m >> >> timeout server 1m >> >> timeout check 10s >> >> >> >> At first I changed them all to 30m. This stopped the “Lost connection” error in nova-scheduler.log. Then, one at a time, I changed them back to the default. When I got to “timeout client” I found that setting it back to 1m caused the errors to start again. I changed it back and forth and found that 4 minutes causes errors, and 6m stops them, so I left it at 6m. >> >> >> >> These are my active variables: >> >> >> >> root at us01odc-dev2-ctrl1:/etc/mysql# mysql -e 'show variables;'|grep timeout >> >> connect_timeout 20 >> >> deadlock_timeout_long 50000000 >> >> deadlock_timeout_short 10000 >> >> delayed_insert_timeout 300 >> >> idle_readonly_transaction_timeout 0 >> >> idle_transaction_timeout 0 >> >> idle_write_transaction_timeout 0 >> >> innodb_flush_log_at_timeout 1 >> >> innodb_lock_wait_timeout 50 >> >> innodb_rollback_on_timeout OFF >> >> interactive_timeout 28800 >> >> lock_wait_timeout 86400 >> >> net_read_timeout 30 >> >> net_write_timeout 60 >> >> rpl_semi_sync_master_timeout 10000 >> >> rpl_semi_sync_slave_kill_conn_timeout 5 >> >> slave_net_timeout 60 >> >> thread_pool_idle_timeout 60 >> >> wait_timeout 3600 >> >> >> >> So it looks like the value of “timeout client” in haproxy.cfg needs to match or exceed the value of “wait_timeout” in mysql. Also in nova.conf I see “#connection_recycle_time = 3600” – I need to experiment to see how that value interacts with the timeouts in the other config files. >> >> >> >> Is this the best way to find the correct config values? It seems like there should be a document that talks about these timeouts and how to set them (or maybe more generally how the different timeout settings in the various config files interact). Does that document exist? If not, maybe I could write one, since I have to figure out the correct values anyway. > > Is your cluster pretty idle? I've never seen that happen in any > environments before... > > -- > Mohammed Naser — vexxhost > ----------------------------------------------------- > D. 514-316-8872 > D. 800-910-1726 ext. 200 > E. mnaser at vexxhost.com > W. https://urldefense.com/v3/__https://vexxhost.com__;!!Ci6f514n9QsL8ck!39gvi32Ldv9W8zhZ_P1JLvkOFM-PelyP_RrU_rT5_EuELR24fLO5P3ShvZ4PDThJbg$ > From smooney at redhat.com Tue Jan 21 02:26:07 2020 From: smooney at redhat.com (Sean Mooney) Date: Tue, 21 Jan 2020 02:26:07 +0000 Subject: [all][neutron][neutron-fwaas] FINAL CALL Maintainers needed In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA0477177D@gmsxchsvr01.thecreation.com> References: <20191119102615.oq46xojyhoybulna@skaplons-mac> <046E9C0290DD9149B106B72FC9156BEA04771777@gmsxchsvr01.thecreation.com> <20200120212505.tdqv5br5vwfib63z@yuggoth.org> <046E9C0290DD9149B106B72FC9156BEA0477177D@gmsxchsvr01.thecreation.com> Message-ID: On Mon, 2020-01-20 at 15:43 -0600, Eric K. Miller wrote: > > Our community has mostly relied on commercial distribution vendors and > > service providers to fill that role in the past. If enough of their customers say > > it's a feature they're relying on, then employing developers who can help > > keep it maintained is their raison d'être. This is how pretty much all free/libre > > open source community software projects work. > > -- > > Jeremy Stanley > > Understood. I thought there may be a way to crowd-fund something in case smaller operators had small budgets, needed > some feature supported, but couldn't afford an entire employee with the small budget. > > Maybe there are developer consultants interested in a gig to maintain something for a while. Not sure where the best > place to go for matching operators with consultants for this type of thing. > > fwaas seems like such a necessity. We would love to offer it, but it is unusable with DVR. We just don't have the > budget to hire someone specifically for development/support of this. for a lot of usecase security groups is sufficent and people do not need to enforce firewalls between networks at neutron routers which is effectivly how fwaas worked. enfrocement via security groups on the ports attached to the instance was sufficent. similarly for operators that have invested in an sdn solution they can porvide an out of band policy enformce point. as a result in a normal openstack deployment fwaas became redundant. there still usecase for this fwaas but less then you would think. much of the thing you migh typicaly do in an east west direction or bettwen laywers in you applciation can be done using remote security groups instead of cidrs with security groups. the gab that security groups did not fill easily was ironic and sriov however i belive some fo the heriacical switch binding drives did support security impletend at the top of rack switch that could close that gap. as a result FWaaS has become less deployed and less maintianed over time. the other issue as you noted is compatabliy the fact that VPNaas FWaas and ovs dvr with ml2/ovs did not just work means its a hard sell. that is compounded by the fact that none of the main sdn solutions supported it either. any implematnion of FWaaS whould have provdied sevedor entroy point for neutron.agent.l2.firewall_drivers https://github.com/openstack/neutron-fwaas/blob/master/setup.cfg#L45-L47 and neutron.agent.l3.firewall_drivers https://github.com/openstack/neutron-fwaas/blob/master/setup.cfg#L51-L53 but as you can see odl, ovn, mideonet, onos, contrail, dragonflow and calico do not implement support https://github.com/openstack/networking-odl/blob/master/setup.cfg#L47-L66 https://github.com/openstack/networking-ovn/blob/master/setup.cfg#L51-L62 https://github.com/openstack/networking-midonet/blob/02e25cc65601add1d96b7150ed70403c3de4243b/setup.cfg#L58-L81 https://github.com/openstack/networking-onos/blob/master/setup.cfg#L40-L49 https://opendev.org/x/networking-opencontrail/src/branch/master/setup.cfg#L29-L32 https://github.com/openstack/dragonflow/blob/master/setup.cfg#L45-L121 https://github.com/openstack/networking-calico/blob/master/setup.cfg#L24-L30 networking-cisco provide an alternive fw api https://opendev.org/x/networking-cisco/src/branch/master/setup.cfg#L97-L100 and arista support a security group api at the top or rack switch https://opendev.org/x/networking-arista/src/branch/master/setup.cfg#L41 unless customers are asking vendors to provide FWaaS, in my experince it was never a telco prioity at least, vendor wont have a buisness case to justify investment. That does not help those that do use FWaaS today but its a sad fact that individuals and compaines need to choose wehre they spend there resouces carfully and FWaaS just never caught on enough to remain relevent. > > Eric > From feilong at catalyst.net.nz Tue Jan 21 02:34:26 2020 From: feilong at catalyst.net.nz (Feilong Wang) Date: Tue, 21 Jan 2020 15:34:26 +1300 Subject: [magnum][kolla] etcd wal sync duration issue In-Reply-To: <046E9C0290DD9149B106B72FC9156BEA04771749@gmsxchsvr01.thecreation.com> References: <046E9C0290DD9149B106B72FC9156BEA0477170E@gmsxchsvr01.thecreation.com> <3f3fe0d1-7b61-d2f9-da65-d126ea5ed336@catalyst.net.nz> <046E9C0290DD9149B106B72FC9156BEA04771716@gmsxchsvr01.thecreation.com> <279cedf1-8bf4-fcf1-cfc2-990c97685531@catalyst.net.nz> <046E9C0290DD9149B106B72FC9156BEA04771749@gmsxchsvr01.thecreation.com> Message-ID: Hi Eric, Thanks for sharing the article. As for the etcd volumes, you can disable it by without setting the etcd_volume_size label. Just FYI. On 17/01/20 6:00 AM, Eric K. Miller wrote: > > Hi Feilong, > >   > > Before I was able to use the benchmark tool you mentioned, we saw some > other slowdowns with Ceph (all flash).  It appears that something must > have crashed somewhere since we had to restart a couple things, after > which etcd has been performing fine and no more health issues being > reported by Magnum. > >   > > So, it looks like it wasn't etcd related afterall. > >   > > However, while researching, I found that etcd's fsync on every write > (so it guarantees a write cache flush for each write) apparently > creates some havoc with some SSDs, where the SSD performs a full cache > flush of multiple caches.  This article explains it a LOT better:  > https://yourcmc.ru/wiki/Ceph_performance (scroll to the "Drive cache > is slowing you down" section) > >   > > It seems that the optimal configuration for etcd would be to use local > drives in each node and be sure that the write cache is disabled in > the SSDs - as opposed to using Ceph volumes, which already adds > network latency, but can create even more latency for synchronizations > due to Ceph's replication. > >   > > Eric > >   > >   > > *From:*feilong [mailto:feilong at catalyst.net.nz] > *Sent:* Wednesday, January 15, 2020 2:36 PM > *To:* Eric K. Miller; openstack-discuss at lists.openstack.org > *Cc:* Spyros Trigazis > *Subject:* Re: [magnum][kolla] etcd wal sync duration issue > >   > > Hi Eric, > > If you're using SSD, then I think the IO performance should  be OK. > You can use this > https://github.com/etcd-io/etcd/tree/master/tools/benchmark to verify > and confirm that 's the root cause. Meanwhile, you can review the > config of etcd cluster deployed by Magnum. I'm not an export of Etcd, > so TBH I can't see anything wrong with the config. Most of them are > just default configurations. > > As for the etcd image, it's built from > https://github.com/projectatomic/atomic-system-containers/tree/master/etcd > or you can refer CERN's repo > https://gitlab.cern.ch/cloud/atomic-system-containers/blob/cern-qa/etcd/ > > *Spyros*, any comments? > >   > > On 14/01/20 10:52 AM, Eric K. Miller wrote: > > Hi Feilong, > >   > > Thanks for responding!  I am, indeed, using the default v3.2.7 version for etcd, which is the only available image. > >   > > I did not try to reproduce with any other driver (we have never used DevStack, honestly, only Kolla-Ansible deployments).  I did see a number of people indicating similar issues with etcd versions in the 3.3.x range, so I didn't think of it being an etcd issue, but then again most issues seem to be a result of people using HDDs and not SSDs, which makes sense. > >   > > Interesting that you saw the same issue, though.  We haven't tried Fedora CoreOS, but I think we would need Train for this. > >   > > Everything I read about etcd indicates that it is extremely latency sensitive, due to the fact that it replicates all changes to all nodes and sends an fsync to Linux each time, so data is always guaranteed to be stored.  I can see this becoming an issue quickly without super-low-latency network and storage.  We are using Ceph-based SSD volumes for the Kubernetes Master node disks, which is extremely fast (likely 10x or better than anything people recommend for etcd), but network latency is always going to be higher with VMs on OpenStack with DVR than bare metal with VLANs due to all of the abstractions. > >   > > Do you know who maintains the etcd images for Magnum here?  Is there an easy way to create a newer image? > > https://hub.docker.com/r/openstackmagnum/etcd/tags/ > >   > > Eric > >   > >   > >   > > From: Feilong Wang [mailto:feilong at catalyst.net.nz] > > Sent: Monday, January 13, 2020 3:39 PM > > To: openstack-discuss at lists.openstack.org > > Subject: Re: [magnum][kolla] etcd wal sync duration issue > >   > > Hi Eric, > > That issue looks familiar for me. There are some questions I'd like to check before answering if you should upgrade to train. > > 1. Are using the default v3.2.7 version for etcd? > > 2. Did you try to reproduce this with devstack, using Fedora CoreOS driver? The etcd version could be 3.2.26 > > I asked above questions because I saw the same error when I used Fedora Atomic with etcd v3.2.7 and I can't reproduce it with Fedora CoreOS + etcd 3.2.26 > >   > >   > > -- > Cheers & Best regards, > Feilong Wang (王飞龙) > ------------------------------------------------------ > Senior Cloud Software Engineer > Tel: +64-48032246 > Email: flwang at catalyst.net.nz > Catalyst IT Limited > Level 6, Catalyst House, 150 Willis Street, Wellington > ------------------------------------------------------ -- Cheers & Best regards, Feilong Wang (王飞龙) Head of R&D Catalyst Cloud - Cloud Native New Zealand -------------------------------------------------------------------------- Tel: +64-48032246 Email: flwang at catalyst.net.nz Level 6, Catalyst House, 150 Willis Street, Wellington -------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhangbailin at inspur.com Tue Jan 21 02:35:23 2020 From: zhangbailin at inspur.com (=?gb2312?B?QnJpbiBaaGFuZyjVxbDZwdYp?=) Date: Tue, 21 Jan 2020 02:35:23 +0000 Subject: =?gb2312?B?tPC4tDogW2xpc3RzLm9wZW5zdGFjay5vcme0+reiXVtub3ZhXSBJIHdvdWxk?= =?gb2312?B?IGxpa2UgdG8gYWRkIGFub3RoZXIgb3B0aW9uIGZvciBjcm9zc19hel9hdHRh?= =?gb2312?Q?ch?= In-Reply-To: References: Message-ID: Hi, Kim KS: "cross_az_attach"'s default value is True, that means a llow attach between instance and volume in different availability zones. If False, volumes attached to an instance must be in the same availability zone in Cinder as the instance availability zone in Nova. Another thing is, you should care booting an BFV instance from "image", and this should interact the " allow_availability_zone_fallback" in Cinder, if " allow_availability_zone_fallback=False" and *that* request AZ does not in Cinder, the request will be fail. About specify AZ to unshelve a shelved_offloaded server, about the cross_az_attach something you can know https://github.com/openstack/nova/blob/master/releasenotes/notes/bp-specifying-az-to-unshelve-server-aa355fef1eab2c02.yaml Availability Zones docs, that contains some description with cinder.cross_az_attach https://docs.openstack.org/nova/latest/admin/availability-zones.html#implications-for-moving-servers cross_az_attach configuration: https://docs.openstack.org/nova/train/configuration/config.html#cinder.cross_az_attach And cross_az_attach with the server is in https://github.com/openstack/nova/blob/master/nova/volume/cinder.py#L523-L545 I am not sure why you are need " enable_az_attach_list = AZ1,AZ2" configuration? brinzhang > cross_az_attach > > Hello all, > > In nova with setting [cinder]/ cross_az_attach option to false, nova creates > instance and volume in same AZ. > > but some of usecase (in my case), we need to attach new volume in different > AZ to the instance. > > so I need two options. > > one is for nova block device mapping and attaching volume and another is for > attaching volume in specified AZ. > > [cinder] > cross_az_attach = False > enable_az_attach_list = AZ1,AZ2 > > how do you all think of it? > > Best, > Kiseok > From ignaziocassano at gmail.com Tue Jan 21 05:59:06 2020 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Tue, 21 Jan 2020 06:59:06 +0100 Subject: [queens][nova] iscsi issue In-Reply-To: <20200120154107.czjws3n3p5rl64nu@lyarwood.usersys.redhat.com> References: <20200120154107.czjws3n3p5rl64nu@lyarwood.usersys.redhat.com> Message-ID: Hello, I increased cinder and nova rpc resoonse time out and now it works better. I am going to test it again and if the error come back, I 'll send the log file as you suggested. Thanks Ignazio Il Lun 20 Gen 2020, 16:41 Lee Yarwood ha scritto: > On 17-01-20 19:30:13, Ignazio Cassano wrote: > > Hello all we are testing openstack queens cinder driver for Unity iscsi > > (driver cinder.volume.drivers.dell_emc.unity.Driver). > > > > The unity storage is a Unity600 Version 4.5.10.5.001 > > > > We are facing an issue when we try to detach volume from a virtual > machine > > with two or more volumes attached (this happens often but not always): > > Could you write this up as an os-brick bug and attach the nova-compute > log in DEBUG showing the initial volume attachments prior to this detach > error? > > https://bugs.launchpad.net/os-brick/+filebug > > > The following is reported nova-compute.log: > > > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server > > self.connector.disconnect_volume(connection_info['data'], None) > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File > > "/usr/lib/python2.7/site-packages/os_brick/utils.py", line 137, in > > trace_logging_wrapper > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server return > > f(*args, **kwargs) > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File > > "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line > 274, > > in inner > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server return > > f(*args, **kwargs) > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File > > > "/usr/lib/python2.7/site-packages/os_brick/initiator/connectors/iscsi.py", > > line 848, in disconnect_volume > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server > > device_info=device_info) > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File > > > "/usr/lib/python2.7/site-packages/os_brick/initiator/connectors/iscsi.py", > > line 892, in _cleanup_connection > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server path_used, > > was_multipath) > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File > > "/usr/lib/python2.7/site-packages/os_brick/initiator/linuxscsi.py", line > > 271, in remove_connection > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server > > self.flush_multipath_device(multipath_name) > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File > > "/usr/lib/python2.7/site-packages/os_brick/initiator/linuxscsi.py", line > > 329, in flush_multipath_device > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server > > root_helper=self._root_helper) > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File > > "/usr/lib/python2.7/site-packages/os_brick/executor.py", line 52, in > > _execute > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server result = > > self.__execute(*args, **kwargs) > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File > > "/usr/lib/python2.7/site-packages/os_brick/privileged/rootwrap.py", line > > 169, in execute > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server return > > execute_root(*cmd, **kwargs) > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File > > "/usr/lib/python2.7/site-packages/oslo_privsep/priv_context.py", line > 207, > > in _wrap > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server return > > self.channel.remote_call(name, args, kwargs) > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server File > > "/usr/lib/python2.7/site-packages/oslo_privsep/daemon.py", line 202, in > > remote_call > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server raise > > exc_type(*result[2]) > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server > > ProcessExecutionError: Unexpected error while running command. > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server Command: > > multipath -f 36006016006e04400d0c4215e3ec55757 > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server Exit code: 1 > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server Stdout: > u'Jan > > 17 16:04:30 | 36006016006e04400d0c4215e3ec55757p1: map in use\nJan 17 > > 16:04:31 | failed to remove multipath map > > 36006016006e04400d0c4215e3ec55757\n' > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server Stderr: u'' > > 2020-01-17 16:05:11.132 6643 ERROR oslo_messaging.rpc.server > > > > > > Best Regards > > > > Ignazio > > -- > Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 > 2D76 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From elmiko at redhat.com Tue Jan 21 09:00:39 2020 From: elmiko at redhat.com (Michael McCune) Date: Tue, 21 Jan 2020 04:00:39 -0500 Subject: [api][sdk][dev][oslo] using uWSGI breaks CORS config In-Reply-To: References: Message-ID: hi Radoslaw, i am also curious about this because i had thought we had CORS issued solved for uWSGI in the past, i will need to look around to find the conversations i was having. thanks for sharing your investigation, i think this is interesting. peace o/ On Fri, Jan 17, 2020 at 1:45 PM Radosław Piliszek < radoslaw.piliszek at gmail.com> wrote: > Fellow Devs, > > as you might have noticed I started taking care of > openstack/js-openstack-lib, > now under the openstacksdk umbrella [1]. > First goal is to modernize the CI to use Zuul v3, current devstack and > nodejs, still WIP [2]. > > As part of the original suite of tests, the unit and functional tests > are run from browsers as well as from node. > And, as you may know, browsers care about CORS [3]. > js-openstack-lib is connecting to various OpenStack APIs (currently > limited to keystone, glance, neutron and nova) to act on behalf of the > user (just like openstacksdk/client does). > oslo.middleware, as used by those APIs, provides a way to configure > CORS by setting params in the [cors] group but uWSGI seemingly ignores > that completely [4]. > I had to switch to mod_wsgi+apache instead of uwsgi+apache to get past > that issue. > I could not reproduce locally because kolla (thankfully) uses mostly > mod_wsgi atm. > > The issue I see is that uWSGI is proposed as the future and mod_wsgi > is termed deprecated. > However, this means the future is broken w.r.t. CORS and so any modern > web interface with it if not sitting on the exact same host and port > (which is usually different between OpenStack APIs and any UI). > > [1] https://review.opendev.org/701854 > [2] https://review.opendev.org/702132 > [3] https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS > [4] https://github.com/unbit/uwsgi/issues/1550 > > -yoctozepto > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tony.pearce at cinglevue.com Tue Jan 21 09:16:54 2020 From: tony.pearce at cinglevue.com (Tony Pearce) Date: Tue, 21 Jan 2020 17:16:54 +0800 Subject: Updating Openstack from Pike non-containerised to something current Message-ID: I am planning to upgrade my Openstack Pike. Please could I ask some questions about this so I understand better how to proceed? *Background*: - I have Openstack Pike running on 3 nodes (2 x compute and 1 x controller). Storage is provided by a Cinder driver and separate vendor storage array (iSCSI). - I have a bug which is impacting Cinder in Pike which is not present in Queens/Stein. - Since I installed Openstack some years ago, I was not able to update/patch it for a number of reasons. - The Openstack which I installed was from the tripleo.org website and I followed the instructions there. - I have configured LBaaS that was available in Pike by making config changes to .conf files. - I recall reading a few months ago when I started looking into this that I need to update to the latest within the current version (Pike) first, before updating beyond Pike so I am planning for that. *Questions*: 1. do I first need to deconfigure / remove the LBaaS before attempting an upgrade beyond Pike? 2. The tripleo website is a little confusing. I am running Pike and it is not containerised. The upgrade info on tripleo.org mentions upgrades to Pike in a containerised environment. So should I follow the detail for the "Ocata and earlier" release in this case? ref: http://tripleo.org/upgrade/minor_update.html#updating-your-overcloud-ocata-and-earlier 3. Do I need to upgrade to each release? ie from Pike > Queens then Queens > Stein etc ? Or can I do this: a) update Pike to the latest in the current Pike release b) Update Pike to Train *What I've gathered so far:* - If I can execute the 'openstack overcloud update' and limit it to the controller node then I can apply this to a cloned environment consisting of 1 x undercloud "director" and 1 x controller but I am unable to stage this for the compute nodes because they are physical bare-metal - the tripleo website guide lists the latest as 'Rocky' but I could see Stein in the yum search. I think I can follow the tripleo guide with a pinch of salt and replace *rocky* with *stein* Thank you *Tony Pearce* | *Senior Network Engineer / Infrastructure Lead**Cinglevue International * Email: tony.pearce at cinglevue.com Web: http://www.cinglevue.com *Australia* 1 Walsh Loop, Joondalup, WA 6027 Australia. Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 Note: This email and all attachments are the sole property of Cinglevue International Pty Ltd. (or any of its subsidiary entities), and the information contained herein must be considered confidential, unless specified otherwise. If you are not the intended recipient, you must not use or forward the information contained in these documents. If you have received this message in error, please delete the email and notify the sender. -------------- next part -------------- An HTML attachment was scrubbed... URL: From amotoki at gmail.com Tue Jan 21 09:50:14 2020 From: amotoki at gmail.com (Akihiro Motoki) Date: Tue, 21 Jan 2020 18:50:14 +0900 Subject: [horizon][cloudkitty][tacker][masakari][congress][karbar][trove] Drop python2.7 and Django1.11 support in horizon plugins Message-ID: Hi, I am sending this mail to encourage reviews in horizon plugins. - Drop python 2.7 and Django 1.11 - Remove six usage Horizon has dropped python 2.7 support and Django 1.11 support (which is the last version with python 2.7 support). We proposed changes to horizon plugins to catch up with these changes. - Most projects have merged them. Thanks for your cooperation. - There are several pending reviews. https://review.opendev.org/#/q/topic:drop-django111-support+status:open - Django 1.11 drop depends on python 2.7 drop. In case of karbar-dashboard, python 2.7 drop needs to land before it. https://review.opendev.org/#/c/694442/ horizon also has dropped six usage. There are plugins which depends six on horizon requirements. We proposed a series of changes to avoid UT breakages in plugins. Reviews on them are also appreciated. https://review.opendev.org/#/q/topic:remove-six+status:open Thanks, Akihiro Motoki (amotoki) From sshnaidm at redhat.com Tue Jan 21 10:21:46 2020 From: sshnaidm at redhat.com (Sagi Shnaidman) Date: Tue, 21 Jan 2020 12:21:46 +0200 Subject: [all][tripleo][openstack-ansible] Openstack Ansible modules - the transition is in progress In-Reply-To: References: Message-ID: Hi, all I'm happy to inform you that the transition has started. 1. Transition steps are tracked in the etherpad: https://etherpad.openstack.org/p/openstack-ansible-modules 2. Modules have been moved with ansible tool migrate.py according to Ansible guidelines. 3. The job that does linting with ansible-test was created and added to collections repo. Ansible-test sanity is used in Ansible CI for all modules, I think it's worth to have it enabled for us too. As well as pep8 checks. 4. I've ported job that was running on Ansible Github PRs as CI job to be check job in collections repo. It will give us at least the same coverage we had before. Of course, it's a temporary solution that should be revisited and most likely replaced by different kinds of jobs later. And with much more test coverage that it has now. So please review and merge: https://review.opendev.org/#/c/703383/ Any noncritical issues can be fixed in following up patches, let's unblock work on collections, please. 5. The current CI job from SDK repo that was running on Ansible Github patches is changed to print a message about moving modules and fail, not to allow to merge new patches easily. The patch is https://review.opendev.org/#/c/703342/ Instead of it, the new ported job from the previous point will run on SDK repo to ensure we don't break modules with SDK patches. 6. As was agreed before we'll need to rename all modules, link to new names in BOTMETA in Ansible and remove modules from Ansible repo. Renaming is a big chunk of work, so I'd appreciate any help here. Of course, any work can be merged only after CI job will be merged and start run on patches: https://review.opendev.org/#/c/703383/ 7. I posted the warning message on all PRs in Ansible Github that has a relation to Openstack modules, using "openstack" modules. Hopefully, it didn't miss anything. 8. Warning mail will be sent today to Ansible dev Google Group about the transition. 9. All discussions are in channel: #openstack-ansible-sig on Freenode. 10. Please help with reviews and merges as we want to do it asap to prevent big diverges between modules in Ansible and Openstack, also not to block people for a long time. Thanks On Tue, Jan 7, 2020 at 1:20 PM Sagi Shnaidman wrote: > Hi, > last meeting was pretty short and only 2 participants due to holidays, so > I think we can discuss this week the same agenda. I remind that we agreed > to move ansible modules after 13 January. > - what is the best strategy for freezing current modules in Ansible? > Because a few patches were merged just recently [1] Seems like "freezing" > doesn't really work. > - python versions support in modules > - keeping history when moving modules > and other topics [2] > Please add your questions to "Open discussion" section if there are some. > Thanks > > [1] > https://github.com/ansible/ansible/commits/devel/lib/ansible/modules/cloud/openstack > [2] https://etherpad.openstack.org/p/openstack-ansible-modules > > > On Fri, Dec 13, 2019 at 12:00 AM Sagi Shnaidman > wrote: > >> Hi, all >> short minutes from the meeting today about moving of Openstack Ansible >> modules to Openstack. >> >> 1. Because of some level of uncertainty and different opinions, the >> details of treatment of old modules will be under discussion in ML. I'll >> send a mail about this topic. >> 2. We agreed to have modules under "openstack." namespace and named >> "cloud". So regular modules will be named like "openstack.cloud.os_server" >> for example. >> 3. We agreed to keep Ansible modules as thin as possible, putting the >> logic into SDK. >> 4. Also we will keep compatibility with as much Ansible versions as >> possible. >> 5. We agreed to have manual releases of Ansible modules as much as we >> need. Similarly as it's done with SDK. >> >> Logs: >> http://eavesdrop.openstack.org/meetings/api_sig/2019/api_sig.2019-12-12-16.00.log.html >> Minutes: >> http://eavesdrop.openstack.org/meetings/api_sig/2019/api_sig.2019-12-12-16.00.html >> Etherpad: https://etherpad.openstack.org/p/openstack-ansible-modules >> >> Next time: Thursday 19 Dec 2019 4.00 PM UTC. >> >> Thanks >> >> On Fri, Dec 6, 2019 at 12:03 AM Sagi Shnaidman >> wrote: >> >>> Hi, all >>> short minutes from the meeting today about Openstack Ansible modules. >>> >>> 1. Ansible 2.10 is going to move all modules to collections, so >>> Openstack modules should find a new home in Openstack repos. >>> 2. Namespace for openstack modules will be named "openstack.". What is >>> coming after the dot is still under discussion. >>> 3. Current modules will be migrated to collections in "openstack." as is >>> with their names and will be still available for playbooks (via >>> symlinking). It will avoid breaking people that use in their playbooks os_* >>> modules now. >>> 4. Old modules will be frozen after migrations and all development work >>> will go in the new modules which will live aside. >>> 5. Critical bugfixes to 2.9 versions will be done via Ansible GitHub >>> repo as usual and synced manually to "openstack." collection. It must be a >>> very exceptional case. >>> 6. Migrations are set for mid of January 2020 approximately. >>> 7. Modules should stay compatible with last Ansible and collections API >>> changed. >>> 8. Because current old modules are licensed with GPL and license of >>> Openstack is Apache2, we need to figure out if we can either relicense them >>> or develop new ones with different license or to continue to work on new >>> ones with GPL in SIG repo. Agreed to ask on legal-discuss ML. >>> >>> Long minutes: >>> http://eavesdrop.openstack.org/meetings/api_sig/2019/api_sig.2019-12-05-16.00.html >>> Logs: >>> http://eavesdrop.openstack.org/meetings/api_sig/2019/api_sig.2019-12-05-16.00.log.html >>> >>> Etherpad: https://etherpad.openstack.org/p/openstack-ansible-modules >>> Next time Thursday 12 Dec 2019 4.00 PM UTC. >>> >>> Thanks >>> >>> On Tue, Dec 3, 2019 at 8:18 PM Sagi Shnaidman >>> wrote: >>> >>>> Hi, all >>>> In the meeting today we agreed to meet every Thursday starting *this >>>> week* at 4.00 PM UTC on #openstack-sdks channel on Freenode. We'll >>>> discuss everything related to Openstack Ansible modules. >>>> Agenda and topics are in the etherpad: >>>> https://etherpad.openstack.org/p/openstack-ansible-modules >>>> (I've created a new one, because we don't limit to Ironic modules only, >>>> it's about all of them in general) >>>> >>>> Short minutes from meeting today: >>>> Organizational: >>>> 1. We meet every Thursday from this week at 4.00 PM UTC on >>>> #openstack-sdks >>>> 2. Interested parties for now are: Ironic, Tripleo, Openstack-Ansible, >>>> Kolla-ansible, OpenstackSDK teams. Feel free to join and add yourself in >>>> the etherpad. [1] >>>> 3. We'll track our work in Storyboard for ansible-collections-openstack >>>> (in progress) >>>> 4. Openstack Ansible modules will live as collections under Ansible SIG >>>> in repo openstack/ansible-collections-openstack [2] because there are >>>> issues with different licensing: GPLv3 for Ansible in upstream and >>>> Openstack license (Apache2). >>>> 5. Ansible upstream Openstack modules will be merge-frozen when we'll >>>> have our collections fully working and will be deprecated from Ansible at >>>> some point in the future. >>>> 6. Openstack Ansible collections will be published to Galaxy. >>>> 7. There is a list of people that can be pinged for reviews in >>>> ansible-collections-openstack project, feel free to join there [1] >>>> >>>> Technical: >>>> 1. We use openstacksdk instead of [project]client modules. >>>> 2. We will rename modules to be more like os_[service_type] named, >>>> examples are in Ironic modules etherpad [3] >>>> >>>> Logs from meeting today you can find here: >>>> http://eavesdrop.openstack.org/meetings/ansible_sig/2019/ansible_sig.2019-12-03-15.01.log.html >>>> Please feel free to participate and add topics to agenda. [1] >>>> >>>> [1] https://etherpad.openstack.org/p/openstack-ansible-modules >>>> [2] https://review.opendev.org/#/c/684740/ >>>> [3] https://etherpad.openstack.org/p/ironic-ansible-modules >>>> >>>> Thanks >>>> >>>> On Wed, Nov 27, 2019 at 7:57 PM Sagi Shnaidman >>>> wrote: >>>> >>>>> Hi, all >>>>> >>>>> in the light of finding the new home place for openstack related >>>>> ansible modules [1] I'd like to discuss the best strategy to create Ironic >>>>> ansible modules. Existing Ironic modules in Ansible repo don't cover even >>>>> half of Ironic functionality, don't fit current needs and definitely >>>>> require an additional work. There are a few topics that require attention >>>>> and better be solved before modules are written to save additional work. We >>>>> prepared an etherpad [2] with all these questions and if you have ideas or >>>>> suggestions on how it should look you're welcome to update it. >>>>> We'd like to decide the final place for them, name conventions (the >>>>> most complex one!), what they should look like and how better to implement. >>>>> Anybody interested in Ansible and baremetal management in Openstack, >>>>> you're more than welcome to contribute. >>>>> >>>>> Thanks >>>>> >>>>> [1] https://review.opendev.org/#/c/684740/ >>>>> [2] https://etherpad.openstack.org/p/ironic-ansible-modules >>>>> >>>>> -- >>>>> Best regards >>>>> Sagi Shnaidman >>>>> >>>> >>>> >>>> -- >>>> Best regards >>>> Sagi Shnaidman >>>> >>> >>> >>> -- >>> Best regards >>> Sagi Shnaidman >>> >> >> >> -- >> Best regards >> Sagi Shnaidman >> > > > -- > Best regards > Sagi Shnaidman > -- Best regards Sagi Shnaidman -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Tue Jan 21 10:37:22 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Tue, 21 Jan 2020 11:37:22 +0100 Subject: [api][sdk][dev][oslo] using uWSGI breaks CORS config In-Reply-To: References: Message-ID: Hi, Michael! Thanks for your interest. I was also surprised that it does not work. It looks simple enough to not break... Since I wrote the email, I did more research on how to tackle this problem. For now I posted a bug to devstack that CORS is out of reach with pure devstack [1]. Only keystone and placement can be configured to use mod_wsgi, others default to uwsgi (or eventlet, to be precise, but these are irrelevant). It's really either hacking browsers or hacking apache config to include relevant CORS headers. [1] https://bugs.launchpad.net/devstack/+bug/1860287 -yoctozepto wt., 21 sty 2020 o 10:00 Michael McCune napisał(a): > > hi Radoslaw, > > i am also curious about this because i had thought we had CORS issued solved for uWSGI in the past, i will need to look around to find the conversations i was having. > > thanks for sharing your investigation, i think this is interesting. > > peace o/ > > On Fri, Jan 17, 2020 at 1:45 PM Radosław Piliszek wrote: >> >> Fellow Devs, >> >> as you might have noticed I started taking care of openstack/js-openstack-lib, >> now under the openstacksdk umbrella [1]. >> First goal is to modernize the CI to use Zuul v3, current devstack and >> nodejs, still WIP [2]. >> >> As part of the original suite of tests, the unit and functional tests >> are run from browsers as well as from node. >> And, as you may know, browsers care about CORS [3]. >> js-openstack-lib is connecting to various OpenStack APIs (currently >> limited to keystone, glance, neutron and nova) to act on behalf of the >> user (just like openstacksdk/client does). >> oslo.middleware, as used by those APIs, provides a way to configure >> CORS by setting params in the [cors] group but uWSGI seemingly ignores >> that completely [4]. >> I had to switch to mod_wsgi+apache instead of uwsgi+apache to get past >> that issue. >> I could not reproduce locally because kolla (thankfully) uses mostly >> mod_wsgi atm. >> >> The issue I see is that uWSGI is proposed as the future and mod_wsgi >> is termed deprecated. >> However, this means the future is broken w.r.t. CORS and so any modern >> web interface with it if not sitting on the exact same host and port >> (which is usually different between OpenStack APIs and any UI). >> >> [1] https://review.opendev.org/701854 >> [2] https://review.opendev.org/702132 >> [3] https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS >> [4] https://github.com/unbit/uwsgi/issues/1550 >> >> -yoctozepto >> From tkajinam at redhat.com Tue Jan 21 13:00:59 2020 From: tkajinam at redhat.com (Takashi Kajinami) Date: Tue, 21 Jan 2020 22:00:59 +0900 Subject: [all][ci] Gate jobs failures because of broken pip Message-ID: Hi All, I now observe some of the gate jobs are failing[1] because of broken pip[2], and it guess that we need to pin pip to avoid the error atm. [1] https://81b633e2c5fe858f8400-d324a81a71d524d51ede3dc5aee27774.ssl.cf5.rackcdn.com/702831/4/check/networking-ovn-tempest-dsvm-ovs-release/c062094/ [2] https://github.com/pypa/pip/issues/7217 May I ask for some help to set that pinning ? I couldn't find the correct way to pin pip after several investigation ... Thank you, Takashi -- ---------- Takashi Kajinami Software Maintenance Engineer Customer Experience and Engagement Red Hat e-mail: tkajinam at redhat.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From eyalb1 at gmail.com Tue Jan 21 13:07:48 2020 From: eyalb1 at gmail.com (Eyal B) Date: Tue, 21 Jan 2020 15:07:48 +0200 Subject: [all][ci] Gate jobs failures because of broken pip In-Reply-To: References: Message-ID: Hi, It seems they fixed it https://github.com/pypa/pip/issues/7217 Eyal On Tue, 21 Jan 2020 at 15:03, Takashi Kajinami wrote: > Hi All, > > I now observe some of the gate jobs are failing[1] because of broken > pip[2], and it guess that we need to pin pip to avoid the error atm. > > [1] > https://81b633e2c5fe858f8400-d324a81a71d524d51ede3dc5aee27774.ssl.cf5.rackcdn.com/702831/4/check/networking-ovn-tempest-dsvm-ovs-release/c062094/ > [2] https://github.com/pypa/pip/issues/7217 > > May I ask for some help to set that pinning ? > I couldn't find the correct way to pin pip after several investigation ... > > Thank you, > Takashi > > -- > ---------- > Takashi Kajinami > Software Maintenance Engineer > Customer Experience and Engagement > Red Hat > e-mail: tkajinam at redhat.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Tue Jan 21 13:10:12 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Tue, 21 Jan 2020 14:10:12 +0100 Subject: [all][ci] Gate jobs failures because of broken pip In-Reply-To: References: Message-ID: Looks like pip fixed itself in 20.0.1 https://pypi.org/project/pip/#history -yoctozepto wt., 21 sty 2020 o 14:09 Takashi Kajinami napisał(a): > > Hi All, > > I now observe some of the gate jobs are failing[1] because of broken pip[2], and it guess that we need to pin pip to avoid the error atm. > > [1] https://81b633e2c5fe858f8400-d324a81a71d524d51ede3dc5aee27774.ssl.cf5.rackcdn.com/702831/4/check/networking-ovn-tempest-dsvm-ovs-release/c062094/ > [2] https://github.com/pypa/pip/issues/7217 > > May I ask for some help to set that pinning ? > I couldn't find the correct way to pin pip after several investigation ... > > Thank you, > Takashi > > -- > ---------- > Takashi Kajinami > Software Maintenance Engineer > Customer Experience and Engagement > Red Hat > e-mail: tkajinam at redhat.com > From bdobreli at redhat.com Tue Jan 21 14:06:59 2020 From: bdobreli at redhat.com (Bogdan Dobrelya) Date: Tue, 21 Jan 2020 15:06:59 +0100 Subject: Updating Openstack from Pike non-containerised to something current In-Reply-To: References: Message-ID: On 21.01.2020 10:16, Tony Pearce wrote: > I am planning to upgrade my Openstack Pike. Please could I ask some > questions about this so I understand better how to proceed? > *Background*: > - I have Openstack Pike running on 3 nodes (2 x compute and 1 x > controller). Storage is provided by a Cinder driver and separate vendor > storage array (iSCSI). > > - I have a bug which is impacting Cinder in Pike which is not present in > Queens/Stein. > > - Since I installed Openstack some years ago, I was not able to > update/patch it for a number of reasons. > > - The Openstack which I installed was from the tripleo.org > website and I followed the instructions there. > > - I have configured LBaaS that was available in Pike by making config > changes to .conf files. > > - I recall reading a few months ago when I started looking into this > that I need to update to the latest within the current version (Pike) > first, before updating beyond Pike so I am planning for that. > > > *Questions*: > 1. do I first need to deconfigure / remove the LBaaS before attempting > an upgrade beyond Pike? > 2. The tripleo website is a little confusing. I am running Pike and it > is not containerised. The upgrade info on tripleo.org > mentions upgrades to Pike in a containerised You should use official TripleO documentation [0] [0] https://docs.openstack.org/tripleo-docs/latest/upgrade/developer/upgrades/major_upgrade.html > environment. So should I follow the detail for the "Ocata and earlier" > release in this case? ref: > http://tripleo.org/upgrade/minor_update.html#updating-your-overcloud-ocata-and-earlier > 3. Do I need to upgrade to each release? ie from Pike > Queens then > Queens > Stein etc ? Or can I do this: > a) update Pike to the latest in the current Pike release > b) Update Pike to Train You have to do it one by one, interleaving the two steps: upgrading undercloud [1] firstly, then the overcloud. There is FFU (fast-forward upgrade) [2] but AFAIK it is only expected to work when going from Newton to Queens. Upgrading by N+3 from Pike to Stein would involve upgrading OS major version on hosts, so that guide wouldn't fit it. FFU from Queens to Train is under developed. [1] https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/post_deployment/upgrade/undercloud.html [2] https://docs.openstack.org/tripleo-docs/latest/upgrade/developer/upgrades/fast_fw_upgrade.html > > > *What I've gathered so far:* > - If I can execute the 'openstack overcloud update' and limit it to the > controller node then I can apply this to a cloned environment consisting > of 1 x undercloud "director" and 1 x controller but I am unable to stage > this for the compute nodes because they are physical bare-metal > > - the tripleo website guide lists the latest as 'Rocky' but I could see > Stein in the yum search. I think I can follow the tripleo guide with a > pinch of salt and replace *rocky* with *stein* > > > > Thank you > > > *Tony Pearce*| *Senior Network Engineer / Infrastructure Lead > **Cinglevue International * > > Email: tony.pearce at cinglevue.com > Web: http://www.cinglevue.com ** > > *Australia* > 1 Walsh Loop, Joondalup, WA 6027 Australia. > > Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 > > Note: This email and all attachments are the sole property of Cinglevue > International Pty Ltd. (or any of its subsidiary entities), and the > information contained herein must be considered confidential, unless > specified otherwise.   If you are not the intended recipient, you must > not use or forward the information contained in these documents.   If > you have received this message in error, please delete the email and > notify the sender. > -- Best regards, Bogdan Dobrelya, Irc #bogdando From jean-philippe at evrard.me Tue Jan 21 14:44:19 2020 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Tue, 21 Jan 2020 15:44:19 +0100 Subject: [all][tc] What happened in OpenStack Governance recently Message-ID: <513ef6a804473e72e6726b851e7604480fc840fa.camel@evrard.me> Hello everyone, It's official, the V release will be codenamed Victoria! The TC is now working on declaring its runtimes [A]. The TC has merged a change in its charter, establishing new rules for the future terms [B]. This will give us leeway for elections and term lengths. The cycle goals selection schedule was made clearer, so that we don't add goals to the teams too late in the future [C]. We have selected the 'project specific PTL and contributor guides' as an official goal for Ussuri. I invite you to read the update of Ghanshyam Mann (gmann) to follow what's going on for our Ussuri goals [D]. You should also have a look at the zuul legacy migration [G], as it could eventually be selected as a V goal :) In terms of release naming, we'll start the W naming process soon, because we know that naming things are hard [E]. If you're interested by security policy, you can have a look at Jeremy's patch [F]. In terms of projects: - There is a new official charm, for watcher [H] - js-openstack-lib is now part of the openstacksdk team [I] - there is a new neutron project, ovn-octavia-provider [J] - the ansible-sig got a new repository, ansible-plugin-container- connection [K] - Justin Ferrieu is the PTL for Cloudkitty for the rest of the cycle [L] - sushi-cli is now part of the ironic team [M] Happy new year to our Chinese community! Regards, Jean-Philippe & Rico [A]: https://review.opendev.org/#/c/693743/ [B]: https://review.opendev.org/#/c/699277/ [C]: https://review.opendev.org/#/c/698299/ [D]: http://lists.openstack.org/pipermail/openstack-discuss/2020-January/012019.html [E]: https://review.opendev.org/#/c/702414/ [F]: https://review.opendev.org/#/c/678426/ [G]: https://review.opendev.org/#/c/691278/ [H]: https://review.opendev.org/#/c/702072/ [I]: https://review.opendev.org/#/c/701854/ [J]: https://review.opendev.org/#/c/697095/ [K]: https://review.opendev.org/#/c/696737/ [L]: https://review.opendev.org/#/c/701214/ [M]: https://review.opendev.org/#/c/700914/ From jean-philippe at evrard.me Tue Jan 21 14:49:03 2020 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Tue, 21 Jan 2020 15:49:03 +0100 Subject: [tc] Follow-up on action tems Message-ID: Hello TC members, Here are a few things that should keep your attention, on top of the usual reviews... - We should start the business opportunities for 2020. Any takers? - We have action items from the last meeting: - ttx to post a summary of large scale sig's meeting on the ML this week. - mnaser to follow up on the static hosting - mnaser to push an update to distill what happened on the ML in terms of stable branch policy - jungleboyj to update the survey report patch - everyone to let the multi-arch SIG ( https://etherpad.openstack.org/p/Multi-arch) know if you have CI resources in different architectures - We have also new action items: - We should update the governance dashboard to include the "ideas" repo. Anyone? - We need to keep tracking on Telemetry status ( http://lists.openstack.org/pipermail/openstack-discuss/2019-December/011586.html ). - Our next meeting will host on 6 February. You can provide suggestion on agenda https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee - ttx should report on his investigation on possible bylaws change for an eventual leadership (tc/uc) merge, UC was more or less okay with the idea, which allows to continue forward the investigation. Note: There is no official stance from the TC on this yet. It's just investigation. - We have updated SIGs tags with https://review.opendev.org/#/c/695625/. This will make sure people aware of the actual status. And as mentioned action ttx and ricolin will works on expose that status tag on governance-sigs' page. And finally, some things to keep in mind: - We need to promote the first contact sig to organisations that might need help on getting things achieved in OpenStack - We need to start collecting ideas for the ideas repo. Regards, JP & Rico From sean.mcginnis at gmx.com Tue Jan 21 15:48:09 2020 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Tue, 21 Jan 2020 09:48:09 -0600 Subject: [all] Nominations for the "W" release name Message-ID: Hello all, We get to be a little proactive this time around and get the release name chosen for the "W" release. Time to start thinking of good names again! Process Changes --------------- There are a couple of changes to be aware of with our naming process. In the past, we had always based our naming criteria on something geographically local to the Summit location. With the event changes to no longer have two large Summit-type events per year, we have tweaked our process to open things up and make it hopefully a little easier to pick a good name that the community likes. There are a couple of significant changes. First, names can now be proposed for anything that starts with the appropriate letter. It is no longer tied to a specific geographic region. Second, in order to simplify the process, the electorate for the poll will be the OpenStack Technical Committee. Full details of the release naming process can be found here: https://governance.openstack.org/tc/reference/release-naming.html Name Selection -------------- With that, the nomination period for the "W" release name is now open. Please add suitable names to: https://wiki.openstack.org/wiki/Release_Naming/W_Proposals We will accept nominations until February 7, 202 at 23:59:59 UTC. We will then have a brief period for any necessary discussions and to get the poll set up, with the TC electorate voting starting by February 17, 2020 and going no longer than February 23, 2020. Based on past timing with trademark and copyright reviews, we will likely have an official release name by mid to late March. Happy naming! Sean From elmiko at redhat.com Tue Jan 21 15:55:22 2020 From: elmiko at redhat.com (Michael McCune) Date: Tue, 21 Jan 2020 10:55:22 -0500 Subject: [api][sdk][dev][oslo] using uWSGI breaks CORS config In-Reply-To: References: Message-ID: On Tue, Jan 21, 2020 at 5:37 AM Radosław Piliszek < radoslaw.piliszek at gmail.com> wrote: > Hi, Michael! > > Thanks for your interest. I was also surprised that it does not work. > It looks simple enough to not break... > > Since I wrote the email, I did more research on how to tackle this problem. > For now I posted a bug to devstack that CORS is out of reach with pure > devstack [1]. > that sounds like a good first step. Only keystone and placement can be configured to use mod_wsgi, others > default to uwsgi (or eventlet, to be precise, but these are > irrelevant). > It's really either hacking browsers or hacking apache config to > include relevant CORS headers. > > this is something that i think we need to know though, because if uWSGI has broken CORS support then we need to fix it or start advising against its usage. peace o/ [1] https://bugs.launchpad.net/devstack/+bug/1860287 > > -yoctozepto > > wt., 21 sty 2020 o 10:00 Michael McCune napisał(a): > > > > hi Radoslaw, > > > > i am also curious about this because i had thought we had CORS issued > solved for uWSGI in the past, i will need to look around to find the > conversations i was having. > > > > thanks for sharing your investigation, i think this is interesting. > > > > peace o/ > > > > On Fri, Jan 17, 2020 at 1:45 PM Radosław Piliszek < > radoslaw.piliszek at gmail.com> wrote: > >> > >> Fellow Devs, > >> > >> as you might have noticed I started taking care of > openstack/js-openstack-lib, > >> now under the openstacksdk umbrella [1]. > >> First goal is to modernize the CI to use Zuul v3, current devstack and > >> nodejs, still WIP [2]. > >> > >> As part of the original suite of tests, the unit and functional tests > >> are run from browsers as well as from node. > >> And, as you may know, browsers care about CORS [3]. > >> js-openstack-lib is connecting to various OpenStack APIs (currently > >> limited to keystone, glance, neutron and nova) to act on behalf of the > >> user (just like openstacksdk/client does). > >> oslo.middleware, as used by those APIs, provides a way to configure > >> CORS by setting params in the [cors] group but uWSGI seemingly ignores > >> that completely [4]. > >> I had to switch to mod_wsgi+apache instead of uwsgi+apache to get past > >> that issue. > >> I could not reproduce locally because kolla (thankfully) uses mostly > >> mod_wsgi atm. > >> > >> The issue I see is that uWSGI is proposed as the future and mod_wsgi > >> is termed deprecated. > >> However, this means the future is broken w.r.t. CORS and so any modern > >> web interface with it if not sitting on the exact same host and port > >> (which is usually different between OpenStack APIs and any UI). > >> > >> [1] https://review.opendev.org/701854 > >> [2] https://review.opendev.org/702132 > >> [3] https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS > >> [4] https://github.com/unbit/uwsgi/issues/1550 > >> > >> -yoctozepto > >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at nemebean.com Tue Jan 21 16:30:46 2020 From: openstack at nemebean.com (Ben Nemec) Date: Tue, 21 Jan 2020 17:30:46 +0100 Subject: [api][sdk][dev][oslo] using uWSGI breaks CORS config In-Reply-To: References: Message-ID: On 1/21/20 4:55 PM, Michael McCune wrote: > > > On Tue, Jan 21, 2020 at 5:37 AM Radosław Piliszek > > wrote: > > Hi, Michael! > > Thanks for your interest. I was also surprised that it does not work. > It looks simple enough to not break... > > Since I wrote the email, I did more research on how to tackle this > problem. > For now I posted a bug to devstack that CORS is out of reach with pure > devstack [1]. > > > that sounds like a good first step. > > Only keystone and placement can be configured to use mod_wsgi, others > default to uwsgi (or eventlet, to be precise, but these are > irrelevant). > It's really either hacking browsers or hacking apache config to > include relevant CORS headers. > > > this is something that i think we need to know though, because if uWSGI > has broken CORS support then we need to fix it or start advising against > its usage. Just to clarify, based on [0] it looks like this is a uwsgi bug and not something we're doing wrong (other than recommending uwsgi ;-). Is that correct? 0: https://github.com/unbit/uwsgi/issues/1550 > > peace o/ > > [1] https://bugs.launchpad.net/devstack/+bug/1860287 > > -yoctozepto > > wt., 21 sty 2020 o 10:00 Michael McCune > napisał(a): > > > > hi Radoslaw, > > > > i am also curious about this because i had thought we had CORS > issued solved for uWSGI in the past, i will need to look around to > find the conversations i was having. > > > > thanks for sharing your investigation, i think this is interesting. > > > > peace o/ > > > > On Fri, Jan 17, 2020 at 1:45 PM Radosław Piliszek > > > wrote: > >> > >> Fellow Devs, > >> > >> as you might have noticed I started taking care of > openstack/js-openstack-lib, > >> now under the openstacksdk umbrella [1]. > >> First goal is to modernize the CI to use Zuul v3, current > devstack and > >> nodejs, still WIP [2]. > >> > >> As part of the original suite of tests, the unit and functional > tests > >> are run from browsers as well as from node. > >> And, as you may know, browsers care about CORS [3]. > >> js-openstack-lib is connecting to various OpenStack APIs (currently > >> limited to keystone, glance, neutron and nova) to act on behalf > of the > >> user (just like openstacksdk/client does). > >> oslo.middleware, as used by those APIs, provides a way to configure > >> CORS by setting params in the [cors] group but uWSGI seemingly > ignores > >> that completely [4]. > >> I had to switch to mod_wsgi+apache instead of uwsgi+apache to > get past > >> that issue. > >> I could not reproduce locally because kolla (thankfully) uses mostly > >> mod_wsgi atm. > >> > >> The issue I see is that uWSGI is proposed as the future and mod_wsgi > >> is termed deprecated. > >> However, this means the future is broken w.r.t. CORS and so any > modern > >> web interface with it if not sitting on the exact same host and port > >> (which is usually different between OpenStack APIs and any UI). > >> > >> [1] https://review.opendev.org/701854 > >> [2] https://review.opendev.org/702132 > >> [3] https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS > >> [4] https://github.com/unbit/uwsgi/issues/1550 > >> > >> -yoctozepto > >> > From smooney at redhat.com Tue Jan 21 17:15:01 2020 From: smooney at redhat.com (Sean Mooney) Date: Tue, 21 Jan 2020 17:15:01 +0000 Subject: [api][sdk][dev][oslo] using uWSGI breaks CORS config In-Reply-To: References: Message-ID: <7e1f612534bb572af4c645d3d58510bd7a1e4f4d.camel@redhat.com> On Tue, 2020-01-21 at 17:30 +0100, Ben Nemec wrote: > > On 1/21/20 4:55 PM, Michael McCune wrote: > > > > > > On Tue, Jan 21, 2020 at 5:37 AM Radosław Piliszek > > > wrote: > > > > Hi, Michael! > > > > Thanks for your interest. I was also surprised that it does not work. > > It looks simple enough to not break... > > > > Since I wrote the email, I did more research on how to tackle this > > problem. > > For now I posted a bug to devstack that CORS is out of reach with pure > > devstack [1]. > > > > > > that sounds like a good first step. > > > > Only keystone and placement can be configured to use mod_wsgi, others > > default to uwsgi (or eventlet, to be precise, but these are > > irrelevant). > > It's really either hacking browsers or hacking apache config to > > include relevant CORS headers. > > > > > > this is something that i think we need to know though, because if uWSGI > > has broken CORS support then we need to fix it or start advising against > > its usage. > > Just to clarify, based on [0] it looks like this is a uwsgi bug and not > something we're doing wrong (other than recommending uwsgi ;-). Is that > correct? > > 0: https://github.com/unbit/uwsgi/issues/1550 is it somehting that we expect the wsgi server to add or the applciation? e.g. should the CORS headders be added by the apache/nginx/uwsgi or openstack. but my only interaction in the past was setting it via the apache .htaccess file. https://enable-cors.org/server_apache.html you can also do it in nginx https://enable-cors.org/server_nginx.html it seam like we use oslo midelware to do this dynmaically form the api code. https://docs.openstack.org/oslo.middleware/latest/admin/cross-project-cors.html so perhaps that is not executing or uswsgi si stripping the headers. they may be filtering them but i cant really find any documentaiton on uswgi and CORS configuration > > > > > peace o/ > > > > [1] https://bugs.launchpad.net/devstack/+bug/1860287 > > > > -yoctozepto > > > > wt., 21 sty 2020 o 10:00 Michael McCune > > napisał(a): > > > > > > hi Radoslaw, > > > > > > i am also curious about this because i had thought we had CORS > > issued solved for uWSGI in the past, i will need to look around to > > find the conversations i was having. > > > > > > thanks for sharing your investigation, i think this is interesting. > > > > > > peace o/ > > > > > > On Fri, Jan 17, 2020 at 1:45 PM Radosław Piliszek > > > > > wrote: > > >> > > >> Fellow Devs, > > >> > > >> as you might have noticed I started taking care of > > openstack/js-openstack-lib, > > >> now under the openstacksdk umbrella [1]. > > >> First goal is to modernize the CI to use Zuul v3, current > > devstack and > > >> nodejs, still WIP [2]. > > >> > > >> As part of the original suite of tests, the unit and functional > > tests > > >> are run from browsers as well as from node. > > >> And, as you may know, browsers care about CORS [3]. > > >> js-openstack-lib is connecting to various OpenStack APIs (currently > > >> limited to keystone, glance, neutron and nova) to act on behalf > > of the > > >> user (just like openstacksdk/client does). > > >> oslo.middleware, as used by those APIs, provides a way to configure > > >> CORS by setting params in the [cors] group but uWSGI seemingly > > ignores > > >> that completely [4]. > > >> I had to switch to mod_wsgi+apache instead of uwsgi+apache to > > get past > > >> that issue. > > >> I could not reproduce locally because kolla (thankfully) uses mostly > > >> mod_wsgi atm. > > >> > > >> The issue I see is that uWSGI is proposed as the future and mod_wsgi > > >> is termed deprecated. > > >> However, this means the future is broken w.r.t. CORS and so any > > modern > > >> web interface with it if not sitting on the exact same host and port > > >> (which is usually different between OpenStack APIs and any UI). > > >> > > >> [1] https://review.opendev.org/701854 > > >> [2] https://review.opendev.org/702132 > > >> [3] https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS > > >> [4] https://github.com/unbit/uwsgi/issues/1550 > > >> > > >> -yoctozepto > > >> > > > > From radoslaw.piliszek at gmail.com Tue Jan 21 17:31:28 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Tue, 21 Jan 2020 18:31:28 +0100 Subject: [api][sdk][dev][oslo] using uWSGI breaks CORS config In-Reply-To: <7e1f612534bb572af4c645d3d58510bd7a1e4f4d.camel@redhat.com> References: <7e1f612534bb572af4c645d3d58510bd7a1e4f4d.camel@redhat.com> Message-ID: Ben wrote: > it looks like this is a uwsgi bug and not > something we're doing wrong (other than recommending uwsgi ;-). Precisely. Well, there is always a chance either side does something wrong until one acknowledges doing it wrong but if django and oslo.middleware can send CORS headers via mod_wsgi and both cannot via uWSGI, then it's uWSGI that's more suspicious. ;-) Sean wrote: > is it somehting that we expect the wsgi server to add or the applciation? > it seam like we use oslo midelware to do this dynmaically form the api code. Wearing my web developer hat, I would say it's up to the application to do, in general, since different resources may need different CORS headers. In the contrived example of testing in CI, allowing anything for the sake of easy access, it does not really matter. :-) Sean wrote: > but my only interaction in the past was setting it via the apache .htaccess file. Yeah, any decent web server allows to add/remove/override headers. Sean wrote: > they may be filtering them but i cant really find any documentaiton on uswgi and CORS configuration Indeed that's what the linked bug report suggests as well as random googlable info. I wonder why nobody from uWSGI answered the bug call. :/ -yoctozepto wt., 21 sty 2020 o 18:15 Sean Mooney napisał(a): > > On Tue, 2020-01-21 at 17:30 +0100, Ben Nemec wrote: > > > > On 1/21/20 4:55 PM, Michael McCune wrote: > > > > > > > > > On Tue, Jan 21, 2020 at 5:37 AM Radosław Piliszek > > > > wrote: > > > > > > Hi, Michael! > > > > > > Thanks for your interest. I was also surprised that it does not work. > > > It looks simple enough to not break... > > > > > > Since I wrote the email, I did more research on how to tackle this > > > problem. > > > For now I posted a bug to devstack that CORS is out of reach with pure > > > devstack [1]. > > > > > > > > > that sounds like a good first step. > > > > > > Only keystone and placement can be configured to use mod_wsgi, others > > > default to uwsgi (or eventlet, to be precise, but these are > > > irrelevant). > > > It's really either hacking browsers or hacking apache config to > > > include relevant CORS headers. > > > > > > > > > this is something that i think we need to know though, because if uWSGI > > > has broken CORS support then we need to fix it or start advising against > > > its usage. > > > > Just to clarify, based on [0] it looks like this is a uwsgi bug and not > > something we're doing wrong (other than recommending uwsgi ;-). Is that > > correct? > > > > 0: https://github.com/unbit/uwsgi/issues/1550 > > is it somehting that we expect the wsgi server to add or the applciation? > e.g. should the CORS headders be added by the apache/nginx/uwsgi or openstack. > > but my only interaction in the past was setting it via the apache .htaccess file. > https://enable-cors.org/server_apache.html you can also do it in nginx https://enable-cors.org/server_nginx.html > > it seam like we use oslo midelware to do this dynmaically form the api code. > > https://docs.openstack.org/oslo.middleware/latest/admin/cross-project-cors.html > so perhaps that is not executing or uswsgi si stripping the headers. > > they may be filtering them but i cant really find any documentaiton on uswgi and CORS configuration > > > > > > > > > peace o/ > > > > > > [1] https://bugs.launchpad.net/devstack/+bug/1860287 > > > > > > -yoctozepto > > > > > > wt., 21 sty 2020 o 10:00 Michael McCune > > > napisał(a): > > > > > > > > hi Radoslaw, > > > > > > > > i am also curious about this because i had thought we had CORS > > > issued solved for uWSGI in the past, i will need to look around to > > > find the conversations i was having. > > > > > > > > thanks for sharing your investigation, i think this is interesting. > > > > > > > > peace o/ > > > > > > > > On Fri, Jan 17, 2020 at 1:45 PM Radosław Piliszek > > > > > > > wrote: > > > >> > > > >> Fellow Devs, > > > >> > > > >> as you might have noticed I started taking care of > > > openstack/js-openstack-lib, > > > >> now under the openstacksdk umbrella [1]. > > > >> First goal is to modernize the CI to use Zuul v3, current > > > devstack and > > > >> nodejs, still WIP [2]. > > > >> > > > >> As part of the original suite of tests, the unit and functional > > > tests > > > >> are run from browsers as well as from node. > > > >> And, as you may know, browsers care about CORS [3]. > > > >> js-openstack-lib is connecting to various OpenStack APIs (currently > > > >> limited to keystone, glance, neutron and nova) to act on behalf > > > of the > > > >> user (just like openstacksdk/client does). > > > >> oslo.middleware, as used by those APIs, provides a way to configure > > > >> CORS by setting params in the [cors] group but uWSGI seemingly > > > ignores > > > >> that completely [4]. > > > >> I had to switch to mod_wsgi+apache instead of uwsgi+apache to > > > get past > > > >> that issue. > > > >> I could not reproduce locally because kolla (thankfully) uses mostly > > > >> mod_wsgi atm. > > > >> > > > >> The issue I see is that uWSGI is proposed as the future and mod_wsgi > > > >> is termed deprecated. > > > >> However, this means the future is broken w.r.t. CORS and so any > > > modern > > > >> web interface with it if not sitting on the exact same host and port > > > >> (which is usually different between OpenStack APIs and any UI). > > > >> > > > >> [1] https://review.opendev.org/701854 > > > >> [2] https://review.opendev.org/702132 > > > >> [3] https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS > > > >> [4] https://github.com/unbit/uwsgi/issues/1550 > > > >> > > > >> -yoctozepto > > > >> > > > > > > > > From tinlam at gmail.com Tue Jan 21 17:37:29 2020 From: tinlam at gmail.com (Tin Lam) Date: Tue, 21 Jan 2020 11:37:29 -0600 Subject: [openstack-helm] Core Reviewer Nominations In-Reply-To: References: Message-ID: +1. Thanks for your reviews and contributions to the OSH projects, Gage and Steven. Tin On Mon, Jan 20, 2020 at 7:14 PM Pete Birley < > petebirley+openstack-dev at gmail.com> wrote: > >> OpenStack-Helm team, >> >> >> >> Based on their record of quality code review and substantial/meaningful >> code contributions to the openstack-helm project, at last weeks meeting we >> proposed the following individuals as core reviewers for openstack-helm: >> >> >> >> - Gage Hugo >> - Steven Fitzpatrick >> >> >> >> All OpenStack-Helm Core Reviewers are invited to reply with a +1/-1 by >> EOD next Monday (27/1/2020). A lone +1/-1 will apply to both candidates, >> otherwise please spell out votes individually for the candidates. >> >> >> >> Cheers, >> >> >> Pete >> > -- Regards, Tin Lam -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Tue Jan 21 18:42:03 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Tue, 21 Jan 2020 19:42:03 +0100 Subject: [vmware][hyperv][kolla] Is there any interest? (or should we deprecate and remove?) Message-ID: Hello Fellow Stackers, In Kolla and Kolla-Ansible we have some support for VMware (both hypervisor and networking controller stuff) and Hyper-V. The issue is the relevant code is in pretty bad shape and we had no recent reports about these being used nor working at all for that matter and we are looking into dropping support for these. Please respond if you are interested in these. Long term we would require access to some CI running these to really keep things in shape. -yoctozepto From radoslaw.piliszek at gmail.com Tue Jan 21 19:43:06 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Tue, 21 Jan 2020 20:43:06 +0100 Subject: [api][sdk][dev][oslo] using uWSGI breaks CORS config In-Reply-To: References: <7e1f612534bb572af4c645d3d58510bd7a1e4f4d.camel@redhat.com> Message-ID: I got response on https://github.com/unbit/uwsgi/issues/1550 Any oslo.middleware and/or quick testing platform setup experts are welcome to join the thread there. :-) -yoctozepto wt., 21 sty 2020 o 18:31 Radosław Piliszek napisał(a): > > Ben wrote: > > it looks like this is a uwsgi bug and not > > something we're doing wrong (other than recommending uwsgi ;-). > > Precisely. Well, there is always a chance either side does something wrong > until one acknowledges doing it wrong but if django and oslo.middleware > can send CORS headers via mod_wsgi and both cannot via uWSGI, then > it's uWSGI that's more suspicious. ;-) > > Sean wrote: > > is it somehting that we expect the wsgi server to add or the applciation? > > it seam like we use oslo midelware to do this dynmaically form the api code. > > Wearing my web developer hat, I would say it's up to the application to do, > in general, since different resources may need different CORS headers. > In the contrived example of testing in CI, allowing anything for the > sake of easy > access, it does not really matter. :-) > > Sean wrote: > > but my only interaction in the past was setting it via the apache .htaccess file. > > Yeah, any decent web server allows to add/remove/override headers. > > Sean wrote: > > they may be filtering them but i cant really find any documentaiton on uswgi and CORS configuration > > Indeed that's what the linked bug report suggests as well as random > googlable info. > I wonder why nobody from uWSGI answered the bug call. :/ > > -yoctozepto > > wt., 21 sty 2020 o 18:15 Sean Mooney napisał(a): > > > > On Tue, 2020-01-21 at 17:30 +0100, Ben Nemec wrote: > > > > > > On 1/21/20 4:55 PM, Michael McCune wrote: > > > > > > > > > > > > On Tue, Jan 21, 2020 at 5:37 AM Radosław Piliszek > > > > > wrote: > > > > > > > > Hi, Michael! > > > > > > > > Thanks for your interest. I was also surprised that it does not work. > > > > It looks simple enough to not break... > > > > > > > > Since I wrote the email, I did more research on how to tackle this > > > > problem. > > > > For now I posted a bug to devstack that CORS is out of reach with pure > > > > devstack [1]. > > > > > > > > > > > > that sounds like a good first step. > > > > > > > > Only keystone and placement can be configured to use mod_wsgi, others > > > > default to uwsgi (or eventlet, to be precise, but these are > > > > irrelevant). > > > > It's really either hacking browsers or hacking apache config to > > > > include relevant CORS headers. > > > > > > > > > > > > this is something that i think we need to know though, because if uWSGI > > > > has broken CORS support then we need to fix it or start advising against > > > > its usage. > > > > > > Just to clarify, based on [0] it looks like this is a uwsgi bug and not > > > something we're doing wrong (other than recommending uwsgi ;-). Is that > > > correct? > > > > > > 0: https://github.com/unbit/uwsgi/issues/1550 > > > > is it somehting that we expect the wsgi server to add or the applciation? > > e.g. should the CORS headders be added by the apache/nginx/uwsgi or openstack. > > > > but my only interaction in the past was setting it via the apache .htaccess file. > > https://enable-cors.org/server_apache.html you can also do it in nginx https://enable-cors.org/server_nginx.html > > > > it seam like we use oslo midelware to do this dynmaically form the api code. > > > > https://docs.openstack.org/oslo.middleware/latest/admin/cross-project-cors.html > > so perhaps that is not executing or uswsgi si stripping the headers. > > > > they may be filtering them but i cant really find any documentaiton on uswgi and CORS configuration > > > > > > > > > > > > > peace o/ > > > > > > > > [1] https://bugs.launchpad.net/devstack/+bug/1860287 > > > > > > > > -yoctozepto > > > > > > > > wt., 21 sty 2020 o 10:00 Michael McCune > > > > napisał(a): > > > > > > > > > > hi Radoslaw, > > > > > > > > > > i am also curious about this because i had thought we had CORS > > > > issued solved for uWSGI in the past, i will need to look around to > > > > find the conversations i was having. > > > > > > > > > > thanks for sharing your investigation, i think this is interesting. > > > > > > > > > > peace o/ > > > > > > > > > > On Fri, Jan 17, 2020 at 1:45 PM Radosław Piliszek > > > > > > > > > wrote: > > > > >> > > > > >> Fellow Devs, > > > > >> > > > > >> as you might have noticed I started taking care of > > > > openstack/js-openstack-lib, > > > > >> now under the openstacksdk umbrella [1]. > > > > >> First goal is to modernize the CI to use Zuul v3, current > > > > devstack and > > > > >> nodejs, still WIP [2]. > > > > >> > > > > >> As part of the original suite of tests, the unit and functional > > > > tests > > > > >> are run from browsers as well as from node. > > > > >> And, as you may know, browsers care about CORS [3]. > > > > >> js-openstack-lib is connecting to various OpenStack APIs (currently > > > > >> limited to keystone, glance, neutron and nova) to act on behalf > > > > of the > > > > >> user (just like openstacksdk/client does). > > > > >> oslo.middleware, as used by those APIs, provides a way to configure > > > > >> CORS by setting params in the [cors] group but uWSGI seemingly > > > > ignores > > > > >> that completely [4]. > > > > >> I had to switch to mod_wsgi+apache instead of uwsgi+apache to > > > > get past > > > > >> that issue. > > > > >> I could not reproduce locally because kolla (thankfully) uses mostly > > > > >> mod_wsgi atm. > > > > >> > > > > >> The issue I see is that uWSGI is proposed as the future and mod_wsgi > > > > >> is termed deprecated. > > > > >> However, this means the future is broken w.r.t. CORS and so any > > > > modern > > > > >> web interface with it if not sitting on the exact same host and port > > > > >> (which is usually different between OpenStack APIs and any UI). > > > > >> > > > > >> [1] https://review.opendev.org/701854 > > > > >> [2] https://review.opendev.org/702132 > > > > >> [3] https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS > > > > >> [4] https://github.com/unbit/uwsgi/issues/1550 > > > > >> > > > > >> -yoctozepto > > > > >> > > > > > > > > > > > > From ionut at fleio.com Tue Jan 21 22:01:38 2020 From: ionut at fleio.com (Ionut Biru) Date: Wed, 22 Jan 2020 00:01:38 +0200 Subject: [magnum] podman fedora-coreos authorization failed: SSL exception connecting on keystone Message-ID: Hello guys, I'm trying to deploy a kubernetes cluster using magnum 9.2 with fedora-coreos-31.20200113.3.1-openstack. Master vm is deployed correctly but the cluster is never deployed since podman returns the following error: Jan 21 21:55:14 k8s-cluster002-mn5qgp6qlmw6-master-0 podman[2433]: Authorization failed: SSL exception connecting to https://api.mydomain.cloud:5000/v3/auth/tokens: HTTPSConnectionPool(host='api.mydomain.cloud', port=5000): Max retries exceeded with url: /v3/auth/tokens (Caused by SSLError(SSLError(185090184, u'[X509] no certificate or crl found (_ssl.c:3063)'),)) I do have a valid letsencrypt certification on that particular domain. curl https://api.mydomain.cloud:5000/v3/auth/tokens {"error": {"message": "The request you have made requires authentication.", "code": 401, "title": "Unauthorized"}} I was wondering, do you guys seen this issue before? Below is the template. https://paste.xinu.at/OC0Ic/ -- Ionut Biru - https://fleio.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosmaita.fossdev at gmail.com Tue Jan 21 22:24:00 2020 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Tue, 21 Jan 2020 17:24:00 -0500 Subject: [cinder] virtual mid-cycle summary available Message-ID: Hello Cinderinos, Thanks for a productive discussion today. I've posted a summary on the wiki so we can remember what happened and for the edification of those who didn't attend: https://wiki.openstack.org/wiki/CinderUssuriMidCycleSummary Also, I've posted a mid-cycle survey to get some feedback about the virtual connectivity and the format of today's meeting: https://forms.gle/muR2vMArZ7Eu1iV46 Please take the survey even if you didn't attend. (What I'm hoping to learn from the non-attendees is whether I could've communicated the event better, or whether the content was un-interesting, or what.) We'll be holding Session Two of the mid-cycle in a few weeks, so your feedback will be helpful in planning. cheers, brian PS: If the survey linked above is inaccessible in your country, please let me know and I'll figure something out so I can collect your feedback. The alternative to google forms I used last time was pretty limited in its free version, plus the only people who used it were located in countries where I know google forms are accessible (Europe & USA), so I succumbed to inertia and stuck with google forms. From feilong at catalyst.net.nz Wed Jan 22 00:56:17 2020 From: feilong at catalyst.net.nz (Feilong Wang) Date: Wed, 22 Jan 2020 13:56:17 +1300 Subject: [magnum] podman fedora-coreos authorization failed: SSL exception connecting on keystone In-Reply-To: References: Message-ID: <1dd3e495-c749-947f-4417-2008f7006cd1@catalyst.net.nz> Hi Ionut, Would you mind sharing your magnum.conf? I think you may need the *cafile* config option for both *keystone_authtoken* and *keystone_auth.* On 22/01/20 11:01 AM, Ionut Biru wrote: > Hello guys, > > I'm trying to deploy a kubernetes cluster using magnum 9.2 > with fedora-coreos-31.20200113.3.1-openstack. > > Master vm is deployed correctly but the cluster is never deployed > since podman returns the following error: > > > Jan 21 21:55:14 k8s-cluster002-mn5qgp6qlmw6-master-0 podman[2433]: > Authorization failed: SSL exception connecting to > https://api.mydomain.cloud:5000/v3/auth/tokens: HTTPSConnectionPool(host='api.mydomain.cloud', > port=5000): Max retries exceeded with url: /v3/auth/tokens (Caused by > SSLError(SSLError(185090184, u'[X509] no certificate or crl found > (_ssl.c:3063)'),)) > > I do have a valid letsencrypt certification on that particular domain. > >  curl https://api.mydomain.cloud:5000/v3/auth/tokens >  {"error": {"message": "The request you have made requires > authentication.", "code": 401, "title": "Unauthorized"}} > > I was wondering, do you guys seen this issue before? Below is the > template. > > https://paste.xinu.at/OC0Ic/ > -- > Ionut Biru - https://fleio.com -- Cheers & Best regards, Feilong Wang (王飞龙) Head of R&D Catalyst Cloud - Cloud Native New Zealand -------------------------------------------------------------------------- Tel: +64-48032246 Email: flwang at catalyst.net.nz Level 6, Catalyst House, 150 Willis Street, Wellington -------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Wed Jan 22 02:15:24 2020 From: mriedemos at gmail.com (Matt Riedemann) Date: Tue, 21 Jan 2020 20:15:24 -0600 Subject: [nova] I would like to add another option for cross_az_attach In-Reply-To: References: Message-ID: <64d7db9d-255d-508e-3043-b30b0d87a56e@gmail.com> On 1/20/20 12:47 AM, Kim KS wrote: > In nova with setting [cinder]/ cross_az_attach option to false, nova creates instance and volume in same AZ. > > but some of usecase (in my case), we need to attach new volume in different AZ to the instance. > > so I need two options. > > one is for nova block device mapping and attaching volume > and another is for attaching volume in specified AZ. > > [cinder] > cross_az_attach = False > enable_az_attach_list = AZ1,AZ2 > > how do you all think of it? As Brin mentioned there are already config hacks in Cinder to workaround cross_az_attach issues between nova and cinder. cross_az_attach in nova is config-driven API behavior which is something to avoid if possible since it's not discoverable by the end user, so piling on more config complexity is something I'd try to avoid if possible. That option wasn't even very usable until recently [1]. Can you explain your use case a bit more? It sounds like you're trying to provide essentially default zones for nova/cinder if the server and volume are not created in a specific zone? Sort of similar to how the default_schedule_zone option in nova works if a server is created without a specific zone. [1] https://github.com/openstack/nova/commit/07a24dcef7ce6767b4b5bab0c8d3166cbe5b39c0 -- Thanks, Matt Riedemann From tony.pearce at cinglevue.com Wed Jan 22 05:10:37 2020 From: tony.pearce at cinglevue.com (Tony Pearce) Date: Wed, 22 Jan 2020 13:10:37 +0800 Subject: Updating Openstack from Pike non-containerised to something current In-Reply-To: References: Message-ID: Thank you for the extra info, Going through those links today and the docs there. I found CentOS have removed "pike" so I cannot apply any update with the same release of Pike at the moment. Can I follow this and go now to start upgrading to Queens? https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/post_deployment/upgrade/major_upgrade.html That guide does not mention anything like "you must first update to the latest in the current release before performing a major upgrade". thanks in advance *Tony Pearce* | *Senior Network Engineer / Infrastructure Lead**Cinglevue International * Email: tony.pearce at cinglevue.com Web: http://www.cinglevue.com *Australia* 1 Walsh Loop, Joondalup, WA 6027 Australia. Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 Note: This email and all attachments are the sole property of Cinglevue International Pty Ltd. (or any of its subsidiary entities), and the information contained herein must be considered confidential, unless specified otherwise. If you are not the intended recipient, you must not use or forward the information contained in these documents. If you have received this message in error, please delete the email and notify the sender. On Tue, 21 Jan 2020 at 22:13, Bogdan Dobrelya wrote: > On 21.01.2020 10:16, Tony Pearce wrote: > > I am planning to upgrade my Openstack Pike. Please could I ask some > > questions about this so I understand better how to proceed? > > *Background*: > > - I have Openstack Pike running on 3 nodes (2 x compute and 1 x > > controller). Storage is provided by a Cinder driver and separate vendor > > storage array (iSCSI). > > > > - I have a bug which is impacting Cinder in Pike which is not present in > > Queens/Stein. > > > > - Since I installed Openstack some years ago, I was not able to > > update/patch it for a number of reasons. > > > > - The Openstack which I installed was from the tripleo.org > > website and I followed the instructions there. > > > > - I have configured LBaaS that was available in Pike by making config > > changes to .conf files. > > > > - I recall reading a few months ago when I started looking into this > > that I need to update to the latest within the current version (Pike) > > first, before updating beyond Pike so I am planning for that. > > > > > > *Questions*: > > 1. do I first need to deconfigure / remove the LBaaS before attempting > > an upgrade beyond Pike? > > 2. The tripleo website is a little confusing. I am running Pike and it > > is not containerised. The upgrade info on tripleo.org > > mentions upgrades to Pike in a containerised > > You should use official TripleO documentation [0] > > [0] > > https://docs.openstack.org/tripleo-docs/latest/upgrade/developer/upgrades/major_upgrade.html > > > environment. So should I follow the detail for the "Ocata and earlier" > > release in this case? ref: > > > http://tripleo.org/upgrade/minor_update.html#updating-your-overcloud-ocata-and-earlier > > 3. Do I need to upgrade to each release? ie from Pike > Queens then > > Queens > Stein etc ? Or can I do this: > > a) update Pike to the latest in the current Pike release > > b) Update Pike to Train > > You have to do it one by one, interleaving the two steps: upgrading > undercloud [1] firstly, then the overcloud. There is FFU (fast-forward > upgrade) [2] but AFAIK it is only expected to work when going from > Newton to Queens. > Upgrading by N+3 from Pike to Stein would involve upgrading OS major > version on hosts, so that guide wouldn't fit it. > > FFU from Queens to Train is under developed. > > [1] > > https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/post_deployment/upgrade/undercloud.html > [2] > > https://docs.openstack.org/tripleo-docs/latest/upgrade/developer/upgrades/fast_fw_upgrade.html > > > > > > > *What I've gathered so far:* > > - If I can execute the 'openstack overcloud update' and limit it to the > > controller node then I can apply this to a cloned environment consisting > > of 1 x undercloud "director" and 1 x controller but I am unable to stage > > this for the compute nodes because they are physical bare-metal > > > > - the tripleo website guide lists the latest as 'Rocky' but I could see > > Stein in the yum search. I think I can follow the tripleo guide with a > > pinch of salt and replace *rocky* with *stein* > > > > > > > > Thank you > > > > > > *Tony Pearce*| *Senior Network Engineer / Infrastructure Lead > > **Cinglevue International * > > > > Email: tony.pearce at cinglevue.com > > Web: http://www.cinglevue.com ** > > > > *Australia* > > 1 Walsh Loop, Joondalup, WA 6027 Australia. > > > > Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 > > > > Note: This email and all attachments are the sole property of Cinglevue > > International Pty Ltd. (or any of its subsidiary entities), and the > > information contained herein must be considered confidential, unless > > specified otherwise. If you are not the intended recipient, you must > > not use or forward the information contained in these documents. If > > you have received this message in error, please delete the email and > > notify the sender. > > > > > -- > Best regards, > Bogdan Dobrelya, > Irc #bogdando > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kiseok7 at gmail.com Wed Jan 22 06:09:35 2020 From: kiseok7 at gmail.com (Kim KS) Date: Wed, 22 Jan 2020 15:09:35 +0900 Subject: [nova] I would like to add another option for cross_az_attach In-Reply-To: References: Message-ID: Hello, Brin and Matt. and Thank you. I'll tell you more about my use case: * First, I create an instance(I'll call it NODE01) and a volume in same AZ. (so I use 'cross_az_attach = False' option) * and I create a cinder volume(I'll call it PV01) in different Volume Zone(I'll call it KubePVZone) * and then I would like to attach PV01 volume to NODE01 instance. KubePVZone is volume zone for kubernetes's persistent volume and NODE01 is a kubernetes' node. KubePVZone's volumes can be attached to the other kubernetes's nodes. So I would like to use options like: [cinder] cross_az_attach = False enable_az_attach_list = KubePVZone Let me know if there is a lack of explanation. I currently use the code by adding in to check_availability_zone method: https://github.com/openstack/nova/blob/058e77e26c1b52ab7d3a79a2b2991ca772318105/nova/volume/cinder.py#L534 + if volume['availability_zone'] in CONF.cinder.enable_az_attach_list: + LOG.info("allowed AZ for attaching in different availability zone: %s", + volume['availability_zone']) + return Best, Kiseok Kim > 2020. 1. 21. 오전 11:35, Brin Zhang(张百林) 작성: > > Hi, Kim KS: > "cross_az_attach"'s default value is True, that means a llow attach between instance and volume in different availability zones. > If False, volumes attached to an instance must be in the same availability zone in Cinder as the instance availability zone in Nova. Another thing is, you should care booting an BFV instance from "image", and this should interact the " allow_availability_zone_fallback" in Cinder, if " allow_availability_zone_fallback=False" and *that* request AZ does not in Cinder, the request will be fail. > > > About specify AZ to unshelve a shelved_offloaded server, about the cross_az_attach something you can know > https://github.com/openstack/nova/blob/master/releasenotes/notes/bp-specifying-az-to-unshelve-server-aa355fef1eab2c02.yaml > > Availability Zones docs, that contains some description with cinder.cross_az_attach > https://docs.openstack.org/nova/latest/admin/availability-zones.html#implications-for-moving-servers > > cross_az_attach configuration: https://docs.openstack.org/nova/train/configuration/config.html#cinder.cross_az_attach > > And cross_az_attach with the server is in > https://github.com/openstack/nova/blob/master/nova/volume/cinder.py#L523-L545 > > I am not sure why you are need " enable_az_attach_list = AZ1,AZ2" configuration? > > brinzhang > > >> cross_az_attach >> >> Hello all, >> >> In nova with setting [cinder]/ cross_az_attach option to false, nova creates >> instance and volume in same AZ. >> >> but some of usecase (in my case), we need to attach new volume in different >> AZ to the instance. >> >> so I need two options. >> >> one is for nova block device mapping and attaching volume and another is for >> attaching volume in specified AZ. >> >> [cinder] >> cross_az_attach = False >> enable_az_attach_list = AZ1,AZ2 >> >> how do you all think of it? >> >> Best, >> Kiseok >> > From ionut at fleio.com Wed Jan 22 07:53:31 2020 From: ionut at fleio.com (Ionut Biru) Date: Wed, 22 Jan 2020 09:53:31 +0200 Subject: [magnum] podman fedora-coreos authorization failed: SSL exception connecting on keystone In-Reply-To: <1dd3e495-c749-947f-4417-2008f7006cd1@catalyst.net.nz> References: <1dd3e495-c749-947f-4417-2008f7006cd1@catalyst.net.nz> Message-ID: Hello, I don't have cafile configured in keystone_authtoken and keystone_auth. I did copied letsencrypt cafile and configured it but now magnum cannot communicate with keystone even at simple as coe cluster list. CRITICAL keystonemiddleware.auth_token [-] Unable to validate token: Could not find versioned identity endpoints when attempting to authenticate. (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify ies exceeded with url: / (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')],)",),) On Wed, Jan 22, 2020 at 3:02 AM Feilong Wang wrote: > Hi Ionut, > > Would you mind sharing your magnum.conf? I think you may need the *cafile* > config option for both *keystone_authtoken* and *keystone_auth.* > > > On 22/01/20 11:01 AM, Ionut Biru wrote: > > Hello guys, > > I'm trying to deploy a kubernetes cluster using magnum 9.2 > with fedora-coreos-31.20200113.3.1-openstack. > > Master vm is deployed correctly but the cluster is never deployed since > podman returns the following error: > > > Jan 21 21:55:14 k8s-cluster002-mn5qgp6qlmw6-master-0 podman[2433]: > Authorization failed: SSL exception connecting to > https://api.mydomain.cloud:5000/v3/auth/tokens: HTTPSConnectionPool(host='api.mydomain.cloud', > port=5000): Max retries exceeded with url: /v3/auth/tokens (Caused by > SSLError(SSLError(185090184, u'[X509] no certificate or crl found > (_ssl.c:3063)'),)) > > I do have a valid letsencrypt certification on that particular domain. > > curl https://api.mydomain.cloud:5000/v3/auth/tokens > {"error": {"message": "The request you have made requires > authentication.", "code": 401, "title": "Unauthorized"}} > > I was wondering, do you guys seen this issue before? Below is the template. > > https://paste.xinu.at/OC0Ic/ > -- > Ionut Biru - https://fleio.com > > -- > Cheers & Best regards, > Feilong Wang (王飞龙) > Head of R&D > Catalyst Cloud - Cloud Native New Zealand > -------------------------------------------------------------------------- > Tel: +64-48032246 > Email: flwang at catalyst.net.nz > Level 6, Catalyst House, 150 Willis Street, Wellington > -------------------------------------------------------------------------- > > -- Ionut Biru - https://fleio.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From info at dantalion.nl Wed Jan 22 08:07:37 2020 From: info at dantalion.nl (info at dantalion.nl) Date: Wed, 22 Jan 2020 09:07:37 +0100 Subject: [Watcher] confused about meeting schedule Message-ID: <522d1ea8-c333-1027-1c6e-b21e8838e07e@dantalion.nl> Hello everyone, The documentation for Watcher states that meetings will be held on a bi-weekly basis on odd-weeks, however, last meeting was held on the 8th of January which is not an odd-week. Today I was expecting a meeting as the meetings are held bi-weekly and the last one was held on the 8th of January, however, there was none. Can someone clarify when the next meeting will be held and the subsequent one after that? If these are on even weeks we should also update Watcher's documentation. Kind regards, Corne lukken From tony.pearce at cinglevue.com Wed Jan 22 08:16:37 2020 From: tony.pearce at cinglevue.com (Tony Pearce) Date: Wed, 22 Jan 2020 16:16:37 +0800 Subject: Updating Openstack from Pike non-containerised to something current In-Reply-To: References: Message-ID: I decided to plan to upgrade to Queens using this guide to first upgrade the undercloud to Queens: https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/post_deployment/upgrade/major_upgrade.html The last step was "openstack undercloud upgrade". It seemed to go successfully as I had success messages shown in the console. But when it was running validation, my openstack overcloud controller dropped offline. Looking at the console of the controller, I saw logs about network devices leaving promiscuous mode, and was unable to get a login prompt to display. The public subnet was not up, either. A reboot of the controller didnt resolve the issue (thinking a service had an issue). So I restored the system to an earlier state. On bootup, it took a long time to come up and load networking. Now I am seeing a ton of console output on the controller related to "net_ratelimit" and "IPv4 martian source" that I have not seen before along with "33 callbacks suppressed". I am concerned this is dropping traffic between instances. My assumption is that the undercloud upgrade has somehow touched the overcloud controller, but I can't understand how this could have occurred just yet. I understood that the undercloud update wouldnt affect any overcloud. Could anyone confirm if I am mistaken here? *Tony Pearce* | *Senior Network Engineer / Infrastructure Lead**Cinglevue International * Email: tony.pearce at cinglevue.com Web: http://www.cinglevue.com *Australia* 1 Walsh Loop, Joondalup, WA 6027 Australia. Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 Note: This email and all attachments are the sole property of Cinglevue International Pty Ltd. (or any of its subsidiary entities), and the information contained herein must be considered confidential, unless specified otherwise. If you are not the intended recipient, you must not use or forward the information contained in these documents. If you have received this message in error, please delete the email and notify the sender. On Tue, 21 Jan 2020 at 22:13, Bogdan Dobrelya wrote: > On 21.01.2020 10:16, Tony Pearce wrote: > > I am planning to upgrade my Openstack Pike. Please could I ask some > > questions about this so I understand better how to proceed? > > *Background*: > > - I have Openstack Pike running on 3 nodes (2 x compute and 1 x > > controller). Storage is provided by a Cinder driver and separate vendor > > storage array (iSCSI). > > > > - I have a bug which is impacting Cinder in Pike which is not present in > > Queens/Stein. > > > > - Since I installed Openstack some years ago, I was not able to > > update/patch it for a number of reasons. > > > > - The Openstack which I installed was from the tripleo.org > > website and I followed the instructions there. > > > > - I have configured LBaaS that was available in Pike by making config > > changes to .conf files. > > > > - I recall reading a few months ago when I started looking into this > > that I need to update to the latest within the current version (Pike) > > first, before updating beyond Pike so I am planning for that. > > > > > > *Questions*: > > 1. do I first need to deconfigure / remove the LBaaS before attempting > > an upgrade beyond Pike? > > 2. The tripleo website is a little confusing. I am running Pike and it > > is not containerised. The upgrade info on tripleo.org > > mentions upgrades to Pike in a containerised > > You should use official TripleO documentation [0] > > [0] > > https://docs.openstack.org/tripleo-docs/latest/upgrade/developer/upgrades/major_upgrade.html > > > environment. So should I follow the detail for the "Ocata and earlier" > > release in this case? ref: > > > http://tripleo.org/upgrade/minor_update.html#updating-your-overcloud-ocata-and-earlier > > 3. Do I need to upgrade to each release? ie from Pike > Queens then > > Queens > Stein etc ? Or can I do this: > > a) update Pike to the latest in the current Pike release > > b) Update Pike to Train > > You have to do it one by one, interleaving the two steps: upgrading > undercloud [1] firstly, then the overcloud. There is FFU (fast-forward > upgrade) [2] but AFAIK it is only expected to work when going from > Newton to Queens. > Upgrading by N+3 from Pike to Stein would involve upgrading OS major > version on hosts, so that guide wouldn't fit it. > > FFU from Queens to Train is under developed. > > [1] > > https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/post_deployment/upgrade/undercloud.html > [2] > > https://docs.openstack.org/tripleo-docs/latest/upgrade/developer/upgrades/fast_fw_upgrade.html > > > > > > > *What I've gathered so far:* > > - If I can execute the 'openstack overcloud update' and limit it to the > > controller node then I can apply this to a cloned environment consisting > > of 1 x undercloud "director" and 1 x controller but I am unable to stage > > this for the compute nodes because they are physical bare-metal > > > > - the tripleo website guide lists the latest as 'Rocky' but I could see > > Stein in the yum search. I think I can follow the tripleo guide with a > > pinch of salt and replace *rocky* with *stein* > > > > > > > > Thank you > > > > > > *Tony Pearce*| *Senior Network Engineer / Infrastructure Lead > > **Cinglevue International * > > > > Email: tony.pearce at cinglevue.com > > Web: http://www.cinglevue.com ** > > > > *Australia* > > 1 Walsh Loop, Joondalup, WA 6027 Australia. > > > > Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 > > > > Note: This email and all attachments are the sole property of Cinglevue > > International Pty Ltd. (or any of its subsidiary entities), and the > > information contained herein must be considered confidential, unless > > specified otherwise. If you are not the intended recipient, you must > > not use or forward the information contained in these documents. If > > you have received this message in error, please delete the email and > > notify the sender. > > > > > -- > Best regards, > Bogdan Dobrelya, > Irc #bogdando > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From madhuri.kumari at intel.com Wed Jan 22 08:37:19 2020 From: madhuri.kumari at intel.com (Kumari, Madhuri) Date: Wed, 22 Jan 2020 08:37:19 +0000 Subject: [ironic][nova][neutron][cloud-init] Infiniband Support in OpenStack In-Reply-To: References: <0512CBBECA36994BAA14C7FEDE986CA61A5528B0@BGSMSX102.gar.corp.intel.com> Message-ID: <0512CBBECA36994BAA14C7FEDE986CA61A5554DF@BGSMSX102.gar.corp.intel.com> Hi Mark, Thanks for your response. I read the blog and my requirement is shared IB network as of now. We have developed a Neutron ML2 driver that manages the IB subnet[1]. Regardless of that, I think my current issue is because the cloud-init is not able to recognize the port as IB port. How did it work for you? You can see my ironic port details. [1] https://opendev.org/x/networking-omnipath/ >>-----Original Message----- >>From: Mark Goddard >>Sent: Monday, January 20, 2020 2:10 PM >>To: Kumari, Madhuri >>Cc: openstack-discuss at lists.openstack.org >>Subject: Re: [ironic][nova][neutron][cloud-init] Infiniband Support in >>OpenStack >> >>On Fri, 17 Jan 2020 at 15:34, Kumari, Madhuri > >>wrote: >>> >>> Hi, >>> >>> >>> >>> I am trying to deploy a node with infiniband in Ironic without any success. >>> >>> >>> >>> The node has two interfaces, eth0 and ib0. The deployment is successful, >>node becomes active but is not reachable. I debugged and checked that the >>issue is with cloud-init. The cloud-init fails to configure the network interfaces >>on the node complaining that the MAC address of infiniband port(ib0) is not >>known to the node. Ironic provides a fake MAC address for infiniband ports >>and cloud-init is supposed to generate the actual MAC address of infiband >>ports[1]. But it fails[2] before reaching there. >>> >>> I have posted the issue in cloud-init[3] as well. >>> >>> >>> >>> Can someone please help me with this issue? How do we specify >>“TYPE=InfiniBand” from OpenStack? Currently the type sent is “phy” only. >>> >>> >>> >>> [1] >>> https://github.com/canonical/cloud-init/blob/master/cloudinit/sources/ >>> helpers/openstack.py#L686 >>> >>> [2] >>> https://github.com/canonical/cloud-init/blob/master/cloudinit/sources/ >>> helpers/openstack.py#L677 >>> >>> [3] https://bugs.launchpad.net/cloud-init/+bug/1857031 >>> >>Hi Madhuri, >> >>Please see my blog post: >>https://www.stackhpc.com/bare-metal-infiniband.html. One major question >>to ask is whether you want shared IB network or multi-tenant isolation. The >>latter is significantly more challenging. It's probably best if you read that >>article and raise any further questions here or IRC. I'll be out of the office >>until Wednesday. >> >>Mark >>> >>> >>> >>> >>> Regards, >>> >>> Madhuri >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From liang.a.fang at intel.com Wed Jan 22 01:34:29 2020 From: liang.a.fang at intel.com (Fang, Liang A) Date: Wed, 22 Jan 2020 01:34:29 +0000 Subject: volume local cache Message-ID: Hi Gibi, Dan, Alex and dear Nova cores Nova Spec: https://review.opendev.org/#/c/689070/ Cinder Spec: https://review.opendev.org/#/c/684556/ Regarding the volume local cache spec, after several rounds of discussion in cinder weekly meeting, we continued discussed in cinder virtual mid-cycle ptg yesterday, and now cinder team is close to get agreement, and major ideas are: * Cinder sets “cacheable” property in volume type * Cinder should guarantee “cacheable” is set correctly, that means: should not set with “multiattach”, should not set for a backend that cannot be cached (NFS, RBD) * Os-brick prevents to attach a volume with cache mode that not safe in cloud environment * e.g. prevent write-back cache mode. So that all the operations, like live migration/snapshot/consistent group/volume backup can work as usual. * Nova schedules the VM with “cacheable” volume to servers with cache capability. * If no such available server, just go ahead without cache, but don’t fail it. * Schedule based on flavor (need more design here with Nova expert) Could you please comment on cinder spec if any before cinder team getting this hammered out? Thank you so much. Below is the performance test I did personally, using Optane SSD p4800x 750G as the cache. Setting fio with block size = 4k, iodepth=1 which is typical for latency measurement. The data may be slightly different on different environment, just FYI: In rand read test: [cid:image004.png at 01D5D107.23D05610] In rand write test: [cid:image006.png at 01D5D107.23D05610] In 70% rand read + 30% rand write mix test: [cid:image009.png at 01D5D107.23D05610] Ceph in above chart means the baseline test, as shown in path ① below. Write-Through and Write-Back means open-cas cached, as shown in path②. [cid:image010.png at 01D5D0FD.83839680] So even write-through mode, we still have lots of performance gains. Some concerns maybe: * In real world, the cache hit rate may cannot be so high (95% and more). * It depends on how big the cache device is. But normally a fast SSD with x TB size or a persistent memory with 512G size is big enough for hot data cache. * What will happen when backend storage is RDMA * Pure RDMA network link latency would be as low as ~10us. If plus the disk io to the storage system, the final latency would be dozens of microseconds. So it’s not suitable for an fast ssd(itself ~10us) to cache for RDMA volume, but we can use persistent memory (with latency about hundreds of nanosecond) to do the cache. I will do the measurement after Chinese new year. * It’s a pain that ceph cannot be supported * The major concern to mount ceph volume to host OS is the security considerations, right? It is a performance / security tradeoff for operators. In private cloud which OpenStack maybe mostly used, trust host OS would may be not a big problem, anyway, other backends rather than ceph are still doing in this way. I’m not going to change ceph to mount to host OS in this spec, but it’s not difficult for customer to switch to this way, like they customized other things. Regards LiangFang -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image010.png Type: image/png Size: 58106 bytes Desc: image010.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.png Type: image/png Size: 121269 bytes Desc: image004.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image006.png Type: image/png Size: 109759 bytes Desc: image006.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image009.png Type: image/png Size: 87269 bytes Desc: image009.png URL: From lyarwood at redhat.com Wed Jan 22 09:41:22 2020 From: lyarwood at redhat.com (Lee Yarwood) Date: Wed, 22 Jan 2020 09:41:22 +0000 Subject: [nova] block_device_info being introduced to nova.virt.driver.ComputeDriver.rescue Message-ID: <20200122094122.m55vax3ruvutta46@lyarwood.usersys.redhat.com> Hello all, Just a heads up that as part of my work introducing stable device rescue to Nova the following change will be introducing an additional block_device_info argument to the signature of nova.virt.driver.ComputeDriver.rescue. virt: Provide block_device_info during rescue https://review.opendev.org/#/c/700811/ Any third party drivers implementing instance rescue will need to be updated. Cheers, -- Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From mark at stackhpc.com Wed Jan 22 11:00:14 2020 From: mark at stackhpc.com (Mark Goddard) Date: Wed, 22 Jan 2020 11:00:14 +0000 Subject: [ironic][nova][neutron][cloud-init] Infiniband Support in OpenStack In-Reply-To: <0512CBBECA36994BAA14C7FEDE986CA61A5554DF@BGSMSX102.gar.corp.intel.com> References: <0512CBBECA36994BAA14C7FEDE986CA61A5528B0@BGSMSX102.gar.corp.intel.com> <0512CBBECA36994BAA14C7FEDE986CA61A5554DF@BGSMSX102.gar.corp.intel.com> Message-ID: On Wed, 22 Jan 2020 at 08:37, Kumari, Madhuri wrote: > > Hi Mark, > > > > Thanks for your response. I read the blog and my requirement is shared IB network as of now. We have developed a Neutron ML2 driver that manages the IB subnet[1]. > > Regardless of that, I think my current issue is because the cloud-init is not able to recognize the port as IB port. How did it work for you? You can see my ironic port details. > > > > [1] https://opendev.org/x/networking-omnipath/ I see, you didn't mention that you are using Omnipath which is based on IB but has differences. Here's the relevant code in cloudinit: https://github.com/canonical/cloud-init/blob/9bfb2ba7268e2c3c932023fc3d3020cdc6d6cc18/cloudinit/net/__init__.py#L785. In order for an interface to be picked up, it needs to have a type of 32 and the ethernet format MAC address of the port must match bytes 13-15 and 18-20 of the IB hardware address. > > > > > > >>-----Original Message----- > > >>From: Mark Goddard > > >>Sent: Monday, January 20, 2020 2:10 PM > > >>To: Kumari, Madhuri > > >>Cc: openstack-discuss at lists.openstack.org > > >>Subject: Re: [ironic][nova][neutron][cloud-init] Infiniband Support in > > >>OpenStack > > >> > > >>On Fri, 17 Jan 2020 at 15:34, Kumari, Madhuri > > >>wrote: > > >>> > > >>> Hi, > > >>> > > >>> > > >>> > > >>> I am trying to deploy a node with infiniband in Ironic without any success. > > >>> > > >>> > > >>> > > >>> The node has two interfaces, eth0 and ib0. The deployment is successful, > > >>node becomes active but is not reachable. I debugged and checked that the > > >>issue is with cloud-init. The cloud-init fails to configure the network interfaces > > >>on the node complaining that the MAC address of infiniband port(ib0) is not > > >>known to the node. Ironic provides a fake MAC address for infiniband ports > > >>and cloud-init is supposed to generate the actual MAC address of infiband > > >>ports[1]. But it fails[2] before reaching there. > > >>> > > >>> I have posted the issue in cloud-init[3] as well. > > >>> > > >>> > > >>> > > >>> Can someone please help me with this issue? How do we specify > > >>“TYPE=InfiniBand” from OpenStack? Currently the type sent is “phy” only. > > >>> > > >>> > > >>> > > >>> [1] > > >>> https://github.com/canonical/cloud-init/blob/master/cloudinit/sources/ > > >>> helpers/openstack.py#L686 > > >>> > > >>> [2] > > >>> https://github.com/canonical/cloud-init/blob/master/cloudinit/sources/ > > >>> helpers/openstack.py#L677 > > >>> > > >>> [3] https://bugs.launchpad.net/cloud-init/+bug/1857031 > > >>> > > >>Hi Madhuri, > > >> > > >>Please see my blog post: > > >>https://www.stackhpc.com/bare-metal-infiniband.html. One major question > > >>to ask is whether you want shared IB network or multi-tenant isolation. The > > >>latter is significantly more challenging. It's probably best if you read that > > >>article and raise any further questions here or IRC. I'll be out of the office > > >>until Wednesday. > > >> > > >>Mark > > >>> > > >>> > > >>> > > >>> > > >>> Regards, > > >>> > > >>> Madhuri > > >>> > > >>> From victoria at vmartinezdelacruz.com Wed Jan 22 13:13:22 2020 From: victoria at vmartinezdelacruz.com (=?UTF-8?Q?Victoria_Mart=C3=ADnez_de_la_Cruz?=) Date: Wed, 22 Jan 2020 10:13:22 -0300 Subject: Rails Girls Summer of Code In-Reply-To: References: Message-ID: Thanks for clarifying. This is great! On Fri, Jan 17, 2020 at 10:24 AM Amy Marrich wrote: > Victoria, > > I thought it was related to Ruby on Rails as well until I found the > following on their site: > > Rails Girls Summer of Code is programming language agnostic, and students > have contributed to an overall of 76 unique Open Source projects such as > Bundler, Rails, Discourse, Tessel, NextCloud, Processing, Babel, > impress.js, Lektor CMS, Hoodie, Speakerinnen, Lotus (now Hanami) and Servo. > > Maybe they've changed as the name is misleading when compared to that > statement. So if OpenStack wanted to get involved we would submit an > application and have some mentors/projects lined up similar > to Outreachy and Google Summer of Cone. > > Thanks, > > Amy (spotz) > > On Fri, Jan 17, 2020 at 5:44 AM Victoria Martínez de la Cruz < > victoria at vmartinezdelacruz.com> wrote: > >> Hi Amy, >> >> This is great! >> >> How is that agnostic? IIRC it was all related to Ruby on Rails projects? >> How OpenStack can join this effort? >> >> Thanks, >> >> V >> >> On Thu, Jan 16, 2020 at 9:55 AM Amy Marrich wrote: >> >>> Hi All, >>> >>> I was contacted about this program to see if OpenStack might be >>> interested in participating and despite the name it is language agnostic. >>> Moe information on the program can be found at Rails Girls Summer of >>> Code, >>> >>> I'm willing to help organize our efforts but would need to know level of >>> interest to participate and mentor. >>> >>> Thanks, >>> >>> Amy (spotz) >>> Chair, Diversity and Inclusion WG >>> Chair, User Committee >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From ionut at fleio.com Wed Jan 22 13:14:48 2020 From: ionut at fleio.com (Ionut Biru) Date: Wed, 22 Jan 2020 15:14:48 +0200 Subject: [magnum] podman fedora-coreos authorization failed: SSL exception connecting on keystone In-Reply-To: References: <1dd3e495-c749-947f-4417-2008f7006cd1@catalyst.net.nz> Message-ID: Hello, I've deployed the same kubernetes version on fedora-atomic but with use_podman=true and worked flawless. Maybe is an issue with fedora-coreos? On Wed, Jan 22, 2020 at 9:53 AM Ionut Biru wrote: > Hello, > > I don't have cafile configured in keystone_authtoken and keystone_auth. I > did copied letsencrypt cafile and configured it but now magnum cannot > communicate with keystone even at simple as coe cluster list. > > CRITICAL keystonemiddleware.auth_token [-] Unable to validate token: > Could not find versioned identity endpoints when attempting to > authenticate. > (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', > 'tls_process_server_certificate', 'certificate verify ies exceeded with > url: / (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', > 'tls_process_server_certificate', 'certificate verify failed')],)",),) > > On Wed, Jan 22, 2020 at 3:02 AM Feilong Wang > wrote: > >> Hi Ionut, >> >> Would you mind sharing your magnum.conf? I think you may need the >> *cafile* config option for both *keystone_authtoken* and *keystone_auth.* >> >> >> On 22/01/20 11:01 AM, Ionut Biru wrote: >> >> Hello guys, >> >> I'm trying to deploy a kubernetes cluster using magnum 9.2 >> with fedora-coreos-31.20200113.3.1-openstack. >> >> Master vm is deployed correctly but the cluster is never deployed since >> podman returns the following error: >> >> >> Jan 21 21:55:14 k8s-cluster002-mn5qgp6qlmw6-master-0 podman[2433]: >> Authorization failed: SSL exception connecting to >> https://api.mydomain.cloud:5000/v3/auth/tokens: HTTPSConnectionPool(host='api.mydomain.cloud', >> port=5000): Max retries exceeded with url: /v3/auth/tokens (Caused by >> SSLError(SSLError(185090184, u'[X509] no certificate or crl found >> (_ssl.c:3063)'),)) >> >> I do have a valid letsencrypt certification on that particular domain. >> >> curl https://api.mydomain.cloud:5000/v3/auth/tokens >> {"error": {"message": "The request you have made requires >> authentication.", "code": 401, "title": "Unauthorized"}} >> >> I was wondering, do you guys seen this issue before? Below is the >> template. >> >> https://paste.xinu.at/OC0Ic/ >> -- >> Ionut Biru - https://fleio.com >> >> -- >> Cheers & Best regards, >> Feilong Wang (王飞龙) >> Head of R&D >> Catalyst Cloud - Cloud Native New Zealand >> -------------------------------------------------------------------------- >> Tel: +64-48032246 >> Email: flwang at catalyst.net.nz >> Level 6, Catalyst House, 150 Willis Street, Wellington >> -------------------------------------------------------------------------- >> >> > > -- > Ionut Biru - https://fleio.com > -- Ionut Biru - https://fleio.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ionut at fleio.com Wed Jan 22 15:05:36 2020 From: ionut at fleio.com (Ionut Biru) Date: Wed, 22 Jan 2020 17:05:36 +0200 Subject: [magnum] podman fedora-coreos authorization failed: SSL exception connecting on keystone In-Reply-To: References: <1dd3e495-c749-947f-4417-2008f7006cd1@catalyst.net.nz> Message-ID: Hi, I found the difference between the two. On fedora-coreos inside the heat container that is ran by podman REQUESTS_CA_BUNDLE has the value /etc/pki/ca-trust/source/anchors/openstack-ca.pem which is empty. On fedora-atomic the var has the value /etc/pki/tls/certs/ca-bundle.crt On Wed, Jan 22, 2020 at 3:14 PM Ionut Biru wrote: > Hello, > > I've deployed the same kubernetes version on fedora-atomic but with > use_podman=true and worked flawless. > Maybe is an issue with fedora-coreos? > > On Wed, Jan 22, 2020 at 9:53 AM Ionut Biru wrote: > >> Hello, >> >> I don't have cafile configured in keystone_authtoken and keystone_auth. I >> did copied letsencrypt cafile and configured it but now magnum cannot >> communicate with keystone even at simple as coe cluster list. >> >> CRITICAL keystonemiddleware.auth_token [-] Unable to validate token: >> Could not find versioned identity endpoints when attempting to >> authenticate. >> (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', >> 'tls_process_server_certificate', 'certificate verify ies exceeded with >> url: / (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', >> 'tls_process_server_certificate', 'certificate verify failed')],)",),) >> >> On Wed, Jan 22, 2020 at 3:02 AM Feilong Wang >> wrote: >> >>> Hi Ionut, >>> >>> Would you mind sharing your magnum.conf? I think you may need the >>> *cafile* config option for both *keystone_authtoken* and >>> *keystone_auth.* >>> >>> >>> On 22/01/20 11:01 AM, Ionut Biru wrote: >>> >>> Hello guys, >>> >>> I'm trying to deploy a kubernetes cluster using magnum 9.2 >>> with fedora-coreos-31.20200113.3.1-openstack. >>> >>> Master vm is deployed correctly but the cluster is never deployed since >>> podman returns the following error: >>> >>> >>> Jan 21 21:55:14 k8s-cluster002-mn5qgp6qlmw6-master-0 podman[2433]: >>> Authorization failed: SSL exception connecting to >>> https://api.mydomain.cloud:5000/v3/auth/tokens: HTTPSConnectionPool(host='api.mydomain.cloud', >>> port=5000): Max retries exceeded with url: /v3/auth/tokens (Caused by >>> SSLError(SSLError(185090184, u'[X509] no certificate or crl found >>> (_ssl.c:3063)'),)) >>> >>> I do have a valid letsencrypt certification on that particular domain. >>> >>> curl https://api.mydomain.cloud:5000/v3/auth/tokens >>> {"error": {"message": "The request you have made requires >>> authentication.", "code": 401, "title": "Unauthorized"}} >>> >>> I was wondering, do you guys seen this issue before? Below is the >>> template. >>> >>> https://paste.xinu.at/OC0Ic/ >>> -- >>> Ionut Biru - https://fleio.com >>> >>> -- >>> Cheers & Best regards, >>> Feilong Wang (王飞龙) >>> Head of R&D >>> Catalyst Cloud - Cloud Native New Zealand >>> -------------------------------------------------------------------------- >>> Tel: +64-48032246 >>> Email: flwang at catalyst.net.nz >>> Level 6, Catalyst House, 150 Willis Street, Wellington >>> -------------------------------------------------------------------------- >>> >>> >> >> -- >> Ionut Biru - https://fleio.com >> > > > -- > Ionut Biru - https://fleio.com > -- Ionut Biru - https://fleio.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From juliaashleykreger at gmail.com Wed Jan 22 16:08:43 2020 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Wed, 22 Jan 2020 08:08:43 -0800 Subject: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management In-Reply-To: References: <87344b65740d47fc9777ae710dcf4af9@AUSX13MPS308.AMER.DELL.COM> <7b55e3b28d644492a846fdb10f7b127b@AUSX13MPS308.AMER.DELL.COM> Message-ID: Circling back to this now since I'm not in meetings and can actually think about this topic. :) On Sun, Jan 12, 2020 at 1:42 PM Nadathur, Sundar wrote: > [trim] > > Further complicating matters is the "Metal to Tenant" use cases where the > > user requesting the machine is not an administrator, but has some level of > > inherent administrative access to all Operating System accessible devices once > > their OS has booted. Which makes me wonder "What if the cloud > > administrators WANT to block the tenant's direct ability to write/flash > > firmware into accelerator/smartnic/etc?" > > Yes, admins may want to do that. This can be done (partly) via RBAC, by having different roles for tenants who can use devices but not reprogram them, and for tenants who can program the device with application/scheduling-relevant features (but not firmware), etc. I concur that it might be able to do by RBAC for hypervisor hosts where access is abstracted and controlled, however the concern in the baremetal integration use case is the tenant ultimately has full superuser access to the machine. > > > I suspect if cloud administrators want to block such hardware access, > > vendors will want to support such a capability. > > Devices can and usually do offer separate mechanisms for reading from registers, writing to them, updating flash etc. each with associated access permissions. A device vendor can go a bit extra by requiring specific Linux capabilities, such as say CAP_IPC_LOCK for mmap access, in their device driver. > Going back to the prior point for a Metal to Tenant case, these may be true for pure users of a shared system, but with the operating model of bare metal as a service, the user has full machine access. The user could also deploy an OS where capabilities checking is disabled entirely. > > Blocking such access inherently forces some actions into hardware > > management/maintenance workflows, and may ultimately may cause some of > > a support matrix's use cases to be unsupportable, again ultimately depending > > on what exactly the user is attempting to achieve. > > Not sure if you are expressing a concern here. If the admin is using device features or RBAC to restrict access, then she is intentionally blocking some combinations in your support matrix, right? Users in such a deployment need to live with that. I was trying to further stress the prior concern and convey that I perceive the end result being a matrix of use cases where some are unsupportable. I completely agree that, in the end, the users would need to live with that situation. I just think that clarity will need to exist for users on what is possible, and what ultimately is not possible in various scenarios. -Julia From juliaashleykreger at gmail.com Wed Jan 22 16:36:28 2020 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Wed, 22 Jan 2020 08:36:28 -0800 Subject: [Cyborg][Ironic][Nova][Neutron][TripleO][Cinder] accelerators management In-Reply-To: References: <87344b65740d47fc9777ae710dcf4af9@AUSX13MPS308.AMER.DELL.COM> <20200113165340.ge3hitlqrdfhj52m@yuggoth.org> Message-ID: On Mon, Jan 13, 2020 at 10:58 AM Sean Mooney wrote: > > On Mon, 2020-01-13 at 18:26 +0000, Nadathur, Sundar wrote: [trim] > > > > > I wouldn't be surprised, though, if there *are* NFV-related cases where the > > > users of the virtual machines into which some network hardware is mapped > > > need access to alter parts of, say, an interface controller's firmware. The Linux > > > kernel has for years incorporated features to write or rewrite firmware and > > > other microcode for certain devices at boot time for similar reasons, after all. > > > > This aspect does come up for discussion a lot. Generally, operators and device vendors get alarmed at the prospect of > > letting a user/VNF/instance program an image/bitstream into a device directly -- we wouldn't know what image it is, > > etc. Cyborg doesn't support that. But Cyborg could program an image/bitstream on behalf of the user/VNF. > to be fair if you device support reprogramming over pcie then you can enable the guest to reprogram the device using > nova's pci passthough feature by passing through the entire pf. cyborgs role is to provide a magaged acclerator not an > unmanaged one. if we wanted to use use pre programed fpga or fix function acclerator you have been able to do that with > pci passtough for the better part of 4 years. so i would consider unmanaged acclerator out of scope of cyborg at least > until the integration of managed accllerator is done. > > nova already handelds vGPU, vPMEM(persistent memeory), generic pci passthough, sriov for neutron ports and hardware > offloaded ovs VF(e.g. smart nic integration). > > cyborgs add value is in managing things nova cannot provide easily. > > arguing that ironic shoudl mangage fpga bitstream becasue it can manage firmware from a nova point of view is arguaing > the virt driver should manage all devices that are provide to the guest meaning in the libvirt case it and not cyborg > shoudl be continuted to be extended to mange fpgas and any other devices directly. I _feel_ like there would eventually be edge cases where it may be desired or required, but without a practical bare metal as a service integration to start with, it seems kind of crazy to think about it too much. > > we coudl do that but that would leave only one thing for cyborge to manage which woudl be remote acclartor that could be > proved to instnace over a network fabric. making it a kind of cinder of acclerators. that is a usecase that nova and > ironic both woudl be ill sutied for but it is not the dirction the cyborg project has moved in so unless you are > suggesing cyborg should piviot i dont think we should redesign the interaction between nova ironic cyborg and neutron to > have ironci manage the devices. I concur, I think the overall concern that started the discussion was still how as a vendor are these things supported and warranties are not inadvertently voided. From some discussions, I feel like the "As a cloud user I want a managed accelerator" is distinctly different from "As a cloud user I want baremetal" and still different from "As a cloud installer, I want to install my infrastructure". No one configuration, software, or use pattern will solve all of the cases, at least until AIs are writing our code for us and the installation AI can read/understand the OEM's build sheet to understand what was done at the factory. > i do think there is merrit in some integration between the ironic python agent and cyborg for discovery and perhaps > programing of the fpga on an ironic node assuming the actual discovery and programing logic live in cyborg and ironic > simply runs/deploys/configures the cyborg agent in the ipa image or invokes the cyborg code directly. > I absolutely agree, and I suspect from a practical operational standpoint, it would be good to at least offer a flag of "Hey, delete any bitstreams" between tenant deployments. The one conundrum is the mechanics of triggering and running a cyborg agent because these actions are typically performed on an isolated, restricted access network without actual access much less credentials to the message bus. Of course, likely solvable. > > > > That said, the VNF or VM (in a non-networking context) can configure a device by reading from registers/DDR on the > > card or writing to them. They can be handled using standard access permissions, Linux capabilities, etc. For example, > > the VM may memory-map a region of the device's address space using the mmap system call, and that access can be > > controlled. > > > > > -- > > > Jeremy Stanley > > > > Regards, > > Sundar > > > > From ds6901 at att.com Wed Jan 22 17:29:16 2020 From: ds6901 at att.com (SCHVENINGER, DOUGLAS P) Date: Wed, 22 Jan 2020 17:29:16 +0000 Subject: [stestr][tempest] how to conf test suite to get file_name to be pythonlogging instead of stdout Message-ID: <9CA8DCB499314346897F899E81CE5FFF61FC5055@MOSTLS1MSGUSRFB.ITServices.sbc.com> When using stestr we are seeing 1. different file_name for logging statement 2. mime type on one test suite run and not the other test suite run 3. test_id on all log statement in one test suite run but missing on some of the log statement from the other test suite run I was wondering if anyone has any input of how to config our internal test suite like tempest to get correct data to show up When we run tempest with stestr the subunit stream is placing all logging data with a test_id and the file_name being pythonlogging. Example sec record 2 below: Record 1 worker-00Mùϳ/àPÈ^'K‡ôoW�@{tempest.api.compute.flavors.test_flavors.FlavorsV2TestJSON.test_list_flavors[id-e36c0eaa-dff5-4082-ad1f-3f9a80aa3f59,smoke] Record 2 worker-0text/plain; charset="utf8"pythonlogging:''P�2020-01-21 19:05:43,639 142 INFO [tempest.lib.common.rest_client] Request (FlavorsV2TestJSON:test_list_flavors): 200 GET https://compute-nc.auk51a.cci.att.com/v2.1/8695795f8370474d9d628b3cdd9f97cb/flavors 0.468s 2020-01-21 19:05:43,639 142 DEBUG [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-OpenStack-Nova-API-Version': '2.1', 'X-Auth-Token': ''} Body: None Response - Headers: {'date': 'Tue, 21 Jan 2020 19:05:43 GMT', 'content-type': 'application/json', 'content-length': '4244', 'connection': 'close', 'vary': 'X-OpenStack-Nova-API-Version', 'x-content-type-options': 'nosniff', 'strict-transport-security': 'max-age=15724800; includeSubDomains', 'content-security-policy': "script-src 'self'; object-src 'self'", 'x-frame-options': 'DENY', 'x-permitted-cross-domain-policies': 'none', 'x-xss-protection': '1; mode=block', 'openstack-api-version': 'compute 2.1', 'x-openstack-nova-api-version': '2.1', 'x-openstack-request-id': 'req-160e3fc6-d568-457d-9cbd-f6670b652565', 'x-compute-request-id': 'req-160e3fc6-d568-457d-9cbd-f6670b652565', 'status': '200', 'content-location': 'https://compute-nc.auk51a.cci.att.com/v2.1/8695795f8370474d9d628b3cdd9f97cb/flavors'} Body: b'{"flavors": [{"id": "5f112417-533d-4e33-b9d7-ca8237a29a83", "name": "m1.xlarge", "links": [{"rel": "self", "href": "https://compute-nc.auk51a.cci.att.com/v2.1/8695795f8370474d9d628b3cdd9f97cb/flavors/5f112417-533d-4e33-b9d7-ca8237a29a83"}, {"rel": "bookmark", "href": "https://compute-nc.auk51a.cci.att.com/8695795f8370474d9d628b3cdd9f97cb/flavors/5f112417-533d-4e33-b9d7-ca8237a29a83"}]}, {"id": "877ef018-6134-4d61-ba6b-d40ee58eb6be", "name": "aqa.alt20200121T190526948874", "links": [{"rel": "self", "href": "https://compute-nc.auk51a.cci.att.com/v2.1/8695795f8370474d9d628b3cdd9f97cb/flavors/877ef018-6134-4d61-ba6b-d40ee58eb6be"}, {"rel": "bookmark", "href": "https://compute-nc.auk51a.cci.att.com/8695795f8370474d9d628b3cdd9f97cb/flavors/877ef018-6134-4d61-ba6b-d40ee58eb6be"}]}, {"id": "910d1836-119b-4ecf-9d47-3ef7e5990fad", "name": "aqa.primary20200121T185150095917", "links": [{"rel": "self", "href": "https://compute-nc.auk51a.cci.att.com/v2.1/8695795f8370474d9d628b3cdd9f97cb/flavors/910d1836-119b-4ecf-9d47-3ef7e5990fad"}, {"rel": "bookmark", "href": "https://compute-nc.auk51a.cci.att.com/8695795f8370474d9d628b3cdd9f97cb/flavors/910d1836-119b-4ecf-9d47-3ef7e5990fad"}]}, {"id": "bd850da7-a2d8-48e8-bd42-3378abae556a", "name": "aqa.primary20200121T190526948874", "links": [{"rel": "self", "href": "https://compute-nc.auk51a.cci.att.com/v2.1/8695795f8370474d9d628b3cdd9f97cb/flavors/bd850da7-a2d8-48e8-bd42-3378abae556a"}, {"rel": "bookmark", "href": "https://compute-nc.auk51a.cci.att.com/8695795f8370474d9d628b3cdd9f97cb/flavors/bd850da7-a2d8-48e8-bd42-3378abae556a"}]}, {"id": "c6653cba-3cbd-48bd-aa3e-54366bded2b4", "name": "m1.small", "links": [{"rel": "self", "href": "https://compute-nc.auk51a.cci.att.com/v2.1/8695795f8370474d9d628b3cdd9f97cb/flavors/c6653cba-3cbd-48bd-aa3e-54366bded2b4"}, {"rel": "bookmark", "href": "https://compute-nc.auk51a.cci.att.com/8695795f8370474d9d628b3cdd9f97cb/flavors/c6653cba-3cbd-48bd-aa3e-54366bded2b4"}]}, {"id": "cdb15fe3-66a6-46cb-ab97-b26cc37b6d7f", "name": "m1.medium", "links": [{"rel": "self", "href": "https://compute-nc.auk51a.cci.att.com/v2.1/8695795f8370474d9d628b3cdd9f97cb/flavors/cdb15fe3-66a6-46cb-ab97-b26cc37b6d7f"}, {"rel": "bookmark", "href": "https://compute-nc.auk51a.cci.att.com/8695795f8370474d9d628b3cdd9f97cb/flavors/cdb15fe3-66a6-46cb-ab97-b26cc37b6d7f"}]}, {"id": "dcc5d1ea-d15a-476e-8587-29bd34e36051", "name": "m1.large", "links": [{"rel": "self", "href": "https://compute-nc.auk51a.cci.att.com/v2.1/8695795f8370474d9d628b3cdd9f97cb/flavors/dcc5d1ea-d15a-476e-8587-29bd34e36051"}, {"rel": "bookmark", "href": "https://compute-nc.auk51a.cci.att.com/8695795f8370474d9d628b3cdd9f97cb/flavors/dcc5d1ea-d15a-476e-8587-29bd34e36051"}]}, {"id": "e43ea269-f785-48f1-a05e-edffae5d5a85", "name": "taas-p1.c2r4d50", "links": [{"rel": "0ËUùƳ/ðMg^'K‡ôoW�@{tempest.api.compute.flavors.test_flavors.FlavorsV2TestJSON.test_list_flavors[id-e36c0eaa-dff5-4082-ad1f-3f9a80aa3f59,smoke] Record 3 worker-0text/plain; charset="utf8"pythonlogging:''LŸself", "href": "https://compute-nc.auk51a.cci.att.com/v2.1/8695795f8370474d9d628b3cdd9f97cb/flavors/e43ea269-f785-48f1-a05e-edffae5d5a85"}, {"rel": "bookmark", "href": "https://compute-nc.auk51a.cci.att.com/8695795f8370474d9d628b3cdd9f97cb/flavors/e43ea269-f785-48f1-a05e-edffae5d5a85"}]}, {"id": "eef05014-3e5f-4ea5-bbfa-8171c2e2707b", "name": "aqa.alt20200121T185150095917", "links": [{"rel": "self", "href": "https://compute-nc.auk51a.cci.att.com/v2.1/8695795f8370474d9d628b3cdd9f97cb/flavors/eef05014-3e5f-4ea5-bbfa-8171c2e2707b"}, {"rel": "bookmark", "href": "https://compute-nc.auk51a.cci.att.com/8695795f8370474d9d628b3cdd9f97cb/flavors/eef05014-3e5f-4ea5-bbfa-8171c2e2707b"}]}, {"id": "f1a3141db37c48e49bd84e90d1c1369e", "name": "p1.c5r1.0d10s1.0e10", "links": [{"rel": "self", "href": "https://compute-nc.auk51a.cci.att.com/v2.1/8695795f8370474d9d628b3cdd9f97cb/flavors/f1a3141db37c48e49bd84e90d1c1369e"}, {"rel": "bookmark", "href": "https://compute-nc.auk51a.cci.att.com/8695795f8370474d9d628b3cdd9f97cb/flavors/f1a3141db37c48e49bd84e90d1c1369e"}]}, {"id": "f3da4ca8-f8df-44f8-8a06-832bca480234", "name": "m1.tiny", "links": [{"rel": "self", "href": "https://compute-nc.auk51a.cci.att.com/v2.1/8695795f8370474d9d628b3cdd9f97cb/flavors/f3da4ca8-f8df-44f8-8a06-832bca480234"}, 2020-01-21 19:05:43,878 142 INFO [tempest.lib.common.rest_client] Request (FlavorsV2TestJSON:test_list_flavors): 200 GET https://compute-nc.auk51a.cci.att.com/v2.1/8695795f8370474d9d628b3cdd9f97cb/flavors/bd850da7-a2d8-48e8-bd42-3378abae556a 0.236s 2020-01-21 19:05:43,878 142 DEBUG [tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-OpenStack-Nova-API-Version': '2.1', 'X-Auth-Token': ''} Body: None Response - Headers: {'date': 'Tue, 21 Jan 2020 19:05:43 GMT', 'content-type': 'application/json', 'content-length': '581', 'connection': 'close', 'vary': 'X-OpenStack-Nova-API-Version', 'x-content-type-options': 'nosniff', 'strict-transport-security': 'max-age=15724800; includeSubDomains', 'content-security-policy': "script-src 'self'; object-src 'self'", 'x-frame-options': 'DENY', 'x-permitted-cross-domain-policies': 'none', 'x-xss-protection': '1; mode=block', 'openstack-api-version': 'compute 2.1', 'x-openstack-nova-api-version': '2.1', 'x-openstack-request-id': 'req-38380cca-ecaf-4fb5-a768-cc91172ccbff', 'x-compute-request-id': 'req-38380cca-ecaf-4fb5-a768-cc91172ccbff', 'status': '200', 'content-location': 'https://compute-nc.auk51a.cci.att.com/v2.1/8695795f8370474d9d628b3cdd9f97cb/flavors/bd850da7-a2d8-48e8-bd42-3378abae556a'} Body: b'{"flavor": {"id": "bd850da7-a2d8-48e8-bd42-3378abae556a", "name": "aqa.primary20200121T190526948874", "ram": 1024, "disk": 5, "swap": "", "OS-FLV-EXT-DATA:ephemeral": 0, "OS-FLV-DISABLED:disabled": false, "vcpus": 1, "os-flavor-access:is_public": true, "rxtx_factor": 1.0, "links": [{"rel": "self", "href": "https://compute-nc.auk51a.cci.att.com/v2.1/8695795f8370474d9d628b3cdd9f97cb/flavors/bd850da7-a2d8-48e8-bd42-3378abae556a"}, {"rel": "bookmark", "href": "https://compute-nc.auk51a.cci.att.com/8695795f8370474d9d628b3cdd9f97cb/flavors/bd850da7-a2d8-48e8-bd42-3378abae556a"}]}}' 0£Õ8d³/ƒ@š^'K‡ôoW�@{tempest.api.compute.flavors.test_flavors.FlavorsV2TestJSON.test_list_flavors[id-e36c0eaa-dff5-4082-ad1f-3f9a80aa3f59,smoke] When we run our internal test suite with stestr the subunit stream is placing all logging data sometime with a test_id and the file_name being stdout. Example sec record 2 below: Record 1 process config logging to file_name stdout �'�@�^(�R����worker-0stdout@�2020-01-22 11:24:34.681 24600 INFO aic_aqa_infrastructure_sonobuoy_plugins.host_info.config [-] Config and Logging setup by aic_aqa_infrastructure_sonobuoy_plugins.host_info.config Config data following in DEBUG 0��..�'�B^^(�R�z��worker-0stdoutB82020-01-22 11:24:34.742 24600 INFO aic_aqa_infrastructure_sonobuoy_plugins.common.expected.base [-] Loaded ExpectedData source file: requirements from: /home/ds6901/aic-aqa-infrastructure-sonobuoy-plugins/aic_aqa_infrastructure_sonobuoy_plugins/host_info/expected/data/requirements.yaml 2020-01-22 11:24:34.744 24600 INFO aic_aqa_infrastructure_sonobuoy_plugins.common.expected.base [-] Loaded ExpectedData source file: site from: /home/ds6901/aic-aqa-infrastructure-sonobuoy-plugins/aic_aqa_infrastructure_sonobuoy_plugins/host_info/expected/data/site.yaml 0q���'�A@^(�R�'��worker-0stdoutA2020-01-22 11:24:34.819 24600 INFO aic_aqa_infrastructure_sonobuoy_plugins.common.expected.base [-] Loaded ExpectedData source file: default from: /home/ds6901/aic-aqa-infrastructure-sonobuoy-plugins/aic_aqa_infrastructure_sonobuoy_plugins/host_info/expected/data/default.yaml 0��'�BP^(�S^�worker-0stdoutB*2020-01-22 11:24:35.260 24600 DEBUG aic_aqa_infrastructure_sonobuoy_plugins.common.utils.host_info_utils [-] file mount point /home/ds6901/aic-aqa-infrastructure-sonobuoy-plugins/aic_aqa_infrastructure_sonobuoy_plugins/tests/functional/data/host with path /home/ds6901/aic-aqa-infrastructure-sonobuoy-plugins/aic_aqa_infrastructure_sonobuoy_plugins/tests/functional/data/host/etc/rsyslog.d/40-sensage.conf get_host_mount /home/ds6901/aic-aqa-infrastructure-sonobuoy-plugins/aic_aqa_infrastructure_sonobuoy_plugins/common/utils/host_info_utils.py:18 Record 2 test case exists event 0#l'��/�@�^(�S�#)H@|aic_aqa_infrastructure_sonobuoy_plugins.host_info.tests.network.test_calico_logs.TestCalicoLogs.test_flow_log_file_retentionworker-00��� �'�E^(�S�� ________________________________ `worker-0stdoutD�2020-01-22 11:24:35.290 24600 DEBUG aic_aqa_infrastructure_sonobuoy_plugins.common.utils.host_info_utils [-] file mount point /home/ds6901/aic-aqa-infrastructure-sonobuoy-plugins/aic_aqa_infrastructure_sonobuoy_plugins/tests/functional/data/host with path /home/ds6901/aic-aqa-infrastructure-sonobuoy-plugins/aic_aqa_infrastructure_sonobuoy_plugins/tests/functional/data/host/etc/hostname get_host_mount /home/ds6901/aic-aqa-infrastructure-sonobuoy-plugins/aic_aqa_infrastructure_sonobuoy_plugins/common/utils/host_info_utils.py:18 2020-01-22 11:24:35.291 24600 DEBUG aic_aqa_infrastructure_sonobuoy_plugins.common.actuals.commands.base_command [-] Executing command: cat /home/ds6901/aic-aqa-infrastructure-sonobuoy-plugins/aic_aqa_infrastructure_sonobuoy_plugins/tests/functional/data/host/etc/hostname __init__ /home/ds6901/aic-aqa-infrastructure-sonobuoy-plugins/aic_aqa_infrastructure_sonobuoy_plugins/common/actuals/commands/base_command.py:31 2020-01-22 11:24:35.298 24600 INFO aic_aqa_infrastructure_sonobuoy_plugins.host_info.tests.test_base [-] Use environment file point /home/ds6901/aic-aqa-infrastructure-sonobuoy-plugins/aic_aqa_infrastructure_sonobuoy_plugins/tests/functional/data/controller-describe-node.json to describe node. 0�j}��'�D�^(�S��worker-0stdoutD�2020-01-22 11:24:35.342 24600 DEBUG aic_aqa_infrastructure_sonobuoy_plugins.common.utils.host_info_utils [-] file mount point /home/ds6901/aic-aqa-infrastructure-sonobuoy-plugins/aic_aqa_infrastructure_sonobuoy_plugins/tests/functional/data/host with path /home/ds6901/aic-aqa-infrastructure-sonobuoy-plugins/aic_aqa_infrastructure_sonobuoy_plugins/tests/functional/data/host/var/log/calico/flowlogs get_host_mount /home/ds6901/aic-aqa-infrastructure-sonobuoy-plugins/aic_aqa_infrastructure_sonobuoy_plugins/common/utils/host_info_utils.py:18 2020-01-22 11:24:35.343 24600 DEBUG aic_aqa_infrastructure_sonobuoy_plugins.common.utils.host_info_utils [-] file mount point /home/ds6901/aic-aqa-infrastructure-sonobuoy-plugins/aic_aqa_infrastructure_sonobuoy_plugins/tests/functional/data/host with path /home/ds6901/aic-aqa-infrastructure-sonobuoy-plugins/aic_aqa_infrastructure_sonobuoy_plugins/tests/functional/data/host/var/log/calico/audit get_host_mount /home/ds6901/aic-aqa-infrastructure-sonobuoy-plugins/aic_aqa_infrastructure_sonobuoy_plugins/common/utils/host_info_utils.py:18 2020-01-22 11:24:35.345 24600 INFO aic_aqa_infrastructure_sonobuoy_plugins.host_info.tests.network.test_calico_logs [-] Class setup logging Record 3 test case in progress with test_id 0ŋ �/�@�^(�Sԗ&@|aic_aqa_infrastructure_sonobuoy_plugins.host_info.tests.network.test_calico_logs.TestCalicoLogs.test_flow_log_file_retentionworker-0 Record 4 log statements from the test case method but no test_id and file_name=stdout 0���Q�'�A�^(�S�a ________________________________ worker-0stdoutA�2020-01-22 11:24:35.350 24600 INFO aic_aqa_infrastructure_sonobuoy_plugins.host_info.tests.network.test_calico_logs [-] Sample log message 1 2020-01-22 11:24:35.358 24600 INFO aic_aqa_infrastructure_sonobuoy_plugins.host_info.tests.network.test_calico_logs [-] Sample log message 2 2020-01-22 11:24:35.359 24600 INFO aic_aqa_infrastructure_sonobuoy_plugins.host_info.tests.network.test_calico_logs [-] Sample log message 3 0�K��/�@�^(�S�n�0@|aic_aqa_infrastructure_sonobuoy_plugins.host_info.tests.network.test_calico_logs.TestCalicoLogs.test_flow_log_file_retentionworker-00N}� Thanks in advance for any help on this. Doug Schveninger Lead Software Engineer douglas.schveninger at att.com Office: (314) 450-3311 -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Wed Jan 22 19:54:20 2020 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Wed, 22 Jan 2020 13:54:20 -0600 Subject: [Release-job-failures] Release of openstack/tempest for ref refs/tags/23.0.0 failed In-Reply-To: References: Message-ID: <9d0cdf46-c97b-c1ae-4898-443a8fd80837@gmx.com> On 1/22/20 11:01 AM, zuul at openstack.org wrote: > Build failed. > > - release-openstack-python https://zuul.opendev.org/t/openstack/build/f56e6adb34c7425e8bda31ab29db4642 : SUCCESS in 3m 55s > - announce-release https://zuul.opendev.org/t/openstack/build/864c651598e74abca8c15a1aedee01c6 : SUCCESS in 4m 39s > - propose-update-constraints https://zuul.opendev.org/t/openstack/build/38926e79572e48c6909abb878df6ce88 : FAILURE in 15m 55s > > _______________________________________________ > Release-job-failures mailing list > Release-job-failures at lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/release-job-failures We found there was a memory issue on one of the gitea servers that caused errors while trying to perform a clone. Everything was fine with the release other than proposing raising the upper constraints in openstack/requirements. I have manually proposed the update here: https://review.opendev.org/#/c/703863/ Sean From jungleboyj at gmail.com Wed Jan 22 19:57:51 2020 From: jungleboyj at gmail.com (Jay Bryant) Date: Wed, 22 Jan 2020 13:57:51 -0600 Subject: [cinder][ci] Cinder drivers being Unsupported and General CI Status ... In-Reply-To: References: Message-ID: All, We once again are at the point in the release where we are talking about 3rd Party CI and what is going on for Cinder.  At the moment I have analyzed drivers that have not successfully reported results on a Cinder patch in 30 or more days and have put together the following list of drivers to be unsupported in the Ussuri release: * Inspur Drivers * Infortrend * Kaminario * NEC * Quobyte * Zadara * HPE Drivers If your name is in the list above you are receiving this e-mail directly, not just through the mailing list. If you are working on resolving CI issues please let me know so we can discuss how to proceed. In addition to the fact that we will be pushing up unsupported patches for the drivers above, we have already unsupported and removed a number of drivers during this release.  They are as follows: * Unsupported: o MacroSAN Driver * Removed: o ProphetStor Driver o Nimble Storage Driver o Veritas Access Driver o Veritas CNFS Driver o Virtuozzo Storage Driver o Huawei FusionStorage Driver o Sheepdog Storage Driver Obviously we are reaching the point that the number of drivers leaving the community is concerning and it has sparked discussions around the fact that maybe our 3rd Party CI approach isn't working as intended.  So what do we do?  Just mark drivers unsupported and no longer remove drivers?  Do we restore drivers that have recently been removed? We are planning to have further discussion around these questions at our next Cinder meeting in #openstack-meeting-4 on Wednesday, 1/29/20 at 14:00 UTC.  If you have thoughts or strong opinions around this topic please join us. Thank you! Jay Bryant jsbryant at electronicjungle.net IRC:  jungleboyj -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Wed Jan 22 20:01:37 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Wed, 22 Jan 2020 21:01:37 +0100 Subject: [api][sdk][dev][oslo] using uWSGI breaks CORS config In-Reply-To: References: <7e1f612534bb572af4c645d3d58510bd7a1e4f4d.camel@redhat.com> Message-ID: Those following github may have already discovered uWSGI is not the issue here. The issue is devstack. Despite docs saying: > post-config - After OpenStack services have been initialized but still before > they have been started. The config happens after WSGI stuff is already started. In mod_wsgi case there is a difference in behavior because placement finally restarts apache2 which reloads keystone's config... So anything set in post-config at the moment will not affect at least APIs (if uWSGI). Worth checking what else is affected and fixing... -yoctozepto wt., 21 sty 2020 o 20:43 Radosław Piliszek napisał(a): > > I got response on https://github.com/unbit/uwsgi/issues/1550 > > Any oslo.middleware and/or quick testing platform setup experts are > welcome to join the thread there. :-) > > -yoctozepto > From jungleboyj at gmail.com Wed Jan 22 19:51:03 2020 From: jungleboyj at gmail.com (Jay Bryant) Date: Wed, 22 Jan 2020 13:51:03 -0600 Subject: [cinder][ci] Cinder drivers being Unsupported and General CI Status ... Message-ID: All, We once again are at the point in the release where we are talking about 3rd Party CI and what is going on for Cinder.  At the moment I have analyzed drivers that have not successfully reported results on a Cinder patch in 30 or more days and have put together the following list of drivers to be unsupported in the Ussuri release: * Inspur Drivers * Infortrend * Kaminario * NEC * Quobyte * Zadara * HPE Drivers If your name is in the list above you are receiving this e-mail directly, not just through the mailing list. If you are working on resolving CI issues please let me know so we can discuss how to proceed. In addition to the fact that we will be pushing up unsupported patches for the drivers above, we have already unsupported and removed a number of drivers during this release.  They are as follows: * Unsupported: o MacroSAN Driver * Removed: o ProphetStor Driver o Nimble Storage Driver o Veritas Access Driver o Veritas CNFS Driver o Virtuozzo Storage Driver o Huawei FusionStorage Driver o Sheepdog Storage Driver Obviously we are reaching the point that the number of drivers leaving the community is concerning and it has sparked discussions around the fact that maybe our 3rd Party CI approach isn't working as intended.  So what do we do?  Just mark drivers unsupported and no longer remove drivers?  Do we restore drivers that have recently been removed? We are planning to have further discussion around these questions at our next Cinder meeting in #openstack-meeting-4 on Wednesday, 1/29/20 at 14:00 UTC.  If you have thoughts or strong opinions around this topic please join us. Thank you! Jay Bryant jsbryant at electronicjungle.net IRC:  jungleboyj -------------- next part -------------- An HTML attachment was scrubbed... URL: From mtreinish at kortar.org Wed Jan 22 20:42:40 2020 From: mtreinish at kortar.org (Matthew Treinish) Date: Wed, 22 Jan 2020 15:42:40 -0500 Subject: [stestr][tempest] how to conf test suite to get file_name to be pythonlogging instead of stdout In-Reply-To: <9CA8DCB499314346897F899E81CE5FFF61FC5055@MOSTLS1MSGUSRFB.ITServices.sbc.com> References: <9CA8DCB499314346897F899E81CE5FFF61FC5055@MOSTLS1MSGUSRFB.ITServices.sbc.com> Message-ID: <20200122204240.GD103678@sinanju> On Wed, Jan 22, 2020 at 05:29:16PM +0000, SCHVENINGER, DOUGLAS P wrote: > When using stestr we are seeing > > 1. different file_name for logging statement > 2. mime type on one test suite run and not the other test suite run > 3. test_id on all log statement in one test suite run but missing on some of the log statement from the other test suite run > > I was wondering if anyone has any input of how to config our internal test suite like tempest to get correct data to show up It's hard to know exactly what's going on without seeing your internal test suite. But, tempest isn't doing anything special with regards to the attachments. Tempest just uses fixtures (https://github.com/testing-cabal/fixtures) to setup streams from stdout, stderr, and python logging, see: https://opendev.org/openstack/tempest/src/branch/master/tempest/test.py#L619-L631 Testtool's TestCase useFixture() method: https://github.com/testing-cabal/testtools/blob/f51ce5f934153e80d3e8a95b52e1464daeb30c14/testtools/testcase.py#L725-L765 handles attaching the stream from the fixture to the attachment as an addCleanup() call (instead of using addDetails() it directly appends it to the TestCase.__details dictionary). If your test suite is missing metadata from these attachments I'd look at how you're generating them and the method you're using them to attach to the output stream. > > When we run tempest with stestr the subunit stream is placing all logging data with a test_id and the file_name being pythonlogging. From the stestr perspective it's not doing anything beside passing the subunit stream emitted by the test runner (either https://github.com/mtreinish/stestr/tree/master/stestr/subunit_runner on py>=3.5 or https://github.com/testing-cabal/subunit/blob/master/python/subunit/run.py on py<3.5). So the answer to this question really relies on how you're generating the attachments in the test suite itself. Also, the subunit v2 format is binary and is very hard to read in an email. I couldn't parse what you pasted (which is why I snipped it). If you want to send the file I'd recommend using an attachment then I can parse the output via tools that can handle the subunit v2 input. Thanks, Matthew Treinish -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From skaplons at redhat.com Wed Jan 22 21:58:48 2020 From: skaplons at redhat.com (Slawek Kaplonski) Date: Wed, 22 Jan 2020 22:58:48 +0100 Subject: [neutron][drivers team] Proposing Nate Johnston as drivers team member Message-ID: <20200122215847.zsfgihjmmjkwlrdl@skaplons-mac> Hi Neutrinos, I would like to propose Nate Johnston to be part of Neutron drivers team. Since long time Nate is very active Neutron's core reviewer. He is also actively participating in our Neutron drivers team meetings and he shown there that he has big experience and knowledge about Neutron, Neutron stadium projects as well as whole OpenStack. I think that he really deservers to be part of this team and that he will be great addition it. I will wait for Your feedback for 1 week and if there will be no any votes agains, I will add Nate to drivers team in next week. -- Slawek Kaplonski Senior software engineer Red Hat From kennelson11 at gmail.com Wed Jan 22 22:06:04 2020 From: kennelson11 at gmail.com (Kendall Nelson) Date: Wed, 22 Jan 2020 14:06:04 -0800 Subject: [all] Nominations for the "W" release name In-Reply-To: References: Message-ID: Thanks for getting this started Sean! -Kendall (diablo_rojo) On Tue, Jan 21, 2020 at 7:49 AM Sean McGinnis wrote: > Hello all, > > We get to be a little proactive this time around and get the release > name chosen for the "W" release. Time to start thinking of good names > again! > > Process Changes > --------------- > > There are a couple of changes to be aware of with our naming process. In > the past, we had always based our naming criteria on something > geographically local to the Summit location. With the event changes to > no longer have two large Summit-type events per year, we have tweaked > our process to open things up and make it hopefully a little easier to > pick a good name that the community likes. > > There are a couple of significant changes. First, names can now be > proposed for anything that starts with the appropriate letter. It is no > longer tied to a specific geographic region. Second, in order to > simplify the process, the electorate for the poll will be the OpenStack > Technical Committee. Full details of the release naming process can be > found here: > > https://governance.openstack.org/tc/reference/release-naming.html > > Name Selection > -------------- > > With that, the nomination period for the "W" release name is now open. > Please add suitable names to: > > https://wiki.openstack.org/wiki/Release_Naming/W_Proposals > > We will accept nominations until February 7, 202 at 23:59:59 UTC. We > will then have a brief period for any necessary discussions and to get > the poll set up, with the TC electorate voting starting by February 17, > 2020 and going no longer than February 23, 2020. > > Based on past timing with trademark and copyright reviews, we will > likely have an official release name by mid to late March. > > Happy naming! > > Sean > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From haleyb.dev at gmail.com Wed Jan 22 23:02:55 2020 From: haleyb.dev at gmail.com (Brian Haley) Date: Wed, 22 Jan 2020 18:02:55 -0500 Subject: [neutron][drivers team] Proposing Nate Johnston as drivers team member In-Reply-To: <20200122215847.zsfgihjmmjkwlrdl@skaplons-mac> References: <20200122215847.zsfgihjmmjkwlrdl@skaplons-mac> Message-ID: Nate would be a great addition to the drivers team, big +1 from me :) On 1/22/20 4:58 PM, Slawek Kaplonski wrote: > Hi Neutrinos, > > I would like to propose Nate Johnston to be part of Neutron drivers team. > Since long time Nate is very active Neutron's core reviewer. He is also actively > participating in our Neutron drivers team meetings and he shown there that he > has big experience and knowledge about Neutron, Neutron stadium projects as well > as whole OpenStack. > I think that he really deservers to be part of this team and that he will be > great addition it. > I will wait for Your feedback for 1 week and if there will be no any votes > agains, I will add Nate to drivers team in next week. > From mark.kirkwood at catalyst.net.nz Wed Jan 22 23:34:30 2020 From: mark.kirkwood at catalyst.net.nz (Mark Kirkwood) Date: Thu, 23 Jan 2020 12:34:30 +1300 Subject: [swift] Adding disks - one by one or all lightly weighted? Message-ID: <30bf2fd4-bedb-e73c-874f-d4a3efc68b13@catalyst.net.nz> Hi, We are wanting to increase the number of disks in each of our storage nodes - from 4 to 12. I'm wondering whether it is better to: 1/ Add 1st new disk (with a reduced weight)...increase the weight until full, then repeat for next disk etc 2/ Add 'em all with a (much i.e 1/8 of that in 1/ ) reduced weight...increase the weights until done Thoughts? regards Mark From sundar.nadathur at intel.com Thu Jan 23 02:23:27 2020 From: sundar.nadathur at intel.com (Nadathur, Sundar) Date: Thu, 23 Jan 2020 02:23:27 +0000 Subject: [cyborg] No IRC meeting today Message-ID: Since many people are traveling on account of the Chinese New Year, there will be no IRC meeting today. Regards, Sundar -------------- next part -------------- An HTML attachment was scrubbed... URL: From Ido.Benda at kaminario.com Thu Jan 23 12:41:08 2020 From: Ido.Benda at kaminario.com (Ido Benda) Date: Thu, 23 Jan 2020 12:41:08 +0000 Subject: [cinder][ci] Cinder drivers being Unsupported and General CI Status ... In-Reply-To: References: Message-ID: Hi Jay, Kaminario’s CI is broken since the drop of Xenial support. We are working to resolve the these issues. Ido Benda www.kaminario.com Mobile: +(972)-52-4799393 E-Mail: ido.benda at kaminario.com From: Jay Bryant Sent: Wednesday, January 22, 2020 21:51 To: openstack-discuss at lists.openstack.org; inspur.ci at inspur.com; wangyong2017 at inspur.com; Chengwei.Chou at infortrend.com; Bill.Sung at infortrend.com; Kuirong.Chen(陳奎融) ; Ido Benda ; Srinivas Dasthagiri ; nec-cinder-ci at istorage.jp.nec.com; silvan at quobyte.com; robert at quobyte.com; felix at quobyte.com; bjoern at quobyte.com; OpenStack Development ; Shlomi Avihou | Zadara ; msdu-openstack at groups.ext.hpe.com Subject: [cinder][ci] Cinder drivers being Unsupported and General CI Status ... All, We once again are at the point in the release where we are talking about 3rd Party CI and what is going on for Cinder. At the moment I have analyzed drivers that have not successfully reported results on a Cinder patch in 30 or more days and have put together the following list of drivers to be unsupported in the Ussuri release: * Inspur Drivers * Infortrend * Kaminario * NEC * Quobyte * Zadara * HPE Drivers If your name is in the list above you are receiving this e-mail directly, not just through the mailing list. If you are working on resolving CI issues please let me know so we can discuss how to proceed. In addition to the fact that we will be pushing up unsupported patches for the drivers above, we have already unsupported and removed a number of drivers during this release. They are as follows: * Unsupported: * MacroSAN Driver * Removed: * ProphetStor Driver * Nimble Storage Driver * Veritas Access Driver * Veritas CNFS Driver * Virtuozzo Storage Driver * Huawei FusionStorage Driver * Sheepdog Storage Driver Obviously we are reaching the point that the number of drivers leaving the community is concerning and it has sparked discussions around the fact that maybe our 3rd Party CI approach isn't working as intended. So what do we do? Just mark drivers unsupported and no longer remove drivers? Do we restore drivers that have recently been removed? We are planning to have further discussion around these questions at our next Cinder meeting in #openstack-meeting-4 on Wednesday, 1/29/20 at 14:00 UTC. If you have thoughts or strong opinions around this topic please join us. Thank you! Jay Bryant jsbryant at electronicjungle.net IRC: jungleboyj -------------- next part -------------- An HTML attachment was scrubbed... URL: From johfulto at redhat.com Thu Jan 23 14:57:02 2020 From: johfulto at redhat.com (John Fulton) Date: Thu, 23 Jan 2020 09:57:02 -0500 Subject: [all] pep8 results for maintenance branches over time Message-ID: What are projects doing about new linters working only with newer deps or even python3 only? For example [1] [2]. Might it be OK to ignore pep8 results on maintenance branches? Thanks, John [1] https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_82b/703943/1/check/openstack-tox-pep8/82b01fb/tox/pep8-1.log Ignoring ruamel.yaml: markers 'python_version == "3.4"' don't match your environment ERROR: ansible-lint 4.2.0 has requirement ruamel.yaml<1,>=0.15.34; python_version < "3.7", but you'll have ruamel-yaml 0.13.14 which is incompatible. [2] https://zuul.opendev.org/t/openstack/build/82b01fb8a8c04068a906bf0cb888de58 AttributeError: module 'ruamel.yaml' has no attribute 'YAML' Examining undercloud-debug.yaml of type playbook Traceback (most recent call last): File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/bin/ansible-lint", line 8, in sys.exit(main()) File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__main__.py", line 187, in main matches.extend(runner.run()) File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__init__.py", line 287, in run skip_list=self.skip_list)) File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__init__.py", line 177, in run matches.extend(rule.matchtasks(playbookfile, text)) File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__init__.py", line 87, in matchtasks yaml = ansiblelint.utils.append_skipped_rules(yaml, text, file['type']) File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/utils.py", line 596, in append_skipped_rules yaml_skip = _append_skipped_rules(pyyaml_data, file_text, file_type) File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/utils.py", line 606, in _append_skipped_rules yaml = ruamel.yaml.YAML() AttributeError: module 'ruamel.yaml' has no attribute 'YAML' Examining undercloud-service-status.yaml of type playbook Traceback (most recent call last): File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/bin/ansible-lint", line 8, in sys.exit(main()) File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__main__.py", line 187, in main matches.extend(runner.run()) File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__init__.py", line 287, in run skip_list=self.skip_list)) File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__init__.py", line 177, in run matches.extend(rule.matchtasks(playbookfile, text)) File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__init__.py", line 87, in matchtasks yaml = ansiblelint.utils.append_skipped_rules(yaml, text, file['type']) File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/utils.py", line 596, in append_skipped_rules yaml_skip = _append_skipped_rules(pyyaml_data, file_text, file_type) File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/utils.py", line 606, in _append_skipped_rules yaml = ruamel.yaml.YAML() AttributeError: module 'ruamel.yaml' has no attribute 'YAML' /home/zuul/src/opendev.org/openstack/tripleo-validations ERROR: InvocationError for command /bin/bash tools/ansible-lint.sh (exited with code 1) pep8 finish: run-test after 45.18 seconds pep8 start: run-test-post pep8 finish: run-test-post after 0.00 seconds ___________________________________ summary ____________________________________ ERROR: pep8: commands failed From sean.mcginnis at gmx.com Thu Jan 23 15:16:36 2020 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 23 Jan 2020 09:16:36 -0600 Subject: [all] pep8 results for maintenance branches over time In-Reply-To: References: Message-ID: On 1/23/20 8:57 AM, John Fulton wrote: > What are projects doing about new linters working only with newer deps > or even python3 only? For example [1] [2]. Might it be OK to ignore > pep8 results on maintenance branches? > > Thanks, > John > > [1] > https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_82b/703943/1/check/openstack-tox-pep8/82b01fb/tox/pep8-1.log > > Ignoring ruamel.yaml: markers 'python_version == "3.4"' don't match > your environment > ERROR: ansible-lint 4.2.0 has requirement ruamel.yaml<1,>=0.15.34; > python_version < "3.7", but you'll have ruamel-yaml 0.13.14 which is > incompatible. > This actually isn't a pep8 issue, it's a pip install issue. It appears this job is installing a version of raumel that is not compatible with the Python version used. So it is rightly giving an error. It appears ansible-lint has a requirement for a version higher than the upper-constraint for stable/queens: https://opendev.org/openstack/requirements/src/branch/stable/queens/upper-constraints.txt#L204 We don't cap linters in the overall upper-constraints.txt file. It looks like the project may need to cap it at a version that is compatible with this stable branch version of things. Sean From gchamoul at redhat.com Thu Jan 23 16:33:30 2020 From: gchamoul at redhat.com (=?utf-8?B?R2HDq2w=?= Chamoulaud) Date: Thu, 23 Jan 2020 17:33:30 +0100 Subject: [all] pep8 results for maintenance branches over time In-Reply-To: References: Message-ID: <20200123163330.vd4m2wid2wmolz33@olivia.strider.local> On 23/Jan/2020 09:16, Sean McGinnis wrote: > On 1/23/20 8:57 AM, John Fulton wrote: > > What are projects doing about new linters working only with newer deps > > or even python3 only? For example [1] [2]. Might it be OK to ignore > > pep8 results on maintenance branches? > > > > Thanks, > > John > > > > [1] > > https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_82b/703943/1/check/openstack-tox-pep8/82b01fb/tox/pep8-1.log > > > > Ignoring ruamel.yaml: markers 'python_version == "3.4"' don't match > > your environment > > ERROR: ansible-lint 4.2.0 has requirement ruamel.yaml<1,>=0.15.34; > > python_version < "3.7", but you'll have ruamel-yaml 0.13.14 which is > > incompatible. > > > This actually isn't a pep8 issue, it's a pip install issue. > > It appears this job is installing a version of raumel that is not > compatible with the Python version used. So it is rightly giving an error. > > It appears ansible-lint has a requirement for a version higher than the > upper-constraint for stable/queens: > > https://opendev.org/openstack/requirements/src/branch/stable/queens/upper-constraints.txt#L204 > > We don't cap linters in the overall upper-constraints.txt file. It looks > like the project may need to cap it at a version that is compatible with > this stable branch version of things. Yes, that's the decision we've made. [1] Thanks for your analysis, Sean! [1] - https://review.opendev.org/#/c/703999/ Gaël, -- Gaël Chamoulaud (He/Him/His) .::. Red Hat .::. OpenStack .::. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From ssbarnea at redhat.com Thu Jan 23 16:36:26 2020 From: ssbarnea at redhat.com (Sorin Sbarnea) Date: Thu, 23 Jan 2020 16:36:26 +0000 Subject: [all] pep8 results for maintenance branches over time In-Reply-To: References: Message-ID: <54D0C24B-3B31-4441-BD30-4AE8DB8F0B68@redhat.com> While more or less complex fixes may be implemented to keep linting working on older branches we will clearly face an interesting challenge: - project adopts new syntax/linter, backporting a fix will need rewriting - outdated linters in maintenance branches will start to misbehave: false positives, failure to install, .... - unable to upgrade linters on maintenance branches due dependency conflicts, dropped py27 support, pip or other time-bombs My gut feeling about it is that sooner or later we may be forced to silence `tox -e linters` on old branches. This is easily achievable via tox.ini even without changing any jobs, I will give an example: ``` [linters:venv] commands = flake8 ``` If you add a minus before flake8, tox will ignore the exit code of flake8, but it will still run it. This is not an invitation to use this as a general practice, just an idea about how to unblock backports in some unfortunate cases. Cheers Sorin Sbarnea > On 23 Jan 2020, at 14:57, John Fulton wrote: > > What are projects doing about new linters working only with newer deps > or even python3 only? For example [1] [2]. Might it be OK to ignore > pep8 results on maintenance branches? > > Thanks, > John > > [1] > https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_82b/703943/1/check/openstack-tox-pep8/82b01fb/tox/pep8-1.log > > Ignoring ruamel.yaml: markers 'python_version == "3.4"' don't match > your environment > ERROR: ansible-lint 4.2.0 has requirement ruamel.yaml<1,>=0.15.34; > python_version < "3.7", but you'll have ruamel-yaml 0.13.14 which is > incompatible. > > > [2] https://zuul.opendev.org/t/openstack/build/82b01fb8a8c04068a906bf0cb888de58 > AttributeError: module 'ruamel.yaml' has no attribute 'YAML' > Examining undercloud-debug.yaml of type playbook > Traceback (most recent call last): > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/bin/ansible-lint", > line 8, in > sys.exit(main()) > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__main__.py", > line 187, in main > matches.extend(runner.run()) > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__init__.py", > line 287, in run > skip_list=self.skip_list)) > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__init__.py", > line 177, in run > matches.extend(rule.matchtasks(playbookfile, text)) > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__init__.py", > line 87, in matchtasks > yaml = ansiblelint.utils.append_skipped_rules(yaml, text, file['type']) > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/utils.py", > line 596, in append_skipped_rules > yaml_skip = _append_skipped_rules(pyyaml_data, file_text, file_type) > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/utils.py", > line 606, in _append_skipped_rules > yaml = ruamel.yaml.YAML() > AttributeError: module 'ruamel.yaml' has no attribute 'YAML' > Examining undercloud-service-status.yaml of type playbook > Traceback (most recent call last): > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/bin/ansible-lint", > line 8, in > sys.exit(main()) > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__main__.py", > line 187, in main > matches.extend(runner.run()) > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__init__.py", > line 287, in run > skip_list=self.skip_list)) > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__init__.py", > line 177, in run > matches.extend(rule.matchtasks(playbookfile, text)) > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__init__.py", > line 87, in matchtasks > yaml = ansiblelint.utils.append_skipped_rules(yaml, text, file['type']) > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/utils.py", > line 596, in append_skipped_rules > yaml_skip = _append_skipped_rules(pyyaml_data, file_text, file_type) > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/utils.py", > line 606, in _append_skipped_rules > yaml = ruamel.yaml.YAML() > AttributeError: module 'ruamel.yaml' has no attribute 'YAML' > /home/zuul/src/opendev.org/openstack/tripleo-validations > ERROR: InvocationError for command /bin/bash tools/ansible-lint.sh > (exited with code 1) > pep8 finish: run-test after 45.18 seconds > pep8 start: run-test-post > pep8 finish: run-test-post after 0.00 seconds > ___________________________________ summary ____________________________________ > ERROR: pep8: commands failed > > From sfinucan at redhat.com Thu Jan 23 17:34:23 2020 From: sfinucan at redhat.com (Stephen Finucane) Date: Thu, 23 Jan 2020 17:34:23 +0000 Subject: [all] pep8 results for maintenance branches over time In-Reply-To: <54D0C24B-3B31-4441-BD30-4AE8DB8F0B68@redhat.com> References: <54D0C24B-3B31-4441-BD30-4AE8DB8F0B68@redhat.com> Message-ID: <09dfb05b7a262758de8900fee47a48276b170002.camel@redhat.com> On Thu, 2020-01-23 at 16:36 +0000, Sorin Sbarnea wrote: > While more or less complex fixes may be implemented to keep linting working on older branches we will clearly face an interesting challenge: > > - project adopts new syntax/linter, backporting a fix will need rewriting > - outdated linters in maintenance branches will start to misbehave: false positives, failure to install, .... > - unable to upgrade linters on maintenance branches due dependency conflicts, dropped py27 support, pip or other time-bombs > > My gut feeling about it is that sooner or later we may be forced to silence `tox -e linters` on old branches. Before anyone does this, it's worth noting smcginnis' point that linters generally aren't versioned by upper-constraints so you can't rely on that to cap your versions for a given stable release. Instead, you should probably cap it at a given MINOR release and periodically update this range. Not only does this prevent a change in one of the linters breaking master, but it will ensure stable branches keep working as they did over the long term. We've done this in nova for years [1][2][3] and I'd just assumed everyone else was doing the same, to be honest :) Stephen [1] https://review.opendev.org/#/c/703405/ [2] https://github.com/openstack/nova/blob/19.0.0/test-requirements.txt#L5 [2] https://github.com/openstack/nova/blob/20.0.0/test-requirements.txt#L5 This is easily achievable via tox.ini even without changing any jobs, I will give an example: ``` [linters:venv] commands = flake8 ``` If you add a minus before flake8, tox will ignore the exit code of flake8, but it will still run it. This is not an invitation to use this as a general practice, just an idea about how to unblock backports in some unfortunate cases. Cheers Sorin Sbarnea On 23 Jan 2020, at 14:57, John Fulton wrote: What are projects doing about new linters working only with newer deps or even python3 only? For example [1] [2]. Might it be OK to ignore pep8 results on maintenance branches? Thanks, John [1] https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_82b/703943/1/check/openstack-tox-pep8/82b01fb/tox/pep8-1.log Ignoring ruamel.yaml: markers 'python_version == "3.4"' don't match your environment ERROR: ansible-lint 4.2.0 has requirement ruamel.yaml<1,>=0.15.34; python_version < "3.7", but you'll have ruamel-yaml 0.13.14 which is incompatible. [2] https://zuul.opendev.org/t/openstack/build/82b01fb8a8c04068a906bf0cb888de58 AttributeError: module 'ruamel.yaml' has no attribute 'YAML' Examining undercloud-debug.yaml of type playbook Traceback (most recent call last): File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/bin/ansible-lint", line 8, in sys.exit(main()) File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__main__.py", line 187, in main matches.extend(runner.run()) File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__init__.py", line 287, in run skip_list=self.skip_list)) File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__init__.py", line 177, in run matches.extend(rule.matchtasks(playbookfile, text)) File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__init__.py", line 87, in matchtasks yaml = ansiblelint.utils.append_skipped_rules(yaml, text, file['type']) File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/utils.py", line 596, in append_skipped_rules yaml_skip = _append_skipped_rules(pyyaml_data, file_text, file_type) File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/utils.py", line 606, in _append_skipped_rules yaml = ruamel.yaml.YAML() AttributeError: module 'ruamel.yaml' has no attribute 'YAML' Examining undercloud-service-status.yaml of type playbook Traceback (most recent call last): File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/bin/ansible-lint", line 8, in sys.exit(main()) File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__main__.py", line 187, in main matches.extend(runner.run()) File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__init__.py", line 287, in run skip_list=self.skip_list)) File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__init__.py", line 177, in run matches.extend(rule.matchtasks(playbookfile, text)) File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__init__.py", line 87, in matchtasks yaml = ansiblelint.utils.append_skipped_rules(yaml, text, file['type']) File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/utils.py", line 596, in append_skipped_rules yaml_skip = _append_skipped_rules(pyyaml_data, file_text, file_type) File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/utils.py", line 606, in _append_skipped_rules yaml = ruamel.yaml.YAML() AttributeError: module 'ruamel.yaml' has no attribute 'YAML' /home/zuul/src/opendev.org/openstack/tripleo-validations ERROR: InvocationError for command /bin/bash tools/ansible-lint.sh (exited with code 1) pep8 finish: run-test after 45.18 seconds pep8 start: run-test-post pep8 finish: run-test-post after 0.00 seconds ___________________________________ summary ____________________________________ ERROR: pep8: commands failed From smooney at redhat.com Thu Jan 23 17:46:26 2020 From: smooney at redhat.com (Sean Mooney) Date: Thu, 23 Jan 2020 17:46:26 +0000 Subject: [all] pep8 results for maintenance branches over time In-Reply-To: <54D0C24B-3B31-4441-BD30-4AE8DB8F0B68@redhat.com> References: <54D0C24B-3B31-4441-BD30-4AE8DB8F0B68@redhat.com> Message-ID: <4a9e565b0425cf491068ab788e292b1b327b546c.camel@redhat.com> On Thu, 2020-01-23 at 16:36 +0000, Sorin Sbarnea wrote: > While more or less complex fixes may be implemented to keep linting working on older branches we will clearly face an > interesting challenge: > > - project adopts new syntax/linter, backporting a fix will need rewriting that should not really be the case. some linter might get stricter over time but when you backport you generally should not need to update the styple unless you are going form pyton3 only to python 2 compatible sysntax but that would be a require change regardless of the linter. > - outdated linters in maintenance branches will start to misbehave: false positives, failure to install, .... we have been pinning flake8 in nova for years. we have never had this happen and i dont think it will hapen unless you intentionall or unintentionlly make a backward impatable linter rule change. > - unable to upgrade linters on maintenance branches due dependency conflicts, dropped py27 support, pip or other time- > bombs we dont update linter on stable branches in general and typeically expect them to be frozen at the version they relased with. > > My gut feeling about it is that sooner or later we may be forced to silence `tox -e linters` on old branches. > > This is easily achievable via tox.ini even without changing any jobs, > > I will give an example: > ``` > [linters:venv] > commands = > flake8 > ``` > If you add a minus before flake8, tox will ignore the exit code of flake8, but it will still run it. well the correct way to do that would really to be mark the jobs as non voting in the zuul.yaml but i dont think this will be required unless project start removing old version from pypi. when project do that it breaks everyone and i think when that happens we need to explain the impact to the maintainer of the project and if they dont care consider should we replace that depency with an alternivite where the maintainer does care about legacy support. > > > This is not an invitation to use this as a general practice, just an idea about how to unblock backports in some > unfortunate cases. > > > Cheers > Sorin Sbarnea > > > On 23 Jan 2020, at 14:57, John Fulton wrote: > > > > What are projects doing about new linters working only with newer deps > > or even python3 only? For example [1] [2]. Might it be OK to ignore > > pep8 results on maintenance branches? > > > > Thanks, > > John > > > > [1] > > https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_82b/703943/1/check/openstack-tox-pep8/82b01fb/tox/pep8-1.log > > > > Ignoring ruamel.yaml: markers 'python_version == "3.4"' don't match > > your environment > > ERROR: ansible-lint 4.2.0 has requirement ruamel.yaml<1,>=0.15.34; > > python_version < "3.7", but you'll have ruamel-yaml 0.13.14 which is > > incompatible. > > > > > > [2] https://zuul.opendev.org/t/openstack/build/82b01fb8a8c04068a906bf0cb888de58 > > AttributeError: module 'ruamel.yaml' has no attribute 'YAML' > > Examining undercloud-debug.yaml of type playbook > > Traceback (most recent call last): > > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/bin/ansible-lint", > > line 8, in > > sys.exit(main()) > > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site- > > packages/ansiblelint/__main__.py", > > line 187, in main > > matches.extend(runner.run()) > > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site- > > packages/ansiblelint/__init__.py", > > line 287, in run > > skip_list=self.skip_list)) > > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site- > > packages/ansiblelint/__init__.py", > > line 177, in run > > matches.extend(rule.matchtasks(playbookfile, text)) > > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site- > > packages/ansiblelint/__init__.py", > > line 87, in matchtasks > > yaml = ansiblelint.utils.append_skipped_rules(yaml, text, file['type']) > > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site- > > packages/ansiblelint/utils.py", > > line 596, in append_skipped_rules > > yaml_skip = _append_skipped_rules(pyyaml_data, file_text, file_type) > > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site- > > packages/ansiblelint/utils.py", > > line 606, in _append_skipped_rules > > yaml = ruamel.yaml.YAML() > > AttributeError: module 'ruamel.yaml' has no attribute 'YAML' > > Examining undercloud-service-status.yaml of type playbook > > Traceback (most recent call last): > > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/bin/ansible-lint", > > line 8, in > > sys.exit(main()) > > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site- > > packages/ansiblelint/__main__.py", > > line 187, in main > > matches.extend(runner.run()) > > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site- > > packages/ansiblelint/__init__.py", > > line 287, in run > > skip_list=self.skip_list)) > > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site- > > packages/ansiblelint/__init__.py", > > line 177, in run > > matches.extend(rule.matchtasks(playbookfile, text)) > > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site- > > packages/ansiblelint/__init__.py", > > line 87, in matchtasks > > yaml = ansiblelint.utils.append_skipped_rules(yaml, text, file['type']) > > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site- > > packages/ansiblelint/utils.py", > > line 596, in append_skipped_rules > > yaml_skip = _append_skipped_rules(pyyaml_data, file_text, file_type) > > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site- > > packages/ansiblelint/utils.py", > > line 606, in _append_skipped_rules > > yaml = ruamel.yaml.YAML() > > AttributeError: module 'ruamel.yaml' has no attribute 'YAML' > > /home/zuul/src/opendev.org/openstack/tripleo-validations > > ERROR: InvocationError for command /bin/bash tools/ansible-lint.sh > > (exited with code 1) > > pep8 finish: run-test after 45.18 seconds > > pep8 start: run-test-post > > pep8 finish: run-test-post after 0.00 seconds > > ___________________________________ summary ____________________________________ > > ERROR: pep8: commands failed > > > > > > From ashlee at openstack.org Thu Jan 23 18:13:24 2020 From: ashlee at openstack.org (Ashlee Ferguson) Date: Thu, 23 Jan 2020 12:13:24 -0600 Subject: Upcoming OSF Event: OpenDev + PTG, June 8-11 in Vancouver Message-ID: <6D3EE100-C598-442E-B9C5-DDA18C035C12@openstack.org> Join us this June in Vancouver for OpenDev + PTG, and grab your early bird tickets now ! OpenDev + PTG is a new collaborative event organized by the OpenStack Foundation gathering developers, system architects, and operators to address common open source infrastructure challenges. June 8-11, 2020 Vancouver Convention Centre - East Building OpenDev will include discussion oriented sessions around a particular topic to explore a problem within a topic area, share common architectures, and collaborate around potential solutions. This OpenDev will focus specifically on the following Tracks, spanning open source projects, including Airship, Ansible, Ceph, Kata Containers, Kubernetes, OpenStack, StarlingX, and Zuul: Hardware Automation Large-scale Usage of Open Source Infrastructure Software Containers in Production Key Challenges for Open Source in 2020 The conversations will continue into the afternoon with the Project Teams Gathering (PTG) . This is when the morning’s practices will be explored by project teams, SIGs and other workgroups who will have dedicated space to get work done in a productive setting, maximizing the ability of contributors to work through their project objectives in an environment that is focused towards work and productivity. Interested in Sponsoring? Sponsoring these events will help ensure the continued growth and success of open source infrastructure projects. A sponsorship prospectus will be available soon. If your organization is interested in participating and supporting the open infrastructure community, please contact events at openstack.org . P.S. - Stay tuned for more info on OSF’s Q4 event, the Open Infrastructure Summit you know and love! See you in Vancouver! The OpenStack Foundation -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Thu Jan 23 18:25:56 2020 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 23 Jan 2020 12:25:56 -0600 Subject: [all] pep8 results for maintenance branches over time In-Reply-To: <09dfb05b7a262758de8900fee47a48276b170002.camel@redhat.com> References: <54D0C24B-3B31-4441-BD30-4AE8DB8F0B68@redhat.com> <09dfb05b7a262758de8900fee47a48276b170002.camel@redhat.com> Message-ID: <16fd3a753ac.118ae488d35676.6609063543387119719@ghanshyammann.com> ---- On Thu, 23 Jan 2020 11:34:23 -0600 Stephen Finucane wrote ---- > On Thu, 2020-01-23 at 16:36 +0000, Sorin Sbarnea wrote: > > While more or less complex fixes may be implemented to keep linting working on older branches we will clearly face an interesting challenge: > > > > - project adopts new syntax/linter, backporting a fix will need rewriting > > - outdated linters in maintenance branches will start to misbehave: false positives, failure to install, .... > > - unable to upgrade linters on maintenance branches due dependency conflicts, dropped py27 support, pip or other time-bombs > > > > My gut feeling about it is that sooner or later we may be forced to silence `tox -e linters` on old branches. > > Before anyone does this, it's worth noting smcginnis' point that > linters generally aren't versioned by upper-constraints so you can't > rely on that to cap your versions for a given stable release. Instead, > you should probably cap it at a given MINOR release and periodically > update this range. Not only does this prevent a change in one of the > linters breaking master, but it will ensure stable branches keep > working as they did over the long term. We've done this in nova for > years [1][2][3] and I'd just assumed everyone else was doing the same, > to be honest :) I hope so until they have not removed the cap from when I did for many projects gate year ago. - https://review.opendev.org/#/q/topic:cap-hacking+(status:open+OR+status:merged) -gmann > > Stephen > > [1] https://review.opendev.org/#/c/703405/ > [2] https://github.com/openstack/nova/blob/19.0.0/test-requirements.txt#L5 > [2] https://github.com/openstack/nova/blob/20.0.0/test-requirements.txt#L5 > > This is easily achievable via tox.ini even without changing any jobs, > > I will give an example: > ``` > [linters:venv] > commands = > flake8 > ``` > If you add a minus before flake8, tox will ignore the exit code of flake8, but it will still run it. > > > This is not an invitation to use this as a general practice, just an idea about how to unblock backports in some unfortunate cases. > > > Cheers > Sorin Sbarnea > > On 23 Jan 2020, at 14:57, John Fulton wrote: > > What are projects doing about new linters working only with newer deps > or even python3 only? For example [1] [2]. Might it be OK to ignore > pep8 results on maintenance branches? > > Thanks, > John > > [1] > https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_82b/703943/1/check/openstack-tox-pep8/82b01fb/tox/pep8-1.log > > Ignoring ruamel.yaml: markers 'python_version == "3.4"' don't match > your environment > ERROR: ansible-lint 4.2.0 has requirement ruamel.yaml<1,>=0.15.34; > python_version < "3.7", but you'll have ruamel-yaml 0.13.14 which is > incompatible. > > > [2] https://zuul.opendev.org/t/openstack/build/82b01fb8a8c04068a906bf0cb888de58 > AttributeError: module 'ruamel.yaml' has no attribute 'YAML' > Examining undercloud-debug.yaml of type playbook > Traceback (most recent call last): > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/bin/ansible-lint", > line 8, in > sys.exit(main()) > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__main__.py", > line 187, in main > matches.extend(runner.run()) > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__init__.py", > line 287, in run > skip_list=self.skip_list)) > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__init__.py", > line 177, in run > matches.extend(rule.matchtasks(playbookfile, text)) > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__init__.py", > line 87, in matchtasks > yaml = ansiblelint.utils.append_skipped_rules(yaml, text, file['type']) > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/utils.py", > line 596, in append_skipped_rules > yaml_skip = _append_skipped_rules(pyyaml_data, file_text, file_type) > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/utils.py", > line 606, in _append_skipped_rules > yaml = ruamel.yaml.YAML() > AttributeError: module 'ruamel.yaml' has no attribute 'YAML' > Examining undercloud-service-status.yaml of type playbook > Traceback (most recent call last): > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/bin/ansible-lint", > line 8, in > sys.exit(main()) > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__main__.py", > line 187, in main > matches.extend(runner.run()) > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__init__.py", > line 287, in run > skip_list=self.skip_list)) > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__init__.py", > line 177, in run > matches.extend(rule.matchtasks(playbookfile, text)) > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/__init__.py", > line 87, in matchtasks > yaml = ansiblelint.utils.append_skipped_rules(yaml, text, file['type']) > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/utils.py", > line 596, in append_skipped_rules > yaml_skip = _append_skipped_rules(pyyaml_data, file_text, file_type) > File "/home/zuul/src/opendev.org/openstack/tripleo-validations/.tox/pep8/lib/python3.5/site-packages/ansiblelint/utils.py", > line 606, in _append_skipped_rules > yaml = ruamel.yaml.YAML() > AttributeError: module 'ruamel.yaml' has no attribute 'YAML' > /home/zuul/src/opendev.org/openstack/tripleo-validations > ERROR: InvocationError for command /bin/bash tools/ansible-lint.sh > (exited with code 1) > pep8 finish: run-test after 45.18 seconds > pep8 start: run-test-post > pep8 finish: run-test-post after 0.00 seconds > ___________________________________ summary ____________________________________ > ERROR: pep8: commands failed > > > > > > > From MM9745 at att.com Thu Jan 23 20:01:29 2020 From: MM9745 at att.com (MCEUEN, MATT) Date: Thu, 23 Jan 2020 20:01:29 +0000 Subject: [openstack-helm] Core Reviewer Nominations In-Reply-To: References: Message-ID: <7C64A75C21BB8D43BD75BB18635E4D8970ACFFDD@MOSTLS1MSGUSRFF.ITServices.sbc.com> +1 & +1 Welcome! From: Tin Lam Sent: Tuesday, January 21, 2020 11:37 AM To: openstack-discuss at lists.openstack.org Subject: Re: [openstack-helm] Core Reviewer Nominations +1. Thanks for your reviews and contributions to the OSH projects, Gage and Steven. Tin On Mon, Jan 20, 2020 at 7:14 PM Pete Birley > wrote: OpenStack-Helm team, Based on their record of quality code review and substantial/meaningful code contributions to the openstack-helm project, at last weeks meeting we proposed the following individuals as core reviewers for openstack-helm: * Gage Hugo * Steven Fitzpatrick All OpenStack-Helm Core Reviewers are invited to reply with a +1/-1 by EOD next Monday (27/1/2020). A lone +1/-1 will apply to both candidates, otherwise please spell out votes individually for the candidates. Cheers, Pete -- Regards, Tin Lam -------------- next part -------------- An HTML attachment was scrubbed... URL: From RRajasekhar at misoenergy.org Thu Jan 23 20:48:27 2020 From: RRajasekhar at misoenergy.org (Ruchi Rajasekhar) Date: Thu, 23 Jan 2020 20:48:27 +0000 Subject: quick question Message-ID: Would anyone happen to know of any data science platforms that can run on OpenStack? I was looking at Pivotal, Pachyderm but they don't run on OpenStack ☹ -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Thu Jan 23 21:17:12 2020 From: smooney at redhat.com (Sean Mooney) Date: Thu, 23 Jan 2020 21:17:12 +0000 Subject: quick question In-Reply-To: References: Message-ID: On Thu, 2020-01-23 at 20:48 +0000, Ruchi Rajasekhar wrote: > Would anyone happen to know of any data science platforms that can run on OpenStack? I was looking at Pivotal, > Pachyderm but they don't run on OpenStack ☹ assuming you mean https://pivotal.io/ checking there docs it seams to be supported on openstack https://docs.pivotal.io/platform/2-8/plan/openstack/openstack_ref_arch.html there install guide is here https://docs.pivotal.io/platform/2-8/customizing/openstack.html pachyderm seams to mainly target deployment on kubernetes. even the local on perm guide https://docs.pachyderm.com/latest/deploy-manage/deploy/on_premises/ assumes that you will deploy kubernetes so you can always just deploy kubernets on openstack and then deploy pachyderm on that but it does not look like they tried to make it easy to install without kubernetes. infact that is more or less the first line in there deployment overview "Pachyderm runs on Kubernetes and is backed by an object store of your choice." https://docs.pachyderm.com/latest/deploy-manage/deploy/ since it was never intended to run on anything othe than kuberntes you best bet to deploy it is to deploy that first on openstack or bare mentally then deploy pachyderm as normal. From skaplons at redhat.com Thu Jan 23 22:46:15 2020 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 23 Jan 2020 23:46:15 +0100 Subject: [neutron] Drivers meeting on 24.01 cancelled Message-ID: <20200123224615.mxpzma7fhglhs5l3@skaplons-mac> Hi, I am on PTO and I will not have time to chair tomorrow's drivers meeting. Lets cancel it this week. See You all next week on the meeting. Sorry for late info about it. -- Slawek Kaplonski Senior software engineer Red Hat From johnsomor at gmail.com Fri Jan 24 00:36:30 2020 From: johnsomor at gmail.com (Michael Johnson) Date: Thu, 23 Jan 2020 16:36:30 -0800 Subject: quick question In-Reply-To: References: Message-ID: I am guessing you are asking about Pivotal Greenplum based on the data science question. If so, yes, they mention OpenStack support in the datasheet. https://content.pivotal.io/datasheets/pivotal-greenplum Michael On Thu, Jan 23, 2020 at 1:20 PM Sean Mooney wrote: > > On Thu, 2020-01-23 at 20:48 +0000, Ruchi Rajasekhar wrote: > > Would anyone happen to know of any data science platforms that can run on OpenStack? I was looking at Pivotal, > > Pachyderm but they don't run on OpenStack ☹ > > assuming you mean https://pivotal.io/ checking there docs it seams to be supported on openstack > https://docs.pivotal.io/platform/2-8/plan/openstack/openstack_ref_arch.html > > there install guide is here https://docs.pivotal.io/platform/2-8/customizing/openstack.html > > pachyderm seams to mainly target deployment on kubernetes. even the local on perm guide > https://docs.pachyderm.com/latest/deploy-manage/deploy/on_premises/ > assumes that you will deploy kubernetes so you can always just deploy kubernets on openstack and then > deploy pachyderm on that but it does not look like they tried to make it easy to install without kubernetes. > infact that is more or less the first line in there deployment overview > "Pachyderm runs on Kubernetes and is backed by an object store of your choice." > https://docs.pachyderm.com/latest/deploy-manage/deploy/ > > since it was never intended to run on anything othe than kuberntes you best bet to deploy it is to deploy that first on > openstack or bare mentally then deploy pachyderm as normal. > > From agarwalvishakha18 at gmail.com Fri Jan 24 06:20:10 2020 From: agarwalvishakha18 at gmail.com (Vishakha Agarwal) Date: Fri, 24 Jan 2020 11:50:10 +0530 Subject: [keystone] Keystone Team Update - Week of 20 January 2020 Message-ID: # Keystone Team Update - Week of 20 January 2020 ## News ### Roadmap Review The team isn't organizing a midcycle or milestone-ly meetings in this cycle. Instead we decided to utilize the meeting time and go through the ussuri roadmap [1]. We discussed the progress with the card owners and if they need any help with them. Please find the updated status of the cards in the roadmap [1]. [1] https://tree.taiga.io/project/keystone-ussuri-roadmap/kanban ### User Support and Bug Duty Every week the duty is being rotated between the members. The person-in-charge for bug duty for current and upcoming week can be seen on the etherpad [2] [2] https://etherpad.openstack.org/p/keystone-l1-duty ## Open Specs Ussuri specs: https://bit.ly/2XDdpkU Ongoing specs: https://bit.ly/2OyDLTh ## Recently Merged Changes Search query: https://bit.ly/2pquOwT We merged 2 changes this week. ## Changes that need Attention Search query: https://bit.ly/2tymTje There are 40 changes that are passing CI, not in merge conflict, have no negative reviews and aren't proposed by bots. ### Priority Reviews * Community Goals https://review.opendev.org/#/c/699127/ [ussuri][goal] Drop python 2.7 support and testing keystone-tempest-plugin https://review.opendev.org/#/c/699119/ [ussuri][goal] Drop python 2.7 support and testing python-keystoneclient * Special Requests https://review.opendev.org/#/c/662734/ Change the default Identity endpoint to internal https://review.opendev.org/#/c/699013/ Always have username in CADF initiator https://review.opendev.org/#/c/700826/ Fix role_assignments role.id filter https://review.opendev.org/#/c/697444/ Adding options to user cli https://review.opendev.org/#/c/702374/ Cleanup doc/requirements.txt https://review.opendev.org/#/c/588211/ Add openstack_groups to assertion https://review.opendev.org/#/c/703578/ Updating tox -e all-plugin command ## Bugs This week we opened 2 new bugs and closed 1. Bugs opened (2) Bug #1860478 (keystone:Low): fetching role assignments should handle domain IDs in addition to project IDs - Opened by Harry Rybacki https://bugs.launchpad.net/keystone/+bug/1860478 Bug #1860252 (keystone:Undecided): security problem,one user can change other user's password without admin - Opened by kuangpeiling https://bugs.launchpad.net/keystone/+bug/1860252 Bugs closed (1) Bug #1860252 (keystone:Invalid): https://bugs.launchpad.net/keystone/+bug/1860252 ## Milestone Outlook https://releases.openstack.org/ussuri/schedule.html Spec freeze is on the week of 10 Feburary. We have 3 specs proposed for usuri cycle that needs reviews. ## Help with this newsletter Help contribute to this newsletter by editing the etherpad: https://etherpad.openstack.org/p/keystone-team-newsletter ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From tony.pearce at cinglevue.com Fri Jan 24 07:41:42 2020 From: tony.pearce at cinglevue.com (Tony Pearce) Date: Fri, 24 Jan 2020 15:41:42 +0800 Subject: Backup and restore questions Message-ID: Is it possible to restore "openstack" to a different cloud? For example, either of the following; 1. Backup Openstack with version "X" and restore to different fresh deployment of the same version or 2. Backup Openstack with version "X" and restore to Openstack Version Y or even later, Z And are there any tools associated with Openstack that achieves this? I think this would come under the term "migration". I've been reading the Openstack docs with regards to backup/restore and doing some trial and error on my side. It appears to me that the "backup/restore" is only applicable for the same system - can anyone confirm it? Are there any options for transferring data to a different Openstack deployment? Thanks in advance *Tony Pearce* | *Senior Network Engineer / Infrastructure Lead**Cinglevue International * Email: tony.pearce at cinglevue.com Web: http://www.cinglevue.com *Australia* 1 Walsh Loop, Joondalup, WA 6027 Australia. Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 Note: This email and all attachments are the sole property of Cinglevue International Pty Ltd. (or any of its subsidiary entities), and the information contained herein must be considered confidential, unless specified otherwise. If you are not the intended recipient, you must not use or forward the information contained in these documents. If you have received this message in error, please delete the email and notify the sender. -------------- next part -------------- An HTML attachment was scrubbed... URL: From nasaito at nec.com Fri Jan 24 02:59:11 2020 From: nasaito at nec.com (=?utf-8?B?U0FJVE8gTkFPS0ko6b2K6Jek44CA55u05qi5KQ==?=) Date: Fri, 24 Jan 2020 02:59:11 +0000 Subject: [cinder][ci] Cinder drivers being Unsupported and General CI Status ... Message-ID: Hi Jay, NEC Cinder CI was configured with zuul v2.5 and Jenkins, and has not been working since "Drop Xenial support". We are fixing our CI with zuul v3 using Software Factory mentioned in the last Cinder Ussuri Virtual Mid-Cycle meeting [1]. We installed Software Factory referring to guide [2], and are investigating how we can monitor openstack/cinder events and create cinder test jobs. It will be helpful if cinder-specific examples are available. [1] https://etherpad.openstack.org/p/cinder-ussuri-mid-cycle-planning [2] https://softwarefactory-project.io/r/#/c/17097/ Thanks, Naoki Saito E-Mail: nasaito at nec.com > From: Jay Bryant > Sent: Thursday, January 23, 2020 4:51 AM > To: openstack-discuss at lists.openstack.org; inspur.ci at inspur.com; wangyong2017 at inspur.com; Chengwei.Chou at infortrend.com; Bill.Sung at infortrend.com; Kuirong.Chen(陳奎融) ; ido.benda at kaminario.com; srinivasd.ctr at kaminario.com; nec-cinder-ci at istorage.jp.nec.com; silvan at quobyte.com; robert at quobyte.com; felix at quobyte.com; bjoern at quobyte.com; OpenStack Development ; Shlomi Avihou | Zadara ; msdu-openstack at groups.ext.hpe.com > Subject: [nec-cinder-ci:76204] [cinder][ci] Cinder drivers being Unsupported and General CI Status ... > > All, > We once again are at the point in the release where we are talking about 3rd Party CI and what is going on for Cinder. At the moment I have analyzed drivers that have not successfully reported results on a Cinder patch in 30 or more days and have put together the following list of drivers to be unsupported in the Ussuri release: > • Inspur Drivers > • Infortrend > • Kaminario > • NEC > • Quobyte > • Zadara > • HPE Drivers > If your name is in the list above you are receiving this e-mail directly, not just through the mailing list. > If you are working on resolving CI issues please let me know so we can discuss how to proceed. > In addition to the fact that we will be pushing up unsupported patches for the drivers above, we have already unsupported and removed a number of drivers during this release. They are as follows: > • Unsupported: > o MacroSAN Driver > • Removed: > o ProphetStor Driver > o Nimble Storage Driver > o Veritas Access Driver > o Veritas CNFS Driver > o Virtuozzo Storage Driver > o Huawei FusionStorage Driver > o Sheepdog Storage Driver > Obviously we are reaching the point that the number of drivers leaving the community is concerning and it has sparked discussions around the fact that maybe our 3rd Party CI approach isn't working as intended. So what do we do? Just mark drivers unsupported and no longer remove drivers? Do we restore drivers that have recently been removed? > We are planning to have further discussion around these questions at our next Cinder meeting in #openstack-meeting-4 on Wednesday, 1/29/20 at 14:00 UTC. If you have thoughts or strong opinions around this topic please join us. > Thank you! > Jay Bryant > mailto:jsbryant at electronicjungle.net > IRC: jungleboyj From radoslaw.piliszek at gmail.com Fri Jan 24 08:07:48 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Fri, 24 Jan 2020 09:07:48 +0100 Subject: Backup and restore questions In-Reply-To: References: Message-ID: Tony wrote: > Is it possible to ... backup Openstack with version "X" and restore > to different fresh deployment of the same version Yes, this is definitely possible. I know some users of Kolla-Ansible migrated their manual and other-deployment-tool-deployed OpenStack-based clouds to use Kolla-Ansible. This is certainly possible using the same version of services. All the data is captured in the sql database (usually mysql). Apart from custom config, this is the most important part to keep backed up (and obviously your users' data in store :-) ). Please read the following great (opinion mine) blog post by Pierre Riteau from StackHPC, that nicely summarizes steps required for such migration: https://www.stackhpc.com/migrating-to-kolla.html -yoctozepto From ralonsoh at redhat.com Fri Jan 24 08:53:03 2020 From: ralonsoh at redhat.com (Rodolfo Alonso) Date: Fri, 24 Jan 2020 08:53:03 +0000 Subject: [neutron][drivers team] Proposing Nate Johnston as drivers team member In-Reply-To: References: <20200122215847.zsfgihjmmjkwlrdl@skaplons-mac> Message-ID: <2dcc7f379289305dcd457ea219f52d3b85d45488.camel@redhat.com> +1 for Nate, a great addition. On Wed, 2020-01-22 at 18:02 -0500, Brian Haley wrote: > Nate would be a great addition to the drivers team, big +1 from me :) > > On 1/22/20 4:58 PM, Slawek Kaplonski wrote: > > Hi Neutrinos, > > > > I would like to propose Nate Johnston to be part of Neutron drivers team. > > Since long time Nate is very active Neutron's core reviewer. He is also actively > > participating in our Neutron drivers team meetings and he shown there that he > > has big experience and knowledge about Neutron, Neutron stadium projects as well > > as whole OpenStack. > > I think that he really deservers to be part of this team and that he will be > > great addition it. > > I will wait for Your feedback for 1 week and if there will be no any votes > > agains, I will add Nate to drivers team in next week. > > From tony.pearce at cinglevue.com Fri Jan 24 08:55:57 2020 From: tony.pearce at cinglevue.com (Tony Pearce) Date: Fri, 24 Jan 2020 16:55:57 +0800 Subject: [cinder][ci] Cinder drivers being Unsupported and General CI Status ... In-Reply-To: References: Message-ID: Hi all, a suggestion for the drivers which are not being supported from the vendor and as such are being removed from Openstack. Could the removed drivers be maintained in Openstack's git? And make them available on the marketplace as an optional download for users that require it? Regards *Tony Pearce* On Fri, 24 Jan 2020 at 16:05, SAITO NAOKI(齊藤 直樹) wrote: > Hi Jay, > > NEC Cinder CI was configured with zuul v2.5 and Jenkins, and has not > been working since "Drop Xenial support". > We are fixing our CI with zuul v3 using Software Factory mentioned in > the last Cinder Ussuri Virtual Mid-Cycle meeting [1]. > > We installed Software Factory referring to guide [2], > and are investigating how we can monitor openstack/cinder events and > create cinder test jobs. > It will be helpful if cinder-specific examples are available. > > [1] https://etherpad.openstack.org/p/cinder-ussuri-mid-cycle-planning > [2] https://softwarefactory-project.io/r/#/c/17097/ > > Thanks, > > Naoki Saito > E-Mail: nasaito at nec.com > > > From: Jay Bryant > > Sent: Thursday, January 23, 2020 4:51 AM > > To: openstack-discuss at lists.openstack.org; inspur.ci at inspur.com; > wangyong2017 at inspur.com; Chengwei.Chou at infortrend.com; Bill.Sung at > infortrend.com; Kuirong.Chen(陳奎融) ; > ido.benda at kaminario.com; srinivasd.ctr at kaminario.com; nec-cinder-ci > at istorage.jp.nec.com; silvan at quobyte.com; robert at quobyte.com; > felix at quobyte.com; bjoern at quobyte.com; OpenStack Development > ; Shlomi Avihou | Zadara zadarastorage.com>; msdu-openstack at groups.ext.hpe.com > > Subject: [nec-cinder-ci:76204] [cinder][ci] Cinder drivers being > Unsupported and General CI Status ... > > > > All, > > We once again are at the point in the release where we are talking about > 3rd Party CI and what is going on for Cinder. At the moment I have > analyzed drivers that have not successfully reported results on a Cinder > patch in 30 or more days and have put together the following list of > drivers to be unsupported in the Ussuri release: > > • Inspur Drivers > > • Infortrend > > • Kaminario > > • NEC > > • Quobyte > > • Zadara > > • HPE Drivers > > If your name is in the list above you are receiving this e-mail > directly, not just through the mailing list. > > If you are working on resolving CI issues please let me know so we can > discuss how to proceed. > > In addition to the fact that we will be pushing up unsupported patches > for the drivers above, we have already unsupported and removed a number of > drivers during this release. They are as follows: > > • Unsupported: > > o MacroSAN Driver > > • Removed: > > o ProphetStor Driver > > o Nimble Storage Driver > > o Veritas Access Driver > > o Veritas CNFS Driver > > o Virtuozzo Storage Driver > > o Huawei FusionStorage Driver > > o Sheepdog Storage Driver > > Obviously we are reaching the point that the number of drivers leaving > the community is concerning and it has sparked discussions around the fact > that maybe our 3rd Party CI approach isn't working as intended. So what do > we do? Just mark drivers unsupported and no longer remove drivers? Do we > restore drivers that have recently been removed? > > We are planning to have further discussion around these questions at our > next Cinder meeting in #openstack-meeting-4 on Wednesday, 1/29/20 at 14:00 > UTC. If you have thoughts or strong opinions around this topic please join > us. > > Thank you! > > Jay Bryant > > mailto:jsbryant at electronicjungle.net > > IRC: jungleboyj > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Fri Jan 24 09:21:11 2020 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 24 Jan 2020 09:21:11 +0000 Subject: [cinder][ci] Cinder drivers being Unsupported and General CI Status ... In-Reply-To: References: Message-ID: <20200124092110.fqio3gvxzuiq7lxk@yuggoth.org> On 2020-01-24 16:55:57 +0800 (+0800), Tony Pearce wrote: > Hi all, a suggestion for the drivers which are not being supported > from the vendor and as such are being removed from Openstack. > Could the removed drivers be maintained in Openstack's git? Looks like Huawei already does this for their Fusioncompute driver: https://opendev.org/x/cinder-fusioncompute > And make them available on the marketplace as an optional download > for users that require it? [...] Yeah, I don't see it in the marketplace's Cinder drivers list. We'd also need some way to indicate that the OpenStack project can't vouch for the quality or even viability of such drivers. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From thierry at openstack.org Fri Jan 24 09:28:21 2020 From: thierry at openstack.org (Thierry Carrez) Date: Fri, 24 Jan 2020 10:28:21 +0100 Subject: [cinder][ci] Cinder drivers being Unsupported and General CI Status ... In-Reply-To: References: Message-ID: <99e9171c-46e6-23d3-04aa-794b7bf39f0d@openstack.org> Tony Pearce wrote: > Hi all, a suggestion for the drivers which are not being supported from > the vendor and as such are being removed from Openstack. Could the > removed drivers be maintained in Openstack's git? And make them > available on the marketplace as an optional download for users that > require it? I still hope those vendors will step up and fix the continuous integration testing in their drivers so that they can stay in the mainline cinder repository. That would be the best outcome. But if they don't, they could totally continue to be maintained on our infrastructure, using a separate organization/namespace to make it clear they are not maintained/supported by the "OpenStack" project. -- Thierry Carrez (ttx) From chacon.piza at gmail.com Fri Jan 24 10:19:15 2020 From: chacon.piza at gmail.com (Martin Chacon Piza) Date: Fri, 24 Jan 2020 11:19:15 +0100 Subject: [Devstack] Unable to find pip2.7 ? Message-ID: Dear all, It seems that there is a problem stacking Devstack using the master branch. The problem happens during the step: Installing package prerequisites [ERROR] /home/vagrant/devstack/inc/python:41 Unable to find pip2.7; cannot continue This is the minimum local.conf I used: [[local|localrc]] SERVICE_HOST=192.168.10.6 HOST_IP=192.168.10.6 HOST_IP_IFACE=eth1 DATABASE_PASSWORD=secretdatabase RABBIT_PASSWORD=secretrabbit ADMIN_PASSWORD=secretadmin SERVICE_PASSWORD=secretservice LOGFILE=$DEST/logs/stack.sh.log LOGDIR=$DEST/logs LOG_COLOR=False DEST=/opt/stack USE_PYTHON3=True As a workaround I need to revert the following change, then it ends up properly: commit 279a7589b03db69fd1b85d947cd0171dacef94ee Author: Jens Harbott (frickler) Date: Mon Apr 16 12:08:30 2018 +0000 Revert "Do not use pip 10 or higher" This reverts commit f99d1771ba1882dfbb69186212a197edae3ef02c. Added workarounds that might want to get split into their own patch before merging: - Don't install python-psutil - Don't run peakmem_tracker Change-Id: If4fb16555e15082a4d97cffdf3cfa608a682997d Any hints? -- Best regards, *Martín Chacón Pizá* *chacon.piza at gmail.com * -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Fri Jan 24 10:45:24 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Fri, 24 Jan 2020 11:45:24 +0100 Subject: [Devstack] Unable to find pip2.7 ? In-Reply-To: References: Message-ID: This part of that change broke it: https://review.opendev.org/#/c/561597/21/tools/install_pip.sh since pip2 is no longer being installed when py3 is used. CI obviously hid the issue by having pip preinstalled. We should not really need pip2 now. -yoctozepto pt., 24 sty 2020 o 11:27 Martin Chacon Piza napisał(a): > > Dear all, > > It seems that there is a problem stacking Devstack using the master branch. > The problem happens during the step: Installing package prerequisites > [ERROR] /home/vagrant/devstack/inc/python:41 Unable to find pip2.7; cannot continue > > > This is the minimum local.conf I used: > > [[local|localrc]] > SERVICE_HOST=192.168.10.6 > HOST_IP=192.168.10.6 > HOST_IP_IFACE=eth1 > DATABASE_PASSWORD=secretdatabase > RABBIT_PASSWORD=secretrabbit > ADMIN_PASSWORD=secretadmin > SERVICE_PASSWORD=secretservice > LOGFILE=$DEST/logs/stack.sh.log > LOGDIR=$DEST/logs > LOG_COLOR=False > DEST=/opt/stack > USE_PYTHON3=True > > As a workaround I need to revert the following change, then it ends up properly: > > commit 279a7589b03db69fd1b85d947cd0171dacef94ee > Author: Jens Harbott (frickler) > Date: Mon Apr 16 12:08:30 2018 +0000 > > Revert "Do not use pip 10 or higher" > > This reverts commit f99d1771ba1882dfbb69186212a197edae3ef02c. > > Added workarounds that might want to get split into their own patch > before merging: > > - Don't install python-psutil > - Don't run peakmem_tracker > > Change-Id: If4fb16555e15082a4d97cffdf3cfa608a682997d > > > Any hints? > > -- > Best regards, > Martín Chacón Pizá > chacon.piza at gmail.com From radoslaw.piliszek at gmail.com Fri Jan 24 11:03:31 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Fri, 24 Jan 2020 12:03:31 +0100 Subject: [Devstack] Unable to find pip2.7 ? In-Reply-To: References: Message-ID: Proposed: https://review.opendev.org/704136 Please let us know if it helps there. -yoctozepto pt., 24 sty 2020 o 11:45 Radosław Piliszek napisał(a): > > This part of that change broke it: > https://review.opendev.org/#/c/561597/21/tools/install_pip.sh > since pip2 is no longer being installed when py3 is used. > CI obviously hid the issue by having pip preinstalled. > We should not really need pip2 now. > > -yoctozepto > > pt., 24 sty 2020 o 11:27 Martin Chacon Piza napisał(a): > > > > Dear all, > > > > It seems that there is a problem stacking Devstack using the master branch. > > The problem happens during the step: Installing package prerequisites > > [ERROR] /home/vagrant/devstack/inc/python:41 Unable to find pip2.7; cannot continue > > > > > > This is the minimum local.conf I used: > > > > [[local|localrc]] > > SERVICE_HOST=192.168.10.6 > > HOST_IP=192.168.10.6 > > HOST_IP_IFACE=eth1 > > DATABASE_PASSWORD=secretdatabase > > RABBIT_PASSWORD=secretrabbit > > ADMIN_PASSWORD=secretadmin > > SERVICE_PASSWORD=secretservice > > LOGFILE=$DEST/logs/stack.sh.log > > LOGDIR=$DEST/logs > > LOG_COLOR=False > > DEST=/opt/stack > > USE_PYTHON3=True > > > > As a workaround I need to revert the following change, then it ends up properly: > > > > commit 279a7589b03db69fd1b85d947cd0171dacef94ee > > Author: Jens Harbott (frickler) > > Date: Mon Apr 16 12:08:30 2018 +0000 > > > > Revert "Do not use pip 10 or higher" > > > > This reverts commit f99d1771ba1882dfbb69186212a197edae3ef02c. > > > > Added workarounds that might want to get split into their own patch > > before merging: > > > > - Don't install python-psutil > > - Don't run peakmem_tracker > > > > Change-Id: If4fb16555e15082a4d97cffdf3cfa608a682997d > > > > > > Any hints? > > > > -- > > Best regards, > > Martín Chacón Pizá > > chacon.piza at gmail.com From pierre at stackhpc.com Fri Jan 24 11:20:28 2020 From: pierre at stackhpc.com (Pierre Riteau) Date: Fri, 24 Jan 2020 12:20:28 +0100 Subject: [Telemetry][TC]Telemetry status In-Reply-To: References: <35341fa4-3db9-0807-1136-9f6dff41b2c2@openstack.org> Message-ID: On Wed, 18 Dec 2019 at 21:33, Lingxian Kong wrote: > > Gnocchi doc website is broken and no one is available to fix. No one > maintains Gnocchi CI. I have just noticed that the Gnocchi website is published again at a new address: http://gnocchi.osci.io See https://github.com/gnocchixyz/gnocchi/pull/1053 for the change. From elmiko at redhat.com Fri Jan 24 14:48:18 2020 From: elmiko at redhat.com (Michael McCune) Date: Fri, 24 Jan 2020 09:48:18 -0500 Subject: quick question In-Reply-To: References: Message-ID: On Thu, Jan 23, 2020 at 3:54 PM Ruchi Rajasekhar wrote: > Would anyone happen to know of any data science platforms that can run on > OpenStack? I was looking at Pivotal, Pachyderm but they don't run on > OpenStack ☹ > if you don't mind about adding another layer, i know that several data science platforms are creating kubernetes tooling. it might be worth investigating using one of the kubernetes on openstack deployment options (magnum, maybe others?) and then layering a data science platform on top of that. good luck! peace o/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre at stackhpc.com Fri Jan 24 17:22:36 2020 From: pierre at stackhpc.com (Pierre Riteau) Date: Fri, 24 Jan 2020 18:22:36 +0100 Subject: [ceilometer] Stable releases Message-ID: Hello, Is it planned to release new Ceilometer tarballs for rocky and stein? There are several bug fixes (for example [1]) in stable branches that would be useful to include in a release. If t can help, I can submit patches to the releases repository. Thanks, Pierre Riteau (priteau) [1] https://bugs.launchpad.net/ceilometer/+bug/1801348 From Tim.Bell at cern.ch Fri Jan 24 18:07:54 2020 From: Tim.Bell at cern.ch (Tim Bell) Date: Fri, 24 Jan 2020 18:07:54 +0000 Subject: quick question In-Reply-To: References: Message-ID: Which data science platforms are you considering ? We may run some of them at CERN, we generally use Kubernetes (via Magnum) are the underlying provisioning engine with autoscaling up/down now available in Train. Our SPARK environments are provisioned likewise. Tim On 24 Jan 2020, at 15:48, Michael McCune > wrote: On Thu, Jan 23, 2020 at 3:54 PM Ruchi Rajasekhar > wrote: Would anyone happen to know of any data science platforms that can run on OpenStack? I was looking at Pivotal, Pachyderm but they don't run on OpenStack ☹ if you don't mind about adding another layer, i know that several data science platforms are creating kubernetes tooling. it might be worth investigating using one of the kubernetes on openstack deployment options (magnum, maybe others?) and then layering a data science platform on top of that. good luck! peace o/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From whayutin at redhat.com Fri Jan 24 18:20:01 2020 From: whayutin at redhat.com (Wesley Hayutin) Date: Fri, 24 Jan 2020 11:20:01 -0700 Subject: [tripleo] missing centos-8 rpms for kolla builds Message-ID: Greetings, I know the ceph repo is in progress. TripleO / RDO is not releasing opendaylight Can the RDO team comment on the rest of the missing packages here please? Thank you!! https://review.opendev.org/#/c/699414/9/kolla/image/build.py NOTE(mgoddard): Mark images with missing dependencies as unbuildable for # CentOS 8. 'centos8': { "barbican-api", # Missing uwsgi-plugin-python3 "ceph-base", # Missing Ceph repo "cinder-base", # Missing Ceph repo "collectd", # Missing collectd-ping and # collectd-sensubility packages "elasticsearch", # Missing elasticsearch repo "etcd", # Missing etcd package "fluentd", # Missing td-agent repo "glance-base", # Missing Ceph repo "gnocchi-base", # Missing Ceph repo "hacluster-base", # Missing hacluster repo "ironic-conductor", # Missing shellinabox package "kibana", # Missing elasticsearch repo "manila-share", # Missing Ceph repo "mongodb", # Missing mongodb and mongodb-server packages "monasca-grafana", # Using python2 "nova-compute", # Missing Ceph repo "nova-libvirt", # Missing Ceph repo "nova-spicehtml5proxy", # Missing spicehtml5 package "opendaylight", # Missing opendaylight repo "ovsdpdk", # Not supported on CentOS "sensu-base", # Missing sensu package "tgtd", # Not supported on CentOS 8 }, 'centos8+source': { "barbican-base", # Missing uwsgi-plugin-python3 "bifrost-base", # Bifrost does not support CentOS 8 "cyborg-agent", # opae-sdk does not support CentOS 8 "freezer-base", # Missing package trickle "masakari-monitors", # Missing hacluster repo "zun-compute", # Missing Ceph repo -------------- next part -------------- An HTML attachment was scrubbed... URL: From mrunge at matthias-runge.de Fri Jan 24 18:53:19 2020 From: mrunge at matthias-runge.de (Matthias Runge) Date: Fri, 24 Jan 2020 19:53:19 +0100 Subject: [tripleo] missing centos-8 rpms for kolla builds In-Reply-To: References: Message-ID: <46556190-e3f5-f452-9af8-12b06f070901@matthias-runge.de> On 24/01/2020 19:20, Wesley Hayutin wrote: > Greetings, > > I know the ceph repo is in progress. > TripleO / RDO is not releasing opendaylight  > > Can the RDO team comment on the rest of the missing packages here please? > > Thank you!! > > https://review.opendev.org/#/c/699414/9/kolla/image/build.py > >  NOTE(mgoddard): Mark images with missing dependencies as unbuildable for >     # CentOS 8. >     'centos8': { >         "barbican-api",          # Missing uwsgi-plugin-python3 >         "ceph-base",             # Missing Ceph repo >         "cinder-base",           # Missing Ceph repo >         "collectd",              # Missing collectd-ping and >                                  # collectd-sensubility packages I can comment on collectd here. It should be possible to build packages on the CentOS build system for CentOS 8 right now or the near future. I am planning especially to do the collectd packages in CentOS opstools again, which would also bring back the amqp1 plugin (and a few others). We used to build fluentd in the past for CentOS 7, but I do have no plans for this on CentOS 8. Matthias From amoralej at redhat.com Fri Jan 24 19:20:37 2020 From: amoralej at redhat.com (Alfredo Moralejo Alonso) Date: Fri, 24 Jan 2020 20:20:37 +0100 Subject: [rdo-dev] [tripleo] missing centos-8 rpms for kolla builds In-Reply-To: References: Message-ID: Hi, We were given access to CBS to build centos8 dependencies a couple of days ago and we are still in the process of re-bootstraping it. I hope we'll have all that is missing in the next days. See my comments below. Best regards, Alfredo On Fri, Jan 24, 2020 at 7:21 PM Wesley Hayutin wrote: > Greetings, > > I know the ceph repo is in progress. > TripleO / RDO is not releasing opendaylight > > Can the RDO team comment on the rest of the missing packages here please? > > Thank you!! > > https://review.opendev.org/#/c/699414/9/kolla/image/build.py > > NOTE(mgoddard): Mark images with missing dependencies as unbuildable for > # CentOS 8. > 'centos8': { > "barbican-api", # Missing uwsgi-plugin-python3 > We'll take care of uwsgi. > "ceph-base", # Missing Ceph repo > "cinder-base", # Missing Ceph repo > "collectd", # Missing collectd-ping and > # collectd-sensubility packages > About collectd and sensu, Matthias already replied from OpsTools side > "elasticsearch", # Missing elasticsearch repo > "etcd", # Missing etcd package > Given that etcd is not longer in CentOS base (it was in 7), I guess we'll take care of etcd unless some other sig is building it as part of k8s family. > "fluentd", # Missing td-agent repo > See Matthias reply. > "glance-base", # Missing Ceph repo > "gnocchi-base", # Missing Ceph repo > "hacluster-base", # Missing hacluster repo > That's an alternative repo for HA related packages for CentOS: http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/ Which still does not provide packages for centos8. Note that centos8.1 includes pacemaker, corosync and pcs in HighAvailability repo. Maybe it could be used instead of the current one. > "ironic-conductor", # Missing shellinabox package > shellinabox is epel. It was never used in tripleo containers, it's really required? > "kibana", # Missing elasticsearch repo > We never provided elasticsearch in the past, is consumed from elasticsearch repo iirc > "manila-share", # Missing Ceph repo > "mongodb", # Missing mongodb and mongodb-server > packages > Mongodb was retired from RDO time ago as it was not longer the recommended backend for any service. In CentOS7 is pulled from EPEL. > "monasca-grafana", # Using python2 > "nova-compute", # Missing Ceph repo > "nova-libvirt", # Missing Ceph repo > "nova-spicehtml5proxy", # Missing spicehtml5 package > spice-html5 is pulled from epel7 was never part of RDO. Not used in TripleO. > "opendaylight", # Missing opendaylight repo > "ovsdpdk", # Not supported on CentOS > "sensu-base", # Missing sensu package > See Matthias reply. > "tgtd", # Not supported on CentOS 8 > tgtd was replace by scsi-target-utils. It's was never provided in RDO, in kolla was pulled from epel for 7 > }, > > 'centos8+source': { > "barbican-base", # Missing uwsgi-plugin-python3 > "bifrost-base", # Bifrost does not support CentOS 8 > "cyborg-agent", # opae-sdk does not support CentOS 8 > "freezer-base", # Missing package trickle > "masakari-monitors", # Missing hacluster repo > "zun-compute", # Missing Ceph repo > _______________________________________________ > dev mailing list > dev at lists.rdoproject.org > http://lists.rdoproject.org/mailman/listinfo/dev > > To unsubscribe: dev-unsubscribe at lists.rdoproject.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim at swiftstack.com Fri Jan 24 21:23:56 2020 From: tim at swiftstack.com (Tim Burke) Date: Fri, 24 Jan 2020 13:23:56 -0800 Subject: [swift] Adding disks - one by one or all lightly weighted? In-Reply-To: <30bf2fd4-bedb-e73c-874f-d4a3efc68b13@catalyst.net.nz> References: <30bf2fd4-bedb-e73c-874f-d4a3efc68b13@catalyst.net.nz> Message-ID: <0dd0dedde4ba0017c5fb3ee0b07308046ec397a6.camel@swiftstack.com> On Thu, 2020-01-23 at 12:34 +1300, Mark Kirkwood wrote: > Hi, > > We are wanting to increase the number of disks in each of our > storage > nodes - from 4 to 12. > > I'm wondering whether it is better to: > > 1/ Add 1st new disk (with a reduced weight)...increase the weight > until > full, then repeat for next disk etc > > 2/ Add 'em all with a (much i.e 1/8 of that in 1/ ) reduced > weight...increase the weights until done > > Thoughts? > > regards > > Mark > > Hi Mark, I'd go with option 2 -- the quicker you can get all of the new disks helping with load, the better. Gradual weight adjustments seem like a good idea; they should help keep your replication traffic reasonable. Note that as long as you're waiting a full replication cycle between rebalances, though, swift should only be moving a single replica at a time, even if you added the new devices at full weight. Of course, tripling capacity like this (assuming that the new disks are the same size as the existing ones) tends to take a while. You should probably familiarize yourself with the emergency replication options and consider enabling some of them until your rings reflect the new topology; see * https://github.com/openstack/swift/blob/2.23.0/etc/object-server.conf-sample#L290-L298 * https://github.com/openstack/swift/blob/2.23.0/etc/object-server.conf-sample#L300-L307 and * https://github.com/openstack/swift/blob/2.23.0/etc/object-server.conf-sample#L353-L364 These can be really useful to speed up rebalances, though swift's durability guarantees take a bit of a hit -- so turn them back off once you've had a cycle or two with the drives at full weight! If the existing drives are full or nearly so (which IME tends to be the case when there's a large capacity increase), those may be necessary to get the system back to a state where it can make good progress. Good luck! Tim From bharat at stackhpc.com Fri Jan 24 21:31:45 2020 From: bharat at stackhpc.com (Bharat Kunwar) Date: Fri, 24 Jan 2020 21:31:45 +0000 Subject: quick question In-Reply-To: References: Message-ID: <31328A6D-077E-40F5-BC81-551A32246CA4@stackhpc.com> At StackHPC, we have deployed Pangeo, JupyterHub and Kubeflow on K8s cluster deployed using Magnum. Kubeflow comprises of a lot of things so it is likely there is something in there for those doing data science stuff. Best Bharat > On 24 Jan 2020, at 18:09, Tim Bell wrote: > >  Which data science platforms are you considering ? > > We may run some of them at CERN, we generally use Kubernetes (via Magnum) are the underlying provisioning engine with autoscaling up/down now available in Train. Our SPARK environments are provisioned likewise. > > Tim > >> On 24 Jan 2020, at 15:48, Michael McCune wrote: >> >> >> >>> On Thu, Jan 23, 2020 at 3:54 PM Ruchi Rajasekhar wrote: >>> Would anyone happen to know of any data science platforms that can run on OpenStack? I was looking at Pivotal, Pachyderm but they don't run on OpenStack ☹ >>> >> >> if you don't mind about adding another layer, i know that several data science platforms are creating kubernetes tooling. it might be worth investigating using one of the kubernetes on openstack deployment options (magnum, maybe others?) and then layering a data science platform on top of that. >> >> good luck! >> >> peace o/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcin.juszkiewicz at linaro.org Fri Jan 24 21:48:05 2020 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Fri, 24 Jan 2020 22:48:05 +0100 Subject: [rdo-dev] [tripleo] missing centos-8 rpms for kolla builds In-Reply-To: References: Message-ID: <86b5b5b7-8f0c-9bc7-6275-cce1c353cd48@linaro.org> W dniu 24.01.2020 o 20:20, Alfredo Moralejo Alonso pisze: > We were given access to CBS to build centos8 dependencies a > couple of days ago and we are still in the process of > re-bootstraping it. I hope we'll have all that is missing in > the next days. Good to hear that. >> https://review.opendev.org/#/c/699414/9/kolla/image/build.py >> >> NOTE(mgoddard): Mark images with missing dependencies as >> unbuildable for CentOS 8. >> 'centos8': { >> "elasticsearch", # Missing elasticsearch Out of CentOS repo. >> "hacluster-base", # Missing hacluster repo > Note that centos8.1 includes pacemaker, corosync and pcs in > HighAvailability repo. Maybe it could be used instead of the > current one. https://review.opendev.org/702706 enables this repo. One image is still disabled due to lack of 'crmsh'. >> "mongodb", # Missing mongodb and >> mongodb-server packages > Mongodb was retired from RDO time ago as it was not longer > the recommended backend for any service. In CentOS7 is pulled > from EPEL. I think we need to deprecate it in Ussuri and remove in V cycle. >> "monasca-grafana", # Using python2 This is nightmare image. >> "tgtd", # Not supported on CentOS 8 >> > > tgtd was replace by scsi-target-utils. It's was never > provided in RDO, in kolla was pulled from epel for 7 https://review.opendev.org/#/c/613815/15 (merged) took care of it. From anlin.kong at gmail.com Sat Jan 25 03:21:30 2020 From: anlin.kong at gmail.com (Lingxian Kong) Date: Sat, 25 Jan 2020 16:21:30 +1300 Subject: [ceilometer] Stable releases In-Reply-To: References: Message-ID: Hi Pierre, please feel free to propose a patch in openstack release repo - Best regards, Lingxian Kong Catalyst Cloud On Sat, Jan 25, 2020 at 6:28 AM Pierre Riteau wrote: > Hello, > > Is it planned to release new Ceilometer tarballs for rocky and stein? > There are several bug fixes (for example [1]) in stable branches that > would be useful to include in a release. > > If t can help, I can submit patches to the releases repository. > > Thanks, > Pierre Riteau (priteau) > > [1] https://bugs.launchpad.net/ceilometer/+bug/1801348 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Sun Jan 26 20:17:58 2020 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Sun, 26 Jan 2020 14:17:58 -0600 Subject: [goals][Drop Python 2.7 Support] Week R-16 Update (# 3 weeks left to complete) Message-ID: <16fe380f991.ada7872440986.5601891717715463907@ghanshyammann.com> Hello Everyone, Below is the progress on "Drop Python 2.7 Support" for R-16 week. Schedule: https://governance.openstack.org/tc/goals/selected/ussuri/drop-py27.html#schedule Highlights: ======== * Schedule is to finish the py2-drop from every deliverable (except requirement repo) by m-2 on R13. Only 3 weeks left to finish the work so I am going to send the status twice a week from now. * Every project has to merge their py2 drop patches on priority. We are/will face more gate issue on py2 things if the transition period is longer. * I have pushed the patches on tempest-plugins and wherever applicable jobs defined in plugin repo have been modified to run on py2 for stable branches and on py3 on the master gate. * Fixing neutron-tempest-plugin failure on stable/rocky by running Tempest on >py3.6 venv[1]. Project wise status and need reviews: ============================ Phase-1 status: The OpenStack services have not merged the py2 drop patches: NOTE: This was supposed to be completed by milestone-1 (Dec 13th, 19). * Adjutant * ec2-api * Karbor * Sahara (its plugins) * Masakari * Tacker * Qinling * Tricircle Phase-2 status: This is ongoing work and I think most of the repo have patches up to review. I will start tracking the progress for unmerged patches form next week. Each project is requested to merge the open patches asap. * Open review: https://review.opendev.org/#/q/topic:drop-py27-support+status:open How you can help: ============== - Review the patches. Push the patches if I missed any repo. [1] https://bugs.launchpad.net/neutron/+bug/1860033 -gmann From openstack at nemebean.com Sun Jan 26 23:59:38 2020 From: openstack at nemebean.com (Ben Nemec) Date: Sun, 26 Jan 2020 17:59:38 -0600 Subject: [oslo] AFK this week Message-ID: <5f96427b-1655-c847-cb84-6b4a1aaafd06@nemebean.com> I am going to be out most or all of this week. If someone wants to run the meeting on Monday then feel free to have it without me again. I may not be back right away next week either. A lot of stuff is still to be determined at this point. -Ben From amoralej at redhat.com Mon Jan 27 08:48:57 2020 From: amoralej at redhat.com (Alfredo Moralejo Alonso) Date: Mon, 27 Jan 2020 09:48:57 +0100 Subject: [rdo-dev] [tripleo] missing centos-8 rpms for kolla builds In-Reply-To: <86b5b5b7-8f0c-9bc7-6275-cce1c353cd48@linaro.org> References: <86b5b5b7-8f0c-9bc7-6275-cce1c353cd48@linaro.org> Message-ID: On Fri, Jan 24, 2020 at 10:48 PM Marcin Juszkiewicz < marcin.juszkiewicz at linaro.org> wrote: > W dniu 24.01.2020 o 20:20, Alfredo Moralejo Alonso pisze: > > We were given access to CBS to build centos8 dependencies a > > couple of days ago and we are still in the process of > > re-bootstraping it. I hope we'll have all that is missing in > > the next days. > > Good to hear that. > > >> https://review.opendev.org/#/c/699414/9/kolla/image/build.py > >> > >> NOTE(mgoddard): Mark images with missing dependencies as > >> unbuildable for CentOS 8. > >> 'centos8': { > >> "elasticsearch", # Missing elasticsearch > > Out of CentOS repo. > > >> "hacluster-base", # Missing hacluster repo > > > Note that centos8.1 includes pacemaker, corosync and pcs in > > HighAvailability repo. Maybe it could be used instead of the > current > one. > > https://review.opendev.org/702706 enables this repo. One image is still > disabled due to lack of 'crmsh'. > How is crmsh used in these images?, ha packages included in HighAvailability repo in CentOS includes pcs and some crm_* commands in pcs and pacemaker-cli packages. IMO, tt'd be good to switch to those commands to manage the cluster. > > >> "mongodb", # Missing mongodb and > >> mongodb-server packages > > > Mongodb was retired from RDO time ago as it was not longer > > the recommended backend for any service. In CentOS7 is pulled > > from EPEL. > > I think we need to deprecate it in Ussuri and remove in V cycle. > > >> "monasca-grafana", # Using python2 > > This is nightmare image. > > >> "tgtd", # Not supported on CentOS 8 > >> > > > > tgtd was replace by scsi-target-utils. It's was never > > provided in RDO, in kolla was pulled from epel for 7 > > https://review.opendev.org/#/c/613815/15 (merged) took care of it. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcin.juszkiewicz at linaro.org Mon Jan 27 09:04:35 2020 From: marcin.juszkiewicz at linaro.org (Marcin Juszkiewicz) Date: Mon, 27 Jan 2020 10:04:35 +0100 Subject: [rdo-dev] [tripleo] missing centos-8 rpms for kolla builds In-Reply-To: References: <86b5b5b7-8f0c-9bc7-6275-cce1c353cd48@linaro.org> Message-ID: <449b1a03-2066-bea1-0a53-91dc59a3d58c@linaro.org> W dniu 27.01.2020 o 09:48, Alfredo Moralejo Alonso pisze: > How is crmsh used in these images?, ha packages included in > HighAvailability repo in CentOS includes pcs and some crm_* commands in pcs > and pacemaker-cli packages. IMO, tt'd be good to switch to those commands > to manage the cluster. No idea. Gaëtan Trellu may know - he created those images. From radoslaw.piliszek at gmail.com Mon Jan 27 09:17:38 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Mon, 27 Jan 2020 10:17:38 +0100 Subject: [rdo-dev] [tripleo] missing centos-8 rpms for kolla builds In-Reply-To: <449b1a03-2066-bea1-0a53-91dc59a3d58c@linaro.org> References: <86b5b5b7-8f0c-9bc7-6275-cce1c353cd48@linaro.org> <449b1a03-2066-bea1-0a53-91dc59a3d58c@linaro.org> Message-ID: I know it was for masakari. Gaëtan had to grab crmsh from opensuse: http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/ -yoctozepto pon., 27 sty 2020 o 10:13 Marcin Juszkiewicz napisał(a): > > W dniu 27.01.2020 o 09:48, Alfredo Moralejo Alonso pisze: > > How is crmsh used in these images?, ha packages included in > > HighAvailability repo in CentOS includes pcs and some crm_* commands in pcs > > and pacemaker-cli packages. IMO, tt'd be good to switch to those commands > > to manage the cluster. > > No idea. Gaëtan Trellu may know - he created those images. > From katonalala at gmail.com Mon Jan 27 10:02:18 2020 From: katonalala at gmail.com (Lajos Katona) Date: Mon, 27 Jan 2020 11:02:18 +0100 Subject: [neutron] Bug deputy report for week of January 20th Message-ID: Hi, I was Neutron bug deputy last week, see my short summary of last week's bugs. Summary: A lot of not serious CI related issue, most of them in progress or even already fixed. As a consequence of networking-ovn code moving old bugs from https://bugs.launchpad.net/networking-ovn were moved to neutron bugs repository in launchpad. - High - https://bugs.launchpad.net/neutron/+bug/1860612 (OVN agent devstack script doesn't support IPv6) - In Progress - Medium - https://bugs.launchpad.net/neutron/+bug/1860209 (Local variable referenced before assignment in "_router_fip_qos_after_admin_state_down_up") - Fix released - https://bugs.launchpad.net/neutron/+bug/1860332 ([CI] gate py3{6,7} jobs timeout ) - Fix Released - https://bugs.launchpad.net/neutron/+bug/1860326 (Kill neutron-keepalived-state-change-monitor fails) - Fix Released - https://bugs.launchpad.net/neutron/+bug/1860488 (Encapsulated DHCP options are not supported in neutron) - In Progress - https://bugs.launchpad.net/neutron/+bug/1860560 ([ovn] lsp_set_address Exception possible when passed empty list of addresses) - In Progress - https://bugs.launchpad.net/neutron/+bug/1860586 ([Tempest] SSH exception "No existing session") - In Progress - https://bugs.launchpad.net/neutron/+bug/1860774([CI] gate functional and fullstack timeouts without reports of the causative test case) - In Progress - https://bugs.launchpad.net/neutron/+bug/1860662 ([OVN] FIP on OVN Load balancer doesn't work if member has FIP assigned on DVR setup) - Low - https://bugs.launchpad.net/neutron/+bug/1860436 ([ovn] Agent liveness checks are flaky and report false positives) - Fix Released - RFE - https://bugs.launchpad.net/neutron/+bug/1860521 (L2 pop notifications are not reliable) - Wishlist - https://bugs.launchpad.net/neutron/+bug/1860338 ([OVS] OVS agent should not plug/unplug smartNIC ports) - This one is related to Smartnic support ( https://bugs.launchpad.net/neutron/+bug/1785608) and os-vif, I suppose drivers team should discuss it. - Moved from OVN: - https://bugs.launchpad.net/neutron/+bug/1689880 ([OVN] The "neutron_sync_mode = repair" option breaks the whole cloud!) - https://bugs.launchpad.net/neutron/+bug/1860273 (Delete static route exception OVN) - https://bugs.launchpad.net/neutron/+bug/1841154 ([OVN][RFE] Make use of external ports) - https://bugs.launchpad.net/neutron/+bug/1832003 ([OVN] Left-over namespaces left in the environment by ovn-metadata-agent) - https://bugs.launchpad.net/neutron/+bug/1813551 ([OVN]Missing ingress QoS in OVN) - https://bugs.launchpad.net/neutron/+bug/1670666 ([OVN] Support native DHCP service for subnet without gateway IP) - https://bugs.launchpad.net/neutron/+bug/1542503 ([OVN] Lacking mechanism to provide proper MTU to instances) - https://bugs.launchpad.net/neutron/+bug/1664475 ([OVN] Can not retrieve where a OVN gateway router lives) - https://bugs.launchpad.net/neutron/+bug/1648525 ([OVN] QoS doesn't work with DVR, vlan tenant networks or provider networks.) - https://bugs.launchpad.net/neutron/+bug/1639835 ([OVN] Add functional testing to kuryr CI job) - https://bugs.launchpad.net/neutron/+bug/1764408 ([OVN] DHCP ports don't recreated after deleting them) -------------- next part -------------- An HTML attachment was scrubbed... URL: From chacon.piza at gmail.com Mon Jan 27 10:11:14 2020 From: chacon.piza at gmail.com (Martin Chacon Piza) Date: Mon, 27 Jan 2020 11:11:14 +0100 Subject: [Devstack] Unable to find pip2.7 ? [Thanks for the fix] In-Reply-To: References: Message-ID: Hi Radek, Thanks for the fast answer and the fix. We can stack it again. Best regards, Martin El vie., 24 de ene. de 2020 a la(s) 12:03, Radosław Piliszek ( radoslaw.piliszek at gmail.com) escribió: > Proposed: https://review.opendev.org/704136 > > Please let us know if it helps there. > > -yoctozepto > > pt., 24 sty 2020 o 11:45 Radosław Piliszek > napisał(a): > > > > This part of that change broke it: > > https://review.opendev.org/#/c/561597/21/tools/install_pip.sh > > since pip2 is no longer being installed when py3 is used. > > CI obviously hid the issue by having pip preinstalled. > > We should not really need pip2 now. > > > > -yoctozepto > > > > pt., 24 sty 2020 o 11:27 Martin Chacon Piza > napisał(a): > > > > > > Dear all, > > > > > > It seems that there is a problem stacking Devstack using the master > branch. > > > The problem happens during the step: Installing package prerequisites > > > [ERROR] /home/vagrant/devstack/inc/python:41 Unable to find pip2.7; > cannot continue > > > > > > > > > This is the minimum local.conf I used: > > > > > > [[local|localrc]] > > > SERVICE_HOST=192.168.10.6 > > > HOST_IP=192.168.10.6 > > > HOST_IP_IFACE=eth1 > > > DATABASE_PASSWORD=secretdatabase > > > RABBIT_PASSWORD=secretrabbit > > > ADMIN_PASSWORD=secretadmin > > > SERVICE_PASSWORD=secretservice > > > LOGFILE=$DEST/logs/stack.sh.log > > > LOGDIR=$DEST/logs > > > LOG_COLOR=False > > > DEST=/opt/stack > > > USE_PYTHON3=True > > > > > > As a workaround I need to revert the following change, then it ends up > properly: > > > > > > commit 279a7589b03db69fd1b85d947cd0171dacef94ee > > > Author: Jens Harbott (frickler) > > > Date: Mon Apr 16 12:08:30 2018 +0000 > > > > > > Revert "Do not use pip 10 or higher" > > > > > > This reverts commit f99d1771ba1882dfbb69186212a197edae3ef02c. > > > > > > Added workarounds that might want to get split into their own patch > > > before merging: > > > > > > - Don't install python-psutil > > > - Don't run peakmem_tracker > > > > > > Change-Id: If4fb16555e15082a4d97cffdf3cfa608a682997d > > > > > > > > > Any hints? > > > > > > -- > > > Best regards, > > > Martín Chacón Pizá > > > chacon.piza at gmail.com > -- *Martín Chacón Pizá* *chacon.piza at gmail.com * -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Mon Jan 27 10:28:24 2020 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 27 Jan 2020 11:28:24 +0100 Subject: [largescale-sig] Next meeting: Jan 29, 9utc Message-ID: <776db5ca-9a4c-698f-e10a-b3c36b04e83b@openstack.org> Hi everyone, The Large Scale SIG will have a meeting this week on Wednesday, Jan 29 at 9 UTC[1] in #openstack-meeting on IRC: [1] https://www.timeanddate.com/worldclock/fixedtime.html?iso=20200129T09 As always, the agenda for our meeting is available at: https://etherpad.openstack.org/p/large-scale-sig-meeting Feel free to add topics to it. We had several standing TODOs out of our January 15 meeting, so I also invite you to review the summary of that meeting in preparation of the next: http://lists.openstack.org/pipermail/openstack-discuss/2020-January/012049.html Regards, -- Thierry Carrez From rfolco at redhat.com Mon Jan 27 12:41:09 2020 From: rfolco at redhat.com (Rafael Folco) Date: Mon, 27 Jan 2020 10:41:09 -0200 Subject: [tripleo] TripleO CI Summary: Sprint 41 Message-ID: Greetings, The TripleO CI team has just completed Sprint 41 / Unified Sprint 20 (Jan 03 thru Jan 22). The following is a summary of completed work during this sprint cycle: - Fixed the new promoter code and added testing to cover identified bugs. - Manually built CentOS8 containers and documented missing requirements [4]. Started building RHEL8 containers for OSP. - Documented the design for the new component promotion pipeline [3]. Continue to implement the downstream version of the component pipeline. - Addressed necessary changes to promote-hash role to support the new multi-hash component provided by delorean [5]. - Turned the ansible-collect-logs role into an infra-red plugin as part of the shared goals for the combined CI team. The planned work for the next sprint [1] extends the work started in the previous sprint and focuses on CentOS8 container builds as a base requirement to leverage other fronts of work. The Ruck and Rover for this sprint are Wesley Hayutin (weshay) and Chandan Kumar (chkumar). Please direct questions or queries to them regarding CI status or issues in #tripleo, ideally to whomever has the ‘|ruck’ suffix on their nick. Ruck/rover notes to be tracked in etherpad [2]. Thanks, rfolco [1] https://tree.taiga.io/project/tripleo-ci-board/taskboard/unified-sprint-21 [2] https://etherpad.openstack.org/p/ruckroversprint21 [3] https://hackmd.io/5uYRmLaOTI2raTbHWsaiSQ [4] https://hackmd.io/dSagCbocQ4KSVEZR1uf8Tw [5] https://trunk.rdoproject.org/centos8-master/tripleo-ci-testing/delorean.repo -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Mon Jan 27 14:08:52 2020 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Mon, 27 Jan 2020 08:08:52 -0600 Subject: =?UTF-8?Q?[qa]_Proposing_Rados=C5=82aw_Piliszek__to_devstack_core?= Message-ID: <16fe7556c9c.c1957cd473318.237960248883865388@ghanshyammann.com> Hello Everyone, Radosław Piliszek (yoctozepto) has been doing nice work in devstack from code as well as review perspective. He has been helping for many bugs fixes nowadays and having him as Core will help us to speed up the things. I would like to propose him for Devstack Core. You can vote/feedback on this email. If no objection by end of this week, I will add him to the list. -gmann From openstack at fried.cc Mon Jan 27 14:49:06 2020 From: openstack at fried.cc (Eric Fried) Date: Mon, 27 Jan 2020 08:49:06 -0600 Subject: =?UTF-8?Q?Re=3a_=5bqa=5d_Proposing_Rados=c5=82aw_Piliszek_to_devsta?= =?UTF-8?Q?ck_core?= In-Reply-To: <16fe7556c9c.c1957cd473318.237960248883865388@ghanshyammann.com> References: <16fe7556c9c.c1957cd473318.237960248883865388@ghanshyammann.com> Message-ID: <1b7b3a78-2398-5d7e-5d01-b57a73e63790@fried.cc> +1 from a non-core FWIW. I've seen Radosław being heavily engaged and doing solid reviews. On 1/27/20 8:08 AM, Ghanshyam Mann wrote: > Hello Everyone, > > Radosław Piliszek (yoctozepto) has been doing nice work in devstack from code as well as review perspective. > He has been helping for many bugs fixes nowadays and having him as Core will help us to speed up the things. > > I would like to propose him for Devstack Core. You can vote/feedback on this email. If no objection by end of this week, I will add him to the list. > > -gmann > > From elod.illes at est.tech Mon Jan 27 16:34:12 2020 From: elod.illes at est.tech (=?utf-8?B?RWzDtWQgSWxsw6lz?=) Date: Mon, 27 Jan 2020 16:34:12 +0000 Subject: [ptl][release][stable][EM] Extended Maintenance - Rocky Message-ID: <8c51036b-050a-3aa7-ef97-4f89c1ba7fe0@est.tech> Hi, In less than one month Rocky is planned to enter into Extended Maintenance phase [1] (estimated date: 2020-02-24). I have generated the list of *open* and *unreleased* changes in *stable/rocky* for the follows-policy tagged repositories [2]. These lists could help the teams, who are planning to do a final release on Rocky before moving stable/rocky branches to Extended Maintenance. Feel free to edit them! * At the transition date the Release Team will tag the latest (Rocky) releases of repositories with *rocky-em* tag. * After the transition stable/rocky will be still open for bugfixes, but there won't be any official releases. NOTE: teams, please focus on wrapping up your libraries first if there is any concern about changes, in order to avoid any broken releases! Thanks, Előd [1] https://releases.openstack.org/ [2] https://etherpad.openstack.org/p/rocky-final-release-before-em From pierre at stackhpc.com Mon Jan 27 17:03:19 2020 From: pierre at stackhpc.com (Pierre Riteau) Date: Mon, 27 Jan 2020 18:03:19 +0100 Subject: [ceilometer] Stable releases In-Reply-To: References: Message-ID: Thank you. Release patch submitted at https://review.opendev.org/#/c/704358/ On Sat, 25 Jan 2020 at 04:21, Lingxian Kong wrote: > > Hi Pierre, please feel free to propose a patch in openstack release repo > > - > Best regards, > Lingxian Kong > Catalyst Cloud > > > On Sat, Jan 25, 2020 at 6:28 AM Pierre Riteau wrote: >> >> Hello, >> >> Is it planned to release new Ceilometer tarballs for rocky and stein? >> There are several bug fixes (for example [1]) in stable branches that >> would be useful to include in a release. >> >> If t can help, I can submit patches to the releases repository. >> >> Thanks, >> Pierre Riteau (priteau) >> >> [1] https://bugs.launchpad.net/ceilometer/+bug/1801348 >> From skaplons at redhat.com Mon Jan 27 17:48:50 2020 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 27 Jan 2020 18:48:50 +0100 Subject: [neutron] CI meeting date changed Message-ID: <20200127174850.uqikso42i3sbhiyx@skaplons-mac> Hi, Just short info, according to what we agreed on our last CI meeting, it's now changed and will be on Wednesday at 1500 UTC. Please also note that IRC channel is also changed and now it will be on "openstack-meeting-3" You can check [1] for details. [1] https://review.opendev.org/#/c/704200/ -- Slawek Kaplonski Senior software engineer Red Hat From rosmaita.fossdev at gmail.com Mon Jan 27 22:47:42 2020 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Mon, 27 Jan 2020 17:47:42 -0500 Subject: [cinder] changes for sqlalchemy 1.3.13 Message-ID: Updating to the latest release of sqlalchemy breaks a cinder unit test. This patch shows the breakage and makes some suggestions about how it could be fixed. I could use some feedback: https://review.opendev.org/704425 thanks! brian From kozhukalov at gmail.com Tue Jan 28 08:06:57 2020 From: kozhukalov at gmail.com (Vladimir Kozhukalov) Date: Tue, 28 Jan 2020 11:06:57 +0300 Subject: [trio2o] Current state, community interest, alternatives Message-ID: Hi, I am currently looking for a open source solution that is able to expose a single API endpoint for a user to multiple Openstack instances. Partly the idea behind is to stop investing efforts into tuning of Openstack to make it scalable and instead to deploy a bunch of small Openstack regions (say ~100 nodes) and work with them via API frontend. I am aware of Nova Cells, but it is complicated and Nova is not the only Openstack component that needs scaling. All these ad hoc tuning like dedicated RabbitMQ instances just make the system barely maintainable and it is complicated to reproduce. Requirements for a possible solution are: - Single authentication point (I know this is simple and can be for example SAML based federation) - Single point to manage keys and quotas - Possibly L2 connectivity between VMs in different Openstack instances (I know this can be achieved using VPNs, but this is kinda too complicated for a user) - Ability to work with different Openstack versions (need for gradual upgrade) I know Huawei and others implemented PoC for Openstack cascading and then invested some efforts in Trio2o and Tricircle. Tricircle is about cascading Neutron and IMO it is better to use a scalable SDN (like TangstenFabric or OVN). Trio2o looks like what I actually need, but the community interest around Trio2o is very limited (frankly, it looks almost dead). Last few weeks I was experimenting with Trio2o and Devstack with Stein release. It looks working, but the functionality is limited. Is there anyone who is interested in reviving Trio2o and investing into it? Maybe there are other solutions that I am not aware of? -- Best regards, Kozhukalov Vladimir -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Tue Jan 28 10:05:48 2020 From: thierry at openstack.org (Thierry Carrez) Date: Tue, 28 Jan 2020 11:05:48 +0100 Subject: [cinder][ci] Cinder drivers being Unsupported and General CI Status ... In-Reply-To: References: Message-ID: Jay Bryant wrote: > We once again are at the point in the release where we are talking about > 3rd Party CI and what is going on for Cinder.  At the moment I have > analyzed drivers that have not successfully reported results on a Cinder > patch in 30 or more days and have put together the following list of > drivers to be unsupported in the Ussuri release: > > * Inspur Drivers > * Infortrend > * Kaminario > * NEC > * Quobyte > * Zadara > * HPE Drivers We (OSF) reached out to our contacts at those companies (except Kaminario who already replied here), to increase the chances that they get the message. Please note that Inspur is in the middle of (extended) holidays in China and may be slow to respond :) -- Thierry Carrez (ttx) From balazs.gibizer at est.tech Tue Jan 28 13:30:26 2020 From: balazs.gibizer at est.tech (=?iso-8859-1?Q?Bal=E1zs_Gibizer?=) Date: Tue, 28 Jan 2020 13:30:26 +0000 Subject: [barbican] TPM2.0 backend In-Reply-To: <1579514411.790283.0@est.tech> References: <1579514411.790283.0@est.tech> Message-ID: <1580218223.279185.0@est.tech> On Mon, Jan 20, 2020 at 10:00, Balázs Gibizer wrote: > Hi, > > Looking at the Barbican documentation I see that the secrets can be > stored on disk (SimpleCrypto backend) or in a HW vendor specific HSM > module. Is there a way to use a TPM 2.0 device as the backend of > Barbican via something like [1]? On the today's barbican IRC meeting I got my question answered. In short it is feasible but at the moment no barbican in-tree implementation exists. Also barbican would accept such contribution. Cheers, gibi [2] http://eavesdrop.openstack.org/meetings/barbican/2020/barbican.2020-01-28-13.05.log.html#l-47 > > Cheers, > gibi > > [1] https://github.com/tpm2-software/tpm2-pkcs11 > > > From dougal at redhat.com Tue Jan 28 14:54:10 2020 From: dougal at redhat.com (Dougal Matthews) Date: Tue, 28 Jan 2020 14:54:10 +0000 Subject: [tripleo] Removing the `openstack overcloud plan *` commands Message-ID: Hey all, While doing work on the Mistral to Ansible port I was looking at the openstack overcloud plan commands. These are; - openstack overcloud plan create - openstack overcloud plan delete - openstack overcloud plan deploy - openstack overcloud plan list - openstack overcloud plan export I believe none of these commands make sense in a post-TripleO UI world. There is no other way to interact and update a plan. Deploys are always done via "openstack overcloud deploy" and this deletes the contents of the plan container and repopulates it with the local files[1]. I am therefore proposing that we remove these commands and skip the normal deprecation process. https://review.opendev.org/#/c/704581/1 What do you think? Thanks, Dougal [1]: https://github.com/openstack/python-tripleoclient/blob/3c589979ceb05d732b3c924eb919e145e22efaa8/tripleoclient/workflows/plan_management.py#L211 -------------- next part -------------- An HTML attachment was scrubbed... URL: From dougal at redhat.com Tue Jan 28 15:01:13 2020 From: dougal at redhat.com (Dougal Matthews) Date: Tue, 28 Jan 2020 15:01:13 +0000 Subject: [tripleo] Removing the `openstack overcloud plan *` commands In-Reply-To: References: Message-ID: On Tue, 28 Jan 2020 at 14:54, Dougal Matthews wrote: > Hey all, > > While doing work on the Mistral to Ansible port I was looking at the > openstack overcloud plan commands. These are; > > - openstack overcloud plan create > - openstack overcloud plan delete > - openstack overcloud plan deploy > - openstack overcloud plan list > - openstack overcloud plan export > There has been a useful comment on the review that export is actually useful still. Harald said; "I find the possibility to download the plan on a failed deployment quite valuable. It allows me to read the heat templates in a more human friendly fully rendered version compared to the j2, run yaml validation tools etc. There is *magic* adding stuff to plan's that I can't see by simply running process templates tools." This seems like a good reason, although it could be argued that Swift is a better tool to download a container. > > I believe none of these commands make sense in a post-TripleO UI world. > There is no other way to interact and update a plan. Deploys are always > done via "openstack overcloud deploy" and this deletes the contents of the > plan container and repopulates it with the local files[1]. > > I am therefore proposing that we remove these commands and skip the normal > deprecation process. https://review.opendev.org/#/c/704581/1 > > What do you think? > > Thanks, > Dougal > > > [1]: > https://github.com/openstack/python-tripleoclient/blob/3c589979ceb05d732b3c924eb919e145e22efaa8/tripleoclient/workflows/plan_management.py#L211 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aschultz at redhat.com Tue Jan 28 15:15:14 2020 From: aschultz at redhat.com (Alex Schultz) Date: Tue, 28 Jan 2020 08:15:14 -0700 Subject: [tripleo] Removing the `openstack overcloud plan *` commands In-Reply-To: References: Message-ID: On Tue, Jan 28, 2020 at 8:10 AM Dougal Matthews wrote: > > > > On Tue, 28 Jan 2020 at 14:54, Dougal Matthews wrote: >> >> Hey all, >> >> While doing work on the Mistral to Ansible port I was looking at the openstack overcloud plan commands. These are; >> >> - openstack overcloud plan create >> - openstack overcloud plan delete >> - openstack overcloud plan deploy >> - openstack overcloud plan list >> - openstack overcloud plan export > > > There has been a useful comment on the review that export is actually useful still. Harald said; > > "I find the possibility to download the plan on a failed deployment quite valuable. It allows me to read the heat templates in a more human friendly fully rendered version compared to the j2, run yaml validation tools etc. There is *magic* adding stuff to plan's that I can't see by simply running process templates tools." > > This seems like a good reason, although it could be argued that Swift is a better tool to download a container. > I've commented but I don't think we're ready to remove these yet until we've gotten off of swift. TBH trying to download a swift container is a painful via openstackcli so I'd rather that we leave these basic commands. Additionally it doesn't require that an end user understand that a plan is stored in swift. End users should be using 'openstack overcloud *' commands to perform actions and not going and doing direct nova/neutron/ironic/swift related actions. We've seen folks do some dangerous stuff when they start toying with the underlying implementations. > >> >> >> I believe none of these commands make sense in a post-TripleO UI world. There is no other way to interact and update a plan. Deploys are always done via "openstack overcloud deploy" and this deletes the contents of the plan container and repopulates it with the local files[1]. >> >> I am therefore proposing that we remove these commands and skip the normal deprecation process. https://review.opendev.org/#/c/704581/1 >> >> What do you think? >> >> Thanks, >> Dougal >> >> >> [1]: https://github.com/openstack/python-tripleoclient/blob/3c589979ceb05d732b3c924eb919e145e22efaa8/tripleoclient/workflows/plan_management.py#L211 From jungleboyj at gmail.com Tue Jan 28 15:16:20 2020 From: jungleboyj at gmail.com (Jay Bryant) Date: Tue, 28 Jan 2020 09:16:20 -0600 Subject: [cinder][ci] Cinder drivers being Unsupported and General CI Status ... In-Reply-To: References: Message-ID: On 1/28/2020 4:05 AM, Thierry Carrez wrote: > Jay Bryant wrote: >> We once again are at the point in the release where we are talking >> about 3rd Party CI and what is going on for Cinder.  At the moment I >> have analyzed drivers that have not successfully reported results on >> a Cinder patch in 30 or more days and have put together the following >> list of drivers to be unsupported in the Ussuri release: >> >>   * Inspur Drivers >>   * Infortrend >>   * Kaminario >>   * NEC >>   * Quobyte >>   * Zadara >>   * HPE Drivers > > We (OSF) reached out to our contacts at those companies (except > Kaminario who already replied here), to increase the chances that they > get the message. > > Please note that Inspur is in the middle of (extended) holidays in > China  and may be slow to respond :) > Thank you!  We have started seeing some more responses as a result.  It is appreciated. Jay From mark at stackhpc.com Tue Jan 28 15:18:01 2020 From: mark at stackhpc.com (Mark Goddard) Date: Tue, 28 Jan 2020 15:18:01 +0000 Subject: [rdo-dev] [tripleo] missing centos-8 rpms for kolla builds In-Reply-To: References: <86b5b5b7-8f0c-9bc7-6275-cce1c353cd48@linaro.org> <449b1a03-2066-bea1-0a53-91dc59a3d58c@linaro.org> Message-ID: On Mon, 27 Jan 2020 at 09:18, Radosław Piliszek wrote: > > I know it was for masakari. > Gaëtan had to grab crmsh from opensuse: > http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/ > > -yoctozepto Thanks Wes for getting this discussion going. I've been looking at CentOS 8 today and trying to assess where we are. I created an Etherpad to track status: https://etherpad.openstack.org/p/kolla-centos8 > > pon., 27 sty 2020 o 10:13 Marcin Juszkiewicz > napisał(a): > > > > W dniu 27.01.2020 o 09:48, Alfredo Moralejo Alonso pisze: > > > How is crmsh used in these images?, ha packages included in > > > HighAvailability repo in CentOS includes pcs and some crm_* commands in pcs > > > and pacemaker-cli packages. IMO, tt'd be good to switch to those commands > > > to manage the cluster. > > > > No idea. Gaëtan Trellu may know - he created those images. > > > From mihalis68 at gmail.com Tue Jan 28 16:00:39 2020 From: mihalis68 at gmail.com (Chris Morgan) Date: Tue, 28 Jan 2020 11:00:39 -0500 Subject: [ops] OpenStack Ops Meetups team meeting 2020-1-20 Message-ID: Meeting minutes for today's meeting are linked below. This was our first meeting since the very successful (or so I am told) Ops Meetup in London earlier this month, for which the agenda is here https://etherpad.openstack.org/p/LON-2020-OPS-AGENDA and the feedback was collected here https://etherpad.openstack.org/p/LON-2020-OPS-FEEDBACK the meetups team is now turning to looking into what we can arrange for operators at the Vancouver opendev+ptg event and then a second full ops meetup in south korea in september. Watch this space! Chris Morgan on behalf of the openstack ops meetups team <•openstack> Minutes: http://eavesdrop.openstack.org/meetings/ops_meetup_team/2020/ops_meetup_team.2020-01-28-15.11.html 10:53 AM Minutes (text): http://eavesdrop.openstack.org/meetings/ops_meetup_team/2020/ops_meetup_team.2020-01-28-15.11.txt 10:53 AM Log: http://eavesdrop.openstack.org/meetings/ops_meetup_team/2020/ops_meetup_team.2020-01-28-15.11.log.html -- Chris Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at stackhpc.com Tue Jan 28 16:52:50 2020 From: mark at stackhpc.com (Mark Goddard) Date: Tue, 28 Jan 2020 16:52:50 +0000 Subject: [rdo-dev] [tripleo] missing centos-8 rpms for kolla builds In-Reply-To: References: <86b5b5b7-8f0c-9bc7-6275-cce1c353cd48@linaro.org> <449b1a03-2066-bea1-0a53-91dc59a3d58c@linaro.org> Message-ID: On Tue, 28 Jan 2020 at 15:18, Mark Goddard wrote: > > On Mon, 27 Jan 2020 at 09:18, Radosław Piliszek > wrote: > > > > I know it was for masakari. > > Gaëtan had to grab crmsh from opensuse: > > http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/ > > > > -yoctozepto > > Thanks Wes for getting this discussion going. I've been looking at > CentOS 8 today and trying to assess where we are. I created an > Etherpad to track status: > https://etherpad.openstack.org/p/kolla-centos8 We are seeing an odd DNF error sometimes. DNF exits 141 with no error code when installing packages. It often happens on the rabbitmq and grafana images. There is a prompt about importing GPG keys prior to the error. Example: https://4eff4bb69c321960be39-770d619687de1bce0976465c40e4e9ca.ssl.cf2.rackcdn.com/693544/33/check/kolla-ansible-centos8-source-mariadb/93a8351/primary/logs/build/000_FAILED_kolla-toolbox.log Related bug report? https://github.com/containers/libpod/issues/4431 Anyone familiar with it? > > > > > pon., 27 sty 2020 o 10:13 Marcin Juszkiewicz > > napisał(a): > > > > > > W dniu 27.01.2020 o 09:48, Alfredo Moralejo Alonso pisze: > > > > How is crmsh used in these images?, ha packages included in > > > > HighAvailability repo in CentOS includes pcs and some crm_* commands in pcs > > > > and pacemaker-cli packages. IMO, tt'd be good to switch to those commands > > > > to manage the cluster. > > > > > > No idea. Gaëtan Trellu may know - he created those images. > > > > > From shlomi at zadara.com Tue Jan 28 17:08:36 2020 From: shlomi at zadara.com (Shlomi Avihou) Date: Tue, 28 Jan 2020 19:08:36 +0200 Subject: [cinder][ci] Cinder drivers being Unsupported and General CI Status ... In-Reply-To: References: Message-ID: Greetings, The Zadara CI system is up and reporting again Thanks, Shlomi. On Wed, Jan 22, 2020 at 9:51 PM Jay Bryant wrote: > All, > > We once again are at the point in the release where we are talking about > 3rd Party CI and what is going on for Cinder. At the moment I have > analyzed drivers that have not successfully reported results on a Cinder > patch in 30 or more days and have put together the following list of > drivers to be unsupported in the Ussuri release: > > - Inspur Drivers > - Infortrend > - Kaminario > - NEC > - Quobyte > - Zadara > - HPE Drivers > > If your name is in the list above you are receiving this e-mail directly, > not just through the mailing list. > > If you are working on resolving CI issues please let me know so we can > discuss how to proceed. > > In addition to the fact that we will be pushing up unsupported patches for > the drivers above, we have already unsupported and removed a number of > drivers during this release. They are as follows: > > - Unsupported: > - MacroSAN Driver > - Removed: > - ProphetStor Driver > - Nimble Storage Driver > - Veritas Access Driver > - Veritas CNFS Driver > - Virtuozzo Storage Driver > - Huawei FusionStorage Driver > - Sheepdog Storage Driver > > Obviously we are reaching the point that the number of drivers leaving the > community is concerning and it has sparked discussions around the fact that > maybe our 3rd Party CI approach isn't working as intended. So what do we > do? Just mark drivers unsupported and no longer remove drivers? Do we > restore drivers that have recently been removed? > > We are planning to have further discussion around these questions at our > next Cinder meeting in #openstack-meeting-4 on Wednesday, 1/29/20 at 14:00 > UTC. If you have thoughts or strong opinions around this topic please join > us. > > Thank you! > > Jay Bryant > > jsbryant at electronicjungle.net > > IRC: jungleboyj > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From olealrd1981 at gmail.com Tue Jan 28 18:24:02 2020 From: olealrd1981 at gmail.com (=?UTF-8?Q?Orestes_Leal_Rodr=C3=ADguez?=) Date: Tue, 28 Jan 2020 13:24:02 -0500 Subject: horizon: Trailing spaces removed on passwords Message-ID: >From the dashboard openstack is removing the trailing spaces from our user's passwords. We have a modified sql.py backend, that does an ldap bind to an active directory data store. And that works almost always. I say almost because for some users it doesn't work at all. We figure out (and a co-worker also confirmed this) that openstack is removing trailing (also leading?) spaces from the password entered in the dashboard. Also, inside the dashboard trailing spaces are not accepted even when they are equal byte by byte (including the space, I get an error). So this is going on. Do anybody knows where is this removal performed? (python script location, line) So I can remove that since I have users (me included, I have the issue since the very beginning of this deployment) that cannot login. And they can use their Active Directrory passwords from other apps without problem. We are running 'stein' with the latest update for ubuntu 18.04-AMD64. From olealrd1981 at gmail.com Tue Jan 28 21:24:28 2020 From: olealrd1981 at gmail.com (=?UTF-8?Q?Orestes_Leal_Rodr=C3=ADguez?=) Date: Tue, 28 Jan 2020 16:24:28 -0500 Subject: horizon: Trailing spaces removed on passwords In-Reply-To: References: Message-ID: I have found a way to solve it and give access to users that have passwords with spaces at the beginning/end. The issue (not an issue per se, but it affects horizon [stein]) lies in django. Specifically on 'django/forms/fields.py' Horizon uses the fields and those by default remove spaces as stated, what I did is the following: On that file, the class CharField's constructor was removing leading/trailing spaces: Below is the diff between the original and the modified python script (one line modified, strip=False) --- fields.py.orig 2020-01-28 15:16:22.696047918 -0500 +++ fields.py 2020-01-28 15:16:45.520084974 -0500 @@ -220,7 +220,7 @@ class CharField(Field): - def __init__(self, max_length=None, min_length=None, strip=True, empty_value='', *args, **kwargs): + def __init__(self, max_length=None, min_length=None, strip=False, empty_value='', *args, **kwargs): self.max_length = max_length self.min_length = min_length self.strip = strip Now passwords are not altered by the underlying framework. Not sure the effect of not removing trailing/leading spaces from the textfields will have on the Horizon operations, though. Maybe horizon should redefine that django class to avoid this behavior. I'm also open to other solutions from the community. Have a great evening, Thanks. Orestes On 1/28/20, Orestes Leal Rodríguez wrote: > From the dashboard openstack is removing the trailing spaces from our > user's passwords. > We have a modified sql.py backend, that does an ldap bind to an active > directory data store. And that works almost always. I say almost > because for some users it doesn't work at all. We figure out (and a > co-worker also confirmed this) that openstack is removing trailing > (also leading?) spaces from the password entered in the dashboard. > Also, inside the dashboard trailing spaces are not accepted even when > they are equal byte by byte (including the space, I get an error). So > this is going on. > > Do anybody knows where is this removal performed? (python script > location, line) So I can remove that since I have users (me included, > I have the issue since the very beginning of this deployment) that > cannot login. And they can use their Active Directrory passwords from > other apps without problem. > > We are running 'stein' with the latest update for ubuntu 18.04-AMD64. > From Albert.Braden at synopsys.com Tue Jan 28 22:09:44 2020 From: Albert.Braden at synopsys.com (Albert Braden) Date: Tue, 28 Jan 2020 22:09:44 +0000 Subject: horizon: Trailing spaces removed on passwords In-Reply-To: References: Message-ID: Stripping leading/trailing spaces from passwords is the correct behavior. Passwords should not contain leading/trailing spaces, and when they do it is usually because of a paste error. -----Original Message----- From: Orestes Leal Rodríguez Sent: Tuesday, January 28, 2020 1:24 PM To: openstack-discuss at lists.openstack.org Subject: Re: horizon: Trailing spaces removed on passwords I have found a way to solve it and give access to users that have passwords with spaces at the beginning/end. The issue (not an issue per se, but it affects horizon [stein]) lies in django. Specifically on 'django/forms/fields.py' Horizon uses the fields and those by default remove spaces as stated, what I did is the following: On that file, the class CharField's constructor was removing leading/trailing spaces: Below is the diff between the original and the modified python script (one line modified, strip=False) --- fields.py.orig 2020-01-28 15:16:22.696047918 -0500 +++ fields.py 2020-01-28 15:16:45.520084974 -0500 @@ -220,7 +220,7 @@ class CharField(Field): - def __init__(self, max_length=None, min_length=None, strip=True, empty_value='', *args, **kwargs): + def __init__(self, max_length=None, min_length=None, strip=False, empty_value='', *args, **kwargs): self.max_length = max_length self.min_length = min_length self.strip = strip Now passwords are not altered by the underlying framework. Not sure the effect of not removing trailing/leading spaces from the textfields will have on the Horizon operations, though. Maybe horizon should redefine that django class to avoid this behavior. I'm also open to other solutions from the community. Have a great evening, Thanks. Orestes On 1/28/20, Orestes Leal Rodríguez wrote: > From the dashboard openstack is removing the trailing spaces from our > user's passwords. > We have a modified sql.py backend, that does an ldap bind to an active > directory data store. And that works almost always. I say almost > because for some users it doesn't work at all. We figure out (and a > co-worker also confirmed this) that openstack is removing trailing > (also leading?) spaces from the password entered in the dashboard. > Also, inside the dashboard trailing spaces are not accepted even when > they are equal byte by byte (including the space, I get an error). So > this is going on. > > Do anybody knows where is this removal performed? (python script > location, line) So I can remove that since I have users (me included, > I have the issue since the very beginning of this deployment) that > cannot login. And they can use their Active Directrory passwords from > other apps without problem. > > We are running 'stein' with the latest update for ubuntu 18.04-AMD64. > From amotoki at gmail.com Wed Jan 29 00:45:28 2020 From: amotoki at gmail.com (Akihiro Motoki) Date: Wed, 29 Jan 2020 09:45:28 +0900 Subject: horizon: Trailing spaces removed on passwords In-Reply-To: References: Message-ID: I have two comments. The first one is whether we should strip leading/trailing spaces in password forms. As Albert commented, someone thinks they should be stripped. On the other hand, Django AuthenticationForm does not strip an input password [0]. I am not sure which is better. IIRC, horizon login form strips leading/trailing spaces in an input password since long ago but I see no discussion on this. The second one is about the original question. Django CharField strips leading/trailing spaces in an input string by default. strip=True is the default [1]. The horizon login form just uses CharField [2], so leading/trailing spaces in an input password are stripped. To avoid this, you need to add strip=False to CharField [2]. It is not a good idea to change the default value of CharField in Django. Other usages of CharField may assume the default behavior. If you would like horizon to change the current behavior, there are several password related fields and we need to change all of them consistently, so it is worth filing a bug. [0] https://github.com/django/django/blob/master/django/contrib/auth/forms.py#L180 [1] https://docs.djangoproject.com/en/3.0/ref/forms/fields/#charfield [2] https://opendev.org/openstack/horizon/src/branch/master/openstack_auth/forms.py#L73-L74 Thanks, Akihiro On Wed, Jan 29, 2020 at 6:27 AM Orestes Leal Rodríguez wrote: > > I have found a way to solve it and give access to users that have > passwords with spaces at the beginning/end. The issue (not an issue > per se, but it affects horizon [stein]) lies in django. Specifically > on 'django/forms/fields.py' > > Horizon uses the fields and those by default remove spaces as stated, > what I did is the following: > > On that file, the class CharField's constructor was removing > leading/trailing spaces: > Below is the diff between the original and the modified python script > (one line modified, strip=False) > > --- fields.py.orig 2020-01-28 15:16:22.696047918 -0500 > +++ fields.py 2020-01-28 15:16:45.520084974 -0500 > @@ -220,7 +220,7 @@ > > > class CharField(Field): > - def __init__(self, max_length=None, min_length=None, strip=True, > empty_value='', *args, **kwargs): > + def __init__(self, max_length=None, min_length=None, strip=False, > empty_value='', *args, **kwargs): > self.max_length = max_length > self.min_length = min_length > self.strip = strip > > Now passwords are not altered by the underlying framework. Not sure > the effect of not removing trailing/leading spaces from the textfields > will have on the Horizon operations, though. Maybe horizon should > redefine that django class to avoid this behavior. I'm also open to > other solutions from the community. Have a great evening, > > > Thanks. > Orestes > > On 1/28/20, Orestes Leal Rodríguez wrote: > > From the dashboard openstack is removing the trailing spaces from our > > user's passwords. > > We have a modified sql.py backend, that does an ldap bind to an active > > directory data store. And that works almost always. I say almost > > because for some users it doesn't work at all. We figure out (and a > > co-worker also confirmed this) that openstack is removing trailing > > (also leading?) spaces from the password entered in the dashboard. > > Also, inside the dashboard trailing spaces are not accepted even when > > they are equal byte by byte (including the space, I get an error). So > > this is going on. > > > > Do anybody knows where is this removal performed? (python script > > location, line) So I can remove that since I have users (me included, > > I have the issue since the very beginning of this deployment) that > > cannot login. And they can use their Active Directrory passwords from > > other apps without problem. > > > > We are running 'stein' with the latest update for ubuntu 18.04-AMD64. > > > From amotoki at gmail.com Wed Jan 29 00:46:42 2020 From: amotoki at gmail.com (Akihiro Motoki) Date: Wed, 29 Jan 2020 09:46:42 +0900 Subject: [neutron][drivers team] Proposing Nate Johnston as drivers team member In-Reply-To: <20200122215847.zsfgihjmmjkwlrdl@skaplons-mac> References: <20200122215847.zsfgihjmmjkwlrdl@skaplons-mac> Message-ID: Great addition. +1 from me. On Thu, Jan 23, 2020 at 7:02 AM Slawek Kaplonski wrote: > > Hi Neutrinos, > > I would like to propose Nate Johnston to be part of Neutron drivers team. > Since long time Nate is very active Neutron's core reviewer. He is also actively > participating in our Neutron drivers team meetings and he shown there that he > has big experience and knowledge about Neutron, Neutron stadium projects as well > as whole OpenStack. > I think that he really deservers to be part of this team and that he will be > great addition it. > I will wait for Your feedback for 1 week and if there will be no any votes > agains, I will add Nate to drivers team in next week. > > -- > Slawek Kaplonski > Senior software engineer > Red Hat > > From ssbarnea at redhat.com Wed Jan 29 06:44:50 2020 From: ssbarnea at redhat.com (Sorin Sbarnea) Date: Wed, 29 Jan 2020 06:44:50 +0000 Subject: horizon: Trailing spaces removed on passwords In-Reply-To: References: Message-ID: Indeed, a well known web UX improving feature, very useful one. I hope nobody tries to remove it. This kind of feature must always be implemented in the client (browser). no server side API should ever try to “sanitize” a password string. On Tue, 28 Jan 2020 at 22:14, Albert Braden wrote: > Stripping leading/trailing spaces from passwords is the correct behavior. > Passwords should not contain leading/trailing spaces, and when they do it > is usually because of a paste error. > > -----Original Message----- > From: Orestes Leal Rodríguez > Sent: Tuesday, January 28, 2020 1:24 PM > To: openstack-discuss at lists.openstack.org > Subject: Re: horizon: Trailing spaces removed on passwords > > I have found a way to solve it and give access to users that have > passwords with spaces at the beginning/end. The issue (not an issue > per se, but it affects horizon [stein]) lies in django. Specifically > on 'django/forms/fields.py' > > Horizon uses the fields and those by default remove spaces as stated, > what I did is the following: > > On that file, the class CharField's constructor was removing > leading/trailing spaces: > Below is the diff between the original and the modified python script > (one line modified, strip=False) > > --- fields.py.orig 2020-01-28 15:16:22.696047918 -0500 > +++ fields.py 2020-01-28 15:16:45.520084974 -0500 > @@ -220,7 +220,7 @@ > > > class CharField(Field): > - def __init__(self, max_length=None, min_length=None, strip=True, > empty_value='', *args, **kwargs): > + def __init__(self, max_length=None, min_length=None, strip=False, > empty_value='', *args, **kwargs): > self.max_length = max_length > self.min_length = min_length > self.strip = strip > > Now passwords are not altered by the underlying framework. Not sure > the effect of not removing trailing/leading spaces from the textfields > will have on the Horizon operations, though. Maybe horizon should > redefine that django class to avoid this behavior. I'm also open to > other solutions from the community. Have a great evening, > > > Thanks. > Orestes > > On 1/28/20, Orestes Leal Rodríguez wrote: > > From the dashboard openstack is removing the trailing spaces from our > > user's passwords. > > We have a modified sql.py backend, that does an ldap bind to an active > > directory data store. And that works almost always. I say almost > > because for some users it doesn't work at all. We figure out (and a > > co-worker also confirmed this) that openstack is removing trailing > > (also leading?) spaces from the password entered in the dashboard. > > Also, inside the dashboard trailing spaces are not accepted even when > > they are equal byte by byte (including the space, I get an error). So > > this is going on. > > > > Do anybody knows where is this removal performed? (python script > > location, line) So I can remove that since I have users (me included, > > I have the issue since the very beginning of this deployment) that > > cannot login. And they can use their Active Directrory passwords from > > other apps without problem. > > > > We are running 'stein' with the latest update for ubuntu 18.04-AMD64. > > > > -- -- /sorin -------------- next part -------------- An HTML attachment was scrubbed... URL: From sundar.nadathur at intel.com Wed Jan 29 06:45:28 2020 From: sundar.nadathur at intel.com (Nadathur, Sundar) Date: Wed, 29 Jan 2020 06:45:28 +0000 Subject: [cyborg] No IRC meeting today In-Reply-To: References: Message-ID: With the travel issues and the virus outbreak, we don't expect many attendees this week either. So, no Cyborg meeting this week either. Regards, Sundar From: Nadathur, Sundar Sent: Wednesday, January 22, 2020 6:23 PM To: openstack-discuss at lists.openstack.org Subject: [cyborg] No IRC meeting today Since many people are traveling on account of the Chinese New Year, there will be no IRC meeting today. Regards, Sundar -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Wed Jan 29 07:33:06 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Wed, 29 Jan 2020 08:33:06 +0100 Subject: horizon: Trailing spaces removed on passwords In-Reply-To: References: Message-ID: Folks, I believe the password value should never ever be modified, that includes space stripping. Albert wrote: > Passwords should not contain leading/trailing spaces Strong claim. I think it's clumsy if they do, but still a password is a password :-) Albert wrote: > it is usually because of a paste error I agree here, I rarely see people willing to have trailing spaces in their passwords. UI/UX-wise people should be allowed to peek at their password as they are entering it (to validate its correctness). Also, it's the very reason why password change form has you to repeat the new password (and sometimes even blocks any copy-pasting which is actually bad UI/UX because it cripples password managers). Akihiro wrote: > Django AuthenticationForm does not strip an input password Which is how it should be. Akihiro wrote: > Other usages of CharField may assume the default behavior. Indeed, one should modify horizon, not django, here. Sorin wrote: > This kind of feature must always be implemented in the client (browser) Well, it can (and is in this case) also be implemented on the server side (by horizon/django here). Sorin wrote: > no server side API should ever try to “sanitize” a password string. Sanitization is always performed to avoid SQL injection and alike. -yoctozepto From amoralej at redhat.com Wed Jan 29 11:30:44 2020 From: amoralej at redhat.com (Alfredo Moralejo Alonso) Date: Wed, 29 Jan 2020 12:30:44 +0100 Subject: [rdo-dev] [tripleo] missing centos-8 rpms for kolla builds In-Reply-To: References: <86b5b5b7-8f0c-9bc7-6275-cce1c353cd48@linaro.org> <449b1a03-2066-bea1-0a53-91dc59a3d58c@linaro.org> Message-ID: On Tue, Jan 28, 2020 at 5:53 PM Mark Goddard wrote: > On Tue, 28 Jan 2020 at 15:18, Mark Goddard wrote: > > > > On Mon, 27 Jan 2020 at 09:18, Radosław Piliszek > > wrote: > > > > > > I know it was for masakari. > > > Gaëtan had to grab crmsh from opensuse: > > > > http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/ > > > > > > -yoctozepto > > > > Thanks Wes for getting this discussion going. I've been looking at > > CentOS 8 today and trying to assess where we are. I created an > > Etherpad to track status: > > https://etherpad.openstack.org/p/kolla-centos8 > > uwsgi and etcd are now available in rdo dependencies repo. Let me know if you find some issue with it. > We are seeing an odd DNF error sometimes. DNF exits 141 with no error > code when installing packages. It often happens on the rabbitmq and > grafana images. There is a prompt about importing GPG keys prior to > the error. > > Example: > https://4eff4bb69c321960be39-770d619687de1bce0976465c40e4e9ca.ssl.cf2.rackcdn.com/693544/33/check/kolla-ansible-centos8-source-mariadb/93a8351/primary/logs/build/000_FAILED_kolla-toolbox.log > > Related bug report? https://github.com/containers/libpod/issues/4431 > > Anyone familiar with it? > > Didn't know about this issue. BTW, there is rabbitmq-server in RDO dependencies repo if you are interested in using it from there instead of rabbit repo. > > > > > > > pon., 27 sty 2020 o 10:13 Marcin Juszkiewicz > > > napisał(a): > > > > > > > > W dniu 27.01.2020 o 09:48, Alfredo Moralejo Alonso pisze: > > > > > How is crmsh used in these images?, ha packages included in > > > > > HighAvailability repo in CentOS includes pcs and some crm_* > commands in pcs > > > > > and pacemaker-cli packages. IMO, tt'd be good to switch to those > commands > > > > > to manage the cluster. > > > > > > > > No idea. Gaëtan Trellu may know - he created those images. > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Wed Jan 29 11:45:03 2020 From: thierry at openstack.org (Thierry Carrez) Date: Wed, 29 Jan 2020 12:45:03 +0100 Subject: [largescale-sig] Meeting summary and next actions Message-ID: <5b0aa1b1-2832-01d4-9f37-628b421f2378@openstack.org> Hi everyone, The Large Scale SIG held a meeting today. You can access the summary and logs of the meeting at: http://eavesdrop.openstack.org/meetings/large_scale_sig/2020/large_scale_sig.2020-01-29-09.00.html Regarding progress on our "Documenting large scale operations" goal, we turned our collection of relevant articles more as a group background activity than an immediate TODO. The idea is more to remember to add links to the etherpad[1] when we come across relevant content. oneswig mentioned the list to the Scientific SIG, in hopes that it would trigger more entries. [1] https://etherpad.openstack.org/p/large-scale-sig-documentation On the "Scaling within one cluster, and instrumentation of the bottlenecks" goal, nobody filed a scaling story yet on the etherpad[2] set up to collect them. masahito produced a first draft of the oslo.metrics blueprint[3], up for the group review. Finally, oneswig signed up to prepare a show-and-tell about some investigation they have been doing in this area, for presentation at next meeting. [2] https://etherpad.openstack.org/p/scaling-stories [3] https://review.opendev.org/#/c/704733/ Between now and next meeting, group members should prioritize: - Reviewing oslo.metrics draft at https://review.opendev.org/#/c/704733/ and comment so that we can iterate on it - Reading page on golden signals at https://landing.google.com/sre/sre-book/chapters/monitoring-distributed-systems/#xref_monitoring_golden-signals Other action items: - oneswig to prepare show-and-tell for presentation at next meeting - ttx to add meeting to eavesdrop on a every-two-week cadence - all post short descriptions of what happens (what breaks first) when scaling up a single cluster to https://etherpad.openstack.org/p/scaling-stories The next meeting will happen on February 12, at 9:00 UTC on #openstack-meeting. Cheers, -- Thierry Carrez (ttx) From dougal at redhat.com Wed Jan 29 13:26:19 2020 From: dougal at redhat.com (Dougal Matthews) Date: Wed, 29 Jan 2020 13:26:19 +0000 Subject: [tripleo] Removing the `openstack overcloud plan *` commands In-Reply-To: References: Message-ID: On Tue, 28 Jan 2020 at 15:15, Alex Schultz wrote: > On Tue, Jan 28, 2020 at 8:10 AM Dougal Matthews wrote: > > > > > > > > On Tue, 28 Jan 2020 at 14:54, Dougal Matthews wrote: > >> > >> Hey all, > >> > >> While doing work on the Mistral to Ansible port I was looking at the > openstack overcloud plan commands. These are; > >> > >> - openstack overcloud plan create > >> - openstack overcloud plan delete > >> - openstack overcloud plan deploy > >> - openstack overcloud plan list > >> - openstack overcloud plan export > > > > > > There has been a useful comment on the review that export is actually > useful still. Harald said; > > > > "I find the possibility to download the plan on a failed deployment > quite valuable. It allows me to read the heat templates in a more human > friendly fully rendered version compared to the j2, run yaml validation > tools etc. There is *magic* adding stuff to plan's that I can't see by > simply running process templates tools." > > > > This seems like a good reason, although it could be argued that Swift is > a better tool to download a container. > > > > I've commented but I don't think we're ready to remove these yet until > we've gotten off of swift. TBH trying to download a swift container is > a painful via openstackcli so I'd rather that we leave these basic > commands. Additionally it doesn't require that an end user understand > that a plan is stored in swift. End users should be using 'openstack > overcloud *' commands to perform actions and not going and doing > direct nova/neutron/ironic/swift related actions. We've seen folks do > some dangerous stuff when they start toying with the underlying > implementations. > That is fair. Thanks. I guess there are some uses that have emerged even if this work was never fully completed. I'll start porting them to Ansible instead. Thanks! > > > > >> > >> > >> I believe none of these commands make sense in a post-TripleO UI world. > There is no other way to interact and update a plan. Deploys are always > done via "openstack overcloud deploy" and this deletes the contents of the > plan container and repopulates it with the local files[1]. > >> > >> I am therefore proposing that we remove these commands and skip the > normal deprecation process. https://review.opendev.org/#/c/704581/1 > >> > >> What do you think? > >> > >> Thanks, > >> Dougal > >> > >> > >> [1]: > https://github.com/openstack/python-tripleoclient/blob/3c589979ceb05d732b3c924eb919e145e22efaa8/tripleoclient/workflows/plan_management.py#L211 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mbultel at redhat.com Wed Jan 29 13:53:33 2020 From: mbultel at redhat.com (Mathieu Bultel) Date: Wed, 29 Jan 2020 14:53:33 +0100 Subject: [tripleo] Removing the `openstack overcloud plan *` commands In-Reply-To: References: Message-ID: Hi all, In any case, there is another way to get the plan content for debugging purposes: openstack container save overcloud So not sure if the openstack overcloud commands helps a lot, and if it's not redundant with openstack container commands. Mathieu On Wed, Jan 29, 2020 at 2:32 PM Dougal Matthews wrote: > > > On Tue, 28 Jan 2020 at 15:15, Alex Schultz wrote: > >> On Tue, Jan 28, 2020 at 8:10 AM Dougal Matthews >> wrote: >> > >> > >> > >> > On Tue, 28 Jan 2020 at 14:54, Dougal Matthews >> wrote: >> >> >> >> Hey all, >> >> >> >> While doing work on the Mistral to Ansible port I was looking at the >> openstack overcloud plan commands. These are; >> >> >> >> - openstack overcloud plan create >> >> - openstack overcloud plan delete >> >> - openstack overcloud plan deploy >> >> - openstack overcloud plan list >> >> - openstack overcloud plan export >> > >> > >> > There has been a useful comment on the review that export is actually >> useful still. Harald said; >> > >> > "I find the possibility to download the plan on a failed deployment >> quite valuable. It allows me to read the heat templates in a more human >> friendly fully rendered version compared to the j2, run yaml validation >> tools etc. There is *magic* adding stuff to plan's that I can't see by >> simply running process templates tools." >> > >> > This seems like a good reason, although it could be argued that Swift >> is a better tool to download a container. >> > >> >> I've commented but I don't think we're ready to remove these yet until >> we've gotten off of swift. TBH trying to download a swift container is >> a painful via openstackcli so I'd rather that we leave these basic >> commands. Additionally it doesn't require that an end user understand >> that a plan is stored in swift. End users should be using 'openstack >> overcloud *' commands to perform actions and not going and doing >> direct nova/neutron/ironic/swift related actions. We've seen folks do >> some dangerous stuff when they start toying with the underlying >> implementations. >> > > That is fair. Thanks. I guess there are some uses that have emerged even > if this work was never fully completed. > > I'll start porting them to Ansible instead. > > Thanks! > > >> >> > >> >> >> >> >> >> I believe none of these commands make sense in a post-TripleO UI >> world. There is no other way to interact and update a plan. Deploys are >> always done via "openstack overcloud deploy" and this deletes the contents >> of the plan container and repopulates it with the local files[1]. >> >> >> >> I am therefore proposing that we remove these commands and skip the >> normal deprecation process. https://review.opendev.org/#/c/704581/1 >> >> >> >> What do you think? >> >> >> >> Thanks, >> >> Dougal >> >> >> >> >> >> [1]: >> https://github.com/openstack/python-tripleoclient/blob/3c589979ceb05d732b3c924eb919e145e22efaa8/tripleoclient/workflows/plan_management.py#L211 >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at stackhpc.com Wed Jan 29 14:36:06 2020 From: mark at stackhpc.com (Mark Goddard) Date: Wed, 29 Jan 2020 14:36:06 +0000 Subject: [rdo-dev] [tripleo] missing centos-8 rpms for kolla builds In-Reply-To: References: <86b5b5b7-8f0c-9bc7-6275-cce1c353cd48@linaro.org> <449b1a03-2066-bea1-0a53-91dc59a3d58c@linaro.org> Message-ID: On Wed, 29 Jan 2020 at 11:31, Alfredo Moralejo Alonso wrote: > > > > On Tue, Jan 28, 2020 at 5:53 PM Mark Goddard wrote: >> >> On Tue, 28 Jan 2020 at 15:18, Mark Goddard wrote: >> > >> > On Mon, 27 Jan 2020 at 09:18, Radosław Piliszek >> > wrote: >> > > >> > > I know it was for masakari. >> > > Gaëtan had to grab crmsh from opensuse: >> > > http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/ >> > > >> > > -yoctozepto >> > >> > Thanks Wes for getting this discussion going. I've been looking at >> > CentOS 8 today and trying to assess where we are. I created an >> > Etherpad to track status: >> > https://etherpad.openstack.org/p/kolla-centos8 >> > > uwsgi and etcd are now available in rdo dependencies repo. Let me know if you find some issue with it. I found them, thanks. > >> >> We are seeing an odd DNF error sometimes. DNF exits 141 with no error >> code when installing packages. It often happens on the rabbitmq and >> grafana images. There is a prompt about importing GPG keys prior to >> the error. >> >> Example: https://4eff4bb69c321960be39-770d619687de1bce0976465c40e4e9ca.ssl.cf2.rackcdn.com/693544/33/check/kolla-ansible-centos8-source-mariadb/93a8351/primary/logs/build/000_FAILED_kolla-toolbox.log >> >> Related bug report? https://github.com/containers/libpod/issues/4431 >> >> Anyone familiar with it? >> > > Didn't know about this issue. > > BTW, there is rabbitmq-server in RDO dependencies repo if you are interested in using it from there instead of rabbit repo. It seems to be due to the use of a GPG check on the repo (as opposed to packages). DNF doesn't use keys imported via rpm --import for this (I'm not sure what it uses), and prompts to add the key. This breaks without a terminal. More explanation here: https://review.opendev.org/#/c/704782. > >> > >> > > >> > > pon., 27 sty 2020 o 10:13 Marcin Juszkiewicz >> > > napisał(a): >> > > > >> > > > W dniu 27.01.2020 o 09:48, Alfredo Moralejo Alonso pisze: >> > > > > How is crmsh used in these images?, ha packages included in >> > > > > HighAvailability repo in CentOS includes pcs and some crm_* commands in pcs >> > > > > and pacemaker-cli packages. IMO, tt'd be good to switch to those commands >> > > > > to manage the cluster. >> > > > >> > > > No idea. Gaëtan Trellu may know - he created those images. >> > > > >> > > >> From doka.ua at gmx.com Wed Jan 29 14:44:02 2020 From: doka.ua at gmx.com (Volodymyr Litovka) Date: Wed, 29 Jan 2020 16:44:02 +0200 Subject: [NOVA] instance hostname vs display_name vs dns_name Message-ID: Dear colleagues, I'm using DNS Integration and just faced an issue - after renaming instance, I can't bind port to the instance using new name: 1) I've created intances with name 'devel' 2) then I renamed it to devel (openstack server set --name packager devel) 3) when binding port with dns_name using new name ('packager' in my case), the following error appear: ERROR nova.api.openstack.wsgi PortNotUsableDNS: Port c3a92cf6-b49b-4570-b69b-0c23af1d1f94 not usable for instance 6aa78bd5-099e-4878-a5ac-90262505a924. Value packager assigned to dns_name attribute does not match instance's hostname devel and yes, record in DB still uses an old hostname: mysql> select display_name from instances where hostname='devel'; +--------------+ | display_name | +--------------+ | packager | +--------------+ 1 row in set (0.00 sec) and hostname remains the same (initial) regardless of any changes to display_name. I'm on Rocky. Is it bug or feature and are there ways to work around this? Thank you. -- Volodymyr Litovka "Vision without Execution is Hallucination." -- Thomas Edison -------------- next part -------------- An HTML attachment was scrubbed... URL: From frickler at offenerstapel.de Wed Jan 29 14:57:43 2020 From: frickler at offenerstapel.de (Jens Harbott) Date: Wed, 29 Jan 2020 14:57:43 +0000 Subject: [qa] Proposing =?UTF-8?Q?Rados=C5=82aw?= Piliszek to devstack core In-Reply-To: <16fe7556c9c.c1957cd473318.237960248883865388@ghanshyammann.com> References: <16fe7556c9c.c1957cd473318.237960248883865388@ghanshyammann.com> Message-ID: <41ca2fc4729f25b5f1073f45ecbfe502f4426e24.camel@offenerstapel.de> On Mon, 2020-01-27 at 08:08 -0600, Ghanshyam Mann wrote: > Hello Everyone, > > Radosław Piliszek (yoctozepto) has been doing nice work in devstack > from code as well as review perspective. > He has been helping for many bugs fixes nowadays and having him as > Core will help us to speed up the things. > > I would like to propose him for Devstack Core. You can vote/feedback > on this email. If no objection by end of this week, I will add him to > the list. Big +2 from me. Jens (frickler) From skaplons at redhat.com Wed Jan 29 20:30:34 2020 From: skaplons at redhat.com (Slawek Kaplonski) Date: Wed, 29 Jan 2020 21:30:34 +0100 Subject: [neutron][drivers team] Proposing Nate Johnston as drivers team member In-Reply-To: References: <20200122215847.zsfgihjmmjkwlrdl@skaplons-mac> Message-ID: <20200129203034.7ztacox53bjh2qkp@skaplons-mac> Hi, It's been a week since I sent this nomination and there was only very positive feedback about that. So welcome in the drivers team Nate :) On Wed, Jan 29, 2020 at 09:46:42AM +0900, Akihiro Motoki wrote: > Great addition. +1 from me. > > On Thu, Jan 23, 2020 at 7:02 AM Slawek Kaplonski wrote: > > > > Hi Neutrinos, > > > > I would like to propose Nate Johnston to be part of Neutron drivers team. > > Since long time Nate is very active Neutron's core reviewer. He is also actively > > participating in our Neutron drivers team meetings and he shown there that he > > has big experience and knowledge about Neutron, Neutron stadium projects as well > > as whole OpenStack. > > I think that he really deservers to be part of this team and that he will be > > great addition it. > > I will wait for Your feedback for 1 week and if there will be no any votes > > agains, I will add Nate to drivers team in next week. > > > > -- > > Slawek Kaplonski > > Senior software engineer > > Red Hat > > > > > -- Slawek Kaplonski Senior software engineer Red Hat From jp.methot at planethoster.info Wed Jan 29 20:50:13 2020 From: jp.methot at planethoster.info (=?utf-8?Q?Jean-Philippe_M=C3=A9thot?=) Date: Wed, 29 Jan 2020 15:50:13 -0500 Subject: [cinder][nova] Migrating servers' root block devices from a cinder backend to another Message-ID: <13E45A51-9013-4F13-AE83-6DF08F2D6052@planethoster.info> Hi, We have a several hundred VMs which were built on cinder block devices as root drives which use a SAN backend. Now we want to change their backend from the SAN to Ceph. We can shutdown the VMs but we will not destroy them. I am aware that there is a cinder migrate volume command to change a volume’s backend, but it requires that the volume be completely detached. Forcing a detached state on that volume does let the volume migration take place, but the volume’s path in Nova block_device_mapping doesn’t update, for obvious reasons. So, I am considering forcing the volumes to a detached status in Cinder and then manually updating the nova db block_device_mapping entry for each volume so that the VM can boot back up afterwards. However, before I start toying with the database and accidentally break stuff, has anyone else ever done something similar? Got any tips or hints on how best to proceed? Jean-Philippe Méthot Openstack system administrator Administrateur système Openstack PlanetHoster inc. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ashlee at openstack.org Wed Jan 29 22:04:57 2020 From: ashlee at openstack.org (Ashlee Ferguson) Date: Wed, 29 Jan 2020 16:04:57 -0600 Subject: Help Shape the Track Content for OpenDev + PTG, June 8-11 in Vancouver Message-ID: Hi everyone, Hopefully by now you’ve heard about our upcoming event, OpenDev + PTG Vancouver, June 8-11, 2020 . We need your help shaping this event! Our vision is for the content to be programmed by you-- the community. OSF is looking to kick things off by selecting members for OpenDev Programming Committees for each Track. That Program Committee will then select Moderators who will lead interactive discussions on a particular topic within the track. Below you'll have the opportunity to nominate yourself for a position on the Programming Committee, as a Moderator, or both, as well as suggesting specific Topics within each Track. PTG programming will kick off in the coming weeks. If you’re interested in volunteering as an OpenDev Programming Committee member, discussion Moderator, or would like to suggest topics for moderated discussions within a particular Track, please read the details below, and then fill out this form . We’re looking for subject matter experts on the following OpenDev Tracks: - Hardware Automation (accelerators, provisioning hardware, networking) - Large-scale Usage of Open Source Infrastructure Software (scale pain points, multi-location, CI/CD) - Containers in Production (isolation, virtualization, telecom containers) - Key Challenges for Open Source in 2020 (beyond licensing, public clouds, ethics) OpenDev Programming Committee members will: Work with other Committee members, which will include OSF representatives, to curate OpenDev content based on subject expertise, community input, and relevance to open source infrastructure Promote the individual Tracks within your networks Review community input and suggestions for Track discussions Solicit moderators from your network if you know someone who is a subject matter expert Ensure diversity of speakers and companies represented in your Track Focus topics around on real-world user stories and technical, in-the-trenches experiences Programming Committee members need to be available during the following dates/time commitments: 8 - 10 hours from February - May for bi-weekly calls with your Track's Programming Committee (plus a couple of OSF representatives to facilitate the call) OpenDev, June 8 - 10, 2020 (not required, but preferred) Programming Committee members will receive a complimentary pass to the event Programming Committees will be comprised of a few people per Track who will work to select a handful of topics and moderators for each Track. The exact topic counts will be determined before Committees begin deciding. OpenDev Discussion Moderators will Be appointed by the Programming Committees Facilitate discussions within a particular Track Have adequate knowledge and experience to lead and moderate discussion around certain topics during the event Work with Programming Committee to decide focal point of discussion Moderators need to be available to attend OpenDev, June 8 - 10, 2020, and will receive a complimentary pass. Programming Committee nominations are open until February 11. Deadlines to volunteer to be a moderator and suggest topics will be in late February. Nominate yourself or suggest discussion topics here: https://openstackfoundation.formstack.com/forms/opendev_vancouver2020_volunteer Cheers, The OpenStack Foundation -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Wed Jan 29 22:17:31 2020 From: smooney at redhat.com (Sean Mooney) Date: Wed, 29 Jan 2020 22:17:31 +0000 Subject: [NOVA] instance hostname vs display_name vs dns_name In-Reply-To: References: Message-ID: <89bb79644030812f7846c51896a79ae6aa32e595.camel@redhat.com> On Wed, 2020-01-29 at 16:44 +0200, Volodymyr Litovka wrote: > Dear colleagues, > > I'm using DNS Integration and just faced an issue - after renaming > instance, I can't bind port to the instance using new name: > > 1) I've created intances with name 'devel' > 2) then I renamed it to devel ((openstack server set --name packager devel) > 3) when binding port with dns_name using new name ('packager' in my > case), the following error appear: > ERROR nova.api.openstack.wsgi PortNotUsableDNS: Port > c3a92cf6-b49b-4570-b69b-0c23af1d1f94 not usable for instance > 6aa78bd5-099e-4878-a5ac-90262505a924. Value packager assigned to > dns_name attribute does not match instance's hostname devel > > and yes, record in DB still uses an old hostname: > > mysql> select display_name from instances where hostname='devel'; > +--------------+ > > display_name | > > +--------------+ > > packager | > > +--------------+ > 1 row in set (0.00 sec) > > > and hostname remains the same (initial) regardless of any changes to > display_name. > > I'm on Rocky. Is it bug or feature and are there ways to work around this? i think you should be able to change the display name but i would not expect that display name to change the host name used by the guest. it is likely a bug that the portbinding appears to be using the displayname for designate integration. i know we use the display name as the hostname for the vm by default but i would not expect openstack server set --name packager devel to alter the hostname served to the vm over dhcp once the vm is intially created. i would expect to be able to alther that via designate and for the display name to be independt of the host name after the inital boot. so yes there is likely a bug in that we are using the disply name somewhere we shoudl not be. > > Thank you. > > -- > Volodymyr Litovka > "Vision without Execution is Hallucination." -- Thomas Edison > From gmann at ghanshyammann.com Wed Jan 29 22:57:25 2020 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 29 Jan 2020 16:57:25 -0600 Subject: [qa][stable][tempest-plugins]: Tempest & plugins py2 jobs failure for stable branches (1860033: the EOLing python2 drama) In-Reply-To: <16fb1aa4aae.10e957b6324515.5822370422740200537@ghanshyammann.com> References: <16fb1aa4aae.10e957b6324515.5822370422740200537@ghanshyammann.com> Message-ID: <16ff38609c1.c06eebfb73294.1918371195388980302@ghanshyammann.com> ---- On Thu, 16 Jan 2020 22:02:05 -0600 Ghanshyam Mann wrote ---- > Hello Everyone, > > This is regarding bug: https://bugs.launchpad.net/tempest/+bug/1860033. Using Radosław's fancy statement > of 'EOLing python2 drama' in subject :). > > neutron tempest plugin job on stable/rocky started failing as neutron-lib dropped the py2. neutron-lib 2.0.0 > is py3 only and so does u-c on the master has been updated to 2.0.0. > > All tempest and its plugin uses the master u-c for stable branch testing which is the valid way because of master Tempest & plugin > is being used to test the stable branches which need u-c from master itself. These failed jobs also used master u-c[1] which is trying > to install the latest neutron-lib and failing. > > This is not just neutron tempest plugin issue but for all Tempest plugins jobs. Any lib used by Tempest or plugins can drop the > py2 now and leads to this failure. Its just neutron-lib raised the flag first before I plan to hack on Tempest & plugins jobs for py2 drop > from master and kepe testing py2 on stable bracnhes. > > We have two way to fix this: > > 1. Separate out the testing of python2 jobs with python2 supported version of Tempest plugins and with respective u-c. > For example, test all python2 job with tempest plugin train version (or any latest version if any which support py2) and > use u-c from stable/train. This will cap the Tempest & plugins with respective u-c for stable branches testing. > > 2. Second option is to install the tempest and plugins in py3 env on py2 jobs also. This should be an easy and preferred way. > I am trying this first[2] and testing[3]. > I am summarizing what Tempest and its plugins should be doing/done for these incompatible issues. Tried option#2: We tried to install the py3.6 (from ppa which is not the best solution) in Tempest venv on ubuntu Xenail to fix the bug like 1860033 [1]. This needs Tempest to bump the py version for tox env t 3.6[2]. But that broke the distro job where py > 3.6 was available like fedora (Bug 1861308). This can be fixed by making basepython as pythion3 and more hack for example to set the python alias on such distro. It can be stable common jobs running on Xenial or distro-specific job like centos7 etc where we have < py3.6. Overall this option did not work well as this need lot of hacks depends on the distro. I am dropping this option for our CI/CD. But you can try this on your production cloud testing where you do not need to handle multiple distro cases. Testing your cloud with the latest Tempest is the best possible way. Going with option#1: IMO, this is a workable option with the current situation. Below is plan to make Tempest and its plugins working for all possible distro/py version. 1. Drop py3.5 from Tempest (also from its plugins if anyone officially supports). * Tempest and its plugin's dependencies are becoming python-requires >=3.6 so Tempest and plugins itself cannot support py3.5. * 'Tempest cannot support py3.5' means cannot run Tempest/plugins on py3.5 env. But still, you can test py3.5 cloud from Tempest on >py3.6 env(venv or separate node). * Patch is up - https://review.opendev.org/#/c/704840/ 2.Modify Tempest tox env basepython to py3 * Let's not pin Tempest for py3.6. Any python version >=py3.6 should be working fine for distro does not have py3.6 like fedora or future distro *Patch is up- https://review.opendev.org/#/c/704688/2 3. Use compatible Tempest & its plugin tag for distro having [1] https://zuul.opendev.org/t/openstack/build/fb8a928ed3614e09a9a3cf4637f2f6c2/log/job-output.txt#33040 > [2] https://review.opendev.org/#/c/703011/ > [3] https://review.opendev.org/#/c/703012/ > > > -gmanne > > > > > From rosmaita.fossdev at gmail.com Wed Jan 29 23:06:06 2020 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Wed, 29 Jan 2020 18:06:06 -0500 Subject: [cinder] driver removal policy update (draft) Message-ID: I've posted a patch updating our third-party driver removal policy along the lines discussed at today's cinder meeting: https://review.opendev.org/704906 Let's continue the discussion in gerrit. I'll update the wiki after we've come to a consensus on the patch. cheers, brian From Albert.Braden at synopsys.com Wed Jan 29 23:24:49 2020 From: Albert.Braden at synopsys.com (Albert Braden) Date: Wed, 29 Jan 2020 23:24:49 +0000 Subject: [largescale-sig] Meeting summary and next actions In-Reply-To: <5b0aa1b1-2832-01d4-9f37-628b421f2378@openstack.org> References: <5b0aa1b1-2832-01d4-9f37-628b421f2378@openstack.org> Message-ID: Sorry I've been busy; I meant to do that last week. I added my recent issues. -----Original Message----- From: Thierry Carrez Sent: Wednesday, January 29, 2020 3:45 AM To: openstack-discuss at lists.openstack.org Subject: [largescale-sig] Meeting summary and next actions Hi everyone, The Large Scale SIG held a meeting today. You can access the summary and logs of the meeting at: https://urldefense.proofpoint.com/v2/url?u=http-3A__eavesdrop.openstack.org_meetings_large-5Fscale-5Fsig_2020_large-5Fscale-5Fsig.2020-2D01-2D29-2D09.00.html&d=DwICaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=Ij-wE2KMiedmFUUmy5d-hWgxPAz0M-DU_dGBs8K5mVI&s=OUwbthdppBOYuAf0urAiITeq5ycSNRUTt6T2PUdZFMw&e= Regarding progress on our "Documenting large scale operations" goal, we turned our collection of relevant articles more as a group background activity than an immediate TODO. The idea is more to remember to add links to the etherpad[1] when we come across relevant content. oneswig mentioned the list to the Scientific SIG, in hopes that it would trigger more entries. [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__etherpad.openstack.org_p_large-2Dscale-2Dsig-2Ddocumentation&d=DwICaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=Ij-wE2KMiedmFUUmy5d-hWgxPAz0M-DU_dGBs8K5mVI&s=81Wc_HOvQdGKN57WpEkZNhMX3lSbPJX43WAKqujgYE8&e= On the "Scaling within one cluster, and instrumentation of the bottlenecks" goal, nobody filed a scaling story yet on the etherpad[2] set up to collect them. masahito produced a first draft of the oslo.metrics blueprint[3], up for the group review. Finally, oneswig signed up to prepare a show-and-tell about some investigation they have been doing in this area, for presentation at next meeting. [2] https://urldefense.proofpoint.com/v2/url?u=https-3A__etherpad.openstack.org_p_scaling-2Dstories&d=DwICaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=Ij-wE2KMiedmFUUmy5d-hWgxPAz0M-DU_dGBs8K5mVI&s=gDTfSN910x8_ELszNWg8KfqYEXARjwVIA6Ebf3w-jUQ&e= [3] https://urldefense.proofpoint.com/v2/url?u=https-3A__review.opendev.org_-23_c_704733_&d=DwICaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=Ij-wE2KMiedmFUUmy5d-hWgxPAz0M-DU_dGBs8K5mVI&s=b2FZ4aN17ftxROcpxdJGQI6gmMnuIlsn62QDvQppxaw&e= Between now and next meeting, group members should prioritize: - Reviewing oslo.metrics draft at https://urldefense.proofpoint.com/v2/url?u=https-3A__review.opendev.org_-23_c_704733_&d=DwICaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=Ij-wE2KMiedmFUUmy5d-hWgxPAz0M-DU_dGBs8K5mVI&s=b2FZ4aN17ftxROcpxdJGQI6gmMnuIlsn62QDvQppxaw&e= and comment so that we can iterate on it - Reading page on golden signals at https://urldefense.proofpoint.com/v2/url?u=https-3A__landing.google.com_sre_sre-2Dbook_chapters_monitoring-2Ddistributed-2Dsystems_-23xref-5Fmonitoring-5Fgolden-2Dsignals&d=DwICaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=Ij-wE2KMiedmFUUmy5d-hWgxPAz0M-DU_dGBs8K5mVI&s=ou2rk22nF--NoZDmpTicSiu9MC2MbDL6ZmyfhUKO1VQ&e= Other action items: - oneswig to prepare show-and-tell for presentation at next meeting - ttx to add meeting to eavesdrop on a every-two-week cadence - all post short descriptions of what happens (what breaks first) when scaling up a single cluster to https://urldefense.proofpoint.com/v2/url?u=https-3A__etherpad.openstack.org_p_scaling-2Dstories&d=DwICaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=Ij-wE2KMiedmFUUmy5d-hWgxPAz0M-DU_dGBs8K5mVI&s=gDTfSN910x8_ELszNWg8KfqYEXARjwVIA6Ebf3w-jUQ&e= The next meeting will happen on February 12, at 9:00 UTC on #openstack-meeting. Cheers, -- Thierry Carrez (ttx) From sbaker at redhat.com Thu Jan 30 00:45:40 2020 From: sbaker at redhat.com (Steve Baker) Date: Thu, 30 Jan 2020 13:45:40 +1300 Subject: [ironic] An approach to removing WSME Message-ID: I've put together a set of changes for removing WSME which involves copying just enough of it into ironic, and I think we need to have the conversation about whether this approach is desirable :) This git branch[1] finishes with WSME removed and existing tests passing. Here are some stats about lines of code (not including unit tests, calculated with cloc): 4500 wsme/wsme 6000 ironic/ironic/api master 7000 ironic/ironic/api story/1651346 In words, we need 1000 out of 4500 lines of WSME source in order to support 6000 lines of ironic specific API code. Switching to a replacement for WSME would likely touch a good proportion of that 6000 lines of ironic specific API code. If we eventually replace it with a new library I think it would be easier if the thing being replaced is inside the ironic source tree to allow for a gradual transition. So this approach could be an end in itself, or it could be a step towards the final goal. My strategy for copying in code was to pull in chunks of required logic while removing some unused features (like request pass-through and date & time type serialization). I also replaced py2/3 patterns with pure py3. There is likely further things which can be removed or refactored for simplicity, but what exists currently works. If there is enough positive feedback for this approach I'll start on unit test coverage for the new code. [1] https://review.opendev.org/#/q/topic:story/1651346 From gmann at ghanshyammann.com Thu Jan 30 01:50:23 2020 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 29 Jan 2020 19:50:23 -0600 Subject: [goals][Drop Python 2.7 Support] Week R-15 Update (# 2.5 weeks left to complete) Message-ID: <16ff4246468.b2c2996274248.600585738113017106@ghanshyammann.com> Hello Everyone, Below is the progress on "Drop Python 2.7 Support" for R-15 week. Schedule: https://governance.openstack.org/tc/goals/selected/ussuri/drop-py27.html#schedule Highlights: ======== * Schedule is to finish the py2-drop from every deliverable (except requirement repo) by m-2 on R13. Only 2.5 weeks left to finish the work. * There are still cases where Tempest and plugins does not work due to py2 drops. New plan & option has been sent in separate ML thread [1] and work in progress. * Few services merged the patches but not all. If your project is in the below list, I request PTLs to review it on priority. Project wise status and need reviews: ============================ Phase-1 status: The OpenStack services have not merged the py2 drop patches: NOTE: This was supposed to be completed by milestone-1 (Dec 13th, 19). * Adjutant * Karbor * Masakari * Tacker * Qinling * Tricircle Phase-2 status: This is ongoing work and I think most of the repo have patches up to review. Try to review them on priority. If any query and I missed to respond on review, feel free to ping me in irc. * Open review: https://review.opendev.org/#/q/topic:drop-py27-support+status:open How you can help: ============== - Review the patches. Push the patches if I missed any repo. [1] http://lists.openstack.org/pipermail/openstack-discuss/2020-January/012241.html -gmann From missile0407 at gmail.com Thu Jan 30 02:23:35 2020 From: missile0407 at gmail.com (Eddie Yen) Date: Thu, 30 Jan 2020 10:23:35 +0800 Subject: [kolla] ujson issue affected to few containers. Message-ID: Hi everyone, I'm not sure it should be bug report or not. So I email out about this issue. In these days, I found the Rocky source deployment always failed at Ceilometer bootstrapping. Then I found it failed at ceilometer-upgrade. So I tried to looking at ceilometer-upgrade.log and the error shows it failed to import ujson. https://pastebin.com/nGqsM0uf Then I googled it and found this issue is already happened and released fixes. https://github.com/esnme/ultrajson/issues/346 But it seems like the container still using the questionable one, even today (Jan 30 UTC+8). And this not only affected to Ceilometer, but may also Gnocchi. I think we have to patch it, but not sure about the workaround. Does anyone have good idea? Many thanks, Eddie. -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Thu Jan 30 07:47:53 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Thu, 30 Jan 2020 08:47:53 +0100 Subject: [kolla] ujson issue affected to few containers. In-Reply-To: References: Message-ID: Hi Eddie, the issue is that the project did *not* do a release. The latest is still 1.35 from Jan 20, *2016*... [1] You said only Rocky source - but is this ubuntu or centos? Also, by the looks of [2] master ceilometer is no longer affected, but monasca and mistral might still be if they call affected paths. The project looks dead so we are fried unless we override and start using its sources from git (hacky hacky). [1] https://pypi.org/project/ujson/#history [2] http://codesearch.openstack.org/?q=ujson&i=nope&files=&repos= -yoctozepto czw., 30 sty 2020 o 03:31 Eddie Yen napisał(a): > > Hi everyone, > > I'm not sure it should be bug report or not. So I email out about this issue. > > In these days, I found the Rocky source deployment always failed at Ceilometer bootstrapping. Then I found it failed at ceilometer-upgrade. > So I tried to looking at ceilometer-upgrade.log and the error shows it failed to import ujson. > > https://pastebin.com/nGqsM0uf > > Then I googled it and found this issue is already happened and released fixes. > https://github.com/esnme/ultrajson/issues/346 > > But it seems like the container still using the questionable one, even today (Jan 30 UTC+8). > And this not only affected to Ceilometer, but may also Gnocchi. > > I think we have to patch it, but not sure about the workaround. > Does anyone have good idea? > > Many thanks, > Eddie. From missile0407 at gmail.com Thu Jan 30 08:06:10 2020 From: missile0407 at gmail.com (Eddie Yen) Date: Thu, 30 Jan 2020 16:06:10 +0800 Subject: [kolla] ujson issue affected to few containers. In-Reply-To: References: Message-ID: Hi Radosław, Sorry about lost distro information, the distro we're using is Ubuntu. We have an old copy of ceilometer container image, the ujson.so version between old and latest are both 1.35 But only latest one affected this issue. BTW, I read the last reply on issue page. Since he said the python 3 with newer GCC is OK, I think it may caused by python version issue or GCC compiler versioning. It may become a huge architect if it really caused by compiling issue, if Ubuntu updated GCC or python. Radosław Piliszek 於 2020年1月30日 週四 下午3:48寫道: > Hi Eddie, > > the issue is that the project did *not* do a release. > The latest is still 1.35 from Jan 20, *2016*... [1] > > You said only Rocky source - but is this ubuntu or centos? > > Also, by the looks of [2] master ceilometer is no longer affected, but > monasca and mistral might still be if they call affected paths. > > The project looks dead so we are fried unless we override and start > using its sources from git (hacky hacky). > > [1] https://pypi.org/project/ujson/#history > [2] http://codesearch.openstack.org/?q=ujson&i=nope&files=&repos= > > -yoctozepto > > > czw., 30 sty 2020 o 03:31 Eddie Yen napisał(a): > > > > Hi everyone, > > > > I'm not sure it should be bug report or not. So I email out about this > issue. > > > > In these days, I found the Rocky source deployment always failed at > Ceilometer bootstrapping. Then I found it failed at ceilometer-upgrade. > > So I tried to looking at ceilometer-upgrade.log and the error shows it > failed to import ujson. > > > > https://pastebin.com/nGqsM0uf > > > > Then I googled it and found this issue is already happened and released > fixes. > > https://github.com/esnme/ultrajson/issues/346 > > > > But it seems like the container still using the questionable one, even > today (Jan 30 UTC+8). > > And this not only affected to Ceilometer, but may also Gnocchi. > > > > I think we have to patch it, but not sure about the workaround. > > Does anyone have good idea? > > > > Many thanks, > > Eddie. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Thu Jan 30 08:20:03 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Thu, 30 Jan 2020 09:20:03 +0100 Subject: [kolla] ujson issue affected to few containers. In-Reply-To: References: Message-ID: Hi Eddie, ujson is one of the few packages that has no wheel, so it is being recompiled each time. It might be the case that some change to gcc and/or python-dev parts caused that. The issue looks like an edge case of c99 spec non-conformity so valid compiles may break on it (or work - just that there is no guarantee). -yoctozepto czw., 30 sty 2020 o 09:06 Eddie Yen napisał(a): > > Hi Radosław, > > Sorry about lost distro information, the distro we're using is Ubuntu. > > We have an old copy of ceilometer container image, the ujson.so version between old and latest are both 1.35 > But only latest one affected this issue. > > BTW, I read the last reply on issue page. Since he said the python 3 with newer GCC is OK, I think it may caused by python version issue or GCC compiler versioning. > It may become a huge architect if it really caused by compiling issue, if Ubuntu updated GCC or python. > > Radosław Piliszek 於 2020年1月30日 週四 下午3:48寫道: >> >> Hi Eddie, >> >> the issue is that the project did *not* do a release. >> The latest is still 1.35 from Jan 20, *2016*... [1] >> >> You said only Rocky source - but is this ubuntu or centos? >> >> Also, by the looks of [2] master ceilometer is no longer affected, but >> monasca and mistral might still be if they call affected paths. >> >> The project looks dead so we are fried unless we override and start >> using its sources from git (hacky hacky). >> >> [1] https://pypi.org/project/ujson/#history >> [2] http://codesearch.openstack.org/?q=ujson&i=nope&files=&repos= >> >> -yoctozepto >> >> >> czw., 30 sty 2020 o 03:31 Eddie Yen napisał(a): >> > >> > Hi everyone, >> > >> > I'm not sure it should be bug report or not. So I email out about this issue. >> > >> > In these days, I found the Rocky source deployment always failed at Ceilometer bootstrapping. Then I found it failed at ceilometer-upgrade. >> > So I tried to looking at ceilometer-upgrade.log and the error shows it failed to import ujson. >> > >> > https://pastebin.com/nGqsM0uf >> > >> > Then I googled it and found this issue is already happened and released fixes. >> > https://github.com/esnme/ultrajson/issues/346 >> > >> > But it seems like the container still using the questionable one, even today (Jan 30 UTC+8). >> > And this not only affected to Ceilometer, but may also Gnocchi. >> > >> > I think we have to patch it, but not sure about the workaround. >> > Does anyone have good idea? >> > >> > Many thanks, >> > Eddie. From radoslaw.piliszek at gmail.com Thu Jan 30 08:20:39 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Thu, 30 Jan 2020 09:20:39 +0100 Subject: [kolla] ujson issue affected to few containers. In-Reply-To: References: Message-ID: valid compilers* czw., 30 sty 2020 o 09:20 Radosław Piliszek napisał(a): > > Hi Eddie, > > ujson is one of the few packages that has no wheel, so it is being > recompiled each time. > It might be the case that some change to gcc and/or python-dev parts > caused that. > The issue looks like an edge case of c99 spec non-conformity so valid > compiles may break on it (or work - just that there is no guarantee). > > -yoctozepto > > czw., 30 sty 2020 o 09:06 Eddie Yen napisał(a): > > > > Hi Radosław, > > > > Sorry about lost distro information, the distro we're using is Ubuntu. > > > > We have an old copy of ceilometer container image, the ujson.so version between old and latest are both 1.35 > > But only latest one affected this issue. > > > > BTW, I read the last reply on issue page. Since he said the python 3 with newer GCC is OK, I think it may caused by python version issue or GCC compiler versioning. > > It may become a huge architect if it really caused by compiling issue, if Ubuntu updated GCC or python. > > > > Radosław Piliszek 於 2020年1月30日 週四 下午3:48寫道: > >> > >> Hi Eddie, > >> > >> the issue is that the project did *not* do a release. > >> The latest is still 1.35 from Jan 20, *2016*... [1] > >> > >> You said only Rocky source - but is this ubuntu or centos? > >> > >> Also, by the looks of [2] master ceilometer is no longer affected, but > >> monasca and mistral might still be if they call affected paths. > >> > >> The project looks dead so we are fried unless we override and start > >> using its sources from git (hacky hacky). > >> > >> [1] https://pypi.org/project/ujson/#history > >> [2] http://codesearch.openstack.org/?q=ujson&i=nope&files=&repos= > >> > >> -yoctozepto > >> > >> > >> czw., 30 sty 2020 o 03:31 Eddie Yen napisał(a): > >> > > >> > Hi everyone, > >> > > >> > I'm not sure it should be bug report or not. So I email out about this issue. > >> > > >> > In these days, I found the Rocky source deployment always failed at Ceilometer bootstrapping. Then I found it failed at ceilometer-upgrade. > >> > So I tried to looking at ceilometer-upgrade.log and the error shows it failed to import ujson. > >> > > >> > https://pastebin.com/nGqsM0uf > >> > > >> > Then I googled it and found this issue is already happened and released fixes. > >> > https://github.com/esnme/ultrajson/issues/346 > >> > > >> > But it seems like the container still using the questionable one, even today (Jan 30 UTC+8). > >> > And this not only affected to Ceilometer, but may also Gnocchi. > >> > > >> > I think we have to patch it, but not sure about the workaround. > >> > Does anyone have good idea? > >> > > >> > Many thanks, > >> > Eddie. From tony.pearce at cinglevue.com Thu Jan 30 08:20:29 2020 From: tony.pearce at cinglevue.com (Tony Pearce) Date: Thu, 30 Jan 2020 16:20:29 +0800 Subject: Kayobe Openstack deployment Message-ID: Hi all - I wanted to ask if there was such a reference architecture / step-by-step deployment guide for Openstack / Kayobe that I could follow to get a better understanding of the components and how to go about deploying it? The documentation is not that great so I'm hitting various issues when trying to follow what is there on the Openstack site. There's a lot of technical things like information on variables - which is fantastic, but there's no context about them. For example, the architecture page is pretty small, when you get further on in the guide it's difficult to contextually link detail back to the architecture. I tried to do the all-in-one deployment as well as the "universe from nothing approach" but hit some issues there as well. Plus it's kind of like trying to learn how to drive a bus by riding a micro-scooter :) Also, the "report bug" bug link on the top of all the pages is going to an error "page does not exist" - not sure that had been realised yet. Regards, *Tony Pearce* | *Senior Network Engineer / Infrastructure Lead**Cinglevue International * Email: tony.pearce at cinglevue.com Web: http://www.cinglevue.com *Australia* 1 Walsh Loop, Joondalup, WA 6027 Australia. Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 Note: This email and all attachments are the sole property of Cinglevue International Pty Ltd. (or any of its subsidiary entities), and the information contained herein must be considered confidential, unless specified otherwise. If you are not the intended recipient, you must not use or forward the information contained in these documents. If you have received this message in error, please delete the email and notify the sender. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.urdin at binero.se Thu Jan 30 08:27:04 2020 From: tobias.urdin at binero.se (Tobias Urdin) Date: Thu, 30 Jan 2020 09:27:04 +0100 Subject: [kolla] ujson issue affected to few containers. In-Reply-To: References: Message-ID: <801b30a3-62a1-a1e9-c0ef-973baa19b4a0@binero.se> Seeing this issue when messing around with Gnocchi on Ubuntu 18.04 as well. Temp solved it by installing ujson from master as suggested in [1] instead of pypi. [1] https://github.com/esnme/ultrajson/issues/346 On 1/30/20 9:10 AM, Eddie Yen wrote: > Hi Radosław, > > Sorry about lost distro information, the distro we're using is Ubuntu. > > We have an old copy of ceilometer container image, the ujson.so > version between old and latest are both 1.35 > But only latest one affected this issue. > > BTW, I read the last reply on issue page. Since he said the python 3 > with newer GCC is OK, I think it may caused by python version issue or > GCC compiler versioning. > It may become a huge architect if it really caused by compiling issue, > if Ubuntu updated GCC or python. > > Radosław Piliszek > 於 2020年1月30日 週四 下午3:48寫道: > > Hi Eddie, > > the issue is that the project did *not* do a release. > The latest is still 1.35 from Jan 20, *2016*... [1] > > You said only Rocky source - but is this ubuntu or centos? > > Also, by the looks of [2] master ceilometer is no longer affected, but > monasca and mistral might still be if they call affected paths. > > The project looks dead so we are fried unless we override and start > using its sources from git (hacky hacky). > > [1] https://pypi.org/project/ujson/#history > > [2] http://codesearch.openstack.org/?q=ujson&i=nope&files=&repos= > > -yoctozepto > > > czw., 30 sty 2020 o 03:31 Eddie Yen > napisał(a): > > > > Hi everyone, > > > > I'm not sure it should be bug report or not. So I email out > about this issue. > > > > In these days, I found the Rocky source deployment always failed > at Ceilometer bootstrapping. Then I found it failed at > ceilometer-upgrade. > > So I tried to looking at ceilometer-upgrade.log and the error > shows it failed to import ujson. > > > > https://pastebin.com/nGqsM0uf > > > > Then I googled it and found this issue is already happened and > released fixes. > > https://github.com/esnme/ultrajson/issues/346 > > > > > But it seems like the container still using the questionable > one, even today (Jan 30 UTC+8). > > And this not only affected to Ceilometer, but may also Gnocchi. > > > > I think we have to patch it, but not sure about the workaround. > > Does anyone have good idea? > > > > Many thanks, > > Eddie. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.urdin at binero.se Thu Jan 30 08:34:15 2020 From: tobias.urdin at binero.se (Tobias Urdin) Date: Thu, 30 Jan 2020 09:34:15 +0100 Subject: [cinder][nova] Migrating servers' root block devices from a cinder backend to another In-Reply-To: <13E45A51-9013-4F13-AE83-6DF08F2D6052@planethoster.info> References: <13E45A51-9013-4F13-AE83-6DF08F2D6052@planethoster.info> Message-ID: We did this something similar recently, we booted all instances from Cinder volume (with "Delete on terminate" set) in an old platform. So we added our new Ceph storage to the old platform, removed instances (updated delete_on_terminate to 0 in Nova DB). Then we issued a retype so cinder-volume performed a `dd` of the volume from the old to the new storage. We then synced network/subnet/sg and started instances with same fixed IP and moved floating IPs to the new platform. Since you only have to swap storage you should experiment with powering off the instances and try doing a migrate of the volume but I suspect you need to either remove the instance or do some really nasty database operations. I would suggest always going through the API and recreate the instance from the migrated volume instead of changing in the DB. We had to update delete_on_terminate in DB but that was pretty trivial (and I even think there is a spec that is not implemented yet that will allow that from API). On 1/29/20 9:54 PM, Jean-Philippe Méthot wrote: > Hi, > > We have a several hundred VMs which were built on cinder block devices > as root drives which use a SAN backend. Now we want to change their > backend from the SAN to Ceph. > We can shutdown the VMs but we will not destroy them. I am aware that > there is a cinder migrate volume command to change a volume’s backend, > but it requires that the volume be completely detached. Forcing a > detached state on > that volume does let the volume migration take place, but the volume’s > path in Nova block_device_mapping doesn’t update, for obvious reasons. > > So, I am considering forcing the volumes to a detached status in > Cinder and then manually updating the nova db block_device_mapping > entry for each volume so that the VM can boot back up afterwards. > However, before I start toying with the database and accidentally > break stuff, has anyone else ever done something similar? Got any tips > or hints on how best to proceed? > > Jean-Philippe Méthot > Openstack system administrator > Administrateur système Openstack > PlanetHoster inc. > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tony.pearce at cinglevue.com Thu Jan 30 08:43:18 2020 From: tony.pearce at cinglevue.com (Tony Pearce) Date: Thu, 30 Jan 2020 16:43:18 +0800 Subject: [cinder][nova] Migrating servers' root block devices from a cinder backend to another In-Reply-To: References: <13E45A51-9013-4F13-AE83-6DF08F2D6052@planethoster.info> Message-ID: I want to do something similar soon and don't want to touch the db (I experimented with cloning the "controller" and it did not achieve any desired outcome). Is there a way to export an instance from Openstack in terms of something like a script that could re-create it on another openstack as a like-for-like? I guess this is assuming that the instance is linux-based and has cloud-init enabled. *Tony Pearce* | *Senior Network Engineer / Infrastructure Lead**Cinglevue International * Email: tony.pearce at cinglevue.com Web: http://www.cinglevue.com *Australia* 1 Walsh Loop, Joondalup, WA 6027 Australia. Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 Note: This email and all attachments are the sole property of Cinglevue International Pty Ltd. (or any of its subsidiary entities), and the information contained herein must be considered confidential, unless specified otherwise. If you are not the intended recipient, you must not use or forward the information contained in these documents. If you have received this message in error, please delete the email and notify the sender. On Thu, 30 Jan 2020 at 16:39, Tobias Urdin wrote: > We did this something similar recently, we booted all instances from > Cinder volume (with "Delete on terminate" set) in an old platform. > > So we added our new Ceph storage to the old platform, removed instances > (updated delete_on_terminate to 0 in Nova DB). > Then we issued a retype so cinder-volume performed a `dd` of the volume > from the old to the new storage. > > We then synced network/subnet/sg and started instances with same fixed IP > and moved floating IPs to the new platform. > > Since you only have to swap storage you should experiment with powering > off the instances and try doing a migrate of the volume > but I suspect you need to either remove the instance or do some really > nasty database operations. > > I would suggest always going through the API and recreate the instance > from the migrated volume instead of changing in the DB. > We had to update delete_on_terminate in DB but that was pretty trivial > (and I even think there is a spec that is not implemented yet that will > allow that from API). > > On 1/29/20 9:54 PM, Jean-Philippe Méthot wrote: > > Hi, > > We have a several hundred VMs which were built on cinder block devices as > root drives which use a SAN backend. Now we want to change their backend > from the SAN to Ceph. > We can shutdown the VMs but we will not destroy them. I am aware that > there is a cinder migrate volume command to change a volume’s backend, but > it requires that the volume be completely detached. Forcing a detached > state on > that volume does let the volume migration take place, but the volume’s > path in Nova block_device_mapping doesn’t update, for obvious reasons. > > So, I am considering forcing the volumes to a detached status in Cinder > and then manually updating the nova db block_device_mapping entry for each > volume so that the VM can boot back up afterwards. > However, before I start toying with the database and accidentally break > stuff, has anyone else ever done something similar? Got any tips or hints > on how best to proceed? > > Jean-Philippe Méthot > Openstack system administrator > Administrateur système Openstack > PlanetHoster inc. > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lyarwood at redhat.com Thu Jan 30 08:57:02 2020 From: lyarwood at redhat.com (Lee Yarwood) Date: Thu, 30 Jan 2020 08:57:02 +0000 Subject: [cinder][nova] Migrating servers' root block devices from a cinder backend to another In-Reply-To: <13E45A51-9013-4F13-AE83-6DF08F2D6052@planethoster.info> References: <13E45A51-9013-4F13-AE83-6DF08F2D6052@planethoster.info> Message-ID: <20200130085702.d7ifopspmp5ayean@lyarwood.usersys.redhat.com> On 29-01-20 15:50:13, Jean-Philippe Méthot wrote: > Hi, > > We have a several hundred VMs which were built on cinder block devices > as root drives which use a SAN backend. Now we want to change their > backend from the SAN to Ceph. We can shutdown the VMs but we will not > destroy them. I am aware that there is a cinder migrate volume command > to change a volume’s backend, but it requires that the volume be > completely detached. Forcing a detached state on that volume does let > the volume migration take place, but the volume’s path in Nova > block_device_mapping doesn’t update, for obvious reasons. > > So, I am considering forcing the volumes to a detached status in > Cinder and then manually updating the nova db block_device_mapping > entry for each volume so that the VM can boot back up afterwards. > However, before I start toying with the database and accidentally > break stuff, has anyone else ever done something similar? Got any tips > or hints on how best to proceed? Assuming you're using the Libvirt driver, a hackaround here is to cold migrate the instance to another host, this refreshes the connection_info from c-api and should allow the instance to boot correctly. FWIW https://review.opendev.org/#/c/696834/ will hopefully support live attached volume migrations to and from Ceph volumes thanks to recently -blockdev changes in Libvirt and QEMU. I also want to look into offline attached volume migration in the V cycle, IIRC swap_volume fails when the instance isn't running but in that case it's essentially a noop for n-cpu and assuming c-vol rebases the data on the backend it should succeed. Anyway, hope this helps! -- Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From jerry.a.wang at intel.com Thu Jan 30 09:25:30 2020 From: jerry.a.wang at intel.com (Wang, Jerry A) Date: Thu, 30 Jan 2020 09:25:30 +0000 Subject: [ironic] An approach to removing WSME In-Reply-To: References: Message-ID: <53794A7A-0B20-455F-9D49-4ED2ABD61B1F@intel.com> Hi Steve: Thanks for your great efforts for WSME replacement. I had a quick review for your code changes, see my comments below: Review Feedback Reason 704490 Has concern move wsme code to ironic, types.py 704487 Has concern move wsme code to ironic, node.py 704489 has concern move wsme code to ironic, args.py 704486 has concern move wsme code to ironic expose.py 703898 Both positive and concern Positive, better code structure than original code, concern, more difficult to locate WSME code 704488 has concern move wsme code to ironic 704485 has concern Add pecan code, pecan would be replaced by flask 703897 Positive Removed some WSME code with this change 703723 Positive Removed some WSME code with this change 703695 Positive Removed some WSME code with this change Firstly, I appreciated 3 changes which definately removed some wsme code, especially the change 703897. But I have some concerns for changes 704490, 704487, 704489, ..., these changes seemed move WSME code into ironic, that would make ironic code base become a bit large, from my personal view, to use python built-in feature or other 3-party lib to replace WSME function would be better. I like the way that code change 703897 did. Thanks Jerry 发自我的iPhone 在 2020年1月30日,上午8:53,Steve Baker 写道: I've put together a set of changes for removing WSME which involves copying just enough of it into ironic, and I think we need to have the conversation about whether this approach is desirable :) This git branch[1] finishes with WSME removed and existing tests passing. Here are some stats about lines of code (not including unit tests, calculated with cloc): 4500 wsme/wsme 6000 ironic/ironic/api master 7000 ironic/ironic/api story/1651346 In words, we need 1000 out of 4500 lines of WSME source in order to support 6000 lines of ironic specific API code. Switching to a replacement for WSME would likely touch a good proportion of that 6000 lines of ironic specific API code. If we eventually replace it with a new library I think it would be easier if the thing being replaced is inside the ironic source tree to allow for a gradual transition. So this approach could be an end in itself, or it could be a step towards the final goal. My strategy for copying in code was to pull in chunks of required logic while removing some unused features (like request pass-through and date & time type serialization). I also replaced py2/3 patterns with pure py3. There is likely further things which can be removed or refactored for simplicity, but what exists currently works. If there is enough positive feedback for this approach I'll start on unit test coverage for the new code. [1] https://review.opendev.org/#/q/topic:story/1651346 -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.urdin at binero.se Thu Jan 30 09:49:23 2020 From: tobias.urdin at binero.se (Tobias Urdin) Date: Thu, 30 Jan 2020 10:49:23 +0100 Subject: [cinder][nova] Migrating servers' root block devices from a cinder backend to another In-Reply-To: References: <13E45A51-9013-4F13-AE83-6DF08F2D6052@planethoster.info> Message-ID: <18d096c7-d3c7-7c70-f28d-0e77d6f41814@binero.se> Another approach would be to export the data to Glance and download then you can upload it somewhere. There is no ready thing that I know about. We used the openstacksdk to simply recreate the steps we did on CLI. Create all the neccesary resources on the other side, create new instances from the migrated volume and set a fixedIP on the neutron port to get same IP address. On 1/30/20 9:43 AM, Tony Pearce wrote: > I want to do something similar soon and don't want to touch the db (I > experimented with cloning the "controller" and it did not achieve any > desired outcome). > > Is there a way to export an instance from Openstack in terms of > something like a script that could re-create it on another openstack > as a like-for-like? I guess this is assuming that the instance is > linux-based and has cloud-init enabled. > > > *Tony Pearce*| *Senior Network Engineer / Infrastructure Lead > **Cinglevue International * > > Email: tony.pearce at cinglevue.com > Web: http://www.cinglevue.com ** > > *Australia* > 1 Walsh Loop, Joondalup, WA 6027 Australia. > > Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 > > Note: This email and all attachments are the sole property of > Cinglevue International Pty Ltd. (or any of its subsidiary entities), > and the information contained herein must be considered confidential, > unless specified otherwise.   If you are not the intended recipient, > you must not use or forward the information contained in these > documents.   If you have received this message in error, please delete > the email and notify the sender. > > > > On Thu, 30 Jan 2020 at 16:39, Tobias Urdin > wrote: > > We did this something similar recently, we booted all instances > from Cinder volume (with "Delete on terminate" set) in an old > platform. > > So we added our new Ceph storage to the old platform, removed > instances (updated delete_on_terminate to 0 in Nova DB). > Then we issued a retype so cinder-volume performed a `dd` of the > volume from the old to the new storage. > > We then synced network/subnet/sg and started instances with same > fixed IP and moved floating IPs to the new platform. > > Since you only have to swap storage you should experiment with > powering off the instances and try doing a migrate of the volume > but I suspect you need to either remove the instance or do some > really nasty database operations. > > I would suggest always going through the API and recreate the > instance from the migrated volume instead of changing in the DB. > We had to update delete_on_terminate in DB but that was pretty > trivial (and I even think there is a spec that is not implemented > yet that will allow that from API). > > On 1/29/20 9:54 PM, Jean-Philippe Méthot wrote: >> Hi, >> >> We have a several hundred VMs which were built on cinder block >> devices as root drives which use a SAN backend. Now we want to >> change their backend from the SAN to Ceph. >> We can shutdown the VMs but we will not destroy them. I am aware >> that there is a cinder migrate volume command to change a >> volume’s backend, but it requires that the volume be completely >> detached. Forcing a detached state on >> that volume does let the volume migration take place, but the >> volume’s path in Nova block_device_mapping doesn’t update, for >> obvious reasons. >> >> So, I am considering forcing the volumes to a detached status in >> Cinder and then manually updating the nova db >> block_device_mapping entry for each volume so that the VM can >> boot back up afterwards. >> However, before I start toying with the database and accidentally >> break stuff, has anyone else ever done something similar? Got any >> tips or hints on how best to proceed? >> >> Jean-Philippe Méthot >> Openstack system administrator >> Administrateur système Openstack >> PlanetHoster inc. >> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tony.pearce at cinglevue.com Thu Jan 30 09:58:21 2020 From: tony.pearce at cinglevue.com (Tony Pearce) Date: Thu, 30 Jan 2020 17:58:21 +0800 Subject: [cinder][nova] Migrating servers' root block devices from a cinder backend to another In-Reply-To: <18d096c7-d3c7-7c70-f28d-0e77d6f41814@binero.se> References: <13E45A51-9013-4F13-AE83-6DF08F2D6052@planethoster.info> <18d096c7-d3c7-7c70-f28d-0e77d6f41814@binero.se> Message-ID: Thanks Tobias - thats my last resort. I'd still need to upload that image to the new openstack and then build an instance from the image. I'd also need to use metadata to make sure the instance was built with the same components (IP address etc). *Tony Pearce* | *Senior Network Engineer / Infrastructure Lead**Cinglevue International * Email: tony.pearce at cinglevue.com Web: http://www.cinglevue.com *Australia* 1 Walsh Loop, Joondalup, WA 6027 Australia. Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 Note: This email and all attachments are the sole property of Cinglevue International Pty Ltd. (or any of its subsidiary entities), and the information contained herein must be considered confidential, unless specified otherwise. If you are not the intended recipient, you must not use or forward the information contained in these documents. If you have received this message in error, please delete the email and notify the sender. On Thu, 30 Jan 2020 at 17:55, Tobias Urdin wrote: > Another approach would be to export the data to Glance and download then > you can upload it somewhere. > > There is no ready thing that I know about. We used the openstacksdk to > simply recreate the steps we did on CLI. > Create all the neccesary resources on the other side, create new instances > from the migrated volume and set a fixedIP on the neutron port to get same > IP address. > > On 1/30/20 9:43 AM, Tony Pearce wrote: > > I want to do something similar soon and don't want to touch the db (I > experimented with cloning the "controller" and it did not achieve any > desired outcome). > > Is there a way to export an instance from Openstack in terms of something > like a script that could re-create it on another openstack as a > like-for-like? I guess this is assuming that the instance is linux-based > and has cloud-init enabled. > > > *Tony Pearce* | > *Senior Network Engineer / Infrastructure Lead **Cinglevue International > * > > Email: tony.pearce at cinglevue.com > Web: http://www.cinglevue.com > > *Australia* > 1 Walsh Loop, Joondalup, WA 6027 Australia. > > Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 > > Note: This email and all attachments are the sole property of Cinglevue > International Pty Ltd. (or any of its subsidiary entities), and the > information contained herein must be considered confidential, unless > specified otherwise. If you are not the intended recipient, you must not > use or forward the information contained in these documents. If you have > received this message in error, please delete the email and notify the > sender. > > > > > On Thu, 30 Jan 2020 at 16:39, Tobias Urdin wrote: > >> We did this something similar recently, we booted all instances from >> Cinder volume (with "Delete on terminate" set) in an old platform. >> >> So we added our new Ceph storage to the old platform, removed instances >> (updated delete_on_terminate to 0 in Nova DB). >> Then we issued a retype so cinder-volume performed a `dd` of the volume >> from the old to the new storage. >> >> We then synced network/subnet/sg and started instances with same fixed IP >> and moved floating IPs to the new platform. >> >> Since you only have to swap storage you should experiment with powering >> off the instances and try doing a migrate of the volume >> but I suspect you need to either remove the instance or do some really >> nasty database operations. >> >> I would suggest always going through the API and recreate the instance >> from the migrated volume instead of changing in the DB. >> We had to update delete_on_terminate in DB but that was pretty trivial >> (and I even think there is a spec that is not implemented yet that will >> allow that from API). >> >> On 1/29/20 9:54 PM, Jean-Philippe Méthot wrote: >> >> Hi, >> >> We have a several hundred VMs which were built on cinder block devices as >> root drives which use a SAN backend. Now we want to change their backend >> from the SAN to Ceph. >> We can shutdown the VMs but we will not destroy them. I am aware that >> there is a cinder migrate volume command to change a volume’s backend, but >> it requires that the volume be completely detached. Forcing a detached >> state on >> that volume does let the volume migration take place, but the volume’s >> path in Nova block_device_mapping doesn’t update, for obvious reasons. >> >> So, I am considering forcing the volumes to a detached status in Cinder >> and then manually updating the nova db block_device_mapping entry for each >> volume so that the VM can boot back up afterwards. >> However, before I start toying with the database and accidentally break >> stuff, has anyone else ever done something similar? Got any tips or hints >> on how best to proceed? >> >> Jean-Philippe Méthot >> Openstack system administrator >> Administrateur système Openstack >> PlanetHoster inc. >> >> >> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doka.ua at gmx.com Thu Jan 30 12:40:02 2020 From: doka.ua at gmx.com (Volodymyr Litovka) Date: Thu, 30 Jan 2020 14:40:02 +0200 Subject: [NOVA] instance hostname vs display_name vs dns_name In-Reply-To: <89bb79644030812f7846c51896a79ae6aa32e595.camel@redhat.com> References: <89bb79644030812f7846c51896a79ae6aa32e595.camel@redhat.com> Message-ID: <766ea365-33ca-6c25-ac91-3a9370ac4106@gmx.com> Hi Sean, please, see below 30.01.20 00:17, Sean Mooney пише: > On Wed, 2020-01-29 at 16:44 +0200, Volodymyr Litovka wrote: >> Dear colleagues, >> >> I'm using DNS Integration and just faced an issue - after renaming >> instance, I can't bind port to the instance using new name: >> >> 1) I've created intances with name 'devel' >> 2) then I renamed it to devel ((openstack server set --name packager devel) >> 3) when binding port with dns_name using new name ('packager' in my >> case), the following error appear: >> ERROR nova.api.openstack.wsgi PortNotUsableDNS: Port >> c3a92cf6-b49b-4570-b69b-0c23af1d1f94 not usable for instance >> 6aa78bd5-099e-4878-a5ac-90262505a924. Value packager assigned to >> dns_name attribute does not match instance's hostname devel >> >> and yes, record in DB still uses an old hostname: >> >> mysql> select display_name from instances where hostname='devel'; >> +--------------+ >>> display_name | >> +--------------+ >>> packager | >> +--------------+ >> 1 row in set (0.00 sec) >> >> >> and hostname remains the same (initial) regardless of any changes to >> display_name. >> >> I'm on Rocky. Is it bug or feature and are there ways to work around this? > i think you should be able to change the display name but i would not expect that display name to change > the host name used by the guest. it is likely a bug that the portbinding appears to be using the displayname > for designate integration. The issue is that portbinding compares NOT the "display_port", but "hostname" column to port's dns_name,  and this lead to the described above issue, i.e. mysql> select hostname, display_name from instances where uuid='2d49b781-cef5-4cdd-a310-e74eb98aa514'; +----------+--------------+ | hostname | display_name | +----------+--------------+ | web01 | web02 | +----------+--------------+ $ openstack port create --network e-net --dns-name web02 --fixed-ip subnet=e-bed testport +-----------------------+------- | Field | Value +-----------------------+------- | dns_assignment | fqdn='web02.loc.', hostname='web02', ip_address='x.x.x.x' | dns_domain | | dns_name | web02 | id | 0425701f-d958-4c81-931a-9594fba7d7d2 +-----------------------+------- $ nova interface-attach --port-id 0425701f-d958-4c81-931a-9594fba7d7d2 web02 ERROR (ClientException): Unexpected API Error. the core issue behind this is that if user renamed server and still want to access it using name, he can't > i know we use the display name as the hostname for the vm by default but > i would not expect openstack server set --name packager devel to alter the hostname served to the vm over dhcp > once the vm is intially created. i would expect to be able to alther that via designate and for the display name to be > independt of the host name after the inital boot. > > so yes there is likely a bug in that we are using the disply name somewhere we shoudl not be. https://bugs.launchpad.net/nova/+bug/1861401 Either renaming instance need to change both "hostname" and "display_name" columns or DNS integration need compare port's dns_name with "display_name". Thank you. -- Volodymyr Litovka "Vision without Execution is Hallucination." -- Thomas Edison -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Thu Jan 30 13:32:12 2020 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 30 Jan 2020 07:32:12 -0600 Subject: [nova] API updates week 20-5 Message-ID: <16ff6a6ee31.10f9e4eec97950.4319060339868883650@ghanshyammann.com> Hello Everyone, Please find the Nova API updates of this week. Please add if I missed any BPs/API related work. API Related BP : ============ COMPLETED: 1. Add image-precache-support spec: - https://blueprints.launchpad.net/nova/+spec/image-precache-support Code Ready for Review: ------------------------------ 1. Nova API policy improvement - Topic: https://review.opendev.org/#/q/topic:bp/policy-defaults-refresh+(status:open+OR+status:merged) - Weekly Progress: First set is merged and 4 APIs policy are ready to review. I will push more changes in this week. - Review guide over ML - http://lists.openstack.org/pipermail/openstack-discuss/2019-August/008504.html Specs are merged and code in-progress: ------------------------------ ------------------ 1. Non-Admin user can filter their instance by availability zone: - Topic: https://review.opendev.org/#/q/topic:bp/non-admin-filter-instance-by-az+(status:open+OR+status:merged) - Weekly Progress: Spec is just merged. 2. Boot from volume instance rescue - Topic: https://review.opendev.org/#/q/topic:bp/virt-bfv-instance-rescue+(status:open+OR+status:merged) - Weekly Progress: Code is in progress. WIP as of now. Spec Ready for Review or Action from Author: --------------------------------------------------------- 1. Support specifying vnic_type to boot a server -Spec: https://review.opendev.org/#/c/672400/ - Weekly Progress: As discussed in spec, this is something can be done via heat or other automation instead of doing it in nova. 2. Add action event fault details -Spec: https://review.opendev.org/#/c/699669/ - Weekly Progress: This seems good to go, it has two +2. 3. Allow specify user to reset password -Spec: https://review.opendev.org/#/c/682302/5 - Weekly Progress: One +2 on this but other are disagree on this idea. More discussion on review. 4. Support re-configure deleted_on_termination in server -Spec: https://review.openstack.org/#/c/580336/ - Weekly Progress: This is old spec and there is no consensus on this till now. Others: 1. None Bugs: ==== I will track the bug data from next report. NOTE- There might be some bug which is not tagged as 'api' or 'api-ref', those are not in the above list. Tag such bugs so that we can keep our eyes. -gmann From mark at stackhpc.com Thu Jan 30 13:40:03 2020 From: mark at stackhpc.com (Mark Goddard) Date: Thu, 30 Jan 2020 13:40:03 +0000 Subject: Kayobe Openstack deployment In-Reply-To: References: Message-ID: On Thu, 30 Jan 2020 at 08:22, Tony Pearce wrote: > > Hi all - I wanted to ask if there was such a reference architecture / step-by-step deployment guide for Openstack / Kayobe that I could follow to get a better understanding of the components and how to go about deploying it? Hi Tony, we spoke in the #openstack-kolla IRC channel [1], but I thought I'd reply here for the benefit of anyone reading this. > > The documentation is not that great so I'm hitting various issues when trying to follow what is there on the Openstack site. There's a lot of technical things like information on variables - which is fantastic, but there's no context about them. For example, the architecture page is pretty small, when you get further on in the guide it's difficult to contextually link detail back to the architecture. As discussed in IRC, we are missing some architecture and from scratch walkthrough documentation in Kayobe. I've been focusing on the configuration reference mostly, but I think it is time to move onto these other areas to help new starters. > > I tried to do the all-in-one deployment as well as the "universe from nothing approach" but hit some issues there as well. Plus it's kind of like trying to learn how to drive a bus by riding a micro-scooter :) I would definitely recommend persevering with the universe from nothing demo [2], as it offers the quickest way to get a system up and running that you can poke at. It's also a fairly good example of a 'bare minimum' configuration. Could you share the issues you had with it? For an even simpler setup, you could try [3], which gets you an all-in-one control plane/compute host quite quickly. I'd suggest using the stable/train branch for a more stable environment. > > Also, the "report bug" bug link on the top of all the pages is going to an error "page does not exist" - not sure that had been realised yet. Andreas Jaeger kindly proposed a fix for this. Here's the storyboard link: https://storyboard.openstack.org/#!/project/openstack/kayobe. [1] http://eavesdrop.openstack.org/irclogs/%23openstack-kolla/%23openstack-kolla.2020-01-30.log.html#t2020-01-30T04:07:14 [2] https://github.com/stackhpc/a-universe-from-nothing [3] https://docs.openstack.org/kayobe/latest/development/automated.html#overcloud > > Regards, > > > Tony Pearce | Senior Network Engineer / Infrastructure Lead > Cinglevue International > > Email: tony.pearce at cinglevue.com > Web: http://www.cinglevue.com > > Australia > 1 Walsh Loop, Joondalup, WA 6027 Australia. > > Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 > > Note: This email and all attachments are the sole property of Cinglevue International Pty Ltd. (or any of its subsidiary entities), and the information contained herein must be considered confidential, unless specified otherwise. If you are not the intended recipient, you must not use or forward the information contained in these documents. If you have received this message in error, please delete the email and notify the sender. > > From jp.methot at planethoster.info Thu Jan 30 14:12:07 2020 From: jp.methot at planethoster.info (=?utf-8?Q?Jean-Philippe_M=C3=A9thot?=) Date: Thu, 30 Jan 2020 09:12:07 -0500 Subject: [cinder][nova] Migrating servers' root block devices from a cinder backend to another In-Reply-To: <20200130085702.d7ifopspmp5ayean@lyarwood.usersys.redhat.com> References: <13E45A51-9013-4F13-AE83-6DF08F2D6052@planethoster.info> <20200130085702.d7ifopspmp5ayean@lyarwood.usersys.redhat.com> Message-ID: > > Assuming you're using the Libvirt driver, a hackaround here is to cold > migrate the instance to another host, this refreshes the connection_info > from c-api and should allow the instance to boot correctly. > Indeed I hadn’t thought of this. If migrations updates the block_device_mapping from the cinder DB entry, I will not need to make any database modifications. Thank you for your help, it’s strongly appreciated. Jean-Philippe Méthot Openstack system administrator Administrateur système Openstack PlanetHoster inc. -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at fried.cc Thu Jan 30 14:44:37 2020 From: openstack at fried.cc (Eric Fried) Date: Thu, 30 Jan 2020 08:44:37 -0600 Subject: [cinder][nova] Migrating servers' root block devices from a cinder backend to another In-Reply-To: References: <13E45A51-9013-4F13-AE83-6DF08F2D6052@planethoster.info> Message-ID: > We had to update delete_on_terminate in DB but that was pretty trivial > (and I even think there is a spec that is not implemented yet that will > allow that from API). Not sure if this is still helpful, but that spec is here: https://review.opendev.org/#/c/580336/ efried . From nate.johnston at redhat.com Thu Jan 30 14:56:08 2020 From: nate.johnston at redhat.com (Nate Johnston) Date: Thu, 30 Jan 2020 09:56:08 -0500 Subject: [neutron][drivers team] Proposing Nate Johnston as drivers team member In-Reply-To: <20200129203034.7ztacox53bjh2qkp@skaplons-mac> References: <20200122215847.zsfgihjmmjkwlrdl@skaplons-mac> <20200129203034.7ztacox53bjh2qkp@skaplons-mac> Message-ID: <20200130145608.b64m3fincxjfmkay@firewall> On Wed, Jan 29, 2020 at 09:30:34PM +0100, Slawek Kaplonski wrote: > Hi, > > It's been a week since I sent this nomination and there was only very positive > feedback about that. > So welcome in the drivers team Nate :) Thanks everyone very much! I am honored to be in the company of such amazing people. Nate > On Wed, Jan 29, 2020 at 09:46:42AM +0900, Akihiro Motoki wrote: > > Great addition. +1 from me. > > > > On Thu, Jan 23, 2020 at 7:02 AM Slawek Kaplonski wrote: > > > > > > Hi Neutrinos, > > > > > > I would like to propose Nate Johnston to be part of Neutron drivers team. > > > Since long time Nate is very active Neutron's core reviewer. He is also actively > > > participating in our Neutron drivers team meetings and he shown there that he > > > has big experience and knowledge about Neutron, Neutron stadium projects as well > > > as whole OpenStack. > > > I think that he really deservers to be part of this team and that he will be > > > great addition it. > > > I will wait for Your feedback for 1 week and if there will be no any votes > > > agains, I will add Nate to drivers team in next week. > > > > > > -- > > > Slawek Kaplonski > > > Senior software engineer > > > Red Hat > > > > > > > > > > -- > Slawek Kaplonski > Senior software engineer > Red Hat > > From jp.methot at planethoster.info Thu Jan 30 15:23:40 2020 From: jp.methot at planethoster.info (=?utf-8?Q?Jean-Philippe_M=C3=A9thot?=) Date: Thu, 30 Jan 2020 10:23:40 -0500 Subject: [cinder][nova] Migrating servers' root block devices from a cinder backend to another In-Reply-To: References: <13E45A51-9013-4F13-AE83-6DF08F2D6052@planethoster.info> <20200130085702.d7ifopspmp5ayean@lyarwood.usersys.redhat.com> Message-ID: <4193A674-A75C-470F-9722-AC6EB228652C@planethoster.info> To follow up on my last email, I have tested the following hackaround in my testing environment: >> >> Assuming you're using the Libvirt driver, a hackaround here is to cold >> migrate the instance to another host, this refreshes the connection_info >> from c-api and should allow the instance to boot correctly. >> I can confirm that on OpenStack Queens, this doesn’t work. Openstack can’t find the attachment ID as it was marked as deleted in the cinder DB. I’m guessing it put itself as deleted when I did cinder reset-state --attach-status detached b70b254f-58cd-4940-b976-6f4dc0209a8c. Even though I did cinder reset-state --attach-status attached b70b254f-58cd-4940-b976-6f4dc0209a8c, the original attachment ID did not undelete itself. It also didn’t update itself to the new path. As a result, nova seems to be looking for the attachment ID when trying to migrate and errors out when it can’t find it. That said, I do remember using this workaround in the past on older versions of Openstack, so this definitely used to work. I guess a change in cinder probably broke this. Jean-Philippe Méthot Openstack system administrator Administrateur système Openstack PlanetHoster inc. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Greg.Waines at windriver.com Thu Jan 30 15:44:58 2020 From: Greg.Waines at windriver.com (Waines, Greg) Date: Thu, 30 Jan 2020 15:44:58 +0000 Subject: [magnum] Help / Pointers on mirroring https://opendev.org/starlingx in github.com for CNCF certification of starlingx Message-ID: <0C3D22C0-A751-4E56-B47F-B879225854D7@windriver.com> Hello, I am working in the OpenStack StarlingX team. We are working on getting StarlingX certified through the CNCF conformance program, https://www.cncf.io/certification/software-conformance/ . ( in the same way that you guys, OpenStack Magnum project, got certified with CNCF ) As you know, in order for the logo to be shown as based on open-source, CNCF requires that the code be mirrored on github.com . e.g. https://github.com/openstack/magnum The openstack foundation guys did provide some info on how to do this: The further steps for the project owner to take: * create a dedicated account for zuul * create the individual empty repos * add a job to each repo to do the mirroring, like: * https://opendev.org/airship/deckhand/src/commit/51dcea4fa12b0bcce65c381c286e61378a0826e2/.zuul.yaml#L406-L463 Also, you can find documentation for the parent job here: https://zuul-ci.org/docs/zuul-jobs/general-jobs.html#job-upload-git-mirror ... maybe it’s cause I don’t know anything about zuul jobs, but these instructions are not super clear to me. Is the person who did this for magnum available to provide some more detailed instructions or help on doing this ? Let me know ... any help is much appreciated, Greg. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosmaita.fossdev at gmail.com Thu Jan 30 16:27:34 2020 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Thu, 30 Jan 2020 11:27:34 -0500 Subject: [cinder] last call for ussuri spec comments Message-ID: The following specs have two +2s. I believe that all expressed concerns have been addressed. I intend to merge them at 22:00 UTC today unless a serious issue is raised before then. https://review.opendev.org/#/c/684556/ - support volume-local-cache https://review.opendev.org/#/c/700977 - add backup id to volume metadata cheers, brian From paye600 at gmail.com Thu Jan 30 16:34:51 2020 From: paye600 at gmail.com (Roman Gorshunov) Date: Thu, 30 Jan 2020 17:34:51 +0100 Subject: [magnum] Help / Pointers on mirroring https://opendev.org/starlingx in github.com for CNCF certification of starlingx In-Reply-To: <0C3D22C0-A751-4E56-B47F-B879225854D7@windriver.com> References: <0C3D22C0-A751-4E56-B47F-B879225854D7@windriver.com> Message-ID: Hello Greg, - Create a GitHub account for the starlingx, you will have URL on GitHub like [0]. You may try to contact GitHub admins and ask to release starlingx name, as it seems to be unused. - Then create SSH key and upload onto GitHub for that account in GitHub interface - Create empty repositories under [1] to match your existing repositories names here [2] - Encrypt SSH private key as described here [3] or here [4] using this tool [5] - Create patch to all your projects under [2] to '/.zuul.yaml' file, similar to what is listed here [6] - Change job name shown on line 407 via URL above, description (line 409), git_mirror_repository variable (line 411), secret name (line 414 and 418), and SSH key starting from line 424 to 463, to match your project's name, repo path on GitHub, and SSH key - Submit changes to Gerrit with patches for all your project and get them merged. If all goes good, the next change merged would trigger your repositories to be synced to GitHub. Status could be seen here [7] - search for your newly created job manes, they should be nested under "upload-git-mirrorMirrors a tested project repository to a remote git server." Hope it helps. [0] https://github.com/starlingxxxx [1] https://github.com/starlingxxxx/... [2] https://opendev.org/starlingx/... [3] https://docs.openstack.org/infra/manual/zuulv3.html#secret-variables [4] https://docs.openstack.org/infra/manual/creators.html#mirroring-projects-to-git-mirrors [5] https://opendev.org/zuul/zuul/src/branch/master/tools/encrypt_secret.py [6] https://opendev.org/airship/deckhand/src/commit/51dcea4fa12b0bcce65c381c286e61378a0826e2/.zuul.yaml#L406-L463 [7] https://zuul.openstack.org/jobs Sample content of your addition (patch) to your '/.zuul.yaml' files: =================================================== - job: name: starlingx-compile-upload-git-mirror parent: upload-git-mirror description: Mirrors starlingx/compile to starlingxxxx/compile vars: git_mirror_repository: starlingxxxx/compile secrets: - name: git_mirror_credentials secret: starlingx-compile-github-secret pass-to-parent: true - secret: name: starlingx-compile-github-secret data: user: git host: github.com host_key: github.com ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAq2A7hRGmdnm9tUDbO9IDSwBK6TbQa+PXYPCPy6rbTrTtw7PHkccKrpp0yVhp5HdEIcKr6pLlVDBfOLX9QUsyCOV0wzfjIJNlGEYsdlLJizHhbn2mUjvSAHQqZETYP81eFzLQNnPHt4EVVUh7VfDESU84KezmD5QlWpXLmvU31/yMf+Se8xhHTvKSCZIFImWwoG6mbUoWf9nzpIoaSjB+weqqUUmpaaasXVal72J+UX2B+2RPW3RcT0eOzQgqlJL3RKrTJvdsjE3JEAvGq3lGHSZXy28G3skua2SmVi/w4yCE6gbODqnTWlg7+wC604ydGXA8VJiS5ap43JXiUFFAaQ== ssh_key: !encrypted/pkcs1-oaep - =================================================== Best regards, -- Roman Gorshunov From jon at csail.mit.edu Thu Jan 30 16:36:23 2020 From: jon at csail.mit.edu (Jonathan Proulx) Date: Thu, 30 Jan 2020 11:36:23 -0500 Subject: [ops][cinder] Moving volume to new type Message-ID: <20200130163623.alxwbl3jt5w2bldw@csail.mit.edu> Hi All, I'm currently languishing on Mitaka so perhaps further back than help can reach but...if anyone can tell me if this is something dumb I'm doing or a know bug in mitaka that's preventing me for movign volumes from one type to anyother it'd be a big help. In the further past I did a cinder backend migration by creating a new volume type then changing all the existign volume sto the new type. This is how we got from iSCSI to RBD (probably in Grizzly or Havana). Currently I'm starting to move from one RBD pool to an other and seems like this should work in the same way. Both pools and types exist and I can create volumes in either but when I run: openstack volume set --type ssd test-vol it rather silently fails to do anything (CLI returns 0), looking into schedulerlogs I see: # yup 2 "hosts" to check DEBUG cinder.scheduler.base_filter Starting with 2 host(s) get_filtered_objects DEBUG cinder.scheduler.base_filter Filter AvailabilityZoneFilter returned 2 host(s) get_filtered_objects DEBUG cinder.scheduler.filters.capacity_filter Space information for volume creation on host nimbus-1 at ssdrbd#ssdrbd (requested / avail): 8/47527.78 host_passes DEBUG cinder.scheduler.base_filter Filter CapacityFilter returned 2 host(s) get_filtered_objects /usr/lib/python2.7/dist-packages/cinder/scheduler/base DEBUG cinder.scheduler.filters.capabilities_filter extra_spec requirement 'ssdrbd' does not match 'rbd' _satisfies_extra_specs /usr/lib/python2.7/dist- DEBUG cinder.scheduler.filters.capabilities_filter host 'nimbus-1 at rbd#rbd': free_capacity_gb: 71127.03, pools: None fails resource_type extra_specs req DEBUG cinder.scheduler.base_filter Filter CapabilitiesFilter returned 1 host(s) get_filtered_objects /usr/lib/python2.7/dist-packages/cinder/scheduler/ # after filtering we have one DEBUG cinder.scheduler.filter_scheduler Filtered [host 'nimbus-1 at ssdrbd#ssdrbd': free_capacity_gb: 47527.78, pools: None] _get_weighted_candidates # but it fails? ERROR cinder.scheduler.manager Could not find a host for volume 49299c0b-8bcf-4cdb-a0e1-dec055b0e78c with type bc2bc9ad-b0ad-43d2-93db-456d750f194d. successfully creating a volume in ssdrbd is identical to that point, except rather than ERROR on the last line it goes to: # Actually chooses 'nimbus-1 at ssdrbd#ssdrbd' as top host DEBUG cinder.scheduler.filter_scheduler Filtered [host 'nimbus-1 at ssdrbd#ssdrbd': free_capacity_gb: 47527.8, pools: None] _get_weighted_candidates DEBUG cinder.scheduler.filter_scheduler Choosing nimbus-1 at ssdrbd#ssdrbd _choose_top_host # then goes and makes volume DEBUG oslo_messaging._drivers.amqpdriver CAST unique_id: 1b7a9d88402a41f8b889b88a2e2a198d exchange 'openstack' topic 'cinder-volume' _send DEBUG cinder.scheduler.manager Task 'cinder.scheduler.flows.create_volume.ScheduleCreateVolumeTask;volume:create' (e70dcc3f-7d88-4542-abff-f1a1293e90fb) transitioned into state 'SUCCESS' from state 'RUNNING' with result 'None' _task_receiver Anyone recognize this situation? Since I'm retiring the old spinning disks I can also "solve" this on the Ceph side by changing the crush map such that the old rbd pool just picks all ssds. So this isn't critical, but in the transitional period until I have enough SSD capacity to really throw *everything* over, there are some hot spot volumes it would be really nice to move this way. Thanks, -Jon From colleen at gazlene.net Thu Jan 30 17:25:54 2020 From: colleen at gazlene.net (Colleen Murphy) Date: Thu, 30 Jan 2020 09:25:54 -0800 Subject: [ops] Federated Identity Management survey In-Reply-To: <61ec6b02-f77b-4503-8a16-a549952bf8c0@www.fastmail.com> References: <4a7a0c41-59ce-4aac-839e-0840eeb50348@www.fastmail.com> <2116da33-6d85-4132-94e5-68bcea0c8385@www.fastmail.com> <61ec6b02-f77b-4503-8a16-a549952bf8c0@www.fastmail.com> Message-ID: On Thu, Jan 30, 2020, at 09:19, Colleen Murphy wrote: > On Mon, Jan 13, 2020, at 09:38, Colleen Murphy wrote: > > On Mon, Dec 23, 2019, at 09:32, Colleen Murphy wrote: > > > Hello operators, > > > > > > A researcher from the University of Kent who was influential in the > > > design of keystone's federation implementation has asked the keystone > > > team to gauge adoption of federated identity management in OpenStack > > > deployments. This is something we've neglected to track well in the > > > last few OpenStack user surveys, so I'd like to ask OpenStack operators > > > to please take a few minutes to complete the following survey about > > > your usage of identity federation in your OpenStack deployment (even if > > > you don't use federation): > > > > > > https://uok.typeform.com/to/KuRY0q > > > > > > The results of this survey will benefit not only university research > > > but also the keystone team as it will help us understand where to focus > > > our efforts. Your participation is greatly appreciated. > > > > > > Thanks for your time, > > > > > > Colleen (cmurphy) > > > > > > > > > > Thanks to everyone who has completed this survey so far! The survey > > will be closing in about a week, so if you have not yet completed it, > > we appreciate you taking the time to do so now. > > > > Colleen (cmurphy) > > > > > > Thanks to everyone who responded to the survey. If you're interested, > attached is an analysis of the results of the survey. > > Colleen (cmurphy) > Attachments: > * FIM Survey.pdf My email with a PDF attachment of the survey analysis may not make it through the list moderation, so here is a link to it instead: https://www.dropbox.com/s/kriojmm60hg3cxg/FIM%20Survey.pdf?dl=0 Colleen From colleen at gazlene.net Thu Jan 30 17:19:46 2020 From: colleen at gazlene.net (Colleen Murphy) Date: Thu, 30 Jan 2020 09:19:46 -0800 Subject: [ops] Federated Identity Management survey In-Reply-To: <2116da33-6d85-4132-94e5-68bcea0c8385@www.fastmail.com> References: <4a7a0c41-59ce-4aac-839e-0840eeb50348@www.fastmail.com> <2116da33-6d85-4132-94e5-68bcea0c8385@www.fastmail.com> Message-ID: <61ec6b02-f77b-4503-8a16-a549952bf8c0@www.fastmail.com> On Mon, Jan 13, 2020, at 09:38, Colleen Murphy wrote: > On Mon, Dec 23, 2019, at 09:32, Colleen Murphy wrote: > > Hello operators, > > > > A researcher from the University of Kent who was influential in the > > design of keystone's federation implementation has asked the keystone > > team to gauge adoption of federated identity management in OpenStack > > deployments. This is something we've neglected to track well in the > > last few OpenStack user surveys, so I'd like to ask OpenStack operators > > to please take a few minutes to complete the following survey about > > your usage of identity federation in your OpenStack deployment (even if > > you don't use federation): > > > > https://uok.typeform.com/to/KuRY0q > > > > The results of this survey will benefit not only university research > > but also the keystone team as it will help us understand where to focus > > our efforts. Your participation is greatly appreciated. > > > > Thanks for your time, > > > > Colleen (cmurphy) > > > > > > Thanks to everyone who has completed this survey so far! The survey > will be closing in about a week, so if you have not yet completed it, > we appreciate you taking the time to do so now. > > Colleen (cmurphy) > > Thanks to everyone who responded to the survey. If you're interested, attached is an analysis of the results of the survey. Colleen (cmurphy) -------------- next part -------------- A non-text attachment was scrubbed... Name: FIM Survey.pdf Type: application/pdf Size: 80324 bytes Desc: not available URL: From sshnaidm at redhat.com Thu Jan 30 18:12:59 2020 From: sshnaidm at redhat.com (Sagi Shnaidman) Date: Thu, 30 Jan 2020 20:12:59 +0200 Subject: [openstack-ansible][ansible-sig][tripleo] Openstack Ansible modules - update on progress, new meeting time Message-ID: Hi, all because no people have been available in the meeting, I'll write a quick summary in ML. Agenda and progress are tracked here as usual: https://etherpad.openstack.org/p/openstack-ansible-modules 1. Because people had overlapping meetings I've created a poll for better meeting time. I think we can switch now to the bi-weekly meeting if no objections to it. The poll is here: https://xoyondo.com/dp/ITMGRZSvaZaONcz 2. We have the same coverage as we had in Ansible repository, so now we can start to improve our testing. It should include a matrix with various ansible versions and openstacksdk versions. We had already some issues with versions mismatch [1] Also we should start to do a better module design. 3. In case of critical patches to 2.9 and when openstack modules will be removed from Ansible repo, please submit patches directly to the 2.9-stable branch of Ansible. Confirmed that with Ansible folks. 4. The CI job in Ansible repository now is failing with a message about moving modules. I hope it won't block from 2.9 patches to land but will do more checking. But at least we've got attention to the fact modules are moving. 5. Jobs for pushing collections to ansible-galaxy are in progress and please review patches in the repo in your time: [1] https://review.opendev.org/705054 [2] https://review.opendev.org/#/q/project:openstack/ansible-collections-openstack -- Best regards Sagi Shnaidman -------------- next part -------------- An HTML attachment was scrubbed... URL: From whayutin at redhat.com Thu Jan 30 18:39:24 2020 From: whayutin at redhat.com (Wesley Hayutin) Date: Thu, 30 Jan 2020 11:39:24 -0700 Subject: [tripleo] no rechecks Message-ID: Greetings, Hey folks we need a few patches to land before it's safe to blindly recheck. A few minor patches need to land before you can expect to get +1 from zuul in a lot of cases. We appreciate your patience while we resolve the issues. Thanks * TRIPLEO GATES ARE BROKEN* * need two patches to fix - * https://review.opendev.org/#/c/705051/ - * https://review.opendev.org/#/c/704885/ - * Also need - * https://review.rdoproject.org/r/#/c/24754/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Albert.Braden at synopsys.com Thu Jan 30 18:47:24 2020 From: Albert.Braden at synopsys.com (Albert Braden) Date: Thu, 30 Jan 2020 18:47:24 +0000 Subject: [cinder][nova] Migrating servers' root block devices from a cinder backend to another In-Reply-To: References: <13E45A51-9013-4F13-AE83-6DF08F2D6052@planethoster.info> <18d096c7-d3c7-7c70-f28d-0e77d6f41814@binero.se> Message-ID: Hi Tony, Have you considered doing a “Tech Refresh” process? The companies I’ve worked at consider VMs ephemeral. When we need to replace a cluster, we build the new one, notify the users to create new VMs there, and then delete the old ones and take down the old cluster. We give them tools like Forklift to help migrate, but we make it their responsibility to create the new VM and move their data over. From: Tony Pearce Sent: Thursday, January 30, 2020 1:58 AM To: Tobias Urdin Cc: OpenStack Discuss ML Subject: Re: [cinder][nova] Migrating servers' root block devices from a cinder backend to another Thanks Tobias - thats my last resort. I'd still need to upload that image to the new openstack and then build an instance from the image. I'd also need to use metadata to make sure the instance was built with the same components (IP address etc). Tony Pearce | Senior Network Engineer / Infrastructure Lead Cinglevue International Email: tony.pearce at cinglevue.com Web: http://www.cinglevue.com Australia 1 Walsh Loop, Joondalup, WA 6027 Australia. Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 Note: This email and all attachments are the sole property of Cinglevue International Pty Ltd. (or any of its subsidiary entities), and the information contained herein must be considered confidential, unless specified otherwise. If you are not the intended recipient, you must not use or forward the information contained in these documents. If you have received this message in error, please delete the email and notify the sender. On Thu, 30 Jan 2020 at 17:55, Tobias Urdin > wrote: Another approach would be to export the data to Glance and download then you can upload it somewhere. There is no ready thing that I know about. We used the openstacksdk to simply recreate the steps we did on CLI. Create all the neccesary resources on the other side, create new instances from the migrated volume and set a fixedIP on the neutron port to get same IP address. On 1/30/20 9:43 AM, Tony Pearce wrote: I want to do something similar soon and don't want to touch the db (I experimented with cloning the "controller" and it did not achieve any desired outcome). Is there a way to export an instance from Openstack in terms of something like a script that could re-create it on another openstack as a like-for-like? I guess this is assuming that the instance is linux-based and has cloud-init enabled. Tony Pearce | Senior Network Engineer / Infrastructure Lead Cinglevue International Email: tony.pearce at cinglevue.com Web: http://www.cinglevue.com Australia 1 Walsh Loop, Joondalup, WA 6027 Australia. Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 Note: This email and all attachments are the sole property of Cinglevue International Pty Ltd. (or any of its subsidiary entities), and the information contained herein must be considered confidential, unless specified otherwise. If you are not the intended recipient, you must not use or forward the information contained in these documents. If you have received this message in error, please delete the email and notify the sender. On Thu, 30 Jan 2020 at 16:39, Tobias Urdin > wrote: We did this something similar recently, we booted all instances from Cinder volume (with "Delete on terminate" set) in an old platform. So we added our new Ceph storage to the old platform, removed instances (updated delete_on_terminate to 0 in Nova DB). Then we issued a retype so cinder-volume performed a `dd` of the volume from the old to the new storage. We then synced network/subnet/sg and started instances with same fixed IP and moved floating IPs to the new platform. Since you only have to swap storage you should experiment with powering off the instances and try doing a migrate of the volume but I suspect you need to either remove the instance or do some really nasty database operations. I would suggest always going through the API and recreate the instance from the migrated volume instead of changing in the DB. We had to update delete_on_terminate in DB but that was pretty trivial (and I even think there is a spec that is not implemented yet that will allow that from API). On 1/29/20 9:54 PM, Jean-Philippe Méthot wrote: Hi, We have a several hundred VMs which were built on cinder block devices as root drives which use a SAN backend. Now we want to change their backend from the SAN to Ceph. We can shutdown the VMs but we will not destroy them. I am aware that there is a cinder migrate volume command to change a volume’s backend, but it requires that the volume be completely detached. Forcing a detached state on that volume does let the volume migration take place, but the volume’s path in Nova block_device_mapping doesn’t update, for obvious reasons. So, I am considering forcing the volumes to a detached status in Cinder and then manually updating the nova db block_device_mapping entry for each volume so that the VM can boot back up afterwards. However, before I start toying with the database and accidentally break stuff, has anyone else ever done something similar? Got any tips or hints on how best to proceed? Jean-Philippe Méthot Openstack system administrator Administrateur système Openstack PlanetHoster inc. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Albert.Braden at synopsys.com Thu Jan 30 19:48:55 2020 From: Albert.Braden at synopsys.com (Albert Braden) Date: Thu, 30 Jan 2020 19:48:55 +0000 Subject: "encoding error when doing console log show" in Rocky Message-ID: We're running openstack-ansible Rocky and seeing the "encoding error when doing console log show." I see here that there is a patch: https://bugs.launchpad.net/python-openstackclient/+bug/1747862 How can I apply this patch to our Rocky install? Do I need to just copy the file over the existing /usr/lib/python2.7/dist-packages/openstackclient/compute/v2/console.py on controllers and hypervisors? Or is there a better way? -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Thu Jan 30 20:08:54 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Thu, 30 Jan 2020 21:08:54 +0100 Subject: "encoding error when doing console log show" in Rocky In-Reply-To: References: Message-ID: The bug you mention affects clients. Hence you just need to patch your clients. I guess just using a newer version of client would work fine. -yoctozepto czw., 30 sty 2020 o 20:56 Albert Braden napisał(a): > > We’re running openstack-ansible Rocky and seeing the “encoding error when doing console log show.” I see here that there is a patch: > > > > https://bugs.launchpad.net/python-openstackclient/+bug/1747862 > > > > How can I apply this patch to our Rocky install? Do I need to just copy the file over the existing /usr/lib/python2.7/dist-packages/openstackclient/compute/v2/console.py on controllers and hypervisors? Or is there a better way? From Albert.Braden at synopsys.com Thu Jan 30 21:18:35 2020 From: Albert.Braden at synopsys.com (Albert Braden) Date: Thu, 30 Jan 2020 21:18:35 +0000 Subject: "encoding error when doing console log show" in Rocky In-Reply-To: References: Message-ID: Sorry if I am slow. This happens when I go to the controller, run my source file and then type: os console log show How can I install a new client to use on the controller command line? Do I need to replace the files mentioned below, or is there a better way? -----Original Message----- From: Radosław Piliszek Sent: Thursday, January 30, 2020 12:09 PM To: Albert Braden Cc: OpenStack Discuss ML Subject: Re: "encoding error when doing console log show" in Rocky The bug you mention affects clients. Hence you just need to patch your clients. I guess just using a newer version of client would work fine. -yoctozepto czw., 30 sty 2020 o 20:56 Albert Braden napisał(a): > > We’re running openstack-ansible Rocky and seeing the “encoding error when doing console log show.” I see here that there is a patch: > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_python-2Dopenstackclient_-2Bbug_1747862&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=-HCLwwRjUJ1GrihF5PjSNOYqDFH9Y7Qg0Bckxae56UY&s=vpftbl1w23aba5hyvrLPC-bG3IemMkFqbNBDTxrtHFM&e= > > > > How can I apply this patch to our Rocky install? Do I need to just copy the file over the existing /usr/lib/python2.7/dist-packages/openstackclient/compute/v2/console.py on controllers and hypervisors? Or is there a better way? From rosmaita.fossdev at gmail.com Thu Jan 30 22:18:08 2020 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Thu, 30 Jan 2020 17:18:08 -0500 Subject: [cinder] last call for ussuri spec comments In-Reply-To: References: Message-ID: <92eaa3f2-3bfe-3c94-292b-3d91c8256753@gmail.com> On 1/30/20 11:27 AM, Brian Rosmaita wrote: > The following specs have two +2s.  I believe that all expressed concerns > have been addressed.  I intend to merge them at 22:00 UTC today unless a > serious issue is raised before then. > > https://review.opendev.org/#/c/684556/ - support volume-local-cache Some concerns were raised with the above patch. Liang, please address them. Don't worry if you can't get them done before the Friday deadline, I'm willing to give you a spec freeze exception. I think the concerns raised will be useful in making clarifications to the spec, but also in pointing out things that reviewers should keep in mind when reviewing the implementation. They also point out some testing directions that will be useful in validating the feature. With respect to the other spec: > https://review.opendev.org/#/c/700977 - add backup id to volume metadata Rajat had a few vocabulary clarifications that can be addressed in a follow-up patch. Conceptually, this spec is fine, so I went ahead and merged it. > > cheers, > brian From smooney at redhat.com Fri Jan 31 00:18:54 2020 From: smooney at redhat.com (Sean Mooney) Date: Fri, 31 Jan 2020 00:18:54 +0000 Subject: [cinder] last call for ussuri spec comments In-Reply-To: <92eaa3f2-3bfe-3c94-292b-3d91c8256753@gmail.com> References: <92eaa3f2-3bfe-3c94-292b-3d91c8256753@gmail.com> Message-ID: On Thu, 2020-01-30 at 17:18 -0500, Brian Rosmaita wrote: > On 1/30/20 11:27 AM, Brian Rosmaita wrote: > > The following specs have two +2s. I believe that all expressed concerns > > have been addressed. I intend to merge them at 22:00 UTC today unless a > > serious issue is raised before then. > > > > https://review.opendev.org/#/c/684556/ - support volume-local-cache > > Some concerns were raised with the above patch. Liang, please address > them. Don't worry if you can't get them done before the Friday > deadline, I'm willing to give you a spec freeze exception. I think the > concerns raised will be useful in making clarifications to the spec, but > also in pointing out things that reviewers should keep in mind when > reviewing the implementation. They also point out some testing > directions that will be useful in validating the feature. the one thing i want to raise related to this spec is that the design direction form the nova side is problematic. when reviewing https://review.opendev.org/#/c/689070/ it was noted that the nova libvirt driver has been moving away form mounting cinder volumes on the host and then passing that block device to qemu, in favor of using qemu's nataive ablity to connect directly to remote storage. looking at the latest version of the nova spec https://review.opendev.org/#/c/689070/8/specs/ussuri/approved/support-volume-local-cache.rst at 49 i notes that this feature will be only capable of caching volums that have already been mounted on the host. while keeping the management of the volumes in os-bricks means that the over all impact on nova is minimal considering that this feature would no longer work if we moved to useing qemu native isci support, and that it will not work with NVMEoF volume or ceph im not sure that the nova side will be approved. when i first review the nova spec i mention that i believed local cacheing could a useful feature but this really feels like a capability that should be developed in qemu, specificly the applity to provide a second device as a cache for any disk deivce assgiend to an instance. that would allow local caching to be done regardless of the storage backend used. qemu cannot do that today so i understand that this approch is in the short to medium term likely the only workable solution but i am concerned that the cinder side will be completed in ussuri and the nova side will not. > > With respect to the other spec: > > > https://review.opendev.org/#/c/700977 - add backup id to volume metadata > > Rajat had a few vocabulary clarifications that can be addressed in a > follow-up patch. Conceptually, this spec is fine, so I went ahead and > merged it. > > > > > cheers, > > brian > > > From missile0407 at gmail.com Fri Jan 31 02:19:20 2020 From: missile0407 at gmail.com (Eddie Yen) Date: Fri, 31 Jan 2020 10:19:20 +0800 Subject: [kolla] ujson issue affected to few containers. In-Reply-To: <801b30a3-62a1-a1e9-c0ef-973baa19b4a0@binero.se> References: <801b30a3-62a1-a1e9-c0ef-973baa19b4a0@binero.se> Message-ID: In summary, looks like we have to wait the project release the fixed code on PyPI or compile the source code from its git project. Otherwise these containers may still affected this issue and can't deploy or working. We may going to try the Ubuntu binary deployment to see it also affect of not. Perhaps the user may going to deploy with binary on Ubuntu before the fix release to PyPI. - Eddie Tobias Urdin 於 2020年1月30日 週四 下午4:29寫道: > Seeing this issue when messing around with Gnocchi on Ubuntu 18.04 as well. > Temp solved it by installing ujson from master as suggested in [1] instead > of pypi. > > [1] https://github.com/esnme/ultrajson/issues/346 > > On 1/30/20 9:10 AM, Eddie Yen wrote: > > Hi Radosław, > > Sorry about lost distro information, the distro we're using is Ubuntu. > > We have an old copy of ceilometer container image, the ujson.so version > between old and latest are both 1.35 > But only latest one affected this issue. > > BTW, I read the last reply on issue page. Since he said the python 3 with > newer GCC is OK, I think it may caused by python version issue or GCC > compiler versioning. > It may become a huge architect if it really caused by compiling issue, if > Ubuntu updated GCC or python. > > Radosław Piliszek 於 2020年1月30日 週四 下午3:48寫道: > >> Hi Eddie, >> >> the issue is that the project did *not* do a release. >> The latest is still 1.35 from Jan 20, *2016*... [1] >> >> You said only Rocky source - but is this ubuntu or centos? >> >> Also, by the looks of [2] master ceilometer is no longer affected, but >> monasca and mistral might still be if they call affected paths. >> >> The project looks dead so we are fried unless we override and start >> using its sources from git (hacky hacky). >> >> [1] https://pypi.org/project/ujson/#history >> [2] http://codesearch.openstack.org/?q=ujson&i=nope&files=&repos= >> >> -yoctozepto >> >> >> czw., 30 sty 2020 o 03:31 Eddie Yen napisał(a): >> > >> > Hi everyone, >> > >> > I'm not sure it should be bug report or not. So I email out about this >> issue. >> > >> > In these days, I found the Rocky source deployment always failed at >> Ceilometer bootstrapping. Then I found it failed at ceilometer-upgrade. >> > So I tried to looking at ceilometer-upgrade.log and the error shows it >> failed to import ujson. >> > >> > https://pastebin.com/nGqsM0uf >> > >> > Then I googled it and found this issue is already happened and released >> fixes. >> > https://github.com/esnme/ultrajson/issues/346 >> > >> > But it seems like the container still using the questionable one, even >> today (Jan 30 UTC+8). >> > And this not only affected to Ceilometer, but may also Gnocchi. >> > >> > I think we have to patch it, but not sure about the workaround. >> > Does anyone have good idea? >> > >> > Many thanks, >> > Eddie. >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhangbailin at inspur.com Fri Jan 31 05:23:43 2020 From: zhangbailin at inspur.com (=?utf-8?B?QnJpbiBaaGFuZyjlvKDnmb7mnpcp?=) Date: Fri, 31 Jan 2020 05:23:43 +0000 Subject: =?utf-8?B?562U5aSNOiBbbGlzdHMub3BlbnN0YWNrLm9yZ+S7o+WPkV1SZTogW2NpbmRl?= =?utf-8?B?cl1bbm92YV0gTWlncmF0aW5nIHNlcnZlcnMnIHJvb3QgYmxvY2sgZGV2aWNl?= =?utf-8?Q?s_from_a_cinder_backend_to_another?= In-Reply-To: <4193A674-A75C-470F-9722-AC6EB228652C@planethoster.info> References: <8acec68002f12277a243a416950aa1c6@sslemail.net> <4193A674-A75C-470F-9722-AC6EB228652C@planethoster.info> Message-ID: <1250fb8c337a49d19dbb48638eea3797@inspur.com> >To follow up on my last email, I have tested the following hackaround in my testing environment: > Assuming you're using the Libvirt driver, a hackaround here is to cold migrate the instance to another host, this refreshes the connection_info from c-api and should allow the instance to boot correctly. > I can confirm that on OpenStack Queens, this doesn’t work. Openstack can’t find the attachment ID as it was marked as deleted in the cinder DB. I’m guessing it put itself as deleted when I did cinder reset-state --attach-status detached b70b254f-58cd-4940-b976-6f4dc0209a8c. The “delete_on_termination” was recored in instance’s BDM table https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/models.py#L629, and it was deal with by nova delete the server (clean up bdms, and call cinderclinet to delete the target volume). In Cinder, delete_on_termination cannot be recored, so in Cinder DB you cannot find this field. Clean up bdms: https://github.com/openstack/nova/blob/master/nova/compute/api.py#L2334-L2372 > Even though I did cinder reset-state --attach-status attached b70b254f-58cd-4940-b976-6f4dc0209a8c, the original attachment ID did not undelete itself. It also didn’t update itself to the new path. As a result, nova seems to be looking for the attachment ID when trying to migrate and errors out when it can’t find it. >That said, I do remember using this workaround in the past on older versions of Openstack, so this definitely used to work. I guess a change in cinder probably broke this. >Jean-Philippe Méthot >Openstack system administrator >Administrateur système Openstack >PlanetHoster inc. brinzhang -------------- next part -------------- An HTML attachment was scrubbed... URL: From tony.pearce at cinglevue.com Fri Jan 31 06:05:39 2020 From: tony.pearce at cinglevue.com (Tony Pearce) Date: Fri, 31 Jan 2020 14:05:39 +0800 Subject: Kayobe Openstack deployment In-Reply-To: References: Message-ID: Thanks again Mark for your support and patience yesterday. I dont think I would have been able to go beyond that hurdle alone. I have gone back to the universe from nothing this morning. The issue I had there was actually the same issue that you had helped me with; so I have now moved past that point. I am running this in a VM and I did not have nested virtualisation enabled on the hosts so I've had to side step to get that implemented. I am sticking with the stable/train. I was not sure about this, but I had figured that as I want to deploy Openstack Train, I'd need the Kayobe stable/train. In terms of the docs - I may be in a good position to help here. I'm not a coder by any means, so I may be in a position to contribute back in this doc sense. Teething issues aside, I really like what I am seeing from Kayobe etc. compared to my previous experience with different deployment tool this seems much more user-friendly. Thanks again Regards *Tony* On Thu, 30 Jan 2020 at 21:40, Mark Goddard wrote: > On Thu, 30 Jan 2020 at 08:22, Tony Pearce > wrote: > > > > Hi all - I wanted to ask if there was such a reference architecture / > step-by-step deployment guide for Openstack / Kayobe that I could follow to > get a better understanding of the components and how to go about deploying > it? > > Hi Tony, we spoke in the #openstack-kolla IRC channel [1], but I > thought I'd reply here for the benefit of anyone reading this. > > > > > The documentation is not that great so I'm hitting various issues when > trying to follow what is there on the Openstack site. There's a lot of > technical things like information on variables - which is fantastic, but > there's no context about them. For example, the architecture page is pretty > small, when you get further on in the guide it's difficult to contextually > link detail back to the architecture. > > As discussed in IRC, we are missing some architecture and from scratch > walkthrough documentation in Kayobe. I've been focusing on the > configuration reference mostly, but I think it is time to move onto > these other areas to help new starters. > > > > > I tried to do the all-in-one deployment as well as the "universe from > nothing approach" but hit some issues there as well. Plus it's kind of like > trying to learn how to drive a bus by riding a micro-scooter :) > > I would definitely recommend persevering with the universe from > nothing demo [2], as it offers the quickest way to get a system up and > running that you can poke at. It's also a fairly good example of a > 'bare minimum' configuration. Could you share the issues you had with > it? For an even simpler setup, you could try [3], which gets you an > all-in-one control plane/compute host quite quickly. I'd suggest using > the stable/train branch for a more stable environment. > > > > > Also, the "report bug" bug link on the top of all the pages is going to > an error "page does not exist" - not sure that had been realised yet. > > Andreas Jaeger kindly proposed a fix for this. Here's the storyboard > link: https://storyboard.openstack.org/#!/project/openstack/kayobe. > > [1] > http://eavesdrop.openstack.org/irclogs/%23openstack-kolla/%23openstack-kolla.2020-01-30.log.html#t2020-01-30T04:07:14 > [2] https://github.com/stackhpc/a-universe-from-nothing > [3] > https://docs.openstack.org/kayobe/latest/development/automated.html#overcloud > > > > > Regards, > > > > > > Tony Pearce | Senior Network Engineer / Infrastructure Lead > > Cinglevue International > > > > Email: tony.pearce at cinglevue.com > > Web: http://www.cinglevue.com > > > > Australia > > 1 Walsh Loop, Joondalup, WA 6027 Australia. > > > > Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 > > > > Note: This email and all attachments are the sole property of Cinglevue > International Pty Ltd. (or any of its subsidiary entities), and the > information contained herein must be considered confidential, unless > specified otherwise. If you are not the intended recipient, you must not > use or forward the information contained in these documents. If you have > received this message in error, please delete the email and notify the > sender. > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Fri Jan 31 08:01:31 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Fri, 31 Jan 2020 09:01:31 +0100 Subject: [kolla] ujson issue affected to few containers. In-Reply-To: References: <801b30a3-62a1-a1e9-c0ef-973baa19b4a0@binero.se> Message-ID: Well, the release does not look it's going to happen ever. Ubuntu binary rocky probably froze in time so it has a higher chance of working, though a potential rebuild will probably kill it as well. Let's start a general thread about ujson. -yoctozepto pt., 31 sty 2020 o 03:26 Eddie Yen napisał(a): > > In summary, looks like we have to wait the project release the fixed code on PyPI or compile the source code from its git project. > Otherwise these containers may still affected this issue and can't deploy or working. > > We may going to try the Ubuntu binary deployment to see it also affect of not. Perhaps the user may going to deploy with binary on Ubuntu before the fix release to PyPI. > > - Eddie > > Tobias Urdin 於 2020年1月30日 週四 下午4:29寫道: >> >> Seeing this issue when messing around with Gnocchi on Ubuntu 18.04 as well. >> Temp solved it by installing ujson from master as suggested in [1] instead of pypi. >> >> [1] https://github.com/esnme/ultrajson/issues/346 >> >> On 1/30/20 9:10 AM, Eddie Yen wrote: >> >> Hi Radosław, >> >> Sorry about lost distro information, the distro we're using is Ubuntu. >> >> We have an old copy of ceilometer container image, the ujson.so version between old and latest are both 1.35 >> But only latest one affected this issue. >> >> BTW, I read the last reply on issue page. Since he said the python 3 with newer GCC is OK, I think it may caused by python version issue or GCC compiler versioning. >> It may become a huge architect if it really caused by compiling issue, if Ubuntu updated GCC or python. >> >> Radosław Piliszek 於 2020年1月30日 週四 下午3:48寫道: >>> >>> Hi Eddie, >>> >>> the issue is that the project did *not* do a release. >>> The latest is still 1.35 from Jan 20, *2016*... [1] >>> >>> You said only Rocky source - but is this ubuntu or centos? >>> >>> Also, by the looks of [2] master ceilometer is no longer affected, but >>> monasca and mistral might still be if they call affected paths. >>> >>> The project looks dead so we are fried unless we override and start >>> using its sources from git (hacky hacky). >>> >>> [1] https://pypi.org/project/ujson/#history >>> [2] http://codesearch.openstack.org/?q=ujson&i=nope&files=&repos= >>> >>> -yoctozepto >>> >>> >>> czw., 30 sty 2020 o 03:31 Eddie Yen napisał(a): >>> > >>> > Hi everyone, >>> > >>> > I'm not sure it should be bug report or not. So I email out about this issue. >>> > >>> > In these days, I found the Rocky source deployment always failed at Ceilometer bootstrapping. Then I found it failed at ceilometer-upgrade. >>> > So I tried to looking at ceilometer-upgrade.log and the error shows it failed to import ujson. >>> > >>> > https://pastebin.com/nGqsM0uf >>> > >>> > Then I googled it and found this issue is already happened and released fixes. >>> > https://github.com/esnme/ultrajson/issues/346 >>> > >>> > But it seems like the container still using the questionable one, even today (Jan 30 UTC+8). >>> > And this not only affected to Ceilometer, but may also Gnocchi. >>> > >>> > I think we have to patch it, but not sure about the workaround. >>> > Does anyone have good idea? >>> > >>> > Many thanks, >>> > Eddie. >> >> From radoslaw.piliszek at gmail.com Fri Jan 31 08:18:28 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Fri, 31 Jan 2020 09:18:28 +0100 Subject: [kolla] ujson issue affected to few containers. In-Reply-To: References: <801b30a3-62a1-a1e9-c0ef-973baa19b4a0@binero.se> Message-ID: I checked ceilometer and it seems they dropped ujson in queens, it's only the gnocchi client that still uses it, unfortunately. -yoctozepto pt., 31 sty 2020 o 09:01 Radosław Piliszek napisał(a): > > Well, the release does not look it's going to happen ever. > Ubuntu binary rocky probably froze in time so it has a higher chance > of working, though a potential rebuild will probably kill it as well. > > Let's start a general thread about ujson. > > -yoctozepto > > pt., 31 sty 2020 o 03:26 Eddie Yen napisał(a): > > > > In summary, looks like we have to wait the project release the fixed code on PyPI or compile the source code from its git project. > > Otherwise these containers may still affected this issue and can't deploy or working. > > > > We may going to try the Ubuntu binary deployment to see it also affect of not. Perhaps the user may going to deploy with binary on Ubuntu before the fix release to PyPI. > > > > - Eddie > > > > Tobias Urdin 於 2020年1月30日 週四 下午4:29寫道: > >> > >> Seeing this issue when messing around with Gnocchi on Ubuntu 18.04 as well. > >> Temp solved it by installing ujson from master as suggested in [1] instead of pypi. > >> > >> [1] https://github.com/esnme/ultrajson/issues/346 > >> > >> On 1/30/20 9:10 AM, Eddie Yen wrote: > >> > >> Hi Radosław, > >> > >> Sorry about lost distro information, the distro we're using is Ubuntu. > >> > >> We have an old copy of ceilometer container image, the ujson.so version between old and latest are both 1.35 > >> But only latest one affected this issue. > >> > >> BTW, I read the last reply on issue page. Since he said the python 3 with newer GCC is OK, I think it may caused by python version issue or GCC compiler versioning. > >> It may become a huge architect if it really caused by compiling issue, if Ubuntu updated GCC or python. > >> > >> Radosław Piliszek 於 2020年1月30日 週四 下午3:48寫道: > >>> > >>> Hi Eddie, > >>> > >>> the issue is that the project did *not* do a release. > >>> The latest is still 1.35 from Jan 20, *2016*... [1] > >>> > >>> You said only Rocky source - but is this ubuntu or centos? > >>> > >>> Also, by the looks of [2] master ceilometer is no longer affected, but > >>> monasca and mistral might still be if they call affected paths. > >>> > >>> The project looks dead so we are fried unless we override and start > >>> using its sources from git (hacky hacky). > >>> > >>> [1] https://pypi.org/project/ujson/#history > >>> [2] http://codesearch.openstack.org/?q=ujson&i=nope&files=&repos= > >>> > >>> -yoctozepto > >>> > >>> > >>> czw., 30 sty 2020 o 03:31 Eddie Yen napisał(a): > >>> > > >>> > Hi everyone, > >>> > > >>> > I'm not sure it should be bug report or not. So I email out about this issue. > >>> > > >>> > In these days, I found the Rocky source deployment always failed at Ceilometer bootstrapping. Then I found it failed at ceilometer-upgrade. > >>> > So I tried to looking at ceilometer-upgrade.log and the error shows it failed to import ujson. > >>> > > >>> > https://pastebin.com/nGqsM0uf > >>> > > >>> > Then I googled it and found this issue is already happened and released fixes. > >>> > https://github.com/esnme/ultrajson/issues/346 > >>> > > >>> > But it seems like the container still using the questionable one, even today (Jan 30 UTC+8). > >>> > And this not only affected to Ceilometer, but may also Gnocchi. > >>> > > >>> > I think we have to patch it, but not sure about the workaround. > >>> > Does anyone have good idea? > >>> > > >>> > Many thanks, > >>> > Eddie. > >> > >> From radoslaw.piliszek at gmail.com Fri Jan 31 08:34:23 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Fri, 31 Jan 2020 09:34:23 +0100 Subject: [all][requirements][monasca][gnocchi] ujson, not maintained for over 4 years, has compiler issues Message-ID: This is a spinoff discussion of [1] to attract more people. As the subject goes, the situation of ujson is bad. Still, monasca and gnocchi (both server and client) seem to be using it which may break depending on compiler. The original issue is that the released version of ujson is in non-spec-conforming C which may break randomly based on used compiler and linker. There has been no release of ujson for more than 4 years. Based on general project activity, Monasca is probably able to fix it but Gnocchi not so surely... [1] http://lists.openstack.org/pipermail/openstack-discuss/2020-January/thread.html -yoctozepto From radoslaw.piliszek at gmail.com Fri Jan 31 08:38:41 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Fri, 31 Jan 2020 09:38:41 +0100 Subject: "encoding error when doing console log show" in Rocky In-Reply-To: References: Message-ID: Well, I might be slow here. :-) What source file do you need to run on the controller? If you run openstack client from controller then it becomes your client so just update the client there to the latest rocky release: https://docs.openstack.org/releasenotes/python-openstackclient/rocky.html something along this: sudo pip install -U python-openstackclient==3.16.3 -yoctozepto czw., 30 sty 2020 o 22:18 Albert Braden napisał(a): > > Sorry if I am slow. This happens when I go to the controller, run my source file and then type: os console log show > > How can I install a new client to use on the controller command line? Do I need to replace the files mentioned below, or is there a better way? > > -----Original Message----- > From: Radosław Piliszek > Sent: Thursday, January 30, 2020 12:09 PM > To: Albert Braden > Cc: OpenStack Discuss ML > Subject: Re: "encoding error when doing console log show" in Rocky > > The bug you mention affects clients. > Hence you just need to patch your clients. > I guess just using a newer version of client would work fine. > > -yoctozepto > > czw., 30 sty 2020 o 20:56 Albert Braden napisał(a): > > > > We’re running openstack-ansible Rocky and seeing the “encoding error when doing console log show.” I see here that there is a patch: > > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_python-2Dopenstackclient_-2Bbug_1747862&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=-HCLwwRjUJ1GrihF5PjSNOYqDFH9Y7Qg0Bckxae56UY&s=vpftbl1w23aba5hyvrLPC-bG3IemMkFqbNBDTxrtHFM&e= > > > > > > > > How can I apply this patch to our Rocky install? Do I need to just copy the file over the existing /usr/lib/python2.7/dist-packages/openstackclient/compute/v2/console.py on controllers and hypervisors? Or is there a better way? From tobias.urdin at binero.se Fri Jan 31 08:48:55 2020 From: tobias.urdin at binero.se (Tobias Urdin) Date: Fri, 31 Jan 2020 09:48:55 +0100 Subject: [all][requirements][monasca][gnocchi] ujson, not maintained for over 4 years, has compiler issues In-Reply-To: References: Message-ID: I looked into that some days ago and the ujson usage is pretty minimal in Gnocchi. Digging up the commits that changed to ujson (wasn't that long ago I think, for performance reasons) and revert those should be too much work. Best regards On 1/31/20 9:38 AM, Radosław Piliszek wrote: > This is a spinoff discussion of [1] to attract more people. > > As the subject goes, the situation of ujson is bad. Still, monasca and > gnocchi (both server and client) seem to be using it which may break > depending on compiler. > The original issue is that the released version of ujson is in > non-spec-conforming C which may break randomly based on used compiler > and linker. > There has been no release of ujson for more than 4 years. > > Based on general project activity, Monasca is probably able to fix it > but Gnocchi not so surely... > > [1] http://lists.openstack.org/pipermail/openstack-discuss/2020-January/thread.html > > -yoctozepto > > From agarwalvishakha18 at gmail.com Fri Jan 31 09:31:24 2020 From: agarwalvishakha18 at gmail.com (Vishakha Agarwal) Date: Fri, 31 Jan 2020 15:01:24 +0530 Subject: [keystone] Keystone Team Update - Week of 27 January 2020 Message-ID: # Keystone Team Update - Week of 27 January 2020 ## News ### User Support and Bug Duty Every week the duty is being rotated between the members. The person-in-charge for bug duty for current and upcoming week can be seen on the etherpad [1] [1] https://etherpad.openstack.org/p/keystone-l1-duty ## Open Specs Ussuri specs: https://bit.ly/2XDdpkU Ongoing specs: https://bit.ly/2OyDLTh ## Recently Merged Changes Search query: https://bit.ly/2pquOwT We merged 19 changes this week. ## Changes that need Attention Search query: https://bit.ly/2tymTje There are 26 changes that are passing CI, not in merge conflict, have no negative reviews and aren't proposed by bots. ### Priority Reviews * Ussuri Roadmap Stories - Groups in keystone SAML assertion https://tree.taiga.io/project/keystone-ussuri-roadmap/us/33 https://review.opendev.org/#/c/588211/ Add openstack_groups to assertion - Add support for modifying resource options to CLI tool https://tree.taiga.io/project/keystone-ussuri-roadmap/us/53 https://review.opendev.org/#/c/697444/ Adding options to user cli * Community Goals https://review.opendev.org/#/c/699119/ [ussuri][goal] Drop python 2.7 support and testing python-keystoneclient * Special Requests https://review.opendev.org/#/c/703578/ Updating tox -e all-plugin command https://review.opendev.org/#/c/704736/ Remove six usage ## Bugs This week we opened 4 new bugs and closed 4. Bugs opened (4) Bug #1860478 (keystone:Low): fetching role assignments should handle domain IDs in addition to project IDs - Opened by Harry Rybacki https://bugs.launchpad.net/keystone/+bug/1860478 Bug #1860252 (keystone:Undecided): security problem,one user can change other user's password without admin - Opened by kuangpeiling https://bugs.launchpad.net/keystone/+bug/1860252 Bug #1861264 (keystone:Undecided): Help information should be returned when executing 'placement-status' alone - Opened by Eric Xie https://bugs.launchpad.net/keystone/+bug/1861264 Bug #1861279 (keystone:Undecided): incomplete instruction in "Install and configure" in keystone in https://docs.openstack.org/keystone/train/install/keystone-install-rdo.html - Opened by Arie Maron https://bugs.launchpad.net/keystone/+bug/1861279 Bugs closed (4) Bug #1860252 (keystone:Undecided) https://bugs.launchpad.net/keystone/+bug/1860252 Bug #1861264 (keystone:Undecided) https://bugs.launchpad.net/keystone/+bug/1861264 Bug #1856904 (keystone:Undecided): CADF Notifications are missing user name in initiator object - Fixed by Gage Hugo https://bugs.launchpad.net/keystone/+bug/1856904 Bug #1858012 (keystone:Undecided): List role assignments by role ID may leak extra system assignments outside of filter - Fixed by Colleen Murphy https://bugs.launchpad.net/keystone/+bug/1858012 ## Milestone Outlook https://releases.openstack.org/ussuri/schedule.html Milestone 2 is in two weeks, which means spec freeze is coming up followed closely by feature proposal freeze. ## Help with this newsletter Help contribute to this newsletter by editing the etherpad: https://etherpad.openstack.org/p/keystone-team-newsletter ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From radoslaw.piliszek at gmail.com Fri Jan 31 10:17:10 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Fri, 31 Jan 2020 11:17:10 +0100 Subject: [all][requirements][qa] zipp failing py3.5 again Message-ID: Folks, we are running circles here. [1] merged breaking zipp again for py3.5 The script which generates these proposals needs to learn python version constraints. Reproposed [2]. [1] https://review.opendev.org/705128 [2] https://review.opendev.org/705184 -yoctozepto From fungi at yuggoth.org Fri Jan 31 10:38:58 2020 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 31 Jan 2020 10:38:58 +0000 Subject: [all][requirements][qa] zipp failing py3.5 again In-Reply-To: References: Message-ID: <20200131103857.sntkieulk4t33ysy@yuggoth.org> On 2020-01-31 11:17:10 +0100 (+0100), Radosław Piliszek wrote: [...] > The script which generates these proposals needs to learn python > version constraints. [...] It might be possible to crawl the PyPI API and try to apply matching environment markers, but I'm starting to think we just need separate constraints lists per interpreter version instead (and then we can generate them by running the current script from each interpreter). -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From mthode at mthode.org Fri Jan 31 16:43:13 2020 From: mthode at mthode.org (Matthew Thode) Date: Fri, 31 Jan 2020 10:43:13 -0600 Subject: [all][requirements][qa] zipp failing py3.5 again In-Reply-To: <20200131103857.sntkieulk4t33ysy@yuggoth.org> References: <20200131103857.sntkieulk4t33ysy@yuggoth.org> Message-ID: <20200131164313.bufwfpt5nstsi6ok@mthode.org> On 20-01-31 10:38:58, Jeremy Stanley wrote: > On 2020-01-31 11:17:10 +0100 (+0100), Radosław Piliszek wrote: > [...] > > The script which generates these proposals needs to learn python > > version constraints. > [...] > > It might be possible to crawl the PyPI API and try to apply matching > environment markers, but I'm starting to think we just need separate > constraints lists per interpreter version instead (and then we can > generate them by running the current script from each interpreter). > -- > Jeremy Stanley I think that would require all those pythons being installed. Personally I only have access to 2.7(for now) and 3.6-3.9, so no 3.4 or 3.5 locally at least. Not sure what the availability is for all those pythons for gate. Maybe pyenv can help? https://github.com/pyenv/pyenv -- Matthew Thode -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From radoslaw.piliszek at gmail.com Fri Jan 31 16:56:50 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Fri, 31 Jan 2020 17:56:50 +0100 Subject: [all][requirements][qa] zipp failing py3.5 again In-Reply-To: <20200131164313.bufwfpt5nstsi6ok@mthode.org> References: <20200131103857.sntkieulk4t33ysy@yuggoth.org> <20200131164313.bufwfpt5nstsi6ok@mthode.org> Message-ID: Yeah, we had pyenv conversations with qa and infra folks. OTOH, I don't think it would be that hard to get that PyPI info. Anyways, I believe we want to follow one u-c approach as multiple add the burden on user to use the proper one. -yoctozepto pt., 31 sty 2020 o 17:46 Matthew Thode napisał(a): > > On 20-01-31 10:38:58, Jeremy Stanley wrote: > > On 2020-01-31 11:17:10 +0100 (+0100), Radosław Piliszek wrote: > > [...] > > > The script which generates these proposals needs to learn python > > > version constraints. > > [...] > > > > It might be possible to crawl the PyPI API and try to apply matching > > environment markers, but I'm starting to think we just need separate > > constraints lists per interpreter version instead (and then we can > > generate them by running the current script from each interpreter). > > -- > > Jeremy Stanley > > I think that would require all those pythons being installed. > Personally I only have access to 2.7(for now) and 3.6-3.9, so no 3.4 or > 3.5 locally at least. Not sure what the availability is for all those > pythons for gate. Maybe pyenv can help? https://github.com/pyenv/pyenv > -- > Matthew Thode From openstack at nemebean.com Fri Jan 31 17:14:57 2020 From: openstack at nemebean.com (Ben Nemec) Date: Fri, 31 Jan 2020 11:14:57 -0600 Subject: [oslo] AFK this week In-Reply-To: <5f96427b-1655-c847-cb84-6b4a1aaafd06@nemebean.com> References: <5f96427b-1655-c847-cb84-6b4a1aaafd06@nemebean.com> Message-ID: <1acba3b1-ee33-ad16-fe50-995b95c8276a@nemebean.com> I'm still going to be out this coming Monday. I'll _probably_ be back sometime next week, but I make no promises. On 1/27/20 12:59 AM, Ben Nemec wrote: > I am going to be out most or all of this week. If someone wants to run > the meeting on Monday then feel free to have it without me again. I may > not be back right away next week either. A lot of stuff is still to be > determined at this point. > > -Ben > From Albert.Braden at synopsys.com Fri Jan 31 18:25:35 2020 From: Albert.Braden at synopsys.com (Albert Braden) Date: Fri, 31 Jan 2020 18:25:35 +0000 Subject: "encoding error when doing console log show" in Rocky In-Reply-To: References: Message-ID: This is the source file: https://f.perl.bot/p/7omp8p I'm not very good at PIP but it looks like I already have the latest Rocky client installed: https://f.perl.bot/p/xd6hi2 https://f.perl.bot/p/zxjh3t Requirement already satisfied, skipping upgrade: python-novaclient>=9.1.0 in /usr/local/lib/python2.7/dist-packages (from python-openstackclient==3.16.3) (16.0.0) root at us01odc-dev2-ctrl1:~# os console log show 3febd3b2-df87-4f06-884b-378116c6fe4c 'latin-1' codec can't encode characters in position 46615-46617: ordinal not in range(256) -----Original Message----- From: Radosław Piliszek Sent: Friday, January 31, 2020 12:39 AM To: Albert Braden Cc: Radosław Piliszek ; OpenStack Discuss ML Subject: Re: "encoding error when doing console log show" in Rocky Well, I might be slow here. :-) What source file do you need to run on the controller? If you run openstack client from controller then it becomes your client so just update the client there to the latest rocky release: https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_releasenotes_python-2Dopenstackclient_rocky.html&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=avA5Liwj2cVesY05Dk6kFm0YanYp7N7Q4PHGMbzeD6Y&s=dQaaBGR2A-dmJXGDr_TiJ0XVyghaMONnZasgJ52CnrU&e= something along this: sudo pip install -U python-openstackclient==3.16.3 -yoctozepto czw., 30 sty 2020 o 22:18 Albert Braden napisał(a): > > Sorry if I am slow. This happens when I go to the controller, run my source file and then type: os console log show > > How can I install a new client to use on the controller command line? Do I need to replace the files mentioned below, or is there a better way? > > -----Original Message----- > From: Radosław Piliszek > Sent: Thursday, January 30, 2020 12:09 PM > To: Albert Braden > Cc: OpenStack Discuss ML > Subject: Re: "encoding error when doing console log show" in Rocky > > The bug you mention affects clients. > Hence you just need to patch your clients. > I guess just using a newer version of client would work fine. > > -yoctozepto > > czw., 30 sty 2020 o 20:56 Albert Braden napisał(a): > > > > We’re running openstack-ansible Rocky and seeing the “encoding error when doing console log show.” I see here that there is a patch: > > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_python-2Dopenstackclient_-2Bbug_1747862&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=-HCLwwRjUJ1GrihF5PjSNOYqDFH9Y7Qg0Bckxae56UY&s=vpftbl1w23aba5hyvrLPC-bG3IemMkFqbNBDTxrtHFM&e= > > > > > > > > How can I apply this patch to our Rocky install? Do I need to just copy the file over the existing /usr/lib/python2.7/dist-packages/openstackclient/compute/v2/console.py on controllers and hypervisors? Or is there a better way? From radoslaw.piliszek at gmail.com Fri Jan 31 18:49:19 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Fri, 31 Jan 2020 19:49:19 +0100 Subject: "encoding error when doing console log show" in Rocky In-Reply-To: References: Message-ID: It looks like these are package-provided and hence pip refused to touch them. Are there no updates to packages of clients? You could try installing the client in venv if you insist on using this machine. That file you call "source file" has a definition of environment variables for osc, you surely "source" it but it's far from being a "source file" in the specific sense around here. Hence my previous confusion. :-) -yoctozepto pt., 31 sty 2020 o 19:25 Albert Braden napisał(a): > > This is the source file: > > https://f.perl.bot/p/7omp8p > > I'm not very good at PIP but it looks like I already have the latest Rocky client installed: > > https://f.perl.bot/p/xd6hi2 > https://f.perl.bot/p/zxjh3t > > Requirement already satisfied, skipping upgrade: python-novaclient>=9.1.0 in /usr/local/lib/python2.7/dist-packages (from python-openstackclient==3.16.3) (16.0.0) > > root at us01odc-dev2-ctrl1:~# os console log show 3febd3b2-df87-4f06-884b-378116c6fe4c > 'latin-1' codec can't encode characters in position 46615-46617: ordinal not in range(256) > > -----Original Message----- > From: Radosław Piliszek > Sent: Friday, January 31, 2020 12:39 AM > To: Albert Braden > Cc: Radosław Piliszek ; OpenStack Discuss ML > Subject: Re: "encoding error when doing console log show" in Rocky > > Well, I might be slow here. :-) > What source file do you need to run on the controller? > If you run openstack client from controller then it becomes your > client so just update the client there to the latest rocky release: > https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_releasenotes_python-2Dopenstackclient_rocky.html&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=avA5Liwj2cVesY05Dk6kFm0YanYp7N7Q4PHGMbzeD6Y&s=dQaaBGR2A-dmJXGDr_TiJ0XVyghaMONnZasgJ52CnrU&e= > something along this: > sudo pip install -U python-openstackclient==3.16.3 > > -yoctozepto > > czw., 30 sty 2020 o 22:18 Albert Braden napisał(a): > > > > Sorry if I am slow. This happens when I go to the controller, run my source file and then type: os console log show > > > > How can I install a new client to use on the controller command line? Do I need to replace the files mentioned below, or is there a better way? > > > > -----Original Message----- > > From: Radosław Piliszek > > Sent: Thursday, January 30, 2020 12:09 PM > > To: Albert Braden > > Cc: OpenStack Discuss ML > > Subject: Re: "encoding error when doing console log show" in Rocky > > > > The bug you mention affects clients. > > Hence you just need to patch your clients. > > I guess just using a newer version of client would work fine. > > > > -yoctozepto > > > > czw., 30 sty 2020 o 20:56 Albert Braden napisał(a): > > > > > > We’re running openstack-ansible Rocky and seeing the “encoding error when doing console log show.” I see here that there is a patch: > > > > > > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_python-2Dopenstackclient_-2Bbug_1747862&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=-HCLwwRjUJ1GrihF5PjSNOYqDFH9Y7Qg0Bckxae56UY&s=vpftbl1w23aba5hyvrLPC-bG3IemMkFqbNBDTxrtHFM&e= > > > > > > > > > > > > How can I apply this patch to our Rocky install? Do I need to just copy the file over the existing /usr/lib/python2.7/dist-packages/openstackclient/compute/v2/console.py on controllers and hypervisors? Or is there a better way? From juliaashleykreger at gmail.com Fri Jan 31 19:27:46 2020 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Fri, 31 Jan 2020 11:27:46 -0800 Subject: [ironic] An approach to removing WSME In-Reply-To: <53794A7A-0B20-455F-9D49-4ED2ABD61B1F@intel.com> References: <53794A7A-0B20-455F-9D49-4ED2ABD61B1F@intel.com> Message-ID: Reply in-line. Thanks for raising this discussion Steve! On Thu, Jan 30, 2020 at 1:27 AM Wang, Jerry A wrote: > > Hi Steve: > > Thanks for your great efforts for WSME replacement. > > I had a quick review for your code changes, see my comments below: > > Review > Feedback > Reason > 704490 > Has concern > move wsme code to ironic, types.py > 704487 > Has concern > move wsme code to ironic, node.py > 704489 > has concern > move wsme code to ironic, args.py > 704486 > has concern > move wsme code to ironic expose.py > 703898 > Both positive and concern > Positive, better code structure than original code, concern, more > difficult to locate WSME code > 704488 > has concern > move wsme code to ironic > 704485 > has concern > Add pecan code, pecan would be replaced by flask > 703897 > Positive > Removed some WSME code with this change > 703723 > Positive > Removed some WSME code with this change > 703695 > Positive > Removed some WSME code with this change > > Firstly, I appreciated 3 changes which definately removed some wsme code, > especially the change 703897. > > But I have some concerns for changes 704490, 704487, 704489, ..., these > changes seemed move WSME code into ironic, that would make ironic code base > become a bit large, from my personal view, to use python built-in feature > or other 3-party lib to replace WSME function would be better. I like the > way that code change 703897 did. > > I guess I have two concerns. 1) Attribution and detailing the original source and licensing since the import of code does seem to meet the word "substantial", least to my perception. 2) I share the concern of importing code into ironic, and even adding substantial amount of code that we would need to potentially maintain. If my memory is recalling correctly, we wanted to move away from WSME in order to reduce maintenance overhead since it is not being actively developed. In the grand scheme of the universe, I am for us achieving our goals, and if built-in tooling or another third party library does not meet our needs, then I feel like it makes sense. > Thanks > Jerry > > 发自我的iPhone > > 在 2020年1月30日,上午8:53,Steve Baker 写道: > > I've put together a set of changes for removing WSME which involves > copying just enough of it into ironic, and I think we need to have the > conversation about whether this approach is desirable :) This git branch[1] > finishes with WSME removed and existing tests passing. > > Here are some stats about lines of code (not including unit tests, > calculated with cloc): > > 4500 wsme/wsme > > 6000 ironic/ironic/api master > > 7000 ironic/ironic/api story/1651346 > > In words, we need 1000 out of 4500 lines of WSME source in order to > support 6000 lines of ironic specific API code. > > Switching to a replacement for WSME would likely touch a good proportion > of that 6000 lines of ironic specific API code. If we eventually replace it > with a new library I think it would be easier if the thing being replaced > is inside the ironic source tree to allow for a gradual transition. So this > approach could be an end in itself, or it could be a step towards the final > goal. > > My strategy for copying in code was to pull in chunks of required logic > while removing some unused features (like request pass-through and date & > time type serialization). I also replaced py2/3 patterns with pure py3. > There is likely further things which can be removed or refactored for > simplicity, but what exists currently works. > > If there is enough positive feedback for this approach I'll start on unit > test coverage for the new code. > > [1] https://review.opendev.org/#/q/topic:story/1651346 > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Albert.Braden at synopsys.com Fri Jan 31 20:31:25 2020 From: Albert.Braden at synopsys.com (Albert Braden) Date: Fri, 31 Jan 2020 20:31:25 +0000 Subject: "encoding error when doing console log show" in Rocky In-Reply-To: References: Message-ID: The client claims that it is version 3.16.3. What am I missing? root at us01odc-dev2-ctrl1:/var/log/nova# openstack --version openstack 3.16.3 root at us01odc-dev2-ctrl1:/var/log/nova# os console log show 3febd3b2-df87-4f06-884b-378116c6fe4c 'latin-1' codec can't encode characters in position 46615-46617: ordinal not in range(256) -----Original Message----- From: Radosław Piliszek Sent: Friday, January 31, 2020 10:49 AM To: Albert Braden Cc: Radosław Piliszek ; OpenStack Discuss ML Subject: Re: "encoding error when doing console log show" in Rocky It looks like these are package-provided and hence pip refused to touch them. Are there no updates to packages of clients? You could try installing the client in venv if you insist on using this machine. That file you call "source file" has a definition of environment variables for osc, you surely "source" it but it's far from being a "source file" in the specific sense around here. Hence my previous confusion. :-) -yoctozepto pt., 31 sty 2020 o 19:25 Albert Braden napisał(a): > > This is the source file: > > https://urldefense.proofpoint.com/v2/url?u=https-3A__f.perl.bot_p_7omp8p&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=9hpBVaDxs6DnEno4f15EYzgDq-WzDXknz_GfNTUqDGI&s=-eOnfHHn64irzKfiMAMB9Hp4bpvYgldkQHRCuCOvE60&e= > > I'm not very good at PIP but it looks like I already have the latest Rocky client installed: > > https://urldefense.proofpoint.com/v2/url?u=https-3A__f.perl.bot_p_xd6hi2&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=9hpBVaDxs6DnEno4f15EYzgDq-WzDXknz_GfNTUqDGI&s=3Skkokc_zvLhR_0v_WNYAp7rzYbPA0pDg_IvMmSPtHc&e= > https://urldefense.proofpoint.com/v2/url?u=https-3A__f.perl.bot_p_zxjh3t&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=9hpBVaDxs6DnEno4f15EYzgDq-WzDXknz_GfNTUqDGI&s=V7aVdb9JbVeogcCPGu2wAtCvM1aw5EgBGKLuhEIkMQw&e= > > Requirement already satisfied, skipping upgrade: python-novaclient>=9.1.0 in /usr/local/lib/python2.7/dist-packages (from python-openstackclient==3.16.3) (16.0.0) > > root at us01odc-dev2-ctrl1:~# os console log show 3febd3b2-df87-4f06-884b-378116c6fe4c > 'latin-1' codec can't encode characters in position 46615-46617: ordinal not in range(256) > > -----Original Message----- > From: Radosław Piliszek > Sent: Friday, January 31, 2020 12:39 AM > To: Albert Braden > Cc: Radosław Piliszek ; OpenStack Discuss ML > Subject: Re: "encoding error when doing console log show" in Rocky > > Well, I might be slow here. :-) > What source file do you need to run on the controller? > If you run openstack client from controller then it becomes your > client so just update the client there to the latest rocky release: > https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_releasenotes_python-2Dopenstackclient_rocky.html&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=avA5Liwj2cVesY05Dk6kFm0YanYp7N7Q4PHGMbzeD6Y&s=dQaaBGR2A-dmJXGDr_TiJ0XVyghaMONnZasgJ52CnrU&e= > something along this: > sudo pip install -U python-openstackclient==3.16.3 > > -yoctozepto > > czw., 30 sty 2020 o 22:18 Albert Braden napisał(a): > > > > Sorry if I am slow. This happens when I go to the controller, run my source file and then type: os console log show > > > > How can I install a new client to use on the controller command line? Do I need to replace the files mentioned below, or is there a better way? > > > > -----Original Message----- > > From: Radosław Piliszek > > Sent: Thursday, January 30, 2020 12:09 PM > > To: Albert Braden > > Cc: OpenStack Discuss ML > > Subject: Re: "encoding error when doing console log show" in Rocky > > > > The bug you mention affects clients. > > Hence you just need to patch your clients. > > I guess just using a newer version of client would work fine. > > > > -yoctozepto > > > > czw., 30 sty 2020 o 20:56 Albert Braden napisał(a): > > > > > > We’re running openstack-ansible Rocky and seeing the “encoding error when doing console log show.” I see here that there is a patch: > > > > > > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_python-2Dopenstackclient_-2Bbug_1747862&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=-HCLwwRjUJ1GrihF5PjSNOYqDFH9Y7Qg0Bckxae56UY&s=vpftbl1w23aba5hyvrLPC-bG3IemMkFqbNBDTxrtHFM&e= > > > > > > > > > > > > How can I apply this patch to our Rocky install? Do I need to just copy the file over the existing /usr/lib/python2.7/dist-packages/openstackclient/compute/v2/console.py on controllers and hypervisors? Or is there a better way? From radoslaw.piliszek at gmail.com Fri Jan 31 20:40:35 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Fri, 31 Jan 2020 21:40:35 +0100 Subject: "encoding error when doing console log show" in Rocky In-Reply-To: References: Message-ID: Then also: pip install -U python-novaclient==11.0.1 -yoctozepto pt., 31 sty 2020 o 21:31 Albert Braden napisał(a): > > The client claims that it is version 3.16.3. What am I missing? > > root at us01odc-dev2-ctrl1:/var/log/nova# openstack --version > openstack 3.16.3 > root at us01odc-dev2-ctrl1:/var/log/nova# os console log show 3febd3b2-df87-4f06-884b-378116c6fe4c > 'latin-1' codec can't encode characters in position 46615-46617: ordinal not in range(256) > > -----Original Message----- > From: Radosław Piliszek > Sent: Friday, January 31, 2020 10:49 AM > To: Albert Braden > Cc: Radosław Piliszek ; OpenStack Discuss ML > Subject: Re: "encoding error when doing console log show" in Rocky > > It looks like these are package-provided and hence pip refused to touch them. > Are there no updates to packages of clients? > You could try installing the client in venv if you insist on using this machine. > > That file you call "source file" has a definition of environment > variables for osc, you surely "source" it but it's far from being a > "source file" in the specific sense around here. Hence my previous > confusion. :-) > > -yoctozepto > > pt., 31 sty 2020 o 19:25 Albert Braden napisał(a): > > > > This is the source file: > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__f.perl.bot_p_7omp8p&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=9hpBVaDxs6DnEno4f15EYzgDq-WzDXknz_GfNTUqDGI&s=-eOnfHHn64irzKfiMAMB9Hp4bpvYgldkQHRCuCOvE60&e= > > > > I'm not very good at PIP but it looks like I already have the latest Rocky client installed: > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__f.perl.bot_p_xd6hi2&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=9hpBVaDxs6DnEno4f15EYzgDq-WzDXknz_GfNTUqDGI&s=3Skkokc_zvLhR_0v_WNYAp7rzYbPA0pDg_IvMmSPtHc&e= > > https://urldefense.proofpoint.com/v2/url?u=https-3A__f.perl.bot_p_zxjh3t&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=9hpBVaDxs6DnEno4f15EYzgDq-WzDXknz_GfNTUqDGI&s=V7aVdb9JbVeogcCPGu2wAtCvM1aw5EgBGKLuhEIkMQw&e= > > > > Requirement already satisfied, skipping upgrade: python-novaclient>=9.1.0 in /usr/local/lib/python2.7/dist-packages (from python-openstackclient==3.16.3) (16.0.0) > > > > root at us01odc-dev2-ctrl1:~# os console log show 3febd3b2-df87-4f06-884b-378116c6fe4c > > 'latin-1' codec can't encode characters in position 46615-46617: ordinal not in range(256) > > > > -----Original Message----- > > From: Radosław Piliszek > > Sent: Friday, January 31, 2020 12:39 AM > > To: Albert Braden > > Cc: Radosław Piliszek ; OpenStack Discuss ML > > Subject: Re: "encoding error when doing console log show" in Rocky > > > > Well, I might be slow here. :-) > > What source file do you need to run on the controller? > > If you run openstack client from controller then it becomes your > > client so just update the client there to the latest rocky release: > > https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_releasenotes_python-2Dopenstackclient_rocky.html&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=avA5Liwj2cVesY05Dk6kFm0YanYp7N7Q4PHGMbzeD6Y&s=dQaaBGR2A-dmJXGDr_TiJ0XVyghaMONnZasgJ52CnrU&e= > > something along this: > > sudo pip install -U python-openstackclient==3.16.3 > > > > -yoctozepto > > > > czw., 30 sty 2020 o 22:18 Albert Braden napisał(a): > > > > > > Sorry if I am slow. This happens when I go to the controller, run my source file and then type: os console log show > > > > > > How can I install a new client to use on the controller command line? Do I need to replace the files mentioned below, or is there a better way? > > > > > > -----Original Message----- > > > From: Radosław Piliszek > > > Sent: Thursday, January 30, 2020 12:09 PM > > > To: Albert Braden > > > Cc: OpenStack Discuss ML > > > Subject: Re: "encoding error when doing console log show" in Rocky > > > > > > The bug you mention affects clients. > > > Hence you just need to patch your clients. > > > I guess just using a newer version of client would work fine. > > > > > > -yoctozepto > > > > > > czw., 30 sty 2020 o 20:56 Albert Braden napisał(a): > > > > > > > > We’re running openstack-ansible Rocky and seeing the “encoding error when doing console log show.” I see here that there is a patch: > > > > > > > > > > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_python-2Dopenstackclient_-2Bbug_1747862&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=-HCLwwRjUJ1GrihF5PjSNOYqDFH9Y7Qg0Bckxae56UY&s=vpftbl1w23aba5hyvrLPC-bG3IemMkFqbNBDTxrtHFM&e= > > > > > > > > > > > > > > > > How can I apply this patch to our Rocky install? Do I need to just copy the file over the existing /usr/lib/python2.7/dist-packages/openstackclient/compute/v2/console.py on controllers and hypervisors? Or is there a better way? From Albert.Braden at synopsys.com Fri Jan 31 20:48:51 2020 From: Albert.Braden at synopsys.com (Albert Braden) Date: Fri, 31 Jan 2020 20:48:51 +0000 Subject: "encoding error when doing console log show" in Rocky In-Reply-To: References: Message-ID: That ran successfully, but it didn't fix the problem. Installing collected packages: python-novaclient Attempting uninstall: python-novaclient Found existing installation: python-novaclient 16.0.0 Uninstalling python-novaclient-16.0.0: Successfully uninstalled python-novaclient-16.0.0 Successfully installed python-novaclient-11.0.1 root at us01odc-dev2-ctrl1:/var/log/nova# os console log show 3febd3b2-df87-4f06-884b-378116c6fe4c 'latin-1' codec can't encode characters in position 46615-46617: ordinal not in range(256) -----Original Message----- From: Radosław Piliszek Sent: Friday, January 31, 2020 12:41 PM To: Albert Braden Cc: Radosław Piliszek ; OpenStack Discuss ML Subject: Re: "encoding error when doing console log show" in Rocky Then also: pip install -U python-novaclient==11.0.1 -yoctozepto pt., 31 sty 2020 o 21:31 Albert Braden napisał(a): > > The client claims that it is version 3.16.3. What am I missing? > > root at us01odc-dev2-ctrl1:/var/log/nova# openstack --version > openstack 3.16.3 > root at us01odc-dev2-ctrl1:/var/log/nova# os console log show 3febd3b2-df87-4f06-884b-378116c6fe4c > 'latin-1' codec can't encode characters in position 46615-46617: ordinal not in range(256) > > -----Original Message----- > From: Radosław Piliszek > Sent: Friday, January 31, 2020 10:49 AM > To: Albert Braden > Cc: Radosław Piliszek ; OpenStack Discuss ML > Subject: Re: "encoding error when doing console log show" in Rocky > > It looks like these are package-provided and hence pip refused to touch them. > Are there no updates to packages of clients? > You could try installing the client in venv if you insist on using this machine. > > That file you call "source file" has a definition of environment > variables for osc, you surely "source" it but it's far from being a > "source file" in the specific sense around here. Hence my previous > confusion. :-) > > -yoctozepto > > pt., 31 sty 2020 o 19:25 Albert Braden napisał(a): > > > > This is the source file: > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__f.perl.bot_p_7omp8p&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=9hpBVaDxs6DnEno4f15EYzgDq-WzDXknz_GfNTUqDGI&s=-eOnfHHn64irzKfiMAMB9Hp4bpvYgldkQHRCuCOvE60&e= > > > > I'm not very good at PIP but it looks like I already have the latest Rocky client installed: > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__f.perl.bot_p_xd6hi2&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=9hpBVaDxs6DnEno4f15EYzgDq-WzDXknz_GfNTUqDGI&s=3Skkokc_zvLhR_0v_WNYAp7rzYbPA0pDg_IvMmSPtHc&e= > > https://urldefense.proofpoint.com/v2/url?u=https-3A__f.perl.bot_p_zxjh3t&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=9hpBVaDxs6DnEno4f15EYzgDq-WzDXknz_GfNTUqDGI&s=V7aVdb9JbVeogcCPGu2wAtCvM1aw5EgBGKLuhEIkMQw&e= > > > > Requirement already satisfied, skipping upgrade: python-novaclient>=9.1.0 in /usr/local/lib/python2.7/dist-packages (from python-openstackclient==3.16.3) (16.0.0) > > > > root at us01odc-dev2-ctrl1:~# os console log show 3febd3b2-df87-4f06-884b-378116c6fe4c > > 'latin-1' codec can't encode characters in position 46615-46617: ordinal not in range(256) > > > > -----Original Message----- > > From: Radosław Piliszek > > Sent: Friday, January 31, 2020 12:39 AM > > To: Albert Braden > > Cc: Radosław Piliszek ; OpenStack Discuss ML > > Subject: Re: "encoding error when doing console log show" in Rocky > > > > Well, I might be slow here. :-) > > What source file do you need to run on the controller? > > If you run openstack client from controller then it becomes your > > client so just update the client there to the latest rocky release: > > https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_releasenotes_python-2Dopenstackclient_rocky.html&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=avA5Liwj2cVesY05Dk6kFm0YanYp7N7Q4PHGMbzeD6Y&s=dQaaBGR2A-dmJXGDr_TiJ0XVyghaMONnZasgJ52CnrU&e= > > something along this: > > sudo pip install -U python-openstackclient==3.16.3 > > > > -yoctozepto > > > > czw., 30 sty 2020 o 22:18 Albert Braden napisał(a): > > > > > > Sorry if I am slow. This happens when I go to the controller, run my source file and then type: os console log show > > > > > > How can I install a new client to use on the controller command line? Do I need to replace the files mentioned below, or is there a better way? > > > > > > -----Original Message----- > > > From: Radosław Piliszek > > > Sent: Thursday, January 30, 2020 12:09 PM > > > To: Albert Braden > > > Cc: OpenStack Discuss ML > > > Subject: Re: "encoding error when doing console log show" in Rocky > > > > > > The bug you mention affects clients. > > > Hence you just need to patch your clients. > > > I guess just using a newer version of client would work fine. > > > > > > -yoctozepto > > > > > > czw., 30 sty 2020 o 20:56 Albert Braden napisał(a): > > > > > > > > We’re running openstack-ansible Rocky and seeing the “encoding error when doing console log show.” I see here that there is a patch: > > > > > > > > > > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_python-2Dopenstackclient_-2Bbug_1747862&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=-HCLwwRjUJ1GrihF5PjSNOYqDFH9Y7Qg0Bckxae56UY&s=vpftbl1w23aba5hyvrLPC-bG3IemMkFqbNBDTxrtHFM&e= > > > > > > > > > > > > > > > > How can I apply this patch to our Rocky install? Do I need to just copy the file over the existing /usr/lib/python2.7/dist-packages/openstackclient/compute/v2/console.py on controllers and hypervisors? Or is there a better way? From Albert.Braden at synopsys.com Fri Jan 31 21:12:16 2020 From: Albert.Braden at synopsys.com (Albert Braden) Date: Fri, 31 Jan 2020 21:12:16 +0000 Subject: "encoding error when doing console log show" in Rocky In-Reply-To: References: Message-ID: In my other cluster where I haven't upgraded anything since the Rocky install, the old nova version 11 client works, but the openstack client fails: root at us01odc-dev1-ctrl1:~# openstack --version openstack 3.16.1 root at us01odc-dev1-ctrl1:~# openstack console log show 5a923a92-8fd1-48fd-8b76-768d1fb5f0c6 'latin-1' codec can't encode characters in position 45794-45796: ordinal not in range(256) root at us01odc-dev1-ctrl1:~# nova --version 11.0.0 root at us01odc-dev1-ctrl1:~# nova console-log 5a923a92-8fd1-48fd-8b76-768d1fb5f0c6 [ 0.000000] Initializing cgroup subsys cpuset [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Initializing cgroup subsys cpuacct [ 0.000000] Linux version 3.10.0-957.5.1.el7.x86_64 (mockbuild at kbuilder.bsys.centos.org) (gcc version 4.8.5 20150 ... -----Original Message----- From: Albert Braden Sent: Friday, January 31, 2020 12:49 PM To: Radosław Piliszek Cc: OpenStack Discuss ML Subject: RE: "encoding error when doing console log show" in Rocky That ran successfully, but it didn't fix the problem. Installing collected packages: python-novaclient Attempting uninstall: python-novaclient Found existing installation: python-novaclient 16.0.0 Uninstalling python-novaclient-16.0.0: Successfully uninstalled python-novaclient-16.0.0 Successfully installed python-novaclient-11.0.1 root at us01odc-dev2-ctrl1:/var/log/nova# os console log show 3febd3b2-df87-4f06-884b-378116c6fe4c 'latin-1' codec can't encode characters in position 46615-46617: ordinal not in range(256) -----Original Message----- From: Radosław Piliszek Sent: Friday, January 31, 2020 12:41 PM To: Albert Braden Cc: Radosław Piliszek ; OpenStack Discuss ML Subject: Re: "encoding error when doing console log show" in Rocky Then also: pip install -U python-novaclient==11.0.1 -yoctozepto pt., 31 sty 2020 o 21:31 Albert Braden napisał(a): > > The client claims that it is version 3.16.3. What am I missing? > > root at us01odc-dev2-ctrl1:/var/log/nova# openstack --version > openstack 3.16.3 > root at us01odc-dev2-ctrl1:/var/log/nova# os console log show 3febd3b2-df87-4f06-884b-378116c6fe4c > 'latin-1' codec can't encode characters in position 46615-46617: ordinal not in range(256) > > -----Original Message----- > From: Radosław Piliszek > Sent: Friday, January 31, 2020 10:49 AM > To: Albert Braden > Cc: Radosław Piliszek ; OpenStack Discuss ML > Subject: Re: "encoding error when doing console log show" in Rocky > > It looks like these are package-provided and hence pip refused to touch them. > Are there no updates to packages of clients? > You could try installing the client in venv if you insist on using this machine. > > That file you call "source file" has a definition of environment > variables for osc, you surely "source" it but it's far from being a > "source file" in the specific sense around here. Hence my previous > confusion. :-) > > -yoctozepto > > pt., 31 sty 2020 o 19:25 Albert Braden napisał(a): > > > > This is the source file: > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__f.perl.bot_p_7omp8p&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=9hpBVaDxs6DnEno4f15EYzgDq-WzDXknz_GfNTUqDGI&s=-eOnfHHn64irzKfiMAMB9Hp4bpvYgldkQHRCuCOvE60&e= > > > > I'm not very good at PIP but it looks like I already have the latest Rocky client installed: > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__f.perl.bot_p_xd6hi2&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=9hpBVaDxs6DnEno4f15EYzgDq-WzDXknz_GfNTUqDGI&s=3Skkokc_zvLhR_0v_WNYAp7rzYbPA0pDg_IvMmSPtHc&e= > > https://urldefense.proofpoint.com/v2/url?u=https-3A__f.perl.bot_p_zxjh3t&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=9hpBVaDxs6DnEno4f15EYzgDq-WzDXknz_GfNTUqDGI&s=V7aVdb9JbVeogcCPGu2wAtCvM1aw5EgBGKLuhEIkMQw&e= > > > > Requirement already satisfied, skipping upgrade: python-novaclient>=9.1.0 in /usr/local/lib/python2.7/dist-packages (from python-openstackclient==3.16.3) (16.0.0) > > > > root at us01odc-dev2-ctrl1:~# os console log show 3febd3b2-df87-4f06-884b-378116c6fe4c > > 'latin-1' codec can't encode characters in position 46615-46617: ordinal not in range(256) > > > > -----Original Message----- > > From: Radosław Piliszek > > Sent: Friday, January 31, 2020 12:39 AM > > To: Albert Braden > > Cc: Radosław Piliszek ; OpenStack Discuss ML > > Subject: Re: "encoding error when doing console log show" in Rocky > > > > Well, I might be slow here. :-) > > What source file do you need to run on the controller? > > If you run openstack client from controller then it becomes your > > client so just update the client there to the latest rocky release: > > https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_releasenotes_python-2Dopenstackclient_rocky.html&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=avA5Liwj2cVesY05Dk6kFm0YanYp7N7Q4PHGMbzeD6Y&s=dQaaBGR2A-dmJXGDr_TiJ0XVyghaMONnZasgJ52CnrU&e= > > something along this: > > sudo pip install -U python-openstackclient==3.16.3 > > > > -yoctozepto > > > > czw., 30 sty 2020 o 22:18 Albert Braden napisał(a): > > > > > > Sorry if I am slow. This happens when I go to the controller, run my source file and then type: os console log show > > > > > > How can I install a new client to use on the controller command line? Do I need to replace the files mentioned below, or is there a better way? > > > > > > -----Original Message----- > > > From: Radosław Piliszek > > > Sent: Thursday, January 30, 2020 12:09 PM > > > To: Albert Braden > > > Cc: OpenStack Discuss ML > > > Subject: Re: "encoding error when doing console log show" in Rocky > > > > > > The bug you mention affects clients. > > > Hence you just need to patch your clients. > > > I guess just using a newer version of client would work fine. > > > > > > -yoctozepto > > > > > > czw., 30 sty 2020 o 20:56 Albert Braden napisał(a): > > > > > > > > We’re running openstack-ansible Rocky and seeing the “encoding error when doing console log show.” I see here that there is a patch: > > > > > > > > > > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_python-2Dopenstackclient_-2Bbug_1747862&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=-HCLwwRjUJ1GrihF5PjSNOYqDFH9Y7Qg0Bckxae56UY&s=vpftbl1w23aba5hyvrLPC-bG3IemMkFqbNBDTxrtHFM&e= > > > > > > > > > > > > > > > > How can I apply this patch to our Rocky install? Do I need to just copy the file over the existing /usr/lib/python2.7/dist-packages/openstackclient/compute/v2/console.py on controllers and hypervisors? Or is there a better way? From radoslaw.piliszek at gmail.com Fri Jan 31 21:20:35 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Fri, 31 Jan 2020 22:20:35 +0100 Subject: "encoding error when doing console log show" in Rocky In-Reply-To: References: Message-ID: Ok, try upgrading both to the latest (no pin). Latest clients should still work on Rocky and we don't have to guess versions. -yoctozepto On Fri, Jan 31, 2020, 22:12 Albert Braden wrote: > In my other cluster where I haven't upgraded anything since the Rocky > install, the old nova version 11 client works, but the openstack client > fails: > > root at us01odc-dev1-ctrl1:~# openstack --version > openstack 3.16.1 > root at us01odc-dev1-ctrl1:~# openstack console log show > 5a923a92-8fd1-48fd-8b76-768d1fb5f0c6 > 'latin-1' codec can't encode characters in position 45794-45796: ordinal > not in range(256) > root at us01odc-dev1-ctrl1:~# nova --version > 11.0.0 > root at us01odc-dev1-ctrl1:~# nova console-log > 5a923a92-8fd1-48fd-8b76-768d1fb5f0c6 > [ 0.000000] Initializing cgroup subsys cpuset > [ 0.000000] Initializing cgroup subsys cpu > [ 0.000000] Initializing cgroup subsys cpuacct > [ 0.000000] Linux version 3.10.0-957.5.1.el7.x86_64 ( > mockbuild at kbuilder.bsys.centos.org) (gcc version 4.8.5 20150 > ... > > -----Original Message----- > From: Albert Braden > Sent: Friday, January 31, 2020 12:49 PM > To: Radosław Piliszek > Cc: OpenStack Discuss ML > Subject: RE: "encoding error when doing console log show" in Rocky > > That ran successfully, but it didn't fix the problem. > > Installing collected packages: python-novaclient > Attempting uninstall: python-novaclient > Found existing installation: python-novaclient 16.0.0 > Uninstalling python-novaclient-16.0.0: > Successfully uninstalled python-novaclient-16.0.0 > Successfully installed python-novaclient-11.0.1 > root at us01odc-dev2-ctrl1:/var/log/nova# os console log show > 3febd3b2-df87-4f06-884b-378116c6fe4c > 'latin-1' codec can't encode characters in position 46615-46617: ordinal > not in range(256) > > -----Original Message----- > From: Radosław Piliszek > Sent: Friday, January 31, 2020 12:41 PM > To: Albert Braden > Cc: Radosław Piliszek ; OpenStack Discuss ML > > Subject: Re: "encoding error when doing console log show" in Rocky > > Then also: > pip install -U python-novaclient==11.0.1 > > -yoctozepto > > pt., 31 sty 2020 o 21:31 Albert Braden > napisał(a): > > > > The client claims that it is version 3.16.3. What am I missing? > > > > root at us01odc-dev2-ctrl1:/var/log/nova# openstack --version > > openstack 3.16.3 > > root at us01odc-dev2-ctrl1:/var/log/nova# os console log show > 3febd3b2-df87-4f06-884b-378116c6fe4c > > 'latin-1' codec can't encode characters in position 46615-46617: ordinal > not in range(256) > > > > -----Original Message----- > > From: Radosław Piliszek > > Sent: Friday, January 31, 2020 10:49 AM > > To: Albert Braden > > Cc: Radosław Piliszek ; OpenStack Discuss > ML > > Subject: Re: "encoding error when doing console log show" in Rocky > > > > It looks like these are package-provided and hence pip refused to touch > them. > > Are there no updates to packages of clients? > > You could try installing the client in venv if you insist on using this > machine. > > > > That file you call "source file" has a definition of environment > > variables for osc, you surely "source" it but it's far from being a > > "source file" in the specific sense around here. Hence my previous > > confusion. :-) > > > > -yoctozepto > > > > pt., 31 sty 2020 o 19:25 Albert Braden > napisał(a): > > > > > > This is the source file: > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__f.perl.bot_p_7omp8p&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=9hpBVaDxs6DnEno4f15EYzgDq-WzDXknz_GfNTUqDGI&s=-eOnfHHn64irzKfiMAMB9Hp4bpvYgldkQHRCuCOvE60&e= > > > > > > I'm not very good at PIP but it looks like I already have the latest > Rocky client installed: > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__f.perl.bot_p_xd6hi2&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=9hpBVaDxs6DnEno4f15EYzgDq-WzDXknz_GfNTUqDGI&s=3Skkokc_zvLhR_0v_WNYAp7rzYbPA0pDg_IvMmSPtHc&e= > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__f.perl.bot_p_zxjh3t&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=9hpBVaDxs6DnEno4f15EYzgDq-WzDXknz_GfNTUqDGI&s=V7aVdb9JbVeogcCPGu2wAtCvM1aw5EgBGKLuhEIkMQw&e= > > > > > > Requirement already satisfied, skipping upgrade: > python-novaclient>=9.1.0 in /usr/local/lib/python2.7/dist-packages (from > python-openstackclient==3.16.3) (16.0.0) > > > > > > root at us01odc-dev2-ctrl1:~# os console log show > 3febd3b2-df87-4f06-884b-378116c6fe4c > > > 'latin-1' codec can't encode characters in position 46615-46617: > ordinal not in range(256) > > > > > > -----Original Message----- > > > From: Radosław Piliszek > > > Sent: Friday, January 31, 2020 12:39 AM > > > To: Albert Braden > > > Cc: Radosław Piliszek ; OpenStack > Discuss ML > > > Subject: Re: "encoding error when doing console log show" in Rocky > > > > > > Well, I might be slow here. :-) > > > What source file do you need to run on the controller? > > > If you run openstack client from controller then it becomes your > > > client so just update the client there to the latest rocky release: > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_releasenotes_python-2Dopenstackclient_rocky.html&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=avA5Liwj2cVesY05Dk6kFm0YanYp7N7Q4PHGMbzeD6Y&s=dQaaBGR2A-dmJXGDr_TiJ0XVyghaMONnZasgJ52CnrU&e= > > > something along this: > > > sudo pip install -U python-openstackclient==3.16.3 > > > > > > -yoctozepto > > > > > > czw., 30 sty 2020 o 22:18 Albert Braden > napisał(a): > > > > > > > > Sorry if I am slow. This happens when I go to the controller, run my > source file and then type: os console log show > > > > > > > > How can I install a new client to use on the controller command > line? Do I need to replace the files mentioned below, or is there a better > way? > > > > > > > > -----Original Message----- > > > > From: Radosław Piliszek > > > > Sent: Thursday, January 30, 2020 12:09 PM > > > > To: Albert Braden > > > > Cc: OpenStack Discuss ML > > > > Subject: Re: "encoding error when doing console log show" in Rocky > > > > > > > > The bug you mention affects clients. > > > > Hence you just need to patch your clients. > > > > I guess just using a newer version of client would work fine. > > > > > > > > -yoctozepto > > > > > > > > czw., 30 sty 2020 o 20:56 Albert Braden > napisał(a): > > > > > > > > > > We’re running openstack-ansible Rocky and seeing the “encoding > error when doing console log show.” I see here that there is a patch: > > > > > > > > > > > > > > > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_python-2Dopenstackclient_-2Bbug_1747862&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=-HCLwwRjUJ1GrihF5PjSNOYqDFH9Y7Qg0Bckxae56UY&s=vpftbl1w23aba5hyvrLPC-bG3IemMkFqbNBDTxrtHFM&e= > > > > > > > > > > > > > > > > > > > > How can I apply this patch to our Rocky install? Do I need to just > copy the file over the existing > /usr/lib/python2.7/dist-packages/openstackclient/compute/v2/console.py on > controllers and hypervisors? Or is there a better way? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Albert.Braden at synopsys.com Fri Jan 31 21:36:35 2020 From: Albert.Braden at synopsys.com (Albert Braden) Date: Fri, 31 Jan 2020 21:36:35 +0000 Subject: "encoding error when doing console log show" in Rocky In-Reply-To: References: Message-ID: They both upgraded but still fail: root at us01odc-dev2-ctrl1:~# nova --version 16.0.0 root at us01odc-dev2-ctrl1:~# nova console-log 3febd3b2-df87-4f06-884b-378116c6fe4c ERROR (UnicodeEncodeError): 'latin-1' codec can't encode characters in position 46615-46617: ordinal not in range(256) root at us01odc-dev2-ctrl1:~# os --version openstack 4.0.0 root at us01odc-dev2-ctrl1:~# os console log show 3febd3b2-df87-4f06-884b-378116c6fe4c 'latin-1' codec can't encode characters in position 46615-46617: ordinal not in range(256) If I look at /usr/lib/python2.7/dist-packages/openstackclient/compute/v2/console.py on the controller, I see the new code from https://review.opendev.org/#/c/541609/3/openstackclient/compute/v2/console.py: if data and data[-1] != '\n': data += '\n' self.app.stdout.write(data) Also I see this in the review at https://review.opendev.org/#/c/541609 If you are sure this works, please ignore my comment. I tried to verify this on a local python console (python-2.7.13). To do this I used sys.stdout = codecs.getwriter('ascii')(sys.stdout) and sys.stdout = codecs.getwriter('utf-8')(sys.stdout) Before running the first command, I can write any unicode-character, after the first command I get the same error as in the bug report. After running the second command, the error persists. That makes me believe, your patch won't fix the issue completely. Is my cluster one of the cases where the patch doesn’t fix the issue? From: Radosław Piliszek Sent: Friday, January 31, 2020 1:21 PM To: Albert Braden Cc: Radosław Piliszek ; OpenStack Discuss ML Subject: Re: "encoding error when doing console log show" in Rocky Ok, try upgrading both to the latest (no pin). Latest clients should still work on Rocky and we don't have to guess versions. -yoctozepto On Fri, Jan 31, 2020, 22:12 Albert Braden > wrote: In my other cluster where I haven't upgraded anything since the Rocky install, the old nova version 11 client works, but the openstack client fails: root at us01odc-dev1-ctrl1:~# openstack --version openstack 3.16.1 root at us01odc-dev1-ctrl1:~# openstack console log show 5a923a92-8fd1-48fd-8b76-768d1fb5f0c6 'latin-1' codec can't encode characters in position 45794-45796: ordinal not in range(256) root at us01odc-dev1-ctrl1:~# nova --version 11.0.0 root at us01odc-dev1-ctrl1:~# nova console-log 5a923a92-8fd1-48fd-8b76-768d1fb5f0c6 [ 0.000000] Initializing cgroup subsys cpuset [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Initializing cgroup subsys cpuacct [ 0.000000] Linux version 3.10.0-957.5.1.el7.x86_64 (mockbuild at kbuilder.bsys.centos.org) (gcc version 4.8.5 20150 ... -----Original Message----- From: Albert Braden > Sent: Friday, January 31, 2020 12:49 PM To: Radosław Piliszek > Cc: OpenStack Discuss ML > Subject: RE: "encoding error when doing console log show" in Rocky That ran successfully, but it didn't fix the problem. Installing collected packages: python-novaclient Attempting uninstall: python-novaclient Found existing installation: python-novaclient 16.0.0 Uninstalling python-novaclient-16.0.0: Successfully uninstalled python-novaclient-16.0.0 Successfully installed python-novaclient-11.0.1 root at us01odc-dev2-ctrl1:/var/log/nova# os console log show 3febd3b2-df87-4f06-884b-378116c6fe4c 'latin-1' codec can't encode characters in position 46615-46617: ordinal not in range(256) -----Original Message----- From: Radosław Piliszek > Sent: Friday, January 31, 2020 12:41 PM To: Albert Braden > Cc: Radosław Piliszek >; OpenStack Discuss ML > Subject: Re: "encoding error when doing console log show" in Rocky Then also: pip install -U python-novaclient==11.0.1 -yoctozepto pt., 31 sty 2020 o 21:31 Albert Braden > napisał(a): > > The client claims that it is version 3.16.3. What am I missing? > > root at us01odc-dev2-ctrl1:/var/log/nova# openstack --version > openstack 3.16.3 > root at us01odc-dev2-ctrl1:/var/log/nova# os console log show 3febd3b2-df87-4f06-884b-378116c6fe4c > 'latin-1' codec can't encode characters in position 46615-46617: ordinal not in range(256) > > -----Original Message----- > From: Radosław Piliszek > > Sent: Friday, January 31, 2020 10:49 AM > To: Albert Braden > > Cc: Radosław Piliszek >; OpenStack Discuss ML > > Subject: Re: "encoding error when doing console log show" in Rocky > > It looks like these are package-provided and hence pip refused to touch them. > Are there no updates to packages of clients? > You could try installing the client in venv if you insist on using this machine. > > That file you call "source file" has a definition of environment > variables for osc, you surely "source" it but it's far from being a > "source file" in the specific sense around here. Hence my previous > confusion. :-) > > -yoctozepto > > pt., 31 sty 2020 o 19:25 Albert Braden > napisał(a): > > > > This is the source file: > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__f.perl.bot_p_7omp8p&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=9hpBVaDxs6DnEno4f15EYzgDq-WzDXknz_GfNTUqDGI&s=-eOnfHHn64irzKfiMAMB9Hp4bpvYgldkQHRCuCOvE60&e= > > > > I'm not very good at PIP but it looks like I already have the latest Rocky client installed: > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__f.perl.bot_p_xd6hi2&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=9hpBVaDxs6DnEno4f15EYzgDq-WzDXknz_GfNTUqDGI&s=3Skkokc_zvLhR_0v_WNYAp7rzYbPA0pDg_IvMmSPtHc&e= > > https://urldefense.proofpoint.com/v2/url?u=https-3A__f.perl.bot_p_zxjh3t&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=9hpBVaDxs6DnEno4f15EYzgDq-WzDXknz_GfNTUqDGI&s=V7aVdb9JbVeogcCPGu2wAtCvM1aw5EgBGKLuhEIkMQw&e= > > > > Requirement already satisfied, skipping upgrade: python-novaclient>=9.1.0 in /usr/local/lib/python2.7/dist-packages (from python-openstackclient==3.16.3) (16.0.0) > > > > root at us01odc-dev2-ctrl1:~# os console log show 3febd3b2-df87-4f06-884b-378116c6fe4c > > 'latin-1' codec can't encode characters in position 46615-46617: ordinal not in range(256) > > > > -----Original Message----- > > From: Radosław Piliszek > > > Sent: Friday, January 31, 2020 12:39 AM > > To: Albert Braden > > > Cc: Radosław Piliszek >; OpenStack Discuss ML > > > Subject: Re: "encoding error when doing console log show" in Rocky > > > > Well, I might be slow here. :-) > > What source file do you need to run on the controller? > > If you run openstack client from controller then it becomes your > > client so just update the client there to the latest rocky release: > > https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_releasenotes_python-2Dopenstackclient_rocky.html&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=avA5Liwj2cVesY05Dk6kFm0YanYp7N7Q4PHGMbzeD6Y&s=dQaaBGR2A-dmJXGDr_TiJ0XVyghaMONnZasgJ52CnrU&e= > > something along this: > > sudo pip install -U python-openstackclient==3.16.3 > > > > -yoctozepto > > > > czw., 30 sty 2020 o 22:18 Albert Braden > napisał(a): > > > > > > Sorry if I am slow. This happens when I go to the controller, run my source file and then type: os console log show > > > > > > How can I install a new client to use on the controller command line? Do I need to replace the files mentioned below, or is there a better way? > > > > > > -----Original Message----- > > > From: Radosław Piliszek > > > > Sent: Thursday, January 30, 2020 12:09 PM > > > To: Albert Braden > > > > Cc: OpenStack Discuss ML > > > > Subject: Re: "encoding error when doing console log show" in Rocky > > > > > > The bug you mention affects clients. > > > Hence you just need to patch your clients. > > > I guess just using a newer version of client would work fine. > > > > > > -yoctozepto > > > > > > czw., 30 sty 2020 o 20:56 Albert Braden > napisał(a): > > > > > > > > We’re running openstack-ansible Rocky and seeing the “encoding error when doing console log show.” I see here that there is a patch: > > > > > > > > > > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_python-2Dopenstackclient_-2Bbug_1747862&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=-HCLwwRjUJ1GrihF5PjSNOYqDFH9Y7Qg0Bckxae56UY&s=vpftbl1w23aba5hyvrLPC-bG3IemMkFqbNBDTxrtHFM&e= > > > > > > > > > > > > > > > > How can I apply this patch to our Rocky install? Do I need to just copy the file over the existing /usr/lib/python2.7/dist-packages/openstackclient/compute/v2/console.py on controllers and hypervisors? Or is there a better way? -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Fri Jan 31 21:36:46 2020 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 31 Jan 2020 21:36:46 +0000 Subject: [all][requirements][qa] zipp failing py3.5 again In-Reply-To: References: <20200131103857.sntkieulk4t33ysy@yuggoth.org> <20200131164313.bufwfpt5nstsi6ok@mthode.org> Message-ID: <20200131213646.kerileidw4262ypl@yuggoth.org> On 2020-01-31 17:56:50 +0100 (+0100), Radosław Piliszek wrote: > Yeah, we had pyenv conversations with qa and infra folks. For the auto-generated constraints updates we could just generate them on representative platforms in separate jobs (or a multi-platform multinode job, though that's just additional complexity for no real gain over multiple jobs). > OTOH, I don't think it would be that hard to get that PyPI info. The main challenge I foresee with that is you'll need to reimplement pip's logic around identifying what the highest supported version of a package is for each interpreter you care about. It's more than just knowing what interpreters a particular package version supports. > Anyways, I believe we want to follow one u-c approach as multiple > add the burden on user to use the proper one. [...] There's also a hybrid option, where you generate per-interpreter constraints files but then merge them, and automatically add an environment marker anywhere there's a difference. That allows you to continue to rely on pip's existing ability to identify the latest supported version of a package for each interpreter. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From radoslaw.piliszek at gmail.com Fri Jan 31 21:40:33 2020 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Fri, 31 Jan 2020 22:40:33 +0100 Subject: "encoding error when doing console log show" in Rocky In-Reply-To: References: Message-ID: No idea... Last thing to try: play with locale settings (locale command). Try setting LANG to en_US.UTF-8 (or en_US if that fails). -yoctozepto On Fri, Jan 31, 2020, 22:36 Albert Braden wrote: > They both upgraded but still fail: > > > > root at us01odc-dev2-ctrl1:~# nova --version > > 16.0.0 > > root at us01odc-dev2-ctrl1:~# nova console-log > 3febd3b2-df87-4f06-884b-378116c6fe4c > > ERROR (UnicodeEncodeError): 'latin-1' codec can't encode characters in > position 46615-46617: ordinal not in range(256) > > root at us01odc-dev2-ctrl1:~# os --version > > openstack 4.0.0 > > root at us01odc-dev2-ctrl1:~# os console log show > 3febd3b2-df87-4f06-884b-378116c6fe4c > > 'latin-1' codec can't encode characters in position 46615-46617: ordinal > not in range(256) > > > > If I look at > /usr/lib/python2.7/dist-packages/openstackclient/compute/v2/console.py on > the controller, I see the new code from > https://review.opendev.org/#/c/541609/3/openstackclient/compute/v2/console.py > : > > > > if data and data[-1] != '\n': > > data += '\n' > > self.app.stdout.write(data) > > > > Also I see this in the review at https://review.opendev.org/#/c/541609 > > > > If you are sure this works, please ignore my comment. > > I tried to verify this on a local python console (python-2.7.13). > > To do this I used > > sys.stdout = codecs.getwriter('ascii')(sys.stdout) > > and > > sys.stdout = codecs.getwriter('utf-8')(sys.stdout) > > Before running the first command, I can write any unicode-character, after > the first command I get the same error as in the bug report. After running > the second command, the error persists. > > That makes me believe, your patch won't fix the issue completely. > > > > Is my cluster one of the cases where the patch doesn’t fix the issue? > > > > *From:* Radosław Piliszek > *Sent:* Friday, January 31, 2020 1:21 PM > *To:* Albert Braden > *Cc:* Radosław Piliszek ; OpenStack Discuss > ML > *Subject:* Re: "encoding error when doing console log show" in Rocky > > > > Ok, try upgrading both to the latest (no pin). Latest clients should still > work on Rocky and we don't have to guess versions. > > > > -yoctozepto > > > > On Fri, Jan 31, 2020, 22:12 Albert Braden > wrote: > > In my other cluster where I haven't upgraded anything since the Rocky > install, the old nova version 11 client works, but the openstack client > fails: > > root at us01odc-dev1-ctrl1:~# openstack --version > openstack 3.16.1 > root at us01odc-dev1-ctrl1:~# openstack console log show > 5a923a92-8fd1-48fd-8b76-768d1fb5f0c6 > 'latin-1' codec can't encode characters in position 45794-45796: ordinal > not in range(256) > root at us01odc-dev1-ctrl1:~# nova --version > 11.0.0 > root at us01odc-dev1-ctrl1:~# nova console-log > 5a923a92-8fd1-48fd-8b76-768d1fb5f0c6 > [ 0.000000] Initializing cgroup subsys cpuset > [ 0.000000] Initializing cgroup subsys cpu > [ 0.000000] Initializing cgroup subsys cpuacct > [ 0.000000] Linux version 3.10.0-957.5.1.el7.x86_64 ( > mockbuild at kbuilder.bsys.centos.org) (gcc version 4.8.5 20150 > ... > > -----Original Message----- > From: Albert Braden > Sent: Friday, January 31, 2020 12:49 PM > To: Radosław Piliszek > Cc: OpenStack Discuss ML > Subject: RE: "encoding error when doing console log show" in Rocky > > That ran successfully, but it didn't fix the problem. > > Installing collected packages: python-novaclient > Attempting uninstall: python-novaclient > Found existing installation: python-novaclient 16.0.0 > Uninstalling python-novaclient-16.0.0: > Successfully uninstalled python-novaclient-16.0.0 > Successfully installed python-novaclient-11.0.1 > root at us01odc-dev2-ctrl1:/var/log/nova# os console log show > 3febd3b2-df87-4f06-884b-378116c6fe4c > 'latin-1' codec can't encode characters in position 46615-46617: ordinal > not in range(256) > > -----Original Message----- > From: Radosław Piliszek > Sent: Friday, January 31, 2020 12:41 PM > To: Albert Braden > Cc: Radosław Piliszek ; OpenStack Discuss ML > > Subject: Re: "encoding error when doing console log show" in Rocky > > Then also: > pip install -U python-novaclient==11.0.1 > > -yoctozepto > > pt., 31 sty 2020 o 21:31 Albert Braden > napisał(a): > > > > The client claims that it is version 3.16.3. What am I missing? > > > > root at us01odc-dev2-ctrl1:/var/log/nova# openstack --version > > openstack 3.16.3 > > root at us01odc-dev2-ctrl1:/var/log/nova# os console log show > 3febd3b2-df87-4f06-884b-378116c6fe4c > > 'latin-1' codec can't encode characters in position 46615-46617: ordinal > not in range(256) > > > > -----Original Message----- > > From: Radosław Piliszek > > Sent: Friday, January 31, 2020 10:49 AM > > To: Albert Braden > > Cc: Radosław Piliszek ; OpenStack Discuss > ML > > Subject: Re: "encoding error when doing console log show" in Rocky > > > > It looks like these are package-provided and hence pip refused to touch > them. > > Are there no updates to packages of clients? > > You could try installing the client in venv if you insist on using this > machine. > > > > That file you call "source file" has a definition of environment > > variables for osc, you surely "source" it but it's far from being a > > "source file" in the specific sense around here. Hence my previous > > confusion. :-) > > > > -yoctozepto > > > > pt., 31 sty 2020 o 19:25 Albert Braden > napisał(a): > > > > > > This is the source file: > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__f.perl.bot_p_7omp8p&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=9hpBVaDxs6DnEno4f15EYzgDq-WzDXknz_GfNTUqDGI&s=-eOnfHHn64irzKfiMAMB9Hp4bpvYgldkQHRCuCOvE60&e= > > > > > > I'm not very good at PIP but it looks like I already have the latest > Rocky client installed: > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__f.perl.bot_p_xd6hi2&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=9hpBVaDxs6DnEno4f15EYzgDq-WzDXknz_GfNTUqDGI&s=3Skkokc_zvLhR_0v_WNYAp7rzYbPA0pDg_IvMmSPtHc&e= > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__f.perl.bot_p_zxjh3t&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=9hpBVaDxs6DnEno4f15EYzgDq-WzDXknz_GfNTUqDGI&s=V7aVdb9JbVeogcCPGu2wAtCvM1aw5EgBGKLuhEIkMQw&e= > > > > > > Requirement already satisfied, skipping upgrade: > python-novaclient>=9.1.0 in /usr/local/lib/python2.7/dist-packages (from > python-openstackclient==3.16.3) (16.0.0) > > > > > > root at us01odc-dev2-ctrl1:~# os console log show > 3febd3b2-df87-4f06-884b-378116c6fe4c > > > 'latin-1' codec can't encode characters in position 46615-46617: > ordinal not in range(256) > > > > > > -----Original Message----- > > > From: Radosław Piliszek > > > Sent: Friday, January 31, 2020 12:39 AM > > > To: Albert Braden > > > Cc: Radosław Piliszek ; OpenStack > Discuss ML > > > Subject: Re: "encoding error when doing console log show" in Rocky > > > > > > Well, I might be slow here. :-) > > > What source file do you need to run on the controller? > > > If you run openstack client from controller then it becomes your > > > client so just update the client there to the latest rocky release: > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_releasenotes_python-2Dopenstackclient_rocky.html&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=avA5Liwj2cVesY05Dk6kFm0YanYp7N7Q4PHGMbzeD6Y&s=dQaaBGR2A-dmJXGDr_TiJ0XVyghaMONnZasgJ52CnrU&e= > > > something along this: > > > sudo pip install -U python-openstackclient==3.16.3 > > > > > > -yoctozepto > > > > > > czw., 30 sty 2020 o 22:18 Albert Braden > napisał(a): > > > > > > > > Sorry if I am slow. This happens when I go to the controller, run my > source file and then type: os console log show > > > > > > > > How can I install a new client to use on the controller command > line? Do I need to replace the files mentioned below, or is there a better > way? > > > > > > > > -----Original Message----- > > > > From: Radosław Piliszek > > > > Sent: Thursday, January 30, 2020 12:09 PM > > > > To: Albert Braden > > > > Cc: OpenStack Discuss ML > > > > Subject: Re: "encoding error when doing console log show" in Rocky > > > > > > > > The bug you mention affects clients. > > > > Hence you just need to patch your clients. > > > > I guess just using a newer version of client would work fine. > > > > > > > > -yoctozepto > > > > > > > > czw., 30 sty 2020 o 20:56 Albert Braden > napisał(a): > > > > > > > > > > We’re running openstack-ansible Rocky and seeing the “encoding > error when doing console log show.” I see here that there is a patch: > > > > > > > > > > > > > > > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_python-2Dopenstackclient_-2Bbug_1747862&d=DwIFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=-HCLwwRjUJ1GrihF5PjSNOYqDFH9Y7Qg0Bckxae56UY&s=vpftbl1w23aba5hyvrLPC-bG3IemMkFqbNBDTxrtHFM&e= > > > > > > > > > > > > > > > > > > > > How can I apply this patch to our Rocky install? Do I need to just > copy the file over the existing > /usr/lib/python2.7/dist-packages/openstackclient/compute/v2/console.py on > controllers and hypervisors? Or is there a better way? > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From whayutin at redhat.com Fri Jan 31 21:53:52 2020 From: whayutin at redhat.com (Wesley Hayutin) Date: Fri, 31 Jan 2020 14:53:52 -0700 Subject: [tripleo] no rechecks In-Reply-To: References: Message-ID: Greetings, See the inline On Thu, Jan 30, 2020 at 11:39 AM Wesley Hayutin wrote: > Greetings, > > Hey folks we need a few patches to land before it's safe to blindly > recheck. A few minor patches need to land before you can expect to get +1 > from zuul in a lot of cases. We appreciate your patience while we resolve > the issues. > > Thanks > > > * TRIPLEO GATES ARE BROKEN* > * need two patches to fix > > - * https://review.opendev.org/#/c/705051/ > > > - * https://review.opendev.org/#/c/704885/ > > > - * Also need > > > - * https://review.rdoproject.org/r/#/c/24754/ > > > Everything has merged that needs to.. Thanks for your patience.. moving the gate back to green status. Thank you Alex, Chandan, Yatin for your assistance in fixing up the mess :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Albert.Braden at synopsys.com Fri Jan 31 22:18:24 2020 From: Albert.Braden at synopsys.com (Albert Braden) Date: Fri, 31 Jan 2020 22:18:24 +0000 Subject: "encoding error when doing console log show" in Rocky In-Reply-To: References: Message-ID: Changing LANG from en_US to en_US.UTF-8 fixed both openstack and nova clients. Thanks for the advice! From: Radosław Piliszek Sent: Friday, January 31, 2020 1:41 PM To: Albert Braden Cc: Radosław Piliszek ; OpenStack Discuss ML Subject: Re: "encoding error when doing console log show" in Rocky No idea... Last thing to try: play with locale settings (locale command). Try setting LANG to en_US.UTF-8 (or en_US if that fails). -yoctozepto On Fri, Jan 31, 2020, 22:36 Albert Braden > wrote: They both upgraded but still fail: root at us01odc-dev2-ctrl1:~# nova --version 16.0.0 root at us01odc-dev2-ctrl1:~# nova console-log 3febd3b2-df87-4f06-884b-378116c6fe4c ERROR (UnicodeEncodeError): 'latin-1' codec can't encode characters in position 46615-46617: ordinal not in range(256) root at us01odc-dev2-ctrl1:~# os --version openstack 4.0.0 root at us01odc-dev2-ctrl1:~# os console log show 3febd3b2-df87-4f06-884b-378116c6fe4c 'latin-1' codec can't encode characters in position 46615-46617: ordinal not in range(256) If I look at /usr/lib/python2.7/dist-packages/openstackclient/compute/v2/console.py on the controller, I see the new code from https://review.opendev.org/#/c/541609/3/openstackclient/compute/v2/console.py: if data and data[-1] != '\n': data += '\n' self.app.stdout.write(data) Also I see this in the review at https://review.opendev.org/#/c/541609 If you are sure this works, please ignore my comment. I tried to verify this on a local python console (python-2.7.13). To do this I used sys.stdout = codecs.getwriter('ascii')(sys.stdout) and sys.stdout = codecs.getwriter('utf-8')(sys.stdout) Before running the first command, I can write any unicode-character, after the first command I get the same error as in the bug report. After running the second command, the error persists. That makes me believe, your patch won't fix the issue completely. Is my cluster one of the cases where the patch doesn’t fix the issue? From: Radosław Piliszek > Sent: Friday, January 31, 2020 1:21 PM To: Albert Braden > Cc: Radosław Piliszek >; OpenStack Discuss ML > Subject: Re: "encoding error when doing console log show" in Rocky Ok, try upgrading both to the latest (no pin). Latest clients should still work on Rocky and we don't have to guess versions. -yoctozepto On Fri, Jan 31, 2020, 22:12 Albert Braden > wrote: In my other cluster where I haven't upgraded anything since the Rocky install, the old nova version 11 client works, but the openstack client fails: root at us01odc-dev1-ctrl1:~# openstack --version openstack 3.16.1 root at us01odc-dev1-ctrl1:~# openstack console log show 5a923a92-8fd1-48fd-8b76-768d1fb5f0c6 'latin-1' codec can't encode characters in position 45794-45796: ordinal not in range(256) root at us01odc-dev1-ctrl1:~# nova --version 11.0.0 root at us01odc-dev1-ctrl1:~# nova console-log 5a923a92-8fd1-48fd-8b76-768d1fb5f0c6 [ 0.000000] Initializing cgroup subsys cpuset [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Initializing cgroup subsys cpuacct [ 0.000000] Linux version 3.10.0-957.5.1.el7.x86_64 (mockbuild at kbuilder.bsys.centos.org) (gcc version 4.8.5 20150 ... -------------- next part -------------- An HTML attachment was scrubbed... URL: