From eblock at nde.ag Mon Jul 1 07:31:14 2019 From: eblock at nde.ag (Eugen Block) Date: Mon, 01 Jul 2019 07:31:14 +0000 Subject: glance image upload error [Errno 32] Broken pipe In-Reply-To: Message-ID: <20190701073114.Horde.OFwjjcHd__j9TawGaNnt3iK@webmail.nde.ag> Hi, have you checked if your glance services are still up and running? The error message indicates that it's not ceph but the glance endpoint that's not there. Regards, Eugen Zitat von Satish Patel : > I have installed opnestack-ansible and integrate glance with ceph > storage, first day when i upload image it works but today when i am > trying to upload image i am getting this error > > [root at ostack-infra-2-1-utility-container-c166f549 ~]# openstack image > create --file cirros-0.3.4-x86_64-disk.raw --container-format bare > --disk-format raw --public cirros-0.3.4-tmp > Error finding address for > http://172.28.8.9:9292/v2/images/8f3456aa-52fc-4b4a-8b11-dfbadb8e88ca/file: > [Errno 32] Broken pipe > > I do have enough space on ceph storage, i am not seeing any error on > glance logs also which help me. > > [root at ostack-infra-01-ceph-mon-container-692bea95 root]# ceph df detail > GLOBAL: > SIZE AVAIL RAW USED %RAW USED OBJECTS > 9314G 3526G 5787G 62.14 245k > POOLS: > NAME ID QUOTA OBJECTS QUOTA BYTES USED > %USED MAX AVAIL OBJECTS DIRTY READ WRITE RAW > USED > images 6 95367M N/A 22522M > 3.62 584G 2839 2839 2594k 36245 > 67568M > vms 7 N/A N/A 1912G > 76.58 584G 248494 242k 6567M 3363M > 5738G > volumes 8 N/A N/A 0 > 0 584G 0 0 0 0 > > backups 9 N/A N/A 0 > 0 584G 0 0 0 0 > > metrics 10 N/A N/A 0 > 0 584G 0 0 0 0 > From mark at stackhpc.com Mon Jul 1 08:10:34 2019 From: mark at stackhpc.com (Mark Goddard) Date: Mon, 1 Jul 2019 09:10:34 +0100 Subject: [kolla-ansible] migration In-Reply-To: References: Message-ID: It sounds like you got quite close to having this working. I'd suggest debugging this instance build failure. One difference with kolla is that we run libvirt inside a container. Have you stopped libvirt from running on the host? Mark On Sun, 30 Jun 2019 at 09:55, Ignazio Cassano wrote: > > Hi Mark, > let me to explain what I am trying. > I have a queens installation based on centos and pacemaker with some instances and heat stacks. > I would like to have another installation with same instances, projects, stacks ....I'd like to have same uuid for all objects (users,projects instances and so on, because it is controlled by a cloud management platform we wrote. > > I stopped controllers on old queens installation backupping the openstack database. > I installed the new kolla openstack queens on new three controllers with same addresses of the old intallation , vip as well. > One of the three controllers is also a kvm node on queens. > I stopped all containeres except rabbit,keepalive,rabbit,haproxy and mariadb. > I deleted al openstack db on mariadb container and I imported the old tables, changing the address of rabbit for pointing to the new rabbit cluster. > I restarded containers. > Changing the rabbit address on old kvm nodes, I can see the old virtual machines and I can open console on them. > I can see all networks (tenant and provider) of al installation, but when I try to create a new instance on the new kvm, it remains in buiding state. > Seems it cannot aquire an address. > Storage between old and new installation are shred on nfs NETAPP, so I can see cinder volumes. > I suppose db structure is different between a kolla installation and a manual instaltion !? > What is wrong ? > Thanks > Ignazio > > > > > Il giorno gio 27 giu 2019 alle ore 16:44 Mark Goddard ha scritto: >> >> On Thu, 27 Jun 2019 at 14:46, Ignazio Cassano wrote: >> > >> > Sorry, for my question. >> > It does not need to change anything because endpoints refer to haproxy vips. >> > So if your new glance works fine you change haproxy backends for glance. >> > Regards >> > Ignazio >> >> That's correct - only the haproxy backend needs to be updated. >> >> > >> > >> > Il giorno gio 27 giu 2019 alle ore 15:21 Ignazio Cassano ha scritto: >> >> >> >> Hello Mark, >> >> let me to verify if I understood your method. >> >> >> >> You have old controllers,haproxy,mariadb and nova computes. >> >> You installed three new controllers but kolla.ansible inventory contains old mariadb and old rabbit servers. >> >> You are deployng single service on new controllers staring with glance. >> >> When you deploy glance on new controllers, it changes the glance endpoint on old mariadb db ? >> >> Regards >> >> Ignazio >> >> >> >> Il giorno gio 27 giu 2019 alle ore 10:52 Mark Goddard ha scritto: >> >>> >> >>> On Wed, 26 Jun 2019 at 19:34, Ignazio Cassano wrote: >> >>> > >> >>> > Hello, >> >>> > Anyone have tried to migrate an existing openstack installation to kolla containers? >> >>> >> >>> Hi, >> >>> >> >>> I'm aware of two people currently working on that. Gregory Orange and >> >>> one of my colleagues, Pierre Riteau. Pierre is away currently, so I >> >>> hope he doesn't mind me quoting him from an email to Gregory. >> >>> >> >>> Mark >> >>> >> >>> "I am indeed working on a similar migration using Kolla Ansible with >> >>> Kayobe, starting from a non-containerised OpenStack deployment based >> >>> on CentOS RPMs. >> >>> Existing OpenStack services are deployed across several controller >> >>> nodes and all sit behind HAProxy, including for internal endpoints. >> >>> We have additional controller nodes that we use to deploy >> >>> containerised services. If you don't have the luxury of additional >> >>> nodes, it will be more difficult as you will need to avoid processes >> >>> clashing when listening on the same port. >> >>> >> >>> The method I am using resembles your second suggestion, however I am >> >>> deploying only one containerised service at a time, in order to >> >>> validate each of them independently. >> >>> I use the --tags option of kolla-ansible to restrict Ansible to >> >>> specific roles, and when I am happy with the resulting configuration I >> >>> update HAProxy to point to the new controllers. >> >>> >> >>> As long as the configuration matches, this should be completely >> >>> transparent for purely HTTP-based services like Glance. You need to be >> >>> more careful with services that include components listening for RPC, >> >>> such as Nova: if the new nova.conf is incorrect and you've deployed a >> >>> nova-conductor that uses it, you could get failed instances launches. >> >>> Some roles depend on others: if you are deploying the >> >>> neutron-openvswitch-agent, you need to run the openvswitch role as >> >>> well. >> >>> >> >>> I suggest starting with migrating Glance as it doesn't have any >> >>> internal services and is easy to validate. Note that properly >> >>> migrating Keystone requires keeping existing Fernet keys around, so >> >>> any token stays valid until the time it is expected to stop working >> >>> (which is fairly complex, see >> >>> https://bugs.launchpad.net/kolla-ansible/+bug/1809469). >> >>> >> >>> While initially I was using an approach similar to your first >> >>> suggestion, it can have side effects since Kolla Ansible uses these >> >>> variables when templating configuration. As an example, most services >> >>> will only have notifications enabled if enable_ceilometer is true. >> >>> >> >>> I've added existing control plane nodes to the Kolla Ansible inventory >> >>> as separate groups, which allows me to use the existing database and >> >>> RabbitMQ for the containerised services. >> >>> For example, instead of: >> >>> >> >>> [mariadb:children] >> >>> control >> >>> >> >>> you may have: >> >>> >> >>> [mariadb:children] >> >>> oldcontrol_db >> >>> >> >>> I still have to perform the migration of these underlying services to >> >>> the new control plane, I will let you know if there is any hurdle. >> >>> >> >>> A few random things to note: >> >>> >> >>> - if run on existing control plane hosts, the baremetal role removes >> >>> some packages listed in `redhat_pkg_removals` which can trigger the >> >>> removal of OpenStack dependencies using them! I've changed this >> >>> variable to an empty list. >> >>> - compare your existing deployment with a Kolla Ansible one to check >> >>> for differences in endpoints, configuration files, database users, >> >>> service users, etc. For Heat, Kolla uses the domain heat_user_domain, >> >>> while your existing deployment may use another one (and this is >> >>> hardcoded in the Kolla Heat image). Kolla Ansible uses the "service" >> >>> project while a couple of deployments I worked with were using >> >>> "services". This shouldn't matter, except there was a bug in Kolla >> >>> which prevented it from setting the roles correctly: >> >>> https://bugs.launchpad.net/kolla/+bug/1791896 (now fixed in latest >> >>> Rocky and Queens images) >> >>> - the ml2_conf.ini generated for Neutron generates physical network >> >>> names like physnet1, physnet2… you may want to override >> >>> bridge_mappings completely. >> >>> - although sometimes it could be easier to change your existing >> >>> deployment to match Kolla Ansible settings, rather than configure >> >>> Kolla Ansible to match your deployment." >> >>> >> >>> > Thanks >> >>> > Ignazio >> >>> > From bence.romsics at gmail.com Mon Jul 1 09:26:21 2019 From: bence.romsics at gmail.com (Bence Romsics) Date: Mon, 1 Jul 2019 11:26:21 +0200 Subject: [neutron] bug deputy report for week of 2019-06-24 Message-ID: Hi Everybody, These are the new bugs of last week: Critical: * https://bugs.launchpad.net/neutron/+bug/1834298 Neutron-fwaas python 3.7 job is broken Fix tested, needs +2 from neutron-fwaas cores: https://review.opendev.org/667736 * https://bugs.launchpad.net/neutron/+bug/1833902 Revert resize tests are failing in jobs with iptables_hybrid fw driver A nova bug breaking neutron gate. Fix merged: https://review.opendev.org/667035 High: * https://bugs.launchpad.net/neutron/+bug/1834257 dhcp-agent can overwhelm neutron server with dhcp_ready_on_ports RPC calls Fix in progress: https://review.opendev.org/659274 * https://bugs.launchpad.net/neutron/+bug/1834484 [QoS] qos_plugin._extend_port_resource_request is killing port retrieval performance Fix in progress: https://review.opendev.org/667981, https://review.opendev.org/667998 Medium: * https://bugs.launchpad.net/neutron/+bug/1834308 [DVR][DB] too many slow query during agent restart Needs more information to reproduce * https://bugs.launchpad.net/neutron/+bug/1834753 TC filter priority parameter in Pyroute is "prio" Fix in progress: https://review.opendev.org/668308 * https://bugs.launchpad.net/neutron/+bug/1834825 Rule to prevent SNAT for router's internal traffic is wrong Fix in progress: https://review.opendev.org/668378 RFE: * https://bugs.launchpad.net/neutron/+bug/1834174 [RFE] Add support for IPoIB interface driver * https://bugs.launchpad.net/neutron/+bug/1834176 [RFE] Add support for per-physnet interface driver Cheers, Bence (rubasov) From ignaziocassano at gmail.com Mon Jul 1 10:10:58 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Mon, 1 Jul 2019 12:10:58 +0200 Subject: [kolla-ansible] migration In-Reply-To: References: Message-ID: Hi Mark, the kolla environment has a new kvm host hosted on controller node. The kolla openstack can see instances on old kvm nodes and it can access them using vnc console porvided by dashboard, but cannot run new instances on new kvm host :-( Regards Ignazio Il giorno lun 1 lug 2019 alle ore 10:10 Mark Goddard ha scritto: > It sounds like you got quite close to having this working. I'd suggest > debugging this instance build failure. One difference with kolla is > that we run libvirt inside a container. Have you stopped libvirt from > running on the host? > Mark > > On Sun, 30 Jun 2019 at 09:55, Ignazio Cassano > wrote: > > > > Hi Mark, > > let me to explain what I am trying. > > I have a queens installation based on centos and pacemaker with some > instances and heat stacks. > > I would like to have another installation with same instances, projects, > stacks ....I'd like to have same uuid for all objects (users,projects > instances and so on, because it is controlled by a cloud management > platform we wrote. > > > > I stopped controllers on old queens installation backupping the > openstack database. > > I installed the new kolla openstack queens on new three controllers with > same addresses of the old intallation , vip as well. > > One of the three controllers is also a kvm node on queens. > > I stopped all containeres except rabbit,keepalive,rabbit,haproxy and > mariadb. > > I deleted al openstack db on mariadb container and I imported the old > tables, changing the address of rabbit for pointing to the new rabbit > cluster. > > I restarded containers. > > Changing the rabbit address on old kvm nodes, I can see the old virtual > machines and I can open console on them. > > I can see all networks (tenant and provider) of al installation, but > when I try to create a new instance on the new kvm, it remains in buiding > state. > > Seems it cannot aquire an address. > > Storage between old and new installation are shred on nfs NETAPP, so I > can see cinder volumes. > > I suppose db structure is different between a kolla installation and a > manual instaltion !? > > What is wrong ? > > Thanks > > Ignazio > > > > > > > > > > Il giorno gio 27 giu 2019 alle ore 16:44 Mark Goddard > ha scritto: > >> > >> On Thu, 27 Jun 2019 at 14:46, Ignazio Cassano > wrote: > >> > > >> > Sorry, for my question. > >> > It does not need to change anything because endpoints refer to > haproxy vips. > >> > So if your new glance works fine you change haproxy backends for > glance. > >> > Regards > >> > Ignazio > >> > >> That's correct - only the haproxy backend needs to be updated. > >> > >> > > >> > > >> > Il giorno gio 27 giu 2019 alle ore 15:21 Ignazio Cassano < > ignaziocassano at gmail.com> ha scritto: > >> >> > >> >> Hello Mark, > >> >> let me to verify if I understood your method. > >> >> > >> >> You have old controllers,haproxy,mariadb and nova computes. > >> >> You installed three new controllers but kolla.ansible inventory > contains old mariadb and old rabbit servers. > >> >> You are deployng single service on new controllers staring with > glance. > >> >> When you deploy glance on new controllers, it changes the glance > endpoint on old mariadb db ? > >> >> Regards > >> >> Ignazio > >> >> > >> >> Il giorno gio 27 giu 2019 alle ore 10:52 Mark Goddard < > mark at stackhpc.com> ha scritto: > >> >>> > >> >>> On Wed, 26 Jun 2019 at 19:34, Ignazio Cassano < > ignaziocassano at gmail.com> wrote: > >> >>> > > >> >>> > Hello, > >> >>> > Anyone have tried to migrate an existing openstack installation > to kolla containers? > >> >>> > >> >>> Hi, > >> >>> > >> >>> I'm aware of two people currently working on that. Gregory Orange > and > >> >>> one of my colleagues, Pierre Riteau. Pierre is away currently, so I > >> >>> hope he doesn't mind me quoting him from an email to Gregory. > >> >>> > >> >>> Mark > >> >>> > >> >>> "I am indeed working on a similar migration using Kolla Ansible with > >> >>> Kayobe, starting from a non-containerised OpenStack deployment based > >> >>> on CentOS RPMs. > >> >>> Existing OpenStack services are deployed across several controller > >> >>> nodes and all sit behind HAProxy, including for internal endpoints. > >> >>> We have additional controller nodes that we use to deploy > >> >>> containerised services. If you don't have the luxury of additional > >> >>> nodes, it will be more difficult as you will need to avoid processes > >> >>> clashing when listening on the same port. > >> >>> > >> >>> The method I am using resembles your second suggestion, however I am > >> >>> deploying only one containerised service at a time, in order to > >> >>> validate each of them independently. > >> >>> I use the --tags option of kolla-ansible to restrict Ansible to > >> >>> specific roles, and when I am happy with the resulting > configuration I > >> >>> update HAProxy to point to the new controllers. > >> >>> > >> >>> As long as the configuration matches, this should be completely > >> >>> transparent for purely HTTP-based services like Glance. You need to > be > >> >>> more careful with services that include components listening for > RPC, > >> >>> such as Nova: if the new nova.conf is incorrect and you've deployed > a > >> >>> nova-conductor that uses it, you could get failed instances > launches. > >> >>> Some roles depend on others: if you are deploying the > >> >>> neutron-openvswitch-agent, you need to run the openvswitch role as > >> >>> well. > >> >>> > >> >>> I suggest starting with migrating Glance as it doesn't have any > >> >>> internal services and is easy to validate. Note that properly > >> >>> migrating Keystone requires keeping existing Fernet keys around, so > >> >>> any token stays valid until the time it is expected to stop working > >> >>> (which is fairly complex, see > >> >>> https://bugs.launchpad.net/kolla-ansible/+bug/1809469). > >> >>> > >> >>> While initially I was using an approach similar to your first > >> >>> suggestion, it can have side effects since Kolla Ansible uses these > >> >>> variables when templating configuration. As an example, most > services > >> >>> will only have notifications enabled if enable_ceilometer is true. > >> >>> > >> >>> I've added existing control plane nodes to the Kolla Ansible > inventory > >> >>> as separate groups, which allows me to use the existing database and > >> >>> RabbitMQ for the containerised services. > >> >>> For example, instead of: > >> >>> > >> >>> [mariadb:children] > >> >>> control > >> >>> > >> >>> you may have: > >> >>> > >> >>> [mariadb:children] > >> >>> oldcontrol_db > >> >>> > >> >>> I still have to perform the migration of these underlying services > to > >> >>> the new control plane, I will let you know if there is any hurdle. > >> >>> > >> >>> A few random things to note: > >> >>> > >> >>> - if run on existing control plane hosts, the baremetal role removes > >> >>> some packages listed in `redhat_pkg_removals` which can trigger the > >> >>> removal of OpenStack dependencies using them! I've changed this > >> >>> variable to an empty list. > >> >>> - compare your existing deployment with a Kolla Ansible one to check > >> >>> for differences in endpoints, configuration files, database users, > >> >>> service users, etc. For Heat, Kolla uses the domain > heat_user_domain, > >> >>> while your existing deployment may use another one (and this is > >> >>> hardcoded in the Kolla Heat image). Kolla Ansible uses the "service" > >> >>> project while a couple of deployments I worked with were using > >> >>> "services". This shouldn't matter, except there was a bug in Kolla > >> >>> which prevented it from setting the roles correctly: > >> >>> https://bugs.launchpad.net/kolla/+bug/1791896 (now fixed in latest > >> >>> Rocky and Queens images) > >> >>> - the ml2_conf.ini generated for Neutron generates physical network > >> >>> names like physnet1, physnet2… you may want to override > >> >>> bridge_mappings completely. > >> >>> - although sometimes it could be easier to change your existing > >> >>> deployment to match Kolla Ansible settings, rather than configure > >> >>> Kolla Ansible to match your deployment." > >> >>> > >> >>> > Thanks > >> >>> > Ignazio > >> >>> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Mon Jul 1 10:38:13 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Mon, 1 Jul 2019 12:38:13 +0200 Subject: [kolla-ansible] migration In-Reply-To: References: Message-ID: PS I presume the problem is neutron, because instances on new kvm nodes remain in building state e do not aquire address. Probably the netron db imported from old openstack installation has some difrrences ....probably I must check defferences from old and new neutron services configuration files. Ignazio Il giorno lun 1 lug 2019 alle ore 10:10 Mark Goddard ha scritto: > It sounds like you got quite close to having this working. I'd suggest > debugging this instance build failure. One difference with kolla is > that we run libvirt inside a container. Have you stopped libvirt from > running on the host? > Mark > > On Sun, 30 Jun 2019 at 09:55, Ignazio Cassano > wrote: > > > > Hi Mark, > > let me to explain what I am trying. > > I have a queens installation based on centos and pacemaker with some > instances and heat stacks. > > I would like to have another installation with same instances, projects, > stacks ....I'd like to have same uuid for all objects (users,projects > instances and so on, because it is controlled by a cloud management > platform we wrote. > > > > I stopped controllers on old queens installation backupping the > openstack database. > > I installed the new kolla openstack queens on new three controllers with > same addresses of the old intallation , vip as well. > > One of the three controllers is also a kvm node on queens. > > I stopped all containeres except rabbit,keepalive,rabbit,haproxy and > mariadb. > > I deleted al openstack db on mariadb container and I imported the old > tables, changing the address of rabbit for pointing to the new rabbit > cluster. > > I restarded containers. > > Changing the rabbit address on old kvm nodes, I can see the old virtual > machines and I can open console on them. > > I can see all networks (tenant and provider) of al installation, but > when I try to create a new instance on the new kvm, it remains in buiding > state. > > Seems it cannot aquire an address. > > Storage between old and new installation are shred on nfs NETAPP, so I > can see cinder volumes. > > I suppose db structure is different between a kolla installation and a > manual instaltion !? > > What is wrong ? > > Thanks > > Ignazio > > > > > > > > > > Il giorno gio 27 giu 2019 alle ore 16:44 Mark Goddard > ha scritto: > >> > >> On Thu, 27 Jun 2019 at 14:46, Ignazio Cassano > wrote: > >> > > >> > Sorry, for my question. > >> > It does not need to change anything because endpoints refer to > haproxy vips. > >> > So if your new glance works fine you change haproxy backends for > glance. > >> > Regards > >> > Ignazio > >> > >> That's correct - only the haproxy backend needs to be updated. > >> > >> > > >> > > >> > Il giorno gio 27 giu 2019 alle ore 15:21 Ignazio Cassano < > ignaziocassano at gmail.com> ha scritto: > >> >> > >> >> Hello Mark, > >> >> let me to verify if I understood your method. > >> >> > >> >> You have old controllers,haproxy,mariadb and nova computes. > >> >> You installed three new controllers but kolla.ansible inventory > contains old mariadb and old rabbit servers. > >> >> You are deployng single service on new controllers staring with > glance. > >> >> When you deploy glance on new controllers, it changes the glance > endpoint on old mariadb db ? > >> >> Regards > >> >> Ignazio > >> >> > >> >> Il giorno gio 27 giu 2019 alle ore 10:52 Mark Goddard < > mark at stackhpc.com> ha scritto: > >> >>> > >> >>> On Wed, 26 Jun 2019 at 19:34, Ignazio Cassano < > ignaziocassano at gmail.com> wrote: > >> >>> > > >> >>> > Hello, > >> >>> > Anyone have tried to migrate an existing openstack installation > to kolla containers? > >> >>> > >> >>> Hi, > >> >>> > >> >>> I'm aware of two people currently working on that. Gregory Orange > and > >> >>> one of my colleagues, Pierre Riteau. Pierre is away currently, so I > >> >>> hope he doesn't mind me quoting him from an email to Gregory. > >> >>> > >> >>> Mark > >> >>> > >> >>> "I am indeed working on a similar migration using Kolla Ansible with > >> >>> Kayobe, starting from a non-containerised OpenStack deployment based > >> >>> on CentOS RPMs. > >> >>> Existing OpenStack services are deployed across several controller > >> >>> nodes and all sit behind HAProxy, including for internal endpoints. > >> >>> We have additional controller nodes that we use to deploy > >> >>> containerised services. If you don't have the luxury of additional > >> >>> nodes, it will be more difficult as you will need to avoid processes > >> >>> clashing when listening on the same port. > >> >>> > >> >>> The method I am using resembles your second suggestion, however I am > >> >>> deploying only one containerised service at a time, in order to > >> >>> validate each of them independently. > >> >>> I use the --tags option of kolla-ansible to restrict Ansible to > >> >>> specific roles, and when I am happy with the resulting > configuration I > >> >>> update HAProxy to point to the new controllers. > >> >>> > >> >>> As long as the configuration matches, this should be completely > >> >>> transparent for purely HTTP-based services like Glance. You need to > be > >> >>> more careful with services that include components listening for > RPC, > >> >>> such as Nova: if the new nova.conf is incorrect and you've deployed > a > >> >>> nova-conductor that uses it, you could get failed instances > launches. > >> >>> Some roles depend on others: if you are deploying the > >> >>> neutron-openvswitch-agent, you need to run the openvswitch role as > >> >>> well. > >> >>> > >> >>> I suggest starting with migrating Glance as it doesn't have any > >> >>> internal services and is easy to validate. Note that properly > >> >>> migrating Keystone requires keeping existing Fernet keys around, so > >> >>> any token stays valid until the time it is expected to stop working > >> >>> (which is fairly complex, see > >> >>> https://bugs.launchpad.net/kolla-ansible/+bug/1809469). > >> >>> > >> >>> While initially I was using an approach similar to your first > >> >>> suggestion, it can have side effects since Kolla Ansible uses these > >> >>> variables when templating configuration. As an example, most > services > >> >>> will only have notifications enabled if enable_ceilometer is true. > >> >>> > >> >>> I've added existing control plane nodes to the Kolla Ansible > inventory > >> >>> as separate groups, which allows me to use the existing database and > >> >>> RabbitMQ for the containerised services. > >> >>> For example, instead of: > >> >>> > >> >>> [mariadb:children] > >> >>> control > >> >>> > >> >>> you may have: > >> >>> > >> >>> [mariadb:children] > >> >>> oldcontrol_db > >> >>> > >> >>> I still have to perform the migration of these underlying services > to > >> >>> the new control plane, I will let you know if there is any hurdle. > >> >>> > >> >>> A few random things to note: > >> >>> > >> >>> - if run on existing control plane hosts, the baremetal role removes > >> >>> some packages listed in `redhat_pkg_removals` which can trigger the > >> >>> removal of OpenStack dependencies using them! I've changed this > >> >>> variable to an empty list. > >> >>> - compare your existing deployment with a Kolla Ansible one to check > >> >>> for differences in endpoints, configuration files, database users, > >> >>> service users, etc. For Heat, Kolla uses the domain > heat_user_domain, > >> >>> while your existing deployment may use another one (and this is > >> >>> hardcoded in the Kolla Heat image). Kolla Ansible uses the "service" > >> >>> project while a couple of deployments I worked with were using > >> >>> "services". This shouldn't matter, except there was a bug in Kolla > >> >>> which prevented it from setting the roles correctly: > >> >>> https://bugs.launchpad.net/kolla/+bug/1791896 (now fixed in latest > >> >>> Rocky and Queens images) > >> >>> - the ml2_conf.ini generated for Neutron generates physical network > >> >>> names like physnet1, physnet2… you may want to override > >> >>> bridge_mappings completely. > >> >>> - although sometimes it could be easier to change your existing > >> >>> deployment to match Kolla Ansible settings, rather than configure > >> >>> Kolla Ansible to match your deployment." > >> >>> > >> >>> > Thanks > >> >>> > Ignazio > >> >>> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From missile0407 at gmail.com Mon Jul 1 11:23:19 2019 From: missile0407 at gmail.com (Eddie Yen) Date: Mon, 1 Jul 2019 19:23:19 +0800 Subject: [kolla][ceph] Cache OSDs didn't stay in the root=cache after ceph deployment. Message-ID: Hi, I'm using stable/rocky to try ceph cache tiering. Now I'm facing a one issue. I chose one SSD to become cache tier disk. And set below options in globals.yml. ceph_enable_cache = "yes" ceph_target_max_byte= "" ceph_target_max_objects = "" ceph_cache_mode = "writeback" And the default OSD type is bluestore. It will bootstrap the cache disk and create another OSD container. And also create the root bucket called "cache". then set the cache rule to every cache pools. The problem is, that OSD didn't stay at "cache" bucket, it still stay at "default" bucket. That caused the services can't access to the Ceph normally. Especially deploying Gnocchi. When error occurred, I manually set that OSD to the cache bucket then re-deploy, and everything is normal now. But still a strange issue that it stay in the wrong bucket. Did I miss something during deployment? Or what can I do? Many thanks, Eddie. -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Mon Jul 1 11:33:08 2019 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 1 Jul 2019 13:33:08 +0200 Subject: [Release-job-failures] Release of openstack/ansible-role-redhat-subscription failed In-Reply-To: References: Message-ID: <1b761937-ed22-9a5a-ccd8-823846cc3150@openstack.org> zuul at openstack.org wrote: > Build failed. > > - release-openstack-python http://logs.openstack.org/59/59f40f154a5302df1ee4230f35f87d261e0bf9eb/release/release-openstack-python/8b78b28/ : POST_FAILURE in 2m 09s > - announce-release announce-release : SKIPPED > - propose-update-constraints propose-update-constraints : SKIPPED Upload to PyPI at the end of the release was rejected due to: "The description failed to render in the default format of reStructuredText. See https://pypi.org/help/#description-content-type for more information." Impact: ansible-role-redhat-subscription 1.0.3 was produced and published, but not uploaded to PyPI. Remediation: once the description is fixed, we'll have to request a 1.0.4 to fix the situation. -- Thierry Carrez (ttx) From mnaser at vexxhost.com Mon Jul 1 11:36:43 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Mon, 1 Jul 2019 06:36:43 -0500 Subject: [kolla-ansible] migration In-Reply-To: References: Message-ID: You should check your cell mapping records inside Nova. They're probably not right of you moved your database and rabbit Sorry for top posting this is from a phone. On Mon., Jul. 1, 2019, 5:46 a.m. Ignazio Cassano, wrote: > PS > I presume the problem is neutron, because instances on new kvm nodes > remain in building state e do not aquire address. > Probably the netron db imported from old openstack installation has some > difrrences ....probably I must check defferences from old and new neutron > services configuration files. > Ignazio > > Il giorno lun 1 lug 2019 alle ore 10:10 Mark Goddard > ha scritto: > >> It sounds like you got quite close to having this working. I'd suggest >> debugging this instance build failure. One difference with kolla is >> that we run libvirt inside a container. Have you stopped libvirt from >> running on the host? >> Mark >> >> On Sun, 30 Jun 2019 at 09:55, Ignazio Cassano >> wrote: >> > >> > Hi Mark, >> > let me to explain what I am trying. >> > I have a queens installation based on centos and pacemaker with some >> instances and heat stacks. >> > I would like to have another installation with same instances, >> projects, stacks ....I'd like to have same uuid for all objects >> (users,projects instances and so on, because it is controlled by a cloud >> management platform we wrote. >> > >> > I stopped controllers on old queens installation backupping the >> openstack database. >> > I installed the new kolla openstack queens on new three controllers >> with same addresses of the old intallation , vip as well. >> > One of the three controllers is also a kvm node on queens. >> > I stopped all containeres except rabbit,keepalive,rabbit,haproxy and >> mariadb. >> > I deleted al openstack db on mariadb container and I imported the old >> tables, changing the address of rabbit for pointing to the new rabbit >> cluster. >> > I restarded containers. >> > Changing the rabbit address on old kvm nodes, I can see the old virtual >> machines and I can open console on them. >> > I can see all networks (tenant and provider) of al installation, but >> when I try to create a new instance on the new kvm, it remains in buiding >> state. >> > Seems it cannot aquire an address. >> > Storage between old and new installation are shred on nfs NETAPP, so I >> can see cinder volumes. >> > I suppose db structure is different between a kolla installation and a >> manual instaltion !? >> > What is wrong ? >> > Thanks >> > Ignazio >> > >> > >> > >> > >> > Il giorno gio 27 giu 2019 alle ore 16:44 Mark Goddard < >> mark at stackhpc.com> ha scritto: >> >> >> >> On Thu, 27 Jun 2019 at 14:46, Ignazio Cassano < >> ignaziocassano at gmail.com> wrote: >> >> > >> >> > Sorry, for my question. >> >> > It does not need to change anything because endpoints refer to >> haproxy vips. >> >> > So if your new glance works fine you change haproxy backends for >> glance. >> >> > Regards >> >> > Ignazio >> >> >> >> That's correct - only the haproxy backend needs to be updated. >> >> >> >> > >> >> > >> >> > Il giorno gio 27 giu 2019 alle ore 15:21 Ignazio Cassano < >> ignaziocassano at gmail.com> ha scritto: >> >> >> >> >> >> Hello Mark, >> >> >> let me to verify if I understood your method. >> >> >> >> >> >> You have old controllers,haproxy,mariadb and nova computes. >> >> >> You installed three new controllers but kolla.ansible inventory >> contains old mariadb and old rabbit servers. >> >> >> You are deployng single service on new controllers staring with >> glance. >> >> >> When you deploy glance on new controllers, it changes the glance >> endpoint on old mariadb db ? >> >> >> Regards >> >> >> Ignazio >> >> >> >> >> >> Il giorno gio 27 giu 2019 alle ore 10:52 Mark Goddard < >> mark at stackhpc.com> ha scritto: >> >> >>> >> >> >>> On Wed, 26 Jun 2019 at 19:34, Ignazio Cassano < >> ignaziocassano at gmail.com> wrote: >> >> >>> > >> >> >>> > Hello, >> >> >>> > Anyone have tried to migrate an existing openstack installation >> to kolla containers? >> >> >>> >> >> >>> Hi, >> >> >>> >> >> >>> I'm aware of two people currently working on that. Gregory Orange >> and >> >> >>> one of my colleagues, Pierre Riteau. Pierre is away currently, so I >> >> >>> hope he doesn't mind me quoting him from an email to Gregory. >> >> >>> >> >> >>> Mark >> >> >>> >> >> >>> "I am indeed working on a similar migration using Kolla Ansible >> with >> >> >>> Kayobe, starting from a non-containerised OpenStack deployment >> based >> >> >>> on CentOS RPMs. >> >> >>> Existing OpenStack services are deployed across several controller >> >> >>> nodes and all sit behind HAProxy, including for internal endpoints. >> >> >>> We have additional controller nodes that we use to deploy >> >> >>> containerised services. If you don't have the luxury of additional >> >> >>> nodes, it will be more difficult as you will need to avoid >> processes >> >> >>> clashing when listening on the same port. >> >> >>> >> >> >>> The method I am using resembles your second suggestion, however I >> am >> >> >>> deploying only one containerised service at a time, in order to >> >> >>> validate each of them independently. >> >> >>> I use the --tags option of kolla-ansible to restrict Ansible to >> >> >>> specific roles, and when I am happy with the resulting >> configuration I >> >> >>> update HAProxy to point to the new controllers. >> >> >>> >> >> >>> As long as the configuration matches, this should be completely >> >> >>> transparent for purely HTTP-based services like Glance. You need >> to be >> >> >>> more careful with services that include components listening for >> RPC, >> >> >>> such as Nova: if the new nova.conf is incorrect and you've >> deployed a >> >> >>> nova-conductor that uses it, you could get failed instances >> launches. >> >> >>> Some roles depend on others: if you are deploying the >> >> >>> neutron-openvswitch-agent, you need to run the openvswitch role as >> >> >>> well. >> >> >>> >> >> >>> I suggest starting with migrating Glance as it doesn't have any >> >> >>> internal services and is easy to validate. Note that properly >> >> >>> migrating Keystone requires keeping existing Fernet keys around, so >> >> >>> any token stays valid until the time it is expected to stop working >> >> >>> (which is fairly complex, see >> >> >>> https://bugs.launchpad.net/kolla-ansible/+bug/1809469). >> >> >>> >> >> >>> While initially I was using an approach similar to your first >> >> >>> suggestion, it can have side effects since Kolla Ansible uses these >> >> >>> variables when templating configuration. As an example, most >> services >> >> >>> will only have notifications enabled if enable_ceilometer is true. >> >> >>> >> >> >>> I've added existing control plane nodes to the Kolla Ansible >> inventory >> >> >>> as separate groups, which allows me to use the existing database >> and >> >> >>> RabbitMQ for the containerised services. >> >> >>> For example, instead of: >> >> >>> >> >> >>> [mariadb:children] >> >> >>> control >> >> >>> >> >> >>> you may have: >> >> >>> >> >> >>> [mariadb:children] >> >> >>> oldcontrol_db >> >> >>> >> >> >>> I still have to perform the migration of these underlying services >> to >> >> >>> the new control plane, I will let you know if there is any hurdle. >> >> >>> >> >> >>> A few random things to note: >> >> >>> >> >> >>> - if run on existing control plane hosts, the baremetal role >> removes >> >> >>> some packages listed in `redhat_pkg_removals` which can trigger the >> >> >>> removal of OpenStack dependencies using them! I've changed this >> >> >>> variable to an empty list. >> >> >>> - compare your existing deployment with a Kolla Ansible one to >> check >> >> >>> for differences in endpoints, configuration files, database users, >> >> >>> service users, etc. For Heat, Kolla uses the domain >> heat_user_domain, >> >> >>> while your existing deployment may use another one (and this is >> >> >>> hardcoded in the Kolla Heat image). Kolla Ansible uses the >> "service" >> >> >>> project while a couple of deployments I worked with were using >> >> >>> "services". This shouldn't matter, except there was a bug in Kolla >> >> >>> which prevented it from setting the roles correctly: >> >> >>> https://bugs.launchpad.net/kolla/+bug/1791896 (now fixed in latest >> >> >>> Rocky and Queens images) >> >> >>> - the ml2_conf.ini generated for Neutron generates physical network >> >> >>> names like physnet1, physnet2… you may want to override >> >> >>> bridge_mappings completely. >> >> >>> - although sometimes it could be easier to change your existing >> >> >>> deployment to match Kolla Ansible settings, rather than configure >> >> >>> Kolla Ansible to match your deployment." >> >> >>> >> >> >>> > Thanks >> >> >>> > Ignazio >> >> >>> > >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sombrafam at gmail.com Mon Jul 1 11:44:30 2019 From: sombrafam at gmail.com (Erlon Cruz) Date: Mon, 1 Jul 2019 08:44:30 -0300 Subject: [cinder] Deprecating driver versions In-Reply-To: <59de8659-6a0b-62b5-8bee-ed0fdf622cf9@gmail.com> References: <20190628075012.ndwk52gabg2akqvx@localhost> <59de8659-6a0b-62b5-8bee-ed0fdf622cf9@gmail.com> Message-ID: Hi Jay, No problem. I just wanted to know if other vendors/maintainers shared the same problems and concerns we have and if we could have an uniform solutions across all drivers which is not the case. Erlon Em sáb, 29 de jun de 2019 às 15:19, Jay S. Bryant escreveu: > Erlon, > > I appreciate the goal here but I agree with Gorka here. > > The drivers are the vendor's responsibilities and they version them as > they wish. I think updating the devref some best practices > recommendations would be good and maybe come to agreement between the > cores on what the best practices are so that we can try to enforce it to > some extent through reviews. That is probably the best way forward. > > Jay > > On 6/28/2019 2:50 AM, Gorka Eguileor wrote: > > On 27/06, Erlon Cruz wrote: > >> Hey folks, > >> > >> Driver versions has being a source of a lot of confusions with > costumers. > >> Most of our drivers > >> have a version number and history that are updated as the developers > adds > >> new fixes and > >> features. Drivers also have a VERSION variable in the version class that > >> should be bumped by > >> developers. The problem with that is: > >> > >> - sometimes folks from the community just push patches on drivers, > and > >> its hard to bump > >> every vendor version correctly; > >> - that relies in the human factor to remember adding it, and usually > >> that fails; > >> - if we create a bugfix and bump the version, the backport to older > >> branches will carry the > >> version, which will not reflect the correct driver code; > >> > >> So, the solution I'm proposing for this is that we use the Cinder > >> versions[1] and remove all > >> version strings for drivers. Every new release we get a version. For > stable > >> versions, from time to > >> time the PTL bumps the stable version and we have an accurate ways to > >> describe the code. > >> If we need to backport and send something to the costumer, we can do the > >> backport, poke > >> the PTL, and he will generate another version which can be downloaded on > >> github or via PIP, > >> and present the version to our costumers. > >> > >> So, what are your thought around this? Anyone else has had problems with > >> that? What would > >> be the implications of removing the driver version strings? > >> > >> Erlon > >> > > Hi Erlon, > > > > I am personally against removing the drivers versions, as I find them > > convenient and think they are good practice. > > > > A possible solution for the driver versioning is for a driver to > > designate a minor version per OpenStack release and use the patch > > version to track changes. This way one can always backport a patch and > > will just need to increase the patch version in the backport patch. > > > > Maybe we can have this formally described in our devref. We tell > > driver developers they can do whatever they want with the versioning in > > master, but backports must not backport the version as it is and instead > > increase the patch version. > > > > What do you think? > > > > If I remember correctly there are some drivers that only increase the > > version once per release. > > > > Cheers, > > Gorka. > > > >> [1] https://releases.openstack.org/teams/cinder.html > >> [2] > >> > https://github.com/openstack/cinder/blob/master/cinder/volume/drivers/solidfire.py#L237 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Mon Jul 1 11:45:49 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Mon, 1 Jul 2019 20:45:49 +0900 Subject: [searchlight] Cannot hold meeting today Message-ID: Daer team, Unfortunately, I'm in the middle of job transition and cannot hold the meeting today. I will propose a new meeting time this week. Please do not hesitate to contact me if you have any questions. Bests, -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Mon Jul 1 11:50:42 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Mon, 1 Jul 2019 13:50:42 +0200 Subject: [kolla-ansible] migration In-Reply-To: References: Message-ID: I checked them and I modified for fitting to new installation thanks Ignazio Il giorno lun 1 lug 2019 alle ore 13:36 Mohammed Naser ha scritto: > You should check your cell mapping records inside Nova. They're probably > not right of you moved your database and rabbit > > Sorry for top posting this is from a phone. > > On Mon., Jul. 1, 2019, 5:46 a.m. Ignazio Cassano, < > ignaziocassano at gmail.com> wrote: > >> PS >> I presume the problem is neutron, because instances on new kvm nodes >> remain in building state e do not aquire address. >> Probably the netron db imported from old openstack installation has some >> difrrences ....probably I must check defferences from old and new neutron >> services configuration files. >> Ignazio >> >> Il giorno lun 1 lug 2019 alle ore 10:10 Mark Goddard >> ha scritto: >> >>> It sounds like you got quite close to having this working. I'd suggest >>> debugging this instance build failure. One difference with kolla is >>> that we run libvirt inside a container. Have you stopped libvirt from >>> running on the host? >>> Mark >>> >>> On Sun, 30 Jun 2019 at 09:55, Ignazio Cassano >>> wrote: >>> > >>> > Hi Mark, >>> > let me to explain what I am trying. >>> > I have a queens installation based on centos and pacemaker with some >>> instances and heat stacks. >>> > I would like to have another installation with same instances, >>> projects, stacks ....I'd like to have same uuid for all objects >>> (users,projects instances and so on, because it is controlled by a cloud >>> management platform we wrote. >>> > >>> > I stopped controllers on old queens installation backupping the >>> openstack database. >>> > I installed the new kolla openstack queens on new three controllers >>> with same addresses of the old intallation , vip as well. >>> > One of the three controllers is also a kvm node on queens. >>> > I stopped all containeres except rabbit,keepalive,rabbit,haproxy and >>> mariadb. >>> > I deleted al openstack db on mariadb container and I imported the old >>> tables, changing the address of rabbit for pointing to the new rabbit >>> cluster. >>> > I restarded containers. >>> > Changing the rabbit address on old kvm nodes, I can see the old >>> virtual machines and I can open console on them. >>> > I can see all networks (tenant and provider) of al installation, but >>> when I try to create a new instance on the new kvm, it remains in buiding >>> state. >>> > Seems it cannot aquire an address. >>> > Storage between old and new installation are shred on nfs NETAPP, so I >>> can see cinder volumes. >>> > I suppose db structure is different between a kolla installation and a >>> manual instaltion !? >>> > What is wrong ? >>> > Thanks >>> > Ignazio >>> > >>> > >>> > >>> > >>> > Il giorno gio 27 giu 2019 alle ore 16:44 Mark Goddard < >>> mark at stackhpc.com> ha scritto: >>> >> >>> >> On Thu, 27 Jun 2019 at 14:46, Ignazio Cassano < >>> ignaziocassano at gmail.com> wrote: >>> >> > >>> >> > Sorry, for my question. >>> >> > It does not need to change anything because endpoints refer to >>> haproxy vips. >>> >> > So if your new glance works fine you change haproxy backends for >>> glance. >>> >> > Regards >>> >> > Ignazio >>> >> >>> >> That's correct - only the haproxy backend needs to be updated. >>> >> >>> >> > >>> >> > >>> >> > Il giorno gio 27 giu 2019 alle ore 15:21 Ignazio Cassano < >>> ignaziocassano at gmail.com> ha scritto: >>> >> >> >>> >> >> Hello Mark, >>> >> >> let me to verify if I understood your method. >>> >> >> >>> >> >> You have old controllers,haproxy,mariadb and nova computes. >>> >> >> You installed three new controllers but kolla.ansible inventory >>> contains old mariadb and old rabbit servers. >>> >> >> You are deployng single service on new controllers staring with >>> glance. >>> >> >> When you deploy glance on new controllers, it changes the glance >>> endpoint on old mariadb db ? >>> >> >> Regards >>> >> >> Ignazio >>> >> >> >>> >> >> Il giorno gio 27 giu 2019 alle ore 10:52 Mark Goddard < >>> mark at stackhpc.com> ha scritto: >>> >> >>> >>> >> >>> On Wed, 26 Jun 2019 at 19:34, Ignazio Cassano < >>> ignaziocassano at gmail.com> wrote: >>> >> >>> > >>> >> >>> > Hello, >>> >> >>> > Anyone have tried to migrate an existing openstack installation >>> to kolla containers? >>> >> >>> >>> >> >>> Hi, >>> >> >>> >>> >> >>> I'm aware of two people currently working on that. Gregory Orange >>> and >>> >> >>> one of my colleagues, Pierre Riteau. Pierre is away currently, so >>> I >>> >> >>> hope he doesn't mind me quoting him from an email to Gregory. >>> >> >>> >>> >> >>> Mark >>> >> >>> >>> >> >>> "I am indeed working on a similar migration using Kolla Ansible >>> with >>> >> >>> Kayobe, starting from a non-containerised OpenStack deployment >>> based >>> >> >>> on CentOS RPMs. >>> >> >>> Existing OpenStack services are deployed across several controller >>> >> >>> nodes and all sit behind HAProxy, including for internal >>> endpoints. >>> >> >>> We have additional controller nodes that we use to deploy >>> >> >>> containerised services. If you don't have the luxury of additional >>> >> >>> nodes, it will be more difficult as you will need to avoid >>> processes >>> >> >>> clashing when listening on the same port. >>> >> >>> >>> >> >>> The method I am using resembles your second suggestion, however I >>> am >>> >> >>> deploying only one containerised service at a time, in order to >>> >> >>> validate each of them independently. >>> >> >>> I use the --tags option of kolla-ansible to restrict Ansible to >>> >> >>> specific roles, and when I am happy with the resulting >>> configuration I >>> >> >>> update HAProxy to point to the new controllers. >>> >> >>> >>> >> >>> As long as the configuration matches, this should be completely >>> >> >>> transparent for purely HTTP-based services like Glance. You need >>> to be >>> >> >>> more careful with services that include components listening for >>> RPC, >>> >> >>> such as Nova: if the new nova.conf is incorrect and you've >>> deployed a >>> >> >>> nova-conductor that uses it, you could get failed instances >>> launches. >>> >> >>> Some roles depend on others: if you are deploying the >>> >> >>> neutron-openvswitch-agent, you need to run the openvswitch role as >>> >> >>> well. >>> >> >>> >>> >> >>> I suggest starting with migrating Glance as it doesn't have any >>> >> >>> internal services and is easy to validate. Note that properly >>> >> >>> migrating Keystone requires keeping existing Fernet keys around, >>> so >>> >> >>> any token stays valid until the time it is expected to stop >>> working >>> >> >>> (which is fairly complex, see >>> >> >>> https://bugs.launchpad.net/kolla-ansible/+bug/1809469). >>> >> >>> >>> >> >>> While initially I was using an approach similar to your first >>> >> >>> suggestion, it can have side effects since Kolla Ansible uses >>> these >>> >> >>> variables when templating configuration. As an example, most >>> services >>> >> >>> will only have notifications enabled if enable_ceilometer is true. >>> >> >>> >>> >> >>> I've added existing control plane nodes to the Kolla Ansible >>> inventory >>> >> >>> as separate groups, which allows me to use the existing database >>> and >>> >> >>> RabbitMQ for the containerised services. >>> >> >>> For example, instead of: >>> >> >>> >>> >> >>> [mariadb:children] >>> >> >>> control >>> >> >>> >>> >> >>> you may have: >>> >> >>> >>> >> >>> [mariadb:children] >>> >> >>> oldcontrol_db >>> >> >>> >>> >> >>> I still have to perform the migration of these underlying >>> services to >>> >> >>> the new control plane, I will let you know if there is any hurdle. >>> >> >>> >>> >> >>> A few random things to note: >>> >> >>> >>> >> >>> - if run on existing control plane hosts, the baremetal role >>> removes >>> >> >>> some packages listed in `redhat_pkg_removals` which can trigger >>> the >>> >> >>> removal of OpenStack dependencies using them! I've changed this >>> >> >>> variable to an empty list. >>> >> >>> - compare your existing deployment with a Kolla Ansible one to >>> check >>> >> >>> for differences in endpoints, configuration files, database users, >>> >> >>> service users, etc. For Heat, Kolla uses the domain >>> heat_user_domain, >>> >> >>> while your existing deployment may use another one (and this is >>> >> >>> hardcoded in the Kolla Heat image). Kolla Ansible uses the >>> "service" >>> >> >>> project while a couple of deployments I worked with were using >>> >> >>> "services". This shouldn't matter, except there was a bug in Kolla >>> >> >>> which prevented it from setting the roles correctly: >>> >> >>> https://bugs.launchpad.net/kolla/+bug/1791896 (now fixed in >>> latest >>> >> >>> Rocky and Queens images) >>> >> >>> - the ml2_conf.ini generated for Neutron generates physical >>> network >>> >> >>> names like physnet1, physnet2… you may want to override >>> >> >>> bridge_mappings completely. >>> >> >>> - although sometimes it could be easier to change your existing >>> >> >>> deployment to match Kolla Ansible settings, rather than configure >>> >> >>> Kolla Ansible to match your deployment." >>> >> >>> >>> >> >>> > Thanks >>> >> >>> > Ignazio >>> >> >>> > >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From hberaud at redhat.com Mon Jul 1 12:05:33 2019 From: hberaud at redhat.com (Herve Beraud) Date: Mon, 1 Jul 2019 14:05:33 +0200 Subject: [Release-job-failures] Release of openstack/ansible-role-redhat-subscription failed In-Reply-To: <1b761937-ed22-9a5a-ccd8-823846cc3150@openstack.org> References: <1b761937-ed22-9a5a-ccd8-823846cc3150@openstack.org> Message-ID: Hello, We have the same issue on podman. https://travis-ci.org/containers/python-podman/jobs/551905955 I don't know why this is happening there, we used ` long_description_content_type="text/markdown"` [1] so normally is not an issue here... maybe pypi team have changed things on their side and we need to take care of it. I will take a look. We also use pbr on podman. Maybe these changes (Read description file as utf-8) [2] have introduced some unexpected behaviours. When I try to build a dist locally everything seems to be good... I don't see any issues. [1] https://github.com/containers/python-podman/commit/d1cc16a69e858faf1c81cde67cb198af22740ba4 [2] https://github.com/openstack/pbr/commit/17f9439e9f9026dd3e8cae1e917a78e80195152c Le lun. 1 juil. 2019 à 13:39, Thierry Carrez a écrit : > zuul at openstack.org wrote: > > Build failed. > > > > - release-openstack-python > http://logs.openstack.org/59/59f40f154a5302df1ee4230f35f87d261e0bf9eb/release/release-openstack-python/8b78b28/ > : POST_FAILURE in 2m 09s > > - announce-release announce-release : SKIPPED > > - propose-update-constraints propose-update-constraints : SKIPPED > > Upload to PyPI at the end of the release was rejected due to: > > "The description failed to render in the default format of > reStructuredText. See https://pypi.org/help/#description-content-type > for more information." > > Impact: ansible-role-redhat-subscription 1.0.3 was produced and > published, but not uploaded to PyPI. > > Remediation: once the description is fixed, we'll have to request a > 1.0.4 to fix the situation. > > -- > Thierry Carrez (ttx) > > -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From hberaud at redhat.com Mon Jul 1 12:31:16 2019 From: hberaud at redhat.com (Herve Beraud) Date: Mon, 1 Jul 2019 14:31:16 +0200 Subject: [Release-job-failures] Release of openstack/ansible-role-redhat-subscription failed In-Reply-To: References: <1b761937-ed22-9a5a-ccd8-823846cc3150@openstack.org> Message-ID: I think it's related to: - https://github.com/pypa/warehouse/pull/5835 - https://github.com/pypa/warehouse/issues/5890 Le lun. 1 juil. 2019 à 14:05, Herve Beraud a écrit : > Hello, > > We have the same issue on podman. > > https://travis-ci.org/containers/python-podman/jobs/551905955 > > I don't know why this is happening there, we used ` > long_description_content_type="text/markdown"` [1] so normally is not an > issue here... maybe pypi team have changed things on their side and we need > to take care of it. I will take a look. > > We also use pbr on podman. > > Maybe these changes (Read description file as utf-8) [2] have introduced > some unexpected behaviours. > > When I try to build a dist locally everything seems to be good... I don't > see any issues. > > [1] > https://github.com/containers/python-podman/commit/d1cc16a69e858faf1c81cde67cb198af22740ba4 > [2] > https://github.com/openstack/pbr/commit/17f9439e9f9026dd3e8cae1e917a78e80195152c > > Le lun. 1 juil. 2019 à 13:39, Thierry Carrez a > écrit : > >> zuul at openstack.org wrote: >> > Build failed. >> > >> > - release-openstack-python >> http://logs.openstack.org/59/59f40f154a5302df1ee4230f35f87d261e0bf9eb/release/release-openstack-python/8b78b28/ >> : POST_FAILURE in 2m 09s >> > - announce-release announce-release : SKIPPED >> > - propose-update-constraints propose-update-constraints : SKIPPED >> >> Upload to PyPI at the end of the release was rejected due to: >> >> "The description failed to render in the default format of >> reStructuredText. See https://pypi.org/help/#description-content-type >> for more information." >> >> Impact: ansible-role-redhat-subscription 1.0.3 was produced and >> published, but not uploaded to PyPI. >> >> Remediation: once the description is fixed, we'll have to request a >> 1.0.4 to fix the situation. >> >> -- >> Thierry Carrez (ttx) >> >> > > -- > Hervé Beraud > Senior Software Engineer > Red Hat - Openstack Oslo > irc: hberaud > -----BEGIN PGP SIGNATURE----- > > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O > v6rDpkeNksZ9fFSyoY2o > =ECSj > -----END PGP SIGNATURE----- > > -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From eblock at nde.ag Mon Jul 1 12:32:07 2019 From: eblock at nde.ag (Eugen Block) Date: Mon, 01 Jul 2019 12:32:07 +0000 Subject: [kolla][ceph] Cache OSDs didn't stay in the root=cache after ceph deployment. In-Reply-To: Message-ID: <20190701123207.Horde.Y7WzBt0rnChxKZBSc3ktF0Z@webmail.nde.ag> Hi, although I'm not familiar with kolla I can comment on the ceph part. > The problem is, that OSD didn't stay at "cache" bucket, it still stay at > "default" bucket. I'm not sure how the deployment process with kolla works and what exactly is done here, but this might be caused by this option [1]: osd crush update on start Its default is "true". We ran into this some time ago and were wondering why the OSDs were in the wrong bucket everytime we restarted services. As I said, I don't know how exactly this would affect you, but you could set that config option to "false" and see if that still happens. Regards, Eugen [1] http://docs.ceph.com/docs/master/rados/operations/crush-map/ Zitat von Eddie Yen : > Hi, > > I'm using stable/rocky to try ceph cache tiering. > Now I'm facing a one issue. > > I chose one SSD to become cache tier disk. And set below options in > globals.yml. > ceph_enable_cache = "yes" > ceph_target_max_byte= "" > ceph_target_max_objects = "" > ceph_cache_mode = "writeback" > > And the default OSD type is bluestore. > > > It will bootstrap the cache disk and create another OSD container. > And also create the root bucket called "cache". then set the cache rule to > every cache pools. > The problem is, that OSD didn't stay at "cache" bucket, it still stay at > "default" bucket. > That caused the services can't access to the Ceph normally. Especially > deploying Gnocchi. > > When error occurred, I manually set that OSD to the cache bucket then > re-deploy, and everything is normal now. > But still a strange issue that it stay in the wrong bucket. > > Did I miss something during deployment? Or what can I do? > > > Many thanks, > Eddie. From skaplons at redhat.com Mon Jul 1 12:44:29 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 1 Jul 2019 14:44:29 +0200 Subject: [neutron][ci] Gate failure Message-ID: <7C407162-556C-4124-BAE2-F1FDBFC2D4F4@redhat.com> Hi, It looks that we (again) have some broken gate. This time it’s problem with neutron-tempest-plugin-designate-scenario job which is failing 100% times. See [1] for details. Patch [2] is proposed so now lets wait until it will be merged and You can than rebase Your neutron patches to make Zuul happy again :) [1] https://bugs.launchpad.net/devstack/+bug/1834849 [2] https://review.opendev.org/#/c/668447/ — Slawek Kaplonski Senior software engineer Red Hat From fungi at yuggoth.org Mon Jul 1 13:08:10 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 1 Jul 2019 13:08:10 +0000 Subject: [neutron][ci] Gate failure In-Reply-To: <7C407162-556C-4124-BAE2-F1FDBFC2D4F4@redhat.com> References: <7C407162-556C-4124-BAE2-F1FDBFC2D4F4@redhat.com> Message-ID: <20190701130809.4uhsjiojbbxms3ce@yuggoth.org> On 2019-07-01 14:44:29 +0200 (+0200), Slawek Kaplonski wrote: [...] > Patch [2] is proposed so now lets wait until it will be merged and > You can than rebase Your neutron patches to make Zuul happy again [...] There should be no need to rebase changes for Zuul to make use of a fix which merges to the branch. Just a "recheck" comment is all you need. Zuul always merges your change onto the latest branch state when starting new builds, so the *only* reason to rebase should be if you have a legitimate merge conflict preventing it from being tested at all. This is not new, this is the way Zuul has always worked. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From aschultz at redhat.com Mon Jul 1 13:52:44 2019 From: aschultz at redhat.com (Alex Schultz) Date: Mon, 1 Jul 2019 07:52:44 -0600 Subject: [Release-job-failures] Release of openstack/ansible-role-redhat-subscription failed In-Reply-To: <1b761937-ed22-9a5a-ccd8-823846cc3150@openstack.org> References: <1b761937-ed22-9a5a-ccd8-823846cc3150@openstack.org> Message-ID: On Mon, Jul 1, 2019 at 5:44 AM Thierry Carrez wrote: > zuul at openstack.org wrote: > > Build failed. > > > > - release-openstack-python > http://logs.openstack.org/59/59f40f154a5302df1ee4230f35f87d261e0bf9eb/release/release-openstack-python/8b78b28/ > : POST_FAILURE in 2m 09s > > - announce-release announce-release : SKIPPED > > - propose-update-constraints propose-update-constraints : SKIPPED > > Upload to PyPI at the end of the release was rejected due to: > > "The description failed to render in the default format of > reStructuredText. See https://pypi.org/help/#description-content-type > for more information." > > Impact: ansible-role-redhat-subscription 1.0.3 was produced and > published, but not uploaded to PyPI. > > Remediation: once the description is fixed, we'll have to request a > 1.0.4 to fix the situation. > > https://review.opendev.org/668471 > -- > Thierry Carrez (ttx) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Mon Jul 1 14:55:25 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 1 Jul 2019 16:55:25 +0200 Subject: [neutron][ci] Gate failure In-Reply-To: <20190701130809.4uhsjiojbbxms3ce@yuggoth.org> References: <7C407162-556C-4124-BAE2-F1FDBFC2D4F4@redhat.com> <20190701130809.4uhsjiojbbxms3ce@yuggoth.org> Message-ID: Hi, > On 1 Jul 2019, at 15:08, Jeremy Stanley wrote: > > On 2019-07-01 14:44:29 +0200 (+0200), Slawek Kaplonski wrote: > [...] >> Patch [2] is proposed so now lets wait until it will be merged and >> You can than rebase Your neutron patches to make Zuul happy again > [...] > > There should be no need to rebase changes for Zuul to make use of a > fix which merges to the branch. Just a "recheck" comment is all you > need. Zuul always merges your change onto the latest branch state > when starting new builds, so the *only* reason to rebase should be > if you have a legitimate merge conflict preventing it from being > tested at all. This is not new, this is the way Zuul has always > worked. So if I have fix for some issue in Neutron repo (lets say patch A), and other patch to neutron repo (patch B) which is failing because of this issue, I don’t need to rebase my failing patch B to include fix from patch A and to get +1 from Zuul? Am I understanding correct what You wrote? I know that it is like that in gate queue but I though that in check queue it works differently. > -- > Jeremy Stanley — Slawek Kaplonski Senior software engineer Red Hat From cboylan at sapwetik.org Mon Jul 1 15:03:01 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Mon, 01 Jul 2019 08:03:01 -0700 Subject: [neutron][ci] Gate failure In-Reply-To: References: <7C407162-556C-4124-BAE2-F1FDBFC2D4F4@redhat.com> <20190701130809.4uhsjiojbbxms3ce@yuggoth.org> Message-ID: On Mon, Jul 1, 2019, at 7:55 AM, Slawek Kaplonski wrote: > Hi, > > > On 1 Jul 2019, at 15:08, Jeremy Stanley wrote: > > > > On 2019-07-01 14:44:29 +0200 (+0200), Slawek Kaplonski wrote: > > [...] > >> Patch [2] is proposed so now lets wait until it will be merged and > >> You can than rebase Your neutron patches to make Zuul happy again > > [...] > > > > There should be no need to rebase changes for Zuul to make use of a > > fix which merges to the branch. Just a "recheck" comment is all you > > need. Zuul always merges your change onto the latest branch state > > when starting new builds, so the *only* reason to rebase should be > > if you have a legitimate merge conflict preventing it from being > > tested at all. This is not new, this is the way Zuul has always > > worked. > > So if I have fix for some issue in Neutron repo (lets say patch A), and > other patch to neutron repo (patch B) which is failing because of this > issue, I don’t need to rebase my failing patch B to include fix from > patch A and to get +1 from Zuul? Am I understanding correct what You > wrote? > I know that it is like that in gate queue but I though that in check > queue it works differently. You don't need to rebase or use depends on once change A has merged because Zuul will always merge your proposed changes (change B) to the target branch (which includes change A). If you want things to go green prior to change A merging you will need to rebase or use depends on. Clark From doka.ua at gmx.com Mon Jul 1 15:40:54 2019 From: doka.ua at gmx.com (Volodymyr Litovka) Date: Mon, 1 Jul 2019 18:40:54 +0300 Subject: [octavia] HTTP/2 support Message-ID: <2c5d7d87-ae8b-1ef1-4c00-bf542d5cb15e@gmx.com> Dear colleagues, - since haproxy supports HTTP/2 starting 1.9 (released on Dec 2018) for both frontends and backends - due to wide adoption of HTTP/2 (34% of all sites are now support HTTP/2 according to W3techs) - support for HTTP/2 is in all browsers and web-servers whether Octavia team has plans to add support for HTTP/2 as well? Thank you. -- Volodymyr Litovka "Vision without Execution is Hallucination." -- Thomas Edison From skaplons at redhat.com Mon Jul 1 16:26:47 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 1 Jul 2019 18:26:47 +0200 Subject: [neutron][ci] Gate failure In-Reply-To: References: <7C407162-556C-4124-BAE2-F1FDBFC2D4F4@redhat.com> <20190701130809.4uhsjiojbbxms3ce@yuggoth.org> Message-ID: Hi, > On 1 Jul 2019, at 17:03, Clark Boylan wrote: > > On Mon, Jul 1, 2019, at 7:55 AM, Slawek Kaplonski wrote: >> Hi, >> >>> On 1 Jul 2019, at 15:08, Jeremy Stanley wrote: >>> >>> On 2019-07-01 14:44:29 +0200 (+0200), Slawek Kaplonski wrote: >>> [...] >>>> Patch [2] is proposed so now lets wait until it will be merged and >>>> You can than rebase Your neutron patches to make Zuul happy again >>> [...] >>> >>> There should be no need to rebase changes for Zuul to make use of a >>> fix which merges to the branch. Just a "recheck" comment is all you >>> need. Zuul always merges your change onto the latest branch state >>> when starting new builds, so the *only* reason to rebase should be >>> if you have a legitimate merge conflict preventing it from being >>> tested at all. This is not new, this is the way Zuul has always >>> worked. >> >> So if I have fix for some issue in Neutron repo (lets say patch A), and >> other patch to neutron repo (patch B) which is failing because of this >> issue, I don’t need to rebase my failing patch B to include fix from >> patch A and to get +1 from Zuul? Am I understanding correct what You >> wrote? >> I know that it is like that in gate queue but I though that in check >> queue it works differently. > > You don't need to rebase or use depends on once change A has merged because Zuul will always merge your proposed changes (change B) to the target branch (which includes change A). Thx a lot for explanation. I didn’t know that it is like that in check queue :) > > If you want things to go green prior to change A merging you will need to rebase or use depends on. > > Clark — Slawek Kaplonski Senior software engineer Red Hat From tpb at dyncloud.net Mon Jul 1 19:44:37 2019 From: tpb at dyncloud.net (Tom Barron) Date: Mon, 1 Jul 2019 15:44:37 -0400 Subject: [manila] no meeting July 4 Message-ID: <20190701194437.wvm2y6ggc7givbai@barron.net> We just agreed in freenode #openstack-manila to skip this week's Manila community meeting since several regular attendees including the chair are on holiday. We'll plan to meet as usual the following Thursday, 11 July, at 1500 UTC in #openstack-meeting-alt. Feel free to add to the agenda here: https://wiki.openstack.org/wiki/Manila/Meetings -- Tom Barron From rfolco at redhat.com Mon Jul 1 20:33:36 2019 From: rfolco at redhat.com (Rafael Folco) Date: Mon, 1 Jul 2019 17:33:36 -0300 Subject: [tripleo] TripleO CI Summary: Sprint 32 Message-ID: Greetings, The TripleO CI team has just completed Sprint 32 / Unified Sprint 11 (June 06 thru Jun 26). The following is a summary of completed work during this sprint cycle: - Created image and container build jobs on RHEL 7 in the internal instance of Software Factory as a prep work for RHEL8 jobs on RDO Software Factory. - Started creating RHEL8 jobs to build a periodic pipeline in the RDO Software Factory and provide feedback for CentOS8 coverage. - Promotion status: green on all branches at most of the sprint. The planned work for the next sprint [1] are: - Prepare the grounds for moving RDO on RHEL8 jobs from the internal instance of Software Factory to the RDO instance. This includes building a nodepool image w/ RHUI for base repos and enabling RHEL8 on tripleo-repos for delorean repos. - Get RDO on RHEL8 build jobs working and address dependency issues. - Continue the design document for a staging environment to test changes in the promoter server. This will benefit CI team with less breakages in the promoter server and also prepare the grounds for the multi-arch builds. The Ruck and Rover for this sprint are Arx Cruz (arxcruz) and Sagi Shnaidman (sshnaidm). Please direct questions or queries to them regarding CI status or issues in #tripleo, ideally to whomever has the ‘|ruck’ suffix on their nick. Ruck/rover notes are being tracked in etherpad [2]. Thanks, rfolco [1] https://tree.taiga.io/project/tripleo-ci-board/taskboard/unified-sprint-12 [2] https://etherpad.openstack.org/p/ruckroversprint12 -------------- next part -------------- An HTML attachment was scrubbed... URL: From hongbin034 at gmail.com Mon Jul 1 21:08:17 2019 From: hongbin034 at gmail.com (Hongbin Lu) Date: Mon, 1 Jul 2019 17:08:17 -0400 Subject: [Zun][Kata] Devstack plugin for installing Kata container Message-ID: Hi all, I have a patch [1] that adds support for installing kata Container in devstack. Right now, it configures kata container to run with Docker. In the future, I plan to add support for containerd's cri plugin, which basically allows running pods with kata container. Initially, OpenStack Zun will use this plugin to install Kata container, but I believe it would be beneficial for other projects. Appreciate if anyone interest to cast your feedback on the patch. [1] https://review.opendev.org/#/c/668490/ Best regards, Hongbin -------------- next part -------------- An HTML attachment was scrubbed... URL: From johnsomor at gmail.com Mon Jul 1 21:33:24 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Mon, 1 Jul 2019 14:33:24 -0700 Subject: [octavia] HTTP/2 support In-Reply-To: <2c5d7d87-ae8b-1ef1-4c00-bf542d5cb15e@gmx.com> References: <2c5d7d87-ae8b-1ef1-4c00-bf542d5cb15e@gmx.com> Message-ID: Hi Volodymyr, Yes, this is definitely something we are planning to do. There are a couple of reasons we have held off doing this: 1. You really want HAProxy 2.0 (also recently released) to get the full capabilities of HTTP/2 (See https://www.haproxy.com/blog/haproxy-2-0-and-beyond/#end-to-end-http-2). Prior to this, HTTP/2 connections from users were translated to HTTP/1.1 connections on the servers and/or had some open issues. 2. Another challenge is that none of the distributions (LTS versions) are shipping with HAProxy 1.9 or 2.0. Currently they all ship with 1.8. This means that operators would have to build images with a custom package or build of HAProxy in it. 3. This feature would bring in capabilities that only HAProxy 1.9/2.0 based amphora images would support. We have been talking in the weekly IRC and at the PTG about how we could handle "discovering" that an image has been build with a specific version of HAProxy prior to it being booted. I think the current leading idea is to tag the images in glance with a special tag/metadata that would allow us to query if there is a compatible image available for a feature a user is requesting. 4. No vendor provider drivers have requested or implemented HTTP/2 support in their drivers. 5. No one has come forward indicating that they can do the development work. (yet) As you can see we have a few issues to figure out and work on before HTTP/2 will be supported. So, definitely on the roadmap (updated), but I can't name which release will get it added. Michael On Mon, Jul 1, 2019 at 8:44 AM Volodymyr Litovka wrote: > > Dear colleagues, > > - since haproxy supports HTTP/2 starting 1.9 (released on Dec 2018) for > both frontends and backends > - due to wide adoption of HTTP/2 (34% of all sites are now support > HTTP/2 according to W3techs) > - support for HTTP/2 is in all browsers and web-servers > > whether Octavia team has plans to add support for HTTP/2 as well? > > Thank you. > > -- > Volodymyr Litovka > "Vision without Execution is Hallucination." -- Thomas Edison > > From missile0407 at gmail.com Tue Jul 2 00:44:13 2019 From: missile0407 at gmail.com (Eddie Yen) Date: Tue, 2 Jul 2019 08:44:13 +0800 Subject: [kolla][ceph] Cache OSDs didn't stay in the root=cache after ceph deployment. In-Reply-To: <20190701123207.Horde.Y7WzBt0rnChxKZBSc3ktF0Z@webmail.nde.ag> References: <20190701123207.Horde.Y7WzBt0rnChxKZBSc3ktF0Z@webmail.nde.ag> Message-ID: Hi Eugen, thanks for your reply first. I tested what you said, addeed "osd crush update on start = False" in the pre-deploy config file (/etc/kolla/config/ceph.conf) Then destroy & re-deploy again. Now the cache OSDs has stayed in the right bucket after ceph deployment. Really thanks for your advise, now everything works now. Appreciate, Eddie. Eugen Block 於 2019年7月1日 週一 下午8:40寫道: > Hi, > > although I'm not familiar with kolla I can comment on the ceph part. > > > The problem is, that OSD didn't stay at "cache" bucket, it still stay at > > "default" bucket. > > I'm not sure how the deployment process with kolla works and what > exactly is done here, but this might be caused by this option [1]: > > osd crush update on start > > Its default is "true". We ran into this some time ago and were > wondering why the OSDs were in the wrong bucket everytime we restarted > services. As I said, I don't know how exactly this would affect you, > but you could set that config option to "false" and see if that still > happens. > > > Regards, > Eugen > > [1] http://docs.ceph.com/docs/master/rados/operations/crush-map/ > > Zitat von Eddie Yen : > > > Hi, > > > > I'm using stable/rocky to try ceph cache tiering. > > Now I'm facing a one issue. > > > > I chose one SSD to become cache tier disk. And set below options in > > globals.yml. > > ceph_enable_cache = "yes" > > ceph_target_max_byte= "" > > ceph_target_max_objects = "" > > ceph_cache_mode = "writeback" > > > > And the default OSD type is bluestore. > > > > > > It will bootstrap the cache disk and create another OSD container. > > And also create the root bucket called "cache". then set the cache rule > to > > every cache pools. > > The problem is, that OSD didn't stay at "cache" bucket, it still stay at > > "default" bucket. > > That caused the services can't access to the Ceph normally. Especially > > deploying Gnocchi. > > > > When error occurred, I manually set that OSD to the cache bucket then > > re-deploy, and everything is normal now. > > But still a strange issue that it stay in the wrong bucket. > > > > Did I miss something during deployment? Or what can I do? > > > > > > Many thanks, > > Eddie. > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhang.lei.fly+os-discuss at gmail.com Tue Jul 2 04:47:59 2019 From: zhang.lei.fly+os-discuss at gmail.com (Jeffrey Zhang) Date: Tue, 2 Jul 2019 12:47:59 +0800 Subject: [kolla] Proposing yoctozepto as core In-Reply-To: References: <950bf345-da0f-dabc-31f1-fcb1711e36df@linaro.org> Message-ID: +1 On Fri, Jun 28, 2019 at 8:11 PM Gaëtan Trellu wrote: > +1, congrats yoctozeoto! > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Tue Jul 2 07:19:29 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Tue, 2 Jul 2019 09:19:29 +0200 Subject: [neutron][ci] Gate failure In-Reply-To: <7C407162-556C-4124-BAE2-F1FDBFC2D4F4@redhat.com> References: <7C407162-556C-4124-BAE2-F1FDBFC2D4F4@redhat.com> Message-ID: <68593D1D-07E0-451D-8867-9FE0F92F1F73@redhat.com> Hi, Fix [1] is merged so You can now recheck Your patches if it failed on neutron-tempest-plugin-designate-scenario job earlier. > On 1 Jul 2019, at 14:44, Slawek Kaplonski wrote: > > Hi, > > It looks that we (again) have some broken gate. This time it’s problem with neutron-tempest-plugin-designate-scenario job which is failing 100% times. > See [1] for details. > Patch [2] is proposed so now lets wait until it will be merged and You can than rebase Your neutron patches to make Zuul happy again :) > > [1] https://bugs.launchpad.net/devstack/+bug/1834849 > [2] https://review.opendev.org/#/c/668447/ > > — > Slawek Kaplonski > Senior software engineer > Red Hat > [1] https://review.opendev.org/#/c/668447/ — Slawek Kaplonski Senior software engineer Red Hat From sneha.rai at hpe.com Tue Jul 2 04:14:10 2019 From: sneha.rai at hpe.com (RAI, SNEHA) Date: Tue, 2 Jul 2019 04:14:10 +0000 Subject: HPE 3PAR Cinder driver-Multiattach: Fails to detach second instance from volume Message-ID: Hi All, Is there a way to find in cinder how many instances are attached to a volume? Thanks & Regards, Sneha Rai From: RAI, SNEHA Sent: Monday, July 1, 2019 5:26 PM To: openstack-dev at lists.openstack.org Subject: HPE 3PAR Cinder driver-Multiattach: Fails to detach second instance from volume Hi Team, There is a bug on 3PAR Cinder driver https://bugs.launchpad.net/cinder/+bug/1834660. I am able to attach multiple instances to 3PAR volume but only the first instance gets detached successfully. For the second instance, volume goes into detaching status due to "Host does not exist" error. What is happening here is, the first detach call invokes _delete_3par_host() which removes the compute host entry from 3PAR which ideally should be done only when the last instance is to be detached. It would be great if someone could help me understand if this needs to be handled in driver code or nova would internally take care of it. Code changes done to support multiattach-https://review.opendev.org/#/c/659443 Thanks & Regards, Sneha Rai -------------- next part -------------- An HTML attachment was scrubbed... URL: From wangpeihuixyz at 126.com Tue Jul 2 04:20:55 2019 From: wangpeihuixyz at 126.com (Frank Wang) Date: Tue, 2 Jul 2019 12:20:55 +0800 (CST) Subject: Does networking-ovn support fwaas ? Message-ID: <79a5567a.38aa.16bb0ea04ef.Coremail.wangpeihuixyz@126.com> Hi all, I'm investigating ovn as the neutron backend recently, networking-ovn is a great project, but one thing confuses me, I'd like to know if networking-ovn support fwaas? Security protection for network level, not port leve like security group, I looked up lots of materials, didn't find a clue. Any comments would be appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aj at suse.com Tue Jul 2 11:06:09 2019 From: aj at suse.com (Andreas Jaeger) Date: Tue, 2 Jul 2019 13:06:09 +0200 Subject: [uc][tc][ops] reviving osops- repos In-Reply-To: References: <20190530205552.falsvxcegehtyuge@yuggoth.org> <20190531123501.tawgvqgsw6yle2nu@csail.mit.edu> <20190531164102.5lwt2jyxk24u3vdz@yuggoth.org> Message-ID: On 11/06/2019 13.55, Jean-Philippe Evrard wrote: >> Alternatively, I feel like a SIG (be it the Ops Docs SIG or a new >> "Operational tooling" SIG) would totally be a good idea to revive this. >> In that case we'd define the repository in [4]. >> >> My personal preference would be for a new SIG, but whoever is signing up >> to work on this should definitely have the final say. > > Agreed on having it inside OpenStack namespace, and code handled by a team/SIG/WG (with my preference being a SIG -- existing or not). When this team/SIG/WG retires, the repo would with it. It provides clean ownership, and clear cleanup when disbanding. Mohammed, is that consensus and actionable? Could you then update https://review.opendev.org/#/c/662300/ to reflect this, please? Andreas -- Andreas Jaeger aj at suse.com Twitter: jaegerandi SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg GF: Felix Imendörffer, Mary Higgins, Sri Rasiah, HRB 21284 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 From mriedemos at gmail.com Tue Jul 2 13:26:33 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Tue, 2 Jul 2019 08:26:33 -0500 Subject: HPE 3PAR Cinder driver-Multiattach: Fails to detach second instance from volume In-Reply-To: References: Message-ID: <7cc537a6-0112-baeb-3645-961b6d1e8d73@gmail.com> On 7/1/2019 11:14 PM, RAI, SNEHA wrote: > Is there a way to find in cinder how many instances are attached to a > volume? Yes, the response to GET /v3/{project_id}/volumes/{volume_id} has an "attachments" parameter which is a list of dicts of information about the servers that are attached to the volume. Note that a single server can have multiple attachments to the volume while the server is being migrated across compute hosts, so to determine unique server attachments you'd have to distinguish by server ID in the attachments list. -- Thanks, Matt From mihalis68 at gmail.com Tue Jul 2 15:04:07 2019 From: mihalis68 at gmail.com (Chris Morgan) Date: Tue, 2 Jul 2019 11:04:07 -0400 Subject: [ops] Ops Meetups Team meeting 2019-7-2 Message-ID: Meeting minutes for todays meeting are linked below. Two key decisions from today that you might like to know about: 1. The team will attempt to run an ops day at the Shanghai summit if there is a) sufficient interest and b) space to hold it. To gauge the interest level we will run a twitter poll. We have found that the engagement we get via twitter is actually better than via mailing lists. 2. The current consensus for the next meetups is as follows: September, NYC (USA) - this is under preparation 2020 meetup #1 EU region 2020 meetup #2 APAC region Calls for hosting the 2020 meetups will be issued by the meetups team soon. Thanks Chris Morgan (on behalf of the openstack ops meetups team) Meeting ended Tue Jul 2 14:48:24 2019 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) 10:48 AM Minutes: http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-07-02-14.17.html 10:48 AM Minutes (text): http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-07-02-14.17.txt 10:48 AM Log: http://eavesdrop.openstack.org/meetings/ops_meetup_team/2019/ops_meetup_team.2019-07-02-14.17.log.html -- Chris Morgan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ashlee at openstack.org Tue Jul 2 19:17:22 2019 From: ashlee at openstack.org (Ashlee Ferguson) Date: Tue, 2 Jul 2019 14:17:22 -0500 Subject: [Shanghai Summit] CFP Deadline TONIGHT Message-ID: <25C8C459-0A89-4955-8F62-97D31C5A3362@openstack.org> Hi everyone, The Shanghai Summit Call for Presentations [1] deadline is TONIGHT, July 2 at 11:59 pm PT (July 3, 2019 at 15:00 China Standard Time)! Submit your presentations, panels, and Hands-on Workshops for the Open Infrastructure Summit [2] by the end of today, and join the global community in Shanghai, November 4-6, 2019. Sessions will be presented in both Mandarin and English, so you may submit your presentation in either language. Tracks [4]: 5G, NFV & Edge AI, Machine Learning & HPC CI/CD Container Infrastructure Getting Started Hands-on Workshops Open Development Private & Hybrid Cloud Public Cloud Security Other Helpful Shanghai Summit & PTG Information-- * Register now [5] before the early bird registration deadline in early August (USD or RMB options available) * Apply for Travel Support [6] before August 8. More information here [7]. * Interested in sponsoring the Summit? [8]. * The content submission process for the Forum and Project Teams Gathering will be managed separately in the upcoming months. We look forward to your submissions! Cheers, Ashlee [1] https://cfp.openstack.org/ [2] https://www.openstack.org/summit/shanghai-2019/ [3] https://cfp.openstack.org/ [4] https://www.openstack.org/summit/shanghai-2019/summit-categories/ [5] https://www.openstack.org/summit/shanghai-2019/ [6] https://openstackfoundation.formstack.com/forms/travelsupportshanghai [7] https://www.openstack.org/summit/shanghai-2019/travel/ [8] https://www.openstack.org/summit/shanghai-2019/sponsors/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From lbragstad at gmail.com Tue Jul 2 19:34:50 2019 From: lbragstad at gmail.com (Lance Bragstad) Date: Tue, 2 Jul 2019 14:34:50 -0500 Subject: [keystone][oslo] oslo.limt implementation update Message-ID: Today in keystone's office hours, we went through a group code review of what's currently proposed for the oslo.limit library [0]. This is a summary of the action items that came out of that meeting. * We should implement a basic functional testing framework that exercises keystoneauth connections (used for pulling limit information for keystone). Otherwise, we'll be mocking things left and right in unit tests to get decent test coverage with the current keystoneauth code. * Investigate alternatives to globals for keystoneauth connections [1]. * Investigate adopting a keystoneauth-like way of loading enforcement models (similar to how ksa loads authentication plugins) [2]. * Figure out if we want to use endpoint_id or service name + region name for service configuration [3]. * Build out functional testing for flat enforcement * Implement strict-two-level enforcement model This existing rewrite was mostly stolen from John's patches to his fork oslo.limit [4]. Hopefully the current series moves things in that direction. Feel free to chime in if you have additional notes or comments. Lance [0] https://review.opendev.org/#/q/topic:rewrite+(status:open+OR+status:merged)+project:openstack/oslo.limit [1] https://bugs.launchpad.net/oslo.limit/+bug/1835103 [2] https://bugs.launchpad.net/oslo.limit/+bug/1835104 [3] https://bugs.launchpad.net/oslo.limit/+bug/1835106 [4] https://github.com/JohnGarbutt/oslo.limit/commit/a5b908046fd904c25b6cd15c65266c747774b5ab -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From lbragstad at gmail.com Tue Jul 2 19:37:35 2019 From: lbragstad at gmail.com (Lance Bragstad) Date: Tue, 2 Jul 2019 14:37:35 -0500 Subject: [keystone][oslo] oslo.limt implementation update In-Reply-To: References: Message-ID: <23745175-52fa-88e4-d0de-eaa9ddc37dbb@gmail.com> I forgot to add that the meeting was held in IRC. Logs are available if you're interested in following along [0]. [0] http://eavesdrop.openstack.org/irclogs/%23openstack-keystone/%23openstack-keystone.2019-07-02.log.html#t2019-07-02T16:32:50 On 7/2/19 2:34 PM, Lance Bragstad wrote: > Today in keystone's office hours, we went through a group code review > of what's currently proposed for the oslo.limit library [0]. This is a > summary of the action items that came out of that meeting. > > * We should implement a basic functional testing framework that > exercises keystoneauth connections (used for pulling limit information > for keystone). Otherwise, we'll be mocking things left and right in > unit tests to get decent test coverage with the current keystoneauth code. > * Investigate alternatives to globals for keystoneauth connections [1]. > * Investigate adopting a keystoneauth-like way of loading enforcement > models (similar to how ksa loads authentication plugins) [2]. > * Figure out if we want to use endpoint_id or service name + region > name for service configuration [3]. > * Build out functional testing for flat enforcement > * Implement strict-two-level enforcement model > > This existing rewrite was mostly stolen from John's patches to his > fork oslo.limit [4]. Hopefully the current series moves things in that > direction. > > Feel free to chime in if you have additional notes or comments. > > Lance > > [0] > https://review.opendev.org/#/q/topic:rewrite+(status:open+OR+status:merged)+project:openstack/oslo.limit > [1] https://bugs.launchpad.net/oslo.limit/+bug/1835103 > [2] https://bugs.launchpad.net/oslo.limit/+bug/1835104 > [3] https://bugs.launchpad.net/oslo.limit/+bug/1835106 > [4] > https://github.com/JohnGarbutt/oslo.limit/commit/a5b908046fd904c25b6cd15c65266c747774b5ab -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From raubvogel at gmail.com Tue Jul 2 19:51:55 2019 From: raubvogel at gmail.com (Mauricio Tavares) Date: Tue, 2 Jul 2019 15:51:55 -0400 Subject: Instance (vm guest) not getting PCI card Message-ID: Newbie and easy questions: I have two cards, one in each stein (centos) compute node setup for kvm, which I want to be able to handle to a vm guest (instance). Following https://docs.openstack.org/nova/latest/admin/pci-passthrough.html, I 1. Setup both computer nodes to vt-t and iommu. 2. On the controller 2.1. Create a PCI alias based on the vendor and product ID alias = { "vendor_id":"19fg", "product_id":"4000", "device_type":"type-PF", "name":"testnic" } - The PCI address for the card is different on each compute node 2.2. Create a flavor, say, n1.large openstack flavor create n1.large --id auto --ram 8192 --disk 80 --vcpus 4 --property "pci_passthrough:alias"="testnic:1" 2.3. Restart openstack-nova-api 3. On each compute node 3.1. Create a PCI alias based on the vendor and product ID alias = { "vendor_id":"19fg", "product_id":"4000", "device_type":"type-PF", "name":"testnic" } 3.2. Create passthrough_whitelist entry passthrough_whitelist = { "vendor_id":"19fg", "product_id":"4000" } 3.3. Restart openstack-nova-compute 4. Create instance (vm guest) using the n1.large flavor. 5. Login to instance and discover dmesg and lspci does not list card 6. Do a "virsh dumpxml" for the instance on its compute node and discover there is no entry for the card listed in the xml file. I take nova would automagically do what I would if this was a kvm install, namely ensure card cannot be accessed/used by the host and then edit the guest xml file so it can see said card. Questions: Q1: If a device is sr-iov capable, do I have to use that or can I just pass the entire card to the vm guest? Q2: Is there anywhere I can look for clues to why is the libvirt xml file for the instance not being populated with the pci card info? So far I only looked in the controller node's nova_scheduler.log file. From smooney at redhat.com Tue Jul 2 20:36:32 2019 From: smooney at redhat.com (Sean Mooney) Date: Tue, 02 Jul 2019 21:36:32 +0100 Subject: Instance (vm guest) not getting PCI card In-Reply-To: References: Message-ID: On Tue, 2019-07-02 at 15:51 -0400, Mauricio Tavares wrote: > Newbie and easy questions: I have two cards, one in each stein > (centos) compute node setup for kvm, which I want to be able to handle > to a vm guest (instance). Following > https://docs.openstack.org/nova/latest/admin/pci-passthrough.html, I > > 1. Setup both computer nodes to vt-t and iommu. > 2. On the controller > 2.1. Create a PCI alias based on the vendor and product ID > alias = { "vendor_id":"19fg", "product_id":"4000", > "device_type":"type-PF", "name":"testnic" } the alias looks correcect. assuming you have it set in teh pci section https://docs.openstack.org/nova/latest/configuration/config.html#pci.alias then i should generate teh request of a pci device. the alias needs to be defiend in the nova.conf used by the api node and the compute node for it to work correctly but i assume that when you say its set on the contoler it set on the nova.conf the nova api is useing. > > - The PCI address for the card is different on each compute node > > 2.2. Create a flavor, say, n1.large > openstack flavor create n1.large --id auto --ram 8192 --disk 80 > --vcpus 4 --property "pci_passthrough:alias"="testnic:1" this is also correct > > 2.3. Restart openstack-nova-api > > 3. On each compute node > 3.1. Create a PCI alias based on the vendor and product ID > alias = { "vendor_id":"19fg", "product_id":"4000", > "device_type":"type-PF", "name":"testnic" } > > 3.2. Create passthrough_whitelist entry > passthrough_whitelist = { "vendor_id":"19fg", "product_id":"4000" } assuming this is set in the pci section it also looks correct https://docs.openstack.org/nova/latest/configuration/config.html#pci.passthrough_whitelist > > 3.3. Restart openstack-nova-compute > > 4. Create instance (vm guest) using the n1.large flavor. > > 5. Login to instance and discover dmesg and lspci does not list card > > 6. Do a "virsh dumpxml" for the instance on its compute node and > discover there is no entry for the card listed in the xml file. I take > nova would automagically do what I would if this was a kvm install, > namely ensure card cannot be accessed/used by the host and then edit > the guest xml file so it can see said card. yes you should have seen it in the xml and the card should have been passed through to the guest. > > Questions: > Q1: If a device is sr-iov capable, do I have to use that or can I just > pass the entire card to the vm guest? you can passthorugh the entire card to the guest yes. > Q2: Is there anywhere I can look for clues to why is the libvirt xml > file for the instance not being populated with the pci card info? So > far I only looked in the controller node's nova_scheduler.log file. there are several things to check. first i would check the nova compute agenet log and see if there are any tracebacks or errors second in the nova cell db, often called just nova or nova_cell1 (not nova_cell0) check the pci_devices table and see if the devices are listed. > From openstack at fried.cc Tue Jul 2 21:01:19 2019 From: openstack at fried.cc (Eric Fried) Date: Tue, 2 Jul 2019 16:01:19 -0500 Subject: [keystone][oslo] oslo.limt implementation update In-Reply-To: <23745175-52fa-88e4-d0de-eaa9ddc37dbb@gmail.com> References: <23745175-52fa-88e4-d0de-eaa9ddc37dbb@gmail.com> Message-ID: <2ad2cb15-b7dc-cf12-1179-8612a9801699@fried.cc> >> * We should implement a basic functional testing framework that >> exercises keystoneauth connections (used for pulling limit information >> for keystone). Otherwise, we'll be mocking things left and right in >> unit tests to get decent test coverage with the current keystoneauth code. I would be not at all offended if functional test fixtures for ksa artifacts were made generally available. efried . From kecarter at redhat.com Tue Jul 2 21:28:55 2019 From: kecarter at redhat.com (Kevin Carter) Date: Tue, 2 Jul 2019 16:28:55 -0500 Subject: [TripleO] The Transformation Squad meeting canceled - 4 July, 2019 Message-ID: Hello all, I wanted to let everyone know that the Transformation Squad team meeting will be canceled this coming Thursday; 4 July, 2019 @ 13:00UTC. We will continue with regular meetings on 11 July, 2019 @ 13:00UTC. If there are questions about this cancellation, or anything else, please let us know. Thank you and have a great week. -- Kevin Carter IRC: cloudnull -------------- next part -------------- An HTML attachment was scrubbed... URL: From cboylan at sapwetik.org Tue Jul 2 22:37:34 2019 From: cboylan at sapwetik.org (Clark Boylan) Date: Tue, 02 Jul 2019 15:37:34 -0700 Subject: [all][qa] Tempest jobs are swapping Message-ID: I've been working to bring up a new cloud as part of our nodepool resource set and one of the things we do to sanity check that is run a default tempest full job. The first time I ran tempest it failed because I hadn't configured swap on the test node and we ran out of memory. I added swap, reran things and tempest passed just fine. Our base jobs configure swap as a last ditch effort to avoid failing jobs unnecessarily but the ideal is to avoid swap entirely. In the past 8GB of memory has been plenty to run the tempest testsuite so I think something has changed here and I think we should be able to get us running back under 8GB of memory again. I bring this up because in recent weeks we've seen different groups attempt to reduce their resource footprint (which is good), but many of the approaches seem to ignore that making our jobs as quick and reliable as possible (eg don't use swap) will have a major impact. This is due to the way gating works where a failure requires we discard all results for subsequent changes in the gate, remove the change that failed, then re enqueue jobs for the changes after the failed change. On top of that the quicker our jobs run the quicker we return resources to the pool. How do we debug this? Devstack jobs actually do capture dstat data as well as memory specific information that can be used to identify resource hogs. Taking a recent tempest-full job's dstat log we can see that cinder-backup is using 785MB of memory all on its own [0] (scroll to the bottom). Devstack also captures memory usage of a larger set of processes in its peakmem_tracker log [1]. This includes RSS specifically which doesn't match up with dstat's number making me think dstat's number may be virtual memory and not resident memory. This peakmem_tracker log identifies other processes which we might look at for improving this situation. It would be great if the QA team and various projects could take a look at this to help improve the reliability and throughput of our testing. Thank you. [0] http://logs.openstack.org/81/665281/3/check/tempest-full/cf5e17e/controller/logs/screen-dstat.txt.gz [1] http://logs.openstack.org/81/665281/3/check/tempest-full/cf5e17e/controller/logs/screen-peakmem_tracker.txt.gz From mriedemos at gmail.com Tue Jul 2 23:18:50 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Tue, 2 Jul 2019 18:18:50 -0500 Subject: [all][qa] Tempest jobs are swapping In-Reply-To: References: Message-ID: On 7/2/2019 5:37 PM, Clark Boylan wrote: > Taking a recent tempest-full job's dstat log we can see that cinder-backup is using 785MB of memory all on its own [0] (scroll to the bottom). Oh hello.... https://review.opendev.org/#/c/651865/ -- Thanks, Matt From juliaashleykreger at gmail.com Tue Jul 2 23:35:51 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 2 Jul 2019 16:35:51 -0700 Subject: [ironic] Moving to office hours as opposed to weekly meetings for the next month Message-ID: Greetings Everyone! This week, during the weekly meeting, we seemed to reach consensus that we would try taking a break from meetings[0] and moving to orienting around using the mailing list[1] and our etherpad "whiteboard" [2]. With this, we're going to want to re-evaluate in about a month. I suspect it would be a good time for us to have a "mid-cycle" style set of topical calls. I've gone ahead and created a poll to try and identify a couple days that might be ideal for contributors[3]. But in the mean time, we want to ensure that we have some times for office hours. The suggestion was also made during this week's meeting that we may want to make the office hours window a little larger to enable more discussion. So when will we have office hours? ---------------------------------- Ideally we'll start with two time windows. One to provide coverage to US and Europe friendly time zones, and another for APAC contributors. * I think 2-4 PM UTC on Mondays would be ideal. This translates to 7-9 AM US-Pacific or 10 AM to 12 PM US-Eastern. * We need to determine a time window that would be ideal for APAC contributors. I've created a poll to help facilitate discussion[4]. So what is Office Hours? ------------------------ Office hours are a time window when we expect some contributors to be on IRC and able to partake in higher bandwidth discussions. These times are not absolute. They can change and evolve, and that is the most important thing for us to keep in mind. -- If there are any questions, Please let me know! Otherwise I'll send a summary email out on next Monday. -Julia [0]: http://eavesdrop.openstack.org/meetings/ironic/2019/ironic.2019-07-01-15.00.log.html#l-123 [1]: http://lists.openstack.org/pipermail/openstack-discuss/2019-June/007038.html [2]: https://etherpad.openstack.org/p/IronicWhiteBoard [3]: https://doodle.com/poll/652gzta6svsda343 [4]: https://doodle.com/poll/2ta5vbskytpntmgv From matt at oliver.net.au Wed Jul 3 00:16:48 2019 From: matt at oliver.net.au (Matthew Oliver) Date: Wed, 3 Jul 2019 10:16:48 +1000 Subject: [PTL] update FC_SIG liaison list reminder Message-ID: Hey PTLs, I'm one of your friendly First Contact SIG (FC_SIG) representatives, and previously emailed you fine folk asking if you could take a look at the FC_SIG liaison list[0] and update it as needed. As mentioned in the PTL guide[1]. For those of you who have already done that thing, awesome, thanks so much. If you haven't had a chance yet it would be fantastic if you could! What is a FC_SIG liaison, they will be the person we contact or add to a review when a new contributor contacts us and wants to work on your project or we find a new patch from a new contributor and we want to make sure they get all the love they need to get them engaged. So what's important here is having the liaison's details, including timezone, up to date. Also note more then 1 liaison in more then 1 timezone would be preferred (if it can be manged), so we can connect new contributors with someone closer to them. If a project doesn't have a liaison then it'll default to the PTL, but we'd still rather have the PTL in the list with timezones as that makes it easier for us. So talk about it in your teams. Thanks, Matt [0] - https://wiki.openstack.org/wiki/CrossProjectLiaisons#First_Contact_SIG [1] - https://docs.openstack.org/project-team-guide/ptl.html#at-the-beginning-of-a-new-cycle -------------- next part -------------- An HTML attachment was scrubbed... URL: From li.canwei2 at zte.com.cn Wed Jul 3 02:34:52 2019 From: li.canwei2 at zte.com.cn (li.canwei2 at zte.com.cn) Date: Wed, 3 Jul 2019 10:34:52 +0800 (CST) Subject: =?UTF-8?B?W1dhdGNoZXJdIHRlYW0gbWVldGluZyDCoGF0IDA4OjAwIFVUQyB0b2RheQ==?= Message-ID: <201907031034525007996@zte.com.cn> Hi all, Watcher team will have a meeting at 08:00 UTC today in the #openstack-meeting-alt channel. The agenda is available on https://wiki.openstack.org/wiki/Watcher_Meeting_Agenda feel free to add any additional items. Thanks! Canwei Li -------------- next part -------------- An HTML attachment was scrubbed... URL: From berndbausch at gmail.com Wed Jul 3 03:22:23 2019 From: berndbausch at gmail.com (Bernd Bausch) Date: Wed, 3 Jul 2019 12:22:23 +0900 Subject: HPE 3PAR Cinder driver-Multiattach: Fails to detach second instance from volume In-Reply-To: References: Message-ID: <45419a88-ca76-0e16-a857-4d7f33717bcb@gmail.com> Yes, by determining the size of the attachments list. CLI: /openstack volume show /displays attachment details. API: The response for the volume details API contains an /attachments /parameter. On 7/2/2019 1:14 PM, RAI, SNEHA wrote: > Hi All, > > Is there a way to find in cinder how many instances are attached to a > volume? > > Thanks & Regards, > > Sneha Rai > > *From:* RAI, SNEHA > *Sent:* Monday, July 1, 2019 5:26 PM > *To:* openstack-dev at lists.openstack.org > *Subject:* HPE 3PAR Cinder driver-Multiattach: Fails to detach second > instance from volume > > Hi Team, > > There is a bug on 3PAR Cinder driver > https://bugs.launchpad.net/cinder/+bug/1834660 > . > > I am able to attach multiple instances to 3PAR volume but only the > first instance gets detached successfully. > > For the second instance, volume goes into detaching status due to > “Host does not exist” error. > > What is happening here is, the first detach call invokes > _delete_3par_host() which removes the compute host entry from 3PAR > which ideally should be done only when the last instance is to be > detached. > > It would be great if someone could help me understand if this needs to > be handled in driver code or nova would internally take care of it. > > Code changes done to support > multiattach-https://review.opendev.org/#/c/659443 > > Thanks & Regards, > > Sneha Rai > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rico.lin.guanyu at gmail.com Wed Jul 3 04:07:40 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Wed, 3 Jul 2019 12:07:40 +0800 Subject: [Shanghai Summit] CFP Deadline TONIGHT In-Reply-To: <25C8C459-0A89-4955-8F62-97D31C5A3362@openstack.org> References: <25C8C459-0A89-4955-8F62-97D31C5A3362@openstack.org> Message-ID: Hi Ashlee, I notice the deadline has postponed to the end of July 7th. If that is correct, people who follow this mail should know about that. Ashlee Ferguson 於 2019年7月3日 週三,上午3:23寫道: > Hi everyone, > > The Shanghai Summit Call for Presentations [1] deadline is TONIGHT, July 2 > at 11:59 pm PT (July 3, 2019 at 15:00 China Standard Time)! > > Submit your presentations, panels, and Hands-on Workshops for the Open > Infrastructure Summit [2] by the end of today, and join the global > community in Shanghai, November 4-6, 2019. Sessions will be presented in > both Mandarin and English, so you may submit your presentation in either > language. > > Tracks [4]: > 5G, NFV & Edge > AI, Machine Learning & HPC > CI/CD > Container Infrastructure > Getting Started > Hands-on Workshops > Open Development > Private & Hybrid Cloud > Public Cloud > Security > > Other Helpful Shanghai Summit & PTG Information-- > > * Register now [5] before the early bird registration deadline in > early August (USD or RMB options available) > * Apply for Travel Support [6] before August 8. More information here [7]. > * Interested in sponsoring the Summit? [8]. > * The content submission process for the Forum and Project Teams Gathering > will be managed separately in the upcoming months. > > We look forward to your submissions! > > Cheers, > Ashlee > > [1] https://cfp.openstack.org/ > [2] https://www.openstack.org/summit/shanghai-2019/ > [3] https://cfp.openstack.org/ > [4] https://www.openstack.org/summit/shanghai-2019/summit-categories/ > [5] https://www.openstack.org/summit/shanghai-2019/ > [6] https://openstackfoundation.formstack.com/forms/travelsupportshanghai > [7] https://www.openstack.org/summit/shanghai-2019/travel/ > [8] https://www.openstack.org/summit/shanghai-2019/sponsors/ > > > > > -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimmy at openstack.org Wed Jul 3 04:15:43 2019 From: jimmy at openstack.org (Jimmy McArthur) Date: Tue, 02 Jul 2019 23:15:43 -0500 Subject: [OpenStack Foundation] [Shanghai Summit] CFP Deadline TONIGHT In-Reply-To: References: <25C8C459-0A89-4955-8F62-97D31C5A3362@openstack.org> Message-ID: <5D1C2BEF.1050204@openstack.org> The deadline is tonight, as far as comms are concerned. We tend to keep the CFP open for a few extra days for stragglers and such :) It's absolutely critical for us to get a final count of submissions, so the sooner the better. Cheers, Jimmy > Rico Lin > July 2, 2019 at 11:07 PM > Hi Ashlee, I notice the deadline has postponed to the end of July 7th. > If that is correct, people who follow this mail should know about that. > > -- > May The Force of OpenStack Be With You, > */Rico Lin > /*irc: ricolin > > > > _______________________________________________ > Foundation mailing list > Foundation at lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/foundation > Ashlee Ferguson > July 2, 2019 at 2:17 PM > Hi everyone, > > The Shanghai Summit Call for Presentations [1] deadline is TONIGHT, > July 2 at 11:59 pm PT (July 3, 2019 at 15:00 China Standard Time)! > > Submit your presentations, panels, and Hands-on Workshops for the Open > Infrastructure Summit [2] by the end of today, and join the global > community in Shanghai, November 4-6, 2019. Sessions will be presented > in both Mandarin and English, so you may submit your presentation in > either language. > > Tracks [4]: > 5G, NFV & Edge > AI, Machine Learning & HPC > CI/CD > Container Infrastructure > Getting Started > Hands-on Workshops > Open Development > Private & Hybrid Cloud > Public Cloud > Security > > Other Helpful Shanghai Summit & PTG Information-- > > * Register now [5] before the early bird registration deadline in > early August (USD or RMB options available) > * Apply for Travel Support [6] before August 8. More information here [7]. > * Interested in sponsoring the Summit? [8]. > * The content submission process for the Forum and Project > Teams Gathering will be managed separately in the upcoming months. > > We look forward to your submissions! > > Cheers, > Ashlee > > [1] https://cfp.openstack.org/ > [2] https://www.openstack.org/summit/shanghai-2019/ > [3] https://cfp.openstack.org/ > [4] https://www.openstack.org/summit/shanghai-2019/summit-categories/ > [5] https://www.openstack.org/summit/shanghai-2019/ > [6] https://openstackfoundation.formstack.com/forms/travelsupportshanghai > [7] https://www.openstack.org/summit/shanghai-2019/travel/ > [8] https://www.openstack.org/summit/shanghai-2019/sponsors/ > > > > > _______________________________________________ > Foundation mailing list > Foundation at lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/foundation -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Wed Jul 3 09:32:45 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Wed, 03 Jul 2019 18:32:45 +0900 Subject: [all][qa] Tempest jobs are swapping In-Reply-To: References: Message-ID: <16bb72de130.e132b245172524.593102227457274520@ghanshyammann.com> ---- On Wed, 03 Jul 2019 07:37:34 +0900 Clark Boylan wrote ---- > I've been working to bring up a new cloud as part of our nodepool resource set and one of the things we do to sanity check that is run a default tempest full job. The first time I ran tempest it failed because I hadn't configured swap on the test node and we ran out of memory. I added swap, reran things and tempest passed just fine. > > Our base jobs configure swap as a last ditch effort to avoid failing jobs unnecessarily but the ideal is to avoid swap entirely. In the past 8GB of memory has been plenty to run the tempest testsuite so I think something has changed here and I think we should be able to get us running back under 8GB of memory again. > > I bring this up because in recent weeks we've seen different groups attempt to reduce their resource footprint (which is good), but many of the approaches seem to ignore that making our jobs as quick and reliable as possible (eg don't use swap) will have a major impact. This is due to the way gating works where a failure requires we discard all results for subsequent changes in the gate, remove the change that failed, then re enqueue jobs for the changes after the failed change. On top of that the quicker our jobs run the quicker we return resources to the pool. > > How do we debug this? Devstack jobs actually do capture dstat data as well as memory specific information that can be used to identify resource hogs. Taking a recent tempest-full job's dstat log we can see that cinder-backup is using 785MB of memory all on its own [0] (scroll to the bottom). Devstack also captures memory usage of a larger set of processes in its peakmem_tracker log [1]. This includes RSS specifically which doesn't match up with dstat's number making me think dstat's number may be virtual memory and not resident memory. This peakmem_tracker log identifies other processes which we might look at for improving this situation. > > It would be great if the QA team and various projects could take a look at this to help improve the reliability and throughput of our testing. Thank you. Thanks, Clark for pointing this. We have faced the memory issue in fast also where some of the swift services were disabled. cinder-backup service is no doubt taking a lot of memory. As matt mentioned the patch of disabling the c-bak service in tempest-full, we need some job which can run c-bak tests on cinder as well on tempest side but not on other projects. There will be some improvement I except by splitting the integrated tests as per actual dependent services [3]. I need some time to prepare those template and propose if the situation improves. > > [0] http://logs.openstack.org/81/665281/3/check/tempest-full/cf5e17e/controller/logs/screen-dstat.txt.gz > [1] http://logs.openstack.org/81/665281/3/check/tempest-full/cf5e17e/controller/logs/screen-peakmem_tracker.txt.gz > [3] http://lists.openstack.org/pipermail/openstack-discuss/2019-May/005871.html -gmann > From ignaziocassano at gmail.com Wed Jul 3 09:34:42 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 3 Jul 2019 11:34:42 +0200 Subject: [magnum][queens] Floating ip error Message-ID: Hello All, I've just install queens kolla openstack qith magnum but when I try to create a docker swarm cluster the magnum conductor reports: : : The Resource Type (Magnum::Optional::Neutron::LBaaS::FloatingIP) could not be found. I tried lbaas and floating ip works of magnum context. Does it requires octavia ? Regards Ignazio -------------- next part -------------- An HTML attachment was scrubbed... URL: From raubvogel at gmail.com Wed Jul 3 14:51:49 2019 From: raubvogel at gmail.com (Mauricio Tavares) Date: Wed, 3 Jul 2019 10:51:49 -0400 Subject: Instance (vm guest) not getting PCI card In-Reply-To: References: Message-ID: On Tue, Jul 2, 2019 at 4:36 PM Sean Mooney wrote: > > On Tue, 2019-07-02 at 15:51 -0400, Mauricio Tavares wrote: > > Newbie and easy questions: I have two cards, one in each stein > > (centos) compute node setup for kvm, which I want to be able to handle > > to a vm guest (instance). Following > > https://docs.openstack.org/nova/latest/admin/pci-passthrough.html, I > > > > 1. Setup both computer nodes to vt-t and iommu. > > 2. On the controller > > 2.1. Create a PCI alias based on the vendor and product ID > > alias = { "vendor_id":"19fg", "product_id":"4000", > > "device_type":"type-PF", "name":"testnic" } > the alias looks correcect. assuming you have it set in teh pci section > https://docs.openstack.org/nova/latest/configuration/config.html#pci.alias > then i should generate teh request of a pci device. > In fact my initial sed script is 'find the line beginning with "[pci]" and then append this underneath it'. I could probably do something more clever, or use ansible, but I was in a hurry. :) > the alias needs to be defiend in the nova.conf used by the api node and the compute node > for it to work correctly but i assume that when you say its set on the contoler it set on > the nova.conf the nova api is useing. Exactly; that would be step 3.1 further down. > > > > - The PCI address for the card is different on each compute node > > > > 2.2. Create a flavor, say, n1.large > > openstack flavor create n1.large --id auto --ram 8192 --disk 80 > > --vcpus 4 --property "pci_passthrough:alias"="testnic:1" > this is also correct > > > > > 2.3. Restart openstack-nova-api > > > > 3. On each compute node > > 3.1. Create a PCI alias based on the vendor and product ID > > alias = { "vendor_id":"19fg", "product_id":"4000", > > "device_type":"type-PF", "name":"testnic" } > > > > 3.2. Create passthrough_whitelist entry > > passthrough_whitelist = { "vendor_id":"19fg", "product_id":"4000" } > assuming this is set in the pci section it also looks correct > https://docs.openstack.org/nova/latest/configuration/config.html#pci.passthrough_whitelist > > I actually put the passthrough_whitelist entry just below the alias one, which is below the [pci] label in the nova.conf file. Make it easier for me to find them later on. > > 3.3. Restart openstack-nova-compute > > > > 4. Create instance (vm guest) using the n1.large flavor. > > > > 5. Login to instance and discover dmesg and lspci does not list card > > > > 6. Do a "virsh dumpxml" for the instance on its compute node and > > discover there is no entry for the card listed in the xml file. I take > > nova would automagically do what I would if this was a kvm install, > > namely ensure card cannot be accessed/used by the host and then edit > > the guest xml file so it can see said card. > yes you should have seen it in the xml and the card should have been passed through to the guest. > > > > Questions: > > Q1: If a device is sr-iov capable, do I have to use that or can I just > > pass the entire card to the vm guest? > you can passthorugh the entire card to the guest yes. > Now, when I ask for the list of pci devices available in the compute nodes, why are they listed as type-PF? I am a bit concerned because it feels like it will be anxiously trying to virtualize it instead of just leaving said card alone, which I would expect with type-PCI. > > > Q2: Is there anywhere I can look for clues to why is the libvirt xml > > file for the instance not being populated with the pci card info? So > > far I only looked in the controller node's nova_scheduler.log file. > there are several things to check. > first i would check the nova compute agenet log and see if there are any tracebacks or errors > second in the nova cell db, often called just nova or nova_cell1 (not nova_cell0) check the pci_devices > table and see if the devices are listed. > > > Well, logs I can understand (we are talking about the nova-compute.log, right?) but I guess this is where my completely cluelessness shows up in grand style: I do not know where to look for that in the database. Nor could figure out how to talk to the REST interface using curl other than getting a token. So, I did a kludgy workaround and got the pci pool associated with each node, say [{'count': 1, 'product_id': u'4000', u'dev_type': u'type-PF', 'numa_node': 0, 'vendor_id': u'19fg'}] My ASSumption (yes, I know what they say about them) here is that when the object defining the compute node is updated, the database entry associated with it gets fed the pci pool I am seeing. In other words, that is the list of pci devices openstack things the node has. I guess this is the time I have to sheepishly admit that while one of the nodes has a single card, the other one has two; they are identified by being 'numa_node': 0 and 'numa_node': 1. Hopefully that will not cause issues. Then, I compared it to the pci request made before the instance is created: (alias_name='testnic',count=1,is_new=,numa_policy='legacy',request_id=None,requester_id=,spec=[{dev_type='type-PF',product_id='4000',vendor_id='19fg'}]) Since they both match, they satisfied the pci passthrough filter test and the instance was allowed to be spawned. That is as far as I went. From ignaziocassano at gmail.com Wed Jul 3 14:58:49 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 3 Jul 2019 16:58:49 +0200 Subject: [queens][magnuma] kubernetes cluster Message-ID: Hi All, I just installed openstack kolla queens with magnum but trying to create a kubernetes cluster the master nodes does not terminate installation: it loops with the following message: curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5 Anyone can help ? Best Regards Ignazio -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Wed Jul 3 19:03:12 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 3 Jul 2019 21:03:12 +0200 Subject: [magnum][queens] kume master error in clout-init-output.log Message-ID: Hi All, I 've just installed openstack kolla queens and wen I try to run a kubernetes cluster, the master node reports the following in the cloud init log file: Cloud-init v. 0.7.9 running 'modules:config' at Wed, 03 Jul 2019 18:57:19 +0000. Up 20.40 seconds. + CA_FILE=/etc/pki/ca-trust/source/anchors/openstack-ca.pem + '[' -n '' ']' /var/lib/cloud/instance/scripts/part-007: line 57: /etc/etcd/etcd.conf: No such file or directory /var/lib/cloud/instance/scripts/part-007: line 70: /etc/etcd/etcd.conf: No such file or directory /var/lib/cloud/instance/scripts/part-007: line 86: /etc/etcd/etcd.conf: No such file or directory Cloud-init v. 0.7.9 running 'modules:final' at Wed, 03 Jul 2019 18:57:21 +0000. Up 21.93 seconds. 2019-07-03 18:59:23,121 - util.py[WARNING]: Failed running /var/lib/cloud/instance/scripts/part-007 [1] The image I am using is suggested in the openstack documentation: $ wget https://download.fedoraproject.org/pub/alt/atomic/stable/Fedora-Atomic-27-20180212.2/CloudImages/x86_64/images/Fedora-Atomic-27-20180212.2.x86_64.qcow2 The volume driver is cinder and the docker storage driver is overlay. The docker volume size is 5 GB and THE VOLUME SIZE OF THE FLAVOR IS 20gb. Seems image is missing some packeages :-( Anyone could help me, please ? Ignazio -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric.adams at intel.com Tue Jul 2 23:24:39 2019 From: eric.adams at intel.com (Adams, Eric) Date: Tue, 2 Jul 2019 23:24:39 +0000 Subject: [kata-dev] [Zun][Kata] Devstack plugin for installing Kata container In-Reply-To: References: Message-ID: Hongbin, That is great to hear. When it is fully integrated we can replace our somewhat hacky Zun / Kata instructions at https://github.com/kata-containers/documentation/blob/master/use-cases/zun_kata.md which just takes the previous Clear Containers work enabling on OpenStack Zun and replaces it with the Kata runtime. Thanks, Eric From: Hongbin Lu [mailto:hongbin034 at gmail.com] Sent: Monday, July 01, 2019 2:08 PM To: kata-dev at lists.katacontainers.io; OpenStack Discuss Subject: [kata-dev] [Zun][Kata] Devstack plugin for installing Kata container Hi all, I have a patch [1] that adds support for installing kata Container in devstack. Right now, it configures kata container to run with Docker. In the future, I plan to add support for containerd's cri plugin, which basically allows running pods with kata container. Initially, OpenStack Zun will use this plugin to install Kata container, but I believe it would be beneficial for other projects. Appreciate if anyone interest to cast your feedback on the patch. [1] https://review.opendev.org/#/c/668490/ Best regards, Hongbin -------------- next part -------------- An HTML attachment was scrubbed... URL: From martialmichel at datamachines.io Wed Jul 3 04:36:41 2019 From: martialmichel at datamachines.io (Martial Michel) Date: Wed, 3 Jul 2019 00:36:41 -0400 Subject: [Scientific] No Scientific SIG meeting Jul 3rd Message-ID: Sending this email to inform you that we will not be having a Scientific SIG meeting Today. Hopefully next week -- Martial -------------- next part -------------- An HTML attachment was scrubbed... URL: From martialmichel at datamachines.io Wed Jul 3 04:36:41 2019 From: martialmichel at datamachines.io (Martial Michel) Date: Wed, 3 Jul 2019 00:36:41 -0400 Subject: [Scientific] No Scientific SIG meeting Jul 3rd Message-ID: Sending this email to inform you that we will not be having a Scientific SIG meeting Today. Hopefully next week -- Martial -------------- next part -------------- An HTML attachment was scrubbed... URL: From dp at servinga.com Wed Jul 3 15:31:16 2019 From: dp at servinga.com (Denis Pascheka) Date: Wed, 3 Jul 2019 17:31:16 +0200 (CEST) Subject: [queens][magnuma] kubernetes cluster In-Reply-To: References: Message-ID: <765458860.27505.1562167876903@ox.servinga.com> An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 65536 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 65536 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 65536 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 65536 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 65536 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 65536 bytes Desc: not available URL: From ignaziocassano at gmail.com Wed Jul 3 15:49:25 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 3 Jul 2019 17:49:25 +0200 Subject: [queens][magnuma] kubernetes cluster In-Reply-To: <765458860.27505.1562167876903@ox.servinga.com> References: <765458860.27505.1562167876903@ox.servinga.com> Message-ID: Thanks Denis, but on kube master the port 8080 is not listening ....probably some services are not active :-( Il Mer 3 Lug 2019 17:31 Denis Pascheka ha scritto: > Hi Ignazio, > > in Queens there is an issue within Magnum which has been resolved in the > Rocky release. > Take a look at this file: > https://github.com/openstack/magnum/blob/stable/rocky/magnum/drivers/common/templates/kubernetes/fragments/wc-notify-master.sh. > > The execution of the curl command in row 16 needs to be escaped with an > backslash. You can achieve this by building your own magnum containers > and > adding an template override > to > it where you add your fixed/own wc-notify-master.sh script from the plugin > directory > . > > > Best Regards, > > *Denis Pascheka* > Cloud Architect > > t: +49 (69) 348 75 11 12 > m: +49 (170) 495 6364 > e: dp at servinga.com > servinga GmbH > Mainzer Landstr. 351-353 > 60326 Frankfurt > > > > * > www.servinga.com * > > Amtsgericht Frankfurt am Main - HRB 91418 - Geschäftsführer Adam Lakota, > Christian Lertes > > Ignazio Cassano hat am 3. Juli 2019 um 16:58 > geschrieben: > > > Hi All, > I just installed openstack kolla queens with magnum but trying to create a > kubernetes cluster the master nodes does not terminate installation: it > loops with the following message: > > curl --silent http://127.0.0.1:8080/healthz > + '[' ok = '' ']' > + sleep 5 > > Anyone can help ? > Best Regards > Ignazio > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: noname Type: image/png Size: 65536 bytes Desc: not available URL: From ignaziocassano at gmail.com Wed Jul 3 16:37:54 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Wed, 3 Jul 2019 18:37:54 +0200 Subject: [queens][magnuma] kubernetes cluster In-Reply-To: <765458860.27505.1562167876903@ox.servinga.com> References: <765458860.27505.1562167876903@ox.servinga.com> Message-ID: Thanks Denis, but I think there is another problem: on kube muster port 8080 is not listening, probably some services are note started Regards Ignazio Il giorno mer 3 lug 2019 alle ore 17:31 Denis Pascheka ha scritto: > Hi Ignazio, > > in Queens there is an issue within Magnum which has been resolved in the > Rocky release. > Take a look at this file: > https://github.com/openstack/magnum/blob/stable/rocky/magnum/drivers/common/templates/kubernetes/fragments/wc-notify-master.sh. > > The execution of the curl command in row 16 needs to be escaped with an > backslash. You can achieve this by building your own magnum containers > and > adding an template override > to > it where you add your fixed/own wc-notify-master.sh script from the plugin > directory > . > > > Best Regards, > > *Denis Pascheka* > Cloud Architect > > t: +49 (69) 348 75 11 12 > m: +49 (170) 495 6364 > e: dp at servinga.com > servinga GmbH > Mainzer Landstr. 351-353 > 60326 Frankfurt > > > > * > www.servinga.com * > > Amtsgericht Frankfurt am Main - HRB 91418 - Geschäftsführer Adam Lakota, > Christian Lertes > > Ignazio Cassano hat am 3. Juli 2019 um 16:58 > geschrieben: > > > Hi All, > I just installed openstack kolla queens with magnum but trying to create a > kubernetes cluster the master nodes does not terminate installation: it > loops with the following message: > > curl --silent http://127.0.0.1:8080/healthz > + '[' ok = '' ']' > + sleep 5 > > Anyone can help ? > Best Regards > Ignazio > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: noname Type: image/png Size: 65536 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: noname Type: image/png Size: 65536 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: noname Type: image/png Size: 65536 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: noname Type: image/png Size: 65536 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: noname Type: image/png Size: 65536 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: noname Type: image/png Size: 65536 bytes Desc: not available URL: From feilong at catalyst.net.nz Thu Jul 4 02:30:20 2019 From: feilong at catalyst.net.nz (Feilong Wang) Date: Thu, 4 Jul 2019 14:30:20 +1200 Subject: [magnum][queens] kume master error in clout-init-output.log In-Reply-To: References: Message-ID: Hi Ignazio, Based on the error message you provided, the etcd cluster is not started successfully. You can run the script part-007 manually to debug what's the root cause. On 4/07/19 7:03 AM, Ignazio Cassano wrote: > Hi All, > I 've just installed openstack kolla queens and wen I try to run a > kubernetes cluster, > the master node reports the following in the cloud init log file: > Cloud-init v. 0.7.9 running 'modules:config' at Wed, 03 Jul 2019 > 18:57:19 +0000. Up 20.40 seconds. > + CA_FILE=/etc/pki/ca-trust/source/anchors/openstack-ca.pem > + '[' -n '' ']' > > /var/lib/cloud/instance/scripts/part-007: line 57: > /etc/etcd/etcd.conf: No such file or directory > /var/lib/cloud/instance/scripts/part-007: line 70: > /etc/etcd/etcd.conf: No such file or directory > /var/lib/cloud/instance/scripts/part-007: line 86: > /etc/etcd/etcd.conf: No such file or directory > Cloud-init v. 0.7.9 running 'modules:final' at Wed, 03 Jul 2019 > 18:57:21 +0000. Up 21.93 seconds. > 2019-07-03 18:59:23,121 - util.py[WARNING]: Failed running > /var/lib/cloud/instance/scripts/part-007 [1] > || > |The image I am using is suggested in the openstack documentation: $ > wget > https://download.fedoraproject.org/pub/alt/atomic/stable/Fedora-Atomic-27-20180212.2/CloudImages/x86_64/images/Fedora-Atomic-27-20180212.2.x86_64.qcow2 > | > |The volume driver is cinder and the docker storage driver is overlay. | > |The docker volume size is 5 GB and THE VOLUME SIZE OF THE FLAVOR IS > 20gb. | > |Seems image is missing some packeages :-( | > |Anyone could help me, please ? | > |Ignazio | > || > || -- Cheers & Best regards, Feilong Wang (王飞龙) -------------------------------------------------------------------------- Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang at catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington -------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Thu Jul 4 03:46:18 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 4 Jul 2019 05:46:18 +0200 Subject: [magnum][queens] kume master error in clout-init-output.log In-Reply-To: References: Message-ID: Thanks. I'll try Ignazio Il Gio 4 Lug 2019 04:39 Feilong Wang ha scritto: > Hi Ignazio, > > Based on the error message you provided, the etcd cluster is not started > successfully. You can run the script part-007 manually to debug what's the > root cause. > > > On 4/07/19 7:03 AM, Ignazio Cassano wrote: > > Hi All, > I 've just installed openstack kolla queens and wen I try to run a > kubernetes cluster, > the master node reports the following in the cloud init log file: > Cloud-init v. 0.7.9 running 'modules:config' at Wed, 03 Jul 2019 18:57:19 > +0000. Up 20.40 seconds. > + CA_FILE=/etc/pki/ca-trust/source/anchors/openstack-ca.pem > + '[' -n '' ']' > > /var/lib/cloud/instance/scripts/part-007: line 57: /etc/etcd/etcd.conf: No > such file or directory > /var/lib/cloud/instance/scripts/part-007: line 70: /etc/etcd/etcd.conf: No > such file or directory > /var/lib/cloud/instance/scripts/part-007: line 86: /etc/etcd/etcd.conf: No > such file or directory > Cloud-init v. 0.7.9 running 'modules:final' at Wed, 03 Jul 2019 18:57:21 > +0000. Up 21.93 seconds. > 2019-07-03 18:59:23,121 - util.py[WARNING]: Failed running > /var/lib/cloud/instance/scripts/part-007 [1] > > The image I am using is suggested in the openstack documentation: > $ wget https://download.fedoraproject.org/pub/alt/atomic/stable/Fedora-Atomic-27-20180212.2/CloudImages/x86_64/images/Fedora-Atomic-27-20180212.2.x86_64.qcow2 > > The volume driver is cinder and the docker storage driver is overlay. > > The docker volume size is 5 GB and THE VOLUME SIZE OF THE FLAVOR IS 20gb. > > > Seems image is missing some packeages :-( > > Anyone could help me, please ? > > Ignazio > > -- > Cheers & Best regards, > Feilong Wang (王飞龙) > -------------------------------------------------------------------------- > Senior Cloud Software Engineer > Tel: +64-48032246 > Email: flwang at catalyst.net.nz > Catalyst IT Limited > Level 6, Catalyst House, 150 Willis Street, Wellington > -------------------------------------------------------------------------- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Thu Jul 4 04:49:42 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 4 Jul 2019 06:49:42 +0200 Subject: [magnum][queens] kume master error in clout-init-output.log In-Reply-To: References: Message-ID: Hello, the problem is that my externet network uses http proxy for downloading package from internet The file /etc/sysconfig/heat-params containes the HTTP_PROXY variable I passed to the cluster template but it does not export it for part007 script and etcd is not downloaded. If I change /etc/sysconfig/heat-params and insert at the head set -a and then I run part007 script, it works because reads the HTTP_PROXY variable and can download etcd. This seems a bug. Regards Ignazio Il giorno gio 4 lug 2019 alle ore 04:39 Feilong Wang < feilong at catalyst.net.nz> ha scritto: > Hi Ignazio, > > Based on the error message you provided, the etcd cluster is not started > successfully. You can run the script part-007 manually to debug what's the > root cause. > > > On 4/07/19 7:03 AM, Ignazio Cassano wrote: > > Hi All, > I 've just installed openstack kolla queens and wen I try to run a > kubernetes cluster, > the master node reports the following in the cloud init log file: > Cloud-init v. 0.7.9 running 'modules:config' at Wed, 03 Jul 2019 18:57:19 > +0000. Up 20.40 seconds. > + CA_FILE=/etc/pki/ca-trust/source/anchors/openstack-ca.pem > + '[' -n '' ']' > > /var/lib/cloud/instance/scripts/part-007: line 57: /etc/etcd/etcd.conf: No > such file or directory > /var/lib/cloud/instance/scripts/part-007: line 70: /etc/etcd/etcd.conf: No > such file or directory > /var/lib/cloud/instance/scripts/part-007: line 86: /etc/etcd/etcd.conf: No > such file or directory > Cloud-init v. 0.7.9 running 'modules:final' at Wed, 03 Jul 2019 18:57:21 > +0000. Up 21.93 seconds. > 2019-07-03 18:59:23,121 - util.py[WARNING]: Failed running > /var/lib/cloud/instance/scripts/part-007 [1] > > The image I am using is suggested in the openstack documentation: > $ wget https://download.fedoraproject.org/pub/alt/atomic/stable/Fedora-Atomic-27-20180212.2/CloudImages/x86_64/images/Fedora-Atomic-27-20180212.2.x86_64.qcow2 > > The volume driver is cinder and the docker storage driver is overlay. > > The docker volume size is 5 GB and THE VOLUME SIZE OF THE FLAVOR IS 20gb. > > > Seems image is missing some packeages :-( > > Anyone could help me, please ? > > Ignazio > > -- > Cheers & Best regards, > Feilong Wang (王飞龙) > -------------------------------------------------------------------------- > Senior Cloud Software Engineer > Tel: +64-48032246 > Email: flwang at catalyst.net.nz > Catalyst IT Limited > Level 6, Catalyst House, 150 Willis Street, Wellington > -------------------------------------------------------------------------- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Thu Jul 4 06:13:01 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 4 Jul 2019 08:13:01 +0200 Subject: [magnum][queens] kume master error in clout-init-output.log In-Reply-To: References: Message-ID: Hello Feilong, do you work actively in magnum project ? If yes, how we could solve this problem ? Must we open a bug ? Alternatives ? Regards Ignazio Il giorno gio 4 lug 2019 alle ore 04:39 Feilong Wang < feilong at catalyst.net.nz> ha scritto: > Hi Ignazio, > > Based on the error message you provided, the etcd cluster is not started > successfully. You can run the script part-007 manually to debug what's the > root cause. > > > On 4/07/19 7:03 AM, Ignazio Cassano wrote: > > Hi All, > I 've just installed openstack kolla queens and wen I try to run a > kubernetes cluster, > the master node reports the following in the cloud init log file: > Cloud-init v. 0.7.9 running 'modules:config' at Wed, 03 Jul 2019 18:57:19 > +0000. Up 20.40 seconds. > + CA_FILE=/etc/pki/ca-trust/source/anchors/openstack-ca.pem > + '[' -n '' ']' > > /var/lib/cloud/instance/scripts/part-007: line 57: /etc/etcd/etcd.conf: No > such file or directory > /var/lib/cloud/instance/scripts/part-007: line 70: /etc/etcd/etcd.conf: No > such file or directory > /var/lib/cloud/instance/scripts/part-007: line 86: /etc/etcd/etcd.conf: No > such file or directory > Cloud-init v. 0.7.9 running 'modules:final' at Wed, 03 Jul 2019 18:57:21 > +0000. Up 21.93 seconds. > 2019-07-03 18:59:23,121 - util.py[WARNING]: Failed running > /var/lib/cloud/instance/scripts/part-007 [1] > > The image I am using is suggested in the openstack documentation: > $ wget https://download.fedoraproject.org/pub/alt/atomic/stable/Fedora-Atomic-27-20180212.2/CloudImages/x86_64/images/Fedora-Atomic-27-20180212.2.x86_64.qcow2 > > The volume driver is cinder and the docker storage driver is overlay. > > The docker volume size is 5 GB and THE VOLUME SIZE OF THE FLAVOR IS 20gb. > > > Seems image is missing some packeages :-( > > Anyone could help me, please ? > > Ignazio > > -- > Cheers & Best regards, > Feilong Wang (王飞龙) > -------------------------------------------------------------------------- > Senior Cloud Software Engineer > Tel: +64-48032246 > Email: flwang at catalyst.net.nz > Catalyst IT Limited > Level 6, Catalyst House, 150 Willis Street, Wellington > -------------------------------------------------------------------------- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Thu Jul 4 07:24:57 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 4 Jul 2019 09:24:57 +0200 Subject: [neutron][ci] Gate broken Message-ID: <545F4CFE-2DAE-4B73-A47A-23FEC08C3AEC@redhat.com> Hi, Currently we have broken our functional and fullstack jobs due to patch [1] merged in Devstack. So functional and fullstack jobs are finishing with RETRY_LIMIT now. Fix is already proposed in [2]. So if Your patch failed on those jobs now, please don’t recheck it until [2] will be merged. [1] https://review.opendev.org/#/c/619562/ [2] https://review.opendev.org/#/c/669067/ — Slawek Kaplonski Senior software engineer Red Hat From moreira.belmiro.email.lists at gmail.com Thu Jul 4 08:14:40 2019 From: moreira.belmiro.email.lists at gmail.com (Belmiro Moreira) Date: Thu, 4 Jul 2019 10:14:40 +0200 Subject: [nova][ironic] Lock-related performance issue with update_resources periodic job In-Reply-To: References: Message-ID: Hi, we do have the issue of ironic instances taking a lot of time to start being created (The same Jason described). This is because the resource tracker takes >30 minutes to cycle (~2500 ironic nodes in one nova-compute). Meanwhile operations are "queue" until it finish. To speed up the resource tracker we use: https://review.opendev.org/#/c/637225/ We are working in shard the nova-compute for ironic. I think that is the right way to go. Considering the experience described by Jason we now increased the "update_resources_interval" to 24h. Yes, the "queue" issue disappeared. We will report back if you find some weird unexpected consequence. Belmiro CERN On Tue, Jun 11, 2019 at 5:56 PM Jason Anderson wrote: > Hi Surya, > > On 5/13/19 3:15 PM, Surya Seetharaman wrote: > > We faced the same problem at CERN when we upgraded to rocky (we have ~2300 > nodes on a single compute) like Eric said, and we set the > [compute]resource_provider_association_refresh to a large value (this > definitely helps by stopping the syncing of traits/aggregates and provider > tree cache info stuff in terms of chattiness with placement) and inspite of > that it doesn't scale that well for us. We still find the periodic task > taking too much of time which causes the locking to hold up the claim for > instances in BUILD state (the exact same problem you described). While one > way to tackle this like you said is to set the "update_resources_interval" > to a higher value - we were not sure how much out of sync things would get > with placement, so it will be interesting to see how this spans out for you > - another way out would be to use multiple computes and spread the nodes > around (though this is also a pain to maintain IMHO) which is what we are > looking into presently. > > I wanted to let you know that we've been running this way in production > for a few weeks now and it's had a noticeable improvement: instances are no > longer sticking in the "Build" stage, pre-networking, for ages. We were > able to track the improvement by comparing the Nova conductor logs ("Took > {seconds} to build the instance" vs "Took {seconds} to spawn the instance > on the hypervisor"; the delta should be as small as possible and in our > case went from ~30 minutes to ~1 minute.) There have been a few cases where > a resource provider claim got "stuck", but in practice it has been so > infrequent that it potentially has other causes. As such, I can recommend > increasing the interval time significantly. Currently we have it set to 6 > hours. > > I have not yet looked in to bringing in the other Nova patches used at > CERN (and available in Stein). I did take a look at updating the locking > mechanism, but do not have work to show for this yet. > > Cheers, > > /Jason > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ruslanas at lpic.lt Thu Jul 4 08:30:09 2019 From: ruslanas at lpic.lt (=?UTF-8?Q?Ruslanas_G=C5=BEibovskis?=) Date: Thu, 4 Jul 2019 10:30:09 +0200 Subject: tripleO undercloud import failing in stein Message-ID: Hi team, I am using CentOS7 with repos installed from stein package (centos-openstack-stein) in CentOS7. I just enable rdo-trunk-stein-tested because undercloud install do not work from centos-openstack-stein due to unable to mount some path, found that it is known and was fixed in stein tested. so I succeeded to boot and build everything. but when I use "time openstack --debug overcloud node import setup/hosts.yaml" step, it stops at: HTTP POST https://undercloud_public_host:13989/v2/executions 201 Started Mistral Workflow tripleo.baremetal.v1.register_or_update. Execution ID: 55f14621-cabd-4783-a611-d5b822ad0833 Waiting for messages on queue 'tripleo' with no timeout. I can reach iDRAC over ssh, and I can boot Physical HW to PXE. Physical HW state do not change, it do not change power state. current state is poweroff of physical hosts I am adding. any ideas? more info can be found here: https://ask.openstack.org/en/question/122870/valueerror-no-json-object-could-be-decoded-in-tripleo-stein/ -- Ruslanas Gžibovskis +370 6030 7030 -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at fried.cc Thu Jul 4 10:16:00 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 4 Jul 2019 05:16:00 -0500 Subject: [nova][ironic] Lock-related performance issue with update_resources periodic job In-Reply-To: References: Message-ID: <37D021C4-ED1B-4942-9C90-0A26FDE3DD76@fried.cc> > https://review.opendev.org/#/c/637225/ Ah heck, I had totally forgotten about that patch. If it's working for you, let me get it polished up and merged. We could probably justify backporting it too. Matt? efried > On Jul 4, 2019, at 03:14, Belmiro Moreira wrote: > > Hi, > we do have the issue of ironic instances taking a lot of time to start being created (The same Jason described). > This is because the resource tracker takes >30 minutes to cycle (~2500 ironic nodes in one nova-compute). Meanwhile operations are "queue" until it finish. > To speed up the resource tracker we use: https://review.opendev.org/#/c/637225/ > > We are working in shard the nova-compute for ironic. I think that is the right way to go. > > Considering the experience described by Jason we now increased the "update_resources_interval" to 24h. > Yes, the "queue" issue disappeared. > > We will report back if you find some weird unexpected consequence. > > Belmiro > CERN > >> On Tue, Jun 11, 2019 at 5:56 PM Jason Anderson wrote: >> Hi Surya, >> >>> On 5/13/19 3:15 PM, Surya Seetharaman wrote: >>> We faced the same problem at CERN when we upgraded to rocky (we have ~2300 nodes on a single compute) like Eric said, and we set the [compute]resource_provider_association_refresh to a large value (this definitely helps by stopping the syncing of traits/aggregates and provider tree cache info stuff in terms of chattiness with placement) and inspite of that it doesn't scale that well for us. We still find the periodic task taking too much of time which causes the locking to hold up the claim for instances in BUILD state (the exact same problem you described). While one way to tackle this like you said is to set the "update_resources_interval" to a higher value - we were not sure how much out of sync things would get with placement, so it will be interesting to see how this spans out for you - another way out would be to use multiple computes and spread the nodes around (though this is also a pain to maintain IMHO) which is what we are looking into presently. >> I wanted to let you know that we've been running this way in production for a few weeks now and it's had a noticeable improvement: instances are no longer sticking in the "Build" stage, pre-networking, for ages. We were able to track the improvement by comparing the Nova conductor logs ("Took {seconds} to build the instance" vs "Took {seconds} to spawn the instance on the hypervisor"; the delta should be as small as possible and in our case went from ~30 minutes to ~1 minute.) There have been a few cases where a resource provider claim got "stuck", but in practice it has been so infrequent that it potentially has other causes. As such, I can recommend increasing the interval time significantly. Currently we have it set to 6 hours. >> >> I have not yet looked in to bringing in the other Nova patches used at CERN (and available in Stein). I did take a look at updating the locking mechanism, but do not have work to show for this yet. >> >> Cheers, >> >> /Jason -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.rosser at rd.bbc.co.uk Thu Jul 4 10:30:45 2019 From: jonathan.rosser at rd.bbc.co.uk (Jonathan Rosser) Date: Thu, 4 Jul 2019 11:30:45 +0100 Subject: [magnum][queens] kume master error in clout-init-output.log In-Reply-To: References: Message-ID: <6fa4b20a-5023-8d97-0b5f-fc148b409ca7@rd.bbc.co.uk> Ignazio, You will something like this applied to Queens magnum https://review.opendev.org/#/c/637390/ which corrects the issue you have identified with the proxy environment variables not being exported. I would also suggest that you apply another patch https://review.opendev.org/#/c/667284 otherwise the proxy settings in the cluster template will interfere with the operation of magnum-conductor. Depending on your network environment you will have further issues without reverting that code. I would recommend using config in kolla to setup any http proxy you need for the magnum-conductor service - completely separate from any end user proxy settings that come from the cluster template. Hope this helps, Jon. On 04/07/2019 05:49, Ignazio Cassano wrote: > Hello, > the problem is that my externet network uses http proxy for downloading > package from internet > The file /etc/sysconfig/heat-params containes the HTTP_PROXY variable I > passed to the cluster template but it does not export it for part007 script > and etcd is not downloaded. > If I change /etc/sysconfig/heat-params and insert at the head > set -a > > and then I run part007 script, it works because reads the HTTP_PROXY > variable and can download etcd. > > This seems a bug. > Regards > Ignazio > > > Il giorno gio 4 lug 2019 alle ore 04:39 Feilong Wang < > feilong at catalyst.net.nz> ha scritto: > >> Hi Ignazio, >> >> Based on the error message you provided, the etcd cluster is not started >> successfully. You can run the script part-007 manually to debug what's the >> root cause. >> >> >> On 4/07/19 7:03 AM, Ignazio Cassano wrote: >> >> Hi All, >> I 've just installed openstack kolla queens and wen I try to run a >> kubernetes cluster, >> the master node reports the following in the cloud init log file: >> Cloud-init v. 0.7.9 running 'modules:config' at Wed, 03 Jul 2019 18:57:19 >> +0000. Up 20.40 seconds. >> + CA_FILE=/etc/pki/ca-trust/source/anchors/openstack-ca.pem >> + '[' -n '' ']' >> >> /var/lib/cloud/instance/scripts/part-007: line 57: /etc/etcd/etcd.conf: No >> such file or directory >> /var/lib/cloud/instance/scripts/part-007: line 70: /etc/etcd/etcd.conf: No >> such file or directory >> /var/lib/cloud/instance/scripts/part-007: line 86: /etc/etcd/etcd.conf: No >> such file or directory >> Cloud-init v. 0.7.9 running 'modules:final' at Wed, 03 Jul 2019 18:57:21 >> +0000. Up 21.93 seconds. >> 2019-07-03 18:59:23,121 - util.py[WARNING]: Failed running >> /var/lib/cloud/instance/scripts/part-007 [1] >> >> The image I am using is suggested in the openstack documentation: >> $ wget https://download.fedoraproject.org/pub/alt/atomic/stable/Fedora-Atomic-27-20180212.2/CloudImages/x86_64/images/Fedora-Atomic-27-20180212.2.x86_64.qcow2 >> >> The volume driver is cinder and the docker storage driver is overlay. >> >> The docker volume size is 5 GB and THE VOLUME SIZE OF THE FLAVOR IS 20gb. >> >> >> Seems image is missing some packeages :-( >> >> Anyone could help me, please ? >> >> Ignazio >> >> -- >> Cheers & Best regards, >> Feilong Wang (王飞龙) >> -------------------------------------------------------------------------- >> Senior Cloud Software Engineer >> Tel: +64-48032246 >> Email: flwang at catalyst.net.nz >> Catalyst IT Limited >> Level 6, Catalyst House, 150 Willis Street, Wellington >> -------------------------------------------------------------------------- >> >> > From kchamart at redhat.com Thu Jul 4 10:31:48 2019 From: kchamart at redhat.com (Kashyap Chamarthy) Date: Thu, 4 Jul 2019 12:31:48 +0200 Subject: [ops][nova] Quick show of hands: any use Intel (non-CMT) `perf` events? Message-ID: <20190704103148.GF19519@paraplu> Heya folks, While removing some dead code I was wondering if anyone here uses "non-CMT" (Cache Monitoring Technology) performance events events? I'm referring to the events here[0], besides the first three, which are CMT-related. Background ---------- The Intel CMT events (there are three of them) were deprecated during the Rocky release, in this[1] commit, and with this rationale: Upstream Linux kernel has deleted[*] the `perf` framework integration with Intel CMT (Cache Monitoring Technology; or "CQM" in Linux kernel parlance), because the feature was broken by design -- an incompatibility between Linux's `perf` infrastructure and Intel CMT hardware support. It was removed in upstream kernel version v4.14; but bear in mind that downstream Linux distributions with lower kernel versions than 4.14 have backported the said change. Nova supports monitoring of the above mentioned Intel CMT events (namely: 'cmt', 'mbm_local', and 'mbm_total') via the configuration attribute `[libvirt]/enabled_perf_events`. Given that the underlying Linux kernel infrastructure for Intel CMT is removed, we should remove support for it in Nova too. Otherwise enabling them in Nova, and updating to a Linux kernel 4.14 (or above) will result in instances failing to boot. To that end, deprecate support for the three Intel CMT events in "Rocky" release, with the intention to remove support for it in the upcoming "Stein" release. Note that we cannot deprecate / remove `enabled_perf_events` config attribute altogether -- since there are other[+] `perf` events besides Intel CMT. Whether anyone is using those other events with Nova is a good question to which we don't have an equally good answer for, if at all. Now we're removing[2] support for CMT events altogether. Question -------- What I'm wondering now is the answer to the last sentence in the above quoted commit: "Whether anyone is using those other events with Nova is a good question to which we don't have an equally good answer for, if at all". If we know that "no one" (as if we can tell for sure) is using them, we can get rid of more dead code. So, any operators using the non-CMT events from here[0]? [0] https://libvirt.org/formatdomain.html#elementsPerf [1] https://opendev.org/openstack/nova/commit/fc4794acc6 —libvirt: Deprecate support for monitoring Intel CMT `perf` events [2] https://review.opendev.org/669129 — libvirt: Remove support for Intel CMT `perf` event -- /kashyap From zhangbailin at inspur.com Thu Jul 4 10:45:29 2019 From: zhangbailin at inspur.com (=?gb2312?B?QnJpbiBaaGFuZyjVxbDZwdYp?=) Date: Thu, 4 Jul 2019 10:45:29 +0000 Subject: reply: [lists.openstack.org][nova] API updates week 19-26 Message-ID: <29c14b1113644f559b56a52c2db2dfb1@inspur.com> > Spec Ready for Review: 5. Add flavor group https://review.opendev.org/#/c/663563/ Brin Zhang Hello Everyone, Please find the Nova API updates of this week. API Related BP : ============ COMPLETED: 1. Support adding description while locking an instance: - https://blueprints.launchpad.net/nova/+spec/add-locked-reason Code Ready for Review: ------------------------------ 1. Add host and hypervisor_hostname flag to create server - Topic: https://review.opendev.org/#/q/topic:bp/add-host-and-hypervisor-hostname-fla g-to-create-server+(status:open+OR+status:merged) - Weekly Progress: patch is updated with review comment. ready for re-review. I will re-review it tomorrow. 2. Specifying az when restore shelved server - Topic: https://review.opendev.org/#/q/topic:bp/support-specifying-az-when-restore-s helved-server+(status:open+OR+status:merged) - Weekly Progress: Review comments is fixed and ready to re-review. 3. Nova API cleanup - Topic: https://review.opendev.org/#/c/666889/ - Weekly Progress: Code is up for review. A lot of files changed but should be ok to review. I have pushed a couple of patches for missing tests of previous microversions. 4. Detach and attach boot volumes: - Topic: https://review.openstack.org/#/q/topic:bp/detach-boot-volume+(status:open+OR +status:merged) - Weekly Progress: No Progress Spec Ready for Review: ----------------------------- 1. Nova API policy improvement - Spec: https://review.openstack.org/#/c/547850/ - PoC: https://review.openstack.org/#/q/topic:bp/policy-default-refresh+(status:ope n+OR+status:merged) - Weekly Progress: Under review and updates. 2. Support for changing deleted_on_termination after boot -Spec: https://review.openstack.org/#/c/580336/ - Weekly Progress: No update this week. 3. Support delete_on_termination in volume attach api -Spec: https://review.openstack.org/#/c/612949/ - Weekly Progress: No updates this week. 4. Add API ref guideline for body text - ~8 api-ref are left to fix. Previously approved Spec needs to be re-proposed for Train: --------------------------------------------------------------------------- 1. Servers Ips non-unique network names : - https://blueprints.launchpad.net/nova/+spec/servers-ips-non-unique-network-n ames - https://review.openstack.org/#/q/topic:bp/servers-ips-non-unique-network-nam es+(status:open+OR+status:merged) 2. Volume multiattach enhancements: - https://blueprints.launchpad.net/nova/+spec/volume-multiattach-enhancements - https://review.openstack.org/#/q/topic:bp/volume-multiattach-enhancements+(s tatus:open+OR+status:merged) Bugs: ==== No progress report in this week too. I will start the bug triage next week. NOTE- There might be some bug which is not tagged as 'api' or 'api-ref', those are not in the above list. Tag such bugs so that we can keep our eyes. -gmann From renat.akhmerov at gmail.com Thu Jul 4 10:52:16 2019 From: renat.akhmerov at gmail.com (Renat Akhmerov) Date: Thu, 4 Jul 2019 17:52:16 +0700 Subject: [Mistral] PTL is on vacation till July 17 In-Reply-To: <698d16c2-c7aa-4ffa-a6a9-d454a0cb8d55@Spark> References: <698d16c2-c7aa-4ffa-a6a9-d454a0cb8d55@Spark> Message-ID: Hi, Just letting you know that I’ll be on vacation till July 17. With all the urgent requests/issue please contact Dougal Matthews (d0gal), Oleg Ovcharuk (vgoleg) or Adriano Petrich (apetrich). Thanks Renat Akhmerov @Nokia -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Thu Jul 4 10:58:10 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 4 Jul 2019 12:58:10 +0200 Subject: [neutron][ci] Gate broken In-Reply-To: <545F4CFE-2DAE-4B73-A47A-23FEC08C3AEC@redhat.com> References: <545F4CFE-2DAE-4B73-A47A-23FEC08C3AEC@redhat.com> Message-ID: <390CD4A3-78D1-4C42-8F4F-DE6A2E2CFFCA@redhat.com> Hi, Fix is merged now. You can now recheck Your patches :) > On 4 Jul 2019, at 09:24, Slawek Kaplonski wrote: > > Hi, > > Currently we have broken our functional and fullstack jobs due to patch [1] merged in Devstack. > So functional and fullstack jobs are finishing with RETRY_LIMIT now. > Fix is already proposed in [2]. > So if Your patch failed on those jobs now, please don’t recheck it until [2] will be merged. > > [1] https://review.opendev.org/#/c/619562/ > [2] https://review.opendev.org/#/c/669067/ > > — > Slawek Kaplonski > Senior software engineer > Red Hat > — Slawek Kaplonski Senior software engineer Red Hat From ignaziocassano at gmail.com Thu Jul 4 11:18:14 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 4 Jul 2019 13:18:14 +0200 Subject: [magnum][queens] kume master error in clout-init-output.log In-Reply-To: <6fa4b20a-5023-8d97-0b5f-fc148b409ca7@rd.bbc.co.uk> References: <6fa4b20a-5023-8d97-0b5f-fc148b409ca7@rd.bbc.co.uk> Message-ID: Many thanks Ignazio Il giorno gio 4 lug 2019 alle ore 12:38 Jonathan Rosser < jonathan.rosser at rd.bbc.co.uk> ha scritto: > Ignazio, > > You will something like this applied to Queens magnum > https://review.opendev.org/#/c/637390/ which corrects the issue you have > identified with the proxy environment variables not being exported. > > I would also suggest that you apply another patch > https://review.opendev.org/#/c/667284 otherwise the proxy settings in > the cluster template will interfere with the operation of > magnum-conductor. Depending on your network environment you will have > further issues without reverting that code. > > I would recommend using config in kolla to setup any http proxy you need > for the magnum-conductor service - completely separate from any end user > proxy settings that come from the cluster template. > > Hope this helps, > Jon. > > On 04/07/2019 05:49, Ignazio Cassano wrote: > > Hello, > > the problem is that my externet network uses http proxy for downloading > > package from internet > > The file /etc/sysconfig/heat-params containes the HTTP_PROXY variable I > > passed to the cluster template but it does not export it for part007 > script > > and etcd is not downloaded. > > If I change /etc/sysconfig/heat-params and insert at the head > > set -a > > > > and then I run part007 script, it works because reads the HTTP_PROXY > > variable and can download etcd. > > > > This seems a bug. > > Regards > > Ignazio > > > > > > Il giorno gio 4 lug 2019 alle ore 04:39 Feilong Wang < > > feilong at catalyst.net.nz> ha scritto: > > > >> Hi Ignazio, > >> > >> Based on the error message you provided, the etcd cluster is not started > >> successfully. You can run the script part-007 manually to debug what's > the > >> root cause. > >> > >> > >> On 4/07/19 7:03 AM, Ignazio Cassano wrote: > >> > >> Hi All, > >> I 've just installed openstack kolla queens and wen I try to run a > >> kubernetes cluster, > >> the master node reports the following in the cloud init log file: > >> Cloud-init v. 0.7.9 running 'modules:config' at Wed, 03 Jul 2019 > 18:57:19 > >> +0000. Up 20.40 seconds. > >> + CA_FILE=/etc/pki/ca-trust/source/anchors/openstack-ca.pem > >> + '[' -n '' ']' > >> > >> /var/lib/cloud/instance/scripts/part-007: line 57: /etc/etcd/etcd.conf: > No > >> such file or directory > >> /var/lib/cloud/instance/scripts/part-007: line 70: /etc/etcd/etcd.conf: > No > >> such file or directory > >> /var/lib/cloud/instance/scripts/part-007: line 86: /etc/etcd/etcd.conf: > No > >> such file or directory > >> Cloud-init v. 0.7.9 running 'modules:final' at Wed, 03 Jul 2019 18:57:21 > >> +0000. Up 21.93 seconds. > >> 2019-07-03 18:59:23,121 - util.py[WARNING]: Failed running > >> /var/lib/cloud/instance/scripts/part-007 [1] > >> > >> The image I am using is suggested in the openstack documentation: > >> $ wget > https://download.fedoraproject.org/pub/alt/atomic/stable/Fedora-Atomic-27-20180212.2/CloudImages/x86_64/images/Fedora-Atomic-27-20180212.2.x86_64.qcow2 > >> > >> The volume driver is cinder and the docker storage driver is overlay. > >> > >> The docker volume size is 5 GB and THE VOLUME SIZE OF THE FLAVOR IS > 20gb. > >> > >> > >> Seems image is missing some packeages :-( > >> > >> Anyone could help me, please ? > >> > >> Ignazio > >> > >> -- > >> Cheers & Best regards, > >> Feilong Wang (王飞龙) > >> > -------------------------------------------------------------------------- > >> Senior Cloud Software Engineer > >> Tel: +64-48032246 > >> Email: flwang at catalyst.net.nz > >> Catalyst IT Limited > >> Level 6, Catalyst House, 150 Willis Street, Wellington > >> > -------------------------------------------------------------------------- > >> > >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ekuvaja at redhat.com Thu Jul 4 11:55:33 2019 From: ekuvaja at redhat.com (Erno Kuvaja) Date: Thu, 4 Jul 2019 12:55:33 +0100 Subject: [Glance] No meeting today 4th of July Message-ID: Hi all, We had no proposed agenda for today so Happy Independence Day for all our American fellows and we will have meeting again next Thu the 11th. Best, Erno jokke Kuvaja -------------- next part -------------- An HTML attachment was scrubbed... URL: From cdent+os at anticdent.org Thu Jul 4 15:04:46 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Thu, 4 Jul 2019 16:04:46 +0100 (BST) Subject: [placement] [tc] PTL not going to summit? Message-ID: I've decided [1] that I'm going to resist going to tech conferences and summits in particular by air travel, unless it becomes an existential issue. For a variety of reasons described in [1] and the two other posts it links to. I know for some people that this will present some concerns about my efficacy as PTL of placement, so I thought I better mention it sooner than later so we can make some decisions on how we want to proceed: * Should I take myself out of the running for U PTL? In which case we need to get a shadow warmed up. * If I'm not there, but still PTL, should we still hold the usual regularly placement things (project update and onboarding, PTG things)? In which case we need to prepare chairs/moderators for those sessions. Given my positions on the exclusive properties of conferences, I'd prefer that we turn update and onboarding activities into asynchronous, document-oriented affairs that anyone can utilize at any time, not just those wanting and able to go to summit. Similarly, given how useful the placement pre-PTG was for Train, the very small amount of time spent discussing placement issues in person in Denver, and the team's strategy of focusing on a relatively small number of changes, I'm not certain we need "time" in Shanghai. We can spread that work out. Thoughts or questions from anyone? [1] https://anticdent.org/remote-maintainership.html -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From mark at stackhpc.com Thu Jul 4 15:16:41 2019 From: mark at stackhpc.com (Mark Goddard) Date: Thu, 4 Jul 2019 16:16:41 +0100 Subject: [placement] [tc] PTL not going to summit? In-Reply-To: References: Message-ID: On Thu, 4 Jul 2019 at 16:06, Chris Dent wrote: > > I've decided [1] that I'm going to resist going to tech conferences > and summits in particular by air travel, unless it becomes an > existential issue. For a variety of reasons described in [1] and the > two other posts it links to. > > I know for some people that this will present some concerns about my > efficacy as PTL of placement, so I thought I better mention it > sooner than later so we can make some decisions on how we want to > proceed: > > * Should I take myself out of the running for U PTL? In which case > we need to get a shadow warmed up. > > * If I'm not there, but still PTL, should we still hold the usual > regularly placement things (project update and onboarding, PTG > things)? In which case we need to prepare chairs/moderators for > those sessions. > > Given my positions on the exclusive properties of conferences, I'd > prefer that we turn update and onboarding activities into > asynchronous, document-oriented affairs that anyone can utilize at > any time, not just those wanting and able to go to summit. > > Similarly, given how useful the placement pre-PTG was for Train, the > very small amount of time spent discussing placement issues in > person in Denver, and the team's strategy of focusing on a > relatively small number of changes, I'm not certain we need "time" > in Shanghai. We can spread that work out. > > Thoughts or questions from anyone? > Thanks for speaking up about your feelings on this. I'm not going so far as to say I won't fly to conferences, but one intercontinental flight in a year for conferences seems like more than enough to me. I won't be attending the Shanghai summit, and environmental factors play a role in this decision (as will an imminent newborn), so I'll watch how this plays out with interest. Mark > > [1] https://anticdent.org/remote-maintainership.html > > -- > Chris Dent ٩◔̯◔۶ https://anticdent.org/ > freenode: cdent -------------- next part -------------- An HTML attachment was scrubbed... URL: From emilien at redhat.com Thu Jul 4 15:45:09 2019 From: emilien at redhat.com (Emilien Macchi) Date: Thu, 4 Jul 2019 11:45:09 -0400 Subject: [placement] [tc] PTL not going to summit? In-Reply-To: References: Message-ID: On Thu, Jul 4, 2019 at 11:19 AM Chris Dent wrote: > > I've decided [1] that I'm going to resist going to tech conferences > and summits in particular by air travel, unless it becomes an > existential issue. For a variety of reasons described in [1] and the > two other posts it links to. > > I know for some people that this will present some concerns about my > efficacy as PTL of placement, so I thought I better mention it > sooner than later so we can make some decisions on how we want to > proceed: > > * Should I take myself out of the running for U PTL? In which case > we need to get a shadow warmed up. > > * If I'm not there, but still PTL, should we still hold the usual > regularly placement things (project update and onboarding, PTG > things)? In which case we need to prepare chairs/moderators for > those sessions. > > Given my positions on the exclusive properties of conferences, I'd > prefer that we turn update and onboarding activities into > asynchronous, document-oriented affairs that anyone can utilize at > any time, not just those wanting and able to go to summit. > > Similarly, given how useful the placement pre-PTG was for Train, the > very small amount of time spent discussing placement issues in > person in Denver, and the team's strategy of focusing on a > relatively small number of changes, I'm not certain we need "time" > in Shanghai. We can spread that work out. > > Thoughts or questions from anyone? > > [1] https://anticdent.org/remote-maintainership.html > Thanks a ton Chris for speaking up on that topic. I also share your opinions in this blog. I've talked with a few TripleO contributors and a bunch of us won't go to China either (for different reasons). Instead, I think we'll try to make progress in our asynchronous collaboration and eventually organize a virtual meetup if needed. Also, in my humble opinion you shouldn't step out of PTL role just because you won't go to the next conference. I think it's part of the PTL role to find out how to make the collaboration happen without barriers, no matter where you are in the world. Thanks for all your hard work on Placement and I hope you'll make the right decision for you and the project. -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Thu Jul 4 16:27:43 2019 From: smooney at redhat.com (Sean Mooney) Date: Thu, 04 Jul 2019 17:27:43 +0100 Subject: [placement] [tc] PTL not going to summit? In-Reply-To: References: Message-ID: <2841fd06a30a9920a4cd6af5c5eee19863e6f27a.camel@redhat.com> On Thu, 2019-07-04 at 11:45 -0400, Emilien Macchi wrote: > On Thu, Jul 4, 2019 at 11:19 AM Chris Dent wrote: > > > > > I've decided [1] that I'm going to resist going to tech conferences > > and summits in particular by air travel, unless it becomes an > > existential issue. For a variety of reasons described in [1] and the > > two other posts it links to. > > > > I know for some people that this will present some concerns about my > > efficacy as PTL of placement, so I thought I better mention it > > sooner than later so we can make some decisions on how we want to > > proceed: > > > > * Should I take myself out of the running for U PTL? In which case > > we need to get a shadow warmed up. > > > > * If I'm not there, but still PTL, should we still hold the usual > > regularly placement things (project update and onboarding, PTG > > things)? In which case we need to prepare chairs/moderators for > > those sessions. > > > > Given my positions on the exclusive properties of conferences, I'd > > prefer that we turn update and onboarding activities into > > asynchronous, document-oriented affairs that anyone can utilize at > > any time, not just those wanting and able to go to summit. > > > > Similarly, given how useful the placement pre-PTG was for Train, the > > very small amount of time spent discussing placement issues in > > person in Denver, and the team's strategy of focusing on a > > relatively small number of changes, I'm not certain we need "time" > > in Shanghai. We can spread that work out. > > > > Thoughts or questions from anyone? > > > > [1] https://anticdent.org/remote-maintainership.html > > > > Thanks a ton Chris for speaking up on that topic. I also share your > opinions in this blog. > I've talked with a few TripleO contributors and a bunch of us won't go to > China either (for different reasons). > > Instead, I think we'll try to make progress in our asynchronous > collaboration and eventually organize a virtual meetup if needed. > Also, in my humble opinion you shouldn't step out of PTL role just because > you won't go to the next conference. I think it's part of the PTL role to > find out how to make the collaboration happen without barriers, no matter > where you are in the world. Thanks for all your hard work on Placement and > I hope you'll make the right decision for you and the project. on this point i also dont think you would be the firsts ptl to not by able to attend ptg in person. i rememebr one of the past cyborge PTLs was not able to attend in person i belive in the first denver ptg and instead joined over voice chat to listen and replied mainly over etherpad to the discussion. i have remote attened 2 kolla midcycle in the past too were similarly by have a listen only audio stream and replying on etherpad as things were bing disucssed i was able to follow then conversation so i personally would not see an issue with a PTL delegating project updates or onbodaing to a core memeber that is present or chossing to hold a virtual ptg meetup. i belive there is still value in having in face discussion but including remotees in those discussion is something that i think we should try for too. so i dont think it should be a requriement to be in person the be ptl once your poject can effectivly discuss and organise your work within itself and with other projects. From fungi at yuggoth.org Thu Jul 4 17:19:59 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 4 Jul 2019 17:19:59 +0000 Subject: [placement] [tc] PTL not going to summit? In-Reply-To: References: Message-ID: <20190704171959.kbdchzlj7c72exkj@yuggoth.org> On 2019-07-04 16:04:46 +0100 (+0100), Chris Dent wrote: > I've decided [1] that I'm going to resist going to tech conferences > and summits in particular by air travel, unless it becomes an > existential issue. For a variety of reasons described in [1] and the > two other posts it links to. > > I know for some people that this will present some concerns about my > efficacy as PTL of placement [...] As Sean notes in his reply, you'd be far from the first PTL (or TC member or other community leader for that matter) to not attend a summit/forum/PTG. I've never heard anyone suggest this was a hard requirement for OpenStack contributors, whether or not they're hold leadership roles. There are plenty of good reasons not to attend, up to and including simply not wanting to be there. I see no problem with that whatsoever. > Given my positions on the exclusive properties of conferences, I'd > prefer that we turn update and onboarding activities into > asynchronous, document-oriented affairs that anyone can utilize at > any time, not just those wanting and able to go to summit. [...] The project updates have served as an opportunity to get video content about projects distributed, since these are typically recorded and post-processed by professional videographers. There have been suggestions of doing the same for onboarding sessions, but this has not happened in the past due to the additional cost required. Producing your own recordings seems like a viable alternative to me, and has also been suggested by a number of folks in the past. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From feilong at catalyst.net.nz Thu Jul 4 02:28:35 2019 From: feilong at catalyst.net.nz (Feilong Wang) Date: Thu, 4 Jul 2019 14:28:35 +1200 Subject: [queens][magnuma] kubernetes cluster In-Reply-To: References: <765458860.27505.1562167876903@ox.servinga.com> Message-ID: Hi Ignazio, We fixed a lot of issues in Rocky and Stein, which some of them can't be easily backported to Queens. Magnum has a very loose dependency for other services, so I would suggest to use rocky or stein if it's possible for you. As for your issue, the error means your kube-apiserver didn't start successfully. You can take a look the cloud init log for more information. On 4/07/19 4:37 AM, Ignazio Cassano wrote: > Thanks Denis, but I think there is another problem: on kube muster > port 8080 is not listening, probably some services are note started > Regards > Ignazio > > Il giorno mer 3 lug 2019 alle ore 17:31 Denis Pascheka > > ha scritto: > > Hi Ignazio,  > > in Queens there is an issue within Magnum which has been resolved > in the Rocky release. > Take a look at this file:  > https://github.com/openstack/magnum/blob/stable/rocky/magnum/drivers/common/templates/kubernetes/fragments/wc-notify-master.sh. > > The execution of the curl command in row 16 needs to be escaped > with an backslash. You can achieve this by building your own > magnum containers >  and > adding an template override >  to > it where you add your fixed/own wc-notify-master.sh script from > the plugin directory > . > > > Best Regards, > > *Denis Pascheka* > Cloud Architect > > t: +49 (69) 348 75 11 12 > m: +49 (170) 495 6364 > e: dp at servinga.com > servinga GmbH > Mainzer Landstr. 351-353 > 60326 Frankfurt > > > > > > *www.servinga.com * > Amtsgericht Frankfurt am Main - HRB 91418 - Geschäftsführer Adam > Lakota, Christian Lertes > >> Ignazio Cassano > > hat am 3. Juli 2019 um 16:58 >> geschrieben: >> >> >> Hi All, >> I just installed openstack kolla queens with magnum but trying to >> create a kubernetes cluster the master nodes does not terminate >> installation: it loops with the following message: >> >> curl --silent http://127.0.0.1:8080/healthz >> + '[' ok = '' ']' >> + sleep 5 >> >> Anyone can help ? >> Best Regards >> Ignazio > >   > -- Cheers & Best regards, Feilong Wang (王飞龙) -------------------------------------------------------------------------- Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang at catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington -------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: noname Type: image/png Size: 65536 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: noname Type: image/png Size: 65536 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: noname Type: image/png Size: 65536 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: noname Type: image/png Size: 65536 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: noname Type: image/png Size: 65536 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: noname Type: image/png Size: 65536 bytes Desc: not available URL: From kklimonda at syntaxhighlighted.com Thu Jul 4 23:12:49 2019 From: kklimonda at syntaxhighlighted.com (Krzysztof Klimonda) Date: Fri, 5 Jul 2019 01:12:49 +0200 Subject: [horizon] Unable to boot instance from volume if image size doesn't fit in flavor root disk size Message-ID: <35A8A85A-AA78-49DF-A1AB-2AB4046438B9@syntaxhighlighted.com> Hi, In order to avoid creating spurious flavors just to match image size requirements, I’d like to be able to spawn instances from volume even if the flavor disk is too small - this is possible via CLI/API but Horizon doesn’t let me launch the instance unless either flavor root disk is sized correctly, or the root disk size is 0(?). Is that a bug in Horizon, or is there something that I’m missing? Regards, -Chris From massimo.sgaravatto at gmail.com Fri Jul 5 06:45:02 2019 From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto) Date: Fri, 5 Jul 2019 08:45:02 +0200 Subject: [ops] [nova] [placement] Mismatch between allocations and instances Message-ID: I tried to check the allocations on each compute node of a Ocata cloud, using the command: curl -s ${PLACEMENT_ENDPOINT}/resource_providers/${UUID}/allocations -H "x-auth-token: $TOKEN" | python -m json.tool I found that, on a few compute nodes, there are some instances for which there is not a corresponding allocation. On another Rocky cloud, we had the opposite problem: there were allocations also for some instances that didn't exist anymore. And this caused problems since we were not able to use all the resources of the relevant compute nodes: we had to manually remove the fwrong" allocations to fix the problem ... I wonder why/how this problem can happen ... And how can we fix the issue ? Should we manually add the missing allocations / manually remove the wrong ones ? Thanks, Massimo -------------- next part -------------- An HTML attachment was scrubbed... URL: From e0ne at e0ne.info Fri Jul 5 06:57:59 2019 From: e0ne at e0ne.info (Ivan Kolodyazhny) Date: Fri, 5 Jul 2019 09:57:59 +0300 Subject: [horizon] Unable to boot instance from volume if image size doesn't fit in flavor root disk size In-Reply-To: <35A8A85A-AA78-49DF-A1AB-2AB4046438B9@syntaxhighlighted.com> References: <35A8A85A-AA78-49DF-A1AB-2AB4046438B9@syntaxhighlighted.com> Message-ID: Hi Chris, It sounds like a horizon bug (TBH, I didn't try to reproduce it yet), so feel free to file a bug in the Launchpad. Regards, Ivan Kolodyazhny, http://blog.e0ne.info/ On Fri, Jul 5, 2019 at 2:15 AM Krzysztof Klimonda < kklimonda at syntaxhighlighted.com> wrote: > Hi, > > In order to avoid creating spurious flavors just to match image size > requirements, I’d like to be able to spawn instances from volume even if > the flavor disk is too small - this is possible via CLI/API but Horizon > doesn’t let me launch the instance unless either flavor root disk is sized > correctly, or the root disk size is 0(?). > > Is that a bug in Horizon, or is there something that I’m missing? > > Regards, > -Chris > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at stackhpc.com Fri Jul 5 07:33:03 2019 From: mark at stackhpc.com (Mark Goddard) Date: Fri, 5 Jul 2019 08:33:03 +0100 Subject: [kolla][kayobe] vote: kayobe as a kolla deliverable In-Reply-To: References: Message-ID: On Thu, 20 Jun 2019 at 14:40, Mark Goddard wrote: > Hi, > > In the most recent kolla meeting [1] we discussed the possibility of > kayobe becoming a deliverable of the kolla project. This follows on > from discussion at the PTG and then on here [3]. > > The two options discussed are: > > 1. become a deliverable of the Kolla project > 2. become an official top level OpenStack project > > There has been some positive feedback about option 1 and no negative > feedback that I am aware of. I would therefore like to ask the kolla > community to vote on whether to include kayobe as a deliverable of the > kolla project. The electorate is the kolla-core and kolla-ansible core > teams, excluding me. The opinion of others in the community is also > welcome. > > If you have questions or feedback, please respond to this email. > > Once you have made a decision, please respond with your answer to the > following question: > > "Should kayobe become a deliverable of the kolla project?" (yes/no) > > This vote has been open for over two weeks, and has had a number of positive responses and no negative responses. I will therefore make the necessary changes to add kayobe as a deliverable of the kolla project. Thank you for your consideration. Thanks, > Mark > > [1] > http://eavesdrop.openstack.org/meetings/kolla/2019/kolla.2019-06-19-15.00.log.html#l-120 > [2] https://etherpad.openstack.org/p/kolla-train-ptg > [3] > http://lists.openstack.org/pipermail/openstack-discuss/2019-June/006901.html > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at stackhpc.com Fri Jul 5 07:39:18 2019 From: mark at stackhpc.com (Mark Goddard) Date: Fri, 5 Jul 2019 08:39:18 +0100 Subject: [kolla] Proposing yoctozepto as core In-Reply-To: References: Message-ID: On Fri, 28 Jun 2019 at 12:50, Mark Goddard wrote: > Hi, > > I would like to propose adding Radosław Piliszek (yoctozepto) to > kolla-core and kolla-ansible-core. While he has only recently started > working upstream in the project, I feel that he has made some valuable > contributions already, particularly around improving and maintaining > CI. His reviews generally provide useful feedback, sometimes in > advance of Zuul! > > Core team - please vote, and consider this my +1. I will keep this > vote open for a week or until all cores have responded. > This has been open for a week, and has had only positive responses. Welcome to the team yoctozepto! In less happy news, I will be removing zhubingbing and Duong Ha-Quang from the kolla-core and kolla-ansible-core groups. Thanks to both for their contributions to the project over the years. Mark > Cheers, > Mark > -------------- next part -------------- An HTML attachment was scrubbed... URL: From e0ne at e0ne.info Fri Jul 5 07:51:41 2019 From: e0ne at e0ne.info (Ivan Kolodyazhny) Date: Fri, 5 Jul 2019 10:51:41 +0300 Subject: [horizon] PTL on vacation Message-ID: Hi team, I'll be on a vacation until July, 19th with a very limited internet connection. During the last weekly meeting [1] we agreed to skip the next two. I'll try to check my mail during the PTO but for any issues, I recommend to ask our great core team [2] via emails or IRC. [1] http://eavesdrop.openstack.org/meetings/horizon/2019/horizon.2019-07-03-15.08.log.html [2] https://review.opendev.org/#/admin/groups/43,members Regards, Ivan Kolodyazhny, http://blog.e0ne.info/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From raubvogel at gmail.com Fri Jul 5 11:58:09 2019 From: raubvogel at gmail.com (Mauricio Tavares) Date: Fri, 5 Jul 2019 07:58:09 -0400 Subject: [nova] type-PCI vs type-PF vs type-VF Message-ID: Quick questions: 1. Is there some official documentation comparing them? Only thing I found so far was https://mohamede.com/2019/02/07/pci-passthrough-type-pf-type-vf-and-type-pci/ 2. Am I correct to assume (I seem to do a lot of that; be afraid) that when nova starts populating the database entry for a given physical node (say a computer node or a ironic baremetal node) it looks through the pci list and decides what type of device (type-PCI, type-PF, or type-VF) it is? What is the criteria? 3. https://mohamede.com/2019/02/07/pci-passthrough-type-pf-type-vf-and-type-pci/ implies that nova sees the 3 different types, well, differently where it will try hard to virtualize type-PF and type-VF and really really wants me to setup sr-iov on them ( https://docs.openstack.org/nova/latest/admin/pci-passthrough.html seem to also want sr-iov) really . Is that correct? -------------- next part -------------- An HTML attachment was scrubbed... URL: From william at ic.unicamp.br Thu Jul 4 21:58:17 2019 From: william at ic.unicamp.br (William Lima Reiznautt) Date: Thu, 04 Jul 2019 18:58:17 -0300 Subject: [nova] consoleauth token ttl Message-ID: <20190704185817.Horde.kZVzaR7qWwNvYreX3yw-Bae@webmail2.ic.unicamp.br> Hello folks, Some known about this configuration token_ttl on [consoleauth] section: Is it on second or minutes ? At documentation is not information. Sorry the question here. -- -- William Lima Reiznautt IC/UNICAMP - Telefone: +55 (19) 3521-5915 --- chavepgp: http://www.ic.unicamp.br/~william/william.pgp From arteom.lap at yandex.ru Fri Jul 5 06:19:53 2019 From: arteom.lap at yandex.ru (=?utf-8?B?0JvQsNC/0LjQvSDQkNGA0YLQtdC8?=) Date: Fri, 05 Jul 2019 09:19:53 +0300 Subject: [Mistral] Guaranteed notification delivery Message-ID: <13379051562307593@iva3-4a4d8f90d111.qloud-c.yandex.net> Good day, please look at https://blueprints.launchpad.net/mistral/+spec/guaranteed-notifies. I would like to know your opinion about this feature. From arteom.lap at yandex.ru Fri Jul 5 09:06:10 2019 From: arteom.lap at yandex.ru (=?utf-8?B?0JvQsNC/0LjQvSDQkNGA0YLQtdC8?=) Date: Fri, 05 Jul 2019 12:06:10 +0300 Subject: [Mistral] Guaranteed notification delivery Message-ID: <25961801562317570@sas2-a1efad875d04.qloud-c.yandex.net> Good day, please look at https://blueprints.launchpad.net/mistral/+spec/guaranteed-notifies. I would like to know your opinion about this feature. From cdent+os at anticdent.org Fri Jul 5 13:01:40 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Fri, 5 Jul 2019 14:01:40 +0100 (BST) Subject: [placement] update 19-26 Message-ID: HTML: https://anticdent.org/placement-update-19-26.html Pupdate 19-26. Next week is R-15, two weeks until Train milestone 2. # Most Important The [spec for nested magic](https://docs.openstack.org/placement/latest/specs/train/approved/2005575-nested-magic-1.html) merged and significant progress has been made in the implementation. That work is nearly ready to merge (see below), after a few more reviews. Once that happens one of our most important tasks will be experimenting with that code to make sure it fully addresses the uses cases, has proper documentation (including "how do I use this?"), and is properly evaluated for performance and maintainability. # What's Changed * The implementation for mappings in allocation candidates had a bug which Eric [found](https://review.opendev.org/668302) and fixed and then I realized there was a [tidier way to do it](https://review.opendev.org/668724). This then led to the `same_subtree` work needing to manage less information, because it was already there. * The spec for [Consumer Types](https://docs.openstack.org/placement/latest/specs/train/approved/2005473-support-consumer-types.html) merged and work [has started](https://review.opendev.org/669170). * We're using os-traits 0.15.0 now. * There's a [framework in place](https://review.opendev.org/665695) for nested resource provider peformance testing. We need to update the provider topology to reflect real world situations (more on that below). * The `root_required` query parameter on `GET /allocation_candidates` has been merged as [microversion 1.35](https://docs.openstack.org/placement/latest/placement-api-microversion-history.html#support-root-required-queryparam-on-get-allocation-candidates). * I've sent [an email](http://lists.openstack.org/pipermail/openstack-discuss/2019-July/007527.html) announcing my intent to not go to the Shangai (or any other) summit, and what changes that could imply for how Placement does the PTG. # Specs/Features All placement specs have merged. Thanks to everyone for the frequent reviews and quick followups. We've been maintaining some good velocity. Some non-placement specs are listed in the Other section below. # Stories/Bugs (Numbers in () are the change since the last pupdate.) There are 23 (3) stories in [the placement group](https://storyboard.openstack.org/#!/project_group/placement). 0 (0) are [untagged](https://storyboard.openstack.org/#!/worklist/580). 2 (-2) are [bugs](https://storyboard.openstack.org/#!/worklist/574). 5 (0) are [cleanups](https://storyboard.openstack.org/#!/worklist/575). 11 (0) are [rfes](https://storyboard.openstack.org/#!/worklist/594). 4 (1) are [docs](https://storyboard.openstack.org/#!/worklist/637). If you're interested in helping out with placement, those stories are good places to look. * Placement related nova [bugs not yet in progress](https://goo.gl/TgiPXb) on launchpad: 16 (0). * Placement related nova [in progress bugs](https://goo.gl/vzGGDQ) on launchpad: 4 (-1). # osc-placement osc-placement is currently behind by 11 microversions. * Add support for multiple member_of. # Main Themes ## Nested Magic These are the features required by modern nested resource provider use cases. We've merged mappings in allocation candidates and `root_required`. `same_subtree` and resourceless request groups are what's left and they are in: * Support `same_subtree` queryparam ## Consumer Types Adding a type to consumers will allow them to be grouped for various purposes, including quota accounting. * ## Cleanup Cleanup is an overarching theme related to improving documentation, performance and the maintainability of the code. The changes we are making this cycle are fairly complex to use and are fairly complex to write, so it is good that we're going to have plenty of time to clean and clarify all these things. As mentioned above, one of the important cleanup tasks that is not yet in progress is updating the [gabbit](https://opendev.org/openstack/placement/src/branch/master/gate/gabbits/nested-perfload.yaml) that creates the nested topology that's used in nested performance testing. The topology there is simple, unrealistic, and doesn't sufficiently exercise the several features that may be used during a query that desires a nested response. Recently I've been seeing that the `placement-perfload` job is giving results that vary between `N` and `N*2` (usually .5 and 1 seconds) and the difference that I can discern is the type of CPUs being presented by the host (same number of CPUs (8) but different type). This supports something we've been theorizing for a while: when dealing with large result sets we are CPU bound processing the several large result sets returned by the database. Further profiling required… Another cleanup that needs to start is satisfying the community wide goal of [PDF doc generation](https://storyboard.openstack.org/#!/story/2006110). # Other Placement Miscellaneous changes can be found in [the usual place](https://review.opendev.org/#/q/project:openstack/placement+status:open). There are three [os-traits changes](https://review.opendev.org/#/q/project:openstack/os-traits+status:open) being discussed. And one [os-resource-classes change](https://review.opendev.org/#/q/project:openstack/os-resource-classes+status:open). # Other Service Users New discoveries are added to the end. Merged stuff is removed. Anything that has had no activity in 4 weeks has been removed. * Nova: nova-manage: heal port allocations * nova-spec: Allow compute nodes to use DISK_GB from shared storage RP * Cyborg: Placement report * helm: WIP: add placement chart * Nova: Use OpenStack SDK for placement * Nova: Spec: Provider config YAML file * libvirt: report pmem namespaces resources by provider tree * Nova: Remove PlacementAPIConnectFailure handling from AggregateAPI * Nova: support move ops with qos ports * Nova: get_ksa_adapter: nix by-service-type confgrp hack * OSA: Add nova placement to placement migration * Nova: Defaults missing group_policy to 'none' * Blazar: Create placement client for each request * tempest: Define the Integrated-gate-networking gate template * tempest: Define the Integrated-gate-placement gate template * Nova: Restore RT.old_resources if ComputeNode.save() fails * Remove assumption of http error if consumer not exists * TripleO: Add new parameter NovaSchedulerLimitTenantsToPlacementAggregate * puppet-nova: Expose limit_tenants_to_placement_aggregate parameter * nova: Support filtering of hosts by forbidden aggregates * blazar: Send global_request_id for tracing calls * nova: Implement update_provider_tree for hyperv * watcher: Improve Compute Data Model * Nova: pre filter disable computes * Nova: Update HostState.\*\_allocation_ratio earlier # End This space left intentionally blank. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From cdent+os at anticdent.org Fri Jul 5 13:53:24 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Fri, 5 Jul 2019 14:53:24 +0100 (BST) Subject: [placement] db query analysis In-Reply-To: <07c6e111-d6a9-d175-7014-aff832c6e9c7@gmail.com> References: <07c6e111-d6a9-d175-7014-aff832c6e9c7@gmail.com> Message-ID: On Mon, 24 Jun 2019, Jay Pipes wrote: >> Related to that, I've started working on a nested-perfload at >> https://review.opendev.org/665695 > > Please note that there used to be fewer queries performed in the allocation > candidate and get resource provider functions. We replaced the giant SQL > statements with multiple smaller SQL statements to assist in debuggability > and tracing. Yeah, I recall that and think it was the right thing to do, but now we've reached a point where we've got limited choices for how to eke out some more performance, which we're going to need to do to be reasonable in > 10,000 node environments. Being me I'm inclined towards things like infinite resource classes, vector math, and "PUT IT ALL IN RAM !!!!!" but luckily its not up to just me. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From ralonsoh at redhat.com Fri Jul 5 16:29:55 2019 From: ralonsoh at redhat.com (Rodolfo Alonso) Date: Fri, 05 Jul 2019 17:29:55 +0100 Subject: [neutron][requirements] Pyroute2 stable/queens upper version (0.4.21) has a memory leak Message-ID: Hello folks: As reported in [1], we have found a memory leak in Pyroute stable/queens upper version (0.4.21). This memory leak is reproducible both with Python 2.7 and Python 3.6 (I didn't test 3.5 or 3.7). The script used is [2]. Using "pmap" to read the process memory map (specifically the "total" value), we can see an increase of several MB per minute. This problem is not present in version 0.5.2 (stable/rocky upper-requirements) I know that in stable releases, the policy established [3] is only to modify those external libraries in case of security related issues. This is not exactly a security breach but can tear down a server along the time. I submitted a patch to bump the version in stable/queens [4] and another one to test this change in the Neutron CI [5]. Is it possible to merge [4]? Regards. [1] https://bugs.launchpad.net/neutron/+bug/1835044 [2] http://paste.openstack.org/show/753759/ [3] https://docs.openstack.org/project-team-guide/stable-branches.html [4] https://review.opendev.org/#/c/668676/ [5] https://review.opendev.org/#/c/668677/ From smooney at redhat.com Fri Jul 5 16:36:35 2019 From: smooney at redhat.com (Sean Mooney) Date: Fri, 05 Jul 2019 17:36:35 +0100 Subject: [nova] type-PCI vs type-PF vs type-VF In-Reply-To: References: Message-ID: On Fri, 2019-07-05 at 07:58 -0400, Mauricio Tavares wrote: > Quick questions: > this is not really documented that well but https://bugs.launchpad.net/nova/+bug/1832169 has some context were i explaing why the 3 types exist. > 1. Is there some official documentation comparing them? Only thing I found > so far was > https://mohamede.com/2019/02/07/pci-passthrough-type-pf-type-vf-and-type-pci/ > > 2. Am I correct to assume (I seem to do a lot of that; be afraid) that when > nova starts populating the database entry for a given physical node (say a > computer node or a ironic baremetal node) it looks through the pci list and > decides what type of device (type-PCI, type-PF, or type-VF) it is? What is > the criteria? we determing the type in the _get_device_type function in the libvirt virt driver here https://github.com/openstack/nova/blob/e75598be31d849bbe09c905081be224f68210d32/nova/virt/libvirt/driver.py#L6093-L6130 the critia we use is interogating the pcie config space capablitys reported via libvirt. if the pci device has the "virt_functions" capablity meaning it can create virtual funcitons it si reported as type-PF as it can be used as an sriov PF if it has cap phys_function and an adress set it means it has a parent PF and therefor it is a VF all other deivce are starndard PCI device that do not support sriov and are mapped to type-PCI > > 3. > https://mohamede.com/2019/02/07/pci-passthrough-type-pf-type-vf-and-type-pci/ > implies that nova sees the 3 different types, well, differently where it > will try hard to virtualize type-PF and type-VF and really really wants me > to setup sr-iov on them ( > https://docs.openstack.org/nova/latest/admin/pci-passthrough.html seem to > also want sr-iov) really . Is that correct? you dont have to use the device for VF passthough but if it is capable of creting VF it will be lists as a PF regradelss of if VF have been created via sysfs or kernel args. if you modify the frimeware mode on your card such as puting it in datacenter bridging mode whic disables sriov it will change the pci config space entries will change and it will change form type-PF to type-PCI the example alias above was using an intel nic that supported the sriov so it used type-PF to do a pci passthough of the entire nic but if it was a gpu or other device that did not support sriov it would have used type-PCI. From iain.macdonnell at oracle.com Fri Jul 5 18:15:14 2019 From: iain.macdonnell at oracle.com (iain.macdonnell at oracle.com) Date: Fri, 5 Jul 2019 11:15:14 -0700 Subject: [nova] consoleauth token ttl In-Reply-To: <20190704185817.Horde.kZVzaR7qWwNvYreX3yw-Bae@webmail2.ic.unicamp.br> References: <20190704185817.Horde.kZVzaR7qWwNvYreX3yw-Bae@webmail2.ic.unicamp.br> Message-ID: <13a9fe53-078a-c6f7-a8db-f93bade9c27b@oracle.com> On 7/4/19 2:58 PM, William Lima Reiznautt wrote: > Some known about this configuration token_ttl on [consoleauth] section: > Is it on second or minutes ? > > At documentation is not information. Per https://docs.openstack.org/nova/stein/configuration/config.html#consoleauth "The lifetime of a console auth token (in seconds)." ~iain From mriedemos at gmail.com Fri Jul 5 20:14:34 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 5 Jul 2019 15:14:34 -0500 Subject: [ops] [nova] [placement] Mismatch between allocations and instances In-Reply-To: References: Message-ID: On 7/5/2019 1:45 AM, Massimo Sgaravatto wrote: > I tried to check the allocations on each compute node of a Ocata cloud, > using the command: > > curl -s ${PLACEMENT_ENDPOINT}/resource_providers/${UUID}/allocations -H > "x-auth-token: $TOKEN"  | python -m json.tool > Just FYI you can use osc-placement (openstack client plugin) for command line: https://docs.openstack.org/osc-placement/latest/index.html > I found that, on a few compute nodes, there are some instances for which > there is not a corresponding allocation. The heal_allocations command [1] might be able to find and fix these up for you. The bad news for you is that heal_allocations wasn't added until Rocky and you're on Ocata. The good news is you should be able to take the current version of the code from master (or stein) and run that in a container or virtual environment against your Ocata cloud (this would be particularly useful if you want to use the --dry-run or --instance options added in Train). You could also potentially backport those changes to your internal branch, or we could start a discussion upstream about backporting that tooling to stable branches - though going to Ocata might be a bit much at this point given Ocata and Pike are in extended maintenance mode [2]. As for *why* the instances on those nodes are missing allocations, it's hard to say without debugging things. The allocation and resource tracking code has changed quite a bit since Ocata (in Pike the scheduler started creating the allocations but the resource tracker in the compute service could still overwrite those allocations if you had older nodes during a rolling upgrade). My guess would be a migration failed or there was just a bug in Ocata where we didn't cleanup or allocate properly. Again, heal_allocations should add the missing allocation for you if you can setup the environment to run that command. > > On another Rocky cloud, we had the opposite problem: there were > allocations also for some instances that didn't exist anymore. > And this caused problems since we were not able to use all the resources > of the relevant compute nodes: we had to manually remove the fwrong" > allocations to fix the problem ... Yup, this could happen for different reasons, usually all due to known bugs for which you don't have the fix yet, e.g. [3][4], or something is failing during a migration and we aren't cleaning up properly (an unreported/not-yet-fixed bug). > > > I wonder why/how this problem can happen ... I mentioned some possibilities above - but I'm sure there are other bugs that have been fixed which I've omitted here, or things that aren't fixed yet, especially in failure scenarios (rollback/cleanup handling is hard). Note that your Ocata and Rocky cases could be different because since Queens (once all compute nodes are >=Queens) during resize, cold and live migration the migration record in nova holds the source node allocations during the migration so the actual *consumer* of the allocations for a provider in placement might not be an instance (server) record but actually a migration, so if you were looking for an allocation consumer by ID in nova using something like "openstack server show $consumer_id" it might return NotFound because the consumer is actually not an instance but a migration record and the allocation was leaked. > > And how can we fix the issue ? Should we manually add the missing > allocations / manually remove the wrong ones ? Coincidentally a thread related to this [5] re-surfaced a couple of weeks ago. I am not sure what Sylvain's progress is on that audit tool, but the linked bug in that email has some other operator scripts you could try for the case that there are leaked/orphaned allocations on compute nodes that no longer have instances. > > Thanks, Massimo > > [1] https://docs.openstack.org/nova/latest/cli/nova-manage.html#placement [2] https://docs.openstack.org/project-team-guide/stable-branches.html [3] https://bugs.launchpad.net/nova/+bug/1825537 [4] https://bugs.launchpad.net/nova/+bug/1821594 [5] http://lists.openstack.org/pipermail/openstack-discuss/2019-June/007241.html -- Thanks, Matt From corey.bryant at canonical.com Fri Jul 5 22:00:50 2019 From: corey.bryant at canonical.com (Corey Bryant) Date: Fri, 5 Jul 2019 18:00:50 -0400 Subject: [goal][python3] Train unit tests weekly update (goal-10) Message-ID: This is the goal-10 weekly update for the "Update Python 3 test runtimes for Train" goal [1]. There are 10 weeks remaining for completion of Train community goals [2]. == What's the Goal? == To ensure (in the Train cycle) that all official OpenStack repositories with Python 3 unit tests are exclusively using the 'openstack-python3-train-jobs' Zuul template or one of its variants (e.g. 'openstack-python3-train-jobs-neutron') to run unit tests, and that tests are passing. This will ensure that all official projects are running py36 and py37 unit tests in Train. For complete details please see [1]. == Ongoing Work == Patches have been submitted for all applicable projects except for: OpenStack Charms, Puppet OpenStack, Quality Assurance, Release Management, tripleo. Open patches needing reviews: https://review.openstack.org/#/q/topic:python3-train+is:open Failing patches: https://review.openstack.org/#/q/topic:python3-train+status:open+(+label:Verified-1+OR+label:Verified-2+) Patch automation scripts needing review: https://review.opendev.org/#/c/666934 == Completed Work == Merged patches: https://review.openstack.org/#/q/topic:python3-train+is:merged == How can you help? == Please take a look at the failing patches and help fix any failing unit tests for your projects. Python 3.7 unit tests will be self-testing in Zuul. If you're interested in helping submit patches, please let me know. == Reference Material == [1] Goal description: https://governance.openstack.org/tc/goals /train/python3-updates.html [2] Train release schedule: https://releases.openstack.org/train/schedule.html (see R-5 for "Train Community Goals Completed") Storyboard: https://storyboard.openstack.org/#!/story/2005924 Porting to Python 3.7: https://docs.python.org/3/whatsnew/3.7.html#porting-to-python-3-7 Python Update Process: https://opendev.org/openstack/governance/src/branch/master/resolutions/20181024-python-update-process.rst Train runtimes: https://opendev.org/openstack/governance/src/branch/master/reference/runtimes/train.rst Thanks, Corey -------------- next part -------------- An HTML attachment was scrubbed... URL: From mriedemos at gmail.com Fri Jul 5 22:46:06 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Fri, 5 Jul 2019 17:46:06 -0500 Subject: [nova][ironic] Lock-related performance issue with update_resources periodic job In-Reply-To: <37D021C4-ED1B-4942-9C90-0A26FDE3DD76@fried.cc> References: <37D021C4-ED1B-4942-9C90-0A26FDE3DD76@fried.cc> Message-ID: On 7/4/2019 5:16 AM, Eric Fried wrote: > > https://review.opendev.org/#/c/637225/ > > Ah heck, I had totally forgotten about that patch. If it's working for > you, let me get it polished up and merged. > > We could probably justify backporting it too. Matt? > > efried Sure - get a bug opened for it, extra points if CERN can provide some before/after numbers with the patch applied to help justify it. From skimming the commit message, if the only side effect would be for sharing providers, which we don't really support yet, then backports seem OK. -- Thanks, Matt From openstack at fried.cc Fri Jul 5 22:57:51 2019 From: openstack at fried.cc (Eric Fried) Date: Fri, 5 Jul 2019 17:57:51 -0500 Subject: [nova][ironic] Lock-related performance issue with update_resources periodic job In-Reply-To: References: <37D021C4-ED1B-4942-9C90-0A26FDE3DD76@fried.cc> Message-ID: <07C5F51A-DCAB-4432-B556-49E1E15801AC@fried.cc> Bug already associated with patch. I'll work on this next week. efried > On Jul 5, 2019, at 17:46, Matt Riedemann wrote: > >> On 7/4/2019 5:16 AM, Eric Fried wrote: >> > https://review.opendev.org/#/c/637225/ >> Ah heck, I had totally forgotten about that patch. If it's working for you, let me get it polished up and merged. >> We could probably justify backporting it too. Matt? >> efried > > Sure - get a bug opened for it, extra points if CERN can provide some before/after numbers with the patch applied to help justify it. > > From skimming the commit message, if the only side effect would be for sharing providers, which we don't really support yet, then backports seem OK. > > -- > > Thanks, > > Matt > From tony at bakeyournoodle.com Fri Jul 5 23:04:20 2019 From: tony at bakeyournoodle.com (Tony Breeds) Date: Sat, 6 Jul 2019 09:04:20 +1000 Subject: [release] Release countdown for week R-14, July 8-12 Message-ID: <20190705230420.GA19305@thor.bakeyournoodle.com> Hi folks, Development Focus ----------------- The Train-2 milestone will happen in three weeks, on July 25. Train-related specs should now be finalized so that teams can move to implementation ASAP. Some teams observe specific deadlines on the second milestone (mostly spec freezes): please refer to https://releases.openstack.org/train/schedule.html for details. General Information ------------------- Please remember that libraries need to be released at least once per milestone period. At milestone 2, the release team will propose releases for any library that has not been otherwise released since milestone 1. Other non-library deliverables that follow the cycle-with-intermediary release model should have an intermediary release before milestone-2. Those who haven't will be proposed to switch to the cycle-with-rc model, which is more suited to deliverables that are released only once per cycle. At milestone-2 we also freeze the contents of the final release. If you have a new deliverable that should be included in the final release, you should make sure it has a deliverable file in https://opendev.org/openstack/releases/src/branch/master/deliverables/train . You should request a beta release (or intermediary release) for those new deliverables by milestone-2. We understand some may not be quite ready for a full release yet, but if you have something minimally viable to get released it would be good to do a 0.x release to exercise the release tooling for your deliverables. See the MembershipFreeze description for more details: https://releases.openstack.org/train/schedule.html#t-mf Upcoming Deadlines & Dates -------------------------- Train-2 Milestone: July 25 (R-12 week) Yours Tony. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From tony at bakeyournoodle.com Fri Jul 5 23:12:29 2019 From: tony at bakeyournoodle.com (Tony Breeds) Date: Sat, 6 Jul 2019 09:12:29 +1000 Subject: [all][release] PTL on Vacation July 6-14 Message-ID: <20190705231229.GB19305@thor.bakeyournoodle.com> Hi All, Just a very quick note to say I'll be AFK for the next week. The only visible change could be that stable releases are delayed a little. As always if it's urgent nothing stops the release team from approving things in my absence Yours Tony. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: not available URL: From colleen at gazlene.net Fri Jul 5 23:14:44 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Fri, 05 Jul 2019 16:14:44 -0700 Subject: [keystone] Keystone Team Update - Week of 1 July 2019 Message-ID: <10503ca2-647b-45f6-ad43-a1bab0471e49@www.fastmail.com> # Keystone Team Update - Week of 1 July 2019 ## News ### Midcycle Planning If you have not already done so, please participate in the scheduling poll[1] and contribute topic ideas[2] for the virtual midcycle. We will select a date next week and start preparing the agenda. [1] https://doodle.com/poll/wr7ct4uhpw82sysg [2] https://etherpad.openstack.org/p/keystone-train-midcycle-topics ### Oslo.limit Walkthrough We used this week's office hours session to walk through the new proposals for the oslo.limit interface, which Lance nicely summarized[3]. [3] http://lists.openstack.org/pipermail/openstack-discuss/2019-July/007487.html ## Open Specs Train specs: https://bit.ly/2uZ2tRl Ongoing specs: https://bit.ly/2OyDLTh We still have three Train specs that needs to be updated or reviewed prior to the Milestone 2 deadline. ## Recently Merged Changes Search query: https://bit.ly/2pquOwT We merged 14 changes this week. ## Changes that need Attention Search query: https://bit.ly/2tymTje There are 43 changes that are passing CI, not in merge conflict, have no negative reviews and aren't proposed by bots. ## Bugs This week we opened 5 new bugs and closed none. Bugs opened (5) Bug #1835299 (keystone:Low) opened by Colleen Murphy https://bugs.launchpad.net/keystone/+bug/1835299 Bug #1835303 (keystoneauth:Undecided) opened by Ben Nemec https://bugs.launchpad.net/keystoneauth/+bug/1835303 Bug #1835103 (oslo.limit:Low) opened by Lance Bragstad https://bugs.launchpad.net/oslo.limit/+bug/1835103 Bug #1835104 (oslo.limit:Low) opened by Lance Bragstad https://bugs.launchpad.net/oslo.limit/+bug/1835104 Bug #1835106 (oslo.limit:Low) opened by Lance Bragstad https://bugs.launchpad.net/oslo.limit/+bug/1835106 ## Milestone Outlook https://releases.openstack.org/train/schedule.html Spec freeze is in 3 weeks, closely followed by feature proposal freeze. If you are working on a feature, even if the spec is not approved yet, don't hesitate to propose some code ahead of the deadline. ## Help with this newsletter Help contribute to this newsletter by editing the etherpad: https://etherpad.openstack.org/p/keystone-team-newsletter From fungi at yuggoth.org Sat Jul 6 00:28:11 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Sat, 6 Jul 2019 00:28:11 +0000 Subject: [neutron][requirements] Pyroute2 stable/queens upper version (0.4.21) has a memory leak In-Reply-To: References: Message-ID: <20190706002811.oigm7z3iov6g3s2s@yuggoth.org> On 2019-07-05 17:29:55 +0100 (+0100), Rodolfo Alonso wrote: [...] > I know that in stable releases, the policy established [3] is only > to modify those external libraries in case of security related > issues. This is not exactly a security breach but can tear down a > server along the time. [...] > [3] https://docs.openstack.org/project-team-guide/stable-branches.html [...] You're referring to policy about backporting fixes for bugs in OpenStack software, and so necessitates patch-level version increases for the affected OpenStack components in upper-constraints.txt to make sure we test other software against that newer version. The policy so far regarding stable branch upper-constraints.txt entries for external dependencies of OpenStack has been to not change them even if they include known security vulnerabilities or other critical bugs, unless those bugs impact our ability to reliably test proposed changes to stable branches of OpenStack software for possible regressions. It's a common misconception, but that upper-constraints.txt file is purely a reflection of the (basically frozen in the case of stable branches) set of dependency versions from PyPI against which changes to our software are tested. It is not a good idea to deploy production environments from the PyPI packages corresponding to the versions listed there, for a variety of reasons (most important of which is that they aren't a security-supported distribution, nor can they ever even remotely become one). -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From skaplons at redhat.com Sat Jul 6 08:58:20 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Sat, 6 Jul 2019 10:58:20 +0200 Subject: [TripleO] Using TripleO standalone with ML2/OVS Message-ID: <8F8016EB-7898-4F66-BD30-998ABDB094FB@redhat.com> Hi, I was trying to use TripleO standalone for development work in the way how Emilien described it in [1] and indeed it works quite well. Thx Emilien. But now, I’m trying to find out the way how to deploy it with Neutron using ML2/OVS instead of default in TripleO ML2/OVN. And I still don’t know how to do it :/ I know it’s my fault but maybe someone can help me with this and tell me what exactly options I should change there to deploy it with other Neutron backend? Thx in advance for any help. [1] https://my1.fr/blog/developer-workflow-with-tripleo/ — Slawek Kaplonski Senior software engineer Red Hat From sam47priya at gmail.com Sat Jul 6 14:54:14 2019 From: sam47priya at gmail.com (Sam P) Date: Sat, 6 Jul 2019 23:54:14 +0900 Subject: [masakari] Run masakari-hostmonitor into Docker container In-Reply-To: References: <7666a4eae3522bcb14741108bf8a5994@incloudus.com> Message-ID: Hi Gaëtan, I have never tried it, but sounds interesting. Current masakari is not designed to run in the containers. However, except masakari monitors, most of masakari components can run on the containers. For masakari-hostmonitor, you don't need to run systemd daemon inside the container. Hoever, the code need to be changed slightly to use remote systemd on the host OS. ex: systemctl --host user_name at host_name command Or we also can use the method Mark shared in previous email. > I would not recommend running the systemd daemon in a container, but > you could potentially use the client to access a daemon running on the > host. E.g., for debian: > https://stackoverflow.com/questions/54079586/make-systemctl-work-from-inside-a-container-in-a-debian-stretch-image. > No doubt there will be various gotchas with this. > Are you planning to run pacemaker and corosync on the host? Since masakari-hostmonitor needs to detect any failures of the host, pacemaker and corosync need to run on the host OS. You may do the otherwise, but this is the simplest solution. --- Regards, Sampath On Fri, Jun 28, 2019 at 5:15 PM Mark Goddard wrote: > > On Thu, 27 Jun 2019 at 21:52, wrote: > > > > Hi, > > > > I'm integrating Masakari into Kolla and Kolla Ansible projects but I'm > > facing > > an issue related to masakari-hostmonitor. > > > > Based on masakari-monitors code[1], "systemctl status" command is used > > to check > > if pacemaker, pacemaker-remote and corosync are running. > > > > Having systemd running into Docker container is not the best solution. > > Does any > > of you has been able to run masakari-monitor into Docker container ? > > > > I would not recommend running the systemd daemon in a container, but > you could potentially use the client to access a daemon running on the > host. E.g., for debian: > https://stackoverflow.com/questions/54079586/make-systemctl-work-from-inside-a-container-in-a-debian-stretch-image. > No doubt there will be various gotchas with this. > > Are you planning to run pacemaker and corosync on the host? > > Mark > > > Thanks for your help. > > > > Gaëtan > > > > - [1] > > https://github.com/openstack/masakari-monitors/blob/26d558333d9731ca06da09b26fe6592c49c0ac8a/masakarimonitors/hostmonitor/host_handler/handle_host.py#L48 > > > From mnaser at vexxhost.com Sat Jul 6 17:38:32 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Sat, 6 Jul 2019 13:38:32 -0400 Subject: [docs][fuel][infra][ironic][neutron][charms][openstack-ansible][solum][tc] proper retirement of repos Message-ID: Hi everyone, One of the issue that we recently ran into was the fact that there was some inconsistency about merging retirement of repositories inside governance without the code being fully removed. In order to avoid this, I've made a change to our governance repository which will enforce that no code exists in those retired repositories, however, this has surfaced that some repositories were retired with some stale files, some are smaller littler files, some are entire projects still. I have compiled a list for every team, with the repos that are not properly retired that have extra files (using this change which should eventually +1 once we fix it all: https://review.opendev.org/669549) [Documentation] openstack/api-site has extra files, please remove: .gitignore, .zuul.yaml, LICENSE, api-quick-start, api-ref, bindep.txt, common, doc-tools-check-languages.conf, firstapp, test-requirements.txt, tools, tox.ini, www [Documentation] openstack/faafo has extra files, please remove: .gitignore, CONTRIBUTING.rst, LICENSE, Vagrantfile, bin, contrib, doc, etc, faafo, requirements.txt, setup.cfg, setup.py, test-requirements.txt, tox.ini [fuel] openstack/fuel-agent has extra files, please remove: .gitignore, LICENSE, MAINTAINERS, cloud-init-templates, contrib, debian, etc, fuel_agent, requirements.txt, run_tests.sh, setup.cfg, setup.py, specs, test-requirements.txt, tools, tox.ini [fuel] openstack/fuel-astute has extra files, please remove: .gitignore, .rspec, .ruby-version, Gemfile, LICENSE, MAINTAINERS, Rakefile, astute.gemspec, astute.service, astute.sysconfig, bin, bindep.txt, debian, examples, lib, mcagents, run_tests.sh, spec, specs, tests [fuel] openstack/fuel-library has extra files, please remove: .gitignore, CHANGELOG, Gemfile, LICENSE, MAINTAINERS, Rakefile, debian, deployment, files, graphs, logs, specs, tests, utils [fuel] openstack/fuel-main has extra files, please remove: .gitignore, 00-debmirror.patch, LICENSE, MAINTAINERS, Makefile, config.mk, fuel-release, iso, mirror, packages, prepare-build-env.sh, report-changelog.sh, repos.mk, requirements-fuel-rpm.txt, requirements-rpm.txt, rules.mk, sandbox.mk, specs [fuel] openstack/fuel-menu has extra files, please remove: .gitignore, MAINTAINERS, MANIFEST.in, fuelmenu, run_tests.sh, setup.py, specs, test-requirements.txt, tox.ini [fuel] openstack/fuel-mirror has extra files, please remove: .gitignore, .mailmap, MAINTAINERS, perestroika, tox.ini [fuel] openstack/fuel-nailgun-agent has extra files, please remove: .gitignore, Gemfile, LICENSE, MAINTAINERS, Rakefile, agent, debian, nailgun-agent.cron, nailgun-agent.gemspec, run_tests.sh, specs [fuel] openstack/fuel-ostf has extra files, please remove: .gitignore, LICENSE, MAINTAINERS, MANIFEST.in, etc, fuel_health, fuel_plugin, ostf.service, pylintrc, requirements.txt, run_tests.sh, setup.cfg, setup.py, specs, test-requirements.txt, tools, tox.ini [fuel] openstack/fuel-qa has extra files, please remove: .coveragerc, .gitignore, .pylintrc, .pylintrc_gerrit, MAINTAINERS, core, doc, fuel_tests, fuelweb_test, gates_tests, packages_tests, pytest.ini, run_system_test.py, run_tests.sh, system_test, tox.ini, utils [fuel] openstack/fuel-ui has extra files, please remove: .eslintignore, .eslintrc.yaml, .gitignore, LICENSE, MAINTAINERS, fixtures, gulp, gulpfile.js, karma.config.js, npm-shrinkwrap.json, package.json, run_real_plugin_tests.sh, run_real_plugin_tests_on_real_nailgun.sh, run_ui_func_tests.sh, specs, static, webpack.config.js [fuel] openstack/fuel-virtualbox has extra files, please remove: .gitignore, MAINTAINERS, actions, clean.sh, config.sh, contrib, drivers, dumpkeys.cache, functions, iso, launch.sh, launch_16GB.sh, launch_8GB.sh [fuel] openstack/fuel-web has extra files, please remove: .gitignore, LICENSE, MAINTAINERS, bin, build_docs.sh, debian, docs, nailgun, run_tests.sh, specs, systemd, tools, tox.ini [fuel] openstack/shotgun has extra files, please remove: .coveragerc, .gitignore, .testr.conf, CONTRIBUTING.rst, HACKING.rst, LICENSE, MAINTAINERS, MANIFEST.in, bin, etc, requirements.txt, setup.cfg, setup.py, shotgun, specs, test-requirements.txt, tox.ini [fuel] openstack/fuel-dev-tools has extra files, please remove: .coveragerc, .gitignore, .mailmap, .testr.conf, CONTRIBUTING.rst, HACKING.rst, LICENSE, MAINTAINERS, babel.cfg, contrib, doc, fuel_dev_tools, openstack-common.conf, requirements.txt, setup.cfg, setup.py, test-requirements.txt, tox.ini, vagrant [fuel] openstack/fuel-devops has extra files, please remove: .coveragerc, .gitignore, .pylintrc, .pylintrc_gerrit, LICENSE, MAINTAINERS, bin, devops, doc, run_tests.sh, samples, setup.cfg, setup.py, test-requirements.txt, tox.ini [fuel] openstack/fuel-docs has extra files, please remove: .gitignore, Makefile, _images, _templates, common_conf.py, conf.py, devdocs, examples, glossary, index.rst, make.bat, plugindocs, requirements.txt, setup.cfg, setup.py, tox.ini, userdocs [fuel] openstack/fuel-nailgun-extension-cluster-upgrade has extra files, please remove: .coveragerc, .gitignore, AUTHORS, LICENSE, MANIFEST.in, bindep.txt, cluster_upgrade, conftest.py, nailgun-test-settings.yaml, requirements.txt, setup.cfg, setup.py, specs, test-requirements.txt, tools, tox.ini [fuel] openstack/fuel-nailgun-extension-iac has extra files, please remove: .gitignore, LICENSE, MANIFEST.in, doc, fuel_external_git, requirements.txt, setup.cfg, setup.py, specs, test-requirements.txt, tools, tox.ini [fuel] openstack/fuel-nailgun-extension-converted-serializers has extra files, please remove: .coveragerc, .gitignore, LICENSE, MANIFEST.in, bindep.txt, conftest.py, converted_serializers, nailgun-test-settings.yaml, requirements.txt, setup.cfg, setup.py, specs, test-requirements.txt, tools, tox.ini [fuel] openstack/fuel-octane has extra files, please remove: .coveragerc, .gitignore, .mailmap, Gemfile, Gemfile.lock, HACKING.rst, LICENSE, MAINTAINERS, MANIFEST.in, Rakefile, bindep.txt, deploy, deployment, docs, misc, octane, requirements.txt, setup.cfg, setup.py, specs, test-requirements.txt, tox.ini [fuel] openstack/fuel-upgrade has extra files, please remove: .gitignore [fuel] openstack/tuning-box has extra files, please remove: .coveragerc, .gitignore, .mailmap, .testr.conf, CONTRIBUTING.rst, HACKING.rst, LICENSE, MAINTAINERS, MANIFEST.in, TODO, alembic.ini, babel.cfg, bindep.txt, doc, examples, openstack-common.conf, requirements.txt, setup.cfg, setup.py, specs, test-requirements.txt, tools, tox.ini, tuning_box [fuel] openstack/fuel-plugins has extra files, please remove: .gitignore, CHANGELOG.md, CONTRIBUTING.rst, HACKING.rst, LICENSE, MAINTAINERS, examples, fuel_plugin_builder, requirements.txt, run_tests.sh, setup.cfg, setup.py, test-requirements.txt, tox.ini [fuel] openstack/fuel-plugin-murano has extra files, please remove: .gitignore, LICENSE, components.yaml, deployment_scripts, deployment_tasks.yaml, docs, environment_config.yaml, functions.sh, metadata.yaml, node_roles.yaml, pre_build_hook, releasenotes, repositories, test-requirements.txt, tox.ini, volumes.yaml [fuel] openstack/fuel-plugin-murano-tests has extra files, please remove: .gitignore, murano_plugin_tests, openrc.default, requirements.txt, tox.ini, utils [fuel] openstack/fuel-specs has extra files, please remove: .gitignore, .testr.conf, LICENSE, doc, images, policy, requirements.txt, setup.cfg, setup.py, specs, tests, tools, tox.ini [fuel] openstack/fuel-stats has extra files, please remove: .gitignore, LICENSE, MAINTAINERS, MANIFEST.in, analytics, collector, migration, requirements.txt, setup.py, test-requirements.txt, tools, tox.ini [fuel] openstack/python-fuelclient has extra files, please remove: .gitignore, .testr.conf, MAINTAINERS, MANIFEST.in, fuelclient, requirements.txt, setup.cfg, setup.py, specs, test-requirements.txt, tools, tox.ini [Infrastructure] opendev/puppet-releasestatus has extra files, please remove: .gitignore [ironic] openstack/python-dracclient has extra files, please remove: .gitignore, CONTRIBUTING.rst, HACKING.rst, LICENSE, doc, dracclient, requirements.txt, setup.cfg, setup.py, test-requirements.txt, tox.ini [neutron] openstack/networking-calico has extra files, please remove: .coveragerc, .gitignore, .mailmap, .testr.conf, .zuul.yaml, CONTRIBUTING.rst, HACKING.rst, LICENSE, MANIFEST.in, RELEASING.md, babel.cfg, debian, devstack, doc, networking_calico, playbooks, requirements.txt, rpm, setup.cfg, setup.py, test-requirements.txt, tox.ini [neutron] openstack/networking-l2gw has extra files, please remove: .coveragerc, .gitignore, .testr.conf, .zuul.yaml, CONTRIBUTING.rst, HACKING.rst, LICENSE, MANIFEST.in, babel.cfg, bindep.txt, contrib, debian, devstack, doc, etc, lower-constraints.txt, networking_l2gw, openstack-common.conf, requirements.txt, setup.cfg, setup.py, specs, test-requirements.txt, tools, tox.ini [neutron] openstack/networking-l2gw-tempest-plugin has extra files, please remove: .gitignore, .zuul.yaml, CONTRIBUTING.rst, HACKING.rst, LICENSE, babel.cfg, contrib, doc, networking_l2gw_tempest_plugin, requirements.txt, setup.cfg, setup.py, test-requirements.txt, tox.ini [neutron] openstack/networking-onos has extra files, please remove: .coveragerc, .gitignore, .mailmap, .pylintrc, .testr.conf, CONTRIBUTING.rst, HACKING.rst, LICENSE, PKG-INFO, TESTING.rst, babel.cfg, devstack, doc, etc, lower-constraints.txt, networking_onos, package, rally-jobs, releasenotes, requirements.txt, setup.cfg, setup.py, test-requirements.txt, tools, tox.ini [neutron] openstack/neutron-vpnaas has extra files, please remove: .coveragerc, .gitignore, .mailmap, .pylintrc, .stestr.conf, .zuul.yaml, CONTRIBUTING.rst, HACKING.rst, LICENSE, TESTING.rst, babel.cfg, devstack, doc, etc, lower-constraints.txt, neutron_vpnaas, playbooks, rally-jobs, releasenotes, requirements.txt, setup.cfg, setup.py, test-requirements.txt, tools, tox.ini [OpenStack Charms] openstack/charm-ceph has extra files, please remove: .gitignore [OpenStackAnsible] openstack/openstack-ansible-os_monasca has extra files, please remove: tests, tox.ini [solum] openstack/solum-infra-guestagent has extra files, please remove: .coveragerc, .gitignore, .mailmap, .testr.conf, CONTRIBUTING.rst, HACKING.rst, LICENSE, MANIFEST.in, babel.cfg, config-generator, doc, etc, requirements.txt, setup.cfg, setup.py, solum_guestagent, test-requirements.txt, tox.ini I'd like to kindly ask the affected teams to help out with this, or any member of our community is more than welcome to push a change to those repos and work with the appropriate teams to help land it. Mohammed -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. http://vexxhost.com From dtantsur at redhat.com Sat Jul 6 19:29:13 2019 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Sat, 6 Jul 2019 21:29:13 +0200 Subject: [docs][fuel][infra][ironic][neutron][charms][openstack-ansible][solum][tc] proper retirement of repos In-Reply-To: References: Message-ID: <9b4d6acc-350f-0dbd-fb27-1c2f5023c6b4@redhat.com> On 7/6/19 7:38 PM, Mohammed Naser wrote: > Hi everyone, > > One of the issue that we recently ran into was the fact that there was > some inconsistency about merging retirement of repositories inside > governance without the code being fully removed. > > In order to avoid this, I've made a change to our governance > repository which will enforce that no code exists in those retired > repositories, however, this has surfaced that some repositories were > retired with some stale files, some are smaller littler files, some > are entire projects still. > > I have compiled a list for every team, with the repos that are not > properly retired that have extra files (using this change which should > eventually +1 once we fix it all: https://review.opendev.org/669549) > > > [ironic] openstack/python-dracclient has extra files, please remove: > .gitignore, CONTRIBUTING.rst, HACKING.rst, LICENSE, doc, dracclient, > requirements.txt, setup.cfg, setup.py, test-requirements.txt, tox.ini This project is not retired, why is it listed here? I also used to think it was not under any governance.. > > > I'd like to kindly ask the affected teams to help out with this, or > any member of our community is more than welcome to push a change to > those repos and work with the appropriate teams to help land it. > > Mohammed > From mnaser at vexxhost.com Sat Jul 6 19:39:04 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Sat, 6 Jul 2019 15:39:04 -0400 Subject: [docs][fuel][infra][ironic][neutron][charms][openstack-ansible][solum][tc] proper retirement of repos In-Reply-To: <9b4d6acc-350f-0dbd-fb27-1c2f5023c6b4@redhat.com> References: <9b4d6acc-350f-0dbd-fb27-1c2f5023c6b4@redhat.com> Message-ID: On Sat, Jul 6, 2019 at 3:33 PM Dmitry Tantsur wrote: > > On 7/6/19 7:38 PM, Mohammed Naser wrote: > > Hi everyone, > > > > One of the issue that we recently ran into was the fact that there was > > some inconsistency about merging retirement of repositories inside > > governance without the code being fully removed. > > > > In order to avoid this, I've made a change to our governance > > repository which will enforce that no code exists in those retired > > repositories, however, this has surfaced that some repositories were > > retired with some stale files, some are smaller littler files, some > > are entire projects still. > > > > I have compiled a list for every team, with the repos that are not > > properly retired that have extra files (using this change which should > > eventually +1 once we fix it all: https://review.opendev.org/669549) > > > > > > > [ironic] openstack/python-dracclient has extra files, please remove: > > .gitignore, CONTRIBUTING.rst, HACKING.rst, LICENSE, doc, dracclient, > > requirements.txt, setup.cfg, setup.py, test-requirements.txt, tox.ini > > This project is not retired, why is it listed here? > > I also used to think it was not under any governance.. https://opendev.org/openstack/governance/commit/003d9d8247f2deea7a344c47d87b54f7457fe19d So, I'm not sure if in this case it has to be moved to x/python-dracclient or... > > > > > > > I'd like to kindly ask the affected teams to help out with this, or > > any member of our community is more than welcome to push a change to > > those repos and work with the appropriate teams to help land it. > > > > Mohammed > > > > -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. http://vexxhost.com From dtantsur at redhat.com Sat Jul 6 19:42:59 2019 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Sat, 6 Jul 2019 21:42:59 +0200 Subject: [docs][fuel][infra][ironic][neutron][charms][openstack-ansible][solum][tc] proper retirement of repos In-Reply-To: References: <9b4d6acc-350f-0dbd-fb27-1c2f5023c6b4@redhat.com> Message-ID: <43e69a72-9c37-6ce9-6a87-178a0ea2edb5@redhat.com> On 7/6/19 9:39 PM, Mohammed Naser wrote: > On Sat, Jul 6, 2019 at 3:33 PM Dmitry Tantsur wrote: >> >> On 7/6/19 7:38 PM, Mohammed Naser wrote: >>> Hi everyone, >>> >>> One of the issue that we recently ran into was the fact that there was >>> some inconsistency about merging retirement of repositories inside >>> governance without the code being fully removed. >>> >>> In order to avoid this, I've made a change to our governance >>> repository which will enforce that no code exists in those retired >>> repositories, however, this has surfaced that some repositories were >>> retired with some stale files, some are smaller littler files, some >>> are entire projects still. >>> >>> I have compiled a list for every team, with the repos that are not >>> properly retired that have extra files (using this change which should >>> eventually +1 once we fix it all: https://review.opendev.org/669549) >> >>> >>> >>> [ironic] openstack/python-dracclient has extra files, please remove: >>> .gitignore, CONTRIBUTING.rst, HACKING.rst, LICENSE, doc, dracclient, >>> requirements.txt, setup.cfg, setup.py, test-requirements.txt, tox.ini >> >> This project is not retired, why is it listed here? >> >> I also used to think it was not under any governance.. > > https://opendev.org/openstack/governance/commit/003d9d8247f2deea7a344c47d87b54f7457fe19d > > So, I'm not sure if in this case it has to be moved to x/python-dracclient or... Yep, I guess it should have been moved. This library is maintained by Dell, not by the ironic team. > >>> >> >>> >>> I'd like to kindly ask the affected teams to help out with this, or >>> any member of our community is more than welcome to push a change to >>> those repos and work with the appropriate teams to help land it. >>> >>> Mohammed >>> >> >> > > From skaplons at redhat.com Sat Jul 6 19:59:38 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Sat, 6 Jul 2019 21:59:38 +0200 Subject: [docs][fuel][infra][ironic][neutron][charms][openstack-ansible][solum][tc] proper retirement of repos In-Reply-To: References: Message-ID: <04C9A55A-1ABA-4015-AECC-63E44E760E29@redhat.com> Hi, > On 6 Jul 2019, at 19:38, Mohammed Naser wrote: > > Hi everyone, > > One of the issue that we recently ran into was the fact that there was > some inconsistency about merging retirement of repositories inside > governance without the code being fully removed. > > In order to avoid this, I've made a change to our governance > repository which will enforce that no code exists in those retired > repositories, however, this has surfaced that some repositories were > retired with some stale files, some are smaller littler files, some > are entire projects still. > > I have compiled a list for every team, with the repos that are not > properly retired that have extra files (using this change which should > eventually +1 once we fix it all: https://review.opendev.org/669549) > > [Documentation] openstack/api-site has extra files, please remove: > .gitignore, .zuul.yaml, LICENSE, api-quick-start, api-ref, bindep.txt, > common, doc-tools-check-languages.conf, firstapp, > test-requirements.txt, tools, tox.ini, www > [Documentation] openstack/faafo has extra files, please remove: > .gitignore, CONTRIBUTING.rst, LICENSE, Vagrantfile, bin, contrib, doc, > etc, faafo, requirements.txt, setup.cfg, setup.py, > test-requirements.txt, tox.ini > > [fuel] openstack/fuel-agent has extra files, please remove: > .gitignore, LICENSE, MAINTAINERS, cloud-init-templates, contrib, > debian, etc, fuel_agent, requirements.txt, run_tests.sh, setup.cfg, > setup.py, specs, test-requirements.txt, tools, tox.ini > [fuel] openstack/fuel-astute has extra files, please remove: > .gitignore, .rspec, .ruby-version, Gemfile, LICENSE, MAINTAINERS, > Rakefile, astute.gemspec, astute.service, astute.sysconfig, bin, > bindep.txt, debian, examples, lib, mcagents, run_tests.sh, spec, > specs, tests > [fuel] openstack/fuel-library has extra files, please remove: > .gitignore, CHANGELOG, Gemfile, LICENSE, MAINTAINERS, Rakefile, > debian, deployment, files, graphs, logs, specs, tests, utils > [fuel] openstack/fuel-main has extra files, please remove: .gitignore, > 00-debmirror.patch, LICENSE, MAINTAINERS, Makefile, config.mk, > fuel-release, iso, mirror, packages, prepare-build-env.sh, > report-changelog.sh, repos.mk, requirements-fuel-rpm.txt, > requirements-rpm.txt, rules.mk, sandbox.mk, specs > [fuel] openstack/fuel-menu has extra files, please remove: .gitignore, > MAINTAINERS, MANIFEST.in, fuelmenu, run_tests.sh, setup.py, specs, > test-requirements.txt, tox.ini > [fuel] openstack/fuel-mirror has extra files, please remove: > .gitignore, .mailmap, MAINTAINERS, perestroika, tox.ini > [fuel] openstack/fuel-nailgun-agent has extra files, please remove: > .gitignore, Gemfile, LICENSE, MAINTAINERS, Rakefile, agent, debian, > nailgun-agent.cron, nailgun-agent.gemspec, run_tests.sh, specs > [fuel] openstack/fuel-ostf has extra files, please remove: .gitignore, > LICENSE, MAINTAINERS, MANIFEST.in, etc, fuel_health, fuel_plugin, > ostf.service, pylintrc, requirements.txt, run_tests.sh, setup.cfg, > setup.py, specs, test-requirements.txt, tools, tox.ini > [fuel] openstack/fuel-qa has extra files, please remove: .coveragerc, > .gitignore, .pylintrc, .pylintrc_gerrit, MAINTAINERS, core, doc, > fuel_tests, fuelweb_test, gates_tests, packages_tests, pytest.ini, > run_system_test.py, run_tests.sh, system_test, tox.ini, utils > [fuel] openstack/fuel-ui has extra files, please remove: > .eslintignore, .eslintrc.yaml, .gitignore, LICENSE, MAINTAINERS, > fixtures, gulp, gulpfile.js, karma.config.js, npm-shrinkwrap.json, > package.json, run_real_plugin_tests.sh, > run_real_plugin_tests_on_real_nailgun.sh, run_ui_func_tests.sh, specs, > static, webpack.config.js > [fuel] openstack/fuel-virtualbox has extra files, please remove: > .gitignore, MAINTAINERS, actions, clean.sh, config.sh, contrib, > drivers, dumpkeys.cache, functions, iso, launch.sh, launch_16GB.sh, > launch_8GB.sh > [fuel] openstack/fuel-web has extra files, please remove: .gitignore, > LICENSE, MAINTAINERS, bin, build_docs.sh, debian, docs, nailgun, > run_tests.sh, specs, systemd, tools, tox.ini > [fuel] openstack/shotgun has extra files, please remove: .coveragerc, > .gitignore, .testr.conf, CONTRIBUTING.rst, HACKING.rst, LICENSE, > MAINTAINERS, MANIFEST.in, bin, etc, requirements.txt, setup.cfg, > setup.py, shotgun, specs, test-requirements.txt, tox.ini > [fuel] openstack/fuel-dev-tools has extra files, please remove: > .coveragerc, .gitignore, .mailmap, .testr.conf, CONTRIBUTING.rst, > HACKING.rst, LICENSE, MAINTAINERS, babel.cfg, contrib, doc, > fuel_dev_tools, openstack-common.conf, requirements.txt, setup.cfg, > setup.py, test-requirements.txt, tox.ini, vagrant > [fuel] openstack/fuel-devops has extra files, please remove: > .coveragerc, .gitignore, .pylintrc, .pylintrc_gerrit, LICENSE, > MAINTAINERS, bin, devops, doc, run_tests.sh, samples, setup.cfg, > setup.py, test-requirements.txt, tox.ini > [fuel] openstack/fuel-docs has extra files, please remove: .gitignore, > Makefile, _images, _templates, common_conf.py, conf.py, devdocs, > examples, glossary, index.rst, make.bat, plugindocs, requirements.txt, > setup.cfg, setup.py, tox.ini, userdocs > [fuel] openstack/fuel-nailgun-extension-cluster-upgrade has extra > files, please remove: .coveragerc, .gitignore, AUTHORS, LICENSE, > MANIFEST.in, bindep.txt, cluster_upgrade, conftest.py, > nailgun-test-settings.yaml, requirements.txt, setup.cfg, setup.py, > specs, test-requirements.txt, tools, tox.ini > [fuel] openstack/fuel-nailgun-extension-iac has extra files, please > remove: .gitignore, LICENSE, MANIFEST.in, doc, fuel_external_git, > requirements.txt, setup.cfg, setup.py, specs, test-requirements.txt, > tools, tox.ini > [fuel] openstack/fuel-nailgun-extension-converted-serializers has > extra files, please remove: .coveragerc, .gitignore, LICENSE, > MANIFEST.in, bindep.txt, conftest.py, converted_serializers, > nailgun-test-settings.yaml, requirements.txt, setup.cfg, setup.py, > specs, test-requirements.txt, tools, tox.ini > [fuel] openstack/fuel-octane has extra files, please remove: > .coveragerc, .gitignore, .mailmap, Gemfile, Gemfile.lock, HACKING.rst, > LICENSE, MAINTAINERS, MANIFEST.in, Rakefile, bindep.txt, deploy, > deployment, docs, misc, octane, requirements.txt, setup.cfg, setup.py, > specs, test-requirements.txt, tox.ini > [fuel] openstack/fuel-upgrade has extra files, please remove: .gitignore > [fuel] openstack/tuning-box has extra files, please remove: > .coveragerc, .gitignore, .mailmap, .testr.conf, CONTRIBUTING.rst, > HACKING.rst, LICENSE, MAINTAINERS, MANIFEST.in, TODO, alembic.ini, > babel.cfg, bindep.txt, doc, examples, openstack-common.conf, > requirements.txt, setup.cfg, setup.py, specs, test-requirements.txt, > tools, tox.ini, tuning_box > [fuel] openstack/fuel-plugins has extra files, please remove: > .gitignore, CHANGELOG.md, CONTRIBUTING.rst, HACKING.rst, LICENSE, > MAINTAINERS, examples, fuel_plugin_builder, requirements.txt, > run_tests.sh, setup.cfg, setup.py, test-requirements.txt, tox.ini > [fuel] openstack/fuel-plugin-murano has extra files, please remove: > .gitignore, LICENSE, components.yaml, deployment_scripts, > deployment_tasks.yaml, docs, environment_config.yaml, functions.sh, > metadata.yaml, node_roles.yaml, pre_build_hook, releasenotes, > repositories, test-requirements.txt, tox.ini, volumes.yaml > [fuel] openstack/fuel-plugin-murano-tests has extra files, please > remove: .gitignore, murano_plugin_tests, openrc.default, > requirements.txt, tox.ini, utils > [fuel] openstack/fuel-specs has extra files, please remove: > .gitignore, .testr.conf, LICENSE, doc, images, policy, > requirements.txt, setup.cfg, setup.py, specs, tests, tools, tox.ini > [fuel] openstack/fuel-stats has extra files, please remove: > .gitignore, LICENSE, MAINTAINERS, MANIFEST.in, analytics, collector, > migration, requirements.txt, setup.py, test-requirements.txt, tools, > tox.ini > [fuel] openstack/python-fuelclient has extra files, please remove: > .gitignore, .testr.conf, MAINTAINERS, MANIFEST.in, fuelclient, > requirements.txt, setup.cfg, setup.py, specs, test-requirements.txt, > tools, tox.ini > > [Infrastructure] opendev/puppet-releasestatus has extra files, please > remove: .gitignore > > [ironic] openstack/python-dracclient has extra files, please remove: > .gitignore, CONTRIBUTING.rst, HACKING.rst, LICENSE, doc, dracclient, > requirements.txt, setup.cfg, setup.py, test-requirements.txt, tox.ini > > [neutron] openstack/networking-calico has extra files, please remove: > .coveragerc, .gitignore, .mailmap, .testr.conf, .zuul.yaml, > CONTRIBUTING.rst, HACKING.rst, LICENSE, MANIFEST.in, RELEASING.md, > babel.cfg, debian, devstack, doc, networking_calico, playbooks, > requirements.txt, rpm, setup.cfg, setup.py, test-requirements.txt, > tox.ini IIRC this isn’t neutron stadium project but some 3rd party project. Should it be under “neutron” tag here? > [neutron] openstack/networking-l2gw has extra files, please remove: > .coveragerc, .gitignore, .testr.conf, .zuul.yaml, CONTRIBUTING.rst, > HACKING.rst, LICENSE, MANIFEST.in, babel.cfg, bindep.txt, contrib, > debian, devstack, doc, etc, lower-constraints.txt, networking_l2gw, > openstack-common.conf, requirements.txt, setup.cfg, setup.py, specs, > test-requirements.txt, tools, tox.ini This also isn’t stadium project (at least not listed in https://governance.openstack.org/tc/reference/projects/neutron.html). I also don’t think that networking-l2gw is retired project. I know that there are still some (even new) maintainers for it. > [neutron] openstack/networking-l2gw-tempest-plugin has extra files, > please remove: .gitignore, .zuul.yaml, CONTRIBUTING.rst, HACKING.rst, > LICENSE, babel.cfg, contrib, doc, networking_l2gw_tempest_plugin, > requirements.txt, setup.cfg, setup.py, test-requirements.txt, tox.ini > [neutron] openstack/networking-onos has extra files, please remove: > .coveragerc, .gitignore, .mailmap, .pylintrc, .testr.conf, > CONTRIBUTING.rst, HACKING.rst, LICENSE, PKG-INFO, TESTING.rst, > babel.cfg, devstack, doc, etc, lower-constraints.txt, networking_onos, > package, rally-jobs, releasenotes, requirements.txt, setup.cfg, > setup Also 3rd party project. > .py, test-requirements.txt, tools, tox.ini > [neutron] openstack/neutron-vpnaas has extra files, please remove: > .coveragerc, .gitignore, .mailmap, .pylintrc, .stestr.conf, > .zuul.yaml, CONTRIBUTING.rst, HACKING.rst, LICENSE, TESTING.rst, > babel.cfg, devstack, doc, etc, lower-constraints.txt, neutron_vpnaas, > playbooks, rally-jobs, releasenotes, requirements.txt, setup.cfg, > setup.py, test-requirements.txt, tools, tox.ini Neutron-vpnaas isn’t retired IIRC. > > [OpenStack Charms] openstack/charm-ceph has extra files, please > remove: .gitignore > > [OpenStackAnsible] openstack/openstack-ansible-os_monasca has extra > files, please remove: tests, tox.ini > > [solum] openstack/solum-infra-guestagent has extra files, please > remove: .coveragerc, .gitignore, .mailmap, .testr.conf, > CONTRIBUTING.rst, HACKING.rst, LICENSE, MANIFEST.in, babel.cfg, > config-generator, doc, etc, requirements.txt, setup.cfg, setup.py, > solum_guestagent, test-requirements.txt, tox.ini > > I'd like to kindly ask the affected teams to help out with this, or > any member of our community is more than welcome to push a change to > those repos and work with the appropriate teams to help land it. > > Mohammed > > -- > Mohammed Naser — vexxhost > ----------------------------------------------------- > D. 514-316-8872 > D. 800-910-1726 ext. 200 > E. mnaser at vexxhost.com > W. http://vexxhost.com > — Slawek Kaplonski Senior software engineer Red Hat From bcafarel at redhat.com Sat Jul 6 20:00:03 2019 From: bcafarel at redhat.com (Bernard Cafarelli) Date: Sat, 6 Jul 2019 22:00:03 +0200 Subject: [docs][fuel][infra][ironic][neutron][charms][openstack-ansible][solum][tc] proper retirement of repos In-Reply-To: References: Message-ID: On Sat, 6 Jul 2019 at 19:44, Mohammed Naser wrote: > Hi everyone, > > One of the issue that we recently ran into was the fact that there was > some inconsistency about merging retirement of repositories inside > governance without the code being fully removed. > > In order to avoid this, I've made a change to our governance > repository which will enforce that no code exists in those retired > repositories, however, this has surfaced that some repositories were > retired with some stale files, some are smaller littler files, some > are entire projects still. > > I have compiled a list for every team, with the repos that are not > properly retired that have extra files (using this change which should > eventually +1 once we fix it all: https://review.opendev.org/669549) > > [Documentation] openstack/api-site has extra files, please remove: > .gitignore, .zuul.yaml, LICENSE, api-quick-start, api-ref, bindep.txt, > common, doc-tools-check-languages.conf, firstapp, > test-requirements.txt, tools, tox.ini, www > [Documentation] openstack/faafo has extra files, please remove: > .gitignore, CONTRIBUTING.rst, LICENSE, Vagrantfile, bin, contrib, doc, > etc, faafo, requirements.txt, setup.cfg, setup.py, > test-requirements.txt, tox.ini > > [fuel] openstack/fuel-agent has extra files, please remove: > .gitignore, LICENSE, MAINTAINERS, cloud-init-templates, contrib, > debian, etc, fuel_agent, requirements.txt, run_tests.sh, setup.cfg, > setup.py, specs, test-requirements.txt, tools, tox.ini > [fuel] openstack/fuel-astute has extra files, please remove: > .gitignore, .rspec, .ruby-version, Gemfile, LICENSE, MAINTAINERS, > Rakefile, astute.gemspec, astute.service, astute.sysconfig, bin, > bindep.txt, debian, examples, lib, mcagents, run_tests.sh, spec, > specs, tests > [fuel] openstack/fuel-library has extra files, please remove: > .gitignore, CHANGELOG, Gemfile, LICENSE, MAINTAINERS, Rakefile, > debian, deployment, files, graphs, logs, specs, tests, utils > [fuel] openstack/fuel-main has extra files, please remove: .gitignore, > 00-debmirror.patch, LICENSE, MAINTAINERS, Makefile, config.mk, > fuel-release, iso, mirror, packages, prepare-build-env.sh, > report-changelog.sh, repos.mk, requirements-fuel-rpm.txt, > requirements-rpm.txt, rules.mk, sandbox.mk, specs > [fuel] openstack/fuel-menu has extra files, please remove: .gitignore, > MAINTAINERS, MANIFEST.in, fuelmenu, run_tests.sh, setup.py, specs, > test-requirements.txt, tox.ini > [fuel] openstack/fuel-mirror has extra files, please remove: > .gitignore, .mailmap, MAINTAINERS, perestroika, tox.ini > [fuel] openstack/fuel-nailgun-agent has extra files, please remove: > .gitignore, Gemfile, LICENSE, MAINTAINERS, Rakefile, agent, debian, > nailgun-agent.cron, nailgun-agent.gemspec, run_tests.sh, specs > [fuel] openstack/fuel-ostf has extra files, please remove: .gitignore, > LICENSE, MAINTAINERS, MANIFEST.in, etc, fuel_health, fuel_plugin, > ostf.service, pylintrc, requirements.txt, run_tests.sh, setup.cfg, > setup.py, specs, test-requirements.txt, tools, tox.ini > [fuel] openstack/fuel-qa has extra files, please remove: .coveragerc, > .gitignore, .pylintrc, .pylintrc_gerrit, MAINTAINERS, core, doc, > fuel_tests, fuelweb_test, gates_tests, packages_tests, pytest.ini, > run_system_test.py, run_tests.sh, system_test, tox.ini, utils > [fuel] openstack/fuel-ui has extra files, please remove: > .eslintignore, .eslintrc.yaml, .gitignore, LICENSE, MAINTAINERS, > fixtures, gulp, gulpfile.js, karma.config.js, npm-shrinkwrap.json, > package.json, run_real_plugin_tests.sh, > run_real_plugin_tests_on_real_nailgun.sh, run_ui_func_tests.sh, specs, > static, webpack.config.js > [fuel] openstack/fuel-virtualbox has extra files, please remove: > .gitignore, MAINTAINERS, actions, clean.sh, config.sh, contrib, > drivers, dumpkeys.cache, functions, iso, launch.sh, launch_16GB.sh, > launch_8GB.sh > [fuel] openstack/fuel-web has extra files, please remove: .gitignore, > LICENSE, MAINTAINERS, bin, build_docs.sh, debian, docs, nailgun, > run_tests.sh, specs, systemd, tools, tox.ini > [fuel] openstack/shotgun has extra files, please remove: .coveragerc, > .gitignore, .testr.conf, CONTRIBUTING.rst, HACKING.rst, LICENSE, > MAINTAINERS, MANIFEST.in, bin, etc, requirements.txt, setup.cfg, > setup.py, shotgun, specs, test-requirements.txt, tox.ini > [fuel] openstack/fuel-dev-tools has extra files, please remove: > .coveragerc, .gitignore, .mailmap, .testr.conf, CONTRIBUTING.rst, > HACKING.rst, LICENSE, MAINTAINERS, babel.cfg, contrib, doc, > fuel_dev_tools, openstack-common.conf, requirements.txt, setup.cfg, > setup.py, test-requirements.txt, tox.ini, vagrant > [fuel] openstack/fuel-devops has extra files, please remove: > .coveragerc, .gitignore, .pylintrc, .pylintrc_gerrit, LICENSE, > MAINTAINERS, bin, devops, doc, run_tests.sh, samples, setup.cfg, > setup.py, test-requirements.txt, tox.ini > [fuel] openstack/fuel-docs has extra files, please remove: .gitignore, > Makefile, _images, _templates, common_conf.py, conf.py, devdocs, > examples, glossary, index.rst, make.bat, plugindocs, requirements.txt, > setup.cfg, setup.py, tox.ini, userdocs > [fuel] openstack/fuel-nailgun-extension-cluster-upgrade has extra > files, please remove: .coveragerc, .gitignore, AUTHORS, LICENSE, > MANIFEST.in, bindep.txt, cluster_upgrade, conftest.py, > nailgun-test-settings.yaml, requirements.txt, setup.cfg, setup.py, > specs, test-requirements.txt, tools, tox.ini > [fuel] openstack/fuel-nailgun-extension-iac has extra files, please > remove: .gitignore, LICENSE, MANIFEST.in, doc, fuel_external_git, > requirements.txt, setup.cfg, setup.py, specs, test-requirements.txt, > tools, tox.ini > [fuel] openstack/fuel-nailgun-extension-converted-serializers has > extra files, please remove: .coveragerc, .gitignore, LICENSE, > MANIFEST.in, bindep.txt, conftest.py, converted_serializers, > nailgun-test-settings.yaml, requirements.txt, setup.cfg, setup.py, > specs, test-requirements.txt, tools, tox.ini > [fuel] openstack/fuel-octane has extra files, please remove: > .coveragerc, .gitignore, .mailmap, Gemfile, Gemfile.lock, HACKING.rst, > LICENSE, MAINTAINERS, MANIFEST.in, Rakefile, bindep.txt, deploy, > deployment, docs, misc, octane, requirements.txt, setup.cfg, setup.py, > specs, test-requirements.txt, tox.ini > [fuel] openstack/fuel-upgrade has extra files, please remove: .gitignore > [fuel] openstack/tuning-box has extra files, please remove: > .coveragerc, .gitignore, .mailmap, .testr.conf, CONTRIBUTING.rst, > HACKING.rst, LICENSE, MAINTAINERS, MANIFEST.in, TODO, alembic.ini, > babel.cfg, bindep.txt, doc, examples, openstack-common.conf, > requirements.txt, setup.cfg, setup.py, specs, test-requirements.txt, > tools, tox.ini, tuning_box > [fuel] openstack/fuel-plugins has extra files, please remove: > .gitignore, CHANGELOG.md, CONTRIBUTING.rst, HACKING.rst, LICENSE, > MAINTAINERS, examples, fuel_plugin_builder, requirements.txt, > run_tests.sh, setup.cfg, setup.py, test-requirements.txt, tox.ini > [fuel] openstack/fuel-plugin-murano has extra files, please remove: > .gitignore, LICENSE, components.yaml, deployment_scripts, > deployment_tasks.yaml, docs, environment_config.yaml, functions.sh, > metadata.yaml, node_roles.yaml, pre_build_hook, releasenotes, > repositories, test-requirements.txt, tox.ini, volumes.yaml > [fuel] openstack/fuel-plugin-murano-tests has extra files, please > remove: .gitignore, murano_plugin_tests, openrc.default, > requirements.txt, tox.ini, utils > [fuel] openstack/fuel-specs has extra files, please remove: > .gitignore, .testr.conf, LICENSE, doc, images, policy, > requirements.txt, setup.cfg, setup.py, specs, tests, tools, tox.ini > [fuel] openstack/fuel-stats has extra files, please remove: > .gitignore, LICENSE, MAINTAINERS, MANIFEST.in, analytics, collector, > migration, requirements.txt, setup.py, test-requirements.txt, tools, > tox.ini > [fuel] openstack/python-fuelclient has extra files, please remove: > .gitignore, .testr.conf, MAINTAINERS, MANIFEST.in, fuelclient, > requirements.txt, setup.cfg, setup.py, specs, test-requirements.txt, > tools, tox.ini > > [Infrastructure] opendev/puppet-releasestatus has extra files, please > remove: .gitignore > > [ironic] openstack/python-dracclient has extra files, please remove: > .gitignore, CONTRIBUTING.rst, HACKING.rst, LICENSE, doc, dracclient, > requirements.txt, setup.cfg, setup.py, test-requirements.txt, tox.ini > > [neutron] openstack/networking-calico has extra files, please remove: > .coveragerc, .gitignore, .mailmap, .testr.conf, .zuul.yaml, > CONTRIBUTING.rst, HACKING.rst, LICENSE, MANIFEST.in, RELEASING.md, > babel.cfg, debian, devstack, doc, networking_calico, playbooks, > requirements.txt, rpm, setup.cfg, setup.py, test-requirements.txt, > tox.ini > [neutron] openstack/networking-l2gw has extra files, please remove: > .coveragerc, .gitignore, .testr.conf, .zuul.yaml, CONTRIBUTING.rst, > HACKING.rst, LICENSE, MANIFEST.in, babel.cfg, bindep.txt, contrib, > debian, devstack, doc, etc, lower-constraints.txt, networking_l2gw, > openstack-common.conf, requirements.txt, setup.cfg, setup.py, specs, > test-requirements.txt, tools, tox.ini > [neutron] openstack/networking-l2gw-tempest-plugin has extra files, > please remove: .gitignore, .zuul.yaml, CONTRIBUTING.rst, HACKING.rst, > LICENSE, babel.cfg, contrib, doc, networking_l2gw_tempest_plugin, > requirements.txt, setup.cfg, setup.py, test-requirements.txt, tox.ini > [neutron] openstack/networking-onos has extra files, please remove: > .coveragerc, .gitignore, .mailmap, .pylintrc, .testr.conf, > CONTRIBUTING.rst, HACKING.rst, LICENSE, PKG-INFO, TESTING.rst, > babel.cfg, devstack, doc, etc, lower-constraints.txt, networking_onos, > package, rally-jobs, releasenotes, requirements.txt, setup.cfg, > setup.py, test-requirements.txt, tools, tox.ini > [neutron] openstack/neutron-vpnaas has extra files, please remove: > .coveragerc, .gitignore, .mailmap, .pylintrc, .stestr.conf, > .zuul.yaml, CONTRIBUTING.rst, HACKING.rst, LICENSE, TESTING.rst, > babel.cfg, devstack, doc, etc, lower-constraints.txt, neutron_vpnaas, > playbooks, rally-jobs, releasenotes, requirements.txt, setup.cfg, > setup.py, test-requirements.txt, tools, tox.ini > At least for networking-l2gw* and neutron-vpnaas, I suppose this was caused by: https://opendev.org/openstack/governance/commit/20f95dd947d2f87519b4bb50fb188e6f71deae7c What it meant is that they are not anymore under neutron governance, but they were not retired (at least as far as I know). There were still some recent commits even if minimal activity, and discussion on team status for neutron-vpnaas. Not sure about networking-calico status though > > [OpenStack Charms] openstack/charm-ceph has extra files, please > remove: .gitignore > > [OpenStackAnsible] openstack/openstack-ansible-os_monasca has extra > files, please remove: tests, tox.ini > > [solum] openstack/solum-infra-guestagent has extra files, please > remove: .coveragerc, .gitignore, .mailmap, .testr.conf, > CONTRIBUTING.rst, HACKING.rst, LICENSE, MANIFEST.in, babel.cfg, > config-generator, doc, etc, requirements.txt, setup.cfg, setup.py, > solum_guestagent, test-requirements.txt, tox.ini > > I'd like to kindly ask the affected teams to help out with this, or > any member of our community is more than welcome to push a change to > those repos and work with the appropriate teams to help land it. > > Mohammed > > -- > Mohammed Naser — vexxhost > ----------------------------------------------------- > D. 514-316-8872 > D. 800-910-1726 ext. 200 > E. mnaser at vexxhost.com > W. http://vexxhost.com > > -- Bernard Cafarelli -------------- next part -------------- An HTML attachment was scrubbed... URL: From vungoctan252 at gmail.com Sun Jul 7 03:30:39 2019 From: vungoctan252 at gmail.com (Vu Tan) Date: Sun, 7 Jul 2019 10:30:39 +0700 Subject: how to install masakari on centos 7 Message-ID: Hi, I would like to use Masakari and I'm having trouble finding a step by step or other documentation to get started with. Which part should be installed on controller, which is should be on compute, and what is the prerequisite to install masakari, I have installed corosync and pacemaker on compute and controller nodes, , what else do I need to do ? step I have done so far: - installed corosync/pacemaker - install masakari on compute node on this github repo: https://github.com/openstack/masakari - add masakari in to mariadb here is my configuration file of masakari.conf, do you mind to take a look at it, if I have misconfigured anything? [DEFAULT] enabled_apis = masakari_api # Enable to specify listening IP other than default masakari_api_listen = controller # Enable to specify port other than default masakari_api_listen_port = 15868 debug = False auth_strategy=keystone [wsgi] # The paste configuration file path api_paste_config = /etc/masakari/api-paste.ini [keystone_authtoken] www_authenticate_uri = http://controller:5000 auth_url = http://controller:5000 auth_type = password project_domain_id = default user_domain_id = default project_name = service username = masakari password = P at ssword [database] connection = mysql+pymysql://masakari:P at ssword@controller/masakari -------------- next part -------------- An HTML attachment was scrubbed... URL: From vungoctan252 at gmail.com Sun Jul 7 04:08:07 2019 From: vungoctan252 at gmail.com (Vu Tan) Date: Sun, 7 Jul 2019 11:08:07 +0700 Subject: [masakari] how to install masakari on centos 7 Message-ID: Vu Tan 10:30 AM (35 minutes ago) to openstack-discuss Sorry, I resend this email because I realized that I lacked of prefix on this email's subject Hi, I would like to use Masakari and I'm having trouble finding a step by step or other documentation to get started with. Which part should be installed on controller, which is should be on compute, and what is the prerequisite to install masakari, I have installed corosync and pacemaker on compute and controller nodes, , what else do I need to do ? step I have done so far: - installed corosync/pacemaker - install masakari on compute node on this github repo: https://github.com/openstack/masakari - add masakari in to mariadb here is my configuration file of masakari.conf, do you mind to take a look at it, if I have misconfigured anything? [DEFAULT] enabled_apis = masakari_api # Enable to specify listening IP other than default masakari_api_listen = controller # Enable to specify port other than default masakari_api_listen_port = 15868 debug = False auth_strategy=keystone [wsgi] # The paste configuration file path api_paste_config = /etc/masakari/api-paste.ini [keystone_authtoken] www_authenticate_uri = http://controller:5000 auth_url = http://controller:5000 auth_type = password project_domain_id = default user_domain_id = default project_name = service username = masakari password = P at ssword [database] connection = mysql+pymysql://masakari:P at ssword@controller/masakari -------------- next part -------------- An HTML attachment was scrubbed... URL: From gaetan.trellu at incloudus.com Sun Jul 7 12:37:05 2019 From: gaetan.trellu at incloudus.com (=?ISO-8859-1?Q?Ga=EBtan_Trellu?=) Date: Sun, 07 Jul 2019 08:37:05 -0400 Subject: [masakari] how to install masakari on centos 7 In-Reply-To: Message-ID: <09a3849b-786e-49ed-a197-5e13af0428bf@email.android.com> An HTML attachment was scrubbed... URL: From moguimar at redhat.com Sun Jul 7 13:09:10 2019 From: moguimar at redhat.com (Moises Guimaraes de Medeiros) Date: Sun, 7 Jul 2019 15:09:10 +0200 Subject: [oslo] oslo.config /castellan poster review Message-ID: Hi, This week I'll be presenting a poster about oslo.config's castellan driver at EuroPython. I'd like to ask y'all interested in the subject to take a look at my poster. I'm planning to print it this Tuesday and I still have some spare space to fit a bit more. The latest version is available at: https://ep2019.europython.eu/media/conference/slides/m7RV4BB-protecting-secrets-with-osloconfig-and-hashicorp-vault.pdf Thanks a lot! -- Moisés Guimarães Software Engineer Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Sun Jul 7 13:41:50 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Sun, 07 Jul 2019 22:41:50 +0900 Subject: [qa][ptg][nova][cinder][keystone][neutron][glance][swift][placement] How to make integrated-gate testing (tempest-full) more stable and fast In-Reply-To: <16afe9a95fa.ff3031a3146625.6650617153857325463@ghanshyammann.com> References: <16a86db6ccd.d787148123989.2198391414179782565@ghanshyammann.com> <16af8ac7fc7.fa901245105341.2925519493395080868@ghanshyammann.com> <16afe9a95fa.ff3031a3146625.6650617153857325463@ghanshyammann.com> Message-ID: <16bccab5abc.efb5a50b274261.309866536645601231@ghanshyammann.com> ---- On Tue, 28 May 2019 22:21:44 +0900 Ghanshyam Mann wrote ---- > ---- On Mon, 27 May 2019 18:43:35 +0900 Ghanshyam Mann wrote ---- > > ---- On Tue, 07 May 2019 07:06:23 +0900 Morgan Fainberg wrote ---- > > > > > > > > > On Sun, May 5, 2019 at 12:19 AM Ghanshyam Mann wrote: > > > > > > For the "Integrated-gate-identity", I have a slight worry that we might lose some coverage with this change. I am unsure of how varied the use of Keystone is outside of KeystoneMiddleware (i.e. token validation) consumption that all services perform, Heat (not part of the integrated gate) and it's usage of Trusts, and some newer emerging uses such as "look up limit data" (potentially in Train, would be covered by Nova). Worst case, we could run all the integrated tests for Keystone changes (at least initially) until we have higher confidence and minimize the tests once we have a clearer audit of how the services use Keystone. The changes would speed up/minimize the usage for the other services directly and Keystone can follow down the line. > > > I want to be as close to 100% sure we're not going to suddenly break everyone because of some change we land. Keystone fortunately and unfortunately sits below most other services in an OpenStack deployment and is heavily relied throughout almost every single request. > > > --Morgan > > > > > > Thanks Morgan. That was what we were worried during PTG discussion. I agree with your point about not to lose coverage and first get to know how Keystone is being used by each service. Let's keep running the all service tests for keystone gate as of now and later we can shorten the tests run based on the clarity of usage. > > We can disable the ssh validation for "Integrated-gate-identity" which keystone does not need to care about. This can save the keystone gate for ssh timeout failure. > > -gmann > > > > > -gmann > > > > > > > Current integrated-gate jobs (tempest-full) is not so stable for various bugs specially timeout. We tried > > > to improve it via filtering the slow tests in the separate tempest-slow job but the situation has not been improved much. > > > > > > We talked about the Ideas to make it more stable and fast for projects especially when failure is not > > > related to each project. We are planning to split the integrated-gate template (only tempest-full job as > > > first step) per related services. > > > > > > Idea: > > > - Run only dependent service tests on project gate. > > > - Tempest gate will keep running all the services tests as the integrated gate at a centeralized place without any change in the current job. > > > - Each project can run the below mentioned template. > > > - All below template will be defined and maintained by QA team. > > > > > > I would like to know each 6 services which run integrated-gate jobs > > > > > > 1."Integrated-gate-networking" (job to run on neutron gate) > > > Tests to run in this template: neutron APIs , nova APIs, keystone APIs ? All scenario currently running in tempest-full in the same way ( means non-slow and in serial) > > > Improvement for neutron gate: exlcude the cinder API tests, glance API tests, swift API tests, > > > > > > 2."Integrated-gate-storage" (job to run on cinder gate, glance gate) > > > Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, Nova APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) > > > Improvement for cinder, glance gate: excluded the neutron APIs tests, Keystone APIs tests > > > > > > 3. "Integrated-gate-object-storage" (job to run on swift gate) > > > Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) > > > Improvement for swift gate: excluded the neutron APIs tests, - Keystone APIs tests, - Nova APIs tests. > > > Note: swift does not run integrated-gate as of now. > > > > > > 4. "Integrated-gate-compute" (job to run on Nova gate) > > > tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs and All scenario currently running in tempest-full in same way ( means non-slow and in serial) > > > Improvement for Nova gate: excluded the swift APIs tests(not running in current job but in future, it might), Keystone API tests. > > > > > > 5. "Integrated-gate-identity" (job to run on keystone gate) > > > Tests to run is : all as all project use keystone, we might need to run all tests as it is running in integrated-gate. > > > But does keystone is being unsed differently by all services? if no then, is it enough to run only single service tests say Nova or neutron ? > > > > > > 6. "Integrated-gate-placement" (job to run on placement gate) > > > Tests to run in this template: Nova APIs tests, Neutron APIs tests + scenario tests + any new service depends on placement APIs > > > Improvement for placement gate: excluded the glance APIs tests, cinder APIs tests, swift APIs tests, keystone APIs tests > > > I have prepared the new template for integrated gate testing[1] and tested in DNM patch [2]. You can observe ~20 min less time on new jobs(except compute one). But the main thing is will improve the stability of gate. Once they are merged, I will propose the patch to replace those template on the projects gate. NOTE: Along with APIs tests, I have back listed the non-dependent scenario tests also. [1] https://review.opendev.org/#/q/topic:refactor-integrated-gate-testing+(status:open+OR+status:merged) [2] https://review.opendev.org/#/c/669313/ -gmann > > > Thoughts on this approach? > > > > > > The important point is we must not lose the coverage of integrated testing per project. So I would like to > > > get each project view if we are missing any dependency (proposed tests removal) in above proposed templates. > > > > > > - https://etherpad.openstack.org/p/qa-train-ptg > > > > > > -gmann > > > > > > > > > > > > From doug at doughellmann.com Sun Jul 7 14:21:07 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Sun, 07 Jul 2019 10:21:07 -0400 Subject: [oslo] oslo.config /castellan poster review In-Reply-To: References: Message-ID: Moises Guimaraes de Medeiros writes: > Hi, > > This week I'll be presenting a poster about oslo.config's castellan driver > at EuroPython. I'd like to ask y'all interested in the subject to take a > look at my poster. I'm planning to print it this Tuesday and I still have > some spare space to fit a bit more. > > The latest version is available at: > > https://ep2019.europython.eu/media/conference/slides/m7RV4BB-protecting-secrets-with-osloconfig-and-hashicorp-vault.pdf > > Thanks a lot! > > -- > > Moisés Guimarães > > Software Engineer > > Red Hat > > That looks great, Moises, nice work! -- Doug From ruijing.guo at intel.com Mon Jul 8 00:47:28 2019 From: ruijing.guo at intel.com (Guo, Ruijing) Date: Mon, 8 Jul 2019 00:47:28 +0000 Subject: [Neutron] NUMA aware VxLAN Message-ID: <2EE296D083DF2940BF4EBB91D39BB89F40CC0832@SHSMSX104.ccr.corp.intel.com> Hi, Existing neutron ML2 support one VxLAN for tenant network. In NUMA case, VM 0 can be created in node 0 and VM 1 can be created in node 1 and VxLAN is in node 0. VM1 need to cross node, which cause some performance downgrade. Does someone have this performance issue? Does Neutron community have plan to enhance it? Thanks, -Ruijing -------------- next part -------------- An HTML attachment was scrubbed... URL: From aj at suse.com Mon Jul 8 05:24:54 2019 From: aj at suse.com (Andreas Jaeger) Date: Mon, 8 Jul 2019 07:24:54 +0200 Subject: [docs][tc] proper retirement of repos In-Reply-To: References: Message-ID: <442f7f8e-6ba0-393e-e99d-7d59dc1e4911@suse.com> On 06/07/2019 19.38, Mohammed Naser wrote: > [Documentation] openstack/api-site has extra files, please remove: > .gitignore, .zuul.yaml, LICENSE, api-quick-start, api-ref, bindep.txt, > common, doc-tools-check-languages.conf, firstapp, > test-requirements.txt, tools, tox.ini, www > [Documentation] openstack/faafo has extra files, please remove: > .gitignore, CONTRIBUTING.rst, LICENSE, Vagrantfile, bin, contrib, doc, > etc, faafo, requirements.txt, setup.cfg, setup.py, > test-requirements.txt, tox.ini These repos were not really retired, they were just removed out of docs ownership but need a new owner, it still contains files for developer.openstack.org. api-site is still active. I would propose to remove content we cannot maintain and move it back into docs if no other owner is found. Is there a better way to flag this in governance repo? Andreas -- Andreas Jaeger aj at suse.com Twitter: jaegerandi SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg GF: Nils Brauckmann, Felix Imendörffer, Enrica Angelone, HRB 247165 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 From aj at suse.com Mon Jul 8 05:28:05 2019 From: aj at suse.com (Andreas Jaeger) Date: Mon, 8 Jul 2019 07:28:05 +0200 Subject: [docs][fuel][infra][ironic][neutron][charms][openstack-ansible][solum][tc] proper retirement of repos In-Reply-To: References: Message-ID: On 06/07/2019 19.38, Mohammed Naser wrote: > Hi everyone, > > One of the issue that we recently ran into was the fact that there was > some inconsistency about merging retirement of repositories inside > governance without the code being fully removed. > > In order to avoid this, I've made a change to our governance > repository which will enforce that no code exists in those retired > repositories, however, this has surfaced that some repositories were > retired with some stale files, some are smaller littler files, some > are entire projects still. > [...] Fuel and some networking repos were not retired (see comments in this thread), instead they were moved out of governance. So, your check is too aggressive, it catches not only the RETIRED case but also the "still active but not under governance" case. AFAIK there's no differentiation in governance repo for these, Andreas -- Andreas Jaeger aj at suse.com Twitter: jaegerandi SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg GF: Nils Brauckmann, Felix Imendörffer, Enrica Angelone, HRB 247165 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 From ykarel at redhat.com Mon Jul 8 06:07:56 2019 From: ykarel at redhat.com (Yatin Karel) Date: Mon, 8 Jul 2019 11:37:56 +0530 Subject: [TripleO] Using TripleO standalone with ML2/OVS In-Reply-To: <8F8016EB-7898-4F66-BD30-998ABDB094FB@redhat.com> References: <8F8016EB-7898-4F66-BD30-998ABDB094FB@redhat.com> Message-ID: Hi Slawek, So from workflow perspective you just need to pass an environment file with correct set of parameters for ovs(resource_registry and parameter_defaults) to "openstack tripleo deploy" command in the end(to override the defaults) with -e . In CI multinode ovs jobs are running with environment files containing ovs specific parameters. For Multinode ovs scenario the environment file [1]. Standalone ovs work(creation of environment file and CI job) is WIP by Marios[1]. I just rechecked the job to see where it's stuck as last failures were unrelated to ovs. @Marios Andreou can share exact status and @Slawomir Kaplonski you might also help in clearing it. From job logs of WIP patch you can check command for reference[3]. You can use standalone environment files directly or use them as a reference and create custom environment files as per your use case. I have not deployed myself standalone with ovs or any other backend so can't tell exact parameters but if i try i will start with these references until someone guides for exact set of parameters someone uses for standlone ovs deployment. May be someone who have tried it will post on this mailing list. [1] https://github.com/openstack/tripleo-heat-templates/blob/master/ci/environments/scenario007-multinode-containers.yaml [2] https://review.opendev.org/#/q/topic:scenario007+(status:open+OR+status:merged) [3] http://logs.openstack.org/97/631497/18/check/tripleo-ci-centos-7-scenario007-standalone/a5ee2d3/logs/undercloud/home/zuul/standalone.sh.txt.gz Regards Yatin Karel On Sat, Jul 6, 2019 at 2:33 PM Slawek Kaplonski wrote: > > Hi, > > I was trying to use TripleO standalone for development work in the way how Emilien described it in [1] and indeed it works quite well. Thx Emilien. > But now, I’m trying to find out the way how to deploy it with Neutron using ML2/OVS instead of default in TripleO ML2/OVN. > And I still don’t know how to do it :/ > I know it’s my fault but maybe someone can help me with this and tell me what exactly options I should change there to deploy it with other Neutron backend? > Thx in advance for any help. > > [1] https://my1.fr/blog/developer-workflow-with-tripleo/ > > — > Slawek Kaplonski > Senior software engineer > Red Hat > > From sbauza at redhat.com Mon Jul 8 07:45:17 2019 From: sbauza at redhat.com (Sylvain Bauza) Date: Mon, 8 Jul 2019 09:45:17 +0200 Subject: [ops] [nova] [placement] Mismatch between allocations and instances In-Reply-To: References: Message-ID: On Fri, Jul 5, 2019 at 10:21 PM Matt Riedemann wrote: > On 7/5/2019 1:45 AM, Massimo Sgaravatto wrote: > > I tried to check the allocations on each compute node of a Ocata cloud, > > using the command: > > > > curl -s ${PLACEMENT_ENDPOINT}/resource_providers/${UUID}/allocations -H > > "x-auth-token: $TOKEN" | python -m json.tool > > > > Just FYI you can use osc-placement (openstack client plugin) for command > line: > > https://docs.openstack.org/osc-placement/latest/index.html > > > I found that, on a few compute nodes, there are some instances for which > > there is not a corresponding allocation. > > The heal_allocations command [1] might be able to find and fix these up > for you. The bad news for you is that heal_allocations wasn't added > until Rocky and you're on Ocata. The good news is you should be able to > take the current version of the code from master (or stein) and run that > in a container or virtual environment against your Ocata cloud (this > would be particularly useful if you want to use the --dry-run or > --instance options added in Train). You could also potentially backport > those changes to your internal branch, or we could start a discussion > upstream about backporting that tooling to stable branches - though > going to Ocata might be a bit much at this point given Ocata and Pike > are in extended maintenance mode [2]. > > As for *why* the instances on those nodes are missing allocations, it's > hard to say without debugging things. The allocation and resource > tracking code has changed quite a bit since Ocata (in Pike the scheduler > started creating the allocations but the resource tracker in the compute > service could still overwrite those allocations if you had older nodes > during a rolling upgrade). My guess would be a migration failed or there > was just a bug in Ocata where we didn't cleanup or allocate properly. > Again, heal_allocations should add the missing allocation for you if you > can setup the environment to run that command. > > > > > On another Rocky cloud, we had the opposite problem: there were > > allocations also for some instances that didn't exist anymore. > > And this caused problems since we were not able to use all the resources > > of the relevant compute nodes: we had to manually remove the fwrong" > > allocations to fix the problem ... > > Yup, this could happen for different reasons, usually all due to known > bugs for which you don't have the fix yet, e.g. [3][4], or something is > failing during a migration and we aren't cleaning up properly (an > unreported/not-yet-fixed bug). > > > > > > > I wonder why/how this problem can happen ... > > I mentioned some possibilities above - but I'm sure there are other bugs > that have been fixed which I've omitted here, or things that aren't > fixed yet, especially in failure scenarios (rollback/cleanup handling is > hard). > > Note that your Ocata and Rocky cases could be different because since > Queens (once all compute nodes are >=Queens) during resize, cold and > live migration the migration record in nova holds the source node > allocations during the migration so the actual *consumer* of the > allocations for a provider in placement might not be an instance > (server) record but actually a migration, so if you were looking for an > allocation consumer by ID in nova using something like "openstack server > show $consumer_id" it might return NotFound because the consumer is > actually not an instance but a migration record and the allocation was > leaked. > > > > > And how can we fix the issue ? Should we manually add the missing > > allocations / manually remove the wrong ones ? > > Coincidentally a thread related to this [5] re-surfaced a couple of > weeks ago. I am not sure what Sylvain's progress is on that audit tool, > but the linked bug in that email has some other operator scripts you > could try for the case that there are leaked/orphaned allocations on > compute nodes that no longer have instances. > > Yeah, I'm fighting off with the change due to some issues, but I'll hopefully upload the change by the next days. -Sylvain > > > Thanks, Massimo > > > > > > [1] https://docs.openstack.org/nova/latest/cli/nova-manage.html#placement > [2] https://docs.openstack.org/project-team-guide/stable-branches.html > [3] https://bugs.launchpad.net/nova/+bug/1825537 > [4] https://bugs.launchpad.net/nova/+bug/1821594 > [5] > > http://lists.openstack.org/pipermail/openstack-discuss/2019-June/007241.html > > -- > > Thanks, > > Matt > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From massimo.sgaravatto at gmail.com Mon Jul 8 07:46:01 2019 From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto) Date: Mon, 8 Jul 2019 09:46:01 +0200 Subject: [ops] [nova] [placement] Mismatch between allocations and instances In-Reply-To: References: Message-ID: On Fri, Jul 5, 2019 at 10:18 PM Matt Riedemann wrote: > On 7/5/2019 1:45 AM, Massimo Sgaravatto wrote: > > I tried to check the allocations on each compute node of a Ocata cloud, > > using the command: > > > > curl -s ${PLACEMENT_ENDPOINT}/resource_providers/${UUID}/allocations -H > > "x-auth-token: $TOKEN" | python -m json.tool > > > > Just FYI you can use osc-placement (openstack client plugin) for command > line: > > https://docs.openstack.org/osc-placement/latest/index.html > > Ok, thanks In the Rocky cloud I had to manually install the python2-osc-placement package. At least for centos7 it is not required by python2-openstackclient > > I found that, on a few compute nodes, there are some instances for which > > there is not a corresponding allocation. > > The heal_allocations command [1] might be able to find and fix these up > for you. The bad news for you is that heal_allocations wasn't added > until Rocky and you're on Ocata. Since in 1 week the Ocata cloud we'll be migrated to Rocky, I can wait ... :-) Thanks a lot for your help ! Cheers, Massimo -------------- next part -------------- An HTML attachment was scrubbed... URL: From katonalala at gmail.com Mon Jul 8 08:46:37 2019 From: katonalala at gmail.com (Lajos Katona) Date: Mon, 8 Jul 2019 10:46:37 +0200 Subject: [docs][fuel][infra][ironic][neutron][charms][openstack-ansible][solum][tc] proper retirement of repos In-Reply-To: References: Message-ID: Hi, For networking-l2gw (and networking-l2gw-tempest-plugin) I can say that it is maintained (with not that great activity, but anyway far from rerirement). Is there anything I have to do to make this "activeness" visible from governance perspective? Lajos Bernard Cafarelli ezt írta (időpont: 2019. júl. 6., Szo, 22:05): > > > On Sat, 6 Jul 2019 at 19:44, Mohammed Naser wrote: > >> Hi everyone, >> >> One of the issue that we recently ran into was the fact that there was >> some inconsistency about merging retirement of repositories inside >> governance without the code being fully removed. >> >> In order to avoid this, I've made a change to our governance >> repository which will enforce that no code exists in those retired >> repositories, however, this has surfaced that some repositories were >> retired with some stale files, some are smaller littler files, some >> are entire projects still. >> >> I have compiled a list for every team, with the repos that are not >> properly retired that have extra files (using this change which should >> eventually +1 once we fix it all: https://review.opendev.org/669549) >> >> [Documentation] openstack/api-site has extra files, please remove: >> .gitignore, .zuul.yaml, LICENSE, api-quick-start, api-ref, bindep.txt, >> common, doc-tools-check-languages.conf, firstapp, >> test-requirements.txt, tools, tox.ini, www >> [Documentation] openstack/faafo has extra files, please remove: >> .gitignore, CONTRIBUTING.rst, LICENSE, Vagrantfile, bin, contrib, doc, >> etc, faafo, requirements.txt, setup.cfg, setup.py, >> test-requirements.txt, tox.ini >> >> [fuel] openstack/fuel-agent has extra files, please remove: >> .gitignore, LICENSE, MAINTAINERS, cloud-init-templates, contrib, >> debian, etc, fuel_agent, requirements.txt, run_tests.sh, setup.cfg, >> setup.py, specs, test-requirements.txt, tools, tox.ini >> [fuel] openstack/fuel-astute has extra files, please remove: >> .gitignore, .rspec, .ruby-version, Gemfile, LICENSE, MAINTAINERS, >> Rakefile, astute.gemspec, astute.service, astute.sysconfig, bin, >> bindep.txt, debian, examples, lib, mcagents, run_tests.sh, spec, >> specs, tests >> [fuel] openstack/fuel-library has extra files, please remove: >> .gitignore, CHANGELOG, Gemfile, LICENSE, MAINTAINERS, Rakefile, >> debian, deployment, files, graphs, logs, specs, tests, utils >> [fuel] openstack/fuel-main has extra files, please remove: .gitignore, >> 00-debmirror.patch, LICENSE, MAINTAINERS, Makefile, config.mk, >> fuel-release, iso, mirror, packages, prepare-build-env.sh, >> report-changelog.sh, repos.mk, requirements-fuel-rpm.txt, >> requirements-rpm.txt, rules.mk, sandbox.mk, specs >> [fuel] openstack/fuel-menu has extra files, please remove: .gitignore, >> MAINTAINERS, MANIFEST.in, fuelmenu, run_tests.sh, setup.py, specs, >> test-requirements.txt, tox.ini >> [fuel] openstack/fuel-mirror has extra files, please remove: >> .gitignore, .mailmap, MAINTAINERS, perestroika, tox.ini >> [fuel] openstack/fuel-nailgun-agent has extra files, please remove: >> .gitignore, Gemfile, LICENSE, MAINTAINERS, Rakefile, agent, debian, >> nailgun-agent.cron, nailgun-agent.gemspec, run_tests.sh, specs >> [fuel] openstack/fuel-ostf has extra files, please remove: .gitignore, >> LICENSE, MAINTAINERS, MANIFEST.in, etc, fuel_health, fuel_plugin, >> ostf.service, pylintrc, requirements.txt, run_tests.sh, setup.cfg, >> setup.py, specs, test-requirements.txt, tools, tox.ini >> [fuel] openstack/fuel-qa has extra files, please remove: .coveragerc, >> .gitignore, .pylintrc, .pylintrc_gerrit, MAINTAINERS, core, doc, >> fuel_tests, fuelweb_test, gates_tests, packages_tests, pytest.ini, >> run_system_test.py, run_tests.sh, system_test, tox.ini, utils >> [fuel] openstack/fuel-ui has extra files, please remove: >> .eslintignore, .eslintrc.yaml, .gitignore, LICENSE, MAINTAINERS, >> fixtures, gulp, gulpfile.js, karma.config.js, npm-shrinkwrap.json, >> package.json, run_real_plugin_tests.sh, >> run_real_plugin_tests_on_real_nailgun.sh, run_ui_func_tests.sh, specs, >> static, webpack.config.js >> [fuel] openstack/fuel-virtualbox has extra files, please remove: >> .gitignore, MAINTAINERS, actions, clean.sh, config.sh, contrib, >> drivers, dumpkeys.cache, functions, iso, launch.sh, launch_16GB.sh, >> launch_8GB.sh >> [fuel] openstack/fuel-web has extra files, please remove: .gitignore, >> LICENSE, MAINTAINERS, bin, build_docs.sh, debian, docs, nailgun, >> run_tests.sh, specs, systemd, tools, tox.ini >> [fuel] openstack/shotgun has extra files, please remove: .coveragerc, >> .gitignore, .testr.conf, CONTRIBUTING.rst, HACKING.rst, LICENSE, >> MAINTAINERS, MANIFEST.in, bin, etc, requirements.txt, setup.cfg, >> setup.py, shotgun, specs, test-requirements.txt, tox.ini >> [fuel] openstack/fuel-dev-tools has extra files, please remove: >> .coveragerc, .gitignore, .mailmap, .testr.conf, CONTRIBUTING.rst, >> HACKING.rst, LICENSE, MAINTAINERS, babel.cfg, contrib, doc, >> fuel_dev_tools, openstack-common.conf, requirements.txt, setup.cfg, >> setup.py, test-requirements.txt, tox.ini, vagrant >> [fuel] openstack/fuel-devops has extra files, please remove: >> .coveragerc, .gitignore, .pylintrc, .pylintrc_gerrit, LICENSE, >> MAINTAINERS, bin, devops, doc, run_tests.sh, samples, setup.cfg, >> setup.py, test-requirements.txt, tox.ini >> [fuel] openstack/fuel-docs has extra files, please remove: .gitignore, >> Makefile, _images, _templates, common_conf.py, conf.py, devdocs, >> examples, glossary, index.rst, make.bat, plugindocs, requirements.txt, >> setup.cfg, setup.py, tox.ini, userdocs >> [fuel] openstack/fuel-nailgun-extension-cluster-upgrade has extra >> files, please remove: .coveragerc, .gitignore, AUTHORS, LICENSE, >> MANIFEST.in, bindep.txt, cluster_upgrade, conftest.py, >> nailgun-test-settings.yaml, requirements.txt, setup.cfg, setup.py, >> specs, test-requirements.txt, tools, tox.ini >> [fuel] openstack/fuel-nailgun-extension-iac has extra files, please >> remove: .gitignore, LICENSE, MANIFEST.in, doc, fuel_external_git, >> requirements.txt, setup.cfg, setup.py, specs, test-requirements.txt, >> tools, tox.ini >> [fuel] openstack/fuel-nailgun-extension-converted-serializers has >> extra files, please remove: .coveragerc, .gitignore, LICENSE, >> MANIFEST.in, bindep.txt, conftest.py, converted_serializers, >> nailgun-test-settings.yaml, requirements.txt, setup.cfg, setup.py, >> specs, test-requirements.txt, tools, tox.ini >> [fuel] openstack/fuel-octane has extra files, please remove: >> .coveragerc, .gitignore, .mailmap, Gemfile, Gemfile.lock, HACKING.rst, >> LICENSE, MAINTAINERS, MANIFEST.in, Rakefile, bindep.txt, deploy, >> deployment, docs, misc, octane, requirements.txt, setup.cfg, setup.py, >> specs, test-requirements.txt, tox.ini >> [fuel] openstack/fuel-upgrade has extra files, please remove: .gitignore >> [fuel] openstack/tuning-box has extra files, please remove: >> .coveragerc, .gitignore, .mailmap, .testr.conf, CONTRIBUTING.rst, >> HACKING.rst, LICENSE, MAINTAINERS, MANIFEST.in, TODO, alembic.ini, >> babel.cfg, bindep.txt, doc, examples, openstack-common.conf, >> requirements.txt, setup.cfg, setup.py, specs, test-requirements.txt, >> tools, tox.ini, tuning_box >> [fuel] openstack/fuel-plugins has extra files, please remove: >> .gitignore, CHANGELOG.md, CONTRIBUTING.rst, HACKING.rst, LICENSE, >> MAINTAINERS, examples, fuel_plugin_builder, requirements.txt, >> run_tests.sh, setup.cfg, setup.py, test-requirements.txt, tox.ini >> [fuel] openstack/fuel-plugin-murano has extra files, please remove: >> .gitignore, LICENSE, components.yaml, deployment_scripts, >> deployment_tasks.yaml, docs, environment_config.yaml, functions.sh, >> metadata.yaml, node_roles.yaml, pre_build_hook, releasenotes, >> repositories, test-requirements.txt, tox.ini, volumes.yaml >> [fuel] openstack/fuel-plugin-murano-tests has extra files, please >> remove: .gitignore, murano_plugin_tests, openrc.default, >> requirements.txt, tox.ini, utils >> [fuel] openstack/fuel-specs has extra files, please remove: >> .gitignore, .testr.conf, LICENSE, doc, images, policy, >> requirements.txt, setup.cfg, setup.py, specs, tests, tools, tox.ini >> [fuel] openstack/fuel-stats has extra files, please remove: >> .gitignore, LICENSE, MAINTAINERS, MANIFEST.in, analytics, collector, >> migration, requirements.txt, setup.py, test-requirements.txt, tools, >> tox.ini >> [fuel] openstack/python-fuelclient has extra files, please remove: >> .gitignore, .testr.conf, MAINTAINERS, MANIFEST.in, fuelclient, >> requirements.txt, setup.cfg, setup.py, specs, test-requirements.txt, >> tools, tox.ini >> >> [Infrastructure] opendev/puppet-releasestatus has extra files, please >> remove: .gitignore >> >> [ironic] openstack/python-dracclient has extra files, please remove: >> .gitignore, CONTRIBUTING.rst, HACKING.rst, LICENSE, doc, dracclient, >> requirements.txt, setup.cfg, setup.py, test-requirements.txt, tox.ini >> >> [neutron] openstack/networking-calico has extra files, please remove: >> .coveragerc, .gitignore, .mailmap, .testr.conf, .zuul.yaml, >> CONTRIBUTING.rst, HACKING.rst, LICENSE, MANIFEST.in, RELEASING.md, >> babel.cfg, debian, devstack, doc, networking_calico, playbooks, >> requirements.txt, rpm, setup.cfg, setup.py, test-requirements.txt, >> tox.ini >> [neutron] openstack/networking-l2gw has extra files, please remove: >> .coveragerc, .gitignore, .testr.conf, .zuul.yaml, CONTRIBUTING.rst, >> HACKING.rst, LICENSE, MANIFEST.in, babel.cfg, bindep.txt, contrib, >> debian, devstack, doc, etc, lower-constraints.txt, networking_l2gw, >> openstack-common.conf, requirements.txt, setup.cfg, setup.py, specs, >> test-requirements.txt, tools, tox.ini >> [neutron] openstack/networking-l2gw-tempest-plugin has extra files, >> please remove: .gitignore, .zuul.yaml, CONTRIBUTING.rst, HACKING.rst, >> LICENSE, babel.cfg, contrib, doc, networking_l2gw_tempest_plugin, >> requirements.txt, setup.cfg, setup.py, test-requirements.txt, tox.ini >> [neutron] openstack/networking-onos has extra files, please remove: >> .coveragerc, .gitignore, .mailmap, .pylintrc, .testr.conf, >> CONTRIBUTING.rst, HACKING.rst, LICENSE, PKG-INFO, TESTING.rst, >> babel.cfg, devstack, doc, etc, lower-constraints.txt, networking_onos, >> package, rally-jobs, releasenotes, requirements.txt, setup.cfg, >> setup.py, test-requirements.txt, tools, tox.ini >> [neutron] openstack/neutron-vpnaas has extra files, please remove: >> .coveragerc, .gitignore, .mailmap, .pylintrc, .stestr.conf, >> .zuul.yaml, CONTRIBUTING.rst, HACKING.rst, LICENSE, TESTING.rst, >> babel.cfg, devstack, doc, etc, lower-constraints.txt, neutron_vpnaas, >> playbooks, rally-jobs, releasenotes, requirements.txt, setup.cfg, >> setup.py, test-requirements.txt, tools, tox.ini >> > At least for networking-l2gw* and neutron-vpnaas, I suppose this was > caused by: > > https://opendev.org/openstack/governance/commit/20f95dd947d2f87519b4bb50fb188e6f71deae7c > What it meant is that they are not anymore under neutron governance, but > they were not retired (at least as far as I know). > There were still some recent commits even if minimal activity, and > discussion on team status for neutron-vpnaas. > > Not sure about networking-calico status though > > >> >> [OpenStack Charms] openstack/charm-ceph has extra files, please >> remove: .gitignore >> >> [OpenStackAnsible] openstack/openstack-ansible-os_monasca has extra >> files, please remove: tests, tox.ini >> >> [solum] openstack/solum-infra-guestagent has extra files, please >> remove: .coveragerc, .gitignore, .mailmap, .testr.conf, >> CONTRIBUTING.rst, HACKING.rst, LICENSE, MANIFEST.in, babel.cfg, >> config-generator, doc, etc, requirements.txt, setup.cfg, setup.py, >> solum_guestagent, test-requirements.txt, tox.ini >> >> I'd like to kindly ask the affected teams to help out with this, or >> any member of our community is more than welcome to push a change to >> those repos and work with the appropriate teams to help land it. >> >> Mohammed >> >> -- >> Mohammed Naser — vexxhost >> ----------------------------------------------------- >> D. 514-316-8872 >> D. 800-910-1726 ext. 200 >> E. mnaser at vexxhost.com >> W. http://vexxhost.com >> >> > > -- > Bernard Cafarelli > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bcafarel at redhat.com Mon Jul 8 09:16:22 2019 From: bcafarel at redhat.com (Bernard Cafarelli) Date: Mon, 8 Jul 2019 11:16:22 +0200 Subject: [neutron] Bug deputy report (week starting on 2019-07-01) Message-ID: Hi Neutrinos, time for a new cycle of bug deputy rotation, which means I was on duty last week, checking bugs up to 1835663 included Quite a few bugs in that list worth a read and further discussion: * port status changing to UP when changing the network name (and possible fallout on DHCP issues) * API currently allowing to set gateway outside of the subnet * How to handle external dependencies recommended version bumps (pyroute bump in queens fixing a memory leak) * possible issue in IPv6 address renewal since we started cleaning the dnsmasq leases file Critical: * Wrong endpoints config with configure_auth_token_middleware - https://bugs.launchpad.net/bugs/1834849 Removal of devstack deprecated option broke designate scenario job Fix merged: https://review.opendev.org/668447 High: * [Queens] Memory leak in pyroute2 0.4.21 - https://bugs.launchpad.net/neutron/+bug/1835044 we need a newer pyroute version Patches on requirements with DNM neutron one for testing in progress: https://review.opendev.org/668676 and https://review.opendev.org/668677 openstack-discuss thread: http://lists.openstack.org/pipermail/openstack-discuss/2019-July/007545.html * Bulk-created ports ignore binding_host_id property - https://bugs.launchpad.net/neutron/+bug/1835209 Issue on bulk port creation work Fix in progress: https://review.opendev.org/#/c/665516/ * flood flow in br-tun table22 incorrect - https://bugs.launchpad.net/neutron/+bug/1835163 Linked to https://bugs.launchpad.net/neutron/+bug/1834979 (see below), this can cause hard to trace l2pop and dhcp issues No owner Medium: * Add ipam.utils.check_gateway_invalid_in_subnet unit tests - https://bugs.launchpad.net/neutron/+bug/1835448 Shows proper usage after bug https://bugs.launchpad.net/neutron/+bug/1835344 (see in Opinion section) Review in progress: https://review.opendev.org/669210 * Port status becomes active after updating network/subnet - https://bugs.launchpad.net/neutron/+bug/1834979 The port of a turned off VM will show as UP after a network modification (reproducer with network name) No owner * Some L3 RPCs are time-consuming especially get_routers - https://bugs.launchpad.net/neutron/+bug/1835663 Probably worth discussing in performance meeting? Incomplete: * Restart dhcp-agent cause IPv6 vm can't renew IP address, will lost minutes or hours - https://bugs.launchpad.net/neutron/+bug/1835484 Cleaning IPv6 addresses from dnsmasq leases file ( https://bugs.launchpad.net/neutron/+bug/1722126) apparently causes VM to loose its address on renewal Waiting for detailed logs, but if ipv6 experts can chime in Opinion: * neutron doesn't check the validity of gateway_ip as a subnet had been created - https://bugs.launchpad.net/neutron/+bug/1835344 How to handle gateway IP outside of the subnet? Discussion in the bug and related review: https://review.opendev.org/669030 Duplicate: * QoS plugin slows down get_ports operation - https://bugs.launchpad.net/neutron/+bug/1835369 Was filled close to https://bugs.launchpad.net/bugs/1834484 Regards, -- Bernard Cafarelli -------------- next part -------------- An HTML attachment was scrubbed... URL: From bcafarel at redhat.com Mon Jul 8 09:36:57 2019 From: bcafarel at redhat.com (Bernard Cafarelli) Date: Mon, 8 Jul 2019 11:36:57 +0200 Subject: [TripleO] Using TripleO standalone with ML2/OVS In-Reply-To: References: <8F8016EB-7898-4F66-BD30-998ABDB094FB@redhat.com> Message-ID: On Mon, 8 Jul 2019 at 08:11, Yatin Karel wrote: > Hi Slawek, > > So from workflow perspective you just need to pass an environment file > with correct set of parameters for ovs(resource_registry and > parameter_defaults) to "openstack tripleo deploy" command in the > end(to override the defaults) with -e . > > In CI multinode ovs jobs are running with environment files containing > ovs specific parameters. For Multinode ovs scenario the environment > file [1]. Standalone ovs work(creation of environment file and CI job) > is WIP by Marios[1]. I just rechecked the job to see where it's stuck > as last failures were unrelated to ovs. @Marios Andreou can share > exact status and @Slawomir Kaplonski you might also help in clearing > it. From job logs of WIP patch you can check command for reference[3]. > > You can use standalone environment files directly or use them as a > reference and create custom environment files as per your use case. > > I have not deployed myself standalone with ovs or any other backend so > can't tell exact parameters but if i try i will start with these > references until someone guides for exact set of parameters someone > uses for standlone ovs deployment. May be someone who have tried it > will post on this mailing list. > Not tested yet, but when the default was switched to OVN, ML2/OVS "default" environment files were updated and should work: https://github.com/openstack/tripleo-heat-templates/commit/6053eb196488a086449f5f2e4fe807825a16bd51#diff-6cac0d1de221a29d330377086cec8599 (before that switch, I know "/usr/share/openstack-tripleo-heat-templates/environments/services/neutron-ovn-standalone.yaml" was working fine to do an OVN standalone deployment) And if the neutron-ovs.yaml/ neutron-ovs-dvr.yaml files do not work out of the box, they sound worth fixing :) So you should be able to just add "-e /usr/share/openstack-tripleo-heat-templates/environments/services/neutron-ovs.yaml" to your openstack tripleo deploy command, and get a standalone ML2/OVS env. > [1] > https://github.com/openstack/tripleo-heat-templates/blob/master/ci/environments/scenario007-multinode-containers.yaml > [2] > https://review.opendev.org/#/q/topic:scenario007+(status:open+OR+status:merged) > [3] > http://logs.openstack.org/97/631497/18/check/tripleo-ci-centos-7-scenario007-standalone/a5ee2d3/logs/undercloud/home/zuul/standalone.sh.txt.gz > > Regards > Yatin Karel > > On Sat, Jul 6, 2019 at 2:33 PM Slawek Kaplonski > wrote: > > > > Hi, > > > > I was trying to use TripleO standalone for development work in the way > how Emilien described it in [1] and indeed it works quite well. Thx Emilien. > > But now, I’m trying to find out the way how to deploy it with Neutron > using ML2/OVS instead of default in TripleO ML2/OVN. > > And I still don’t know how to do it :/ > > I know it’s my fault but maybe someone can help me with this and tell me > what exactly options I should change there to deploy it with other Neutron > backend? > > Thx in advance for any help. > > > > [1] https://my1.fr/blog/developer-workflow-with-tripleo/ > > > > — > > Slawek Kaplonski > > Senior software engineer > > Red Hat > > > > > > -- Bernard Cafarelli -------------- next part -------------- An HTML attachment was scrubbed... URL: From hberaud at redhat.com Mon Jul 8 09:52:41 2019 From: hberaud at redhat.com (Herve Beraud) Date: Mon, 8 Jul 2019 11:52:41 +0200 Subject: [oslo] oslo.config /castellan poster review In-Reply-To: References: Message-ID: Hey Moises, Really interesting thanks! I'll take a look to your github POC. Le dim. 7 juil. 2019 à 16:22, Doug Hellmann a écrit : > Moises Guimaraes de Medeiros writes: > > > Hi, > > > > This week I'll be presenting a poster about oslo.config's castellan > driver > > at EuroPython. I'd like to ask y'all interested in the subject to take a > > look at my poster. I'm planning to print it this Tuesday and I still have > > some spare space to fit a bit more. > > > > The latest version is available at: > > > > > https://ep2019.europython.eu/media/conference/slides/m7RV4BB-protecting-secrets-with-osloconfig-and-hashicorp-vault.pdf > > > > Thanks a lot! > > > > -- > > > > Moisés Guimarães > > > > Software Engineer > > > > Red Hat > > > > > > That looks great, Moises, nice work! > > -- > Doug > > -- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE----- -------------- next part -------------- An HTML attachment was scrubbed... URL: From bcafarel at redhat.com Mon Jul 8 10:20:26 2019 From: bcafarel at redhat.com (Bernard Cafarelli) Date: Mon, 8 Jul 2019 12:20:26 +0200 Subject: [neutron] Bug deputy report (week starting on 2019-07-01) In-Reply-To: References: Message-ID: Small late-morning update On Mon, 8 Jul 2019 at 11:16, Bernard Cafarelli wrote: > Hi Neutrinos, > > time for a new cycle of bug deputy rotation, which means I was on duty > last week, checking bugs up to 1835663 included > > Quite a few bugs in that list worth a read and further discussion: > * port status changing to UP when changing the network name (and possible > fallout on DHCP issues) > * API currently allowing to set gateway outside of the subnet > * How to handle external dependencies recommended version bumps (pyroute > bump in queens fixing a memory leak) > * possible issue in IPv6 address renewal since we started cleaning the > dnsmasq leases file > > Critical: > * Wrong endpoints config with configure_auth_token_middleware - > https://bugs.launchpad.net/bugs/1834849 > Removal of devstack deprecated option broke designate scenario job > Fix merged: https://review.opendev.org/668447 > > High: > * [Queens] Memory leak in pyroute2 0.4.21 - > https://bugs.launchpad.net/neutron/+bug/1835044 > we need a newer pyroute version > Patches on requirements with DNM neutron one for testing in progress: > https://review.opendev.org/668676 and https://review.opendev.org/668677 > openstack-discuss thread: > http://lists.openstack.org/pipermail/openstack-discuss/2019-July/007545.html > Closed as Won't fix, requirements listed versions are for CI, not for packagers. On the packaging side, I know we have RDO fixes in progress, other packagers on Queens should find this bug relevant > * Bulk-created ports ignore binding_host_id property - > https://bugs.launchpad.net/neutron/+bug/1835209 > Issue on bulk port creation work > Fix in progress: https://review.opendev.org/#/c/665516/ > * flood flow in br-tun table22 incorrect - > https://bugs.launchpad.net/neutron/+bug/1835163 > Linked to https://bugs.launchpad.net/neutron/+bug/1834979 (see below), > this can cause hard to trace l2pop and dhcp issues > No owner > > Medium: > * Add ipam.utils.check_gateway_invalid_in_subnet unit tests - > https://bugs.launchpad.net/neutron/+bug/1835448 > Shows proper usage after bug > https://bugs.launchpad.net/neutron/+bug/1835344 (see in Opinion section) > Review in progress: https://review.opendev.org/669210 > * Port status becomes active after updating network/subnet - > https://bugs.launchpad.net/neutron/+bug/1834979 > The port of a turned off VM will show as UP after a network modification > (reproducer with network name) > No owner > * Some L3 RPCs are time-consuming especially get_routers - > https://bugs.launchpad.net/neutron/+bug/1835663 > Probably worth discussing in performance meeting? > > Incomplete: > * Restart dhcp-agent cause IPv6 vm can't renew IP address, will lost > minutes or hours - https://bugs.launchpad.net/neutron/+bug/1835484 > Cleaning IPv6 addresses from dnsmasq leases file ( > https://bugs.launchpad.net/neutron/+bug/1722126) apparently causes VM to > loose its address on renewal > Waiting for detailed logs, but if ipv6 experts can chime in > > Opinion: > * neutron doesn't check the validity of gateway_ip as a subnet had been > created - https://bugs.launchpad.net/neutron/+bug/1835344 > How to handle gateway IP outside of the subnet? > Discussion in the bug and related review: > https://review.opendev.org/669030 > > Duplicate: > * QoS plugin slows down get_ports operation - > https://bugs.launchpad.net/neutron/+bug/1835369 > Was filled close to https://bugs.launchpad.net/bugs/1834484 > > Regards, > -- > Bernard Cafarelli > -- Bernard Cafarelli -------------- next part -------------- An HTML attachment was scrubbed... URL: From jyotishri403 at gmail.com Mon Jul 8 10:46:17 2019 From: jyotishri403 at gmail.com (Jyoti Dahiwele) Date: Mon, 8 Jul 2019 16:16:17 +0530 Subject: Glusterfs support in stein Message-ID: Dear team, Do stein support glusterfs backend in cinder configuration? If yes plz share configuration document. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aj at suse.com Mon Jul 8 10:50:32 2019 From: aj at suse.com (Andreas Jaeger) Date: Mon, 8 Jul 2019 12:50:32 +0200 Subject: [docs][fuel][infra][ironic][neutron][charms][openstack-ansible][solum][tc] proper retirement of repos In-Reply-To: References: Message-ID: <8a2ec881-29d7-7d42-0400-5371e528fa8b@suse.com> On 06/07/2019 19.38, Mohammed Naser wrote: > [...] > [OpenStack Charms] openstack/charm-ceph has extra files, please > remove: .gitignore I would just whitelist this and be fine with it. The cost of adding this (2 changes to project-config, manual infra-root involvement to enable ACLs again plus one change to the repo) is not worth this single file. The README looks fine, this file is just cosmetics. Note: For repos that have full content, I would do those steps but not for just an extra .gitignore, Andreas -- Andreas Jaeger aj at suse.com Twitter: jaegerandi SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 From massimo.sgaravatto at gmail.com Mon Jul 8 12:13:13 2019 From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto) Date: Mon, 8 Jul 2019 14:13:13 +0200 Subject: Glusterfs support in stein In-Reply-To: References: Message-ID: We also used it in the past but as far as I remember the gluster driver was removed in Ocata (we had therefore to migrate to the NFSdriver) Cheers, Massimo On Mon, Jul 8, 2019 at 12:50 PM Jyoti Dahiwele wrote: > Dear team, > > Do stein support glusterfs backend in cinder configuration? > If yes plz share configuration document. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Mon Jul 8 12:51:19 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Mon, 8 Jul 2019 07:51:19 -0500 Subject: [cinder] Glusterfs support in stein In-Reply-To: References: Message-ID: <20190708125119.GA15668@sm-workstation> On Mon, Jul 08, 2019 at 02:13:13PM +0200, Massimo Sgaravatto wrote: > We also used it in the past but as far as I remember the gluster driver was > removed in Ocata (we had therefore to migrate to the NFSdriver) > > Cheers, Massimo > This is correct. The GlusterFS driver was marked as deprecated by its maintainers in the Newton release, then officially removed in Ocata. The driver can still be found in those stable branches, but would probably take a not insignificant amount of work to use in later releases. Sean From thierry at openstack.org Mon Jul 8 12:54:53 2019 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 8 Jul 2019 14:54:53 +0200 Subject: [tripleo][charms][helm][kolla][ansible][puppet][chef] Deployment tools capabilities v0.1.0 Message-ID: Hi, deployment tools teams, As mentioned here last month[1], we are working to improve the information present on the deployment tools pages on the OpenStack website, and we need your help! After the Forum session on this topic in Denver[2], a workgroup worked on producing a set of base capabilities that can be asserted by the various deployment tools we have. You can find version 0.1.0 of those capabilities here: https://opendev.org/osf/openstack-map/src/branch/master/deployment_tools_capabilities.yaml As an example, I pushed a change that makes every deployment tool assert the capability to deploy keystone ("components:keystone" tag) at: https://review.opendev.org/#/c/669648/ Now it's your turn. Please have a look at the list of the capabilities above, and propose a change to add those that are relevant to your deployment tool in the following file: https://opendev.org/osf/openstack-map/src/branch/master/deployment_tools.yaml Capabilities are all of the form "category:tag" (components:keystone, starts-from:os-installed, technology:puppet...). Once all deployment projects have completed that task, we'll add the capabilities to the rendered page on the website and allow for basic searching for tools with matching capability. Now, capabilities go only so far in describing your deployment tool. I also encourage you to improve in the same file the "desc" field: that one is directly displayed on the site.Uuse it to describe in more details how your deployment tool actually works and what makes it unique, beyond basic capabilities tags. Please feel free to use this thread (or personal email) if you have questions on this. And thanks in advance for your help! [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-June/006964.html [2] https://etherpad.openstack.org/p/DEN-deployment-tools-capabilities -- Thierry Carrez (ttx) From vungoctan252 at gmail.com Mon Jul 8 08:24:16 2019 From: vungoctan252 at gmail.com (Vu Tan) Date: Mon, 8 Jul 2019 15:24:16 +0700 Subject: [masakari] how to install masakari on centos 7 In-Reply-To: <09a3849b-786e-49ed-a197-5e13af0428bf@email.android.com> References: <09a3849b-786e-49ed-a197-5e13af0428bf@email.android.com> Message-ID: Hi, Thanks a lot for your reply, I install pacemaker/corosync, masakari-api, maskari-engine on controller node, and I run masakari-api with this command: masakari-api, but I dont know whether the process is running like that or is it just hang there, here is what it shows when I run the command, I leave it there for a while but it does not change anything : [root at controller masakari]# masakari-api 2019-07-08 15:21:09.946 30250 INFO masakari.api.openstack [-] Loaded extensions: ['extensions', 'notifications', 'os-hosts', 'segments', 'versions'] 2019-07-08 15:21:09.955 30250 WARNING keystonemiddleware._common.config [-] The option "__file__" in conf is not known to auth_token 2019-07-08 15:21:09.955 30250 WARNING keystonemiddleware._common.config [-] The option "here" in conf is not known to auth_token 2019-07-08 15:21:09.960 30250 WARNING keystonemiddleware.auth_token [-] AuthToken middleware is set with keystone_authtoken.service_token_roles_required set to False. This is backwards compatible but deprecated behaviour. Please set this to True. 2019-07-08 15:21:09.974 30250 INFO masakari.wsgi [-] masakari_api listening on 127.0.0.1:15868 2019-07-08 15:21:09.975 30250 INFO oslo_service.service [-] Starting 4 workers 2019-07-08 15:21:09.984 30274 INFO masakari.masakari_api.wsgi.server [-] (30274) wsgi starting up on http://127.0.0.1:15868 2019-07-08 15:21:09.985 30275 INFO masakari.masakari_api.wsgi.server [-] (30275) wsgi starting up on http://127.0.0.1:15868 2019-07-08 15:21:09.992 30277 INFO masakari.masakari_api.wsgi.server [-] (30277) wsgi starting up on http://127.0.0.1:15868 2019-07-08 15:21:09.994 30276 INFO masakari.masakari_api.wsgi.server [-] (30276) wsgi starting up on http://127.0.0.1:15868 On Sun, Jul 7, 2019 at 7:37 PM Gaëtan Trellu wrote: > Hi Vu Tan, > > Masakari documentation doesn't really exist... I had to figured some stuff > by myself to make it works into Kolla project. > > On controller nodes you need: > > - pacemaker > - corosync > - masakari-api (openstack/masakari repository) > - masakari- engine (openstack/masakari repository) > > On compute nodes you need: > > - pacemaker-remote (integrated to pacemaker cluster as a resource) > - masakari- hostmonitor (openstack/masakari-monitor repository) > - masakari-instancemonitor (openstack/masakari-monitor repository) > - masakari-processmonitor (openstack/masakari-monitor repository) > > For masakari-hostmonitor, the service needs to have access to systemctl > command (make sure you are not using sysvinit). > > For masakari-monitor, the masakari-monitor.conf is a bit different, you > will have to configure the [api] section properly. > > RabbitMQ needs to be configured (as transport_url) on masakari-api and > masakari-engine too. > > Please check this review[1], you will have masakari.conf and > masakari-monitor.conf configuration examples. > > [1] https://review.opendev.org/#/c/615715 > > Gaëtan > > On Jul 7, 2019 12:08 AM, Vu Tan wrote: > > > Vu Tan > 10:30 AM (35 minutes ago) > to openstack-discuss > Sorry, I resend this email because I realized that I lacked of prefix on > this email's subject > > > Hi, > > I would like to use Masakari and I'm having trouble finding a step by step > or other documentation to get started with. Which part should be installed > on controller, which is should be on compute, and what is the prerequisite > to install masakari, I have installed corosync and pacemaker on compute and > controller nodes, , what else do I need to do ? step I have done so far: > - installed corosync/pacemaker > - install masakari on compute node on this github repo: > https://github.com/openstack/masakari > - add masakari in to mariadb > here is my configuration file of masakari.conf, do you mind to take a look > at it, if I have misconfigured anything? > > [DEFAULT] > enabled_apis = masakari_api > > # Enable to specify listening IP other than default > masakari_api_listen = controller > # Enable to specify port other than default > masakari_api_listen_port = 15868 > debug = False > auth_strategy=keystone > > [wsgi] > # The paste configuration file path > api_paste_config = /etc/masakari/api-paste.ini > > [keystone_authtoken] > www_authenticate_uri = http://controller:5000 > auth_url = http://controller:5000 > auth_type = password > project_domain_id = default > user_domain_id = default > project_name = service > username = masakari > password = P at ssword > > [database] > connection = mysql+pymysql://masakari:P at ssword@controller/masakari > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vungoctan252 at gmail.com Mon Jul 8 10:08:46 2019 From: vungoctan252 at gmail.com (Vu Tan) Date: Mon, 8 Jul 2019 17:08:46 +0700 Subject: [masakari] how to install masakari on centos 7 In-Reply-To: References: <09a3849b-786e-49ed-a197-5e13af0428bf@email.android.com> Message-ID: Hi Gaetan, I try to generate config file by using this command tox -egenconfig on top level of masakari but the output is error, is this masakari still in beta version ? [root at compute1 masakari-monitors]# tox -egenconfig genconfig create: /root/masakari-monitors/.tox/genconfig ERROR: InterpreterNotFound: python3 _____________________________________________________________ summary ______________________________________________________________ ERROR: genconfig: InterpreterNotFound: python3 On Mon, Jul 8, 2019 at 3:24 PM Vu Tan wrote: > Hi, > Thanks a lot for your reply, I install pacemaker/corosync, masakari-api, > maskari-engine on controller node, and I run masakari-api with this > command: masakari-api, but I dont know whether the process is running like > that or is it just hang there, here is what it shows when I run the > command, I leave it there for a while but it does not change anything : > [root at controller masakari]# masakari-api > 2019-07-08 15:21:09.946 30250 INFO masakari.api.openstack [-] Loaded > extensions: ['extensions', 'notifications', 'os-hosts', 'segments', > 'versions'] > 2019-07-08 15:21:09.955 30250 WARNING keystonemiddleware._common.config > [-] The option "__file__" in conf is not known to auth_token > 2019-07-08 15:21:09.955 30250 WARNING keystonemiddleware._common.config > [-] The option "here" in conf is not known to auth_token > 2019-07-08 15:21:09.960 30250 WARNING keystonemiddleware.auth_token [-] > AuthToken middleware is set with > keystone_authtoken.service_token_roles_required set to False. This is > backwards compatible but deprecated behaviour. Please set this to True. > 2019-07-08 15:21:09.974 30250 INFO masakari.wsgi [-] masakari_api > listening on 127.0.0.1:15868 > 2019-07-08 15:21:09.975 30250 INFO oslo_service.service [-] Starting 4 > workers > 2019-07-08 15:21:09.984 30274 INFO masakari.masakari_api.wsgi.server [-] > (30274) wsgi starting up on http://127.0.0.1:15868 > 2019-07-08 15:21:09.985 30275 INFO masakari.masakari_api.wsgi.server [-] > (30275) wsgi starting up on http://127.0.0.1:15868 > 2019-07-08 15:21:09.992 30277 INFO masakari.masakari_api.wsgi.server [-] > (30277) wsgi starting up on http://127.0.0.1:15868 > 2019-07-08 15:21:09.994 30276 INFO masakari.masakari_api.wsgi.server [-] > (30276) wsgi starting up on http://127.0.0.1:15868 > > On Sun, Jul 7, 2019 at 7:37 PM Gaëtan Trellu > wrote: > >> Hi Vu Tan, >> >> Masakari documentation doesn't really exist... I had to figured some >> stuff by myself to make it works into Kolla project. >> >> On controller nodes you need: >> >> - pacemaker >> - corosync >> - masakari-api (openstack/masakari repository) >> - masakari- engine (openstack/masakari repository) >> >> On compute nodes you need: >> >> - pacemaker-remote (integrated to pacemaker cluster as a resource) >> - masakari- hostmonitor (openstack/masakari-monitor repository) >> - masakari-instancemonitor (openstack/masakari-monitor repository) >> - masakari-processmonitor (openstack/masakari-monitor repository) >> >> For masakari-hostmonitor, the service needs to have access to systemctl >> command (make sure you are not using sysvinit). >> >> For masakari-monitor, the masakari-monitor.conf is a bit different, you >> will have to configure the [api] section properly. >> >> RabbitMQ needs to be configured (as transport_url) on masakari-api and >> masakari-engine too. >> >> Please check this review[1], you will have masakari.conf and >> masakari-monitor.conf configuration examples. >> >> [1] https://review.opendev.org/#/c/615715 >> >> Gaëtan >> >> On Jul 7, 2019 12:08 AM, Vu Tan wrote: >> >> >> Vu Tan >> 10:30 AM (35 minutes ago) >> to openstack-discuss >> Sorry, I resend this email because I realized that I lacked of prefix on >> this email's subject >> >> >> Hi, >> >> I would like to use Masakari and I'm having trouble finding a step by >> step or other documentation to get started with. Which part should be >> installed on controller, which is should be on compute, and what is the >> prerequisite to install masakari, I have installed corosync and pacemaker >> on compute and controller nodes, , what else do I need to do ? step I >> have done so far: >> - installed corosync/pacemaker >> - install masakari on compute node on this github repo: >> https://github.com/openstack/masakari >> - add masakari in to mariadb >> here is my configuration file of masakari.conf, do you mind to take a >> look at it, if I have misconfigured anything? >> >> [DEFAULT] >> enabled_apis = masakari_api >> >> # Enable to specify listening IP other than default >> masakari_api_listen = controller >> # Enable to specify port other than default >> masakari_api_listen_port = 15868 >> debug = False >> auth_strategy=keystone >> >> [wsgi] >> # The paste configuration file path >> api_paste_config = /etc/masakari/api-paste.ini >> >> [keystone_authtoken] >> www_authenticate_uri = http://controller:5000 >> auth_url = http://controller:5000 >> auth_type = password >> project_domain_id = default >> user_domain_id = default >> project_name = service >> username = masakari >> password = P at ssword >> >> [database] >> connection = mysql+pymysql://masakari:P at ssword@controller/masakari >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From gaetan.trellu at incloudus.com Mon Jul 8 14:14:55 2019 From: gaetan.trellu at incloudus.com (gaetan.trellu at incloudus.com) Date: Mon, 08 Jul 2019 10:14:55 -0400 Subject: [masakari] how to install masakari on centos 7 In-Reply-To: References: <09a3849b-786e-49ed-a197-5e13af0428bf@email.android.com> Message-ID: <144114a2e83d8e8e30579ddb0ae39e59@incloudus.com> Vu Tan, About "auth_token" error, you need "os_privileged_user_*" options into your masakari.conf for the API. As mentioned previously please have a look here to have an example of configuration working (for me at least): - masakari.conf: https://review.opendev.org/#/c/615715/42/ansible/roles/masakari/templates/masakari.conf.j2 - masakari-monitor.conf: https://review.opendev.org/#/c/615715/42/ansible/roles/masakari/templates/masakari-monitors.conf.j2 About your tox issue make sure you have Python3 installed. Gaëtan On 2019-07-08 06:08, Vu Tan wrote: > Hi Gaetan, > I try to generate config file by using this command tox -egenconfig on > top level of masakari but the output is error, is this masakari still > in beta version ? > [root at compute1 masakari-monitors]# tox -egenconfig > genconfig create: /root/masakari-monitors/.tox/genconfig > ERROR: InterpreterNotFound: python3 > _____________________________________________________________ summary > ______________________________________________________________ > ERROR: genconfig: InterpreterNotFound: python3 > > On Mon, Jul 8, 2019 at 3:24 PM Vu Tan wrote: > Hi, > Thanks a lot for your reply, I install pacemaker/corosync, > masakari-api, maskari-engine on controller node, and I run masakari-api > with this command: masakari-api, but I dont know whether the process is > running like that or is it just hang there, here is what it shows when > I run the command, I leave it there for a while but it does not change > anything : > [root at controller masakari]# masakari-api > 2019-07-08 15:21:09.946 30250 INFO masakari.api.openstack [-] Loaded > extensions: ['extensions', 'notifications', 'os-hosts', 'segments', > 'versions'] > 2019-07-08 15:21:09.955 30250 WARNING keystonemiddleware._common.config > [-] The option "__file__" in conf is not known to auth_token > 2019-07-08 15:21:09.955 30250 WARNING keystonemiddleware._common.config > [-] The option "here" in conf is not known to auth_token > 2019-07-08 15:21:09.960 30250 WARNING keystonemiddleware.auth_token [-] > AuthToken middleware is set with > keystone_authtoken.service_token_roles_required set to False. This is > backwards compatible but deprecated behaviour. Please set this to True. > 2019-07-08 15:21:09.974 30250 INFO masakari.wsgi [-] masakari_api > listening on 127.0.0.1:15868 > 2019-07-08 15:21:09.975 30250 INFO oslo_service.service [-] Starting 4 > workers > 2019-07-08 15:21:09.984 30274 INFO masakari.masakari_api.wsgi.server > [-] (30274) wsgi starting up on http://127.0.0.1:15868 > 2019-07-08 15:21:09.985 30275 INFO masakari.masakari_api.wsgi.server > [-] (30275) wsgi starting up on http://127.0.0.1:15868 > 2019-07-08 15:21:09.992 30277 INFO masakari.masakari_api.wsgi.server > [-] (30277) wsgi starting up on http://127.0.0.1:15868 > 2019-07-08 15:21:09.994 30276 INFO masakari.masakari_api.wsgi.server > [-] (30276) wsgi starting up on http://127.0.0.1:15868 > > On Sun, Jul 7, 2019 at 7:37 PM Gaëtan Trellu > wrote: > > Hi Vu Tan, > > Masakari documentation doesn't really exist... I had to figured some > stuff by myself to make it works into Kolla project. > > On controller nodes you need: > > - pacemaker > - corosync > - masakari-api (openstack/masakari repository) > - masakari- engine (openstack/masakari repository) > > On compute nodes you need: > > - pacemaker-remote (integrated to pacemaker cluster as a resource) > - masakari- hostmonitor (openstack/masakari-monitor repository) > - masakari-instancemonitor (openstack/masakari-monitor repository) > - masakari-processmonitor (openstack/masakari-monitor repository) > > For masakari-hostmonitor, the service needs to have access to systemctl > command (make sure you are not using sysvinit). > > For masakari-monitor, the masakari-monitor.conf is a bit different, you > will have to configure the [api] section properly. > > RabbitMQ needs to be configured (as transport_url) on masakari-api and > masakari-engine too. > > Please check this review[1], you will have masakari.conf and > masakari-monitor.conf configuration examples. > > [1] https://review.opendev.org/#/c/615715 > > Gaëtan > > On Jul 7, 2019 12:08 AM, Vu Tan wrote: > > VU TAN > > 10:30 AM (35 minutes ago) > > to openstack-discuss > > Sorry, I resend this email because I realized that I lacked of prefix > on this email's subject > > Hi, > > I would like to use Masakari and I'm having trouble finding a step by > step or other documentation to get started with. Which part should be > installed on controller, which is should be on compute, and what is the > prerequisite to install masakari, I have installed corosync and > pacemaker on compute and controller nodes, , what else do I need to do > ? step I have done so far: > - installed corosync/pacemaker > - install masakari on compute node on this github repo: > https://github.com/openstack/masakari > - add masakari in to mariadb > here is my configuration file of masakari.conf, do you mind to take a > look at it, if I have misconfigured anything? > > [DEFAULT] > enabled_apis = masakari_api > > # Enable to specify listening IP other than default > masakari_api_listen = controller > # Enable to specify port other than default > masakari_api_listen_port = 15868 > debug = False > auth_strategy=keystone > > [wsgi] > # The paste configuration file path > api_paste_config = /etc/masakari/api-paste.ini > > [keystone_authtoken] > www_authenticate_uri = http://controller:5000 > auth_url = http://controller:5000 > auth_type = password > project_domain_id = default > user_domain_id = default > project_name = service > username = masakari > password = P at ssword > > [database] > connection = mysql+pymysql://masakari:P at ssword@controller/masakari -------------- next part -------------- A non-text attachment was scrubbed... Name: blocked.gif Type: image/gif Size: 118 bytes Desc: not available URL: From vungoctan252 at gmail.com Mon Jul 8 14:21:16 2019 From: vungoctan252 at gmail.com (Vu Tan) Date: Mon, 8 Jul 2019 21:21:16 +0700 Subject: [masakari] how to install masakari on centos 7 In-Reply-To: <144114a2e83d8e8e30579ddb0ae39e59@incloudus.com> References: <09a3849b-786e-49ed-a197-5e13af0428bf@email.android.com> <144114a2e83d8e8e30579ddb0ae39e59@incloudus.com> Message-ID: Hi Gaetan, Thanks for pinpoint this out, silly me that did not notice the simple "error InterpreterNotFound: python3". Thanks a lot, I appreciate it On Mon, Jul 8, 2019 at 9:15 PM wrote: > Vu Tan, > > About "auth_token" error, you need "os_privileged_user_*" options into > your masakari.conf for the API. > As mentioned previously please have a look here to have an example of > configuration working (for me at least): > > - masakari.conf: > > https://review.opendev.org/#/c/615715/42/ansible/roles/masakari/templates/masakari.conf.j2 > - masakari-monitor.conf: > > https://review.opendev.org/#/c/615715/42/ansible/roles/masakari/templates/masakari-monitors.conf.j2 > > About your tox issue make sure you have Python3 installed. > > Gaëtan > > On 2019-07-08 06:08, Vu Tan wrote: > > > Hi Gaetan, > > I try to generate config file by using this command tox -egenconfig on > > top level of masakari but the output is error, is this masakari still > > in beta version ? > > [root at compute1 masakari-monitors]# tox -egenconfig > > genconfig create: /root/masakari-monitors/.tox/genconfig > > ERROR: InterpreterNotFound: python3 > > _____________________________________________________________ summary > > ______________________________________________________________ > > ERROR: genconfig: InterpreterNotFound: python3 > > > > On Mon, Jul 8, 2019 at 3:24 PM Vu Tan wrote: > > Hi, > > Thanks a lot for your reply, I install pacemaker/corosync, > > masakari-api, maskari-engine on controller node, and I run masakari-api > > with this command: masakari-api, but I dont know whether the process is > > running like that or is it just hang there, here is what it shows when > > I run the command, I leave it there for a while but it does not change > > anything : > > [root at controller masakari]# masakari-api > > 2019-07-08 15:21:09.946 30250 INFO masakari.api.openstack [-] Loaded > > extensions: ['extensions', 'notifications', 'os-hosts', 'segments', > > 'versions'] > > 2019-07-08 15:21:09.955 30250 WARNING keystonemiddleware._common.config > > [-] The option "__file__" in conf is not known to auth_token > > 2019-07-08 15:21:09.955 30250 WARNING keystonemiddleware._common.config > > [-] The option "here" in conf is not known to auth_token > > 2019-07-08 15:21:09.960 30250 WARNING keystonemiddleware.auth_token [-] > > AuthToken middleware is set with > > keystone_authtoken.service_token_roles_required set to False. This is > > backwards compatible but deprecated behaviour. Please set this to True. > > 2019-07-08 15:21:09.974 30250 INFO masakari.wsgi [-] masakari_api > > listening on 127.0.0.1:15868 > > 2019-07-08 15:21:09.975 30250 INFO oslo_service.service [-] Starting 4 > > workers > > 2019-07-08 15:21:09.984 30274 INFO masakari.masakari_api.wsgi.server > > [-] (30274) wsgi starting up on http://127.0.0.1:15868 > > 2019-07-08 15:21:09.985 30275 INFO masakari.masakari_api.wsgi.server > > [-] (30275) wsgi starting up on http://127.0.0.1:15868 > > 2019-07-08 15:21:09.992 30277 INFO masakari.masakari_api.wsgi.server > > [-] (30277) wsgi starting up on http://127.0.0.1:15868 > > 2019-07-08 15:21:09.994 30276 INFO masakari.masakari_api.wsgi.server > > [-] (30276) wsgi starting up on http://127.0.0.1:15868 > > > > On Sun, Jul 7, 2019 at 7:37 PM Gaëtan Trellu > > wrote: > > > > Hi Vu Tan, > > > > Masakari documentation doesn't really exist... I had to figured some > > stuff by myself to make it works into Kolla project. > > > > On controller nodes you need: > > > > - pacemaker > > - corosync > > - masakari-api (openstack/masakari repository) > > - masakari- engine (openstack/masakari repository) > > > > On compute nodes you need: > > > > - pacemaker-remote (integrated to pacemaker cluster as a resource) > > - masakari- hostmonitor (openstack/masakari-monitor repository) > > - masakari-instancemonitor (openstack/masakari-monitor repository) > > - masakari-processmonitor (openstack/masakari-monitor repository) > > > > For masakari-hostmonitor, the service needs to have access to systemctl > > command (make sure you are not using sysvinit). > > > > For masakari-monitor, the masakari-monitor.conf is a bit different, you > > will have to configure the [api] section properly. > > > > RabbitMQ needs to be configured (as transport_url) on masakari-api and > > masakari-engine too. > > > > Please check this review[1], you will have masakari.conf and > > masakari-monitor.conf configuration examples. > > > > [1] https://review.opendev.org/#/c/615715 > > > > Gaëtan > > > > On Jul 7, 2019 12:08 AM, Vu Tan wrote: > > > > VU TAN > > > > 10:30 AM (35 minutes ago) > > > > to openstack-discuss > > > > Sorry, I resend this email because I realized that I lacked of prefix > > on this email's subject > > > > Hi, > > > > I would like to use Masakari and I'm having trouble finding a step by > > step or other documentation to get started with. Which part should be > > installed on controller, which is should be on compute, and what is the > > prerequisite to install masakari, I have installed corosync and > > pacemaker on compute and controller nodes, , what else do I need to do > > ? step I have done so far: > > - installed corosync/pacemaker > > - install masakari on compute node on this github repo: > > https://github.com/openstack/masakari > > - add masakari in to mariadb > > here is my configuration file of masakari.conf, do you mind to take a > > look at it, if I have misconfigured anything? > > > > [DEFAULT] > > enabled_apis = masakari_api > > > > # Enable to specify listening IP other than default > > masakari_api_listen = controller > > # Enable to specify port other than default > > masakari_api_listen_port = 15868 > > debug = False > > auth_strategy=keystone > > > > [wsgi] > > # The paste configuration file path > > api_paste_config = /etc/masakari/api-paste.ini > > > > [keystone_authtoken] > > www_authenticate_uri = http://controller:5000 > > auth_url = http://controller:5000 > > auth_type = password > > project_domain_id = default > > user_domain_id = default > > project_name = service > > username = masakari > > password = P at ssword > > > > [database] > > connection = mysql+pymysql://masakari:P at ssword@controller/masakari -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Mon Jul 8 16:04:15 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Mon, 08 Jul 2019 17:04:15 +0100 Subject: [docs] Retire openstack/docs-specs Message-ID: <79c35e0e14cca630361b0e8caa434618cfce9d24.camel@redhat.com> Hey, I'm proposing retiring docs-specs [1][2]. This hasn't been used in some time since most of our documentation now lives in individual project repositories and any work that would span these multiple projects would likely exist as a community goal, documented in those relevant repositories. I do not intend for the existing content to be removed - it simply won't be possible to add new specs. Let me know if anyone has a reason not to do this. If not, we'll get a move on with this next week. Stephen [1] https://opendev.org/openstack/docs-specs [2] https://review.opendev.org/#/c/668853/ From aj at suse.com Mon Jul 8 17:18:20 2019 From: aj at suse.com (Andreas Jaeger) Date: Mon, 8 Jul 2019 19:18:20 +0200 Subject: [docs] Retire openstack/docs-specs In-Reply-To: <79c35e0e14cca630361b0e8caa434618cfce9d24.camel@redhat.com> References: <79c35e0e14cca630361b0e8caa434618cfce9d24.camel@redhat.com> Message-ID: On 08/07/2019 18.04, Stephen Finucane wrote: > Hey, > > I'm proposing retiring docs-specs [1][2]. This hasn't been used in some > time since most of our documentation now lives in individual project > repositories and any work that would span these multiple projects would > likely exist as a community goal, documented in those relevant > repositories. I do not intend for the existing content to be removed - > it simply won't be possible to add new specs. Retirement means removing the content from the repo, see the Infra manual. We can keep [1] online for sure... Andreas > > Let me know if anyone has a reason not to do this. If not, we'll get a > move on with this next week. > > Stephen > > [1] https://opendev.org/openstack/docs-specs > [2] https://review.opendev.org/#/c/668853/ > > > -- Andreas Jaeger aj at suse.com Twitter: jaegerandi SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg GF: Nils Brauckmann, Felix Imendörffer, Enrica Angelone, HRB 247165 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 -- Andreas Jaeger aj at suse.com Twitter: jaegerandi SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg GF: Nils Brauckmann, Felix Imendörffer, Enrica Angelone, HRB 247165 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 From openstack at nemebean.com Mon Jul 8 20:29:22 2019 From: openstack at nemebean.com (Ben Nemec) Date: Mon, 8 Jul 2019 15:29:22 -0500 Subject: [oslo] oslo.config /castellan poster review In-Reply-To: References: Message-ID: <0a525f18-d457-5d06-8e22-ebae840e37b6@nemebean.com> On 7/7/19 9:21 AM, Doug Hellmann wrote: > Moises Guimaraes de Medeiros writes: > >> Hi, >> >> This week I'll be presenting a poster about oslo.config's castellan driver >> at EuroPython. I'd like to ask y'all interested in the subject to take a >> look at my poster. I'm planning to print it this Tuesday and I still have >> some spare space to fit a bit more. >> >> The latest version is available at: >> >> https://ep2019.europython.eu/media/conference/slides/m7RV4BB-protecting-secrets-with-osloconfig-and-hashicorp-vault.pdf >> >> Thanks a lot! >> >> -- >> >> Moisés Guimarães >> >> Software Engineer >> >> Red Hat >> >> > > That looks great, Moises, nice work! > +1. Thanks for putting it together! From colleen at gazlene.net Mon Jul 8 20:31:38 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Mon, 08 Jul 2019 13:31:38 -0700 Subject: [keystone] Virtual Midcycle Planning In-Reply-To: <5480a911-8beb-46de-a326-fe5eea6802e5@www.fastmail.com> References: <5480a911-8beb-46de-a326-fe5eea6802e5@www.fastmail.com> Message-ID: On Tue, Jun 25, 2019, at 13:30, Colleen Murphy wrote: > Hi team, > > As discussed in today's meeting, we will be having a virtual midcycle > some time around milestone 2. We'll do two days with one three-hour > session (with breaks) each day. We will do this over a video conference > session, details of how to join will follow closer to the event. > > I've started a brainstorming etherpad: > > https://etherpad.openstack.org/p/keystone-train-midcycle-topics > > Please add discussion topics or hacking ideas to the etherpad and I > will try to sort them. > > We need to decide on when exactly to hold the midcycle. I've created a > doodle poll: > > https://doodle.com/poll/wr7ct4uhpw82sysg > > Please select times and days that you're available and then we'll try > to schedule two back-to-back days (or at least two days in the same > week) for the midcycle. > > Let me know if you have any questions or concerns. > > Colleen > > There were a few top contenders in the poll, we'll go with the following two days: Monday, July 22, 14:00-17:00 UTC Tuesday, July 23, 14:00-17:00 UTC (we will skip the weekly team meeting that would have been at 16:00) Agenda still TBD, will be posted on the topics etherpad[1]. [1] https://etherpad.openstack.org/p/keystone-train-midcycle-topics Colleen From rosmaita.fossdev at gmail.com Mon Jul 8 22:44:39 2019 From: rosmaita.fossdev at gmail.com (Brian Rosmaita) Date: Mon, 8 Jul 2019 18:44:39 -0400 Subject: [cinder][stable-maint] July releases from the stable branches Message-ID: I've posted release patches for the stable branches: https://review.openstack.org/#/q/topic:cinderproject-july-2019 Notes on what is/isn't being released are on the etherpad: https://etherpad.openstack.org/p/cinder-releases-tracking I have a question about the semver for the cinder stable/stein release; it's noted on the patch: https://review.opendev.org/669771 cheers, brian From Kevin.Fox at pnnl.gov Tue Jul 9 00:17:47 2019 From: Kevin.Fox at pnnl.gov (Fox, Kevin M) Date: Tue, 9 Jul 2019 00:17:47 +0000 Subject: [Neutron] NUMA aware VxLAN In-Reply-To: <2EE296D083DF2940BF4EBB91D39BB89F40CC0832@SHSMSX104.ccr.corp.intel.com> References: <2EE296D083DF2940BF4EBB91D39BB89F40CC0832@SHSMSX104.ccr.corp.intel.com> Message-ID: <1A3C52DFCD06494D8528644858247BF01C3A2651@EX10MBOX03.pnnl.gov> I'm curious. A lot of network cards support offloaded vxlan traffic these days so the processor isn't doing much work. Is this issue really a problem? Thanks, Kevin ________________________________ From: Guo, Ruijing [ruijing.guo at intel.com] Sent: Sunday, July 07, 2019 5:47 PM To: openstack-dev at lists.openstack.org; openstack at lists.openstack.org Subject: [Neutron] NUMA aware VxLAN Hi, Existing neutron ML2 support one VxLAN for tenant network. In NUMA case, VM 0 can be created in node 0 and VM 1 can be created in node 1 and VxLAN is in node 0. VM1 need to cross node, which cause some performance downgrade. Does someone have this performance issue? Does Neutron community have plan to enhance it? Thanks, -Ruijing -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Tue Jul 9 01:14:29 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 09 Jul 2019 10:14:29 +0900 Subject: [docs][fuel][infra][ironic][neutron][charms][openstack-ansible][solum][tc] proper retirement of repos In-Reply-To: References: Message-ID: <16bd44bd9de.c4394238317337.7822051836860954924@ghanshyammann.com> ---- On Mon, 08 Jul 2019 17:46:37 +0900 Lajos Katona wrote ---- > Hi, > For networking-l2gw (and networking-l2gw-tempest-plugin) I can say that it is maintained (with not that great activity, but anyway far from rerirement).Is there anything I have to do to make this "activeness" visible from governance perspective? Lajos, networking-l2gw* had been retired around ~3 years back[1]. If networking-l2gw situation is active and fulfils the neutron stadium criteria, you can propose it back after discussing with neutron team. [1] https://review.opendev.org/#/c/392010/ -gmann > Lajos > Bernard Cafarelli ezt írta (időpont: 2019. júl. 6., Szo, 22:05): > > > On Sat, 6 Jul 2019 at 19:44, Mohammed Naser wrote: > Hi everyone, > > One of the issue that we recently ran into was the fact that there was > some inconsistency about merging retirement of repositories inside > governance without the code being fully removed. > > In order to avoid this, I've made a change to our governance > repository which will enforce that no code exists in those retired > repositories, however, this has surfaced that some repositories were > retired with some stale files, some are smaller littler files, some > are entire projects still. > > I have compiled a list for every team, with the repos that are not > properly retired that have extra files (using this change which should > eventually +1 once we fix it all: https://review.opendev.org/669549) > > [Documentation] openstack/api-site has extra files, please remove: > .gitignore, .zuul.yaml, LICENSE, api-quick-start, api-ref, bindep.txt, > common, doc-tools-check-languages.conf, firstapp, > test-requirements.txt, tools, tox.ini, www > [Documentation] openstack/faafo has extra files, please remove: > .gitignore, CONTRIBUTING.rst, LICENSE, Vagrantfile, bin, contrib, doc, > etc, faafo, requirements.txt, setup.cfg, setup.py, > test-requirements.txt, tox.ini > > [fuel] openstack/fuel-agent has extra files, please remove: > .gitignore, LICENSE, MAINTAINERS, cloud-init-templates, contrib, > debian, etc, fuel_agent, requirements.txt, run_tests.sh, setup.cfg, > setup.py, specs, test-requirements.txt, tools, tox.ini > [fuel] openstack/fuel-astute has extra files, please remove: > .gitignore, .rspec, .ruby-version, Gemfile, LICENSE, MAINTAINERS, > Rakefile, astute.gemspec, astute.service, astute.sysconfig, bin, > bindep.txt, debian, examples, lib, mcagents, run_tests.sh, spec, > specs, tests > [fuel] openstack/fuel-library has extra files, please remove: > .gitignore, CHANGELOG, Gemfile, LICENSE, MAINTAINERS, Rakefile, > debian, deployment, files, graphs, logs, specs, tests, utils > [fuel] openstack/fuel-main has extra files, please remove: .gitignore, > 00-debmirror.patch, LICENSE, MAINTAINERS, Makefile, config.mk, > fuel-release, iso, mirror, packages, prepare-build-env.sh, > report-changelog.sh, repos.mk, requirements-fuel-rpm.txt, > requirements-rpm.txt, rules.mk, sandbox.mk, specs > [fuel] openstack/fuel-menu has extra files, please remove: .gitignore, > MAINTAINERS, MANIFEST.in, fuelmenu, run_tests.sh, setup.py, specs, > test-requirements.txt, tox.ini > [fuel] openstack/fuel-mirror has extra files, please remove: > .gitignore, .mailmap, MAINTAINERS, perestroika, tox.ini > [fuel] openstack/fuel-nailgun-agent has extra files, please remove: > .gitignore, Gemfile, LICENSE, MAINTAINERS, Rakefile, agent, debian, > nailgun-agent.cron, nailgun-agent.gemspec, run_tests.sh, specs > [fuel] openstack/fuel-ostf has extra files, please remove: .gitignore, > LICENSE, MAINTAINERS, MANIFEST.in, etc, fuel_health, fuel_plugin, > ostf.service, pylintrc, requirements.txt, run_tests.sh, setup.cfg, > setup.py, specs, test-requirements.txt, tools, tox.ini > [fuel] openstack/fuel-qa has extra files, please remove: .coveragerc, > .gitignore, .pylintrc, .pylintrc_gerrit, MAINTAINERS, core, doc, > fuel_tests, fuelweb_test, gates_tests, packages_tests, pytest.ini, > run_system_test.py, run_tests.sh, system_test, tox.ini, utils > [fuel] openstack/fuel-ui has extra files, please remove: > .eslintignore, .eslintrc.yaml, .gitignore, LICENSE, MAINTAINERS, > fixtures, gulp, gulpfile.js, karma.config.js, npm-shrinkwrap.json, > package.json, run_real_plugin_tests.sh, > run_real_plugin_tests_on_real_nailgun.sh, run_ui_func_tests.sh, specs, > static, webpack.config.js > [fuel] openstack/fuel-virtualbox has extra files, please remove: > .gitignore, MAINTAINERS, actions, clean.sh, config.sh, contrib, > drivers, dumpkeys.cache, functions, iso, launch.sh, launch_16GB.sh, > launch_8GB.sh > [fuel] openstack/fuel-web has extra files, please remove: .gitignore, > LICENSE, MAINTAINERS, bin, build_docs.sh, debian, docs, nailgun, > run_tests.sh, specs, systemd, tools, tox.ini > [fuel] openstack/shotgun has extra files, please remove: .coveragerc, > .gitignore, .testr.conf, CONTRIBUTING.rst, HACKING.rst, LICENSE, > MAINTAINERS, MANIFEST.in, bin, etc, requirements.txt, setup.cfg, > setup.py, shotgun, specs, test-requirements.txt, tox.ini > [fuel] openstack/fuel-dev-tools has extra files, please remove: > .coveragerc, .gitignore, .mailmap, .testr.conf, CONTRIBUTING.rst, > HACKING.rst, LICENSE, MAINTAINERS, babel.cfg, contrib, doc, > fuel_dev_tools, openstack-common.conf, requirements.txt, setup.cfg, > setup.py, test-requirements.txt, tox.ini, vagrant > [fuel] openstack/fuel-devops has extra files, please remove: > .coveragerc, .gitignore, .pylintrc, .pylintrc_gerrit, LICENSE, > MAINTAINERS, bin, devops, doc, run_tests.sh, samples, setup.cfg, > setup.py, test-requirements.txt, tox.ini > [fuel] openstack/fuel-docs has extra files, please remove: .gitignore, > Makefile, _images, _templates, common_conf.py, conf.py, devdocs, > examples, glossary, index.rst, make.bat, plugindocs, requirements.txt, > setup.cfg, setup.py, tox.ini, userdocs > [fuel] openstack/fuel-nailgun-extension-cluster-upgrade has extra > files, please remove: .coveragerc, .gitignore, AUTHORS, LICENSE, > MANIFEST.in, bindep.txt, cluster_upgrade, conftest.py, > nailgun-test-settings.yaml, requirements.txt, setup.cfg, setup.py, > specs, test-requirements.txt, tools, tox.ini > [fuel] openstack/fuel-nailgun-extension-iac has extra files, please > remove: .gitignore, LICENSE, MANIFEST.in, doc, fuel_external_git, > requirements.txt, setup.cfg, setup.py, specs, test-requirements.txt, > tools, tox.ini > [fuel] openstack/fuel-nailgun-extension-converted-serializers has > extra files, please remove: .coveragerc, .gitignore, LICENSE, > MANIFEST.in, bindep.txt, conftest.py, converted_serializers, > nailgun-test-settings.yaml, requirements.txt, setup.cfg, setup.py, > specs, test-requirements.txt, tools, tox.ini > [fuel] openstack/fuel-octane has extra files, please remove: > .coveragerc, .gitignore, .mailmap, Gemfile, Gemfile.lock, HACKING.rst, > LICENSE, MAINTAINERS, MANIFEST.in, Rakefile, bindep.txt, deploy, > deployment, docs, misc, octane, requirements.txt, setup.cfg, setup.py, > specs, test-requirements.txt, tox.ini > [fuel] openstack/fuel-upgrade has extra files, please remove: .gitignore > [fuel] openstack/tuning-box has extra files, please remove: > .coveragerc, .gitignore, .mailmap, .testr.conf, CONTRIBUTING.rst, > HACKING.rst, LICENSE, MAINTAINERS, MANIFEST.in, TODO, alembic.ini, > babel.cfg, bindep.txt, doc, examples, openstack-common.conf, > requirements.txt, setup.cfg, setup.py, specs, test-requirements.txt, > tools, tox.ini, tuning_box > [fuel] openstack/fuel-plugins has extra files, please remove: > .gitignore, CHANGELOG.md, CONTRIBUTING.rst, HACKING.rst, LICENSE, > MAINTAINERS, examples, fuel_plugin_builder, requirements.txt, > run_tests.sh, setup.cfg, setup.py, test-requirements.txt, tox.ini > [fuel] openstack/fuel-plugin-murano has extra files, please remove: > .gitignore, LICENSE, components.yaml, deployment_scripts, > deployment_tasks.yaml, docs, environment_config.yaml, functions.sh, > metadata.yaml, node_roles.yaml, pre_build_hook, releasenotes, > repositories, test-requirements.txt, tox.ini, volumes.yaml > [fuel] openstack/fuel-plugin-murano-tests has extra files, please > remove: .gitignore, murano_plugin_tests, openrc.default, > requirements.txt, tox.ini, utils > [fuel] openstack/fuel-specs has extra files, please remove: > .gitignore, .testr.conf, LICENSE, doc, images, policy, > requirements.txt, setup.cfg, setup.py, specs, tests, tools, tox.ini > [fuel] openstack/fuel-stats has extra files, please remove: > .gitignore, LICENSE, MAINTAINERS, MANIFEST.in, analytics, collector, > migration, requirements.txt, setup.py, test-requirements.txt, tools, > tox.ini > [fuel] openstack/python-fuelclient has extra files, please remove: > .gitignore, .testr.conf, MAINTAINERS, MANIFEST.in, fuelclient, > requirements.txt, setup.cfg, setup.py, specs, test-requirements.txt, > tools, tox.ini > > [Infrastructure] opendev/puppet-releasestatus has extra files, please > remove: .gitignore > > [ironic] openstack/python-dracclient has extra files, please remove: > .gitignore, CONTRIBUTING.rst, HACKING.rst, LICENSE, doc, dracclient, > requirements.txt, setup.cfg, setup.py, test-requirements.txt, tox.ini > > [neutron] openstack/networking-calico has extra files, please remove: > .coveragerc, .gitignore, .mailmap, .testr.conf, .zuul.yaml, > CONTRIBUTING.rst, HACKING.rst, LICENSE, MANIFEST.in, RELEASING.md, > babel.cfg, debian, devstack, doc, networking_calico, playbooks, > requirements.txt, rpm, setup.cfg, setup.py, test-requirements.txt, > tox.ini > [neutron] openstack/networking-l2gw has extra files, please remove: > .coveragerc, .gitignore, .testr.conf, .zuul.yaml, CONTRIBUTING.rst, > HACKING.rst, LICENSE, MANIFEST.in, babel.cfg, bindep.txt, contrib, > debian, devstack, doc, etc, lower-constraints.txt, networking_l2gw, > openstack-common.conf, requirements.txt, setup.cfg, setup.py, specs, > test-requirements.txt, tools, tox.ini > [neutron] openstack/networking-l2gw-tempest-plugin has extra files, > please remove: .gitignore, .zuul.yaml, CONTRIBUTING.rst, HACKING.rst, > LICENSE, babel.cfg, contrib, doc, networking_l2gw_tempest_plugin, > requirements.txt, setup.cfg, setup.py, test-requirements.txt, tox.ini > [neutron] openstack/networking-onos has extra files, please remove: > .coveragerc, .gitignore, .mailmap, .pylintrc, .testr.conf, > CONTRIBUTING.rst, HACKING.rst, LICENSE, PKG-INFO, TESTING.rst, > babel.cfg, devstack, doc, etc, lower-constraints.txt, networking_onos, > package, rally-jobs, releasenotes, requirements.txt, setup.cfg, > setup.py, test-requirements.txt, tools, tox.ini > [neutron] openstack/neutron-vpnaas has extra files, please remove: > .coveragerc, .gitignore, .mailmap, .pylintrc, .stestr.conf, > .zuul.yaml, CONTRIBUTING.rst, HACKING.rst, LICENSE, TESTING.rst, > babel.cfg, devstack, doc, etc, lower-constraints.txt, neutron_vpnaas, > playbooks, rally-jobs, releasenotes, requirements.txt, setup.cfg, > setup.py, test-requirements.txt, tools, tox.ini > At least for networking-l2gw* and neutron-vpnaas, I suppose this was caused by:https://opendev.org/openstack/governance/commit/20f95dd947d2f87519b4bb50fb188e6f71deae7c > What it meant is that they are not anymore under neutron governance, but they were not retired (at least as far as I know).There were still some recent commits even if minimal activity, and discussion on team status for neutron-vpnaas. > Not sure about networking-calico status though > [OpenStack Charms] openstack/charm-ceph has extra files, please > remove: .gitignore > > [OpenStackAnsible] openstack/openstack-ansible-os_monasca has extra > files, please remove: tests, tox.ini > > [solum] openstack/solum-infra-guestagent has extra files, please > remove: .coveragerc, .gitignore, .mailmap, .testr.conf, > CONTRIBUTING.rst, HACKING.rst, LICENSE, MANIFEST.in, babel.cfg, > config-generator, doc, etc, requirements.txt, setup.cfg, setup.py, > solum_guestagent, test-requirements.txt, tox.ini > > I'd like to kindly ask the affected teams to help out with this, or > any member of our community is more than welcome to push a change to > those repos and work with the appropriate teams to help land it. > > Mohammed > > -- > Mohammed Naser — vexxhost > ----------------------------------------------------- > D. 514-316-8872 > D. 800-910-1726 ext. 200 > E. mnaser at vexxhost.com > W. http://vexxhost.com > > > > -- > Bernard Cafarelli > From Tushar.Patil at nttdata.com Tue Jul 9 01:18:03 2019 From: Tushar.Patil at nttdata.com (Patil, Tushar) Date: Tue, 9 Jul 2019 01:18:03 +0000 Subject: [masakari] how to install masakari on centos 7 In-Reply-To: References: <09a3849b-786e-49ed-a197-5e13af0428bf@email.android.com> <144114a2e83d8e8e30579ddb0ae39e59@incloudus.com>, Message-ID: Hi Vu and Gaetan, Gaetan, thank you for helping out Vu in setting up masakari-monitors service. As a masakari team ,we have noticed there is a need to add proper documentation to help the community run Masakari services in their environment. We are working on adding proper documentation in this 'Train' cycle. Will send an email on this mailing list once the patches are uploaded on the gerrit so that you can give your feedback on the same. If you have any trouble in setting up Masakari, please let us know on this mailing list or join the bi-weekly IRC Masakari meeting on the #openstack-meeting IRC channel. The next meeting will be held on 16th July 2019 @0400 UTC. Regards, Tushar Patil ________________________________________ From: Vu Tan Sent: Monday, July 8, 2019 11:21:16 PM To: Gaëtan Trellu Cc: openstack-discuss at lists.openstack.org Subject: Re: [masakari] how to install masakari on centos 7 Hi Gaetan, Thanks for pinpoint this out, silly me that did not notice the simple "error InterpreterNotFound: python3". Thanks a lot, I appreciate it On Mon, Jul 8, 2019 at 9:15 PM > wrote: Vu Tan, About "auth_token" error, you need "os_privileged_user_*" options into your masakari.conf for the API. As mentioned previously please have a look here to have an example of configuration working (for me at least): - masakari.conf: https://review.opendev.org/#/c/615715/42/ansible/roles/masakari/templates/masakari.conf.j2 - masakari-monitor.conf: https://review.opendev.org/#/c/615715/42/ansible/roles/masakari/templates/masakari-monitors.conf.j2 About your tox issue make sure you have Python3 installed. Gaëtan On 2019-07-08 06:08, Vu Tan wrote: > Hi Gaetan, > I try to generate config file by using this command tox -egenconfig on > top level of masakari but the output is error, is this masakari still > in beta version ? > [root at compute1 masakari-monitors]# tox -egenconfig > genconfig create: /root/masakari-monitors/.tox/genconfig > ERROR: InterpreterNotFound: python3 > _____________________________________________________________ summary > ______________________________________________________________ > ERROR: genconfig: InterpreterNotFound: python3 > > On Mon, Jul 8, 2019 at 3:24 PM Vu Tan > wrote: > Hi, > Thanks a lot for your reply, I install pacemaker/corosync, > masakari-api, maskari-engine on controller node, and I run masakari-api > with this command: masakari-api, but I dont know whether the process is > running like that or is it just hang there, here is what it shows when > I run the command, I leave it there for a while but it does not change > anything : > [root at controller masakari]# masakari-api > 2019-07-08 15:21:09.946 30250 INFO masakari.api.openstack [-] Loaded > extensions: ['extensions', 'notifications', 'os-hosts', 'segments', > 'versions'] > 2019-07-08 15:21:09.955 30250 WARNING keystonemiddleware._common.config > [-] The option "__file__" in conf is not known to auth_token > 2019-07-08 15:21:09.955 30250 WARNING keystonemiddleware._common.config > [-] The option "here" in conf is not known to auth_token > 2019-07-08 15:21:09.960 30250 WARNING keystonemiddleware.auth_token [-] > AuthToken middleware is set with > keystone_authtoken.service_token_roles_required set to False. This is > backwards compatible but deprecated behaviour. Please set this to True. > 2019-07-08 15:21:09.974 30250 INFO masakari.wsgi [-] masakari_api > listening on 127.0.0.1:15868 > 2019-07-08 15:21:09.975 30250 INFO oslo_service.service [-] Starting 4 > workers > 2019-07-08 15:21:09.984 30274 INFO masakari.masakari_api.wsgi.server > [-] (30274) wsgi starting up on http://127.0.0.1:15868 > 2019-07-08 15:21:09.985 30275 INFO masakari.masakari_api.wsgi.server > [-] (30275) wsgi starting up on http://127.0.0.1:15868 > 2019-07-08 15:21:09.992 30277 INFO masakari.masakari_api.wsgi.server > [-] (30277) wsgi starting up on http://127.0.0.1:15868 > 2019-07-08 15:21:09.994 30276 INFO masakari.masakari_api.wsgi.server > [-] (30276) wsgi starting up on http://127.0.0.1:15868 > > On Sun, Jul 7, 2019 at 7:37 PM Gaëtan Trellu > > wrote: > > Hi Vu Tan, > > Masakari documentation doesn't really exist... I had to figured some > stuff by myself to make it works into Kolla project. > > On controller nodes you need: > > - pacemaker > - corosync > - masakari-api (openstack/masakari repository) > - masakari- engine (openstack/masakari repository) > > On compute nodes you need: > > - pacemaker-remote (integrated to pacemaker cluster as a resource) > - masakari- hostmonitor (openstack/masakari-monitor repository) > - masakari-instancemonitor (openstack/masakari-monitor repository) > - masakari-processmonitor (openstack/masakari-monitor repository) > > For masakari-hostmonitor, the service needs to have access to systemctl > command (make sure you are not using sysvinit). > > For masakari-monitor, the masakari-monitor.conf is a bit different, you > will have to configure the [api] section properly. > > RabbitMQ needs to be configured (as transport_url) on masakari-api and > masakari-engine too. > > Please check this review[1], you will have masakari.conf and > masakari-monitor.conf configuration examples. > > [1] https://review.opendev.org/#/c/615715 > > Gaëtan > > On Jul 7, 2019 12:08 AM, Vu Tan > wrote: > > VU TAN > > > 10:30 AM (35 minutes ago) > > to openstack-discuss > > Sorry, I resend this email because I realized that I lacked of prefix > on this email's subject > > Hi, > > I would like to use Masakari and I'm having trouble finding a step by > step or other documentation to get started with. Which part should be > installed on controller, which is should be on compute, and what is the > prerequisite to install masakari, I have installed corosync and > pacemaker on compute and controller nodes, , what else do I need to do > ? step I have done so far: > - installed corosync/pacemaker > - install masakari on compute node on this github repo: > https://github.com/openstack/masakari > - add masakari in to mariadb > here is my configuration file of masakari.conf, do you mind to take a > look at it, if I have misconfigured anything? > > [DEFAULT] > enabled_apis = masakari_api > > # Enable to specify listening IP other than default > masakari_api_listen = controller > # Enable to specify port other than default > masakari_api_listen_port = 15868 > debug = False > auth_strategy=keystone > > [wsgi] > # The paste configuration file path > api_paste_config = /etc/masakari/api-paste.ini > > [keystone_authtoken] > www_authenticate_uri = http://controller:5000 > auth_url = http://controller:5000 > auth_type = password > project_domain_id = default > user_domain_id = default > project_name = service > username = masakari > password = P at ssword > > [database] > connection = mysql+pymysql://masakari:P at ssword@controller/masakari Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding. From smooney at redhat.com Tue Jul 9 01:43:35 2019 From: smooney at redhat.com (Sean Mooney) Date: Tue, 09 Jul 2019 02:43:35 +0100 Subject: [Neutron] NUMA aware VxLAN In-Reply-To: <1A3C52DFCD06494D8528644858247BF01C3A2651@EX10MBOX03.pnnl.gov> References: <2EE296D083DF2940BF4EBB91D39BB89F40CC0832@SHSMSX104.ccr.corp.intel.com> <1A3C52DFCD06494D8528644858247BF01C3A2651@EX10MBOX03.pnnl.gov> Message-ID: <3e50593dbc47186d482429affeec497c89fdfc0f.camel@redhat.com> On Tue, 2019-07-09 at 00:17 +0000, Fox, Kevin M wrote: > I'm curious. A lot of network cards support offloaded vxlan traffic these days so the processor isn't doing much work. > Is this issue really a problem? the issue is not really with the cpu overhead of tunnel decapsulation it is with the cross numa latency incurred when crossing the qpi bus so even with hardware accleration on the nic there is a perfomace degradation if you go across a numa nodes. the optimal solution is to have multiple nics, one per numa node and bond them then affinities the vms mac to the numa local bond peer such that no traffic has to travers the numa node but that is non trivaial to do and is not supported by openstack natively. you would have to have an agent of or cron job that actuly a did the tuning after a vm is spawned but it could be an interesting experiment if someone wanted to code it up. > > Thanks, > Kevin > ________________________________ > From: Guo, Ruijing [ruijing.guo at intel.com] > Sent: Sunday, July 07, 2019 5:47 PM > To: openstack-dev at lists.openstack.org; openstack at lists.openstack.org > Subject: [Neutron] NUMA aware VxLAN > > Hi, > > Existing neutron ML2 support one VxLAN for tenant network. In NUMA case, VM 0 can be created in node 0 and VM 1 can be > created in node 1 and VxLAN is in node 0. > > VM1 need to cross node, which cause some performance downgrade. Does someone have this performance issue? Does Neutron > community have plan to enhance it? nova has a spec called numa aware vswitchs https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/numa-aware-vswitches.html that allow you to declare teh numa affinity of tunnels and physnets on a host. this will not allow you to have multiple tunnel enpoint ips but it will allow you force instnace with a numa toploty to be colocated on the same numa node as the network backend. > Thanks, > -Ruijing From smooney at redhat.com Tue Jul 9 01:43:35 2019 From: smooney at redhat.com (Sean Mooney) Date: Tue, 09 Jul 2019 02:43:35 +0100 Subject: [Neutron] NUMA aware VxLAN In-Reply-To: <1A3C52DFCD06494D8528644858247BF01C3A2651@EX10MBOX03.pnnl.gov> References: <2EE296D083DF2940BF4EBB91D39BB89F40CC0832@SHSMSX104.ccr.corp.intel.com> <1A3C52DFCD06494D8528644858247BF01C3A2651@EX10MBOX03.pnnl.gov> Message-ID: <3e50593dbc47186d482429affeec497c89fdfc0f.camel@redhat.com> On Tue, 2019-07-09 at 00:17 +0000, Fox, Kevin M wrote: > I'm curious. A lot of network cards support offloaded vxlan traffic these days so the processor isn't doing much work. > Is this issue really a problem? the issue is not really with the cpu overhead of tunnel decapsulation it is with the cross numa latency incurred when crossing the qpi bus so even with hardware accleration on the nic there is a perfomace degradation if you go across a numa nodes. the optimal solution is to have multiple nics, one per numa node and bond them then affinities the vms mac to the numa local bond peer such that no traffic has to travers the numa node but that is non trivaial to do and is not supported by openstack natively. you would have to have an agent of or cron job that actuly a did the tuning after a vm is spawned but it could be an interesting experiment if someone wanted to code it up. > > Thanks, > Kevin > ________________________________ > From: Guo, Ruijing [ruijing.guo at intel.com] > Sent: Sunday, July 07, 2019 5:47 PM > To: openstack-dev at lists.openstack.org; openstack at lists.openstack.org > Subject: [Neutron] NUMA aware VxLAN > > Hi, > > Existing neutron ML2 support one VxLAN for tenant network. In NUMA case, VM 0 can be created in node 0 and VM 1 can be > created in node 1 and VxLAN is in node 0. > > VM1 need to cross node, which cause some performance downgrade. Does someone have this performance issue? Does Neutron > community have plan to enhance it? nova has a spec called numa aware vswitchs https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/numa-aware-vswitches.html that allow you to declare teh numa affinity of tunnels and physnets on a host. this will not allow you to have multiple tunnel enpoint ips but it will allow you force instnace with a numa toploty to be colocated on the same numa node as the network backend. > Thanks, > -Ruijing From satish.txt at gmail.com Tue Jul 9 02:42:45 2019 From: satish.txt at gmail.com (Satish Patel) Date: Mon, 8 Jul 2019 22:42:45 -0400 Subject: neutron netns arp issue Message-ID: Hello, I am deploying openstack-ansible with octavia and i can see neutron created network for lb-mgmt-net which also created dhcp namespace for that network which is in vlan27 so far everything good so for testing i have created vm and it didn't get IP address so i have started troubleshooting and i found my namespace sending arp request but not getting reply back. [root at ostack-infra-2-2 ~]# ip netns exec qdhcp-2b94d9df-dd49-45b5-a992-63fee27bfa77 ip a 1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ns-5604eec1-20 at if132: mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether fa:16:3e:c2:b3:4d brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 172.27.12.3/21 brd 172.27.15.255 scope global ns-5604eec1-20 valid_lft forever preferred_lft forever inet 169.254.169.254/16 brd 169.254.255.255 scope global ns-5604eec1-20 valid_lft forever preferred_lft forever inet6 fe80::f816:3eff:fec2:b34d/64 scope link valid_lft forever preferred_lft forever This is my linuxbridge [root at ostack-infra-2-2 ~]# brctl show brq2b94d9df-dd bridge name bridge id STP enabled interfaces brq2b94d9df-dd 8000.16d25dbea2cc no br-vlan.27 tap5604eec1-20 on same controller node "ostack-infra-2-2" i have br-lbaas network which has same VLAN 27 subnet IP. now when i ping from dhcp-namespace to outside host on same vlan 27, i can see ARP going out and remote host replying back but my reply coming on br-lbaas interface. [root at ostack-infra-2-2 ~]# ip netns exec qdhcp-2b94d9df-dd49-45b5-a992-63fee27bfa77 ping 172.27.8.4 PING 172.27.8.4 (172.27.8.4) 56(84) bytes of data. >From 172.27.12.3 icmp_seq=1 Destination Host Unreachable >From 172.27.12.3 icmp_seq=2 Destination Host Unreachable on other terminal i am running tcpdump on br-lbaas and i am seeing remote host ARP reply coming on that interface but not going to br-vlan.27 which neutron created. [root at ostack-infra-2-2 network-scripts]# tcpdump -i br-lbaas -nn tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on br-lbaas, link-type EN10MB (Ethernet), capture size 262144 bytes 22:41:38.920858 ARP, Reply 172.27.8.4 is-at 32:7c:a1:91:79:7c, length 46 22:41:39.922167 ARP, Reply 172.27.8.4 is-at 32:7c:a1:91:79:7c, length 46 22:41:40.924052 ARP, Reply 172.27.8.4 is-at 32:7c:a1:91:79:7c, length 46 Do you think i can't create two same subnet bridge on same host? From amotoki at gmail.com Tue Jul 9 04:58:05 2019 From: amotoki at gmail.com (Akihiro Motoki) Date: Tue, 9 Jul 2019 13:58:05 +0900 Subject: [docs][fuel][infra][ironic][neutron][charms][openstack-ansible][solum][tc] proper retirement of repos In-Reply-To: <16bd44bd9de.c4394238317337.7822051836860954924@ghanshyammann.com> References: <16bd44bd9de.c4394238317337.7822051836860954924@ghanshyammann.com> Message-ID: Regarding networking repositories, the neutron team evaluates activities of individual projects and include/push out them from the neutron governance depending on their situations. Thus, retirement from some repository does not mean the retirement of the repository. Development on such repository can continue after exclusion from the neutron governance. AFAIK, in case of networking-l2gw, the development is still active but it does not satisfy the criteria of the neutron stadium like integrated tests or community goal satisfaction due to the lack of development resources. This is the reason that networking-l2gw *repository* is not retired as TC expects although it is not under the TC governance. IMHO it looks better to move networking-l2-gw repository from "openstack" namespace to "x" namespace. (Of course the other option is the neutron team consider the inclusion of networking-l2gw again but it would take time even if we go to this route.) Thanks, Akihiro Motoki (irc: amotoki) On Tue, Jul 9, 2019 at 10:19 AM Ghanshyam Mann wrote: > > ---- On Mon, 08 Jul 2019 17:46:37 +0900 Lajos Katona wrote ---- > > Hi, > > For networking-l2gw (and networking-l2gw-tempest-plugin) I can say that it is maintained (with not that great activity, but anyway far from rerirement).Is there anything I have to do to make this "activeness" visible from governance perspective? > > Lajos, > networking-l2gw* had been retired around ~3 years back[1]. If networking-l2gw situation is active and fulfils the neutron stadium criteria, you can propose it back after discussing with neutron team. > > [1] https://review.opendev.org/#/c/392010/ > > -gmann > > > Lajos > > Bernard Cafarelli ezt írta (időpont: 2019. júl. 6., Szo, 22:05): > > > > > > On Sat, 6 Jul 2019 at 19:44, Mohammed Naser wrote: > > Hi everyone, > > > > One of the issue that we recently ran into was the fact that there was > > some inconsistency about merging retirement of repositories inside > > governance without the code being fully removed. > > > > In order to avoid this, I've made a change to our governance > > repository which will enforce that no code exists in those retired > > repositories, however, this has surfaced that some repositories were > > retired with some stale files, some are smaller littler files, some > > are entire projects still. > > > > I have compiled a list for every team, with the repos that are not > > properly retired that have extra files (using this change which should > > eventually +1 once we fix it all: https://review.opendev.org/669549) > > > > [Documentation] openstack/api-site has extra files, please remove: > > .gitignore, .zuul.yaml, LICENSE, api-quick-start, api-ref, bindep.txt, > > common, doc-tools-check-languages.conf, firstapp, > > test-requirements.txt, tools, tox.ini, www > > [Documentation] openstack/faafo has extra files, please remove: > > .gitignore, CONTRIBUTING.rst, LICENSE, Vagrantfile, bin, contrib, doc, > > etc, faafo, requirements.txt, setup.cfg, setup.py, > > test-requirements.txt, tox.ini > > > > [fuel] openstack/fuel-agent has extra files, please remove: > > .gitignore, LICENSE, MAINTAINERS, cloud-init-templates, contrib, > > debian, etc, fuel_agent, requirements.txt, run_tests.sh, setup.cfg, > > setup.py, specs, test-requirements.txt, tools, tox.ini > > [fuel] openstack/fuel-astute has extra files, please remove: > > .gitignore, .rspec, .ruby-version, Gemfile, LICENSE, MAINTAINERS, > > Rakefile, astute.gemspec, astute.service, astute.sysconfig, bin, > > bindep.txt, debian, examples, lib, mcagents, run_tests.sh, spec, > > specs, tests > > [fuel] openstack/fuel-library has extra files, please remove: > > .gitignore, CHANGELOG, Gemfile, LICENSE, MAINTAINERS, Rakefile, > > debian, deployment, files, graphs, logs, specs, tests, utils > > [fuel] openstack/fuel-main has extra files, please remove: .gitignore, > > 00-debmirror.patch, LICENSE, MAINTAINERS, Makefile, config.mk, > > fuel-release, iso, mirror, packages, prepare-build-env.sh, > > report-changelog.sh, repos.mk, requirements-fuel-rpm.txt, > > requirements-rpm.txt, rules.mk, sandbox.mk, specs > > [fuel] openstack/fuel-menu has extra files, please remove: .gitignore, > > MAINTAINERS, MANIFEST.in, fuelmenu, run_tests.sh, setup.py, specs, > > test-requirements.txt, tox.ini > > [fuel] openstack/fuel-mirror has extra files, please remove: > > .gitignore, .mailmap, MAINTAINERS, perestroika, tox.ini > > [fuel] openstack/fuel-nailgun-agent has extra files, please remove: > > .gitignore, Gemfile, LICENSE, MAINTAINERS, Rakefile, agent, debian, > > nailgun-agent.cron, nailgun-agent.gemspec, run_tests.sh, specs > > [fuel] openstack/fuel-ostf has extra files, please remove: .gitignore, > > LICENSE, MAINTAINERS, MANIFEST.in, etc, fuel_health, fuel_plugin, > > ostf.service, pylintrc, requirements.txt, run_tests.sh, setup.cfg, > > setup.py, specs, test-requirements.txt, tools, tox.ini > > [fuel] openstack/fuel-qa has extra files, please remove: .coveragerc, > > .gitignore, .pylintrc, .pylintrc_gerrit, MAINTAINERS, core, doc, > > fuel_tests, fuelweb_test, gates_tests, packages_tests, pytest.ini, > > run_system_test.py, run_tests.sh, system_test, tox.ini, utils > > [fuel] openstack/fuel-ui has extra files, please remove: > > .eslintignore, .eslintrc.yaml, .gitignore, LICENSE, MAINTAINERS, > > fixtures, gulp, gulpfile.js, karma.config.js, npm-shrinkwrap.json, > > package.json, run_real_plugin_tests.sh, > > run_real_plugin_tests_on_real_nailgun.sh, run_ui_func_tests.sh, specs, > > static, webpack.config.js > > [fuel] openstack/fuel-virtualbox has extra files, please remove: > > .gitignore, MAINTAINERS, actions, clean.sh, config.sh, contrib, > > drivers, dumpkeys.cache, functions, iso, launch.sh, launch_16GB.sh, > > launch_8GB.sh > > [fuel] openstack/fuel-web has extra files, please remove: .gitignore, > > LICENSE, MAINTAINERS, bin, build_docs.sh, debian, docs, nailgun, > > run_tests.sh, specs, systemd, tools, tox.ini > > [fuel] openstack/shotgun has extra files, please remove: .coveragerc, > > .gitignore, .testr.conf, CONTRIBUTING.rst, HACKING.rst, LICENSE, > > MAINTAINERS, MANIFEST.in, bin, etc, requirements.txt, setup.cfg, > > setup.py, shotgun, specs, test-requirements.txt, tox.ini > > [fuel] openstack/fuel-dev-tools has extra files, please remove: > > .coveragerc, .gitignore, .mailmap, .testr.conf, CONTRIBUTING.rst, > > HACKING.rst, LICENSE, MAINTAINERS, babel.cfg, contrib, doc, > > fuel_dev_tools, openstack-common.conf, requirements.txt, setup.cfg, > > setup.py, test-requirements.txt, tox.ini, vagrant > > [fuel] openstack/fuel-devops has extra files, please remove: > > .coveragerc, .gitignore, .pylintrc, .pylintrc_gerrit, LICENSE, > > MAINTAINERS, bin, devops, doc, run_tests.sh, samples, setup.cfg, > > setup.py, test-requirements.txt, tox.ini > > [fuel] openstack/fuel-docs has extra files, please remove: .gitignore, > > Makefile, _images, _templates, common_conf.py, conf.py, devdocs, > > examples, glossary, index.rst, make.bat, plugindocs, requirements.txt, > > setup.cfg, setup.py, tox.ini, userdocs > > [fuel] openstack/fuel-nailgun-extension-cluster-upgrade has extra > > files, please remove: .coveragerc, .gitignore, AUTHORS, LICENSE, > > MANIFEST.in, bindep.txt, cluster_upgrade, conftest.py, > > nailgun-test-settings.yaml, requirements.txt, setup.cfg, setup.py, > > specs, test-requirements.txt, tools, tox.ini > > [fuel] openstack/fuel-nailgun-extension-iac has extra files, please > > remove: .gitignore, LICENSE, MANIFEST.in, doc, fuel_external_git, > > requirements.txt, setup.cfg, setup.py, specs, test-requirements.txt, > > tools, tox.ini > > [fuel] openstack/fuel-nailgun-extension-converted-serializers has > > extra files, please remove: .coveragerc, .gitignore, LICENSE, > > MANIFEST.in, bindep.txt, conftest.py, converted_serializers, > > nailgun-test-settings.yaml, requirements.txt, setup.cfg, setup.py, > > specs, test-requirements.txt, tools, tox.ini > > [fuel] openstack/fuel-octane has extra files, please remove: > > .coveragerc, .gitignore, .mailmap, Gemfile, Gemfile.lock, HACKING.rst, > > LICENSE, MAINTAINERS, MANIFEST.in, Rakefile, bindep.txt, deploy, > > deployment, docs, misc, octane, requirements.txt, setup.cfg, setup.py, > > specs, test-requirements.txt, tox.ini > > [fuel] openstack/fuel-upgrade has extra files, please remove: .gitignore > > [fuel] openstack/tuning-box has extra files, please remove: > > .coveragerc, .gitignore, .mailmap, .testr.conf, CONTRIBUTING.rst, > > HACKING.rst, LICENSE, MAINTAINERS, MANIFEST.in, TODO, alembic.ini, > > babel.cfg, bindep.txt, doc, examples, openstack-common.conf, > > requirements.txt, setup.cfg, setup.py, specs, test-requirements.txt, > > tools, tox.ini, tuning_box > > [fuel] openstack/fuel-plugins has extra files, please remove: > > .gitignore, CHANGELOG.md, CONTRIBUTING.rst, HACKING.rst, LICENSE, > > MAINTAINERS, examples, fuel_plugin_builder, requirements.txt, > > run_tests.sh, setup.cfg, setup.py, test-requirements.txt, tox.ini > > [fuel] openstack/fuel-plugin-murano has extra files, please remove: > > .gitignore, LICENSE, components.yaml, deployment_scripts, > > deployment_tasks.yaml, docs, environment_config.yaml, functions.sh, > > metadata.yaml, node_roles.yaml, pre_build_hook, releasenotes, > > repositories, test-requirements.txt, tox.ini, volumes.yaml > > [fuel] openstack/fuel-plugin-murano-tests has extra files, please > > remove: .gitignore, murano_plugin_tests, openrc.default, > > requirements.txt, tox.ini, utils > > [fuel] openstack/fuel-specs has extra files, please remove: > > .gitignore, .testr.conf, LICENSE, doc, images, policy, > > requirements.txt, setup.cfg, setup.py, specs, tests, tools, tox.ini > > [fuel] openstack/fuel-stats has extra files, please remove: > > .gitignore, LICENSE, MAINTAINERS, MANIFEST.in, analytics, collector, > > migration, requirements.txt, setup.py, test-requirements.txt, tools, > > tox.ini > > [fuel] openstack/python-fuelclient has extra files, please remove: > > .gitignore, .testr.conf, MAINTAINERS, MANIFEST.in, fuelclient, > > requirements.txt, setup.cfg, setup.py, specs, test-requirements.txt, > > tools, tox.ini > > > > [Infrastructure] opendev/puppet-releasestatus has extra files, please > > remove: .gitignore > > > > [ironic] openstack/python-dracclient has extra files, please remove: > > .gitignore, CONTRIBUTING.rst, HACKING.rst, LICENSE, doc, dracclient, > > requirements.txt, setup.cfg, setup.py, test-requirements.txt, tox.ini > > > > [neutron] openstack/networking-calico has extra files, please remove: > > .coveragerc, .gitignore, .mailmap, .testr.conf, .zuul.yaml, > > CONTRIBUTING.rst, HACKING.rst, LICENSE, MANIFEST.in, RELEASING.md, > > babel.cfg, debian, devstack, doc, networking_calico, playbooks, > > requirements.txt, rpm, setup.cfg, setup.py, test-requirements.txt, > > tox.ini > > [neutron] openstack/networking-l2gw has extra files, please remove: > > .coveragerc, .gitignore, .testr.conf, .zuul.yaml, CONTRIBUTING.rst, > > HACKING.rst, LICENSE, MANIFEST.in, babel.cfg, bindep.txt, contrib, > > debian, devstack, doc, etc, lower-constraints.txt, networking_l2gw, > > openstack-common.conf, requirements.txt, setup.cfg, setup.py, specs, > > test-requirements.txt, tools, tox.ini > > [neutron] openstack/networking-l2gw-tempest-plugin has extra files, > > please remove: .gitignore, .zuul.yaml, CONTRIBUTING.rst, HACKING.rst, > > LICENSE, babel.cfg, contrib, doc, networking_l2gw_tempest_plugin, > > requirements.txt, setup.cfg, setup.py, test-requirements.txt, tox.ini > > [neutron] openstack/networking-onos has extra files, please remove: > > .coveragerc, .gitignore, .mailmap, .pylintrc, .testr.conf, > > CONTRIBUTING.rst, HACKING.rst, LICENSE, PKG-INFO, TESTING.rst, > > babel.cfg, devstack, doc, etc, lower-constraints.txt, networking_onos, > > package, rally-jobs, releasenotes, requirements.txt, setup.cfg, > > setup.py, test-requirements.txt, tools, tox.ini > > [neutron] openstack/neutron-vpnaas has extra files, please remove: > > .coveragerc, .gitignore, .mailmap, .pylintrc, .stestr.conf, > > .zuul.yaml, CONTRIBUTING.rst, HACKING.rst, LICENSE, TESTING.rst, > > babel.cfg, devstack, doc, etc, lower-constraints.txt, neutron_vpnaas, > > playbooks, rally-jobs, releasenotes, requirements.txt, setup.cfg, > > setup.py, test-requirements.txt, tools, tox.ini > > At least for networking-l2gw* and neutron-vpnaas, I suppose this was caused by:https://opendev.org/openstack/governance/commit/20f95dd947d2f87519b4bb50fb188e6f71deae7c > > What it meant is that they are not anymore under neutron governance, but they were not retired (at least as far as I know).There were still some recent commits even if minimal activity, and discussion on team status for neutron-vpnaas. > > Not sure about networking-calico status though > > [OpenStack Charms] openstack/charm-ceph has extra files, please > > remove: .gitignore > > > > [OpenStackAnsible] openstack/openstack-ansible-os_monasca has extra > > files, please remove: tests, tox.ini > > > > [solum] openstack/solum-infra-guestagent has extra files, please > > remove: .coveragerc, .gitignore, .mailmap, .testr.conf, > > CONTRIBUTING.rst, HACKING.rst, LICENSE, MANIFEST.in, babel.cfg, > > config-generator, doc, etc, requirements.txt, setup.cfg, setup.py, > > solum_guestagent, test-requirements.txt, tox.ini > > > > I'd like to kindly ask the affected teams to help out with this, or > > any member of our community is more than welcome to push a change to > > those repos and work with the appropriate teams to help land it. > > > > Mohammed > > > > -- > > Mohammed Naser — vexxhost > > ----------------------------------------------------- > > D. 514-316-8872 > > D. 800-910-1726 ext. 200 > > E. mnaser at vexxhost.com > > W. http://vexxhost.com > > > > > > > > -- > > Bernard Cafarelli > > > > From dharmendra.kushwaha at india.nec.com Tue Jul 9 06:34:26 2019 From: dharmendra.kushwaha at india.nec.com (Dharmendra Kushwaha) Date: Tue, 9 Jul 2019 06:34:26 +0000 Subject: [Tacker] Proposing changes in Tacker core team Message-ID: Hello Team, I am proposing below changes in Team: I would like to propose Hiroyuki Jo to join Tacker core team. Hiroyuki Jo have lead multiple valuable features level activities like affinity policy, VDU-healing, and VNF reservation [1] in Rocky & Stein cycle, and made it sure to be completed timely. And currently working on VNF packages [2] and ETSI NFV-SOL specification support [3]. Hiroyuki has a good understanding of NFV and Tacker project, and helping team by providing sensible reviews. I believe it is a good addition in Tacker core team, and Tacker project will benefit from this nomination. On the other hand, I wanted to thank to Bharath Thiruveedula for his great & valuable contribution in the project. He helped a lot to make Tacker better in early days. But now he doesn't seem to be active in project and he decided to step-down from core team. Whenever you will decide to come back to the project, I will be happy to add you in core-team. Core-Team, Please respond with your +1/-1. If no objection, I will do these changes in next week. [1] https://review.opendev.org/#/q/project:openstack/tacker-specs+owner:%22Hiroyuki+Jo+%253Chiroyuki.jo.mt%2540hco.ntt.co.jp%253E%22 [2] https://blueprints.launchpad.net/tacker/+spec/tosca-csar-mgmt-driver [3] https://blueprints.launchpad.net/tacker/+spec/support-etsi-nfv-specs Thanks & Regards Dharmendra Kushwaha From ekuvaja at redhat.com Tue Jul 9 11:41:10 2019 From: ekuvaja at redhat.com (Erno Kuvaja) Date: Tue, 9 Jul 2019 12:41:10 +0100 Subject: [Glance][PTL] Vacation time Message-ID: Hi all, I'll be away for couple of weeks. I'll be back Wed 24th of July. - Erno "jokke" Kuvaja -------------- next part -------------- An HTML attachment was scrubbed... URL: From skylightcoder at gmail.com Tue Jul 9 12:46:02 2019 From: skylightcoder at gmail.com (=?UTF-8?B?R8O2a2hhbiBJxZ5JSw==?=) Date: Tue, 9 Jul 2019 15:46:02 +0300 Subject: [Nova] Instances can't be started after compute nodes unexpectedly shut down because of power outage Message-ID: Hi folks, Because of power outage, Most of our compute nodes unexpectedly shut down and now I can not start our instances. Error message is "Failed to get "write" lock another process using the image?". Instances Power status is No State. Full error log is http://paste.openstack.org/show/754107/. My environment is OpenStack Pike on Ubuntu 16.04 LTS servers and Instances are on a nfs shared storage. Nova version is 16.1.6.dev2. qemu version is 2.10.1. libvirt version is 3.6.0. I saw a commit [1], but it doesn't solve this problem. There are important instances on my environment. How can I rescue my instances? What would you suggest ? Thanks, Gökhan [1] https://review.opendev.org/#/c/509774/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbedyk at suse.de Tue Jul 9 11:45:14 2019 From: wbedyk at suse.de (Witek Bedyk) Date: Tue, 9 Jul 2019 13:45:14 +0200 Subject: [monasca] Virtual Midcycle Meeting scheduling Message-ID: <40121ffb-3306-ef6b-b2ba-9d75e2a1e130@suse.de> Hello, as discussed in the last team meeting, we will hold a virtual Midcycle Meeting. The goal is to sync on the progress of the development and update the stories if needed. I plan with 2 hours meeting. Please select the times which work best for you: https://doodle.com/poll/zszfxakcbfm6sdha Please fill in the topics you would like to discuss or update on in the etherpad: https://etherpad.openstack.org/p/monasca-train-midcycle Thanks Witek From jungleboyj at gmail.com Tue Jul 9 14:31:45 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Tue, 9 Jul 2019 09:31:45 -0500 Subject: [cinder][stable-maint] July releases from the stable branches In-Reply-To: References: Message-ID: On 7/8/2019 5:44 PM, Brian Rosmaita wrote: > I've posted release patches for the stable branches: > https://review.openstack.org/#/q/topic:cinderproject-july-2019 > > Notes on what is/isn't being released are on the etherpad: > https://etherpad.openstack.org/p/cinder-releases-tracking > > I have a question about the semver for the cinder stable/stein release; > it's noted on the patch: https://review.opendev.org/669771 > > cheers, > brian Brian, Thanks for putting this together.  Changes look good to me! Jay From jim at jimrollenhagen.com Tue Jul 9 14:36:15 2019 From: jim at jimrollenhagen.com (Jim Rollenhagen) Date: Tue, 9 Jul 2019 10:36:15 -0400 Subject: [tc] Assuming control of GitHub organizations In-Reply-To: <9a4fdd56-7af8-c432-6ade-d52c163e8b8c@openstack.org> References: <10270aaf-9f4e-80b0-8e40-760d7c52dc0d@ham.ie> <9a4fdd56-7af8-c432-6ade-d52c163e8b8c@openstack.org> Message-ID: On Thu, Jun 27, 2019 at 8:55 AM Thierry Carrez wrote: > Graham Hayes wrote: > > On 27/06/2019 09:55, Thierry Carrez wrote: > >> I have been considering our GitHub presence as a downstream "code > >> marketing" property, a sort of front-end or entry point into the > >> OpenStack universe for outsiders. As such, I'd consider it much closer > >> to openstack.org/software than to opendev.org/openstack. > >> > >> So one way to do this would be to ask Foundation staff to maintain this > >> code marketing property, taking care of aligning message with the > >> content at openstack.org/software (which is driven from the > >> osf/openstack-map repository). > >> > >> If we handle it at TC-level my fear is that we would duplicate work > >> around things like project descriptions and what is pinned, and end up > >> with slightly different messages. > > > > I am not as concerned about this, the TC should be setting out our > > viewpoint for the project, and if this is in conflict with the message > > from the foundation, we have plenty of avenues to raise it. > > How about the TC controls which repo is replicated where (and which ones > are pinned etc), but we import the descriptions from the openstack-map > repo? > > That would keep control on the TC side but avoid duplication of effort. > In my experience it's already difficult to get projects to update > descriptions in one place, so two... > This seems reasonable to me. > > Also, who is volunteering for setting up the replication, and then > keeping track of things as they evolve ? > I'm up for it, can we get a couple more? I'd like to get this going soon. // jim > > -- > Thierry Carrez (ttx) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From juliaashleykreger at gmail.com Tue Jul 9 15:20:45 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Tue, 9 Jul 2019 08:20:45 -0700 Subject: [ironic][edge] L3/DHCP-Less deployments Message-ID: Greetings everyone! Over the last few weeks, I've had numerous discussions with contributors who have expressed interest in the DHCP-less deployment method specification document[0]. It seems many parties are interested! I think the path forward at this time is to get everyone talking about where and how they can help push this feature forward. If those interested could propose time windows when they are available, I'll try to find a mutual time window where we can try to get everyone on IRC or on a call and try to map out the next steps together. Sound good? -Julia [0]: https://specs.openstack.org/openstack/ironic-specs/specs/not-implemented/L3-based-deployment.html From mriedemos at gmail.com Tue Jul 9 18:13:14 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Tue, 9 Jul 2019 13:13:14 -0500 Subject: [watcher] nova cdm builder performance optimizations - summary Message-ID: <2d3cea37-d4ba-d435-9b46-ce711fac55cc@gmail.com> I wanted to summarize a series of changes which have improved the performance of the NovaClusterDataModel builder for audits across single and multiple cells (in the CERN case) by a factor of 20-30%. There were initially three changes involved (in order): 1. https://review.opendev.org/#/c/659688/ - Optimize NovaClusterDataModelCollector.add_instance_node Reports on that patch alone said it fixed a regression introduced in Stein with scoped audits: "I checked this patch on the my test environment on the stable/stein branch. I have more than 1000 virtual servers (some real, some dummy). Previously, in the stable/rocky branch, the time to build a cluster was about 15-20 minutes, in the Stein branch there was a regression and the time increased to 90 minutes. After this patch, the build time is only 2 minutes." That change was backported to stable/stein. 2. - https://review.opendev.org/#/c/661121/ - Optimize hypervisor API calls (which requires https://review.opendev.org/#/c/659886/) As noted that change requires a patch to python-novaclient if you are looking to backport the change. We can't backport that upstream because of the python-novaclient dependency since it would require bumping the minimum required version of the library on a stable branch which is against stable branch policy (minimum version of library dependencies are more or less frozen on stable branches). That change also requires configuring watcher with: [nova_client] api_version = 2.53 # or greater; train now requires at least 2.56 3. - https://review.opendev.org/#/c/662089/ - Optimize NovaHelper.get_compute_node_by_hostname This optimizes code used to build/update the nova CDM during notification processing and also fixes a bug about looking up the compute service properly. After those three changes were merged, Corne Lukken (Dantali0n) started doing scale and performance testing with and without the changes in a CERN 5-cell test cluster. Corne identified a regression for which Canwei Li determined the root cause and chenker fixed: 4. https://review.opendev.org/#/c/668100/ - Reduce the query time of the instances when call get_instance_list() With that fix applied Corne reported the overall improvement of 20-30% when building the nova CDM during an audit in various scenarios. The actual performance numbers will be sent later as part of a thesis Corne is working on. I want to thank Dantali0n, licanwei and chenker for all of their help with this series of improvements. -- Thanks, Matt From emilien at redhat.com Tue Jul 9 19:37:41 2019 From: emilien at redhat.com (Emilien Macchi) Date: Tue, 9 Jul 2019 15:37:41 -0400 Subject: [tripleo] [cisco] [bigswitch] [midonet] Composable Services for Neutron plugins Message-ID: With that thread I'm hoping to reach out some of the folks who were involved in TripleO services for some of the Neutron plugins like Cisco, Bigswitch and Midonet. Some of them still use ExtraConfig interface which isn't super ideal. I'm willing to help convert them into composable services but I'll need help on the reviews and more importantly the actual testing. If you're in CC and can help, good please let me know. If you're not involved anymore, please give me a name if possible. If you were involved and know the plugin doesn't need maintenance, let me know as well, it'll save me time. I started with the Cisco UCSM: https://review.opendev.org/669931 -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From stig.openstack at telfer.org Tue Jul 9 20:45:56 2019 From: stig.openstack at telfer.org (Stig Telfer) Date: Tue, 9 Jul 2019 21:45:56 +0100 Subject: [scientific-sig] IRC today Message-ID: <552CEE47-1355-4E01-8A5D-50BB6D133A6C@telfer.org> Hi All - We have a Scientific SIG IRC meeting coming up at 2100 UTC (about 15 minutes time) in channel #openstack-meeting. Everyone is welcome. Today’s agenda is here: https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_July_9th_2019 If you’d like to add anything to the agenda, come along to the meeting and bring it up. Cheers, Stig From openstack at fried.cc Tue Jul 9 21:31:12 2019 From: openstack at fried.cc (Eric Fried) Date: Tue, 9 Jul 2019 16:31:12 -0500 Subject: [nova][ironic] Lock-related performance issue with update_resources periodic job In-Reply-To: <07C5F51A-DCAB-4432-B556-49E1E15801AC@fried.cc> References: <37D021C4-ED1B-4942-9C90-0A26FDE3DD76@fried.cc> <07C5F51A-DCAB-4432-B556-49E1E15801AC@fried.cc> Message-ID: <7883d905-69df-117c-c78d-8a5667e6c941@fried.cc> >>>> https://review.opendev.org/#/c/637225/ >>> Ah heck, I had totally forgotten about that patch. If it's working for you, let me get it polished up and merged. This is polished and ready for review, merge, backport. efried . From zhengzhenyulixi at gmail.com Wed Jul 10 01:49:04 2019 From: zhengzhenyulixi at gmail.com (Zhenyu Zheng) Date: Wed, 10 Jul 2019 09:49:04 +0800 Subject: [watcher] nova cdm builder performance optimizations - summary In-Reply-To: <2d3cea37-d4ba-d435-9b46-ce711fac55cc@gmail.com> References: <2d3cea37-d4ba-d435-9b46-ce711fac55cc@gmail.com> Message-ID: Thanks alot for the summary, this could be very helpful, we will have a test on these :) On Wed, Jul 10, 2019 at 2:29 AM Matt Riedemann wrote: > I wanted to summarize a series of changes which have improved the > performance of the NovaClusterDataModel builder for audits across single > and multiple cells (in the CERN case) by a factor of 20-30%. > > There were initially three changes involved (in order): > > 1. https://review.opendev.org/#/c/659688/ - Optimize > NovaClusterDataModelCollector.add_instance_node > > Reports on that patch alone said it fixed a regression introduced in > Stein with scoped audits: > > "I checked this patch on the my test environment on the stable/stein > branch. I have more than 1000 virtual servers (some real, some dummy). > Previously, in the stable/rocky branch, the time to build a cluster was > about 15-20 minutes, in the Stein branch there was a regression and the > time increased to 90 minutes. After this patch, the build time is only 2 > minutes." > > That change was backported to stable/stein. > > 2. - https://review.opendev.org/#/c/661121/ - Optimize hypervisor API > calls (which requires https://review.opendev.org/#/c/659886/) > > As noted that change requires a patch to python-novaclient if you are > looking to backport the change. We can't backport that upstream because > of the python-novaclient dependency since it would require bumping the > minimum required version of the library on a stable branch which is > against stable branch policy (minimum version of library dependencies > are more or less frozen on stable branches). > > That change also requires configuring watcher with: > > [nova_client] > api_version = 2.53 # or greater; train now requires at least 2.56 > > 3. - https://review.opendev.org/#/c/662089/ - Optimize > NovaHelper.get_compute_node_by_hostname > > This optimizes code used to build/update the nova CDM during > notification processing and also fixes a bug about looking up the > compute service properly. > > After those three changes were merged, Corne Lukken (Dantali0n) started > doing scale and performance testing with and without the changes in a > CERN 5-cell test cluster. Corne identified a regression for which Canwei > Li determined the root cause and chenker fixed: > > 4. https://review.opendev.org/#/c/668100/ - Reduce the query time of the > instances when call get_instance_list() > > With that fix applied Corne reported the overall improvement of 20-30% > when building the nova CDM during an audit in various scenarios. The > actual performance numbers will be sent later as part of a thesis Corne > is working on. > > I want to thank Dantali0n, licanwei and chenker for all of their help > with this series of improvements. > > -- > > Thanks, > > Matt > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sungn2 at lenovo.com Wed Jul 10 02:20:22 2019 From: sungn2 at lenovo.com (Guannan GN2 Sun) Date: Wed, 10 Jul 2019 02:20:22 +0000 Subject: [devstack] Deploy issue on Ubuntu 16.04. Message-ID: Hi team, I meet some problem when deploying devstack(from https://github.com/openstack/devstack master branch) on Ubuntu 16.04. It seems something is wrong with placement-api error message as following: curl -g -k --noproxy '*' -s -o /dev/null -w '%{http_code}%' http://10.240.24.138/placement [[503 == 503]] [ERROR] /opt/stack/devstack/lib/placement:156 placement-api did not start However when I check its status using "systemctl status devstack at placement-api", it is active and running. I also change to "stein" branch and try to deploy again, but still meet the same problem. Does someone meet similar issue before or could someone help me to debug this issue? Below is my local.conf file. Thank you! local.conf: #################################################### [[local|localrc]] # Credentials ADMIN_PASSWORD=password DATABASE_PASSWORD=password RABBIT_PASSWORD=password SERVICE_PASSWORD=password SERVICE_TOKEN=password SWIFT_HASH=password SWIFT_TEMPURL_KEY=password GIT_BASE=${GIT_BASE:-https://git.openstack.org} # A clean install every time RECLONE=yes # Enable Ironic plugin IRONIC_USING_PLUGIN=true enable_plugin ironic git://github.com/openstack/ironic # Enable Tempest enable_service tempest # Disable nova novnc service, ironic does not support it anyway. disable_service n-novnc # Enable Swift for the direct deploy interface. enable_service s-proxy enable_service s-object enable_service s-container enable_service s-account enable_service placement-api enable_service placement-client # Disable Horizon disable_service horizon # Disable Cinder disable_service cinder c-sch c-api c-vol # Swift temp URL's are required for the direct deploy interface SWIFT_ENABLE_TEMPURLS=True # Tempest related options BUILD_TIMEOUT=3000 IRONIC_CALLBACK_TIMEOUT=3000 POWER_TIMEOUT=600 SERVICE_TIMEOUT=600 TEMPEST_PLUGINS+=' /opt/stack/ironic-tempest-plugin' # Ironic related options IRONIC_IS_HARDWARE=true VIRT_DRIVER=ironic IRONIC_HW_NODE_CPU=1 IRONIC_HW_NODE_RAM=4096 IRONIC_HW_NODE_DISK=20 IRONIC_BAREMETAL_BASIC_OPS=True DEFAULT_INSTANCE_TYPE=baremetal # Enable additional hardware types, if needed. IRONIC_ENABLED_HARDWARE_TYPES=ipmi,fake-hardware,xclarity # Don't forget that many hardware types require enabling of additional # interfaces, most often power and management: IRONIC_ENABLED_MANAGEMENT_INTERFACES=ipmitool,fake,xclarity IRONIC_ENABLED_POWER_INTERFACES=ipmitool,fake,xclarity # The 'ipmi' hardware type's default deploy interface is 'iscsi'. # This would change the default to 'direct': #IRONIC_DEFAULT_DEPLOY_INTERFACE=direct # Change this to alter the default driver for nodes created by devstack. # This driver should be in the enabled list above. IRONIC_DEPLOY_DRIVER=ipmi # The parameters below represent the minimum possible values to create # functional nodes. #IRONIC_VM_SPECS_RAM=1280 #IRONIC_VM_SPECS_DISK=10 # Size of the ephemeral partition in GB. Use 0 for no ephemeral partition. IRONIC_VM_EPHEMERAL_DISK=0 # By default, DevStack creates a 10.0.0.0/24 network for instances. # If this overlaps with the hosts network, you may adjust with the # following. PUBLIC_NETWORK_GATEWAY=10.240.24.1 FLOATING_RANGE=10.240.24.0/24 FIXED_RANGE=10.0.0.0/24 HOST_IP=10.240.24.138 SERVICE_HOST=$HOST_IP FIXED_NETWORK_SIZE=256 # Neutron options Q_USE_PROVIDERNET_FOR_PUBLIC=True PUBLIC_INTERFACE=ens9 OVS_PHYSICAL_BRIDGE=br-ens9 PUBLIC_BRIDGE=br-ens9 OVS_BRIDGE_MAPPINGS=public:br-ens9 ALLOCATION_POOL=start=10.0.0.30,end=10.0.0.100 # Log all output to files LOGFILE=/opt/stack/devstack.log LOGDIR=/opt/stack/logs IRONIC_VM_LOG_DIR=/opt/stack/ironic-bm-logs #################################################### Best Regards, Guannan -------------- next part -------------- An HTML attachment was scrubbed... URL: From sekharvajjula at gmail.com Wed Jul 10 06:38:59 2019 From: sekharvajjula at gmail.com (chandra sekhar) Date: Wed, 10 Jul 2019 09:38:59 +0300 Subject: [ironic][edge] L3/DHCP-Less deployments In-Reply-To: References: Message-ID: Hi Julia, I am based in Finland . Anytime from 8:00 to 19:00 (EEST) is good for me. Regards Chandra R. On Tue, 9 Jul 2019, 18:33 Julia Kreger, wrote: > Greetings everyone! > > Over the last few weeks, I've had numerous discussions with > contributors who have expressed interest in the DHCP-less deployment > method specification document[0]. > > It seems many parties are interested! I think the path forward at this > time is to get everyone talking about where and how they can help push > this feature forward. > > If those interested could propose time windows when they are > available, I'll try to find a mutual time window where we can try to > get everyone on IRC or on a call and try to map out the next steps > together. > > Sound good? > > -Julia > > [0]: > https://specs.openstack.org/openstack/ironic-specs/specs/not-implemented/L3-based-deployment.html > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cdent+os at anticdent.org Wed Jul 10 06:50:37 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Wed, 10 Jul 2019 07:50:37 +0100 (BST) Subject: [devstack] Deploy issue on Ubuntu 16.04. In-Reply-To: References: Message-ID: On Wed, 10 Jul 2019, Guannan GN2 Sun wrote: > I meet some problem when deploying devstack(from > https://github.com/openstack/devstack master branch) on Ubuntu > 16.04. It seems something is wrong with placement-api error > message as following: > > > curl -g -k --noproxy '*' -s -o /dev/null -w '%{http_code}%' http://10.240.24.138/placement > > [[503 == 503]] > > [ERROR] /opt/stack/devstack/lib/placement:156 placement-api did not start > > > However when I check its status using "systemctl status > devstack at placement-api", it is active and running. I also change > to "stein" branch and try to deploy again, but still meet the same > problem. Does someone meet similar issue before or could someone > help me to debug this issue? Below is my local.conf file. Thank > you! Look in /etc/apache2/sites-enabled/placement-api.conf (and any other files in the same directory) and make sure you only have one proxy configuration for the connection between apache2 and the uwsgi process that is running under systemd. What's could be happening is that though you have placement running, apache is trying to talk to the wrong thing. You can either: * clean up the placement-api.conf file so that it only has one entry and restart apache2 * unstack.sh, remove the files in /etc/apache2/sites-enabled and /etc/apaceh2/sites-available, and rerun stack.sh This happens when there are repeated runs of stack.sh on the same host with insufficient cleanup between. This probably means there's a bit of a bug in lib/placement that we could fix so that it cleans up after itself better. I hope this is helpful. If this wasn't it and you figure out what was causing it please post the solution. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From josephine.seifert at secustack.com Wed Jul 10 07:45:51 2019 From: josephine.seifert at secustack.com (Josephine Seifert) Date: Wed, 10 Jul 2019 09:45:51 +0200 Subject: [image-encryption][nova][cinder][glance][barbican]Image Encryption popup team meeting Message-ID: <9033c7e9-fc98-c6b3-b8b3-7a6d4a7df4d0@secustack.com> Hello, starting with next week, there will be a weekly meeting on *Monday, 13 UTC *[1]**for the Image Encryption popup team. Please feel free to join us :) Greetings Josephine [1] http://eavesdrop.openstack.org/#Image_Encryption_Popup-Team_Meeting -------------- next part -------------- An HTML attachment was scrubbed... URL: From jean-philippe at evrard.me Wed Jul 10 08:15:10 2019 From: jean-philippe at evrard.me (Jean-Philippe Evrard) Date: Wed, 10 Jul 2019 10:15:10 +0200 Subject: [tripleo][openstack-ansible] Integrating ansible-role-collect-logs in OSA In-Reply-To: References: Message-ID: <71d568ee47ce516a5a4cab1422290da2be1baff6.camel@evrard.me> On Fri, 2019-06-28 at 16:30 +0530, Chandan kumar wrote: > With os_tempest project, TripleO and Openstack Ansible projects > started collaborating together to reuse the tools > developed by each other to avoid duplicates and enable more > collaboration. ... And that's amazing! > In TripleO, we have ansible-role-collect-logs[1.] role for the same > and in OSA we have logs_collect.sh[2.] script for > the same. But once the logs gets collected at each other projects, It > is very hard to navigate and find out where > is the respective files. Agreed. > By Keeping the same structure at all places > It would be easier. Agreed again > So moving ahead what we are going to do: > * Refactor collect-logs role to pass defaults list of files at one > place It seems the role is doing a lot of things, but things can be conditionally triggered. Wondering if this role shouldn't be split into multiple roles... But that's a different story. > * Pass the list of different logs files based on deployment tools I think this doesn't need to be in the role. Make the role the simplest as possible, and flexible enough to get passed the list of things to log/not log, and the destination. OSA can pass a list of files it wants to gather. But isn't that what the role already does? Or did I misunderstood? > * Put system/containers related commands at one place Can we simply rely on ansible inventory, and running a play with the role (targetting all) would gather logs for all systems (aio, multinodes, containers), each system could go into their own directory (log folder would be /{{ inventory_hostname }}/...) for example: aio1/ aio1-nova/ machine2/ It simple enough. But I am happy to see a different approach. > * Replace the collect_logs.sh script with playbook in OSA and replace > it. :thumbsup: > Thanks for reading, We are looking for the comments on the above > suggestion. Thanks for tackling that up! I am looking forward a simple common file gathering :) If you need to do changes in the role (to implement new features), maybe it can help you if I give you a prio list :) What I am _generally_ looking for in the logs: - The ara reports - The tempest report - The /etc/openstack_deploy/ - The /var/log/ for containers/hosts What I am _sometimes_ looking for in the logs (so less important/more contextual for me): - ram/disk usage per host - NICs details - cpu features (but I am not sure we really need this anymore) - host details (generally zuul does that for me, so no need to implement something there) Regards, Jean-Philippe Evrard (evrardjp) From zigo at debian.org Wed Jul 10 09:08:01 2019 From: zigo at debian.org (Thomas Goirand) Date: Wed, 10 Jul 2019 11:08:01 +0200 Subject: [horizon] django-pyscss unmaintained? Message-ID: <97088153-2339-4719-c08b-a78f8fd8090d@debian.org> Hi, As I'm maintaining python-django-pyscss in Debian, I've noticed that there's no commit since 2015 on the Github repository, and no pypi release either. I've sent a new pull request here: https://github.com/fusionbox/django-pyscss/pull/45 so that pyscss continues to pass unit tests with Django 2.2, but I have very little hope for it to be merged, as the previous pull request for Django 1.11 was never merged. Is it time to take over the project? Can someone from the Horizon team get in touch with the previous maintainer? Cheers, Thomas Goirand (zigo) From rico.lin.guanyu at gmail.com Wed Jul 10 09:18:37 2019 From: rico.lin.guanyu at gmail.com (Rico Lin) Date: Wed, 10 Jul 2019 17:18:37 +0800 Subject: [auto-scaling] Auto-scaling SIG update Message-ID: It has been two months after Summit and PTG [3], and we now have collected some use cases [1], and general documents for auto-scaling [2] (Great thanks to Joseph Davis and Adam Spiers). *Use Cases needed* We need more use cases from all, for autoscaling on/with OpenStack. For example, autoscaling k8s on OpenStack is already a thing, so IMO we need to put some words in use cases too. *Help to improve documents* Please check our new document and help to improve it if you think of any. *Join us* We have IRC channel and meeting so please join us in irc: #openstack-auto-scaling so we can hear you and you can help us. *Add request to our StoryBoard* If you have any feature requests or WIP plan, you can use our storyboard[4] as root trace for them all like request for `Integrate Monasca and Senlin`. *Open for bug?* One question I would like to bring to the team is should we open our storyboard[4] for bug collection? We have some good hands from multiple teams, who can help with bugs. And like feature requests, a top-level story for a bug should be very helpful, but what we need from the reporter in order to get better information about that bug, and also what part should we help? IMO we can make sure bug are well documented, trace bug progress (from top-level view), and help to raise attention(in ML and in events). Would like to hear more opinions on this. *PTG* I already sign up for half-day PTG schedule(as the team still under developing, I would not expect that we need more for now), so I see you all there if you will be there too! [1] https://docs.openstack.org/auto-scaling-sig/latest/use-cases.html [2] https://docs.openstack.org/auto-scaling-sig/latest/theory-of-auto-scaling.html [3] https://etherpad.openstack.org/p/DEN-auto-scaling-SIG [4] https://storyboard.openstack.org/#!/project/openstack/auto-scaling-sig [5] https://storyboard.openstack.org/#!/story/2005627 -- May The Force of OpenStack Be With You, *Rico Lin*irc: ricolin -------------- next part -------------- An HTML attachment was scrubbed... URL: From thierry at openstack.org Wed Jul 10 10:01:05 2019 From: thierry at openstack.org (Thierry Carrez) Date: Wed, 10 Jul 2019 12:01:05 +0200 Subject: [tc] Assuming control of GitHub organizations In-Reply-To: References: <10270aaf-9f4e-80b0-8e40-760d7c52dc0d@ham.ie> <9a4fdd56-7af8-c432-6ade-d52c163e8b8c@openstack.org> Message-ID: Jim Rollenhagen wrote: > On Thu, Jun 27, 2019 at 8:55 AM Thierry Carrez > wrote: > >> How about the TC controls which repo is replicated where (and which ones >> are pinned etc), but we import the descriptions from the openstack-map >> repo? >> >> That would keep control on the TC side but avoid duplication of effort. >> In my experience it's already difficult to get projects to update >> descriptions in one place, so two... > > This seems reasonable to me. > > >> Also, who is volunteering for setting up the replication, and then >> keeping track of things as they evolve ? > > I'm up for it, can we get a couple more? I'd like to get this going soon. Count me in. -- Thierry Carrez (ttx) From smooney at redhat.com Wed Jul 10 11:00:46 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 10 Jul 2019 12:00:46 +0100 Subject: [devstack] Deploy issue on Ubuntu 16.04. In-Reply-To: References: Message-ID: On Wed, 2019-07-10 at 07:50 +0100, Chris Dent wrote: > On Wed, 10 Jul 2019, Guannan GN2 Sun wrote: > > > I meet some problem when deploying devstack(from > > https://github.com/openstack/devstack master branch) on Ubuntu > > 16.04. It seems something is wrong with placement-api error > > message as following: > > > > > > curl -g -k --noproxy '*' -s -o /dev/null -w '%{http_code}%' http://10.240.24.138/placement > > > > [[503 == 503]] > > > > [ERROR] /opt/stack/devstack/lib/placement:156 placement-api did not start > > > > > > However when I check its status using "systemctl status > > devstack at placement-api", it is active and running. I also change > > to "stein" branch and try to deploy again, but still meet the same > > problem. Does someone meet similar issue before or could someone > > help me to debug this issue? Below is my local.conf file. Thank > > you! > what cdent said below is also true but just wanted to highlight that master of devstack is not intendeted to be run with ubuntu 16.04 we move to 18.04 some thime ago upstream so if there have been any deviations between 16.04 and 18.04 dont expect master of devestack to support both. > Look in /etc/apache2/sites-enabled/placement-api.conf (and any other > files in the same directory) and make sure you only have one proxy > configuration for the connection between apache2 and the uwsgi > process that is running under systemd. > > What's could be happening is that though you have placement running, > apache is trying to talk to the wrong thing. > > You can either: > > * clean up the placement-api.conf file so that it only has one entry > and restart apache2 > > * unstack.sh, remove the files in /etc/apache2/sites-enabled and > /etc/apaceh2/sites-available, and rerun stack.sh > > This happens when there are repeated runs of stack.sh on the same > host with insufficient cleanup between. This probably means there's > a bit of a bug in lib/placement that we could fix so that it cleans > up after itself better. > > I hope this is helpful. If this wasn't it and you figure out what > was causing it please post the solution. > From cdent+os at anticdent.org Wed Jul 10 12:51:19 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Wed, 10 Jul 2019 13:51:19 +0100 (BST) Subject: [placement] mascot/logo live Message-ID: https://www.openstack.org/project-mascots has been updated to have the Placement mascot/logo. Here's a sample: https://object-storage-ca-ymq-1.vexxhost.net/swift/v1/6e4619c416ff4bd19e1c087f27a43eea/www-images-prod/project-mascots/Placement/OpenStack_Project_Placement_vertical.jpg Thanks very much to the Foundation and their designers for making this happen. In case you missed the discussion [1] about this a while back, we know it looks a bit cranky. [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-June/thread.html#7170 -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From james.page at canonical.com Wed Jul 10 13:02:30 2019 From: james.page at canonical.com (James Page) Date: Wed, 10 Jul 2019 14:02:30 +0100 Subject: [nova-lxd] retiring nova-lxd Message-ID: Hi All I’m slightly sad to announce that we’re retiring the LXD driver for Nova aka “nova-lxd”. Developing a driver for Nova for container based machines has been a fun and technically challenging ride over the last four years but we’ve never really seen any level of serious production deployment; as a result we’ve decided that it’s time to call it a day for nova-lxd. I’d like to thank all of the key contributors for their efforts over the years - specifically Chuck Short, Paul Hummer, Chris MacNaughton, Sahid Orentino and Alex Kavanaugh who have led or contributed to the development of the driver over its lifetime. I’ll be raising a review to leave a note for future followers as to the fate of nova-lxd. If anyone else would like to continue development of the driver they are more than welcome to revert my commit and become a part of the development team! We’ll continue to support our current set of stable branches for another ~12 months. Note that development of LXD and the pylxd Python module continues; its just the integration of OpenStack with LXD that we’re ceasing development of. Regards James -------------- next part -------------- An HTML attachment was scrubbed... URL: From gaetan.trellu at incloudus.com Wed Jul 10 13:06:58 2019 From: gaetan.trellu at incloudus.com (=?ISO-8859-1?Q?Ga=EBtan_Trellu?=) Date: Wed, 10 Jul 2019 09:06:58 -0400 Subject: [masakari] how to install masakari on centos 7 In-Reply-To: Message-ID: An HTML attachment was scrubbed... URL: From juliaashleykreger at gmail.com Wed Jul 10 14:17:39 2019 From: juliaashleykreger at gmail.com (Julia Kreger) Date: Wed, 10 Jul 2019 07:17:39 -0700 Subject: [ironic] Shanghai PTG/Forum Message-ID: Greetings fellow ironic contributors! I've created an initial planning/discussion etherpad for the upcoming Shanghai summit[0]. I need a count of contributors that intend to attend with-in the next few weeks. If you plan on or intend to attend, please indicate so on the etherpad. I look forward to ideas and items for discussion! -Julia [0]: https://etherpad.openstack.org/p/PVG-Ironic-Planning From johnsomor at gmail.com Wed Jul 10 15:21:48 2019 From: johnsomor at gmail.com (Michael Johnson) Date: Wed, 10 Jul 2019 08:21:48 -0700 Subject: [nova-lxd] retiring nova-lxd In-Reply-To: References: Message-ID: James, I am sorry to hear this. This last year I implemented a Proof-of-Concept using nova-lxd for the Octavia Amphora instances. There are some rough edges, but we were successful with our tempest tests passing[1][2]. LXD was the most straight forward path for the Octavia team to take for a container implementation. Thank you for the work and effort that went into nova-lxd, Michael [1] https://review.opendev.org/636066 [2] https://review.opendev.org/636069 On Wed, Jul 10, 2019 at 6:04 AM James Page wrote: > > Hi All > > > I’m slightly sad to announce that we’re retiring the LXD driver for Nova aka “nova-lxd”. > > > Developing a driver for Nova for container based machines has been a fun and technically challenging ride over the last four years but we’ve never really seen any level of serious production deployment; as a result we’ve decided that it’s time to call it a day for nova-lxd. > > > I’d like to thank all of the key contributors for their efforts over the years - specifically Chuck Short, Paul Hummer, Chris MacNaughton, Sahid Orentino and Alex Kavanaugh who have led or contributed to the development of the driver over its lifetime. > > > I’ll be raising a review to leave a note for future followers as to the fate of nova-lxd. If anyone else would like to continue development of the driver they are more than welcome to revert my commit and become a part of the development team! > > > We’ll continue to support our current set of stable branches for another ~12 months. > > > Note that development of LXD and the pylxd Python module continues; its just the integration of OpenStack with LXD that we’re ceasing development of. > > > Regards > > > James > > From fungi at yuggoth.org Wed Jul 10 15:24:40 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Wed, 10 Jul 2019 15:24:40 +0000 Subject: [horizon] django-pyscss unmaintained? In-Reply-To: <97088153-2339-4719-c08b-a78f8fd8090d@debian.org> References: <97088153-2339-4719-c08b-a78f8fd8090d@debian.org> Message-ID: <20190710152440.sxxentyyjwh6bvky@yuggoth.org> On 2019-07-10 11:08:01 +0200 (+0200), Thomas Goirand wrote: > As I'm maintaining python-django-pyscss in Debian, I've noticed that > there's no commit since 2015 on the Github repository, and no pypi > release either. [...] For that matter, the pyScss upstream maintainer seems to have not merged anything new or commented on new issues and pull requests for roughly a year at this point. It's worth considering that both might be abandoned now. How intrinsic is Sass to Horizon vs just straight CSS3? Enough that it would be less work for the Horizon team to take over the Python ecosystem tools around it than to replace the bits of Sass in use? Or are there newer tools which accomplish the same thing now and that's why the old ones are falling into disuse? -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From miguel at mlavalle.com Wed Jul 10 16:02:57 2019 From: miguel at mlavalle.com (Miguel Lavalle) Date: Wed, 10 Jul 2019 11:02:57 -0500 Subject: [openstack-dev] [neutron] [ptg] Start planning for the PTG in Shanghai Message-ID: Dear Neutrinos, We need to help the Foundation to plan for the the PTG in Shanghai. To that end, I have created an etherpad to start collecting the names of the team members who plan to attend and also topics to be discussed during our meetings: https://etherpad.openstack.org/p/Shanghai-Neutron-Planning I need to respond back to the Foundation's survey by August 11th. So it will be very helpful if all of you who plan to be in Shanghai add your names and irc nicks to the "Attendees" section no later than August 4th. This deadline doesn't apply to the proposed topics section. For that, we have more time. Regards Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at fried.cc Wed Jul 10 16:05:24 2019 From: openstack at fried.cc (Eric Fried) Date: Wed, 10 Jul 2019 11:05:24 -0500 Subject: [nova-lxd] retiring nova-lxd In-Reply-To: References: Message-ID: <88feef3d-f86e-7a0e-78e7-298c6ad6fb40@fried.cc> > I’m slightly sad to announce that we’re retiring the LXD driver for Nova > aka “nova-lxd”. How does this impact the fate of the following series? https://review.opendev.org/667975 Add support for 'initenv' elements https://review.opendev.org/667976 Add support for cloud-init on LXC instances efried . From smooney at redhat.com Wed Jul 10 16:15:15 2019 From: smooney at redhat.com (Sean Mooney) Date: Wed, 10 Jul 2019 17:15:15 +0100 Subject: [nova-lxd] retiring nova-lxd In-Reply-To: <88feef3d-f86e-7a0e-78e7-298c6ad6fb40@fried.cc> References: <88feef3d-f86e-7a0e-78e7-298c6ad6fb40@fried.cc> Message-ID: On Wed, 2019-07-10 at 11:05 -0500, Eric Fried wrote: > > I’m slightly sad to announce that we’re retiring the LXD driver for Nova > > aka “nova-lxd”. > > How does this impact the fate of the following series? > > https://review.opendev.org/667975 Add support for 'initenv' elements > https://review.opendev.org/667976 Add support for cloud-init on LXC > instances it should have no impact that is for lxc container supprot via libvirt nova-lxd is an out of tree driver managing lxc container via lxd. so those patchse should be able to proceed unaffected by nova-lxd's retirement > > efried > . > > From thierry at openstack.org Wed Jul 10 16:37:10 2019 From: thierry at openstack.org (Thierry Carrez) Date: Wed, 10 Jul 2019 18:37:10 +0200 Subject: [tc] agenda for Technical Committee Meeting 11 July 2019 @ 1400 UTC Message-ID: TC Members, Our next meeting will be this Thursday, 11 July at 1400 UTC in #openstack-tc. I'll be your chair and we'll rely on mugsie to keep the GIF game level high. This email contains the agenda for the meeting, based on the content of the wiki [0]. If you will not be able to attend, please include your name in the "Apologies for Absence" section of the wiki page [0]. [0] https://wiki.openstack.org/wiki/Meetings/TechnicalCommittee * Follow up on past action items ** Health check changes: asettle to update community (done), fungi to update wiki (done), mugise to update yaml file with liasons and mnaser to update the tooling ** Help-most-needed list: AlanClark and zaneb to update investment opportunities document [0] https://etherpad.openstack.org/p/2019-upstream-investment-opportunities-refactor [1] https://review.opendev.org/#/c/657447/ ** Goal selection: lbragstad to prune the community-goals etherpad [2] https://etherpad.openstack.org/p/community-goals ** Pop-up teams: ttx to define pop-up teams [3] https://review.opendev.org/#/c/661356/ [4] https://review.opendev.org/#/c/661983/5 ** SIG governance: tc-members to review goal, popup, and SIG project etherpad [5] https://etherpad.openstack.org/p/explain-team-formate-differentiate ** Review PTL Guide: asettle ttx ricolin to sync up and review the PTL section of the project teams guide to improve the PTL experience [6] https://review.opendev.org/#/c/665699/ * Active initiatives ** Python 3: mnaser to sync up with swift team on python3 migration and mugsie to sync with dhellmann or release-team to find the code for the proposal bot ** Forum follow-up: ttx to organise Milestone 2 forum meeting with tc-members ** Make goal selection a two-step process (needs reviews at https://review.opendev.org/#/c/667932/) * Update on U release naming process * What are retired repos ? Regards, -- Thierry Carrez (ttx) From whayutin at redhat.com Wed Jul 10 18:12:26 2019 From: whayutin at redhat.com (Wesley Hayutin) Date: Wed, 10 Jul 2019 12:12:26 -0600 Subject: [tripleo][openstack-ansible] Integrating ansible-role-collect-logs in OSA In-Reply-To: <71d568ee47ce516a5a4cab1422290da2be1baff6.camel@evrard.me> References: <71d568ee47ce516a5a4cab1422290da2be1baff6.camel@evrard.me> Message-ID: On Wed, Jul 10, 2019 at 2:23 AM Jean-Philippe Evrard < jean-philippe at evrard.me> wrote: > On Fri, 2019-06-28 at 16:30 +0530, Chandan kumar wrote: > > With os_tempest project, TripleO and Openstack Ansible projects > > started collaborating together to reuse the tools > > developed by each other to avoid duplicates and enable more > > collaboration. > > ... And that's amazing! > Good stuff :) Agreed. > > > In TripleO, we have ansible-role-collect-logs[1.] role for the same > > and in OSA we have logs_collect.sh[2.] script for > > the same. But once the logs gets collected at each other projects, It > > is very hard to navigate and find out where > > is the respective files. > > Agreed. > > > By Keeping the same structure at all places > > It would be easier. > > Agreed again > > > So moving ahead what we are going to do: > > * Refactor collect-logs role to pass defaults list of files at one > > place > > It seems the role is doing a lot of things, but things can be > conditionally triggered. Wondering if this role shouldn't be split into > multiple roles... But that's a different story. > > > * Pass the list of different logs files based on deployment tools > > I think this doesn't need to be in the role. Make the role the simplest > as possible, and flexible enough to get passed the list of things to > log/not log, and the destination. OSA can pass a list of files it wants > to gather. But isn't that what the role already does? Or did I > misunderstood? > The TripleO team passes various config files to the collect roles depending on what the needs and requirements are. Some of these config files are public some are not. upstream config https://github.com/openstack/tripleo-ci/blob/master/toci-quickstart/config/collect-logs.yml default config https://github.com/openstack/ansible-role-collect-logs/blob/master/defaults/main.yml These are of course just passed in as extra-config. I think each project would want to define their own list of files and maintain it in their own project. WDYT? > > > * Put system/containers related commands at one place > > Can we simply rely on ansible inventory, and running a play with the > role (targetting all) would gather logs for all systems (aio, > multinodes, containers), each system > could go into their own directory (log folder would be /{{ > inventory_hostname }}/...) for example: > > aio1/ > aio1-nova/ > machine2/ > > It simple enough. But I am happy to see a different approach. > Yes, this is exactly how it works today. We don't break out which files should be collect for each host, but that is just our preference. > > > * Replace the collect_logs.sh script with playbook in OSA and replace > > it. > > :thumbsup: > > > Thanks for reading, We are looking for the comments on the above > > suggestion. > > Thanks for tackling that up! > I am looking forward a simple common file gathering :) > > If you need to do changes in the role (to implement new features), > maybe it can help you if I give you a prio list :) > > What I am _generally_ looking for in the logs: > - The ara reports > - The tempest report > - The /etc/openstack_deploy/ > - The /var/log/ for containers/hosts > Agree, having a table of contents in the footer is nice for users as well. https://github.com/openstack/tripleo-ci/blob/master/docs/tripleo-quickstart-logs.html Which is added by infra via https://opendev.org/opendev/system-config/src/branch/master/modules/openstack_project/manifests/static.pp > > What I am _sometimes_ looking for in the logs (so less important/more > contextual for me): > - ram/disk usage per host > - NICs details > - cpu features (but I am not sure we really need this anymore) > - host details (generally zuul does that for me, so no need to > implement something there) > > AFAICT, if we were to organize the role more aggressively via the tasks we can easily enable or disable features as needed per project. The majority of the work would around the reorganization to better suit various projects. Any thoughts on additional work that I am not seeing? Thanks for responding! I know our team is very excited about the continued collaboration with other upstream projects, so thanks!! > Regards, > Jean-Philippe Evrard (evrardjp) > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vungoctan252 at gmail.com Tue Jul 9 15:25:59 2019 From: vungoctan252 at gmail.com (Vu Tan) Date: Tue, 9 Jul 2019 22:25:59 +0700 Subject: [masakari] how to install masakari on centos 7 In-Reply-To: References: <09a3849b-786e-49ed-a197-5e13af0428bf@email.android.com> <144114a2e83d8e8e30579ddb0ae39e59@incloudus.com> Message-ID: Thank Patil Tushar, I hope it will be available soon On Tue, Jul 9, 2019 at 8:18 AM Patil, Tushar wrote: > Hi Vu and Gaetan, > > Gaetan, thank you for helping out Vu in setting up masakari-monitors > service. > > As a masakari team ,we have noticed there is a need to add proper > documentation to help the community run Masakari services in their > environment. We are working on adding proper documentation in this 'Train' > cycle. > > Will send an email on this mailing list once the patches are uploaded on > the gerrit so that you can give your feedback on the same. > > If you have any trouble in setting up Masakari, please let us know on this > mailing list or join the bi-weekly IRC Masakari meeting on the > #openstack-meeting IRC channel. The next meeting will be held on 16th July > 2019 @0400 UTC. > > Regards, > Tushar Patil > > ________________________________________ > From: Vu Tan > Sent: Monday, July 8, 2019 11:21:16 PM > To: Gaëtan Trellu > Cc: openstack-discuss at lists.openstack.org > Subject: Re: [masakari] how to install masakari on centos 7 > > Hi Gaetan, > Thanks for pinpoint this out, silly me that did not notice the simple > "error InterpreterNotFound: python3". Thanks a lot, I appreciate it > > On Mon, Jul 8, 2019 at 9:15 PM gaetan.trellu at incloudus.com>> wrote: > Vu Tan, > > About "auth_token" error, you need "os_privileged_user_*" options into > your masakari.conf for the API. > As mentioned previously please have a look here to have an example of > configuration working (for me at least): > > - masakari.conf: > > https://review.opendev.org/#/c/615715/42/ansible/roles/masakari/templates/masakari.conf.j2 > - masakari-monitor.conf: > > https://review.opendev.org/#/c/615715/42/ansible/roles/masakari/templates/masakari-monitors.conf.j2 > > About your tox issue make sure you have Python3 installed. > > Gaëtan > > On 2019-07-08 06:08, Vu Tan wrote: > > > Hi Gaetan, > > I try to generate config file by using this command tox -egenconfig on > > top level of masakari but the output is error, is this masakari still > > in beta version ? > > [root at compute1 masakari-monitors]# tox -egenconfig > > genconfig create: /root/masakari-monitors/.tox/genconfig > > ERROR: InterpreterNotFound: python3 > > _____________________________________________________________ summary > > ______________________________________________________________ > > ERROR: genconfig: InterpreterNotFound: python3 > > > > On Mon, Jul 8, 2019 at 3:24 PM Vu Tan vungoctan252 at gmail.com>> wrote: > > Hi, > > Thanks a lot for your reply, I install pacemaker/corosync, > > masakari-api, maskari-engine on controller node, and I run masakari-api > > with this command: masakari-api, but I dont know whether the process is > > running like that or is it just hang there, here is what it shows when > > I run the command, I leave it there for a while but it does not change > > anything : > > [root at controller masakari]# masakari-api > > 2019-07-08 15:21:09.946 30250 INFO masakari.api.openstack [-] Loaded > > extensions: ['extensions', 'notifications', 'os-hosts', 'segments', > > 'versions'] > > 2019-07-08 15:21:09.955 30250 WARNING keystonemiddleware._common.config > > [-] The option "__file__" in conf is not known to auth_token > > 2019-07-08 15:21:09.955 30250 WARNING keystonemiddleware._common.config > > [-] The option "here" in conf is not known to auth_token > > 2019-07-08 15:21:09.960 30250 WARNING keystonemiddleware.auth_token [-] > > AuthToken middleware is set with > > keystone_authtoken.service_token_roles_required set to False. This is > > backwards compatible but deprecated behaviour. Please set this to True. > > 2019-07-08 15:21:09.974 30250 INFO masakari.wsgi [-] masakari_api > > listening on 127.0.0.1:15868 > > 2019-07-08 15:21:09.975 30250 INFO oslo_service.service [-] Starting 4 > > workers > > 2019-07-08 15:21:09.984 30274 INFO masakari.masakari_api.wsgi.server > > [-] (30274) wsgi starting up on http://127.0.0.1:15868 > > 2019-07-08 15:21:09.985 30275 INFO masakari.masakari_api.wsgi.server > > [-] (30275) wsgi starting up on http://127.0.0.1:15868 > > 2019-07-08 15:21:09.992 30277 INFO masakari.masakari_api.wsgi.server > > [-] (30277) wsgi starting up on http://127.0.0.1:15868 > > 2019-07-08 15:21:09.994 30276 INFO masakari.masakari_api.wsgi.server > > [-] (30276) wsgi starting up on http://127.0.0.1:15868 > > > > On Sun, Jul 7, 2019 at 7:37 PM Gaëtan Trellu > > > wrote: > > > > Hi Vu Tan, > > > > Masakari documentation doesn't really exist... I had to figured some > > stuff by myself to make it works into Kolla project. > > > > On controller nodes you need: > > > > - pacemaker > > - corosync > > - masakari-api (openstack/masakari repository) > > - masakari- engine (openstack/masakari repository) > > > > On compute nodes you need: > > > > - pacemaker-remote (integrated to pacemaker cluster as a resource) > > - masakari- hostmonitor (openstack/masakari-monitor repository) > > - masakari-instancemonitor (openstack/masakari-monitor repository) > > - masakari-processmonitor (openstack/masakari-monitor repository) > > > > For masakari-hostmonitor, the service needs to have access to systemctl > > command (make sure you are not using sysvinit). > > > > For masakari-monitor, the masakari-monitor.conf is a bit different, you > > will have to configure the [api] section properly. > > > > RabbitMQ needs to be configured (as transport_url) on masakari-api and > > masakari-engine too. > > > > Please check this review[1], you will have masakari.conf and > > masakari-monitor.conf configuration examples. > > > > [1] https://review.opendev.org/#/c/615715 > > > > Gaëtan > > > > On Jul 7, 2019 12:08 AM, Vu Tan vungoctan252 at gmail.com>> wrote: > > > > VU TAN > > > > > 10:30 AM (35 minutes ago) > > > > to openstack-discuss > > > > Sorry, I resend this email because I realized that I lacked of prefix > > on this email's subject > > > > Hi, > > > > I would like to use Masakari and I'm having trouble finding a step by > > step or other documentation to get started with. Which part should be > > installed on controller, which is should be on compute, and what is the > > prerequisite to install masakari, I have installed corosync and > > pacemaker on compute and controller nodes, , what else do I need to do > > ? step I have done so far: > > - installed corosync/pacemaker > > - install masakari on compute node on this github repo: > > https://github.com/openstack/masakari > > - add masakari in to mariadb > > here is my configuration file of masakari.conf, do you mind to take a > > look at it, if I have misconfigured anything? > > > > [DEFAULT] > > enabled_apis = masakari_api > > > > # Enable to specify listening IP other than default > > masakari_api_listen = controller > > # Enable to specify port other than default > > masakari_api_listen_port = 15868 > > debug = False > > auth_strategy=keystone > > > > [wsgi] > > # The paste configuration file path > > api_paste_config = /etc/masakari/api-paste.ini > > > > [keystone_authtoken] > > www_authenticate_uri = http://controller:5000 > > auth_url = http://controller:5000 > > auth_type = password > > project_domain_id = default > > user_domain_id = default > > project_name = service > > username = masakari > > password = P at ssword > > > > [database] > > connection = mysql+pymysql://masakari:P at ssword@controller/masakari > Disclaimer: This email and any attachments are sent in strictest > confidence for the sole use of the addressee and may contain legally > privileged, confidential, and proprietary data. If you are not the intended > recipient, please advise the sender by replying promptly to this email and > then delete and destroy this email and any attachments without any further > use, copying or forwarding. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vungoctan252 at gmail.com Wed Jul 10 07:12:49 2019 From: vungoctan252 at gmail.com (Vu Tan) Date: Wed, 10 Jul 2019 14:12:49 +0700 Subject: [masakari] how to install masakari on centos 7 In-Reply-To: References: <09a3849b-786e-49ed-a197-5e13af0428bf@email.android.com> <144114a2e83d8e8e30579ddb0ae39e59@incloudus.com> Message-ID: Hi Gaetan, I follow you the guide you gave me, but the problem still persist, can you please take a look at my configuration to see what is wrong or what is missing in my config ? the error: 2019-07-10 14:08:46.876 17292 WARNING keystonemiddleware._common.config [-] The option "__file__" in conf is not known to auth_token 2019-07-10 14:08:46.876 17292 WARNING keystonemiddleware._common.config [-] The option "here" in conf is not known to auth_token 2019-07-10 14:08:46.882 17292 WARNING keystonemiddleware.auth_token [-] AuthToken middleware is set with keystone_authtoken.service_ the config: [DEFAULT] enabled_apis = masakari_api log_dir = /var/log/kolla/masakari state_path = /var/lib/masakari os_user_domain_name = default os_project_domain_name = default os_privileged_user_tenant = service os_privileged_user_auth_url = http://controller:5000/v3 os_privileged_user_name = nova os_privileged_user_password = P at ssword masakari_api_listen = controller masakari_api_listen_port = 15868 debug = False auth_strategy=keystone [wsgi] # The paste configuration file path api_paste_config = /etc/masakari/api-paste.ini [keystone_authtoken] www_authenticate_uri = http://controller:5000 auth_url = http://controller:5000 auth_type = password project_domain_id = default project_domain_name = default user_domain_name = default user_domain_id = default project_name = service username = masakari password = P at ssword region_name = RegionOne [oslo_middleware] enable_proxy_headers_parsing = True [database] connection = mysql+pymysql://masakari:P at ssword@controller/masakari On Tue, Jul 9, 2019 at 10:25 PM Vu Tan wrote: > Thank Patil Tushar, I hope it will be available soon > > On Tue, Jul 9, 2019 at 8:18 AM Patil, Tushar > wrote: > >> Hi Vu and Gaetan, >> >> Gaetan, thank you for helping out Vu in setting up masakari-monitors >> service. >> >> As a masakari team ,we have noticed there is a need to add proper >> documentation to help the community run Masakari services in their >> environment. We are working on adding proper documentation in this 'Train' >> cycle. >> >> Will send an email on this mailing list once the patches are uploaded on >> the gerrit so that you can give your feedback on the same. >> >> If you have any trouble in setting up Masakari, please let us know on >> this mailing list or join the bi-weekly IRC Masakari meeting on the >> #openstack-meeting IRC channel. The next meeting will be held on 16th July >> 2019 @0400 UTC. >> >> Regards, >> Tushar Patil >> >> ________________________________________ >> From: Vu Tan >> Sent: Monday, July 8, 2019 11:21:16 PM >> To: Gaëtan Trellu >> Cc: openstack-discuss at lists.openstack.org >> Subject: Re: [masakari] how to install masakari on centos 7 >> >> Hi Gaetan, >> Thanks for pinpoint this out, silly me that did not notice the simple >> "error InterpreterNotFound: python3". Thanks a lot, I appreciate it >> >> On Mon, Jul 8, 2019 at 9:15 PM > gaetan.trellu at incloudus.com>> wrote: >> Vu Tan, >> >> About "auth_token" error, you need "os_privileged_user_*" options into >> your masakari.conf for the API. >> As mentioned previously please have a look here to have an example of >> configuration working (for me at least): >> >> - masakari.conf: >> >> https://review.opendev.org/#/c/615715/42/ansible/roles/masakari/templates/masakari.conf.j2 >> - masakari-monitor.conf: >> >> https://review.opendev.org/#/c/615715/42/ansible/roles/masakari/templates/masakari-monitors.conf.j2 >> >> About your tox issue make sure you have Python3 installed. >> >> Gaëtan >> >> On 2019-07-08 06:08, Vu Tan wrote: >> >> > Hi Gaetan, >> > I try to generate config file by using this command tox -egenconfig on >> > top level of masakari but the output is error, is this masakari still >> > in beta version ? >> > [root at compute1 masakari-monitors]# tox -egenconfig >> > genconfig create: /root/masakari-monitors/.tox/genconfig >> > ERROR: InterpreterNotFound: python3 >> > _____________________________________________________________ summary >> > ______________________________________________________________ >> > ERROR: genconfig: InterpreterNotFound: python3 >> > >> > On Mon, Jul 8, 2019 at 3:24 PM Vu Tan > vungoctan252 at gmail.com>> wrote: >> > Hi, >> > Thanks a lot for your reply, I install pacemaker/corosync, >> > masakari-api, maskari-engine on controller node, and I run masakari-api >> > with this command: masakari-api, but I dont know whether the process is >> > running like that or is it just hang there, here is what it shows when >> > I run the command, I leave it there for a while but it does not change >> > anything : >> > [root at controller masakari]# masakari-api >> > 2019-07-08 15:21:09.946 30250 INFO masakari.api.openstack [-] Loaded >> > extensions: ['extensions', 'notifications', 'os-hosts', 'segments', >> > 'versions'] >> > 2019-07-08 15:21:09.955 30250 WARNING keystonemiddleware._common.config >> > [-] The option "__file__" in conf is not known to auth_token >> > 2019-07-08 15:21:09.955 30250 WARNING keystonemiddleware._common.config >> > [-] The option "here" in conf is not known to auth_token >> > 2019-07-08 15:21:09.960 30250 WARNING keystonemiddleware.auth_token [-] >> > AuthToken middleware is set with >> > keystone_authtoken.service_token_roles_required set to False. This is >> > backwards compatible but deprecated behaviour. Please set this to True. >> > 2019-07-08 15:21:09.974 30250 INFO masakari.wsgi [-] masakari_api >> > listening on 127.0.0.1:15868 >> > 2019-07-08 15:21:09.975 30250 INFO oslo_service.service [-] Starting 4 >> > workers >> > 2019-07-08 15:21:09.984 30274 INFO masakari.masakari_api.wsgi.server >> > [-] (30274) wsgi starting up on http://127.0.0.1:15868 >> > 2019-07-08 15:21:09.985 30275 INFO masakari.masakari_api.wsgi.server >> > [-] (30275) wsgi starting up on http://127.0.0.1:15868 >> > 2019-07-08 15:21:09.992 30277 INFO masakari.masakari_api.wsgi.server >> > [-] (30277) wsgi starting up on http://127.0.0.1:15868 >> > 2019-07-08 15:21:09.994 30276 INFO masakari.masakari_api.wsgi.server >> > [-] (30276) wsgi starting up on http://127.0.0.1:15868 >> > >> > On Sun, Jul 7, 2019 at 7:37 PM Gaëtan Trellu >> > > >> wrote: >> > >> > Hi Vu Tan, >> > >> > Masakari documentation doesn't really exist... I had to figured some >> > stuff by myself to make it works into Kolla project. >> > >> > On controller nodes you need: >> > >> > - pacemaker >> > - corosync >> > - masakari-api (openstack/masakari repository) >> > - masakari- engine (openstack/masakari repository) >> > >> > On compute nodes you need: >> > >> > - pacemaker-remote (integrated to pacemaker cluster as a resource) >> > - masakari- hostmonitor (openstack/masakari-monitor repository) >> > - masakari-instancemonitor (openstack/masakari-monitor repository) >> > - masakari-processmonitor (openstack/masakari-monitor repository) >> > >> > For masakari-hostmonitor, the service needs to have access to systemctl >> > command (make sure you are not using sysvinit). >> > >> > For masakari-monitor, the masakari-monitor.conf is a bit different, you >> > will have to configure the [api] section properly. >> > >> > RabbitMQ needs to be configured (as transport_url) on masakari-api and >> > masakari-engine too. >> > >> > Please check this review[1], you will have masakari.conf and >> > masakari-monitor.conf configuration examples. >> > >> > [1] https://review.opendev.org/#/c/615715 >> > >> > Gaëtan >> > >> > On Jul 7, 2019 12:08 AM, Vu Tan > vungoctan252 at gmail.com>> wrote: >> > >> > VU TAN > >> > >> > 10:30 AM (35 minutes ago) >> > >> > to openstack-discuss >> > >> > Sorry, I resend this email because I realized that I lacked of prefix >> > on this email's subject >> > >> > Hi, >> > >> > I would like to use Masakari and I'm having trouble finding a step by >> > step or other documentation to get started with. Which part should be >> > installed on controller, which is should be on compute, and what is the >> > prerequisite to install masakari, I have installed corosync and >> > pacemaker on compute and controller nodes, , what else do I need to do >> > ? step I have done so far: >> > - installed corosync/pacemaker >> > - install masakari on compute node on this github repo: >> > https://github.com/openstack/masakari >> > - add masakari in to mariadb >> > here is my configuration file of masakari.conf, do you mind to take a >> > look at it, if I have misconfigured anything? >> > >> > [DEFAULT] >> > enabled_apis = masakari_api >> > >> > # Enable to specify listening IP other than default >> > masakari_api_listen = controller >> > # Enable to specify port other than default >> > masakari_api_listen_port = 15868 >> > debug = False >> > auth_strategy=keystone >> > >> > [wsgi] >> > # The paste configuration file path >> > api_paste_config = /etc/masakari/api-paste.ini >> > >> > [keystone_authtoken] >> > www_authenticate_uri = http://controller:5000 >> > auth_url = http://controller:5000 >> > auth_type = password >> > project_domain_id = default >> > user_domain_id = default >> > project_name = service >> > username = masakari >> > password = P at ssword >> > >> > [database] >> > connection = mysql+pymysql://masakari:P at ssword@controller/masakari >> Disclaimer: This email and any attachments are sent in strictest >> confidence for the sole use of the addressee and may contain legally >> privileged, confidential, and proprietary data. If you are not the intended >> recipient, please advise the sender by replying promptly to this email and >> then delete and destroy this email and any attachments without any further >> use, copying or forwarding. >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wbedyk at suse.de Wed Jul 10 10:26:27 2019 From: wbedyk at suse.de (Witek Bedyk) Date: Wed, 10 Jul 2019 12:26:27 +0200 Subject: [auto-scaling] Auto-scaling SIG update In-Reply-To: References: Message-ID: Hi Rico, thanks for putting this together. > *Open for bug?* > One question I would like to bring to the team is should we open our > storyboard[4] for bug collection? We have some good hands from multiple > teams, who can help with bugs. And like feature requests, a top-level > story for a bug should be very helpful, but what we need from the > reporter in order to get better information about that bug, and also > what part should we help? > IMO we can make sure bug are well documented, trace bug progress (from > top-level view), and help to raise attention(in ML and in events). Would > like to hear more opinions on this. I think it's a good idea. Most bugs will refer to individual projects and code changes there. But StoryBoard has a nice ability to collect tasks for several projects in one story. Auto-scaling related bugs in individual projects could add an additional task in auto-scaling project. That way the internal project work will gain more general context and better visibility. Cheers Witek From laurentfdumont at gmail.com Wed Jul 10 19:08:34 2019 From: laurentfdumont at gmail.com (Laurent Dumont) Date: Wed, 10 Jul 2019 15:08:34 -0400 Subject: openstack-sdk API - Get all flavors for all projects. Message-ID: Hi everyone, I'm trying to map the relationship between projects, host-aggregate and computes. I'm trying to find a way to use the Openstack Python SDK in order to list all existing flavors and the "access_project_ids" seems to be the best way to see which project is associated with the flavor. That said, it seems that the python SDK does not expose the value when querying using "get_flavor". Anyone know of way of getting the information from the SDK? Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.slagle at gmail.com Wed Jul 10 20:17:38 2019 From: james.slagle at gmail.com (James Slagle) Date: Wed, 10 Jul 2019 16:17:38 -0400 Subject: [TripleO] Scaling node counts with only Ansible (N=1) Message-ID: There's been a fair amount of recent work around simplifying our Heat templates and migrating the software configuration part of our deployment entirely to Ansible. As part of this effort, it became apparent that we could render much of the data that we need out of Heat in a way that is generic per node, and then have Ansible render the node specific data during config-download runtime. To illustrate the point, consider when we specify ComputeCount:10 in our templates, that much of the work that Heat is doing across those 10 sets of resources for each Compute node is duplication. However, it's been necessary so that Heat can render data structures such as list of IP's, lists of hostnames, contents of /etc/hosts files, etc etc etc. If all that was driven by Ansible using host facts, then Heat doesn't need to do those 10 sets of resources to begin with. The goal is to get to a point where we can deploy the Heat stack with a count of 1 for each role, and then deploy any number of nodes per role using Ansible. To that end, I've been referring to this effort as N=1. The value in this work is that it directly addresses our scaling issues with Heat (by just deploying a much smaller stack). Obviously we'd still be relying heavily on Ansible to scale to the required levels, but I feel that is much better understood challenge at this point in the evolution of configuration tools. With the patches that we've been working on recently, I've got a POC running where I can deploy additional compute nodes with just Ansible. This is done by just adding the additional nodes to the Ansible inventory with a small set of facts to include IP addresses on each enabled network and a hostname. These patches are at https://review.opendev.org/#/q/topic:bp/reduce-deployment-resources and reviews/feedback are welcome. Other points: - Baremetal provisioning and port creation are presently handled by Heat. With the ongoing efforts to migrate baremetal provisioning out of Heat (nova-less deploy), I think these efforts are very complimentary. Eventually, we get to a point where Heat is not actually creating any other OpenStack API resources. For now, the patches only work when using pre-provisioned nodes. - We need to consider how we'd manage the Ansible inventory going forward if we open up an interface for operators to manipulate it directly. That's something we'd want to manage and preserve (version control) as it's critical data for the deployment. Given the progress that we've made with the POC, my sense is that we'll keep pushing in this overall direction. I'd like to get some feedback on the approach. We have an etherpad we are using to track some of the work at a high level: https://etherpad.openstack.org/p/tripleo-reduce-deployment-resources I'll be adding some notes on how I setup the POC to that etherpad if others would like to try it out. -- -- James Slagle -- From colleen at gazlene.net Wed Jul 10 22:15:45 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Wed, 10 Jul 2019 15:15:45 -0700 Subject: [keystone] Shanghai PTG planning Message-ID: <60277b1f-9453-4abd-a3c4-a9117885e56f@www.fastmail.com> Hi team, The foundation has asked us to let them know whether we'll be needing PTG space in Shanghai. The keystone team usually has good attendance at PTGs and makes good use of the time, but I know this next one may be hard for some people to attend so I don't want to automatically assume we'll use it this time. If you wish to attend the next PTG and have a semi-reasonable amount of confidence that you may be able to, please add your name to the etherpad: https://etherpad.openstack.org/p/keystone-shanghai-ptg Please do this by Monday, August 5 so that I have time to respond to the foundation by August 11. Feel free to also start brainstorming topics for the PTG (no deadline on this). Colleen From openstack at fried.cc Wed Jul 10 22:44:57 2019 From: openstack at fried.cc (Eric Fried) Date: Wed, 10 Jul 2019 17:44:57 -0500 Subject: [nova][ptg] Shanghai attendance Message-ID: <1b6fcd6a-b3c6-d99b-fb20-466a8733115d@fried.cc> We (nova) need to let the foundation know about our planned presence at the Shanghai PTG for the U release. To that end, I've seeded an etherpad [1]. If you are a contributor or operator with an interest in nova, please add your name to the Attendance section, even (especially) if you know you are not attending. I need this information by the beginning of August, even if it's tentative. Thanks! efried [1] https://etherpad.openstack.org/p/nova-shanghai-ptg From mriedemos at gmail.com Thu Jul 11 00:06:58 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 10 Jul 2019 19:06:58 -0500 Subject: [nova][ptg] Shanghai attendance In-Reply-To: <1b6fcd6a-b3c6-d99b-fb20-466a8733115d@fried.cc> References: <1b6fcd6a-b3c6-d99b-fb20-466a8733115d@fried.cc> Message-ID: On 7/10/2019 5:44 PM, Eric Fried wrote: > We (nova) need to let the foundation know about our planned presence at > the Shanghai PTG for the U release. To that end, I've seeded an etherpad > [1]. If you are a contributor or operator with an interest in nova, > please add your name to the Attendance section, even (especially) if you > know you are not attending. I need this information by the beginning of > August, even if it's tentative. > > Thanks! > > efried > > [1]https://etherpad.openstack.org/p/nova-shanghai-ptg Any idea when the early bird pricing ends? Also, will there be a separate discount for the PTG like there has been in the past or is it all a flat combined rate now? This is more a question for Foundation-y people reading this. -- Thanks, Matt From allison at openstack.org Thu Jul 11 00:11:00 2019 From: allison at openstack.org (Allison Price) Date: Wed, 10 Jul 2019 19:11:00 -0500 Subject: [nova][ptg] Shanghai attendance In-Reply-To: References: <1b6fcd6a-b3c6-d99b-fb20-466a8733115d@fried.cc> Message-ID: Hey Matt, Early bird pricing is going to end in early August - we are publishing the exact date soon. There will be a contributor discount that will include OpenStack ATCs and AUCs, but there will not be a separate PTG discount. We will have more information in the upcoming days, but let us know if you have questions in the meantime. Thanks! Allison > On Jul 10, 2019, at 7:06 PM, Matt Riedemann wrote: > > On 7/10/2019 5:44 PM, Eric Fried wrote: >> We (nova) need to let the foundation know about our planned presence at >> the Shanghai PTG for the U release. To that end, I've seeded an etherpad >> [1]. If you are a contributor or operator with an interest in nova, >> please add your name to the Attendance section, even (especially) if you >> know you are not attending. I need this information by the beginning of >> August, even if it's tentative. >> Thanks! >> efried >> [1]https://etherpad.openstack.org/p/nova-shanghai-ptg > > Any idea when the early bird pricing ends? > > Also, will there be a separate discount for the PTG like there has been in the past or is it all a flat combined rate now? > > This is more a question for Foundation-y people reading this. > > -- > > Thanks, > > Matt > From mriedemos at gmail.com Thu Jul 11 00:15:00 2019 From: mriedemos at gmail.com (Matt Riedemann) Date: Wed, 10 Jul 2019 19:15:00 -0500 Subject: [nova][ptg] Shanghai attendance In-Reply-To: References: <1b6fcd6a-b3c6-d99b-fb20-466a8733115d@fried.cc> Message-ID: On 7/10/2019 7:11 PM, Allison Price wrote: > Early bird pricing is going to end in early August - we are publishing the exact date soon. There will be a contributor discount that will include OpenStack ATCs and AUCs, but there will not be a separate PTG discount. > > We will have more information in the upcoming days, but let us know if you have questions in the meantime. Thanks. I just compared the PTG and Summit registrations and it looks like they are the same - $231 for standard access and that gets you into both events. If it's a combined ticket I'm assuming any ATC discount would just apply to that (both summit and PTG). I'm also assuming "but there will not be a separate PTG discount" just means there is no discount for attending the last PTG in Denver. -- Thanks, Matt From emccormick at cirrusseven.com Thu Jul 11 00:17:47 2019 From: emccormick at cirrusseven.com (Erik McCormick) Date: Wed, 10 Jul 2019 20:17:47 -0400 Subject: [nova][ptg] Shanghai attendance In-Reply-To: References: <1b6fcd6a-b3c6-d99b-fb20-466a8733115d@fried.cc> Message-ID: On Wed, Jul 10, 2019, 8:08 PM Matt Riedemann wrote: > On 7/10/2019 5:44 PM, Eric Fried wrote: > > We (nova) need to let the foundation know about our planned presence at > > the Shanghai PTG for the U release. To that end, I've seeded an etherpad > > [1]. If you are a contributor or operator with an interest in nova, > > please add your name to the Attendance section, even (especially) if you > > know you are not attending. I need this information by the beginning of > > August, even if it's tentative. > > > > Thanks! > > > > efried > > > > [1]https://etherpad.openstack.org/p/nova-shanghai-ptg > > Any idea when the early bird pricing ends? > > Also, will there be a separate discount for the PTG like there has been > in the past or is it all a flat combined rate now? > > This is more a question for Foundation-y people reading this. > Attendance in Denver PTG (last Train Hotel) should get you into this summit gratis, no? I've been assuming codes will appear at some point ;) -- > > Thanks, > > Matt > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kendall at openstack.org Thu Jul 11 00:38:56 2019 From: kendall at openstack.org (Kendall Waters) Date: Wed, 10 Jul 2019 19:38:56 -0500 Subject: [nova][ptg] Shanghai attendance In-Reply-To: References: <1b6fcd6a-b3c6-d99b-fb20-466a8733115d@fried.cc> Message-ID: Yes, that is correct, the standard access will grant you access to both the Summit and PTG and the contributor discount can be applied to that ticket. To clarify, there is no discount for attending the last PTG. Since we have co-located the Summit and PTG, we are no longer offering a PTG attendee discount to the Summit like we have in the past. Cheers, Kendall > On Jul 10, 2019, at 7:17 PM, Erik McCormick wrote: > > > > On Wed, Jul 10, 2019, 8:08 PM Matt Riedemann > wrote: > On 7/10/2019 5:44 PM, Eric Fried wrote: > > We (nova) need to let the foundation know about our planned presence at > > the Shanghai PTG for the U release. To that end, I've seeded an etherpad > > [1]. If you are a contributor or operator with an interest in nova, > > please add your name to the Attendance section, even (especially) if you > > know you are not attending. I need this information by the beginning of > > August, even if it's tentative. > > > > Thanks! > > > > efried > > > > [1]https://etherpad.openstack.org/p/nova-shanghai-ptg > > Any idea when the early bird pricing ends? > > Also, will there be a separate discount for the PTG like there has been > in the past or is it all a flat combined rate now? > > This is more a question for Foundation-y people reading this. > > Attendance in Denver PTG (last Train Hotel) should get you into this summit gratis, no? I've been assuming codes will appear at some point ;) > > -- > > Thanks, > > Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From emccormick at cirrusseven.com Thu Jul 11 01:24:31 2019 From: emccormick at cirrusseven.com (Erik McCormick) Date: Wed, 10 Jul 2019 21:24:31 -0400 Subject: [nova][ptg] Shanghai attendance In-Reply-To: References: <1b6fcd6a-b3c6-d99b-fb20-466a8733115d@fried.cc> Message-ID: On Wed, Jul 10, 2019, 8:38 PM Kendall Waters wrote: > Yes, that is correct, the standard access will grant you access to both > the Summit and PTG and the contributor discount can be applied to that > ticket. > > To clarify, there is no discount for attending the last PTG. Since we have > co-located the Summit and PTG, we are no longer offering a PTG attendee > discount to the Summit like we have in the past. > The PTG I was referring to was the last one at the Renaissance. That was supposed to come with free admission to the next 2 summits (Denver and Shanghai). Hopefully that is still being honored? > Cheers, > Kendall > > > On Jul 10, 2019, at 7:17 PM, Erik McCormick > wrote: > > > > On Wed, Jul 10, 2019, 8:08 PM Matt Riedemann wrote: > >> On 7/10/2019 5:44 PM, Eric Fried wrote: >> > We (nova) need to let the foundation know about our planned presence at >> > the Shanghai PTG for the U release. To that end, I've seeded an etherpad >> > [1]. If you are a contributor or operator with an interest in nova, >> > please add your name to the Attendance section, even (especially) if you >> > know you are not attending. I need this information by the beginning of >> > August, even if it's tentative. >> > >> > Thanks! >> > >> > efried >> > >> > [1]https://etherpad.openstack.org/p/nova-shanghai-ptg >> >> Any idea when the early bird pricing ends? >> >> Also, will there be a separate discount for the PTG like there has been >> in the past or is it all a flat combined rate now? >> >> This is more a question for Foundation-y people reading this. >> > > Attendance in Denver PTG (last Train Hotel) should get you into this > summit gratis, no? I've been assuming codes will appear at some point ;) > > -- >> >> Thanks, >> >> Matt > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Thu Jul 11 03:03:48 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 11 Jul 2019 03:03:48 +0000 Subject: [nova][ptg] Shanghai attendance In-Reply-To: References: <1b6fcd6a-b3c6-d99b-fb20-466a8733115d@fried.cc> Message-ID: <20190711030347.33gkdi2ncuxzi2ok@yuggoth.org> On 2019-07-10 21:24:31 -0400 (-0400), Erik McCormick wrote: [...] > The PTG I was referring to was the last one at the Renaissance. That was > supposed to come with free admission to the next 2 summits (Denver and > Shanghai). Hopefully that is still being honored? [...] Your list has an off-by-one error. ;) That second "Denver" PTG at the Renaissance Stapleton Hotel was September 2018. The next two summits following it were November 2018 in Berlin and April 2019 in Denver. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From emccormick at cirrusseven.com Thu Jul 11 03:21:21 2019 From: emccormick at cirrusseven.com (Erik McCormick) Date: Wed, 10 Jul 2019 23:21:21 -0400 Subject: [nova][ptg] Shanghai attendance In-Reply-To: <20190711030347.33gkdi2ncuxzi2ok@yuggoth.org> References: <1b6fcd6a-b3c6-d99b-fb20-466a8733115d@fried.cc> <20190711030347.33gkdi2ncuxzi2ok@yuggoth.org> Message-ID: On Wed, Jul 10, 2019, 11:05 PM Jeremy Stanley wrote: > On 2019-07-10 21:24:31 -0400 (-0400), Erik McCormick wrote: > [...] > > The PTG I was referring to was the last one at the Renaissance. That was > > supposed to come with free admission to the next 2 summits (Denver and > > Shanghai). Hopefully that is still being honored? > [...] > > Your list has an off-by-one error. ;) > > That second "Denver" PTG at the Renaissance Stapleton Hotel was > September 2018. The next two summits following it were November 2018 > in Berlin and April 2019 in Denver. > -- > Jeremy Stanley > Wow. These cold meds have truly addled my brain. Thanks for the correction. I sleep now. -Erik > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sungn2 at lenovo.com Thu Jul 11 06:03:54 2019 From: sungn2 at lenovo.com (Guannan GN2 Sun) Date: Thu, 11 Jul 2019 06:03:54 +0000 Subject: [devstack] Deploy issue on Ubuntu 16.04. (Guannan) Message-ID: <594856dca0114bfaa4596dedf13c68a7@lenovo.com> Thanks Chric and Sean, I run unstack.sh, remove the files in /etc/apache2/sites-enabled and /etc/apaceh2/sites-available, and rerun stack.sh, that works for me! Although I'm still run master branch of devstack on Ubuntu 16.04, as Sean say, I may move to Ubuntu 18.04 later. Thank you! Best Regards, Guannan -------------- next part -------------- An HTML attachment was scrubbed... URL: From manuel.sb at garvan.org.au Thu Jul 11 06:15:46 2019 From: manuel.sb at garvan.org.au (Manuel Sopena Ballesteros) Date: Thu, 11 Jul 2019 06:15:46 +0000 Subject: trying to understand steal time with cpu pinning Message-ID: <9D8A2486E35F0941A60430473E29F15B017EB2889D@MXDB1.ad.garvan.unsw.edu.au> Dear Openstack community, Please correct me if I am wrong. As far as I understand `steal time > 0` means that the hypervisor has replaced a vcpu with a different one on the physical cpu. Also, cpu pinning allocates a vcpu to a physical cpu permanently. I have a vm setup with cpu pinning and numa affinity and realized, that cpu steal time is between 1% and 0%. Why is that? Thank you very much NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. -------------- next part -------------- An HTML attachment was scrubbed... URL: From skylightcoder at gmail.com Thu Jul 11 06:58:23 2019 From: skylightcoder at gmail.com (=?UTF-8?B?R8O2a2hhbiBJxZ5JSw==?=) Date: Thu, 11 Jul 2019 09:58:23 +0300 Subject: [Nova] Instances can't be started after compute nodes unexpectedly shut down because of power outage In-Reply-To: References: Message-ID: Can anyone help me please ? I can no't rescue my instances yet :( Thanks, Gökhan Gökhan IŞIK , 9 Tem 2019 Sal, 15:46 tarihinde şunu yazdı: > Hi folks, > Because of power outage, Most of our compute nodes unexpectedly > shut down and now I can not start our instances. Error message is "Failed > to get "write" lock another process using the image?". Instances Power > status is No State. Full error log is > http://paste.openstack.org/show/754107/. My environment is OpenStack Pike > on Ubuntu 16.04 LTS servers and Instances are on a nfs shared storage. Nova > version is 16.1.6.dev2. qemu version is 2.10.1. libvirt version is 3.6.0. I > saw a commit [1], but it doesn't solve this problem. > There are important instances on my environment. How can I rescue my > instances? What would you suggest ? > > Thanks, > Gökhan > > [1] https://review.opendev.org/#/c/509774/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jyotishri403 at gmail.com Thu Jul 11 07:22:20 2019 From: jyotishri403 at gmail.com (Jyoti Dahiwele) Date: Thu, 11 Jul 2019 12:52:20 +0530 Subject: [cinder] Glusterfs support in stein In-Reply-To: <20190708125119.GA15668@sm-workstation> References: <20190708125119.GA15668@sm-workstation> Message-ID: Thank you for the clarification. On Mon, 8 Jul 2019, 18:21 Sean McGinnis, wrote: > On Mon, Jul 08, 2019 at 02:13:13PM +0200, Massimo Sgaravatto wrote: > > We also used it in the past but as far as I remember the gluster driver > was > > removed in Ocata (we had therefore to migrate to the NFSdriver) > > > > Cheers, Massimo > > > > This is correct. The GlusterFS driver was marked as deprecated by its > maintainers in the Newton release, then officially removed in Ocata. > > The driver can still be found in those stable branches, but would probably > take > a not insignificant amount of work to use in later releases. > > Sean > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.teckelmann at bertelsmann.de Thu Jul 11 07:28:26 2019 From: ralf.teckelmann at bertelsmann.de (Teckelmann, Ralf, NMU-OIP) Date: Thu, 11 Jul 2019 07:28:26 +0000 Subject: FWaaS in Stein - NoMatches: No 'neutron.service_plugins' driver found, looking for 'firewall' Message-ID: Good Morning everyone, We like to have FWaaS enabled for a Stein-based OpenStack installation. Using linuxbridges we are not able to use FWaaS_v2, because it only seems to work with ovs. We thus tried FWaaS (v1) following https://docs.openstack.org/openstack-ansible-os_neutron/latest/configure-network-services.html#firewall-service-optional . However, all we get from it is (1). Are we missing a point or is FWaaS_V1 just not supported in Stein anymore? If so, this would mean for a setup Stein+Linuxbridges no FWaaS is actually available, right? (1) grep firewall /var/log/neutron/neutron-server.log 2019-07-05 10:10:55.693 29793 ERROR neutron_lib.utils.runtime NoMatches: No 'neutron.service_plugins' driver found, looking for 'firewall' 2019-07-05 10:10:55.694 29793 ERROR neutron.manager [req-394624b6-e638-45ec-be7c-ce86793fdbc4 - - - - -] Plugin 'firewall' not found. 2019-07-05 10:11:00.046 29979 INFO neutron.manager [req-e86af4f4-afae-46d7-ac5e-51585a12083b - - - - -] Loading Plugin: firewall 2019-07-05 10:11:00.046 29979 ERROR neutron_lib.utils.runtime [req-e86af4f4-afae-46d7-ac5e-51585a12083b - - - - -] Error loading class by alias: NoMatches: No 'neutron.service_plugins' driver found, looking for 'firewall' Best regards, Ralf T. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cdent+os at anticdent.org Thu Jul 11 07:59:27 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Thu, 11 Jul 2019 08:59:27 +0100 (BST) Subject: [placement][ptg] Shanghai attendance Message-ID: As with nova [1] and keystone [2], placement needs to decide what kind of presence it would like to have at the PTG portion of the event in Shanghai in November, and let the Foundation know. I've already expressed that I don't intend to go [3] and in that message asked for feedback from the rest of the placement team on whether we need to do any placement work there, given the value we got out of the virtual pre-PTG in April. Let me ask again: Do we need a presence at the PTG in Shanghai? I've created an etherpad where we can track who will be there for sure and who will not, and start making note of relevant topics. https://etherpad.openstack.org/p/placement-shanghai-ptg [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-July/007640.html [2] http://lists.openstack.org/pipermail/openstack-discuss/2019-July/007639.html [3] http://lists.openstack.org/pipermail/openstack-discuss/2019-July/007527.html -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From skaplons at redhat.com Thu Jul 11 08:04:02 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 11 Jul 2019 10:04:02 +0200 Subject: FWaaS in Stein - NoMatches: No 'neutron.service_plugins' driver found, looking for 'firewall' In-Reply-To: References: Message-ID: <28A1866C-F2C6-4BAC-BE73-467AD22B4805@redhat.com> Hi, FWaaS v1 was deprecated since some time and was removed completely in Stein release. > On 11 Jul 2019, at 09:28, Teckelmann, Ralf, NMU-OIP wrote: > > Good Morning everyone, > > We like to have FWaaS enabled for a Stein-based OpenStack installation. > Using linuxbridges we are not able to use FWaaS_v2, because it only seems to work with ovs. > > We thus tried FWaaS (v1) following https://docs.openstack.org/openstack-ansible-os_neutron/latest/configure-network-services.html#firewall-service-optional . > However, all we get from it is (1). > > Are we missing a point or is FWaaS_V1 just not supported in Stein anymore? > If so, this would mean for a setup Stein+Linuxbridges no FWaaS is actually available, right? > > (1) > grep firewall /var/log/neutron/neutron-server.log > 2019-07-05 10:10:55.693 29793 ERROR neutron_lib.utils.runtime NoMatches: No'neutron.service_plugins' driver found, looking for 'firewall' > 2019-07-05 10:10:55.694 29793 ERROR neutron.manager [req-394624b6-e638-45ec-be7c-ce86793fdbc4 - - - - -] Plugin 'firewall' not found. > 2019-07-05 10:11:00.046 29979 INFO neutron.manager [req-e86af4f4-afae-46d7-ac5e-51585a12083b - - - - -] Loading Plugin: firewall > 2019-07-05 10:11:00.046 29979 ERROR neutron_lib.utils.runtime [req-e86af4f4-afae-46d7-ac5e-51585a12083b - - - - -] Error loading class by alias: NoMatches: No 'neutron.service_plugins' driver found, looking for 'firewall' > > Best regards, > > Ralf T. — Slawek Kaplonski Senior software engineer Red Hat From ignaziocassano at gmail.com Thu Jul 11 09:01:09 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 11 Jul 2019 11:01:09 +0200 Subject: [queens][nova] nova host-evacuate errot Message-ID: Hello All, on ocata when I poweroff a node with active instance , doing a nova host-evacuate works fine and instances are restartd on an active node. On queens it does non evacuate instances but nova-api reports for each instance the following: 2019-07-11 10:19:54.745 13811 INFO nova.api.openstack.wsgi [req-daad0a7d-87ce-41bf-b096-a70fc306db5c 0c7a2d6006614fe2b3e81e47377dd2a9 c26f8d35f85547c4add392a221af1aab - default default] HTTP exception thrown: Cannot 'evacuate' instance e8485a5e-3623-4184-bcce-cafd56fa60b3 while it is in task_state powering-off So it poweroff all instance on the failed node but does not start them on active nodes What is changed ? Ignazio -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.teckelmann at bertelsmann.de Thu Jul 11 09:23:54 2019 From: ralf.teckelmann at bertelsmann.de (Teckelmann, Ralf, NMU-OIP) Date: Thu, 11 Jul 2019 09:23:54 +0000 Subject: AW: FWaaS in Stein - NoMatches: No 'neutron.service_plugins' driver found, looking for 'firewall' In-Reply-To: <28A1866C-F2C6-4BAC-BE73-467AD22B4805@redhat.com> References: , <28A1866C-F2C6-4BAC-BE73-467AD22B4805@redhat.com> Message-ID: Hello Slawek, Thank your for your fast response. This means in regard of a Stein-Deployment with Linuxbridges no Perimeter-Firewall is offered anymore. Are there plans to remedy this deficiency in the next releases? Cheers, Ralf T. ________________________________ Von: Slawek Kaplonski Gesendet: Donnerstag, 11. Juli 2019 10:04:02 An: Teckelmann, Ralf, NMU-OIP Cc: openstack-discuss at lists.openstack.org Betreff: Re: FWaaS in Stein - NoMatches: No 'neutron.service_plugins' driver found, looking for 'firewall' Hi, FWaaS v1 was deprecated since some time and was removed completely in Stein release. > On 11 Jul 2019, at 09:28, Teckelmann, Ralf, NMU-OIP wrote: > > Good Morning everyone, > > We like to have FWaaS enabled for a Stein-based OpenStack installation. > Using linuxbridges we are not able to use FWaaS_v2, because it only seems to work with ovs. > > We thus tried FWaaS (v1) following https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_openstack-2Dansible-2Dos-5Fneutron_latest_configure-2Dnetwork-2Dservices.html-23firewall-2Dservice-2Doptional&d=DwIFaQ&c=vo2ie5TPcLdcgWuLVH4y8lsbGPqIayH3XbK3gK82Oco&r=WXex93lsaiQ-z7CeZkHv93lzt4fdCRIPXloSPQEU7CM&m=mRJxK4Dne35uMLvIxZWOXNeMxXzMcUTsQQd1yrgQ7kM&s=9KmdvZINwdij6mV-kMqE6S94CMiK4z8yO1b7cfXNhv8&e= . > However, all we get from it is (1). > > Are we missing a point or is FWaaS_V1 just not supported in Stein anymore? > If so, this would mean for a setup Stein+Linuxbridges no FWaaS is actually available, right? > > (1) > grep firewall /var/log/neutron/neutron-server.log > 2019-07-05 10:10:55.693 29793 ERROR neutron_lib.utils.runtime NoMatches: No'neutron.service_plugins' driver found, looking for 'firewall' > 2019-07-05 10:10:55.694 29793 ERROR neutron.manager [req-394624b6-e638-45ec-be7c-ce86793fdbc4 - - - - -] Plugin 'firewall' not found. > 2019-07-05 10:11:00.046 29979 INFO neutron.manager [req-e86af4f4-afae-46d7-ac5e-51585a12083b - - - - -] Loading Plugin: firewall > 2019-07-05 10:11:00.046 29979 ERROR neutron_lib.utils.runtime [req-e86af4f4-afae-46d7-ac5e-51585a12083b - - - - -] Error loading class by alias: NoMatches: No 'neutron.service_plugins' driver found, looking for 'firewall' > > Best regards, > > Ralf T. — Slawek Kaplonski Senior software engineer Red Hat -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Thu Jul 11 09:27:49 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 11 Jul 2019 11:27:49 +0200 Subject: [queens][nova] nova host-evacuate errot In-Reply-To: References: Message-ID: I am sorry. For simulating an host crash I used a wrong procedure. Using "echo 'c' > /proc/sysrq-trigger" all work fine Il giorno gio 11 lug 2019 alle ore 11:01 Ignazio Cassano < ignaziocassano at gmail.com> ha scritto: > Hello All, > on ocata when I poweroff a node with active instance , doing a nova > host-evacuate works fine > and instances are restartd on an active node. > On queens it does non evacuate instances but nova-api reports for each > instance the following: > > 2019-07-11 10:19:54.745 13811 INFO nova.api.openstack.wsgi > [req-daad0a7d-87ce-41bf-b096-a70fc306db5c 0c7a2d6006614fe2b3e81e47377dd2a9 > c26f8d35f85547c4add392a221af1aab - default default] HTTP exception thrown: > Cannot 'evacuate' instance e8485a5e-3623-4184-bcce-cafd56fa60b3 while it is > in task_state powering-off > > So it poweroff all instance on the failed node but does not start them on > active nodes > > What is changed ? > Ignazio > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jyotishri403 at gmail.com Thu Jul 11 09:37:19 2019 From: jyotishri403 at gmail.com (Jyoti Dahiwele) Date: Thu, 11 Jul 2019 15:07:19 +0530 Subject: root disk for instance Message-ID: Dear Team, Please clear me my following doubts. When I use image from source option and mini flavor to create an instace, from which storage pool instance will get root disk ? From cinder or glance? -------------- next part -------------- An HTML attachment was scrubbed... URL: From manulachathurika at gmail.com Thu Jul 11 09:41:03 2019 From: manulachathurika at gmail.com (Manula Thantriwatte) Date: Thu, 11 Jul 2019 15:11:03 +0530 Subject: g-placement / g-api didn't start in devstack Message-ID: I'm installing DevStack in Ubuntu 18.04. I'm using stable/stein branch for the installation. When I running the stach.sh I'm getting following error. I have updated the timeout to 300 seconds as well. But still I'm not able to run the DevStack. Could you please help me on this issue? Is there specific python version do I need to install. In my PC I have python version 3.6.8. ++:: curl -g -k --noproxy '*' -s -o /dev/null -w '%{http_code}' http://192.168.9.10/image +:: [[ 503 == 503 ]] +:: sleep 1 +functions:wait_for_service:431 rval=124 +functions:wait_for_service:436 time_stop wait_for_service +functions-common:time_stop:2317 local name +functions-common:time_stop:2318 local end_time +functions-common:time_stop:2319 local elapsed_time +functions-common:time_stop:2320 local total +functions-common:time_stop:2321 local start_time +functions-common:time_stop:2323 name=wait_for_service +functions-common:time_stop:2324 start_time=1562654135525 +functions-common:time_stop:2326 [[ -z 1562654135525 ]] ++functions-common:time_stop:2329 date +%s%3N +functions-common:time_stop:2329 end_time=1562654195603 +functions-common:time_stop:2330 elapsed_time=60078 +functions-common:time_stop:2331 total=93 +functions-common:time_stop:2333 _TIME_START[$name]= +functions-common:time_stop:2334 _TIME_TOTAL[$name]=60171 +functions:wait_for_service:437 return 124 +lib/glance:start_glance:353 die 353 'g-api did not start' +functions-common:die:195 local exitcode=0 [Call Trace] ./stack.sh:1261:start_glance /opt/stack/devstack/lib/glance:353:die [ERROR] /opt/stack/devstack/lib/glance:353 g-api did not start Error on exit Traceback (most recent call last): File "/opt/stack/devstack/tools/worlddump.py", line 255, in sys.exit(main()) File "/opt/stack/devstack/tools/worlddump.py", line 239, in main Thanks ! -- Regards, Manula Chathurika Thantriwatte phone : (+94) 772492511 email : manulachathurika at gmail.com Linkedin : *http://lk.linkedin.com/in/manulachathurika * blog : http://manulachathurika.blogspot.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jayachander.it at gmail.com Thu Jul 11 09:48:21 2019 From: jayachander.it at gmail.com (Jay See) Date: Thu, 11 Jul 2019 11:48:21 +0200 Subject: [queens][nova] nova host-evacuate errot In-Reply-To: References: Message-ID: Hi Ignazio, I am trying to evacuate the compute host on older version (mitaka). Could please share the process you followed. I am not able to succeed with openstack live-migration fails with error message (this is known issue in older versions) and nova live-ligration - nothing happens even after initiating VM migration. It is almost 4 days. ~Jay. On Thu, Jul 11, 2019 at 11:31 AM Ignazio Cassano wrote: > I am sorry. > For simulating an host crash I used a wrong procedure. > Using "echo 'c' > /proc/sysrq-trigger" all work fine > > Il giorno gio 11 lug 2019 alle ore 11:01 Ignazio Cassano < > ignaziocassano at gmail.com> ha scritto: > >> Hello All, >> on ocata when I poweroff a node with active instance , doing a nova >> host-evacuate works fine >> and instances are restartd on an active node. >> On queens it does non evacuate instances but nova-api reports for each >> instance the following: >> >> 2019-07-11 10:19:54.745 13811 INFO nova.api.openstack.wsgi >> [req-daad0a7d-87ce-41bf-b096-a70fc306db5c 0c7a2d6006614fe2b3e81e47377dd2a9 >> c26f8d35f85547c4add392a221af1aab - default default] HTTP exception thrown: >> Cannot 'evacuate' instance e8485a5e-3623-4184-bcce-cafd56fa60b3 while it is >> in task_state powering-off >> >> So it poweroff all instance on the failed node but does not start them on >> active nodes >> >> What is changed ? >> Ignazio >> >> >> -- ​ P *SAVE PAPER – Please do not print this e-mail unless absolutely necessary.* -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Thu Jul 11 09:52:27 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 11 Jul 2019 11:52:27 +0200 Subject: [queens][nova] nova host-evacuate errot In-Reply-To: References: Message-ID: Hi Jay, would you like to evacuate a failed compute node or evacuate a running compute node ? Ignazio Il giorno gio 11 lug 2019 alle ore 11:48 Jay See ha scritto: > Hi Ignazio, > > I am trying to evacuate the compute host on older version (mitaka). > Could please share the process you followed. I am not able to succeed with > openstack live-migration fails with error message (this is known issue in > older versions) and nova live-ligration - nothing happens even after > initiating VM migration. It is almost 4 days. > > ~Jay. > > On Thu, Jul 11, 2019 at 11:31 AM Ignazio Cassano > wrote: > >> I am sorry. >> For simulating an host crash I used a wrong procedure. >> Using "echo 'c' > /proc/sysrq-trigger" all work fine >> >> Il giorno gio 11 lug 2019 alle ore 11:01 Ignazio Cassano < >> ignaziocassano at gmail.com> ha scritto: >> >>> Hello All, >>> on ocata when I poweroff a node with active instance , doing a nova >>> host-evacuate works fine >>> and instances are restartd on an active node. >>> On queens it does non evacuate instances but nova-api reports for each >>> instance the following: >>> >>> 2019-07-11 10:19:54.745 13811 INFO nova.api.openstack.wsgi >>> [req-daad0a7d-87ce-41bf-b096-a70fc306db5c 0c7a2d6006614fe2b3e81e47377dd2a9 >>> c26f8d35f85547c4add392a221af1aab - default default] HTTP exception thrown: >>> Cannot 'evacuate' instance e8485a5e-3623-4184-bcce-cafd56fa60b3 while it is >>> in task_state powering-off >>> >>> So it poweroff all instance on the failed node but does not start them >>> on active nodes >>> >>> What is changed ? >>> Ignazio >>> >>> >>> > > -- > ​ > P *SAVE PAPER – Please do not print this e-mail unless absolutely > necessary.* > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eblock at nde.ag Thu Jul 11 09:53:26 2019 From: eblock at nde.ag (Eugen Block) Date: Thu, 11 Jul 2019 09:53:26 +0000 Subject: root disk for instance In-Reply-To: Message-ID: <20190711095326.Horde._eRtJ1r8ufNJjxxy9JnnVe-@webmail.nde.ag> Hi, it's always glance that serves the images, it just depends on how you decide to create the instance, ephemeral or persistent disks. You can find more information about storage concepts in [1]. If I'm not completely wrong, since Newton release the default in the Horizon settings is to create an instance from volume, so it would be a persistent disk managed by cinder (the volume persists after the instance has been deleted, this is also configurable). The image is downloaded from glance into a volume on your volume server. If you change the Horizon behavior or if you launch an instance from the cli you'd get an ephemeral disk by nova, depending on your storage backend this would be a local copy of the image on the compute node(s) or something related in your storage backend, e.g. an rbd object in ceph. Does this clear it up a bit? Regards, Eugen [1] https://docs.openstack.org/arch-design/design-storage/design-storage-concepts.html Zitat von Jyoti Dahiwele : > Dear Team, > > Please clear me my following doubts. > When I use image from source option and mini flavor to create an instace, > from which storage pool instance will get root disk ? From cinder or glance? From jayachander.it at gmail.com Thu Jul 11 09:57:34 2019 From: jayachander.it at gmail.com (Jay See) Date: Thu, 11 Jul 2019 11:57:34 +0200 Subject: [queens][nova] nova host-evacuate errot In-Reply-To: References: Message-ID: Hi , I have tried on a failed compute node which is in power off state now. I have tried on a running compute node, no errors. But nothing happens. On running compute node - Disabled the compute service and tried migration also. May be I might have not followed proper steps. Just wanted to know the steps you have followed. Otherwise, I was planning to manual migration also if possible. ~Jay. On Thu, Jul 11, 2019 at 11:52 AM Ignazio Cassano wrote: > Hi Jay, > would you like to evacuate a failed compute node or evacuate a running > compute node ? > > Ignazio > > Il giorno gio 11 lug 2019 alle ore 11:48 Jay See > ha scritto: > >> Hi Ignazio, >> >> I am trying to evacuate the compute host on older version (mitaka). >> Could please share the process you followed. I am not able to succeed >> with openstack live-migration fails with error message (this is known issue >> in older versions) and nova live-ligration - nothing happens even after >> initiating VM migration. It is almost 4 days. >> >> ~Jay. >> >> On Thu, Jul 11, 2019 at 11:31 AM Ignazio Cassano < >> ignaziocassano at gmail.com> wrote: >> >>> I am sorry. >>> For simulating an host crash I used a wrong procedure. >>> Using "echo 'c' > /proc/sysrq-trigger" all work fine >>> >>> Il giorno gio 11 lug 2019 alle ore 11:01 Ignazio Cassano < >>> ignaziocassano at gmail.com> ha scritto: >>> >>>> Hello All, >>>> on ocata when I poweroff a node with active instance , doing a nova >>>> host-evacuate works fine >>>> and instances are restartd on an active node. >>>> On queens it does non evacuate instances but nova-api reports for each >>>> instance the following: >>>> >>>> 2019-07-11 10:19:54.745 13811 INFO nova.api.openstack.wsgi >>>> [req-daad0a7d-87ce-41bf-b096-a70fc306db5c 0c7a2d6006614fe2b3e81e47377dd2a9 >>>> c26f8d35f85547c4add392a221af1aab - default default] HTTP exception thrown: >>>> Cannot 'evacuate' instance e8485a5e-3623-4184-bcce-cafd56fa60b3 while it is >>>> in task_state powering-off >>>> >>>> So it poweroff all instance on the failed node but does not start them >>>> on active nodes >>>> >>>> What is changed ? >>>> Ignazio >>>> >>>> >>>> >> >> -- >> ​ >> P *SAVE PAPER – Please do not print this e-mail unless absolutely >> necessary.* >> > -- ​ P *SAVE PAPER – Please do not print this e-mail unless absolutely necessary.* -------------- next part -------------- An HTML attachment was scrubbed... URL: From cdent+os at anticdent.org Thu Jul 11 09:59:24 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Thu, 11 Jul 2019 10:59:24 +0100 (BST) Subject: g-placement / g-api didn't start in devstack In-Reply-To: References: Message-ID: On Thu, 11 Jul 2019, Manula Thantriwatte wrote: > I'm installing DevStack in Ubuntu 18.04. I'm using stable/stein branch for > the installation. When I running the stach.sh I'm getting following error. > I have updated the timeout to 300 seconds as well. But still I'm not able > to run the DevStack. Could you please help me on this issue? Is there > specific python version do I need to install. In my PC I have python > version 3.6.8. You might be seeing the same issue as described in http://lists.openstack.org/pipermail/openstack-discuss/2019-July/007616.html so you might try the suggestions in there. (It might also be something else). -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From skaplons at redhat.com Thu Jul 11 10:30:49 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 11 Jul 2019 12:30:49 +0200 Subject: FWaaS in Stein - NoMatches: No 'neutron.service_plugins' driver found, looking for 'firewall' In-Reply-To: References: <28A1866C-F2C6-4BAC-BE73-467AD22B4805@redhat.com> Message-ID: Hi, AFAICT there is no many still active developers of neutron-fwaas project and I don’t know about such plans currently. > On 11 Jul 2019, at 11:23, Teckelmann, Ralf, NMU-OIP wrote: > > Hello Slawek, > > Thank your for your fast response. > This means in regard of a Stein-Deployment with Linuxbridges no Perimeter-Firewall is offered anymore. > Are there plans to remedy this deficiency in the next releases? > > Cheers, > > Ralf T. > Von: Slawek Kaplonski > Gesendet: Donnerstag, 11. Juli 2019 10:04:02 > An: Teckelmann, Ralf, NMU-OIP > Cc: openstack-discuss at lists.openstack.org > Betreff: Re: FWaaS in Stein - NoMatches: No 'neutron.service_plugins' driver found, looking for 'firewall' > > Hi, > > FWaaS v1 was deprecated since some time and was removed completely in Stein release. > > > On 11 Jul 2019, at 09:28, Teckelmann, Ralf, NMU-OIP wrote: > > > > Good Morning everyone, > > > > We like to have FWaaS enabled for a Stein-based OpenStack installation. > > Using linuxbridges we are not able to use FWaaS_v2, because it only seems to work with ovs. > > > > We thus tried FWaaS (v1) following https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_openstack-2Dansible-2Dos-5Fneutron_latest_configure-2Dnetwork-2Dservices.html-23firewall-2Dservice-2Doptional&d=DwIFaQ&c=vo2ie5TPcLdcgWuLVH4y8lsbGPqIayH3XbK3gK82Oco&r=WXex93lsaiQ-z7CeZkHv93lzt4fdCRIPXloSPQEU7CM&m=mRJxK4Dne35uMLvIxZWOXNeMxXzMcUTsQQd1yrgQ7kM&s=9KmdvZINwdij6mV-kMqE6S94CMiK4z8yO1b7cfXNhv8&e= . > > However, all we get from it is (1). > > > > Are we missing a point or is FWaaS_V1 just not supported in Stein anymore? > > If so, this would mean for a setup Stein+Linuxbridges no FWaaS is actually available, right? > > > > (1) > > grep firewall /var/log/neutron/neutron-server.log > > 2019-07-05 10:10:55.693 29793 ERROR neutron_lib.utils.runtime NoMatches: No'neutron.service_plugins' driver found, looking for 'firewall' > > 2019-07-05 10:10:55.694 29793 ERROR neutron.manager [req-394624b6-e638-45ec-be7c-ce86793fdbc4 - - - - -] Plugin 'firewall' not found. > > 2019-07-05 10:11:00.046 29979 INFO neutron.manager [req-e86af4f4-afae-46d7-ac5e-51585a12083b - - - - -] Loading Plugin: firewall > > 2019-07-05 10:11:00.046 29979 ERROR neutron_lib.utils.runtime [req-e86af4f4-afae-46d7-ac5e-51585a12083b - - - - -] Error loading class by alias: NoMatches: No 'neutron.service_plugins' driver found, looking for 'firewall' > > > > Best regards, > > > > Ralf T. > > — > Slawek Kaplonski > Senior software engineer > Red Hat > — Slawek Kaplonski Senior software engineer Red Hat From gmann at ghanshyammann.com Thu Jul 11 10:41:49 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Thu, 11 Jul 2019 19:41:49 +0900 Subject: [nova] API updates week 19-28 Message-ID: <16be09ffa52.ccb8c25789487.3681219850412168320@ghanshyammann.com> Hello Everyone, Please find the Nova API updates of this week. API Related BP : ============ COMPLETED: 1. Support adding description while locking an instance: - https://blueprints.launchpad.net/nova/+spec/add-locked-reason Code Ready for Review: ------------------------------ 1. Add host and hypervisor_hostname flag to create server - Topic: https://review.opendev.org/#/q/topic:bp/add-host-and-hypervisor-hostname-flag-to-create-server+(status:open+OR+status:merged) - Weekly Progress: matt is +2 on this. Need another core reviewer. 2. Specifying az when restore shelved server - Topic: https://review.opendev.org/#/q/topic:bp/support-specifying-az-when-restore-shelved-server+(status:open+OR+status:merged) - Weekly Progress: Brin updated the patch after review comments. Ready for another round of review. 3. Nova API cleanup - Topic: https://review.opendev.org/#/c/666889/ - Weekly Progress: Code is up for review. Specs are merged and code in-progress: ------------------------------ ------------------ 4. Nova API policy improvement - Topic: https://review.openstack.org/#/q/topic:bp/policy-default-refresh+(status:open+OR+status:merged) - Weekly Progress: Spec is merged. Started the testing coverage of the existing policies with little refactoring on John PoC. 5. Detach and attach boot volumes: - Topic: https://review.openstack.org/#/q/topic:bp/detach-boot-volume+(status:open+OR+status:merged) - Weekly Progress: No Progress. Patches are in merge conflict. Spec Ready for Review: ----------------------------- 1. Support for changing deleted_on_termination after boot -Spec: https://review.openstack.org/#/c/580336/ - Weekly Progress: No update this week. Pending on Lee Yarwood proposal after PTG discussion. 3. Support delete_on_termination in volume attach api -Spec: https://review.openstack.org/#/c/612949/ - Weekly Progress: No updates this week. matt recommend to merging this with 580336 which is pending on Lee Yarwood proposal. Previously approved Spec needs to be re-proposed for Train: --------------------------------------------------------------------------- 1. Servers Ips non-unique network names : - https://blueprints.launchpad.net/nova/+spec/servers-ips-non-unique-network-names - https://review.openstack.org/#/q/topic:bp/servers-ips-non-unique-network-names+(status:open+OR+status:merged) - I remember I planned this to re-propose but could not get time. If anyone would like to help on this please repropose. otherwise I will start this in U cycle. 2. Volume multiattach enhancements: - https://blueprints.launchpad.net/nova/+spec/volume-multiattach-enhancements - https://review.openstack.org/#/q/topic:bp/volume-multiattach-enhancements+(status:open+OR+status:merged) - This also need volutneer - http://lists.openstack.org/pipermail/openstack-discuss/2019-June/007411.html Others: 1. Add API ref guideline for body text - Need another +2 on this - https://review.opendev.org/#/c/668234/ - after that, 2 api-ref are left to fix. Bugs: ==== No progress report in this week. NOTE- There might be some bug which is not tagged as 'api' or 'api-ref', those are not in the above list. Tag such bugs so that we can keep our eyes. -gmann From aheczko at mirantis.com Thu Jul 11 10:55:25 2019 From: aheczko at mirantis.com (Adam Heczko) Date: Thu, 11 Jul 2019 12:55:25 +0200 Subject: FWaaS in Stein - NoMatches: No 'neutron.service_plugins' driver found, looking for 'firewall' In-Reply-To: References: <28A1866C-F2C6-4BAC-BE73-467AD22B4805@redhat.com> Message-ID: Hi Ralf, WDYM saying 'no Perimeter-Firewall is offered anymore'? OpenStack with OVS ML2 provides a security groups, which is considered a 'perimeter firewall'. On Thu, Jul 11, 2019 at 12:35 PM Slawek Kaplonski wrote: > Hi, > > AFAICT there is no many still active developers of neutron-fwaas project > and I don’t know about such plans currently. > > > On 11 Jul 2019, at 11:23, Teckelmann, Ralf, NMU-OIP < > ralf.teckelmann at bertelsmann.de> wrote: > > > > Hello Slawek, > > > > Thank your for your fast response. > > This means in regard of a Stein-Deployment with Linuxbridges no > Perimeter-Firewall is offered anymore. > > Are there plans to remedy this deficiency in the next releases? > > > > Cheers, > > > > Ralf T. > > Von: Slawek Kaplonski > > Gesendet: Donnerstag, 11. Juli 2019 10:04:02 > > An: Teckelmann, Ralf, NMU-OIP > > Cc: openstack-discuss at lists.openstack.org > > Betreff: Re: FWaaS in Stein - NoMatches: No 'neutron.service_plugins' > driver found, looking for 'firewall' > > > > Hi, > > > > FWaaS v1 was deprecated since some time and was removed completely in > Stein release. > > > > > On 11 Jul 2019, at 09:28, Teckelmann, Ralf, NMU-OIP < > ralf.teckelmann at bertelsmann.de> wrote: > > > > > > Good Morning everyone, > > > > > > We like to have FWaaS enabled for a Stein-based OpenStack installation. > > > Using linuxbridges we are not able to use FWaaS_v2, because it only > seems to work with ovs. > > > > > > We thus tried FWaaS (v1) following > https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_openstack-2Dansible-2Dos-5Fneutron_latest_configure-2Dnetwork-2Dservices.html-23firewall-2Dservice-2Doptional&d=DwIFaQ&c=vo2ie5TPcLdcgWuLVH4y8lsbGPqIayH3XbK3gK82Oco&r=WXex93lsaiQ-z7CeZkHv93lzt4fdCRIPXloSPQEU7CM&m=mRJxK4Dne35uMLvIxZWOXNeMxXzMcUTsQQd1yrgQ7kM&s=9KmdvZINwdij6mV-kMqE6S94CMiK4z8yO1b7cfXNhv8&e= > . > > > However, all we get from it is (1). > > > > > > Are we missing a point or is FWaaS_V1 just not supported in Stein > anymore? > > > If so, this would mean for a setup Stein+Linuxbridges no FWaaS is > actually available, right? > > > > > > (1) > > > grep firewall /var/log/neutron/neutron-server.log > > > 2019-07-05 10:10:55.693 29793 ERROR neutron_lib.utils.runtime > NoMatches: No'neutron.service_plugins' driver found, looking for 'firewall' > > > 2019-07-05 10:10:55.694 29793 ERROR neutron.manager > [req-394624b6-e638-45ec-be7c-ce86793fdbc4 - - - - -] Plugin 'firewall' not > found. > > > 2019-07-05 10:11:00.046 29979 INFO neutron.manager > [req-e86af4f4-afae-46d7-ac5e-51585a12083b - - - - -] Loading Plugin: > firewall > > > 2019-07-05 10:11:00.046 29979 ERROR neutron_lib.utils.runtime > [req-e86af4f4-afae-46d7-ac5e-51585a12083b - - - - -] Error loading class by > alias: NoMatches: No 'neutron.service_plugins' driver found, looking for > 'firewall' > > > > > > Best regards, > > > > > > Ralf T. > > > > — > > Slawek Kaplonski > > Senior software engineer > > Red Hat > > > > — > Slawek Kaplonski > Senior software engineer > Red Hat > > > -- Adam Heczko Principal Security Architect @ Mirantis Inc. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ssbarnea at redhat.com Thu Jul 11 11:01:25 2019 From: ssbarnea at redhat.com (Sorin Sbarnea) Date: Thu, 11 Jul 2019 12:01:25 +0100 Subject: [doc8] future development and maintenance Message-ID: <9D091EF4-60F2-490C-BF76-AAD7A68FB8A1@redhat.com> It seems that the doc8 project is lacking some love, regardless the fact that is used by >90 projects from opendev.org. https://review.opendev.org/#/q/project:x/doc8 Last merge and release was more than 2 years ago and no reviews were performed either. I think it would be in our interest to assure that doc8 maintenance continues and that we can keep it usable. I would like to propose extenting the list of cores from the current 4 ones that I already listed in CC with 3 more, so we can effectively make a change that gets merged and later released (anyone willing to help?) If current cores agree, I would be happy to help with maintenance. I estimate that the effort needed would likely be less than 1h/month in longer term. If there is a desire to move it to github/travis, I would not mind either. Thanks Sorin Sbarnea Red Hat TripleO CI From ralf.teckelmann at bertelsmann.de Thu Jul 11 11:13:14 2019 From: ralf.teckelmann at bertelsmann.de (Teckelmann, Ralf, NMU-OIP) Date: Thu, 11 Jul 2019 11:13:14 +0000 Subject: AW: FWaaS in Stein - NoMatches: No 'neutron.service_plugins' driver found, looking for 'firewall' In-Reply-To: References: <28A1866C-F2C6-4BAC-BE73-467AD22B4805@redhat.com> Message-ID: Hello Adam, You may missed the part „in regard of a Stein-Deployment with Linuxbridges” of my question. So OVS is not relevant, as I understand the mutual exclusion of linux bridges and ovs. Cheers, Ralf T. Von: Adam Heczko Gesendet: Donnerstag, 11. Juli 2019 12:55 An: Slawek Kaplonski Cc: Teckelmann, Ralf, NMU-OIP ; openstack-discuss at lists.openstack.org Betreff: Re: FWaaS in Stein - NoMatches: No 'neutron.service_plugins' driver found, looking for 'firewall' Hi Ralf, WDYM saying 'no Perimeter-Firewall is offered anymore'? OpenStack with OVS ML2 provides a security groups, which is considered a 'perimeter firewall'. On Thu, Jul 11, 2019 at 12:35 PM Slawek Kaplonski > wrote: Hi, AFAICT there is no many still active developers of neutron-fwaas project and I don’t know about such plans currently. > On 11 Jul 2019, at 11:23, Teckelmann, Ralf, NMU-OIP > wrote: > > Hello Slawek, > > Thank your for your fast response. > This means in regard of a Stein-Deployment with Linuxbridges no Perimeter-Firewall is offered anymore. > Are there plans to remedy this deficiency in the next releases? > > Cheers, > > Ralf T. > Von: Slawek Kaplonski > > Gesendet: Donnerstag, 11. Juli 2019 10:04:02 > An: Teckelmann, Ralf, NMU-OIP > Cc: openstack-discuss at lists.openstack.org > Betreff: Re: FWaaS in Stein - NoMatches: No 'neutron.service_plugins' driver found, looking for 'firewall' > > Hi, > > FWaaS v1 was deprecated since some time and was removed completely in Stein release. > > > On 11 Jul 2019, at 09:28, Teckelmann, Ralf, NMU-OIP > wrote: > > > > Good Morning everyone, > > > > We like to have FWaaS enabled for a Stein-based OpenStack installation. > > Using linuxbridges we are not able to use FWaaS_v2, because it only seems to work with ovs. > > > > We thus tried FWaaS (v1) following https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_openstack-2Dansible-2Dos-5Fneutron_latest_configure-2Dnetwork-2Dservices.html-23firewall-2Dservice-2Doptional&d=DwIFaQ&c=vo2ie5TPcLdcgWuLVH4y8lsbGPqIayH3XbK3gK82Oco&r=WXex93lsaiQ-z7CeZkHv93lzt4fdCRIPXloSPQEU7CM&m=mRJxK4Dne35uMLvIxZWOXNeMxXzMcUTsQQd1yrgQ7kM&s=9KmdvZINwdij6mV-kMqE6S94CMiK4z8yO1b7cfXNhv8&e= . > > However, all we get from it is (1). > > > > Are we missing a point or is FWaaS_V1 just not supported in Stein anymore? > > If so, this would mean for a setup Stein+Linuxbridges no FWaaS is actually available, right? > > > > (1) > > grep firewall /var/log/neutron/neutron-server.log > > 2019-07-05 10:10:55.693 29793 ERROR neutron_lib.utils.runtime NoMatches: No'neutron.service_plugins' driver found, looking for 'firewall' > > 2019-07-05 10:10:55.694 29793 ERROR neutron.manager [req-394624b6-e638-45ec-be7c-ce86793fdbc4 - - - - -] Plugin 'firewall' not found. > > 2019-07-05 10:11:00.046 29979 INFO neutron.manager [req-e86af4f4-afae-46d7-ac5e-51585a12083b - - - - -] Loading Plugin: firewall > > 2019-07-05 10:11:00.046 29979 ERROR neutron_lib.utils.runtime [req-e86af4f4-afae-46d7-ac5e-51585a12083b - - - - -] Error loading class by alias: NoMatches: No 'neutron.service_plugins' driver found, looking for 'firewall' > > > > Best regards, > > > > Ralf T. > > — > Slawek Kaplonski > Senior software engineer > Red Hat > — Slawek Kaplonski Senior software engineer Red Hat -- Adam Heczko Principal Security Architect @ Mirantis Inc. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gaetan.trellu at incloudus.com Thu Jul 11 11:14:47 2019 From: gaetan.trellu at incloudus.com (=?ISO-8859-1?Q?Ga=EBtan_Trellu?=) Date: Thu, 11 Jul 2019 07:14:47 -0400 Subject: [masakari] how to install masakari on centos 7 In-Reply-To: Message-ID: <35400f83-c29d-475a-8d36-d56b3cf16d30@email.android.com> An HTML attachment was scrubbed... URL: From skaplons at redhat.com Thu Jul 11 11:22:12 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Thu, 11 Jul 2019 13:22:12 +0200 Subject: FWaaS in Stein - NoMatches: No 'neutron.service_plugins' driver found, looking for 'firewall' In-Reply-To: References: <28A1866C-F2C6-4BAC-BE73-467AD22B4805@redhat.com> Message-ID: Hi, Security groups are supported by both Linuxbridge and OVS agents. But this is different solution than FWaaS. Security groups are applied on port’s level, not on router. > On 11 Jul 2019, at 13:13, Teckelmann, Ralf, NMU-OIP wrote: > > Hello Adam, > > You may missed the part „in regard of a Stein-Deployment with Linuxbridges” of my question. > So OVS is not relevant, as I understand the mutual exclusion of linux bridges and ovs. > > Cheers, > > Ralf T. > > Von: Adam Heczko > Gesendet: Donnerstag, 11. Juli 2019 12:55 > An: Slawek Kaplonski > Cc: Teckelmann, Ralf, NMU-OIP ; openstack-discuss at lists.openstack.org > Betreff: Re: FWaaS in Stein - NoMatches: No 'neutron.service_plugins' driver found, looking for 'firewall' > > Hi Ralf, WDYM saying 'no Perimeter-Firewall is offered anymore'? > OpenStack with OVS ML2 provides a security groups, which is considered a 'perimeter firewall'. > > On Thu, Jul 11, 2019 at 12:35 PM Slawek Kaplonski wrote: > Hi, > > AFAICT there is no many still active developers of neutron-fwaas project and I don’t know about such plans currently. > > > On 11 Jul 2019, at 11:23, Teckelmann, Ralf, NMU-OIP wrote: > > > > Hello Slawek, > > > > Thank your for your fast response. > > This means in regard of a Stein-Deployment with Linuxbridges no Perimeter-Firewall is offered anymore. > > Are there plans to remedy this deficiency in the next releases? > > > > Cheers, > > > > Ralf T. > > Von: Slawek Kaplonski > > Gesendet: Donnerstag, 11. Juli 2019 10:04:02 > > An: Teckelmann, Ralf, NMU-OIP > > Cc: openstack-discuss at lists.openstack.org > > Betreff: Re: FWaaS in Stein - NoMatches: No 'neutron.service_plugins' driver found, looking for 'firewall' > > > > Hi, > > > > FWaaS v1 was deprecated since some time and was removed completely in Stein release. > > > > > On 11 Jul 2019, at 09:28, Teckelmann, Ralf, NMU-OIP wrote: > > > > > > Good Morning everyone, > > > > > > We like to have FWaaS enabled for a Stein-based OpenStack installation. > > > Using linuxbridges we are not able to use FWaaS_v2, because it only seems to work with ovs. > > > > > > We thus tried FWaaS (v1) following https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_openstack-2Dansible-2Dos-5Fneutron_latest_configure-2Dnetwork-2Dservices.html-23firewall-2Dservice-2Doptional&d=DwIFaQ&c=vo2ie5TPcLdcgWuLVH4y8lsbGPqIayH3XbK3gK82Oco&r=WXex93lsaiQ-z7CeZkHv93lzt4fdCRIPXloSPQEU7CM&m=mRJxK4Dne35uMLvIxZWOXNeMxXzMcUTsQQd1yrgQ7kM&s=9KmdvZINwdij6mV-kMqE6S94CMiK4z8yO1b7cfXNhv8&e= . > > > However, all we get from it is (1). > > > > > > Are we missing a point or is FWaaS_V1 just not supported in Stein anymore? > > > If so, this would mean for a setup Stein+Linuxbridges no FWaaS is actually available, right? > > > > > > (1) > > > grep firewall /var/log/neutron/neutron-server.log > > > 2019-07-05 10:10:55.693 29793 ERROR neutron_lib.utils.runtime NoMatches: No'neutron.service_plugins' driver found, looking for 'firewall' > > > 2019-07-05 10:10:55.694 29793 ERROR neutron.manager [req-394624b6-e638-45ec-be7c-ce86793fdbc4 - - - - -] Plugin 'firewall' not found. > > > 2019-07-05 10:11:00.046 29979 INFO neutron.manager [req-e86af4f4-afae-46d7-ac5e-51585a12083b - - - - -] Loading Plugin: firewall > > > 2019-07-05 10:11:00.046 29979 ERROR neutron_lib.utils.runtime [req-e86af4f4-afae-46d7-ac5e-51585a12083b - - - - -] Error loading class by alias: NoMatches: No 'neutron.service_plugins' driver found, looking for 'firewall' > > > > > > Best regards, > > > > > > Ralf T. > > > > — > > Slawek Kaplonski > > Senior software engineer > > Red Hat > > > > — > Slawek Kaplonski > Senior software engineer > Red Hat > > > > > -- > Adam Heczko > Principal Security Architect @ Mirantis Inc. — Slawek Kaplonski Senior software engineer Red Hat From tomas.bredar at gmail.com Thu Jul 11 11:26:25 2019 From: tomas.bredar at gmail.com (=?UTF-8?B?VG9tw6HFoSBCcmVkw6Fy?=) Date: Thu, 11 Jul 2019 13:26:25 +0200 Subject: [tripleo][cinder][netapp] Message-ID: Hi community, I'm trying to define multiple NetApp storage backends via Tripleo installer. According to [1] the puppet manifest supports multiple backends. The current templates [2] [3] support only single backend. Does anyone know how to define multiple netapp backends in the tripleo-heat environment files / templates? You help is appreciated. Tomas [1] https://opendev.org/openstack/puppet-cinder/src/branch/stable/queens/manifests/backend/netapp.pp [2] https://opendev.org/openstack/tripleo-heat-templates/src/branch/stable/queens/environments/cinder-netapp-config.yaml [3] https://opendev.org/openstack/tripleo-heat-templates/src/branch/stable/queens/puppet/services/cinder-backend-netapp.yaml -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Thu Jul 11 11:33:13 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 11 Jul 2019 13:33:13 +0200 Subject: [queens][nova] nova host-evacuate errot In-Reply-To: References: Message-ID: Ok Jay, let me to describe my environment. I have an openstack made up of 3 controllers nodes ad several compute nodes. The controller nodes services are controlled by pacemaker and the compute nodes services are controlled by remote pacemaker. My hardware is Dell so I am using ipmi fencing device . I wrote a service controlled by pacemaker: this service controls if a compude node fails and for avoiding split brains if a compute node does nod respond on the management network and on storage network the stonith poweroff the node and then execute a nova host-evacuate. Anycase to have a simulation before writing the service I described above you can do as follows: connect on one compute node where some virtual machines are running run the command: echo 'c' > /proc/sysrq-trigger (it stops immediately the node like in case of failure) On a controller node run: nova host-evacuate "name of failed compute node" Instances running on the failed compute node should be restarted on another compute node Ignazio Il giorno gio 11 lug 2019 alle ore 11:57 Jay See ha scritto: > Hi , > > I have tried on a failed compute node which is in power off state now. > I have tried on a running compute node, no errors. But nothing happens. > On running compute node - Disabled the compute service and tried migration > also. > > May be I might have not followed proper steps. Just wanted to know the > steps you have followed. Otherwise, I was planning to manual migration also > if possible. > ~Jay. > > On Thu, Jul 11, 2019 at 11:52 AM Ignazio Cassano > wrote: > >> Hi Jay, >> would you like to evacuate a failed compute node or evacuate a running >> compute node ? >> >> Ignazio >> >> Il giorno gio 11 lug 2019 alle ore 11:48 Jay See < >> jayachander.it at gmail.com> ha scritto: >> >>> Hi Ignazio, >>> >>> I am trying to evacuate the compute host on older version (mitaka). >>> Could please share the process you followed. I am not able to succeed >>> with openstack live-migration fails with error message (this is known issue >>> in older versions) and nova live-ligration - nothing happens even after >>> initiating VM migration. It is almost 4 days. >>> >>> ~Jay. >>> >>> On Thu, Jul 11, 2019 at 11:31 AM Ignazio Cassano < >>> ignaziocassano at gmail.com> wrote: >>> >>>> I am sorry. >>>> For simulating an host crash I used a wrong procedure. >>>> Using "echo 'c' > /proc/sysrq-trigger" all work fine >>>> >>>> Il giorno gio 11 lug 2019 alle ore 11:01 Ignazio Cassano < >>>> ignaziocassano at gmail.com> ha scritto: >>>> >>>>> Hello All, >>>>> on ocata when I poweroff a node with active instance , doing a nova >>>>> host-evacuate works fine >>>>> and instances are restartd on an active node. >>>>> On queens it does non evacuate instances but nova-api reports for each >>>>> instance the following: >>>>> >>>>> 2019-07-11 10:19:54.745 13811 INFO nova.api.openstack.wsgi >>>>> [req-daad0a7d-87ce-41bf-b096-a70fc306db5c 0c7a2d6006614fe2b3e81e47377dd2a9 >>>>> c26f8d35f85547c4add392a221af1aab - default default] HTTP exception thrown: >>>>> Cannot 'evacuate' instance e8485a5e-3623-4184-bcce-cafd56fa60b3 while it is >>>>> in task_state powering-off >>>>> >>>>> So it poweroff all instance on the failed node but does not start them >>>>> on active nodes >>>>> >>>>> What is changed ? >>>>> Ignazio >>>>> >>>>> >>>>> >>> >>> -- >>> ​ >>> P *SAVE PAPER – Please do not print this e-mail unless absolutely >>> necessary.* >>> >> > > -- > ​ > P *SAVE PAPER – Please do not print this e-mail unless absolutely > necessary.* > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aheczko at mirantis.com Thu Jul 11 11:44:49 2019 From: aheczko at mirantis.com (Adam Heczko) Date: Thu, 11 Jul 2019 13:44:49 +0200 Subject: FWaaS in Stein - NoMatches: No 'neutron.service_plugins' driver found, looking for 'firewall' In-Reply-To: References: <28A1866C-F2C6-4BAC-BE73-467AD22B4805@redhat.com> Message-ID: Exacly Slawek. Ralph I was referring to the sentence 'Perimeter-Firewall' OpenStack provides a Perimeter-Firewall and that is a Security Groups. https://docs.openstack.org/nova/queens/admin/security-groups.html SG (Security Groups) is something different than FWaaS. Though FWaaS to some degree could also provide a SG functionality, as it can bind to AFAIK and Neutron port. On Thu, Jul 11, 2019 at 1:22 PM Slawek Kaplonski wrote: > Hi, > > Security groups are supported by both Linuxbridge and OVS agents. But this > is different solution than FWaaS. Security groups are applied on port’s > level, not on router. > > > On 11 Jul 2019, at 13:13, Teckelmann, Ralf, NMU-OIP < > ralf.teckelmann at bertelsmann.de> wrote: > > > > Hello Adam, > > > > You may missed the part „in regard of a Stein-Deployment with > Linuxbridges” of my question. > > So OVS is not relevant, as I understand the mutual exclusion of linux > bridges and ovs. > > > > Cheers, > > > > Ralf T. > > > > Von: Adam Heczko > > Gesendet: Donnerstag, 11. Juli 2019 12:55 > > An: Slawek Kaplonski > > Cc: Teckelmann, Ralf, NMU-OIP ; > openstack-discuss at lists.openstack.org > > Betreff: Re: FWaaS in Stein - NoMatches: No 'neutron.service_plugins' > driver found, looking for 'firewall' > > > > Hi Ralf, WDYM saying 'no Perimeter-Firewall is offered anymore'? > > OpenStack with OVS ML2 provides a security groups, which is considered a > 'perimeter firewall'. > > > > On Thu, Jul 11, 2019 at 12:35 PM Slawek Kaplonski > wrote: > > Hi, > > > > AFAICT there is no many still active developers of neutron-fwaas project > and I don’t know about such plans currently. > > > > > On 11 Jul 2019, at 11:23, Teckelmann, Ralf, NMU-OIP < > ralf.teckelmann at bertelsmann.de> wrote: > > > > > > Hello Slawek, > > > > > > Thank your for your fast response. > > > This means in regard of a Stein-Deployment with Linuxbridges no > Perimeter-Firewall is offered anymore. > > > Are there plans to remedy this deficiency in the next releases? > > > > > > Cheers, > > > > > > Ralf T. > > > Von: Slawek Kaplonski > > > Gesendet: Donnerstag, 11. Juli 2019 10:04:02 > > > An: Teckelmann, Ralf, NMU-OIP > > > Cc: openstack-discuss at lists.openstack.org > > > Betreff: Re: FWaaS in Stein - NoMatches: No 'neutron.service_plugins' > driver found, looking for 'firewall' > > > > > > Hi, > > > > > > FWaaS v1 was deprecated since some time and was removed completely in > Stein release. > > > > > > > On 11 Jul 2019, at 09:28, Teckelmann, Ralf, NMU-OIP < > ralf.teckelmann at bertelsmann.de> wrote: > > > > > > > > Good Morning everyone, > > > > > > > > We like to have FWaaS enabled for a Stein-based OpenStack > installation. > > > > Using linuxbridges we are not able to use FWaaS_v2, because it only > seems to work with ovs. > > > > > > > > We thus tried FWaaS (v1) following > https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.openstack.org_openstack-2Dansible-2Dos-5Fneutron_latest_configure-2Dnetwork-2Dservices.html-23firewall-2Dservice-2Doptional&d=DwIFaQ&c=vo2ie5TPcLdcgWuLVH4y8lsbGPqIayH3XbK3gK82Oco&r=WXex93lsaiQ-z7CeZkHv93lzt4fdCRIPXloSPQEU7CM&m=mRJxK4Dne35uMLvIxZWOXNeMxXzMcUTsQQd1yrgQ7kM&s=9KmdvZINwdij6mV-kMqE6S94CMiK4z8yO1b7cfXNhv8&e= > . > > > > However, all we get from it is (1). > > > > > > > > Are we missing a point or is FWaaS_V1 just not supported in Stein > anymore? > > > > If so, this would mean for a setup Stein+Linuxbridges no FWaaS is > actually available, right? > > > > > > > > (1) > > > > grep firewall /var/log/neutron/neutron-server.log > > > > 2019-07-05 10:10:55.693 29793 ERROR neutron_lib.utils.runtime > NoMatches: No'neutron.service_plugins' driver found, looking for 'firewall' > > > > 2019-07-05 10:10:55.694 29793 ERROR neutron.manager > [req-394624b6-e638-45ec-be7c-ce86793fdbc4 - - - - -] Plugin 'firewall' not > found. > > > > 2019-07-05 10:11:00.046 29979 INFO neutron.manager > [req-e86af4f4-afae-46d7-ac5e-51585a12083b - - - - -] Loading Plugin: > firewall > > > > 2019-07-05 10:11:00.046 29979 ERROR neutron_lib.utils.runtime > [req-e86af4f4-afae-46d7-ac5e-51585a12083b - - - - -] Error loading class by > alias: NoMatches: No 'neutron.service_plugins' driver found, looking for > 'firewall' > > > > > > > > Best regards, > > > > > > > > Ralf T. > > > > > > — > > > Slawek Kaplonski > > > Senior software engineer > > > Red Hat > > > > > > > — > > Slawek Kaplonski > > Senior software engineer > > Red Hat > > > > > > > > > > -- > > Adam Heczko > > Principal Security Architect @ Mirantis Inc. > > — > Slawek Kaplonski > Senior software engineer > Red Hat > > -- Adam Heczko Principal Security Architect @ Mirantis Inc. -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Thu Jul 11 12:03:28 2019 From: smooney at redhat.com (Sean Mooney) Date: Thu, 11 Jul 2019 13:03:28 +0100 Subject: trying to understand steal time with cpu pinning In-Reply-To: <9D8A2486E35F0941A60430473E29F15B017EB2889D@MXDB1.ad.garvan.unsw.edu.au> References: <9D8A2486E35F0941A60430473E29F15B017EB2889D@MXDB1.ad.garvan.unsw.edu.au> Message-ID: <9d112059e4772bccb037907fa984044be069640f.camel@redhat.com> On Thu, 2019-07-11 at 06:15 +0000, Manuel Sopena Ballesteros wrote: > Dear Openstack community, > > Please correct me if I am wrong. > > As far as I understand `steal time > 0` means that the hypervisor has replaced a vcpu with a different one on the > physical cpu. > Also, cpu pinning allocates a vcpu to a physical cpu permanently. > > I have a vm setup with cpu pinning and numa affinity and realized, that cpu steal time is between 1% and 0%. > > Why is that? there are 2 ways that this can happen. 1.) you are not setting hw:emulator_thread_policy to move the qemu emulator threads to a different core. 2.) you have host system process or kernel threads that are stealing guest cpu time like vhost threads. you can prevent 2 using systemd or take teh blunt hammer approch and use the kernel isolcpus parmater but that generally should only be used for realtime systems. openstack does not prevent other host process form running on the vcpu_pin_set, that is left to the operator/installer/os to do. > > Thank you very much > NOTICE > Please consider the environment before printing this email. This message and any attachments are intended for the > addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended > recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this > message in error please notify us at once by return email and then delete both messages. We accept no liability for > the distribution of viruses or similar in electronic communications. This notice should not be removed. Technically disclaimer like ^ are considered poor mailing list etiquette and are not legally enforceable. if you can disable it for the openstack mailing list that would be for the best. the discloser,copy and distribution parts in partcalar can never be honoured by a publicly archived mailing list so this just adds noise to the list. From emilien at redhat.com Thu Jul 11 12:35:20 2019 From: emilien at redhat.com (Emilien Macchi) Date: Thu, 11 Jul 2019 08:35:20 -0400 Subject: [tripleo][cinder][netapp] In-Reply-To: References: Message-ID: On Thu, Jul 11, 2019 at 7:32 AM Tomáš Bredár wrote: > Hi community, > > I'm trying to define multiple NetApp storage backends via Tripleo > installer. > According to [1] the puppet manifest supports multiple backends. > The current templates [2] [3] support only single backend. > Does anyone know how to define multiple netapp backends in the > tripleo-heat environment files / templates? > We don't support that via the templates that you linked, however if you follow this manual you should be able to configure multiple NetApp backends: https://docs.openstack.org/tripleo-docs/latest/install/advanced_deployment/cinder_custom_backend.html Let us know how it worked! -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From jyotishri403 at gmail.com Thu Jul 11 14:21:04 2019 From: jyotishri403 at gmail.com (Jyoti Dahiwele) Date: Thu, 11 Jul 2019 19:51:04 +0530 Subject: root disk for instance In-Reply-To: <20190711095326.Horde._eRtJ1r8ufNJjxxy9JnnVe-@webmail.nde.ag> References: <20190711095326.Horde._eRtJ1r8ufNJjxxy9JnnVe-@webmail.nde.ag> Message-ID: Thanks for your reply. I'm referring this link https://docs.openstack.org/mitaka/admin-guide/compute-images-instances.html to understand about root disk. In this it is saying root disk will come from compute node. What is the location of root disk on compute node? If I want to keep all my vms on shared storage . How to configure it ? Or If I want to keep all my vms on cinder volume. What will be the configuration for it on nova and cinder? On Thu, 11 Jul 2019, 15:24 Eugen Block, wrote: > Hi, > > it's always glance that serves the images, it just depends on how you > decide to create the instance, ephemeral or persistent disks. You can > find more information about storage concepts in [1]. > > If I'm not completely wrong, since Newton release the default in the > Horizon settings is to create an instance from volume, so it would be > a persistent disk managed by cinder (the volume persists after the > instance has been deleted, this is also configurable). The image is > downloaded from glance into a volume on your volume server. > > If you change the Horizon behavior or if you launch an instance from > the cli you'd get an ephemeral disk by nova, depending on your storage > backend this would be a local copy of the image on the compute node(s) > or something related in your storage backend, e.g. an rbd object in > ceph. > > Does this clear it up a bit? > > Regards, > Eugen > > [1] > > https://docs.openstack.org/arch-design/design-storage/design-storage-concepts.html > > > Zitat von Jyoti Dahiwele : > > > Dear Team, > > > > Please clear me my following doubts. > > When I use image from source option and mini flavor to create an > instace, > > from which storage pool instance will get root disk ? From cinder or > glance? > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tyler.bishop at beyondhosting.net Thu Jul 11 14:23:42 2019 From: tyler.bishop at beyondhosting.net (Tyler Bishop) Date: Thu, 11 Jul 2019 10:23:42 -0400 Subject: root disk for instance In-Reply-To: References: <20190711095326.Horde._eRtJ1r8ufNJjxxy9JnnVe-@webmail.nde.ag> Message-ID: /var/lib/qemu/instances/instance-hash/blah blah > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eblock at nde.ag Thu Jul 11 14:48:27 2019 From: eblock at nde.ag (Eugen Block) Date: Thu, 11 Jul 2019 14:48:27 +0000 Subject: root disk for instance In-Reply-To: References: <20190711095326.Horde._eRtJ1r8ufNJjxxy9JnnVe-@webmail.nde.ag> Message-ID: <20190711144827.Horde.tglCLyI8aGqVcHo23OPMuxK@webmail.nde.ag> > I'm referring this link > https://docs.openstack.org/mitaka/admin-guide/compute-images-instances.html > to understand about root disk. In this it is saying root disk will come > from compute node. What is the location of root disk on compute node? First, use a later release, Mitaka is quite old, Stein is the current release. If you use a default configuration without any specific backend for your instances they will be located on the compute nodes in /var/lib/nova/instances/. The respective base images would reside in /var/lib/nova/instances/_base, so your compute nodes should have sufficient disk space. > If I want to keep all my vms on shared storage . How to configure it ? That's up to you, there are several backends supported, so you'll have to choose one. Many people including me use Ceph as storage backend for Glance, Cinder and Nova. > Or If I want to keep all my vms on cinder volume. What will be the > configuration for it on nova and cinder? I recommend to set up a lab environment where you can learn to set up OpenStack, then play around and test different backends if required. The general configuration requirements are covered in the docs [1]. If you don't want to configure every single service you can follow a deployment guide [2], but that will require skills in ansible, juju or tripleO. I'd recommend the manual way, that way you learn the basics and how the different components interact. [1] https://docs.openstack.org/stein/install/ [2] https://docs.openstack.org/stein/deploy/ Zitat von Jyoti Dahiwele : > Thanks for your reply. > I'm referring this link > https://docs.openstack.org/mitaka/admin-guide/compute-images-instances.html > to understand about root disk. In this it is saying root disk will come > from compute node. What is the location of root disk on compute node? > > If I want to keep all my vms on shared storage . How to configure it ? > > Or If I want to keep all my vms on cinder volume. What will be the > configuration for it on nova and cinder? > > > On Thu, 11 Jul 2019, 15:24 Eugen Block, wrote: > >> Hi, >> >> it's always glance that serves the images, it just depends on how you >> decide to create the instance, ephemeral or persistent disks. You can >> find more information about storage concepts in [1]. >> >> If I'm not completely wrong, since Newton release the default in the >> Horizon settings is to create an instance from volume, so it would be >> a persistent disk managed by cinder (the volume persists after the >> instance has been deleted, this is also configurable). The image is >> downloaded from glance into a volume on your volume server. >> >> If you change the Horizon behavior or if you launch an instance from >> the cli you'd get an ephemeral disk by nova, depending on your storage >> backend this would be a local copy of the image on the compute node(s) >> or something related in your storage backend, e.g. an rbd object in >> ceph. >> >> Does this clear it up a bit? >> >> Regards, >> Eugen >> >> [1] >> >> https://docs.openstack.org/arch-design/design-storage/design-storage-concepts.html >> >> >> Zitat von Jyoti Dahiwele : >> >> > Dear Team, >> > >> > Please clear me my following doubts. >> > When I use image from source option and mini flavor to create an >> instace, >> > from which storage pool instance will get root disk ? From cinder or >> glance? >> >> >> >> >> From akekane at redhat.com Thu Jul 11 15:04:18 2019 From: akekane at redhat.com (Abhishek Kekane) Date: Thu, 11 Jul 2019 20:34:18 +0530 Subject: [glance] Train: Milestone 2 review priorities Message-ID: Dear Reviewers/Developers, Train milestone two is just two weeks away from now. I have created one etherpad [1] which lists the priority patches for glance, glance_store, python-glanceclient and glance-specs which will be good to have merged before Train 2 milestone. Feel free to add if you think any patch which I have missed is good to have during this milestone 2. Request all reviewers to review these patches so that we can have smooth milestone 2 release. Highlight of this milestone 2 release is that we are planning to roll out glance-store version 1.0 which will officially mark the multiple-stores feature of glance as a stable feature, [1] https://etherpad.openstack.org/p/Glance-Train-MileStone-2-Release-Plan Thanks & Best Regards, Abhishek Kekane -------------- next part -------------- An HTML attachment was scrubbed... URL: From grant at civo.com Thu Jul 11 15:07:49 2019 From: grant at civo.com (Grant Morley) Date: Thu, 11 Jul 2019 16:07:49 +0100 Subject: Cinder issue with DataCore Message-ID: Hi All, We are trying to test DataCore storage backend for cinder ( running on Queens ). We have everything installed and have the cinder config all setup. However whenever we try and start the "cinder-volume" service, we get the following error: 2019-07-11 12:30:06.977 1909533 ERROR cinder.volume.manager File "/openstack/venvs/cinder-17.1.2/lib/python2.7/site-packages/cinder/volume/manager.py", line 456, in init_host 2019-07-11 12:30:06.977 1909533 ERROR cinder.volume.manager self.driver.do_setup(ctxt) 2019-07-11 12:30:06.977 1909533 ERROR cinder.volume.manager   File "/openstack/venvs/cinder-17.1.2/lib/python2.7/site-packages/cinder/volume/drivers/datacore/iscsi.py", line 83, in do_setup 2019-07-11 12:30:06.977 1909533 ERROR cinder.volume.manager super(ISCSIVolumeDriver, self).do_setup(context) 2019-07-11 12:30:06.977 1909533 ERROR cinder.volume.manager   File "/openstack/venvs/cinder-17.1.2/lib/python2.7/site-packages/cinder/volume/drivers/datacore/driver.py", line 116, in do_setup 2019-07-11 12:30:06.977 1909533 ERROR cinder.volume.manager self.configuration.datacore_api_timeout) 2019-07-11 12:30:06.977 1909533 ERROR cinder.volume.manager   File "/openstack/venvs/cinder-17.1.2/lib/python2.7/site-packages/cinder/volume/drivers/datacore/api.py", line 176, in __init__ 2019-07-11 12:30:06.977 1909533 ERROR cinder.volume.manager raise datacore_exceptions.DataCoreException(msg) 2019-07-11 12:30:06.977 1909533 ERROR cinder.volume.manager DataCoreException: Failed to import websocket-client python module. Please, ensure the module is installed. We have the "websocket-client" installed also: pip freeze | grep websocket websocket-client==0.44.0 The datacore libraries also appear to be available in our venvs dirs: ls /openstack/venvs/cinder-17.1.2/lib/python2.7/site-packages/cinder/volume/drivers/datacore api.py  driver.py  exception.py  fc.py  __init__.py  iscsi.py passwd.py  utils.py We are a bit stumped at the moment and wondered if anyone knew what might be causing the error? We have managed to get Ceph and SolidFire working fine. Regards, -- Grant Morley Cloud Lead, Civo Ltd www.civo.com | Signup for an account! -------------- next part -------------- An HTML attachment was scrubbed... URL: From jyotishri403 at gmail.com Thu Jul 11 15:22:45 2019 From: jyotishri403 at gmail.com (Jyoti Dahiwele) Date: Thu, 11 Jul 2019 20:52:45 +0530 Subject: root disk for instance In-Reply-To: <20190711144827.Horde.tglCLyI8aGqVcHo23OPMuxK@webmail.nde.ag> References: <20190711095326.Horde._eRtJ1r8ufNJjxxy9JnnVe-@webmail.nde.ag> <20190711144827.Horde.tglCLyI8aGqVcHo23OPMuxK@webmail.nde.ag> Message-ID: Thanks, I'll check it out. On Thu, 11 Jul 2019, 20:18 Eugen Block, wrote: > > I'm referring this link > > > https://docs.openstack.org/mitaka/admin-guide/compute-images-instances.html > > to understand about root disk. In this it is saying root disk will come > > from compute node. What is the location of root disk on compute node? > > First, use a later release, Mitaka is quite old, Stein is the current > release. > > If you use a default configuration without any specific backend for > your instances they will be located on the compute nodes in > /var/lib/nova/instances/. The respective base images would reside in > /var/lib/nova/instances/_base, so your compute nodes should have > sufficient disk space. > > > If I want to keep all my vms on shared storage . How to configure it ? > > That's up to you, there are several backends supported, so you'll have > to choose one. Many people including me use Ceph as storage backend > for Glance, Cinder and Nova. > > > Or If I want to keep all my vms on cinder volume. What will be the > > configuration for it on nova and cinder? > > I recommend to set up a lab environment where you can learn to set up > OpenStack, then play around and test different backends if required. > The general configuration requirements are covered in the docs [1]. If > you don't want to configure every single service you can follow a > deployment guide [2], but that will require skills in ansible, juju or > tripleO. I'd recommend the manual way, that way you learn the basics > and how the different components interact. > > [1] https://docs.openstack.org/stein/install/ > [2] https://docs.openstack.org/stein/deploy/ > > > Zitat von Jyoti Dahiwele : > > > Thanks for your reply. > > I'm referring this link > > > https://docs.openstack.org/mitaka/admin-guide/compute-images-instances.html > > to understand about root disk. In this it is saying root disk will come > > from compute node. What is the location of root disk on compute node? > > > > If I want to keep all my vms on shared storage . How to configure it ? > > > > Or If I want to keep all my vms on cinder volume. What will be the > > configuration for it on nova and cinder? > > > > > > On Thu, 11 Jul 2019, 15:24 Eugen Block, wrote: > > > >> Hi, > >> > >> it's always glance that serves the images, it just depends on how you > >> decide to create the instance, ephemeral or persistent disks. You can > >> find more information about storage concepts in [1]. > >> > >> If I'm not completely wrong, since Newton release the default in the > >> Horizon settings is to create an instance from volume, so it would be > >> a persistent disk managed by cinder (the volume persists after the > >> instance has been deleted, this is also configurable). The image is > >> downloaded from glance into a volume on your volume server. > >> > >> If you change the Horizon behavior or if you launch an instance from > >> the cli you'd get an ephemeral disk by nova, depending on your storage > >> backend this would be a local copy of the image on the compute node(s) > >> or something related in your storage backend, e.g. an rbd object in > >> ceph. > >> > >> Does this clear it up a bit? > >> > >> Regards, > >> Eugen > >> > >> [1] > >> > >> > https://docs.openstack.org/arch-design/design-storage/design-storage-concepts.html > >> > >> > >> Zitat von Jyoti Dahiwele : > >> > >> > Dear Team, > >> > > >> > Please clear me my following doubts. > >> > When I use image from source option and mini flavor to create an > >> instace, > >> > from which storage pool instance will get root disk ? From cinder or > >> glance? > >> > >> > >> > >> > >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jungleboyj at gmail.com Thu Jul 11 15:42:35 2019 From: jungleboyj at gmail.com (Jay Bryant) Date: Thu, 11 Jul 2019 10:42:35 -0500 Subject: [cinder] Shanghai PTG/Forum Attendance ... Message-ID: <49416a91-b7d4-6b58-a7d9-33c6b6fc2f5a@gmail.com> All, We have been asked to get an idea of how many people are planning to attend the PTG and/or Forum for Cinder.  I have created an etherpad to start collecting topics as well as a list of people who are planning to attend [1].  If you think you may be in Shanghai or are in a timezone that can participate remotely, please update the etherpad. For fun, I have also created a Twitter Poll [2].  If you want to vote there to indicate attendance/non-attendance, please do so! Thanks! Jay (irc: jungleboyj) [1] https://etherpad.openstack.org/p/cinder-shanghai-ptg-planning [2] https://twitter.com/jungleboyj/status/1149045308271333378 From sfinucan at redhat.com Thu Jul 11 16:10:29 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Thu, 11 Jul 2019 17:10:29 +0100 Subject: [doc8] future development and maintenance In-Reply-To: <9D091EF4-60F2-490C-BF76-AAD7A68FB8A1@redhat.com> References: <9D091EF4-60F2-490C-BF76-AAD7A68FB8A1@redhat.com> Message-ID: <273715c4f2d4933232fc6b26cb982609daa1a2c7.camel@redhat.com> On Thu, 2019-07-11 at 12:01 +0100, Sorin Sbarnea wrote: > It seems that the doc8 project is lacking some love, regardless the > fact that is used by >90 projects from opendev.org. > > https://review.opendev.org/#/q/project:x/doc8 > > Last merge and release was more than 2 years ago and no reviews were > performed either. > > I think it would be in our interest to assure that doc8 maintenance > continues and that we can keep it usable. > > I would like to propose extenting the list of cores from the current > 4 ones that I already listed in CC with 3 more, so we can effectively > make a change that gets merged and later released (anyone willing to > help?) > > If current cores agree, I would be happy to help with maintenance. I > estimate that the effort needed would likely be less than 1h/month in > longer term. If there is a desire to move it to github/travis, I > would not mind either. I'd be tentatively interested in helping out here, though it's not as if I don't already have a lot on my plate :) I wonder if this might have better success outside of OpenStack, perhaps in the sphinx-doc or sphinx-contrib GitHub repo? Stephen > Thanks > Sorin Sbarnea > Red Hat TripleO CI From emilien at redhat.com Thu Jul 11 16:11:23 2019 From: emilien at redhat.com (Emilien Macchi) Date: Thu, 11 Jul 2019 12:11:23 -0400 Subject: [TripleO] Scaling node counts with only Ansible (N=1) In-Reply-To: References: Message-ID: On Wed, Jul 10, 2019 at 4:24 PM James Slagle wrote: > There's been a fair amount of recent work around simplifying our Heat > templates and migrating the software configuration part of our > deployment entirely to Ansible. > > As part of this effort, it became apparent that we could render much > of the data that we need out of Heat in a way that is generic per > node, and then have Ansible render the node specific data during > config-download runtime. > > To illustrate the point, consider when we specify ComputeCount:10 in > our templates, that much of the work that Heat is doing across those > 10 sets of resources for each Compute node is duplication. However, > it's been necessary so that Heat can render data structures such as > list of IP's, lists of hostnames, contents of /etc/hosts files, etc > etc etc. If all that was driven by Ansible using host facts, then Heat > doesn't need to do those 10 sets of resources to begin with. > > The goal is to get to a point where we can deploy the Heat stack with > a count of 1 for each role, and then deploy any number of nodes per > role using Ansible. To that end, I've been referring to this effort as > N=1. > > The value in this work is that it directly addresses our scaling > issues with Heat (by just deploying a much smaller stack). Obviously > we'd still be relying heavily on Ansible to scale to the required > levels, but I feel that is much better understood challenge at this > point in the evolution of configuration tools. > > With the patches that we've been working on recently, I've got a POC > running where I can deploy additional compute nodes with just Ansible. > This is done by just adding the additional nodes to the Ansible > inventory with a small set of facts to include IP addresses on each > enabled network and a hostname. > > These patches are at > https://review.opendev.org/#/q/topic:bp/reduce-deployment-resources > and reviews/feedback are welcome. > This is a fabulous proposal in my opinion. I've added (and will continue to add) TODO ideas in the etherpad. Anyone willing to help, please ping us if needed. Another point, somewhat related: I took the opportunity of this work to reduce the complexity around the number of hieradata files. I would like to investigate if we can generate one data file which would be loaded by both Puppet and Ansible for doing the configuration management. I'll create a separated thread on that effort very soon. > Other points: > > - Baremetal provisioning and port creation are presently handled by > Heat. With the ongoing efforts to migrate baremetal provisioning out > of Heat (nova-less deploy), I think these efforts are very > complimentary. Eventually, we get to a point where Heat is not > actually creating any other OpenStack API resources. For now, the > patches only work when using pre-provisioned nodes. > > - We need to consider how we'd manage the Ansible inventory going > forward if we open up an interface for operators to manipulate it > directly. That's something we'd want to manage and preserve (version > control) as it's critical data for the deployment. > > Given the progress that we've made with the POC, my sense is that > we'll keep pushing in this overall direction. I'd like to get some > feedback on the approach. We have an etherpad we are using to track > some of the work at a high level: > > https://etherpad.openstack.org/p/tripleo-reduce-deployment-resources > > I'll be adding some notes on how I setup the POC to that etherpad if > others would like to try it out. > > -- > -- James Slagle > -- > > -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From ianyrchoi at gmail.com Thu Jul 11 16:41:05 2019 From: ianyrchoi at gmail.com (Ian Y. Choi) Date: Fri, 12 Jul 2019 01:41:05 +0900 Subject: [I18n][ptg] Shanghai PTG/Forum Attendance Message-ID: <64124295-1a95-598c-f929-fe546cbecad8@gmail.com> Hello, Like other official teams, I18n team has been also asked to get an idea of how many people are planning to attend the PTG and/or Forum. I have created an Etherpad to start collecting topics as well as a list of people who are planning to attend [1], and if you think you may be in Shanghai or are in a timezone that can participate remotely, please update the Etherpad. Note that this upcoming PTG/Forum is very Asia-timezone friendly - I highly encourage people in Asia countries (including China, surely) to participate offline or remotely. Thanks! With many thanks, /Ian [1] https://etherpad.openstack.org/p/i18n-shanghai-ptg-planning From emilien at redhat.com Thu Jul 11 17:52:29 2019 From: emilien at redhat.com (Emilien Macchi) Date: Thu, 11 Jul 2019 13:52:29 -0400 Subject: [tripleo] Roadmap to simplify TripleO Message-ID: Even though TripleO is well known for its maturity, it also has a reputation of being complex when it comes to the number of tools that it uses. Somewhat related to the efforts that James is leading with "Scaling TripleO" [1] [2], I would like to formalize our joint efforts to make TripleO simpler in the future. Some work has been done over the last releases already and yet we have seen net benefits; however we still have challenges ahead of us. - With no UI anymore, do we really need an API? - How can we reduce the number of languages in TripleO? ... and make Python + Ansible the main ones. - How can we reduce our dependencies? I created a document which explains the problems and propose some solutions: https://docs.google.com/document/d/1vY9rsccgp7NHFXpLtCFTHQm14F15Tij7lhn5X_P14Ys For those who can't or don't want Google Doc, I've put together the notes into etherpad [3] and I'll take care of making sure it's updated at last at the beginning until we sort things out. Feel free to be involved: - comment or suggest edits if you have feedback / ideas - sign-up if you're interested to contribute For now I expect this document to be a clear place what our plan is but I wouldn't be surprised if in the future we break it down into specs / blueprints / etc for better tracking. Thanks, [1] https://docs.google.com/document/d/12tPc4NC5fo8ytGuFZ4DSZXXyzes1x3U7oYz9neaPP_o/edit [2] http://lists.openstack.org/pipermail/openstack-discuss/2019-July/007638.html [3] https://etherpad.openstack.org/p/tripleo-simplification -- Emilien Macchi -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack at fried.cc Thu Jul 11 17:58:15 2019 From: openstack at fried.cc (Eric Fried) Date: Thu, 11 Jul 2019 12:58:15 -0500 Subject: [placement][ptg] Shanghai attendance In-Reply-To: References: Message-ID: <11798408-977b-ecdd-2730-4e7dfb16f38d@fried.cc> Chris- > asked for feedback from the rest of the placement team on whether we > need to do any placement work there, given the value we got out of > the virtual pre-PTG in April. > > Let me ask again: Do we need a presence at the PTG in Shanghai? The email-based virtual pre-PTG was extremely productive. We should definitely do it again. However, I feel like that last hour on Saturday saw more design progress than weeks worth of emails or spec reviews could have accomplished. To me, this is the value of the in-person meetups. It sucks that we can't involve everybody in the discussions - but we can rapidly crystallize an idea enough to produce a coherent spec that *can* then involve everybody. Having said that, I'm really not coming up with any major design topics that would benefit from such a meetup this time around. I feel like what we've accomplished in Train sets us up for a cycle or two of refinement (perf/docs/refactor/tech-debt) rather than feature work. I suppose we'll see what shakes out on the > https://etherpad.openstack.org/p/placement-shanghai-ptg efried . From donny at fortnebula.com Thu Jul 11 18:06:00 2019 From: donny at fortnebula.com (Donny Davis) Date: Thu, 11 Jul 2019 14:06:00 -0400 Subject: [Nova] Instances can't be started after compute nodes unexpectedly shut down because of power outage In-Reply-To: References: Message-ID: Can you ssh to the hypervisor and run virsh list to make sure your instances are in fact down? On Thu, Jul 11, 2019 at 3:02 AM Gökhan IŞIK wrote: > Can anyone help me please ? I can no't rescue my instances yet :( > > Thanks, > Gökhan > > Gökhan IŞIK , 9 Tem 2019 Sal, 15:46 tarihinde > şunu yazdı: > >> Hi folks, >> Because of power outage, Most of our compute nodes unexpectedly >> shut down and now I can not start our instances. Error message is "Failed >> to get "write" lock another process using the image?". Instances Power >> status is No State. Full error log is >> http://paste.openstack.org/show/754107/. My environment is OpenStack >> Pike on Ubuntu 16.04 LTS servers and Instances are on a nfs shared storage. >> Nova version is 16.1.6.dev2. qemu version is 2.10.1. libvirt version is >> 3.6.0. I saw a commit [1], but it doesn't solve this problem. >> There are important instances on my environment. How can I rescue my >> instances? What would you suggest ? >> >> Thanks, >> Gökhan >> >> [1] https://review.opendev.org/#/c/509774/ >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Thu Jul 11 18:15:55 2019 From: smooney at redhat.com (Sean Mooney) Date: Thu, 11 Jul 2019 19:15:55 +0100 Subject: [placement][ptg] Shanghai attendance In-Reply-To: <11798408-977b-ecdd-2730-4e7dfb16f38d@fried.cc> References: <11798408-977b-ecdd-2730-4e7dfb16f38d@fried.cc> Message-ID: <98d51bb7ad7d34268927e0fdb47fb8f31c6af6cb.camel@redhat.com> On Thu, 2019-07-11 at 12:58 -0500, Eric Fried wrote: > Chris- > > > asked for feedback from the rest of the placement team on whether we > > need to do any placement work there, given the value we got out of > > the virtual pre-PTG in April. > > > > Let me ask again: Do we need a presence at the PTG in Shanghai? > > The email-based virtual pre-PTG was extremely productive. We should > definitely do it again. > > However, I feel like that last hour on Saturday saw more design progress > than weeks worth of emails or spec reviews could have accomplished. To > me, this is the value of the in-person meetups. It sucks that we can't > involve everybody in the discussions - but we can rapidly crystallize an > idea enough to produce a coherent spec that *can* then involve everybody. well if ye do a virtual email based pre-ptg again why not also continue that virtual concept and consider a google hangout or somehting to allow realtime discussion with video/etherpad. i personally found the email discussion hard to follow vs an etherpad or gerrit review per topic but it did have pluses too. > > Having said that, I'm really not coming up with any major design topics > that would benefit from such a meetup this time around. I feel like what > we've accomplished in Train sets us up for a cycle or two of refinement > (perf/docs/refactor/tech-debt) rather than feature work. I suppose we'll > see what shakes out on the > > > https://etherpad.openstack.org/p/placement-shanghai-ptg > > efried > . > From skylightcoder at gmail.com Thu Jul 11 18:39:11 2019 From: skylightcoder at gmail.com (=?UTF-8?B?R8O2a2hhbiBJxZ5JSw==?=) Date: Thu, 11 Jul 2019 21:39:11 +0300 Subject: [Nova] Instances can't be started after compute nodes unexpectedly shut down because of power outage In-Reply-To: References: Message-ID: I run virsh list --all command and output is below: root at compute06:~# virsh list --all Id Name State ---------------------------------------------------- - instance-000012f9 shut off - instance-000013b6 shut off - instance-000016fb shut off - instance-0000190a shut off - instance-00001a8a shut off - instance-00001e05 shut off - instance-0000202a shut off - instance-00002135 shut off - instance-00002141 shut off - instance-000021b6 shut off - instance-000021ec shut off - instance-000023db shut off - instance-00002ad7 shut off And also when I try start instances with virsh , output is below: root at compute06:~# virsh start instance-0000219b error: Failed to start domain instance-000012f9 error: internal error: process exited while connecting to monitor: 2019-07-11T18:36:34.229534Z qemu-system-x86_64: -chardev pty,id=charserial0,logfile=/dev/fdset/2,logappend=on: char device redirected to /dev/pts/3 (label charserial0) 2019-07-11T18:36:34.243395Z qemu-system-x86_64: -drive file=/var/lib/nova/instances/659b5853-d094-4425-85a9-5bcacf88c84e/disk,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,discard=ignore: Failed to get "write" lock Is another process using the image? Thanks, Gökhan Donny Davis , 11 Tem 2019 Per, 21:06 tarihinde şunu yazdı: > Can you ssh to the hypervisor and run virsh list to make sure your > instances are in fact down? > > On Thu, Jul 11, 2019 at 3:02 AM Gökhan IŞIK > wrote: > >> Can anyone help me please ? I can no't rescue my instances yet :( >> >> Thanks, >> Gökhan >> >> Gökhan IŞIK , 9 Tem 2019 Sal, 15:46 tarihinde >> şunu yazdı: >> >>> Hi folks, >>> Because of power outage, Most of our compute nodes unexpectedly >>> shut down and now I can not start our instances. Error message is "Failed >>> to get "write" lock another process using the image?". Instances Power >>> status is No State. Full error log is >>> http://paste.openstack.org/show/754107/. My environment is OpenStack >>> Pike on Ubuntu 16.04 LTS servers and Instances are on a nfs shared storage. >>> Nova version is 16.1.6.dev2. qemu version is 2.10.1. libvirt version is >>> 3.6.0. I saw a commit [1], but it doesn't solve this problem. >>> There are important instances on my environment. How can I rescue my >>> instances? What would you suggest ? >>> >>> Thanks, >>> Gökhan >>> >>> [1] https://review.opendev.org/#/c/509774/ >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From jayachander.it at gmail.com Thu Jul 11 18:51:30 2019 From: jayachander.it at gmail.com (Jay See) Date: Thu, 11 Jul 2019 20:51:30 +0200 Subject: [queens][nova] nova host-evacuate errot In-Reply-To: References: Message-ID: Thanks for explanation Ignazio. I have tried same same by trying to put the compute node on a failure (echo 'c' > /proc/sysrq-trigger ). Compute node was stuck and I was not able connect to it. All the VMs are now in Error state. Running the host-evacaute was successful on controller node, but now I am not able to use the VMs. Because they are all in error state now. root at h004:~$ nova host-evacuate h017 +--------------------------------------+-------------------+---------------+ | Server UUID | Evacuate Accepted | Error Message | +--------------------------------------+-------------------+---------------+ | f3545f7d-b85e-49ee-b407-333a4c5b5ab9 | True | | | 9094494b-cfa3-459b-8d51-d9aae0ea9636 | True | | | abe7075b-ac22-4168-bf3d-d302ba37d80e | True | | | c9919371-5f2e-4155-a01a-5f41d9c8b0e7 | True | | | ffd983bb-851e-4314-9d1d-375303c278f3 | True | | +--------------------------------------+-------------------+---------------+ Now I have restarted the compute node manually , now I am able to connect to the compute node but VMs are still in Error state. 1. Any ideas, how to recover the VMs? 2. Are there any other methods to evacuate, as this method seems to be not working in mitaka version. ~Jay. On Thu, Jul 11, 2019 at 1:33 PM Ignazio Cassano wrote: > Ok Jay, > let me to describe my environment. > I have an openstack made up of 3 controllers nodes ad several compute > nodes. > The controller nodes services are controlled by pacemaker and the compute > nodes services are controlled by remote pacemaker. > My hardware is Dell so I am using ipmi fencing device . > I wrote a service controlled by pacemaker: > this service controls if a compude node fails and for avoiding split > brains if a compute node does nod respond on the management network and on > storage network the stonith poweroff the node and then execute a nova > host-evacuate. > > Anycase to have a simulation before writing the service I described above > you can do as follows: > > connect on one compute node where some virtual machines are running > run the command: echo 'c' > /proc/sysrq-trigger (it stops immediately the > node like in case of failure) > On a controller node run: nova host-evacuate "name of failed compute node" > Instances running on the failed compute node should be restarted on > another compute node > > > Ignazio > > Il giorno gio 11 lug 2019 alle ore 11:57 Jay See > ha scritto: > >> Hi , >> >> I have tried on a failed compute node which is in power off state now. >> I have tried on a running compute node, no errors. But nothing happens. >> On running compute node - Disabled the compute service and tried >> migration also. >> >> May be I might have not followed proper steps. Just wanted to know the >> steps you have followed. Otherwise, I was planning to manual migration also >> if possible. >> ~Jay. >> >> On Thu, Jul 11, 2019 at 11:52 AM Ignazio Cassano < >> ignaziocassano at gmail.com> wrote: >> >>> Hi Jay, >>> would you like to evacuate a failed compute node or evacuate a running >>> compute node ? >>> >>> Ignazio >>> >>> Il giorno gio 11 lug 2019 alle ore 11:48 Jay See < >>> jayachander.it at gmail.com> ha scritto: >>> >>>> Hi Ignazio, >>>> >>>> I am trying to evacuate the compute host on older version (mitaka). >>>> Could please share the process you followed. I am not able to succeed >>>> with openstack live-migration fails with error message (this is known issue >>>> in older versions) and nova live-ligration - nothing happens even after >>>> initiating VM migration. It is almost 4 days. >>>> >>>> ~Jay. >>>> >>>> On Thu, Jul 11, 2019 at 11:31 AM Ignazio Cassano < >>>> ignaziocassano at gmail.com> wrote: >>>> >>>>> I am sorry. >>>>> For simulating an host crash I used a wrong procedure. >>>>> Using "echo 'c' > /proc/sysrq-trigger" all work fine >>>>> >>>>> Il giorno gio 11 lug 2019 alle ore 11:01 Ignazio Cassano < >>>>> ignaziocassano at gmail.com> ha scritto: >>>>> >>>>>> Hello All, >>>>>> on ocata when I poweroff a node with active instance , doing a nova >>>>>> host-evacuate works fine >>>>>> and instances are restartd on an active node. >>>>>> On queens it does non evacuate instances but nova-api reports for >>>>>> each instance the following: >>>>>> >>>>>> 2019-07-11 10:19:54.745 13811 INFO nova.api.openstack.wsgi >>>>>> [req-daad0a7d-87ce-41bf-b096-a70fc306db5c 0c7a2d6006614fe2b3e81e47377dd2a9 >>>>>> c26f8d35f85547c4add392a221af1aab - default default] HTTP exception thrown: >>>>>> Cannot 'evacuate' instance e8485a5e-3623-4184-bcce-cafd56fa60b3 while it is >>>>>> in task_state powering-off >>>>>> >>>>>> So it poweroff all instance on the failed node but does not start >>>>>> them on active nodes >>>>>> >>>>>> What is changed ? >>>>>> Ignazio >>>>>> >>>>>> >>>>>> >>>> >>>> -- >>>> ​ >>>> P *SAVE PAPER – Please do not print this e-mail unless absolutely >>>> necessary.* >>>> >>> >> >> -- >> ​ >> P *SAVE PAPER – Please do not print this e-mail unless absolutely >> necessary.* >> > -- ​ P *SAVE PAPER – Please do not print this e-mail unless absolutely necessary.* -------------- next part -------------- An HTML attachment was scrubbed... URL: From donny at fortnebula.com Thu Jul 11 19:11:02 2019 From: donny at fortnebula.com (Donny Davis) Date: Thu, 11 Jul 2019 15:11:02 -0400 Subject: [Nova] Instances can't be started after compute nodes unexpectedly shut down because of power outage In-Reply-To: References: Message-ID: Well that is interesting. If you look in your libvirt config directory (/etc/libvirt on Centos) you can get a little more info on what is being used for locking. Maybe strace can shed some light on it. Try something like strace -ttt -f qemu-img info /var/lib/nova/instances/659b5853-d094-4425-85a9-5bcacf88c84e/disk On Thu, Jul 11, 2019 at 2:39 PM Gökhan IŞIK wrote: > I run virsh list --all command and output is below: > > root at compute06:~# virsh list --all > Id Name State > ---------------------------------------------------- > - instance-000012f9 shut off > - instance-000013b6 shut off > - instance-000016fb shut off > - instance-0000190a shut off > - instance-00001a8a shut off > - instance-00001e05 shut off > - instance-0000202a shut off > - instance-00002135 shut off > - instance-00002141 shut off > - instance-000021b6 shut off > - instance-000021ec shut off > - instance-000023db shut off > - instance-00002ad7 shut off > > And also when I try start instances with virsh , output is below: > > root at compute06:~# virsh start instance-0000219b > error: Failed to start domain instance-000012f9 > error: internal error: process exited while connecting to monitor: > 2019-07-11T18:36:34.229534Z qemu-system-x86_64: -chardev > pty,id=charserial0,logfile=/dev/fdset/2,logappend=on: char device > redirected to /dev/pts/3 (label charserial0) > 2019-07-11T18:36:34.243395Z qemu-system-x86_64: -drive > file=/var/lib/nova/instances/659b5853-d094-4425-85a9-5bcacf88c84e/disk,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,discard=ignore: > Failed to get "write" lock > Is another process using the image? > > Thanks, > Gökhan > > Donny Davis , 11 Tem 2019 Per, 21:06 tarihinde şunu > yazdı: > >> Can you ssh to the hypervisor and run virsh list to make sure your >> instances are in fact down? >> >> On Thu, Jul 11, 2019 at 3:02 AM Gökhan IŞIK >> wrote: >> >>> Can anyone help me please ? I can no't rescue my instances yet :( >>> >>> Thanks, >>> Gökhan >>> >>> Gökhan IŞIK , 9 Tem 2019 Sal, 15:46 tarihinde >>> şunu yazdı: >>> >>>> Hi folks, >>>> Because of power outage, Most of our compute nodes unexpectedly >>>> shut down and now I can not start our instances. Error message is "Failed >>>> to get "write" lock another process using the image?". Instances Power >>>> status is No State. Full error log is >>>> http://paste.openstack.org/show/754107/. My environment is OpenStack >>>> Pike on Ubuntu 16.04 LTS servers and Instances are on a nfs shared storage. >>>> Nova version is 16.1.6.dev2. qemu version is 2.10.1. libvirt version is >>>> 3.6.0. I saw a commit [1], but it doesn't solve this problem. >>>> There are important instances on my environment. How can I rescue my >>>> instances? What would you suggest ? >>>> >>>> Thanks, >>>> Gökhan >>>> >>>> [1] https://review.opendev.org/#/c/509774/ >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From skylightcoder at gmail.com Thu Jul 11 19:23:37 2019 From: skylightcoder at gmail.com (=?UTF-8?B?R8O2a2hhbiBJxZ5JSw==?=) Date: Thu, 11 Jul 2019 22:23:37 +0300 Subject: [Nova] Instances can't be started after compute nodes unexpectedly shut down because of power outage In-Reply-To: References: Message-ID: In [1] it says "Image locking is added and enabled by default. Multiple QEMU processes cannot write to the same image as long as the host supports OFD or posix locking, unless options are specified otherwise." May be need to do something on nova side. I run this command and get same error. Output is in http://paste.openstack.org/show/754311/ İf I run qemu-img info instance-0000219b with -U , it doesn't give any errors. [1] https://wiki.qemu.org/ChangeLog/2.10 Donny Davis , 11 Tem 2019 Per, 22:11 tarihinde şunu yazdı: > Well that is interesting. If you look in your libvirt config directory > (/etc/libvirt on Centos) you can get a little more info on what is being > used for locking. > > Maybe strace can shed some light on it. Try something like > > strace -ttt -f qemu-img info > /var/lib/nova/instances/659b5853-d094-4425-85a9-5bcacf88c84e/disk > > > > > > On Thu, Jul 11, 2019 at 2:39 PM Gökhan IŞIK > wrote: > >> I run virsh list --all command and output is below: >> >> root at compute06:~# virsh list --all >> Id Name State >> ---------------------------------------------------- >> - instance-000012f9 shut off >> - instance-000013b6 shut off >> - instance-000016fb shut off >> - instance-0000190a shut off >> - instance-00001a8a shut off >> - instance-00001e05 shut off >> - instance-0000202a shut off >> - instance-00002135 shut off >> - instance-00002141 shut off >> - instance-000021b6 shut off >> - instance-000021ec shut off >> - instance-000023db shut off >> - instance-00002ad7 shut off >> >> And also when I try start instances with virsh , output is below: >> >> root at compute06:~# virsh start instance-0000219b >> error: Failed to start domain instance-000012f9 >> error: internal error: process exited while connecting to monitor: >> 2019-07-11T18:36:34.229534Z qemu-system-x86_64: -chardev >> pty,id=charserial0,logfile=/dev/fdset/2,logappend=on: char device >> redirected to /dev/pts/3 (label charserial0) >> 2019-07-11T18:36:34.243395Z qemu-system-x86_64: -drive >> file=/var/lib/nova/instances/659b5853-d094-4425-85a9-5bcacf88c84e/disk,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,discard=ignore: >> Failed to get "write" lock >> Is another process using the image? >> >> Thanks, >> Gökhan >> >> Donny Davis , 11 Tem 2019 Per, 21:06 tarihinde >> şunu yazdı: >> >>> Can you ssh to the hypervisor and run virsh list to make sure your >>> instances are in fact down? >>> >>> On Thu, Jul 11, 2019 at 3:02 AM Gökhan IŞIK >>> wrote: >>> >>>> Can anyone help me please ? I can no't rescue my instances yet :( >>>> >>>> Thanks, >>>> Gökhan >>>> >>>> Gökhan IŞIK , 9 Tem 2019 Sal, 15:46 tarihinde >>>> şunu yazdı: >>>> >>>>> Hi folks, >>>>> Because of power outage, Most of our compute nodes unexpectedly >>>>> shut down and now I can not start our instances. Error message is "Failed >>>>> to get "write" lock another process using the image?". Instances Power >>>>> status is No State. Full error log is >>>>> http://paste.openstack.org/show/754107/. My environment is OpenStack >>>>> Pike on Ubuntu 16.04 LTS servers and Instances are on a nfs shared storage. >>>>> Nova version is 16.1.6.dev2. qemu version is 2.10.1. libvirt version is >>>>> 3.6.0. I saw a commit [1], but it doesn't solve this problem. >>>>> There are important instances on my environment. How can I rescue my >>>>> instances? What would you suggest ? >>>>> >>>>> Thanks, >>>>> Gökhan >>>>> >>>>> [1] https://review.opendev.org/#/c/509774/ >>>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From vungoctan252 at gmail.com Thu Jul 11 04:32:06 2019 From: vungoctan252 at gmail.com (Vu Tan) Date: Thu, 11 Jul 2019 11:32:06 +0700 Subject: [masakari] how to install masakari on centos 7 In-Reply-To: References: Message-ID: I know it's just a warning, just take a look at this image: [image: image.png] it's just hang there forever, and in the log show what I have shown to you On Wed, Jul 10, 2019 at 8:07 PM Gaëtan Trellu wrote: > This is just a warning, not an error. > > On Jul 10, 2019 3:12 AM, Vu Tan wrote: > > Hi Gaetan, > I follow you the guide you gave me, but the problem still persist, can you > please take a look at my configuration to see what is wrong or what is > missing in my config ? > the error: > 2019-07-10 14:08:46.876 17292 WARNING keystonemiddleware._common.config > [-] The option "__file__" in conf is not known to auth_token > 2019-07-10 14:08:46.876 17292 WARNING keystonemiddleware._common.config > [-] The option "here" in conf is not known to auth_token > 2019-07-10 14:08:46.882 17292 WARNING keystonemiddleware.auth_token [-] > AuthToken middleware is set with keystone_authtoken.service_ > > the config: > > [DEFAULT] > enabled_apis = masakari_api > log_dir = /var/log/kolla/masakari > state_path = /var/lib/masakari > os_user_domain_name = default > os_project_domain_name = default > os_privileged_user_tenant = service > os_privileged_user_auth_url = http://controller:5000/v3 > os_privileged_user_name = nova > os_privileged_user_password = P at ssword > masakari_api_listen = controller > masakari_api_listen_port = 15868 > debug = False > auth_strategy=keystone > > [wsgi] > # The paste configuration file path > api_paste_config = /etc/masakari/api-paste.ini > > [keystone_authtoken] > www_authenticate_uri = http://controller:5000 > auth_url = http://controller:5000 > auth_type = password > project_domain_id = default > project_domain_name = default > user_domain_name = default > user_domain_id = default > project_name = service > username = masakari > password = P at ssword > region_name = RegionOne > > [oslo_middleware] > enable_proxy_headers_parsing = True > > [database] > connection = mysql+pymysql://masakari:P at ssword@controller/masakari > > > > On Tue, Jul 9, 2019 at 10:25 PM Vu Tan wrote: > > Thank Patil Tushar, I hope it will be available soon > > On Tue, Jul 9, 2019 at 8:18 AM Patil, Tushar > wrote: > > Hi Vu and Gaetan, > > Gaetan, thank you for helping out Vu in setting up masakari-monitors > service. > > As a masakari team ,we have noticed there is a need to add proper > documentation to help the community run Masakari services in their > environment. We are working on adding proper documentation in this 'Train' > cycle. > > Will send an email on this mailing list once the patches are uploaded on > the gerrit so that you can give your feedback on the same. > > If you have any trouble in setting up Masakari, please let us know on this > mailing list or join the bi-weekly IRC Masakari meeting on the > #openstack-meeting IRC channel. The next meeting will be held on 16th July > 2019 @0400 UTC. > > Regards, > Tushar Patil > > ________________________________________ > From: Vu Tan > Sent: Monday, July 8, 2019 11:21:16 PM > To: Gaëtan Trellu > Cc: openstack-discuss at lists.openstack.org > Subject: Re: [masakari] how to install masakari on centos 7 > > Hi Gaetan, > Thanks for pinpoint this out, silly me that did not notice the simple > "error InterpreterNotFound: python3". Thanks a lot, I appreciate it > > On Mon, Jul 8, 2019 at 9:15 PM gaetan.trellu at incloudus.com>> wrote: > Vu Tan, > > About "auth_token" error, you need "os_privileged_user_*" options into > your masakari.conf for the API. > As mentioned previously please have a look here to have an example of > configuration working (for me at least): > > - masakari.conf: > > https://review.opendev.org/#/c/615715/42/ansible/roles/masakari/templates/masakari.conf.j2 > - masakari-monitor.conf: > > https://review.opendev.org/#/c/615715/42/ansible/roles/masakari/templates/masakari-monitors.conf.j2 > > About your tox issue make sure you have Python3 installed. > > Gaëtan > > On 2019-07-08 06:08, Vu Tan wrote: > > > Hi Gaetan, > > I try to generate config file by using this command tox -egenconfig on > > top level of masakari but the output is error, is this masakari still > > in beta version ? > > [root at compute1 masakari-monitors]# tox -egenconfig > > genconfig create: /root/masakari-monitors/.tox/genconfig > > ERROR: InterpreterNotFound: python3 > > _____________________________________________________________ summary > > ______________________________________________________________ > > ERROR: genconfig: InterpreterNotFound: python3 > > > > On Mon, Jul 8, 2019 at 3:24 PM Vu Tan vungoctan252 at gmail.com>> wrote: > > Hi, > > Thanks a lot for your reply, I install pacemaker/corosync, > > masakari-api, maskari-engine on controller node, and I run masakari-api > > with this command: masakari-api, but I dont know whether the process is > > running like that or is it just hang there, here is what it shows when > > I run the command, I leave it there for a while but it does not change > > anything : > > [root at controller masakari]# masakari-api > > 2019-07-08 15:21:09.946 30250 INFO masakari.api.openstack [-] Loaded > > extensions: ['extensions', 'notifications', 'os-hosts', 'segments', > > 'versions'] > > 2019-07-08 15:21:09.955 30250 WARNING keystonemiddleware._common.config > > [-] The option "__file__" in conf is not known to auth_token > > 2019-07-08 15:21:09.955 30250 WARNING keystonemiddleware._common.config > > [-] The option "here" in conf is not known to auth_token > > 2019-07-08 15:21:09.960 30250 WARNING keystonemiddleware.auth_token [-] > > AuthToken middleware is set with > > keystone_authtoken.service_token_roles_required set to False. This is > > backwards compatible but deprecated behaviour. Please set this to True. > > 2019-07-08 15:21:09.974 30250 INFO masakari.wsgi [-] masakari_api > > listening on 127.0.0.1:15868 > > 2019-07-08 15:21:09.975 30250 INFO oslo_service.service [-] Starting 4 > > workers > > 2019-07-08 15:21:09.984 30274 INFO masakari.masakari_api.wsgi.server > > [-] (30274) wsgi starting up on http://127.0.0.1:15868 > > 2019-07-08 15:21:09.985 30275 INFO masakari.masakari_api.wsgi.server > > [-] (30275) wsgi starting up on http://127.0.0.1:15868 > > 2019-07-08 15:21:09.992 30277 INFO masakari.masakari_api.wsgi.server > > [-] (30277) wsgi starting up on http://127.0.0.1:15868 > > 2019-07-08 15:21:09.994 30276 INFO masakari.masakari_api.wsgi.server > > [-] (30276) wsgi starting up on http://127.0.0.1:15868 > > > > On Sun, Jul 7, 2019 at 7:37 PM Gaëtan Trellu > > > wrote: > > > > Hi Vu Tan, > > > > Masakari documentation doesn't really exist... I had to figured some > > stuff by myself to make it works into Kolla project. > > > > On controller nodes you need: > > > > - pacemaker > > - corosync > > - masakari-api (openstack/masakari repository) > > - masakari- engine (openstack/masakari repository) > > > > On compute nodes you need: > > > > - pacemaker-remote (integrated to pacemaker cluster as a resource) > > - masakari- hostmonitor (openstack/masakari-monitor repository) > > - masakari-instancemonitor (openstack/masakari-monitor repository) > > - masakari-processmonitor (openstack/masakari-monitor repository) > > > > For masakari-hostmonitor, the service needs to have access to systemctl > > command (make sure you are not using sysvinit). > > > > For masakari-monitor, the masakari-monitor.conf is a bit different, you > > will have to configure the [api] section properly. > > > > RabbitMQ needs to be configured (as transport_url) on masakari-api and > > masakari-engine too. > > > > Please check this review[1], you will have masakari.conf and > > masakari-monitor.conf configuration examples. > > > > [1] https://review.opendev.org/#/c/615715 > > > > Gaëtan > > > > On Jul 7, 2019 12:08 AM, Vu Tan vungoctan252 at gmail.com>> wrote: > > > > VU TAN > > > > > 10:30 AM (35 minutes ago) > > > > to openstack-discuss > > > > Sorry, I resend this email because I realized that I lacked of prefix > > on this email's subject > > > > Hi, > > > > I would like to use Masakari and I'm having trouble finding a step by > > step or other documentation to get started with. Which part should be > > installed on controller, which is should be on compute, and what is the > > prerequisite to install masakari, I have installed corosync and > > pacemaker on compute and controller nodes, , what else do I need to do > > ? step I have done so far: > > - installed corosync/pacemaker > > - install masakari on compute node on this github repo: > > https://github.com/openstack/masakari > > - add masakari in to mariadb > > here is my configuration file of masakari.conf, do you mind to take a > > look at it, if I have misconfigured anything? > > > > [DEFAULT] > > enabled_apis = masakari_api > > > > # Enable to specify listening IP other than default > > masakari_api_listen = controller > > # Enable to specify port other than default > > masakari_api_listen_port = 15868 > > debug = False > > auth_strategy=keystone > > > > [wsgi] > > # The paste configuration file path > > api_paste_config = /etc/masakari/api-paste.ini > > > > [keystone_authtoken] > > www_authenticate_uri = http://controller:5000 > > auth_url = http://controller:5000 > > auth_type = password > > project_domain_id = default > > user_domain_id = default > > project_name = service > > username = masakari > > password = P at ssword > > > > [database] > > connection = mysql+pymysql://masakari:P at ssword@controller/masakari > Disclaimer: This email and any attachments are sent in strictest > confidence for the sole use of the addressee and may contain legally > privileged, confidential, and proprietary data. If you are not the intended > recipient, please advise the sender by replying promptly to this email and > then delete and destroy this email and any attachments without any further > use, copying or forwarding. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 2369 bytes Desc: not available URL: From manuel.sb at garvan.org.au Thu Jul 11 06:15:00 2019 From: manuel.sb at garvan.org.au (Manuel Sopena Ballesteros) Date: Thu, 11 Jul 2019 06:15:00 +0000 Subject: trying to understand steal time with cpu pinning Message-ID: <9D8A2486E35F0941A60430473E29F15B017EB28866@MXDB1.ad.garvan.unsw.edu.au> Dear Openstack community, Please correct me if I am wrong. As far as I understand `steal time > 0` means that the hypervisor has replaced a vcpu with a different one on the physical cpu. Also, cpu pinning allocates a vcpu to a physical cpu permanently. I have a vm setup with cpu pinning and numa affinity and realized, that cpu steal time is between 1% and 0%. [cid:image002.png at 01D53803.C9C9F190] Why is that? Thank you very much NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 108451 bytes Desc: image002.png URL: From ignaziocassano at gmail.com Thu Jul 11 08:10:39 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Thu, 11 Jul 2019 10:10:39 +0200 Subject: [queens][magnuma] kubernetes cluster In-Reply-To: References: <765458860.27505.1562167876903@ox.servinga.com> Message-ID: Hi Feilong, we are going to upgrade our openstack installations asap. Thanks Ignazio Il giorno gio 4 lug 2019 alle ore 19:45 Feilong Wang < feilong at catalyst.net.nz> ha scritto: > Hi Ignazio, > > We fixed a lot of issues in Rocky and Stein, which some of them can't be > easily backported to Queens. Magnum has a very loose dependency for other > services, so I would suggest to use rocky or stein if it's possible for you. > > As for your issue, the error means your kube-apiserver didn't start > successfully. You can take a look the cloud init log for more information. > > > On 4/07/19 4:37 AM, Ignazio Cassano wrote: > > Thanks Denis, but I think there is another problem: on kube muster port > 8080 is not listening, probably some services are note started > Regards > Ignazio > > Il giorno mer 3 lug 2019 alle ore 17:31 Denis Pascheka > ha scritto: > >> Hi Ignazio, >> >> in Queens there is an issue within Magnum which has been resolved in the >> Rocky release. >> Take a look at this file: >> https://github.com/openstack/magnum/blob/stable/rocky/magnum/drivers/common/templates/kubernetes/fragments/wc-notify-master.sh. >> >> The execution of the curl command in row 16 needs to be escaped with an >> backslash. You can achieve this by building your own magnum containers >> and >> adding an template override >> to >> it where you add your fixed/own wc-notify-master.sh script from the plugin >> directory >> . >> >> >> Best Regards, >> >> *Denis Pascheka* >> Cloud Architect >> >> t: +49 (69) 348 75 11 12 >> m: +49 (170) 495 6364 >> e: dp at servinga.com >> servinga GmbH >> Mainzer Landstr. 351-353 >> 60326 Frankfurt >> >> >> >> * >> www.servinga.com * >> >> Amtsgericht Frankfurt am Main - HRB 91418 - Geschäftsführer Adam Lakota, >> Christian Lertes >> >> Ignazio Cassano hat am 3. Juli 2019 um 16:58 >> geschrieben: >> >> >> Hi All, >> I just installed openstack kolla queens with magnum but trying to create >> a kubernetes cluster the master nodes does not terminate installation: it >> loops with the following message: >> >> curl --silent http://127.0.0.1:8080/healthz >> + '[' ok = '' ']' >> + sleep 5 >> >> Anyone can help ? >> Best Regards >> Ignazio >> >> >> >> > -- > Cheers & Best regards, > Feilong Wang (王飞龙) > -------------------------------------------------------------------------- > Senior Cloud Software Engineer > Tel: +64-48032246 > Email: flwang at catalyst.net.nz > Catalyst IT Limited > Level 6, Catalyst House, 150 Willis Street, Wellington > -------------------------------------------------------------------------- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: noname Type: image/png Size: 65536 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: noname Type: image/png Size: 65536 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: noname Type: image/png Size: 65536 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: noname Type: image/png Size: 65536 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: noname Type: image/png Size: 65536 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: noname Type: image/png Size: 65536 bytes Desc: not available URL: From donny at fortnebula.com Thu Jul 11 20:10:43 2019 From: donny at fortnebula.com (Donny Davis) Date: Thu, 11 Jul 2019 16:10:43 -0400 Subject: [Nova] Instances can't be started after compute nodes unexpectedly shut down because of power outage In-Reply-To: References: Message-ID: You surely want to leave locking turned on. You may want to ask qemu-devel about the locking of a image file and how it works. This isn't really an Openstack issue, seems to be a layer below. Depending on how mission critical your VM's are, you could probably work around it by just passing in --force-share into the command openstack is trying to run. I cannot recommend this path, the best way is to find out how you remove the lock. On Thu, Jul 11, 2019 at 3:23 PM Gökhan IŞIK wrote: > In [1] it says "Image locking is added and enabled by default. Multiple > QEMU processes cannot write to the same image as long as the host supports > OFD or posix locking, unless options are specified otherwise." May be need > to do something on nova side. > > I run this command and get same error. Output is in > http://paste.openstack.org/show/754311/ > > İf I run qemu-img info instance-0000219b with -U , it doesn't give any > errors. > > [1] https://wiki.qemu.org/ChangeLog/2.10 > > Donny Davis , 11 Tem 2019 Per, 22:11 tarihinde şunu > yazdı: > >> Well that is interesting. If you look in your libvirt config directory >> (/etc/libvirt on Centos) you can get a little more info on what is being >> used for locking. >> >> Maybe strace can shed some light on it. Try something like >> >> strace -ttt -f qemu-img info >> /var/lib/nova/instances/659b5853-d094-4425-85a9-5bcacf88c84e/disk >> >> >> >> >> >> On Thu, Jul 11, 2019 at 2:39 PM Gökhan IŞIK >> wrote: >> >>> I run virsh list --all command and output is below: >>> >>> root at compute06:~# virsh list --all >>> Id Name State >>> ---------------------------------------------------- >>> - instance-000012f9 shut off >>> - instance-000013b6 shut off >>> - instance-000016fb shut off >>> - instance-0000190a shut off >>> - instance-00001a8a shut off >>> - instance-00001e05 shut off >>> - instance-0000202a shut off >>> - instance-00002135 shut off >>> - instance-00002141 shut off >>> - instance-000021b6 shut off >>> - instance-000021ec shut off >>> - instance-000023db shut off >>> - instance-00002ad7 shut off >>> >>> And also when I try start instances with virsh , output is below: >>> >>> root at compute06:~# virsh start instance-0000219b >>> error: Failed to start domain instance-000012f9 >>> error: internal error: process exited while connecting to monitor: >>> 2019-07-11T18:36:34.229534Z qemu-system-x86_64: -chardev >>> pty,id=charserial0,logfile=/dev/fdset/2,logappend=on: char device >>> redirected to /dev/pts/3 (label charserial0) >>> 2019-07-11T18:36:34.243395Z qemu-system-x86_64: -drive >>> file=/var/lib/nova/instances/659b5853-d094-4425-85a9-5bcacf88c84e/disk,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,discard=ignore: >>> Failed to get "write" lock >>> Is another process using the image? >>> >>> Thanks, >>> Gökhan >>> >>> Donny Davis , 11 Tem 2019 Per, 21:06 tarihinde >>> şunu yazdı: >>> >>>> Can you ssh to the hypervisor and run virsh list to make sure your >>>> instances are in fact down? >>>> >>>> On Thu, Jul 11, 2019 at 3:02 AM Gökhan IŞIK >>>> wrote: >>>> >>>>> Can anyone help me please ? I can no't rescue my instances yet :( >>>>> >>>>> Thanks, >>>>> Gökhan >>>>> >>>>> Gökhan IŞIK , 9 Tem 2019 Sal, 15:46 >>>>> tarihinde şunu yazdı: >>>>> >>>>>> Hi folks, >>>>>> Because of power outage, Most of our compute nodes unexpectedly >>>>>> shut down and now I can not start our instances. Error message is "Failed >>>>>> to get "write" lock another process using the image?". Instances Power >>>>>> status is No State. Full error log is >>>>>> http://paste.openstack.org/show/754107/. My environment is OpenStack >>>>>> Pike on Ubuntu 16.04 LTS servers and Instances are on a nfs shared storage. >>>>>> Nova version is 16.1.6.dev2. qemu version is 2.10.1. libvirt version is >>>>>> 3.6.0. I saw a commit [1], but it doesn't solve this problem. >>>>>> There are important instances on my environment. How can I rescue my >>>>>> instances? What would you suggest ? >>>>>> >>>>>> Thanks, >>>>>> Gökhan >>>>>> >>>>>> [1] https://review.opendev.org/#/c/509774/ >>>>>> >>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From sean.mcginnis at gmx.com Thu Jul 11 20:16:11 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Thu, 11 Jul 2019 15:16:11 -0500 Subject: Cinder issue with DataCore In-Reply-To: References: Message-ID: <20190711201611.GA26823@sm-workstation> On Thu, Jul 11, 2019 at 04:07:49PM +0100, Grant Morley wrote: > Hi All, > > We are trying to test DataCore storage backend for cinder ( running on > Queens ). We have everything installed and have the cinder config all setup. > However whenever we try and start the "cinder-volume" service, we get the > following error: > Just a word of warning so you don't have any bad surprises later - the DataCore driver was no longer being maintained so it was deprecated in the Rocky release and removed in Stein. If I remember correctly, they actually stopped running third party CI to test their driver in Queens, but we didn't catch it in time to mark it deprecated in that release. It could very well be fine for Queens, I just don't have any data showing that for sure. > > We have the "websocket-client" installed also: > > pip freeze | grep websocket > websocket-client==0.44.0 Make sure you have it installed in your venv. Try this with: /openstack/venvs/cinder-17.1.2/lib/python2.7/bin/pip freeze | grep websocket > > The datacore libraries also appear to be available in our venvs dirs: > > ls /openstack/venvs/cinder-17.1.2/lib/python2.7/site-packages/cinder/volume/drivers/datacore > api.py  driver.py  exception.py  fc.py  __init__.py  iscsi.py passwd.py  > utils.py > > We are a bit stumped at the moment and wondered if anyone knew what might be > causing the error? We have managed to get Ceph and SolidFire working fine. > > Regards, > > -- > > > Grant Morley > Cloud Lead, Civo Ltd > www.civo.com | Signup for an account! > From cdent+os at anticdent.org Thu Jul 11 21:25:50 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Thu, 11 Jul 2019 22:25:50 +0100 (BST) Subject: [placement][ptg] Shanghai attendance In-Reply-To: <11798408-977b-ecdd-2730-4e7dfb16f38d@fried.cc> References: <11798408-977b-ecdd-2730-4e7dfb16f38d@fried.cc> Message-ID: On Thu, 11 Jul 2019, Eric Fried wrote: >> asked for feedback from the rest of the placement team on whether we >> need to do any placement work there, given the value we got out of >> the virtual pre-PTG in April. >> >> Let me ask again: Do we need a presence at the PTG in Shanghai? > > The email-based virtual pre-PTG was extremely productive. We should > definitely do it again. > > However, I feel like that last hour on Saturday saw more design progress > than weeks worth of emails or spec reviews could have accomplished. To > me, this is the value of the in-person meetups. It sucks that we can't > involve everybody in the discussions - but we can rapidly crystallize an > idea enough to produce a coherent spec that *can* then involve everybody. I don't dispute that, and if sufficient people are there, then I hope that such conversations can happen. However, if it's just that hour that is useful, then we don't need to have placement attend in a formal fashion. Spare rooms and a little forethought plus serendipity will do the trick. This is especially the case for Shanghai if the same thing that was true in Denver remains so: most people will be engaged with other meetings and projects [1]. > Having said that, I'm really not coming up with any major design topics > that would benefit from such a meetup this time around. I feel like what > we've accomplished in Train sets us up for a cycle or two of refinement > (perf/docs/refactor/tech-debt) rather than feature work. Yes. It might even be possible to call it mature and close to done. That's what it is if nobody is clamoring for features. [1] BTW: I think this is a good thing. If placement requires three days of 30 people who only work on that talking at one another, we're doing placement completely wrong. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From cdent+os at anticdent.org Thu Jul 11 21:32:30 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Thu, 11 Jul 2019 22:32:30 +0100 (BST) Subject: [placement][ptg] Shanghai attendance In-Reply-To: <98d51bb7ad7d34268927e0fdb47fb8f31c6af6cb.camel@redhat.com> References: <11798408-977b-ecdd-2730-4e7dfb16f38d@fried.cc> <98d51bb7ad7d34268927e0fdb47fb8f31c6af6cb.camel@redhat.com> Message-ID: On Thu, 11 Jul 2019, Sean Mooney wrote: > well if ye do a virtual email based pre-ptg again why not also continue that virtual > concept and consider a google hangout or somehting to allow realtime discussion with video/etherpad. > i personally found the email discussion hard to follow vs an etherpad or gerrit review per topic > but it did have pluses too. Let's work this out closer to the time. If it's what people want we can certainly do it. I'd vote against it, but definitely not block it. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From fungi at yuggoth.org Thu Jul 11 21:55:48 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 11 Jul 2019 21:55:48 +0000 Subject: [placement][ptg] Shanghai attendance In-Reply-To: References: <11798408-977b-ecdd-2730-4e7dfb16f38d@fried.cc> <98d51bb7ad7d34268927e0fdb47fb8f31c6af6cb.camel@redhat.com> Message-ID: <20190711215548.n2idwnmtzdzov6pf@yuggoth.org> On 2019-07-11 22:32:30 +0100 (+0100), Chris Dent wrote: > On Thu, 11 Jul 2019, Sean Mooney wrote: > > > well if ye do a virtual email based pre-ptg again why not also > > continue that virtual concept and consider a google hangout or > > somehting to allow realtime discussion with video/etherpad. i > > personally found the email discussion hard to follow vs an > > etherpad or gerrit review per topic but it did have pluses too. > > Let's work this out closer to the time. If it's what people want we > can certainly do it. I'd vote against it, but definitely not block > it. Or pop some popcorn and sit back... they're going to have a devil of a time getting to Google Hangouts from behind the GFWoC. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From donny at fortnebula.com Thu Jul 11 21:56:00 2019 From: donny at fortnebula.com (Donny Davis) Date: Thu, 11 Jul 2019 17:56:00 -0400 Subject: [Nova] Instances can't be started after compute nodes unexpectedly shut down because of power outage In-Reply-To: References: Message-ID: Of course you can also always just pull the disk images from the vm folders, merge them back with the base file, upload to glance and then relaunch the instances. You can give this method a spin with the lowest risk to your instances https://medium.com/@kumar_pravin/qemu-merge-snapshot-and-backing-file-into-standalone-disk-c8d3a2b17c0e On Thu, Jul 11, 2019 at 4:10 PM Donny Davis wrote: > You surely want to leave locking turned on. > > You may want to ask qemu-devel about the locking of a image file and how > it works. This isn't really an Openstack issue, seems to be a layer below. > > Depending on how mission critical your VM's are, you could probably work > around it by just passing in --force-share into the command openstack is > trying to run. > > I cannot recommend this path, the best way is to find out how you remove > the lock. > > > > > > > On Thu, Jul 11, 2019 at 3:23 PM Gökhan IŞIK > wrote: > >> In [1] it says "Image locking is added and enabled by default. Multiple >> QEMU processes cannot write to the same image as long as the host supports >> OFD or posix locking, unless options are specified otherwise." May be need >> to do something on nova side. >> >> I run this command and get same error. Output is in >> http://paste.openstack.org/show/754311/ >> >> İf I run qemu-img info instance-0000219b with -U , it doesn't give any >> errors. >> >> [1] https://wiki.qemu.org/ChangeLog/2.10 >> >> Donny Davis , 11 Tem 2019 Per, 22:11 tarihinde >> şunu yazdı: >> >>> Well that is interesting. If you look in your libvirt config directory >>> (/etc/libvirt on Centos) you can get a little more info on what is being >>> used for locking. >>> >>> Maybe strace can shed some light on it. Try something like >>> >>> strace -ttt -f qemu-img info >>> /var/lib/nova/instances/659b5853-d094-4425-85a9-5bcacf88c84e/disk >>> >>> >>> >>> >>> >>> On Thu, Jul 11, 2019 at 2:39 PM Gökhan IŞIK >>> wrote: >>> >>>> I run virsh list --all command and output is below: >>>> >>>> root at compute06:~# virsh list --all >>>> Id Name State >>>> ---------------------------------------------------- >>>> - instance-000012f9 shut off >>>> - instance-000013b6 shut off >>>> - instance-000016fb shut off >>>> - instance-0000190a shut off >>>> - instance-00001a8a shut off >>>> - instance-00001e05 shut off >>>> - instance-0000202a shut off >>>> - instance-00002135 shut off >>>> - instance-00002141 shut off >>>> - instance-000021b6 shut off >>>> - instance-000021ec shut off >>>> - instance-000023db shut off >>>> - instance-00002ad7 shut off >>>> >>>> And also when I try start instances with virsh , output is below: >>>> >>>> root at compute06:~# virsh start instance-0000219b >>>> error: Failed to start domain instance-000012f9 >>>> error: internal error: process exited while connecting to monitor: >>>> 2019-07-11T18:36:34.229534Z qemu-system-x86_64: -chardev >>>> pty,id=charserial0,logfile=/dev/fdset/2,logappend=on: char device >>>> redirected to /dev/pts/3 (label charserial0) >>>> 2019-07-11T18:36:34.243395Z qemu-system-x86_64: -drive >>>> file=/var/lib/nova/instances/659b5853-d094-4425-85a9-5bcacf88c84e/disk,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,discard=ignore: >>>> Failed to get "write" lock >>>> Is another process using the image? >>>> >>>> Thanks, >>>> Gökhan >>>> >>>> Donny Davis , 11 Tem 2019 Per, 21:06 tarihinde >>>> şunu yazdı: >>>> >>>>> Can you ssh to the hypervisor and run virsh list to make sure your >>>>> instances are in fact down? >>>>> >>>>> On Thu, Jul 11, 2019 at 3:02 AM Gökhan IŞIK >>>>> wrote: >>>>> >>>>>> Can anyone help me please ? I can no't rescue my instances yet :( >>>>>> >>>>>> Thanks, >>>>>> Gökhan >>>>>> >>>>>> Gökhan IŞIK , 9 Tem 2019 Sal, 15:46 >>>>>> tarihinde şunu yazdı: >>>>>> >>>>>>> Hi folks, >>>>>>> Because of power outage, Most of our compute nodes unexpectedly >>>>>>> shut down and now I can not start our instances. Error message is "Failed >>>>>>> to get "write" lock another process using the image?". Instances Power >>>>>>> status is No State. Full error log is >>>>>>> http://paste.openstack.org/show/754107/. My environment is >>>>>>> OpenStack Pike on Ubuntu 16.04 LTS servers and Instances are on a nfs >>>>>>> shared storage. Nova version is 16.1.6.dev2. qemu version is 2.10.1. >>>>>>> libvirt version is 3.6.0. I saw a commit [1], but it doesn't solve this >>>>>>> problem. >>>>>>> There are important instances on my environment. How can I rescue my >>>>>>> instances? What would you suggest ? >>>>>>> >>>>>>> Thanks, >>>>>>> Gökhan >>>>>>> >>>>>>> [1] https://review.opendev.org/#/c/509774/ >>>>>>> >>>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Thu Jul 11 23:33:09 2019 From: smooney at redhat.com (Sean Mooney) Date: Fri, 12 Jul 2019 00:33:09 +0100 Subject: [placement][ptg] Shanghai attendance In-Reply-To: <20190711215548.n2idwnmtzdzov6pf@yuggoth.org> References: <11798408-977b-ecdd-2730-4e7dfb16f38d@fried.cc> <98d51bb7ad7d34268927e0fdb47fb8f31c6af6cb.camel@redhat.com> <20190711215548.n2idwnmtzdzov6pf@yuggoth.org> Message-ID: On Thu, 2019-07-11 at 21:55 +0000, Jeremy Stanley wrote: > On 2019-07-11 22:32:30 +0100 (+0100), Chris Dent wrote: > > On Thu, 11 Jul 2019, Sean Mooney wrote: > > > > > well if ye do a virtual email based pre-ptg again why not also > > > continue that virtual concept and consider a google hangout or > > > somehting to allow realtime discussion with video/etherpad. i > > > personally found the email discussion hard to follow vs an > > > etherpad or gerrit review per topic but it did have pluses too. > > > > Let's work this out closer to the time. If it's what people want we > > can certainly do it. I'd vote against it, but definitely not block > > it. well it was more of a suggestion that if a.) hallway chats/ spare rooms was not enough and b.) realtime "virtual" face 2 face time was needed for some reason an online meeting could proably be tried a a fall back. if its not needed great. > > Or pop some popcorn and sit back... they're going to have a devil of > a time getting to Google Hangouts from behind the GFWoC. hehe ya i was actully thinking of still before the ptg. hangout during the ptg praobly wont be much of an option. From gagehugo at gmail.com Thu Jul 11 23:42:59 2019 From: gagehugo at gmail.com (Gage Hugo) Date: Thu, 11 Jul 2019 18:42:59 -0500 Subject: [security sig] Weekly Newsletter July 11th 2019 Message-ID: #Week of: 11 July 2019 - Security SIG Meeting Info: http://eavesdrop.openstack.org/#Security_SIG_meeting - Weekly on Thursday at 1500 UTC in #openstack-meeting - Agenda: https://etherpad.openstack.org/p/security-agenda - https://security.openstack.org/ - https://wiki.openstack.org/wiki/Security-SIG #Meeting Notes - Summary: http://eavesdrop.openstack.org/meetings/security/2019/security.2019-07-11-15.00.html - Image Encryption Pop-Up Team Meeting - The image encryption pop-up team has an official meeting! See http://eavesdrop.openstack.org/#Image_Encryption_Popup-Team_Meeting - Syntribos still in use - It appears that there is an occasional update to the Syntribos project (outside of maintenance patches). There will be another email send out to see if this project is still being used. - Security Guide Cleanup - nickthetait has been looking into cleaning up & updating the OpenStack Security Guide, yay! - Shanghai PTG Attendance - We have been asked to check how many people are planning on attending the PTG/Forum for the Security SIG. There will be another email sent out for this. # VMT Reports - A full list of publicly marked security issues can be found here: https://bugs.launchpad.net/ossa/ - No new public security bugs this week -------------- next part -------------- An HTML attachment was scrubbed... URL: From gagehugo at gmail.com Thu Jul 11 23:47:06 2019 From: gagehugo at gmail.com (Gage Hugo) Date: Thu, 11 Jul 2019 18:47:06 -0500 Subject: [Security SIG] Shanghai PTG Attendence Message-ID: Hello everyone, We have been asked to look into how many are planning on attending the PTG/Forum for the Security SIG. I have created an etherpad to collect a list of people who are planning on attending[0]. If you are interested in Security or the SIG, please update the etherpad. Thanks! [0] https://etherpad.openstack.org/p/security-shanghai-ptg-planning -------------- next part -------------- An HTML attachment was scrubbed... URL: From gagehugo at gmail.com Thu Jul 11 23:53:43 2019 From: gagehugo at gmail.com (Gage Hugo) Date: Thu, 11 Jul 2019 18:53:43 -0500 Subject: [Syntribos] Does anyone still use this? Message-ID: The Security SIG has recently been looking into updating several sites/documentation and one part we discussed the last couple meetings was the security tools listings[0]. One of the projects listed there, Syntribos, doesn't appear to have much activity[1] outside of zuul/python/docs maintenance and the idea of retiring the project was mentioned. However, there does seem to be an occasional bug-fix submitted, so we are sending this email out to see if anyone is still utilizing Syntribos. If so, please either respond here, reach out to us in the #openstack-security irc channel, or fill out the section in the Security SIG Agenda etherpad[2]. Thanks! [0] https://security.openstack.org/#security-tool-development [1] https://review.opendev.org/#/q/project:openstack/syntribos [2] https://etherpad.openstack.org/p/security-agenda -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Fri Jul 12 04:12:35 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Fri, 12 Jul 2019 06:12:35 +0200 Subject: [queens][nova] nova host-evacuate errot In-Reply-To: References: Message-ID: Jay, for recovering vm state use the command nova reset-state.... nova help reset-state to check the command requested parameters. Ad far as evacuation la concerned, how many compute nodes do gli have ? Instance live migration works? Are gli using shared cinder storage? Ignazio Il Gio 11 Lug 2019 20:51 Jay See ha scritto: > Thanks for explanation Ignazio. > > I have tried same same by trying to put the compute node on a failure > (echo 'c' > /proc/sysrq-trigger ). Compute node was stuck and I was not > able connect to it. > All the VMs are now in Error state. > > Running the host-evacaute was successful on controller node, but now I am > not able to use the VMs. Because they are all in error state now. > > root at h004:~$ nova host-evacuate h017 > > +--------------------------------------+-------------------+---------------+ > | Server UUID | Evacuate Accepted | Error Message > | > > +--------------------------------------+-------------------+---------------+ > | f3545f7d-b85e-49ee-b407-333a4c5b5ab9 | True | > | > | 9094494b-cfa3-459b-8d51-d9aae0ea9636 | True | > | > | abe7075b-ac22-4168-bf3d-d302ba37d80e | True | > | > | c9919371-5f2e-4155-a01a-5f41d9c8b0e7 | True | > | > | ffd983bb-851e-4314-9d1d-375303c278f3 | True | > | > > +--------------------------------------+-------------------+---------------+ > > Now I have restarted the compute node manually , now I am able to connect > to the compute node but VMs are still in Error state. > 1. Any ideas, how to recover the VMs? > 2. Are there any other methods to evacuate, as this method seems to be not > working in mitaka version. > > ~Jay. > > On Thu, Jul 11, 2019 at 1:33 PM Ignazio Cassano > wrote: > >> Ok Jay, >> let me to describe my environment. >> I have an openstack made up of 3 controllers nodes ad several compute >> nodes. >> The controller nodes services are controlled by pacemaker and the compute >> nodes services are controlled by remote pacemaker. >> My hardware is Dell so I am using ipmi fencing device . >> I wrote a service controlled by pacemaker: >> this service controls if a compude node fails and for avoiding split >> brains if a compute node does nod respond on the management network and on >> storage network the stonith poweroff the node and then execute a nova >> host-evacuate. >> >> Anycase to have a simulation before writing the service I described above >> you can do as follows: >> >> connect on one compute node where some virtual machines are running >> run the command: echo 'c' > /proc/sysrq-trigger (it stops immediately the >> node like in case of failure) >> On a controller node run: nova host-evacuate "name of failed compute >> node" >> Instances running on the failed compute node should be restarted on >> another compute node >> >> >> Ignazio >> >> Il giorno gio 11 lug 2019 alle ore 11:57 Jay See < >> jayachander.it at gmail.com> ha scritto: >> >>> Hi , >>> >>> I have tried on a failed compute node which is in power off state now. >>> I have tried on a running compute node, no errors. But nothing happens. >>> On running compute node - Disabled the compute service and tried >>> migration also. >>> >>> May be I might have not followed proper steps. Just wanted to know the >>> steps you have followed. Otherwise, I was planning to manual migration also >>> if possible. >>> ~Jay. >>> >>> On Thu, Jul 11, 2019 at 11:52 AM Ignazio Cassano < >>> ignaziocassano at gmail.com> wrote: >>> >>>> Hi Jay, >>>> would you like to evacuate a failed compute node or evacuate a running >>>> compute node ? >>>> >>>> Ignazio >>>> >>>> Il giorno gio 11 lug 2019 alle ore 11:48 Jay See < >>>> jayachander.it at gmail.com> ha scritto: >>>> >>>>> Hi Ignazio, >>>>> >>>>> I am trying to evacuate the compute host on older version (mitaka). >>>>> Could please share the process you followed. I am not able to succeed >>>>> with openstack live-migration fails with error message (this is known issue >>>>> in older versions) and nova live-ligration - nothing happens even after >>>>> initiating VM migration. It is almost 4 days. >>>>> >>>>> ~Jay. >>>>> >>>>> On Thu, Jul 11, 2019 at 11:31 AM Ignazio Cassano < >>>>> ignaziocassano at gmail.com> wrote: >>>>> >>>>>> I am sorry. >>>>>> For simulating an host crash I used a wrong procedure. >>>>>> Using "echo 'c' > /proc/sysrq-trigger" all work fine >>>>>> >>>>>> Il giorno gio 11 lug 2019 alle ore 11:01 Ignazio Cassano < >>>>>> ignaziocassano at gmail.com> ha scritto: >>>>>> >>>>>>> Hello All, >>>>>>> on ocata when I poweroff a node with active instance , doing a nova >>>>>>> host-evacuate works fine >>>>>>> and instances are restartd on an active node. >>>>>>> On queens it does non evacuate instances but nova-api reports for >>>>>>> each instance the following: >>>>>>> >>>>>>> 2019-07-11 10:19:54.745 13811 INFO nova.api.openstack.wsgi >>>>>>> [req-daad0a7d-87ce-41bf-b096-a70fc306db5c 0c7a2d6006614fe2b3e81e47377dd2a9 >>>>>>> c26f8d35f85547c4add392a221af1aab - default default] HTTP exception thrown: >>>>>>> Cannot 'evacuate' instance e8485a5e-3623-4184-bcce-cafd56fa60b3 while it is >>>>>>> in task_state powering-off >>>>>>> >>>>>>> So it poweroff all instance on the failed node but does not start >>>>>>> them on active nodes >>>>>>> >>>>>>> What is changed ? >>>>>>> Ignazio >>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> -- >>>>> ​ >>>>> P *SAVE PAPER – Please do not print this e-mail unless absolutely >>>>> necessary.* >>>>> >>>> >>> >>> -- >>> ​ >>> P *SAVE PAPER – Please do not print this e-mail unless absolutely >>> necessary.* >>> >> > > -- > ​ > P *SAVE PAPER – Please do not print this e-mail unless absolutely > necessary.* > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jyotishri403 at gmail.com Fri Jul 12 04:16:01 2019 From: jyotishri403 at gmail.com (Jyoti Dahiwele) Date: Fri, 12 Jul 2019 09:46:01 +0530 Subject: Flow of instance creation Message-ID: Dear Everyone, Can you explain me the flow of instance creation in stein ? -------------- next part -------------- An HTML attachment was scrubbed... URL: From skylightcoder at gmail.com Fri Jul 12 07:46:31 2019 From: skylightcoder at gmail.com (=?UTF-8?B?R8O2a2hhbiBJxZ5JSw==?=) Date: Fri, 12 Jul 2019 10:46:31 +0300 Subject: [Nova] Instances can't be started after compute nodes unexpectedly shut down because of power outage In-Reply-To: References: Message-ID: Awesome, thanks! Donny, I followed below steps and rescue my instance. 1. Find instance id and compute host root at infra1-utility-container-50bcf920:~# openstack server show 1d2e8a39-97ee-4ce7-a612-1b50f90cc51e -c id -c OS-EXT-SRV-ATTR:hypervisor_hostname +-------------------------------------+--------------------------------------+ | Field | Value | +-------------------------------------+--------------------------------------+ | OS-EXT-SRV-ATTR:hypervisor_hostname | compute06 | | id | 1d2e8a39-97ee-4ce7-a612-1b50f90cc51e | +-------------------------------------+--------------------------------------+ 2. Find image and backing image file on compute host root at compute06:~# qemu-img info -U --backing-chain /var/lib/nova/instances/1d2e8a39-97ee-4ce7-a612-1b50f90cc51e/disk image: /var/lib/nova/instances/1d2e8a39-97ee-4ce7-a612-1b50f90cc51e/disk file format: qcow2 virtual size: 160G (171798691840 bytes) disk size: 32G cluster_size: 65536 backing file: /var/lib/nova/instances/_base/a1960f539532979a591c5f837ad604eedd9c7323 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false image: /var/lib/nova/instances/_base/a1960f539532979a591c5f837ad604eedd9c7323 file format: raw virtual size: 160G (171798691840 bytes) disk size: 18G 3. Copy image and backing image file root at compute06:~# cp /var/lib/nova/instances/1d2e8a39-97ee-4ce7-a612-1b50f90cc51e/disk master root at compute06:~# cp /var/lib/nova/instances/_base/a1960f539532979a591c5f837ad604eedd9c7323 new-master 4. Rebase the image file that was backed off the original file so that it uses the new file i.e new-master then commit those changes back to original file master back into the new base new-master root at compute06:~# qemu-img rebase -b new-master -U master root at compute06:~# qemu-img commit master root at compute06:~# qemu-img info new-master 5. Convert raw image to qcow2 root at compute06:~# qemu-img convert -f raw -O qcow2 new-master new-master.qcow2 6. Time to upload glance and then launch instance from this image :) Thanks, Gökhan. Donny Davis , 12 Tem 2019 Cum, 00:56 tarihinde şunu yazdı: > Of course you can also always just pull the disk images from the vm > folders, merge them back with the base file, upload to glance and then > relaunch the instances. > > You can give this method a spin with the lowest risk to your instances > > > https://medium.com/@kumar_pravin/qemu-merge-snapshot-and-backing-file-into-standalone-disk-c8d3a2b17c0e > > > > > > On Thu, Jul 11, 2019 at 4:10 PM Donny Davis wrote: > >> You surely want to leave locking turned on. >> >> You may want to ask qemu-devel about the locking of a image file and how >> it works. This isn't really an Openstack issue, seems to be a layer below. >> >> Depending on how mission critical your VM's are, you could probably work >> around it by just passing in --force-share into the command openstack is >> trying to run. >> >> I cannot recommend this path, the best way is to find out how you remove >> the lock. >> >> >> >> >> >> >> On Thu, Jul 11, 2019 at 3:23 PM Gökhan IŞIK >> wrote: >> >>> In [1] it says "Image locking is added and enabled by default. Multiple >>> QEMU processes cannot write to the same image as long as the host supports >>> OFD or posix locking, unless options are specified otherwise." May be need >>> to do something on nova side. >>> >>> I run this command and get same error. Output is in >>> http://paste.openstack.org/show/754311/ >>> >>> İf I run qemu-img info instance-0000219b with -U , it doesn't give any >>> errors. >>> >>> [1] https://wiki.qemu.org/ChangeLog/2.10 >>> >>> Donny Davis , 11 Tem 2019 Per, 22:11 tarihinde >>> şunu yazdı: >>> >>>> Well that is interesting. If you look in your libvirt config directory >>>> (/etc/libvirt on Centos) you can get a little more info on what is being >>>> used for locking. >>>> >>>> Maybe strace can shed some light on it. Try something like >>>> >>>> strace -ttt -f qemu-img info >>>> /var/lib/nova/instances/659b5853-d094-4425-85a9-5bcacf88c84e/disk >>>> >>>> >>>> >>>> >>>> >>>> On Thu, Jul 11, 2019 at 2:39 PM Gökhan IŞIK >>>> wrote: >>>> >>>>> I run virsh list --all command and output is below: >>>>> >>>>> root at compute06:~# virsh list --all >>>>> Id Name State >>>>> ---------------------------------------------------- >>>>> - instance-000012f9 shut off >>>>> - instance-000013b6 shut off >>>>> - instance-000016fb shut off >>>>> - instance-0000190a shut off >>>>> - instance-00001a8a shut off >>>>> - instance-00001e05 shut off >>>>> - instance-0000202a shut off >>>>> - instance-00002135 shut off >>>>> - instance-00002141 shut off >>>>> - instance-000021b6 shut off >>>>> - instance-000021ec shut off >>>>> - instance-000023db shut off >>>>> - instance-00002ad7 shut off >>>>> >>>>> And also when I try start instances with virsh , output is below: >>>>> >>>>> root at compute06:~# virsh start instance-0000219b >>>>> error: Failed to start domain instance-000012f9 >>>>> error: internal error: process exited while connecting to monitor: >>>>> 2019-07-11T18:36:34.229534Z qemu-system-x86_64: -chardev >>>>> pty,id=charserial0,logfile=/dev/fdset/2,logappend=on: char device >>>>> redirected to /dev/pts/3 (label charserial0) >>>>> 2019-07-11T18:36:34.243395Z qemu-system-x86_64: -drive >>>>> file=/var/lib/nova/instances/659b5853-d094-4425-85a9-5bcacf88c84e/disk,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,discard=ignore: >>>>> Failed to get "write" lock >>>>> Is another process using the image? >>>>> >>>>> Thanks, >>>>> Gökhan >>>>> >>>>> Donny Davis , 11 Tem 2019 Per, 21:06 tarihinde >>>>> şunu yazdı: >>>>> >>>>>> Can you ssh to the hypervisor and run virsh list to make sure your >>>>>> instances are in fact down? >>>>>> >>>>>> On Thu, Jul 11, 2019 at 3:02 AM Gökhan IŞIK >>>>>> wrote: >>>>>> >>>>>>> Can anyone help me please ? I can no't rescue my instances yet :( >>>>>>> >>>>>>> Thanks, >>>>>>> Gökhan >>>>>>> >>>>>>> Gökhan IŞIK , 9 Tem 2019 Sal, 15:46 >>>>>>> tarihinde şunu yazdı: >>>>>>> >>>>>>>> Hi folks, >>>>>>>> Because of power outage, Most of our compute nodes unexpectedly >>>>>>>> shut down and now I can not start our instances. Error message is "Failed >>>>>>>> to get "write" lock another process using the image?". Instances Power >>>>>>>> status is No State. Full error log is >>>>>>>> http://paste.openstack.org/show/754107/. My environment is >>>>>>>> OpenStack Pike on Ubuntu 16.04 LTS servers and Instances are on a nfs >>>>>>>> shared storage. Nova version is 16.1.6.dev2. qemu version is 2.10.1. >>>>>>>> libvirt version is 3.6.0. I saw a commit [1], but it doesn't solve this >>>>>>>> problem. >>>>>>>> There are important instances on my environment. How can I rescue >>>>>>>> my instances? What would you suggest ? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Gökhan >>>>>>>> >>>>>>>> [1] https://review.opendev.org/#/c/509774/ >>>>>>>> >>>>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From grant at civo.com Fri Jul 12 08:28:06 2019 From: grant at civo.com (Grant Morley) Date: Fri, 12 Jul 2019 09:28:06 +0100 Subject: Cinder issue with DataCore In-Reply-To: <20190711201611.GA26823@sm-workstation> References: <20190711201611.GA26823@sm-workstation> Message-ID: Hi Sean, Thanks for that. Luckily we can give the device back and we haven't paid anything for it either. I'll stop trying to get it working now then :) Regards, On 11/07/2019 21:16, Sean McGinnis wrote: > On Thu, Jul 11, 2019 at 04:07:49PM +0100, Grant Morley wrote: >> Hi All, >> >> We are trying to test DataCore storage backend for cinder ( running on >> Queens ). We have everything installed and have the cinder config all setup. >> However whenever we try and start the "cinder-volume" service, we get the >> following error: >> > Just a word of warning so you don't have any bad surprises later - the DataCore > driver was no longer being maintained so it was deprecated in the Rocky release > and removed in Stein. > > If I remember correctly, they actually stopped running third party CI to test > their driver in Queens, but we didn't catch it in time to mark it deprecated in > that release. It could very well be fine for Queens, I just don't have any data > showing that for sure. > >> We have the "websocket-client" installed also: >> >> pip freeze | grep websocket >> websocket-client==0.44.0 > Make sure you have it installed in your venv. Try this with: > > /openstack/venvs/cinder-17.1.2/lib/python2.7/bin/pip freeze | grep websocket > >> The datacore libraries also appear to be available in our venvs dirs: >> >> ls /openstack/venvs/cinder-17.1.2/lib/python2.7/site-packages/cinder/volume/drivers/datacore >> api.py  driver.py  exception.py  fc.py  __init__.py  iscsi.py passwd.py >> utils.py >> >> We are a bit stumped at the moment and wondered if anyone knew what might be >> causing the error? We have managed to get Ceph and SolidFire working fine. >> >> Regards, >> >> -- >> >> >> Grant Morley >> Cloud Lead, Civo Ltd >> www.civo.com | Signup for an account! >> -- Grant Morley Cloud Lead, Civo Ltd www.civo.com | Signup for an account! -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfinucan at redhat.com Fri Jul 12 09:12:37 2019 From: sfinucan at redhat.com (Stephen Finucane) Date: Fri, 12 Jul 2019 10:12:37 +0100 Subject: [nova][ec2] Removal of unused utility methods from 'nova.api.ec2' Message-ID: <4f01bfc7bd76a37e7bdb1630e6a1ee8fc767cec5.camel@redhat.com> I have a couple of patches up to remove what looks like a large chunk of unused ec2 API code from nova. The patches start at [1] and notes are provided in earlier revisions of those reviews ([2], [3], [4]) about why I'm able to remove some things and not others. I have a DNM patch proposed against the ec2-api [5] to ensure that things really don't break there but in case someone is maintaining something outside of opendev.org (I'd have found it on codesearch.openstack.org) that really needs these, you need to speak up on the review asap. If not, we'll get going with this. Stephen PS: To be clear, ec2-api *will still work*. None of the things we're removing are used by that project or we wouldn't be removing them :) [1] https://review.opendev.org/#/c/662501/ [2] https://review.opendev.org/#/c/662501/1/nova/api/ec2/ec2utils.py [3] https://review.opendev.org/#/c/662502/1/nova/objects/ec2.py [4] https://review.opendev.org/#/c/662503/1/nova/api/ec2/cloud.py [5] https://review.opendev.org/#/c/663386/ From jayachander.it at gmail.com Fri Jul 12 09:26:00 2019 From: jayachander.it at gmail.com (Jay See) Date: Fri, 12 Jul 2019 11:26:00 +0200 Subject: [queens][nova] nova host-evacuate errot In-Reply-To: References: Message-ID: Ignazio, One instance is stuck in error state not able to recover it. All other instances are running now. root at h004:~$ nova reset-state --all-tenants my-instance-1-2 Reset state for server my-instance-1-2 succeeded; new state is error I have several compute nodes (14). I am not sure what is gli? Live migration is not working, i have tried it was not throwing any errors. But nothing seems to happen. I am not completely sure, I haven't heard about gli before. (This setup is deployed by someone else). ~Jay. On Fri, Jul 12, 2019 at 6:12 AM Ignazio Cassano wrote: > Jay, for recovering vm state use the command nova reset-state.... > > nova help reset-state to check the command requested parameters. > > Ad far as evacuation la concerned, how many compute nodes do gli have ? > Instance live migration works? > Are gli using shared cinder storage? > Ignazio > > Il Gio 11 Lug 2019 20:51 Jay See ha scritto: > >> Thanks for explanation Ignazio. >> >> I have tried same same by trying to put the compute node on a failure >> (echo 'c' > /proc/sysrq-trigger ). Compute node was stuck and I was not >> able connect to it. >> All the VMs are now in Error state. >> >> Running the host-evacaute was successful on controller node, but now I am >> not able to use the VMs. Because they are all in error state now. >> >> root at h004:~$ nova host-evacuate h017 >> >> +--------------------------------------+-------------------+---------------+ >> | Server UUID | Evacuate Accepted | Error >> Message | >> >> +--------------------------------------+-------------------+---------------+ >> | f3545f7d-b85e-49ee-b407-333a4c5b5ab9 | True | >> | >> | 9094494b-cfa3-459b-8d51-d9aae0ea9636 | True | >> | >> | abe7075b-ac22-4168-bf3d-d302ba37d80e | True | >> | >> | c9919371-5f2e-4155-a01a-5f41d9c8b0e7 | True | >> | >> | ffd983bb-851e-4314-9d1d-375303c278f3 | True | >> | >> >> +--------------------------------------+-------------------+---------------+ >> >> Now I have restarted the compute node manually , now I am able to connect >> to the compute node but VMs are still in Error state. >> 1. Any ideas, how to recover the VMs? >> 2. Are there any other methods to evacuate, as this method seems to be >> not working in mitaka version. >> >> ~Jay. >> >> On Thu, Jul 11, 2019 at 1:33 PM Ignazio Cassano >> wrote: >> >>> Ok Jay, >>> let me to describe my environment. >>> I have an openstack made up of 3 controllers nodes ad several compute >>> nodes. >>> The controller nodes services are controlled by pacemaker and the >>> compute nodes services are controlled by remote pacemaker. >>> My hardware is Dell so I am using ipmi fencing device . >>> I wrote a service controlled by pacemaker: >>> this service controls if a compude node fails and for avoiding split >>> brains if a compute node does nod respond on the management network and on >>> storage network the stonith poweroff the node and then execute a nova >>> host-evacuate. >>> >>> Anycase to have a simulation before writing the service I described >>> above you can do as follows: >>> >>> connect on one compute node where some virtual machines are running >>> run the command: echo 'c' > /proc/sysrq-trigger (it stops immediately >>> the node like in case of failure) >>> On a controller node run: nova host-evacuate "name of failed compute >>> node" >>> Instances running on the failed compute node should be restarted on >>> another compute node >>> >>> >>> Ignazio >>> >>> Il giorno gio 11 lug 2019 alle ore 11:57 Jay See < >>> jayachander.it at gmail.com> ha scritto: >>> >>>> Hi , >>>> >>>> I have tried on a failed compute node which is in power off state now. >>>> I have tried on a running compute node, no errors. But nothing happens. >>>> On running compute node - Disabled the compute service and tried >>>> migration also. >>>> >>>> May be I might have not followed proper steps. Just wanted to know the >>>> steps you have followed. Otherwise, I was planning to manual migration also >>>> if possible. >>>> ~Jay. >>>> >>>> On Thu, Jul 11, 2019 at 11:52 AM Ignazio Cassano < >>>> ignaziocassano at gmail.com> wrote: >>>> >>>>> Hi Jay, >>>>> would you like to evacuate a failed compute node or evacuate a running >>>>> compute node ? >>>>> >>>>> Ignazio >>>>> >>>>> Il giorno gio 11 lug 2019 alle ore 11:48 Jay See < >>>>> jayachander.it at gmail.com> ha scritto: >>>>> >>>>>> Hi Ignazio, >>>>>> >>>>>> I am trying to evacuate the compute host on older version (mitaka). >>>>>> Could please share the process you followed. I am not able to succeed >>>>>> with openstack live-migration fails with error message (this is known issue >>>>>> in older versions) and nova live-ligration - nothing happens even after >>>>>> initiating VM migration. It is almost 4 days. >>>>>> >>>>>> ~Jay. >>>>>> >>>>>> On Thu, Jul 11, 2019 at 11:31 AM Ignazio Cassano < >>>>>> ignaziocassano at gmail.com> wrote: >>>>>> >>>>>>> I am sorry. >>>>>>> For simulating an host crash I used a wrong procedure. >>>>>>> Using "echo 'c' > /proc/sysrq-trigger" all work fine >>>>>>> >>>>>>> Il giorno gio 11 lug 2019 alle ore 11:01 Ignazio Cassano < >>>>>>> ignaziocassano at gmail.com> ha scritto: >>>>>>> >>>>>>>> Hello All, >>>>>>>> on ocata when I poweroff a node with active instance , doing a >>>>>>>> nova host-evacuate works fine >>>>>>>> and instances are restartd on an active node. >>>>>>>> On queens it does non evacuate instances but nova-api reports for >>>>>>>> each instance the following: >>>>>>>> >>>>>>>> 2019-07-11 10:19:54.745 13811 INFO nova.api.openstack.wsgi >>>>>>>> [req-daad0a7d-87ce-41bf-b096-a70fc306db5c 0c7a2d6006614fe2b3e81e47377dd2a9 >>>>>>>> c26f8d35f85547c4add392a221af1aab - default default] HTTP exception thrown: >>>>>>>> Cannot 'evacuate' instance e8485a5e-3623-4184-bcce-cafd56fa60b3 while it is >>>>>>>> in task_state powering-off >>>>>>>> >>>>>>>> So it poweroff all instance on the failed node but does not start >>>>>>>> them on active nodes >>>>>>>> >>>>>>>> What is changed ? >>>>>>>> Ignazio >>>>>>>> >>>>>>>> >>>>>>>> >>>>>> >>>>>> -- >>>>>> ​ >>>>>> P *SAVE PAPER – Please do not print this e-mail unless absolutely >>>>>> necessary.* >>>>>> >>>>> >>>> >>>> -- >>>> ​ >>>> P *SAVE PAPER – Please do not print this e-mail unless absolutely >>>> necessary.* >>>> >>> >> >> -- >> ​ >> P *SAVE PAPER – Please do not print this e-mail unless absolutely >> necessary.* >> > -- ​ P *SAVE PAPER – Please do not print this e-mail unless absolutely necessary.* -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Fri Jul 12 09:52:49 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Fri, 12 Jul 2019 11:52:49 +0200 Subject: [queens][nova] nova host-evacuate errot In-Reply-To: References: Message-ID: Sorry ...the question was : how many compute nodes do you have ? instead of how many compute nodes do gli have... Anycase; Did you configured cinder ? Il giorno ven 12 lug 2019 alle ore 11:26 Jay See ha scritto: > Ignazio, > > One instance is stuck in error state not able to recover it. All other > instances are running now. > > root at h004:~$ nova reset-state --all-tenants my-instance-1-2 > Reset state for server my-instance-1-2 succeeded; new state is error > > I have several compute nodes (14). I am not sure what is gli? > Live migration is not working, i have tried it was not throwing any > errors. But nothing seems to happen. > I am not completely sure, I haven't heard about gli before. (This setup is > deployed by someone else). > > ~Jay. > > On Fri, Jul 12, 2019 at 6:12 AM Ignazio Cassano > wrote: > >> Jay, for recovering vm state use the command nova reset-state.... >> >> nova help reset-state to check the command requested parameters. >> >> Ad far as evacuation la concerned, how many compute nodes do gli have ? >> Instance live migration works? >> Are gli using shared cinder storage? >> Ignazio >> >> Il Gio 11 Lug 2019 20:51 Jay See ha scritto: >> >>> Thanks for explanation Ignazio. >>> >>> I have tried same same by trying to put the compute node on a failure >>> (echo 'c' > /proc/sysrq-trigger ). Compute node was stuck and I was not >>> able connect to it. >>> All the VMs are now in Error state. >>> >>> Running the host-evacaute was successful on controller node, but now I >>> am not able to use the VMs. Because they are all in error state now. >>> >>> root at h004:~$ nova host-evacuate h017 >>> >>> +--------------------------------------+-------------------+---------------+ >>> | Server UUID | Evacuate Accepted | Error >>> Message | >>> >>> +--------------------------------------+-------------------+---------------+ >>> | f3545f7d-b85e-49ee-b407-333a4c5b5ab9 | True | >>> | >>> | 9094494b-cfa3-459b-8d51-d9aae0ea9636 | True | >>> | >>> | abe7075b-ac22-4168-bf3d-d302ba37d80e | True | >>> | >>> | c9919371-5f2e-4155-a01a-5f41d9c8b0e7 | True | >>> | >>> | ffd983bb-851e-4314-9d1d-375303c278f3 | True | >>> | >>> >>> +--------------------------------------+-------------------+---------------+ >>> >>> Now I have restarted the compute node manually , now I am able to >>> connect to the compute node but VMs are still in Error state. >>> 1. Any ideas, how to recover the VMs? >>> 2. Are there any other methods to evacuate, as this method seems to be >>> not working in mitaka version. >>> >>> ~Jay. >>> >>> On Thu, Jul 11, 2019 at 1:33 PM Ignazio Cassano < >>> ignaziocassano at gmail.com> wrote: >>> >>>> Ok Jay, >>>> let me to describe my environment. >>>> I have an openstack made up of 3 controllers nodes ad several compute >>>> nodes. >>>> The controller nodes services are controlled by pacemaker and the >>>> compute nodes services are controlled by remote pacemaker. >>>> My hardware is Dell so I am using ipmi fencing device . >>>> I wrote a service controlled by pacemaker: >>>> this service controls if a compude node fails and for avoiding split >>>> brains if a compute node does nod respond on the management network and on >>>> storage network the stonith poweroff the node and then execute a nova >>>> host-evacuate. >>>> >>>> Anycase to have a simulation before writing the service I described >>>> above you can do as follows: >>>> >>>> connect on one compute node where some virtual machines are running >>>> run the command: echo 'c' > /proc/sysrq-trigger (it stops immediately >>>> the node like in case of failure) >>>> On a controller node run: nova host-evacuate "name of failed compute >>>> node" >>>> Instances running on the failed compute node should be restarted on >>>> another compute node >>>> >>>> >>>> Ignazio >>>> >>>> Il giorno gio 11 lug 2019 alle ore 11:57 Jay See < >>>> jayachander.it at gmail.com> ha scritto: >>>> >>>>> Hi , >>>>> >>>>> I have tried on a failed compute node which is in power off state now. >>>>> I have tried on a running compute node, no errors. But nothing happens. >>>>> On running compute node - Disabled the compute service and tried >>>>> migration also. >>>>> >>>>> May be I might have not followed proper steps. Just wanted to know the >>>>> steps you have followed. Otherwise, I was planning to manual migration also >>>>> if possible. >>>>> ~Jay. >>>>> >>>>> On Thu, Jul 11, 2019 at 11:52 AM Ignazio Cassano < >>>>> ignaziocassano at gmail.com> wrote: >>>>> >>>>>> Hi Jay, >>>>>> would you like to evacuate a failed compute node or evacuate a >>>>>> running compute node ? >>>>>> >>>>>> Ignazio >>>>>> >>>>>> Il giorno gio 11 lug 2019 alle ore 11:48 Jay See < >>>>>> jayachander.it at gmail.com> ha scritto: >>>>>> >>>>>>> Hi Ignazio, >>>>>>> >>>>>>> I am trying to evacuate the compute host on older version (mitaka). >>>>>>> Could please share the process you followed. I am not able to >>>>>>> succeed with openstack live-migration fails with error message (this is >>>>>>> known issue in older versions) and nova live-ligration - nothing happens >>>>>>> even after initiating VM migration. It is almost 4 days. >>>>>>> >>>>>>> ~Jay. >>>>>>> >>>>>>> On Thu, Jul 11, 2019 at 11:31 AM Ignazio Cassano < >>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>> >>>>>>>> I am sorry. >>>>>>>> For simulating an host crash I used a wrong procedure. >>>>>>>> Using "echo 'c' > /proc/sysrq-trigger" all work fine >>>>>>>> >>>>>>>> Il giorno gio 11 lug 2019 alle ore 11:01 Ignazio Cassano < >>>>>>>> ignaziocassano at gmail.com> ha scritto: >>>>>>>> >>>>>>>>> Hello All, >>>>>>>>> on ocata when I poweroff a node with active instance , doing a >>>>>>>>> nova host-evacuate works fine >>>>>>>>> and instances are restartd on an active node. >>>>>>>>> On queens it does non evacuate instances but nova-api reports for >>>>>>>>> each instance the following: >>>>>>>>> >>>>>>>>> 2019-07-11 10:19:54.745 13811 INFO nova.api.openstack.wsgi >>>>>>>>> [req-daad0a7d-87ce-41bf-b096-a70fc306db5c 0c7a2d6006614fe2b3e81e47377dd2a9 >>>>>>>>> c26f8d35f85547c4add392a221af1aab - default default] HTTP exception thrown: >>>>>>>>> Cannot 'evacuate' instance e8485a5e-3623-4184-bcce-cafd56fa60b3 while it is >>>>>>>>> in task_state powering-off >>>>>>>>> >>>>>>>>> So it poweroff all instance on the failed node but does not start >>>>>>>>> them on active nodes >>>>>>>>> >>>>>>>>> What is changed ? >>>>>>>>> Ignazio >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> ​ >>>>>>> P *SAVE PAPER – Please do not print this e-mail unless absolutely >>>>>>> necessary.* >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> ​ >>>>> P *SAVE PAPER – Please do not print this e-mail unless absolutely >>>>> necessary.* >>>>> >>>> >>> >>> -- >>> ​ >>> P *SAVE PAPER – Please do not print this e-mail unless absolutely >>> necessary.* >>> >> > > -- > ​ > P *SAVE PAPER – Please do not print this e-mail unless absolutely > necessary.* > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cdent+os at anticdent.org Fri Jul 12 10:00:01 2019 From: cdent+os at anticdent.org (Chris Dent) Date: Fri, 12 Jul 2019 11:00:01 +0100 (BST) Subject: [placement] update 19-27 Message-ID: HTML: https://anticdent.org/placement-update-19-27.html Pupdate 19-27 is here and now. # Most Important Of the features we planned to do this cycle, all are done save one: consumer types (in progress, see below). This means we have a good opportunity to focus on documentation, performance, and improving the codebase for maintainability. You do not need permission to work on these things. If you find a problem and know how to fix it, fix it. If you are not sure about the solution, please discuss it on this email list or in the `#openstack-placement` IRC channel. This also means we're in a good position to help review changes that use placement in other projects. The Foundation needs to know how much, if any, Placement time will be needed in Shanghai. I started a [thread](http://lists.openstack.org/pipermail/openstack-discuss/2019-July/007654.html) and an [etherpad](https://etherpad.openstack.org/p/placement-shanghai-ptg). # What's Changed * The `same_subtree` query parameter has merged as [microversion 1.36](https://docs.openstack.org/placement/latest/placement-api-microversion-history.html#support-same-subtree-queryparam-on-get-allocation-candidates). This enables a form of nested provider affinity: "these providers must all share the same ancestor". * The placement projects have been updated (pending merge) to match the [Python 3 test runtimes for Train](https://governance.openstack.org/tc/goals/train/python3-updates.html) community wide goal. Since the projects have been Python3-enabled from the start, this was mostly a matter of aligning configurations with community-wide norms. When Python 3.8 becomes available we should get on that sooner than later to catch issues early. # Specs/Features All placement specs have merged. Thanks to everyone for the frequent reviews and quick followups. Some non-placement specs are listed in the Other section below. # Stories/Bugs (Numbers in () are the change since the last pupdate.) There are 23 (0) stories in [the placement group](https://storyboard.openstack.org/#!/project_group/placement). 0 (0) are [untagged](https://storyboard.openstack.org/#!/worklist/580). 3 (1) are [bugs](https://storyboard.openstack.org/#!/worklist/574). 5 (0) are [cleanups](https://storyboard.openstack.org/#!/worklist/575). 11 (0) are [rfes](https://storyboard.openstack.org/#!/worklist/594). 4 (0) are [docs](https://storyboard.openstack.org/#!/worklist/637). If you're interested in helping out with placement, those stories are good places to look. * Placement related nova [bugs not yet in progress](https://goo.gl/TgiPXb) on launchpad: 16 (0). * Placement related nova [in progress bugs](https://goo.gl/vzGGDQ) on launchpad: 4 (0). # osc-placement osc-placement is currently behind by 11 microversions. * Add support for multiple member_of. This is stuck and needs input from users of the tool. # Main Themes ## Consumer Types Adding a type to consumers will allow them to be grouped for various purposes, including quota accounting. * This is currently the migration to add the necessary table and column. ## Cleanup Cleanup is an overarching theme related to improving documentation, performance and the maintainability of the code. The changes we are making this cycle are fairly complex to use and are fairly complex to write, so it is good that we're going to have plenty of time to clean and clarify all these things. As mentioned last week, one of the important cleanup tasks that is not yet in progress is updating the [gabbit](https://opendev.org/openstack/placement/src/branch/master/gate/gabbits/nested-perfload.yaml) that creates the nested topology that's used in nested performance testing. The topology there is simple, unrealistic, and doesn't sufficiently exercise the several features that may be used during a query that desires a nested response. This needs to be someone who is more closely related to real world use of nested than me. efried? gibi? Another cleanup that needs to start is satisfying the community wide goal of [PDF doc generation](https://storyboard.openstack.org/#!/story/2006110). Anyone know if there is a cookbook for this? # Other Placement Miscellaneous changes can be found in [the usual place](https://review.opendev.org/#/q/project:openstack/placement+status:open). There are three [os-traits changes](https://review.opendev.org/#/q/project:openstack/os-traits+status:open) being discussed. And one [os-resource-classes change](https://review.opendev.org/#/q/project:openstack/os-resource-classes+status:open). # Other Service Users New discoveries are added to the end. Merged stuff is removed. Anything that has had no activity in 4 weeks has been removed. * Nova: nova-manage: heal port allocations * nova-spec: Allow compute nodes to use DISK_GB from shared storage RP * Cyborg: Placement report * helm: add placement chart * Nova: Use OpenStack SDK for placement * Nova: Spec: Provider config YAML file * libvirt: report pmem namespaces resources by provider tree * Nova: Remove PlacementAPIConnectFailure handling from AggregateAPI * Nova: support move ops with qos ports * Nova: get_ksa_adapter: nix by-service-type confgrp hack * OSA: Add nova placement to placement migration * Nova: Defaults missing group_policy to 'none' * Blazar: Create placement client for each request * tempest: Define the Integrated-gate-placement gate template * Nova: Restore RT.old_resources if ComputeNode.save() fails * nova: Support filtering of hosts by forbidden aggregates * blazar: Send global_request_id for tracing calls * Nova: Update HostState.\*\_allocation_ratio earlier * neutron: segments: fix rp inventory update * Nova: WIP: Add a placement audit command * OSA: Add placement client for neutron * zun: [WIP] Use placement for unified resource management # End A colleague suggested yesterday that the universe doesn't have an over subscription problem, rather there's localized contention, and what we really have is a placement problem. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent From jayachander.it at gmail.com Fri Jul 12 10:48:07 2019 From: jayachander.it at gmail.com (Jay See) Date: Fri, 12 Jul 2019 12:48:07 +0200 Subject: [queens][nova] nova host-evacuate errot In-Reply-To: References: Message-ID: Yes, cinder is running. root at h017:~$ service --status-all | grep cinder [ + ] cinder-volume On Fri, Jul 12, 2019 at 11:53 AM Ignazio Cassano wrote: > Sorry ...the question was : how many compute nodes do you have ? > instead of how many compute nodes do gli have... > > > Anycase; > Did you configured cinder ? > > Il giorno ven 12 lug 2019 alle ore 11:26 Jay See > ha scritto: > >> Ignazio, >> >> One instance is stuck in error state not able to recover it. All other >> instances are running now. >> >> root at h004:~$ nova reset-state --all-tenants my-instance-1-2 >> Reset state for server my-instance-1-2 succeeded; new state is error >> >> I have several compute nodes (14). I am not sure what is gli? >> Live migration is not working, i have tried it was not throwing any >> errors. But nothing seems to happen. >> I am not completely sure, I haven't heard about gli before. (This setup >> is deployed by someone else). >> >> ~Jay. >> >> On Fri, Jul 12, 2019 at 6:12 AM Ignazio Cassano >> wrote: >> >>> Jay, for recovering vm state use the command nova reset-state.... >>> >>> nova help reset-state to check the command requested parameters. >>> >>> Ad far as evacuation la concerned, how many compute nodes do gli have ? >>> Instance live migration works? >>> Are gli using shared cinder storage? >>> Ignazio >>> >>> Il Gio 11 Lug 2019 20:51 Jay See ha scritto: >>> >>>> Thanks for explanation Ignazio. >>>> >>>> I have tried same same by trying to put the compute node on a failure >>>> (echo 'c' > /proc/sysrq-trigger ). Compute node was stuck and I was not >>>> able connect to it. >>>> All the VMs are now in Error state. >>>> >>>> Running the host-evacaute was successful on controller node, but now I >>>> am not able to use the VMs. Because they are all in error state now. >>>> >>>> root at h004:~$ nova host-evacuate h017 >>>> >>>> +--------------------------------------+-------------------+---------------+ >>>> | Server UUID | Evacuate Accepted | Error >>>> Message | >>>> >>>> +--------------------------------------+-------------------+---------------+ >>>> | f3545f7d-b85e-49ee-b407-333a4c5b5ab9 | True | >>>> | >>>> | 9094494b-cfa3-459b-8d51-d9aae0ea9636 | True | >>>> | >>>> | abe7075b-ac22-4168-bf3d-d302ba37d80e | True | >>>> | >>>> | c9919371-5f2e-4155-a01a-5f41d9c8b0e7 | True | >>>> | >>>> | ffd983bb-851e-4314-9d1d-375303c278f3 | True | >>>> | >>>> >>>> +--------------------------------------+-------------------+---------------+ >>>> >>>> Now I have restarted the compute node manually , now I am able to >>>> connect to the compute node but VMs are still in Error state. >>>> 1. Any ideas, how to recover the VMs? >>>> 2. Are there any other methods to evacuate, as this method seems to be >>>> not working in mitaka version. >>>> >>>> ~Jay. >>>> >>>> On Thu, Jul 11, 2019 at 1:33 PM Ignazio Cassano < >>>> ignaziocassano at gmail.com> wrote: >>>> >>>>> Ok Jay, >>>>> let me to describe my environment. >>>>> I have an openstack made up of 3 controllers nodes ad several compute >>>>> nodes. >>>>> The controller nodes services are controlled by pacemaker and the >>>>> compute nodes services are controlled by remote pacemaker. >>>>> My hardware is Dell so I am using ipmi fencing device . >>>>> I wrote a service controlled by pacemaker: >>>>> this service controls if a compude node fails and for avoiding split >>>>> brains if a compute node does nod respond on the management network and on >>>>> storage network the stonith poweroff the node and then execute a nova >>>>> host-evacuate. >>>>> >>>>> Anycase to have a simulation before writing the service I described >>>>> above you can do as follows: >>>>> >>>>> connect on one compute node where some virtual machines are running >>>>> run the command: echo 'c' > /proc/sysrq-trigger (it stops immediately >>>>> the node like in case of failure) >>>>> On a controller node run: nova host-evacuate "name of failed compute >>>>> node" >>>>> Instances running on the failed compute node should be restarted on >>>>> another compute node >>>>> >>>>> >>>>> Ignazio >>>>> >>>>> Il giorno gio 11 lug 2019 alle ore 11:57 Jay See < >>>>> jayachander.it at gmail.com> ha scritto: >>>>> >>>>>> Hi , >>>>>> >>>>>> I have tried on a failed compute node which is in power off state now. >>>>>> I have tried on a running compute node, no errors. But >>>>>> nothing happens. >>>>>> On running compute node - Disabled the compute service and tried >>>>>> migration also. >>>>>> >>>>>> May be I might have not followed proper steps. Just wanted to know >>>>>> the steps you have followed. Otherwise, I was planning to manual migration >>>>>> also if possible. >>>>>> ~Jay. >>>>>> >>>>>> On Thu, Jul 11, 2019 at 11:52 AM Ignazio Cassano < >>>>>> ignaziocassano at gmail.com> wrote: >>>>>> >>>>>>> Hi Jay, >>>>>>> would you like to evacuate a failed compute node or evacuate a >>>>>>> running compute node ? >>>>>>> >>>>>>> Ignazio >>>>>>> >>>>>>> Il giorno gio 11 lug 2019 alle ore 11:48 Jay See < >>>>>>> jayachander.it at gmail.com> ha scritto: >>>>>>> >>>>>>>> Hi Ignazio, >>>>>>>> >>>>>>>> I am trying to evacuate the compute host on older version (mitaka). >>>>>>>> Could please share the process you followed. I am not able to >>>>>>>> succeed with openstack live-migration fails with error message (this is >>>>>>>> known issue in older versions) and nova live-ligration - nothing happens >>>>>>>> even after initiating VM migration. It is almost 4 days. >>>>>>>> >>>>>>>> ~Jay. >>>>>>>> >>>>>>>> On Thu, Jul 11, 2019 at 11:31 AM Ignazio Cassano < >>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>> >>>>>>>>> I am sorry. >>>>>>>>> For simulating an host crash I used a wrong procedure. >>>>>>>>> Using "echo 'c' > /proc/sysrq-trigger" all work fine >>>>>>>>> >>>>>>>>> Il giorno gio 11 lug 2019 alle ore 11:01 Ignazio Cassano < >>>>>>>>> ignaziocassano at gmail.com> ha scritto: >>>>>>>>> >>>>>>>>>> Hello All, >>>>>>>>>> on ocata when I poweroff a node with active instance , doing a >>>>>>>>>> nova host-evacuate works fine >>>>>>>>>> and instances are restartd on an active node. >>>>>>>>>> On queens it does non evacuate instances but nova-api reports for >>>>>>>>>> each instance the following: >>>>>>>>>> >>>>>>>>>> 2019-07-11 10:19:54.745 13811 INFO nova.api.openstack.wsgi >>>>>>>>>> [req-daad0a7d-87ce-41bf-b096-a70fc306db5c 0c7a2d6006614fe2b3e81e47377dd2a9 >>>>>>>>>> c26f8d35f85547c4add392a221af1aab - default default] HTTP exception thrown: >>>>>>>>>> Cannot 'evacuate' instance e8485a5e-3623-4184-bcce-cafd56fa60b3 while it is >>>>>>>>>> in task_state powering-off >>>>>>>>>> >>>>>>>>>> So it poweroff all instance on the failed node but does not start >>>>>>>>>> them on active nodes >>>>>>>>>> >>>>>>>>>> What is changed ? >>>>>>>>>> Ignazio >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> ​ >>>>>>>> P *SAVE PAPER – Please do not print this e-mail unless absolutely >>>>>>>> necessary.* >>>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> ​ >>>>>> P *SAVE PAPER – Please do not print this e-mail unless absolutely >>>>>> necessary.* >>>>>> >>>>> >>>> >>>> -- >>>> ​ >>>> P *SAVE PAPER – Please do not print this e-mail unless absolutely >>>> necessary.* >>>> >>> >> >> -- >> ​ >> P *SAVE PAPER – Please do not print this e-mail unless absolutely >> necessary.* >> > -- ​ P *SAVE PAPER – Please do not print this e-mail unless absolutely necessary.* -------------- next part -------------- An HTML attachment was scrubbed... URL: From ignaziocassano at gmail.com Fri Jul 12 11:18:01 2019 From: ignaziocassano at gmail.com (Ignazio Cassano) Date: Fri, 12 Jul 2019 13:18:01 +0200 Subject: [queens][nova] nova host-evacuate errot In-Reply-To: References: Message-ID: Ok. But your virtual machine are using a root volume on cinder or are ephemeral ? Anycase when you try a live migration, look at nova compute log on the kvm node where the instance is migrate from Il giorno ven 12 lug 2019 alle ore 12:48 Jay See ha scritto: > Yes, cinder is running. > > root at h017:~$ service --status-all | grep cinder > [ + ] cinder-volume > > On Fri, Jul 12, 2019 at 11:53 AM Ignazio Cassano > wrote: > >> Sorry ...the question was : how many compute nodes do you have ? >> instead of how many compute nodes do gli have... >> >> >> Anycase; >> Did you configured cinder ? >> >> Il giorno ven 12 lug 2019 alle ore 11:26 Jay See < >> jayachander.it at gmail.com> ha scritto: >> >>> Ignazio, >>> >>> One instance is stuck in error state not able to recover it. All other >>> instances are running now. >>> >>> root at h004:~$ nova reset-state --all-tenants my-instance-1-2 >>> Reset state for server my-instance-1-2 succeeded; new state is error >>> >>> I have several compute nodes (14). I am not sure what is gli? >>> Live migration is not working, i have tried it was not throwing any >>> errors. But nothing seems to happen. >>> I am not completely sure, I haven't heard about gli before. (This setup >>> is deployed by someone else). >>> >>> ~Jay. >>> >>> On Fri, Jul 12, 2019 at 6:12 AM Ignazio Cassano < >>> ignaziocassano at gmail.com> wrote: >>> >>>> Jay, for recovering vm state use the command nova reset-state.... >>>> >>>> nova help reset-state to check the command requested parameters. >>>> >>>> Ad far as evacuation la concerned, how many compute nodes do gli have >>>> ? >>>> Instance live migration works? >>>> Are gli using shared cinder storage? >>>> Ignazio >>>> >>>> Il Gio 11 Lug 2019 20:51 Jay See ha scritto: >>>> >>>>> Thanks for explanation Ignazio. >>>>> >>>>> I have tried same same by trying to put the compute node on a failure >>>>> (echo 'c' > /proc/sysrq-trigger ). Compute node was stuck and I was not >>>>> able connect to it. >>>>> All the VMs are now in Error state. >>>>> >>>>> Running the host-evacaute was successful on controller node, but now I >>>>> am not able to use the VMs. Because they are all in error state now. >>>>> >>>>> root at h004:~$ nova host-evacuate h017 >>>>> >>>>> +--------------------------------------+-------------------+---------------+ >>>>> | Server UUID | Evacuate Accepted | Error >>>>> Message | >>>>> >>>>> +--------------------------------------+-------------------+---------------+ >>>>> | f3545f7d-b85e-49ee-b407-333a4c5b5ab9 | True | >>>>> | >>>>> | 9094494b-cfa3-459b-8d51-d9aae0ea9636 | True | >>>>> | >>>>> | abe7075b-ac22-4168-bf3d-d302ba37d80e | True | >>>>> | >>>>> | c9919371-5f2e-4155-a01a-5f41d9c8b0e7 | True | >>>>> | >>>>> | ffd983bb-851e-4314-9d1d-375303c278f3 | True | >>>>> | >>>>> >>>>> +--------------------------------------+-------------------+---------------+ >>>>> >>>>> Now I have restarted the compute node manually , now I am able to >>>>> connect to the compute node but VMs are still in Error state. >>>>> 1. Any ideas, how to recover the VMs? >>>>> 2. Are there any other methods to evacuate, as this method seems to be >>>>> not working in mitaka version. >>>>> >>>>> ~Jay. >>>>> >>>>> On Thu, Jul 11, 2019 at 1:33 PM Ignazio Cassano < >>>>> ignaziocassano at gmail.com> wrote: >>>>> >>>>>> Ok Jay, >>>>>> let me to describe my environment. >>>>>> I have an openstack made up of 3 controllers nodes ad several compute >>>>>> nodes. >>>>>> The controller nodes services are controlled by pacemaker and the >>>>>> compute nodes services are controlled by remote pacemaker. >>>>>> My hardware is Dell so I am using ipmi fencing device . >>>>>> I wrote a service controlled by pacemaker: >>>>>> this service controls if a compude node fails and for avoiding split >>>>>> brains if a compute node does nod respond on the management network and on >>>>>> storage network the stonith poweroff the node and then execute a nova >>>>>> host-evacuate. >>>>>> >>>>>> Anycase to have a simulation before writing the service I described >>>>>> above you can do as follows: >>>>>> >>>>>> connect on one compute node where some virtual machines are running >>>>>> run the command: echo 'c' > /proc/sysrq-trigger (it stops immediately >>>>>> the node like in case of failure) >>>>>> On a controller node run: nova host-evacuate "name of failed compute >>>>>> node" >>>>>> Instances running on the failed compute node should be restarted on >>>>>> another compute node >>>>>> >>>>>> >>>>>> Ignazio >>>>>> >>>>>> Il giorno gio 11 lug 2019 alle ore 11:57 Jay See < >>>>>> jayachander.it at gmail.com> ha scritto: >>>>>> >>>>>>> Hi , >>>>>>> >>>>>>> I have tried on a failed compute node which is in power off state >>>>>>> now. >>>>>>> I have tried on a running compute node, no errors. But >>>>>>> nothing happens. >>>>>>> On running compute node - Disabled the compute service and tried >>>>>>> migration also. >>>>>>> >>>>>>> May be I might have not followed proper steps. Just wanted to know >>>>>>> the steps you have followed. Otherwise, I was planning to manual migration >>>>>>> also if possible. >>>>>>> ~Jay. >>>>>>> >>>>>>> On Thu, Jul 11, 2019 at 11:52 AM Ignazio Cassano < >>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>> >>>>>>>> Hi Jay, >>>>>>>> would you like to evacuate a failed compute node or evacuate a >>>>>>>> running compute node ? >>>>>>>> >>>>>>>> Ignazio >>>>>>>> >>>>>>>> Il giorno gio 11 lug 2019 alle ore 11:48 Jay See < >>>>>>>> jayachander.it at gmail.com> ha scritto: >>>>>>>> >>>>>>>>> Hi Ignazio, >>>>>>>>> >>>>>>>>> I am trying to evacuate the compute host on older version (mitaka). >>>>>>>>> Could please share the process you followed. I am not able to >>>>>>>>> succeed with openstack live-migration fails with error message (this is >>>>>>>>> known issue in older versions) and nova live-ligration - nothing happens >>>>>>>>> even after initiating VM migration. It is almost 4 days. >>>>>>>>> >>>>>>>>> ~Jay. >>>>>>>>> >>>>>>>>> On Thu, Jul 11, 2019 at 11:31 AM Ignazio Cassano < >>>>>>>>> ignaziocassano at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> I am sorry. >>>>>>>>>> For simulating an host crash I used a wrong procedure. >>>>>>>>>> Using "echo 'c' > /proc/sysrq-trigger" all work fine >>>>>>>>>> >>>>>>>>>> Il giorno gio 11 lug 2019 alle ore 11:01 Ignazio Cassano < >>>>>>>>>> ignaziocassano at gmail.com> ha scritto: >>>>>>>>>> >>>>>>>>>>> Hello All, >>>>>>>>>>> on ocata when I poweroff a node with active instance , doing a >>>>>>>>>>> nova host-evacuate works fine >>>>>>>>>>> and instances are restartd on an active node. >>>>>>>>>>> On queens it does non evacuate instances but nova-api reports >>>>>>>>>>> for each instance the following: >>>>>>>>>>> >>>>>>>>>>> 2019-07-11 10:19:54.745 13811 INFO nova.api.openstack.wsgi >>>>>>>>>>> [req-daad0a7d-87ce-41bf-b096-a70fc306db5c 0c7a2d6006614fe2b3e81e47377dd2a9 >>>>>>>>>>> c26f8d35f85547c4add392a221af1aab - default default] HTTP exception thrown: >>>>>>>>>>> Cannot 'evacuate' instance e8485a5e-3623-4184-bcce-cafd56fa60b3 while it is >>>>>>>>>>> in task_state powering-off >>>>>>>>>>> >>>>>>>>>>> So it poweroff all instance on the failed node but does not >>>>>>>>>>> start them on active nodes >>>>>>>>>>> >>>>>>>>>>> What is changed ? >>>>>>>>>>> Ignazio >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> ​ >>>>>>>>> P *SAVE PAPER – Please do not print this e-mail unless >>>>>>>>> absolutely necessary.* >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> ​ >>>>>>> P *SAVE PAPER – Please do not print this e-mail unless absolutely >>>>>>> necessary.* >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> ​ >>>>> P *SAVE PAPER – Please do not print this e-mail unless absolutely >>>>> necessary.* >>>>> >>>> >>> >>> -- >>> ​ >>> P *SAVE PAPER – Please do not print this e-mail unless absolutely >>> necessary.* >>> >> > > -- > ​ > P *SAVE PAPER – Please do not print this e-mail unless absolutely > necessary.* > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ssbarnea at redhat.com Fri Jul 12 11:27:04 2019 From: ssbarnea at redhat.com (Sorin Sbarnea) Date: Fri, 12 Jul 2019 12:27:04 +0100 Subject: [doc8] future development and maintenance In-Reply-To: <273715c4f2d4933232fc6b26cb982609daa1a2c7.camel@redhat.com> References: <9D091EF4-60F2-490C-BF76-AAD7A68FB8A1@redhat.com> <273715c4f2d4933232fc6b26cb982609daa1a2c7.camel@redhat.com> Message-ID: As 2/4 emails returned as invalid accounts, my hopes to hear back are getting lower and lower (tried few other channels too). I also believe that doc8 may be better suited outside OpenStack and especially under GitHub due to extra visibility to other developers. So temporary I created a fork at https://github.com/pycontribs/doc8 when I enabled travis and merged few fixes. If we do not hear back I propose to use ^ for future development and maintenance. If Sphinx organization wants to adopt it, even better, it can be transferred without loosing any tickets/PRs/... To be able to make a new release on https://pypi.org/project/doc8/ from Travis, I would need to have the "pycontribs" user added as maintainer, or I could always attempt to publish it as "doc9" but I would prefer to avoid a schism. /sorin > On 11 Jul 2019, at 17:10, Stephen Finucane wrote: > > On Thu, 2019-07-11 at 12:01 +0100, Sorin Sbarnea wrote: >> It seems that the doc8 project is lacking some love, regardless the >> fact that is used by >90 projects from opendev.org. >> >> https://review.opendev.org/#/q/project:x/doc8 >> >> Last merge and release was more than 2 years ago and no reviews were >> performed either. >> >> I think it would be in our interest to assure that doc8 maintenance >> continues and that we can keep it usable. >> >> I would like to propose extenting the list of cores from the current >> 4 ones that I already listed in CC with 3 more, so we can effectively >> make a change that gets merged and later released (anyone willing to >> help?) >> >> If current cores agree, I would be happy to help with maintenance. I >> estimate that the effort needed would likely be less than 1h/month in >> longer term. If there is a desire to move it to github/travis, I >> would not mind either. > > I'd be tentatively interested in helping out here, though it's not as > if I don't already have a lot on my plate :) > > I wonder if this might have better success outside of OpenStack, > perhaps in the sphinx-doc or sphinx-contrib GitHub repo? > > Stephen > >> Thanks >> Sorin Sbarnea >> Red Hat TripleO CI From jose.castro.leon at cern.ch Fri Jul 12 13:03:14 2019 From: jose.castro.leon at cern.ch (Jose Castro Leon) Date: Fri, 12 Jul 2019 13:03:14 +0000 Subject: [Manila] CephFS deferred deletion Message-ID: Dear all, Lately, one of our clients stored 300k files in a manila cephfs share. Then he deleted the share in Manila. This event make the driver unresponsive for several hours until all the data was removed in the cluster. We had a quick look at the code in manila [1] and the deletion is done first by calling the following api calls in the ceph bindings (delete_volume[1] and then purge_volume[2]). The first call moves the directory to a volumes_deleted directory. The second call does a deletion in depth of all the contents of that directory. The last operation is the one that trigger the issue. We had a similar issue in the past in Cinder. There, Arne proposed to do a deferred deletion of volumes. I think we could do the same in Manila for the cephfs driver. The idea is to continue to call to the delete_volume. And then inside a periodic task in the driver, asynchronously it will get the contents of that directory and trigger the purge command. I can propose the change and contribute with the code, but before going to deep I would like to know if there is a reason of having a singleton for the volume_client connection. If I compare with cinder code the connection is established and closed in each operation with the backend. If you are not the maintainer, could you please point me to he/she? I can post it in the mailing list if you prefer Cheers Jose Castro Leon CERN Cloud Infrastructure [1] https://github.com/openstack/manila/blob/master/manila/share/drivers/cephfs/driver.py#L260-L267 [2] https://github.com/ceph/ceph/blob/master/src/pybind/ceph_volume_client.py#L700-L734 [2] https://github.com/ceph/ceph/blob/master/src/pybind/ceph_volume_client.py#L736-L790 PS: The issue was triggered by one of our clients in kubernetes using the Manila CSI driver From tomas.bredar at gmail.com Fri Jul 12 13:06:55 2019 From: tomas.bredar at gmail.com (=?UTF-8?B?VG9tw6HFoSBCcmVkw6Fy?=) Date: Fri, 12 Jul 2019 15:06:55 +0200 Subject: [tripleo][cinder][netapp] In-Reply-To: References: Message-ID: Hi Emilien! Thanks for your help. Yes with this I am able to define multiple stanzas in cinder.conf. However netapp driver needs a .conf file with the nfs shares listed in it. Defining multiple configuration files with nfs share details in each is not possible with the manual you've sent nor with the templates in my first email. I'm wondering if it's possible to define a second backend by creating another service, for example "OS::TripleO::Services::CinderBackendNetApp2" ? Tomas št 11. 7. 2019 o 14:35 Emilien Macchi napísal(a): > On Thu, Jul 11, 2019 at 7:32 AM Tomáš Bredár > wrote: > >> Hi community, >> >> I'm trying to define multiple NetApp storage backends via Tripleo >> installer. >> According to [1] the puppet manifest supports multiple backends. >> The current templates [2] [3] support only single backend. >> Does anyone know how to define multiple netapp backends in the >> tripleo-heat environment files / templates? >> > > We don't support that via the templates that you linked, however if you > follow this manual you should be able to configure multiple NetApp backends: > > https://docs.openstack.org/tripleo-docs/latest/install/advanced_deployment/cinder_custom_backend.html > > Let us know how it worked! > -- > Emilien Macchi > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tpb at dyncloud.net Fri Jul 12 13:15:40 2019 From: tpb at dyncloud.net (Tom Barron) Date: Fri, 12 Jul 2019 09:15:40 -0400 Subject: [Manila] CephFS deferred deletion In-Reply-To: References: Message-ID: <20190712131540.3eqvltysfix6eivd@barron.net> On 12/07/19 13:03 +0000, Jose Castro Leon wrote: >Dear all, > >Lately, one of our clients stored 300k files in a manila cephfs share. >Then he deleted the share in Manila. This event make the driver >unresponsive for several hours until all the data was removed in the >cluster. > >We had a quick look at the code in manila [1] and the deletion is done >first by calling the following api calls in the ceph bindings >(delete_volume[1] and then purge_volume[2]). The first call moves the >directory to a volumes_deleted directory. The second call does a >deletion in depth of all the contents of that directory. > >The last operation is the one that trigger the issue. > >We had a similar issue in the past in Cinder. There, Arne proposed to >do a deferred deletion of volumes. I think we could do the same in >Manila for the cephfs driver. > >The idea is to continue to call to the delete_volume. And then inside a >periodic task in the driver, asynchronously it will get the contents of >that directory and trigger the purge command. > >I can propose the change and contribute with the code, but before going >to deep I would like to know if there is a reason of having a singleton >for the volume_client connection. If I compare with cinder code the >connection is established and closed in each operation with the >backend. > >If you are not the maintainer, could you please point me to he/she? >I can post it in the mailing list if you prefer > >Cheers >Jose Castro Leon >CERN Cloud Infrastructure > >[1] >https://github.com/openstack/manila/blob/master/manila/share/drivers/cephfs/driver.py#L260-L267 > > >[2] >https://github.com/ceph/ceph/blob/master/src/pybind/ceph_volume_client.py#L700-L734 > > >[2] >https://github.com/ceph/ceph/blob/master/src/pybind/ceph_volume_client.py#L736-L790 > > >PS: The issue was triggered by one of our clients in kubernetes using >the Manila CSI driver Hi Jose, Let's get this fixed since there's a lot of interest in Manila CSI driver and I think we can expect more batched deletes with it than we have had historically. I've copied Ramana Raja and Patrick Donnelly since they will be able to answer your question about the singleton volume_client connection more authoritatively than I can. Thanks for volunteering to propose a review to deal with this issue! -- Tom Barron From tpb at dyncloud.net Fri Jul 12 13:25:07 2019 From: tpb at dyncloud.net (Tom Barron) Date: Fri, 12 Jul 2019 09:25:07 -0400 Subject: [manila][ptg] Manila PTG planning Message-ID: <20190712132507.de6cuud5jmnx2okd@barron.net> >From our weekly community meetings [1] we know there are five or six people expecting to participate in the Shanghai PTG this coming November. But those meetings are at 1500 UTC -- not the best time for folks in Asia -- so let's check via email as well. Please update the Manila Shanghai PTG Planning etherpad [2] to indicate if you plan (or are trying) to attend the PTG in person or remotely. There's also a section for Topic Proposals, so we can start adding to it now as well. Thanks! -- Tom Barron [1] https://wiki.openstack.org/wiki/Manila/Meetings [2] https://etherpad.openstack.org/p/manila-shanghai-ptg-planning From dpeacock at redhat.com Fri Jul 12 13:45:48 2019 From: dpeacock at redhat.com (David Peacock) Date: Fri, 12 Jul 2019 09:45:48 -0400 Subject: [TripleO] Scaling node counts with only Ansible (N=1) In-Reply-To: References: Message-ID: Hi James, On Wed, Jul 10, 2019 at 4:20 PM James Slagle wrote: > There's been a fair amount of recent work around simplifying our Heat > templates and migrating the software configuration part of our > deployment entirely to Ansible. > > As part of this effort, it became apparent that we could render much > of the data that we need out of Heat in a way that is generic per > node, and then have Ansible render the node specific data during > config-download runtime. > I find this endeavour very exciting. Do you have any early indications of performance gains that you can share? Cheers, David -------------- next part -------------- An HTML attachment was scrubbed... URL: From km.giuseppesannino at gmail.com Fri Jul 12 13:57:48 2019 From: km.giuseppesannino at gmail.com (Giuseppe Sannino) Date: Fri, 12 Jul 2019 15:57:48 +0200 Subject: [kolla][nova][neutron] Access to VMs is slow when running on a remote compute host Message-ID: Hi community, I need your help ,tips, advices. *> Environment <* I have deployed Openstack "Stein" using the latest kolla-ansible on the following deployment topology: 1) OS Controller running as VM on a "cloud" location 2) OS Compute running on a baremetal server remotely (wrt OS Controller) location 3) Network node running on the Compute host As per the above info, Controller and compute run on two different networks. Kolla-Ansible is not really designed for such scenario but after manipulating the globals.yml and the inventory files (basically I had to move node specific network settings from the globals to the inventory file), eventually the deployment works fine. *> Problem <* I have no specific issue working with this deployment except the following: "SSH connection to the VM is quite slow". It takes around 20 seconds for me to log into the VM (Ubuntu, CentOS, whatever). *> Observations <* - Except for the slowness during the SSH login, I don't have any further specific issue working with this envirorment - With the Network on the Compute I can turn the OS controller off with no impact on the VM. Still the connection is slow - I tried different type of images (Ubuntu, CentOS, Windows) always with the same result. - SSH connection is slow even if I try to login into the VM within the IP Namespace >From the ssh -vvv, I can see that the authentication gets stuck here: debug1: Authentication succeeded (publickey). Authenticated to ***** debug1: channel 0: new [client-session] debug3: ssh_session2_open: channel_new: 0 debug2: channel 0: send open debug3: send packet: type 90 debug1: Requesting no-more-sessions at openssh.com debug3: send packet: type 80 debug1: Entering interactive session. debug1: pledge: network >>>>> 10 to 15 seconds later debug3: receive packet: type 80 debug1: client_input_global_request: rtype hostkeys-00 at openssh.com want_reply 0 debug3: receive packet: type 91 debug2: callback start debug2: fd 3 setting TCP_NODELAY debug3: ssh_packet_set_tos: set IP_TOS 0x10 debug2: client_session2_setup: id 0 debug2: channel 0: request pty-req confirm 1 Have you ever experienced such issue ? Any suggestion? Many thanks /Giuseppe -------------- next part -------------- An HTML attachment was scrubbed... URL: From abishop at redhat.com Fri Jul 12 14:11:03 2019 From: abishop at redhat.com (Alan Bishop) Date: Fri, 12 Jul 2019 07:11:03 -0700 Subject: [tripleo][cinder][netapp] In-Reply-To: References: Message-ID: On Fri, Jul 12, 2019 at 6:09 AM Tomáš Bredár wrote: > Hi Emilien! > > Thanks for your help. Yes with this I am able to define multiple stanzas > in cinder.conf. However netapp driver needs a .conf file with the nfs > shares listed in it. Defining multiple configuration files with nfs share > details in each is not possible with the manual you've sent nor with the > templates in my first email. > Hi Tomas, When deploying a single backend, the tripleo template takes care of generating the nfs shares file (actually, puppet-cinder generates the file, but it's triggered by tripleo). But when you use the custom backend method that Emilien pointed you to use, then you are responsible for supplying all the pieces for the backend(s) to function correctly. This means you will need to generate the nfs shares file on the host (controller), and then bind mount the file using CinderVolumeOptVolumes so that the shares file on the host is visible to the cinder-volume process running in a container. I'm wondering if it's possible to define a second backend by creating > another service, for example "OS::TripleO::Services::CinderBackendNetApp2" ? > Sorry, this won't work. TripleO will trying to deploy two completely separate instances of the cinder-volume service, but the two deployments will step all over each other. There has been a long standing goal of enhancing tripleo so that it can deploy multiple instances of a cinder backend, but it's a complex task that will require non-trivial changes to tripleo. Alan Tomas > > št 11. 7. 2019 o 14:35 Emilien Macchi napísal(a): > >> On Thu, Jul 11, 2019 at 7:32 AM Tomáš Bredár >> wrote: >> >>> Hi community, >>> >>> I'm trying to define multiple NetApp storage backends via Tripleo >>> installer. >>> According to [1] the puppet manifest supports multiple backends. >>> The current templates [2] [3] support only single backend. >>> Does anyone know how to define multiple netapp backends in the >>> tripleo-heat environment files / templates? >>> >> >> We don't support that via the templates that you linked, however if you >> follow this manual you should be able to configure multiple NetApp backends: >> >> https://docs.openstack.org/tripleo-docs/latest/install/advanced_deployment/cinder_custom_backend.html >> >> Let us know how it worked! >> -- >> Emilien Macchi >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From haleyb.dev at gmail.com Fri Jul 12 14:23:18 2019 From: haleyb.dev at gmail.com (Brian Haley) Date: Fri, 12 Jul 2019 10:23:18 -0400 Subject: [kolla][nova][neutron] Access to VMs is slow when running on a remote compute host In-Reply-To: References: Message-ID: On 7/12/19 9:57 AM, Giuseppe Sannino wrote: > Hi community, > I need your help ,tips, advices. > > > *> Environment <* > I have deployed Openstack "Stein" using the latest kolla-ansible on the > following deployment topology: > > 1) OS Controller running as VM on a "cloud" location > 2) OS Compute running on a baremetal server remotely (wrt OS Controller) > location > 3) Network node running on the Compute host > > As per the above info, Controller and compute run on two different networks. > > Kolla-Ansible is not really designed for such scenario but after > manipulating the globals.yml and the inventory files (basically I had to > move node specific network settings from the globals to the inventory > file), eventually the deployment works fine. > > > *> Problem <* > I have no specific issue working with this deployment except the following: > > "SSH connection to the VM is quite slow". > > It takes around 20 seconds for me to log into the VM (Ubuntu, CentOS, > whatever). But once logged-in things are OK? For example, an scp stalls the same way, but the transfer is fast? > *> Observations <* > > * Except for the slowness during the SSH login, I don't have any > further specific issue working with this envirorment > * With the Network on the Compute I can turn the OS controller off > with no impact on the VM. Still the connection is slow > * I tried different type of images (Ubuntu, CentOS, Windows) always > with the same result. > * SSH connection is slow even if I try to login into the VM within the > IP Namespace > > From the ssh -vvv, I can see that the authentication gets stuck here: > > debug1: Authentication succeeded (publickey). > Authenticated to ***** > debug1: channel 0: new [client-session] > debug3: ssh_session2_open: channel_new: 0 > debug2: channel 0: send open > debug3: send packet: type 90 > debug1: Requesting no-more-sessions at openssh.com > > debug3: send packet: type 80 > debug1: Entering interactive session. > debug1: pledge: network > > >>>>> 10 to 15 seconds later What is sshd doing at this time? Have you tried enabling debug or running tcpdump when a new connection is attempted? At first glance I'd say it's a DNS issue since it eventually succeeds, the logs would help to point in a direction. -Brian > debug3: receive packet: type 80 > debug1: client_input_global_request: rtype hostkeys-00 at openssh.com > want_reply 0 > debug3: receive packet: type 91 > debug2: callback start > debug2: fd 3 setting TCP_NODELAY > debug3: ssh_packet_set_tos: set IP_TOS 0x10 > debug2: client_session2_setup: id 0 > debug2: channel 0: request pty-req confirm 1 > > > Have you ever experienced such issue ? > Any suggestion? > > Many thanks > > /Giuseppe > > > From mark at stackhpc.com Fri Jul 12 15:35:50 2019 From: mark at stackhpc.com (Mark Goddard) Date: Fri, 12 Jul 2019 16:35:50 +0100 Subject: [kolla][nova][neutron] Access to VMs is slow when running on a remote compute host In-Reply-To: References: Message-ID: On Fri, 12 Jul 2019 at 15:24, Brian Haley wrote: > > > On 7/12/19 9:57 AM, Giuseppe Sannino wrote: > > Hi community, > > I need your help ,tips, advices. > > > > > > *> Environment <* > > I have deployed Openstack "Stein" using the latest kolla-ansible on the > > following deployment topology: > > > > 1) OS Controller running as VM on a "cloud" location > > 2) OS Compute running on a baremetal server remotely (wrt OS Controller) > > location > > 3) Network node running on the Compute host > > > > As per the above info, Controller and compute run on two different > networks. > > > > Kolla-Ansible is not really designed for such scenario but after > > manipulating the globals.yml and the inventory files (basically I had to > > move node specific network settings from the globals to the inventory > > file), eventually the deployment works fine. > > > > > > *> Problem <* > > I have no specific issue working with this deployment except the > following: > > > > "SSH connection to the VM is quite slow". > > > > It takes around 20 seconds for me to log into the VM (Ubuntu, CentOS, > > whatever). > > But once logged-in things are OK? For example, an scp stalls the same > way, but the transfer is fast? > > > *> Observations <* > > > > * Except for the slowness during the SSH login, I don't have any > > further specific issue working with this envirorment > > * With the Network on the Compute I can turn the OS controller off > > with no impact on the VM. Still the connection is slow > > * I tried different type of images (Ubuntu, CentOS, Windows) always > > with the same result. > > * SSH connection is slow even if I try to login into the VM within the > > IP Namespace > > > > From the ssh -vvv, I can see that the authentication gets stuck here: > > > > debug1: Authentication succeeded (publickey). > > Authenticated to ***** > > debug1: channel 0: new [client-session] > > debug3: ssh_session2_open: channel_new: 0 > > debug2: channel 0: send open > > debug3: send packet: type 90 > > debug1: Requesting no-more-sessions at openssh.com > > > > debug3: send packet: type 80 > > debug1: Entering interactive session. > > debug1: pledge: network > > > > >>>>> 10 to 15 seconds later > > What is sshd doing at this time? Have you tried enabling debug or > running tcpdump when a new connection is attempted? At first glance I'd > say it's a DNS issue since it eventually succeeds, the logs would help > to point in a direction. > +1 - ~30s timeout on SSH login is normally a DNS issue. > -Brian > > > > debug3: receive packet: type 80 > > debug1: client_input_global_request: rtype hostkeys-00 at openssh.com > > want_reply 0 > > debug3: receive packet: type 91 > > debug2: callback start > > debug2: fd 3 setting TCP_NODELAY > > debug3: ssh_packet_set_tos: set IP_TOS 0x10 > > debug2: client_session2_setup: id 0 > > debug2: channel 0: request pty-req confirm 1 > > > > > > Have you ever experienced such issue ? > > Any suggestion? > > > > Many thanks > > > > /Giuseppe > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Fri Jul 12 15:40:58 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Fri, 12 Jul 2019 17:40:58 +0200 Subject: [kolla][nova][neutron] Access to VMs is slow when running on a remote compute host In-Reply-To: References: Message-ID: <01AA2BA1-3EEC-4142-BCDB-F9861707102E@redhat.com> Hi, I suspect some problems with names resolving. Can You check if You have also such delay when doing e.g. “sudo” commands after You ssh to the instance? > On 12 Jul 2019, at 16:23, Brian Haley wrote: > > > > On 7/12/19 9:57 AM, Giuseppe Sannino wrote: >> Hi community, >> I need your help ,tips, advices. >> *> Environment <* >> I have deployed Openstack "Stein" using the latest kolla-ansible on the following deployment topology: >> 1) OS Controller running as VM on a "cloud" location >> 2) OS Compute running on a baremetal server remotely (wrt OS Controller) location >> 3) Network node running on the Compute host >> As per the above info, Controller and compute run on two different networks. >> Kolla-Ansible is not really designed for such scenario but after manipulating the globals.yml and the inventory files (basically I had to move node specific network settings from the globals to the inventory file), eventually the deployment works fine. >> *> Problem <* >> I have no specific issue working with this deployment except the following: >> "SSH connection to the VM is quite slow". >> It takes around 20 seconds for me to log into the VM (Ubuntu, CentOS, whatever). > > But once logged-in things are OK? For example, an scp stalls the same way, but the transfer is fast? > >> *> Observations <* >> * Except for the slowness during the SSH login, I don't have any >> further specific issue working with this envirorment >> * With the Network on the Compute I can turn the OS controller off >> with no impact on the VM. Still the connection is slow >> * I tried different type of images (Ubuntu, CentOS, Windows) always >> with the same result. >> * SSH connection is slow even if I try to login into the VM within the >> IP Namespace >> From the ssh -vvv, I can see that the authentication gets stuck here: >> debug1: Authentication succeeded (publickey). >> Authenticated to ***** >> debug1: channel 0: new [client-session] >> debug3: ssh_session2_open: channel_new: 0 >> debug2: channel 0: send open >> debug3: send packet: type 90 >> debug1: Requesting no-more-sessions at openssh.com >> debug3: send packet: type 80 >> debug1: Entering interactive session. >> debug1: pledge: network >> >>>>> 10 to 15 seconds later > > What is sshd doing at this time? Have you tried enabling debug or running tcpdump when a new connection is attempted? At first glance I'd say it's a DNS issue since it eventually succeeds, the logs would help to point in a direction. > > -Brian > > >> debug3: receive packet: type 80 >> debug1: client_input_global_request: rtype hostkeys-00 at openssh.com want_reply 0 >> debug3: receive packet: type 91 >> debug2: callback start >> debug2: fd 3 setting TCP_NODELAY >> debug3: ssh_packet_set_tos: set IP_TOS 0x10 >> debug2: client_session2_setup: id 0 >> debug2: channel 0: request pty-req confirm 1 >> Have you ever experienced such issue ? >> Any suggestion? >> Many thanks >> /Giuseppe — Slawek Kaplonski Senior software engineer Red Hat From thiagocmartinsc at gmail.com Fri Jul 12 15:42:46 2019 From: thiagocmartinsc at gmail.com (=?UTF-8?B?TWFydGlueCAtIOOCuOOCp+ODvOODoOOCug==?=) Date: Fri, 12 Jul 2019 11:42:46 -0400 Subject: [nova-lxd] retiring nova-lxd In-Reply-To: References: Message-ID: Oh, that's very sad news... I was about to deploy a relatively big OpenStack Cloud (for ~500 LXD Instances) running on nova-lxd but now, I'll cancel it. No more bare-metal cloud?! :-( On Wed, 10 Jul 2019 at 09:08, James Page wrote: > Hi All > > I’m slightly sad to announce that we’re retiring the LXD driver for Nova > aka “nova-lxd”. > > Developing a driver for Nova for container based machines has been a fun > and technically challenging ride over the last four years but we’ve never > really seen any level of serious production deployment; as a result we’ve > decided that it’s time to call it a day for nova-lxd. > > I’d like to thank all of the key contributors for their efforts over the > years - specifically Chuck Short, Paul Hummer, Chris MacNaughton, Sahid > Orentino and Alex Kavanaugh who have led or contributed to the development > of the driver over its lifetime. > > I’ll be raising a review to leave a note for future followers as to the > fate of nova-lxd. If anyone else would like to continue development of the > driver they are more than welcome to revert my commit and become a part of > the development team! > > We’ll continue to support our current set of stable branches for another > ~12 months. > > Note that development of LXD and the pylxd Python module continues; its > just the integration of OpenStack with LXD that we’re ceasing development > of. > > Regards > > James > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Fri Jul 12 16:11:32 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Sat, 13 Jul 2019 01:11:32 +0900 Subject: OpenInfra Days Vietnam 2019 (Hanoi) - Call For Presentations In-Reply-To: References: Message-ID: Hello teams, Less than 2 months and the OpenInfra Days Vietnam will happen. We love to hear the voices of Kata, StarlingX, Airship, Zuul, and other OpenStack projects team. Please join us in one day event in Hanoi this August 24th. - *Event's website:* http://day.vietopeninfra.org/ - *Call for Presentation (extended deadline 30th July):* https://forms.gle/iiRBxxyRv1mGFbgi7 - *Buy tickets:* https://ticketbox.vn/event/vietnam-openinfra-days-2019-75375 Tell me if you have any questions. Yours, Trinh On Tue, May 21, 2019 at 2:36 PM Trinh Nguyen wrote: > Hello, > > Hope you're doing well :) > > The OpenInfra Days Vietnam 2019 [1] is looking for speakers in many > different topics (e.g., container, CI, deployment, edge computing, etc.). > If you would love to have a taste of Hanoi, the capital of Vietnam, please > join us this one-day event and submit your presentation [2]. > > *- Date:* 24 AUGUST 2019 > *- Location:* INTERCONTINENTAL HANOI LANDMARK72, HANOI, VIETNAM > > Especially this time, we're honored to have the Upstream Institute > Training [3] hosted by the OpenStack Foundation on the next day (25 August > 2019). > > [1] http://day.vietopeninfra.org/ > [2] https://forms.gle/iiRBxxyRv1mGFbgi7 > [3] > https://docs.openstack.org/upstream-training/upstream-training-content.html > > See you in Hanoi! > > Bests, > > On behalf of the VietOpenInfra Group. > > -- > *Trinh Nguyen* > *www.edlab.xyz * > > -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From ssbarnea at redhat.com Fri Jul 12 16:37:03 2019 From: ssbarnea at redhat.com (Sorin Sbarnea) Date: Fri, 12 Jul 2019 17:37:03 +0100 Subject: [tripleo] tripleo-ci team investigating lowering the time check jobs are waiting Message-ID: <26F98673-AC01-4C61-BCA5-8AA28937D5E7@redhat.com> Tripleo CI team is trying to find ways to lower the resource loads on upstream zuul by identifying jobs running in check or gate which should have being skipped. This is an exercise in tuning which jobs launch against the various TripleO gerrit repos and not a complete overhaul of the coverage. There have been some obvious mistakes made in the zuul configurations in the past, we are looking to resolve all these issues, but also wanted to reach out to the broader community for more input. Since yesterday we started to track the queue lengths for both tripleo-check and tripleo-gate [1], so soon we will be able to see some trends. We have also been monitoring the number of jobs launched in both check and gate and their pass rates [2][3] Even if you only have some doubts about some jobs, it would be useful to report them to us so we can investigate further. Report on: https://etherpad.openstack.org/p/tripleo-jobs-tunning [1] http://dashboard-ci.tripleo.org/d/cEEjGFFmz/cockpit?orgId=1&fullscreen&panelId=398 [2] check http://dashboard-ci.tripleo.org/d/cEEjGFFmz/cockpit?orgId=1&fullscreen&panelId=157 [3] gate http://dashboard-ci.tripleo.org/d/cEEjGFFmz/cockpit?orgId=1&fullscreen&panelId=307 Thanks, Sorin Sbarnea TripleO CI From james.slagle at gmail.com Fri Jul 12 16:43:13 2019 From: james.slagle at gmail.com (James Slagle) Date: Fri, 12 Jul 2019 12:43:13 -0400 Subject: [TripleO] Scaling node counts with only Ansible (N=1) In-Reply-To: References: Message-ID: On Fri, Jul 12, 2019 at 9:46 AM David Peacock wrote: > > Hi James, > > On Wed, Jul 10, 2019 at 4:20 PM James Slagle wrote: >> >> There's been a fair amount of recent work around simplifying our Heat >> templates and migrating the software configuration part of our >> deployment entirely to Ansible. >> >> As part of this effort, it became apparent that we could render much >> of the data that we need out of Heat in a way that is generic per >> node, and then have Ansible render the node specific data during >> config-download runtime. > > > I find this endeavour very exciting. Do you have any early indications of performance gains that you can share? No hard numbers yet, but I can say that I can get to the Ansible stage of the deployment with any number of nodes with an undercloud that just meets the minimum requirements. This is significant because previously we could not get to this stage without first deploying a huge Heat stack which required a lot of physical resources, tuning, tweaking, or going the undercloud minion route. Also, it's less about performance and more about scale. Certainly the Heat stack operation will be much faster as the number of nodes in the deployment increases. The stack operation time will in fact be constant in relation to the number of nodes in the deployment. It will depend on the number of *roles*, but typically those are ~< 5 per deployment, and the most I've seen is 12. The total work done by Ansible does increase as we move more logic into roles and tasks. However, I expect the total Ansible run time to be roughly equivalent to what we have today since the sum of all that Ansible applies is roughly equal. In terms of scale however, it allows us to move beyond the ~300 node limit we're at today. And it keeps the Heat performance constant as opposed to increasing with the node count. -- -- James Slagle -- From donny at fortnebula.com Fri Jul 12 19:37:05 2019 From: donny at fortnebula.com (Donny Davis) Date: Fri, 12 Jul 2019 15:37:05 -0400 Subject: [Nova] Instances can't be started after compute nodes unexpectedly shut down because of power outage In-Reply-To: References: Message-ID: How is the recovery coming along Gökhan? I am curious to hear. On Fri, Jul 12, 2019 at 3:46 AM Gökhan IŞIK wrote: > Awesome, thanks! Donny, > I followed below steps and rescue my instance. > > 1. > > Find instance id and compute host > > root at infra1-utility-container-50bcf920:~# openstack server show 1d2e8a39-97ee-4ce7-a612-1b50f90cc51e -c id -c OS-EXT-SRV-ATTR:hypervisor_hostname > +-------------------------------------+--------------------------------------+ > | Field | Value | > +-------------------------------------+--------------------------------------+ > | OS-EXT-SRV-ATTR:hypervisor_hostname | compute06 | > | id | 1d2e8a39-97ee-4ce7-a612-1b50f90cc51e | > +-------------------------------------+--------------------------------------+ > > > 2. > > Find image and backing image file on compute host > > root at compute06:~# qemu-img info -U --backing-chain /var/lib/nova/instances/1d2e8a39-97ee-4ce7-a612-1b50f90cc51e/disk > image: /var/lib/nova/instances/1d2e8a39-97ee-4ce7-a612-1b50f90cc51e/disk > file format: qcow2 > virtual size: 160G (171798691840 bytes) > disk size: 32G > cluster_size: 65536 > backing file: /var/lib/nova/instances/_base/a1960f539532979a591c5f837ad604eedd9c7323 > Format specific information: > compat: 1.1 > lazy refcounts: false > refcount bits: 16 > corrupt: false > image: /var/lib/nova/instances/_base/a1960f539532979a591c5f837ad604eedd9c7323 > file format: raw > virtual size: 160G (171798691840 bytes) > disk size: 18G > > > > 3. Copy image and backing image file > > > root at compute06:~# cp /var/lib/nova/instances/1d2e8a39-97ee-4ce7-a612-1b50f90cc51e/disk master > root at compute06:~# cp /var/lib/nova/instances/_base/a1960f539532979a591c5f837ad604eedd9c7323 new-master > > > 4. > > Rebase the image file that was backed off the original file so that > it uses the new file i.e new-master then commit those changes back to > original file master back into the new base new-master > > root at compute06:~# qemu-img rebase -b new-master -U master > > root at compute06:~# qemu-img commit master > > root at compute06:~# qemu-img info new-master > > > > > 5. > > Convert raw image to qcow2 > > root at compute06:~# qemu-img convert -f raw -O qcow2 new-master new-master.qcow2 > > > 6. Time to upload glance and then launch instance from this image :) > > > Thanks, > Gökhan. > > Donny Davis , 12 Tem 2019 Cum, 00:56 tarihinde şunu > yazdı: > >> Of course you can also always just pull the disk images from the vm >> folders, merge them back with the base file, upload to glance and then >> relaunch the instances. >> >> You can give this method a spin with the lowest risk to your instances >> >> >> https://medium.com/@kumar_pravin/qemu-merge-snapshot-and-backing-file-into-standalone-disk-c8d3a2b17c0e >> >> >> >> >> >> On Thu, Jul 11, 2019 at 4:10 PM Donny Davis wrote: >> >>> You surely want to leave locking turned on. >>> >>> You may want to ask qemu-devel about the locking of a image file and how >>> it works. This isn't really an Openstack issue, seems to be a layer below. >>> >>> Depending on how mission critical your VM's are, you could probably work >>> around it by just passing in --force-share into the command openstack is >>> trying to run. >>> >>> I cannot recommend this path, the best way is to find out how you remove >>> the lock. >>> >>> >>> >>> >>> >>> >>> On Thu, Jul 11, 2019 at 3:23 PM Gökhan IŞIK >>> wrote: >>> >>>> In [1] it says "Image locking is added and enabled by default. >>>> Multiple QEMU processes cannot write to the same image as long as the host >>>> supports OFD or posix locking, unless options are specified otherwise." May >>>> be need to do something on nova side. >>>> >>>> I run this command and get same error. Output is in >>>> http://paste.openstack.org/show/754311/ >>>> >>>> İf I run qemu-img info instance-0000219b with -U , it doesn't give any >>>> errors. >>>> >>>> [1] https://wiki.qemu.org/ChangeLog/2.10 >>>> >>>> Donny Davis , 11 Tem 2019 Per, 22:11 tarihinde >>>> şunu yazdı: >>>> >>>>> Well that is interesting. If you look in your libvirt config directory >>>>> (/etc/libvirt on Centos) you can get a little more info on what is being >>>>> used for locking. >>>>> >>>>> Maybe strace can shed some light on it. Try something like >>>>> >>>>> strace -ttt -f qemu-img info >>>>> /var/lib/nova/instances/659b5853-d094-4425-85a9-5bcacf88c84e/disk >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Thu, Jul 11, 2019 at 2:39 PM Gökhan IŞIK >>>>> wrote: >>>>> >>>>>> I run virsh list --all command and output is below: >>>>>> >>>>>> root at compute06:~# virsh list --all >>>>>> Id Name State >>>>>> ---------------------------------------------------- >>>>>> - instance-000012f9 shut off >>>>>> - instance-000013b6 shut off >>>>>> - instance-000016fb shut off >>>>>> - instance-0000190a shut off >>>>>> - instance-00001a8a shut off >>>>>> - instance-00001e05 shut off >>>>>> - instance-0000202a shut off >>>>>> - instance-00002135 shut off >>>>>> - instance-00002141 shut off >>>>>> - instance-000021b6 shut off >>>>>> - instance-000021ec shut off >>>>>> - instance-000023db shut off >>>>>> - instance-00002ad7 shut off >>>>>> >>>>>> And also when I try start instances with virsh , output is below: >>>>>> >>>>>> root at compute06:~# virsh start instance-0000219b >>>>>> error: Failed to start domain instance-000012f9 >>>>>> error: internal error: process exited while connecting to monitor: >>>>>> 2019-07-11T18:36:34.229534Z qemu-system-x86_64: -chardev >>>>>> pty,id=charserial0,logfile=/dev/fdset/2,logappend=on: char device >>>>>> redirected to /dev/pts/3 (label charserial0) >>>>>> 2019-07-11T18:36:34.243395Z qemu-system-x86_64: -drive >>>>>> file=/var/lib/nova/instances/659b5853-d094-4425-85a9-5bcacf88c84e/disk,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,discard=ignore: >>>>>> Failed to get "write" lock >>>>>> Is another process using the image? >>>>>> >>>>>> Thanks, >>>>>> Gökhan >>>>>> >>>>>> Donny Davis , 11 Tem 2019 Per, 21:06 tarihinde >>>>>> şunu yazdı: >>>>>> >>>>>>> Can you ssh to the hypervisor and run virsh list to make sure your >>>>>>> instances are in fact down? >>>>>>> >>>>>>> On Thu, Jul 11, 2019 at 3:02 AM Gökhan IŞIK >>>>>>> wrote: >>>>>>> >>>>>>>> Can anyone help me please ? I can no't rescue my instances yet :( >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Gökhan >>>>>>>> >>>>>>>> Gökhan IŞIK , 9 Tem 2019 Sal, 15:46 >>>>>>>> tarihinde şunu yazdı: >>>>>>>> >>>>>>>>> Hi folks, >>>>>>>>> Because of power outage, Most of our compute nodes unexpectedly >>>>>>>>> shut down and now I can not start our instances. Error message is "Failed >>>>>>>>> to get "write" lock another process using the image?". Instances Power >>>>>>>>> status is No State. Full error log is >>>>>>>>> http://paste.openstack.org/show/754107/. My environment is >>>>>>>>> OpenStack Pike on Ubuntu 16.04 LTS servers and Instances are on a nfs >>>>>>>>> shared storage. Nova version is 16.1.6.dev2. qemu version is 2.10.1. >>>>>>>>> libvirt version is 3.6.0. I saw a commit [1], but it doesn't solve this >>>>>>>>> problem. >>>>>>>>> There are important instances on my environment. How can I rescue >>>>>>>>> my instances? What would you suggest ? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Gökhan >>>>>>>>> >>>>>>>>> [1] https://review.opendev.org/#/c/509774/ >>>>>>>>> >>>>>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From hjensas at redhat.com Fri Jul 12 19:59:34 2019 From: hjensas at redhat.com (Harald =?ISO-8859-1?Q?Jens=E5s?=) Date: Fri, 12 Jul 2019 21:59:34 +0200 Subject: [TripleO] Scaling node counts with only Ansible (N=1) In-Reply-To: References: Message-ID: <23924034ea0981350b7e241aed5e99c5e769b291.camel@redhat.com> On Wed, 2019-07-10 at 16:17 -0400, James Slagle wrote: > There's been a fair amount of recent work around simplifying our Heat > templates and migrating the software configuration part of our > deployment entirely to Ansible. > > As part of this effort, it became apparent that we could render much > of the data that we need out of Heat in a way that is generic per > node, and then have Ansible render the node specific data during > config-download runtime. > > To illustrate the point, consider when we specify ComputeCount:10 in > our templates, that much of the work that Heat is doing across those > 10 sets of resources for each Compute node is duplication. However, > it's been necessary so that Heat can render data structures such as > list of IP's, lists of hostnames, contents of /etc/hosts files, etc > etc etc. If all that was driven by Ansible using host facts, then > Heat > doesn't need to do those 10 sets of resources to begin with. > > The goal is to get to a point where we can deploy the Heat stack with > a count of 1 for each role, and then deploy any number of nodes per > role using Ansible. To that end, I've been referring to this effort > as > N=1. > > The value in this work is that it directly addresses our scaling > issues with Heat (by just deploying a much smaller stack). Obviously > we'd still be relying heavily on Ansible to scale to the required > levels, but I feel that is much better understood challenge at this > point in the evolution of configuration tools. > > With the patches that we've been working on recently, I've got a POC > running where I can deploy additional compute nodes with just > Ansible. > This is done by just adding the additional nodes to the Ansible > inventory with a small set of facts to include IP addresses on each > enabled network and a hostname. > > These patches are at > https://review.opendev.org/#/q/topic:bp/reduce-deployment-resources > and reviews/feedback are welcome. > > Other points: > > - Baremetal provisioning and port creation are presently handled by > Heat. With the ongoing efforts to migrate baremetal provisioning out > of Heat (nova-less deploy), I think these efforts are very > complimentary. Eventually, we get to a point where Heat is not > actually creating any other OpenStack API resources. For now, the > patches only work when using pre-provisioned nodes. > I've said this before, but I think we should turn this nova-less around. Now with nova-less we create a bunch of servers, and write up the parameters file to use the deployed-server approach. Effectively we still neet to have the resource group in heat making a server resource for every server. Creating the fake server resource is fast, because Heat does'nt call Nova,Ironic to create any resources. But the stack is equally big, with a stack for every node. i.e not N=1. What you are doing here, is essentially to say we don't create a resource group that then creates N number of role stacks, one for each overcloud node. You are creating a single generic "server" definition per Role. So we drop the resource group and create OS::Triple::{{Role}}.Server 1-time (once). To me it's backwards to push a large struct with properties for N=many nodes into the creation of that stack. Currently the puppet/role-role.yaml creates all the network ports etc. As you only want to create it once, it instead could simply output the UUID of the networks+subnets. These are identical for all servers in the role. So we end up with a small heat stack. Once the stack is created we could use that generic "server" role data to feed into something (ansible?, python?, mistral?) that calls metalsmith to build the servers, then create ports for each server in neutron, one port for each network+subnet defined in the role. Then feed that output into the json (hieradata) that is pushed to each node and used during service configuration, all the things we need to configure network interfaces, /etc/hosts and so on. We need a way to keep track of which ports belong to wich node, but I guess something simple like using the node's ironic UUID in either the name, description or tag field of the neutron port will work. There is also the extra filed in Ironic which is json type, so we could place a map of network->port_uuid in there as well. Another idea I've been pondering is if we put credentials on the overcloud nodes so that the node itself could make the call to neutron on the undercloud to create ports in neutron. I.e we just push the UUID of the correct network and subnet where the resource should be created, and let the overcloud node do the create. The problem with this is that we wouldn't have a way to build the /etc/hosts and probably other things that include ips etc for all the nodes. Maby if all the nodes was part of an etcd cluster, and pushed it's data there? I think the creation of the actual Networks and Subnets can be left in heat, it's typically 5-6 networks and 5-6 subnets so it's not a lot of resources. Even in a large DCN deployment having 50-100 subnets per network or even 50-100 networks I think this is'nt a problem. > - We need to consider how we'd manage the Ansible inventory going > forward if we open up an interface for operators to manipulate it > directly. That's something we'd want to manage and preserve (version > control) as it's critical data for the deployment. > > Given the progress that we've made with the POC, my sense is that > we'll keep pushing in this overall direction. I'd like to get some > feedback on the approach. We have an etherpad we are using to track > some of the work at a high level: > > https://etherpad.openstack.org/p/tripleo-reduce-deployment-resources > > I'll be adding some notes on how I setup the POC to that etherpad if > others would like to try it out. > From corey.bryant at canonical.com Fri Jul 12 20:57:29 2019 From: corey.bryant at canonical.com (Corey Bryant) Date: Fri, 12 Jul 2019 16:57:29 -0400 Subject: [goal][python3] Train unit tests weekly update (goal-9) Message-ID: This is the goal-9 weekly update for the "Update Python 3 test runtimes for Train" goal [1]. There are 9 weeks remaining for completion of Train community goals [2]. == What's the Goal? == To ensure (in the Train cycle) that all official OpenStack repositories with Python 3 unit tests are exclusively using the 'openstack-python3-train-jobs' Zuul template or one of its variants (e.g. 'openstack-python3-train-jobs-neutron') to run unit tests, and that tests are passing. This will ensure that all official projects are running py36 and py37 unit tests in Train. For complete details please see [1]. == Ongoing Work == Patches have been submitted for all applicable projects except for one, OpenStack Charms. I hope to have those submitted next week. Open patches needing reviews: https://review.openstack.org/#/q/topic:python3-train+is:open Failing patches: https://review.openstack.org/#/q/topic:python3-train+status:open+(+label:Verified-1+OR+label:Verified-2+) Patch automation scripts needing review: https://review.opendev.org/#/c/666934 == Completed Work == Merged patches: https://review.openstack.org/#/q/topic:python3-train+is:merged == How can you help? == Please take a look at the failing patches and help fix any failing unit tests for your projects. Python 3.7 unit tests will be self-testing in Zuul. == Reference Material == [1] Goal description: https://governance.openstack.org/tc/goals /train/python3-updates.html [2] Train release schedule: https://releases.openstack.org/train/schedule.html (see R-5 for "Train Community Goals Completed") Storyboard: https://storyboard.openstack.org/#!/story/2005924 Porting to Python 3.7: https://docs.python.org/3/whatsnew/3.7.html#porting-to-python-3-7 Python Update Process: https://opendev.org/openstack/governance/src/branch/master/resolutions/20181024-python-update-process.rst Train runtimes: https://opendev.org/openstack/governance/src/branch/master/reference/runtimes/train.rst Thanks, Corey -------------- next part -------------- An HTML attachment was scrubbed... URL: From colleen at gazlene.net Sat Jul 13 00:24:32 2019 From: colleen at gazlene.net (Colleen Murphy) Date: Fri, 12 Jul 2019 17:24:32 -0700 Subject: [keystone] Keystone Team Update - Week of 8 July 2019 Message-ID: <4f0e2f82-1580-4074-8814-d336f918a339@www.fastmail.com> # Keystone Team Update - Week of 8 July 2019 ## News ### Midcycle Planning We've set dates for the week of Milestone 2 for our virtual midcycle[1]. Topic suggestion is still open[2] and I will try to propose a draft agenda early next week. You can help by adding votes to topics you think are especially wortwhile in the etherpad. [1] http://lists.openstack.org/pipermail/openstack-discuss/2019-July/007593.html [2] https://etherpad.openstack.org/p/keystone-train-midcycle-topics ### PTG Planning It's also already time to start thinking about the PTG[3]. If you have an inkling of whether you will be able to attend the PTG in Shanghai in November and would participate in the keystone room, please add your name to the planning etherpad[4]. [3] http://lists.openstack.org/pipermail/openstack-discuss/2019-July/007639.html [4] https://etherpad.openstack.org/p/keystone-shanghai-ptg ### Handling Defunct APIs While cleaning up deprecated config options related to PKI tokens[5], we found that these config options affect the OS-SIMPLE-CERT API and discussed[6][7] how best to handle the changing API without needing a version change. If you have thoughts on this matter, let us know in the review. [5] https://review.opendev.org/659434 [6] http://eavesdrop.openstack.org/meetings/keystone/2019/keystone.2019-07-09-16.00.log.html#l-99 [7] http://eavesdrop.openstack.org/irclogs/%23openstack-sdks/%23openstack-sdks.2019-07-11.log.html#t2019-07-11T16:18:01 ### External Auth Prior to federated authentication, we supported external authentication using the REMOTE_USER HTTPD variable. Now that we have federated authentication available to us, we discussed whether it's time to discourage the use of regular external authentication[8]. If you are using external authentication or have an opinion about it, please join in the discussion with us[9]. [8] http://eavesdrop.openstack.org/irclogs/%23openstack-keystone/%23openstack-keystone.2019-07-09.log.html#t2019-07-09T23:42:19 [9] https://review.opendev.org/669959 ## Open Specs Train specs: https://bit.ly/2uZ2tRl Ongoing specs: https://bit.ly/2OyDLTh Since we do not have anyone who can commit to the implementation of the "Expose root domain as assignment target" spec[10] for Train, the latest patchset proposes it to the backlog rather than to the Train cycle. [10] https://review.opendev.org/661837 ## Recently Merged Changes Search query: https://bit.ly/2pquOwT We merged 10 changes this week. ## Changes that need Attention Search query: https://bit.ly/2tymTje There are 46 changes that are passing CI, not in merge conflict, have no negative reviews and aren't proposed by bots. ## Bugs This week we opened 1 new bugs and closed 1. Bugs opened (1) Bug #1836390 (oslo.policy:Undecided) opened by Vadym Markov https://bugs.launchpad.net/oslo.policy/+bug/1836390 Bugs fixed (1) Bug #1811771 (keystone:Low) fixed by Vishakha Agarwal https://bugs.launchpad.net/keystone/+bug/1811771 Although we did not open many new bugs this week, the backlog of unconfirmed and untriaged bugs has been slowly growing[11], please help by reproducing and triaging new bugs. [11] https://bugs.launchpad.net/keystone/+bugs?search=Search&field.status=New ## Milestone Outlook https://releases.openstack.org/train/schedule.html Milestone 2 is in two weeks, which means spec freeze is coming up followed closely by feature proposal freeze. ## Shout-outs Big thanks to Guang Yee for taking on an overhaul of the tokenless authentication documentation[12] which has been in severe need of an update and cleanup. [12] https://review.opendev.org/669790 ## Help with this newsletter Help contribute to this newsletter by editing the etherpad: https://etherpad.openstack.org/p/keystone-team-newsletter From ianyrchoi at gmail.com Sat Jul 13 01:42:57 2019 From: ianyrchoi at gmail.com (Ian Y. Choi) Date: Sat, 13 Jul 2019 10:42:57 +0900 Subject: OpenInfra Days Vietnam 2019 (Hanoi) - Call For Presentations In-Reply-To: References: Message-ID: (Copying to OpenStack & Community mailing lists) Trinh Nguyen wrote on 7/13/2019 1:11 AM: > Hello teams, > > Less than 2 months and the OpenInfra Days Vietnam will happen. We love > to hear the voices of Kata, StarlingX, Airship, Zuul, and other > OpenStack projects team. Please join us in one day event in Hanoi this > August 24th. > > * *Event's website:* http://day.vietopeninfra.org/ > * *Call for Presentation (extended deadline 30th July):* > https://forms.gle/iiRBxxyRv1mGFbgi7 > * *Buy tickets:* > https://ticketbox.vn/event/vietnam-openinfra-days-2019-75375 > > Tell me if you have any questions. > > Yours, > Trinh > > > On Tue, May 21, 2019 at 2:36 PM Trinh Nguyen > wrote: > > Hello, > > Hope you're doing well :) > > The OpenInfra Days Vietnam 2019 [1] is looking for speakers in > many different topics (e.g., container, CI, deployment, edge > computing, etc.). If you would love to have a taste of Hanoi, the > capital of Vietnam, please join us this one-day event and submit > your presentation [2]. > > *- Date:* 24 AUGUST 2019 > *- Location:* INTERCONTINENTAL HANOI LANDMARK72, HANOI, VIETNAM > > Especially this time, we're honored to have the Upstream Institute > Training [3] hosted by the OpenStack Foundation on the next day > (25 August 2019). > > [1] http://day.vietopeninfra.org/ > [2] https://forms.gle/iiRBxxyRv1mGFbgi7 > [3] > https://docs.openstack.org/upstream-training/upstream-training-content.html > > See you in Hanoi! > > Bests, > > On behalf of the VietOpenInfra Group. > > -- > *Trinh Nguyen* > _www.edlab.xyz _ > > > > -- > *Trinh Nguyen* > _www.edlab.xyz _ > From amotoki at gmail.com Sat Jul 13 08:19:18 2019 From: amotoki at gmail.com (Akihiro Motoki) Date: Sat, 13 Jul 2019 17:19:18 +0900 Subject: [all] [ptls] [tc] [nova] [neutron] [tripleo] Volunteers that know TeX for PDF community goal In-Reply-To: <80e2e8550fd39cf9e224e24b4e6ad806acdc9e16.camel@redhat.com> References: <20190624155629.GA26343@sinanju.localdomain> <80e2e8550fd39cf9e224e24b4e6ad806acdc9e16.camel@redhat.com> Message-ID: On Wed, Jun 26, 2019 at 6:24 PM Stephen Finucane wrote: > > On Mon, 2019-06-24 at 16:36 -0700, Michael Johnson wrote: > > 4. We should document how to ignore or re-order the docs. We have an > > internal API reference that comes through as the first section, but is > > of little use to anyone outside the developers. It is also confusing > > as the actual Octavia API-REF link doesn't render. > > I think this happens because it renders pages in the order that it > encounters them. If you ensure there's a table of contents on the index > page of each subsection ('/user', '/admin', '/config', ...), and that > there's a top-level table of contents linking to each of these from > your 'master_doc' as defined in 'conf.py (typically 'index') then > things _should_ render in the correct order. These table of contents > _could_ be hidden, though I haven't tested that yet. > > I plan to rework the nova docs according to the above...as soon as I > get the darn thing building. I found time to explore TOC related stuffs a bit deeper for the neutron document. I share my experiences below. BTW, it looks better to prepare some place to share our knowledge, etherpad? P.S. I still have a trouble in sample config files (relative paths and literalinclude), but it is a different topic. It would be nice if someone can dive into this topic. Let's start the detail on TOC. [TOC structure] It seems the HTML and LaTeX builders handle toctree in a bit different manners. The following are what I hit. I noted solutions/workarounds for individual items, but another solution is to use a separate top page (See "Separate top page" below). The following style like [1] is used commonly in our OpenStack project documents. This is useful for HTML documents but it is not for PDF documents :-( Installation Guide ------------------ .. toctree:: :maxdepth: 2 install/index leads to the following TOC in a PDF doc. It does not sound nice :p 1 Installation Guide 1.1 Networking service Installation Guide 2 Networking Guide 2.1 OpenStack Networking Guide To avoid this, we need to use toctree without sections in the top page. [Search page] "search" page works only for HTML pages. It is meaningless in PDF docs. The straight-forward solution is to use "only" directive. .. only:: html * :ref:`search` [URLs in toctree] > > It is also confusing > > as the actual Octavia API-REF link doesn't render. It seems the latex builder does not consider a direct link in toctree like below. In case of the neutron doc, I created a new rst file to point the API reference [2], but it is just a workaround. .. toctree :maxdepth: 2 ... API Reference ... [Separate top page] You can also have a separate top page (master_doc) for PDF document [3]. We usually specify a file specified in 'master_doc' (default: 'index') to 'startdocname' but we can use a different one. latex_documents = [ ('pdf-index', 'neutron.tex', u'Neutron Documentation', u'Neutron development team', 'manual'), ] [TOC level] By default, the first two levels are shown in TOC. If you would like to show deeper levels, you need to set 'tocdepth'. To show the fourth level, the following needs to be configured in doc/source/conf.py [4] (The number starts from 0, so 3 means the fourth level): latex_elements = { 'preamble': r'\setcounter{tocdepth}{3}', } [Stop generating the index] It seems the latex builder always tries to generate the module index by default, but if we do not load all modules in a document the index would be partial and not useful. To disable this, the following configuration in doc/source/conf.py would work: latex_elements = { 'makeindex': '', 'printindex': '', } [1] https://opendev.org/openstack/neutron/blame/commit/9d4161f955681c83bf121d84f1bc335cef758fce/doc/source/index.rst#L37-L44 [2] https://review.opendev.org/#/c/667345/6/doc/source/reference/rest-api.rst [3] https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-latex_documents [4] 'preamble' in https://www.sphinx-doc.org/en/master/latex.html#the-latex-elements-configuration-setting -- Akihiro Motoki (irc:amotoki) From gmann at ghanshyammann.com Sat Jul 13 12:16:32 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Sat, 13 Jul 2019 21:16:32 +0900 Subject: [dev][release][qa] patrole stable/stein is created by mistake ? Message-ID: <16beb436c21.c9d98154147017.455401003873410397@ghanshyammann.com> Hi Release team, Today I noticed while doing patrole review that stable/stain has been created for patrole which is wrong. Patrole is branchless[1] and I remember I have not requested the stable branch while releasing the patrole. Is it created by mistakenly ? or intentional? [1] https://docs.openstack.org/patrole/latest/overview.html#release-versioning -gmann From gmann at ghanshyammann.com Sat Jul 13 12:19:35 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Sat, 13 Jul 2019 21:19:35 +0900 Subject: [dev][release][qa] patrole stable/stein is created by mistake ? In-Reply-To: <16beb436c21.c9d98154147017.455401003873410397@ghanshyammann.com> References: <16beb436c21.c9d98154147017.455401003873410397@ghanshyammann.com> Message-ID: <16beb463457.1193bd7a4147042.5135658897011450644@ghanshyammann.com> ---- On Sat, 13 Jul 2019 21:16:32 +0900 Ghanshyam Mann wrote ---- > Hi Release team, > > Today I noticed while doing patrole review that stable/stain has been created for patrole which is wrong. > Patrole is branchless[1] and I remember I have not requested the stable branch while releasing the patrole. > > Is it created by mistakenly ? or intentional? I found the patch which created this - https://review.opendev.org/#/c/650173/1. Can we revert that but that patch has more changes? -gmann > > [1] https://docs.openstack.org/patrole/latest/overview.html#release-versioning > > -gmann > From aj at suse.com Sat Jul 13 19:08:32 2019 From: aj at suse.com (Andreas Jaeger) Date: Sat, 13 Jul 2019 21:08:32 +0200 Subject: [docs][tc][infra] what to do with developer.openstack.org and api-site? Message-ID: <48048bf0-a79c-6abf-b88f-a1132afc0d6b@suse.com> The website developer.openstack.org has the tagline "Development resources for OpenStack clouds" and hosts development and API information, especially: * An index page * The OpenStack api-guide * The document "Writing Your First OpenStack Application" ("firstapp guide") * The individual api-guides and api-references which are published from the individual projects like Nova or Keystone. The index page, the OpenStack api-guide and the document "Writing your first OpenStack Application" are hosted in the api-site repository which was formerly part of the docs team but is now orphaned. Let's look at the content hosted in api-site repository * The index page needs little updates * The OpenStack Api-Guide: This is a small guide that needs little maintenance (when projects create/retire API versions) * Writing your first OpenStack Application: This was never finished and only covers a few programming language, it's dead. With moving the repo out of docs (see https://review.opendev.org/485249 for change), the hope was that somebody took it over - but this did not happen. We have an official entry point for OpenStack's development resources and it is served by api-site repo and I suggest we look at the situation and solve it fully. I see the following options: 1) Retiring developer.openstack.org completely, this would mean we would host the api-guides and api-references on docs.openstack.org (perhaps with moving them into doc/source). If we go down this road, we need to discuss what this means (redirects) and what to do with the Api-Guide and the FirstApp guide. 2) Fully revitialize the repo and have it owned by an official team or SIG (this means reverting parts of https://review.opendev.org/485249/) 3) Retire the document "Writing your first OpenStack Application", and unretire api-site and have it owned by some official team/SIG. Any other options? What shall we do? Andreas -- Andreas Jaeger aj at suse.com Twitter: jaegerandi SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg GF: Nils Brauckmann, Felix Imendörffer, Enrica Angelone, HRB 247165 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 From james.slagle at gmail.com Sat Jul 13 20:19:13 2019 From: james.slagle at gmail.com (James Slagle) Date: Sat, 13 Jul 2019 16:19:13 -0400 Subject: [TripleO] Scaling node counts with only Ansible (N=1) In-Reply-To: <23924034ea0981350b7e241aed5e99c5e769b291.camel@redhat.com> References: <23924034ea0981350b7e241aed5e99c5e769b291.camel@redhat.com> Message-ID: On Fri, Jul 12, 2019 at 3:59 PM Harald Jensås wrote: > I've said this before, but I think we should turn this nova-less > around. Now with nova-less we create a bunch of servers, and write up > the parameters file to use the deployed-server approach. Effectively we > still neet to have the resource group in heat making a server resource > for every server. Creating the fake server resource is fast, because > Heat does'nt call Nova,Ironic to create any resources. But the stack is > equally big, with a stack for every node. i.e not N=1. > > What you are doing here, is essentially to say we don't create a > resource group that then creates N number of role stacks, one for each > overcloud node. You are creating a single generic "server" definition > per Role. So we drop the resource group and create > OS::Triple::{{Role}}.Server 1-time (once). To me it's backwards to push > a large struct with properties for N=many nodes into the creation of > that stack. I'm not entirely following what you're saying is backwards. What I've proposed is that we *don't* have any node specific data in the stack. It sounds like you're saying the way we do it today is backwards. It's correct that what's been proposed with metalsmith currently still requires the full ResourceGroup with a member for each node. With the template changes I'm proposing, that wouldn't be required, so we could actually do the Heat stack first, then metalsmith. > > Currently the puppet/role-role.yaml creates all the network ports etc. > As you only want to create it once, it instead could simply output the > UUID of the networks+subnets. These are identical for all servers in > the role. So we end up with a small heat stack. > > Once the stack is created we could use that generic "server" role data > to feed into something (ansible?, python?, mistral?) that calls > metalsmith to build the servers, then create ports for each server in > neutron, one port for each network+subnet defined in the role. Then > feed that output into the json (hieradata) that is pushed to each node > and used during service configuration, all the things we need to > configure network interfaces, /etc/hosts and so on. We need a way to > keep track of which ports belong to wich node, but I guess something > simple like using the node's ironic UUID in either the name, > description or tag field of the neutron port will work. There is also > the extra filed in Ironic which is json type, so we could place a map > of network->port_uuid in there as well. It won't matter whether we do baremetal provisioning before or after the Heat stack. Heat won't care, as it won't have any expectation to create any servers or that they are already created. We can define where we end up calling the metalsmith piece as it should be independent of the Heat stack if we make these template changes. > Another idea I've been pondering is if we put credentials on the > overcloud nodes so that the node itself could make the call to neutron > on the undercloud to create ports in neutron. I.e we just push the UUID > of the correct network and subnet where the resource should be created, > and let the overcloud node do the create. The problem with this is that I don't think it would be a good idea to have undercloud credentials on the overcloud nodes. > I think the creation of the actual Networks and Subnets can be left in > heat, it's typically 5-6 networks and 5-6 subnets so it's not a lot of > resources. Even in a large DCN deployment having 50-100 subnets per > network or even 50-100 networks I think this is'nt a problem. Agreed, I'm not specifically proposing we move those pieces at this time. -- -- James Slagle -- From sean.mcginnis at gmx.com Sat Jul 13 20:38:28 2019 From: sean.mcginnis at gmx.com (Sean McGinnis) Date: Sat, 13 Jul 2019 15:38:28 -0500 Subject: [dev][release][qa] patrole stable/stein is created by mistake ? In-Reply-To: <16beb463457.1193bd7a4147042.5135658897011450644@ghanshyammann.com> References: <16beb436c21.c9d98154147017.455401003873410397@ghanshyammann.com> <16beb463457.1193bd7a4147042.5135658897011450644@ghanshyammann.com> Message-ID: <20190713203828.GA29711@sm-workstation> On Sat, Jul 13, 2019 at 09:19:35PM +0900, Ghanshyam Mann wrote: > ---- On Sat, 13 Jul 2019 21:16:32 +0900 Ghanshyam Mann wrote ---- > > Hi Release team, > > > > Today I noticed while doing patrole review that stable/stain has been created for patrole which is wrong. > > Patrole is branchless[1] and I remember I have not requested the stable branch while releasing the patrole. > > > > Is it created by mistakenly ? or intentional? > > I found the patch which created this - https://review.opendev.org/#/c/650173/1. > > Can we revert that but that patch has more changes? > > -gmann > We can't really just revert it. If you would like to change this, please update the patrole release type to tempest-plugin or move it to be an independent release deliverable if it is not actually cycle based. You can also remove the branching information with that change, then after that is gone and there isn't a risk of recreating it, the infra team may be able to assist in deleting the existing branch. Sean From gmann at ghanshyammann.com Sun Jul 14 00:26:10 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Sun, 14 Jul 2019 09:26:10 +0900 Subject: [dev][release][qa] patrole stable/stein is created by mistake ? In-Reply-To: <20190713203828.GA29711@sm-workstation> References: <16beb436c21.c9d98154147017.455401003873410397@ghanshyammann.com> <16beb463457.1193bd7a4147042.5135658897011450644@ghanshyammann.com> <20190713203828.GA29711@sm-workstation> Message-ID: <16beddf697c.fd5e3e64150304.4626321901014847129@ghanshyammann.com> ---- On Sun, 14 Jul 2019 05:38:28 +0900 Sean McGinnis wrote ---- > On Sat, Jul 13, 2019 at 09:19:35PM +0900, Ghanshyam Mann wrote: > > ---- On Sat, 13 Jul 2019 21:16:32 +0900 Ghanshyam Mann wrote ---- > > > Hi Release team, > > > > > > Today I noticed while doing patrole review that stable/stain has been created for patrole which is wrong. > > > Patrole is branchless[1] and I remember I have not requested the stable branch while releasing the patrole. > > > > > > Is it created by mistakenly ? or intentional? > > > > I found the patch which created this - https://review.opendev.org/#/c/650173/1. > > > > Can we revert that but that patch has more changes? > > > > -gmann > > > > We can't really just revert it. If you would like to change this, please update > the patrole release type to tempest-plugin or move it to be an independent > release deliverable if it is not actually cycle based. > > You can also remove the branching information with that change, then after that > is gone and there isn't a risk of recreating it, the infra team may be able to > assist in deleting the existing branch. Release model for patrole is 'cycle-with-intermediary' which is right, we do not need to change that. We just need to remove the stable/stein and branch information which was added for the only stein. I will push the change. Can we have a tag which clearly says the branchless and no-stable branch nature for deliverables? 'tempest-plugins' was introduced for a different purpose. new tag can be applicable for other deliverables also (current or in future). That can help to avoid these errrors. -gmann > > Sean > > From masha.atakova at mail.com Sun Jul 14 12:12:32 2019 From: masha.atakova at mail.com (Mary Atakova) Date: Sun, 14 Jul 2019 15:12:32 +0300 Subject: Horizon and nova-cli inconsistency in live migration parameters Message-ID: <4d5629c4-e342-0bb6-4dd1-dd9d22d2e099@mail.com> Hi everyone, I think I have found an inconsistency between nova cli and Horizon in the way they send live migration commands to nova API. I'm not sure where this needs to be fixed, so please advise on that. The situation: I try to live-migrate a VM, and I do it through nova cli and through Horizon, and nova-api logs contain different data depending on the way I do it. 1. nova cli I run the command: nova live-migration 17359460-d23c-4acc-a9b1-5cf117b54430 and it shows up in nova-api logs as: {"os-migrateLive": {"block_migration": "auto", "host": null}} 2. horizon I run live migration via Horizon with unchecked Disk Overcommit and unchecked Block Migration. It shows up in nova-api logs as: {"os-migrateLive": {"disk_over_commit": false, "block_migration": false, "host": null}} This difference can lead to different parameters provided for live migration when block_migration: auto is translated into block_migration: true. My environment: Openstack Stein CentOS Linux release 7.6.1810 nova versions: # yum list installed | grep nova openstack-nova-api.noarch       1:19.0.1-1.el7 @centos-openstack-stein openstack-nova-common.noarch    1:19.0.1-1.el7 @centos-openstack-stein openstack-nova-conductor.noarch 1:19.0.1-1.el7 @centos-openstack-stein openstack-nova-console.noarch   1:19.0.1-1.el7 @centos-openstack-stein openstack-nova-novncproxy.noarch openstack-nova-placement-api.noarch openstack-nova-scheduler.noarch 1:19.0.1-1.el7 @centos-openstack-stein python2-nova.noarch             1:19.0.1-1.el7 @centos-openstack-stein python2-novaclient.noarch       1:13.0.0-1.el7 @centos-openstack-stein horizon version: # yum list installed | grep dashboard openstack-dashboard.noarch         1:15.1.0-1.el7 @centos-openstack-stein openstack-dashboard-theme.noarch   1:15.1.0-1.el7 @centos-openstack-stein Thank you in advance for your attention Best regards, Mary From masha.atakova at mail.com Sun Jul 14 12:35:29 2019 From: masha.atakova at mail.com (Mary Atakova) Date: Sun, 14 Jul 2019 15:35:29 +0300 Subject: [nova] [horizon] Horizon and nova-cli inconsistency in live migration parameters Message-ID: <98059178-00c5-8bc0-7bbf-bf5c55c53397@mail.com> Hi everyone, I think I have found an inconsistency between nova cli and Horizon in the way they send live migration commands to nova API. I'm not sure where this needs to be fixed, so please advise on that. The situation: I try to live-migrate a VM, and I do it through nova cli and through Horizon, and nova-api logs contain different data depending on the way I do it. 1. nova cli I run the command: nova live-migration 17359460-d23c-4acc-a9b1-5cf117b54430 and it shows up in nova-api logs as: {"os-migrateLive": {"block_migration": "auto", "host": null}} 2. horizon I run live migration via Horizon with unchecked Disk Overcommit and unchecked Block Migration. It shows up in nova-api logs as: {"os-migrateLive": {"disk_over_commit": false, "block_migration": false, "host": null}} This difference can lead to different parameters provided for live migration when block_migration: auto is translated into block_migration: true. My environment: Openstack Stein CentOS Linux release 7.6.1810 nova versions: # yum list installed | grep nova openstack-nova-api.noarch       1:19.0.1-1.el7 @centos-openstack-stein openstack-nova-common.noarch    1:19.0.1-1.el7 @centos-openstack-stein openstack-nova-conductor.noarch 1:19.0.1-1.el7 @centos-openstack-stein openstack-nova-console.noarch   1:19.0.1-1.el7 @centos-openstack-stein openstack-nova-novncproxy.noarch openstack-nova-placement-api.noarch openstack-nova-scheduler.noarch 1:19.0.1-1.el7 @centos-openstack-stein python2-nova.noarch             1:19.0.1-1.el7 @centos-openstack-stein python2-novaclient.noarch       1:13.0.0-1.el7 @centos-openstack-stein horizon version: # yum list installed | grep dashboard openstack-dashboard.noarch         1:15.1.0-1.el7 @centos-openstack-stein openstack-dashboard-theme.noarch   1:15.1.0-1.el7 @centos-openstack-stein Thank you in advance for your attention Best regards, Mary -------------- next part -------------- An HTML attachment was scrubbed... URL: From bansalnehal26 at gmail.com Fri Jul 12 05:00:31 2019 From: bansalnehal26 at gmail.com (Nehal Bansal) Date: Fri, 12 Jul 2019 10:30:31 +0530 Subject: [Ceilometer] [gnocchi] Regarding problem in installation of gnocchi-api Message-ID: Hi, I am trying to install Ceilometer service with Openstack queens release. On installing gnocchi-api with apt-get install gnocchi-api I get the following error: Registering service and endpoints for gnocchi with type metric at http://192.168.133.81:8041 Failed to discover available identity versions when contacting http://127.0.0.1:35357/v3/. Attempting to parse version from URL. Unable to establish connection to http://127.0.0.1:35357/v3/auth/tokens: HTTPConnectionPool(host='127.0.0.1', port=35357): Max retries exceeded with url: /v3/auth/tokens (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 111] Connection refused',)) Warning - data is empty Failed to discover available identity versions when contacting http://127.0.0.1:35357/v3/. Attempting to parse version from URL. Unable to establish connection to http://127.0.0.1:35357/v3/auth/tokens: HTTPConnectionPool(host='127.0.0.1', port=35357): Max retries exceeded with url: /v3/auth/tokens (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 111] Connection refused',)) dpkg: error processing package gnocchi-api (--configure): subprocess installed post-installation script returned error exit status 1 Errors were encountered while processing: gnocchi-api E: Sub-process /usr/bin/dpkg returned an error code (1) Kindly, let me know how to proceed from here. Regards, Nehal -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Sun Jul 14 22:34:13 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Sun, 14 Jul 2019 22:34:13 +0000 Subject: [tc][ptl][infra] Resolution: Mandatory Repository Retirement Message-ID: <20190714223412.wc34j47lvgd3kykm@yuggoth.org> When an effort ceases to be an official part (deliverable) of OpenStack with the intent to continue development outside OpenStack's technical governance, the maintainers of that codebase may end up later abandoning it or making other choices at odds with OpenStack policies with no clear indication to consumers. Some folks may have been using it when it was still part of OpenStack, and so may continue to have expectations which are no longer in sync with reality. To solve this, a resolution has been proposed asserting that software which is removed from OpenStack's technical governance in the future can only be continued as a fork, so that its original Git repository in the "openstack" namespace on OpenDev can be retired with a clear message letting consumers of that source code know what has happened and where it has gone. Some breathing room is allowed, as the initial draft includes measures to allow any deliverable repositories which move out of OpenStack prior to the end of the current development cycle (Train) to be renamed into new OpenDev namespaces with a redirect before the new policy goes into effect. The text of the proposed resolution can be found here: https://review.opendev.org/670741 Please follow up on the above review with any comments/concerns, or you can express them as a reply to this ML thread, or in private to me or any other TC members. Thanks for reading! -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From berndbausch at gmail.com Sun Jul 14 22:39:42 2019 From: berndbausch at gmail.com (Bernd Bausch) Date: Mon, 15 Jul 2019 07:39:42 +0900 Subject: [Ceilometer] [gnocchi] Regarding problem in installation of gnocchi-api In-Reply-To: References: Message-ID: <1539d343-766b-33fb-742c-1c2b7e9f9469@gmail.com> Either Keystone is not running, but that would cause other problems in the cloud. In short, the cloud would not function at all. Or the URL http://127.0.0.1:35357 in the Gnocchi configuration is incorrect. 127.0.0.1 (localhost) is an unusual address, to say the least; try 192.168.133.81 instead. Also ensure that Keystone listens at port 35357 and correct it if not. Bernd. On 7/12/2019 2:00 PM, Nehal Bansal wrote: > I am trying to install Ceilometer service with Openstack queens > release. On installing gnocchi-api with apt-get install gnocchi-api I > get the following error: > > Registering service and endpoints for gnocchi with type metric at > http://192.168.133.81:8041 > Failed to discover available identity versions when contacting > http://127.0.0.1:35357/v3/. Attempting to parse version from URL. > Unable to establish connection to > http://127.0.0.1:35357/v3/auth/tokens: > HTTPConnectionPool(host='127.0.0.1', port=35357): Max retries exceeded > with url: /v3/auth/tokens (Caused by > NewConnectionError(' 0x7f0bf2672250>: Failed to establish a new connection: [Errno 111] > Connection refused',)) > > Kindly, let me know how to proceed from here. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Mon Jul 15 01:23:01 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Mon, 15 Jul 2019 10:23:01 +0900 Subject: [telemetry] Looking for new meeting time Message-ID: Hi team, I've changed job and could not afford doing the meeting at 02:00 UTC on Thursday anymore. My possible time slot is from 13:00-15:00 UTC Tuesday-Thursday. Please propose a better time frame for our team meeting. Sincerely, -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From dangtrinhnt at gmail.com Mon Jul 15 01:26:55 2019 From: dangtrinhnt at gmail.com (Trinh Nguyen) Date: Mon, 15 Jul 2019 10:26:55 +0900 Subject: [searchlight] Team meeting today Message-ID: Hi team, Train-2 milestone is coming. Let's get together for a team meeting today if possible. There are a couple things we need to finish. Yours, -- *Trinh Nguyen* *www.edlab.xyz * -------------- next part -------------- An HTML attachment was scrubbed... URL: From anlin.kong at gmail.com Mon Jul 15 04:04:55 2019 From: anlin.kong at gmail.com (Lingxian Kong) Date: Mon, 15 Jul 2019 16:04:55 +1200 Subject: [telemetry] Looking for new meeting time In-Reply-To: References: Message-ID: congratulations first :-) I'm on UTC+12, so meeting during 13:00-15:00 UTC is hard for me :-( Best regards, Lingxian Kong Catalyst Cloud On Mon, Jul 15, 2019 at 1:23 PM Trinh Nguyen wrote: > Hi team, > > I've changed job and could not afford doing the meeting at 02:00 UTC on > Thursday anymore. My possible time slot is from 13:00-15:00 UTC > Tuesday-Thursday. > > Please propose a better time frame for our team meeting. > > Sincerely, > > -- > *Trinh Nguyen* > *www.edlab.xyz * > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From noonedeadpunk at ya.ru Mon Jul 15 05:28:17 2019 From: noonedeadpunk at ya.ru (=?UTF-8?B?0JTQvNC40YLRgNC40Lkg0KDQsNCx0L7RgtGP0LPQvtCy?=) Date: Mon, 15 Jul 2019 08:28:17 +0300 Subject: [Ceilometer] [gnocchi] Regarding problem in installation of gnocchi-api In-Reply-To: References: Message-ID: Hi, This looks like packaging problem while running postinst hook since endpoint is hardcoded in pkgos_get_id(). As a workaround I may advice you to configure gnocchi as wsgi application: https://gnocchi.xyz/operating.html#running-api-as-a-wsgi-application or just install service using pip (I guess you'll need 2.x version for Queens). пт, 12 июл. 2019 г. в 08:00, Nehal Bansal : > Hi, > I am trying to install Ceilometer service with Openstack queens release. > On installing gnocchi-api with apt-get install gnocchi-api I get the > following error: > > Registering service and endpoints for gnocchi with type metric at > http://192.168.133.81:8041 > Failed to discover available identity versions when contacting > http://127.0.0.1:35357/v3/. Attempting to parse version from URL. > Unable to establish connection to http://127.0.0.1:35357/v3/auth/tokens: > HTTPConnectionPool(host='127.0.0.1', port=35357): Max retries exceeded with > url: /v3/auth/tokens (Caused by > NewConnectionError(' 0x7f0bf2672250>: Failed to establish a new connection: [Errno 111] > Connection refused',)) > Warning - data is empty > Failed to discover available identity versions when contacting > http://127.0.0.1:35357/v3/. Attempting to parse version from URL. > Unable to establish connection to http://127.0.0.1:35357/v3/auth/tokens: > HTTPConnectionPool(host='127.0.0.1', port=35357): Max retries exceeded with > url: /v3/auth/tokens (Caused by > NewConnectionError(' 0x7fa1c8b0cb90>: Failed to establish a new connection: [Errno 111] > Connection refused',)) > dpkg: error processing package gnocchi-api (--configure): > subprocess installed post-installation script returned error exit status 1 > Errors were encountered while processing: > gnocchi-api > E: Sub-process /usr/bin/dpkg returned an error code (1) > > Kindly, let me know how to proceed from here. > > Regards, > Nehal > -- -- С Уважением, Дмитрий -------------- next part -------------- An HTML attachment was scrubbed... URL: From rony.khan at brilliant.com.bd Mon Jul 15 06:26:16 2019 From: rony.khan at brilliant.com.bd (Md. Farhad Hasan Khan) Date: Mon, 15 Jul 2019 12:26:16 +0600 Subject: OpenStack Neutron IPv6 Message-ID: <03c201d53ad6$3555de90$a0019bb0$@brilliant.com.bd> Hi, How to configure IPv6 in OpenStack Rocky neutron? Please help me. Thanks & B'Rgds, Rony -------------- next part -------------- An HTML attachment was scrubbed... URL: From ruslanas at lpic.lt Mon Jul 15 06:42:27 2019 From: ruslanas at lpic.lt (=?UTF-8?Q?Ruslanas_G=C5=BEibovskis?=) Date: Mon, 15 Jul 2019 08:42:27 +0200 Subject: OpenStack Neutron IPv6 In-Reply-To: <03c201d53ad6$3555de90$a0019bb0$@brilliant.com.bd> References: <03c201d53ad6$3555de90$a0019bb0$@brilliant.com.bd> Message-ID: What do you want to configure with IPv6? Provider network? Local network? VIM? Provider and Local networks can be configured using horizon very easily, just in a drop down specify IPv6. I think it do not require any additional actions, unlease during cloud setup you specified another way. On Mon, 15 Jul 2019 at 08:30, Md. Farhad Hasan Khan < rony.khan at brilliant.com.bd> wrote: > Hi, > > How to configure IPv6 in OpenStack Rocky neutron? Please help me. > > > > Thanks & B’Rgds, > > Rony > > > > -- Ruslanas Gžibovskis +370 6030 7030 -------------- next part -------------- An HTML attachment was scrubbed... URL: From sungn2 at lenovo.com Mon Jul 15 06:55:16 2019 From: sungn2 at lenovo.com (Guannan GN2 Sun) Date: Mon, 15 Jul 2019 06:55:16 +0000 Subject: [ironic] Using ironic in a devstack environment to deploy OS on bare-metal host. Message-ID: <8a82c02e5ae64fc79e02e10629a65556@lenovo.com> Hi team, I met some problem when using ironic in a devstack environment to deploy OS on bare-metal host. By now I can successfully boot into the tiny os and set ip, also I can ping the ip from my devstack server. However, when the tiny os running some configuration, it will failed after about 5 minutes. I found some log information in "/var/log/ironic-python-agent.log" in the tiny os that may relate to the problem: "DEBUG oslo_concurrency.processutils [-] u'iscsistart -f' failed. Not Retrying. execute /usr/local/lib/python2.7/site-packages/oslo_concurrency/processutils.py:457" "DEBUG root [-] No iscsi connection detected. Skipping iscsi. Error: [Errno 2] No such file or directory _check_for_iscsi /usr/local/lib/python2.7/site-packages/ironic_python_agent/hardware.py:107" "WARNING root [-] The root device was not detected in 27 seconds: DeviceNotFound: Error finding the disk or partition device to deploy the image onto: No suitable device was found for deployment - root device hints were not provided and all found block devices are smaller than 4294967296B." "WARNING root [-] Can't find field vendor for device usb0 in device class net: IOError: [Errno 2] No such file or directory: '/sys/class/net/usb0/device/vendor'" "WARNING root [-] Can't find field vendor for device tunl0 in device class net: IOError: [Errno 2] No such file or directory: '/sys/class/net/tunl0/device/vendor'" "DEBUG root [-] No Mellanox devices found evaluate_hardware_support /usr/local/lib/python2.7/site-packages/ironic_python_agent/hardware_managers/mlnx.py:84" And when it deploy faild and I run command "openstack baremetal node show server-51", it will show following information on 'last_error' field: "Asynchronous exception: Node failed to deploy. Exception: Connection to agent failed: Failed to connect to the agent running on node f8794658-bc69-40cb-9673-be75f78ced21 for invoking command iscsi.start_iscsi_target. Error: ('Connection aborted.', BadStatusLine("''",)) for node" I'm trying to debug, but have not find the root cause, please give me your advice if someone have experience on it. Thank you! Best Regards, Guannan -------------- next part -------------- An HTML attachment was scrubbed... URL: From manulachathurika at gmail.com Mon Jul 15 08:44:11 2019 From: manulachathurika at gmail.com (Manula Thantriwatte) Date: Mon, 15 Jul 2019 14:14:11 +0530 Subject: ImportError: No module named django.core.wsgi - DevStack Message-ID: Hi All, I have successfully install DevStack in Ubuntu 18.04. But when I'm accessing the dashboard I'm getting 500 error. What I'm getting in horizon_error.log is, 2019-07-15 14:10:43.218296 mod_wsgi (pid=31763): Target WSGI script '/opt/stack/horizon/openstack_dashboard/wsgi.py' cannot be loaded as Python module. 2019-07-15 14:10:43.218323 mod_wsgi (pid=31763): Exception occurred processing WSGI script '/opt/stack/horizon/openstack_dashboard/wsgi.py'. 2019-07-15 14:10:43.218349 Traceback (most recent call last): 2019-07-15 14:10:43.218370 File "/opt/stack/horizon/openstack_dashboard/wsgi.py", line 21, in 2019-07-15 14:10:43.218401 from django.core.wsgi import get_wsgi_application 2019-07-15 14:10:43.218422 ImportError: No module named django.core.wsgi Python Version is : 3.6.8 Django version is : 2.0.5 I tried with uninstalling Djanago and reinstalling it. But it didn't work for me. Can someone help me on how to resole this issue ? Thanks ! -- Regards, Manula Chathurika Thantriwatte phone : (+94) 772492511 email : manulachathurika at gmail.com Linkedin : *http://lk.linkedin.com/in/manulachathurika * blog : http://manulachathurika.blogspot.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From hjensas at redhat.com Mon Jul 15 09:12:46 2019 From: hjensas at redhat.com (Harald =?ISO-8859-1?Q?Jens=E5s?=) Date: Mon, 15 Jul 2019 11:12:46 +0200 Subject: [TripleO] Scaling node counts with only Ansible (N=1) In-Reply-To: References: <23924034ea0981350b7e241aed5e99c5e769b291.camel@redhat.com> Message-ID: On Sat, 2019-07-13 at 16:19 -0400, James Slagle wrote: > On Fri, Jul 12, 2019 at 3:59 PM Harald Jensås > wrote: > > I've said this before, but I think we should turn this nova-less > > around. Now with nova-less we create a bunch of servers, and write > > up > > the parameters file to use the deployed-server approach. > > Effectively we > > still neet to have the resource group in heat making a server > > resource > > for every server. Creating the fake server resource is fast, > > because > > Heat does'nt call Nova,Ironic to create any resources. But the > > stack is > > equally big, with a stack for every node. i.e not N=1. > > > > What you are doing here, is essentially to say we don't create a > > resource group that then creates N number of role stacks, one for > > each > > overcloud node. You are creating a single generic "server" > > definition > > per Role. So we drop the resource group and create > > OS::Triple::{{Role}}.Server 1-time (once). To me it's backwards to > > push > > a large struct with properties for N=many nodes into the creation > > of > > that stack. > > I'm not entirely following what you're saying is backwards. What I've > proposed is that we *don't* have any node specific data in the stack. > It sounds like you're saying the way we do it today is backwards. > What I mean to say is that I think the way we are integrating nova-less by first deploying the servers, to then provide the data to Heat to create the resource groups as we do today becomes backwards when your work on N=1 is introduced. > It's correct that what's been proposed with metalsmith currently > still > requires the full ResourceGroup with a member for each node. With the > template changes I'm proposing, that wouldn't be required, so we > could > actually do the Heat stack first, then metalsmith. > Yes, this is what I think we should do. Especially if your changes here removes the resource group entirely. It makes more sense to create the stack, and once that is created we can do deployment, scaling etc without updating the stack again. > > Currently the puppet/role-role.yaml creates all the network ports > > etc. > > As you only want to create it once, it instead could simply output > > the > > UUID of the networks+subnets. These are identical for all servers > > in > > the role. So we end up with a small heat stack. > > > > Once the stack is created we could use that generic "server" role > > data > > to feed into something (ansible?, python?, mistral?) that calls > > metalsmith to build the servers, then create ports for each server > > in > > neutron, one port for each network+subnet defined in the role. Then > > feed that output into the json (hieradata) that is pushed to each > > node > > and used during service configuration, all the things we need to > > configure network interfaces, /etc/hosts and so on. We need a way > > to > > keep track of which ports belong to wich node, but I guess > > something > > simple like using the node's ironic UUID in either the name, > > description or tag field of the neutron port will work. There is > > also > > the extra filed in Ironic which is json type, so we could place a > > map > > of network->port_uuid in there as well. > > It won't matter whether we do baremetal provisioning before or after > the Heat stack. Heat won't care, as it won't have any expectation to > create any servers or that they are already created. We can define > where we end up calling the metalsmith piece as it should be > independent of the Heat stack if we make these template changes. > This is true. But, in your previous mail in this thread you wrote: """ Other points: - Baremetal provisioning and port creation are presently handled by Heat. With the ongoing efforts to migrate baremetal provisioning out of Heat (nova-less deploy), I think these efforts are very complimentary. Eventually, we get to a point where Heat is not actually creating any other OpenStack API resources. For now, the patches only work when using pre-provisioned nodes. """ IMO "baremetal provision and port creation" fit together. (I read the above statement so as well.) Currently nova-less creates the ctlplane port and provision the baremetal node. If we want to do both baremetal provisioning and port creation togheter (I think this makes sense), we have to do it after the stack has created the networks. What I envision is to have one method that creates all the ports, ctlplane + composable networks in a unified way. Today these are created differently, the ctlplane port is part of the server resource (or metalsmith in nova-less case) and the other ports are created by heat. > > I think the creation of the actual Networks and Subnets can be left > > in > > heat, it's typically 5-6 networks and 5-6 subnets so it's not a lot > > of > > resources. Even in a large DCN deployment having 50-100 subnets per > > network or even 50-100 networks I think this is'nt a problem. > > Agreed, I'm not specifically proposing we move those pieces at this > time. > +1 From thierry at openstack.org Mon Jul 15 09:50:05 2019 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 15 Jul 2019 11:50:05 +0200 Subject: [docs][tc][infra] what to do with developer.openstack.org and api-site? In-Reply-To: <48048bf0-a79c-6abf-b88f-a1132afc0d6b@suse.com> References: <48048bf0-a79c-6abf-b88f-a1132afc0d6b@suse.com> Message-ID: Andreas Jaeger wrote: > [...] > I see the following options: > > 1) Retiring developer.openstack.org completely, this would mean we would > host the api-guides and api-references on docs.openstack.org (perhaps > with moving them into doc/source). If we go down this road, we need to > discuss what this means (redirects) and what to do with the Api-Guide > and the FirstApp guide. > > 2) Fully revitialize the repo and have it owned by an official team or > SIG (this means reverting parts of https://review.opendev.org/485249/) > > 3) Retire the document "Writing your first OpenStack Application", and > unretire api-site and have it owned by some official team/SIG. > > Any other options? What shall we do? Thanks Andreas for raising this. As an extra data point, my long-term plan was to have SDKs and CLIs properly listed in the Software pages under SDKs[1], including third-party ones in their own subtab, all driven from the osf/openstack-map repository[2]. With that in mind, I think it would make sense to look into retiring developer.openstack.org, and move docs to docs.openstack.org. We could also revive https://www.openstack.org/appdev/ and use it as the base landing page to direct application-side people to the various pieces. [1] https://www.openstack.org/software/project-navigator/sdks [2] https://opendev.org/osf/openstack-map/ -- Thierry Carrez (ttx) From thierry at openstack.org Mon Jul 15 10:30:26 2019 From: thierry at openstack.org (Thierry Carrez) Date: Mon, 15 Jul 2019 12:30:26 +0200 Subject: [dev][release][qa] patrole stable/stein is created by mistake ? In-Reply-To: <16beddf697c.fd5e3e64150304.4626321901014847129@ghanshyammann.com> References: <16beb436c21.c9d98154147017.455401003873410397@ghanshyammann.com> <16beb463457.1193bd7a4147042.5135658897011450644@ghanshyammann.com> <20190713203828.GA29711@sm-workstation> <16beddf697c.fd5e3e64150304.4626321901014847129@ghanshyammann.com> Message-ID: <0a85c82c-f312-a8a9-34af-f5b812c8f2b3@openstack.org> Ghanshyam Mann wrote: > ---- On Sun, 14 Jul 2019 05:38:28 +0900 Sean McGinnis wrote ---- > > On Sat, Jul 13, 2019 at 09:19:35PM +0900, Ghanshyam Mann wrote: > > > ---- On Sat, 13 Jul 2019 21:16:32 +0900 Ghanshyam Mann wrote ---- > > > > Hi Release team, > > > > > > > > Today I noticed while doing patrole review that stable/stain has been created for patrole which is wrong. > > > > Patrole is branchless[1] and I remember I have not requested the stable branch while releasing the patrole. > > > > > > > > Is it created by mistakenly ? or intentional? > > > > > > I found the patch which created this - https://review.opendev.org/#/c/650173/1. > > > > > > Can we revert that but that patch has more changes? > > > > > > -gmann > > > > > > > We can't really just revert it. If you would like to change this, please update > > the patrole release type to tempest-plugin or move it to be an independent > > release deliverable if it is not actually cycle based. > > > > You can also remove the branching information with that change, then after that > > is gone and there isn't a risk of recreating it, the infra team may be able to > > assist in deleting the existing branch. > > Release model for patrole is 'cycle-with-intermediary' which is right, we do not need > to change that. We just need to remove the stable/stein and branch information which > was added for the only stein. I will push the change. > > Can we have a tag which clearly says the branchless and no-stable branch nature for deliverables? > > 'tempest-plugins' was introduced for a different purpose. new tag can be applicable for other > deliverables also (current or in future). That can help to avoid these errrors. Creating a new release model or a new deliverable type sounds a bit overkill. I think the simpler would be to add a new value to the existing "stable-branch-type" key. Like "stable-branch-type: none" and then set that value for all tempest plugins, tempest itself and patrole. See https://review.opendev.org/670808 as a strawman. -- Thierry Carrez (ttx) From tomas.bredar at gmail.com Mon Jul 15 11:10:49 2019 From: tomas.bredar at gmail.com (=?UTF-8?B?VG9tw6HFoSBCcmVkw6Fy?=) Date: Mon, 15 Jul 2019 13:10:49 +0200 Subject: [tripleo][cinder][netapp] In-Reply-To: References: Message-ID: Hi Alan! Thanks for the pointers. For now I'm going with a single backend with two NFS shares, so I'll use the tripleo templates for netapp. For the future, could you point me to the right template / puppet-cinder code which can create multiple nfs share files for me? Or should I create my own puppet manifest? Thanks again. Tomas pi 12. 7. 2019 o 16:11 Alan Bishop napísal(a): > > On Fri, Jul 12, 2019 at 6:09 AM Tomáš Bredár > wrote: > >> Hi Emilien! >> >> Thanks for your help. Yes with this I am able to define multiple stanzas >> in cinder.conf. However netapp driver needs a .conf file with the nfs >> shares listed in it. Defining multiple configuration files with nfs share >> details in each is not possible with the manual you've sent nor with the >> templates in my first email. >> > > Hi Tomas, > > When deploying a single backend, the tripleo template takes care of > generating the nfs shares file (actually, puppet-cinder generates the file, > but it's triggered by tripleo). But when you use the custom backend method > that Emilien pointed you to use, then you are responsible for supplying all > the pieces for the backend(s) to function correctly. This means you will > need to generate the nfs shares file on the host (controller), and then > bind mount the file using CinderVolumeOptVolumes so that the shares file on > the host is visible to the cinder-volume process running in a container. > > I'm wondering if it's possible to define a second backend by creating >> another service, for example "OS::TripleO::Services::CinderBackendNetApp2" ? >> > > Sorry, this won't work. TripleO will trying to deploy two completely > separate instances of the cinder-volume service, but the two deployments > will step all over each other. There has been a long standing goal of > enhancing tripleo so that it can deploy multiple instances of a cinder > backend, but it's a complex task that will require non-trivial changes to > tripleo. > > Alan > > Tomas >> >> št 11. 7. 2019 o 14:35 Emilien Macchi napísal(a): >> >>> On Thu, Jul 11, 2019 at 7:32 AM Tomáš Bredár >>> wrote: >>> >>>> Hi community, >>>> >>>> I'm trying to define multiple NetApp storage backends via Tripleo >>>> installer. >>>> According to [1] the puppet manifest supports multiple backends. >>>> The current templates [2] [3] support only single backend. >>>> Does anyone know how to define multiple netapp backends in the >>>> tripleo-heat environment files / templates? >>>> >>> >>> We don't support that via the templates that you linked, however if you >>> follow this manual you should be able to configure multiple NetApp backends: >>> >>> https://docs.openstack.org/tripleo-docs/latest/install/advanced_deployment/cinder_custom_backend.html >>> >>> Let us know how it worked! >>> -- >>> Emilien Macchi >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From skaplons at redhat.com Mon Jul 15 12:30:56 2019 From: skaplons at redhat.com (Slawek Kaplonski) Date: Mon, 15 Jul 2019 14:30:56 +0200 Subject: [neutron] Bug deputy - week of July 8th Message-ID: Hi, I was bug deputy last week. It was relatively easy week. Below is my summary of what happened there: *High* https://bugs.launchpad.net/neutron/+bug/1835731 - DVR router with configured external_network_bridge option don't have external connectivity, patch proposed https://review.opendev.org/#/c/669640/ https://bugs.launchpad.net/neutron/+bug/1836023 - OVS agent "hangs" while processing trusted ports - fix proposed, I added loadimact tag to it, https://bugs.launchpad.net/neutron/+bug/1836095 - Improve "OVSFirewallDriver.process_trusted_ports" - fix also in progress - this is “child” of 1836023 https://bugs.launchpad.net/neutron/+bug/1836015 - [neutron-fwaas]firewall goup status is inactive when updating policy in fwg - fix proposed already *Medium* https://bugs.launchpad.net/neutron/+bug/1835914 - Test test_show_network_segment_range failing - gate failure issue, set to medium as it is not happening very often - we need volunteer for this one, https://bugs.launchpad.net/neutron/+bug/1836028 - Functional Test script results in an authentication error, fix in progress: https://review.opendev.org/#/c/670021/ https://bugs.launchpad.net/neutron/+bug/1836037 - Routed provider networks nova inventory update fails - fix in progress: https://review.opendev.org/670105 https://bugs.launchpad.net/neutron/+bug/1836253 - Sometimes InstanceMetada API returns 404 due to invalid InstaceID returned by _get_instance_and_tenant_id() - we need volunteer for this https://bugs.launchpad.net/neutron/+bug/1836565 - Functional test test_keepalived_state_change_notification may fail do to race condition - fix in progress: https://review.opendev.org/670815 *Low* https://bugs.launchpad.net/neutron/+bug/1836263 - [DOCS] doc: PUT /ports/{port_id} updates selectively - we need volunteer for this one *Incomplete* https://bugs.launchpad.net/neutron/+bug/1835848 - Check the list of ports, fixed IP is empty and still connected devices are displayed - waiting for some more info from reporter https://bugs.launchpad.net/neutron/+bug/1775644 - Neutron fwaas v2 group port binding failed - waiting for info from reporter *Old bugs revived recently* https://bugs.launchpad.net/neutron/+bug/1806032 - shim API extension proposed by Bence was merged long time ago but nothing else happens since then, maybe we should get back to this? — Slawek Kaplonski Senior software engineer Red Hat From abishop at redhat.com Mon Jul 15 13:56:05 2019 From: abishop at redhat.com (Alan Bishop) Date: Mon, 15 Jul 2019 06:56:05 -0700 Subject: [tripleo][cinder][netapp] In-Reply-To: References: Message-ID: On Mon, Jul 15, 2019 at 4:11 AM Tomáš Bredár wrote: > Hi Alan! > Thanks for the pointers. For now I'm going with a single backend with two > NFS shares, so I'll use the tripleo templates for netapp. > For the future, could you point me to the right template / puppet-cinder > code which can create multiple nfs share files for me? Or should I create > my own puppet manifest? > Hi Tomas, The puppet-cinder code that renders the shares config file is [1], and the data comes from puppet-tripleo [2]. This is puppet hiera data, and the value is bound to the CindeNetappNfsShares tripleo parameter [3]. So you should be able to deploy a single backend that accesses multiple shares by adding something like this to your tripleo deployment. parameter_defaults: CinderNetappNfsShares: 'host_1:/path/to/share_1,host_2:/path/to/share_2' [1] https://opendev.org/openstack/puppet-cinder/src/branch/stable/queens/manifests/backend/netapp.pp#L280 [2] https://opendev.org/openstack/puppet-tripleo/src/branch/stable/queens/manifests/profile/base/cinder/volume/netapp.pp#L38 [3] https://opendev.org/openstack/tripleo-heat-templates/src/branch/stable/queens/puppet/services/cinder-backend-netapp.yaml#L137 Alan Thanks again. > > Tomas > > pi 12. 7. 2019 o 16:11 Alan Bishop napísal(a): > >> >> On Fri, Jul 12, 2019 at 6:09 AM Tomáš Bredár >> wrote: >> >>> Hi Emilien! >>> >>> Thanks for your help. Yes with this I am able to define multiple stanzas >>> in cinder.conf. However netapp driver needs a .conf file with the nfs >>> shares listed in it. Defining multiple configuration files with nfs share >>> details in each is not possible with the manual you've sent nor with the >>> templates in my first email. >>> >> >> Hi Tomas, >> >> When deploying a single backend, the tripleo template takes care of >> generating the nfs shares file (actually, puppet-cinder generates the file, >> but it's triggered by tripleo). But when you use the custom backend method >> that Emilien pointed you to use, then you are responsible for supplying all >> the pieces for the backend(s) to function correctly. This means you will >> need to generate the nfs shares file on the host (controller), and then >> bind mount the file using CinderVolumeOptVolumes so that the shares file on >> the host is visible to the cinder-volume process running in a container. >> >> I'm wondering if it's possible to define a second backend by creating >>> another service, for example "OS::TripleO::Services::CinderBackendNetApp2" ? >>> >> >> Sorry, this won't work. TripleO will trying to deploy two completely >> separate instances of the cinder-volume service, but the two deployments >> will step all over each other. There has been a long standing goal of >> enhancing tripleo so that it can deploy multiple instances of a cinder >> backend, but it's a complex task that will require non-trivial changes to >> tripleo. >> >> Alan >> >> Tomas >>> >>> št 11. 7. 2019 o 14:35 Emilien Macchi napísal(a): >>> >>>> On Thu, Jul 11, 2019 at 7:32 AM Tomáš Bredár >>>> wrote: >>>> >>>>> Hi community, >>>>> >>>>> I'm trying to define multiple NetApp storage backends via Tripleo >>>>> installer. >>>>> According to [1] the puppet manifest supports multiple backends. >>>>> The current templates [2] [3] support only single backend. >>>>> Does anyone know how to define multiple netapp backends in the >>>>> tripleo-heat environment files / templates? >>>>> >>>> >>>> We don't support that via the templates that you linked, however if you >>>> follow this manual you should be able to configure multiple NetApp backends: >>>> >>>> https://docs.openstack.org/tripleo-docs/latest/install/advanced_deployment/cinder_custom_backend.html >>>> >>>> Let us know how it worked! >>>> -- >>>> Emilien Macchi >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From raubvogel at gmail.com Mon Jul 15 14:19:19 2019 From: raubvogel at gmail.com (Mauricio Tavares) Date: Mon, 15 Jul 2019 10:19:19 -0400 Subject: Configuring the network interfaces of an instance Message-ID: If I have an instance (vm guest) with more than one network interface, how can I tell it during the install process to use a specific one? -------------- next part -------------- An HTML attachment was scrubbed... URL: From alifshit at redhat.com Mon Jul 15 14:24:36 2019 From: alifshit at redhat.com (Artom Lifshitz) Date: Mon, 15 Jul 2019 10:24:36 -0400 Subject: [nova][neutron][CI] test_server_connectivity_cold_migration_revert failing again Message-ID: Looks like we merged [1] too soon, and despite external events handling during migration revert now being handled correctly in Nova with the merger of [2], there's something else broken, perhaps at the Neutron level, that prevents test_server_connectivity_cold_migration_revert from consistently passing. Until we figure out what that "something" is, I've proposed to skip the test in tempest [3] and filed a bug for it [4]. The bug is filed under Neutron, but that's more of a placeholder based on preliminary guessing. We can easily change it to Nova or some other component once we understand better what's going on. [1] https://review.opendev.org/#/c/663405/ [2] https://review.opendev.org/#/c/667177/ [3] https://review.opendev.org/#/c/670848/1 [4] https://bugs.launchpad.net/neutron/+bug/1836595 From smooney at redhat.com Mon Jul 15 14:28:33 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 15 Jul 2019 15:28:33 +0100 Subject: [docs][tc][infra] what to do with developer.openstack.org and api-site? In-Reply-To: References: <48048bf0-a79c-6abf-b88f-a1132afc0d6b@suse.com> Message-ID: <116050ecdd0b7c5ecbe728d914b67d3f0770a2ea.camel@redhat.com> On Mon, 2019-07-15 at 11:50 +0200, Thierry Carrez wrote: > Andreas Jaeger wrote: > > [...] > > I see the following options: > > > > 1) Retiring developer.openstack.org completely, this would mean we would > > host the api-guides and api-references on docs.openstack.org (perhaps > > with moving them into doc/source). If we go down this road, we need to > > discuss what this means (redirects) and what to do with the Api-Guide > > and the FirstApp guide. > > > > 2) Fully revitialize the repo and have it owned by an official team or > > SIG (this means reverting parts of https://review.opendev.org/485249/) > > > > 3) Retire the document "Writing your first OpenStack Application", and > > unretire api-site and have it owned by some official team/SIG. > > > > Any other options? What shall we do? > > Thanks Andreas for raising this. > > As an extra data point, my long-term plan was to have SDKs and CLIs > properly listed in the Software pages under SDKs[1], including > third-party ones in their own subtab, all driven from the > osf/openstack-map repository[2]. > > With that in mind, I think it would make sense to look into retiring > developer.openstack.org, i use https://developer.openstack.org/api-ref/compute/ almost daily so unless we host the api ref somwhere else and put redirect in place i would hope we can keep this inplace. if we move it under docs like the config stuff https://docs.openstack.org/nova/latest/configuration/config.html or somewhere else that is fine but i fine it very useful to be able to link the rendered api docs to people on irc that ask questions. i can obviosly point peple to github https://github.com/openstack/nova/blob/master/api-ref/source/servers.inc but unlike the configs the api ref is much less readable with out rendering it with sphinx > and move docs to docs.openstack.org. We could > also revive https://www.openstack.org/appdev/ and use it as the base > landing page to direct application-side people to the various pieces. > > [1] https://www.openstack.org/software/project-navigator/sdks > [2] https://opendev.org/osf/openstack-map/ > From km.giuseppesannino at gmail.com Mon Jul 15 14:29:47 2019 From: km.giuseppesannino at gmail.com (Giuseppe Sannino) Date: Mon, 15 Jul 2019 16:29:47 +0200 Subject: [kolla][nova][neutron] Access to VMs is slow when running on a remote compute host In-Reply-To: <01AA2BA1-3EEC-4142-BCDB-F9861707102E@redhat.com> References: <01AA2BA1-3EEC-4142-BCDB-F9861707102E@redhat.com> Message-ID: Hi! first of all, thanks for the fast replies. I do appreciate that. I did some more test trying to figure out the issue. - Set UseDNS to "no" in sshd_config => Issue persists - Installed and configured Telnet => Telnet login is slow as well >From the "top" or "auth.log"nothing specific popped up. I can sshd taking some cpu for a short while but nothing more than that. Once logged in the VM is not too slow. CLI doesn't get stuck or similar. One thing worthwhile to mention, it seems like the writing throughput on the disk is a bit slow: 67MB/s wrt around 318MB/s of another VM running on a "datacenter" Openstack installation. The Cinder Volume docker is running on the Compute Host and Cinder is using the filesystem as backend. BR /Giuseppe On Fri, 12 Jul 2019 at 17:41, Slawek Kaplonski wrote: > Hi, > > I suspect some problems with names resolving. Can You check if You have > also such delay when doing e.g. “sudo” commands after You ssh to the > instance? > > > On 12 Jul 2019, at 16:23, Brian Haley wrote: > > > > > > > > On 7/12/19 9:57 AM, Giuseppe Sannino wrote: > >> Hi community, > >> I need your help ,tips, advices. > >> *> Environment <* > >> I have deployed Openstack "Stein" using the latest kolla-ansible on the > following deployment topology: > >> 1) OS Controller running as VM on a "cloud" location > >> 2) OS Compute running on a baremetal server remotely (wrt OS > Controller) location > >> 3) Network node running on the Compute host > >> As per the above info, Controller and compute run on two different > networks. > >> Kolla-Ansible is not really designed for such scenario but after > manipulating the globals.yml and the inventory files (basically I had to > move node specific network settings from the globals to the inventory > file), eventually the deployment works fine. > >> *> Problem <* > >> I have no specific issue working with this deployment except the > following: > >> "SSH connection to the VM is quite slow". > >> It takes around 20 seconds for me to log into the VM (Ubuntu, CentOS, > whatever). > > > > But once logged-in things are OK? For example, an scp stalls the same > way, but the transfer is fast? > > > >> *> Observations <* > >> * Except for the slowness during the SSH login, I don't have any > >> further specific issue working with this envirorment > >> * With the Network on the Compute I can turn the OS controller off > >> with no impact on the VM. Still the connection is slow > >> * I tried different type of images (Ubuntu, CentOS, Windows) always > >> with the same result. > >> * SSH connection is slow even if I try to login into the VM within the > >> IP Namespace > >> From the ssh -vvv, I can see that the authentication gets stuck here: > >> debug1: Authentication succeeded (publickey). > >> Authenticated to ***** > >> debug1: channel 0: new [client-session] > >> debug3: ssh_session2_open: channel_new: 0 > >> debug2: channel 0: send open > >> debug3: send packet: type 90 > >> debug1: Requesting no-more-sessions at openssh.com no-more-sessions at openssh.com> > >> debug3: send packet: type 80 > >> debug1: Entering interactive session. > >> debug1: pledge: network > >> >>>>> 10 to 15 seconds later > > > > What is sshd doing at this time? Have you tried enabling debug or > running tcpdump when a new connection is attempted? At first glance I'd > say it's a DNS issue since it eventually succeeds, the logs would help to > point in a direction. > > > > -Brian > > > > > >> debug3: receive packet: type 80 > >> debug1: client_input_global_request: rtype hostkeys-00 at openssh.com > want_reply 0 > >> debug3: receive packet: type 91 > >> debug2: callback start > >> debug2: fd 3 setting TCP_NODELAY > >> debug3: ssh_packet_set_tos: set IP_TOS 0x10 > >> debug2: client_session2_setup: id 0 > >> debug2: channel 0: request pty-req confirm 1 > >> Have you ever experienced such issue ? > >> Any suggestion? > >> Many thanks > >> /Giuseppe > > — > Slawek Kaplonski > Senior software engineer > Red Hat > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tomas.bredar at gmail.com Mon Jul 15 14:33:50 2019 From: tomas.bredar at gmail.com (=?UTF-8?B?VG9tw6HFoSBCcmVkw6Fy?=) Date: Mon, 15 Jul 2019 16:33:50 +0200 Subject: [tripleo][cinder][netapp] In-Reply-To: References: Message-ID: Hi Alan, Yes, this is something I was able to achieve. Sorry I think I didn't express myself clearly. My question was how to correctly create multiple configuration files? po 15. 7. 2019 o 15:56 Alan Bishop napísal(a): > > On Mon, Jul 15, 2019 at 4:11 AM Tomáš Bredár > wrote: > >> Hi Alan! >> Thanks for the pointers. For now I'm going with a single backend with two >> NFS shares, so I'll use the tripleo templates for netapp. >> For the future, could you point me to the right template / puppet-cinder >> code which can create multiple nfs share files for me? Or should I create >> my own puppet manifest? >> > > Hi Tomas, > > The puppet-cinder code that renders the shares config file is [1], and the > data comes from puppet-tripleo [2]. This is puppet hiera data, and the > value is bound to the CindeNetappNfsShares tripleo parameter [3]. > > So you should be able to deploy a single backend that accesses multiple > shares by adding something like this to your tripleo deployment. > > parameter_defaults: > CinderNetappNfsShares: 'host_1:/path/to/share_1,host_2:/path/to/share_2' > > [1] > https://opendev.org/openstack/puppet-cinder/src/branch/stable/queens/manifests/backend/netapp.pp#L280 > [2] > https://opendev.org/openstack/puppet-tripleo/src/branch/stable/queens/manifests/profile/base/cinder/volume/netapp.pp#L38 > [3] > https://opendev.org/openstack/tripleo-heat-templates/src/branch/stable/queens/puppet/services/cinder-backend-netapp.yaml#L137 > > Alan > > Thanks again. >> >> Tomas >> >> pi 12. 7. 2019 o 16:11 Alan Bishop napísal(a): >> >>> >>> On Fri, Jul 12, 2019 at 6:09 AM Tomáš Bredár >>> wrote: >>> >>>> Hi Emilien! >>>> >>>> Thanks for your help. Yes with this I am able to define multiple >>>> stanzas in cinder.conf. However netapp driver needs a .conf file with the >>>> nfs shares listed in it. Defining multiple configuration files with nfs >>>> share details in each is not possible with the manual you've sent nor with >>>> the templates in my first email. >>>> >>> >>> Hi Tomas, >>> >>> When deploying a single backend, the tripleo template takes care of >>> generating the nfs shares file (actually, puppet-cinder generates the file, >>> but it's triggered by tripleo). But when you use the custom backend method >>> that Emilien pointed you to use, then you are responsible for supplying all >>> the pieces for the backend(s) to function correctly. This means you will >>> need to generate the nfs shares file on the host (controller), and then >>> bind mount the file using CinderVolumeOptVolumes so that the shares file on >>> the host is visible to the cinder-volume process running in a container. >>> >>> I'm wondering if it's possible to define a second backend by creating >>>> another service, for example "OS::TripleO::Services::CinderBackendNetApp2" ? >>>> >>> >>> Sorry, this won't work. TripleO will trying to deploy two completely >>> separate instances of the cinder-volume service, but the two deployments >>> will step all over each other. There has been a long standing goal of >>> enhancing tripleo so that it can deploy multiple instances of a cinder >>> backend, but it's a complex task that will require non-trivial changes to >>> tripleo. >>> >>> Alan >>> >>> Tomas >>>> >>>> št 11. 7. 2019 o 14:35 Emilien Macchi napísal(a): >>>> >>>>> On Thu, Jul 11, 2019 at 7:32 AM Tomáš Bredár >>>>> wrote: >>>>> >>>>>> Hi community, >>>>>> >>>>>> I'm trying to define multiple NetApp storage backends via Tripleo >>>>>> installer. >>>>>> According to [1] the puppet manifest supports multiple backends. >>>>>> The current templates [2] [3] support only single backend. >>>>>> Does anyone know how to define multiple netapp backends in the >>>>>> tripleo-heat environment files / templates? >>>>>> >>>>> >>>>> We don't support that via the templates that you linked, however if >>>>> you follow this manual you should be able to configure multiple NetApp >>>>> backends: >>>>> >>>>> https://docs.openstack.org/tripleo-docs/latest/install/advanced_deployment/cinder_custom_backend.html >>>>> >>>>> Let us know how it worked! >>>>> -- >>>>> Emilien Macchi >>>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From smooney at redhat.com Mon Jul 15 16:18:57 2019 From: smooney at redhat.com (Sean Mooney) Date: Mon, 15 Jul 2019 17:18:57 +0100 Subject: [kolla][nova][neutron] Access to VMs is slow when running on a remote compute host In-Reply-To: References: <01AA2BA1-3EEC-4142-BCDB-F9861707102E@redhat.com> Message-ID: <9f46833c423d5e87a19e93ad77f7999e44cc5268.camel@redhat.com> On Mon, 2019-07-15 at 16:29 +0200, Giuseppe Sannino wrote: > Hi! > first of all, thanks for the fast replies. I do appreciate that. > > I did some more test trying to figure out the issue. > - Set UseDNS to "no" in sshd_config => Issue persists > - Installed and configured Telnet => Telnet login is slow as well > > From the "top" or "auth.log"nothing specific popped up. I can sshd taking > some cpu for a short while but nothing more than that. > > Once logged in the VM is not too slow. CLI doesn't get stuck or similar. > One thing worthwhile to mention, it seems like the writing throughput on > the disk is a bit slow: 67MB/s wrt around 318MB/s of another VM running on > a "datacenter" Openstack installation. unless you see iowait in the guest its likely not related to the disk speed. you might be able to improve the disk performace by changeing the chache mode but unless you are seeing io wait that is just an optimisation to try later. when you are logged into the vm have you tried ssh again via localhost to determin if the long login time is related to the network or the vm. if its related to the network it will be fast over localhost if its related to the vm, e.g. because of disk, cpu load, memory load or ssh server configuration then the local ssh will be slow. > > The Cinder Volume docker is running on the Compute Host and Cinder is using > the filesystem as backend. > > BR > /Giuseppe > > > > > > > > > > > > > > > > > > > > On Fri, 12 Jul 2019 at 17:41, Slawek Kaplonski wrote: > > > Hi, > > > > I suspect some problems with names resolving. Can You check if You have > > also such delay when doing e.g. “sudo” commands after You ssh to the > > instance? > > > > > On 12 Jul 2019, at 16:23, Brian Haley wrote: > > > > > > > > > > > > On 7/12/19 9:57 AM, Giuseppe Sannino wrote: > > > > Hi community, > > > > I need your help ,tips, advices. > > > > *> Environment <* > > > > I have deployed Openstack "Stein" using the latest kolla-ansible on the > > > > following deployment topology: > > > > 1) OS Controller running as VM on a "cloud" location > > > > 2) OS Compute running on a baremetal server remotely (wrt OS > > > > Controller) location > > > > 3) Network node running on the Compute host > > > > As per the above info, Controller and compute run on two different > > > > networks. > > > > Kolla-Ansible is not really designed for such scenario but after > > > > manipulating the globals.yml and the inventory files (basically I had to > > move node specific network settings from the globals to the inventory > > file), eventually the deployment works fine. > > > > *> Problem <* > > > > I have no specific issue working with this deployment except the > > > > following: > > > > "SSH connection to the VM is quite slow". > > > > It takes around 20 seconds for me to log into the VM (Ubuntu, CentOS, > > > > whatever). > > > > > > But once logged-in things are OK? For example, an scp stalls the same > > > > way, but the transfer is fast? > > > > > > > *> Observations <* > > > > * Except for the slowness during the SSH login, I don't have any > > > > further specific issue working with this envirorment > > > > * With the Network on the Compute I can turn the OS controller off > > > > with no impact on the VM. Still the connection is slow > > > > * I tried different type of images (Ubuntu, CentOS, Windows) always > > > > with the same result. > > > > * SSH connection is slow even if I try to login into the VM within the > > > > IP Namespace > > > > From the ssh -vvv, I can see that the authentication gets stuck here: > > > > debug1: Authentication succeeded (publickey). > > > > Authenticated to ***** > > > > debug1: channel 0: new [client-session] > > > > debug3: ssh_session2_open: channel_new: 0 > > > > debug2: channel 0: send open > > > > debug3: send packet: type 90 > > > > debug1: Requesting no-more-sessions at openssh.com > > > no-more-sessions at openssh.com> > > > > debug3: send packet: type 80 > > > > debug1: Entering interactive session. > > > > debug1: pledge: network > > > > > > > > > 10 to 15 seconds later > > > > > > What is sshd doing at this time? Have you tried enabling debug or > > > > running tcpdump when a new connection is attempted? At first glance I'd > > say it's a DNS issue since it eventually succeeds, the logs would help to > > point in a direction. > > > > > > -Brian > > > > > > > > > > debug3: receive packet: type 80 > > > > debug1: client_input_global_request: rtype hostkeys-00 at openssh.com > > > > want_reply 0 > > > > debug3: receive packet: type 91 > > > > debug2: callback start > > > > debug2: fd 3 setting TCP_NODELAY > > > > debug3: ssh_packet_set_tos: set IP_TOS 0x10 > > > > debug2: client_session2_setup: id 0 > > > > debug2: channel 0: request pty-req confirm 1 > > > > Have you ever experienced such issue ? > > > > Any suggestion? > > > > Many thanks > > > > /Giuseppe > > > > — > > Slawek Kaplonski > > Senior software engineer > > Red Hat > > > > From fungi at yuggoth.org Mon Jul 15 16:29:36 2019 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 15 Jul 2019 16:29:36 +0000 Subject: [kolla][nova][neutron] Access to VMs is slow when running on a remote compute host In-Reply-To: References: <01AA2BA1-3EEC-4142-BCDB-F9861707102E@redhat.com> Message-ID: <20190715162936.qowsv7aq2j6bqriq@yuggoth.org> On 2019-07-15 16:29:47 +0200 (+0200), Giuseppe Sannino wrote: [...] > Once logged in the VM is not too slow. CLI doesn't get stuck or similar. > One thing worthwhile to mention, it seems like the writing throughput on > the disk is a bit slow: 67MB/s wrt around 318MB/s of another VM running on > a "datacenter" Openstack installation. [...] Have you checked dmesg in the guest instance to see if there is any I/O problem reported by the kernel? The login process will block on updating /var/log/wtmp or similar, so if writes to whatever backing store that lives on are delayed, that can explain the symptom. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From km.giuseppesannino at gmail.com Mon Jul 15 16:34:36 2019 From: km.giuseppesannino at gmail.com (Giuseppe Sannino) Date: Mon, 15 Jul 2019 18:34:36 +0200 Subject: [kolla][nova][neutron] Access to VMs is slow when running on a remote compute host In-Reply-To: <9f46833c423d5e87a19e93ad77f7999e44cc5268.camel@redhat.com> References: <01AA2BA1-3EEC-4142-BCDB-F9861707102E@redhat.com> <9f46833c423d5e87a19e93ad77f7999e44cc5268.camel@redhat.com> Message-ID: Hi Sean, the ssh to localhost is slow as well. "telnet localhost" is also slow. /Giuseppe On Mon, 15 Jul 2019 at 18:18, Sean Mooney wrote: > On Mon, 2019-07-15 at 16:29 +0200, Giuseppe Sannino wrote: > > Hi! > > first of all, thanks for the fast replies. I do appreciate that. > > > > I did some more test trying to figure out the issue. > > - Set UseDNS to "no" in sshd_config => Issue persists > > - Installed and configured Telnet => Telnet login is slow as well > > > > From the "top" or "auth.log"nothing specific popped up. I can sshd taking > > some cpu for a short while but nothing more than that. > > > > Once logged in the VM is not too slow. CLI doesn't get stuck or similar. > > One thing worthwhile to mention, it seems like the writing throughput on > > the disk is a bit slow: 67MB/s wrt around 318MB/s of another VM running > on > > a "datacenter" Openstack installation. > unless you see iowait in the guest its likely not related to the disk > speed. > you might be able to improve the disk performace by changeing the chache > mode > but unless you are seeing io wait that is just an optimisation to try > later. > > when you are logged into the vm have you tried ssh again via localhost to > determin if the long login time is related to the network or the vm. > > if its related to the network it will be fast over localhost > if its related to the vm, e.g. because of disk, cpu load, memory load or > ssh server configuration > then the local ssh will be slow. > > > > > The Cinder Volume docker is running on the Compute Host and Cinder is > using > > the filesystem as backend. > > > > BR > > /Giuseppe > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, 12 Jul 2019 at 17:41, Slawek Kaplonski > wrote: > > > > > Hi, > > > > > > I suspect some problems with names resolving. Can You check if You have > > > also such delay when doing e.g. “sudo” commands after You ssh to the > > > instance? > > > > > > > On 12 Jul 2019, at 16:23, Brian Haley wrote: > > > > > > > > > > > > > > > > On 7/12/19 9:57 AM, Giuseppe Sannino wrote: > > > > > Hi community, > > > > > I need your help ,tips, advices. > > > > > *> Environment <* > > > > > I have deployed Openstack "Stein" using the latest kolla-ansible > on the > > > > > > following deployment topology: > > > > > 1) OS Controller running as VM on a "cloud" location > > > > > 2) OS Compute running on a baremetal server remotely (wrt OS > > > > > > Controller) location > > > > > 3) Network node running on the Compute host > > > > > As per the above info, Controller and compute run on two different > > > > > > networks. > > > > > Kolla-Ansible is not really designed for such scenario but after > > > > > > manipulating the globals.yml and the inventory files (basically I had > to > > > move node specific network settings from the globals to the inventory > > > file), eventually the deployment works fine. > > > > > *> Problem <* > > > > > I have no specific issue working with this deployment except the > > > > > > following: > > > > > "SSH connection to the VM is quite slow". > > > > > It takes around 20 seconds for me to log into the VM (Ubuntu, > CentOS, > > > > > > whatever). > > > > > > > > But once logged-in things are OK? For example, an scp stalls the > same > > > > > > way, but the transfer is fast? > > > > > > > > > *> Observations <* > > > > > * Except for the slowness during the SSH login, I don't have any > > > > > further specific issue working with this envirorment > > > > > * With the Network on the Compute I can turn the OS controller off > > > > > with no impact on the VM. Still the connection is slow > > > > > * I tried different type of images (Ubuntu, CentOS, Windows) > always > > > > > with the same result. > > > > > * SSH connection is slow even if I try to login into the VM > within the > > > > > IP Namespace > > > > > From the ssh -vvv, I can see that the authentication gets stuck > here: > > > > > debug1: Authentication succeeded (publickey). > > > > > Authenticated to ***** > > > > > debug1: channel 0: new [client-session] > > > > > debug3: ssh_session2_open: channel_new: 0 > > > > > debug2: channel 0: send open > > > > > debug3: send packet: type 90 > > > > > debug1: Requesting no-more-sessions at openssh.com > > > > > no-more-sessions at openssh.com> > > > > > debug3: send packet: type 80 > > > > > debug1: Entering interactive session. > > > > > debug1: pledge: network > > > > > > > > > > 10 to 15 seconds later > > > > > > > > What is sshd doing at this time? Have you tried enabling debug or > > > > > > running tcpdump when a new connection is attempted? At first glance > I'd > > > say it's a DNS issue since it eventually succeeds, the logs would help > to > > > point in a direction. > > > > > > > > -Brian > > > > > > > > > > > > > debug3: receive packet: type 80 > > > > > debug1: client_input_global_request: rtype hostkeys-00 at openssh.com > > > > > > want_reply 0 > > > > > debug3: receive packet: type 91 > > > > > debug2: callback start > > > > > debug2: fd 3 setting TCP_NODELAY > > > > > debug3: ssh_packet_set_tos: set IP_TOS 0x10 > > > > > debug2: client_session2_setup: id 0 > > > > > debug2: channel 0: request pty-req confirm 1 > > > > > Have you ever experienced such issue ? > > > > > Any suggestion? > > > > > Many thanks > > > > > /Giuseppe > > > > > > — > > > Slawek Kaplonski > > > Senior software engineer > > > Red Hat > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From km.giuseppesannino at gmail.com Mon Jul 15 16:43:58 2019 From: km.giuseppesannino at gmail.com (Giuseppe Sannino) Date: Mon, 15 Jul 2019 18:43:58 +0200 Subject: [kolla][nova][neutron] Access to VMs is slow when running on a remote compute host In-Reply-To: <20190715162936.qowsv7aq2j6bqriq@yuggoth.org> References: <01AA2BA1-3EEC-4142-BCDB-F9861707102E@redhat.com> <20190715162936.qowsv7aq2j6bqriq@yuggoth.org> Message-ID: Ciao Jeremy, dmesg reports no error on the guest. syslog and auth.log look clean as well. /G On Mon, 15 Jul 2019 at 18:30, Jeremy Stanley wrote: > On 2019-07-15 16:29:47 +0200 (+0200), Giuseppe Sannino wrote: > [...] > > Once logged in the VM is not too slow. CLI doesn't get stuck or similar. > > One thing worthwhile to mention, it seems like the writing throughput on > > the disk is a bit slow: 67MB/s wrt around 318MB/s of another VM running > on > > a "datacenter" Openstack installation. > [...] > > Have you checked dmesg in the guest instance to see if there is any > I/O problem reported by the kernel? The login process will block on > updating /var/log/wtmp or similar, so if writes to whatever backing > store that lives on are delayed, that can explain the symptom. > -- > Jeremy Stanley > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aschultz at redhat.com Mon Jul 15 16:45:59 2019 From: aschultz at redhat.com (Alex Schultz) Date: Mon, 15 Jul 2019 10:45:59 -0600 Subject: [kolla][nova][neutron] Access to VMs is slow when running on a remote compute host In-Reply-To: References: <01AA2BA1-3EEC-4142-BCDB-F9861707102E@redhat.com> <9f46833c423d5e87a19e93ad77f7999e44cc5268.camel@redhat.com> Message-ID: On Mon, Jul 15, 2019 at 10:40 AM Giuseppe Sannino < km.giuseppesannino at gmail.com> wrote: > Hi Sean, > the ssh to localhost is slow as well. > "telnet localhost" is also slow. > > Are you having dns issues? Historically if you have UseDNS set to true and your dns servers are bad it can just be slow to connect as it tries to do the reverse lookup. > /Giuseppe > > On Mon, 15 Jul 2019 at 18:18, Sean Mooney wrote: > >> On Mon, 2019-07-15 at 16:29 +0200, Giuseppe Sannino wrote: >> > Hi! >> > first of all, thanks for the fast replies. I do appreciate that. >> > >> > I did some more test trying to figure out the issue. >> > - Set UseDNS to "no" in sshd_config => Issue persists >> > - Installed and configured Telnet => Telnet login is slow as well >> > >> > From the "top" or "auth.log"nothing specific popped up. I can sshd >> taking >> > some cpu for a short while but nothing more than that. >> > >> > Once logged in the VM is not too slow. CLI doesn't get stuck or similar. >> > One thing worthwhile to mention, it seems like the writing throughput on >> > the disk is a bit slow: 67MB/s wrt around 318MB/s of another VM >> running on >> > a "datacenter" Openstack installation. >> unless you see iowait in the guest its likely not related to the disk >> speed. >> you might be able to improve the disk performace by changeing the chache >> mode >> but unless you are seeing io wait that is just an optimisation to try >> later. >> >> when you are logged into the vm have you tried ssh again via localhost to >> determin if the long login time is related to the network or the vm. >> >> if its related to the network it will be fast over localhost >> if its related to the vm, e.g. because of disk, cpu load, memory load or >> ssh server configuration >> then the local ssh will be slow. >> >> > >> > The Cinder Volume docker is running on the Compute Host and Cinder is >> using >> > the filesystem as backend. >> > >> > BR >> > /Giuseppe >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > On Fri, 12 Jul 2019 at 17:41, Slawek Kaplonski >> wrote: >> > >> > > Hi, >> > > >> > > I suspect some problems with names resolving. Can You check if You >> have >> > > also such delay when doing e.g. “sudo” commands after You ssh to the >> > > instance? >> > > >> > > > On 12 Jul 2019, at 16:23, Brian Haley wrote: >> > > > >> > > > >> > > > >> > > > On 7/12/19 9:57 AM, Giuseppe Sannino wrote: >> > > > > Hi community, >> > > > > I need your help ,tips, advices. >> > > > > *> Environment <* >> > > > > I have deployed Openstack "Stein" using the latest kolla-ansible >> on the >> > > >> > > following deployment topology: >> > > > > 1) OS Controller running as VM on a "cloud" location >> > > > > 2) OS Compute running on a baremetal server remotely (wrt OS >> > > >> > > Controller) location >> > > > > 3) Network node running on the Compute host >> > > > > As per the above info, Controller and compute run on two different >> > > >> > > networks. >> > > > > Kolla-Ansible is not really designed for such scenario but after >> > > >> > > manipulating the globals.yml and the inventory files (basically I had >> to >> > > move node specific network settings from the globals to the inventory >> > > file), eventually the deployment works fine. >> > > > > *> Problem <* >> > > > > I have no specific issue working with this deployment except the >> > > >> > > following: >> > > > > "SSH connection to the VM is quite slow". >> > > > > It takes around 20 seconds for me to log into the VM (Ubuntu, >> CentOS, >> > > >> > > whatever). >> > > > >> > > > But once logged-in things are OK? For example, an scp stalls the >> same >> > > >> > > way, but the transfer is fast? >> > > > >> > > > > *> Observations <* >> > > > > * Except for the slowness during the SSH login, I don't have any >> > > > > further specific issue working with this envirorment >> > > > > * With the Network on the Compute I can turn the OS controller >> off >> > > > > with no impact on the VM. Still the connection is slow >> > > > > * I tried different type of images (Ubuntu, CentOS, Windows) >> always >> > > > > with the same result. >> > > > > * SSH connection is slow even if I try to login into the VM >> within the >> > > > > IP Namespace >> > > > > From the ssh -vvv, I can see that the authentication gets stuck >> here: >> > > > > debug1: Authentication succeeded (publickey). >> > > > > Authenticated to ***** >> > > > > debug1: channel 0: new [client-session] >> > > > > debug3: ssh_session2_open: channel_new: 0 >> > > > > debug2: channel 0: send open >> > > > > debug3: send packet: type 90 >> > > > > debug1: Requesting no-more-sessions at openssh.com > > > >> > > no-more-sessions at openssh.com> >> > > > > debug3: send packet: type 80 >> > > > > debug1: Entering interactive session. >> > > > > debug1: pledge: network >> > > > > > > > > > 10 to 15 seconds later >> > > > >> > > > What is sshd doing at this time? Have you tried enabling debug or >> > > >> > > running tcpdump when a new connection is attempted? At first glance >> I'd >> > > say it's a DNS issue since it eventually succeeds, the logs would >> help to >> > > point in a direction. >> > > > >> > > > -Brian >> > > > >> > > > >> > > > > debug3: receive packet: type 80 >> > > > > debug1: client_input_global_request: rtype >> hostkeys-00 at openssh.com >> > > >> > > want_reply 0 >> > > > > debug3: receive packet: type 91 >> > > > > debug2: callback start >> > > > > debug2: fd 3 setting TCP_NODELAY >> > > > > debug3: ssh_packet_set_tos: set IP_TOS 0x10 >> > > > > debug2: client_session2_setup: id 0 >> > > > > debug2: channel 0: request pty-req confirm 1 >> > > > > Have you ever experienced such issue ? >> > > > > Any suggestion? >> > > > > Many thanks >> > > > > /Giuseppe >> > > >> > > — >> > > Slawek Kaplonski >> > > Senior software engineer >> > > Red Hat >> > > >> > > >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From km.giuseppesannino at gmail.com Mon Jul 15 17:05:13 2019 From: km.giuseppesannino at gmail.com (Giuseppe Sannino) Date: Mon, 15 Jul 2019 19:05:13 +0200 Subject: [kolla][nova][neutron] Access to VMs is slow when running on a remote compute host In-Reply-To: References: <01AA2BA1-3EEC-4142-BCDB-F9861707102E@redhat.com> <9f46833c423d5e87a19e93ad77f7999e44cc5268.camel@redhat.com> Message-ID: Hi Alex, yeah, it was the first suspect also based on the various research on internet. I currently have the "useDNS" set to no but still the issue persists. /G On Mon, 15 Jul 2019 at 18:46, Alex Schultz wrote: > > On Mon, Jul 15, 2019 at 10:40 AM Giuseppe Sannino < > km.giuseppesannino at gmail.com> wrote: > >> Hi Sean, >> the ssh to localhost is slow as well. >> "telnet localhost" is also slow. >> >> > Are you having dns issues? Historically if you have UseDNS set to true and > your dns servers are bad it can just be slow to connect as it tries to do > the reverse lookup. > > >> /Giuseppe >> >> On Mon, 15 Jul 2019 at 18:18, Sean Mooney wrote: >> >>> On Mon, 2019-07-15 at 16:29 +0200, Giuseppe Sannino wrote: >>> > Hi! >>> > first of all, thanks for the fast replies. I do appreciate that. >>> > >>> > I did some more test trying to figure out the issue. >>> > - Set UseDNS to "no" in sshd_config => Issue persists >>> > - Installed and configured Telnet => Telnet login is slow as well >>> > >>> > From the "top" or "auth.log"nothing specific popped up. I can sshd >>> taking >>> > some cpu for a short while but nothing more than that. >>> > >>> > Once logged in the VM is not too slow. CLI doesn't get stuck or >>> similar. >>> > One thing worthwhile to mention, it seems like the writing throughput >>> on >>> > the disk is a bit slow: 67MB/s wrt around 318MB/s of another VM >>> running on >>> > a "datacenter" Openstack installation. >>> unless you see iowait in the guest its likely not related to the disk >>> speed. >>> you might be able to improve the disk performace by changeing the chache >>> mode >>> but unless you are seeing io wait that is just an optimisation to try >>> later. >>> >>> when you are logged into the vm have you tried ssh again via localhost to >>> determin if the long login time is related to the network or the vm. >>> >>> if its related to the network it will be fast over localhost >>> if its related to the vm, e.g. because of disk, cpu load, memory load or >>> ssh server configuration >>> then the local ssh will be slow. >>> >>> > >>> > The Cinder Volume docker is running on the Compute Host and Cinder is >>> using >>> > the filesystem as backend. >>> > >>> > BR >>> > /Giuseppe >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > On Fri, 12 Jul 2019 at 17:41, Slawek Kaplonski >>> wrote: >>> > >>> > > Hi, >>> > > >>> > > I suspect some problems with names resolving. Can You check if You >>> have >>> > > also such delay when doing e.g. “sudo” commands after You ssh to the >>> > > instance? >>> > > >>> > > > On 12 Jul 2019, at 16:23, Brian Haley >>> wrote: >>> > > > >>> > > > >>> > > > >>> > > > On 7/12/19 9:57 AM, Giuseppe Sannino wrote: >>> > > > > Hi community, >>> > > > > I need your help ,tips, advices. >>> > > > > *> Environment <* >>> > > > > I have deployed Openstack "Stein" using the latest kolla-ansible >>> on the >>> > > >>> > > following deployment topology: >>> > > > > 1) OS Controller running as VM on a "cloud" location >>> > > > > 2) OS Compute running on a baremetal server remotely (wrt OS >>> > > >>> > > Controller) location >>> > > > > 3) Network node running on the Compute host >>> > > > > As per the above info, Controller and compute run on two >>> different >>> > > >>> > > networks. >>> > > > > Kolla-Ansible is not really designed for such scenario but after >>> > > >>> > > manipulating the globals.yml and the inventory files (basically I >>> had to >>> > > move node specific network settings from the globals to the inventory >>> > > file), eventually the deployment works fine. >>> > > > > *> Problem <* >>> > > > > I have no specific issue working with this deployment except the >>> > > >>> > > following: >>> > > > > "SSH connection to the VM is quite slow". >>> > > > > It takes around 20 seconds for me to log into the VM (Ubuntu, >>> CentOS, >>> > > >>> > > whatever). >>> > > > >>> > > > But once logged-in things are OK? For example, an scp stalls the >>> same >>> > > >>> > > way, but the transfer is fast? >>> > > > >>> > > > > *> Observations <* >>> > > > > * Except for the slowness during the SSH login, I don't have any >>> > > > > further specific issue working with this envirorment >>> > > > > * With the Network on the Compute I can turn the OS controller >>> off >>> > > > > with no impact on the VM. Still the connection is slow >>> > > > > * I tried different type of images (Ubuntu, CentOS, Windows) >>> always >>> > > > > with the same result. >>> > > > > * SSH connection is slow even if I try to login into the VM >>> within the >>> > > > > IP Namespace >>> > > > > From the ssh -vvv, I can see that the authentication gets stuck >>> here: >>> > > > > debug1: Authentication succeeded (publickey). >>> > > > > Authenticated to ***** >>> > > > > debug1: channel 0: new [client-session] >>> > > > > debug3: ssh_session2_open: channel_new: 0 >>> > > > > debug2: channel 0: send open >>> > > > > debug3: send packet: type 90 >>> > > > > debug1: Requesting no-more-sessions at openssh.com >> > > >>> > > no-more-sessions at openssh.com> >>> > > > > debug3: send packet: type 80 >>> > > > > debug1: Entering interactive session. >>> > > > > debug1: pledge: network >>> > > > > > > > > > 10 to 15 seconds later >>> > > > >>> > > > What is sshd doing at this time? Have you tried enabling debug or >>> > > >>> > > running tcpdump when a new connection is attempted? At first glance >>> I'd >>> > > say it's a DNS issue since it eventually succeeds, the logs would >>> help to >>> > > point in a direction. >>> > > > >>> > > > -Brian >>> > > > >>> > > > >>> > > > > debug3: receive packet: type 80 >>> > > > > debug1: client_input_global_request: rtype >>> hostkeys-00 at openssh.com >>> > > >>> > > want_reply 0 >>> > > > > debug3: receive packet: type 91 >>> > > > > debug2: callback start >>> > > > > debug2: fd 3 setting TCP_NODELAY >>> > > > > debug3: ssh_packet_set_tos: set IP_TOS 0x10 >>> > > > > debug2: client_session2_setup: id 0 >>> > > > > debug2: channel 0: request pty-req confirm 1 >>> > > > > Have you ever experienced such issue ? >>> > > > > Any suggestion? >>> > > > > Many thanks >>> > > > > /Giuseppe >>> > > >>> > > — >>> > > Slawek Kaplonski >>> > > Senior software engineer >>> > > Red Hat >>> > > >>> > > >>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From radoslaw.piliszek at gmail.com Mon Jul 15 17:26:52 2019 From: radoslaw.piliszek at gmail.com (=?UTF-8?Q?Rados=C5=82aw_Piliszek?=) Date: Mon, 15 Jul 2019 19:26:52 +0200 Subject: [kolla][nova][neutron] Access to VMs is slow when running on a remote compute host In-Reply-To: References: <01AA2BA1-3EEC-4142-BCDB-F9861707102E@redhat.com> <9f46833c423d5e87a19e93ad77f7999e44cc5268.camel@redhat.com> Message-ID: For completeness - always also ensure that 'GSSAPIAuthentication' is set to 'no' because in default config it might require DNS lookups too. (Obviously you can run GSSAPIAuthentication and avoid DNS lookups by configuring GSSAPI appropriately ;-) ). Kind regards, Radek pon., 15 lip 2019 o 19:14 Giuseppe Sannino napisał(a): > Hi Alex, > yeah, it was the first suspect also based on the various research on > internet. > I currently have the "useDNS" set to no but still the issue persists. > > /G > > On Mon, 15 Jul 2019 at 18:46, Alex Schultz wrote: > >> >> On Mon, Jul 15, 2019 at 10:40 AM Giuseppe Sannino < >> km.giuseppesannino at gmail.com> wrote: >> >>> Hi Sean, >>> the ssh to localhost is slow as well. >>> "telnet localhost" is also slow. >>> >>> >> Are you having dns issues? Historically if you have UseDNS set to true >> and your dns servers are bad it can just be slow to connect as it tries to >> do the reverse lookup. >> >> >>> /Giuseppe >>> >>> On Mon, 15 Jul 2019 at 18:18, Sean Mooney wrote: >>> >>>> On Mon, 2019-07-15 at 16:29 +0200, Giuseppe Sannino wrote: >>>> > Hi! >>>> > first of all, thanks for the fast replies. I do appreciate that. >>>> > >>>> > I did some more test trying to figure out the issue. >>>> > - Set UseDNS to "no" in sshd_config => Issue persists >>>> > - Installed and configured Telnet => Telnet login is slow as well >>>> > >>>> > From the "top" or "auth.log"nothing specific popped up. I can sshd >>>> taking >>>> > some cpu for a short while but nothing more than that. >>>> > >>>> > Once logged in the VM is not too slow. CLI doesn't get stuck or >>>> similar. >>>> > One thing worthwhile to mention, it seems like the writing throughput >>>> on >>>> > the disk is a bit slow: 67MB/s wrt around 318MB/s of another VM >>>> running on >>>> > a "datacenter" Openstack installation. >>>> unless you see iowait in the guest its likely not related to the disk >>>> speed. >>>> you might be able to improve the disk performace by changeing the >>>> chache mode >>>> but unless you are seeing io wait that is just an optimisation to try >>>> later. >>>> >>>> when you are logged into the vm have you tried ssh again via localhost >>>> to >>>> determin if the long login time is related to the network or the vm. >>>> >>>> if its related to the network it will be fast over localhost >>>> if its related to the vm, e.g. because of disk, cpu load, memory load >>>> or ssh server configuration >>>> then the local ssh will be slow. >>>> >>>> > >>>> > The Cinder Volume docker is running on the Compute Host and Cinder is >>>> using >>>> > the filesystem as backend. >>>> > >>>> > BR >>>> > /Giuseppe >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > On Fri, 12 Jul 2019 at 17:41, Slawek Kaplonski >>>> wrote: >>>> > >>>> > > Hi, >>>> > > >>>> > > I suspect some problems with names resolving. Can You check if You >>>> have >>>> > > also such delay when doing e.g. “sudo” commands after You ssh to the >>>> > > instance? >>>> > > >>>> > > > On 12 Jul 2019, at 16:23, Brian Haley >>>> wrote: >>>> > > > >>>> > > > >>>> > > > >>>> > > > On 7/12/19 9:57 AM, Giuseppe Sannino wrote: >>>> > > > > Hi community, >>>> > > > > I need your help ,tips, advices. >>>> > > > > *> Environment <* >>>> > > > > I have deployed Openstack "Stein" using the latest >>>> kolla-ansible on the >>>> > > >>>> > > following deployment topology: >>>> > > > > 1) OS Controller running as VM on a "cloud" location >>>> > > > > 2) OS Compute running on a baremetal server remotely (wrt OS >>>> > > >>>> > > Controller) location >>>> > > > > 3) Network node running on the Compute host >>>> > > > > As per the above info, Controller and compute run on two >>>> different >>>> > > >>>> > > networks. >>>> > > > > Kolla-Ansible is not really designed for such scenario but after >>>> > > >>>> > > manipulating the globals.yml and the inventory files (basically I >>>> had to >>>> > > move node specific network settings from the globals to the >>>> inventory >>>> > > file), eventually the deployment works fine. >>>> > > > > *> Problem <* >>>> > > > > I have no specific issue working with this deployment except the >>>> > > >>>> > > following: >>>> > > > > "SSH connection to the VM is quite slow". >>>> > > > > It takes around 20 seconds for me to log into the VM (Ubuntu, >>>> CentOS, >>>> > > >>>> > > whatever). >>>> > > > >>>> > > > But once logged-in things are OK? For example, an scp stalls the >>>> same >>>> > > >>>> > > way, but the transfer is fast? >>>> > > > >>>> > > > > *> Observations <* >>>> > > > > * Except for the slowness during the SSH login, I don't have >>>> any >>>> > > > > further specific issue working with this envirorment >>>> > > > > * With the Network on the Compute I can turn the OS controller >>>> off >>>> > > > > with no impact on the VM. Still the connection is slow >>>> > > > > * I tried different type of images (Ubuntu, CentOS, Windows) >>>> always >>>> > > > > with the same result. >>>> > > > > * SSH connection is slow even if I try to login into the VM >>>> within the >>>> > > > > IP Namespace >>>> > > > > From the ssh -vvv, I can see that the authentication gets stuck >>>> here: >>>> > > > > debug1: Authentication succeeded (publickey). >>>> > > > > Authenticated to ***** >>>> > > > > debug1: channel 0: new [client-session] >>>> > > > > debug3: ssh_session2_open: channel_new: 0 >>>> > > > > debug2: channel 0: send open >>>> > > > > debug3: send packet: type 90 >>>> > > > > debug1: Requesting no-more-sessions at openssh.com >>> > > >>>> > > no-more-sessions at openssh.com> >>>> > > > > debug3: send packet: type 80 >>>> > > > > debug1: Entering interactive session. >>>> > > > > debug1: pledge: network >>>> > > > > > > > > > 10 to 15 seconds later >>>> > > > >>>> > > > What is sshd doing at this time? Have you tried enabling debug or >>>> > > >>>> > > running tcpdump when a new connection is attempted? At first >>>> glance I'd >>>> > > say it's a DNS issue since it eventually succeeds, the logs would >>>> help to >>>> > > point in a direction. >>>> > > > >>>> > > > -Brian >>>> > > > >>>> > > > >>>> > > > > debug3: receive packet: type 80 >>>> > > > > debug1: client_input_global_request: rtype >>>> hostkeys-00 at openssh.com >>>> > > >>>> > > want_reply 0 >>>> > > > > debug3: receive packet: type 91 >>>> > > > > debug2: callback start >>>> > > > > debug2: fd 3 setting TCP_NODELAY >>>> > > > > debug3: ssh_packet_set_tos: set IP_TOS 0x10 >>>> > > > > debug2: client_session2_setup: id 0 >>>> > > > > debug2: channel 0: request pty-req confirm 1 >>>> > > > > Have you ever experienced such issue ? >>>> > > > > Any suggestion? >>>> > > > > Many thanks >>>> > > > > /Giuseppe >>>> > > >>>> > > — >>>> > > Slawek Kaplonski >>>> > > Senior software engineer >>>> > > Red Hat >>>> > > >>>> > > >>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From amy at demarco.com Mon Jul 15 17:52:04 2019 From: amy at demarco.com (Amy Marrich) Date: Mon, 15 Jul 2019 12:52:04 -0500 Subject: [Diversity] Grace Hopper Lead Co-Mentor(s) needed Message-ID: OpenStack has participated in Open Source Day at the Grace Hopper Conference for the last 5 years and this year it is in Orlando on October 3rd. Due to unforeseen circumstances we are one lead mentor short to participate this year and are looking for someone willing to help lead the session. The Anita Borg Institute will provide a ticket for the entire conference, the downside is you or your company will be responsible for your travel, lodging and food. This year CityNetwork is supplying the VM infrastructure for us to use. The session submitted is as follows: Participants will sign up for the necessary accounts needed to contribute to OpenStack and install Git and Gerrit. They will configure their systems and learn how to file bugs and review code within the Open Dev infrastructure. Due to time and resource requirements the groups will be given a virtual machine running Devstack in which to create their own theme for the Horizon Dashboard. The group will decide on which humanitarian effort they wish to support but suggestions will be provided for those groups that can not decide. The group will also work as a team to locate the necessary documentation on the OpenStack site if they have not done so in preparation. The selected Project Manager will lead these discussions as well as making sure efforts are staying on track to meet the deadline of the demonstrations. The Developers in the group will work to locate the necessary files and directories on the VM in order to complete their task. They will be in charge of creating/changing the necessary files to create a new theme utilizing SCSS and restarting any services needed to implement. The Graphic Designer if present will create any necessary graphics for the theme and will either scp them themselves to the virtual machine or provide to the developer. If there is no designated graphic designer the PM and devs will divide this duty up. After the themes are designed relevant files will be scped back down to the participants machines and they will be able to commit their code for review in a sandbox environment for other groups to review. If you are interested in participating please let me know by July 20th as we need to send additional information to the OSD folks on July 21st. Any questions please reach out on the lists, IRC, or privately! Thanks, Amy (spotz) -------------- next part -------------- An HTML attachment was scrubbed... URL: From dsneddon at redhat.com Mon Jul 15 18:25:34 2019 From: dsneddon at redhat.com (Dan Sneddon) Date: Mon, 15 Jul 2019 11:25:34 -0700 Subject: [TripleO] Scaling node counts with only Ansible (N=1) In-Reply-To: References: <23924034ea0981350b7e241aed5e99c5e769b291.camel@redhat.com> Message-ID: On Mon, Jul 15, 2019 at 2:13 AM Harald Jensås wrote: > On Sat, 2019-07-13 at 16:19 -0400, James Slagle wrote: > > On Fri, Jul 12, 2019 at 3:59 PM Harald Jensås > > wrote: > > > I've said this before, but I think we should turn this nova-less > > > around. Now with nova-less we create a bunch of servers, and write > > > up > > > the parameters file to use the deployed-server approach. > > > Effectively we > > > still neet to have the resource group in heat making a server > > > resource > > > for every server. Creating the fake server resource is fast, > > > because > > > Heat does'nt call Nova,Ironic to create any resources. But the > > > stack is > > > equally big, with a stack for every node. i.e not N=1. > > > > > > What you are doing here, is essentially to say we don't create a > > > resource group that then creates N number of role stacks, one for > > > each > > > overcloud node. You are creating a single generic "server" > > > definition > > > per Role. So we drop the resource group and create > > > OS::Triple::{{Role}}.Server 1-time (once). To me it's backwards to > > > push > > > a large struct with properties for N=many nodes into the creation > > > of > > > that stack. > > > > I'm not entirely following what you're saying is backwards. What I've > > proposed is that we *don't* have any node specific data in the stack. > > It sounds like you're saying the way we do it today is backwards. > > > > What I mean to say is that I think the way we are integrating nova-less > by first deploying the servers, to then provide the data to Heat to > create the resource groups as we do today becomes backwards when your > work on N=1 is introduced. > > > > It's correct that what's been proposed with metalsmith currently > > still > > requires the full ResourceGroup with a member for each node. With the > > template changes I'm proposing, that wouldn't be required, so we > > could > > actually do the Heat stack first, then metalsmith. > > > > Yes, this is what I think we should do. Especially if your changes here > removes the resource group entirely. It makes more sense to create the > stack, and once that is created we can do deployment, scaling etc > without updating the stack again. > > > > Currently the puppet/role-role.yaml creates all the network ports > > > etc. > > > As you only want to create it once, it instead could simply output > > > the > > > UUID of the networks+subnets. These are identical for all servers > > > in > > > the role. So we end up with a small heat stack. > > > > > > Once the stack is created we could use that generic "server" role > > > data > > > to feed into something (ansible?, python?, mistral?) that calls > > > metalsmith to build the servers, then create ports for each server > > > in > > > neutron, one port for each network+subnet defined in the role. Then > > > feed that output into the json (hieradata) that is pushed to each > > > node > > > and used during service configuration, all the things we need to > > > configure network interfaces, /etc/hosts and so on. We need a way > > > to > > > keep track of which ports belong to wich node, but I guess > > > something > > > simple like using the node's ironic UUID in either the name, > > > description or tag field of the neutron port will work. There is > > > also > > > the extra filed in Ironic which is json type, so we could place a > > > map > > > of network->port_uuid in there as well. > > > > It won't matter whether we do baremetal provisioning before or after > > the Heat stack. Heat won't care, as it won't have any expectation to > > create any servers or that they are already created. We can define > > where we end up calling the metalsmith piece as it should be > > independent of the Heat stack if we make these template changes. > > > > This is true. But, in your previous mail in this thread you wrote: > > """ > Other points: > > - Baremetal provisioning and port creation are presently handled by > Heat. With the ongoing efforts to migrate baremetal provisioning out > of Heat (nova-less deploy), I think these efforts are very > complimentary. Eventually, we get to a point where Heat is not > actually creating any other OpenStack API resources. For now, the > patches only work when using pre-provisioned nodes. > """ > > IMO "baremetal provision and port creation" fit together. (I read the > above statement so as well.) Currently nova-less creates the ctlplane > port and provision the baremetal node. If we want to do both baremetal > provisioning and port creation togheter (I think this makes sense), we > have to do it after the stack has created the networks. > > What I envision is to have one method that creates all the ports, > ctlplane + composable networks in a unified way. Today these are > created differently, the ctlplane port is part of the server resource > (or metalsmith in nova-less case) and the other ports are created by > heat. > This is my main question about this proposal. When TripleO was in its infancy, there wasn't a mechanism to create Neutron ports separately from the server, so we created a Nova Server resource that specified which network the port was on (originally there was only one port created, now we create additional ports in Neutron). This can be seen in the puppet/-role.yaml file, for example: resources: Controller: type: OS::TripleO::ControllerServer deletion_policy: {get_param: ServerDeletionPolicy} metadata: os-collect-config: command: {get_param: ConfigCommand} splay: {get_param: ConfigCollectSplay} properties: [...] networks: - if: - ctlplane_fixed_ip_set - network: ctlplane subnet: {get_param: ControllerControlPlaneSubnet} fixed_ip: yaql: expression: $.data.where(not isEmpty($)).first() data: - get_param: [ControllerIPs, 'ctlplane', {get_param: NodeIndex}] - network: ctlplane subnet: {get_param: ControllerControlPlaneSubnet} This has the side-effect that the ports are created by Nova calling Neutron rather than by Heat calling Neutron for port creation. We have maintained this mechanism even in the latest versions of THT for backwards compatibility. This would all be easier if we were creating the Neutron ctlplane port and then assigning it to the server, but that breaks backwards-compatibility. How would the creation of the ctlplane port be handled in this proposal? If metalsmith is creating the ctlplane port, do we still need a separate Server resource for every node? If so, I imagine it would have a much smaller stack than what we currently create for each server. If not, would metalsmith create a port on the ctlplane as part of the provisioning steps, and then pass this port back? We still need to be able to support fixed IPs for ctlplane ports, so we need to be able to pass a specific IP to metalsmith. > > > I think the creation of the actual Networks and Subnets can be left > > > in > > > heat, it's typically 5-6 networks and 5-6 subnets so it's not a lot > > > of > > > resources. Even in a large DCN deployment having 50-100 subnets per > > > network or even 50-100 networks I think this is'nt a problem. > > > > Agreed, I'm not specifically proposing we move those pieces at this > > time. > > > > +1 > > > > > -- Dan Sneddon | Senior Principal Software Engineer dsneddon at redhat.com | redhat.com/cloud dsneddon:irc | @dxs:twitter -------------- next part -------------- An HTML attachment was scrubbed... URL: From doug at doughellmann.com Mon Jul 15 18:37:05 2019 From: doug at doughellmann.com (Doug Hellmann) Date: Mon, 15 Jul 2019 14:37:05 -0400 Subject: [dev][release][qa] patrole stable/stein is created by mistake ? In-Reply-To: <0a85c82c-f312-a8a9-34af-f5b812c8f2b3@openstack.org> References: <16beb436c21.c9d98154147017.455401003873410397@ghanshyammann.com> <16beb463457.1193bd7a4147042.5135658897011450644@ghanshyammann.com> <20190713203828.GA29711@sm-workstation> <16beddf697c.fd5e3e64150304.4626321901014847129@ghanshyammann.com> <0a85c82c-f312-a8a9-34af-f5b812c8f2b3@openstack.org> Message-ID: <717CD206-8A3A-438F-A52B-22396CFA9BAD@doughellmann.com> > On Jul 15, 2019, at 6:30 AM, Thierry Carrez wrote: > > Ghanshyam Mann wrote: >> ---- On Sun, 14 Jul 2019 05:38:28 +0900 Sean McGinnis wrote ---- >> > On Sat, Jul 13, 2019 at 09:19:35PM +0900, Ghanshyam Mann wrote: >> > > ---- On Sat, 13 Jul 2019 21:16:32 +0900 Ghanshyam Mann wrote ---- >> > > > Hi Release team, >> > > > >> > > > Today I noticed while doing patrole review that stable/stain has been created for patrole which is wrong. >> > > > Patrole is branchless[1] and I remember I have not requested the stable branch while releasing the patrole. >> > > > >> > > > Is it created by mistakenly ? or intentional? >> > > >> > > I found the patch which created this - https://review.opendev.org/#/c/650173/1. >> > > >> > > Can we revert that but that patch has more changes? >> > > >> > > -gmann >> > > >> > >> > We can't really just revert it. If you would like to change this, please update >> > the patrole release type to tempest-plugin or move it to be an independent >> > release deliverable if it is not actually cycle based. >> > >> > You can also remove the branching information with that change, then after that >> > is gone and there isn't a risk of recreating it, the infra team may be able to >> > assist in deleting the existing branch. >> Release model for patrole is 'cycle-with-intermediary' which is right, we do not need >> to change that. We just need to remove the stable/stein and branch information which >> was added for the only stein. I will push the change. >> Can we have a tag which clearly says the branchless and no-stable branch nature for deliverables? >> 'tempest-plugins' was introduced for a different purpose. new tag can be applicable for other >> deliverables also (current or in future). That can help to avoid these errrors. > > Creating a new release model or a new deliverable type sounds a bit overkill. > > I think the simpler would be to add a new value to the existing "stable-branch-type" key. Like "stable-branch-type: none" and then set that value for all tempest plugins, tempest itself and patrole. > > See https://review.opendev.org/670808 as a strawman. > > -- > Thierry Carrez (ttx) If the project isn’t creating stable branches each cycle, how is it cycle-based? What’s wrong with moving it to use the independent release model, which was created for cases like this? Doug -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnaser at vexxhost.com Mon Jul 15 21:46:13 2019 From: mnaser at vexxhost.com (Mohammed Naser) Date: Mon, 15 Jul 2019 17:46:13 -0400 Subject: [tc] Weekly update Message-ID: Hi everyone, This is the weekly update for what happened inside the OpenStack TC, you can get more information by checking for changes in the openstack/governance repository. # Retired projects - release-schedule-generator (from releases) - tripleo-ansible-roles (from tripleo) - docs-specs (from docs) # New projects - octavia-diskimage-retrofit (under charms) - kayobe (under kolla) # General changes - The "U" release will be based in the China region as per the upcoming summit. - The documentation team is starting a transition to a SIG - Few changes to the PDF goal proposal patches alongside storyboard links - Typo fixes (we can't spell liaison apparently!) - Jeremy Stanley has volunteered to become TC liaison for the image encryption popup team - Each project has been assigned 2 TC liaisons to provide assistance within our community. Thanks for tuning in :) Regards, Mohammed From sbaker at redhat.com Mon Jul 15 22:26:10 2019 From: sbaker at redhat.com (Steve Baker) Date: Tue, 16 Jul 2019 10:26:10 +1200 Subject: [TripleO] Scaling node counts with only Ansible (N=1) In-Reply-To: References: <23924034ea0981350b7e241aed5e99c5e769b291.camel@redhat.com> Message-ID: <6bdebb51-0b3c-2888-a691-720bd5ac039a@redhat.com> On 15/07/19 9:12 PM, Harald Jensås wrote: > On Sat, 2019-07-13 at 16:19 -0400, James Slagle wrote: >> On Fri, Jul 12, 2019 at 3:59 PM Harald Jensås >> wrote: >>> I've said this before, but I think we should turn this nova-less >>> around. Now with nova-less we create a bunch of servers, and write >>> up >>> the parameters file to use the deployed-server approach. >>> Effectively we >>> still neet to have the resource group in heat making a server >>> resource >>> for every server. Creating the fake server resource is fast, >>> because >>> Heat does'nt call Nova,Ironic to create any resources. But the >>> stack is >>> equally big, with a stack for every node. i.e not N=1. >>> >>> What you are doing here, is essentially to say we don't create a >>> resource group that then creates N number of role stacks, one for >>> each >>> overcloud node. You are creating a single generic "server" >>> definition >>> per Role. So we drop the resource group and create >>> OS::Triple::{{Role}}.Server 1-time (once). To me it's backwards to >>> push >>> a large struct with properties for N=many nodes into the creation >>> of >>> that stack. >> I'm not entirely following what you're saying is backwards. What I've >> proposed is that we *don't* have any node specific data in the stack. >> It sounds like you're saying the way we do it today is backwards. >> > What I mean to say is that I think the way we are integrating nova-less > by first deploying the servers, to then provide the data to Heat to > create the resource groups as we do today becomes backwards when your > work on N=1 is introduced. > > >> It's correct that what's been proposed with metalsmith currently >> still >> requires the full ResourceGroup with a member for each node. With the >> template changes I'm proposing, that wouldn't be required, so we >> could >> actually do the Heat stack first, then metalsmith. >> > Yes, this is what I think we should do. Especially if your changes here > removes the resource group entirely. It makes more sense to create the > stack, and once that is created we can do deployment, scaling etc > without updating the stack again. I think this is something we can move towards after James has finished this work. It would probably mean deprecating "openstack overcloud node provision" and providing some other way of running the baremetal provisioning in isolation after the heat stack operation, like an equivalent to "openstack overcloud deploy --config-download-only" From mark.kirkwood at catalyst.net.nz Mon Jul 15 23:21:44 2019 From: mark.kirkwood at catalyst.net.nz (Mark Kirkwood) Date: Tue, 16 Jul 2019 11:21:44 +1200 Subject: [devstack] [trove] Enabling other datastores, plugin variable design less than tidy Message-ID: Hi, I'm doing some work with the Trove Postgres datastore. So the first thing I did was attempt to get Devstack to set up a stack with Postgres instead of Mysql. Reading the docs, it seemed that all I needed to do was this: $ cat local.conf [[local|localrc]] ADMIN_PASSWORD=password DATABASE_PASSWORD=$ADMIN_PASSWORD RABBIT_PASSWORD=$ADMIN_PASSWORD SERVICE_PASSWORD=$ADMIN_PASSWORD TROVE_DATASTORE_TYPE=postgresql TROVE_DATASTORE_VERSION=9.6 TROVE_DATASTORE_PACKAGE=postgresql-9.6 enable_plugin trove https://opendev.org/openstack/trove Wrong! After watching the plugin try to build a Mysql guest image of version 9.6 (!), I realized that more reading of the plugin source was required. So iteration 2 (or maybe it was 3...lol), of my local.conf is: [[local|localrc]] ADMIN_PASSWORD=password DATABASE_PASSWORD=$ADMIN_PASSWORD RABBIT_PASSWORD=$ADMIN_PASSWORD SERVICE_PASSWORD=$ADMIN_PASSWORD TROVE_DATASTORE_TYPE=postgresql TROVE_DATASTORE_VERSION=9.6 TROVE_DATASTORE_PACKAGE=postgresql-9.6 SERVICE_TYPE=$TROVE_DATASTORE_TYPE DATASTORE_VERSION=$TROVE_DATASTORE_VERSION VM=/opt/stack/images/ubuntu_postgresql/ubuntu_postgresql enable_plugin trove https://opendev.org/openstack/trove This works. However, it seems like those last 3 variable substitutions should not be required. i.e: - SERVICE_TYPE should not exist (we should use TROVE_DATASTORE_TYPE) - DATASTORE_VERSION should not exist (we should use TROVE_DATASTORE_VERSION) - VM should be constructed out of DISTRO and TROVE_DATASTORE_TYPE Thoughts? regards Mark From abishop at redhat.com Mon Jul 15 23:44:58 2019 From: abishop at redhat.com (Alan Bishop) Date: Mon, 15 Jul 2019 16:44:58 -0700 Subject: [tripleo][cinder][netapp] In-Reply-To: References: Message-ID: On Mon, Jul 15, 2019 at 7:34 AM Tomáš Bredár wrote: > Hi Alan, > Yes, this is something I was able to achieve. Sorry I think I didn't > express myself clearly. My question was how to correctly create multiple > configuration files? > Hi Tomas, Sorry, but I'm not aware of any existing template or puppet module you could use to create multiple share config files. However, if you're willing to get your hands dirty and spend time experimenting, there are a couple of tripleo facilities you might be able to use to get the job done. One approach would be to use an "extra config" hook [1], in which you could execute a script that generates the contents of your share config files (e.g. a bash script that echos data to a file). Tripleo's extraconfig/services directory [2] might provide some ideas. For example, you could create a template that defines a new, minimal tripleo "service" whose host_prep_tasks (essentially a list of ansible tasks) create the files. [1] http://tripleo.org/install/advanced_deployment/extra_config.html [2] https://github.com/openstack/tripleo-heat-templates/tree/stable/queens/extraconfig/services Unfortunately, it's been quite a while since I dabbled in these areas, so I can't point you to a concise example. Maybe a tripleo expert can provide better guidance. Alan > po 15. 7. 2019 o 15:56 Alan Bishop napísal(a): > >> >> On Mon, Jul 15, 2019 at 4:11 AM Tomáš Bredár >> wrote: >> >>> Hi Alan! >>> Thanks for the pointers. For now I'm going with a single backend with >>> two NFS shares, so I'll use the tripleo templates for netapp. >>> For the future, could you point me to the right template / puppet-cinder >>> code which can create multiple nfs share files for me? Or should I create >>> my own puppet manifest? >>> >> >> Hi Tomas, >> >> The puppet-cinder code that renders the shares config file is [1], and >> the data comes from puppet-tripleo [2]. This is puppet hiera data, and the >> value is bound to the CindeNetappNfsShares tripleo parameter [3]. >> >> So you should be able to deploy a single backend that accesses multiple >> shares by adding something like this to your tripleo deployment. >> >> parameter_defaults: >> CinderNetappNfsShares: 'host_1:/path/to/share_1,host_2:/path/to/share_2' >> >> [1] >> https://opendev.org/openstack/puppet-cinder/src/branch/stable/queens/manifests/backend/netapp.pp#L280 >> [2] >> https://opendev.org/openstack/puppet-tripleo/src/branch/stable/queens/manifests/profile/base/cinder/volume/netapp.pp#L38 >> [3] >> https://opendev.org/openstack/tripleo-heat-templates/src/branch/stable/queens/puppet/services/cinder-backend-netapp.yaml#L137 >> >> Alan >> >> Thanks again. >>> >>> Tomas >>> >>> pi 12. 7. 2019 o 16:11 Alan Bishop napísal(a): >>> >>>> >>>> On Fri, Jul 12, 2019 at 6:09 AM Tomáš Bredár >>>> wrote: >>>> >>>>> Hi Emilien! >>>>> >>>>> Thanks for your help. Yes with this I am able to define multiple >>>>> stanzas in cinder.conf. However netapp driver needs a .conf file with the >>>>> nfs shares listed in it. Defining multiple configuration files with nfs >>>>> share details in each is not possible with the manual you've sent nor with >>>>> the templates in my first email. >>>>> >>>> >>>> Hi Tomas, >>>> >>>> When deploying a single backend, the tripleo template takes care of >>>> generating the nfs shares file (actually, puppet-cinder generates the file, >>>> but it's triggered by tripleo). But when you use the custom backend method >>>> that Emilien pointed you to use, then you are responsible for supplying all >>>> the pieces for the backend(s) to function correctly. This means you will >>>> need to generate the nfs shares file on the host (controller), and then >>>> bind mount the file using CinderVolumeOptVolumes so that the shares file on >>>> the host is visible to the cinder-volume process running in a container. >>>> >>>> I'm wondering if it's possible to define a second backend by creating >>>>> another service, for example "OS::TripleO::Services::CinderBackendNetApp2" ? >>>>> >>>> >>>> Sorry, this won't work. TripleO will trying to deploy two completely >>>> separate instances of the cinder-volume service, but the two deployments >>>> will step all over each other. There has been a long standing goal of >>>> enhancing tripleo so that it can deploy multiple instances of a cinder >>>> backend, but it's a complex task that will require non-trivial changes to >>>> tripleo. >>>> >>>> Alan >>>> >>>> Tomas >>>>> >>>>> št 11. 7. 2019 o 14:35 Emilien Macchi napísal(a): >>>>> >>>>>> On Thu, Jul 11, 2019 at 7:32 AM Tomáš Bredár >>>>>> wrote: >>>>>> >>>>>>> Hi community, >>>>>>> >>>>>>> I'm trying to define multiple NetApp storage backends via Tripleo >>>>>>> installer. >>>>>>> According to [1] the puppet manifest supports multiple backends. >>>>>>> The current templates [2] [3] support only single backend. >>>>>>> Does anyone know how to define multiple netapp backends in the >>>>>>> tripleo-heat environment files / templates? >>>>>>> >>>>>> >>>>>> We don't support that via the templates that you linked, however if >>>>>> you follow this manual you should be able to configure multiple NetApp >>>>>> backends: >>>>>> >>>>>> https://docs.openstack.org/tripleo-docs/latest/install/advanced_deployment/cinder_custom_backend.html >>>>>> >>>>>> Let us know how it worked! >>>>>> -- >>>>>> Emilien Macchi >>>>>> >>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From gmann at ghanshyammann.com Tue Jul 16 03:42:00 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 16 Jul 2019 12:42:00 +0900 Subject: [dev][release][qa] patrole stable/stein is created by mistake ? In-Reply-To: <0a85c82c-f312-a8a9-34af-f5b812c8f2b3@openstack.org> References: <16beb436c21.c9d98154147017.455401003873410397@ghanshyammann.com> <16beb463457.1193bd7a4147042.5135658897011450644@ghanshyammann.com> <20190713203828.GA29711@sm-workstation> <16beddf697c.fd5e3e64150304.4626321901014847129@ghanshyammann.com> <0a85c82c-f312-a8a9-34af-f5b812c8f2b3@openstack.org> Message-ID: <16bf8df6c0c.e19126b3193710.3862555704510892753@ghanshyammann.com> ---- On Mon, 15 Jul 2019 19:30:26 +0900 Thierry Carrez wrote ---- > Ghanshyam Mann wrote: > > ---- On Sun, 14 Jul 2019 05:38:28 +0900 Sean McGinnis wrote ---- > > > On Sat, Jul 13, 2019 at 09:19:35PM +0900, Ghanshyam Mann wrote: > > > > ---- On Sat, 13 Jul 2019 21:16:32 +0900 Ghanshyam Mann wrote ---- > > > > > Hi Release team, > > > > > > > > > > Today I noticed while doing patrole review that stable/stain has been created for patrole which is wrong. > > > > > Patrole is branchless[1] and I remember I have not requested the stable branch while releasing the patrole. > > > > > > > > > > Is it created by mistakenly ? or intentional? > > > > > > > > I found the patch which created this - https://review.opendev.org/#/c/650173/1. > > > > > > > > Can we revert that but that patch has more changes? > > > > > > > > -gmann > > > > > > > > > > We can't really just revert it. If you would like to change this, please update > > > the patrole release type to tempest-plugin or move it to be an independent > > > release deliverable if it is not actually cycle based. > > > > > > You can also remove the branching information with that change, then after that > > > is gone and there isn't a risk of recreating it, the infra team may be able to > > > assist in deleting the existing branch. > > > > Release model for patrole is 'cycle-with-intermediary' which is right, we do not need > > to change that. We just need to remove the stable/stein and branch information which > > was added for the only stein. I will push the change. > > > > Can we have a tag which clearly says the branchless and no-stable branch nature for deliverables? > > > > 'tempest-plugins' was introduced for a different purpose. new tag can be applicable for other > > deliverables also (current or in future). That can help to avoid these errrors. > > Creating a new release model or a new deliverable type sounds a bit > overkill. > > I think the simpler would be to add a new value to the existing > "stable-branch-type" key. Like "stable-branch-type: none" and then set > that value for all tempest plugins, tempest itself and patrole. Thanks, that will work perfectly in Tempest, and its plugins cases. -gmann > > See https://review.opendev.org/670808 as a strawman. > > -- > Thierry Carrez (ttx) > > From gmann at ghanshyammann.com Tue Jul 16 03:51:45 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 16 Jul 2019 12:51:45 +0900 Subject: [dev][release][qa] patrole stable/stein is created by mistake ? In-Reply-To: <717CD206-8A3A-438F-A52B-22396CFA9BAD@doughellmann.com> References: <16beb436c21.c9d98154147017.455401003873410397@ghanshyammann.com> <16beb463457.1193bd7a4147042.5135658897011450644@ghanshyammann.com> <20190713203828.GA29711@sm-workstation> <16beddf697c.fd5e3e64150304.4626321901014847129@ghanshyammann.com> <0a85c82c-f312-a8a9-34af-f5b812c8f2b3@openstack.org> <717CD206-8A3A-438F-A52B-22396CFA9BAD@doughellmann.com> Message-ID: <16bf8e85983.b41e1d9d193754.93039526410042655@ghanshyammann.com> ---- On Tue, 16 Jul 2019 03:37:05 +0900 Doug Hellmann wrote ---- > > > On Jul 15, 2019, at 6:30 AM, Thierry Carrez wrote: > Ghanshyam Mann wrote: > ---- On Sun, 14 Jul 2019 05:38:28 +0900 Sean McGinnis wrote ---- > > On Sat, Jul 13, 2019 at 09:19:35PM +0900, Ghanshyam Mann wrote: > > > ---- On Sat, 13 Jul 2019 21:16:32 +0900 Ghanshyam Mann wrote ---- > > > > Hi Release team, > > > > > > > > Today I noticed while doing patrole review that stable/stain has been created for patrole which is wrong. > > > > Patrole is branchless[1] and I remember I have not requested the stable branch while releasing the patrole. > > > > > > > > Is it created by mistakenly ? or intentional? > > > > > > I found the patch which created this - https://review.opendev.org/#/c/650173/1. > > > > > > Can we revert that but that patch has more changes? > > > > > > -gmann > > > > > > > We can't really just revert it. If you would like to change this, please update > > the patrole release type to tempest-plugin or move it to be an independent > > release deliverable if it is not actually cycle based. > > > > You can also remove the branching information with that change, then after that > > is gone and there isn't a risk of recreating it, the infra team may be able to > > assist in deleting the existing branch. > Release model for patrole is 'cycle-with-intermediary' which is right, we do not need > to change that. We just need to remove the stable/stein and branch information which > was added for the only stein. I will push the change. > Can we have a tag which clearly says the branchless and no-stable branch nature for deliverables? > 'tempest-plugins' was introduced for a different purpose. new tag can be applicable for other > deliverables also (current or in future). That can help to avoid these errrors. > > Creating a new release model or a new deliverable type sounds a bit overkill. > > I think the simpler would be to add a new value to the existing "stable-branch-type" key. Like "stable-branch-type: none" and then set that value for all tempest plugins, tempest itself and patrole. > > See https://review.opendev.org/670808 as a strawman. > > -- > Thierry Carrez (ttx) > If the project isn’t creating stable branches each cycle, how is it cycle-based? What’s wrong with moving it to use the independent release model, which was created for cases like this? Tempest and all Tempest plugins are "cycle-with-intermediary" because we want a particular release tag per OpenStack release. That is what we decided in ML discussion while deciding the release model for tempest plugins [1]. With independent release mode, we face the issue of not having the compatible versions of all Tempest plugins which user need to run on against their production cloud. [1] http://lists.openstack.org/pipermail/openstack-dev/2018-June/131837.html > Doug > From dtantsur at redhat.com Tue Jul 16 07:15:50 2019 From: dtantsur at redhat.com (Dmitry Tantsur) Date: Tue, 16 Jul 2019 09:15:50 +0200 Subject: [TripleO] Scaling node counts with only Ansible (N=1) In-Reply-To: <6bdebb51-0b3c-2888-a691-720bd5ac039a@redhat.com> References: <23924034ea0981350b7e241aed5e99c5e769b291.camel@redhat.com> <6bdebb51-0b3c-2888-a691-720bd5ac039a@redhat.com> Message-ID: <49e44a0f-84c9-195a-99d5-b843e4045454@redhat.com> On 7/16/19 12:26 AM, Steve Baker wrote: > > On 15/07/19 9:12 PM, Harald Jensås wrote: >> On Sat, 2019-07-13 at 16:19 -0400, James Slagle wrote: >>> On Fri, Jul 12, 2019 at 3:59 PM Harald Jensås >>> wrote: >>>> I've said this before, but I think we should turn this nova-less >>>> around. Now with nova-less we create a bunch of servers, and write >>>> up >>>> the parameters file to use the deployed-server approach. >>>> Effectively we >>>> still neet to have the resource group in heat making a server >>>> resource >>>> for every server. Creating the fake server resource is fast, >>>> because >>>> Heat does'nt call Nova,Ironic to create any resources. But the >>>> stack is >>>> equally big, with a stack for every node. i.e not N=1. >>>> >>>> What you are doing here, is essentially to say we don't create a >>>> resource group that then creates N number of role stacks, one for >>>> each >>>> overcloud node. You are creating a single generic "server" >>>> definition >>>> per Role. So we drop the resource group and create >>>> OS::Triple::{{Role}}.Server 1-time (once). To me it's backwards to >>>> push >>>> a large struct with properties for N=many nodes into the creation >>>> of >>>> that stack. >>> I'm not entirely following what you're saying is backwards. What I've >>> proposed is that we *don't* have any node specific data in the stack. >>> It sounds like you're saying the way we do it today is backwards. >>> >> What I mean to say is that I think the way we are integrating nova-less >> by first deploying the servers, to then provide the data to Heat to >> create the resource groups as we do today becomes backwards when your >> work on N=1 is introduced. >> >> >>> It's correct that what's been proposed with metalsmith currently >>> still >>> requires the full ResourceGroup with a member for each node. With the >>> template changes I'm proposing, that wouldn't be required, so we >>> could >>> actually do the Heat stack first, then metalsmith. >>> >> Yes, this is what I think we should do. Especially if your changes here >> removes the resource group entirely. It makes more sense to create the >> stack, and once that is created we can do deployment, scaling etc >> without updating the stack again. > > I think this is something we can move towards after James has finished this > work. It would probably mean deprecating "openstack overcloud node provision" > and providing some other way of running the baremetal provisioning in isolation > after the heat stack operation, like an equivalent to "openstack overcloud > deploy --config-download-only" > > I'm very much against on deprecating "openstack overcloud node provision", it's one of the reasons of this whole effort. I'm equally -2 on making the bare metal provisioning depending on heat in any way for the same reason. Dmitry From rraja at redhat.com Tue Jul 16 07:19:37 2019 From: rraja at redhat.com (Ramana Venkatesh Raja) Date: Tue, 16 Jul 2019 12:49:37 +0530 Subject: [Manila] CephFS deferred deletion In-Reply-To: <20190712131540.3eqvltysfix6eivd@barron.net> References: <20190712131540.3eqvltysfix6eivd@barron.net> Message-ID: Re-sending the email as it didn't get posted in the ML. On Fri, Jul 12, 2019 at 6:45 PM Tom Barron wrote: > > On 12/07/19 13:03 +0000, Jose Castro Leon wrote: > >Dear all, > > > >Lately, one of our clients stored 300k files in a manila cephfs share. > >Then he deleted the share in Manila. This event make the driver > >unresponsive for several hours until all the data was removed in the > >cluster. > > > >We had a quick look at the code in manila [1] and the deletion is done > >first by calling the following api calls in the ceph bindings > >(delete_volume[1] and then purge_volume[2]). The first call moves the > >directory to a volumes_deleted directory. The second call does a > >deletion in depth of all the contents of that directory. > > > >The last operation is the one that trigger the issue. > > > >We had a similar issue in the past in Cinder. There, Arne proposed to > >do a deferred deletion of volumes. I think we could do the same in > >Manila for the cephfs driver. > > > >The idea is to continue to call to the delete_volume. And then inside a > >periodic task in the driver, asynchronously it will get the contents of > >that directory and trigger the purge command. > > > >I can propose the change and contribute with the code, but before going > >to deep I would like to know if there is a reason of having a singleton > >for the volume_client connection. If I compare with cinder code the > >connection is established and closed in each operation with the > >backend. > > > >If you are not the maintainer, could you please point me to he/she? > >I can post it in the mailing list if you prefer > > > >Cheers > >Jose Castro Leon > >CERN Cloud Infrastructure > > > >[1] > >https://github.com/openstack/manila/blob/master/manila/share/drivers/cephfs/driver.py#L260-L267 > > > > > >[2] > >https://github.com/ceph/ceph/blob/master/src/pybind/ceph_volume_client.py#L700-L734 > > > > > >[2] > >https://github.com/ceph/ceph/blob/master/src/pybind/ceph_volume_client.py#L736-L790 > > > > > >PS: The issue was triggered by one of our clients in kubernetes using > >the Manila CSI driver > > Hi Jose, > > Let's get this fixed since there's a lot of interest in Manila CSI > driver and I think we can expect more batched deletes with it than we > have had historically. The plan is to have manila's CephFS driver use the ceph-mgr's new volumes module, https://github.com/ceph/ceph/blob/master/src/pybind/mgr/volumes/module.py to create/delete manila groups/shares/snapshots, authorize/de-authorize access to the shares. Manila shares, essentially CephFS subdirectories with a specific data layout and quota, are referred to as FS subvolumes, and Ceph filesystems as FS volumes in the ceph-mgr volumes module. The ceph-mgr volumes modules is under active development. The latest Ceph CSI (v1.1.0) release is the first consumer of this module. The Ceph CSI issues CLI calls to the ceph-mgr to manage the lifecycle of the FS subvolumes, https://github.com/ceph/ceph-csi/pull/400 We're implementing the asynchronous purge of FS subvolumes in the ceph-mgr module. The PR is close to being merged, https://github.com/ceph/ceph/pull/28003/ https://github.com/ceph/ceph/pull/28003/commits/483a2141fe8c9a58bc25a544412cdf5b047ad772 http://tracker.ceph.com/issues/40036 Issuing the `ceph fs subvolume rm` command in the Ceph CSI driver (and later in the manila driver) will move the FS subvolume to a trash directory, whose contents will be asynchronously purged by a set of worker threads. > > I've copied Ramana Raja and Patrick Donnelly since they will be able > to answer your question about the singleton volume_client connection > more authoritatively than I can. Currently, in the mgr-volumes module we establish and close connection to a FS volume (a Ceph filesystem) for each FS subvolume (CephFS subdirectory within the filesystem) operation, https://github.com/ceph/ceph/pull/28082/commits/8d29816f0f3db6c7d287bbb7469db77c9de701d1#diff-cfd3b6f517caccc18f7f066395e8a4bdR174 Instead, we want to maintain a connection to a FS volume and perform operations on its subvolumes, until the FS volume is deleted. This would reduce the time taken to perform subvolume operations, important in CSI work loads (and in OpenStack workloads?). The code is in review, https://github.com/ceph/ceph/pull/28003/commits/5c41e949af9acabd612b0644de0603e374b4b42a Thanks, Ramana > > Thanks for volunteering to propose a review to deal with this issue! > > -- Tom Barron > From dsneddon at redhat.com Tue Jul 16 07:34:56 2019 From: dsneddon at redhat.com (Dan Sneddon) Date: Tue, 16 Jul 2019 00:34:56 -0700 Subject: [TripleO] Scaling node counts with only Ansible (N=1) In-Reply-To: <49e44a0f-84c9-195a-99d5-b843e4045454@redhat.com> References: <23924034ea0981350b7e241aed5e99c5e769b291.camel@redhat.com> <6bdebb51-0b3c-2888-a691-720bd5ac039a@redhat.com> <49e44a0f-84c9-195a-99d5-b843e4045454@redhat.com> Message-ID: On Tue, Jul 16, 2019 at 12:19 AM Dmitry Tantsur wrote: > On 7/16/19 12:26 AM, Steve Baker wrote: > > > > On 15/07/19 9:12 PM, Harald Jensås wrote: > >> On Sat, 2019-07-13 at 16:19 -0400, James Slagle wrote: > >>> On Fri, Jul 12, 2019 at 3:59 PM Harald Jensås > >>> wrote: > >>>> I've said this before, but I think we should turn this nova-less > >>>> around. Now with nova-less we create a bunch of servers, and write > >>>> up > >>>> the parameters file to use the deployed-server approach. > >>>> Effectively we > >>>> still neet to have the resource group in heat making a server > >>>> resource > >>>> for every server. Creating the fake server resource is fast, > >>>> because > >>>> Heat does'nt call Nova,Ironic to create any resources. But the > >>>> stack is > >>>> equally big, with a stack for every node. i.e not N=1. > >>>> > >>>> What you are doing here, is essentially to say we don't create a > >>>> resource group that then creates N number of role stacks, one for > >>>> each > >>>> overcloud node. You are creating a single generic "server" > >>>> definition > >>>> per Role. So we drop the resource group and create > >>>> OS::Triple::{{Role}}.Server 1-time (once). To me it's backwards to > >>>> push > >>>> a large struct with properties for N=many nodes into the creation > >>>> of > >>>> that stack. > >>> I'm not entirely following what you're saying is backwards. What I've > >>> proposed is that we *don't* have any node specific data in the stack. > >>> It sounds like you're saying the way we do it today is backwards. > >>> > >> What I mean to say is that I think the way we are integrating nova-less > >> by first deploying the servers, to then provide the data to Heat to > >> create the resource groups as we do today becomes backwards when your > >> work on N=1 is introduced. > >> > >> > >>> It's correct that what's been proposed with metalsmith currently > >>> still > >>> requires the full ResourceGroup with a member for each node. With the > >>> template changes I'm proposing, that wouldn't be required, so we > >>> could > >>> actually do the Heat stack first, then metalsmith. > >>> > >> Yes, this is what I think we should do. Especially if your changes here > >> removes the resource group entirely. It makes more sense to create the > >> stack, and once that is created we can do deployment, scaling etc > >> without updating the stack again. > > > > I think this is something we can move towards after James has finished > this > > work. It would probably mean deprecating "openstack overcloud node > provision" > > and providing some other way of running the baremetal provisioning in > isolation > > after the heat stack operation, like an equivalent to "openstack > overcloud > > deploy --config-download-only" > > > > > > I'm very much against on deprecating "openstack overcloud node provision", > it's > one of the reasons of this whole effort. I'm equally -2 on making the bare > metal > provisioning depending on heat in any way for the same reason. > > Dmitry > > My concerns about network ports boil down to technical debt with Heat. It would be great if we can make the individual nodes completely independent of Heat, and somehow migrate from the old Heat-based definition for upgrades. -- Dan Sneddon -------------- next part -------------- An HTML attachment was scrubbed... URL: From bdobreli at redhat.com Tue Jul 16 09:26:11 2019 From: bdobreli at redhat.com (Bogdan Dobrelya) Date: Tue, 16 Jul 2019 11:26:11 +0200 Subject: [TripleO] Scaling node counts with only Ansible (N=1) In-Reply-To: References: <23924034ea0981350b7e241aed5e99c5e769b291.camel@redhat.com> <6bdebb51-0b3c-2888-a691-720bd5ac039a@redhat.com> <49e44a0f-84c9-195a-99d5-b843e4045454@redhat.com> Message-ID: <91b5d9aa-3210-1645-f0c7-adf91b84d007@redhat.com> On 16.07.2019 9:34, Dan Sneddon wrote: > On Tue, Jul 16, 2019 at 12:19 AM Dmitry Tantsur > wrote: > > On 7/16/19 12:26 AM, Steve Baker wrote: > > > > On 15/07/19 9:12 PM, Harald Jensås wrote: > >> On Sat, 2019-07-13 at 16:19 -0400, James Slagle wrote: > >>> On Fri, Jul 12, 2019 at 3:59 PM Harald Jensås > > > >>> wrote: > >>>> I've said this before, but I think we should turn this nova-less > >>>> around. Now with nova-less we create a bunch of servers, and write > >>>> up > >>>> the parameters file to use the deployed-server approach. > >>>> Effectively we > >>>> still neet to have the resource group in heat making a server > >>>> resource > >>>> for every server. Creating the fake server resource is fast, > >>>> because > >>>> Heat does'nt call Nova,Ironic to create any resources. But the > >>>> stack is > >>>> equally big, with a stack for every node. i.e not N=1. > >>>> > >>>> What you are doing here, is essentially to say we don't create a > >>>> resource group that then creates N number of role stacks, one for > >>>> each > >>>> overcloud node. You are creating a single generic "server" > >>>> definition > >>>> per Role. So we drop the resource group and create > >>>> OS::Triple::{{Role}}.Server 1-time (once). To me it's backwards to > >>>> push > >>>> a large struct with properties for N=many nodes into the creation > >>>> of > >>>> that stack. > >>> I'm not entirely following what you're saying is backwards. > What I've > >>> proposed is that we *don't* have any node specific data in the > stack. > >>> It sounds like you're saying the way we do it today is backwards. > >>> > >> What I mean to say is that I think the way we are integrating > nova-less > >> by first deploying the servers, to then provide the data to Heat to > >> create the resource groups as we do today becomes backwards when > your > >> work on N=1 is introduced. > >> > >> > >>> It's correct that what's been proposed with metalsmith currently > >>> still > >>> requires the full ResourceGroup with a member for each node. > With the > >>> template changes I'm proposing, that wouldn't be required, so we > >>> could > >>> actually do the Heat stack first, then metalsmith. > >>> > >> Yes, this is what I think we should do. Especially if your > changes here > >> removes the resource group entirely. It makes more sense to > create the > >> stack, and once that is created we can do deployment, scaling etc > >> without updating the stack again. > > > > I think this is something we can move towards after James has > finished this > > work. It would probably mean deprecating "openstack overcloud > node provision" > > and providing some other way of running the baremetal > provisioning in isolation > > after the heat stack operation, like an equivalent to "openstack > overcloud > > deploy --config-download-only" > > > > > > I'm very much against on deprecating "openstack overcloud node > provision", it's > one of the reasons of this whole effort. I'm equally -2 on making > the bare metal > provisioning depending on heat in any way for the same reason. > > Dmitry > > > My concerns about network ports boil down to technical debt with Heat. > It would be great if we can make the individual nodes completely > independent of Heat, and somehow migrate from the old Heat-based > definition for upgrades. As it was earlier mentioned in the thread, we'll highly likely need some external data store to migrate/upgrade things out of Heat smoothly. That probably should be etcd? I don't think a clever ansible inventory could handle that fully replacing such a data store. > > -- > Dan Sneddon -- Best regards, Bogdan Dobrelya, Irc #bogdando From dharmendra.kushwaha at india.nec.com Tue Jul 16 09:29:34 2019 From: dharmendra.kushwaha at india.nec.com (Dharmendra Kushwaha) Date: Tue, 16 Jul 2019 09:29:34 +0000 Subject: [Tacker] Proposing changes in Tacker core team In-Reply-To: References: Message-ID: Welcome Hiroyuki Jo in Tacker Core-Team. :) Thanks & Regards Dharmendra Kushwaha ________________________________________ From: Dharmendra Kushwaha Sent: Tuesday, July 9, 2019 12:04 PM To: openstack-discuss at lists.openstack.org Subject: [Tacker] Proposing changes in Tacker core team Hello Team, I am proposing below changes in Team: I would like to propose Hiroyuki Jo to join Tacker core team. Hiroyuki Jo have lead multiple valuable features level activities like affinity policy, VDU-healing, and VNF reservation [1] in Rocky & Stein cycle, and made it sure to be completed timely. And currently working on VNF packages [2] and ETSI NFV-SOL specification support [3]. Hiroyuki has a good understanding of NFV and Tacker project, and helping team by providing sensible reviews. I believe it is a good addition in Tacker core team, and Tacker project will benefit from this nomination. On the other hand, I wanted to thank to Bharath Thiruveedula for his great & valuable contribution in the project. He helped a lot to make Tacker better in early days. But now he doesn't seem to be active in project and he decided to step-down from core team. Whenever you will decide to come back to the project, I will be happy to add you in core-team. Core-Team, Please respond with your +1/-1. If no objection, I will do these changes in next week. [1] https://review.opendev.org/#/q/project:openstack/tacker-specs+owner:%22Hiroyuki+Jo+%253Chiroyuki.jo.mt%2540hco.ntt.co.jp%253E%22 [2] https://blueprints.launchpad.net/tacker/+spec/tosca-csar-mgmt-driver [3] https://blueprints.launchpad.net/tacker/+spec/support-etsi-nfv-specs Thanks & Regards Dharmendra Kushwaha From manulachathurika at gmail.com Tue Jul 16 10:01:59 2019 From: manulachathurika at gmail.com (Manula Thantriwatte) Date: Tue, 16 Jul 2019 15:31:59 +0530 Subject: ImportError: No module named django.core.wsgi - DevStack - Stable/Stein - Ubuntu 18.04 In-Reply-To: References: Message-ID: Hi All, I have successfully install DevStack in Ubuntu 18.04. But when I'm accessing the dashboard I'm getting 500 error. What I'm getting in horizon_error.log is, 2019-07-15 14:10:43.218296 mod_wsgi (pid=31763): Target WSGI script '/opt/stack/horizon/openstack_dashboard/wsgi.py' cannot be loaded as Python module. 2019-07-15 14:10:43.218323 mod_wsgi (pid=31763): Exception occurred processing WSGI script '/opt/stack/horizon/openstack_dashboard/wsgi.py'. 2019-07-15 14:10:43.218349 Traceback (most recent call last): 2019-07-15 14:10:43.218370 File "/opt/stack/horizon/openstack_dashboard/wsgi.py", line 21, in 2019-07-15 14:10:43.218401 from django.core.wsgi import get_wsgi_application 2019-07-15 14:10:43.218422 ImportError: No module named django.core.wsgi Python Version is : 3.6.8 Django version is : 2.0.5 I tried with uninstalling Djanago and reinstalling it. But it didn't work for me. Can someone help me on how to resole this issue ? Thanks ! -- Regards, Manula Chathurika Thantriwatte phone : (+94) 772492511 email : manulachathurika at gmail.com Linkedin : *http://lk.linkedin.com/in/manulachathurika * blog : http://manulachathurika.blogspot.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From massimo.sgaravatto at gmail.com Tue Jul 16 10:38:03 2019 From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto) Date: Tue, 16 Jul 2019 12:38:03 +0200 Subject: Problems running db_sync for nova (Ocata --> Pike) Message-ID: Hi We are trying to update our Ocata cloud to Pike (this is the first step: we will go through Ocata --> Pike --> Queens --> Rocky) but we have a problem with nova-manage db sync: [root at cld-ctrl-01 ~]# su -s /bin/sh -c "nova-manage db sync" nova ERROR: Could not access cell0. Has the nova_api database been created? Has the nova_cell0 database been created? Has "nova-manage api_db sync" been run? Has "nova-manage cell_v2 map_cell0" been run? Is [api_database]/connection set in nova.conf? Is the cell0 database connection URL correct? Error: "Database schema file with version 390 doesn't exist." We have these settings in nova.conf: [database] connection = mysql+pymysql://nova_prod:xyz at 192.168.60.10:6306/nova_prod [api_database] connection = mysql+pymysql:// nova_api_prod:xyz at 192.168.60.10:6306/nova_api_prod I can't see problems accessing the databases. i.e. these commands work: mysql -u nova_api_prod -pxyz -h 192.168.60.10 -P 6306 nova_api_prod mysql -u nova_prod -pxyz -h 192.168.60.10 -P 6306 nova_prod mysql -u nova_prod -pxyz -h 192.168.60.10 -P 6306 nova_prod_cell0 If I try to rerun nova-manage cell_v2 map_cell0: [root at cld-ctrl-01 ~]# nova-manage cell_v2 map_cell0 Cell0 is already setup [root at cld-ctrl-01 ~]# su -s /bin/sh -c "nova-manage api_db version" nova 45 [root at cld-ctrl-01 ~]# su -s /bin/sh -c "nova-manage db version" nova 362 [root at cld-ctrl-01 ~]# nova-manage cell_v2 list_cells +-------+--------------------------------------+--------------------------------------------------+-------------------------------------------------------------------+ | Name | UUID | Transport URL | Database Connection | +-------+--------------------------------------+--------------------------------------------------+-------------------------------------------------------------------+ | cell0 | 00000000-0000-0000-0000-000000000000 | none:/// | mysql+pymysql://nova_prod:****@ 192.168.60.10:6306/nova_prod_cell0 | | cell1 | 5e42faa0-710b-4967-bb42-fcf53602c96e | rabbit://openstack_prod:****@192.168.60.183:5672 | mysql+pymysql://nova_prod:****@192.168.60.10:6306/nova_prod | +-------+--------------------------------------+--------------------------------------------------+-------------------------------------------------------------------+ Following some doc we also tried to specify the --database_connection / --database-connection argument, but this seems not working Setting nova in debug mode, this is the last entry in the nova-manage log file: 2019-07-16 12:11:42.774 22013 DEBUG migrate.versioning.repository [req-8ca52407-dbe1-4a62-ad6c-16631c2a9a06 - - - - -] Config: OrderedDict([('db_settings', OrderedDict([('__name__', 'db_settings'), ('repository_id', 'nova'), ('version_table', 'migrate_version'), ('required_dbs', '[]')]))]) __init__ /usr/lib/python2.7/site-packages/migrate/versioning/repository.py:83 Any hints ? Thanks, Massimo -------------- next part -------------- An HTML attachment was scrubbed... URL: From massimo.sgaravatto at gmail.com Tue Jul 16 10:43:01 2019 From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto) Date: Tue, 16 Jul 2019 12:43:01 +0200 Subject: [nova] [ops] Problems running db sync for nova (Ocata --> Pike) Message-ID: Resending with the right tags in the subject line ... Hi We are trying to update our Ocata cloud to Pike (this is the first step: we will go through Ocata --> Pike --> Queens --> Rocky) but we have a problem with nova-manage db sync: [root at cld-ctrl-01 ~]# su -s /bin/sh -c "nova-manage db sync" nova ERROR: Could not access cell0. Has the nova_api database been created? Has the nova_cell0 database been created? Has "nova-manage api_db sync" been run? Has "nova-manage cell_v2 map_cell0" been run? Is [api_database]/connection set in nova.conf? Is the cell0 database connection URL correct? Error: "Database schema file with version 390 doesn't exist." We have these settings in nova.conf: [database] connection = mysql+pymysql://nova_prod:xyz at 192.168.60.10:6306/nova_prod [api_database] connection = mysql+pymysql:// nova_api_prod:xyz at 192.168.60.10:6306/nova_api_prod I can't see problems accessing the databases. i.e. these commands work: mysql -u nova_api_prod -pxyz -h 192.168.60.10 -P 6306 nova_api_prod mysql -u nova_prod -pxyz -h 192.168.60.10 -P 6306 nova_prod mysql -u nova_prod -pxyz -h 192.168.60.10 -P 6306 nova_prod_cell0 If I try to rerun nova-manage cell_v2 map_cell0: [root at cld-ctrl-01 ~]# nova-manage cell_v2 map_cell0 Cell0 is already setup [root at cld-ctrl-01 ~]# su -s /bin/sh -c "nova-manage api_db version" nova 45 [root at cld-ctrl-01 ~]# su -s /bin/sh -c "nova-manage db version" nova 362 [root at cld-ctrl-01 ~]# nova-manage cell_v2 list_cells +-------+--------------------------------------+--------------------------------------------------+-------------------------------------------------------------------+ | Name | UUID | Transport URL | Database Connection | +-------+--------------------------------------+--------------------------------------------------+-------------------------------------------------------------------+ | cell0 | 00000000-0000-0000-0000-000000000000 | none:/// | mysql+pymysql://nova_prod:****@ 192.168.60.10:6306/nova_prod_cell0 | | cell1 | 5e42faa0-710b-4967-bb42-fcf53602c96e | rabbit://openstack_prod:****@192.168.60.183:5672 | mysql+pymysql://nova_prod:****@192.168.60.10:6306/nova_prod | +-------+--------------------------------------+--------------------------------------------------+-------------------------------------------------------------------+ Following some doc we also tried to specify the --database_connection / --database-connection argument, but this seems not working Setting nova in debug mode, this is the last entry in the nova-manage log file: 2019-07-16 12:11:42.774 22013 DEBUG migrate.versioning.repository [req-8ca52407-dbe1-4a62-ad6c-16631c2a9a06 - - - - -] Config: OrderedDict([('db_settings', OrderedDict([('__name__', 'db_settings'), ('repository_id', 'nova'), ('version_table', 'migrate_version'), ('required_dbs', '[]')]))]) __init__ /usr/lib/python2.7/site-packages/migrate/versioning/repository.py:83 Any hints ? Thanks, Massimo   OpenStack Discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From eblock at nde.ag Tue Jul 16 10:48:39 2019 From: eblock at nde.ag (Eugen Block) Date: Tue, 16 Jul 2019 10:48:39 +0000 Subject: [nova] [ops] Problems running db sync for nova (Ocata --> Pike) In-Reply-To: Message-ID: <20190716104839.Horde.Pe3wSo-woFy0u8rF9gnehyD@webmail.nde.ag> Hi, I think you need to run nova-manage db sync --local_cell in Ocata. I can't quite remember the reason, but you should try it. I had the same problems and this command worked out for me. Regards, Eugen Zitat von Massimo Sgaravatto : > Resending with the right tags in the subject line ... > > > Hi > > We are trying to update our Ocata cloud to Pike (this is the first step: > we will go through Ocata --> Pike --> Queens --> Rocky) but we have a > problem with nova-manage db sync: > > > [root at cld-ctrl-01 ~]# su -s /bin/sh -c "nova-manage db sync" nova > ERROR: Could not access cell0. > Has the nova_api database been created? > Has the nova_cell0 database been created? > Has "nova-manage api_db sync" been run? > Has "nova-manage cell_v2 map_cell0" been run? > Is [api_database]/connection set in nova.conf? > Is the cell0 database connection URL correct? > Error: "Database schema file with version 390 doesn't exist." > > We have these settings in nova.conf: > > [database] > connection = mysql+pymysql://nova_prod:xyz at 192.168.60.10:6306/nova_prod > [api_database] > connection = mysql+pymysql:// > nova_api_prod:xyz at 192.168.60.10:6306/nova_api_prod > > I can't see problems accessing the databases. i.e. these commands work: > > mysql -u nova_api_prod -pxyz -h 192.168.60.10 -P 6306 nova_api_prod > > mysql -u nova_prod -pxyz -h 192.168.60.10 -P 6306 nova_prod > > mysql -u nova_prod -pxyz -h 192.168.60.10 -P 6306 nova_prod_cell0 > > If I try to rerun nova-manage cell_v2 map_cell0: > > [root at cld-ctrl-01 ~]# nova-manage cell_v2 map_cell0 > Cell0 is already setup > > > [root at cld-ctrl-01 ~]# su -s /bin/sh -c "nova-manage api_db version" nova > 45 > > [root at cld-ctrl-01 ~]# su -s /bin/sh -c "nova-manage db version" nova > 362 > > [root at cld-ctrl-01 ~]# nova-manage cell_v2 list_cells > +-------+--------------------------------------+--------------------------------------------------+-------------------------------------------------------------------+ > | Name | UUID | Transport > URL | Database Connection > | > +-------+--------------------------------------+--------------------------------------------------+-------------------------------------------------------------------+ > | cell0 | 00000000-0000-0000-0000-000000000000 | > none:/// | mysql+pymysql://nova_prod:****@ > 192.168.60.10:6306/nova_prod_cell0 | > | cell1 | 5e42faa0-710b-4967-bb42-fcf53602c96e | > rabbit://openstack_prod:****@192.168.60.183:5672 | > mysql+pymysql://nova_prod:****@192.168.60.10:6306/nova_prod | > +-------+--------------------------------------+--------------------------------------------------+-------------------------------------------------------------------+ > > Following some doc we also tried to specify the --database_connection / > --database-connection argument, but this seems not working > > > Setting nova in debug mode, this is the last entry in the nova-manage log > file: > > 2019-07-16 12:11:42.774 22013 DEBUG migrate.versioning.repository > [req-8ca52407-dbe1-4a62-ad6c-16631c2a9a06 - - - - -] Config: > OrderedDict([('db_settings', OrderedDict([('__name__', 'db_settings'), > ('repository_id', 'nova'), ('version_table', 'migrate_version'), > ('required_dbs', '[]')]))]) __init__ > /usr/lib/python2.7/site-packages/migrate/versioning/repository.py:83 > > Any hints ? > > Thanks, Massimo >  >  > OpenStack Discuss From ruslanas at lpic.lt Tue Jul 16 10:59:48 2019 From: ruslanas at lpic.lt (=?UTF-8?Q?Ruslanas_G=C5=BEibovskis?=) Date: Tue, 16 Jul 2019 12:59:48 +0200 Subject: ImportError: No module named django.core.wsgi - DevStack - Stable/Stein - Ubuntu 18.04 In-Reply-To: References: Message-ID: I had something similar, try reinstalling python from OSP repo, not common, as it rewrites some python lib file... at least it helped me. On Tue, 16 Jul 2019 at 12:05, Manula Thantriwatte < manulachathurika at gmail.com> wrote: > Hi All, > > I have successfully install DevStack in Ubuntu 18.04. But when I'm > accessing the dashboard I'm getting 500 error. What I'm getting > in horizon_error.log is, > > 2019-07-15 14:10:43.218296 mod_wsgi (pid=31763): Target WSGI script > '/opt/stack/horizon/openstack_dashboard/wsgi.py' cannot be loaded as Python > module. > 2019-07-15 14:10:43.218323 mod_wsgi (pid=31763): Exception occurred > processing WSGI script '/opt/stack/horizon/openstack_dashboard/wsgi.py'. > 2019-07-15 14:10:43.218349 Traceback (most recent call last): > 2019-07-15 14:10:43.218370 File > "/opt/stack/horizon/openstack_dashboard/wsgi.py", line 21, in > 2019-07-15 14:10:43.218401 from django.core.wsgi import > get_wsgi_application > 2019-07-15 14:10:43.218422 ImportError: No module named django.core.wsgi > > Python Version is : 3.6.8 > Django version is : 2.0.5 > > I tried with uninstalling Djanago and reinstalling it. But it didn't work > for me. > > Can someone help me on how to resole this issue ? > > Thanks ! > -- > Regards, > Manula Chathurika Thantriwatte > phone : (+94) 772492511 > email : manulachathurika at gmail.com > Linkedin : *http://lk.linkedin.com/in/manulachathurika > * > blog : http://manulachathurika.blogspot.com/ > > -- Ruslanas Gžibovskis +370 6030 7030 -------------- next part -------------- An HTML attachment was scrubbed... URL: From massimo.sgaravatto at gmail.com Tue Jul 16 11:05:54 2019 From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto) Date: Tue, 16 Jul 2019 13:05:54 +0200 Subject: [nova] [ops] Problems running db sync for nova (Ocata --> Pike) In-Reply-To: <20190716104839.Horde.Pe3wSo-woFy0u8rF9gnehyD@webmail.nde.ag> References: <20190716104839.Horde.Pe3wSo-woFy0u8rF9gnehyD@webmail.nde.ag> Message-ID: Mmm. Doesn't "--local_cell" mean that cell0 is not updated ? I was already using cell0 in Ocata ... Cheers, Massimo On Tue, Jul 16, 2019 at 12:52 PM Eugen Block wrote: > Hi, > > I think you need to run > > nova-manage db sync --local_cell > > in Ocata. I can't quite remember the reason, but you should try it. > I had the same problems and this command worked out for me. > > Regards, > Eugen > > > > Zitat von Massimo Sgaravatto : > > > Resending with the right tags in the subject line ... > > > > > > Hi > > > > We are trying to update our Ocata cloud to Pike (this is the first step: > > we will go through Ocata --> Pike --> Queens --> Rocky) but we have a > > problem with nova-manage db sync: > > > > > > [root at cld-ctrl-01 ~]# su -s /bin/sh -c "nova-manage db sync" nova > > ERROR: Could not access cell0. > > Has the nova_api database been created? > > Has the nova_cell0 database been created? > > Has "nova-manage api_db sync" been run? > > Has "nova-manage cell_v2 map_cell0" been run? > > Is [api_database]/connection set in nova.conf? > > Is the cell0 database connection URL correct? > > Error: "Database schema file with version 390 doesn't exist." > > > > We have these settings in nova.conf: > > > > [database] > > connection = mysql+pymysql://nova_prod:xyz at 192.168.60.10:6306/nova_prod > > [api_database] > > connection = mysql+pymysql:// > > nova_api_prod:xyz at 192.168.60.10:6306/nova_api_prod > > > > I can't see problems accessing the databases. i.e. these commands work: > > > > mysql -u nova_api_prod -pxyz -h 192.168.60.10 -P 6306 nova_api_prod > > > > mysql -u nova_prod -pxyz -h 192.168.60.10 -P 6306 nova_prod > > > > mysql -u nova_prod -pxyz -h 192.168.60.10 -P 6306 nova_prod_cell0 > > > > If I try to rerun nova-manage cell_v2 map_cell0: > > > > [root at cld-ctrl-01 ~]# nova-manage cell_v2 map_cell0 > > Cell0 is already setup > > > > > > [root at cld-ctrl-01 ~]# su -s /bin/sh -c "nova-manage api_db version" nova > > 45 > > > > [root at cld-ctrl-01 ~]# su -s /bin/sh -c "nova-manage db version" nova > > 362 > > > > [root at cld-ctrl-01 ~]# nova-manage cell_v2 list_cells > > > +-------+--------------------------------------+--------------------------------------------------+-------------------------------------------------------------------+ > > | Name | UUID | > Transport > > URL | Database Connection > > | > > > +-------+--------------------------------------+--------------------------------------------------+-------------------------------------------------------------------+ > > | cell0 | 00000000-0000-0000-0000-000000000000 | > > none:/// | mysql+pymysql://nova_prod:****@ > > 192.168.60.10:6306/nova_prod_cell0 | > > | cell1 | 5e42faa0-710b-4967-bb42-fcf53602c96e | > > rabbit://openstack_prod:****@192.168.60.183:5672 | > > mysql+pymysql://nova_prod:****@192.168.60.10:6306/nova_prod | > > > +-------+--------------------------------------+--------------------------------------------------+-------------------------------------------------------------------+ > > > > Following some doc we also tried to specify the --database_connection / > > --database-connection argument, but this seems not working > > > > > > Setting nova in debug mode, this is the last entry in the nova-manage log > > file: > > > > 2019-07-16 12:11:42.774 22013 DEBUG migrate.versioning.repository > > [req-8ca52407-dbe1-4a62-ad6c-16631c2a9a06 - - - - -] Config: > > OrderedDict([('db_settings', OrderedDict([('__name__', 'db_settings'), > > ('repository_id', 'nova'), ('version_table', 'migrate_version'), > > ('required_dbs', '[]')]))]) __init__ > > /usr/lib/python2.7/site-packages/migrate/versioning/repository.py:83 > > > > Any hints ? > > > > Thanks, Massimo > >  > >  > > OpenStack Discuss > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eblock at nde.ag Tue Jul 16 11:11:51 2019 From: eblock at nde.ag (Eugen Block) Date: Tue, 16 Jul 2019 11:11:51 +0000 Subject: [nova] [ops] Problems running db sync for nova (Ocata --> Pike) In-Reply-To: References: <20190716104839.Horde.Pe3wSo-woFy0u8rF9gnehyD@webmail.nde.ag> Message-ID: <20190716111151.Horde.2mXkgZIEbUsVwdTqrDfdbD4@webmail.nde.ag> AFAIK cell0 is the local cell, isn't it? I'm not sure, maybe someone else can shed some light on this. Zitat von Massimo Sgaravatto : > Mmm. > Doesn't "--local_cell" mean that cell0 is not updated ? > I was already using cell0 in Ocata ... > > Cheers, Massimo > > On Tue, Jul 16, 2019 at 12:52 PM Eugen Block wrote: > >> Hi, >> >> I think you need to run >> >> nova-manage db sync --local_cell >> >> in Ocata. I can't quite remember the reason, but you should try it. >> I had the same problems and this command worked out for me. >> >> Regards, >> Eugen >> >> >> >> Zitat von Massimo Sgaravatto : >> >> > Resending with the right tags in the subject line ... >> > >> > >> > Hi >> > >> > We are trying to update our Ocata cloud to Pike (this is the first step: >> > we will go through Ocata --> Pike --> Queens --> Rocky) but we have a >> > problem with nova-manage db sync: >> > >> > >> > [root at cld-ctrl-01 ~]# su -s /bin/sh -c "nova-manage db sync" nova >> > ERROR: Could not access cell0. >> > Has the nova_api database been created? >> > Has the nova_cell0 database been created? >> > Has "nova-manage api_db sync" been run? >> > Has "nova-manage cell_v2 map_cell0" been run? >> > Is [api_database]/connection set in nova.conf? >> > Is the cell0 database connection URL correct? >> > Error: "Database schema file with version 390 doesn't exist." >> > >> > We have these settings in nova.conf: >> > >> > [database] >> > connection = mysql+pymysql://nova_prod:xyz at 192.168.60.10:6306/nova_prod >> > [api_database] >> > connection = mysql+pymysql:// >> > nova_api_prod:xyz at 192.168.60.10:6306/nova_api_prod >> > >> > I can't see problems accessing the databases. i.e. these commands work: >> > >> > mysql -u nova_api_prod -pxyz -h 192.168.60.10 -P 6306 nova_api_prod >> > >> > mysql -u nova_prod -pxyz -h 192.168.60.10 -P 6306 nova_prod >> > >> > mysql -u nova_prod -pxyz -h 192.168.60.10 -P 6306 nova_prod_cell0 >> > >> > If I try to rerun nova-manage cell_v2 map_cell0: >> > >> > [root at cld-ctrl-01 ~]# nova-manage cell_v2 map_cell0 >> > Cell0 is already setup >> > >> > >> > [root at cld-ctrl-01 ~]# su -s /bin/sh -c "nova-manage api_db version" nova >> > 45 >> > >> > [root at cld-ctrl-01 ~]# su -s /bin/sh -c "nova-manage db version" nova >> > 362 >> > >> > [root at cld-ctrl-01 ~]# nova-manage cell_v2 list_cells >> > >> +-------+--------------------------------------+--------------------------------------------------+-------------------------------------------------------------------+ >> > | Name | UUID | >> Transport >> > URL | Database Connection >> > | >> > >> +-------+--------------------------------------+--------------------------------------------------+-------------------------------------------------------------------+ >> > | cell0 | 00000000-0000-0000-0000-000000000000 | >> > none:/// | mysql+pymysql://nova_prod:****@ >> > 192.168.60.10:6306/nova_prod_cell0 | >> > | cell1 | 5e42faa0-710b-4967-bb42-fcf53602c96e | >> > rabbit://openstack_prod:****@192.168.60.183:5672 | >> > mysql+pymysql://nova_prod:****@192.168.60.10:6306/nova_prod | >> > >> +-------+--------------------------------------+--------------------------------------------------+-------------------------------------------------------------------+ >> > >> > Following some doc we also tried to specify the --database_connection / >> > --database-connection argument, but this seems not working >> > >> > >> > Setting nova in debug mode, this is the last entry in the nova-manage log >> > file: >> > >> > 2019-07-16 12:11:42.774 22013 DEBUG migrate.versioning.repository >> > [req-8ca52407-dbe1-4a62-ad6c-16631c2a9a06 - - - - -] Config: >> > OrderedDict([('db_settings', OrderedDict([('__name__', 'db_settings'), >> > ('repository_id', 'nova'), ('version_table', 'migrate_version'), >> > ('required_dbs', '[]')]))]) __init__ >> > /usr/lib/python2.7/site-packages/migrate/versioning/repository.py:83 >> > >> > Any hints ? >> > >> > Thanks, Massimo >> >  >> >  >> > OpenStack Discuss >> >> >> >> >> From massimo.sgaravatto at gmail.com Tue Jul 16 11:16:19 2019 From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto) Date: Tue, 16 Jul 2019 13:16:19 +0200 Subject: [nova] [ops] Problems running db sync for nova (Ocata --> Pike) In-Reply-To: <20190716111151.Horde.2mXkgZIEbUsVwdTqrDfdbD4@webmail.nde.ag> References: <20190716104839.Horde.Pe3wSo-woFy0u8rF9gnehyD@webmail.nde.ag> <20190716111151.Horde.2mXkgZIEbUsVwdTqrDfdbD4@webmail.nde.ag> Message-ID: My understanding (e.g. from #comment 1 of this bug: https://bugs.launchpad.net/grenade/+bug/1761775) is that --local-cell updates only the database specified as connection in [database], i.e. not also the cell0 database On Tue, Jul 16, 2019 at 1:12 PM Eugen Block wrote: > AFAIK cell0 is the local cell, isn't it? I'm not sure, maybe someone > else can shed some light on this. > > > Zitat von Massimo Sgaravatto : > > > Mmm. > > Doesn't "--local_cell" mean that cell0 is not updated ? > > I was already using cell0 in Ocata ... > > > > Cheers, Massimo > > > > On Tue, Jul 16, 2019 at 12:52 PM Eugen Block wrote: > > > >> Hi, > >> > >> I think you need to run > >> > >> nova-manage db sync --local_cell > >> > >> in Ocata. I can't quite remember the reason, but you should try it. > >> I had the same problems and this command worked out for me. > >> > >> Regards, > >> Eugen > >> > >> > >> > >> Zitat von Massimo Sgaravatto : > >> > >> > Resending with the right tags in the subject line ... > >> > > >> > > >> > Hi > >> > > >> > We are trying to update our Ocata cloud to Pike (this is the first > step: > >> > we will go through Ocata --> Pike --> Queens --> Rocky) but we have a > >> > problem with nova-manage db sync: > >> > > >> > > >> > [root at cld-ctrl-01 ~]# su -s /bin/sh -c "nova-manage db sync" nova > >> > ERROR: Could not access cell0. > >> > Has the nova_api database been created? > >> > Has the nova_cell0 database been created? > >> > Has "nova-manage api_db sync" been run? > >> > Has "nova-manage cell_v2 map_cell0" been run? > >> > Is [api_database]/connection set in nova.conf? > >> > Is the cell0 database connection URL correct? > >> > Error: "Database schema file with version 390 doesn't exist." > >> > > >> > We have these settings in nova.conf: > >> > > >> > [database] > >> > connection = mysql+pymysql:// > nova_prod:xyz at 192.168.60.10:6306/nova_prod > >> > [api_database] > >> > connection = mysql+pymysql:// > >> > nova_api_prod:xyz at 192.168.60.10:6306/nova_api_prod > >> > > >> > I can't see problems accessing the databases. i.e. these commands > work: > >> > > >> > mysql -u nova_api_prod -pxyz -h 192.168.60.10 -P 6306 nova_api_prod > >> > > >> > mysql -u nova_prod -pxyz -h 192.168.60.10 -P 6306 nova_prod > >> > > >> > mysql -u nova_prod -pxyz -h 192.168.60.10 -P 6306 nova_prod_cell0 > >> > > >> > If I try to rerun nova-manage cell_v2 map_cell0: > >> > > >> > [root at cld-ctrl-01 ~]# nova-manage cell_v2 map_cell0 > >> > Cell0 is already setup > >> > > >> > > >> > [root at cld-ctrl-01 ~]# su -s /bin/sh -c "nova-manage api_db version" > nova > >> > 45 > >> > > >> > [root at cld-ctrl-01 ~]# su -s /bin/sh -c "nova-manage db version" nova > >> > 362 > >> > > >> > [root at cld-ctrl-01 ~]# nova-manage cell_v2 list_cells > >> > > >> > +-------+--------------------------------------+--------------------------------------------------+-------------------------------------------------------------------+ > >> > | Name | UUID | > >> Transport > >> > URL | Database Connection > >> > | > >> > > >> > +-------+--------------------------------------+--------------------------------------------------+-------------------------------------------------------------------+ > >> > | cell0 | 00000000-0000-0000-0000-000000000000 | > >> > none:/// | mysql+pymysql://nova_prod:****@ > >> > 192.168.60.10:6306/nova_prod_cell0 | > >> > | cell1 | 5e42faa0-710b-4967-bb42-fcf53602c96e | > >> > rabbit://openstack_prod:****@192.168.60.183:5672 | > >> > mysql+pymysql://nova_prod:****@192.168.60.10:6306/nova_prod | > >> > > >> > +-------+--------------------------------------+--------------------------------------------------+-------------------------------------------------------------------+ > >> > > >> > Following some doc we also tried to specify the --database_connection > / > >> > --database-connection argument, but this seems not working > >> > > >> > > >> > Setting nova in debug mode, this is the last entry in the nova-manage > log > >> > file: > >> > > >> > 2019-07-16 12:11:42.774 22013 DEBUG migrate.versioning.repository > >> > [req-8ca52407-dbe1-4a62-ad6c-16631c2a9a06 - - - - -] Config: > >> > OrderedDict([('db_settings', OrderedDict([('__name__', 'db_settings'), > >> > ('repository_id', 'nova'), ('version_table', 'migrate_version'), > >> > ('required_dbs', '[]')]))]) __init__ > >> > /usr/lib/python2.7/site-packages/migrate/versioning/repository.py:83 > >> > > >> > Any hints ? > >> > > >> > Thanks, Massimo > >> >  > >> >  > >> > OpenStack Discuss > >> > >> > >> > >> > >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From witold.bedyk at suse.com Tue Jul 16 11:27:19 2019 From: witold.bedyk at suse.com (Witek Bedyk) Date: Tue, 16 Jul 2019 13:27:19 +0200 Subject: [monasca] Virtual Midcycle Meeting scheduling In-Reply-To: <40121ffb-3306-ef6b-b2ba-9d75e2a1e130@suse.de> References: <40121ffb-3306-ef6b-b2ba-9d75e2a1e130@suse.de> Message-ID: <92004787-2fc1-650f-91d0-be1fe28c97bd@suse.de> Hello everyone, our Virtual Midcycle Meeting will take place: Wed, Jul 24, 2019 2:00 PM - 4:00 PM UTC Here the URL for joining the meeting: https://global.gotomeeting.com/join/402046285 More details are available in the etherpad below. Everyone is welcome to join. Cheers Witek On 7/9/19 1:45 PM, Witek Bedyk wrote: > Hello, > > as discussed in the last team meeting, we will hold a virtual Midcycle > Meeting. The goal is to sync on the progress of the development and > update the stories if needed. I plan with 2 hours meeting. Please select > the times which work best for you: > > https://doodle.com/poll/zszfxakcbfm6sdha > > Please fill in the topics you would like to discuss or update on in the > etherpad: > > https://etherpad.openstack.org/p/monasca-train-midcycle > > Thanks > Witek > > -- Witek Bedyk Cloud Developer SUSE Linux GmbH, Maxfeldstr. 5, D-90409 Nürnberg Tel: +49-911-74053-0; Fax: +49-911-7417755; https://www.suse.com/ SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) From hjensas at redhat.com Tue Jul 16 11:28:59 2019 From: hjensas at redhat.com (Harald =?ISO-8859-1?Q?Jens=E5s?=) Date: Tue, 16 Jul 2019 13:28:59 +0200 Subject: [TripleO] Scaling node counts with only Ansible (N=1) In-Reply-To: References: <23924034ea0981350b7e241aed5e99c5e769b291.camel@redhat.com> Message-ID: <37937901d16a52f3ef5e6761a34812f4b87723cd.camel@redhat.com> On Mon, 2019-07-15 at 11:25 -0700, Dan Sneddon wrote: > This is my main question about this proposal. When TripleO was in its > infancy, there wasn't a mechanism to create Neutron ports separately > from the server, so we created a Nova Server resource that specified > which network the port was on (originally there was only one port > created, now we create additional ports in Neutron). This can be seen > in the puppet/-role.yaml file, for example: > > resources: > Controller: > type: OS::TripleO::ControllerServer > deletion_policy: {get_param: ServerDeletionPolicy} > metadata: > os-collect-config: > command: {get_param: ConfigCommand} > splay: {get_param: ConfigCollectSplay} > properties: > [...] > networks: > - if: > - ctlplane_fixed_ip_set > - network: ctlplane > subnet: {get_param: ControllerControlPlaneSubnet} > fixed_ip: > yaql: > expression: $.data.where(not isEmpty($)).first() > data: > - get_param: [ControllerIPs, 'ctlplane', > {get_param: NodeIndex}] > - network: ctlplane > subnet: {get_param: ControllerControlPlaneSubnet} > > This has the side-effect that the ports are created by Nova calling > Neutron rather than by Heat calling Neutron for port creation. We > have maintained this mechanism even in the latest versions of THT for > backwards compatibility. This would all be easier if we were creating > the Neutron ctlplane port and then assigning it to the server, but > that breaks backwards-compatibility. > This is indeed an issue that both nova-less and N=1 need to find a solution for. As soon as the nova server resources are removed from a stack the server and ctlplane port will be deleted. We loose track of which IP was assigned to which server at that point. I believe the plan in nova-less is to use the "protected" flag for Ironic nodes to ensure the baremetal node is not unprovisioned (destroyed). So the overcloud node will keep running. This however does'nt solve the problem with the ctlplane port being deleted. We need to ensure that the port is either not deleted, or that a new port is immediately created using the same IP address as before. If we don't we will very likely have duplicate IP issues on next scale out. > How would the creation of the ctlplane port be handled in this > proposal? If metalsmith is creating the ctlplane port, do we still > need a separate Server resource for every node? If so, I imagine it > would have a much smaller stack than what we currently create for > each server. If not, would metalsmith create a port on the ctlplane > as part of the provisioning steps, and then pass this port back? We > still need to be able to support fixed IPs for ctlplane ports, so we > need to be able to pass a specific IP to metalsmith. > The way nova-less works is that "openstack overcloud node provision" call's metalsmith to create a port and deploy the server. Once done the data for the servers are placed in a heat environment file defining the 'DeployedServerPortMap' parameter etc so that the already existing pre- deployed-server workflow[1] can be utilized. Using fixed IPs for ctlplane ports is possible with nova-less. But the interface to do so is changed, see[2]. [1] https://docs.openstack.org/tripleo-docs/latest/install/advanced_deployment/deployed_server.html#network-configuration [2] https://specs.openstack.org/openstack/tripleo-specs/specs/stein/nova-less-deploy.html#examples From james.slagle at gmail.com Tue Jul 16 12:15:02 2019 From: james.slagle at gmail.com (James Slagle) Date: Tue, 16 Jul 2019 08:15:02 -0400 Subject: [TripleO] Scaling node counts with only Ansible (N=1) In-Reply-To: References: <23924034ea0981350b7e241aed5e99c5e769b291.camel@redhat.com> Message-ID: On Mon, Jul 15, 2019 at 2:25 PM Dan Sneddon wrote: > > > > On Mon, Jul 15, 2019 at 2:13 AM Harald Jensås wrote: >> >> On Sat, 2019-07-13 at 16:19 -0400, James Slagle wrote: >> > On Fri, Jul 12, 2019 at 3:59 PM Harald Jensås >> > wrote: >> > > I've said this before, but I think we should turn this nova-less >> > > around. Now with nova-less we create a bunch of servers, and write >> > > up >> > > the parameters file to use the deployed-server approach. >> > > Effectively we >> > > still neet to have the resource group in heat making a server >> > > resource >> > > for every server. Creating the fake server resource is fast, >> > > because >> > > Heat does'nt call Nova,Ironic to create any resources. But the >> > > stack is >> > > equally big, with a stack for every node. i.e not N=1. >> > > >> > > What you are doing here, is essentially to say we don't create a >> > > resource group that then creates N number of role stacks, one for >> > > each >> > > overcloud node. You are creating a single generic "server" >> > > definition >> > > per Role. So we drop the resource group and create >> > > OS::Triple::{{Role}}.Server 1-time (once). To me it's backwards to >> > > push >> > > a large struct with properties for N=many nodes into the creation >> > > of >> > > that stack. >> > >> > I'm not entirely following what you're saying is backwards. What I've >> > proposed is that we *don't* have any node specific data in the stack. >> > It sounds like you're saying the way we do it today is backwards. >> > >> >> What I mean to say is that I think the way we are integrating nova-less >> by first deploying the servers, to then provide the data to Heat to >> create the resource groups as we do today becomes backwards when your >> work on N=1 is introduced. >> >> >> > It's correct that what's been proposed with metalsmith currently >> > still >> > requires the full ResourceGroup with a member for each node. With the >> > template changes I'm proposing, that wouldn't be required, so we >> > could >> > actually do the Heat stack first, then metalsmith. >> > >> >> Yes, this is what I think we should do. Especially if your changes here >> removes the resource group entirely. It makes more sense to create the >> stack, and once that is created we can do deployment, scaling etc >> without updating the stack again. >> >> > > Currently the puppet/role-role.yaml creates all the network ports >> > > etc. >> > > As you only want to create it once, it instead could simply output >> > > the >> > > UUID of the networks+subnets. These are identical for all servers >> > > in >> > > the role. So we end up with a small heat stack. >> > > >> > > Once the stack is created we could use that generic "server" role >> > > data >> > > to feed into something (ansible?, python?, mistral?) that calls >> > > metalsmith to build the servers, then create ports for each server >> > > in >> > > neutron, one port for each network+subnet defined in the role. Then >> > > feed that output into the json (hieradata) that is pushed to each >> > > node >> > > and used during service configuration, all the things we need to >> > > configure network interfaces, /etc/hosts and so on. We need a way >> > > to >> > > keep track of which ports belong to wich node, but I guess >> > > something >> > > simple like using the node's ironic UUID in either the name, >> > > description or tag field of the neutron port will work. There is >> > > also >> > > the extra filed in Ironic which is json type, so we could place a >> > > map >> > > of network->port_uuid in there as well. >> > >> > It won't matter whether we do baremetal provisioning before or after >> > the Heat stack. Heat won't care, as it won't have any expectation to >> > create any servers or that they are already created. We can define >> > where we end up calling the metalsmith piece as it should be >> > independent of the Heat stack if we make these template changes. >> > >> >> This is true. But, in your previous mail in this thread you wrote: >> >> """ >> Other points: >> >> - Baremetal provisioning and port creation are presently handled by >> Heat. With the ongoing efforts to migrate baremetal provisioning out >> of Heat (nova-less deploy), I think these efforts are very >> complimentary. Eventually, we get to a point where Heat is not >> actually creating any other OpenStack API resources. For now, the >> patches only work when using pre-provisioned nodes. >> """ >> >> IMO "baremetal provision and port creation" fit together. (I read the >> above statement so as well.) Currently nova-less creates the ctlplane >> port and provision the baremetal node. If we want to do both baremetal >> provisioning and port creation togheter (I think this makes sense), we >> have to do it after the stack has created the networks. >> >> What I envision is to have one method that creates all the ports, >> ctlplane + composable networks in a unified way. Today these are >> created differently, the ctlplane port is part of the server resource >> (or metalsmith in nova-less case) and the other ports are created by >> heat. > > > This is my main question about this proposal. When TripleO was in its infancy, there wasn't a mechanism to create Neutron ports separately from the server, so we created a Nova Server resource that specified which network the port was on (originally there was only one port created, now we create additional ports in Neutron). This can be seen in the puppet/-role.yaml file, for example: > > resources: > Controller: > type: OS::TripleO::ControllerServer > deletion_policy: {get_param: ServerDeletionPolicy} > metadata: > os-collect-config: > command: {get_param: ConfigCommand} > splay: {get_param: ConfigCollectSplay} > properties: > [...] > networks: > - if: > - ctlplane_fixed_ip_set > - network: ctlplane > subnet: {get_param: ControllerControlPlaneSubnet} > fixed_ip: > yaql: > expression: $.data.where(not isEmpty($)).first() > data: > - get_param: [ControllerIPs, 'ctlplane', {get_param: NodeIndex}] > - network: ctlplane > subnet: {get_param: ControllerControlPlaneSubnet} > > This has the side-effect that the ports are created by Nova calling Neutron rather than by Heat calling Neutron for port creation. We have maintained this mechanism even in the latest versions of THT for backwards compatibility. This would all be easier if we were creating the Neutron ctlplane port and then assigning it to the server, but that breaks backwards-compatibility. > > How would the creation of the ctlplane port be handled in this proposal? If metalsmith is creating the ctlplane port, do we still need a separate Server resource for every node? If so, I imagine it would have a much smaller stack than what we currently create for each server. If not, would metalsmith create a port on the ctlplane as part of the provisioning steps, and then pass this port back? We still need to be able to support fixed IPs for ctlplane ports, so we need to be able to pass a specific IP to metalsmith. I think most of your questions pertain to defining the right interface for baremetal provisioning with metalsmith. We more or less have a clean slate there in terms of how we want that to look going forward. Given that it won't use Nova, my understanding is that the port(s) will be created via Neutron directly. We won't need separate server resources in the stack for every node once provisioning is not part of the stack. We will need to look at how we are creating the other network isolation ports per server however. It's something that we'll need to look at to see if we want to keep using Neutron just for IPAM. It seems a little wasteful to me, but perhaps it's not an issue even with thousands of ports. Initially, you'd be able to scale with just Ansible as long as the operator does mistakenly use overlapping IP's. We could also add ansible tasks that created the ports in Neutron (or verified they were already created) so that the actual IPAM usage is properly reflected in Neutron. -- -- James Slagle -- From james.slagle at gmail.com Tue Jul 16 12:21:32 2019 From: james.slagle at gmail.com (James Slagle) Date: Tue, 16 Jul 2019 08:21:32 -0400 Subject: [TripleO] Scaling node counts with only Ansible (N=1) In-Reply-To: <37937901d16a52f3ef5e6761a34812f4b87723cd.camel@redhat.com> References: <23924034ea0981350b7e241aed5e99c5e769b291.camel@redhat.com> <37937901d16a52f3ef5e6761a34812f4b87723cd.camel@redhat.com> Message-ID: On Tue, Jul 16, 2019 at 7:29 AM Harald Jensås wrote: > As soon as the nova server resources are removed from a stack the > server and ctlplane port will be deleted. We loose track of which IP > was assigned to which server at that point. > > I believe the plan in nova-less is to use the "protected" flag for > Ironic nodes to ensure the baremetal node is not unprovisioned > (destroyed). So the overcloud node will keep running. This however > does'nt solve the problem with the ctlplane port being deleted. We > need to ensure that the port is either not deleted, or that a new port > is immediately created using the same IP address as before. If we don't > we will very likely have duplicate IP issues on next scale out. Heat provides a supported interface mechanism to override it's built-in resource types. We in fact already do this for both OS::Nova::Server and OS::Neutron::Port. When we manage resources of those types in our stack, we are actually using our custom plugins from tripleo-common. We can add additional logic there to handle this case and define whatever we want to have happen when the resources are deleted from the stack. This would address that issue for both N=1 and nova-less. -- -- James Slagle -- From james.slagle at gmail.com Tue Jul 16 12:25:16 2019 From: james.slagle at gmail.com (James Slagle) Date: Tue, 16 Jul 2019 08:25:16 -0400 Subject: [TripleO] Scaling node counts with only Ansible (N=1) In-Reply-To: <49e44a0f-84c9-195a-99d5-b843e4045454@redhat.com> References: <23924034ea0981350b7e241aed5e99c5e769b291.camel@redhat.com> <6bdebb51-0b3c-2888-a691-720bd5ac039a@redhat.com> <49e44a0f-84c9-195a-99d5-b843e4045454@redhat.com> Message-ID: On Tue, Jul 16, 2019 at 3:23 AM Dmitry Tantsur wrote: > > On 7/16/19 12:26 AM, Steve Baker wrote: > > > > On 15/07/19 9:12 PM, Harald Jensås wrote: > >> On Sat, 2019-07-13 at 16:19 -0400, James Slagle wrote: > >>> On Fri, Jul 12, 2019 at 3:59 PM Harald Jensås > >>> wrote: > >>>> I've said this before, but I think we should turn this nova-less > >>>> around. Now with nova-less we create a bunch of servers, and write > >>>> up > >>>> the parameters file to use the deployed-server approach. > >>>> Effectively we > >>>> still neet to have the resource group in heat making a server > >>>> resource > >>>> for every server. Creating the fake server resource is fast, > >>>> because > >>>> Heat does'nt call Nova,Ironic to create any resources. But the > >>>> stack is > >>>> equally big, with a stack for every node. i.e not N=1. > >>>> > >>>> What you are doing here, is essentially to say we don't create a > >>>> resource group that then creates N number of role stacks, one for > >>>> each > >>>> overcloud node. You are creating a single generic "server" > >>>> definition > >>>> per Role. So we drop the resource group and create > >>>> OS::Triple::{{Role}}.Server 1-time (once). To me it's backwards to > >>>> push > >>>> a large struct with properties for N=many nodes into the creation > >>>> of > >>>> that stack. > >>> I'm not entirely following what you're saying is backwards. What I've > >>> proposed is that we *don't* have any node specific data in the stack. > >>> It sounds like you're saying the way we do it today is backwards. > >>> > >> What I mean to say is that I think the way we are integrating nova-less > >> by first deploying the servers, to then provide the data to Heat to > >> create the resource groups as we do today becomes backwards when your > >> work on N=1 is introduced. > >> > >> > >>> It's correct that what's been proposed with metalsmith currently > >>> still > >>> requires the full ResourceGroup with a member for each node. With the > >>> template changes I'm proposing, that wouldn't be required, so we > >>> could > >>> actually do the Heat stack first, then metalsmith. > >>> > >> Yes, this is what I think we should do. Especially if your changes here > >> removes the resource group entirely. It makes more sense to create the > >> stack, and once that is created we can do deployment, scaling etc > >> without updating the stack again. > > > > I think this is something we can move towards after James has finished this > > work. It would probably mean deprecating "openstack overcloud node provision" > > and providing some other way of running the baremetal provisioning in isolation > > after the heat stack operation, like an equivalent to "openstack overcloud > > deploy --config-download-only" > > > > > > I'm very much against on deprecating "openstack overcloud node provision", it's > one of the reasons of this whole effort. I'm equally -2 on making the bare metal > provisioning depending on heat in any way for the same reason. I think what's being proposed here is just that we'd change the ordering of the workflow in that we'd do the Heat stack first. That being said, I see the lack of dependency working both ways. Baremetal provisioning should not depend on Heat, and Heat should not depend on baremetal provisioning. You should be able to create the Heat stack without the servers actually existing (same as you can do today with pre-provisioned nodes). -- -- James Slagle -- From gmann at ghanshyammann.com Tue Jul 16 12:26:51 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 16 Jul 2019 21:26:51 +0900 Subject: [docs][tc][infra] what to do with developer.openstack.org and api-site? In-Reply-To: References: <48048bf0-a79c-6abf-b88f-a1132afc0d6b@suse.com> Message-ID: <16bfabff222.12ae1a1f0212237.3798556913715077596@ghanshyammann.com> ---- On Mon, 15 Jul 2019 18:50:05 +0900 Thierry Carrez wrote ---- > Andreas Jaeger wrote: > > [...] > > I see the following options: > > > > 1) Retiring developer.openstack.org completely, this would mean we would > > host the api-guides and api-references on docs.openstack.org (perhaps > > with moving them into doc/source). If we go down this road, we need to > > discuss what this means (redirects) and what to do with the Api-Guide > > and the FirstApp guide. +1 on option 1. openstack api-guides (not the individual projects api-guides) make more sense under os-api-ref which is nothing but overall openstack APIs state and less maintenance effort as you mentioned. api-references content is on project side ./api-ref/source. Do you mean to move them to doc/source ? or cannot we host docs.openstack.org from the same existing ./api-ref/source location? -gmann > > > > 2) Fully revitialize the repo and have it owned by an official team or > > SIG (this means reverting parts of https://review.opendev.org/485249/) > > > > 3) Retire the document "Writing your first OpenStack Application", and > > unretire api-site and have it owned by some official team/SIG. > > > > Any other options? What shall we do? > > Thanks Andreas for raising this. > > As an extra data point, my long-term plan was to have SDKs and CLIs > properly listed in the Software pages under SDKs[1], including > third-party ones in their own subtab, all driven from the > osf/openstack-map repository[2]. > > With that in mind, I think it would make sense to look into retiring > developer.openstack.org, and move docs to docs.openstack.org. We could > also revive https://www.openstack.org/appdev/ and use it as the base > landing page to direct application-side people to the various pieces. > > [1] https://www.openstack.org/software/project-navigator/sdks > [2] https://opendev.org/osf/openstack-map/ > > -- > Thierry Carrez (ttx) > > From gmann at ghanshyammann.com Tue Jul 16 12:29:42 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 16 Jul 2019 21:29:42 +0900 Subject: [docs][tc][infra] what to do with developer.openstack.org and api-site? In-Reply-To: <116050ecdd0b7c5ecbe728d914b67d3f0770a2ea.camel@redhat.com> References: <48048bf0-a79c-6abf-b88f-a1132afc0d6b@suse.com> <116050ecdd0b7c5ecbe728d914b67d3f0770a2ea.camel@redhat.com> Message-ID: <16bfac28ea9.11c8a05f6212408.2764941371109330116@ghanshyammann.com> ---- On Mon, 15 Jul 2019 23:28:33 +0900 Sean Mooney wrote ---- > On Mon, 2019-07-15 at 11:50 +0200, Thierry Carrez wrote: > > Andreas Jaeger wrote: > > > [...] > > > I see the following options: > > > > > > 1) Retiring developer.openstack.org completely, this would mean we would > > > host the api-guides and api-references on docs.openstack.org (perhaps > > > with moving them into doc/source). If we go down this road, we need to > > > discuss what this means (redirects) and what to do with the Api-Guide > > > and the FirstApp guide. > > > > > > 2) Fully revitialize the repo and have it owned by an official team or > > > SIG (this means reverting parts of https://review.opendev.org/485249/) > > > > > > 3) Retire the document "Writing your first OpenStack Application", and > > > unretire api-site and have it owned by some official team/SIG. > > > > > > Any other options? What shall we do? > > > > Thanks Andreas for raising this. > > > > As an extra data point, my long-term plan was to have SDKs and CLIs > > properly listed in the Software pages under SDKs[1], including > > third-party ones in their own subtab, all driven from the > > osf/openstack-map repository[2]. > > > > With that in mind, I think it would make sense to look into retiring > > developer.openstack.org, > i use https://developer.openstack.org/api-ref/compute/ almost daily so unless > we host the api ref somwhere else and put redirect in place i would hope we > can keep this inplace. Yeah, this link is used very frequently by many developers as well as by users. Redirect is something much needed for this site. -gmann > if we move it under docs like the config stuff > https://docs.openstack.org/nova/latest/configuration/config.html > or somewhere else that is fine but i fine it very useful to be able > to link the rendered api docs to people on irc that ask questions. > > i can obviosly point peple to github > https://github.com/openstack/nova/blob/master/api-ref/source/servers.inc > but unlike the configs the api ref is much less readable with out rendering > it with sphinx > > > and move docs to docs.openstack.org. We could > > also revive https://www.openstack.org/appdev/ and use it as the base > > landing page to direct application-side people to the various pieces. > > > > [1] https://www.openstack.org/software/project-navigator/sdks > > [2] https://opendev.org/osf/openstack-map/ > > > > > From aj at suse.com Tue Jul 16 12:35:43 2019 From: aj at suse.com (Andreas Jaeger) Date: Tue, 16 Jul 2019 14:35:43 +0200 Subject: [docs][tc][infra] what to do with developer.openstack.org and api-site? In-Reply-To: <16bfabff222.12ae1a1f0212237.3798556913715077596@ghanshyammann.com> References: <48048bf0-a79c-6abf-b88f-a1132afc0d6b@suse.com> <16bfabff222.12ae1a1f0212237.3798556913715077596@ghanshyammann.com> Message-ID: On 16/07/2019 14.26, Ghanshyam Mann wrote: > ---- On Mon, 15 Jul 2019 18:50:05 +0900 Thierry Carrez wrote ---- > > Andreas Jaeger wrote: > > > [...] > > > I see the following options: > > > > > > 1) Retiring developer.openstack.org completely, this would mean we would > > > host the api-guides and api-references on docs.openstack.org (perhaps > > > with moving them into doc/source). If we go down this road, we need to > > > discuss what this means (redirects) and what to do with the Api-Guide > > > and the FirstApp guide. > > +1 on option 1. > > openstack api-guides (not the individual projects api-guides) make more sense under > os-api-ref which is nothing but overall openstack APIs state and less maintenance effort as you mentioned. I propose move it to openstack-manuals instead of os-api-ref to make publishing easier. os-api-ref is a tool used by others. > api-references content is on project side ./api-ref/source. Do you mean to move them to > doc/source ? or cannot we host docs.openstack.org from the same existing ./api-ref/source location? The idea some time ago was to move them to doc/source But short-term we can just change the publishing jobs to publish to docs.openstack.org/api-reference instead of developer.openstack.org/api-reference, Andreas -- Andreas Jaeger aj at suse.com Twitter: jaegerandi SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 From zhangbailin at inspur.com Tue Jul 16 12:39:41 2019 From: zhangbailin at inspur.com (=?utf-8?B?QnJpbiBaaGFuZyjlvKDnmb7mnpcp?=) Date: Tue, 16 Jul 2019 12:39:41 +0000 Subject: =?utf-8?B?IFtsaXN0cy5vcGVuc3RhY2sub3Jn5Luj5Y+RXVJlOiBbZG9jc11bdGNdW2lu?= =?utf-8?B?ZnJhXSB3aGF0IHRvIGRvIHdpdGggZGV2ZWxvcGVyLm9wZW5zdGFjay5vcmcg?= =?utf-8?Q?and_api-site=3F?= Message-ID: <0b8be13f4fe54fe99b2f71d56b40b022@inspur.com> >>> Yeah, this link is used very frequently by many developers as well as by users. Redirect is something much needed for this site. +1, yeah. https://developer.openstack.org/api-ref/compute/ is used at the highest frequency, I think redirect this is necessary. ---- On Mon, 15 Jul 2019 23:28:33 +0900 Sean Mooney wrote ---- > On Mon, 2019-07-15 at 11:50 +0200, Thierry Carrez wrote: > > Andreas Jaeger wrote: > > > [...] > > > I see the following options: > > > > > > 1) Retiring developer.openstack.org completely, this would mean we would > > > host the api-guides and api-references on docs.openstack.org (perhaps > > > with moving them into doc/source). If we go down this road, we need to > > > discuss what this means (redirects) and what to do with the Api-Guide > > > and the FirstApp guide. > > > > > > 2) Fully revitialize the repo and have it owned by an official team or > > > SIG (this means reverting parts of https://review.opendev.org/485249/) > > > > > > 3) Retire the document "Writing your first OpenStack Application", and > > > unretire api-site and have it owned by some official team/SIG. > > > > > > Any other options? What shall we do? > > > > Thanks Andreas for raising this. > > > > As an extra data point, my long-term plan was to have SDKs and CLIs > > properly listed in the Software pages under SDKs[1], including > > third-party ones in their own subtab, all driven from the > > osf/openstack-map repository[2]. > > > > With that in mind, I think it would make sense to look into retiring > > developer.openstack.org, > i use https://developer.openstack.org/api-ref/compute/ almost daily so unless > we host the api ref somwhere else and put redirect in place i would hope we > can keep this inplace. Yeah, this link is used very frequently by many developers as well as by users. Redirect is something much needed for this site. -gmann > if we move it under docs like the config stuff > https://docs.openstack.org/nova/latest/configuration/config.html > or somewhere else that is fine but i fine it very useful to be able > to link the rendered api docs to people on irc that ask questions. > > i can obviosly point peple to github > https://github.com/openstack/nova/blob/master/api-ref/source/servers.inc > but unlike the configs the api ref is much less readable with out rendering > it with sphinx > > > and move docs to docs.openstack.org. We could > > also revive https://www.openstack.org/appdev/ and use it as the base > > landing page to direct application-side people to the various pieces. > > > > [1] https://www.openstack.org/software/project-navigator/sdks > > [2] https://opendev.org/osf/openstack-map/ > > > > > From gmann at ghanshyammann.com Tue Jul 16 12:49:15 2019 From: gmann at ghanshyammann.com (Ghanshyam Mann) Date: Tue, 16 Jul 2019 21:49:15 +0900 Subject: [docs][tc][infra] what to do with developer.openstack.org and api-site? In-Reply-To: References: <48048bf0-a79c-6abf-b88f-a1132afc0d6b@suse.com> <16bfabff222.12ae1a1f0212237.3798556913715077596@ghanshyammann.com> Message-ID: <16bfad474a0.c753acc3213344.8952022068766030087@ghanshyammann.com> ---- On Tue, 16 Jul 2019 21:35:43 +0900 Andreas Jaeger wrote ---- > On 16/07/2019 14.26, Ghanshyam Mann wrote: > > ---- On Mon, 15 Jul 2019 18:50:05 +0900 Thierry Carrez wrote ---- > > > Andreas Jaeger wrote: > > > > [...] > > > > I see the following options: > > > > > > > > 1) Retiring developer.openstack.org completely, this would mean we would > > > > host the api-guides and api-references on docs.openstack.org (perhaps > > > > with moving them into doc/source). If we go down this road, we need to > > > > discuss what this means (redirects) and what to do with the Api-Guide > > > > and the FirstApp guide. > > > > +1 on option 1. > > > > openstack api-guides (not the individual projects api-guides) make more sense under > > os-api-ref which is nothing but overall openstack APIs state and less maintenance effort as you mentioned. > > I propose move it to openstack-manuals instead of os-api-ref to make > publishing easier. os-api-ref is a tool used by others. I am ok to move under openstack-manual also. Though we do not know the future home of openstack-manual as docs team is moving towards SIG. But where ever it will go, 'OpenStack api guide' can go with this. > > > api-references content is on project side ./api-ref/source. Do you mean to move them to > > doc/source ? or cannot we host docs.openstack.org from the same existing ./api-ref/source location? > > The idea some time ago was to move them to doc/source > > But short-term we can just change the publishing jobs to publish to > docs.openstack.org/api-reference instead of > developer.openstack.org/api-reference, Yeah, I remember there were no clear consesus of moving the api-ref/source to doc/source so that might take time. publishing from api-ref/source as short term options looks good to me. Thanks for initiating this discussion and caring about these sites. -gmann > > Andreas > -- > Andreas Jaeger aj at suse.com Twitter: jaegerandi > SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah > HRB 21284 (AG Nürnberg) > GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 > From aj at suse.com Tue Jul 16 12:54:39 2019 From: aj at suse.com (Andreas Jaeger) Date: Tue, 16 Jul 2019 14:54:39 +0200 Subject: [docs][tc][infra] what to do with developer.openstack.org and api-site? In-Reply-To: <16bfad474a0.c753acc3213344.8952022068766030087@ghanshyammann.com> References: <48048bf0-a79c-6abf-b88f-a1132afc0d6b@suse.com> <16bfabff222.12ae1a1f0212237.3798556913715077596@ghanshyammann.com> <16bfad474a0.c753acc3213344.8952022068766030087@ghanshyammann.com> Message-ID: <9cfa698b-4638-88ba-e0db-799cf9543ae4@suse.com> On 16/07/2019 14.49, Ghanshyam Mann wrote: > [...] > I am ok to move under openstack-manual also. Though we do not know the > future home of openstack-manual as docs team is moving towards SIG. But where ever > it will go, 'OpenStack api guide' can go with this. I think a SIG can own repositories, so there's no technical blocker for that, Andreas -- Andreas Jaeger aj at suse.com Twitter: jaegerandi SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 From thierry at openstack.org Tue Jul 16 13:26:39 2019 From: thierry at openstack.org (Thierry Carrez) Date: Tue, 16 Jul 2019 15:26:39 +0200 Subject: [dev][release][qa] patrole stable/stein is created by mistake ? In-Reply-To: <16bf8e85983.b41e1d9d193754.93039526410042655@ghanshyammann.com> References: <16beb436c21.c9d98154147017.455401003873410397@ghanshyammann.com> <16beb463457.1193bd7a4147042.5135658897011450644@ghanshyammann.com> <20190713203828.GA29711@sm-workstation> <1