From poby1972 at gmail.com Sun Jul 1 09:02:14 2018 From: poby1972 at gmail.com (Kevin Kwon) Date: Sun, 1 Jul 2018 18:02:14 +0900 Subject: [Openstack] Cinder volume Troubleshoot Message-ID: Dear All! would you please let me know how can troubleshoot below case? i don't know why below Storage server is down. Please help to figure it out.. root at OpenStack-Controller:~# openstack volume service list +------------------+-----------------------+------+---------+-------+----------------------------+ | Binary | Host | Zone | Status | State | Updated At | +------------------+-----------------------+------+---------+-------+----------------------------+ | cinder-scheduler | OpenStack-Controller | nova | enabled | up | 2018-07-01T08:58:19.000000 | | cinder-volume | OpenStack-Storage at lvm | nova | enabled | down | 2018-07-01T07:28:59.000000 | +------------------+-----------------------+------+---------+-------+----------------------------+ root at OpenStack-Controller:~# Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Sun Jul 1 13:52:21 2018 From: satish.txt at gmail.com (Satish Patel) Date: Sun, 1 Jul 2018 09:52:21 -0400 Subject: [Openstack] flavor metadata Message-ID: Folks, Recently we build openstack for production and i have question related flavor metadata. I have 3 kind of servers 8 core / 32 core / 40 core servers, now i want to tell my openstack my one of specific application always go to 32 core machine, how do i tell that to flavor metadata? Or should i use availability zone option and create two group? From satish.txt at gmail.com Sun Jul 1 14:06:47 2018 From: satish.txt at gmail.com (Satish Patel) Date: Sun, 1 Jul 2018 10:06:47 -0400 Subject: [Openstack] DNS integration Message-ID: Folks, Is there a way to tell openstack when you launch instance add them in external DNS using some kind of api call? We are using extarnal pDSN (power DNS) and wants my VM get register itself as soon as we launch them, is it possible by neutron or we should use cloud-init? From berndbausch at gmail.com Sun Jul 1 15:02:47 2018 From: berndbausch at gmail.com (Bernd Bausch) Date: Mon, 2 Jul 2018 00:02:47 +0900 Subject: [Openstack] Cinder volume Troubleshoot In-Reply-To: References: Message-ID: <4AB3FABB-DFC3-4D3D-8DB3-2BD0D846E857@gmail.com> When cinder-volume has problems, you read the cinder-volume log. Bernd > On Jul 1, 2018, at 18:02, Kevin Kwon wrote: > > > Dear All! > > > would you please let me know how can troubleshoot below case? > > i don't know why below Storage server is down. > > Please help to figure it out.. > > > root at OpenStack-Controller:~# openstack volume service list > +------------------+-----------------------+------+---------+-------+----------------------------+ > | Binary | Host | Zone | Status | State | Updated At | > +------------------+-----------------------+------+---------+-------+----------------------------+ > | cinder-scheduler | OpenStack-Controller | nova | enabled | up | 2018-07-01T08:58:19.000000 | > | cinder-volume | OpenStack-Storage at lvm | nova | enabled | down | 2018-07-01T07:28:59.000000 | > +------------------+-----------------------+------+---------+-------+----------------------------+ > root at OpenStack-Controller:~# > > > Kevin > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack From berndbausch at gmail.com Sun Jul 1 15:10:31 2018 From: berndbausch at gmail.com (Bernd Bausch) Date: Mon, 2 Jul 2018 00:10:31 +0900 Subject: [Openstack] flavor metadata In-Reply-To: References: Message-ID: Not availability zones, but host aggregates. See https://docs.openstack.org/nova/latest/user/aggregates.html. Bernd > On Jul 1, 2018, at 22:52, Satish Patel wrote: > > Folks, > > Recently we build openstack for production and i have question related > flavor metadata. > > I have 3 kind of servers 8 core / 32 core / 40 core servers, now i > want to tell my openstack my one of specific application always go to > 32 core machine, how do i tell that to flavor metadata? > > Or should i use availability zone option and create two group? > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From berndbausch at gmail.com Sun Jul 1 15:15:34 2018 From: berndbausch at gmail.com (Bernd Bausch) Date: Mon, 2 Jul 2018 00:15:34 +0900 Subject: [Openstack] DNS integration In-Reply-To: References: Message-ID: You need DNS integration with OpenStack Designate [1]. The process is described in the Networking guide [2]. [1] https://docs.openstack.org/designate/latest [2] https://docs.openstack.org/neutron/latest/admin/config-dns-int-ext-serv.html Bernd > On Jul 1, 2018, at 23:06, Satish Patel wrote: > > Folks, > > > Is there a way to tell openstack when you launch instance add them in > external DNS using some kind of api call? > > We are using extarnal pDSN (power DNS) and wants my VM get register > itself as soon as we launch them, is it possible by neutron or we > should use cloud-init? > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From simon.leinen at switch.ch Sun Jul 1 15:37:09 2018 From: simon.leinen at switch.ch (Simon Leinen) Date: Sun, 1 Jul 2018 17:37:09 +0200 Subject: [Openstack] DNS integration In-Reply-To: (Satish Patel's message of "Sun, 1 Jul 2018 10:06:47 -0400") References: Message-ID: Satish Patel writes: > Is there a way to tell openstack when you launch instance add them in > external DNS using some kind of api call? Yes, this is supported using external DNS integration. See e.g. https://docs.openstack.org/neutron/latest/admin/config-dns-int-ext-serv.html for configuration examples. > We are using extarnal pDSN (power DNS) and wants my VM get register > itself as soon as we launch them, is it possible by neutron or we > should use cloud-init? Designate supports pDNS as a back-end, but I have only tested configurations where Designate has full control over the primary nameserver (BIND9 in my case, though as I said pDNS should work too). -- Simon. From satish.txt at gmail.com Sun Jul 1 15:48:59 2018 From: satish.txt at gmail.com (Satish Patel) Date: Sun, 1 Jul 2018 11:48:59 -0400 Subject: [Openstack] DNS integration In-Reply-To: References: Message-ID: Thanks for the clue, I will give it a try and get back here Sent from my iPhone > On Jul 1, 2018, at 11:37 AM, Simon Leinen wrote: > > Satish Patel writes: >> Is there a way to tell openstack when you launch instance add them in >> external DNS using some kind of api call? > > Yes, this is supported using external DNS integration. See e.g. > > https://docs.openstack.org/neutron/latest/admin/config-dns-int-ext-serv.html > > for configuration examples. > >> We are using extarnal pDSN (power DNS) and wants my VM get register >> itself as soon as we launch them, is it possible by neutron or we >> should use cloud-init? > > Designate supports pDNS as a back-end, but I have only tested > configurations where Designate has full control over the primary > nameserver (BIND9 in my case, though as I said pDNS should work too). > -- > Simon. From satish.txt at gmail.com Sun Jul 1 22:14:32 2018 From: satish.txt at gmail.com (Satish Patel) Date: Sun, 1 Jul 2018 18:14:32 -0400 Subject: [Openstack] openstack-ansible variable overwrite question Message-ID: I have deployed OSA and all good but having small issue in following file. # cat /etc/neutron/plugins/ml2/linuxbridge_agent.ini [linux_bridge] physical_interface_mappings = flat:eth12,vlan:br-vlan I WANT TO CHANGE IT TO FOLLWOING physical_interface_mappings = vlan:br-vlan I have overwrite variable in following file like this. # cat /etc/openstack_deploy/user_variables.yml neutron_linuxbridge_agent_ini_overrides: linux_bridge: physical_interface_mappings = vlan:br-vlan after re-run playbook it updated file with following format, am i missing something here? [DEFAULT] linux_bridge = physical_interface_mappings = vlan:br-vlan [linux_bridge] physical_interface_mappings = flat:eth12,vlan:br-vlan From mnaser at vexxhost.com Sun Jul 1 23:09:48 2018 From: mnaser at vexxhost.com (Mohammed Naser) Date: Sun, 1 Jul 2018 19:09:48 -0400 Subject: [Openstack] openstack-ansible variable overwrite question In-Reply-To: References: Message-ID: On Sun, Jul 1, 2018 at 6:14 PM, Satish Patel wrote: > I have deployed OSA and all good but having small issue in following file. > > # cat /etc/neutron/plugins/ml2/linuxbridge_agent.ini > > [linux_bridge] > physical_interface_mappings = flat:eth12,vlan:br-vlan > > I WANT TO CHANGE IT TO FOLLWOING > > physical_interface_mappings = vlan:br-vlan > > > I have overwrite variable in following file like this. > > # cat /etc/openstack_deploy/user_variables.yml > > neutron_linuxbridge_agent_ini_overrides: > linux_bridge: > physical_interface_mappings = vlan:br-vlan It's a YAML structure, you want this: neutron_linuxbridge_agent_ini_overrides: linux_bridge: physical_interface_mappings: vlan:br-vlan > > after re-run playbook it updated file with following format, am i > missing something here? > > [DEFAULT] > linux_bridge = physical_interface_mappings = vlan:br-vlan > > [linux_bridge] > physical_interface_mappings = flat:eth12,vlan:br-vlan > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. http://vexxhost.com From satish.txt at gmail.com Mon Jul 2 01:25:53 2018 From: satish.txt at gmail.com (Satish Patel) Date: Sun, 1 Jul 2018 21:25:53 -0400 Subject: [Openstack] openstack-ansible variable overwrite question In-Reply-To: References: Message-ID: Damnn it... you are right... it works now.. Thanks On Sun, Jul 1, 2018 at 7:09 PM, Mohammed Naser wrote: > On Sun, Jul 1, 2018 at 6:14 PM, Satish Patel wrote: >> I have deployed OSA and all good but having small issue in following file. >> >> # cat /etc/neutron/plugins/ml2/linuxbridge_agent.ini >> >> [linux_bridge] >> physical_interface_mappings = flat:eth12,vlan:br-vlan >> >> I WANT TO CHANGE IT TO FOLLWOING >> >> physical_interface_mappings = vlan:br-vlan >> >> >> I have overwrite variable in following file like this. >> >> # cat /etc/openstack_deploy/user_variables.yml >> >> neutron_linuxbridge_agent_ini_overrides: >> linux_bridge: >> physical_interface_mappings = vlan:br-vlan > > It's a YAML structure, you want this: > > neutron_linuxbridge_agent_ini_overrides: > linux_bridge: > physical_interface_mappings: vlan:br-vlan > >> >> after re-run playbook it updated file with following format, am i >> missing something here? >> >> [DEFAULT] >> linux_bridge = physical_interface_mappings = vlan:br-vlan >> >> [linux_bridge] >> physical_interface_mappings = flat:eth12,vlan:br-vlan >> >> _______________________________________________ >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> Post to : openstack at lists.openstack.org >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > > -- > Mohammed Naser — vexxhost > ----------------------------------------------------- > D. 514-316-8872 > D. 800-910-1726 ext. 200 > E. mnaser at vexxhost.com > W. http://vexxhost.com From satish.txt at gmail.com Mon Jul 2 11:50:33 2018 From: satish.txt at gmail.com (Satish Patel) Date: Mon, 2 Jul 2018 07:50:33 -0400 Subject: [Openstack] Openstack-ansible ceph storage question Message-ID: <1228704E-DBD8-44F0-B67E-CD9A7785922C@gmail.com> I am have build 15 node openstack cluster and now I have 5 node for ceph storage they all are HP DL360p G8 which has 10 HDD tray. Now for ceph minimum requirement is to have 3 monitor node, I have 5 ceph node then it would be little tight if I give 3 node for monitor and 2 for OSD. I was thinking can we use 3 infra* node for monitor from openstack-ansible? How much risk is there to use ceph mon node on infra node and how much memory cpu monitor node need ? Sent from my iPhone From torin.woltjer at granddial.com Mon Jul 2 12:43:08 2018 From: torin.woltjer at granddial.com (Torin Woltjer) Date: Mon, 02 Jul 2018 12:43:08 GMT Subject: [Openstack] masakari client (cannot list/add segment) Message-ID: <7452ea5858bd4c70be986b257d67a46d@granddial.com> Installed masakari 4.0.0 on queens. Hostmonitor, instancemonitor, and processmonitor all running on compute nodes. API and engine running on controller nodes. I've tried using the masakari client to list/add segments, any of those commands does nothing and returns: ("'NoneType' object has no attribute 'auth_url'", ', mode 'w' at 0x7f26bb4b71e0>) I cannot find any log file for the masakari client to troubleshoot this further. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From torin.woltjer at granddial.com Mon Jul 2 12:48:48 2018 From: torin.woltjer at granddial.com (Torin Woltjer) Date: Mon, 02 Jul 2018 12:48:48 GMT Subject: [Openstack] flavor metadata Message-ID: I would recommend using availability zones for this. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: Satish Patel Sent: 7/1/18 9:56 AM To: openstack Subject: [Openstack] flavor metadata Folks, Recently we build openstack for production and i have question related flavor metadata. I have 3 kind of servers 8 core / 32 core / 40 core servers, now i want to tell my openstack my one of specific application always go to 32 core machine, how do i tell that to flavor metadata? Or should i use availability zone option and create two group? _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack at lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From torin.woltjer at granddial.com Mon Jul 2 12:52:02 2018 From: torin.woltjer at granddial.com (Torin Woltjer) Date: Mon, 02 Jul 2018 12:52:02 GMT Subject: [Openstack] DNS integration Message-ID: <9b39e501fda3409081621cbf76f6e038@granddial.com> Have a look at Designate: https://wiki.openstack.org/wiki/Designate It has support for powerDNS, and sounds like what you're looking for. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: Satish Patel Sent: 7/1/18 10:27 AM To: openstack Subject: [Openstack] DNS integration Folks, Is there a way to tell openstack when you launch instance add them in external DNS using some kind of api call? We are using extarnal pDSN (power DNS) and wants my VM get register itself as soon as we launch them, is it possible by neutron or we should use cloud-init? _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack at lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Mon Jul 2 13:31:36 2018 From: satish.txt at gmail.com (Satish Patel) Date: Mon, 2 Jul 2018 09:31:36 -0400 Subject: [Openstack] flavor metadata In-Reply-To: References: Message-ID: Now i am confused, what is the best option and can you give me example how should i use them? ~S On Mon, Jul 2, 2018 at 8:48 AM, Torin Woltjer wrote: > I would recommend using availability zones for this. > > Torin Woltjer > > Grand Dial Communications - A ZK Tech Inc. Company > > 616.776.1066 ext. 2006 > www.granddial.com > > ________________________________ > From: Satish Patel > Sent: 7/1/18 9:56 AM > To: openstack > Subject: [Openstack] flavor metadata > Folks, > > Recently we build openstack for production and i have question related > flavor metadata. > > I have 3 kind of servers 8 core / 32 core / 40 core servers, now i > want to tell my openstack my one of specific application always go to > 32 core machine, how do i tell that to flavor metadata? > > Or should i use availability zone option and create two group? > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack From satish.txt at gmail.com Mon Jul 2 13:32:26 2018 From: satish.txt at gmail.com (Satish Patel) Date: Mon, 2 Jul 2018 09:32:26 -0400 Subject: [Openstack] DNS integration In-Reply-To: <9b39e501fda3409081621cbf76f6e038@granddial.com> References: <9b39e501fda3409081621cbf76f6e038@granddial.com> Message-ID: Yes, i am looking at it but documentation is little confusion too.. On Mon, Jul 2, 2018 at 8:52 AM, Torin Woltjer wrote: > Have a look at Designate: https://wiki.openstack.org/wiki/Designate > It has support for powerDNS, and sounds like what you're looking for. > > Torin Woltjer > > Grand Dial Communications - A ZK Tech Inc. Company > > 616.776.1066 ext. 2006 > www.granddial.com > > ________________________________ > From: Satish Patel > Sent: 7/1/18 10:27 AM > To: openstack > Subject: [Openstack] DNS integration > Folks, > > > Is there a way to tell openstack when you launch instance add them in > external DNS using some kind of api call? > > We are using extarnal pDSN (power DNS) and wants my VM get register > itself as soon as we launch them, is it possible by neutron or we > should use cloud-init? > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack From dtroyer at gmail.com Mon Jul 2 13:41:37 2018 From: dtroyer at gmail.com (Dean Troyer) Date: Mon, 2 Jul 2018 08:41:37 -0500 Subject: [Openstack] masakari client (cannot list/add segment) In-Reply-To: <7452ea5858bd4c70be986b257d67a46d@granddial.com> References: <7452ea5858bd4c70be986b257d67a46d@granddial.com> Message-ID: On Mon, Jul 2, 2018 at 7:43 AM, Torin Woltjer wrote: > Installed masakari 4.0.0 on queens. Hostmonitor, instancemonitor, and > processmonitor all running on compute nodes. API and engine running on > controller nodes. I've tried using the masakari client to list/add segments, > any of those commands does nothing and returns: > > ("'NoneType' object has no attribute 'auth_url'", ', > mode 'w' at 0x7f26bb4b71e0>) That error is not very helpful but the clue is the 'auth_url' part. This is usually caused by not having your auth variables set properly, in this case OS_AUTH_URL. Check all of your auth configuration for the client. dt -- Dean Troyer dtroyer at gmail.com From elbouanani.houssam at gmail.com Mon Jul 2 13:45:39 2018 From: elbouanani.houssam at gmail.com (Houssam ElBouanani) Date: Mon, 2 Jul 2018 14:45:39 +0100 Subject: [Openstack] Compute node on bare metal? Message-ID: Hi, I have recently finished installing a minimal OpenStack Queens environment for a school project, and was asked whether it is possible to deploy an additional compute node on bare metal, aka without an underlying operating system, in order to eliminate the operating system overhead and thus to maximize performance. I personally guess not because the compute node needs to run some services in order to function. Am I wrong? Please keep in mind that I am not referring to the Ironic service, as I wish to install a compute node on bare metal, not to provision bare metal machines to users. Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaypipes at gmail.com Mon Jul 2 14:45:07 2018 From: jaypipes at gmail.com (Jay Pipes) Date: Mon, 2 Jul 2018 10:45:07 -0400 Subject: [Openstack] Compute node on bare metal? In-Reply-To: References: Message-ID: <16443207-a393-b881-fca1-caeebb58efda@gmail.com> On 07/02/2018 09:45 AM, Houssam ElBouanani wrote: > Hi, > > I have recently finished installing a minimal OpenStack Queens > environment for a school project, and was asked whether it is possible > to deploy an additional compute node on bare metal, aka without an > underlying operating system, in order to eliminate the operating system > overhead and thus to maximize performance. Whomever asked you about this must be confusing a *hypervisor* with an operating system. Using baremetal means you eliminate the overhead of the *hypervisor* (virtualization). It doesn't mean you eliminate the operating system. You can't do much of anything with a baremetal machine that has no operating system on it. Best, -jay From Remo at italy1.com Mon Jul 2 15:08:03 2018 From: Remo at italy1.com (Remo Mattei) Date: Mon, 2 Jul 2018 09:08:03 -0600 Subject: [Openstack] DNS integration In-Reply-To: References: <9b39e501fda3409081621cbf76f6e038@granddial.com> Message-ID: <0BF9563E-2201-4CFF-80C2-39760541871E@italy1.com> Are you using OOO? Or what? HA mode? Remo > On Jul 2, 2018, at 7:32 AM, Satish Patel wrote: > > Yes, i am looking at it but documentation is little confusion too.. > > On Mon, Jul 2, 2018 at 8:52 AM, Torin Woltjer > > wrote: >> Have a look at Designate: https://wiki.openstack.org/wiki/Designate >> It has support for powerDNS, and sounds like what you're looking for. >> >> Torin Woltjer >> >> Grand Dial Communications - A ZK Tech Inc. Company >> >> 616.776.1066 ext. 2006 >> www.granddial.com >> >> ________________________________ >> From: Satish Patel >> Sent: 7/1/18 10:27 AM >> To: openstack >> Subject: [Openstack] DNS integration >> Folks, >> >> >> Is there a way to tell openstack when you launch instance add them in >> external DNS using some kind of api call? >> >> We are using extarnal pDSN (power DNS) and wants my VM get register >> itself as soon as we launch them, is it possible by neutron or we >> should use cloud-init? >> >> _______________________________________________ >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> Post to : openstack at lists.openstack.org >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Mon Jul 2 15:19:10 2018 From: satish.txt at gmail.com (Satish Patel) Date: Mon, 2 Jul 2018 11:19:10 -0400 Subject: [Openstack] DNS integration In-Reply-To: <0BF9563E-2201-4CFF-80C2-39760541871E@italy1.com> References: <9b39e501fda3409081621cbf76f6e038@granddial.com> <0BF9563E-2201-4CFF-80C2-39760541871E@italy1.com> Message-ID: I am using Openstack-ansible deployment method in HA mode. On Mon, Jul 2, 2018 at 11:08 AM, Remo Mattei wrote: > Are you using OOO? Or what? HA mode? > > Remo > > > On Jul 2, 2018, at 7:32 AM, Satish Patel wrote: > > Yes, i am looking at it but documentation is little confusion too.. > > On Mon, Jul 2, 2018 at 8:52 AM, Torin Woltjer > wrote: > > Have a look at Designate: https://wiki.openstack.org/wiki/Designate > It has support for powerDNS, and sounds like what you're looking for. > > Torin Woltjer > > Grand Dial Communications - A ZK Tech Inc. Company > > 616.776.1066 ext. 2006 > www.granddial.com > > ________________________________ > From: Satish Patel > Sent: 7/1/18 10:27 AM > To: openstack > Subject: [Openstack] DNS integration > Folks, > > > Is there a way to tell openstack when you launch instance add them in > external DNS using some kind of api call? > > We are using extarnal pDSN (power DNS) and wants my VM get register > itself as soon as we launch them, is it possible by neutron or we > should use cloud-init? > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/ > openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/ > openstack > > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/ > openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/ > openstack > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From torin.woltjer at granddial.com Mon Jul 2 15:39:52 2018 From: torin.woltjer at granddial.com (Torin Woltjer) Date: Mon, 02 Jul 2018 15:39:52 GMT Subject: [Openstack] masakari client (cannot list/add segment) Message-ID: <017329dd04934285a379b99357ac223a@granddial.com> Running the command with the -d debug option provides this python traceback: Traceback (most recent call last): File "/usr/local/bin/masakari", line 11, in sys.exit(main()) File "/usr/local/lib/python2.7/dist-packages/masakariclient/shell.py", line 189, in main MasakariShell().main(args) File "/usr/local/lib/python2.7/dist-packages/masakariclient/shell.py", line 160, in main sc = self._setup_masakari_client(api_ver, args) File "/usr/local/lib/python2.7/dist-packages/masakariclient/shell.py", line 116, in _setup_masakari_client return masakari_client.Client(api_ver, user_agent=USER_AGENT, **kwargs) File "/usr/local/lib/python2.7/dist-packages/masakariclient/client.py", line 28, in Client return cls(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/masakariclient/v1/client.py", line 22, in __init__ prof=prof, user_agent=user_agent, **kwargs) File "/usr/local/lib/python2.7/dist-packages/masakariclient/sdk/ha/connection.py", line 48, in create_connection raise e AttributeError: 'NoneType' object has no attribute 'auth_url' Specifying --os-auth-url http://controller:5000 doesn't change this. Is python-masakariclient incorrectly installed? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/2/18 8:43 AM To: "openstack at lists.openstack.org" Subject: masakari client (cannot list/add segment) Installed masakari 4.0.0 on queens. Hostmonitor, instancemonitor, and processmonitor all running on compute nodes. API and engine running on controller nodes. I've tried using the masakari client to list/add segments, any of those commands does nothing and returns: ("'NoneType' object has no attribute 'auth_url'", ', mode 'w' at 0x7f26bb4b71e0>) I cannot find any log file for the masakari client to troubleshoot this further. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From torin.woltjer at granddial.com Mon Jul 2 18:11:00 2018 From: torin.woltjer at granddial.com (Torin Woltjer) Date: Mon, 02 Jul 2018 18:11:00 GMT Subject: [Openstack] masakari client (cannot list/add segment) Message-ID: Installing it with tox instead of pip seems to have precisely the same effect. Is there a config file for the masakari client that I am not aware of? Nothing seems to be provided with it, and documentation is nonexistant. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/2/18 11:45 AM To: Subject: Re: [Openstack] masakari client (cannot list/add segment) Running the command with the -d debug option provides this python traceback: Traceback (most recent call last): File "/usr/local/bin/masakari", line 11, in sys.exit(main()) File "/usr/local/lib/python2.7/dist-packages/masakariclient/shell.py", line 189, in main MasakariShell().main(args) File "/usr/local/lib/python2.7/dist-packages/masakariclient/shell.py", line 160, in main sc = self._setup_masakari_client(api_ver, args) File "/usr/local/lib/python2.7/dist-packages/masakariclient/shell.py", line 116, in _setup_masakari_client return masakari_client.Client(api_ver, user_agent=USER_AGENT, **kwargs) File "/usr/local/lib/python2.7/dist-packages/masakariclient/client.py", line 28, in Client return cls(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/masakariclient/v1/client.py", line 22, in __init__ prof=prof, user_agent=user_agent, **kwargs) File "/usr/local/lib/python2.7/dist-packages/masakariclient/sdk/ha/connection.py", line 48, in create_connection raise e AttributeError: 'NoneType' object has no attribute 'auth_url' Specifying --os-auth-url http://controller:5000 doesn't change this. Is python-masakariclient incorrectly installed? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/2/18 8:43 AM To: "openstack at lists.openstack.org" Subject: masakari client (cannot list/add segment) Installed masakari 4.0.0 on queens. Hostmonitor, instancemonitor, and processmonitor all running on compute nodes. API and engine running on controller nodes. I've tried using the masakari client to list/add segments, any of those commands does nothing and returns: ("'NoneType' object has no attribute 'auth_url'", ', mode 'w' at 0x7f26bb4b71e0>) I cannot find any log file for the masakari client to troubleshoot this further. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From berndbausch at gmail.com Mon Jul 2 22:26:53 2018 From: berndbausch at gmail.com (Bernd Bausch) Date: Tue, 3 Jul 2018 07:26:53 +0900 Subject: [Openstack] flavor metadata In-Reply-To: References: Message-ID: <196613d0-744d-c82f-a381-05b69e4c107a@gmail.com> The purpose of availability zones is segregating your servers so that downtime of a group of servers doesn't affect another group of servers. Such as, servers in different buildings or hooked up to different power lines. Important detail: A server can be in a single availability zone at most. This is not the same as segregating servers according to their capabilities. For example, you may want to categorize your servers by core numbers, storage type and network bandwidth. In this case, a server may be in three groups: the 24-core group, the SSD group and the 40Gbit group. For that, you need host aggregates. Nova documentation tells you how to use host aggregates https://docs.openstack.org/nova/latest/user/aggregates.html. Bernd Bausch On 7/2/2018 10:31 PM, Satish Patel wrote: > Now i am confused, what is the best option and can you give me example > how should i use them? > > ~S > > On Mon, Jul 2, 2018 at 8:48 AM, Torin Woltjer > wrote: >> I would recommend using availability zones for this. >> >> Torin Woltjer >> >> Grand Dial Communications - A ZK Tech Inc. Company >> >> 616.776.1066 ext. 2006 >> www.granddial.com >> >> ________________________________ >> From: Satish Patel >> Sent: 7/1/18 9:56 AM >> To: openstack >> Subject: [Openstack] flavor metadata >> Folks, >> >> Recently we build openstack for production and i have question related >> flavor metadata. >> >> I have 3 kind of servers 8 core / 32 core / 40 core servers, now i >> want to tell my openstack my one of specific application always go to >> 32 core machine, how do i tell that to flavor metadata? >> >> Or should i use availability zone option and create two group? >> >> _______________________________________________ >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> Post to : openstack at lists.openstack.org >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From berndbausch at gmail.com Mon Jul 2 22:34:01 2018 From: berndbausch at gmail.com (Bernd Bausch) Date: Tue, 3 Jul 2018 07:34:01 +0900 Subject: [Openstack] Compute node on bare metal? In-Reply-To: <16443207-a393-b881-fca1-caeebb58efda@gmail.com> References: <16443207-a393-b881-fca1-caeebb58efda@gmail.com> Message-ID: <95f9e1dd-3e92-fb18-1ee3-bd262a841e0d@gmail.com> They are probably thinking of Vmware ESXi, which is both an operating system kernel, named vmkernel, and a hypervisor. OpenStack is not a hypervisor. It /uses /hypervisors to manage virtual machines. Furthermore, OpenStack is written in Python, so that, as a minimum, your "baremetal" would have to be able to run Python code. On 7/2/2018 11:45 PM, Jay Pipes wrote: > On 07/02/2018 09:45 AM, Houssam ElBouanani wrote: >> Hi, >> >> I have recently finished installing a minimal OpenStack Queens >> environment for a school project, and was asked whether it is >> possible to deploy an additional compute node on bare metal, aka >> without an underlying operating system, in order to eliminate the >> operating system overhead and thus to maximize performance. > > Whomever asked you about this must be confusing a *hypervisor* with an > operating system. Using baremetal means you eliminate the overhead of > the *hypervisor* (virtualization). It doesn't mean you eliminate the > operating system. You can't do much of anything with a baremetal > machine that has no operating system on it. > > Best, > -jay > > _______________________________________________ > Mailing list: > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to     : openstack at lists.openstack.org > Unsubscribe : > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From openstack at medberry.net Mon Jul 2 23:11:45 2018 From: openstack at medberry.net (David Medberry) Date: Mon, 2 Jul 2018 17:11:45 -0600 Subject: [Openstack] flavor metadata In-Reply-To: References: Message-ID: Bernd has this right. Host aggregates (sometimes called Haggs) is the right soluton to this problem. You can setup a flavor to only run on a certain hagg. This works well (in production, at scale). On Sun, Jul 1, 2018 at 7:52 AM, Satish Patel wrote: > Folks, > > Recently we build openstack for production and i have question related > flavor metadata. > > I have 3 kind of servers 8 core / 32 core / 40 core servers, now i > want to tell my openstack my one of specific application always go to > 32 core machine, how do i tell that to flavor metadata? > > Or should i use availability zone option and create two group? > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/ > openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/ > openstack > -------------- next part -------------- An HTML attachment was scrubbed... URL: From torin.woltjer at granddial.com Tue Jul 3 15:28:02 2018 From: torin.woltjer at granddial.com (Torin Woltjer) Date: Tue, 03 Jul 2018 15:28:02 GMT Subject: [Openstack] Recovering from full outage Message-ID: We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimmy at openstack.org Tue Jul 3 15:44:38 2018 From: jimmy at openstack.org (Jimmy McArthur) Date: Tue, 03 Jul 2018 10:44:38 -0500 Subject: [Openstack] Recovering from full outage In-Reply-To: References: Message-ID: <5B3B99E6.4070306@openstack.org> I'm adding this to the OpenStack Operators list as it's a bit better for these types of questions. Torin Woltjer wrote: > We just suffered a power outage in out data center and I'm having > trouble recovering the Openstack cluster. All of the nodes are back > online, every instance shows active but `virsh list --all` on the > compute nodes show that all of the VMs are actually shut down. Running > `ip addr` on any of the nodes shows that none of the bridges are > present and `ip netns` shows that all of the network namespaces are > missing as well. So despite all of the neutron service running, none > of the networking appears to be active, which is concerning. How do I > solve this without recreating all of the networks? > > /*Torin Woltjer*/ > *Grand Dial Communications - A ZK Tech Inc. Company* > *616.776.1066 ext. 2006* > /*www.granddial.com */ > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From martialmichel at datamachines.io Tue Jul 3 20:34:54 2018 From: martialmichel at datamachines.io (Martial Michel) Date: Tue, 3 Jul 2018 16:34:54 -0400 Subject: [Openstack] [Scientific] No Scientific SIG meeting on July 4th Message-ID: We will skip this week's Scientific SIG IRC meeting (originally scheduled for 2018-07-04 1100 UTC in channel #openstack-meeting). For my colleagues located in the USA, Happy July 4th! -- Martial -------------- next part -------------- An HTML attachment was scrubbed... URL: From torin.woltjer at granddial.com Tue Jul 3 21:03:30 2018 From: torin.woltjer at granddial.com (Torin Woltjer) Date: Tue, 03 Jul 2018 21:03:30 GMT Subject: [Openstack] Recovering from full outage Message-ID: Running `openstack server reboot` on an instance just causes the instance to be stuck in a rebooting status. Most notable of the logs is neutron-server.log which shows the following: http://paste.openstack.org/show/724917/ I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted controllers, and all of the agents show online. http://paste.openstack.org/show/724921/ And all of the instances can be properly started, however I cannot ping any of the instances floating IPs or the neutron router. And when logging into an instance with the console, there is no IP address on any interface. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/3/18 11:50 AM To: torin.woltjer at granddial.com Subject: Re: [Openstack] Recovering from full outage Try restarting them using "openstack server reboot" and also check the nova-compute.log and neutron agents logs on the compute nodes. On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer wrote: We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack at lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From torin.woltjer at granddial.com Tue Jul 3 21:34:26 2018 From: torin.woltjer at granddial.com (Torin Woltjer) Date: Tue, 03 Jul 2018 21:34:26 GMT Subject: [Openstack] Recovering from full outage Message-ID: The following errors appear in the neutron-linuxbridge-agent.log on both controllers: http://paste.openstack.org/show/724930/ No such errors are on the compute nodes themselves. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/3/18 5:14 PM To: Cc: "openstack-operators at lists.openstack.org" , "openstack at lists.openstack.org" Subject: Re: [Openstack] Recovering from full outage Running `openstack server reboot` on an instance just causes the instance to be stuck in a rebooting status. Most notable of the logs is neutron-server.log which shows the following: http://paste.openstack.org/show/724917/ I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted controllers, and all of the agents show online. http://paste.openstack.org/show/724921/ And all of the instances can be properly started, however I cannot ping any of the instances floating IPs or the neutron router. And when logging into an instance with the console, there is no IP address on any interface. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/3/18 11:50 AM To: torin.woltjer at granddial.com Subject: Re: [Openstack] Recovering from full outage Try restarting them using "openstack server reboot" and also check the nova-compute.log and neutron agents logs on the compute nodes. On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer wrote: We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack at lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From lmihaiescu at gmail.com Tue Jul 3 23:47:13 2018 From: lmihaiescu at gmail.com (George Mihaiescu) Date: Tue, 3 Jul 2018 19:47:13 -0400 Subject: [Openstack] Recovering from full outage In-Reply-To: References: Message-ID: <06CB62EE-0078-4C35-AF6C-7E7099DBC474@gmail.com> Did you set a lock_path in the neutron’s config? > On Jul 3, 2018, at 17:34, Torin Woltjer wrote: > > The following errors appear in the neutron-linuxbridge-agent.log on both controllers: http://paste.openstack.org/show/724930/ > > No such errors are on the compute nodes themselves. > > Torin Woltjer > > Grand Dial Communications - A ZK Tech Inc. Company > > 616.776.1066 ext. 2006 > www.granddial.com > > From: "Torin Woltjer" > Sent: 7/3/18 5:14 PM > To: > Cc: "openstack-operators at lists.openstack.org" , "openstack at lists.openstack.org" > Subject: Re: [Openstack] Recovering from full outage > Running `openstack server reboot` on an instance just causes the instance to be stuck in a rebooting status. Most notable of the logs is neutron-server.log which shows the following: > http://paste.openstack.org/show/724917/ > > I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted controllers, and all of the agents show online. > http://paste.openstack.org/show/724921/ > And all of the instances can be properly started, however I cannot ping any of the instances floating IPs or the neutron router. And when logging into an instance with the console, there is no IP address on any interface. > > Torin Woltjer > > Grand Dial Communications - A ZK Tech Inc. Company > > 616.776.1066 ext. 2006 > www.granddial.com > > From: George Mihaiescu > Sent: 7/3/18 11:50 AM > To: torin.woltjer at granddial.com > Subject: Re: [Openstack] Recovering from full outage > Try restarting them using "openstack server reboot" and also check the nova-compute.log and neutron agents logs on the compute nodes. > >> On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer wrote: >> We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks? >> >> Torin Woltjer >> >> Grand Dial Communications - A ZK Tech Inc. Company >> >> 616.776.1066 ext. 2006 >> www.granddial.com >> >> _______________________________________________ >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> Post to : openstack at lists.openstack.org >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nguyentrongtan124 at gmail.com Wed Jul 4 10:28:59 2018 From: nguyentrongtan124 at gmail.com (=?UTF-8?B?Tmd1eeG7hW4gVHLhu41uZyBU4bqlbg==?=) Date: Wed, 4 Jul 2018 17:28:59 +0700 Subject: [Openstack] Novaclient redirect endpoint https into http Message-ID: <00be01d41381$d37b2940$7a717bc0$@gmail.com> Hi Team Nova develop! I build OpenStack version Queens with 05 node (03 node controller, 02 node compute). I use HAproxy to load balancer all services. First, I create endpoint with http protocol. Everything is successful. Next, I change endpoint from http to https. After complete, I use openstack command normally. But, I can not use nova command, endpoint nova have been redirected from https to http. Here: http://prntscr.com/k2e8s6 (command: nova –insecure service list) And this is error log: Unable to establish connection to http://192.168.30.70:8774/v2.1/: ('Connection aborted.', BadStatusLine("''",)) Endpoint after change: https://prnt.sc/k2cgjm Horizon error: https://prnt.sc/k2chwm Nova version: http://prntscr.com/k2eaxh My LAB : Ubuntu 16.04 LTS, Please help me fix this bug. Thanks and Best Regards! Nguyen Trong Tan Openstack group user VietNam. -------------- next part -------------- An HTML attachment was scrubbed... URL: From laszlo.budai at gmail.com Wed Jul 4 11:27:00 2018 From: laszlo.budai at gmail.com (Budai Laszlo) Date: Wed, 4 Jul 2018 14:27:00 +0300 Subject: [Openstack] openstac-ansible question Message-ID: <5cc31a5a-1496-5127-9e49-f04f61f4aee4@gmail.com> Dear all, is it possible to define exceptions in the group_binds definition for a network? for instance we have something like this: - network: container_bridge: "vlan4" container_type: "veth" container_interface: "eth1" ip_from_q: "container" type: "raw" group_binds: *- all_containers* - hosts is_container_address: true is_ssh_address: true So, instead of *all_containers* I would be interested in something like "all_containers except those running on the ceph nodes". Any ideas are welcome. Thank you, Laszlo From bogdan.katynski at workday.com Wed Jul 4 14:50:26 2018 From: bogdan.katynski at workday.com (Bogdan Katynski) Date: Wed, 4 Jul 2018 14:50:26 +0000 Subject: [Openstack] Novaclient redirect endpoint https into http In-Reply-To: <00be01d41381$d37b2940$7a717bc0$@gmail.com> References: <00be01d41381$d37b2940$7a717bc0$@gmail.com> Message-ID: <0D8F95CB-0AAB-45FD-ADC8-3B917C1460D4@workday.com> > > But, I can not use nova command, endpoint nova have been redirected from https to http. Here: http://prntscr.com/k2e8s6 (command: nova –insecure service list) First of all, it seems that the nova client is hitting /v2.1 instead of /v2.1/ URI and this seems to be triggering the redirect. Since openstack CLI works, I presume it must be using the correct URL and hence it’s not getting redirected. > > And this is error log: Unable to establish connection to http://192.168.30.70:8774/v2.1/: ('Connection aborted.', BadStatusLine("''",)) > Looks to me that nova-api does a redirect to an absolute URL. I suspect SSL is terminated on the HAProxy and nova-api itself is configured without SSL so it redirects to an http URL. In my opinion, nova would be more load-balancer friendly if it used a relative URI in the redirect but that’s outside of the scope of this question and since I don’t know the context behind choosing the absolute URL, I could be wrong on that. I had a similar problem with heat-api running behind an Apache reverse proxy, and managed to resolve it by applying the workaround from this bug report: https://bugs.launchpad.net/python-heatclient/+bug/1420907 Setting X-Forwarded-Proto: https before forwarding the request to heat-api fixed the issue for me. -- Bogdan Katyński freenode: bodgix From toni.mueller at oeko.net Wed Jul 4 15:08:34 2018 From: toni.mueller at oeko.net (Toni Mueller) Date: Wed, 4 Jul 2018 16:08:34 +0100 Subject: [Openstack] NUMA some of the time? Message-ID: <20180704150834.4szwgh6cjrs4gq6m@bla.tonimueller.org> Hi, I am still trying to figure how to best utilise the small set of hardware, and discovered the NUMA configuration mechanism. It allows me to configure reserved cores for certain VMs, but it does not seem to allow me to say "you can share these cores, but VMs of, say, appropriate flavour take precedence and will throw you off these cores in case they need more power". How can I achieve that, dynamically? TIA! Thanks, Toni From nguyentrongtan124 at gmail.com Wed Jul 4 22:49:53 2018 From: nguyentrongtan124 at gmail.com (=?UTF-8?B?Tmd1eeG7hW4gVHLhu41uZyBU4bqlbg==?=) Date: Thu, 5 Jul 2018 05:49:53 +0700 Subject: [Openstack] Novaclient redirect endpoint https into http Message-ID: <00cb01d413e9$5534d320$ff9e7960$@gmail.com> Hi Team Nova develop! I build OpenStack version Queens with 05 node (03 node controller, 02 node compute). I use HAproxy to load balancer all services. First, I create endpoint with http protocol. Everything is successful. Next, I change endpoint from http to https. After complete, I use openstack command normally. But, I can not use nova command, endpoint nova have been redirected from https to http. Here: http://prntscr.com/k2e8s6 (command: nova –insecure service list) And this is error log: Unable to establish connection to http://192.168.30.70:8774/v2.1/: ('Connection aborted.', BadStatusLine("''",)) Endpoint after change: https://prnt.sc/k2cgjm Horizon error: https://prnt.sc/k2chwm Nova version: http://prntscr.com/k2eaxh My LAB : Ubuntu 16.04 LTS, Please help me fix this bug. Thanks and Best Regards! Nguyen Trong Tan Openstack group user VietNam. -------------- next part -------------- An HTML attachment was scrubbed... URL: From nguyentrongtan124 at gmail.com Thu Jul 5 01:12:39 2018 From: nguyentrongtan124 at gmail.com (=?UTF-8?B?Tmd1eeG7hW4gVHLhu41uZyBU4bqlbg==?=) Date: Thu, 5 Jul 2018 08:12:39 +0700 Subject: [Openstack] Novaclient redirect endpoint https into http In-Reply-To: <0D8F95CB-0AAB-45FD-ADC8-3B917C1460D4@workday.com> References: <00be01d41381$d37b2940$7a717bc0$@gmail.com> <0D8F95CB-0AAB-45FD-ADC8-3B917C1460D4@workday.com> Message-ID: <00da01d413fd$4600d3e0$d2027ba0$@gmail.com> Thanks you katynski for response. But, I had config Haproxy correctly. Here is my config: http://prntscr.com/k2ofwv And, when I use openstack command, that is successful. Here: http://prntscr.com/k2ogau I don’t think I config wrong. I can create, delete, list, show any VM with openstack command successfully. Thanks and Best Regards! Nguyen Trong Tan Openstack group user VietNam. -----Original Message----- From: Bogdan Katynski [mailto:bogdan.katynski at workday.com] Sent: Wednesday, July 4, 2018 9:50 PM To: Nguyễn Trọng Tấn Cc: openstack-operators at lists.openstack.org; openstack at lists.openstack.org; Lê Quang Long (VDC-IT) Subject: Re: [Openstack] Novaclient redirect endpoint https into http > > But, I can not use nova command, endpoint nova have been redirected from https to http. Here: http://prntscr.com/k2e8s6 (command: nova –insecure service list) First of all, it seems that the nova client is hitting /v2.1 instead of /v2.1/ URI and this seems to be triggering the redirect. Since openstack CLI works, I presume it must be using the correct URL and hence it’s not getting redirected. > > And this is error log: Unable to establish connection to http://192.168.30.70:8774/v2.1/: ('Connection aborted.', BadStatusLine("''",)) > Looks to me that nova-api does a redirect to an absolute URL. I suspect SSL is terminated on the HAProxy and nova-api itself is configured without SSL so it redirects to an http URL. In my opinion, nova would be more load-balancer friendly if it used a relative URI in the redirect but that’s outside of the scope of this question and since I don’t know the context behind choosing the absolute URL, I could be wrong on that. I had a similar problem with heat-api running behind an Apache reverse proxy, and managed to resolve it by applying the workaround from this bug report: https://bugs.launchpad.net/python-heatclient/+bug/1420907 Setting X-Forwarded-Proto: https before forwarding the request to heat-api fixed the issue for me. -- Bogdan Katyński freenode: bodgix From jaosorior at gmail.com Thu Jul 5 01:36:59 2018 From: jaosorior at gmail.com (Juan Antonio Osorio) Date: Wed, 4 Jul 2018 20:36:59 -0500 Subject: [Openstack] Novaclient redirect endpoint https into http In-Reply-To: <00da01d413fd$4600d3e0$d2027ba0$@gmail.com> References: <00be01d41381$d37b2940$7a717bc0$@gmail.com> <0D8F95CB-0AAB-45FD-ADC8-3B917C1460D4@workday.com> <00da01d413fd$4600d3e0$d2027ba0$@gmail.com> Message-ID: Are you using http_to_wsgi_middleware? Gotta enable that in the nova config and make sure its in your paste config. On Wed, 4 Jul 2018, 20:22 Nguyễn Trọng Tấn, wrote: > Thanks you katynski for response. > > But, I had config Haproxy correctly. Here is my config: > http://prntscr.com/k2ofwv > > And, when I use openstack command, that is successful. Here: > http://prntscr.com/k2ogau > > I don’t think I config wrong. I can create, delete, list, show any VM with > openstack command successfully. > > > > Thanks and Best Regards! > > Nguyen Trong Tan > > Openstack group user VietNam. > > > > -----Original Message----- > From: Bogdan Katynski [mailto:bogdan.katynski at workday.com] > Sent: Wednesday, July 4, 2018 9:50 PM > To: Nguyễn Trọng Tấn > Cc: openstack-operators at lists.openstack.org; openstack at lists.openstack.org; > Lê Quang Long (VDC-IT) > Subject: Re: [Openstack] Novaclient redirect endpoint https into http > > > > > > But, I can not use nova command, endpoint nova have been redirected from > https to http. Here: http://prntscr.com/k2e8s6 (command: nova –insecure > service list) > > First of all, it seems that the nova client is hitting /v2.1 instead of > /v2.1/ URI and this seems to be triggering the redirect. > > Since openstack CLI works, I presume it must be using the correct URL and > hence it’s not getting redirected. > > > > > And this is error log: Unable to establish connection to > http://192.168.30.70:8774/v2.1/: ('Connection aborted.', > BadStatusLine("''",)) > > > > Looks to me that nova-api does a redirect to an absolute URL. I suspect > SSL is terminated on the HAProxy and nova-api itself is configured without > SSL so it redirects to an http URL. > > In my opinion, nova would be more load-balancer friendly if it used a > relative URI in the redirect but that’s outside of the scope of this > question and since I don’t know the context behind choosing the absolute > URL, I could be wrong on that. > > I had a similar problem with heat-api running behind an Apache reverse > proxy, and managed to resolve it by applying the workaround from this bug > report: > > https://bugs.launchpad.net/python-heatclient/+bug/1420907 > > Setting > > X-Forwarded-Proto: https > > before forwarding the request to heat-api fixed the issue for me. > > -- > Bogdan Katyński > freenode: bodgix > > > > > > > > _______________________________________________ > Mailing list: > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nguyentrongtan124 at gmail.com Thu Jul 5 01:54:59 2018 From: nguyentrongtan124 at gmail.com (=?UTF-8?B?Tmd1eeG7hW4gVHLhu41uZyBU4bqlbg==?=) Date: Thu, 5 Jul 2018 08:54:59 +0700 Subject: [Openstack] Novaclient redirect endpoint https into http In-Reply-To: References: <00be01d41381$d37b2940$7a717bc0$@gmail.com> <0D8F95CB-0AAB-45FD-ADC8-3B917C1460D4@workday.com> <00da01d413fd$4600d3e0$d2027ba0$@gmail.com> Message-ID: <00dd01d41403$2fa0c5f0$8ee251d0$@gmail.com> Thank you Juan Antonio Osorio! With you response, I had fixed this error. I must add more config in nova.conf. Here: [oslo_middleware] secure_proxy_ssl_header = X-Forwarded-Proto enable_proxy_headers_parsing = true Now, I can use nova command normally. http://prntscr.com/k2oq7o Thank you very much. Thanks and Best Regards! Nguyen Trong Tan Openstack group user VietNam. From: Juan Antonio Osorio [mailto:jaosorior at gmail.com] Sent: Thursday, July 5, 2018 8:37 AM To: Nguyễn Trọng Tấn Cc: Bogdan Katynski ; openstack at lists.openstack.org; Lê Quang Long (VDC-IT) Subject: Re: [Openstack] Novaclient redirect endpoint https into http Are you using http_to_wsgi_middleware? Gotta enable that in the nova config and make sure its in your paste config. On Wed, 4 Jul 2018, 20:22 Nguyễn Trọng Tấn, > wrote: Thanks you katynski for response. But, I had config Haproxy correctly. Here is my config: http://prntscr.com/k2ofwv And, when I use openstack command, that is successful. Here: http://prntscr.com/k2ogau I don’t think I config wrong. I can create, delete, list, show any VM with openstack command successfully. Thanks and Best Regards! Nguyen Trong Tan Openstack group user VietNam. -----Original Message----- From: Bogdan Katynski [mailto:bogdan.katynski at workday.com ] Sent: Wednesday, July 4, 2018 9:50 PM To: Nguyễn Trọng Tấn > Cc: openstack-operators at lists.openstack.org ; openstack at lists.openstack.org ; Lê Quang Long (VDC-IT) > Subject: Re: [Openstack] Novaclient redirect endpoint https into http > > But, I can not use nova command, endpoint nova have been redirected from https to http. Here: http://prntscr.com/k2e8s6 (command: nova –insecure service list) First of all, it seems that the nova client is hitting /v2.1 instead of /v2.1/ URI and this seems to be triggering the redirect. Since openstack CLI works, I presume it must be using the correct URL and hence it’s not getting redirected. > > And this is error log: Unable to establish connection to http://192.168.30.70:8774/v2.1/: ('Connection aborted.', BadStatusLine("''",)) > Looks to me that nova-api does a redirect to an absolute URL. I suspect SSL is terminated on the HAProxy and nova-api itself is configured without SSL so it redirects to an http URL. In my opinion, nova would be more load-balancer friendly if it used a relative URI in the redirect but that’s outside of the scope of this question and since I don’t know the context behind choosing the absolute URL, I could be wrong on that. I had a similar problem with heat-api running behind an Apache reverse proxy, and managed to resolve it by applying the workaround from this bug report: https://bugs.launchpad.net/python-heatclient/+bug/1420907 Setting X-Forwarded-Proto: https before forwarding the request to heat-api fixed the issue for me. -- Bogdan Katyński freenode: bodgix _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack at lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From torin.woltjer at granddial.com Thu Jul 5 12:43:43 2018 From: torin.woltjer at granddial.com (Torin Woltjer) Date: Thu, 05 Jul 2018 12:43:43 GMT Subject: [Openstack] Recovering from full outage Message-ID: There is no lock path set in my neutron configuration. Does it ultimately matter what it is set to as long as it is consistent? Does it need to be set on compute nodes as well as controllers? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/3/18 7:47 PM To: torin.woltjer at granddial.com Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org Subject: Re: [Openstack] Recovering from full outage Did you set a lock_path in the neutron’s config? On Jul 3, 2018, at 17:34, Torin Woltjer wrote: The following errors appear in the neutron-linuxbridge-agent.log on both controllers: http://paste.openstack.org/show/724930/ No such errors are on the compute nodes themselves. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/3/18 5:14 PM To: Cc: "openstack-operators at lists.openstack.org" , "openstack at lists.openstack.org" Subject: Re: [Openstack] Recovering from full outage Running `openstack server reboot` on an instance just causes the instance to be stuck in a rebooting status. Most notable of the logs is neutron-server.log which shows the following: http://paste.openstack.org/show/724917/ I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted controllers, and all of the agents show online. http://paste.openstack.org/show/724921/ And all of the instances can be properly started, however I cannot ping any of the instances floating IPs or the neutron router. And when logging into an instance with the console, there is no IP address on any interface. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/3/18 11:50 AM To: torin.woltjer at granddial.com Subject: Re: [Openstack] Recovering from full outage Try restarting them using "openstack server reboot" and also check the nova-compute.log and neutron agents logs on the compute nodes. On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer wrote: We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack at lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From torin.woltjer at granddial.com Thu Jul 5 14:30:10 2018 From: torin.woltjer at granddial.com (Torin Woltjer) Date: Thu, 05 Jul 2018 14:30:10 GMT Subject: [Openstack] Recovering from full outage Message-ID: <5d62f81a0e864009ab7a1b12097e0b2f@granddial.com> The qrouter netns appears once the lock_path is specified, the neutron router is pingable as well. However, instances are not pingable. If I log in via console, the instances have not been given IP addresses, if I manually give them an address and route they are pingable and seem to work. So the router is working correctly but dhcp is not working. No errors in any of the neutron or nova logs on controllers or compute nodes. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/5/18 8:53 AM To: Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org Subject: Re: [Openstack] Recovering from full outage There is no lock path set in my neutron configuration. Does it ultimately matter what it is set to as long as it is consistent? Does it need to be set on compute nodes as well as controllers? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/3/18 7:47 PM To: torin.woltjer at granddial.com Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org Subject: Re: [Openstack] Recovering from full outage Did you set a lock_path in the neutron’s config? On Jul 3, 2018, at 17:34, Torin Woltjer wrote: The following errors appear in the neutron-linuxbridge-agent.log on both controllers: http://paste.openstack.org/show/724930/ No such errors are on the compute nodes themselves. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/3/18 5:14 PM To: Cc: "openstack-operators at lists.openstack.org" , "openstack at lists.openstack.org" Subject: Re: [Openstack] Recovering from full outage Running `openstack server reboot` on an instance just causes the instance to be stuck in a rebooting status. Most notable of the logs is neutron-server.log which shows the following: http://paste.openstack.org/show/724917/ I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted controllers, and all of the agents show online. http://paste.openstack.org/show/724921/ And all of the instances can be properly started, however I cannot ping any of the instances floating IPs or the neutron router. And when logging into an instance with the console, there is no IP address on any interface. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/3/18 11:50 AM To: torin.woltjer at granddial.com Subject: Re: [Openstack] Recovering from full outage Try restarting them using "openstack server reboot" and also check the nova-compute.log and neutron agents logs on the compute nodes. On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer wrote: We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack at lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From fsoppels at redhat.com Thu Jul 5 15:35:00 2018 From: fsoppels at redhat.com (Fabrizio Soppelsa) Date: Thu, 5 Jul 2018 17:35:00 +0200 Subject: [Openstack] NUMA some of the time? In-Reply-To: <20180704150834.4szwgh6cjrs4gq6m@bla.tonimueller.org> References: <20180704150834.4szwgh6cjrs4gq6m@bla.tonimueller.org> Message-ID: Greetings Toni, Not sure I'm answering, but just as a quick note you can prioritize NUMA and CPU-pinning by creating dedicated flavors [1] with something like: openstack flavor set m1.largenuma --property hw:numa_cpus.0=0,1 --property hw:numa_mem.0=2048 Usually NUMA cores are reserved or shared only for specific fixed ops, and I'm not sure what you mean by "throwing off more cores" in case of need, probably you need to look into something like Heat autoscale? Cheers, Fabrizio [1] https://docs.openstack.org/nova/pike/admin/cpu-topologies.html On Wed, Jul 4, 2018 at 5:19 PM Toni Mueller wrote: > > Hi, > > I am still trying to figure how to best utilise the small set of > hardware, and discovered the NUMA configuration mechanism. It allows me > to configure reserved cores for certain VMs, but it does not seem to > allow me to say "you can share these cores, but VMs of, say, appropriate > flavour take precedence and will throw you off these cores in case they > need more power". > > How can I achieve that, dynamically? > > TIA! > > > Thanks, > Toni > > > _______________________________________________ > Mailing list: > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > -------------- next part -------------- An HTML attachment was scrubbed... URL: From torin.woltjer at granddial.com Thu Jul 5 16:39:49 2018 From: torin.woltjer at granddial.com (Torin Woltjer) Date: Thu, 05 Jul 2018 16:39:49 GMT Subject: [Openstack] Recovering from full outage Message-ID: <4cb6b48da9734ad1899ff99db02db307@granddial.com> Yes, I've done this. The VMs hang for awhile waiting for DHCP and eventually come up with no addresses. neutron-dhcp-agent has been restarted on both controllers. The qdhcp netns's were all present; I stopped the service, removed the qdhcp netns's, noted the dhcp agents show offline by `neutron agent-list`, restarted all neutron services, noted the qdhcp netns's were recreated, restarted a VM again and it still fails to pull an IP address. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/5/18 10:38 AM To: torin.woltjer at granddial.com Subject: Re: [Openstack] Recovering from full outage Did you restart the neutron-dhcp-agent and rebooted the VMs? On Thu, Jul 5, 2018 at 10:30 AM, Torin Woltjer wrote: The qrouter netns appears once the lock_path is specified, the neutron router is pingable as well. However, instances are not pingable. If I log in via console, the instances have not been given IP addresses, if I manually give them an address and route they are pingable and seem to work. So the router is working correctly but dhcp is not working. No errors in any of the neutron or nova logs on controllers or compute nodes. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/5/18 8:53 AM To: Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org Subject: Re: [Openstack] Recovering from full outage There is no lock path set in my neutron configuration. Does it ultimately matter what it is set to as long as it is consistent? Does it need to be set on compute nodes as well as controllers? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/3/18 7:47 PM To: torin.woltjer at granddial.com Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org Subject: Re: [Openstack] Recovering from full outage Did you set a lock_path in the neutron’s config? On Jul 3, 2018, at 17:34, Torin Woltjer wrote: The following errors appear in the neutron-linuxbridge-agent.log on both controllers: http://paste.openstack.org/show/724930/ No such errors are on the compute nodes themselves. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/3/18 5:14 PM To: Cc: "openstack-operators at lists.openstack.org" , "openstack at lists.openstack.org" Subject: Re: [Openstack] Recovering from full outage Running `openstack server reboot` on an instance just causes the instance to be stuck in a rebooting status. Most notable of the logs is neutron-server.log which shows the following: http://paste.openstack.org/show/724917/ I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted controllers, and all of the agents show online. http://paste.openstack.org/show/724921/ And all of the instances can be properly started, however I cannot ping any of the instances floating IPs or the neutron router. And when logging into an instance with the console, there is no IP address on any interface. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/3/18 11:50 AM To: torin.woltjer at granddial.com Subject: Re: [Openstack] Recovering from full outage Try restarting them using "openstack server reboot" and also check the nova-compute.log and neutron agents logs on the compute nodes. On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer wrote: We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack at lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From lmihaiescu at gmail.com Thu Jul 5 16:56:07 2018 From: lmihaiescu at gmail.com (George Mihaiescu) Date: Thu, 5 Jul 2018 12:56:07 -0400 Subject: [Openstack] Recovering from full outage In-Reply-To: <4cb6b48da9734ad1899ff99db02db307@granddial.com> References: <4cb6b48da9734ad1899ff99db02db307@granddial.com> Message-ID: You should tcpdump inside the qdhcp namespace to see if the requests make it there, and also check iptables rules on the compute nodes for the return traffic. On Thu, Jul 5, 2018 at 12:39 PM, Torin Woltjer wrote: > Yes, I've done this. The VMs hang for awhile waiting for DHCP and > eventually come up with no addresses. neutron-dhcp-agent has been restarted > on both controllers. The qdhcp netns's were all present; I stopped the > service, removed the qdhcp netns's, noted the dhcp agents show offline by > `neutron agent-list`, restarted all neutron services, noted the qdhcp > netns's were recreated, restarted a VM again and it still fails to pull an > IP address. > > *Torin Woltjer* > > *Grand Dial Communications - A ZK Tech Inc. Company* > > *616.776.1066 ext. 2006* > * www.granddial.com * > > ------------------------------ > *From*: George Mihaiescu > *Sent*: 7/5/18 10:38 AM > *To*: torin.woltjer at granddial.com > *Subject*: Re: [Openstack] Recovering from full outage > Did you restart the neutron-dhcp-agent and rebooted the VMs? > > On Thu, Jul 5, 2018 at 10:30 AM, Torin Woltjer < > torin.woltjer at granddial.com> wrote: > >> The qrouter netns appears once the lock_path is specified, the neutron >> router is pingable as well. However, instances are not pingable. If I log >> in via console, the instances have not been given IP addresses, if I >> manually give them an address and route they are pingable and seem to work. >> So the router is working correctly but dhcp is not working. >> >> No errors in any of the neutron or nova logs on controllers or compute >> nodes. >> >> >> *Torin Woltjer* >> >> *Grand Dial Communications - A ZK Tech Inc. Company* >> >> *616.776.1066 ext. 2006* >> * >> www.granddial.com * >> >> ------------------------------ >> *From*: "Torin Woltjer" >> *Sent*: 7/5/18 8:53 AM >> *To*: >> *Cc*: openstack-operators at lists.openstack.org, >> openstack at lists.openstack.org >> *Subject*: Re: [Openstack] Recovering from full outage >> There is no lock path set in my neutron configuration. Does it ultimately >> matter what it is set to as long as it is consistent? Does it need to be >> set on compute nodes as well as controllers? >> >> *Torin Woltjer* >> >> *Grand Dial Communications - A ZK Tech Inc. Company* >> >> *616.776.1066 ext. 2006* >> * >> >> www.granddial.com * >> >> ------------------------------ >> *From*: George Mihaiescu >> *Sent*: 7/3/18 7:47 PM >> *To*: torin.woltjer at granddial.com >> *Cc*: openstack-operators at lists.openstack.org, >> openstack at lists.openstack.org >> *Subject*: Re: [Openstack] Recovering from full outage >> >> Did you set a lock_path in the neutron’s config? >> >> On Jul 3, 2018, at 17:34, Torin Woltjer >> wrote: >> >> The following errors appear in the neutron-linuxbridge-agent.log on both >> controllers: >> >> >> >> >> http://paste.openstack.org/sho >> w/724930/ >> >> No such errors are on the compute nodes themselves. >> >> *Torin Woltjer* >> >> *Grand Dial Communications - A ZK Tech Inc. Company* >> >> *616.776.1066 ext. 2006* >> * >> >> >> www.granddial.com * >> >> ------------------------------ >> *From*: "Torin Woltjer" >> *Sent*: 7/3/18 5:14 PM >> *To*: >> *Cc*: "openstack-operators at lists.openstack.org" < >> openstack-operators at lists.openstack.org>, "openstack at lists.openstack.org" >> >> *Subject*: Re: [Openstack] Recovering from full outage >> Running `openstack server reboot` on an instance just causes the instance >> to be stuck in a rebooting status. Most notable of the logs is >> neutron-server.log which shows the following: >> >> >> >> >> >> >> >> http://paste.openstack.org/sho >> w/724917/ >> >> I realized that rabbitmq was in a failed state, so I bootstrapped it, >> rebooted controllers, and all of the agents show online. >> >> >> >> >> >> >> >> http://paste.openstack.org/sho >> w/724921/ >> And all of the instances can be properly started, however I cannot ping >> any of the instances floating IPs or the neutron router. And when logging >> into an instance with the console, there is no IP address on any interface. >> >> *Torin Woltjer* >> >> *Grand Dial Communications - A ZK Tech Inc. Company* >> >> *616.776.1066 ext. 2006* >> * >> >> >> >> www.granddial.com * >> >> ------------------------------ >> *From*: George Mihaiescu >> *Sent*: 7/3/18 11:50 AM >> *To*: torin.woltjer at granddial.com >> *Subject*: Re: [Openstack] Recovering from full outage >> Try restarting them using "openstack server reboot" and also check the >> nova-compute.log and neutron agents logs on the compute nodes. >> >> On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer < >> torin.woltjer at granddial.com> wrote: >> >>> We just suffered a power outage in out data center and I'm having >>> trouble recovering the Openstack cluster. All of the nodes are back online, >>> every instance shows active but `virsh list --all` on the compute nodes >>> show that all of the VMs are actually shut down. Running `ip addr` on any >>> of the nodes shows that none of the bridges are present and `ip netns` >>> shows that all of the network namespaces are missing as well. So despite >>> all of the neutron service running, none of the networking appears to be >>> active, which is concerning. How do I solve this without recreating all of >>> the networks? >>> >>> *Torin Woltjer* >>> >>> *Grand Dial Communications - A ZK Tech Inc. Company* >>> >>> *616.776.1066 ext. 2006* >>> * >>> >>> >>> >>> >>> www.granddial.com * >>> >>> _______________________________________________ >>> Mailing list: >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>> Post to : openstack at lists.openstack.org >>> Unsubscribe : >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From melwittt at gmail.com Thu Jul 5 18:55:47 2018 From: melwittt at gmail.com (melanie witt) Date: Thu, 5 Jul 2018 11:55:47 -0700 Subject: [Openstack] [nova][api] Novaclient redirect endpoint https into http In-Reply-To: <0D8F95CB-0AAB-45FD-ADC8-3B917C1460D4@workday.com> References: <00be01d41381$d37b2940$7a717bc0$@gmail.com> <0D8F95CB-0AAB-45FD-ADC8-3B917C1460D4@workday.com> Message-ID: +openstack-dev@ On Wed, 4 Jul 2018 14:50:26 +0000, Bogdan Katynski wrote: >> But, I can not use nova command, endpoint nova have been redirected from https to http. Here:http://prntscr.com/k2e8s6 (command: nova –insecure service list) > First of all, it seems that the nova client is hitting /v2.1 instead of /v2.1/ URI and this seems to be triggering the redirect. > > Since openstack CLI works, I presume it must be using the correct URL and hence it’s not getting redirected. > >> >> And this is error log: Unable to establish connection tohttp://192.168.30.70:8774/v2.1/: ('Connection aborted.', BadStatusLine("''",)) >> > Looks to me that nova-api does a redirect to an absolute URL. I suspect SSL is terminated on the HAProxy and nova-api itself is configured without SSL so it redirects to an http URL. > > In my opinion, nova would be more load-balancer friendly if it used a relative URI in the redirect but that’s outside of the scope of this question and since I don’t know the context behind choosing the absolute URL, I could be wrong on that. Thanks for mentioning this. We do have a bug open in python-novaclient around a similar issue [1]. I've added comments based on this thread and will consult with the API subteam to see if there's something we can do about this in nova-api. -melanie [1] https://bugs.launchpad.net/python-novaclient/+bug/1776928 From torin.woltjer at granddial.com Thu Jul 5 20:06:17 2018 From: torin.woltjer at granddial.com (Torin Woltjer) Date: Thu, 05 Jul 2018 20:06:17 GMT Subject: [Openstack] Recovering from full outage Message-ID: Are IP addresses set by cloud-init on boot? I noticed that cloud-init isn't working on my VMs. created a new instance from an ubuntu 18.04 image to test with, the hostname was not set to the name of the instance and could not login as users I had specified in the configuration. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/5/18 12:57 PM To: torin.woltjer at granddial.com Cc: "openstack at lists.openstack.org" , "openstack-operators at lists.openstack.org" Subject: Re: [Openstack] Recovering from full outage You should tcpdump inside the qdhcp namespace to see if the requests make it there, and also check iptables rules on the compute nodes for the return traffic. On Thu, Jul 5, 2018 at 12:39 PM, Torin Woltjer wrote: Yes, I've done this. The VMs hang for awhile waiting for DHCP and eventually come up with no addresses. neutron-dhcp-agent has been restarted on both controllers. The qdhcp netns's were all present; I stopped the service, removed the qdhcp netns's, noted the dhcp agents show offline by `neutron agent-list`, restarted all neutron services, noted the qdhcp netns's were recreated, restarted a VM again and it still fails to pull an IP address. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/5/18 10:38 AM To: torin.woltjer at granddial.com Subject: Re: [Openstack] Recovering from full outage Did you restart the neutron-dhcp-agent and rebooted the VMs? On Thu, Jul 5, 2018 at 10:30 AM, Torin Woltjer wrote: The qrouter netns appears once the lock_path is specified, the neutron router is pingable as well. However, instances are not pingable. If I log in via console, the instances have not been given IP addresses, if I manually give them an address and route they are pingable and seem to work. So the router is working correctly but dhcp is not working. No errors in any of the neutron or nova logs on controllers or compute nodes. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/5/18 8:53 AM To: Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org Subject: Re: [Openstack] Recovering from full outage There is no lock path set in my neutron configuration. Does it ultimately matter what it is set to as long as it is consistent? Does it need to be set on compute nodes as well as controllers? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/3/18 7:47 PM To: torin.woltjer at granddial.com Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org Subject: Re: [Openstack] Recovering from full outage Did you set a lock_path in the neutron’s config? On Jul 3, 2018, at 17:34, Torin Woltjer wrote: The following errors appear in the neutron-linuxbridge-agent.log on both controllers: http://paste.openstack.org/show/724930/ No such errors are on the compute nodes themselves. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/3/18 5:14 PM To: Cc: "openstack-operators at lists.openstack.org" , "openstack at lists.openstack.org" Subject: Re: [Openstack] Recovering from full outage Running `openstack server reboot` on an instance just causes the instance to be stuck in a rebooting status. Most notable of the logs is neutron-server.log which shows the following: http://paste.openstack.org/show/724917/ I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted controllers, and all of the agents show online. http://paste.openstack.org/show/724921/ And all of the instances can be properly started, however I cannot ping any of the instances floating IPs or the neutron router. And when logging into an instance with the console, there is no IP address on any interface. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/3/18 11:50 AM To: torin.woltjer at granddial.com Subject: Re: [Openstack] Recovering from full outage Try restarting them using "openstack server reboot" and also check the nova-compute.log and neutron agents logs on the compute nodes. On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer wrote: We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack at lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From mordred at inaugust.com Thu Jul 5 20:10:08 2018 From: mordred at inaugust.com (Monty Taylor) Date: Thu, 5 Jul 2018 15:10:08 -0500 Subject: [Openstack] [openstack-dev] [nova][api] Novaclient redirect endpoint https into http In-Reply-To: References: <00be01d41381$d37b2940$7a717bc0$@gmail.com> <0D8F95CB-0AAB-45FD-ADC8-3B917C1460D4@workday.com> Message-ID: On 07/05/2018 01:55 PM, melanie witt wrote: > +openstack-dev@ > > On Wed, 4 Jul 2018 14:50:26 +0000, Bogdan Katynski wrote: >>> But, I can not use nova command, endpoint nova have been redirected >>> from https to http. Here:http://prntscr.com/k2e8s6  (command: nova >>> –insecure service list) >> First of all, it seems that the nova client is hitting /v2.1 instead >> of /v2.1/ URI and this seems to be triggering the redirect. >> >> Since openstack CLI works, I presume it must be using the correct URL >> and hence it’s not getting redirected. >> >>> And this is error log: Unable to establish connection >>> tohttp://192.168.30.70:8774/v2.1/: ('Connection aborted.', >>> BadStatusLine("''",)) >> Looks to me that nova-api does a redirect to an absolute URL. I >> suspect SSL is terminated on the HAProxy and nova-api itself is >> configured without SSL so it redirects to an http URL. >> >> In my opinion, nova would be more load-balancer friendly if it used a >> relative URI in the redirect but that’s outside of the scope of this >> question and since I don’t know the context behind choosing the >> absolute URL, I could be wrong on that. > > Thanks for mentioning this. We do have a bug open in python-novaclient > around a similar issue [1]. I've added comments based on this thread and > will consult with the API subteam to see if there's something we can do > about this in nova-api. A similar thing came up the other day related to keystone and version discovery. Version discovery documents tend to return full urls - even though relative urls would make public/internal API endpoints work better. (also, sometimes people don't configure things properly and the version discovery url winds up being incorrect) In shade/sdk - we actually construct a wholly-new discovery url based on the url used for the catalog and the url in the discovery document since we've learned that the version discovery urls are frequently broken. This is problematic because SOMETIMES people have public urls deployed as a sub-url and internal urls deployed on a port - so you have: Catalog: public: https://example.com/compute internal: https://compute.example.com:1234 Version discovery: https://example.com/compute/v2.1 When we go to combine the catalog url and the versioned url, if the user is hitting internal, we product https://compute.example.com:1234/compute/v2.1 - because we have no way of systemically knowing that /compute should also be stripped. VERY LONG WINDED WAY of saying 2 things: a) Relative URLs would be *way* friendlier (and incidentally are supported by keystoneauth, openstacksdk and shade - and are written up as being a thing people *should* support in the documents about API consumption) b) Can we get agreement that changing behavior to return or redirect to a relative URL would not be considered an api contract break? (it's possible the answer to this is 'no' - so it's a real question) Monty From torin.woltjer at granddial.com Fri Jul 6 13:38:55 2018 From: torin.woltjer at granddial.com (Torin Woltjer) Date: Fri, 06 Jul 2018 13:38:55 GMT Subject: [Openstack] Recovering from full outage Message-ID: I have done tcpdumps on both the controllers and on a compute node. Controller: `ip netns exec qdhcp-d85c2a00-a637-4109-83f0-7c2949be4cad tcpdump -vnes0 -i ns-83d68c76-b8 port 67` `tcpdump -vnes0 -i any port 67` Compute: `tcpdump -vnes0 -i brqd85c2a00-a6 port 68` For the first command on the controller, there are no packets captured at all. The second command on the controller captures packets, but they don't appear to be relevant to openstack. The dump from the compute node shows constant requests are getting sent by openstack instances. In summary; DHCP requests are being sent, but are never received. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/5/18 4:50 PM To: torin.woltjer at granddial.com Subject: Re: [Openstack] Recovering from full outage The cloud-init requires network connectivity by default in order to reach the metadata server for the hostname, ssh-key, etc You can configure cloud-init to use the config-drive, but the lack of network connectivity will make the instance useless anyway, even though it will have you ssh-key and hostname... Did you check the things I told you? On Jul 5, 2018, at 16:06, Torin Woltjer wrote: Are IP addresses set by cloud-init on boot? I noticed that cloud-init isn't working on my VMs. created a new instance from an ubuntu 18.04 image to test with, the hostname was not set to the name of the instance and could not login as users I had specified in the configuration. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/5/18 12:57 PM To: torin.woltjer at granddial.com Cc: "openstack at lists.openstack.org" , "openstack-operators at lists.openstack.org" Subject: Re: [Openstack] Recovering from full outage You should tcpdump inside the qdhcp namespace to see if the requests make it there, and also check iptables rules on the compute nodes for the return traffic. On Thu, Jul 5, 2018 at 12:39 PM, Torin Woltjer wrote: Yes, I've done this. The VMs hang for awhile waiting for DHCP and eventually come up with no addresses. neutron-dhcp-agent has been restarted on both controllers. The qdhcp netns's were all present; I stopped the service, removed the qdhcp netns's, noted the dhcp agents show offline by `neutron agent-list`, restarted all neutron services, noted the qdhcp netns's were recreated, restarted a VM again and it still fails to pull an IP address. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/5/18 10:38 AM To: torin.woltjer at granddial.com Subject: Re: [Openstack] Recovering from full outage Did you restart the neutron-dhcp-agent and rebooted the VMs? On Thu, Jul 5, 2018 at 10:30 AM, Torin Woltjer wrote: The qrouter netns appears once the lock_path is specified, the neutron router is pingable as well. However, instances are not pingable. If I log in via console, the instances have not been given IP addresses, if I manually give them an address and route they are pingable and seem to work. So the router is working correctly but dhcp is not working. No errors in any of the neutron or nova logs on controllers or compute nodes. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/5/18 8:53 AM To: Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org Subject: Re: [Openstack] Recovering from full outage There is no lock path set in my neutron configuration. Does it ultimately matter what it is set to as long as it is consistent? Does it need to be set on compute nodes as well as controllers? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/3/18 7:47 PM To: torin.woltjer at granddial.com Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org Subject: Re: [Openstack] Recovering from full outage Did you set a lock_path in the neutron’s config? On Jul 3, 2018, at 17:34, Torin Woltjer wrote: The following errors appear in the neutron-linuxbridge-agent.log on both controllers: http://paste.openstack.org/show/724930/ No such errors are on the compute nodes themselves. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/3/18 5:14 PM To: Cc: "openstack-operators at lists.openstack.org" , "openstack at lists.openstack.org" Subject: Re: [Openstack] Recovering from full outage Running `openstack server reboot` on an instance just causes the instance to be stuck in a rebooting status. Most notable of the logs is neutron-server.log which shows the following: http://paste.openstack.org/show/724917/ I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted controllers, and all of the agents show online. http://paste.openstack.org/show/724921/ And all of the instances can be properly started, however I cannot ping any of the instances floating IPs or the neutron router. And when logging into an instance with the console, there is no IP address on any interface. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/3/18 11:50 AM To: torin.woltjer at granddial.com Subject: Re: [Openstack] Recovering from full outage Try restarting them using "openstack server reboot" and also check the nova-compute.log and neutron agents logs on the compute nodes. On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer wrote: We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack at lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From lmihaiescu at gmail.com Fri Jul 6 15:14:23 2018 From: lmihaiescu at gmail.com (George Mihaiescu) Date: Fri, 6 Jul 2018 11:14:23 -0400 Subject: [Openstack] Recovering from full outage In-Reply-To: References: Message-ID: Can you manually assign an IP address to a VM and once inside, ping the address of the dhcp server? That would confirm if there is connectivity at least. Also, on the controller node where the dhcp server for that network is, check the "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" and make sure there are entries corresponding to your instances. In my experience, if neutron is broken after working fine (so excluding any miss-configuration), then an agent is out-of-sync and restart usually fixes things. On Fri, Jul 6, 2018 at 9:38 AM, Torin Woltjer wrote: > I have done tcpdumps on both the controllers and on a compute node. > Controller: > `ip netns exec qdhcp-d85c2a00-a637-4109-83f0-7c2949be4cad tcpdump -vnes0 > -i ns-83d68c76-b8 port 67` > `tcpdump -vnes0 -i any port 67` > Compute: > `tcpdump -vnes0 -i brqd85c2a00-a6 port 68` > > For the first command on the controller, there are no packets captured at > all. The second command on the controller captures packets, but they don't > appear to be relevant to openstack. The dump from the compute node shows > constant requests are getting sent by openstack instances. > > In summary; DHCP requests are being sent, but are never received. > > *Torin Woltjer* > > *Grand Dial Communications - A ZK Tech Inc. Company* > > *616.776.1066 ext. 2006* > * www.granddial.com * > > ------------------------------ > *From*: George Mihaiescu > *Sent*: 7/5/18 4:50 PM > *To*: torin.woltjer at granddial.com > *Subject*: Re: [Openstack] Recovering from full outage > > The cloud-init requires network connectivity by default in order to reach > the metadata server for the hostname, ssh-key, etc > > You can configure cloud-init to use the config-drive, but the lack of > network connectivity will make the instance useless anyway, even though it > will have you ssh-key and hostname... > > Did you check the things I told you? > > On Jul 5, 2018, at 16:06, Torin Woltjer > wrote: > > Are IP addresses set by cloud-init on boot? I noticed that cloud-init > isn't working on my VMs. created a new instance from an ubuntu 18.04 image > to test with, the hostname was not set to the name of the instance and > could not login as users I had specified in the configuration. > > *Torin Woltjer* > > *Grand Dial Communications - A ZK Tech Inc. Company* > > *616.776.1066 ext. 2006* > * > www.granddial.com * > > ------------------------------ > *From*: George Mihaiescu > *Sent*: 7/5/18 12:57 PM > *To*: torin.woltjer at granddial.com > *Cc*: "openstack at lists.openstack.org" , " > openstack-operators at lists.openstack.org" openstack.org> > *Subject*: Re: [Openstack] Recovering from full outage > You should tcpdump inside the qdhcp namespace to see if the requests make > it there, and also check iptables rules on the compute nodes for the return > traffic. > > > On Thu, Jul 5, 2018 at 12:39 PM, Torin Woltjer < > torin.woltjer at granddial.com> wrote: > >> Yes, I've done this. The VMs hang for awhile waiting for DHCP and >> eventually come up with no addresses. neutron-dhcp-agent has been restarted >> on both controllers. The qdhcp netns's were all present; I stopped the >> service, removed the qdhcp netns's, noted the dhcp agents show offline by >> `neutron agent-list`, restarted all neutron services, noted the qdhcp >> netns's were recreated, restarted a VM again and it still fails to pull an >> IP address. >> >> *Torin Woltjer* >> >> *Grand Dial Communications - A ZK Tech Inc. Company* >> >> *616.776.1066 ext. 2006* >> * >> >> www.granddial.com * >> >> ------------------------------ >> *From*: George Mihaiescu >> *Sent*: 7/5/18 10:38 AM >> *To*: torin.woltjer at granddial.com >> *Subject*: Re: [Openstack] Recovering from full outage >> Did you restart the neutron-dhcp-agent and rebooted the VMs? >> >> On Thu, Jul 5, 2018 at 10:30 AM, Torin Woltjer < >> torin.woltjer at granddial.com> wrote: >> >>> The qrouter netns appears once the lock_path is specified, the neutron >>> router is pingable as well. However, instances are not pingable. If I log >>> in via console, the instances have not been given IP addresses, if I >>> manually give them an address and route they are pingable and seem to work. >>> So the router is working correctly but dhcp is not working. >>> >>> No errors in any of the neutron or nova logs on controllers or compute >>> nodes. >>> >>> >>> *Torin Woltjer* >>> >>> *Grand Dial Communications - A ZK Tech Inc. Company* >>> >>> *616.776.1066 ext. 2006* >>> * >>> >>> >>> www.granddial.com * >>> >>> ------------------------------ >>> *From*: "Torin Woltjer" >>> *Sent*: 7/5/18 8:53 AM >>> *To*: >>> *Cc*: openstack-operators at lists.openstack.org, >>> openstack at lists.openstack.org >>> *Subject*: Re: [Openstack] Recovering from full outage >>> There is no lock path set in my neutron configuration. Does it >>> ultimately matter what it is set to as long as it is consistent? Does it >>> need to be set on compute nodes as well as controllers? >>> >>> *Torin Woltjer* >>> >>> *Grand Dial Communications - A ZK Tech Inc. Company* >>> >>> *616.776.1066 ext. 2006* >>> * >>> >>> >>> >>> www.granddial.com * >>> >>> ------------------------------ >>> *From*: George Mihaiescu >>> *Sent*: 7/3/18 7:47 PM >>> *To*: torin.woltjer at granddial.com >>> *Cc*: openstack-operators at lists.openstack.org, >>> openstack at lists.openstack.org >>> *Subject*: Re: [Openstack] Recovering from full outage >>> >>> Did you set a lock_path in the neutron’s config? >>> >>> On Jul 3, 2018, at 17:34, Torin Woltjer >>> wrote: >>> >>> The following errors appear in the neutron-linuxbridge-agent.log on both >>> controllers: >>> >>> >>> >>> >>> >>> >>> >>> >>> http://paste.openstack.org/sho >>> w/724930/ >>> >>> No such errors are on the compute nodes themselves. >>> >>> *Torin Woltjer* >>> >>> *Grand Dial Communications - A ZK Tech Inc. Company* >>> >>> *616.776.1066 ext. 2006* >>> * >>> >>> >>> >>> >>> www.granddial.com * >>> >>> ------------------------------ >>> *From*: "Torin Woltjer" >>> *Sent*: 7/3/18 5:14 PM >>> *To*: >>> *Cc*: "openstack-operators at lists.openstack.org" < >>> openstack-operators at lists.openstack.org>, "openstack at lists.openstack.org" >>> >>> *Subject*: Re: [Openstack] Recovering from full outage >>> Running `openstack server reboot` on an instance just causes the >>> instance to be stuck in a rebooting status. Most notable of the logs is >>> neutron-server.log which shows the following: >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> http://paste.openstack.org/sho >>> w/724917/ >>> >>> I realized that rabbitmq was in a failed state, so I bootstrapped it, >>> rebooted controllers, and all of the agents show online. >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> http://paste.openstack.org/sho >>> w/724921/ >>> And all of the instances can be properly started, however I cannot ping >>> any of the instances floating IPs or the neutron router. And when logging >>> into an instance with the console, there is no IP address on any interface. >>> >>> *Torin Woltjer* >>> >>> *Grand Dial Communications - A ZK Tech Inc. Company* >>> >>> *616.776.1066 ext. 2006* >>> * >>> >>> >>> >>> >>> >>> www.granddial.com * >>> >>> ------------------------------ >>> *From*: George Mihaiescu >>> *Sent*: 7/3/18 11:50 AM >>> *To*: torin.woltjer at granddial.com >>> *Subject*: Re: [Openstack] Recovering from full outage >>> Try restarting them using "openstack server reboot" and also check the >>> nova-compute.log and neutron agents logs on the compute nodes. >>> >>> On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer < >>> torin.woltjer at granddial.com> wrote: >>> >>>> We just suffered a power outage in out data center and I'm having >>>> trouble recovering the Openstack cluster. All of the nodes are back online, >>>> every instance shows active but `virsh list --all` on the compute nodes >>>> show that all of the VMs are actually shut down. Running `ip addr` on any >>>> of the nodes shows that none of the bridges are present and `ip netns` >>>> shows that all of the network namespaces are missing as well. So despite >>>> all of the neutron service running, none of the networking appears to be >>>> active, which is concerning. How do I solve this without recreating all of >>>> the networks? >>>> >>>> *Torin Woltjer* >>>> >>>> *Grand Dial Communications - A ZK Tech Inc. Company* >>>> >>>> *616.776.1066 ext. 2006* >>>> * >>>> >>>> >>>> >>>> >>>> >>>> >>>> www.granddial.com * >>>> >>>> _______________________________________________ >>>> Mailing list: >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>>> Post to : openstack at lists.openstack.org >>>> Unsubscribe : >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From torin.woltjer at granddial.com Fri Jul 6 15:49:58 2018 From: torin.woltjer at granddial.com (Torin Woltjer) Date: Fri, 06 Jul 2018 15:49:58 GMT Subject: [Openstack] Recovering from full outage Message-ID: Interestingly, I can ping the neutron router at 172.16.1.1 just fine, but DHCP (located at 172.16.1.2 and 172.16.1.3) fails. The instance that I manually added the IP address to has a floating IP, and oddly enough I am able to ping DHCP on the provider network, which suggests that DHCP may be working on other networks but not on my selfservice network. I was able to confirm this by creating a new virtual machine directly on the provider network, I was able to ping to it and SSH into it right off of the bat, as it obtained the proper address on its own. "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" is empty. "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" contains: fa:16:3e:3f:94:17,host-172-16-1-8.openstacklocal,172.16.1.8 fa:16:3e:e0:57:e7,host-172-16-1-7.openstacklocal,172.16.1.7 fa:16:3e:db:a7:cb,host-172-16-1-12.openstacklocal,172.16.1.12 fa:16:3e:f8:10:99,host-172-16-1-10.openstacklocal,172.16.1.10 fa:16:3e:a7:82:4c,host-172-16-1-3.openstacklocal,172.16.1.3 fa:16:3e:f8:23:1d,host-172-16-1-14.openstacklocal,172.16.1.14 fa:16:3e:63:53:a4,host-172-16-1-1.openstacklocal,172.16.1.1 fa:16:3e:b7:41:a8,host-172-16-1-2.openstacklocal,172.16.1.2 fa:16:3e:5e:25:5f,host-172-16-1-4.openstacklocal,172.16.1.4 fa:16:3e:3a:a2:53,host-172-16-1-100.openstacklocal,172.16.1.100 fa:16:3e:46:39:e2,host-172-16-1-13.openstacklocal,172.16.1.13 fa:16:3e:06:de:e0,host-172-16-1-18.openstacklocal,172.16.1.18 I've done system restarts since the power outage and the agent hasn't corrected itself. I've restarted all neutron services as I've done things, I could also try stopping and starting dnsmasq. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/6/18 11:15 AM To: torin.woltjer at granddial.com Cc: "openstack at lists.openstack.org" , "openstack-operators at lists.openstack.org" , pgsousa at gmail.com Subject: Re: [Openstack] Recovering from full outage Can you manually assign an IP address to a VM and once inside, ping the address of the dhcp server? That would confirm if there is connectivity at least. Also, on the controller node where the dhcp server for that network is, check the "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" and make sure there are entries corresponding to your instances. In my experience, if neutron is broken after working fine (so excluding any miss-configuration), then an agent is out-of-sync and restart usually fixes things. On Fri, Jul 6, 2018 at 9:38 AM, Torin Woltjer wrote: I have done tcpdumps on both the controllers and on a compute node. Controller: `ip netns exec qdhcp-d85c2a00-a637-4109-83f0-7c2949be4cad tcpdump -vnes0 -i ns-83d68c76-b8 port 67` `tcpdump -vnes0 -i any port 67` Compute: `tcpdump -vnes0 -i brqd85c2a00-a6 port 68` For the first command on the controller, there are no packets captured at all. The second command on the controller captures packets, but they don't appear to be relevant to openstack. The dump from the compute node shows constant requests are getting sent by openstack instances. In summary; DHCP requests are being sent, but are never received. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/5/18 4:50 PM To: torin.woltjer at granddial.com Subject: Re: [Openstack] Recovering from full outage The cloud-init requires network connectivity by default in order to reach the metadata server for the hostname, ssh-key, etc You can configure cloud-init to use the config-drive, but the lack of network connectivity will make the instance useless anyway, even though it will have you ssh-key and hostname... Did you check the things I told you? On Jul 5, 2018, at 16:06, Torin Woltjer wrote: Are IP addresses set by cloud-init on boot? I noticed that cloud-init isn't working on my VMs. created a new instance from an ubuntu 18.04 image to test with, the hostname was not set to the name of the instance and could not login as users I had specified in the configuration. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/5/18 12:57 PM To: torin.woltjer at granddial.com Cc: "openstack at lists.openstack.org" , "openstack-operators at lists.openstack.org" Subject: Re: [Openstack] Recovering from full outage You should tcpdump inside the qdhcp namespace to see if the requests make it there, and also check iptables rules on the compute nodes for the return traffic. On Thu, Jul 5, 2018 at 12:39 PM, Torin Woltjer wrote: Yes, I've done this. The VMs hang for awhile waiting for DHCP and eventually come up with no addresses. neutron-dhcp-agent has been restarted on both controllers. The qdhcp netns's were all present; I stopped the service, removed the qdhcp netns's, noted the dhcp agents show offline by `neutron agent-list`, restarted all neutron services, noted the qdhcp netns's were recreated, restarted a VM again and it still fails to pull an IP address. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/5/18 10:38 AM To: torin.woltjer at granddial.com Subject: Re: [Openstack] Recovering from full outage Did you restart the neutron-dhcp-agent and rebooted the VMs? On Thu, Jul 5, 2018 at 10:30 AM, Torin Woltjer wrote: The qrouter netns appears once the lock_path is specified, the neutron router is pingable as well. However, instances are not pingable. If I log in via console, the instances have not been given IP addresses, if I manually give them an address and route they are pingable and seem to work. So the router is working correctly but dhcp is not working. No errors in any of the neutron or nova logs on controllers or compute nodes. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/5/18 8:53 AM To: Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org Subject: Re: [Openstack] Recovering from full outage There is no lock path set in my neutron configuration. Does it ultimately matter what it is set to as long as it is consistent? Does it need to be set on compute nodes as well as controllers? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/3/18 7:47 PM To: torin.woltjer at granddial.com Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org Subject: Re: [Openstack] Recovering from full outage Did you set a lock_path in the neutron’s config? On Jul 3, 2018, at 17:34, Torin Woltjer wrote: The following errors appear in the neutron-linuxbridge-agent.log on both controllers: http://paste.openstack.org/show/724930/ No such errors are on the compute nodes themselves. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/3/18 5:14 PM To: Cc: "openstack-operators at lists.openstack.org" , "openstack at lists.openstack.org" Subject: Re: [Openstack] Recovering from full outage Running `openstack server reboot` on an instance just causes the instance to be stuck in a rebooting status. Most notable of the logs is neutron-server.log which shows the following: http://paste.openstack.org/show/724917/ I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted controllers, and all of the agents show online. http://paste.openstack.org/show/724921/ And all of the instances can be properly started, however I cannot ping any of the instances floating IPs or the neutron router. And when logging into an instance with the console, there is no IP address on any interface. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/3/18 11:50 AM To: torin.woltjer at granddial.com Subject: Re: [Openstack] Recovering from full outage Try restarting them using "openstack server reboot" and also check the nova-compute.log and neutron agents logs on the compute nodes. On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer wrote: We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack at lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaypipes at gmail.com Fri Jul 6 16:46:04 2018 From: jaypipes at gmail.com (Jay Pipes) Date: Fri, 6 Jul 2018 12:46:04 -0400 Subject: [Openstack] NUMA some of the time? In-Reply-To: <20180704150834.4szwgh6cjrs4gq6m@bla.tonimueller.org> References: <20180704150834.4szwgh6cjrs4gq6m@bla.tonimueller.org> Message-ID: <2c04c635-443b-9ee8-00cc-8a7a669b4b18@gmail.com> Hi Tony, The short answer is that you cannot do that today. Today, each Nova compute node is either "all in" for NUMA and CPU pinning or it's not. This means that for resource-constrained environments like "The Edge!", there are not very good ways to finely divide up a compute node and make the most efficient use of its resources. There is no current way to say "On this dual-Xeon compute node, put all workloads that don't care about dedicated CPUs on this socket and all workloads that DO care about dedicated CPUs on the other socket.". That said, we have had lengthy discussions about tracking dedicated guest CPU resources and dividing up the available logical host processors into buckets for "shared CPU" and "dedicated CPU" workloads on the following spec: https://review.openstack.org/#/c/555081/ It is not going to land in Rocky. However, we should be able to make good progress towards the goals in that spec in early Stein. Best, -jay On 07/04/2018 11:08 AM, Toni Mueller wrote: > > Hi, > > I am still trying to figure how to best utilise the small set of > hardware, and discovered the NUMA configuration mechanism. It allows me > to configure reserved cores for certain VMs, but it does not seem to > allow me to say "you can share these cores, but VMs of, say, appropriate > flavour take precedence and will throw you off these cores in case they > need more power". > > How can I achieve that, dynamically? > > TIA! > > > Thanks, > Toni > > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > From torin.woltjer at granddial.com Fri Jul 6 18:13:20 2018 From: torin.woltjer at granddial.com (Torin Woltjer) Date: Fri, 06 Jul 2018 18:13:20 GMT Subject: [Openstack] Recovering from full outage Message-ID: I explored creating a second "selfservice" vxlan to see if DHCP would work on it as it does on my external "provider" network. The new vxlan network shares the same problems as the old vxlan network. Am I having problems with VXLAN in particular? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/6/18 12:05 PM To: Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org Subject: Re: [Openstack] Recovering from full outage Interestingly, I can ping the neutron router at 172.16.1.1 just fine, but DHCP (located at 172.16.1.2 and 172.16.1.3) fails. The instance that I manually added the IP address to has a floating IP, and oddly enough I am able to ping DHCP on the provider network, which suggests that DHCP may be working on other networks but not on my selfservice network. I was able to confirm this by creating a new virtual machine directly on the provider network, I was able to ping to it and SSH into it right off of the bat, as it obtained the proper address on its own. "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" is empty. "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" contains: fa:16:3e:3f:94:17,host-172-16-1-8.openstacklocal,172.16.1.8 fa:16:3e:e0:57:e7,host-172-16-1-7.openstacklocal,172.16.1.7 fa:16:3e:db:a7:cb,host-172-16-1-12.openstacklocal,172.16.1.12 fa:16:3e:f8:10:99,host-172-16-1-10.openstacklocal,172.16.1.10 fa:16:3e:a7:82:4c,host-172-16-1-3.openstacklocal,172.16.1.3 fa:16:3e:f8:23:1d,host-172-16-1-14.openstacklocal,172.16.1.14 fa:16:3e:63:53:a4,host-172-16-1-1.openstacklocal,172.16.1.1 fa:16:3e:b7:41:a8,host-172-16-1-2.openstacklocal,172.16.1.2 fa:16:3e:5e:25:5f,host-172-16-1-4.openstacklocal,172.16.1.4 fa:16:3e:3a:a2:53,host-172-16-1-100.openstacklocal,172.16.1.100 fa:16:3e:46:39:e2,host-172-16-1-13.openstacklocal,172.16.1.13 fa:16:3e:06:de:e0,host-172-16-1-18.openstacklocal,172.16.1.18 I've done system restarts since the power outage and the agent hasn't corrected itself. I've restarted all neutron services as I've done things, I could also try stopping and starting dnsmasq. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/6/18 11:15 AM To: torin.woltjer at granddial.com Cc: "openstack at lists.openstack.org" , "openstack-operators at lists.openstack.org" , pgsousa at gmail.com Subject: Re: [Openstack] Recovering from full outage Can you manually assign an IP address to a VM and once inside, ping the address of the dhcp server? That would confirm if there is connectivity at least. Also, on the controller node where the dhcp server for that network is, check the "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" and make sure there are entries corresponding to your instances. In my experience, if neutron is broken after working fine (so excluding any miss-configuration), then an agent is out-of-sync and restart usually fixes things. On Fri, Jul 6, 2018 at 9:38 AM, Torin Woltjer wrote: I have done tcpdumps on both the controllers and on a compute node. Controller: `ip netns exec qdhcp-d85c2a00-a637-4109-83f0-7c2949be4cad tcpdump -vnes0 -i ns-83d68c76-b8 port 67` `tcpdump -vnes0 -i any port 67` Compute: `tcpdump -vnes0 -i brqd85c2a00-a6 port 68` For the first command on the controller, there are no packets captured at all. The second command on the controller captures packets, but they don't appear to be relevant to openstack. The dump from the compute node shows constant requests are getting sent by openstack instances. In summary; DHCP requests are being sent, but are never received. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/5/18 4:50 PM To: torin.woltjer at granddial.com Subject: Re: [Openstack] Recovering from full outage The cloud-init requires network connectivity by default in order to reach the metadata server for the hostname, ssh-key, etc You can configure cloud-init to use the config-drive, but the lack of network connectivity will make the instance useless anyway, even though it will have you ssh-key and hostname... Did you check the things I told you? On Jul 5, 2018, at 16:06, Torin Woltjer wrote: Are IP addresses set by cloud-init on boot? I noticed that cloud-init isn't working on my VMs. created a new instance from an ubuntu 18.04 image to test with, the hostname was not set to the name of the instance and could not login as users I had specified in the configuration. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/5/18 12:57 PM To: torin.woltjer at granddial.com Cc: "openstack at lists.openstack.org" , "openstack-operators at lists.openstack.org" Subject: Re: [Openstack] Recovering from full outage You should tcpdump inside the qdhcp namespace to see if the requests make it there, and also check iptables rules on the compute nodes for the return traffic. On Thu, Jul 5, 2018 at 12:39 PM, Torin Woltjer wrote: Yes, I've done this. The VMs hang for awhile waiting for DHCP and eventually come up with no addresses. neutron-dhcp-agent has been restarted on both controllers. The qdhcp netns's were all present; I stopped the service, removed the qdhcp netns's, noted the dhcp agents show offline by `neutron agent-list`, restarted all neutron services, noted the qdhcp netns's were recreated, restarted a VM again and it still fails to pull an IP address. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/5/18 10:38 AM To: torin.woltjer at granddial.com Subject: Re: [Openstack] Recovering from full outage Did you restart the neutron-dhcp-agent and rebooted the VMs? On Thu, Jul 5, 2018 at 10:30 AM, Torin Woltjer wrote: The qrouter netns appears once the lock_path is specified, the neutron router is pingable as well. However, instances are not pingable. If I log in via console, the instances have not been given IP addresses, if I manually give them an address and route they are pingable and seem to work. So the router is working correctly but dhcp is not working. No errors in any of the neutron or nova logs on controllers or compute nodes. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/5/18 8:53 AM To: Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org Subject: Re: [Openstack] Recovering from full outage There is no lock path set in my neutron configuration. Does it ultimately matter what it is set to as long as it is consistent? Does it need to be set on compute nodes as well as controllers? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/3/18 7:47 PM To: torin.woltjer at granddial.com Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org Subject: Re: [Openstack] Recovering from full outage Did you set a lock_path in the neutron’s config? On Jul 3, 2018, at 17:34, Torin Woltjer wrote: The following errors appear in the neutron-linuxbridge-agent.log on both controllers: http://paste.openstack.org/show/724930/ No such errors are on the compute nodes themselves. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/3/18 5:14 PM To: Cc: "openstack-operators at lists.openstack.org" , "openstack at lists.openstack.org" Subject: Re: [Openstack] Recovering from full outage Running `openstack server reboot` on an instance just causes the instance to be stuck in a rebooting status. Most notable of the logs is neutron-server.log which shows the following: http://paste.openstack.org/show/724917/ I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted controllers, and all of the agents show online. http://paste.openstack.org/show/724921/ And all of the instances can be properly started, however I cannot ping any of the instances floating IPs or the neutron router. And when logging into an instance with the console, there is no IP address on any interface. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/3/18 11:50 AM To: torin.woltjer at granddial.com Subject: Re: [Openstack] Recovering from full outage Try restarting them using "openstack server reboot" and also check the nova-compute.log and neutron agents logs on the compute nodes. On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer wrote: We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack at lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From mrhillsman at gmail.com Mon Jul 9 17:27:55 2018 From: mrhillsman at gmail.com (Melvin Hillsman) Date: Mon, 9 Jul 2018 12:27:55 -0500 Subject: [Openstack] Reminder: UC Meeting Today 1800UTC / 1300CST Message-ID: Hey everyone, Please see https://wiki.openstack.org/wiki/Governance/Foundation/Us erCommittee for UC meeting info and add additional agenda items if needed. -- Kind regards, Melvin Hillsman mrhillsman at gmail.com mobile: (832) 264-2646 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.studarus at openstacksandiego.org Mon Jul 9 18:40:17 2018 From: john.studarus at openstacksandiego.org (John Studarus) Date: Mon, 09 Jul 2018 11:40:17 -0700 Subject: [Openstack] OpenStack event tomorrow July 10 SF Bay - Spread the word Message-ID: <16480582615.b8e317ec159095.7683600829540190778@openstacksandiego.org> Just a reminder that tomorrow is the OpenStack 8th Birthday at the Intel campus in Santa Clara. We're looking for your help to spread the word to let all your friends, coworkers, and others that might be interested in attending. Would you mind taking a second and forwarding along this email to them? We've got a half day of talks and labs following by an evening social. We've got a number of top speakers including the Cloud CIO of Cisco, Lew Tucker, CEO of VexxHost, Mohammed Naser, the original K8s PM and GKE co-creator, Eric Han and Platform 9 talking about their migration off AWS onto a private OpenStack cloud. If you don't have time for the afternoon talks and labs, simply join us for the keynotes, or in the evening for the social, networking, Lightning Talks, and party! Great presentations, free food, fancy drinks, cupcakes, and tons of swag! What's not to love? Here are all the details: https://www.meetup.com/openstack/events/252368073/ July 10, 2018 1:00 PM - 6:00 PM Presentation and labs 6:00 PM - 9:00 PM Lightening talks, dinner, fancy drinks, give-aways, and cupcakes. Intel Altera Campus Building 3, Auditorium 131 Innovation Drive San Jose, CA 95134 Thank you to all our sponsors including: Canonical (https://www.canonical.com/) Cisco (http://www.cisco.com/) Cisco DevNet (https://developer.cisco.com/) Datera (http://www.datera.io/) Intel (https://software.intel.com/networking/) Packet Host (https://www.packet.net/) Platform9 (https://platform9.com/ ) Portworx (https://portworx.com ) Red Hat (http://www.redhat.com/) Redis Labs (https://redislabs.com/) SoftIron (https://softiron.com/) -------------- next part -------------- An HTML attachment was scrubbed... URL: From skinnyh92 at gmail.com Tue Jul 10 00:33:34 2018 From: skinnyh92 at gmail.com (Hang Yang) Date: Mon, 9 Jul 2018 17:33:34 -0700 Subject: [Openstack] How to make Neutron select a specific subnet when boot an instance Message-ID: Hi there, I have one question about choosing a specific subnet when boot a vm. My OpenStack cluster is in Queens and I have multiple subnets in one network. What I want is when I issue a boot command the instance only gets ip from one subnet by default. I know there is a way to achieve that by creating a port with --fixed-ip subnet=xxx then pass the port id to boot command. Wondering if there is another way that does not need manually create the port? Is there any configuration I can do to make it default for Neutron to only pick one subnet from a network in boot? Thanks for any help. Best regards, Hang -------------- next part -------------- An HTML attachment was scrubbed... URL: From eblock at nde.ag Tue Jul 10 07:31:39 2018 From: eblock at nde.ag (Eugen Block) Date: Tue, 10 Jul 2018 07:31:39 +0000 Subject: [Openstack] How to make Neutron select a specific subnet when boot an instance In-Reply-To: Message-ID: <20180710073139.Horde.WnAWO9KJKFeJjsCnfBHIOn7@webmail.nde.ag> Hi, depending on your workflow there would be a way with scripting [0], not sure if this would be a suitable approach for you. There also has been a blueprint [1] for the selection of subnets during instance creation since Juno, but I can't find anything about an implementation. Regards, Eugen [0] https://ask.openstack.org/en/question/95573/how-to-select-a-subnet-when-booting-an-instance/ [1] https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/selecting-subnet-when-creating-vm.html Zitat von Hang Yang : > Hi there, > > I have one question about choosing a specific subnet when boot a vm. My > OpenStack cluster is in Queens and I have multiple subnets in one network. > What I want is when I issue a boot command the instance only gets ip from > one subnet by default. I know there is a way to achieve that by creating a > port with --fixed-ip subnet=xxx then pass the port id to boot command. > Wondering if there is another way that does not need manually create the > port? Is there any configuration I can do to make it default for Neutron to > only pick one subnet from a network in boot? > > Thanks for any help. > > Best regards, > Hang From eblock at nde.ag Tue Jul 10 07:38:09 2018 From: eblock at nde.ag (Eugen Block) Date: Tue, 10 Jul 2018 07:38:09 +0000 Subject: [Openstack] How to make Neutron select a specific subnet when boot an instance In-Reply-To: Message-ID: <20180710073809.Horde.PUUk_Qm2eM6GKpbfiKjgyQj@webmail.nde.ag> There has been some work on this [2], but it didn't make it into the Kilo release (abandoned), and I don't see it either in later releases. [2] https://blueprints.launchpad.net/nova/+spec/selecting-subnet-when-creating-vm Zitat von Hang Yang : > Hi there, > > I have one question about choosing a specific subnet when boot a vm. My > OpenStack cluster is in Queens and I have multiple subnets in one network. > What I want is when I issue a boot command the instance only gets ip from > one subnet by default. I know there is a way to achieve that by creating a > port with --fixed-ip subnet=xxx then pass the port id to boot command. > Wondering if there is another way that does not need manually create the > port? Is there any configuration I can do to make it default for Neutron to > only pick one subnet from a network in boot? > > Thanks for any help. > > Best regards, > Hang From lhinds at redhat.com Tue Jul 10 08:20:45 2018 From: lhinds at redhat.com (Luke Hinds) Date: Tue, 10 Jul 2018 09:20:45 +0100 Subject: [Openstack] [OSSN-0084] Data retained after deletion of a ScaleIO volume Message-ID: <2dc4a6ce-b5a3-8cee-724c-4295fd595f54@redhat.com> Data retained after deletion of a ScaleIO volume --- ### Summary ### Certain storage volume configurations allow newly created volumes to contain previous data. This could lead to leakage of sensitive information between tenants. ### Affected Services / Software ### Cinder releases up to and including Queens with ScaleIO volumes using thin volumes and zero padding. ### Discussion ### Using both thin volumes and zero padding does not ensure data contained in a volume is actually deleted. The default volume provisioning rule is set to thick so most installations are likely not affected. Operators can check their configuration in `cinder.conf` or check for zero padding with this command `scli --query_all`. #### Recommended Actions #### Operators can use the following two workarounds, until the release of Rocky (planned 30th August 2018) which resolves the issue. 1. Swap to thin volumes 2. Ensure ScaleIO storage pools use zero-padding with: `scli --modify_zero_padding_policy (((--protection_domain_id | --protection_domain_name ) --storage_pool_name ) | --storage_pool_id ) (--enable_zero_padding | --disable_zero_padding)` ### Contacts / References ### Author: Nick Tait This OSSN : https://wiki.openstack.org/wiki/OSSN/OSSN-0084 Original LaunchPad Bug : https://bugs.launchpad.net/ossn/+bug/1699573 Mailing List : [Security] tag on openstack-dev at lists.openstack.org OpenStack Security Project : https://launchpad.net/~openstack-ossg -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hamzy at us.ibm.com Tue Jul 10 14:44:41 2018 From: hamzy at us.ibm.com (Mark Hamzy) Date: Tue, 10 Jul 2018 09:44:41 -0500 Subject: [Openstack] [tripleo] What is the proper way to use NetConfigDataLookup? Message-ID: What is the proper way to use NetConfigDataLookup? I tried the following: (undercloud) [stack at oscloud5 ~]$ cat << '__EOF__' > ~/templates/mapping-info.yaml parameter_defaults: NetConfigDataLookup: control1: nic1: '5c:f3:fc:36:dd:68' nic2: '5c:f3:fc:36:dd:6c' nic3: '6c:ae:8b:29:27:fa' # 9.114.219.34 nic4: '6c:ae:8b:29:27:fb' # 9.114.118.??? nic5: '6c:ae:8b:29:27:fc' nic6: '6c:ae:8b:29:27:fd' compute1: nic1: '6c:ae:8b:25:34:ea' # 9.114.219.44 nic2: '6c:ae:8b:25:34:eb' nic3: '6c:ae:8b:25:34:ec' # 9.114.118.??? nic4: '6c:ae:8b:25:34:ed' compute2: nic1: '00:0a:f7:73:3c:c0' nic2: '00:0a:f7:73:3c:c1' nic3: '00:0a:f7:73:3c:c2' # 9.114.118.156 nic4: '00:0a:f7:73:3c:c3' # 9.114.112.??? nic5: '00:0a:f7:73:73:f4' nic6: '00:0a:f7:73:73:f5' nic7: '00:0a:f7:73:73:f6' # 9.114.219.134 nic8: '00:0a:f7:73:73:f7' __EOF__ (undercloud) [stack at oscloud5 ~]$ openstack overcloud deploy --templates -e ~/templates/node-info.yaml -e ~/templates/mapping-info.yaml -e ~/templates/overcloud_images.yaml -e ~/templates/environments/network-environment.yaml -e ~/templates/environments/network-isolation.yaml -e ~/templates/environments/config-debug.yaml --disable-validations --ntp-server pool.ntp.org --control-scale 1 --compute-scale But I did not see a /etc/os-net-config/mapping.yaml get created. Also is this configuration used when the system boots IronicPythonAgent to provision the disk? -- Mark You must be the change you wish to see in the world. -- Mahatma Gandhi Never let the future disturb you. You will meet it, if you have to, with the same weapons of reason which today arm you against the present. -- Marcus Aurelius -------------- next part -------------- An HTML attachment was scrubbed... URL: From torin.woltjer at granddial.com Tue Jul 10 18:58:32 2018 From: torin.woltjer at granddial.com (Torin Woltjer) Date: Tue, 10 Jul 2018 18:58:32 GMT Subject: [Openstack] Recovering from full outage Message-ID: <167ce30bac124c85a16061c83353553a@granddial.com> DHCP is working again so instances are getting their addresses. For some reason cloud-init isn't working correctly. Hostnames aren't getting set, and SSH key pair isn't getting set. The neutron-metadata service is in control of this? neutron-metadata-agent.log: 2018-07-10 08:01:42.046 5518 INFO eventlet.wsgi.server [-] 109.73.185.195, "GET / HTTP/1.1" status: 404 len: 195 time: 0.0622332 2018-07-10 09:49:42.604 5518 INFO eventlet.wsgi.server [-] 197.149.85.150, "GET / HTTP/1.1" status: 404 len: 195 time: 0.0645461 2018-07-10 10:52:50.845 5517 INFO eventlet.wsgi.server [-] 88.249.225.204, "GET / HTTP/1.1" status: 404 len: 195 time: 0.0659041 2018-07-10 11:43:20.471 5518 INFO eventlet.wsgi.server [-] 143.208.186.168, "GET / HTTP/1.1" status: 404 len: 195 time: 0.0618532 2018-07-10 11:53:15.574 5511 INFO eventlet.wsgi.server [-] 194.40.240.254, "GET / HTTP/1.1" status: 404 len: 195 time: 0.0636070 2018-07-10 13:26:46.795 5518 INFO eventlet.wsgi.server [-] 109.73.177.149, "GET / HTTP/1.1" status: 404 len: 195 time: 0.0611560 2018-07-10 13:27:38.795 5513 INFO eventlet.wsgi.server [-] 125.167.69.238, "GET / HTTP/1.0" status: 404 len: 195 time: 0.0631371 2018-07-10 13:30:49.551 5514 INFO eventlet.wsgi.server [-] 155.93.152.111, "GET / HTTP/1.0" status: 404 len: 195 time: 0.0609179 2018-07-10 14:12:42.008 5521 INFO eventlet.wsgi.server [-] 190.85.38.173, "GET / HTTP/1.1" status: 404 len: 195 time: 0.0597739 No other log files show abnormal behavior. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/6/18 2:33 PM To: "lmihaiescu at gmail.com" Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org Subject: Re: [Openstack] Recovering from full outage I explored creating a second "selfservice" vxlan to see if DHCP would work on it as it does on my external "provider" network. The new vxlan network shares the same problems as the old vxlan network. Am I having problems with VXLAN in particular? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/6/18 12:05 PM To: Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org Subject: Re: [Openstack] Recovering from full outage Interestingly, I can ping the neutron router at 172.16.1.1 just fine, but DHCP (located at 172.16.1.2 and 172.16.1.3) fails. The instance that I manually added the IP address to has a floating IP, and oddly enough I am able to ping DHCP on the provider network, which suggests that DHCP may be working on other networks but not on my selfservice network. I was able to confirm this by creating a new virtual machine directly on the provider network, I was able to ping to it and SSH into it right off of the bat, as it obtained the proper address on its own. "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" is empty. "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" contains: fa:16:3e:3f:94:17,host-172-16-1-8.openstacklocal,172.16.1.8 fa:16:3e:e0:57:e7,host-172-16-1-7.openstacklocal,172.16.1.7 fa:16:3e:db:a7:cb,host-172-16-1-12.openstacklocal,172.16.1.12 fa:16:3e:f8:10:99,host-172-16-1-10.openstacklocal,172.16.1.10 fa:16:3e:a7:82:4c,host-172-16-1-3.openstacklocal,172.16.1.3 fa:16:3e:f8:23:1d,host-172-16-1-14.openstacklocal,172.16.1.14 fa:16:3e:63:53:a4,host-172-16-1-1.openstacklocal,172.16.1.1 fa:16:3e:b7:41:a8,host-172-16-1-2.openstacklocal,172.16.1.2 fa:16:3e:5e:25:5f,host-172-16-1-4.openstacklocal,172.16.1.4 fa:16:3e:3a:a2:53,host-172-16-1-100.openstacklocal,172.16.1.100 fa:16:3e:46:39:e2,host-172-16-1-13.openstacklocal,172.16.1.13 fa:16:3e:06:de:e0,host-172-16-1-18.openstacklocal,172.16.1.18 I've done system restarts since the power outage and the agent hasn't corrected itself. I've restarted all neutron services as I've done things, I could also try stopping and starting dnsmasq. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/6/18 11:15 AM To: torin.woltjer at granddial.com Cc: "openstack at lists.openstack.org" , "openstack-operators at lists.openstack.org" , pgsousa at gmail.com Subject: Re: [Openstack] Recovering from full outage Can you manually assign an IP address to a VM and once inside, ping the address of the dhcp server? That would confirm if there is connectivity at least. Also, on the controller node where the dhcp server for that network is, check the "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" and make sure there are entries corresponding to your instances. In my experience, if neutron is broken after working fine (so excluding any miss-configuration), then an agent is out-of-sync and restart usually fixes things. On Fri, Jul 6, 2018 at 9:38 AM, Torin Woltjer wrote: I have done tcpdumps on both the controllers and on a compute node. Controller: `ip netns exec qdhcp-d85c2a00-a637-4109-83f0-7c2949be4cad tcpdump -vnes0 -i ns-83d68c76-b8 port 67` `tcpdump -vnes0 -i any port 67` Compute: `tcpdump -vnes0 -i brqd85c2a00-a6 port 68` For the first command on the controller, there are no packets captured at all. The second command on the controller captures packets, but they don't appear to be relevant to openstack. The dump from the compute node shows constant requests are getting sent by openstack instances. In summary; DHCP requests are being sent, but are never received. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/5/18 4:50 PM To: torin.woltjer at granddial.com Subject: Re: [Openstack] Recovering from full outage The cloud-init requires network connectivity by default in order to reach the metadata server for the hostname, ssh-key, etc You can configure cloud-init to use the config-drive, but the lack of network connectivity will make the instance useless anyway, even though it will have you ssh-key and hostname... Did you check the things I told you? On Jul 5, 2018, at 16:06, Torin Woltjer wrote: Are IP addresses set by cloud-init on boot? I noticed that cloud-init isn't working on my VMs. created a new instance from an ubuntu 18.04 image to test with, the hostname was not set to the name of the instance and could not login as users I had specified in the configuration. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/5/18 12:57 PM To: torin.woltjer at granddial.com Cc: "openstack at lists.openstack.org" , "openstack-operators at lists.openstack.org" Subject: Re: [Openstack] Recovering from full outage You should tcpdump inside the qdhcp namespace to see if the requests make it there, and also check iptables rules on the compute nodes for the return traffic. On Thu, Jul 5, 2018 at 12:39 PM, Torin Woltjer wrote: Yes, I've done this. The VMs hang for awhile waiting for DHCP and eventually come up with no addresses. neutron-dhcp-agent has been restarted on both controllers. The qdhcp netns's were all present; I stopped the service, removed the qdhcp netns's, noted the dhcp agents show offline by `neutron agent-list`, restarted all neutron services, noted the qdhcp netns's were recreated, restarted a VM again and it still fails to pull an IP address. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/5/18 10:38 AM To: torin.woltjer at granddial.com Subject: Re: [Openstack] Recovering from full outage Did you restart the neutron-dhcp-agent and rebooted the VMs? On Thu, Jul 5, 2018 at 10:30 AM, Torin Woltjer wrote: The qrouter netns appears once the lock_path is specified, the neutron router is pingable as well. However, instances are not pingable. If I log in via console, the instances have not been given IP addresses, if I manually give them an address and route they are pingable and seem to work. So the router is working correctly but dhcp is not working. No errors in any of the neutron or nova logs on controllers or compute nodes. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/5/18 8:53 AM To: Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org Subject: Re: [Openstack] Recovering from full outage There is no lock path set in my neutron configuration. Does it ultimately matter what it is set to as long as it is consistent? Does it need to be set on compute nodes as well as controllers? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/3/18 7:47 PM To: torin.woltjer at granddial.com Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org Subject: Re: [Openstack] Recovering from full outage Did you set a lock_path in the neutron’s config? On Jul 3, 2018, at 17:34, Torin Woltjer wrote: The following errors appear in the neutron-linuxbridge-agent.log on both controllers: http://paste.openstack.org/show/724930/ No such errors are on the compute nodes themselves. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/3/18 5:14 PM To: Cc: "openstack-operators at lists.openstack.org" , "openstack at lists.openstack.org" Subject: Re: [Openstack] Recovering from full outage Running `openstack server reboot` on an instance just causes the instance to be stuck in a rebooting status. Most notable of the logs is neutron-server.log which shows the following: http://paste.openstack.org/show/724917/ I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted controllers, and all of the agents show online. http://paste.openstack.org/show/724921/ And all of the instances can be properly started, however I cannot ping any of the instances floating IPs or the neutron router. And when logging into an instance with the console, there is no IP address on any interface. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/3/18 11:50 AM To: torin.woltjer at granddial.com Subject: Re: [Openstack] Recovering from full outage Try restarting them using "openstack server reboot" and also check the nova-compute.log and neutron agents logs on the compute nodes. On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer wrote: We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack at lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Wed Jul 11 14:49:03 2018 From: satish.txt at gmail.com (Satish Patel) Date: Wed, 11 Jul 2018 10:49:03 -0400 Subject: [Openstack] openstack-ansible ceph deployment issue Message-ID: Building pike (OSAD) on CentOS7.3 and getting following error during running openstack-ansible ceph-install.yml TASK [ceph-config : template ceph_conf_overrides] *********************************************************************************************************** Wednesday 11 July 2018 01:36:38 -0400 (0:00:02.055) 0:01:51.602 ******** fatal: [ostack-controller-03_ceph-mon_container-f1024d18 -> localhost]: FAILED! => {"changed": false, "checksum": "bf21a9e8fbc5a3846fb05b4fa0859e0917b2202f", "failed": true, "msg": "Aborting, target uses selinux but python bindings (libselinux-python) aren't installed!"} NO MORE HOSTS LEFT ****************************************************************************************************************************************** PLAY RECAP ************************************************************************************************************************************************** ostack-controller-03_ceph-mon_container-f1024d18 : ok=39 changed=2 unreachable=0 failed=1 From ruth at ivimey.org Wed Jul 11 18:44:31 2018 From: ruth at ivimey.org (Ruth Ivimey-Cook) Date: Wed, 11 Jul 2018 19:44:31 +0100 Subject: [Openstack] Fatal error during container create (ansible-openstack on bionic) Message-ID: <79fb4658-c748-3e6c-f3ee-1ca884299ff6@ivimey.org> I am getting a fatal error in lxc_create when running openstack-ansible playbooks/setup-hosts.yml and hoping someone can push me in the right direction. Logs below... I am interpreting the fatal error as some sort of missing config, which is why I included the warnings that happened earlier in the above. Is that right? Is there any way I can isolate where exactly in the ansible setup this happens? The only significant changes I've made to the ansible setup are - comment out `linux-image-extra-{{ ansible_kernel }}` package from the ubuntu config as it no longer exists. - create /etc/ansible/.../*ubuntu-18.04.yml files by copying the equivalent ubuntu-16.04.yml file, where no 18.04 version was already present. > ~/openstack-ansible$ sudo openstack-ansible playbooks/setup-hosts.yml > > Variable files: "-e @/etc/openstack_deploy/user_secrets.yml -e @/etc/openstack_deploy/user_variables.yml " > >  [WARNING]: Unable to parse /etc/openstack_deploy/inventory.ini as an inventory source > [DEPRECATION WARNING]: 'include' for playbook includes. You should use 'import_playbook' instead. This > > feature will be removed in version 2.8. Deprecation warnings can be disabled by setting > > deprecation_warnings=False in ansible.cfg. > >  [WARNING]: Could not match supplied host pattern, ignoring: all_lxc_containers > >  [WARNING]: Could not match supplied host pattern, ignoring: all_nspawn_containers > > PLAY [Install Ansible prerequisites] ************************************************************************* > > TASK [Ensure python is installed] **************************************************************************** > > ok: [aio1] ... lots of stuff that works... > TASK [Create the new LXC service log directory] ************************************************************** > > ok: [aio1] > > TASK [Create the LXC service log aggregation link] *********************************************************** > > ok: [aio1] > > TASK [apt_package_pinning : Add apt pin preferences] ********************************************************* > > TASK [lxc_hosts : Check for the presence of a public key file on the deployment host] ************************ > > ok: [aio1 -> localhost] > > TASK [lxc_hosts : Fail if a ssh public key is not set in a var and is not present on the deployment host] **** > > TASK [lxc_hosts : Gather variables for each operating system] ************************************************ > > ok: [aio1] => (item=/etc/ansible/roles/lxc_hosts/vars/ubuntu-18.04-host.yml) > > TASK [lxc_hosts : Gather container variables] **************************************************************** > >  [WARNING]: Invalid request to find a file that matches a "null" value > > ok: [aio1] => (item=/etc/ansible/roles/lxc_hosts/vars/ubuntu-18.04.yml) > > TASK [lxc_hosts : include_tasks] ***************************************************************************** > > included: /etc/ansible/roles/lxc_hosts/tasks/lxc_pre_install.yml for aio1 A little later in the same run: > TASK [lxc_container_create : Check the physical_host variable is set] **************************************** > > TASK [lxc_container_create : Collect physical host facts if missing] ***************************************** > > TASK [lxc_container_create : Kernel version and LXC backing store check] ************************************* > > TASK [lxc_container_create : Gather variables for each operating system] ************************************* > >  [WARNING]: Invalid request to find a file that matches a "null" value > >  [WARNING]: Invalid request to find a file that matches a "null" value > > ok: [aio1_cinder_api_container-3255dd97] => (item=/etc/ansible/roles/lxc_container_create/vars/ubuntu-18.04.yml) > >  [WARNING]: Invalid request to find a file that matches a "null" value > > ok: [aio1_designate_container-54f1c305] => (item=/etc/ansible/roles/lxc_container_create/vars/ubuntu-18.04.yml) > >  [WARNING]: Invalid request to find a file that matches a "null" value > >  [WARNING]: Invalid request to find a file that matches a "null" value And then, finally, the fatal error: > TASK [lxc_container_create : include_tasks] ****************************************************************** > > included: /etc/ansible/roles/lxc_container_create/tasks/lxc_container_create_dir.yml for aio1_cinder_api_container-3255dd97, aio1_designate_container-54f1c305, aio1_galera_container-b332cdef, aio1_glance_container-8d10cc70, aio1_heat_api_container-362fdd4a, aio1_horizon_container-d76a2adc, aio1_keystone_container-78616d24, aio1_memcached_container-916a4563, aio1_neutron_server_container-3bf65b1d, aio1_nova_api_container-91ebf932, aio1_repo_container-f56147bc, aio1_rabbit_mq_container-bfd8534a, aio1_rsyslog_container-ce40ff7f, aio1_swift_proxy_container-eada6cf1, aio1_utility_container-195113e0 > > TASK [lxc_container_create : Create container (dir)] ********************************************************* > > fatal: [aio1_cinder_api_container-3255dd97 -> 172.29.236.100]: FAILED! => {"changed": false, "module_stderr": "Shared connection to 172.29.236.100 closed.\r\n", "module_stdout": "Failed to load config for aio1_cinder_api_container-3255dd97\r\n443: error creating container aio1_cinder_api_container-3255dd97\r\nTraceback (most recent call last):\r\n  File \"/tmp/ansible_GSPHbc/ansible_module_lxc_container.py\", line 1772, in \r\n    main()\r\n  File \"/tmp/ansible_GSPHbc/ansible_module_lxc_container.py\", line 1767, in main\r\n    lxc_manage = LxcContainerManagement(module=module)\r\n  File \"/tmp/ansible_GSPHbc/ansible_module_lxc_container.py\", line 619, in __init__\r\n    self.container = self.get_container_bind()\r\n File \"/tmp/ansible_GSPHbc/ansible_module_lxc_container.py\", line 624, in get_container_bind\r\n    return lxc.Container(name=self.container_name)\r\n  File \"/usr/lib/python2.7/dist-packages/lxc/__init__.py\", line 153, in __init__\r\n    _lxc.Container.__init__(self, name)\r\nSystemError: NULL result without error in PyObject_Call\r\n", "msg": "MODULE FAILURE", "rc": 1} Context: I want to run openstack on ubuntu bionic, and using ansible seemed to be the best way forward. I know openstack-ansible is only supported on xenial, but as I'm a software developer I thought I'd give it a go. I first commented out the OS checks... and have got a good deal of progress since. However, I have hit a problem and am hoping someone can help. I also posted this question on the ask.openstack pages but it's still awaiting moderation :( https://ask.openstack.org/en/question/115193/fatal-error-during-container-create-ansible-openstack-on-bionic/ From mnaser at vexxhost.com Wed Jul 11 19:35:03 2018 From: mnaser at vexxhost.com (Mohammed Naser) Date: Wed, 11 Jul 2018 12:35:03 -0700 Subject: [Openstack] Fatal error during container create (ansible-openstack on bionic) In-Reply-To: <79fb4658-c748-3e6c-f3ee-1ca884299ff6@ivimey.org> References: <79fb4658-c748-3e6c-f3ee-1ca884299ff6@ivimey.org> Message-ID: Hi there, Bionic isn't current supported and we're working on adding support for it in the Rocky cycle! We recommend you deploy on Xenial. Thanks, Mohammed On Wed, Jul 11, 2018 at 11:44 AM, Ruth Ivimey-Cook wrote: > I am getting a fatal error in lxc_create when running openstack-ansible > playbooks/setup-hosts.yml and hoping someone can push me in the right > direction. Logs below... > > I am interpreting the fatal error as some sort of missing config, which is > why I included the warnings that happened earlier in the above. Is that > right? Is there any way I can isolate where exactly in the ansible setup > this happens? > > The only significant changes I've made to the ansible setup are > > - comment out `linux-image-extra-{{ ansible_kernel }}` package from the > ubuntu config as it no longer exists. > - create /etc/ansible/.../*ubuntu-18.04.yml files by copying the equivalent > ubuntu-16.04.yml file, where no 18.04 version was already present. > >> ~/openstack-ansible$ sudo openstack-ansible playbooks/setup-hosts.yml >> >> Variable files: "-e @/etc/openstack_deploy/user_secrets.yml -e >> @/etc/openstack_deploy/user_variables.yml " >> >> [WARNING]: Unable to parse /etc/openstack_deploy/inventory.ini as an >> inventory source > >> [DEPRECATION WARNING]: 'include' for playbook includes. You should use >> 'import_playbook' instead. This >> >> feature will be removed in version 2.8. Deprecation warnings can be >> disabled by setting >> >> deprecation_warnings=False in ansible.cfg. >> >> [WARNING]: Could not match supplied host pattern, ignoring: >> all_lxc_containers >> >> [WARNING]: Could not match supplied host pattern, ignoring: >> all_nspawn_containers >> >> PLAY [Install Ansible prerequisites] >> ************************************************************************* >> >> TASK [Ensure python is installed] >> **************************************************************************** >> >> ok: [aio1] > > > ... lots of stuff that works... > >> TASK [Create the new LXC service log directory] >> ************************************************************** >> >> ok: [aio1] >> >> TASK [Create the LXC service log aggregation link] >> *********************************************************** >> >> ok: [aio1] >> >> TASK [apt_package_pinning : Add apt pin preferences] >> ********************************************************* >> >> TASK [lxc_hosts : Check for the presence of a public key file on the >> deployment host] ************************ >> >> ok: [aio1 -> localhost] >> >> TASK [lxc_hosts : Fail if a ssh public key is not set in a var and is not >> present on the deployment host] **** >> >> TASK [lxc_hosts : Gather variables for each operating system] >> ************************************************ >> >> ok: [aio1] => >> (item=/etc/ansible/roles/lxc_hosts/vars/ubuntu-18.04-host.yml) >> >> TASK [lxc_hosts : Gather container variables] >> **************************************************************** >> >> [WARNING]: Invalid request to find a file that matches a "null" value >> >> ok: [aio1] => (item=/etc/ansible/roles/lxc_hosts/vars/ubuntu-18.04.yml) >> >> TASK [lxc_hosts : include_tasks] >> ***************************************************************************** >> >> included: /etc/ansible/roles/lxc_hosts/tasks/lxc_pre_install.yml for aio1 > > A little later in the same run: > >> TASK [lxc_container_create : Check the physical_host variable is set] >> **************************************** >> >> TASK [lxc_container_create : Collect physical host facts if missing] >> ***************************************** >> >> TASK [lxc_container_create : Kernel version and LXC backing store check] >> ************************************* >> >> TASK [lxc_container_create : Gather variables for each operating system] >> ************************************* >> >> [WARNING]: Invalid request to find a file that matches a "null" value >> >> [WARNING]: Invalid request to find a file that matches a "null" value >> >> ok: [aio1_cinder_api_container-3255dd97] => >> (item=/etc/ansible/roles/lxc_container_create/vars/ubuntu-18.04.yml) >> >> [WARNING]: Invalid request to find a file that matches a "null" value >> >> ok: [aio1_designate_container-54f1c305] => >> (item=/etc/ansible/roles/lxc_container_create/vars/ubuntu-18.04.yml) >> >> [WARNING]: Invalid request to find a file that matches a "null" value >> >> [WARNING]: Invalid request to find a file that matches a "null" value > > And then, finally, the fatal error: > >> TASK [lxc_container_create : include_tasks] >> ****************************************************************** >> >> included: >> /etc/ansible/roles/lxc_container_create/tasks/lxc_container_create_dir.yml >> for aio1_cinder_api_container-3255dd97, aio1_designate_container-54f1c305, >> aio1_galera_container-b332cdef, aio1_glance_container-8d10cc70, >> aio1_heat_api_container-362fdd4a, aio1_horizon_container-d76a2adc, >> aio1_keystone_container-78616d24, aio1_memcached_container-916a4563, >> aio1_neutron_server_container-3bf65b1d, aio1_nova_api_container-91ebf932, >> aio1_repo_container-f56147bc, aio1_rabbit_mq_container-bfd8534a, >> aio1_rsyslog_container-ce40ff7f, aio1_swift_proxy_container-eada6cf1, >> aio1_utility_container-195113e0 >> >> TASK [lxc_container_create : Create container (dir)] >> ********************************************************* >> >> fatal: [aio1_cinder_api_container-3255dd97 -> 172.29.236.100]: FAILED! => >> {"changed": false, "module_stderr": "Shared connection to 172.29.236.100 >> closed.\r\n", "module_stdout": "Failed to load config for >> aio1_cinder_api_container-3255dd97\r\n443: error creating container >> aio1_cinder_api_container-3255dd97\r\nTraceback (most recent call last):\r\n >> File \"/tmp/ansible_GSPHbc/ansible_module_lxc_container.py\", line 1772, in >> \r\n main()\r\n File >> \"/tmp/ansible_GSPHbc/ansible_module_lxc_container.py\", line 1767, in >> main\r\n lxc_manage = LxcContainerManagement(module=module)\r\n File >> \"/tmp/ansible_GSPHbc/ansible_module_lxc_container.py\", line 619, in >> __init__\r\n self.container = self.get_container_bind()\r\n File >> \"/tmp/ansible_GSPHbc/ansible_module_lxc_container.py\", line 624, in >> get_container_bind\r\n return lxc.Container(name=self.container_name)\r\n >> File \"/usr/lib/python2.7/dist-packages/lxc/__init__.py\", line 153, in >> __init__\r\n _lxc.Container.__init__(self, name)\r\nSystemError: NULL >> result without error in PyObject_Call\r\n", "msg": "MODULE FAILURE", "rc": >> 1} > > Context: I want to run openstack on ubuntu bionic, and using ansible seemed > to be the best way forward. I know openstack-ansible is only supported on > xenial, but as I'm a software developer I thought I'd give it a go. I first > commented out the OS checks... and have got a good deal of progress since. > However, I have hit a problem and am hoping someone can help. > > I also posted this question on the ask.openstack pages but it's still > awaiting moderation :( > > https://ask.openstack.org/en/question/115193/fatal-error-during-container-create-ansible-openstack-on-bionic/ > > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack From torin.woltjer at granddial.com Wed Jul 11 21:23:30 2018 From: torin.woltjer at granddial.com (Torin Woltjer) Date: Wed, 11 Jul 2018 21:23:30 GMT Subject: [Openstack] Recovering from full outage Message-ID: If I run `ip netns exec qrouter netstat -lnp` or `ip netns exec qdhcp netstat -lnp` on the controller, should I see anything listening on the metadata port (8775)? When I run these commands I don't see that listening, but I have no example of a working system to check against. Can anybody verify this? Thanks, Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/10/18 2:58 PM To: Cc: , Subject: Re: [Openstack] Recovering from full outage DHCP is working again so instances are getting their addresses. For some reason cloud-init isn't working correctly. Hostnames aren't getting set, and SSH key pair isn't getting set. The neutron-metadata service is in control of this? neutron-metadata-agent.log: 2018-07-10 08:01:42.046 5518 INFO eventlet.wsgi.server [-] 109.73.185.195, "GET / HTTP/1.1" status: 404 len: 195 time: 0.0622332 2018-07-10 09:49:42.604 5518 INFO eventlet.wsgi.server [-] 197.149.85.150, "GET / HTTP/1.1" status: 404 len: 195 time: 0.0645461 2018-07-10 10:52:50.845 5517 INFO eventlet.wsgi.server [-] 88.249.225.204, "GET / HTTP/1.1" status: 404 len: 195 time: 0.0659041 2018-07-10 11:43:20.471 5518 INFO eventlet.wsgi.server [-] 143.208.186.168, "GET / HTTP/1.1" status: 404 len: 195 time: 0.0618532 2018-07-10 11:53:15.574 5511 INFO eventlet.wsgi.server [-] 194.40.240.254, "GET / HTTP/1.1" status: 404 len: 195 time: 0.0636070 2018-07-10 13:26:46.795 5518 INFO eventlet.wsgi.server [-] 109.73.177.149, "GET / HTTP/1.1" status: 404 len: 195 time: 0.0611560 2018-07-10 13:27:38.795 5513 INFO eventlet.wsgi.server [-] 125.167.69.238, "GET / HTTP/1.0" status: 404 len: 195 time: 0.0631371 2018-07-10 13:30:49.551 5514 INFO eventlet.wsgi.server [-] 155.93.152.111, "GET / HTTP/1.0" status: 404 len: 195 time: 0.0609179 2018-07-10 14:12:42.008 5521 INFO eventlet.wsgi.server [-] 190.85.38.173, "GET / HTTP/1.1" status: 404 len: 195 time: 0.0597739 No other log files show abnormal behavior. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/6/18 2:33 PM To: "lmihaiescu at gmail.com" Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org Subject: Re: [Openstack] Recovering from full outage I explored creating a second "selfservice" vxlan to see if DHCP would work on it as it does on my external "provider" network. The new vxlan network shares the same problems as the old vxlan network. Am I having problems with VXLAN in particular? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/6/18 12:05 PM To: Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org Subject: Re: [Openstack] Recovering from full outage Interestingly, I can ping the neutron router at 172.16.1.1 just fine, but DHCP (located at 172.16.1.2 and 172.16.1.3) fails. The instance that I manually added the IP address to has a floating IP, and oddly enough I am able to ping DHCP on the provider network, which suggests that DHCP may be working on other networks but not on my selfservice network. I was able to confirm this by creating a new virtual machine directly on the provider network, I was able to ping to it and SSH into it right off of the bat, as it obtained the proper address on its own. "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" is empty. "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" contains: fa:16:3e:3f:94:17,host-172-16-1-8.openstacklocal,172.16.1.8 fa:16:3e:e0:57:e7,host-172-16-1-7.openstacklocal,172.16.1.7 fa:16:3e:db:a7:cb,host-172-16-1-12.openstacklocal,172.16.1.12 fa:16:3e:f8:10:99,host-172-16-1-10.openstacklocal,172.16.1.10 fa:16:3e:a7:82:4c,host-172-16-1-3.openstacklocal,172.16.1.3 fa:16:3e:f8:23:1d,host-172-16-1-14.openstacklocal,172.16.1.14 fa:16:3e:63:53:a4,host-172-16-1-1.openstacklocal,172.16.1.1 fa:16:3e:b7:41:a8,host-172-16-1-2.openstacklocal,172.16.1.2 fa:16:3e:5e:25:5f,host-172-16-1-4.openstacklocal,172.16.1.4 fa:16:3e:3a:a2:53,host-172-16-1-100.openstacklocal,172.16.1.100 fa:16:3e:46:39:e2,host-172-16-1-13.openstacklocal,172.16.1.13 fa:16:3e:06:de:e0,host-172-16-1-18.openstacklocal,172.16.1.18 I've done system restarts since the power outage and the agent hasn't corrected itself. I've restarted all neutron services as I've done things, I could also try stopping and starting dnsmasq. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/6/18 11:15 AM To: torin.woltjer at granddial.com Cc: "openstack at lists.openstack.org" , "openstack-operators at lists.openstack.org" , pgsousa at gmail.com Subject: Re: [Openstack] Recovering from full outage Can you manually assign an IP address to a VM and once inside, ping the address of the dhcp server? That would confirm if there is connectivity at least. Also, on the controller node where the dhcp server for that network is, check the "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" and make sure there are entries corresponding to your instances. In my experience, if neutron is broken after working fine (so excluding any miss-configuration), then an agent is out-of-sync and restart usually fixes things. On Fri, Jul 6, 2018 at 9:38 AM, Torin Woltjer wrote: I have done tcpdumps on both the controllers and on a compute node. Controller: `ip netns exec qdhcp-d85c2a00-a637-4109-83f0-7c2949be4cad tcpdump -vnes0 -i ns-83d68c76-b8 port 67` `tcpdump -vnes0 -i any port 67` Compute: `tcpdump -vnes0 -i brqd85c2a00-a6 port 68` For the first command on the controller, there are no packets captured at all. The second command on the controller captures packets, but they don't appear to be relevant to openstack. The dump from the compute node shows constant requests are getting sent by openstack instances. In summary; DHCP requests are being sent, but are never received. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/5/18 4:50 PM To: torin.woltjer at granddial.com Subject: Re: [Openstack] Recovering from full outage The cloud-init requires network connectivity by default in order to reach the metadata server for the hostname, ssh-key, etc You can configure cloud-init to use the config-drive, but the lack of network connectivity will make the instance useless anyway, even though it will have you ssh-key and hostname... Did you check the things I told you? On Jul 5, 2018, at 16:06, Torin Woltjer wrote: Are IP addresses set by cloud-init on boot? I noticed that cloud-init isn't working on my VMs. created a new instance from an ubuntu 18.04 image to test with, the hostname was not set to the name of the instance and could not login as users I had specified in the configuration. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/5/18 12:57 PM To: torin.woltjer at granddial.com Cc: "openstack at lists.openstack.org" , "openstack-operators at lists.openstack.org" Subject: Re: [Openstack] Recovering from full outage You should tcpdump inside the qdhcp namespace to see if the requests make it there, and also check iptables rules on the compute nodes for the return traffic. On Thu, Jul 5, 2018 at 12:39 PM, Torin Woltjer wrote: Yes, I've done this. The VMs hang for awhile waiting for DHCP and eventually come up with no addresses. neutron-dhcp-agent has been restarted on both controllers. The qdhcp netns's were all present; I stopped the service, removed the qdhcp netns's, noted the dhcp agents show offline by `neutron agent-list`, restarted all neutron services, noted the qdhcp netns's were recreated, restarted a VM again and it still fails to pull an IP address. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/5/18 10:38 AM To: torin.woltjer at granddial.com Subject: Re: [Openstack] Recovering from full outage Did you restart the neutron-dhcp-agent and rebooted the VMs? On Thu, Jul 5, 2018 at 10:30 AM, Torin Woltjer wrote: The qrouter netns appears once the lock_path is specified, the neutron router is pingable as well. However, instances are not pingable. If I log in via console, the instances have not been given IP addresses, if I manually give them an address and route they are pingable and seem to work. So the router is working correctly but dhcp is not working. No errors in any of the neutron or nova logs on controllers or compute nodes. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/5/18 8:53 AM To: Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org Subject: Re: [Openstack] Recovering from full outage There is no lock path set in my neutron configuration. Does it ultimately matter what it is set to as long as it is consistent? Does it need to be set on compute nodes as well as controllers? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/3/18 7:47 PM To: torin.woltjer at granddial.com Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org Subject: Re: [Openstack] Recovering from full outage Did you set a lock_path in the neutron’s config? On Jul 3, 2018, at 17:34, Torin Woltjer wrote: The following errors appear in the neutron-linuxbridge-agent.log on both controllers: http://paste.openstack.org/show/724930/ No such errors are on the compute nodes themselves. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/3/18 5:14 PM To: Cc: "openstack-operators at lists.openstack.org" , "openstack at lists.openstack.org" Subject: Re: [Openstack] Recovering from full outage Running `openstack server reboot` on an instance just causes the instance to be stuck in a rebooting status. Most notable of the logs is neutron-server.log which shows the following: http://paste.openstack.org/show/724917/ I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted controllers, and all of the agents show online. http://paste.openstack.org/show/724921/ And all of the instances can be properly started, however I cannot ping any of the instances floating IPs or the neutron router. And when logging into an instance with the console, there is no IP address on any interface. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: George Mihaiescu Sent: 7/3/18 11:50 AM To: torin.woltjer at granddial.com Subject: Re: [Openstack] Recovering from full outage Try restarting them using "openstack server reboot" and also check the nova-compute.log and neutron agents logs on the compute nodes. On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer wrote: We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks? Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack at lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From torin.woltjer at granddial.com Thu Jul 12 12:20:32 2018 From: torin.woltjer at granddial.com (Torin Woltjer) Date: Thu, 12 Jul 2018 12:20:32 GMT Subject: [Openstack] [Openstack-operators] Recovering from full outage Message-ID: <0742b8e467364769a2c2cdac10067e2f@granddial.com> The neutron-metadata-agent service is running, the the agent is alive, and it is listening on port 8775. However, new instances still do not get any information like hostname or keypair. If I run `curl 192.168.116.22:8775` from the compute nodes, I do get a response. The metadata agent is running, listening, and accessible from the compute nodes; and it worked previously. I'm stumped. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: அருண் குமார் (Arun Kumar) Sent: 7/12/18 12:01 AM To: torin.woltjer at granddial.com Cc: "openstack at lists.openstack.org" , openstack-operators at lists.openstack.org Subject: Re: [Openstack-operators] [Openstack] Recovering from full outage Hi Torin, If I run `ip netns exec qrouter netstat -lnp` or `ip netns exec qdhcp netstat -lnp` on the controller, should I see anything listening on the metadata port (8775)? When I run these commands I don't see that listening, but I have no example of a working system to check against. Can anybody verify this? Either on qrouter/qdhcp namespaces, you won't see port 8775, instead check whether meta-data service is running on the neutron controller node(s) and listening on port 8775? Aslo, you can verify metadata and neturon services using following commands service neutron-metadata-agent status neutron agent-list netstat -ntplua | grep :8775 Thanks & Regards Arun ஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃ அன்புடன் அருண் நுட்பம் நம்மொழியில் தழைக்கச் செய்வோம் http://thangamaniarun.wordpress.com ஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jpetrini at coredial.com Thu Jul 12 13:16:15 2018 From: jpetrini at coredial.com (John Petrini) Date: Thu, 12 Jul 2018 09:16:15 -0400 Subject: [Openstack] [Openstack-operators] Recovering from full outage In-Reply-To: <0742b8e467364769a2c2cdac10067e2f@granddial.com> References: <0742b8e467364769a2c2cdac10067e2f@granddial.com> Message-ID: Are you instances receiving a route to the metadata service (169.254.169.254) from DHCP? Can you curl the endpoint? curl http://169.254.169.254/latest/meta-data -------------- next part -------------- An HTML attachment was scrubbed... URL: From torin.woltjer at granddial.com Thu Jul 12 14:21:17 2018 From: torin.woltjer at granddial.com (Torin Woltjer) Date: Thu, 12 Jul 2018 14:21:17 GMT Subject: [Openstack] [Openstack-operators] Recovering from full outage Message-ID: <863528991ac14ddf87d2449c763071e1@granddial.com> I tested this on two instances. The first instance has existed since before I began having this issue. The second is created from a cirros test image. On the first instance: The route exists: 169.254.169.254 via 172.16.1.1 dev ens3 proto dhcp metric 100. curl returns information, for example; `curl http://169.254.169.254/latest/meta-data/public-keys` 0=nextcloud On the second instance: The route exists: 169.254.169.254 via 172.16.1.1 dev eth0 curl fails; `curl http://169.254.169.254/latest/meta-data` curl: (7) Failed to connect to 169.254.169.254 port 80: Connection timed out I am curious why this is the case that one is able to connect but not the other. Both the first and second instances were running on the same compute node. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: John Petrini Sent: 7/12/18 9:16 AM To: torin.woltjer at granddial.com Cc: thangam.arunx at gmail.com, OpenStack Operators , OpenStack Mailing List Subject: Re: [Openstack-operators] [Openstack] Recovering from full outage Are you instances receiving a route to the metadata service (169.254.169.254) from DHCP? Can you curl the endpoint? curl http://169.254.169.254/latest/meta-data -------------- next part -------------- An HTML attachment was scrubbed... URL: From haleyb.dev at gmail.com Thu Jul 12 14:29:54 2018 From: haleyb.dev at gmail.com (Brian Haley) Date: Thu, 12 Jul 2018 10:29:54 -0400 Subject: [Openstack] [Openstack-operators] Recovering from full outage In-Reply-To: <0742b8e467364769a2c2cdac10067e2f@granddial.com> References: <0742b8e467364769a2c2cdac10067e2f@granddial.com> Message-ID: <2104c379-cda9-7822-13d9-b968595527af@gmail.com> On 07/12/2018 08:20 AM, Torin Woltjer wrote: > The neutron-metadata-agent service is running, the the agent is alive, > and it is listening on port 8775. However, new instances still do not > get any information like hostname or keypair. If I run `curl > 192.168.116.22:8775` from the compute nodes, I do get a response. The > metadata agent is running, listening, and accessible from the compute > nodes; and it worked previously. > > I'm stumped. There is also a metadata proxy that runs in the qrouter namespace, you can verify it's running and getting requests by looking at both iptables and netstat output. $ sudo ip netns exec qrouter-$ID iptables-save -c | grep 169 [16:960] -A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697 [96:7968] -A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j MARK --set-xmark 0x1/0xffff The numbers inside [] represent packets:bytes, so non-zero is good. $ sudo ip netns exec qrouter-$ID netstat -anep | grep 9697 tcp 0 0 0.0.0.0:9697 0.0.0.0:* LISTEN 0 294339 4867/haproxy If you have a running instance you can log into, running curl to the metadata IP would be helpful to try and diagnose since it would go through this entire path. -Brian > /*Torin Woltjer*/ > *Grand Dial Communications - A ZK Tech Inc. Company* > *616.776.1066 ext. 2006* > /*www.granddial.com */ > > ------------------------------------------------------------------------ > *From*: அருண் குமார் (Arun Kumar) > *Sent*: 7/12/18 12:01 AM > *To*: torin.woltjer at granddial.com > *Cc*: "openstack at lists.openstack.org" , > openstack-operators at lists.openstack.org > *Subject*: Re: [Openstack-operators] [Openstack] Recovering from full outage > Hi Torin, > > If I run `ip netns exec qrouter netstat -lnp` or `ip netns exec > qdhcp netstat -lnp` on the controller, should I see anything > listening on the metadata port (8775)? When I run these commands I > don't see that listening, but I have no example of a working system > to check against. Can anybody verify this? > > > Either on qrouter/qdhcp namespaces, you won't see port 8775, instead > check whether meta-data service is running on the neutron controller > node(s) and listening on port 8775? Aslo, you can verify metadata and > neturon services using following commands > > service neutron-metadata-agent status > neutron agent-list > netstat -ntplua | grep :8775 > > > Thanks & Regards > Arun > > ஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃ > அன்புடன் > அருண் > நுட்பம் நம்மொழியில் தழைக்கச் செய்வோம் > http://thangamaniarun.wordpress.com > ஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃ > > > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > From jpetrini at coredial.com Thu Jul 12 14:33:10 2018 From: jpetrini at coredial.com (John Petrini) Date: Thu, 12 Jul 2018 10:33:10 -0400 Subject: [Openstack] [Openstack-operators] Recovering from full outage In-Reply-To: <863528991ac14ddf87d2449c763071e1@granddial.com> References: <863528991ac14ddf87d2449c763071e1@granddial.com> Message-ID: You might want to try giving the neutron-dhcp and metadata agents a restart. -------------- next part -------------- An HTML attachment was scrubbed... URL: From torin.woltjer at granddial.com Thu Jul 12 15:03:26 2018 From: torin.woltjer at granddial.com (Torin Woltjer) Date: Thu, 12 Jul 2018 15:03:26 GMT Subject: [Openstack] [Openstack-operators] Recovering from full outage Message-ID: <373f719b15654b4a8ae5832d8e12229f@granddial.com> Checking iptables for the metadata-proxy inside of qrouter provides the following: $ip netns exec qrouter-80c3bc40-b49c-446a-926f-99811adc0c5e iptables-save -c | grep 169 [0:0] -A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697 [0:0] -A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j MARK --set-xmark 0x1/0xffff Packets:Bytes are both 0, so no traffic is touching this rule? Interestingly the command: $ip netns exec qrouter-80c3bc40-b49c-446a-926f-99811adc0c5e netstat -anep | grep 9697 returns nothing, so there isn't actually anything running on 9697 in the network namespace... This is the output without grep: Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State User Inode PID/Program name raw 0 0 0.0.0.0:112 0.0.0.0:* 7 0 76154 8404/keepalived raw 0 0 0.0.0.0:112 0.0.0.0:* 7 0 76153 8404/keepalived Active UNIX domain sockets (servers and established) Proto RefCnt Flags Type State I-Node PID/Program name Path unix 2 [ ] DGRAM 64501 7567/python2 unix 2 [ ] DGRAM 79953 8403/keepalived Could the reason no traffic touching the rule be that nothing is listening on that port, or is there a second issue down the chain? Curl fails even after restarting the neutron-dhcp-agent & neutron-metadata agent. Thank you for this, and any future help. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ruth at ivimey.org Thu Jul 12 17:45:57 2018 From: ruth at ivimey.org (Ruth Ivimey-Cook) Date: Thu, 12 Jul 2018 18:45:57 +0100 Subject: [Openstack] Fatal error during container create (ansible-openstack on bionic) In-Reply-To: References: <79fb4658-c748-3e6c-f3ee-1ca884299ff6@ivimey.org> Message-ID: Hi Mohammed, Xenial doesn't fit my needs in other ways, so reverting to it isn't an option, and I also want to use bionic's spin of openstack rather than run with packages from another repo, nor wait for an unspecified period for Rocky's appearance. I am prepared to invest some time in debugging this build, but I was hoping for some assistance here out of courtesy. The only reason I looked at os-a was that configuring openstack on its own was unfathomably complex. Even with ansible it is still very complex for what I feel should be a simple task. I don't really understand why there isn't a default usable (i.e. one-box cloud) configuration in the .rpm/.deb packages which users can then extend and adapt, as with other similar tools. Where would I look for the current work on the os-a rocky update? perhaps I can derive something useful from that. Regards, Ruth On 11/07/2018 20:35, Mohammed Naser wrote: > Hi there, > > Bionic isn't current supported and we're working on adding support for > it in the Rocky cycle! We recommend you deploy on Xenial. > > Thanks, > Mohammed > > On Wed, Jul 11, 2018 at 11:44 AM, Ruth Ivimey-Cook wrote: >> I am getting a fatal error in lxc_create when running openstack-ansible >> playbooks/setup-hosts.yml and hoping someone can push me in the right >> direction. Logs below... >> >> I am interpreting the fatal error as some sort of missing config, which is >> why I included the warnings that happened earlier in the above. Is that >> right? Is there any way I can isolate where exactly in the ansible setup >> this happens? >> >> The only significant changes I've made to the ansible setup are >> >> - comment out `linux-image-extra-{{ ansible_kernel }}` package from the >> ubuntu config as it no longer exists. >> - create /etc/ansible/.../*ubuntu-18.04.yml files by copying the equivalent >> ubuntu-16.04.yml file, where no 18.04 version was already present. >> >>> ~/openstack-ansible$ sudo openstack-ansible playbooks/setup-hosts.yml >>> >>> Variable files: "-e @/etc/openstack_deploy/user_secrets.yml -e >>> @/etc/openstack_deploy/user_variables.yml " >>> >>> [WARNING]: Unable to parse /etc/openstack_deploy/inventory.ini as an >>> inventory source >>> [DEPRECATION WARNING]: 'include' for playbook includes. You should use >>> 'import_playbook' instead. This >>> >>> feature will be removed in version 2.8. Deprecation warnings can be >>> disabled by setting >>> >>> deprecation_warnings=False in ansible.cfg. >>> >>> [WARNING]: Could not match supplied host pattern, ignoring: >>> all_lxc_containers >>> >>> [WARNING]: Could not match supplied host pattern, ignoring: >>> all_nspawn_containers >>> >>> PLAY [Install Ansible prerequisites] >>> ************************************************************************* >>> >>> TASK [Ensure python is installed] >>> **************************************************************************** >>> >>> ok: [aio1] >> >> ... lots of stuff that works... >> >>> TASK [Create the new LXC service log directory] >>> ************************************************************** >>> >>> ok: [aio1] >>> >>> TASK [Create the LXC service log aggregation link] >>> *********************************************************** >>> >>> ok: [aio1] >>> >>> TASK [apt_package_pinning : Add apt pin preferences] >>> ********************************************************* >>> >>> TASK [lxc_hosts : Check for the presence of a public key file on the >>> deployment host] ************************ >>> >>> ok: [aio1 -> localhost] >>> >>> TASK [lxc_hosts : Fail if a ssh public key is not set in a var and is not >>> present on the deployment host] **** >>> >>> TASK [lxc_hosts : Gather variables for each operating system] >>> ************************************************ >>> >>> ok: [aio1] => >>> (item=/etc/ansible/roles/lxc_hosts/vars/ubuntu-18.04-host.yml) >>> >>> TASK [lxc_hosts : Gather container variables] >>> **************************************************************** >>> >>> [WARNING]: Invalid request to find a file that matches a "null" value >>> >>> ok: [aio1] => (item=/etc/ansible/roles/lxc_hosts/vars/ubuntu-18.04.yml) >>> >>> TASK [lxc_hosts : include_tasks] >>> ***************************************************************************** >>> >>> included: /etc/ansible/roles/lxc_hosts/tasks/lxc_pre_install.yml for aio1 >> A little later in the same run: >> >>> TASK [lxc_container_create : Check the physical_host variable is set] >>> **************************************** >>> >>> TASK [lxc_container_create : Collect physical host facts if missing] >>> ***************************************** >>> >>> TASK [lxc_container_create : Kernel version and LXC backing store check] >>> ************************************* >>> >>> TASK [lxc_container_create : Gather variables for each operating system] >>> ************************************* >>> >>> [WARNING]: Invalid request to find a file that matches a "null" value >>> >>> [WARNING]: Invalid request to find a file that matches a "null" value >>> >>> ok: [aio1_cinder_api_container-3255dd97] => >>> (item=/etc/ansible/roles/lxc_container_create/vars/ubuntu-18.04.yml) >>> >>> [WARNING]: Invalid request to find a file that matches a "null" value >>> >>> ok: [aio1_designate_container-54f1c305] => >>> (item=/etc/ansible/roles/lxc_container_create/vars/ubuntu-18.04.yml) >>> >>> [WARNING]: Invalid request to find a file that matches a "null" value >>> >>> [WARNING]: Invalid request to find a file that matches a "null" value >> And then, finally, the fatal error: >> >>> TASK [lxc_container_create : include_tasks] >>> ****************************************************************** >>> >>> included: >>> /etc/ansible/roles/lxc_container_create/tasks/lxc_container_create_dir.yml >>> for aio1_cinder_api_container-3255dd97, aio1_designate_container-54f1c305, >>> aio1_galera_container-b332cdef, aio1_glance_container-8d10cc70, >>> aio1_heat_api_container-362fdd4a, aio1_horizon_container-d76a2adc, >>> aio1_keystone_container-78616d24, aio1_memcached_container-916a4563, >>> aio1_neutron_server_container-3bf65b1d, aio1_nova_api_container-91ebf932, >>> aio1_repo_container-f56147bc, aio1_rabbit_mq_container-bfd8534a, >>> aio1_rsyslog_container-ce40ff7f, aio1_swift_proxy_container-eada6cf1, >>> aio1_utility_container-195113e0 >>> >>> TASK [lxc_container_create : Create container (dir)] >>> ********************************************************* >>> >>> fatal: [aio1_cinder_api_container-3255dd97 -> 172.29.236.100]: FAILED! => >>> {"changed": false, "module_stderr": "Shared connection to 172.29.236.100 >>> closed.\r\n", "module_stdout": "Failed to load config for >>> aio1_cinder_api_container-3255dd97\r\n443: error creating container >>> aio1_cinder_api_container-3255dd97\r\nTraceback (most recent call last):\r\n >>> File \"/tmp/ansible_GSPHbc/ansible_module_lxc_container.py\", line 1772, in >>> \r\n main()\r\n File >>> \"/tmp/ansible_GSPHbc/ansible_module_lxc_container.py\", line 1767, in >>> main\r\n lxc_manage = LxcContainerManagement(module=module)\r\n File >>> \"/tmp/ansible_GSPHbc/ansible_module_lxc_container.py\", line 619, in >>> __init__\r\n self.container = self.get_container_bind()\r\n File >>> \"/tmp/ansible_GSPHbc/ansible_module_lxc_container.py\", line 624, in >>> get_container_bind\r\n return lxc.Container(name=self.container_name)\r\n >>> File \"/usr/lib/python2.7/dist-packages/lxc/__init__.py\", line 153, in >>> __init__\r\n _lxc.Container.__init__(self, name)\r\nSystemError: NULL >>> result without error in PyObject_Call\r\n", "msg": "MODULE FAILURE", "rc": >>> 1} >> Context: I want to run openstack on ubuntu bionic, and using ansible seemed >> to be the best way forward. I know openstack-ansible is only supported on >> xenial, but as I'm a software developer I thought I'd give it a go. I first >> commented out the OS checks... and have got a good deal of progress since. >> However, I have hit a problem and am hoping someone can help. >> >> I also posted this question on the ask.openstack pages but it's still >> awaiting moderation :( >> >> https://ask.openstack.org/en/question/115193/fatal-error-during-container-create-ansible-openstack-on-bionic/ >> >> >> _______________________________________________ >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> Post to : openstack at lists.openstack.org >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack From marcioprado at marcioprado.eti.br Fri Jul 13 10:34:06 2018 From: marcioprado at marcioprado.eti.br (Marcio Prado) Date: Fri, 13 Jul 2018 07:34:06 -0300 Subject: [Openstack] Monitor Instances KVM Message-ID: <8cdde50ecf6be684e5026641246b36a1@marcioprado.eti.br> Good Morning, How can I monitor openstack KVM instances? Does Ceilometer do this? Thank you -- Marcio Prado Analista de TI - Infraestrutura e Redes Fone: (35) 9.9821-3561 www.marcioprado.eti.br From jd8lester at gmail.com Sat Jul 14 03:41:40 2018 From: jd8lester at gmail.com (John Lester) Date: Fri, 13 Jul 2018 23:41:40 -0400 Subject: [Openstack] [networking-sfc] Create port pair post commit fails - sql null host_id cannot be null Message-ID: Hello, I'm having some difficulties with creating a port pair in the networking-sfc project. See below log file. Any help would be greatly appreciated. Thanks! neutron-server.log 2018-07-13 23:13:26.965 3361 INFO neutron.wsgi [req-1e70ad3f-6bd7-4291-bb06-d5671cae4ef2 c51f741dd6c34637a8b66bade75ed570 3a1091e3257a42398eda7b366cec3e7a - default default] 192.168.14.112 "GET /v2.0/ports?id=9a1a302e-c862-4683-80d0-e91d08b063d8 HTTP/1.1" status: 200 len: 1136 time: 0.6060951 2018-07-13 23:13:27.019 3361 INFO neutron.wsgi [req-367bbef4-9644-4c49-a050-b0f2fd2de2e6 c51f741dd6c34637a8b66bade75ed570 3a1091e3257a42398eda7b366cec3e7a - default default] 192.168.14.112 "GET /v2.0/ports?id=0fe15cd5-9a92-4d11-bf4e-206cf4b8deb5 HTTP/1.1" status: 200 len: 1132 time: 0.0466549 2018-07-13 23:13:27.483 3361 WARNING networking_sfc.services.sfc.drivers.ovs.driver [req-e059eb52-c097-4e8f-8444-c3d1d280e47c c51f741dd6c34637a8b66bade75ed570 3a1091e3257a42398eda7b366cec3e7a - default default] Currently only support vxlan network 2018-07-13 23:13:27.582 3361 WARNING networking_sfc.services.sfc.drivers.ovs.driver [req-e059eb52-c097-4e8f-8444-c3d1d280e47c c51f741dd6c34637a8b66bade75ed570 3a1091e3257a42398eda7b366cec3e7a - default default] Currently only support vxlan network 2018-07-13 23:13:27.585 3361 ERROR oslo_db.sqlalchemy.exc_filters [req-e059eb52-c097-4e8f-8444-c3d1d280e47c c51f741dd6c34637a8b66bade75ed570 3a1091e3257a42398eda7b366cec3e7a - default default] DBAPIError exception wrapped from (pymysql.err.IntegrityError) (1048, u"Column 'host_id' cannot be null") [SQL: u'INSERT INTO sfc_portpair_details (project_id, id, ingress, egress, host_id, in_mac_address, mac_address, network_type, segment_id, local_endpoint, correlation) VALUES (%(project_id)s, %(id)s, %(ingress)s, %(egress)s, %(host_id)s, %(in_mac_address)s, %(mac_address)s, %(network_type)s, %(segment_id)s, %(local_endpoint)s, %(correlation)s)'] [parameters: {'ingress': u'9a1a302e-c862-4683-80d0-e91d08b063d8', 'segment_id': None, 'correlation': None, 'id': '7e6ad744-fd8d-4a58-9606-cfe8d00587c1', 'local_endpoint': None, 'in_mac_address': None, 'egress': u'0fe15cd5-9a92-4d11-bf4e-206cf4b8deb5', 'mac_address': None, 'host_id': None, 'project_id': u'3a1091e3257a42398eda7b366cec3e7a', 'network_type': None}]: IntegrityError: (1048, u"Column 'host_id' cannot be null") 2018-07-13 23:13:27.585 3361 ERROR oslo_db.sqlalchemy.exc_filters Traceback (most recent call last): 2018-07-13 23:13:27.585 3361 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1182, in _execute_context 2018-07-13 23:13:27.585 3361 ERROR oslo_db.sqlalchemy.exc_filters context) 2018-07-13 23:13:27.585 3361 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/default.py", line 470, in do_execute 2018-07-13 23:13:27.585 3361 ERROR oslo_db.sqlalchemy.exc_filters cursor.execute(statement, parameters) 2018-07-13 23:13:27.585 3361 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/dist-packages/pymysql/cursors.py", line 166, in execute 2018-07-13 23:13:27.585 3361 ERROR oslo_db.sqlalchemy.exc_filters result = self._query(query) 2018-07-13 23:13:27.585 3361 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/dist-packages/pymysql/cursors.py", line 322, in _query 2018-07-13 23:13:27.585 3361 ERROR oslo_db.sqlalchemy.exc_filters conn.query(q) 2018-07-13 23:13:27.585 3361 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/dist-packages/pymysql/connections.py", line 856, in query 2018-07-13 23:13:27.585 3361 ERROR oslo_db.sqlalchemy.exc_filters self._affected_rows = self._read_query_result(unbuffered=unbuffered) 2018-07-13 23:13:27.585 3361 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/dist-packages/pymysql/connections.py", line 1057, in _read_query_result 2018-07-13 23:13:27.585 3361 ERROR oslo_db.sqlalchemy.exc_filters result.read() 2018-07-13 23:13:27.585 3361 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/dist-packages/pymysql/connections.py", line 1340, in read 2018-07-13 23:13:27.585 3361 ERROR oslo_db.sqlalchemy.exc_filters first_packet = self.connection._read_packet() 2018-07-13 23:13:27.585 3361 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/dist-packages/pymysql/connections.py", line 1014, in _read_packet 2018-07-13 23:13:27.585 3361 ERROR oslo_db.sqlalchemy.exc_filters packet.check_error() 2018-07-13 23:13:27.585 3361 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/dist-packages/pymysql/connections.py", line 393, in check_error 2018-07-13 23:13:27.585 3361 ERROR oslo_db.sqlalchemy.exc_filters err.raise_mysql_exception(self._data) 2018-07-13 23:13:27.585 3361 ERROR oslo_db.sqlalchemy.exc_filters File "/usr/lib/python2.7/dist-packages/pymysql/err.py", line 107, in raise_mysql_exception 2018-07-13 23:13:27.585 3361 ERROR oslo_db.sqlalchemy.exc_filters raise errorclass(errno, errval) 2018-07-13 23:13:27.585 3361 ERROR oslo_db.sqlalchemy.exc_filters IntegrityError: (1048, u"Column 'host_id' cannot be null") 2018-07-13 23:13:27.585 3361 ERROR oslo_db.sqlalchemy.exc_filters 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager [req-e059eb52-c097-4e8f-8444-c3d1d280e47c c51f741dd6c34637a8b66bade75ed570 3a1091e3257a42398eda7b366cec3e7a - default default] (pymysql.err.IntegrityError) (1048, u"Column 'host_id' cannot be null") [SQL: u'INSERT INTO sfc_portpair_details (project_id, id, ingress, egress, host_id, in_mac_address, mac_address, network_type, segment_id, local_endpoint, correlation) VALUES (%(project_id)s, %(id)s, %(ingress)s, %(egress)s, %(host_id)s, %(in_mac_address)s, %(mac_address)s, %(network_type)s, %(segment_id)s, %(local_endpoint)s, %(correlation)s)'] [parameters: {'ingress': u'9a1a302e-c862-4683-80d0-e91d08b063d8', 'segment_id': None, 'correlation': None, 'id': '7e6ad744-fd8d-4a58-9606-cfe8d00587c1', 'local_endpoint': None, 'in_mac_address': None, 'egress': u'0fe15cd5-9a92-4d11-bf4e-206cf4b8deb5', 'mac_address': None, 'host_id': None, 'project_id': u'3a1091e3257a42398eda7b366cec3e7a', 'network_type': None}]: DBError: (pymysql.err.IntegrityError) (1048, u"Column 'host_id' cannot be null") [SQL: u'INSERT INTO sfc_portpair_details (project_id, id, ingress, egress, host_id, in_mac_address, mac_address, network_type, segment_id, local_endpoint, correlation) VALUES (%(project_id)s, %(id)s, %(ingress)s, %(egress)s, %(host_id)s, %(in_mac_address)s, %(mac_address)s, %(network_type)s, %(segment_id)s, %(local_endpoint)s, %(correlation)s)'] [parameters: {'ingress': u'9a1a302e-c862-4683-80d0-e91d08b063d8', 'segment_id': None, 'correlation': None, 'id': '7e6ad744-fd8d-4a58-9606-cfe8d00587c1', 'local_endpoint': None, 'in_mac_address': None, 'egress': u'0fe15cd5-9a92-4d11-bf4e-206cf4b8deb5', 'mac_address': None, 'host_id': None, 'project_id': u'3a1091e3257a42398eda7b366cec3e7a', 'network_type': None}] 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager Traceback (most recent call last): 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/networking_sfc/services/sfc/driver_manager.py", line 100, in _call_drivers 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager getattr(driver.obj, method_name)(context) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/networking_sfc/services/sfc/drivers/base.py", line 79, in create_port_pair_postcommit 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager self.create_port_pair(context) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/lib/python2.7/dist-packages/oslo_log/helpers.py", line 67, in wrapper 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager return method(*args, **kwargs) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/networking_sfc/services/sfc/drivers/ovs/driver.py", line 1168, in create_port_pair 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager self._create_port_pair_detail(port_pair) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/lib/python2.7/dist-packages/oslo_log/helpers.py", line 67, in wrapper 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager return method(*args, **kwargs) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/networking_sfc/services/sfc/drivers/ovs/driver.py", line 1161, in _create_port_pair_detail 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager r = self.create_port_pair_detail(portpair_detail) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/networking_sfc/services/sfc/drivers/ovs/db.py", line 244, in create_port_pair_detail 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager return self._make_port_detail_dict(port_obj) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__ 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager self.gen.next() 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/lib/python2.7/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 1029, in _transaction_scope 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager yield resource 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__ 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager self.gen.next() 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/lib/python2.7/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 641, in _session 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager self.session.rollback() 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager self.force_reraise() 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager six.reraise(self.type_, self.value, self.tb) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/lib/python2.7/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 638, in _session 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager self._end_session_transaction(self.session) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/lib/python2.7/dist-packages/oslo_db/sqlalchemy/enginefacade.py", line 666, in _end_session_transaction 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager session.commit() 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 921, in commit 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager self.transaction.commit() 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 461, in commit 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager self._prepare_impl() 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 430, in _prepare_impl 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager self.session.dispatch.before_commit(self.session) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/event/attr.py", line 218, in __call__ 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager fn(*args, **kw) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/lib/python2.7/dist-packages/neutron/db/api.py", line 267, in load_one_to_manys 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager session.flush() 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 2192, in flush 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager self._flush(objects) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 2312, in _flush 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager transaction.rollback(_capture_exception=True) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/util/langhelpers.py", line 66, in __exit__ 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager compat.reraise(exc_type, exc_value, exc_tb) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 2276, in _flush 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager flush_context.execute() 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/unitofwork.py", line 389, in execute 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager rec.execute(self) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/unitofwork.py", line 548, in execute 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager uow 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/persistence.py", line 181, in save_obj 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager mapper, table, insert) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/persistence.py", line 799, in _emit_insert_statements 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager execute(statement, multiparams) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 945, in execute 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager return meth(self, multiparams, params) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/elements.py", line 263, in _execute_on_connection 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager return connection._execute_clauseelement(self, multiparams, params) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1053, in _execute_clauseelement 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager compiled_sql, distilled_params 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1189, in _execute_context 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager context) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1398, in _handle_dbapi_exception 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager util.raise_from_cause(newraise, exc_info) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager reraise(type(exception), exception, tb=exc_tb, cause=cause) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1182, in _execute_context 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager context) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/default.py", line 470, in do_execute 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager cursor.execute(statement, parameters) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/lib/python2.7/dist-packages/pymysql/cursors.py", line 166, in execute 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager result = self._query(query) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/lib/python2.7/dist-packages/pymysql/cursors.py", line 322, in _query 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager conn.query(q) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/lib/python2.7/dist-packages/pymysql/connections.py", line 856, in query 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager self._affected_rows = self._read_query_result(unbuffered=unbuffered) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/lib/python2.7/dist-packages/pymysql/connections.py", line 1057, in _read_query_result 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager result.read() 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/lib/python2.7/dist-packages/pymysql/connections.py", line 1340, in read 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager first_packet = self.connection._read_packet() 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/lib/python2.7/dist-packages/pymysql/connections.py", line 1014, in _read_packet 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager packet.check_error() 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/lib/python2.7/dist-packages/pymysql/connections.py", line 393, in check_error 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager err.raise_mysql_exception(self._data) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager File "/usr/lib/python2.7/dist-packages/pymysql/err.py", line 107, in raise_mysql_exception 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager raise errorclass(errno, errval) 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager DBError: (pymysql.err.IntegrityError) (1048, u"Column 'host_id' cannot be null") [SQL: u'INSERT INTO sfc_portpair_details (project_id, id, ingress, egress, host_id, in_mac_address, mac_address, network_type, segment_id, local_endpoint, correlation) VALUES (%(project_id)s, %(id)s, %(ingress)s, %(egress)s, %(host_id)s, %(in_mac_address)s, %(mac_address)s, %(network_type)s, %(segment_id)s, %(local_endpoint)s, %(correlation)s)'] [parameters: {'ingress': u'9a1a302e-c862-4683-80d0-e91d08b063d8', 'segment_id': None, 'correlation': None, 'id': '7e6ad744-fd8d-4a58-9606-cfe8d00587c1', 'local_endpoint': None, 'in_mac_address': None, 'egress': u'0fe15cd5-9a92-4d11-bf4e-206cf4b8deb5', 'mac_address': None, 'host_id': None, 'project_id': u'3a1091e3257a42398eda7b366cec3e7a', 'network_type': None}] 2018-07-13 23:13:27.599 3361 ERROR networking_sfc.services.sfc.driver_manager 2018-07-13 23:13:27.603 3361 ERROR networking_sfc.services.sfc.driver_manager [req-e059eb52-c097-4e8f-8444-c3d1d280e47c c51f741dd6c34637a8b66bade75ed570 3a1091e3257a42398eda7b366cec3e7a - default default] SFC driver 'ovs' failed in create_port_pair_postcommit: DBError: (pymysql.err.IntegrityError) (1048, u"Column 'host_id' cannot be null") [SQL: u'INSERT INTO sfc_portpair_details (project_id, id, ingress, egress, host_id, in_mac_address, mac_address, network_type, segment_id, local_endpoint, correlation) VALUES (%(project_id)s, %(id)s, %(ingress)s, %(egress)s, %(host_id)s, %(in_mac_address)s, %(mac_address)s, %(network_type)s, %(segment_id)s, %(local_endpoint)s, %(correlation)s)'] [parameters: {'ingress': u'9a1a302e-c862-4683-80d0-e91d08b063d8', 'segment_id': None, 'correlation': None, 'id': '7e6ad744-fd8d-4a58-9606-cfe8d00587c1', 'local_endpoint': None, 'in_mac_address': None, 'egress': u'0fe15cd5-9a92-4d11-bf4e-206cf4b8deb5', 'mac_address': None, 'host_id': None, 'project_id': u'3a1091e3257a42398eda7b366cec3e7a', 'network_type': None}] 2018-07-13 23:13:27.604 3361 ERROR networking_sfc.services.sfc.plugin [req-e059eb52-c097-4e8f-8444-c3d1d280e47c c51f741dd6c34637a8b66bade75ed570 3a1091e3257a42398eda7b366cec3e7a - default default] create_port_pair_postcommit failed.: SfcDriverError: create_port_pair_postcommit failed. 2018-07-13 23:13:27.604 3361 ERROR networking_sfc.services.sfc.plugin Traceback (most recent call last): 2018-07-13 23:13:27.604 3361 ERROR networking_sfc.services.sfc.plugin File "/usr/local/lib/python2.7/dist-packages/networking_sfc/services/sfc/plugin.py", line 118, in create_port_pair 2018-07-13 23:13:27.604 3361 ERROR networking_sfc.services.sfc.plugin self.driver_manager.create_port_pair_postcommit(portpair_context) 2018-07-13 23:13:27.604 3361 ERROR networking_sfc.services.sfc.plugin File "/usr/local/lib/python2.7/dist-packages/networking_sfc/services/sfc/driver_manager.py", line 141, in create_port_pair_postcommit 2018-07-13 23:13:27.604 3361 ERROR networking_sfc.services.sfc.plugin self._call_drivers("create_port_pair_postcommit", context) 2018-07-13 23:13:27.604 3361 ERROR networking_sfc.services.sfc.plugin File "/usr/local/lib/python2.7/dist-packages/networking_sfc/services/sfc/driver_manager.py", line 112, in _call_drivers 2018-07-13 23:13:27.604 3361 ERROR networking_sfc.services.sfc.plugin method=method_name 2018-07-13 23:13:27.604 3361 ERROR networking_sfc.services.sfc.plugin SfcDriverError: create_port_pair_postcommit failed. 2018-07-13 23:13:27.604 3361 ERROR networking_sfc.services.sfc.plugin 2018-07-13 23:13:27.604 3361 ERROR networking_sfc.services.sfc.plugin [req-e059eb52-c097-4e8f-8444-c3d1d280e47c c51f741dd6c34637a8b66bade75ed570 3a1091e3257a42398eda7b366cec3e7a - default default] Create port pair failed, deleting port_pair '4392a1c0-5e36-4c2f-a477-03767e8140dd': SfcDriverError: create_port_pair_postcommit failed. 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource [req-e059eb52-c097-4e8f-8444-c3d1d280e47c c51f741dd6c34637a8b66bade75ed570 3a1091e3257a42398eda7b366cec3e7a - default default] create failed: No details.: SfcDriverError: create_port_pair_postcommit failed. 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource Traceback (most recent call last): 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/lib/python2.7/dist-packages/neutron/api/v2/resource.py", line 98, in resource 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource result = method(request=request, **args) 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/lib/python2.7/dist-packages/neutron/api/v2/base.py", line 435, in create 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource return self._create(request, body, **kwargs) 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/lib/python2.7/dist-packages/neutron/db/api.py", line 93, in wrapped 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource setattr(e, '_RETRY_EXCEEDED', True) 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource self.force_reraise() 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource six.reraise(self.type_, self.value, self.tb) 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/lib/python2.7/dist-packages/neutron/db/api.py", line 89, in wrapped 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource return f(*args, **kwargs) 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 150, in wrapper 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource ectxt.value = e.inner_exc 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource self.force_reraise() 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource six.reraise(self.type_, self.value, self.tb) 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 138, in wrapper 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource return f(*args, **kwargs) 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/lib/python2.7/dist-packages/neutron/db/api.py", line 128, in wrapped 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource LOG.debug("Retry wrapper got retriable exception: %s", e) 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource self.force_reraise() 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource six.reraise(self.type_, self.value, self.tb) 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/lib/python2.7/dist-packages/neutron/db/api.py", line 124, in wrapped 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource return f(*dup_args, **dup_kwargs) 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/lib/python2.7/dist-packages/neutron/api/v2/base.py", line 548, in _create 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource obj = do_create(body) 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/lib/python2.7/dist-packages/neutron/api/v2/base.py", line 530, in do_create 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource request.context, reservation.reservation_id) 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource self.force_reraise() 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource six.reraise(self.type_, self.value, self.tb) 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/lib/python2.7/dist-packages/neutron/api/v2/base.py", line 523, in do_create 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource return obj_creator(request.context, **kwargs) 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/lib/python2.7/dist-packages/oslo_log/helpers.py", line 67, in wrapper 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource return method(*args, **kwargs) 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/networking_sfc/services/sfc/plugin.py", line 125, in create_port_pair 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource self.delete_port_pair(context, portpair_db['id']) 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource self.force_reraise() 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource six.reraise(self.type_, self.value, self.tb) 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/networking_sfc/services/sfc/plugin.py", line 118, in create_port_pair 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource self.driver_manager.create_port_pair_postcommit(portpair_context) 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/networking_sfc/services/sfc/driver_manager.py", line 141, in create_port_pair_postcommit 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource self._call_drivers("create_port_pair_postcommit", context) 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/networking_sfc/services/sfc/driver_manager.py", line 112, in _call_drivers 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource method=method_name 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource SfcDriverError: create_port_pair_postcommit failed. 2018-07-13 23:13:27.747 3361 ERROR neutron.api.v2.resource -------------- next part -------------- An HTML attachment was scrubbed... URL: From abogott at wikimedia.org Sun Jul 15 17:55:55 2018 From: abogott at wikimedia.org (Andrew Bogott) Date: Sun, 15 Jul 2018 12:55:55 -0500 Subject: [Openstack] [Designate] designate-sink for multiple regions? Message-ID:     During our migration from nova-network to neutron, we'll be running two nova regions in parallel in our cloud.  I have designate-sink working just fine in our existing (nova-network) region, but since sink is only listening to the rabbit queue of that region it's oblivious to to events that happen in the Neutron region.     My tentative plan is to run a second designate instance in the second region, but point it to the same database as designate in the nova-network region so that the db (and associated mdns services) contain the aggregate knowledge from both regions.  At first blush that sounds terrible and prone to a million race conditions, but it's not /that/ different from what the HA designate guide suggests (apart from using different queues, which may or may not be a deal breaker.)     Am I on the road to deadlocks and database corruption?  Is there a more straightforward solution to this problem that I'm missing?     I'm running version Mitaka -- I've investigated using direct Neutron integration workflow in the new region instead but it appears to be broken in Mitaka as per https://bugs.launchpad.net/neutron/+bug/1616274.  I'd be happy to be wrong about that! Thanks! -Andrew From torin.woltjer at granddial.com Mon Jul 16 12:41:16 2018 From: torin.woltjer at granddial.com (Torin Woltjer) Date: Mon, 16 Jul 2018 12:41:16 GMT Subject: [Openstack] [Openstack-operators] Recovering from full outage Message-ID: <1361da1cb6954d29955d92d0b0f3ddae@granddial.com> $ip netns exec qdhcp-87a5200d-057f-475d-953d-17e873a47454 curl http://169.254.169.254 404 Not Found

404 Not Found

The resource could not be found.

$ip netns exec qrouter-80c3bc40-b49c-446a-926f-99811adc0c5e curl http://169.254.169.254 curl: (7) Couldn't connect to server Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: "Torin Woltjer" Sent: 7/12/18 11:16 AM To: , , "jpetrini at coredial.com" Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org Subject: Re: [Openstack] [Openstack-operators] Recovering from full outage Checking iptables for the metadata-proxy inside of qrouter provides the following: $ip netns exec qrouter-80c3bc40-b49c-446a-926f-99811adc0c5e iptables-save -c | grep 169 [0:0] -A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697 [0:0] -A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j MARK --set-xmark 0x1/0xffff Packets:Bytes are both 0, so no traffic is touching this rule? Interestingly the command: $ip netns exec qrouter-80c3bc40-b49c-446a-926f-99811adc0c5e netstat -anep | grep 9697 returns nothing, so there isn't actually anything running on 9697 in the network namespace... This is the output without grep: Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State User Inode PID/Program name raw 0 0 0.0.0.0:112 0.0.0.0:* 7 0 76154 8404/keepalived raw 0 0 0.0.0.0:112 0.0.0.0:* 7 0 76153 8404/keepalived Active UNIX domain sockets (servers and established) Proto RefCnt Flags Type State I-Node PID/Program name Path unix 2 [ ] DGRAM 64501 7567/python2 unix 2 [ ] DGRAM 79953 8403/keepalived Could the reason no traffic touching the rule be that nothing is listening on that port, or is there a second issue down the chain? Curl fails even after restarting the neutron-dhcp-agent & neutron-metadata agent. Thank you for this, and any future help. -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.vamsikrishna at ericsson.com Mon Jul 16 12:44:28 2018 From: a.vamsikrishna at ericsson.com (A Vamsikrishna) Date: Mon, 16 Jul 2018 12:44:28 +0000 Subject: [Openstack] [networking-odl] Builds are failing in Stable/pike in networking-odl Message-ID: Hi All, Builds are failing in Stable/pike in networking-odl on below review: https://review.openstack.org/#/c/582745/ looks that issue is here: http://logs.openstack.org/45/582745/5/check/networking-odl-rally-dsvm-carbon-snapshot/be4abe3/logs/devstacklog.txt.gz#_2018-07-15_18_23_41_854 There is 404 from opendaylight.org service and snapshot version is missing & only /-SNAPSHOT/maven-metadata.xml, it should be 0.8.3-SNAPSHOT or 0.9.0-SNAPSHOT This job is making use of carbon based ODL version & not able to find it. Any idea how to fix / proceed further to make stable/pike builds to be successful ? Thanks, Vamsi -------------- next part -------------- An HTML attachment was scrubbed... URL: From toni.mueller at oeko.net Mon Jul 16 14:25:25 2018 From: toni.mueller at oeko.net (Toni Mueller) Date: Mon, 16 Jul 2018 15:25:25 +0100 Subject: [Openstack] NUMA some of the time? In-Reply-To: References: <20180704150834.4szwgh6cjrs4gq6m@bla.tonimueller.org> Message-ID: <20180716142525.x6z7mcxl7figb764@bla.tonimueller.org> Hi Fabrizio! thank you for your answer! On Thu, Jul 05, 2018 at 05:35:00PM +0200, Fabrizio Soppelsa wrote: > and CPU-pinning by creating dedicated flavors [1] with something like: > > openstack flavor set m1.largenuma --property hw:numa_cpus.0=0,1 --property > hw:numa_mem.0=2048 I had suggested this already, but this approach was rejected by the manager because they want to overbook more. But I found something with "cpu_weight", which seems to accomplish a similar task without having to pin the CPU, and thus make it unusable for everyone else. Kind regards, Toni From toni.mueller at oeko.net Mon Jul 16 14:30:41 2018 From: toni.mueller at oeko.net (Toni Mueller) Date: Mon, 16 Jul 2018 15:30:41 +0100 Subject: [Openstack] NUMA some of the time? In-Reply-To: <2c04c635-443b-9ee8-00cc-8a7a669b4b18@gmail.com> References: <20180704150834.4szwgh6cjrs4gq6m@bla.tonimueller.org> <2c04c635-443b-9ee8-00cc-8a7a669b4b18@gmail.com> Message-ID: <20180716143041.sirupsip7iyedos4@bla.tonimueller.org> Hi Jay, On Fri, Jul 06, 2018 at 12:46:04PM -0400, Jay Pipes wrote: > There is no current way to say "On this dual-Xeon compute node, put all > workloads that don't care about dedicated CPUs on this socket and all > workloads that DO care about dedicated CPUs on the other socket.". it turned out that this is not what I should want to say. What I should say instead is: "Run all VMs on all cores, but if certain VMs suddenly spike, give them all they ask for at the expense of everyone else, and also avoid moving them around between cores, if possible." The idea is that these high priority VMs are (probably) idle most of the time, but at other times need high performance. It was thus deemed to be a huge waste to reserve cores for them. > https://review.openstack.org/#/c/555081/ Thank you for the pointer! Thanks, Toni From a.vamsikrishna at ericsson.com Mon Jul 16 15:04:00 2018 From: a.vamsikrishna at ericsson.com (A Vamsikrishna) Date: Mon, 16 Jul 2018 15:04:00 +0000 Subject: [Openstack] [networking-odl] Builds are failing in Stable/pike in networking-odl In-Reply-To: References: Message-ID: +Isaku Hi Isaku, I found the reason for the build failure. below path it should be distribution-artifacts instead of distribution-karaf https://nexus.opendaylight.org/content/repositories/opendaylight.snapshot/org/opendaylight/integration/distribution-karaf/-SNAPSHOT/maven-metadata.xml Line no: 7 is causing the problem https://github.com/openstack/networking-odl/blob/stable/pike/devstack/functions >From logs: http://logs.openstack.org/45/582745/5/check/networking-odl-rally-dsvm-carbon-snapshot/be4abe3/logs/devstacklog.txt.gz#_2018-07-15_18_23_41_854 opt/stack/new/networking-odl/devstack/functions:_odl_nexus_path:7 : echo https://nexus.opendaylight.org/content/repositories/opendaylight.snapshot/org/opendaylight/integration/distribution-karaf I think below code needs a fix, Can you please help us out ? https://github.com/openstack/networking-odl/blob/stable/pike/devstack/settings.odl#L72-L81 case "$ODL_RELEASE" in latest-snapshot|nitrogen-snapshot-0.7*) # use karaf because distribution-karaf isn't available for Nitrogen at the moment # TODO(yamahata): when distriution-karaf is available, remove this ODL_URL_DISTRIBUTION_KARAF_PATH=${ODL_URL_DISTRIBUTION_KARAF_PATH:-org/opendaylight/integration/karaf} ;; *) ODL_URL_DISTRIBUTION_KARAF_PATH=${ODL_URL_DISTRIBUTION_KARAF_PATH:-org/opendaylight/integration/distribution-karaf} ;; Esac Thanks, Vamsi From: A Vamsikrishna Sent: Monday, July 16, 2018 6:14 PM To: 'openstack-dev at lists.openstack.org' ; openstack at lists.openstack.org Subject: [networking-odl] Builds are failing in Stable/pike in networking-odl Hi All, Builds are failing in Stable/pike in networking-odl on below review: https://review.openstack.org/#/c/582745/ looks that issue is here: http://logs.openstack.org/45/582745/5/check/networking-odl-rally-dsvm-carbon-snapshot/be4abe3/logs/devstacklog.txt.gz#_2018-07-15_18_23_41_854 There is 404 from opendaylight.org service and snapshot version is missing & only /-SNAPSHOT/maven-metadata.xml, it should be 0.8.3-SNAPSHOT or 0.9.0-SNAPSHOT This job is making use of carbon based ODL version & not able to find it. Any idea how to fix / proceed further to make stable/pike builds to be successful ? Thanks, Vamsi -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaypipes at gmail.com Mon Jul 16 15:10:10 2018 From: jaypipes at gmail.com (Jay Pipes) Date: Mon, 16 Jul 2018 11:10:10 -0400 Subject: [Openstack] NUMA some of the time? In-Reply-To: <20180716143041.sirupsip7iyedos4@bla.tonimueller.org> References: <20180704150834.4szwgh6cjrs4gq6m@bla.tonimueller.org> <2c04c635-443b-9ee8-00cc-8a7a669b4b18@gmail.com> <20180716143041.sirupsip7iyedos4@bla.tonimueller.org> Message-ID: On 07/16/2018 10:30 AM, Toni Mueller wrote: > > Hi Jay, > > On Fri, Jul 06, 2018 at 12:46:04PM -0400, Jay Pipes wrote: >> There is no current way to say "On this dual-Xeon compute node, put all >> workloads that don't care about dedicated CPUs on this socket and all >> workloads that DO care about dedicated CPUs on the other socket.". > > it turned out that this is not what I should want to say. What I should > say instead is: > > "Run all VMs on all cores, but if certain VMs suddenly spike, give them > all they ask for at the expense of everyone else, and also avoid moving > them around between cores, if possible." > > The idea is that these high priority VMs are (probably) idle most of the > time, but at other times need high performance. It was thus deemed to be > a huge waste to reserve cores for them. You're looking for something like VMWare DRS, then: https://www.vmware.com/products/vsphere/drs-dpm.html This isn't something Nova is looking to implement. Best, -jay From haleyb.dev at gmail.com Mon Jul 16 20:38:31 2018 From: haleyb.dev at gmail.com (Brian Haley) Date: Mon, 16 Jul 2018 16:38:31 -0400 Subject: [Openstack] [Openstack-operators] Recovering from full outage In-Reply-To: <1361da1cb6954d29955d92d0b0f3ddae@granddial.com> References: <1361da1cb6954d29955d92d0b0f3ddae@granddial.com> Message-ID: On 07/16/2018 08:41 AM, Torin Woltjer wrote: > $ip netns exec qdhcp-87a5200d-057f-475d-953d-17e873a47454 curl > http://169.254.169.254 > > >  404 Not Found > > >  

404 Not Found

>  The resource could not be found.

> > Strange, don't know where the reply came from for that. > $ip netns exec qrouter-80c3bc40-b49c-446a-926f-99811adc0c5e curl > http://169.254.169.254 > curl: (7) Couldn't connect to server Based on your iptables output below, I would think the metadata proxy is running in the qrouter namespace. However, a curl from there will not work since it is restricted to only work for incoming packets from the qr- device(s). You would have to try curl from a running instance. Is there an haproxy process running? And is it listening on port 9697 in the qrouter namespace? -Brian > ------------------------------------------------------------------------ > *From*: "Torin Woltjer" > *Sent*: 7/12/18 11:16 AM > *To*: , , > "jpetrini at coredial.com" > *Cc*: openstack-operators at lists.openstack.org, openstack at lists.openstack.org > *Subject*: Re: [Openstack] [Openstack-operators] Recovering from full outage > Checking iptables for the metadata-proxy inside of qrouter provides the > following: > $ip netns exec qrouter-80c3bc40-b49c-446a-926f-99811adc0c5e > iptables-save -c | grep 169 > [0:0] -A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p > tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697 > [0:0] -A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p > tcp -m tcp --dport 80 -j MARK --set-xmark 0x1/0xffff > Packets:Bytes are both 0, so no traffic is touching this rule? > > Interestingly the command: > $ip netns exec qrouter-80c3bc40-b49c-446a-926f-99811adc0c5e netstat > -anep | grep 9697 > returns nothing, so there isn't actually anything running on 9697 in the > network namespace... > > This is the output without grep: > Active Internet connections (servers and established) > Proto Recv-Q Send-Q Local Address           Foreign Address > State       User       Inode      PID/Program name > raw        0      0 0.0.0.0:112             0.0.0.0:*               7 >         0          76154      8404/keepalived > raw        0      0 0.0.0.0:112             0.0.0.0:*               7 >         0          76153      8404/keepalived > Active UNIX domain sockets (servers and established) > Proto RefCnt Flags       Type       State         I-Node   PID/Program > name     Path > unix  2      [ ]         DGRAM                    64501    7567/python2 > unix  2      [ ]         DGRAM                    79953    8403/keepalived > > Could the reason no traffic touching the rule be that nothing is > listening on that port, or is there a second issue down the chain? > > Curl fails even after restarting the neutron-dhcp-agent & > neutron-metadata agent. > > Thank you for this, and any future help. From torin.woltjer at granddial.com Mon Jul 16 20:54:28 2018 From: torin.woltjer at granddial.com (Torin Woltjer) Date: Mon, 16 Jul 2018 20:54:28 GMT Subject: [Openstack] [Openstack-operators] Recovering from full outage Message-ID: I feel pretty dumb about this, but it was fixed by adding a rule to my security groups. I'm still very confused about some of the other behavior that I saw, but at least the problem is fixed now. Torin Woltjer Grand Dial Communications - A ZK Tech Inc. Company 616.776.1066 ext. 2006 www.granddial.com ---------------------------------------- From: Brian Haley Sent: 7/16/18 4:39 PM To: torin.woltjer at granddial.com, thangam.arunx at gmail.com, jpetrini at coredial.com Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org Subject: Re: [Openstack] [Openstack-operators] Recovering from full outage On 07/16/2018 08:41 AM, Torin Woltjer wrote: > $ip netns exec qdhcp-87a5200d-057f-475d-953d-17e873a47454 curl > http://169.254.169.254 > > > 404 Not Found > > > 404 Not Found > The resource could not be found. > > Strange, don't know where the reply came from for that. > $ip netns exec qrouter-80c3bc40-b49c-446a-926f-99811adc0c5e curl > http://169.254.169.254 > curl: (7) Couldn't connect to server Based on your iptables output below, I would think the metadata proxy is running in the qrouter namespace. However, a curl from there will not work since it is restricted to only work for incoming packets from the qr- device(s). You would have to try curl from a running instance. Is there an haproxy process running? And is it listening on port 9697 in the qrouter namespace? -Brian > ------------------------------------------------------------------------ > *From*: "Torin Woltjer" > *Sent*: 7/12/18 11:16 AM > *To*: , , > "jpetrini at coredial.com" > *Cc*: openstack-operators at lists.openstack.org, openstack at lists.openstack.org > *Subject*: Re: [Openstack] [Openstack-operators] Recovering from full outage > Checking iptables for the metadata-proxy inside of qrouter provides the > following: > $ip netns exec qrouter-80c3bc40-b49c-446a-926f-99811adc0c5e > iptables-save -c | grep 169 > [0:0] -A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p > tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697 > [0:0] -A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p > tcp -m tcp --dport 80 -j MARK --set-xmark 0x1/0xffff > Packets:Bytes are both 0, so no traffic is touching this rule? > > Interestingly the command: > $ip netns exec qrouter-80c3bc40-b49c-446a-926f-99811adc0c5e netstat > -anep | grep 9697 > returns nothing, so there isn't actually anything running on 9697 in the > network namespace... > > This is the output without grep: > Active Internet connections (servers and established) > Proto Recv-Q Send-Q Local Address Foreign Address > State User Inode PID/Program name > raw 0 0 0.0.0.0:112 0.0.0.0:* 7 > 0 76154 8404/keepalived > raw 0 0 0.0.0.0:112 0.0.0.0:* 7 > 0 76153 8404/keepalived > Active UNIX domain sockets (servers and established) > Proto RefCnt Flags Type State I-Node PID/Program > name Path > unix 2 [ ] DGRAM 64501 7567/python2 > unix 2 [ ] DGRAM 79953 8403/keepalived > > Could the reason no traffic touching the rule be that nothing is > listening on that port, or is there a second issue down the chain? > > Curl fails even after restarting the neutron-dhcp-agent & > neutron-metadata agent. > > Thank you for this, and any future help. -------------- next part -------------- An HTML attachment was scrubbed... URL: From manuel.sb at garvan.org.au Tue Jul 17 04:34:44 2018 From: manuel.sb at garvan.org.au (Manuel Sopena Ballesteros) Date: Tue, 17 Jul 2018 04:34:44 +0000 Subject: [Openstack] Monitor Instances KVM In-Reply-To: <8cdde50ecf6be684e5026641246b36a1@marcioprado.eti.br> References: <8cdde50ecf6be684e5026641246b36a1@marcioprado.eti.br> Message-ID: <9D8A2486E35F0941A60430473E29F15B0173A4839F@MXDB1.ad.garvan.unsw.edu.au> Hi Marcio, I used to use gnocchi and grafana. This may be a starting point https://julien.danjou.info/openstack-gnocchi-grafana/ thanks Manuel -----Original Message----- From: Marcio Prado [mailto:marcioprado at marcioprado.eti.br] Sent: Friday, July 13, 2018 8:34 PM To: openstack at lists.openstack.org Subject: [Openstack] Monitor Instances KVM Good Morning, How can I monitor openstack KVM instances? Does Ceilometer do this? Thank you -- Marcio Prado Analista de TI - Infraestrutura e Redes Fone: (35) 9.9821-3561 www.marcioprado.eti.br _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack at lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. From mark.kirkwood at catalyst.net.nz Wed Jul 18 02:33:33 2018 From: mark.kirkwood at catalyst.net.nz (Mark Kirkwood) Date: Wed, 18 Jul 2018 14:33:33 +1200 Subject: [Openstack] Swift3 bucket naming conventions In-Reply-To: References: Message-ID: <45235977-47de-1477-31cf-07c48f992500@catalyst.net.nz> Hi, I've been caught by this myself - by default s3api has the parameter: dns_compliant_bucket_names = True which will forbid _ in the bucket name. Just set this to False under your [s3api] section (or the [swift3] section if it is called that in your proxy pipeline). regards Mark On 23/06/18 04:22, Clay Gerrard wrote: > Swift containers can certainly have underscores in them... almost any > character is valid. > > But I guess s3api thinks that's maybe not a valid bucket name? > > https://github.com/openstack/swift/blob/master/test/unit/common/middleware/s3api/test_utils.py#L38 > > -Clay > > > On Thu, Jun 21, 2018 at 3:27 AM, Shyam Prasad N > > wrote: > > Hi, > > On my openstack swift s3 interface, I tried to create bucket names > similar to what I have in my AWS S3. But swift3 doesn't seem to > allow bucket names containing underscore. Once I remove the > underscore and try to create the bucket, it works. Is there a way > to overcome this? > > -- > -Shyam > > _______________________________________________ > Mailing list: > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > Post to     : openstack at lists.openstack.org > > Unsubscribe : > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > > > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack From satish.txt at gmail.com Wed Jul 18 03:30:57 2018 From: satish.txt at gmail.com (Satish Patel) Date: Tue, 17 Jul 2018 23:30:57 -0400 Subject: [Openstack] openstack-ansible noVNC console issue Message-ID: I have upgrade OSA from 16.0.14 to 16.0.15 minor version and found spice-html5 has been replaced by noVNC After upgrade i found my GUI console stopped working and after set log-level debug found following errors. Full error output: http://paste.openstack.org/show/726154/ 2018-07-17 18:41:27.482 675 DEBUG nova.console.websocketproxy [-] 192.168.100.1: new handler Process vmsg /openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/websockify/websocket.py:875 2018-07-17 18:41:27.493 692 INFO nova.console.websocketproxy [-] 192.168.100.1 - - [17/Jul/2018 18:41:27] code 404, message No such file 2018-07-17 18:41:27.554 675 DEBUG nova.console.websocketproxy [-] 192.168.100.2: new handler Process vmsg /openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/websockify/websocket.py:875 2018-07-17 18:41:27.566 693 INFO nova.console.websocketproxy [-] 192.168.100.2 - - [17/Jul/2018 18:41:27] code 404, message No such file 2018-07-17 18:41:29.162 675 DEBUG nova.console.websocketproxy [-] 192.168.100.3: new handler Process vmsg /openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/websockify/websocket.py:875 2018-07-17 18:41:29.174 694 INFO nova.console.websocketproxy [-] 192.168.100.3 - - [17/Jul/2018 18:41:29] code 404, message No such file 2018-07-17 18:41:29.177 694 INFO nova.console.websocketproxy [-] handler exception: [Errno 104] Connection reset by peer 2018-07-17 18:41:29.178 694 DEBUG nova.console.websocketproxy [-] exception vmsg /openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/websockify/websocket.py:875 2018-07-17 18:41:29.178 694 ERROR nova.console.websocketproxy Traceback (most recent call last): 2018-07-17 18:41:29.178 694 ERROR nova.console.websocketproxy File "/openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/websockify/websocket.py", line 930, in top_new_client 2018-07-17 18:41:29.178 694 ERROR nova.console.websocketproxy client = self.do_handshake(startsock, address) 2018-07-17 18:41:29.178 694 ERROR nova.console.websocketproxy File "/openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/websockify/websocket.py", line 860, in do_handshake 2018-07-17 18:41:29.178 694 ERROR nova.console.websocketproxy self.RequestHandlerClass(retsock, address, self) 2018-07-17 18:41:29.178 694 ERROR nova.console.websocketproxy File "/openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/nova/console/websocketproxy.py", line 176, in __init__ 2018-07-17 18:41:29.178 694 ERROR nova.console.websocketproxy websockify.ProxyRequestHandler.__init__(self, *args, **kwargs) 2018-07-17 18:41:29.178 694 ERROR nova.console.websocketproxy File "/openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/websockify/websocket.py", line 114, in __init__ 2018-07-17 18:41:29.178 694 ERROR nova.console.websocketproxy SimpleHTTPRequestHandler.__init__(self, req, addr, server) 2018-07-17 18:41:29.178 694 ERROR nova.console.websocketproxy File "/usr/lib64/python2.7/SocketServer.py", line 649, in __init__ 2018-07-17 18:41:29.178 694 ERROR nova.console.websocketproxy self.handle() From nspmangalore at gmail.com Wed Jul 18 06:46:34 2018 From: nspmangalore at gmail.com (Shyam Prasad N) Date: Wed, 18 Jul 2018 12:16:34 +0530 Subject: [Openstack] Swift3 bucket naming conventions In-Reply-To: <45235977-47de-1477-31cf-07c48f992500@catalyst.net.nz> References: <45235977-47de-1477-31cf-07c48f992500@catalyst.net.nz> Message-ID: Great. That provides me another option. Thanks, Mark. :) On Wed, Jul 18, 2018 at 8:03 AM, Mark Kirkwood < mark.kirkwood at catalyst.net.nz> wrote: > Hi, > > I've been caught by this myself - by default s3api has the parameter: > > dns_compliant_bucket_names = True > > which will forbid _ in the bucket name. Just set this to False under your > [s3api] section (or the [swift3] section if it is called that in your proxy > pipeline). > > regards > Mark > > On 23/06/18 04:22, Clay Gerrard wrote: > >> Swift containers can certainly have underscores in them... almost any >> character is valid. >> >> But I guess s3api thinks that's maybe not a valid bucket name? >> >> https://github.com/openstack/swift/blob/master/test/unit/com >> mon/middleware/s3api/test_utils.py#L38 >> >> -Clay >> >> >> On Thu, Jun 21, 2018 at 3:27 AM, Shyam Prasad N > > wrote: >> >> Hi, >> >> On my openstack swift s3 interface, I tried to create bucket names >> similar to what I have in my AWS S3. But swift3 doesn't seem to >> allow bucket names containing underscore. Once I remove the >> underscore and try to create the bucket, it works. Is there a way >> to overcome this? >> >> -- -Shyam >> >> _______________________________________________ >> Mailing list: >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> >> Post to : openstack at lists.openstack.org >> >> Unsubscribe : >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> >> >> >> >> >> _______________________________________________ >> Mailing list: http://lists.openstack.org/cgi >> -bin/mailman/listinfo/openstack >> Post to : openstack at lists.openstack.org >> Unsubscribe : http://lists.openstack.org/cgi >> -bin/mailman/listinfo/openstack >> > > -- -Shyam -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.lake at surrey.ac.uk Thu Jul 19 08:05:56 2018 From: d.lake at surrey.ac.uk (d.lake at surrey.ac.uk) Date: Thu, 19 Jul 2018 08:05:56 +0000 Subject: [Openstack] Issues with Hypervisor after Devstack Message-ID: Hello I’m reinstalling a single-node devstack system and everything looks OK except the compute node never appears in he list of Hypervisors. I do a “discover-hosts” and nothing is found. Note - this is a reinstall form an unchanged local.conf - in other words, it has worked before. I am suspecting that this is something to do with name resolution but I have no idea what. I have a DNS server in /etc/resolv.conf to get the source code. But the name of my system “Openstack” does not appear in the DNS. I have declared the IP address in local.conf as 127.0.0.1 and the host name as OpenStack. I have entries in /etc/host for both OpenStack to my IP address and to 127.0.0.1 I have no idea how to go around debugging this ! As far as I can see, nslookup gives me no response on the DNS as expected. I know devstack may not be the best way to do this but it has worked in the past and I have no idea why it now fails. I just need to restore service for now. Thanks David Sent from my iPhone -------------- next part -------------- An HTML attachment was scrubbed... URL: From gaosong1 at syswin.com Thu Jul 19 08:29:14 2018 From: gaosong1 at syswin.com (=?gb2312?B?uN/LyQ==?=) Date: Thu, 19 Jul 2018 08:29:14 +0000 Subject: [Openstack] [Horizon] Horizon responds very slowly Message-ID: <6ED5A4C0760EC04A8DFC2AF5B93AE14EB4F2ABD3@MAIL01.syswin.com> After kill one node of a cluster which consist of three nodes, I found that Horizon based on keystone with provider set to fernet respondes very slowly. Admin login will cost at least 20 senconds. And cli verbose command return show making authentication is stuck about 5 senconds. Any help will be appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eblock at nde.ag Thu Jul 19 08:47:07 2018 From: eblock at nde.ag (Eugen Block) Date: Thu, 19 Jul 2018 08:47:07 +0000 Subject: [Openstack] [Horizon] Horizon responds very slowly In-Reply-To: <6ED5A4C0760EC04A8DFC2AF5B93AE14EB4F2ABD3@MAIL01.syswin.com> Message-ID: <20180719084707.Horde.nJPNJA-tAWUutMcU6gz_cHh@webmail.nde.ag> Hi, we also had to deal with slow dashboard, in our case it was a misconfiguration of memcached [0], [1]. Check with your configuration and make sure you use oslo.cache. Hope this helps! [0] https://bugs.launchpad.net/keystone/+bug/1587777 [1] https://ask.openstack.org/en/question/102611/how-to-configure-memcache-in-openstack-ha/ Zitat von 高松 : > After kill one node of a cluster which consist of three nodes, > I found that Horizon based on keystone with provider set to fernet > respondes very slowly. > Admin login will cost at least 20 senconds. > And cli verbose command return show making authentication is stuck > about 5 senconds. > Any help will be appreciated. From gaosong_1250 at 163.com Fri Jul 20 01:18:53 2018 From: gaosong_1250 at 163.com (gao.song) Date: Fri, 20 Jul 2018 09:18:53 +0800 (CST) Subject: [Openstack] [horizon] Horizon responds very slowly Message-ID: <34584fe7.1849.164b544ac1a.Coremail.gaosong_1250@163.com> After kill one node of a cluster which consist of three nodes, I found that Horizon based on keystone with provider set to fernet respondes very slowly. Admin login will cost at least 20 senconds. And cli verbose command return show making authentication is stuck about 5 senconds. Any help will be appreciated. -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.lake at surrey.ac.uk Fri Jul 20 07:32:49 2018 From: d.lake at surrey.ac.uk (d.lake at surrey.ac.uk) Date: Fri, 20 Jul 2018 07:32:49 +0000 Subject: [Openstack] VM as a router with ODL/OpenStack Message-ID: Hello I’m trying to use a VM as a router in an OpenStack + ODL installation. I have the VM set up with two internal addresses - 10.10.5.21 and 10.10.6.21. They are allocated floating public addresses of 10.201.81.21 and 10.201.82.21 respectively. I am using a TREx load generator which sources from 16.0.0.0/8 and sinks to 48.0.0.0/8. I have added routes both ways on the routers between the floating and private addresses. I have read that I need to disable “port security” on the VM ports to allow IP spoofing - does this also include the router ports? Also, when I start a test session generating traffic from 16.0.0.0 -> 48.0.0.0. I see a flow in OVS which matches but has an action of “drop.” How do I overcome this? Thanks in advance David Sent from my iPhone -------------- next part -------------- An HTML attachment was scrubbed... URL: From berndbausch at gmail.com Fri Jul 20 08:45:34 2018 From: berndbausch at gmail.com (Bernd Bausch) Date: Fri, 20 Jul 2018 17:45:34 +0900 Subject: [Openstack] [gnocchi][aodh] Unable to trigger aggregate alarms Message-ID: This is on a Newton Packstack. I try to trigger alarms based on average cpu_util of a group of instances. *Problem: *The alarm perpetually remains in state "insufficient data". Ceilometer is configured to use Gnocchi and the medium archive policy (which stores data once a minute).  The intervals in pipeline.yaml are set to 60. I run two instances with high CPU usage. Both have a metadata item "metering.server_group=hicpu". The alarm uses a query "server_group==hicpu", has a granularity of 60 and evalution periods set to 1. I expect it to be in state /alarm /or /ok /after less than 2 minutes. From Gnocchi, I can retrieve measures, both of the two individual instances and of aggregate measures. *Why "insufficient data"? **How can I find out what's going on in Aodh's mind? *More info below. Thanks. Bernd Bausch My alarm: $ openstack alarm show cpuhigh-agg +---------------------------+--------------------------------------------------+ | Field                     | Value                                            | +---------------------------+--------------------------------------------------+ | aggregation_method        | sum                                              | | alarm_actions             | [u'http://localhost:1234']                       | | alarm_id                  | 6adb333a-b306-470d-b673-2c8e72c7a468             | | comparison_operator       | gt                                               | | description               | gnocchi_aggregation_by_resources_threshold alarm | |                           | rule                                             | | enabled                   | True                                             | | evaluation_periods        | 1                                                | | granularity               | 60                                               | | insufficient_data_actions | []                                               | | metric                    | cpu_util                                         | | name                      | cpuhigh-agg                                      | | ok_actions                | [u'http://localhost:1234']                       | | project_id                | 55a05c4f3908490ca2419591837575ba                 | | query                     | {"and": [{"=": {"created_by_project_id":         | |                           | "55a05c4f3908490ca2419591837575ba"}}, {"=":      | |                           | {"server_group": "hicpu"}}]}                     | | repeat_actions            | False                                            | | resource_type             | instance                                         | | severity                  | low                                              | *| state                     | insufficient data                                |* | state_timestamp           | 2018-07-19T11:05:38.098000                       | | threshold                 | 80.0                                             | | time_constraints          | []                                               | | timestamp                 | 2018-07-19T11:05:38.098000                       | | type                      | gnocchi_aggregation_by_resources_threshold       | | user_id                   | 96ce6a7200a54c79add0cc27ded03422                 | +---------------------------+--------------------------------------------------+ My instances look like this: $ openstack server show cpu-user1 +--------------------------------------+---------------------------------------+ | Field                                | Value                                 | +--------------------------------------+---------------------------------------+ ... | project_id                           | 55a05c4f3908490ca2419591837575ba      | | properties                           | *metering.server_group='hicpu'*         | | security_groups                      | [{u'name': u'default'}, {u'name':     | |                                      | u'ssh'}]                              | | status                               | ACTIVE                                | ... +--------------------------------------+---------------------------------------+ Gnocchi contains enough data I would think: gnocchi measures aggregation -m cpu_util --query server_group=hicpu --aggregation sum --resource-type instance +---------------------------+-------------+---------------+ | timestamp                 | granularity |         value | +---------------------------+-------------+---------------+ | 2018-07-19T09:00:00+00:00 |      3600.0 | 676.454821872 | | 2018-07-19T10:00:00+00:00 |      3600.0 | 927.148462196 | | 2018-07-19T09:46:00+00:00 |        60.0 | 79.0149064873 | | 2018-07-19T09:47:00+00:00 |        60.0 | 54.6575832468 | | 2018-07-19T09:48:00+00:00 |        60.0 | 46.0457056053 | | 2018-07-19T09:49:00+00:00 |        60.0 | 52.5139041993 | | 2018-07-19T09:50:00+00:00 |        60.0 | 42.7994058262 | | 2018-07-19T09:51:00+00:00 |        60.0 | 40.0215359957 | ... -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From d.lake at surrey.ac.uk Fri Jul 20 09:57:59 2018 From: d.lake at surrey.ac.uk (d.lake at surrey.ac.uk) Date: Fri, 20 Jul 2018 09:57:59 +0000 Subject: [Openstack] [netvirt-dev] VM as a router with ODL/OpenStack In-Reply-To: References: Message-ID: Hi Aswin From a “ovs-dpctl dump-flows” I see this: recirc_id(0),in_port(5),eth(src=a0:36:9f:f6:f9:98,dst=fa:16:3e:f1:8e:3d),eth_type(0x0800),ipv4(src=16.0.0.0/240.0.0.0,dst=48.0.0.0/255.0.0.0,frag=no), packets:1438, bytes:105356, used:0.005s, flags:S, actions:drop The src MAC address is the traffic generator. The dst is the MAC address of the floating IP. David From: Aswin Suryanarayanan [mailto:asuryana at redhat.com] Sent: 20 July 2018 10:45 To: Lake D Mr (PG/R - Elec Electronic Eng) Cc: odl netvirt dev ; openstack at lists.openstack.org; Ge C Dr (Elec Electronic Eng) Subject: Re: [netvirt-dev] VM as a router with ODL/OpenStack On Fri, Jul 20, 2018 at 1:02 PM, > wrote: Hello I’m trying to use a VM as a router in an OpenStack + ODL installation. I have the VM set up with two internal addresses - 10.10.5.21 and 10.10.6.21. They are allocated floating public addresses of 10.201.81.21 and 10.201.82.21 respectively. I am using a TREx load generator which sources from 16.0.0.0/8 and sinks to 48.0.0.0/8. I have added routes both ways on the routers between the floating and private addresses. I have read that I need to disable “port security” on the VM ports to allow IP spoofing - does this also include the router ports? Router ports have port security disabled by default , no need to do that explicitly. Also, when I start a test session generating traffic from 16.0.0.0 -> 48.0.0.0. I see a flow in OVS which matches but has an action of “drop.” Which table exactly is the packet dropped? How do I overcome this? Thanks in advance David Sent from my iPhone _______________________________________________ netvirt-dev mailing list netvirt-dev at lists.opendaylight.org https://lists.opendaylight.org/mailman/listinfo/netvirt-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.lake at surrey.ac.uk Fri Jul 20 11:02:01 2018 From: d.lake at surrey.ac.uk (d.lake at surrey.ac.uk) Date: Fri, 20 Jul 2018 11:02:01 +0000 Subject: [Openstack] [netvirt-dev] VM as a router with ODL/OpenStack In-Reply-To: References: Message-ID: Hi Aswin I’ve just noticed that I don’t think the packet is ever actually making it through to OVS. If I do a “ovs-dpctl dump-flows” then I see the immediate drop on ingress port 5. But if I extend that to “ovs-ofctl -O OpenFlow13 dump-flows br-int” the only entry I see is: cookie=0x8000003, duration=3823.308s, table=21, n_packets=0, n_bytes=0, priority=18,ip,metadata=0x30d40/0xfffffe,nw_dst=48.0.0.0/8 actions=group:150007 I’ve just checked the port names and “Port 5” is: name : "br-prov2-patch" ofport : 5 David From: Aswin Suryanarayanan [mailto:asuryana at redhat.com] Sent: 20 July 2018 10:45 To: Lake D Mr (PG/R - Elec Electronic Eng) Cc: odl netvirt dev ; openstack at lists.openstack.org; Ge C Dr (Elec Electronic Eng) Subject: Re: [netvirt-dev] VM as a router with ODL/OpenStack On Fri, Jul 20, 2018 at 1:02 PM, > wrote: Hello I’m trying to use a VM as a router in an OpenStack + ODL installation. I have the VM set up with two internal addresses - 10.10.5.21 and 10.10.6.21. They are allocated floating public addresses of 10.201.81.21 and 10.201.82.21 respectively. I am using a TREx load generator which sources from 16.0.0.0/8 and sinks to 48.0.0.0/8. I have added routes both ways on the routers between the floating and private addresses. I have read that I need to disable “port security” on the VM ports to allow IP spoofing - does this also include the router ports? Router ports have port security disabled by default , no need to do that explicitly. Also, when I start a test session generating traffic from 16.0.0.0 -> 48.0.0.0. I see a flow in OVS which matches but has an action of “drop.” Which table exactly is the packet dropped? How do I overcome this? Thanks in advance David Sent from my iPhone _______________________________________________ netvirt-dev mailing list netvirt-dev at lists.opendaylight.org https://lists.opendaylight.org/mailman/listinfo/netvirt-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.lake at surrey.ac.uk Fri Jul 20 11:30:41 2018 From: d.lake at surrey.ac.uk (d.lake at surrey.ac.uk) Date: Fri, 20 Jul 2018 11:30:41 +0000 Subject: [Openstack] [netvirt-dev] VM as a router with ODL/OpenStack In-Reply-To: References: Message-ID: With “ovs-ofctl -O OpenFlow13 dump-flows br-int” I don’t see ANY entries for packets to 48.0.0.0/8 or 16.0.0.0/8 Only this one entry (which I think is a static route which I have in the router between the floating network and the private network). David From: Aswin Suryanarayanan [mailto:asuryana at redhat.com] Sent: 20 July 2018 12:28 To: Lake D Mr (PG/R - Elec Electronic Eng) Cc: odl netvirt dev ; openstack at lists.openstack.org; Ge C Dr (Elec Electronic Eng) Subject: Re: [netvirt-dev] VM as a router with ODL/OpenStack On Fri, Jul 20, 2018 at 4:32 PM, > wrote: Hi Aswin I’ve just noticed that I don’t think the packet is ever actually making it through to OVS. If I do a “ovs-dpctl dump-flows” then I see the immediate drop on ingress port 5. But if I extend that to “ovs-ofctl -O OpenFlow13 dump-flows br-int” the only entry I see is: cookie=0x8000003, duration=3823.308s, table=21, n_packets=0, n_bytes=0, priority=18,ip,metadata=0x30d40/0xfffffe,nw_dst=48.0.0.0/8 actions=group:150007 Oh I think it is hard to understand the reason from this flow. Were you able to identify where the packet is dropped from “ovs-ofctl -O OpenFlow13 dump-flows br-int” ? I’ve just checked the port names and “Port 5” is: name : "br-prov2-patch" ofport : 5 David From: Aswin Suryanarayanan [mailto:asuryana at redhat.com] Sent: 20 July 2018 10:45 To: Lake D Mr (PG/R - Elec Electronic Eng) > Cc: odl netvirt dev >; openstack at lists.openstack.org; Ge C Dr (Elec Electronic Eng) > Subject: Re: [netvirt-dev] VM as a router with ODL/OpenStack On Fri, Jul 20, 2018 at 1:02 PM, > wrote: Hello I’m trying to use a VM as a router in an OpenStack + ODL installation. I have the VM set up with two internal addresses - 10.10.5.21 and 10.10.6.21. They are allocated floating public addresses of 10.201.81.21 and 10.201.82.21 respectively. I am using a TREx load generator which sources from 16.0.0.0/8 and sinks to 48.0.0.0/8. I have added routes both ways on the routers between the floating and private addresses. I have read that I need to disable “port security” on the VM ports to allow IP spoofing - does this also include the router ports? Router ports have port security disabled by default , no need to do that explicitly. Also, when I start a test session generating traffic from 16.0.0.0 -> 48.0.0.0. I see a flow in OVS which matches but has an action of “drop.” Which table exactly is the packet dropped? How do I overcome this? Thanks in advance David Sent from my iPhone _______________________________________________ netvirt-dev mailing list netvirt-dev at lists.opendaylight.org https://lists.opendaylight.org/mailman/listinfo/netvirt-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.lake at surrey.ac.uk Fri Jul 20 13:50:04 2018 From: d.lake at surrey.ac.uk (d.lake at surrey.ac.uk) Date: Fri, 20 Jul 2018 13:50:04 +0000 Subject: [Openstack] [netvirt-dev] VM as a router with ODL/OpenStack In-Reply-To: <916d48a523ea183cb9f735e9aa4df60965ed7f5d.camel@suse.de> References: <916d48a523ea183cb9f735e9aa4df60965ed7f5d.camel@suse.de> Message-ID: Hi Jaime Thank you - I will try this and see how it works. David -----Original Message----- From: Jaime Caamaño Ruiz [mailto:jcaamano at suse.de] Sent: 20 July 2018 14:23 To: Lake D Mr (PG/R - Elec Electronic Eng) ; netvirt-dev at lists.opendaylight.org; openstack at lists.openstack.org Cc: Ge C Dr (Elec Electronic Eng) Subject: Re: [netvirt-dev] VM as a router with ODL/OpenStack Hello David On the router VM, you would probably dedicate a port for management with a floating ip assigned. The you would have extra ports for as many nets the router is servicing, two in the case of trex simple setup. These ports would have port security disabled: openstack port set --no-security-group --disable-port-security If running trex in the cloud vm, more less the same. Have one port for management. Then two other ports for trex traffic. On these ports, add allowed address pairs for 16.0.0.0/8 and 48.0.0.0/8 respectively openstack port set --allowed-address ip-address=16.0.0.0/8 If you have any routers in the middle, add static routes. Not actually tried with ODL, but this works with neutron ovs driver. BR Jaime. -----Original Message----- From: d.lake at surrey.ac.uk To: netvirt-dev at lists.opendaylight.org, openstack at lists.openstack.org, jcaamano at suse.de Cc: c.ge at surrey.ac.uk Subject: [netvirt-dev] VM as a router with ODL/OpenStack Date: Fri, 20 Jul 2018 07:32:49 +0000 Hello I’m trying to use a VM as a router in an OpenStack + ODL installation. I have the VM set up with two internal addresses - 10.10.5.21 and 10.10.6.21. They are allocated floating public addresses of 10.201.81.21 and 10.201.82.21 respectively. I am using a TREx load generator which sources from 16.0.0.0/8 and sinks to 48.0.0.0/8. I have added routes both ways on the routers between the floating and private addresses. I have read that I need to disable “port security” on the VM ports to allow IP spoofing - does this also include the router ports? Also, when I start a test session generating traffic from 16.0.0.0 -> 48.0.0.0. I see a flow in OVS which matches but has an action of “drop.” How do I overcome this? Thanks in advance David Sent from my iPhone _______________________________________________ netvirt-dev mailing list netvirt-dev at lists.opendaylight.org https://lists.opendaylight.org/mailman/listinfo/netvirt-dev From phuoc.hc at dcn.ssu.ac.kr Fri Jul 20 16:24:38 2018 From: phuoc.hc at dcn.ssu.ac.kr (Cong Phuoc Hoang) Date: Sat, 21 Jul 2018 01:24:38 +0900 Subject: [Openstack] [gnocchi][aodh] Unable to trigger aggregate alarms In-Reply-To: References: Message-ID: Last time I tried and it worked. But now I meet the same issue with Ceilometer master version. On Fri, Jul 20, 2018 at 5:54 PM Bernd Bausch wrote: > This is on a Newton Packstack. > > I try to trigger alarms based on average cpu_util of a group of instances. *Problem: > *The alarm perpetually remains in state "insufficient data". > > Ceilometer is configured to use Gnocchi and the medium archive policy > (which stores data once a minute). The intervals in pipeline.yaml are set > to 60. > > I run two instances with high CPU usage. Both have a metadata item > "metering.server_group=hicpu". The alarm uses a query > "server_group==hicpu", has a granularity of 60 and evalution periods set to > 1. I expect it to be in state *alarm *or *ok *after less than 2 minutes. > > From Gnocchi, I can retrieve measures, both of the two individual > instances and of aggregate measures. > > *Why "insufficient data"? **How can I find out what's going on in Aodh's > mind? *More info below. Thanks. > > Bernd Bausch > > My alarm: > > $ openstack alarm show cpuhigh-agg > > +---------------------------+--------------------------------------------------+ > | Field | > Value | > > +---------------------------+--------------------------------------------------+ > | aggregation_method | > sum | > | alarm_actions | [u'http://localhost:1234'] > | > | alarm_id | > 6adb333a-b306-470d-b673-2c8e72c7a468 | > | comparison_operator | > gt | > | description | gnocchi_aggregation_by_resources_threshold > alarm | > | | > rule | > | enabled | > True | > | evaluation_periods | > 1 | > | granularity | > 60 | > | insufficient_data_actions | > [] | > | metric | > cpu_util | > | name | > cpuhigh-agg | > | ok_actions | [u'http://localhost:1234'] > | > | project_id | > 55a05c4f3908490ca2419591837575ba | > | query | {"and": [{"=": > {"created_by_project_id": | > | | "55a05c4f3908490ca2419591837575ba"}}, > {"=": | > | | {"server_group": > "hicpu"}}]} | > | repeat_actions | > False | > | resource_type | > instance | > | severity | > low | > *| state | insufficient > data |* > | state_timestamp | > 2018-07-19T11:05:38.098000 | > | threshold | > 80.0 | > | time_constraints | > [] | > | timestamp | > 2018-07-19T11:05:38.098000 | > | type | > gnocchi_aggregation_by_resources_threshold | > | user_id | > 96ce6a7200a54c79add0cc27ded03422 | > > +---------------------------+--------------------------------------------------+ > > My instances look like this: > > $ openstack server show cpu-user1 > > +--------------------------------------+---------------------------------------+ > | Field | > Value | > > +--------------------------------------+---------------------------------------+ > ... > | project_id | > 55a05c4f3908490ca2419591837575ba | > | properties | *metering.server_group='hicpu'* > | > | security_groups | [{u'name': u'default'}, > {u'name': | > | | > u'ssh'}] | > | status | > ACTIVE | > ... > > +--------------------------------------+---------------------------------------+ > > Gnocchi contains enough data I would think: > > gnocchi measures aggregation -m cpu_util --query server_group=hicpu > --aggregation sum --resource-type instance > +---------------------------+-------------+---------------+ > | timestamp | granularity | value | > +---------------------------+-------------+---------------+ > | 2018-07-19T09:00:00+00:00 | 3600.0 | 676.454821872 | > | 2018-07-19T10:00:00+00:00 | 3600.0 | 927.148462196 | > | 2018-07-19T09:46:00+00:00 | 60.0 | 79.0149064873 | > | 2018-07-19T09:47:00+00:00 | 60.0 | 54.6575832468 | > | 2018-07-19T09:48:00+00:00 | 60.0 | 46.0457056053 | > | 2018-07-19T09:49:00+00:00 | 60.0 | 52.5139041993 | > | 2018-07-19T09:50:00+00:00 | 60.0 | 42.7994058262 | > | 2018-07-19T09:51:00+00:00 | 60.0 | 40.0215359957 | > ... > > > _______________________________________________ > Mailing list: > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skinnyh92 at gmail.com Fri Jul 20 18:11:19 2018 From: skinnyh92 at gmail.com (Hang Yang) Date: Fri, 20 Jul 2018 11:11:19 -0700 Subject: [Openstack] [Senlin] Admin is not able to delete other users' clusters Message-ID: Hi there, I'm using Senlin in stable/queens and find admin user is able to list all users' cluster but not able to delete other users's clusters/profiles. The debug log shows it gets ResouceNotFound error Wondering if that is an expected behavior? How should admin user manage all the Senlin clusters? Any help is appreciated. hangyang at ows-api1-qe1[ows_qe]:~$ openstack cluster list --global-project +----------+-----------------------+--------+----------------------+----------------------+------------+ | id | name | status | created_at | updated_at | project_id | +----------+-----------------------+--------+----------------------+----------------------+------------+ | a2294060 | my_test_cluster_YB61d | ACTIVE | 2018-06-12T21:49:19Z | 2018-06-12T21:49:19Z | 152690aa | hangyang at ows-api1-qe1[ows_qe]:~$ openstack cluster delete a2294060 --debug ... RESP BODY: {"code": 404, "error": {"code": 404, "message": "The cluster 'a2294060' could not be found.", "type": "ResourceNotFound"}, "explanation": "The resource could not be found.", "title": "Not Found"} a2294060: failed due to 'Unable to delete Cluster for a2294060' ... Senlin policy.json { "context_is_admin": "role:admin", "deny_everybody": "!", "build_info:build_info": "", "profile_types:index": "", "profile_types:get": "", "profile_types:ops": "", "policy_types:index": "", "policy_types:get": "", "clusters:index": "", "clusters:create": "", "clusters:delete": "", "clusters:get": "", "clusters:action": "", "clusters:update": "", "clusters:collect": "", "clusters:operation": "", ... Regards, Hang -------------- next part -------------- An HTML attachment was scrubbed... URL: From berndbausch at gmail.com Sat Jul 21 04:39:09 2018 From: berndbausch at gmail.com (Bernd Bausch) Date: Sat, 21 Jul 2018 13:39:09 +0900 Subject: [Openstack] [gnocchi][aodh] Unable to trigger aggregate alarms In-Reply-To: References: Message-ID: Additional info: I found /var/log/aodh/evaluator.log. In there, each time Aodh evaluates alarm conditions, it issues this message: pruned statistics to 0 This occurs in .../aodh/evaluator/gnocchi.py. I don't understand the logic of the code, in particular why I end up with 0 statistics, but my guess is that "insufficient data" is caused by this. At least, I have the confirmation that Aodh uses Gnocchi to get mesaures. I tried the autoscaling example in Red Hat's documentation https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/10/html/manual_installation_procedures/sect-ceilometer-gnocchi-backend. Same result: The alarm remains at "insufficient data". Assuming that the documented code works, I guess something is wrong with my configuration. But what? Bernd Bausch On 7/20/2018 5:45 PM, Bernd Bausch wrote: > > This is on a Newton Packstack. > > I try to trigger alarms based on average cpu_util of a group of > instances. *Problem: *The alarm perpetually remains in state > "insufficient data". > > Ceilometer is configured to use Gnocchi and the medium archive policy > (which stores data once a minute).  The intervals in pipeline.yaml are > set to 60. > > I run two instances with high CPU usage. Both have a metadata item > "metering.server_group=hicpu". The alarm uses a query > "server_group==hicpu", has a granularity of 60 and evalution periods > set to 1. I expect it to be in state /alarm /or /ok /after less than 2 > minutes. > > From Gnocchi, I can retrieve measures, both of the two individual > instances and of aggregate measures. > > *Why "insufficient data"? **How can I find out what's going on in > Aodh's mind? *More info below. Thanks. > > Bernd Bausch > > My alarm: > > $ openstack alarm show cpuhigh-agg > +---------------------------+--------------------------------------------------+ > | Field                     | > Value                                            | > +---------------------------+--------------------------------------------------+ > | aggregation_method        | > sum                                              | > | alarm_actions             | > [u'http://localhost:1234']                       | > | alarm_id                  | > 6adb333a-b306-470d-b673-2c8e72c7a468             | > | comparison_operator       | > gt                                               | > | description               | > gnocchi_aggregation_by_resources_threshold alarm | > |                           | > rule                                             | > | enabled                   | > True                                             | > | evaluation_periods        | > 1                                                | > | granularity               | > 60                                               | > | insufficient_data_actions | > []                                               | > | metric                    | > cpu_util                                         | > | name                      | > cpuhigh-agg                                      | > | ok_actions                | > [u'http://localhost:1234']                       | > | project_id                | > 55a05c4f3908490ca2419591837575ba                 | > | query                     | {"and": [{"=": > {"created_by_project_id":         | > |                           | "55a05c4f3908490ca2419591837575ba"}}, > {"=":      | > |                           | {"server_group": > "hicpu"}}]}                     | > | repeat_actions            | > False                                            | > | resource_type             | > instance                                         | > | severity                  | > low                                              | > *| state                     | insufficient > data                                |* > | state_timestamp           | > 2018-07-19T11:05:38.098000                       | > | threshold                 | > 80.0                                             | > | time_constraints          | > []                                               | > | timestamp                 | > 2018-07-19T11:05:38.098000                       | > | type                      | > gnocchi_aggregation_by_resources_threshold       | > | user_id                   | > 96ce6a7200a54c79add0cc27ded03422                 | > +---------------------------+--------------------------------------------------+ > > My instances look like this: > > $ openstack server show cpu-user1 > +--------------------------------------+---------------------------------------+ > | Field                                | > Value                                 | > +--------------------------------------+---------------------------------------+ > ... > | project_id                           | > 55a05c4f3908490ca2419591837575ba      | > | properties                           | > *metering.server_group='hicpu'*         | > | security_groups                      | [{u'name': u'default'}, > {u'name':     | > |                                      | > u'ssh'}]                              | > | status                               | > ACTIVE                                | > ... > +--------------------------------------+---------------------------------------+ > > Gnocchi contains enough data I would think: > > gnocchi measures aggregation -m cpu_util --query server_group=hicpu > --aggregation sum --resource-type instance > +---------------------------+-------------+---------------+ > | timestamp                 | granularity |         value | > +---------------------------+-------------+---------------+ > | 2018-07-19T09:00:00+00:00 |      3600.0 | 676.454821872 | > | 2018-07-19T10:00:00+00:00 |      3600.0 | 927.148462196 | > | 2018-07-19T09:46:00+00:00 |        60.0 | 79.0149064873 | > | 2018-07-19T09:47:00+00:00 |        60.0 | 54.6575832468 | > | 2018-07-19T09:48:00+00:00 |        60.0 | 46.0457056053 | > | 2018-07-19T09:49:00+00:00 |        60.0 | 52.5139041993 | > | 2018-07-19T09:50:00+00:00 |        60.0 | 42.7994058262 | > | 2018-07-19T09:51:00+00:00 |        60.0 | 40.0215359957 | > ... > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From fazy at niif.hu Mon Jul 23 09:28:50 2018 From: fazy at niif.hu (=?UTF-8?B?RXJkxZFzaSBQw6l0ZXI=?=) Date: Mon, 23 Jul 2018 11:28:50 +0200 Subject: [Openstack] ask.openstack.org down?! Message-ID: Hi, I got connection refused when trying to open ask.openstack.org. I've tried it from HBONE (AS 1955) and UPC (AS 6830) networks with no luck. Can someone investigate or escalate this to the right place? Thanks:  Peter ERDOSI From berndbausch at gmail.com Mon Jul 23 09:47:16 2018 From: berndbausch at gmail.com (Bernd Bausch) Date: Mon, 23 Jul 2018 18:47:16 +0900 Subject: [Openstack] [gnocchi][aodh] Unable to trigger aggregate alarms In-Reply-To: References: Message-ID: <3412430c-f317-12a1-6db4-f31ac481f16c@gmail.com> My problem is related to https://docs.openstack.org/releasenotes/aodh/newton.html#bug-fixes. The default value of Aodh's config variable /gnocchi_external_project_owner /is /service, /but in Newton-based Packstack, the service project is named /services /- plural. The fix: Adding /gnocchi_external_project_owner = //services /to aodh.conf. Bernd Bausch On 7/20/2018 5:45 PM, Bernd Bausch wrote: > > This is on a Newton Packstack. > > I try to trigger alarms based on average cpu_util of a group of > instances. *Problem: *The alarm perpetually remains in state > "insufficient data". > > Ceilometer is configured to use Gnocchi and the medium archive policy > (which stores data once a minute).  The intervals in pipeline.yaml are > set to 60. > > I run two instances with high CPU usage. Both have a metadata item > "metering.server_group=hicpu". The alarm uses a query > "server_group==hicpu", has a granularity of 60 and evalution periods > set to 1. I expect it to be in state /alarm /or /ok /after less than 2 > minutes. > > From Gnocchi, I can retrieve measures, both of the two individual > instances and of aggregate measures. > > *Why "insufficient data"? **How can I find out what's going on in > Aodh's mind? *More info below. Thanks. > > Bernd Bausch > > My alarm: > > $ openstack alarm show cpuhigh-agg > +---------------------------+--------------------------------------------------+ > | Field                     | > Value                                            | > +---------------------------+--------------------------------------------------+ > | aggregation_method        | > sum                                              | > | alarm_actions             | > [u'http://localhost:1234']                       | > | alarm_id                  | > 6adb333a-b306-470d-b673-2c8e72c7a468             | > | comparison_operator       | > gt                                               | > | description               | > gnocchi_aggregation_by_resources_threshold alarm | > |                           | > rule                                             | > | enabled                   | > True                                             | > | evaluation_periods        | > 1                                                | > | granularity               | > 60                                               | > | insufficient_data_actions | > []                                               | > | metric                    | > cpu_util                                         | > | name                      | > cpuhigh-agg                                      | > | ok_actions                | > [u'http://localhost:1234']                       | > | project_id                | > 55a05c4f3908490ca2419591837575ba                 | > | query                     | {"and": [{"=": > {"created_by_project_id":         | > |                           | "55a05c4f3908490ca2419591837575ba"}}, > {"=":      | > |                           | {"server_group": > "hicpu"}}]}                     | > | repeat_actions            | > False                                            | > | resource_type             | > instance                                         | > | severity                  | > low                                              | > *| state                     | insufficient > data                                |* > | state_timestamp           | > 2018-07-19T11:05:38.098000                       | > | threshold                 | > 80.0                                             | > | time_constraints          | > []                                               | > | timestamp                 | > 2018-07-19T11:05:38.098000                       | > | type                      | > gnocchi_aggregation_by_resources_threshold       | > | user_id                   | > 96ce6a7200a54c79add0cc27ded03422                 | > +---------------------------+--------------------------------------------------+ > > My instances look like this: > > $ openstack server show cpu-user1 > +--------------------------------------+---------------------------------------+ > | Field                                | > Value                                 | > +--------------------------------------+---------------------------------------+ > ... > | project_id                           | > 55a05c4f3908490ca2419591837575ba      | > | properties                           | > *metering.server_group='hicpu'*         | > | security_groups                      | [{u'name': u'default'}, > {u'name':     | > |                                      | > u'ssh'}]                              | > | status                               | > ACTIVE                                | > ... > +--------------------------------------+---------------------------------------+ > > Gnocchi contains enough data I would think: > > gnocchi measures aggregation -m cpu_util --query server_group=hicpu > --aggregation sum --resource-type instance > +---------------------------+-------------+---------------+ > | timestamp                 | granularity |         value | > +---------------------------+-------------+---------------+ > | 2018-07-19T09:00:00+00:00 |      3600.0 | 676.454821872 | > | 2018-07-19T10:00:00+00:00 |      3600.0 | 927.148462196 | > | 2018-07-19T09:46:00+00:00 |        60.0 | 79.0149064873 | > | 2018-07-19T09:47:00+00:00 |        60.0 | 54.6575832468 | > | 2018-07-19T09:48:00+00:00 |        60.0 | 46.0457056053 | > | 2018-07-19T09:49:00+00:00 |        60.0 | 52.5139041993 | > | 2018-07-19T09:50:00+00:00 |        60.0 | 42.7994058262 | > | 2018-07-19T09:51:00+00:00 |        60.0 | 40.0215359957 | > ... > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From fungi at yuggoth.org Mon Jul 23 13:15:26 2018 From: fungi at yuggoth.org (Jeremy Stanley) Date: Mon, 23 Jul 2018 13:15:26 +0000 Subject: [Openstack] ask.openstack.org down?! In-Reply-To: References: Message-ID: <20180723131526.omqi6vfrvhwfwfyf@yuggoth.org> On 2018-07-23 11:28:50 +0200 (+0200), Erdősi Péter wrote: > I got connection refused when trying to open ask.openstack.org. > I've tried it from HBONE (AS 1955) and UPC (AS 6830) networks with > no luck. It wasn't a routing issue. The server's Web service was actually not running. Something seems to have occurred (perhaps hit a race condition bug) around the time of the server's scheduled jobs (log rotation, et cetera) which raised a Django WSGI exception and killed the parent Apache process. Starting Apache again has restored the site to working order. Thanks for reporting! > Can someone investigate or escalate this to the right place? A more appropriate list would have been openstack-infra at lists.openstack.org (Cc'd on my reply). -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From fazy at niif.hu Mon Jul 23 13:28:56 2018 From: fazy at niif.hu (=?UTF-8?B?RXJkxZFzaSBQw6l0ZXI=?=) Date: Mon, 23 Jul 2018 15:28:56 +0200 Subject: [Openstack] ask.openstack.org down?! In-Reply-To: <20180723131526.omqi6vfrvhwfwfyf@yuggoth.org> References: <20180723131526.omqi6vfrvhwfwfyf@yuggoth.org> Message-ID: <8dfc06f8-a9fb-6f95-2b39-081c5b8ec1c4@niif.hu> 2018. 07. 23. 15:15 keltezéssel, Jeremy Stanley írta: > On 2018-07-23 11:28:50 +0200 (+0200), Erdősi Péter wrote: >> I got connection refused when trying to open ask.openstack.org. >> I've tried it from HBONE (AS 1955) and UPC (AS 6830) networks with >> no luck. > It wasn't a routing issue. The server's Web service was actually not > running. Something seems to have occurred (perhaps hit a race > condition bug) around the time of the server's scheduled jobs (log > rotation, et cetera) which raised a Django WSGI exception and killed > the parent Apache process. Starting Apache again has restored the > site to working order. Thanks for reporting! > We thank you the fast repair ;) Regards:  Peter ERDOSI -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Mon Jul 23 13:29:04 2018 From: satish.txt at gmail.com (Satish Patel) Date: Mon, 23 Jul 2018 09:29:04 -0400 Subject: [Openstack] ask.openstack.org down?! In-Reply-To: References: Message-ID: it up now. On Mon, Jul 23, 2018 at 5:28 AM, Erdősi Péter wrote: > Hi, > > I got connection refused when trying to open ask.openstack.org. > I've tried it from HBONE (AS 1955) and UPC (AS 6830) networks with no luck. > > Can someone investigate or escalate this to the right place? > > Thanks: > Peter ERDOSI > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack From fazy at niif.hu Mon Jul 23 15:51:49 2018 From: fazy at niif.hu (=?UTF-8?B?RXJkxZFzaSBQw6l0ZXI=?=) Date: Mon, 23 Jul 2018 17:51:49 +0200 Subject: [Openstack] Converting from flat to vlan network type Message-ID: Hi there! We want to change few thing is our production Openstack setup around networking. (it's Mitaka now, but soon we planning the upgrade) Just a few idea, how it looks now:  - We use OVS  - Have 2 network nodes (l3 agent, dhcp agent, etc)  - Have multiple compute nodes  - Have independent controllers (with neutron-api) You can choose from two network type - please ignore the naming now - :  - "Flat"  - "Smart" The "Smart" dataflow looks like this: backbone -> network nodes (where the kernel get down the VLAN tag, and add the untagged traffic to br-smart) OVS/qrouter -> vxlan tunnel -> compute node OVS ->  VM The "Flat" dataflow looks like this: backbone -> compute node (same as network node, we have bondX.100, bondX.101 etc interfaces) -> OVS bridge (br-flat) -> OVS - > VM  (the network nodes also have interface in our "flat" networks, since they doing dhcp/metadata, but the gateway is two HW router with vrrp, not some dynamic qrouter. Of course, there are firewalls before the VM, but it's irrelevant. So, our goal is to unconfiugre this bridge mass, and do the vlan tagging/untagging in OVS. AFAIK it can work, if I make a trunk port in OVS, and then choose VLAN type with the right segmentation ID. (I'am possibly okay with the configuration itself, the next part looks hard) So, as you can imagine, we have multiple VM-s in "smart" and "flat" networks too (with smart, there are a lot of floating IP associated, the flat type DHCP agent offers public IP addresses too) My question is: Am I able to change the segmentation type from flat to vlan somehow? (both on our "flat" and "smart" networking) Downtime will be OK for this, but reassigning different IP addresses to VMs is not. Thanks:  Peter ERDOSI From e0ne at e0ne.info Mon Jul 23 18:35:09 2018 From: e0ne at e0ne.info (Ivan Kolodyazhny) Date: Mon, 23 Jul 2018 21:35:09 +0300 Subject: [Openstack] [Horizon] Horizon responds very slowly In-Reply-To: <20180719084707.Horde.nJPNJA-tAWUutMcU6gz_cHh@webmail.nde.ag> References: <6ED5A4C0760EC04A8DFC2AF5B93AE14EB4F2ABD3@MAIL01.syswin.com> <20180719084707.Horde.nJPNJA-tAWUutMcU6gz_cHh@webmail.nde.ag> Message-ID: Hi, It could be a common issue between horizon and keystone. As a temporary workaround for this, you can apply this [1] patch to redirect admin user to the different page. [1] https://review.openstack.org/#/c/577090/ Regards, Ivan Kolodyazhny, http://blog.e0ne.info/ On Thu, Jul 19, 2018 at 11:47 AM, Eugen Block wrote: > Hi, > > we also had to deal with slow dashboard, in our case it was a > misconfiguration of memcached [0], [1]. > > Check with your configuration and make sure you use oslo.cache. > > Hope this helps! > > [0] https://bugs.launchpad.net/keystone/+bug/1587777 > [1] https://ask.openstack.org/en/question/102611/how-to-configur > e-memcache-in-openstack-ha/ > > > Zitat von 高松 : > > > After kill one node of a cluster which consist of three nodes, >> I found that Horizon based on keystone with provider set to fernet >> respondes very slowly. >> Admin login will cost at least 20 senconds. >> And cli verbose command return show making authentication is stuck about >> 5 senconds. >> Any help will be appreciated. >> > > > > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstac > k > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstac > k > -------------- next part -------------- An HTML attachment was scrubbed... URL: From berndbausch at gmail.com Tue Jul 24 04:23:37 2018 From: berndbausch at gmail.com (Bernd Bausch) Date: Tue, 24 Jul 2018 13:23:37 +0900 Subject: [Openstack] [gnocchi] Clarification what is a metric? Message-ID: <0363cc7f-6aa8-c23e-95f0-d411218044db@gmail.com> I have been under the impression that a metric associates a name and a resource. For example, all resources of type /instance/_//_have a /cpu /metric. Perhaps I am wrong. Who can explain it to me or, better, point me to an explanation? I want to create time series of temperatures in various places. There are already three resources of type generic named tokyo, osaka and frankfurt. Next step, each resource needs a metric named /heat/. So I try: gnocchi metric create --resource-id tokyo --archive-policy medium --unit celsius heat gnocchi metric create --resource-id osaka --archive-policy medium --unit celsius heat gnocchi metric create --resource-id frankfurt --archive-policy medium --unit celsius heat The first command succeeds. The two others don't: Named metric heat already exists (HTTP 409) Checking: gnocchi metric show --resource-id frankfurt heat Metric heat does not exist (HTTP 404) I also tried gnocchi metric create 37e04566-bcfe-52b6-81e3-371bdf71c813/heat but that creates a metric named /37e04566-bcfe-52b6-81e3-371bdf71c813/heat/ which is not associated with resource /37e04566-bcfe-52b6-81e3-371bdf71c813/. *What can I do to associate all three resources with metric **/heat/**?* -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From berndbausch at gmail.com Tue Jul 24 05:28:13 2018 From: berndbausch at gmail.com (Bernd Bausch) Date: Tue, 24 Jul 2018 14:28:13 +0900 Subject: [Openstack] [gnocchi] Clarification what is a metric? In-Reply-To: <0363cc7f-6aa8-c23e-95f0-d411218044db@gmail.com> References: <0363cc7f-6aa8-c23e-95f0-d411218044db@gmail.com> Message-ID: <05f5ecca-05f4-3874-6458-bf0802ba087c@gmail.com> Never mind. It seems to be working now. I don't know what went wrong earlier. On 7/24/2018 1:23 PM, Bernd Bausch wrote: > I have been under the impression that a metric associates a name and a > resource. For example, all resources of type /instance/_//_have a /cpu > /metric. Perhaps I am wrong. Who can explain it to me or, better, > point me to an explanation? > > I want to create time series of temperatures in various places. There > are already three resources of type generic named tokyo, osaka and > frankfurt. > > Next step, each resource needs a metric named /heat/. So I try: > gnocchi metric create --resource-id tokyo --archive-policy medium > --unit celsius heat > gnocchi metric create --resource-id osaka --archive-policy medium > --unit celsius heat > gnocchi metric create --resource-id frankfurt --archive-policy medium > --unit celsius heat > > The first command succeeds. The two others don't: > Named metric heat already exists (HTTP 409) > > Checking: > gnocchi metric show --resource-id frankfurt heat > Metric heat does not exist (HTTP 404) > > I also tried > > gnocchi metric create 37e04566-bcfe-52b6-81e3-371bdf71c813/heat > > but that creates a metric named > /37e04566-bcfe-52b6-81e3-371bdf71c813/heat/ which is not associated > with resource /37e04566-bcfe-52b6-81e3-371bdf71c813/. > > *What can I do to associate all three resources with metric **/heat/**?* -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From amy at demarco.com Tue Jul 24 15:05:51 2018 From: amy at demarco.com (Amy Marrich) Date: Tue, 24 Jul 2018 10:05:51 -0500 Subject: [Openstack] UC Election - Looking for Election Officials Message-ID: Hey Stackers, We are getting ready for the Summer UC election and we need to have at least two Election Officials. I was wondering if you would like to help us on that process. You can find all the details of the election at https://governance.openstack.org/uc/reference/uc-election-aug2018.html. I do want to point out to those who are new that Election Officials are unable to run in the election itself but can of course vote. The election dates will be: August 6 - August 17, 05:59 UTC: Open candidacy for UC positions August 20 - August 24, 11:59 UTC: UC elections (voting) Please, reach out to any of the current UC members or simple reply to this email if you can help us in this community process. Thanks, OpenStack User Committee Amy, Leong, Matt, Melvin, and Saverio -------------- next part -------------- An HTML attachment was scrubbed... URL: From chkumar246 at gmail.com Tue Jul 24 15:14:44 2018 From: chkumar246 at gmail.com (Chandan kumar) Date: Tue, 24 Jul 2018 20:44:44 +0530 Subject: [Openstack] UC Election - Looking for Election Officials In-Reply-To: References: Message-ID: Hello Amy, On Tue, Jul 24, 2018 at 8:41 PM Amy Marrich wrote: > > Hey Stackers, > > > We are getting ready for the Summer UC election and we need to have at least two Election Officials. I was wondering if you would like to help us on that process. You can find all the details of the election at https://governance.openstack.org/uc/reference/uc-election-aug2018.html. > > > I do want to point out to those who are new that Election Officials are unable to run in the election itself but can of course vote. > > > > The election dates will be: > > August 6 - August 17, 05:59 UTC: Open candidacy for UC positions > > August 20 - August 24, 11:59 UTC: UC elections (voting) > > > > Please, reach out to any of the current UC members or simple reply to this email if you can help us in this community process. > I want to help on this. let me know how can i proceed? Thanks, Chandan Kumar From amy at demarco.com Tue Jul 24 16:39:18 2018 From: amy at demarco.com (Amy Marrich) Date: Tue, 24 Jul 2018 11:39:18 -0500 Subject: [Openstack] UC Election - Looking for Election Officials In-Reply-To: References: Message-ID: Just wanted to say THANK you as we now have 3 officials! Please participate in the User Committee elections as a candidate and perhaps most importantly by voting! Thanks, Amy (spotz) On Tue, Jul 24, 2018 at 10:05 AM, Amy Marrich wrote: > Hey Stackers, > > > We are getting ready for the Summer UC election and we need to have at > least two Election Officials. I was wondering if you would like to help us > on that process. You can find all the details of the election at > https://governance.openstack.org/uc/reference/uc-election-aug2018.html. > > > I do want to point out to those who are new that Election Officials are > unable to run in the election itself but can of course vote. > > > > The election dates will be: > > August 6 - August 17, 05:59 UTC: Open candidacy for UC positions > > August 20 - August 24, 11:59 UTC: UC elections (voting) > > > > Please, reach out to any of the current UC members or simple reply to this > email if you can help us in this community process. > > > > Thanks, > > > > OpenStack User Committee > > Amy, Leong, Matt, Melvin, and Saverio > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jalnliu at lbl.gov Tue Jul 24 18:06:12 2018 From: jalnliu at lbl.gov (Jialin Liu) Date: Tue, 24 Jul 2018 11:06:12 -0700 Subject: [Openstack] error in deleting objects Message-ID: Hi, When I tried to delete all objects, with command 'delete -a', it seems some container are not found: Error Deleting: some.h5_9958: u"Container u'some.h5_9958' not found" container:some.h5_9958 Does anyone know how to deal with this? Best, Jialin -------------- next part -------------- An HTML attachment was scrubbed... URL: From amy at demarco.com Tue Jul 24 19:24:25 2018 From: amy at demarco.com (Amy Marrich) Date: Tue, 24 Jul 2018 14:24:25 -0500 Subject: [Openstack] UC Election - Looking for Election Officials In-Reply-To: References: Message-ID: And for those curious... our officials are..... Ed Leafe, Chandan Kumar and then Mohamed Elsakhawy Thanks, Amy (spotz) (Who's claiming lack of sleep for not including the names earlier) On Tue, Jul 24, 2018 at 11:39 AM, Amy Marrich wrote: > Just wanted to say THANK you as we now have 3 officials! Please > participate in the User Committee elections as a candidate and perhaps most > importantly by voting! > > Thanks, > > Amy (spotz) > > On Tue, Jul 24, 2018 at 10:05 AM, Amy Marrich wrote: > >> Hey Stackers, >> >> >> We are getting ready for the Summer UC election and we need to have at >> least two Election Officials. I was wondering if you would like to help us >> on that process. You can find all the details of the election at >> https://governance.openstack.org/uc/reference/uc-election-aug2018.html. >> >> >> I do want to point out to those who are new that Election Officials are >> unable to run in the election itself but can of course vote. >> >> >> >> The election dates will be: >> >> August 6 - August 17, 05:59 UTC: Open candidacy for UC positions >> >> August 20 - August 24, 11:59 UTC: UC elections (voting) >> >> >> >> Please, reach out to any of the current UC members or simple reply to >> this email if you can help us in this community process. >> >> >> >> Thanks, >> >> >> >> OpenStack User Committee >> >> Amy, Leong, Matt, Melvin, and Saverio >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gaosong_1250 at 163.com Wed Jul 25 02:07:14 2018 From: gaosong_1250 at 163.com (gao.song) Date: Wed, 25 Jul 2018 10:07:14 +0800 (CST) Subject: [Openstack] [Horizon] Horizon responds very slowly In-Reply-To: References: <6ED5A4C0760EC04A8DFC2AF5B93AE14EB4F2ABD3@MAIL01.syswin.com> <20180719084707.Horde.nJPNJA-tAWUutMcU6gz_cHh@webmail.nde.ag> Message-ID: <4cd434ee.2978.164cf30baea.Coremail.gaosong_1250@163.com> Sorry for the delay, different timezone huh! Thanks a lot! We'll try the solution! At 2018-07-24 02:35:09, "Ivan Kolodyazhny" wrote: Hi, It could be a common issue between horizon and keystone. As a temporary workaround for this, you can apply this [1] patch to redirect admin user to the different page. [1] https://review.openstack.org/#/c/577090/ Regards, Ivan Kolodyazhny, http://blog.e0ne.info/ On Thu, Jul 19, 2018 at 11:47 AM, Eugen Block wrote: Hi, we also had to deal with slow dashboard, in our case it was a misconfiguration of memcached [0], [1]. Check with your configuration and make sure you use oslo.cache. Hope this helps! [0] https://bugs.launchpad.net/keystone/+bug/1587777 [1] https://ask.openstack.org/en/question/102611/how-to-configure-memcache-in-openstack-ha/ Zitat von 高松 : After kill one node of a cluster which consist of three nodes, I found that Horizon based on keystone with provider set to fernet respondes very slowly. Admin login will cost at least 20 senconds. And cli verbose command return show making authentication is stuck about 5 senconds. Any help will be appreciated. _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack at lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Wed Jul 25 05:22:44 2018 From: satish.txt at gmail.com (Satish Patel) Date: Wed, 25 Jul 2018 01:22:44 -0400 Subject: [Openstack] Live migration failed with ceph storage Message-ID: I have openstack with ceph storage setup and trying to test Live migration but somehow it failed and showing following error nova.conf # ceph rbd support live_migration_uri = "qemu+tcp://%s/system" live_migration_tunnelled = True libvirtd.conf listen_tls = 0 listen_tcp = 1 unix_sock_group = "libvirt" unix_sock_ro_perms = "0777" unix_sock_rw_perms = "0770" auth_unix_ro = "none" auth_unix_rw = "none" auth_tcp = "none" This is the error i am getting, i google it but didn't find any reference ] [instance: 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] Live migration failed.: AttributeError: 'Guest' object has no attribute 'migrate_configure_max_speed' 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] Traceback (most recent call last): 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] File "/openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/nova/compute/manager.py", line 5580, in _do_live_migration 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] block_migration, migrate_data) 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] File "/openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6436, in live_migration 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] migrate_data) 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] File "/openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6944, in _live_migration 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] guest.migrate_configure_max_speed( 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] AttributeError: 'Guest' object has no attribute 'migrate_configure_max_speed' From openstack at medberry.net Wed Jul 25 12:19:25 2018 From: openstack at medberry.net (David Medberry) Date: Wed, 25 Jul 2018 06:19:25 -0600 Subject: [Openstack] Live migration failed with ceph storage In-Reply-To: References: Message-ID: It's not clear what version of Nova you are running but perhaps it is badly patched. The 16.x.x (Pike) release of Nova has no "migrate_configure_max_speed" but as best I can tell you are running a patched version of Nova Pike so it may be inconsistent. This parameter was introduced on 2017-08-24: https://github.com/openstack/nova/commit/23446a9552b5be3b040278646149a0f481d0a005 That parameter showed up in Queens (not Pike) initially. -d On Tue, Jul 24, 2018 at 11:22 PM, Satish Patel wrote: > I have openstack with ceph storage setup and trying to test Live > migration but somehow it failed and showing following error > > nova.conf > > # ceph rbd support > live_migration_uri = "qemu+tcp://%s/system" > live_migration_tunnelled = True > > libvirtd.conf > > listen_tls = 0 > listen_tcp = 1 > unix_sock_group = "libvirt" > unix_sock_ro_perms = "0777" > unix_sock_rw_perms = "0770" > auth_unix_ro = "none" > auth_unix_rw = "none" > auth_tcp = "none" > > > This is the error i am getting, i google it but didn't find any reference > > > > ] [instance: 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] Live migration > failed.: AttributeError: 'Guest' object has no attribute > 'migrate_configure_max_speed' > 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: > 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] Traceback (most recent call > last): > 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: > 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] File > "/openstack/venvs/nova-16.0.16/lib/python2.7/site- > packages/nova/compute/manager.py", > line 5580, in _do_live_migration > 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: > 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] block_migration, > migrate_data) > 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: > 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] File > "/openstack/venvs/nova-16.0.16/lib/python2.7/site- > packages/nova/virt/libvirt/driver.py", > line 6436, in live_migration > 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: > 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] migrate_data) > 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: > 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] File > "/openstack/venvs/nova-16.0.16/lib/python2.7/site- > packages/nova/virt/libvirt/driver.py", > line 6944, in _live_migration > 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: > 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] > guest.migrate_configure_max_speed( > 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: > 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] AttributeError: 'Guest' object > has no attribute 'migrate_configure_max_speed' > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/ > openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/ > openstack > -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Wed Jul 25 12:33:53 2018 From: satish.txt at gmail.com (Satish Patel) Date: Wed, 25 Jul 2018 08:33:53 -0400 Subject: [Openstack] Live migration failed with ceph storage In-Reply-To: References: Message-ID: Thanks David, [root at ostack-compute-01 ~]# nova --version 9.1.2 I am using Pike 16.0.15 (My deployment tool is openstack-ansible) What are my option here? On Wed, Jul 25, 2018 at 8:19 AM, David Medberry wrote: > It's not clear what version of Nova you are running but perhaps it is badly > patched. The 16.x.x (Pike) release of Nova has no > "migrate_configure_max_speed" but as best I can tell you are running a > patched version of Nova Pike so it may be inconsistent. > > This parameter was introduced on 2017-08-24: > https://github.com/openstack/nova/commit/23446a9552b5be3b040278646149a0f481d0a005 > > That parameter showed up in Queens (not Pike) initially. > > -d > > On Tue, Jul 24, 2018 at 11:22 PM, Satish Patel wrote: >> >> I have openstack with ceph storage setup and trying to test Live >> migration but somehow it failed and showing following error >> >> nova.conf >> >> # ceph rbd support >> live_migration_uri = "qemu+tcp://%s/system" >> live_migration_tunnelled = True >> >> libvirtd.conf >> >> listen_tls = 0 >> listen_tcp = 1 >> unix_sock_group = "libvirt" >> unix_sock_ro_perms = "0777" >> unix_sock_rw_perms = "0770" >> auth_unix_ro = "none" >> auth_unix_rw = "none" >> auth_tcp = "none" >> >> >> This is the error i am getting, i google it but didn't find any reference >> >> >> >> ] [instance: 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] Live migration >> failed.: AttributeError: 'Guest' object has no attribute >> 'migrate_configure_max_speed' >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] Traceback (most recent call >> last): >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] File >> >> "/openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/nova/compute/manager.py", >> line 5580, in _do_live_migration >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] block_migration, >> migrate_data) >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] File >> >> "/openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", >> line 6436, in live_migration >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] migrate_data) >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] File >> >> "/openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", >> line 6944, in _live_migration >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] >> guest.migrate_configure_max_speed( >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] AttributeError: 'Guest' object >> has no attribute 'migrate_configure_max_speed' >> >> _______________________________________________ >> Mailing list: >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> Post to : openstack at lists.openstack.org >> Unsubscribe : >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > From openstack at medberry.net Wed Jul 25 13:45:15 2018 From: openstack at medberry.net (David Medberry) Date: Wed, 25 Jul 2018 07:45:15 -0600 Subject: [Openstack] Live migration failed with ceph storage In-Reply-To: References: Message-ID: I think that nova --version is the version of the client (not of nova itself). I'm looking at OSAD 16.0.15 to see what it is pulling for nova. If I see anything of interest, I'll reply. On Wed, Jul 25, 2018 at 6:33 AM, Satish Patel wrote: > Thanks David, > > [root at ostack-compute-01 ~]# nova --version > 9.1.2 > > I am using Pike 16.0.15 (My deployment tool is openstack-ansible) > > > What are my option here? > > > On Wed, Jul 25, 2018 at 8:19 AM, David Medberry > wrote: > > It's not clear what version of Nova you are running but perhaps it is > badly > > patched. The 16.x.x (Pike) release of Nova has no > > "migrate_configure_max_speed" but as best I can tell you are running a > > patched version of Nova Pike so it may be inconsistent. > > > > This parameter was introduced on 2017-08-24: > > https://github.com/openstack/nova/commit/23446a9552b5be3b040278646149a0 > f481d0a005 > > > > That parameter showed up in Queens (not Pike) initially. > > > > -d > > > > On Tue, Jul 24, 2018 at 11:22 PM, Satish Patel > wrote: > >> > >> I have openstack with ceph storage setup and trying to test Live > >> migration but somehow it failed and showing following error > >> > >> nova.conf > >> > >> # ceph rbd support > >> live_migration_uri = "qemu+tcp://%s/system" > >> live_migration_tunnelled = True > >> > >> libvirtd.conf > >> > >> listen_tls = 0 > >> listen_tcp = 1 > >> unix_sock_group = "libvirt" > >> unix_sock_ro_perms = "0777" > >> unix_sock_rw_perms = "0770" > >> auth_unix_ro = "none" > >> auth_unix_rw = "none" > >> auth_tcp = "none" > >> > >> > >> This is the error i am getting, i google it but didn't find any > reference > >> > >> > >> > >> ] [instance: 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] Live migration > >> failed.: AttributeError: 'Guest' object has no attribute > >> 'migrate_configure_max_speed' > >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: > >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] Traceback (most recent call > >> last): > >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: > >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] File > >> > >> "/openstack/venvs/nova-16.0.16/lib/python2.7/site- > packages/nova/compute/manager.py", > >> line 5580, in _do_live_migration > >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: > >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] block_migration, > >> migrate_data) > >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: > >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] File > >> > >> "/openstack/venvs/nova-16.0.16/lib/python2.7/site- > packages/nova/virt/libvirt/driver.py", > >> line 6436, in live_migration > >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: > >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] migrate_data) > >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: > >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] File > >> > >> "/openstack/venvs/nova-16.0.16/lib/python2.7/site- > packages/nova/virt/libvirt/driver.py", > >> line 6944, in _live_migration > >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: > >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] > >> guest.migrate_configure_max_speed( > >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: > >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] AttributeError: 'Guest' object > >> has no attribute 'migrate_configure_max_speed' > >> > >> _______________________________________________ > >> Mailing list: > >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > >> Post to : openstack at lists.openstack.org > >> Unsubscribe : > >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Wed Jul 25 14:04:25 2018 From: satish.txt at gmail.com (Satish Patel) Date: Wed, 25 Jul 2018 10:04:25 -0400 Subject: [Openstack] Live migration failed with ceph storage In-Reply-To: References: Message-ID: David, look like OSAD 16.0.15 using following repo, if i am not wrong - name: os_nova scm: git src: https://git.openstack.org/openstack/openstack-ansible-os_nova version: 378cf6c83f9ad23c2e0d37e9df06796fee02cc27 On Wed, Jul 25, 2018 at 9:45 AM, David Medberry wrote: > I think that nova --version is the version of the client (not of nova > itself). > > I'm looking at OSAD 16.0.15 to see what it is pulling for nova. > > If I see anything of interest, I'll reply. > > On Wed, Jul 25, 2018 at 6:33 AM, Satish Patel wrote: >> >> Thanks David, >> >> [root at ostack-compute-01 ~]# nova --version >> 9.1.2 >> >> I am using Pike 16.0.15 (My deployment tool is openstack-ansible) >> >> >> What are my option here? >> >> >> On Wed, Jul 25, 2018 at 8:19 AM, David Medberry >> wrote: >> > It's not clear what version of Nova you are running but perhaps it is >> > badly >> > patched. The 16.x.x (Pike) release of Nova has no >> > "migrate_configure_max_speed" but as best I can tell you are running a >> > patched version of Nova Pike so it may be inconsistent. >> > >> > This parameter was introduced on 2017-08-24: >> > >> > https://github.com/openstack/nova/commit/23446a9552b5be3b040278646149a0f481d0a005 >> > >> > That parameter showed up in Queens (not Pike) initially. >> > >> > -d >> > >> > On Tue, Jul 24, 2018 at 11:22 PM, Satish Patel >> > wrote: >> >> >> >> I have openstack with ceph storage setup and trying to test Live >> >> migration but somehow it failed and showing following error >> >> >> >> nova.conf >> >> >> >> # ceph rbd support >> >> live_migration_uri = "qemu+tcp://%s/system" >> >> live_migration_tunnelled = True >> >> >> >> libvirtd.conf >> >> >> >> listen_tls = 0 >> >> listen_tcp = 1 >> >> unix_sock_group = "libvirt" >> >> unix_sock_ro_perms = "0777" >> >> unix_sock_rw_perms = "0770" >> >> auth_unix_ro = "none" >> >> auth_unix_rw = "none" >> >> auth_tcp = "none" >> >> >> >> >> >> This is the error i am getting, i google it but didn't find any >> >> reference >> >> >> >> >> >> >> >> ] [instance: 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] Live migration >> >> failed.: AttributeError: 'Guest' object has no attribute >> >> 'migrate_configure_max_speed' >> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] Traceback (most recent call >> >> last): >> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] File >> >> >> >> >> >> "/openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/nova/compute/manager.py", >> >> line 5580, in _do_live_migration >> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] block_migration, >> >> migrate_data) >> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] File >> >> >> >> >> >> "/openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", >> >> line 6436, in live_migration >> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] migrate_data) >> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] File >> >> >> >> >> >> "/openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", >> >> line 6944, in _live_migration >> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] >> >> guest.migrate_configure_max_speed( >> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] AttributeError: 'Guest' object >> >> has no attribute 'migrate_configure_max_speed' >> >> >> >> _______________________________________________ >> >> Mailing list: >> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> >> Post to : openstack at lists.openstack.org >> >> Unsubscribe : >> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> > >> > > > From satish.txt at gmail.com Wed Jul 25 14:06:18 2018 From: satish.txt at gmail.com (Satish Patel) Date: Wed, 25 Jul 2018 10:06:18 -0400 Subject: [Openstack] Live migration failed with ceph storage In-Reply-To: References: Message-ID: Oh wait i believe following. https://github.com/openstack/openstack-ansible/blob/0e03f46a2ebb0ffc6f12384f19ec1184434e7a09/playbooks/defaults/repo_packages/openstack_services.yml#L148 On Wed, Jul 25, 2018 at 10:04 AM, Satish Patel wrote: > David, > > look like OSAD 16.0.15 using following repo, if i am not wrong > > - name: os_nova > scm: git > src: https://git.openstack.org/openstack/openstack-ansible-os_nova > version: 378cf6c83f9ad23c2e0d37e9df06796fee02cc27 > > On Wed, Jul 25, 2018 at 9:45 AM, David Medberry wrote: >> I think that nova --version is the version of the client (not of nova >> itself). >> >> I'm looking at OSAD 16.0.15 to see what it is pulling for nova. >> >> If I see anything of interest, I'll reply. >> >> On Wed, Jul 25, 2018 at 6:33 AM, Satish Patel wrote: >>> >>> Thanks David, >>> >>> [root at ostack-compute-01 ~]# nova --version >>> 9.1.2 >>> >>> I am using Pike 16.0.15 (My deployment tool is openstack-ansible) >>> >>> >>> What are my option here? >>> >>> >>> On Wed, Jul 25, 2018 at 8:19 AM, David Medberry >>> wrote: >>> > It's not clear what version of Nova you are running but perhaps it is >>> > badly >>> > patched. The 16.x.x (Pike) release of Nova has no >>> > "migrate_configure_max_speed" but as best I can tell you are running a >>> > patched version of Nova Pike so it may be inconsistent. >>> > >>> > This parameter was introduced on 2017-08-24: >>> > >>> > https://github.com/openstack/nova/commit/23446a9552b5be3b040278646149a0f481d0a005 >>> > >>> > That parameter showed up in Queens (not Pike) initially. >>> > >>> > -d >>> > >>> > On Tue, Jul 24, 2018 at 11:22 PM, Satish Patel >>> > wrote: >>> >> >>> >> I have openstack with ceph storage setup and trying to test Live >>> >> migration but somehow it failed and showing following error >>> >> >>> >> nova.conf >>> >> >>> >> # ceph rbd support >>> >> live_migration_uri = "qemu+tcp://%s/system" >>> >> live_migration_tunnelled = True >>> >> >>> >> libvirtd.conf >>> >> >>> >> listen_tls = 0 >>> >> listen_tcp = 1 >>> >> unix_sock_group = "libvirt" >>> >> unix_sock_ro_perms = "0777" >>> >> unix_sock_rw_perms = "0770" >>> >> auth_unix_ro = "none" >>> >> auth_unix_rw = "none" >>> >> auth_tcp = "none" >>> >> >>> >> >>> >> This is the error i am getting, i google it but didn't find any >>> >> reference >>> >> >>> >> >>> >> >>> >> ] [instance: 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] Live migration >>> >> failed.: AttributeError: 'Guest' object has no attribute >>> >> 'migrate_configure_max_speed' >>> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >>> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] Traceback (most recent call >>> >> last): >>> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >>> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] File >>> >> >>> >> >>> >> "/openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/nova/compute/manager.py", >>> >> line 5580, in _do_live_migration >>> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >>> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] block_migration, >>> >> migrate_data) >>> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >>> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] File >>> >> >>> >> >>> >> "/openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", >>> >> line 6436, in live_migration >>> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >>> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] migrate_data) >>> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >>> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] File >>> >> >>> >> >>> >> "/openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", >>> >> line 6944, in _live_migration >>> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >>> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] >>> >> guest.migrate_configure_max_speed( >>> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >>> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] AttributeError: 'Guest' object >>> >> has no attribute 'migrate_configure_max_speed' >>> >> >>> >> _______________________________________________ >>> >> Mailing list: >>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>> >> Post to : openstack at lists.openstack.org >>> >> Unsubscribe : >>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>> > >>> > >> >> From satish.txt at gmail.com Wed Jul 25 14:15:02 2018 From: satish.txt at gmail.com (Satish Patel) Date: Wed, 25 Jul 2018 10:15:02 -0400 Subject: [Openstack] Live migration failed with ceph storage In-Reply-To: References: Message-ID: David, I did this on compute node [root at ostack-compute-01 ~]# locate test_guest.py /openstack/venvs/nova-16.0.14/lib/python2.7/site-packages/nova/tests/unit/virt/libvirt/test_guest.py /openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/nova/tests/unit/virt/libvirt/test_guest.py I didn't find option [root at ostack-compute-01 ~]# grep -i "test_migrate_configure_max_speed" /openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/nova/tests/unit/virt/libvirt/test_guest.py [root at ostack-compute-01 ~]# On Wed, Jul 25, 2018 at 10:06 AM, Satish Patel wrote: > Oh wait i believe following. > > https://github.com/openstack/openstack-ansible/blob/0e03f46a2ebb0ffc6f12384f19ec1184434e7a09/playbooks/defaults/repo_packages/openstack_services.yml#L148 > > On Wed, Jul 25, 2018 at 10:04 AM, Satish Patel wrote: >> David, >> >> look like OSAD 16.0.15 using following repo, if i am not wrong >> >> - name: os_nova >> scm: git >> src: https://git.openstack.org/openstack/openstack-ansible-os_nova >> version: 378cf6c83f9ad23c2e0d37e9df06796fee02cc27 >> >> On Wed, Jul 25, 2018 at 9:45 AM, David Medberry wrote: >>> I think that nova --version is the version of the client (not of nova >>> itself). >>> >>> I'm looking at OSAD 16.0.15 to see what it is pulling for nova. >>> >>> If I see anything of interest, I'll reply. >>> >>> On Wed, Jul 25, 2018 at 6:33 AM, Satish Patel wrote: >>>> >>>> Thanks David, >>>> >>>> [root at ostack-compute-01 ~]# nova --version >>>> 9.1.2 >>>> >>>> I am using Pike 16.0.15 (My deployment tool is openstack-ansible) >>>> >>>> >>>> What are my option here? >>>> >>>> >>>> On Wed, Jul 25, 2018 at 8:19 AM, David Medberry >>>> wrote: >>>> > It's not clear what version of Nova you are running but perhaps it is >>>> > badly >>>> > patched. The 16.x.x (Pike) release of Nova has no >>>> > "migrate_configure_max_speed" but as best I can tell you are running a >>>> > patched version of Nova Pike so it may be inconsistent. >>>> > >>>> > This parameter was introduced on 2017-08-24: >>>> > >>>> > https://github.com/openstack/nova/commit/23446a9552b5be3b040278646149a0f481d0a005 >>>> > >>>> > That parameter showed up in Queens (not Pike) initially. >>>> > >>>> > -d >>>> > >>>> > On Tue, Jul 24, 2018 at 11:22 PM, Satish Patel >>>> > wrote: >>>> >> >>>> >> I have openstack with ceph storage setup and trying to test Live >>>> >> migration but somehow it failed and showing following error >>>> >> >>>> >> nova.conf >>>> >> >>>> >> # ceph rbd support >>>> >> live_migration_uri = "qemu+tcp://%s/system" >>>> >> live_migration_tunnelled = True >>>> >> >>>> >> libvirtd.conf >>>> >> >>>> >> listen_tls = 0 >>>> >> listen_tcp = 1 >>>> >> unix_sock_group = "libvirt" >>>> >> unix_sock_ro_perms = "0777" >>>> >> unix_sock_rw_perms = "0770" >>>> >> auth_unix_ro = "none" >>>> >> auth_unix_rw = "none" >>>> >> auth_tcp = "none" >>>> >> >>>> >> >>>> >> This is the error i am getting, i google it but didn't find any >>>> >> reference >>>> >> >>>> >> >>>> >> >>>> >> ] [instance: 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] Live migration >>>> >> failed.: AttributeError: 'Guest' object has no attribute >>>> >> 'migrate_configure_max_speed' >>>> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >>>> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] Traceback (most recent call >>>> >> last): >>>> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >>>> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] File >>>> >> >>>> >> >>>> >> "/openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/nova/compute/manager.py", >>>> >> line 5580, in _do_live_migration >>>> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >>>> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] block_migration, >>>> >> migrate_data) >>>> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >>>> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] File >>>> >> >>>> >> >>>> >> "/openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", >>>> >> line 6436, in live_migration >>>> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >>>> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] migrate_data) >>>> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >>>> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] File >>>> >> >>>> >> >>>> >> "/openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", >>>> >> line 6944, in _live_migration >>>> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >>>> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] >>>> >> guest.migrate_configure_max_speed( >>>> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >>>> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] AttributeError: 'Guest' object >>>> >> has no attribute 'migrate_configure_max_speed' >>>> >> >>>> >> _______________________________________________ >>>> >> Mailing list: >>>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>>> >> Post to : openstack at lists.openstack.org >>>> >> Unsubscribe : >>>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>>> > >>>> > >>> >>> From satish.txt at gmail.com Wed Jul 25 15:47:01 2018 From: satish.txt at gmail.com (Satish Patel) Date: Wed, 25 Jul 2018 11:47:01 -0400 Subject: [Openstack] Live migration failed with ceph storage In-Reply-To: References: Message-ID: Look like i do have this option in whatever version of nova i am running [root at ostack-compute-02 site-packages]# pwd /openstack/venvs/nova-16.0.16/lib/python2.7/site-packages [root at ostack-compute-02 site-packages]# grep "migrate_configure_max_speed" * -r nova/tests/unit/virt/libvirt/test_driver.py: guest.migrate_configure_max_speed = mock.MagicMock() nova/tests/unit/virt/libvirt/test_driver.py: guest.migrate_configure_max_speed.assert_called_once_with( nova/tests/unit/virt/libvirt/test_driver.py: guest.migrate_configure_max_speed = mock.MagicMock() nova/tests/unit/virt/libvirt/test_driver.py: guest.migrate_configure_max_speed.assert_called_once_with( nova/tests/unit/virt/libvirt/test_driver.py: guest.migrate_configure_max_speed = mock.MagicMock() nova/tests/unit/virt/libvirt/test_driver.py: guest.migrate_configure_max_speed.assert_not_called() nova/tests/unit/virt/libvirt/test_driver.py: guest.migrate_configure_max_speed = mock.MagicMock() nova/tests/unit/virt/libvirt/test_driver.py: guest.migrate_configure_max_speed.assert_not_called() nova/virt/libvirt/driver.py: guest.migrate_configure_max_speed( On Wed, Jul 25, 2018 at 10:15 AM, Satish Patel wrote: > David, > > I did this on compute node > > [root at ostack-compute-01 ~]# locate test_guest.py > /openstack/venvs/nova-16.0.14/lib/python2.7/site-packages/nova/tests/unit/virt/libvirt/test_guest.py > /openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/nova/tests/unit/virt/libvirt/test_guest.py > > > I didn't find option > > [root at ostack-compute-01 ~]# grep -i "test_migrate_configure_max_speed" > /openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/nova/tests/unit/virt/libvirt/test_guest.py > [root at ostack-compute-01 ~]# > > On Wed, Jul 25, 2018 at 10:06 AM, Satish Patel wrote: >> Oh wait i believe following. >> >> https://github.com/openstack/openstack-ansible/blob/0e03f46a2ebb0ffc6f12384f19ec1184434e7a09/playbooks/defaults/repo_packages/openstack_services.yml#L148 >> >> On Wed, Jul 25, 2018 at 10:04 AM, Satish Patel wrote: >>> David, >>> >>> look like OSAD 16.0.15 using following repo, if i am not wrong >>> >>> - name: os_nova >>> scm: git >>> src: https://git.openstack.org/openstack/openstack-ansible-os_nova >>> version: 378cf6c83f9ad23c2e0d37e9df06796fee02cc27 >>> >>> On Wed, Jul 25, 2018 at 9:45 AM, David Medberry wrote: >>>> I think that nova --version is the version of the client (not of nova >>>> itself). >>>> >>>> I'm looking at OSAD 16.0.15 to see what it is pulling for nova. >>>> >>>> If I see anything of interest, I'll reply. >>>> >>>> On Wed, Jul 25, 2018 at 6:33 AM, Satish Patel wrote: >>>>> >>>>> Thanks David, >>>>> >>>>> [root at ostack-compute-01 ~]# nova --version >>>>> 9.1.2 >>>>> >>>>> I am using Pike 16.0.15 (My deployment tool is openstack-ansible) >>>>> >>>>> >>>>> What are my option here? >>>>> >>>>> >>>>> On Wed, Jul 25, 2018 at 8:19 AM, David Medberry >>>>> wrote: >>>>> > It's not clear what version of Nova you are running but perhaps it is >>>>> > badly >>>>> > patched. The 16.x.x (Pike) release of Nova has no >>>>> > "migrate_configure_max_speed" but as best I can tell you are running a >>>>> > patched version of Nova Pike so it may be inconsistent. >>>>> > >>>>> > This parameter was introduced on 2017-08-24: >>>>> > >>>>> > https://github.com/openstack/nova/commit/23446a9552b5be3b040278646149a0f481d0a005 >>>>> > >>>>> > That parameter showed up in Queens (not Pike) initially. >>>>> > >>>>> > -d >>>>> > >>>>> > On Tue, Jul 24, 2018 at 11:22 PM, Satish Patel >>>>> > wrote: >>>>> >> >>>>> >> I have openstack with ceph storage setup and trying to test Live >>>>> >> migration but somehow it failed and showing following error >>>>> >> >>>>> >> nova.conf >>>>> >> >>>>> >> # ceph rbd support >>>>> >> live_migration_uri = "qemu+tcp://%s/system" >>>>> >> live_migration_tunnelled = True >>>>> >> >>>>> >> libvirtd.conf >>>>> >> >>>>> >> listen_tls = 0 >>>>> >> listen_tcp = 1 >>>>> >> unix_sock_group = "libvirt" >>>>> >> unix_sock_ro_perms = "0777" >>>>> >> unix_sock_rw_perms = "0770" >>>>> >> auth_unix_ro = "none" >>>>> >> auth_unix_rw = "none" >>>>> >> auth_tcp = "none" >>>>> >> >>>>> >> >>>>> >> This is the error i am getting, i google it but didn't find any >>>>> >> reference >>>>> >> >>>>> >> >>>>> >> >>>>> >> ] [instance: 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] Live migration >>>>> >> failed.: AttributeError: 'Guest' object has no attribute >>>>> >> 'migrate_configure_max_speed' >>>>> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >>>>> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] Traceback (most recent call >>>>> >> last): >>>>> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >>>>> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] File >>>>> >> >>>>> >> >>>>> >> "/openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/nova/compute/manager.py", >>>>> >> line 5580, in _do_live_migration >>>>> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >>>>> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] block_migration, >>>>> >> migrate_data) >>>>> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >>>>> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] File >>>>> >> >>>>> >> >>>>> >> "/openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", >>>>> >> line 6436, in live_migration >>>>> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >>>>> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] migrate_data) >>>>> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >>>>> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] File >>>>> >> >>>>> >> >>>>> >> "/openstack/venvs/nova-16.0.16/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", >>>>> >> line 6944, in _live_migration >>>>> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >>>>> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] >>>>> >> guest.migrate_configure_max_speed( >>>>> >> 2018-07-25 01:00:59.214 9331 ERROR nova.compute.manager [instance: >>>>> >> 2b92ca5b-e433-4ac7-8dc8-619c9523ba97] AttributeError: 'Guest' object >>>>> >> has no attribute 'migrate_configure_max_speed' >>>>> >> >>>>> >> _______________________________________________ >>>>> >> Mailing list: >>>>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>>>> >> Post to : openstack at lists.openstack.org >>>>> >> Unsubscribe : >>>>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>>>> > >>>>> > >>>> >>>> From satish.txt at gmail.com Wed Jul 25 17:18:40 2018 From: satish.txt at gmail.com (Satish Patel) Date: Wed, 25 Jul 2018 13:18:40 -0400 Subject: [Openstack] URGENT - Live migration error DestinationDiskExists Message-ID: I am using PIKE 16.0.15 version and seeing following error during live migration, I am using Ceph storage for shared storage. any idea what is going on ? 2018-07-25 13:15:00.773 52312 ERROR oslo_messaging.rpc.server DestinationDiskExists: The supplied disk path (/var/lib/nova/instances/5f56bc2b-74c8-47c1-834c-00796fafe6ae) already exists, it is expected not to exist. From prometheanfire at gentoo.org Wed Jul 25 17:56:54 2018 From: prometheanfire at gentoo.org (Matthew Thode) Date: Wed, 25 Jul 2018 12:56:54 -0500 Subject: [Openstack] [OSSA-2018-002] GET /v3/OS-FEDERATION/projects leaks project information (CVE-2018-14432) Message-ID: <20180725175654.tus4pp3wi3ywrfzt@gentoo.org> ======================================================================= OSSA-2018-002: GET /v3/OS-FEDERATION/projects leaks project information ======================================================================= :Date: July 25, 2018 :CVE: CVE-2018-14432 Affects ~~~~~~~ - Keystone: <11.0.4, ==12.0.0, ==13.0.0 Description ~~~~~~~~~~~ Kristi Nikolla with Boston University reported a vulnerability in Keystone federation. By doing GET /v3/OS-FEDERATION/projects an authenticated user may discover projects they have no authority to access, leaking all projects in the deployment and their attributes. Only Keystone with the /v3/OS-FEDERATION endpoint enabled via policy.json is affected. Patches ~~~~~~~ - https://review.openstack.org/585802 (Ocata) - https://review.openstack.org/585792 (Pike) - https://review.openstack.org/585788 (Queens) - https://review.openstack.org/585782 (Rocky) Credits ~~~~~~~ - Kristi Nikolla from Boston University (CVE-2018-14432) References ~~~~~~~~~~ - https://launchpad.net/bugs/1779205 - http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-14432 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From skinnyh92 at gmail.com Wed Jul 25 22:52:07 2018 From: skinnyh92 at gmail.com (Hang Yang) Date: Wed, 25 Jul 2018 15:52:07 -0700 Subject: [Openstack] [Telemetry] RPC Message TTL for Ceilometer Notification Agent Message-ID: Hi there, I have a question about rpc_message_ttl for ceilometer service. I'm using queens ceilometer with gnocchi and rabbitmq. I recently noticed that ceilometer notification agent was receiving old metrics sent by polling agent on hypervisor from a few days ago. Since the default rpc_message_ttl is set to 300s, does anyone know how could that happen? I didn't want to receive those old metrics as they were in large scale and choke the notification agent (stuck in high cpu usage). I had to purge rabbitmq to fix that issue but wondering if there is any configuration I can do to prevent it happen again? Any help is appreciated. Thanks, Hang -------------- next part -------------- An HTML attachment was scrubbed... URL: From eblock at nde.ag Thu Jul 26 06:50:09 2018 From: eblock at nde.ag (Eugen Block) Date: Thu, 26 Jul 2018 06:50:09 +0000 Subject: [Openstack] URGENT - Live migration error DestinationDiskExists In-Reply-To: Message-ID: <20180726065009.Horde.m7Yx8r0jZElGelqHRpzU890@webmail.nde.ag> I assume /var/lib/nova/ uses shared storage and is mounted on the compute node(s)? It sounds like the directory already existed before it was configured to use shared storage or something. I believe I had a similar issue some time ago, I can't remember every detail, but although I believed that /var/lib/nova was mounted it actually was not. So make sure your configuration is correct, maybe delete the respective directory. Since you are using ceph as backend there isn't any data except a console.log file in that directory, so it should be safe. But you'll have to double check that before deleting anything, of course! Regards Zitat von Satish Patel : > I am using PIKE 16.0.15 version and seeing following error during live > migration, I am using Ceph storage for shared storage. any idea what > is going on ? > > 2018-07-25 13:15:00.773 52312 ERROR oslo_messaging.rpc.server > DestinationDiskExists: The supplied disk path > (/var/lib/nova/instances/5f56bc2b-74c8-47c1-834c-00796fafe6ae) already > exists, it is expected not to exist. > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack From marcioprado at marcioprado.eti.br Thu Jul 26 14:36:30 2018 From: marcioprado at marcioprado.eti.br (Marcio Prado) Date: Thu, 26 Jul 2018 11:36:30 -0300 Subject: [Openstack] Error Neutron: RTNETLINK answers: File exists Message-ID: <7aae93de29de39cd26a67c75ba04efd9@marcioprado.eti.br> Good afternoon, For no apparent reason my Neutron stopped working. I deleted the networks, subnets and routers, created everything again. But it does not work. The logs are: 2018-07-26 11:29:16.101 3272 INFO neutron.plugins.ml2.drivers.agent._common_agent [req-9ba0ca9f-aeaf-44b2-ba24-c08556aae0ac - - - - -] Linux bridge agent Agent out of sync with plugin! 2018-07-26 11:29:16.101 3272 INFO neutron.agent.securitygroups_rpc [req-9ba0ca9f-aeaf-44b2-ba24-c08556aae0ac - - - - -] Preparing filters for devices set(['tap69feb7be-2b', 'tap0efd5228-b0', 'tap83a57ce5-a8', 'tapd50d137f-f6']) 2018-07-26 11:29:18.218 3272 INFO neutron.plugins.ml2.drivers.agent._common_agent [req-9ba0ca9f-aeaf-44b2-ba24-c08556aae0ac - - - - -] Port tap69feb7be-2b updated. Details: {u'profile': {}, u'network_qos_policy_id': None, u'qos_policy_id': None, u'allowed_address_pairs': [], u'admin_state_up': True, u'network_id': u'0f293447-ad01-465e-a034-fdaa136a4488', u'segmentation_id': None, u'device_owner': u'network:router_gateway', u'physical_network': u'provider', u'mac_address': u'fa:16:3e:a3:be:5c', u'device': u'tap69feb7be-2b', u'port_security_enabled': False, u'port_id': u'69feb7be-2b9c-4604-a078-32c984d7075a', u'fixed_ips': [{u'subnet_id': u'5ef3df97-d88a-4c60-969c-5a862f04c1e0', u'ip_address': u'192.168.0.14'}], u'network_type': u'flat'} 2018-07-26 11:29:18.871 3272 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.arp_protect [req-9ba0ca9f-aeaf-44b2-ba24-c08556aae0ac - - - - -] Skipping ARP spoofing rules for port 'tap69feb7be-2b' because it has port security disabled 2018-07-26 11:29:20.208 3272 ERROR neutron.agent.linux.utils [req-9ba0ca9f-aeaf-44b2-ba24-c08556aae0ac - - - - -] Exit code: 2; Stdin: ; Stdout: ; Stderr: RTNETLINK answers: File exists 2018-07-26 11:29:20.219 3272 ERROR neutron.plugins.ml2.drivers.agent._common_agent [req-9ba0ca9f-aeaf-44b2-ba24-c08556aae0ac - - - - -] Error in agent loop. Devices info: {'current': set(['tap69feb7be-2b', 'tap0efd5228-b0', 'tap83a57ce5-a8', 'tapd50d137f-f6']), 'timestamps': {'tap0efd5228-b0': 9, 'tap69feb7be-2b': 13, 'tap83a57ce5-a8': 10, 'tapd50d137f-f6': 8}, 'removed': set([]), 'added': set(['tap69feb7be-2b', 'tap0efd5228-b0', 'tap83a57ce5-a8', 'tapd50d137f-f6']), 'updated': set([])} 2018-07-26 11:29:20.219 3272 ERROR neutron.plugins.ml2.drivers.agent._common_agent Traceback (most recent call last): 2018-07-26 11:29:20.219 3272 ERROR neutron.plugins.ml2.drivers.agent._common_agent File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 453, in daemon_loop 2018-07-26 11:29:20.219 3272 ERROR neutron.plugins.ml2.drivers.agent._common_agent sync = self.process_network_devices(device_info) 2018-07-26 11:29:20.219 3272 ERROR neutron.plugins.ml2.drivers.agent._common_agent File "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 153, in wrapper 2018-07-26 11:29:20.219 3272 ERROR neutron.plugins.ml2.drivers.agent._common_agent return f(*args, **kwargs) 2018-07-26 11:29:20.219 3272 ERROR neutron.plugins.ml2.drivers.agent._common_agent File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 210, in process_network_devices Has anyone had similar experience? -- Marcio Prado Analista de TI - Infraestrutura e Redes Fone: (35) 9.9821-3561 www.marcioprado.eti.br From eblock at nde.ag Fri Jul 27 08:05:36 2018 From: eblock at nde.ag (Eugen Block) Date: Fri, 27 Jul 2018 08:05:36 +0000 Subject: [Openstack] Error Neutron: RTNETLINK answers: File exists In-Reply-To: <7aae93de29de39cd26a67c75ba04efd9@marcioprado.eti.br> Message-ID: <20180727080536.Horde.GzHVtxaoCBgtq_alRDaYS0d@webmail.nde.ag> Hi, is there anything in the linuxbridge-agent logs on control and/or compute node(s)? Which neutron services don't start? Can you paste "openstack network agent list" output? The important question is: what was the cause of "neutron stopped working" and why did you delete the existing networks? It probably would be helpful knowing the reaseon to be able to prevent such problemes in the future. Or are the provided logs from before? We experience network/neutron troubles from time to time, and sometimes the only way to fix it is a reboot. Regards, Eugen Zitat von Marcio Prado : > Good afternoon, > > For no apparent reason my Neutron stopped working. > > I deleted the networks, subnets and routers, created everything again. > > But it does not work. The logs are: > > > 2018-07-26 11:29:16.101 3272 INFO > neutron.plugins.ml2.drivers.agent._common_agent > [req-9ba0ca9f-aeaf-44b2-ba24-c08556aae0ac - - - - -] Linux bridge > agent Agent out of sync with plugin! > 2018-07-26 11:29:16.101 3272 INFO neutron.agent.securitygroups_rpc > [req-9ba0ca9f-aeaf-44b2-ba24-c08556aae0ac - - - - -] Preparing > filters for devices set(['tap69feb7be-2b', 'tap0efd5228-b0', > 'tap83a57ce5-a8', 'tapd50d137f-f6']) > 2018-07-26 11:29:18.218 3272 INFO > neutron.plugins.ml2.drivers.agent._common_agent > [req-9ba0ca9f-aeaf-44b2-ba24-c08556aae0ac - - - - -] Port > tap69feb7be-2b updated. Details: {u'profile': {}, > u'network_qos_policy_id': None, u'qos_policy_id': None, > u'allowed_address_pairs': [], u'admin_state_up': True, > u'network_id': u'0f293447-ad01-465e-a034-fdaa136a4488', > u'segmentation_id': None, u'device_owner': > u'network:router_gateway', u'physical_network': u'provider', > u'mac_address': u'fa:16:3e:a3:be:5c', u'device': u'tap69feb7be-2b', > u'port_security_enabled': False, u'port_id': > u'69feb7be-2b9c-4604-a078-32c984d7075a', u'fixed_ips': > [{u'subnet_id': u'5ef3df97-d88a-4c60-969c-5a862f04c1e0', > u'ip_address': u'192.168.0.14'}], u'network_type': u'flat'} > 2018-07-26 11:29:18.871 3272 INFO > neutron.plugins.ml2.drivers.linuxbridge.agent.arp_protect > [req-9ba0ca9f-aeaf-44b2-ba24-c08556aae0ac - - - - -] Skipping ARP > spoofing rules for port 'tap69feb7be-2b' because it has port > security disabled > 2018-07-26 11:29:20.208 3272 ERROR neutron.agent.linux.utils > [req-9ba0ca9f-aeaf-44b2-ba24-c08556aae0ac - - - - -] Exit code: 2; > Stdin: ; Stdout: ; Stderr: RTNETLINK answers: File exists > > 2018-07-26 11:29:20.219 3272 ERROR > neutron.plugins.ml2.drivers.agent._common_agent > [req-9ba0ca9f-aeaf-44b2-ba24-c08556aae0ac - - - - -] Error in agent > loop. Devices info: {'current': set(['tap69feb7be-2b', > 'tap0efd5228-b0', 'tap83a57ce5-a8', 'tapd50d137f-f6']), > 'timestamps': {'tap0efd5228-b0': 9, 'tap69feb7be-2b': 13, > 'tap83a57ce5-a8': 10, 'tapd50d137f-f6': 8}, 'removed': set([]), > 'added': set(['tap69feb7be-2b', 'tap0efd5228-b0', 'tap83a57ce5-a8', > 'tapd50d137f-f6']), 'updated': set([])} > 2018-07-26 11:29:20.219 3272 ERROR > neutron.plugins.ml2.drivers.agent._common_agent Traceback (most > recent call last): > 2018-07-26 11:29:20.219 3272 ERROR > neutron.plugins.ml2.drivers.agent._common_agent File > "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 453, in > daemon_loop > 2018-07-26 11:29:20.219 3272 ERROR > neutron.plugins.ml2.drivers.agent._common_agent sync = > self.process_network_devices(device_info) > 2018-07-26 11:29:20.219 3272 ERROR > neutron.plugins.ml2.drivers.agent._common_agent File > "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 153, > in wrapper > 2018-07-26 11:29:20.219 3272 ERROR > neutron.plugins.ml2.drivers.agent._common_agent return f(*args, > **kwargs) > 2018-07-26 11:29:20.219 3272 ERROR > neutron.plugins.ml2.drivers.agent._common_agent File > "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", line 210, in > process_network_devices > > Has anyone had similar experience? > > -- > Marcio Prado > Analista de TI - Infraestrutura e Redes > Fone: (35) 9.9821-3561 > www.marcioprado.eti.br > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack From marcioprado at marcioprado.eti.br Fri Jul 27 11:32:47 2018 From: marcioprado at marcioprado.eti.br (Marcio Prado) Date: Fri, 27 Jul 2018 08:32:47 -0300 Subject: [Openstack] Error Neutron: RTNETLINK answers: File exists In-Reply-To: <20180727080536.Horde.GzHVtxaoCBgtq_alRDaYS0d@webmail.nde.ag> References: <20180727080536.Horde.GzHVtxaoCBgtq_alRDaYS0d@webmail.nde.ag> Message-ID: <99001d52739f470ba2abf7d951600436@marcioprado.eti.br> Thanks for the help Eugen, This log is from the linuxbridge of the controller node. Compute nodes are not logging errors. Follows the output of the "openstack network agent list" +--------------------------------------+--------------------+------------+-------------------+-------+-------+---------------------------+ | ID | Agent Type | Host | Availability Zone | Alive | State | Binary | +--------------------------------------+--------------------+------------+-------------------+-------+-------+---------------------------+ | 590f5a6d-379b-4e8d-87ec-f1060cecf230 | Linux bridge agent | controller | None | True | UP | neutron-linuxbridge-agent | | 88fb87c9-4c03-4faa-8286-95be3586fc94 | DHCP agent | controller | nova | True | UP | neutron-dhcp-agent | | b982382e-438c-46a9-8d4e-d58d554150fd | Linux bridge agent | compute1 | None | True | UP | neutron-linuxbridge-agent | | c7a9ba41-1fae-46cd-b61f-30bcacb0a4e8 | L3 agent | controller | nova | True | UP | neutron-l3-agent | | c9a1ea4b-2d5d-4bda-9849-cd6e302a2917 | Metadata agent | controller | None | True | UP | neutron-metadata-agent | | e690d4b9-9285-4ddd-a87a-f28ea99d9a73 | Linux bridge agent | compute3 | None | False | UP | neutron-linuxbridge-agent | | fdd8f615-f5d6-4100-826e-59f8270df715 | Linux bridge agent | compute2 | None | False | UP | neutron-linuxbridge-agent | +--------------------------------------+--------------------+------------+-------------------+-------+-------+---------------------------+ compute2 and compute3 are turned off intentionally. Log compute1 /var/log/neutron/neutron-linuxbridge-agent.log 2018-07-27 07:43:57.242 1895 INFO neutron.common.config [-] /usr/bin/neutron-linuxbridge-agent version 10.0.0 2018-07-27 07:43:57.243 1895 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Interface mappings: {'provider': 'eno3'} 2018-07-27 07:43:57.243 1895 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Bridge mappings: {} 2018-07-27 07:44:00.954 1895 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Agent initialized successfully, now running... 2018-07-27 07:44:01.582 1895 INFO neutron.plugins.ml2.drivers.agent._common_agent [req-3a8a42dc-32fc-40fc-8a4f-ddbb4d8c5f5b - - - - -] RPC agent_id: lb525400d52f59 2018-07-27 07:44:01.589 1895 INFO neutron.agent.agent_extensions_manager [req-3a8a42dc-32fc-40fc-8a4f-ddbb4d8c5f5b - - - - -] Loaded agent extensions: [] 2018-07-27 07:44:01.716 1895 INFO neutron.plugins.ml2.drivers.agent._common_agent [-] Linux bridge agent Agent has just been revived. Doing a full sync. 2018-07-27 07:44:01.778 1895 INFO neutron.plugins.ml2.drivers.agent._common_agent [req-3a8a42dc-32fc-40fc-8a4f-ddbb4d8c5f5b - - - - -] Linux bridge agent Agent RPC Daemon Started! 2018-07-27 07:44:01.779 1895 INFO neutron.plugins.ml2.drivers.agent._common_agent [req-3a8a42dc-32fc-40fc-8a4f-ddbb4d8c5f5b - - - - -] Linux bridge agent Agent out of sync with plugin! 2018-07-27 07:44:02.418 1895 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.arp_protect [req-3a8a42dc-32fc-40fc-8a4f-ddbb4d8c5f5b - - - - -] Clearing orphaned ARP spoofing entries for devices [] I'm using this OpenStack cloud to run my master's experiment. I turned off all nodes, and after a few days I called again and from that the VMs were not remotely accessible. So I delete existing networks and re-create. It was in an attempt to solve the problem. Here is an attached image. Neutron is creating multiple interfaces on the 10.0.0.0 network on the router. Em 27-07-2018 05:05, Eugen Block escreveu: > Hi, > > is there anything in the linuxbridge-agent logs on control and/or > compute node(s)? > Which neutron services don't start? Can you paste "openstack network > agent list" output? > > The important question is: what was the cause of "neutron stopped > working" and why did you delete the existing networks? It probably > would be helpful knowing the reaseon to be able to prevent such > problemes in the future. Or are the provided logs from before? > > We experience network/neutron troubles from time to time, and > sometimes the only way to fix it is a reboot. > > Regards, > Eugen > > > Zitat von Marcio Prado : > >> Good afternoon, >> >> For no apparent reason my Neutron stopped working. >> >> I deleted the networks, subnets and routers, created everything again. >> >> But it does not work. The logs are: >> >> >> 2018-07-26 11:29:16.101 3272 INFO >> neutron.plugins.ml2.drivers.agent._common_agent >> [req-9ba0ca9f-aeaf-44b2-ba24-c08556aae0ac - - - - -] Linux bridge >> agent Agent out of sync with plugin! >> 2018-07-26 11:29:16.101 3272 INFO neutron.agent.securitygroups_rpc >> [req-9ba0ca9f-aeaf-44b2-ba24-c08556aae0ac - - - - -] Preparing >> filters for devices set(['tap69feb7be-2b', 'tap0efd5228-b0', >> 'tap83a57ce5-a8', 'tapd50d137f-f6']) >> 2018-07-26 11:29:18.218 3272 INFO >> neutron.plugins.ml2.drivers.agent._common_agent >> [req-9ba0ca9f-aeaf-44b2-ba24-c08556aae0ac - - - - -] Port >> tap69feb7be-2b updated. Details: {u'profile': {}, >> u'network_qos_policy_id': None, u'qos_policy_id': None, >> u'allowed_address_pairs': [], u'admin_state_up': True, u'network_id': >> u'0f293447-ad01-465e-a034-fdaa136a4488', u'segmentation_id': None, >> u'device_owner': u'network:router_gateway', u'physical_network': >> u'provider', u'mac_address': u'fa:16:3e:a3:be:5c', u'device': >> u'tap69feb7be-2b', u'port_security_enabled': False, u'port_id': >> u'69feb7be-2b9c-4604-a078-32c984d7075a', u'fixed_ips': >> [{u'subnet_id': u'5ef3df97-d88a-4c60-969c-5a862f04c1e0', >> u'ip_address': u'192.168.0.14'}], u'network_type': u'flat'} >> 2018-07-26 11:29:18.871 3272 INFO >> neutron.plugins.ml2.drivers.linuxbridge.agent.arp_protect >> [req-9ba0ca9f-aeaf-44b2-ba24-c08556aae0ac - - - - -] Skipping ARP >> spoofing rules for port 'tap69feb7be-2b' because it has port security >> disabled >> 2018-07-26 11:29:20.208 3272 ERROR neutron.agent.linux.utils >> [req-9ba0ca9f-aeaf-44b2-ba24-c08556aae0ac - - - - -] Exit code: 2; >> Stdin: ; Stdout: ; Stderr: RTNETLINK answers: File exists >> >> 2018-07-26 11:29:20.219 3272 ERROR >> neutron.plugins.ml2.drivers.agent._common_agent >> [req-9ba0ca9f-aeaf-44b2-ba24-c08556aae0ac - - - - -] Error in agent >> loop. Devices info: {'current': set(['tap69feb7be-2b', >> 'tap0efd5228-b0', 'tap83a57ce5-a8', 'tapd50d137f-f6']), 'timestamps': >> {'tap0efd5228-b0': 9, 'tap69feb7be-2b': 13, 'tap83a57ce5-a8': 10, >> 'tapd50d137f-f6': 8}, 'removed': set([]), 'added': >> set(['tap69feb7be-2b', 'tap0efd5228-b0', 'tap83a57ce5-a8', >> 'tapd50d137f-f6']), 'updated': set([])} >> 2018-07-26 11:29:20.219 3272 ERROR >> neutron.plugins.ml2.drivers.agent._common_agent Traceback (most >> recent call last): >> 2018-07-26 11:29:20.219 3272 ERROR >> neutron.plugins.ml2.drivers.agent._common_agent File >> "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", >> line 453, in daemon_loop >> 2018-07-26 11:29:20.219 3272 ERROR >> neutron.plugins.ml2.drivers.agent._common_agent sync = >> self.process_network_devices(device_info) >> 2018-07-26 11:29:20.219 3272 ERROR >> neutron.plugins.ml2.drivers.agent._common_agent File >> "/usr/lib/python2.7/dist-packages/osprofiler/profiler.py", line 153, >> in wrapper >> 2018-07-26 11:29:20.219 3272 ERROR >> neutron.plugins.ml2.drivers.agent._common_agent return f(*args, >> **kwargs) >> 2018-07-26 11:29:20.219 3272 ERROR >> neutron.plugins.ml2.drivers.agent._common_agent File >> "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/agent/_common_agent.py", >> line 210, in process_network_devices >> >> Has anyone had similar experience? >> >> -- Marcio Prado >> Analista de TI - Infraestrutura e Redes >> Fone: (35) 9.9821-3561 >> www.marcioprado.eti.br >> >> _______________________________________________ >> Mailing list: >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> Post to : openstack at lists.openstack.org >> Unsubscribe : >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > > > _______________________________________________ > Mailing list: > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -- Marcio Prado Analista de TI - Infraestrutura e Redes Fone: (35) 9.9821-3561 www.marcioprado.eti.br -------------- next part -------------- A non-text attachment was scrubbed... Name: ErrorNeutron.png Type: image/png Size: 50453 bytes Desc: not available URL: From ralf.teckelmann at bertelsmann.de Fri Jul 27 12:50:09 2018 From: ralf.teckelmann at bertelsmann.de (Teckelmann, Ralf, NMU-OIP) Date: Fri, 27 Jul 2018 12:50:09 +0000 Subject: [Openstack] OpenStack Charm Versions and juju revisions explained Message-ID: Hello, My colleagues and are currently preparing an Upgrade of a productive OpenStack deployment. Teh deployment is managed with a tool called Juju from canonical. Whats puzzling us is the matching of the "juju charms version" e.g. stable/17.11 to the "revision" given for the charm, e.g. #279. Can anyone explain how to match those? Have a nice weekend Ralf Teckelmann -------------- next part -------------- An HTML attachment was scrubbed... URL: From felluslior at gmail.com Sat Jul 28 10:53:08 2018 From: felluslior at gmail.com (Lior Fellus) Date: Sat, 28 Jul 2018 13:53:08 +0300 Subject: [Openstack] restore lost files Message-ID: Hi i am using Mirantis OpenStack mitaka. the OS volume of every instance are stored in /var/lib/nova/instances as qcow files. the volumes i used as storage disk are nfs on my EMC. i am backing up those volumes to another server. i have to restore files from one volume i have in backup server. i tried to mount this volume with guestmount to my ubuntu pc but fs is not recognise. any suggestions would be appriciate Regards -------------- next part -------------- An HTML attachment was scrubbed... URL: From berndbausch at gmail.com Sat Jul 28 22:31:35 2018 From: berndbausch at gmail.com (Bernd Bausch) Date: Sun, 29 Jul 2018 07:31:35 +0900 Subject: [Openstack] restore lost files In-Reply-To: References: Message-ID: You don’t provide any details, but if you copied the qcow file while the instance was running, you can’t be certain that the file is in a consistent state. There might be unflushed data in the instance buffer cache and in the host buffer cache. Did you consider instance snapshots? What do you get from guestmount -i? Bernd > On Jul 28, 2018, at 19:53, Lior Fellus wrote: > > Hi > > i am using Mirantis OpenStack mitaka. > the OS volume of every instance are stored in /var/lib/nova/instances as qcow files. > the volumes i used as storage disk are nfs on my EMC. > i am backing up those volumes to another server. > i have to restore files from one volume i have in backup server. > i tried to mount this volume with guestmount to my ubuntu pc but fs is not recognise. > any suggestions would be appriciate > > Regards > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack From gaosong_1250 at 163.com Mon Jul 30 14:02:46 2018 From: gaosong_1250 at 163.com (gao.song) Date: Mon, 30 Jul 2018 22:02:46 +0800 (CST) Subject: [Openstack] [Horizon] Horizon responds very slowly In-Reply-To: References: <6ED5A4C0760EC04A8DFC2AF5B93AE14EB4F2ABD3@MAIL01.syswin.com> <20180719084707.Horde.nJPNJA-tAWUutMcU6gz_cHh@webmail.nde.ag> Message-ID: <583e07eb.916a.164eb7f9e18.Coremail.gaosong_1250@163.com> Further report! We finally figure it out. It because of the original memcache_server configuration which lead to load key from the poweroff controller configuration example: [cache] backend = oslo_cache.memcache_pool enabled = True memcache_servers = controller1:11211,controller2:11211,controller3:11211 After change the server set to contoller_vip:11211,problem solved. At 2018-07-24 02:35:09, "Ivan Kolodyazhny" wrote: Hi, It could be a common issue between horizon and keystone. As a temporary workaround for this, you can apply this [1] patch to redirect admin user to the different page. [1] https://review.openstack.org/#/c/577090/ Regards, Ivan Kolodyazhny, http://blog.e0ne.info/ On Thu, Jul 19, 2018 at 11:47 AM, Eugen Block wrote: Hi, we also had to deal with slow dashboard, in our case it was a misconfiguration of memcached [0], [1]. Check with your configuration and make sure you use oslo.cache. Hope this helps! [0] https://bugs.launchpad.net/keystone/+bug/1587777 [1] https://ask.openstack.org/en/question/102611/how-to-configure-memcache-in-openstack-ha/ Zitat von 高松 : After kill one node of a cluster which consist of three nodes, I found that Horizon based on keystone with provider set to fernet respondes very slowly. Admin login will cost at least 20 senconds. And cli verbose command return show making authentication is stuck about 5 senconds. Any help will be appreciated. _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack at lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From e0ne at e0ne.info Mon Jul 30 14:03:39 2018 From: e0ne at e0ne.info (Ivan Kolodyazhny) Date: Mon, 30 Jul 2018 17:03:39 +0300 Subject: [Openstack] [Horizon] Horizon responds very slowly In-Reply-To: <583e07eb.916a.164eb7f9e18.Coremail.gaosong_1250@163.com> References: <6ED5A4C0760EC04A8DFC2AF5B93AE14EB4F2ABD3@MAIL01.syswin.com> <20180719084707.Horde.nJPNJA-tAWUutMcU6gz_cHh@webmail.nde.ag> <583e07eb.916a.164eb7f9e18.Coremail.gaosong_1250@163.com> Message-ID: Thanks for the update, Gao! Let me know if any help is needed. Regards, Ivan Kolodyazhny, http://blog.e0ne.info/ On Mon, Jul 30, 2018 at 5:02 PM, gao.song wrote: > Further report! > We finally figure it out. > It because of the original memcache_server configuration which lead to > load key from the poweroff controller > configuration example: > [cache] > backend = oslo_cache.memcache_pool > enabled = True > memcache_servers = controller1:11211,controller2:11211,controller3:11211 > After change the server set to contoller_vip:11211,problem solved. > > > > > > At 2018-07-24 02:35:09, "Ivan Kolodyazhny" wrote: > > Hi, > > It could be a common issue between horizon and keystone. > > As a temporary workaround for this, you can apply this [1] patch to > redirect admin user to the different page. > > [1] https://review.openstack.org/#/c/577090/ > > Regards, > Ivan Kolodyazhny, > http://blog.e0ne.info/ > > On Thu, Jul 19, 2018 at 11:47 AM, Eugen Block wrote: > >> Hi, >> >> we also had to deal with slow dashboard, in our case it was a >> misconfiguration of memcached [0], [1]. >> >> Check with your configuration and make sure you use oslo.cache. >> >> Hope this helps! >> >> [0] https://bugs.launchpad.net/keystone/+bug/1587777 >> [1] https://ask.openstack.org/en/question/102611/how-to-configur >> e-memcache-in-openstack-ha/ >> >> >> Zitat von 高松 : >> >> >> After kill one node of a cluster which consist of three nodes, >>> I found that Horizon based on keystone with provider set to fernet >>> respondes very slowly. >>> Admin login will cost at least 20 senconds. >>> And cli verbose command return show making authentication is stuck about >>> 5 senconds. >>> Any help will be appreciated. >>> >> >> >> >> >> _______________________________________________ >> Mailing list: http://lists.openstack.org/cgi >> -bin/mailman/listinfo/openstack >> Post to : openstack at lists.openstack.org >> Unsubscribe : http://lists.openstack.org/cgi >> -bin/mailman/listinfo/openstack >> > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From clay.gerrard at gmail.com Mon Jul 30 16:28:09 2018 From: clay.gerrard at gmail.com (Clay Gerrard) Date: Mon, 30 Jul 2018 11:28:09 -0500 Subject: [Openstack] [Openstack-operators] swift question In-Reply-To: References: Message-ID: Sure! python swiftclient's upload command has a --changed option: https://docs.openstack.org/python-swiftclient/latest/cli/index.html#swift-upload But you might be happier with something more sophisticated like rclone: https://rclone.org/ Nice thing about object storage is you can access it from anywhere via HTTP and PUT anything you want in there ;) -Clay On Mon, Jul 30, 2018 at 9:54 AM Alfredo De Luca wrote: > Hi all. > I wonder if i can sync a directory on a server to the obj store (swift). > What I do now is just a backup but I d like to implement a sort of file > rotate locally and on the obj store. > Any idea? > > > -- > *Alfredo* > > _______________________________________________ > OpenStack-operators mailing list > OpenStack-operators at lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators > -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Tue Jul 31 12:40:14 2018 From: satish.txt at gmail.com (Satish Patel) Date: Tue, 31 Jul 2018 08:40:14 -0400 Subject: [Openstack] Openstack monitoring Message-ID: Folks, I am sure many folks asked this question in past and which include me, How to monitor openstack best and easy way. I just finish openstack-ansible deployment in LXC container and it seems more component i need to monitor, I have couple of question in mind for folks who already using OSA or similar deployment tool for cloud, How do you monitor your cloud This is what i have on my tables and want second opinion from community before i stamp on it. 1. Monasca ( I have zero experience ) 2. Nagios 3. Icing 4. ELK ( This is analyzer so not a true monitoring ) 5. Xymon 6. Zabbix 7. ??? I am using OSA lxc containers so i have to install monitoring agent on each container, to get true service alerts or is there any other method which we can use to monitor. From eblock at nde.ag Tue Jul 31 12:40:26 2018 From: eblock at nde.ag (Eugen Block) Date: Tue, 31 Jul 2018 12:40:26 +0000 Subject: [Openstack] [Horizon] Horizon responds very slowly In-Reply-To: <583e07eb.916a.164eb7f9e18.Coremail.gaosong_1250@163.com> References: <6ED5A4C0760EC04A8DFC2AF5B93AE14EB4F2ABD3@MAIL01.syswin.com> <20180719084707.Horde.nJPNJA-tAWUutMcU6gz_cHh@webmail.nde.ag> <583e07eb.916a.164eb7f9e18.Coremail.gaosong_1250@163.com> Message-ID: <20180731124026.Horde.CUL9oU0nDcVsPS2themtB7m@webmail.nde.ag> Interesting, the HA guide [2] states that memcached should be configured with the list of hosts: > Access to Memcached is not handled by HAProxy because replicated > access is currently in an experimental state. > Instead, OpenStack services must be supplied with the full list of > hosts running Memcached. On the other hand, it would be only one of many incorrect statements in that guide since I've dealt with it, so maybe this is just outdated information (although the page has been modified on July 25th). Which OpenStack version are you deploying? Regards, Eugen [2] https://docs.openstack.org/ha-guide/controller-ha-memcached.html Zitat von "gao.song" : > Further report! > We finally figure it out. > It because of the original memcache_server configuration which lead > to load key from the poweroff controller > configuration example: > [cache] > backend = oslo_cache.memcache_pool > enabled = True > memcache_servers = controller1:11211,controller2:11211,controller3:11211 > After change the server set to contoller_vip:11211,problem solved. > > > > > > > At 2018-07-24 02:35:09, "Ivan Kolodyazhny" wrote: > > Hi, > > > It could be a common issue between horizon and keystone. > > > As a temporary workaround for this, you can apply this [1] patch to > redirect admin user to the different page. > > > [1] https://review.openstack.org/#/c/577090/ > > > Regards, > Ivan Kolodyazhny, > http://blog.e0ne.info/ > > > On Thu, Jul 19, 2018 at 11:47 AM, Eugen Block wrote: > Hi, > > we also had to deal with slow dashboard, in our case it was a > misconfiguration of memcached [0], [1]. > > Check with your configuration and make sure you use oslo.cache. > > Hope this helps! > > [0] https://bugs.launchpad.net/keystone/+bug/1587777 > [1] > https://ask.openstack.org/en/question/102611/how-to-configure-memcache-in-openstack-ha/ > > > Zitat von 高松 : > > > After kill one node of a cluster which consist of three nodes, > I found that Horizon based on keystone with provider set to fernet > respondes very slowly. > Admin login will cost at least 20 senconds. > And cli verbose command return show making authentication is stuck > about 5 senconds. > Any help will be appreciated. > > > > > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack