From eblock at nde.ag Mon Sep 3 11:27:37 2018 From: eblock at nde.ag (Eugen Block) Date: Mon, 03 Sep 2018 11:27:37 +0000 Subject: [Openstack] [nova] Nova-scheduler: when are filters applied? In-Reply-To: References: <20180830141938.Horde.oWg04EkYxTBMGLQrn__TgQg@webmail.nde.ag> <9996fe76-a744-d3b8-baab-9efbb6389ffe@gmail.com> <20180830145435.Horde.qGpUxaiNIIbQGcCo43g-PRn@webmail.nde.ag> Message-ID: <20180903112737.Horde.yS7Os9MoA6nGHZVlOD43KIO@webmail.nde.ag> Hi, > To echo what cfriesen said, if you set your allocation ratio to 1.0, > the system will not overcommit memory. Shut down instances consume > memory from an inventory management perspective. If you don't want > any danger of an instance causing an OOM, you must set you > ram_allocation_ratio to 1.0. let's forget about the scheduler, I'll try to make my question a bit clearer. Let's say I have a ratio of 1.0 on my hypervisor, and let it have 24 GB of RAM available, ignoring the OS for a moment. Now I launch 6 instances, each with a flavor requesting 4 GB of RAM, that would leave no space for further instances, right? Then I shutdown two instances (freeing 8 GB RAM) and create a new one with 8 GB of RAM, the compute node is full again (assuming all instances actually consume all of their RAM). Now I boot one of the shutdown instances again, the compute node would require additional 4 GB of RAM for that instance, and this would lead to OOM, isn't that correct? So a ratio of 1.0 would not prevent that from happening, would it? Regards, Eugen Zitat von Jay Pipes : > On 08/30/2018 10:54 AM, Eugen Block wrote: >> Hi Jay, >> >>> You need to set your ram_allocation_ratio nova.CONF option to 1.0 >>> if you're running into OOM issues. This will prevent overcommit of >>> memory on your compute nodes. >> >> I understand that, the overcommitment works quite well most of the time. >> >> It just has been an issue twice when I booted an instance that had >> been shutdown a while ago. In the meantime there were new instances >> created on that hypervisor, and this old instance caused the OOM. >> >> I would expect that with a ratio of 1.0 I would experience the same >> issue, wouldn't I? As far as I understand the scheduler only checks >> at instance creation, not when booting existing instances. Is that >> a correct assumption? > > To echo what cfriesen said, if you set your allocation ratio to 1.0, > the system will not overcommit memory. Shut down instances consume > memory from an inventory management perspective. If you don't want > any danger of an instance causing an OOM, you must set you > ram_allocation_ratio to 1.0. > > The scheduler doesn't really have anything to do with this. > > Best, > -jay > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack From balazs.gibizer at ericsson.com Mon Sep 3 11:55:30 2018 From: balazs.gibizer at ericsson.com (=?iso-8859-1?q?Bal=E1zs?= Gibizer) Date: Mon, 03 Sep 2018 13:55:30 +0200 Subject: [Openstack] [nova] Nova-scheduler: when are filters applied? In-Reply-To: <20180903112737.Horde.yS7Os9MoA6nGHZVlOD43KIO@webmail.nde.ag> References: <20180830141938.Horde.oWg04EkYxTBMGLQrn__TgQg@webmail.nde.ag> <9996fe76-a744-d3b8-baab-9efbb6389ffe@gmail.com> <20180830145435.Horde.qGpUxaiNIIbQGcCo43g-PRn@webmail.nde.ag> <20180903112737.Horde.yS7Os9MoA6nGHZVlOD43KIO@webmail.nde.ag> Message-ID: <1535975730.32321.6@smtp.office365.com> On Mon, Sep 3, 2018 at 1:27 PM, Eugen Block wrote: > Hi, > >> To echo what cfriesen said, if you set your allocation ratio to 1.0, >> the system will not overcommit memory. Shut down instances consume >> memory from an inventory management perspective. If you don't want >> any danger of an instance causing an OOM, you must set you >> ram_allocation_ratio to 1.0. > > let's forget about the scheduler, I'll try to make my question a bit > clearer. > > Let's say I have a ratio of 1.0 on my hypervisor, and let it have 24 > GB of RAM available, ignoring the OS for a moment. Now I launch 6 > instances, each with a flavor requesting 4 GB of RAM, that would > leave no space for further instances, right? > Then I shutdown two instances (freeing 8 GB RAM) and create a new one > with 8 GB of RAM, the compute node is full again (assuming all > instances actually consume all of their RAM). When you shutdown the two instances the phyisical RAM will be deallocated BUT nova will not remove the resource allocation in placement. Therefore your new instance which requires 8GB RAM will not be placed to the host in question because on that host all the 24G RAM is still allocated even if physically not consumed at the moment. > Now I boot one of the shutdown instances again, the compute node > would require additional 4 GB of RAM for that instance, and this > would lead to OOM, isn't that correct? So a ratio of 1.0 would not > prevent that from happening, would it? Nova did not place the instance require 8G RAM to this host above. Therefore you can freely start up the two 4G consuming instances on this host later. > Regards, > Eugen > > > Zitat von Jay Pipes : > >> On 08/30/2018 10:54 AM, Eugen Block wrote: >>> Hi Jay, >>> >>>> You need to set your ram_allocation_ratio nova.CONF option to 1.0 >>>> if you're running into OOM issues. This will prevent overcommit >>>> of memory on your compute nodes. >>> >>> I understand that, the overcommitment works quite well most of the >>> time. >>> >>> It just has been an issue twice when I booted an instance that had >>> been shutdown a while ago. In the meantime there were new >>> instances created on that hypervisor, and this old instance >>> caused the OOM. >>> >>> I would expect that with a ratio of 1.0 I would experience the same >>> issue, wouldn't I? As far as I understand the scheduler only >>> checks at instance creation, not when booting existing instances. >>> Is that a correct assumption? >> >> To echo what cfriesen said, if you set your allocation ratio to 1.0, >> the system will not overcommit memory. Shut down instances consume >> memory from an inventory management perspective. If you don't want >> any danger of an instance causing an OOM, you must set you >> ram_allocation_ratio to 1.0. >> >> The scheduler doesn't really have anything to do with this. >> >> Best, >> -jay >> >> _______________________________________________ >> Mailing list: >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> Post to : openstack at lists.openstack.org >> Unsubscribe : >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > > _______________________________________________ > Mailing list: > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack From jaypipes at gmail.com Mon Sep 3 11:56:59 2018 From: jaypipes at gmail.com (Jay Pipes) Date: Mon, 3 Sep 2018 07:56:59 -0400 Subject: [Openstack] [nova] Nova-scheduler: when are filters applied? In-Reply-To: <20180903112737.Horde.yS7Os9MoA6nGHZVlOD43KIO@webmail.nde.ag> References: <20180830141938.Horde.oWg04EkYxTBMGLQrn__TgQg@webmail.nde.ag> <9996fe76-a744-d3b8-baab-9efbb6389ffe@gmail.com> <20180830145435.Horde.qGpUxaiNIIbQGcCo43g-PRn@webmail.nde.ag> <20180903112737.Horde.yS7Os9MoA6nGHZVlOD43KIO@webmail.nde.ag> Message-ID: On 09/03/2018 07:27 AM, Eugen Block wrote: > Hi, > >> To echo what cfriesen said, if you set your allocation ratio to 1.0, >> the system will not overcommit memory. Shut down instances consume >> memory from an inventory management perspective. If you don't want any >> danger of an instance causing an OOM, you must set you >> ram_allocation_ratio to 1.0. > > let's forget about the scheduler, I'll try to make my question a bit > clearer. > > Let's say I have a ratio of 1.0 on my hypervisor, and let it have 24 GB > of RAM available, ignoring the OS for a moment. Now I launch 6 > instances, each with a flavor requesting 4 GB of RAM, that would leave > no space for further instances, right? > Then I shutdown two instances (freeing 8 GB RAM) and create a new one > with 8 GB of RAM, the compute node is full again (assuming all instances > actually consume all of their RAM). > Now I boot one of the shutdown instances again, the compute node would > require additional 4 GB of RAM for that instance, and this would lead to > OOM, isn't that correct? So a ratio of 1.0 would not prevent that from > happening, would it? I'm not entirely sure what you mean by "shut down an instance". Perhaps this is what is leading to confusion. I consider "shutting down an instance" to be stopping or suspending an instance. As I mentioned below, shutdown instances consume memory from an inventory management perspective. If you stop or suspend an instance on your host, that instance is still consuming the same amount of memory in the placement service. You will *not* be able to launch a new instance on that same compute host *unless* your allocation ratio is >1.0. Now, if by "shut down an instance", you actually mean "terminate an instance" or possibly "shelve and then offload an instance", then that is a different thing, and in both of *those* cases, resources are released on the compute host. Best, -jay > Zitat von Jay Pipes : > >> On 08/30/2018 10:54 AM, Eugen Block wrote: >>> Hi Jay, >>> >>>> You need to set your ram_allocation_ratio nova.CONF option to 1.0 if >>>> you're running into OOM issues. This will prevent overcommit of >>>> memory on your compute nodes. >>> >>> I understand that, the overcommitment works quite well most of the time. >>> >>> It just has been an issue twice when I booted an instance that had >>> been shutdown a while ago. In the meantime there were new instances >>> created on that hypervisor, and this old instance caused the OOM. >>> >>> I would expect that with a ratio of 1.0 I would experience the same >>> issue, wouldn't I? As far as I understand the scheduler only checks >>> at instance creation, not when booting existing instances. Is that a >>> correct assumption? >> >> To echo what cfriesen said, if you set your allocation ratio to 1.0, >> the system will not overcommit memory. Shut down instances consume >> memory from an inventory management perspective. If you don't want any >> danger of an instance causing an OOM, you must set you >> ram_allocation_ratio to 1.0. >> >> The scheduler doesn't really have anything to do with this. >> >> Best, >> -jay >> >> _______________________________________________ >> Mailing list: >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> Post to     : openstack at lists.openstack.org >> Unsubscribe : >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to     : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack From eblock at nde.ag Mon Sep 3 12:00:53 2018 From: eblock at nde.ag (Eugen Block) Date: Mon, 03 Sep 2018 12:00:53 +0000 Subject: [Openstack] [nova] Nova-scheduler: when are filters applied? In-Reply-To: <1535975730.32321.6@smtp.office365.com> References: <20180830141938.Horde.oWg04EkYxTBMGLQrn__TgQg@webmail.nde.ag> <9996fe76-a744-d3b8-baab-9efbb6389ffe@gmail.com> <20180830145435.Horde.qGpUxaiNIIbQGcCo43g-PRn@webmail.nde.ag> <20180903112737.Horde.yS7Os9MoA6nGHZVlOD43KIO@webmail.nde.ag> <1535975730.32321.6@smtp.office365.com> Message-ID: <20180903120053.Horde.-5iW4P8l6b-JMzwEiCmbTNh@webmail.nde.ag> Thanks, that is a very good explanation, I get it now. Thank you very much for your answers! Zitat von Balázs Gibizer : > On Mon, Sep 3, 2018 at 1:27 PM, Eugen Block wrote: >> Hi, >> >>> To echo what cfriesen said, if you set your allocation ratio to >>> 1.0, the system will not overcommit memory. Shut down instances >>> consume memory from an inventory management perspective. If you >>> don't want any danger of an instance causing an OOM, you must set >>> you ram_allocation_ratio to 1.0. >> >> let's forget about the scheduler, I'll try to make my question a >> bit clearer. >> >> Let's say I have a ratio of 1.0 on my hypervisor, and let it have >> 24 GB of RAM available, ignoring the OS for a moment. Now I launch >> 6 instances, each with a flavor requesting 4 GB of RAM, that would >> leave no space for further instances, right? >> Then I shutdown two instances (freeing 8 GB RAM) and create a new >> one with 8 GB of RAM, the compute node is full again (assuming all >> instances actually consume all of their RAM). > > When you shutdown the two instances the phyisical RAM will be > deallocated BUT nova will not remove the resource allocation in > placement. Therefore your new instance which requires 8GB RAM will > not be placed to the host in question because on that host all the > 24G RAM is still allocated even if physically not consumed at the > moment. > > >> Now I boot one of the shutdown instances again, the compute node >> would require additional 4 GB of RAM for that instance, and this >> would lead to OOM, isn't that correct? So a ratio of 1.0 would not >> prevent that from happening, would it? > > Nova did not place the instance require 8G RAM to this host above. > Therefore you can freely start up the two 4G consuming instances on > this host later. > >> Regards, >> Eugen >> >> >> Zitat von Jay Pipes : >> >>> On 08/30/2018 10:54 AM, Eugen Block wrote: >>>> Hi Jay, >>>> >>>>> You need to set your ram_allocation_ratio nova.CONF option to >>>>> 1.0 if you're running into OOM issues. This will prevent >>>>> overcommit of memory on your compute nodes. >>>> >>>> I understand that, the overcommitment works quite well most of the time. >>>> >>>> It just has been an issue twice when I booted an instance that >>>> had been shutdown a while ago. In the meantime there were new >>>> instances created on that hypervisor, and this old instance >>>> caused the OOM. >>>> >>>> I would expect that with a ratio of 1.0 I would experience the >>>> same issue, wouldn't I? As far as I understand the scheduler >>>> only checks at instance creation, not when booting existing >>>> instances. Is that a correct assumption? >>> >>> To echo what cfriesen said, if you set your allocation ratio to >>> 1.0, the system will not overcommit memory. Shut down instances >>> consume memory from an inventory management perspective. If you >>> don't want any danger of an instance causing an OOM, you must set >>> you ram_allocation_ratio to 1.0. >>> >>> The scheduler doesn't really have anything to do with this. >>> >>> Best, >>> -jay >>> >>> _______________________________________________ >>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>> Post to : openstack at lists.openstack.org >>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> >> >> >> _______________________________________________ >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> Post to : openstack at lists.openstack.org >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack From mahati.chamarthy at gmail.com Mon Sep 3 13:49:24 2018 From: mahati.chamarthy at gmail.com (Mahati C) Date: Mon, 3 Sep 2018 19:19:24 +0530 Subject: [Openstack] Call for OpenStack Outreachy internship project proposals and funding Message-ID: Hello everyone! An update on the Outreachy program, including a request for volunteer mentors and funding. Outreachy helps people from underrepresented groups get involved in free and open source software by matching interns with established mentors in the upstream community. OpenStack is a participating organization in the Outreachy Dec 2018 to Mar 2019 internships. If you're interested to be a mentor, please publish your project ideas on this page https://www.outreachy.org/communities/cfp/openstack/submit-project/. Here is a link that helps you get acquainted with mentorship process: https://wiki.openstack.org/wiki/Outreachy/Mentors. We have funding for two interns so far. We are looking for additional sponsors to help support OpenStack applicants. The sponsorship cost is 6,500 USD per intern, which is used to provide them a stipend for the three-month program. You can learn more about sponsorship here: https://www.outreachy.org/sponsor/ . Outreachy has been one of the most important and effective diversity efforts we’ve invested in. We have had many interns turn into long term OpenStack contributors. Please help spread the word. If you are interested in becoming a mentor or sponsoring an intern, please contact me (mahati.chamarthy AT intel.com) or Sameul ( samueldmq AT gmail.com ). Thank you! Best, Mahati -------------- next part -------------- An HTML attachment was scrubbed... URL: From corey.bryant at canonical.com Tue Sep 4 12:55:50 2018 From: corey.bryant at canonical.com (Corey Bryant) Date: Tue, 4 Sep 2018 08:55:50 -0400 Subject: [Openstack] installation of Gnocchi on Queens In-Reply-To: <20180831145738.dxfyegat6kdqk73w@sileht.net> References: <67746dd8-c0ff-8371-6765-e7ef6995d65a@evolved-intelligence.com> <20180831144238.qb3zp22yv45dxlih@sileht.net> <596564a3-eefd-48cc-6fc4-6008c9b58e6f@evolved-intelligence.com> <20180831145738.dxfyegat6kdqk73w@sileht.net> Message-ID: On Fri, Aug 31, 2018 at 10:57 AM, Mehdi Abaakouk wrote: > On Fri, Aug 31, 2018 at 03:54:50PM +0100, Terry Lundin wrote: > >> >> >> On 31/08/18 15:42, Mehdi Abaakouk wrote: >> >>> On Fri, Aug 31, 2018 at 01:27:46PM +0100, Terry Lundin wrote: >>> >>>> Hello, >>>> >>>> We are trying to install ceilometer and gnocchi on Openstack Queens >>>> (Ubuntu 16.04) following the official instructions from >>>> https://docs.openstack.org/ceilometer/queens/install/install >>>> -base-ubuntu.html we end up in serious problems. When we issue the >>>> gnocchi install via apt-get, e.g. >>>> >>>> # apt-get install gnocchi-api gnocchi-metricd python-gnocchiclient >>>> >>>> *it will uninstall the dashboard, keystone and placement api *(it was a >>>> nice few hours fixing that). >>>> >>>> A suggestion to solve this was to use the pre-release archive: >>>> >>>> sudo add-apt-repository cloud-archive:queens-proposed >>>> sudo apt-get update >>>> >>>> This installs gnocchi without removing keystone, but the gnocchi api >>>> won't install as a service anymore. It seems like the gnocchi version is >>>> not compatible with Queens. >>>> >>> >>> For sure, Gnocchi is Queens compatible for sure. This is an bug of the >>> Ubuntu Cloud Archive packaging. I think you hitting this: >>> >>> https://bugs.launchpad.net/ubuntu/+source/gnocchi/+bug/1746992 >>> >> >> Yes, hitting that one. I understand it's an issue with gnocchi-api >> requiring python-3 while openstack queens is running on python-2. What is >> the work-around? >> > > According the bug tracker a new package update will come soon in > queens-proposed. So just waiting > The package has been in queens-proposed for a while, it's just waiting on someone to verify it and tag it appropriately. > Install gnocchi/gnocchi api on a separate apache server outside openstack? >> > > Yes that's a good solution. The gnocchi package was Py3-only and still installs Py3 by default. We provided Py2 support through the bug mentioned above to help folks out who still needed it (mainly all-in-one installs). With the package that is currently in queens-proposed you need to install libapache2-mod-wsgi and python-gnocchi (as opposed to libapache2-mod-wsgi-py3 and python3-gnocchi) if you want Py2 support. Corey That's could work. > > > -- > Mehdi Abaakouk > mail: sileht at sileht.net > irc: sileht > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstac > k > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstac > k > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ken at jots.org Tue Sep 4 19:20:03 2018 From: ken at jots.org (Ken D'Ambrosio) Date: Tue, 04 Sep 2018 15:20:03 -0400 Subject: [Openstack] Viewing VM's hypervisors as a non-admin user? Message-ID: <5b6acfe819bfbb6283f5c92551a07ec6@jots.org> Hey, all. We've got a Juno cloud, and it would be really handy for some of our engineers if they could see which VMs wound up on which hypervisors. I'm unsure how to make that happen; I'm afraid the documentation on the options of the policy.json file is a bit opaque. How would I go about making this happen, assuming it's even possible? Thanks! -Ken From berndbausch at gmail.com Tue Sep 4 20:39:03 2018 From: berndbausch at gmail.com (Bernd Bausch) Date: Wed, 5 Sep 2018 05:39:03 +0900 Subject: [Openstack] Viewing VM's hypervisors as a non-admin user? In-Reply-To: <5b6acfe819bfbb6283f5c92551a07ec6@jots.org> References: <5b6acfe819bfbb6283f5c92551a07ec6@jots.org> Message-ID: <0A83490B-D0B8-4931-A2A0-872E42E5E60E@gmail.com> It’s probably this policy rule: "os_compute_api:os-extended-server-attributes": "rule:admin_api" See also http://git.openstack.org/cgit/openstack/nova/tree/nova/policies/extended_server_attributes.py Bernd > On Sep 5, 2018, at 4:20, Ken D'Ambrosio wrote: > > Hey, all. We've got a Juno cloud, and it would be really handy for some of our engineers if they could see which VMs wound up on which hypervisors. I'm unsure how to make that happen; I'm afraid the documentation on the options of the policy.json file is a bit opaque. How would I go about making this happen, assuming it's even possible? > > Thanks! > > -Ken > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnaser at vexxhost.com Tue Sep 4 20:53:42 2018 From: mnaser at vexxhost.com (Mohammed Naser) Date: Tue, 4 Sep 2018 16:53:42 -0400 Subject: [Openstack] Viewing VM's hypervisors as a non-admin user? In-Reply-To: <0A83490B-D0B8-4931-A2A0-872E42E5E60E@gmail.com> References: <5b6acfe819bfbb6283f5c92551a07ec6@jots.org> <0A83490B-D0B8-4931-A2A0-872E42E5E60E@gmail.com> Message-ID: hostId in the API exposes that without telling you the exact host On Tue, Sep 4, 2018 at 4:49 PM Bernd Bausch wrote: > > It’s probably this policy rule: > > "os_compute_api:os-extended-server-attributes": "rule:admin_api" > > > See also http://git.openstack.org/cgit/openstack/nova/tree/nova/policies/extended_server_attributes.py > > Bernd > > On Sep 5, 2018, at 4:20, Ken D'Ambrosio wrote: > > Hey, all. We've got a Juno cloud, and it would be really handy for some of our engineers if they could see which VMs wound up on which hypervisors. I'm unsure how to make that happen; I'm afraid the documentation on the options of the policy.json file is a bit opaque. How would I go about making this happen, assuming it's even possible? > > Thanks! > > -Ken > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -- Mohammed Naser — vexxhost ----------------------------------------------------- D. 514-316-8872 D. 800-910-1726 ext. 200 E. mnaser at vexxhost.com W. http://vexxhost.com From dpanarese at enter.eu Wed Sep 5 10:14:56 2018 From: dpanarese at enter.eu (Davide Panarese) Date: Wed, 5 Sep 2018 12:14:56 +0200 Subject: [Openstack] [Manila] Metrics about shares into ceilometer/gnocchi Message-ID: <8B58FEF0-7066-45FD-A84C-79F54EF4D739@enter.eu> Hi, I’m testing Manila Queens service with CEPHFS backend and it’s quite good. But I don’t understand if there are metrics about shares into ceilometer. I configured notification sections like other services but I can’t see anything about my shares into ceilometer meter-list. I’m using Ceilometer Mitaka version (maybe it doesn’t work with oldest version?!). Anyone could explain me if manila provide native metrics or I need to manage manually custom metrics for this service? $ cat manila.conf [oslo_messaging_notifications] driver = messagingv2 transport_url = rabbit://rabbit:password at rabbithost.local:5672/ topics = notifications $ manila list +--------------------------------------+--------------+------+-------------+-----------+-----------+-----------------+------+-------------------+ | ID | Name | Size | Share Proto | Status | Is Public | Share Type Name | Host | Availability Zone | +--------------------------------------+--------------+------+-------------+-----------+-----------+-----------------+------+-------------------+ | 15b209ca-cab3-47eb-8c36-8452bc8187f8 | test-share01 | 10 | NFS | available | False | cephfsnfstype | | nova | +--------------------------------------+--------------+------+-------------+-----------+-----------+-----------------+------+-------------------+ $ ceilometer meter-list +-------------------------------+-------+------+--------------------------------------+----------------------------------+----------------------------------+ | Name | Type | Unit | Resource ID | User ID | Project ID | +-------------------------------+-------+------+--------------------------------------+----------------------------------+----------------------------------+ | bandwidth | delta | B | 693ff8e8-0448-4c30-bb2a-25771292b005 | None | 856cebe7051a461baa00cf26faaca24c | | bandwidth | delta | B | 92ca8ccb-339a-4766-9c63-8575578ff330 | None | 48e622c4c017407b9fbc7f12f9b8b8f5 | | image.size | gauge | B | 294b8125-deec-459f-a8a0-13b6a661c355 | None | 48e622c4c017407b9fbc7f12f9b8b8f5 | | image.size | gauge | B | 88bec41c-b64e-4c32-8273-3d6d08118759 | None | 48e622c4c017407b9fbc7f12f9b8b8f5 | +-------------------------------+-------+------+--------------------------------------+----------------------------------+----------------------------------+ Thanks a lot, Davide -------------- next part -------------- An HTML attachment was scrubbed... URL: From ksameersrk at gmail.com Thu Sep 6 05:13:16 2018 From: ksameersrk at gmail.com (Sameer Kulkarni) Date: Thu, 6 Sep 2018 10:43:16 +0530 Subject: [Openstack] Fwd: Study of Swift performance degradation during drive failure In-Reply-To: References: Message-ID: Hi All, We are trying to understand and study how Swift handles drive failures. >From the book we have learnt that a drive failure triggers replication by default where as a node failure doesnt. We are trying to study the performance impact of this replication on the handoff nodes. If during the replication of an entire partition P to one of the handoff nodes N1, an object is upload whose 1 of the 3 replicas is destined to node N1, then is one operation going to have a higher priority ? i.e is does a normal upload operation take priority over the replication that is in progress or does it wait for the replication to complete. Also in the above scenario I do not believe the user experiences much performance degradation as the proxy server would have recieved the quorum of successful responses from the other 2 nodes. This brings us to our next question, what would be the simplest way to quantify the performance degradation due to a drive failure(maybe multiple) on a Swift setup using as few drives as possible. Any help or pointers would be appreciated. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From me at not.mn Thu Sep 6 16:01:55 2018 From: me at not.mn (John Dickinson) Date: Thu, 06 Sep 2018 09:01:55 -0700 Subject: [Openstack] Study of Swift performance degradation during drive failure In-Reply-To: References: Message-ID: <5FD76377-9F41-47A4-9266-87961E04579B@not.mn> On 5 Sep 2018, at 22:08, Sameer Kulkarni wrote: > Hi All, > > We are trying to understand and study how Swift handles drive > failures. > From the book we have learnt that a drive failure triggers replication > by > default where as a node failure doesnt. We are trying to study the > performance impact of this replication on the handoff nodes. > > If during the replication of an entire partition P to one of the > handoff > nodes N1, an object is upload whose 1 of the 3 replicas is destined to > node > N1, then is one operation going to have a higher priority ? i.e is > does a > normal upload operation take priority over the replication that is in > progress or does it wait for the replication to complete. > > Also in the above scenario I do not believe the user experiences much > performance degradation as the proxy server would have recieved the > quorum > of successful responses from the other 2 nodes. This brings us to our > next > question, what would be the simplest way to quantify the performance > degradation due to a drive failure(maybe multiple) on a Swift setup > using > as few drives as possible. > > Any help or pointers would be appreciated. > > Thank you. Some very short answers: no, Swift does not automatically prioritize one type of operation over another, although there are config settings that operators may adjust to balance background tasks and client requests. I would love for Swift to be able to do this, and we're slowly working towards that goal with a few ongoing pieces of work. There is likely no simple way to quantify performance degradation due to hardware failure. That's the "fun" of distributed systems. It depends too much on specifics of the hardware, the current workload, and the particular characteristics of the failure. I cannot give you a general answer. Normally deployers will run benchmarks against their cluster under different circumstances to measure actual impact of expected failure modes. --John From qishiyexu2 at 126.com Fri Sep 7 06:02:59 2018 From: qishiyexu2 at 126.com (=?GBK?B?s8Ke3Q==?=) Date: Fri, 7 Sep 2018 14:02:59 +0800 (CST) Subject: [Openstack] [openstack][nova]Can I specify a unique certificate for every instance(spice connection)? Message-ID: Hi, Opesntack can configure a global tls certificate for all instances with spice connection via /etc/libvert/qemu.conf, but can I configure a certificate for every instance seperately? BR Don -------------- next part -------------- An HTML attachment was scrubbed... URL: From corey.bryant at canonical.com Fri Sep 7 15:18:37 2018 From: corey.bryant at canonical.com (Corey Bryant) Date: Fri, 7 Sep 2018 11:18:37 -0400 Subject: [Openstack] OpenStack Rocky for Ubuntu 18.04 LTS Message-ID: The Ubuntu OpenStack team at Canonical is pleased to announce the general availability of OpenStack Rocky on Ubuntu 18.04 LTS via the Ubuntu Cloud Archive. Details of the Rocky release can be found at: https://www.openstack.org/software/rocky To get access to the Ubuntu Rocky packages: Ubuntu 18.04 LTS ----------------------- You can enable the Ubuntu Cloud Archive pocket for OpenStack Rocky on Ubuntu 18.04 installations by running the following commands: sudo add-apt-repository cloud-archive:rocky sudo apt update The Ubuntu Cloud Archive for Rocky includes updates for: aodh, barbican, ceilometer, ceph (13.2.1), cinder, designate, designate-dashboard, glance, gnocchi, heat, heat-dashboard, horizon, ironic, keystone, magnum, manila, manila-ui, mistral, murano, murano-dashboard, networking-bagpipe, networking-bgpvpn, networking-hyperv, networking-l2gw, networking-odl, networking-ovn, networking-sfc, neutron, neutron-dynamic-routing, neutron-fwaas, neutron-lbaas, neutron-lbaas-dashboard, neutron-vpnaas, nova, nova-lxd, octavia, openstack-trove, openvswitch (2.10.0), panko, sahara, sahara-dashboard, senlin, swift, trove-dashboard, vmware-nsx, watcher, and zaqar. For a full list of packages and versions, please refer to: http://reqorts.qa.ubuntu.com/reports/ubuntu-server/cloud-archive/rocky_versions.html Python 3 support --------------------- Python 3 packages are now available for all of the above packages except swift. All of these packages have successfully been unit tested with at least Python 3.6. Function testing is ongoing and fixes will continue to be backported to Rocky. Python 3 enablement -------------------------- In Rocky, Python 2 packages will still be installed by default for all packages except gnocchi and octavia, which are Python 3 by default. In a future release, we will switch all packages to Python 3 by default. To enable Python 3 for existing installations: # upgrade to latest Rocky package versions first, then: sudo apt install python3- [1] sudo apt install libapache2-mod-wsgi-py3 # not required for all packages [2] sudo apt purge python- [1] sudo apt autoremove --purge sudo systemctl restart -* sudo systemctl restart apache2 # not required for all packages [2] For example: sudo apt install aodh-* sudo apt install python3-aodh libapache2-mod-wsgi-py3 sudo apt purge python-aodh sudo apt autoremove --purge sudo systemctl restart aodh-* apache2 To enable Python 3 for new installations: sudo apt install python3- [1] sudo apt install libapache2-mod-wsgi-py3 # not required for all packages [2] sudo apt install - For example: sudo apt install python3-aodh libapache2-mod-wsgi-py3 aodh-api [1] The naming convention of python packages is generally python- and python3-. For horizon, however, the packages are named python-django-horizon and python3-django-horizon. [2] The following packages are run under apache2 and require installation of libapache2-mod-wsgi-py3 to enable Python 3 support: aodh-api, cinder-api, barbican-api, keystone, nova-placement-api, openstack-dashboard, panko-api, sahara-api Other notable changes ---------------------------- sahara-api: sahara API now runs under apache2 with mod_wsgi Branch Package Builds ----------------------------- If you would like to try out the latest updates to branches, we deliver continuously integrated packages on each upstream commit via the following PPA’s: sudo add-apt-repository ppa:openstack-ubuntu-testing/mitaka sudo add-apt-repository ppa:openstack-ubuntu-testing/ocata sudo add-apt-repository ppa:openstack-ubuntu-testing/pike sudo add-apt-repository ppa:openstack-ubuntu-testing/queens sudo add-apt-repository ppa:openstack-ubuntu-testing/rocky Reporting bugs ------------------- If you have any issues please report bugs using the 'ubuntu-bug' tool to ensure that bugs get logged in the right place in Launchpad: sudo ubuntu-bug nova-conductor Thanks to everyone who has contributed to OpenStack Rocky, both upstream and downstream. Special thanks to the Puppet OpenStack modules team and the OpenStack Charms team for their continued early testing of the Ubuntu Cloud Archive, as well as the Ubuntu and Debian OpenStack teams for all of their contributions. Have fun and see you in Stein! Cheers, Corey (on behalf of the Ubuntu OpenStack team) -------------- next part -------------- An HTML attachment was scrubbed... URL: From navdeep.uniyal at bristol.ac.uk Fri Sep 7 15:43:18 2018 From: navdeep.uniyal at bristol.ac.uk (Navdeep Uniyal) Date: Fri, 7 Sep 2018 15:43:18 +0000 Subject: [Openstack] Openstack Pike SRIOV Enablement issue Message-ID: Dear All, I am facing some issues while trying to enable SRIOV interfaces with OpenStack Pike. I am using Solarflare NICs. In my case I have done all the required configurations as stated in : But on starting the sriov-nic-agent, I am getting following error: neutron-sriov-nic-agent --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/sriov_agent.ini Guru meditation now registers SIGUSR1 and SIGUSR2 by default for backward compatibility. SIGUSR1 will no longer be registered in a future release, so please use SIGUSR2 to generate reports. 2018-09-07 16:30:47.097 7503 INFO neutron.common.config [-] Logging enabled! 2018-09-07 16:30:47.097 7503 INFO neutron.common.config [-] /usr/bin/neutron-sriov-nic-agent version 11.0.5 2018-09-07 16:30:47.097 7503 INFO neutron.plugins.ml2.drivers.mech_sriov.agent.sriov_nic_agent [-] Physical Devices mappings: {'sriovprovider': ['enp1s0f1']} 2018-09-07 16:30:47.097 7503 INFO neutron.plugins.ml2.drivers.mech_sriov.agent.sriov_nic_agent [-] Exclude Devices: {} 2018-09-07 16:30:47.098 7503 INFO neutron.plugins.ml2.drivers.mech_sriov.agent.sriov_nic_agent [req-a6d17b33-6256-46cb-9652-936d7d22c7a1 - - - - -] RPC agent_id: nic-switch-agent.box 2018-09-07 16:30:47.100 7503 INFO neutron.agent.agent_extensions_manager [req-a6d17b33-6256-46cb-9652-936d7d22c7a1 - - - - -] Loaded agent extensions: [] 2018-09-07 16:30:47.130 7503 INFO neutron.plugins.ml2.drivers.mech_sriov.agent.sriov_nic_agent [req-a6d17b33-6256-46cb-9652-936d7d22c7a1 - - - - -] Agent initialized successfully, now running... 2018-09-07 16:30:47.130 7503 INFO neutron.plugins.ml2.drivers.mech_sriov.agent.sriov_nic_agent [req-a6d17b33-6256-46cb-9652-936d7d22c7a1 - - - - -] SRIOV NIC Agent RPC Daemon Started! 2018-09-07 16:30:47.130 7503 INFO neutron.plugins.ml2.drivers.mech_sriov.agent.sriov_nic_agent [req-a6d17b33-6256-46cb-9652-936d7d22c7a1 - - - - -] Agent out of sync with plugin! 2018-09-07 16:30:47.229 7503 WARNING neutron.plugins.ml2.drivers.mech_sriov.agent.pci_lib [req-a6d17b33-6256-46cb-9652-936d7d22c7a1 - - - - -] failed to parse vf link show line vf 6 MAC 9e:87:7b:bf:75:d2, link-state auto: for enp1s0f1 2018-09-07 16:30:47.229 7503 WARNING neutron.plugins.ml2.drivers.mech_sriov.agent.pci_lib [req-a6d17b33-6256-46cb-9652-936d7d22c7a1 - - - - -] failed to parse vf link show line vf 7 MAC 1e:46:04:79:cd:6b, link-state auto: for enp1s0f1 2018-09-07 16:30:49.239 7503 WARNING neutron.plugins.ml2.drivers.mech_sriov.agent.pci_lib [req-a6d17b33-6256-46cb-9652-936d7d22c7a1 - - - - -] failed to parse vf link show line vf 6 MAC 9e:87:7b:bf:75:d2, link-state auto: for enp1s0f1 2018-09-07 16:30:49.239 7503 WARNING neutron.plugins.ml2.drivers.mech_sriov.agent.pci_lib [req-a6d17b33-6256-46cb-9652-936d7d22c7a1 - - - - -] failed to parse vf link show line vf 7 MAC 1e:46:04:79:cd:6b, link-state auto: for enp1s0f1 Please help on resolving this issue. Any pointers would be welcome. Kind Regards, Navdeep -------------- next part -------------- An HTML attachment was scrubbed... URL: From skinnyh92 at gmail.com Fri Sep 7 19:46:58 2018 From: skinnyh92 at gmail.com (Hang Yang) Date: Fri, 7 Sep 2018 12:46:58 -0700 Subject: [Openstack] [diskimage-builder] Element pip-and-virtualenv failed to install pip Message-ID: Hi there, I'm new to the DIB tool and ran into an issue when used 2.16.0 DIB tool to build a CentOS based image with pip-and-virtualenv element. It failed at https://github.com/openstack/diskimage-builder/blob/master/diskimage_builder/elements/pip-and-virtualenv/install.d/pip-and-virtualenv-source-install/04-install-pip#L78 due to cannot find pip command. I found the /tmp/get_pip.py was there but totally empty. I have to manually add a wget step to retreat the get_pip.py right before the failed step then it worked. But should the get_pip.py be downloaded automatically by this https://github.com/openstack/diskimage-builder/blob/master/diskimage_builder/elements/pip-and-virtualenv/source-repository-pip-and-virtualenv ? Does anyone know how could this issue happen? Thanks in advance for any help. Best, Hang -------------- next part -------------- An HTML attachment was scrubbed... URL: From rleander at redhat.com Sat Sep 8 16:24:49 2018 From: rleander at redhat.com (Rain Leander) Date: Sat, 8 Sep 2018 18:24:49 +0200 Subject: [Openstack] [ptg] Interviews at OpenStack PTG Denver Message-ID: Hello all! I'm attending PTG this week to conduct project interviews [0]. These interviews have several purposes. Please consider all of the following when thinking about what you might want to say in your interview: * Tell the users/customers/press what you've been working on in Rocky * Give them some idea of what's (what might be?) coming in Stein * Put a human face on the OpenStack project and encourage new participants to join us * You're welcome to promote your company's involvement in OpenStack but we ask that you avoid any kind of product pitches or job recruitment In the interview I'll ask some leading questions and it'll go easier if you've given some thought to them ahead of time: * Who are you? (Your name, your employer, and the project(s) on which you are active.) * What did you accomplish in Rocky? (Focus on the 2-3 things that will be most interesting to cloud operators) * What do you expect to be the focus in Stein? (At the time of your interview, it's likely that the meetings will not yet have decided anything firm. That's ok.) * Anything further about the project(s) you work on or the OpenStack community in general. Finally, note that there are only 40 interview slots available, so please consider coordinating with your project to designate the people that you want to represent the project, so that we don't end up with 12 interview about Neutron, or whatever. I mean, love me some Neutron, but twelve interviews is a bit too many, eh? It's fine to have multiple people in one interview - Maximum 3, probably. Interview slots are 30 minutes, in which time we hope to capture somewhere between 10 and 20 minutes of content. It's fine to run shorter but 15 minutes is probably an ideal length. See you SOON! [0] https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU/edit?usp=sharing -- K Rain Leander OpenStack Community Liaison Open Source and Standards Team https://www.rdoproject.org/ http://community.redhat.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Sat Sep 8 17:22:04 2018 From: satish.txt at gmail.com (Satish Patel) Date: Sat, 8 Sep 2018 13:22:04 -0400 Subject: [Openstack] How to check capacity from command line Message-ID: Folks, I have deploy openstack cloud with 40 compute node and i have create two aggregated host group based on two kind of hardware. group-A - HP DL460 group-B - HP DL360 Now i want to check per group capacity from command line, so how do i check per group capacity? From reza.b2008 at gmail.com Mon Sep 10 05:11:46 2018 From: reza.b2008 at gmail.com (Reza Bakhshayeshi) Date: Mon, 10 Sep 2018 09:41:46 +0430 Subject: [Openstack] masakari instance monitor sending notification error Message-ID: Hi all, I'm going to setup masakari environment on Neutron. I've installed masakari api & engine on Controller and masakari-monitors & python-masakariclient on two Compute nodes. When an instance goes down masakari instance monitor encounter following error: 2018-09-09 05:56:41.230 25262 ERROR masakarimonitors.instancemonitor.libvirt_handler.callback AttributeError: 'Connection' object has no attribute 'sdk' I've also applied this patch with same error: object has no attribute 'vmha https://review.openstack.org/#/c/395433/3/masakarimonitors/instancemonitor/libvirt_handler/callback.py Do you have any idea what I'm missing? Regards, Reza -------------- next part -------------- An HTML attachment was scrubbed... URL: From reza.b2008 at gmail.com Mon Sep 10 12:35:38 2018 From: reza.b2008 at gmail.com (Reza Bakhshayeshi) Date: Mon, 10 Sep 2018 17:05:38 +0430 Subject: [Openstack] masakari client returns empty subcommand list Message-ID: Hi, I've installed python-masakariclient on controller node, and receiving empty subcommand list on newton: root at controller:~/python-masakariclient# masakari segment-list usage: masakari [--masakari-api-version] [--debug] [--os-auth-plugin AUTH_PLUGIN] [--os-auth-url AUTH_URL] [--os-project-id PROJECT_ID] [--os-project-name PROJECT_NAME] [--os-tenant-id TENANT_ID] [--os-tenant-name TENANT_NAME] [--os-domain-id DOMAIN_ID] [--os-domain-name DOMAIN_NAME] [--os-project-domain-id PROJECT_DOMAIN_ID] [--os-project-domain-name PROJECT_DOMAIN_NAME] [--os-user-domain-id USER_DOMAIN_ID] [--os-user-domain-name USER_DOMAIN_NAME] [--os-username USERNAME] [--os-user-id USER_ID] [--os-password PASSWORD] [--os-trust-id TRUST_ID] [--os-cacert CA_BUNDLE_FILE | --verify | --insecure] [--os-token TOKEN] [--os-access-info ACCESS_INFO] ... masakari: error: argument : invalid choice: 'segment-list' (choose from 'bash_completion') Unfortunately there is no step by step documentation for any of masakari services... Do you have any idea what is missing? Regards, Reza -------------- next part -------------- An HTML attachment was scrubbed... URL: From sam47priya at gmail.com Mon Sep 10 14:46:34 2018 From: sam47priya at gmail.com (Sam P) Date: Mon, 10 Sep 2018 08:46:34 -0600 Subject: [Openstack] masakari instance monitor sending notification error In-Reply-To: References: Message-ID: Hi Reza, Sorry for the inconveniences you had. Are you try to install masakari with OpenStack stable/newton? In that case, what masakari release you used? Because, masakari released from stable/Ocata and no official release for Newton. If you need to work it with Newton, then you have to isolate the masakari services by using containers. --- Regards, Sampath On Sun, Sep 9, 2018 at 11:21 PM Reza Bakhshayeshi wrote: > Hi all, > > I'm going to setup masakari environment on Neutron. I've installed > masakari api & engine on Controller and masakari-monitors & > python-masakariclient on two Compute nodes. > When an instance goes down masakari instance monitor encounter following > error: > > 2018-09-09 05:56:41.230 25262 ERROR > masakarimonitors.instancemonitor.libvirt_handler.callback AttributeError: > 'Connection' object has no attribute 'sdk' > > I've also applied this patch with same error: object has no attribute 'vmha > > > https://review.openstack.org/#/c/395433/3/masakarimonitors/instancemonitor/libvirt_handler/callback.py > > Do you have any idea what I'm missing? > > Regards, > Reza > _______________________________________________ > Mailing list: > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rleander at redhat.com Mon Sep 10 16:40:52 2018 From: rleander at redhat.com (Rain Leander) Date: Mon, 10 Sep 2018 18:40:52 +0200 Subject: [Openstack] [ptg] Interviews at OpenStack PTG Denver In-Reply-To: References: Message-ID: Today I'm conducting interviews in Blanca Peak and I'd love to talk about what you worked on in Rocky and / or what you're planning to do in Stein. I'm especially keen to hear about how new collaborators can join your project! See you soon! ~R. On Sat, Sep 8, 2018 at 6:24 PM Rain Leander wrote: > Hello all! > > I'm attending PTG this week to conduct project interviews [0]. These > interviews have several purposes. Please consider all of the following when > thinking about what you might want to say in your interview: > > * Tell the users/customers/press what you've been working on in Rocky > * Give them some idea of what's (what might be?) coming in Stein > * Put a human face on the OpenStack project and encourage new participants > to join us > * You're welcome to promote your company's involvement in OpenStack but we > ask that you avoid any kind of product pitches or job recruitment > > In the interview I'll ask some leading questions and it'll go easier if > you've given some thought to them ahead of time: > > * Who are you? (Your name, your employer, and the project(s) on which you > are active.) > * What did you accomplish in Rocky? (Focus on the 2-3 things that will be > most interesting to cloud operators) > * What do you expect to be the focus in Stein? (At the time of your > interview, it's likely that the meetings will not yet have decided anything > firm. That's ok.) > * Anything further about the project(s) you work on or the OpenStack > community in general. > > Finally, note that there are only 40 interview slots available, so please > consider coordinating with your project to designate the people that you > want to represent the project, so that we don't end up with 12 interview > about Neutron, or whatever. I mean, love me some Neutron, but twelve > interviews is a bit too many, eh? > > It's fine to have multiple people in one interview - Maximum 3, probably. > > Interview slots are 30 minutes, in which time we hope to capture somewhere > between 10 and 20 minutes of content. It's fine to run shorter but 15 > minutes is probably an ideal length. > > See you SOON! > > [0] > https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU/edit?usp=sharing > -- > K Rain Leander > OpenStack Community Liaison > Open Source and Standards Team > https://www.rdoproject.org/ > http://community.redhat.com > -- K Rain Leander OpenStack Community Liaison Open Source and Standards Team https://www.rdoproject.org/ http://community.redhat.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From reza.b2008 at gmail.com Tue Sep 11 07:53:48 2018 From: reza.b2008 at gmail.com (Reza Bakhshayeshi) Date: Tue, 11 Sep 2018 12:23:48 +0430 Subject: [Openstack] masakari instance monitor sending notification error In-Reply-To: References: Message-ID: Hi Sam, Sorry about misspelling, Yes, I'm sure about stable/newton. I've cloned it by this command: git clone https://github.com/openstack/masakari -b newton-eol and my version is masakari==2.0.0 Do you mean that installing stable/Ocata inside a container, while I'm using Newton controller and compute? Do you have any example documentation for this purpose or any of these configuration files: /etc/masakari/masakari.conf /etc/masakarimonitors/masakarimonitors.conf Regards, Reza On Mon, 10 Sep 2018 at 19:16, Sam P wrote: > Hi Reza, > > Sorry for the inconveniences you had. > Are you try to install masakari with OpenStack stable/newton? > In that case, what masakari release you used? > Because, masakari released from stable/Ocata and no official release for > Newton. > If you need to work it with Newton, then you have to isolate the masakari > services by using containers. > > --- Regards, > Sampath > > > > On Sun, Sep 9, 2018 at 11:21 PM Reza Bakhshayeshi > wrote: > >> Hi all, >> >> I'm going to setup masakari environment on Neutron. I've installed >> masakari api & engine on Controller and masakari-monitors & >> python-masakariclient on two Compute nodes. >> When an instance goes down masakari instance monitor encounter following >> error: >> >> 2018-09-09 05:56:41.230 25262 ERROR >> masakarimonitors.instancemonitor.libvirt_handler.callback AttributeError: >> 'Connection' object has no attribute 'sdk' >> >> I've also applied this patch with same error: object has no attribute >> 'vmha >> >> >> https://review.openstack.org/#/c/395433/3/masakarimonitors/instancemonitor/libvirt_handler/callback.py >> >> Do you have any idea what I'm missing? >> >> Regards, >> Reza >> _______________________________________________ >> Mailing list: >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> Post to : openstack at lists.openstack.org >> Unsubscribe : >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From doka.ua at gmx.com Tue Sep 11 07:58:49 2018 From: doka.ua at gmx.com (Volodymyr Litovka) Date: Tue, 11 Sep 2018 10:58:49 +0300 Subject: [Openstack] boot order with multiple attachments Message-ID: <8ddbf904-27bd-ecbf-3a13-efc3f697067b@gmx.com> Hi colleagues, is there any mechanism to ensure boot disk when attaching more than two volumes to server? At the moment, I can't find a way to make it predictable. I have two bootable images with the following properties: 1) hw_boot_menu='true', hw_disk_bus='scsi', hw_qemu_guest_agent='yes', hw_scsi_model='virtio-scsi', img_hide_hypervisor_id='true', locations='[{u'url': u'swift+config:...', u'metadata': {}}]' which corresponds to the following volume: - attachments: [{u'server_id': u'...', u'attachment_id': u'...', u'attached_at': u'...', u'host_name': u'...', u'volume_id': u'', u'device': u'/dev/sda', u'id': u'...'}] - volume_image_metadata: {u'checksum': u'...', u'hw_qemu_guest_agent': u'yes', u'disk_format': u'raw', u'image_name': u'bionic-Qpub', u'hw_scsi_model': u'virtio-scsi', u'image_id': u'...', u'hw_boot_menu': u'true', u'min_ram': u'0', u'container_format': u'bare', u'min_disk': u'0', u'img_hide_hypervisor_id': u'true', u'hw_disk_bus': u'scsi', u'size': u'...'} and second image: 2) hw_disk_bus='scsi', hw_qemu_guest_agent='yes', hw_scsi_model='virtio-scsi', img_hide_hypervisor_id='true', locations='[{u'url': u'cinder://...', u'metadata': {}}]' which corresponds to the following volume: - attachments: [{u'server_id': u'...', u'attachment_id': u'...', u'attached_at': u'...', u'host_name': u'...', u'volume_id': u'', u'device': u'/dev/sdb', u'id': u'...'}] - volume_image_metadata: {u'checksum': u'...', u'hw_qemu_guest_agent': u'yes', u'disk_format': u'raw', u'image_name': u'xenial', u'hw_scsi_model': u'virtio-scsi', u'image_id': u'...', u'min_ram': u'0', u'container_format': u'bare', u'min_disk': u'0', u'img_hide_hypervisor_id': u'true', u'hw_disk_bus': u'scsi', u'size': u'...'} Using Heat, I'm creating the following block_devices_mapping_v2 scheme: block_device_mapping_v2:         - volume_id:           delete_on_termination: false           device_type: disk           disk_bus: scsi           boot_index: 0         - volume_id:           delete_on_termination: false           device_type: disk           disk_bus: scsi           boot_index: -1 which maps to the following nova-api debug log: Action: 'create', calling method: >, body: {"ser ver": {"name": "jex-n1", "imageRef": "", "block_device_mapping_v2": [{"boot_index": 0, "uuid": "", "disk_bus": "scsi", "source_type": "volume" , "device_type": "disk", "destination_type": "volume", "delete_on_termination": false}, {"boot_index": -1, "uuid": "", "disk_bus": "scsi", "so urce_type": "volume", "device_type": "disk", "destination_type": "volume", "delete_on_termination": false}], "flavorRef": "4b3da838-3d81-461a-b946-d3613fb6f4b3", "user_data": "...", "max_count": 1, "min_count": 1, "networks": [{"port": "9044f884-1a3d-4dc6-981e-f585f5e45dd1"}], "config_drive": true}} _process_stack /usr/lib/python2.7/dist-packages/nova/api/openstack/wsgi.py:604 Regardless of boot_index value, server boots from VOLUME2 (/dev/sdb), while having attached VOLUME1 as well as /dev/sda I'm using Queens. Where I'm wrong? Thank you. -- Volodymyr Litovka "Vision without Execution is Hallucination." -- Thomas Edison From doka.ua at gmx.com Tue Sep 11 08:54:50 2018 From: doka.ua at gmx.com (Volodymyr Litovka) Date: Tue, 11 Sep 2018 11:54:50 +0300 Subject: [Openstack] boot order with multiple attachments In-Reply-To: <8ddbf904-27bd-ecbf-3a13-efc3f697067b@gmx.com> References: <8ddbf904-27bd-ecbf-3a13-efc3f697067b@gmx.com> Message-ID: Hi again, there is similar case - https://bugs.launchpad.net/nova/+bug/1570107 - but I get same result (booting from VOLUME2) regardless of whether I use or don't use device_type/disk_bus properties in BDM description. Any ideas on how to solve this issue? Thanks. On 9/11/18 10:58 AM, Volodymyr Litovka wrote: > Hi colleagues, > > is there any mechanism to ensure boot disk when attaching more than > two volumes to server? At the moment, I can't find a way to make it > predictable. > > I have two bootable images with the following properties: > 1) hw_boot_menu='true', hw_disk_bus='scsi', hw_qemu_guest_agent='yes', > hw_scsi_model='virtio-scsi', img_hide_hypervisor_id='true', > locations='[{u'url': u'swift+config:...', u'metadata': {}}]' > > which corresponds to the following volume: > > - attachments: [{u'server_id': u'...', u'attachment_id': u'...', > u'attached_at': u'...', u'host_name': u'...', u'volume_id': > u'', u'device': u'/dev/sda', u'id': u'...'}] > - volume_image_metadata: {u'checksum': u'...', u'hw_qemu_guest_agent': > u'yes', u'disk_format': u'raw', u'image_name': u'bionic-Qpub', > u'hw_scsi_model': u'virtio-scsi', u'image_id': u'...', > u'hw_boot_menu': u'true', u'min_ram': u'0', u'container_format': > u'bare', u'min_disk': u'0', u'img_hide_hypervisor_id': u'true', > u'hw_disk_bus': u'scsi', u'size': u'...'} > > and second image: > 2) hw_disk_bus='scsi', hw_qemu_guest_agent='yes', > hw_scsi_model='virtio-scsi', img_hide_hypervisor_id='true', > locations='[{u'url': u'cinder://...', u'metadata': {}}]' > > which corresponds to the following volume: > > - attachments: [{u'server_id': u'...', u'attachment_id': u'...', > u'attached_at': u'...', u'host_name': u'...', u'volume_id': > u'', u'device': u'/dev/sdb', u'id': u'...'}] > - volume_image_metadata: {u'checksum': u'...', u'hw_qemu_guest_agent': > u'yes', u'disk_format': u'raw', u'image_name': u'xenial', > u'hw_scsi_model': u'virtio-scsi', u'image_id': u'...', u'min_ram': > u'0', u'container_format': u'bare', u'min_disk': u'0', > u'img_hide_hypervisor_id': u'true', u'hw_disk_bus': u'scsi', u'size': > u'...'} > > Using Heat, I'm creating the following block_devices_mapping_v2 scheme: > > block_device_mapping_v2: >         - volume_id: >           delete_on_termination: false >           device_type: disk >           disk_bus: scsi >           boot_index: 0 >         - volume_id: >           delete_on_termination: false >           device_type: disk >           disk_bus: scsi >           boot_index: -1 > > which maps to the following nova-api debug log: > > Action: 'create', calling method: ServersController.create of > 0x7f6b08dd4890>>, body: {"ser > ver": {"name": "jex-n1", "imageRef": "", "block_device_mapping_v2": > [{"boot_index": 0, "uuid": "", "disk_bus": "scsi", > "source_type": "volume" > , "device_type": "disk", "destination_type": "volume", > "delete_on_termination": false}, {"boot_index": -1, "uuid": > "", "disk_bus": "scsi", "so > urce_type": "volume", "device_type": "disk", "destination_type": > "volume", "delete_on_termination": false}], "flavorRef": > "4b3da838-3d81-461a-b946-d3613fb6f4b3", "user_data": "...", > "max_count": 1, "min_count": 1, "networks": [{"port": > "9044f884-1a3d-4dc6-981e-f585f5e45dd1"}], "config_drive": true}} > _process_stack > /usr/lib/python2.7/dist-packages/nova/api/openstack/wsgi.py:604 > > Regardless of boot_index value, server boots from VOLUME2 (/dev/sdb), > while having attached VOLUME1 as well as /dev/sda > > I'm using Queens. Where I'm wrong? > > Thank you. > -- Volodymyr Litovka "Vision without Execution is Hallucination." -- Thomas Edison From rleander at redhat.com Tue Sep 11 16:01:57 2018 From: rleander at redhat.com (Rain Leander) Date: Tue, 11 Sep 2018 18:01:57 +0200 Subject: [Openstack] [ptg] Interviews at OpenStack PTG Denver In-Reply-To: References: Message-ID: Today I'm conducting interviews in the lunch room - look for the Interviews sign! https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU/edit?usp=sharing On Mon, Sep 10, 2018 at 6:40 PM Rain Leander wrote: > Today I'm conducting interviews in Blanca Peak and I'd love to talk about > what you worked on in Rocky and / or what you're planning to do in Stein. > I'm especially keen to hear about how new collaborators can join your > project! See you soon! > > ~R. > > On Sat, Sep 8, 2018 at 6:24 PM Rain Leander wrote: > >> Hello all! >> >> I'm attending PTG this week to conduct project interviews [0]. These >> interviews have several purposes. Please consider all of the following when >> thinking about what you might want to say in your interview: >> >> * Tell the users/customers/press what you've been working on in Rocky >> * Give them some idea of what's (what might be?) coming in Stein >> * Put a human face on the OpenStack project and encourage new >> participants to join us >> * You're welcome to promote your company's involvement in OpenStack but >> we ask that you avoid any kind of product pitches or job recruitment >> >> In the interview I'll ask some leading questions and it'll go easier if >> you've given some thought to them ahead of time: >> >> * Who are you? (Your name, your employer, and the project(s) on which you >> are active.) >> * What did you accomplish in Rocky? (Focus on the 2-3 things that will be >> most interesting to cloud operators) >> * What do you expect to be the focus in Stein? (At the time of your >> interview, it's likely that the meetings will not yet have decided anything >> firm. That's ok.) >> * Anything further about the project(s) you work on or the OpenStack >> community in general. >> >> Finally, note that there are only 40 interview slots available, so please >> consider coordinating with your project to designate the people that you >> want to represent the project, so that we don't end up with 12 interview >> about Neutron, or whatever. I mean, love me some Neutron, but twelve >> interviews is a bit too many, eh? >> >> It's fine to have multiple people in one interview - Maximum 3, probably. >> >> Interview slots are 30 minutes, in which time we hope to capture >> somewhere between 10 and 20 minutes of content. It's fine to run shorter >> but 15 minutes is probably an ideal length. >> >> See you SOON! >> >> [0] >> https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU/edit?usp=sharing >> -- >> K Rain Leander >> OpenStack Community Liaison >> Open Source and Standards Team >> https://www.rdoproject.org/ >> http://community.redhat.com >> > > > -- > K Rain Leander > OpenStack Community Liaison > Open Source and Standards Team > https://www.rdoproject.org/ > http://community.redhat.com > -- K Rain Leander OpenStack Community Liaison Open Source and Standards Team https://www.rdoproject.org/ http://community.redhat.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From rleander at redhat.com Tue Sep 11 20:05:34 2018 From: rleander at redhat.com (Rain Leander) Date: Tue, 11 Sep 2018 22:05:34 +0200 Subject: [Openstack] [ptg] Interviews at OpenStack PTG Denver In-Reply-To: References: Message-ID: This afternoon I'm in Boulder Creek; look for the Interviews sign. If you're a PTL, I wanna talk with you. If you want contributors, I wanna talk with you. If you're here at the PTG, I wanna talk with you. https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU/edit?usp=sharing ~Rain. On Tue, Sep 11, 2018 at 6:01 PM Rain Leander wrote: > Today I'm conducting interviews in the lunch room - look for the > Interviews sign! > https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU/edit?usp=sharing > > On Mon, Sep 10, 2018 at 6:40 PM Rain Leander wrote: > >> Today I'm conducting interviews in Blanca Peak and I'd love to talk about >> what you worked on in Rocky and / or what you're planning to do in Stein. >> I'm especially keen to hear about how new collaborators can join your >> project! See you soon! >> >> ~R. >> >> On Sat, Sep 8, 2018 at 6:24 PM Rain Leander wrote: >> >>> Hello all! >>> >>> I'm attending PTG this week to conduct project interviews [0]. These >>> interviews have several purposes. Please consider all of the following when >>> thinking about what you might want to say in your interview: >>> >>> * Tell the users/customers/press what you've been working on in Rocky >>> * Give them some idea of what's (what might be?) coming in Stein >>> * Put a human face on the OpenStack project and encourage new >>> participants to join us >>> * You're welcome to promote your company's involvement in OpenStack but >>> we ask that you avoid any kind of product pitches or job recruitment >>> >>> In the interview I'll ask some leading questions and it'll go easier if >>> you've given some thought to them ahead of time: >>> >>> * Who are you? (Your name, your employer, and the project(s) on which >>> you are active.) >>> * What did you accomplish in Rocky? (Focus on the 2-3 things that will >>> be most interesting to cloud operators) >>> * What do you expect to be the focus in Stein? (At the time of your >>> interview, it's likely that the meetings will not yet have decided anything >>> firm. That's ok.) >>> * Anything further about the project(s) you work on or the OpenStack >>> community in general. >>> >>> Finally, note that there are only 40 interview slots available, so >>> please consider coordinating with your project to designate the people that >>> you want to represent the project, so that we don't end up with 12 >>> interview about Neutron, or whatever. I mean, love me some Neutron, but >>> twelve interviews is a bit too many, eh? >>> >>> It's fine to have multiple people in one interview - Maximum 3, probably. >>> >>> Interview slots are 30 minutes, in which time we hope to capture >>> somewhere between 10 and 20 minutes of content. It's fine to run shorter >>> but 15 minutes is probably an ideal length. >>> >>> See you SOON! >>> >>> [0] >>> https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU/edit?usp=sharing >>> -- >>> K Rain Leander >>> OpenStack Community Liaison >>> Open Source and Standards Team >>> https://www.rdoproject.org/ >>> http://community.redhat.com >>> >> >> >> -- >> K Rain Leander >> OpenStack Community Liaison >> Open Source and Standards Team >> https://www.rdoproject.org/ >> http://community.redhat.com >> > > > -- > K Rain Leander > OpenStack Community Liaison > Open Source and Standards Team > https://www.rdoproject.org/ > http://community.redhat.com > -- K Rain Leander OpenStack Community Liaison Open Source and Standards Team https://www.rdoproject.org/ http://community.redhat.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From reza.b2008 at gmail.com Wed Sep 12 11:31:18 2018 From: reza.b2008 at gmail.com (Reza Bakhshayeshi) Date: Wed, 12 Sep 2018 16:01:18 +0430 Subject: [Openstack] [masakari-monitors] sending notification conflict error Message-ID: Hi In newton masakari-monitors (version==2), when I shut off an instance, masakari engine repeatedly start and stop instance! I get the following error on compute node: 2018-09-12 11:15:27.584 3141 INFO masakarimonitors.instancemonitor.libvirt_handler.callback [-] Send a notification. 2018-09-12 11:15:27.897 3141 ERROR masakarimonitors.instancemonitor.libvirt_handler.callback [-] HttpException: Conflict 2018-09-12 11:15:27.897 3141 ERROR masakarimonitors.instancemonitor.libvirt_handler.callback Traceback (most recent call last): 2018-09-12 11:15:27.897 3141 ERROR masakarimonitors.instancemonitor.libvirt_handler.callback File "/usr/local/lib/python2.7/dist-packages/masakarimonitors/instancemonitor/libvirt_handler/callback.py", line 89, in _post_event 2018-09-12 11:15:27.897 3141 ERROR masakarimonitors.instancemonitor.libvirt_handler.callback payload=payload) 2018-09-12 11:15:27.897 3141 ERROR masakarimonitors.instancemonitor.libvirt_handler.callback File "/usr/local/lib/python2.7/dist-packages/masakariclient/sdk/vmha/v1/_proxy.py", line 65, in create_notification 2018-09-12 11:15:27.897 3141 ERROR masakarimonitors.instancemonitor.libvirt_handler.callback return self._create(_notification.Notification, **attrs) 2018-09-12 11:15:27.897 3141 ERROR masakarimonitors.instancemonitor.libvirt_handler.callback File "/usr/lib/python2.7/dist-packages/openstack/proxy2.py", line 193, in _create 2018-09-12 11:15:27.897 3141 ERROR masakarimonitors.instancemonitor.libvirt_handler.callback return res.create(self.session) 2018-09-12 11:15:27.897 3141 ERROR masakarimonitors.instancemonitor.libvirt_handler.callback File "/usr/lib/python2.7/dist-packages/openstack/resource2.py", line 570, in create 2018-09-12 11:15:27.897 3141 ERROR masakarimonitors.instancemonitor.libvirt_handler.callback json=request.body, headers=request.headers) 2018-09-12 11:15:27.897 3141 ERROR masakarimonitors.instancemonitor.libvirt_handler.callback File "/usr/lib/python2.7/dist-packages/keystoneauth1/session.py", line 675, in post 2018-09-12 11:15:27.897 3141 ERROR masakarimonitors.instancemonitor.libvirt_handler.callback return self.request(url, 'POST', **kwargs) 2018-09-12 11:15:27.897 3141 ERROR masakarimonitors.instancemonitor.libvirt_handler.callback File "/usr/lib/python2.7/dist-packages/openstack/session.py", line 52, in map_exceptions_wrapper 2018-09-12 11:15:27.897 3141 ERROR masakarimonitors.instancemonitor.libvirt_handler.callback http_status=e.http_status, cause=e) 2018-09-12 11:15:27.897 3141 ERROR masakarimonitors.instancemonitor.libvirt_handler.callback HttpException: HttpException: Conflict 2018-09-12 11:15:27.897 3141 ERROR masakarimonitors.instancemonitor.libvirt_handler.callback 2018-09-12 11:15:27.993 3141 DEBUG masakarimonitors.instancemonitor.libvirt_handler.eventfilter [-] libvirt Event Received.type = VM hostname = compute1 uuid = 9e38b13e-ed53-4855-9037-1589da26415e time = 2018-09-12 11:15:27.993473 eventID = 0 eventType = 5 detail = 0 vir_event_filter /usr/local/lib/python2.7/dist-packages/masakarimonitors/instancemonitor/libvirt_handler/eventfilter.py:56 2018-09-12 11:15:27.994 3141 DEBUG masakarimonitors.instancemonitor.libvirt_handler.eventfilter [-] Event Filter Matched. vir_event_filter /usr/local/lib/python2.7/dist-packages/masakarimonitors/instancemonitor/libvirt_handler/eventfilter.py:60 2018-09-12 11:15:27.994 3141 INFO masakarimonitors.instancemonitor.libvirt_handler.callback [-] libvirt Event: type=VM hostname=compute1 uuid=9e38b13e-ed53-4855-9037-1589da26415e time=2018-09-12 11:15:27.993473 eventID=LIFECYCLE detail=STOPPED_SHUTDOWN 2018-09-12 11:15:27.995 3141 INFO masakarimonitors.instancemonitor.libvirt_handler.callback [-] Send a notification. 2018-09-12 11:15:28.263 3141 WARNING masakarimonitors.instancemonitor.libvirt_handler.callback [-] Retry sending a notification. (HttpException: Conflict) 2018-09-12 11:15:28.561 3141 WARNING masakarimonitors.instancemonitor.libvirt_handler.callback [-] Retry sending a notification. (HttpException: Conflict) Do you have any idea? Regards, Reza -------------- next part -------------- An HTML attachment was scrubbed... URL: From rleander at redhat.com Wed Sep 12 15:50:22 2018 From: rleander at redhat.com (Rain Leander) Date: Wed, 12 Sep 2018 09:50:22 -0600 Subject: [Openstack] [ptg] Interviews at OpenStack PTG Denver In-Reply-To: References: Message-ID: Today I’m fully booked in Platte River on the ballroom level. I’d love to talk to as many projects as possible during PTG. Let’s talk about what we accomplished in Rocky, our hopes for Stein, and how contributors can get in touch. See you soon! https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU ~Rain. On Tue, 11 Sep 2018 at 14:05, Rain Leander wrote: > This afternoon I'm in Boulder Creek; look for the Interviews sign. If > you're a PTL, I wanna talk with you. If you want contributors, I wanna talk > with you. If you're here at the PTG, I wanna talk with you. > > > https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU/edit?usp=sharing > > ~Rain. > > On Tue, Sep 11, 2018 at 6:01 PM Rain Leander wrote: > >> Today I'm conducting interviews in the lunch room - look for the >> Interviews sign! >> https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU/edit?usp=sharing >> >> On Mon, Sep 10, 2018 at 6:40 PM Rain Leander wrote: >> >>> Today I'm conducting interviews in Blanca Peak and I'd love to talk >>> about what you worked on in Rocky and / or what you're planning to do in >>> Stein. I'm especially keen to hear about how new collaborators can join >>> your project! See you soon! >>> >>> ~R. >>> >>> On Sat, Sep 8, 2018 at 6:24 PM Rain Leander wrote: >>> >>>> Hello all! >>>> >>>> I'm attending PTG this week to conduct project interviews [0]. These >>>> interviews have several purposes. Please consider all of the following when >>>> thinking about what you might want to say in your interview: >>>> >>>> * Tell the users/customers/press what you've been working on in Rocky >>>> * Give them some idea of what's (what might be?) coming in Stein >>>> * Put a human face on the OpenStack project and encourage new >>>> participants to join us >>>> * You're welcome to promote your company's involvement in OpenStack but >>>> we ask that you avoid any kind of product pitches or job recruitment >>>> >>>> In the interview I'll ask some leading questions and it'll go easier if >>>> you've given some thought to them ahead of time: >>>> >>>> * Who are you? (Your name, your employer, and the project(s) on which >>>> you are active.) >>>> * What did you accomplish in Rocky? (Focus on the 2-3 things that will >>>> be most interesting to cloud operators) >>>> * What do you expect to be the focus in Stein? (At the time of your >>>> interview, it's likely that the meetings will not yet have decided anything >>>> firm. That's ok.) >>>> * Anything further about the project(s) you work on or the OpenStack >>>> community in general. >>>> >>>> Finally, note that there are only 40 interview slots available, so >>>> please consider coordinating with your project to designate the people that >>>> you want to represent the project, so that we don't end up with 12 >>>> interview about Neutron, or whatever. I mean, love me some Neutron, but >>>> twelve interviews is a bit too many, eh? >>>> >>>> It's fine to have multiple people in one interview - Maximum 3, >>>> probably. >>>> >>>> Interview slots are 30 minutes, in which time we hope to capture >>>> somewhere between 10 and 20 minutes of content. It's fine to run shorter >>>> but 15 minutes is probably an ideal length. >>>> >>>> See you SOON! >>>> >>>> [0] >>>> https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU/edit?usp=sharing >>>> -- >>>> K Rain Leander >>>> OpenStack Community Liaison >>>> Open Source and Standards Team >>>> https://www.rdoproject.org/ >>>> http://community.redhat.com >>>> >>> >>> >>> -- >>> K Rain Leander >>> OpenStack Community Liaison >>> Open Source and Standards Team >>> https://www.rdoproject.org/ >>> http://community.redhat.com >>> >> >> >> -- >> K Rain Leander >> OpenStack Community Liaison >> Open Source and Standards Team >> https://www.rdoproject.org/ >> http://community.redhat.com >> > > > -- > K Rain Leander > OpenStack Community Liaison > Open Source and Standards Team > https://www.rdoproject.org/ > http://community.redhat.com > -- K Rain Leander OpenStack Community Liaison Open Source and Standards Team https://www.rdoproject.org/ http://community.redhat.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From openflow.vrr at gmail.com Wed Sep 12 18:09:24 2018 From: openflow.vrr at gmail.com (Valdinei Rodrigues dos reis) Date: Wed, 12 Sep 2018 15:09:24 -0300 Subject: [Openstack] HA configuration misunderstanding Message-ID: Hi there. I'm configuring HA for Openstack services, have just configured GaleraDB with 5 nodes, and get stucked in indicating to Openstack services how to use this cluster. HA documentation guide says "OpenStack services are configured with the list of these IP addresses" But I cant figure out how to do this. Without HA configuration goes like: connection = mysql+pymysql://user:password at 172.16.0.226/glance I really appreciate any help. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sam47priya at gmail.com Wed Sep 12 19:27:45 2018 From: sam47priya at gmail.com (Sam P) Date: Wed, 12 Sep 2018 12:27:45 -0700 Subject: [Openstack] [masakari-monitors] sending notification conflict error In-Reply-To: References: Message-ID: Hi Reza, Can you please share openstacksdk version in our compute node? --- Regards, Sampath On Wed, Sep 12, 2018 at 4:40 AM Reza Bakhshayeshi wrote: > Hi > In newton masakari-monitors (version==2), when I shut off an instance, > masakari engine repeatedly start and stop instance! > I get the following error on compute node: > > 2018-09-12 11:15:27.584 3141 INFO > masakarimonitors.instancemonitor.libvirt_handler.callback [-] Send a > notification. > 2018-09-12 11:15:27.897 3141 ERROR > masakarimonitors.instancemonitor.libvirt_handler.callback [-] > HttpException: Conflict > 2018-09-12 11:15:27.897 3141 ERROR > masakarimonitors.instancemonitor.libvirt_handler.callback Traceback (most > recent call last): > 2018-09-12 11:15:27.897 3141 ERROR > masakarimonitors.instancemonitor.libvirt_handler.callback File > "/usr/local/lib/python2.7/dist-packages/masakarimonitors/instancemonitor/libvirt_handler/callback.py", > line 89, in _post_event > 2018-09-12 11:15:27.897 3141 ERROR > masakarimonitors.instancemonitor.libvirt_handler.callback > payload=payload) > 2018-09-12 11:15:27.897 3141 ERROR > masakarimonitors.instancemonitor.libvirt_handler.callback File > "/usr/local/lib/python2.7/dist-packages/masakariclient/sdk/vmha/v1/_proxy.py", > line 65, in create_notification > 2018-09-12 11:15:27.897 3141 ERROR > masakarimonitors.instancemonitor.libvirt_handler.callback return > self._create(_notification.Notification, **attrs) > 2018-09-12 11:15:27.897 3141 ERROR > masakarimonitors.instancemonitor.libvirt_handler.callback File > "/usr/lib/python2.7/dist-packages/openstack/proxy2.py", line 193, in _create > 2018-09-12 11:15:27.897 3141 ERROR > masakarimonitors.instancemonitor.libvirt_handler.callback return > res.create(self.session) > 2018-09-12 11:15:27.897 3141 ERROR > masakarimonitors.instancemonitor.libvirt_handler.callback File > "/usr/lib/python2.7/dist-packages/openstack/resource2.py", line 570, in > create > 2018-09-12 11:15:27.897 3141 ERROR > masakarimonitors.instancemonitor.libvirt_handler.callback > json=request.body, headers=request.headers) > 2018-09-12 11:15:27.897 3141 ERROR > masakarimonitors.instancemonitor.libvirt_handler.callback File > "/usr/lib/python2.7/dist-packages/keystoneauth1/session.py", line 675, in > post > 2018-09-12 11:15:27.897 3141 ERROR > masakarimonitors.instancemonitor.libvirt_handler.callback return > self.request(url, 'POST', **kwargs) > 2018-09-12 11:15:27.897 3141 ERROR > masakarimonitors.instancemonitor.libvirt_handler.callback File > "/usr/lib/python2.7/dist-packages/openstack/session.py", line 52, in > map_exceptions_wrapper > 2018-09-12 11:15:27.897 3141 ERROR > masakarimonitors.instancemonitor.libvirt_handler.callback > http_status=e.http_status, cause=e) > 2018-09-12 11:15:27.897 3141 ERROR > masakarimonitors.instancemonitor.libvirt_handler.callback HttpException: > HttpException: Conflict > 2018-09-12 11:15:27.897 3141 ERROR > masakarimonitors.instancemonitor.libvirt_handler.callback > 2018-09-12 11:15:27.993 3141 DEBUG > masakarimonitors.instancemonitor.libvirt_handler.eventfilter [-] libvirt > Event Received.type = VM hostname = compute1 uuid = > 9e38b13e-ed53-4855-9037-1589da26415e time = 2018-09-12 11:15:27.993473 > eventID = 0 eventType = 5 detail = 0 vir_event_filter > /usr/local/lib/python2.7/dist-packages/masakarimonitors/instancemonitor/libvirt_handler/eventfilter.py:56 > 2018-09-12 11:15:27.994 3141 DEBUG > masakarimonitors.instancemonitor.libvirt_handler.eventfilter [-] Event > Filter Matched. vir_event_filter > /usr/local/lib/python2.7/dist-packages/masakarimonitors/instancemonitor/libvirt_handler/eventfilter.py:60 > 2018-09-12 11:15:27.994 3141 INFO > masakarimonitors.instancemonitor.libvirt_handler.callback [-] libvirt > Event: type=VM hostname=compute1 > uuid=9e38b13e-ed53-4855-9037-1589da26415e time=2018-09-12 > 11:15:27.993473 eventID=LIFECYCLE detail=STOPPED_SHUTDOWN > 2018-09-12 11:15:27.995 3141 INFO > masakarimonitors.instancemonitor.libvirt_handler.callback [-] Send a > notification. > 2018-09-12 11:15:28.263 3141 WARNING > masakarimonitors.instancemonitor.libvirt_handler.callback [-] Retry sending > a notification. (HttpException: Conflict) > 2018-09-12 11:15:28.561 3141 WARNING > masakarimonitors.instancemonitor.libvirt_handler.callback [-] Retry sending > a notification. (HttpException: Conflict) > > Do you have any idea? > > Regards, > Reza > _______________________________________________ > Mailing list: > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > -------------- next part -------------- An HTML attachment was scrubbed... URL: From reza.b2008 at gmail.com Wed Sep 12 22:02:00 2018 From: reza.b2008 at gmail.com (Reza Bakhshayeshi) Date: Thu, 13 Sep 2018 02:32:00 +0430 Subject: [Openstack] [masakari-monitors] sending notification conflict error In-Reply-To: References: Message-ID: Hi Sam, Thank you very much for your attention, The python-openstacksdk version on compute node is 0.9.5 Regards, Reza On Thu, 13 Sep 2018 at 00:02, Sam P wrote: > Hi Reza, > Can you please share openstacksdk version in our compute node? > > --- Regards, > Sampath > > > > On Wed, Sep 12, 2018 at 4:40 AM Reza Bakhshayeshi > wrote: > >> Hi >> In newton masakari-monitors (version==2), when I shut off an instance, >> masakari engine repeatedly start and stop instance! >> I get the following error on compute node: >> >> 2018-09-12 11:15:27.584 3141 INFO >> masakarimonitors.instancemonitor.libvirt_handler.callback [-] Send a >> notification. >> 2018-09-12 11:15:27.897 3141 ERROR >> masakarimonitors.instancemonitor.libvirt_handler.callback [-] >> HttpException: Conflict >> 2018-09-12 11:15:27.897 3141 ERROR >> masakarimonitors.instancemonitor.libvirt_handler.callback Traceback (most >> recent call last): >> 2018-09-12 11:15:27.897 3141 ERROR >> masakarimonitors.instancemonitor.libvirt_handler.callback File >> "/usr/local/lib/python2.7/dist-packages/masakarimonitors/instancemonitor/libvirt_handler/callback.py", >> line 89, in _post_event >> 2018-09-12 11:15:27.897 3141 ERROR >> masakarimonitors.instancemonitor.libvirt_handler.callback >> payload=payload) >> 2018-09-12 11:15:27.897 3141 ERROR >> masakarimonitors.instancemonitor.libvirt_handler.callback File >> "/usr/local/lib/python2.7/dist-packages/masakariclient/sdk/vmha/v1/_proxy.py", >> line 65, in create_notification >> 2018-09-12 11:15:27.897 3141 ERROR >> masakarimonitors.instancemonitor.libvirt_handler.callback return >> self._create(_notification.Notification, **attrs) >> 2018-09-12 11:15:27.897 3141 ERROR >> masakarimonitors.instancemonitor.libvirt_handler.callback File >> "/usr/lib/python2.7/dist-packages/openstack/proxy2.py", line 193, in _create >> 2018-09-12 11:15:27.897 3141 ERROR >> masakarimonitors.instancemonitor.libvirt_handler.callback return >> res.create(self.session) >> 2018-09-12 11:15:27.897 3141 ERROR >> masakarimonitors.instancemonitor.libvirt_handler.callback File >> "/usr/lib/python2.7/dist-packages/openstack/resource2.py", line 570, in >> create >> 2018-09-12 11:15:27.897 3141 ERROR >> masakarimonitors.instancemonitor.libvirt_handler.callback >> json=request.body, headers=request.headers) >> 2018-09-12 11:15:27.897 3141 ERROR >> masakarimonitors.instancemonitor.libvirt_handler.callback File >> "/usr/lib/python2.7/dist-packages/keystoneauth1/session.py", line 675, in >> post >> 2018-09-12 11:15:27.897 3141 ERROR >> masakarimonitors.instancemonitor.libvirt_handler.callback return >> self.request(url, 'POST', **kwargs) >> 2018-09-12 11:15:27.897 3141 ERROR >> masakarimonitors.instancemonitor.libvirt_handler.callback File >> "/usr/lib/python2.7/dist-packages/openstack/session.py", line 52, in >> map_exceptions_wrapper >> 2018-09-12 11:15:27.897 3141 ERROR >> masakarimonitors.instancemonitor.libvirt_handler.callback >> http_status=e.http_status, cause=e) >> 2018-09-12 11:15:27.897 3141 ERROR >> masakarimonitors.instancemonitor.libvirt_handler.callback HttpException: >> HttpException: Conflict >> 2018-09-12 11:15:27.897 3141 ERROR >> masakarimonitors.instancemonitor.libvirt_handler.callback >> 2018-09-12 11:15:27.993 3141 DEBUG >> masakarimonitors.instancemonitor.libvirt_handler.eventfilter [-] libvirt >> Event Received.type = VM hostname = compute1 uuid = >> 9e38b13e-ed53-4855-9037-1589da26415e time = 2018-09-12 11:15:27.993473 >> eventID = 0 eventType = 5 detail = 0 vir_event_filter >> /usr/local/lib/python2.7/dist-packages/masakarimonitors/instancemonitor/libvirt_handler/eventfilter.py:56 >> 2018-09-12 11:15:27.994 3141 DEBUG >> masakarimonitors.instancemonitor.libvirt_handler.eventfilter [-] Event >> Filter Matched. vir_event_filter >> /usr/local/lib/python2.7/dist-packages/masakarimonitors/instancemonitor/libvirt_handler/eventfilter.py:60 >> 2018-09-12 11:15:27.994 3141 INFO >> masakarimonitors.instancemonitor.libvirt_handler.callback [-] libvirt >> Event: type=VM hostname=compute1 >> uuid=9e38b13e-ed53-4855-9037-1589da26415e time=2018-09-12 >> 11:15:27.993473 eventID=LIFECYCLE detail=STOPPED_SHUTDOWN >> 2018-09-12 11:15:27.995 3141 INFO >> masakarimonitors.instancemonitor.libvirt_handler.callback [-] Send a >> notification. >> 2018-09-12 11:15:28.263 3141 WARNING >> masakarimonitors.instancemonitor.libvirt_handler.callback [-] Retry sending >> a notification. (HttpException: Conflict) >> 2018-09-12 11:15:28.561 3141 WARNING >> masakarimonitors.instancemonitor.libvirt_handler.callback [-] Retry sending >> a notification. (HttpException: Conflict) >> >> Do you have any idea? >> >> Regards, >> Reza >> _______________________________________________ >> Mailing list: >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> Post to : openstack at lists.openstack.org >> Unsubscribe : >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> > _______________________________________________ > Mailing list: > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > -------------- next part -------------- An HTML attachment was scrubbed... URL: From berndbausch at gmail.com Thu Sep 13 06:27:51 2018 From: berndbausch at gmail.com (Bernd Bausch) Date: Thu, 13 Sep 2018 15:27:51 +0900 Subject: [Openstack] [Magnum] no documentation for openstack client commands Message-ID: <38afb58a-b34f-f25c-7004-cb3a5cc4ee4b@gmail.com> The Magnum user guide says [1]: " Refer to the OpenStack Command-Line Interface Reference for a full list of the commands supported by the openstack coe command-line client. " Unfortunately, the openstack CLI reference doesn't mention openstack coe commands. Or the description is hidden so well that I don't find it. To add insult to injury, when I click on the bug icon of the user guide page, I am told "Launchpad doesn't know what bug tracker Magnum uses. ". [1] https://docs.openstack.org/magnum/latest/user/index.html#using-the-command-line-client -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From eblock at nde.ag Thu Sep 13 06:35:56 2018 From: eblock at nde.ag (Eugen Block) Date: Thu, 13 Sep 2018 06:35:56 +0000 Subject: [Openstack] HA configuration misunderstanding In-Reply-To: Message-ID: <20180913063556.Horde.J_HkZyJeftGB6nc89GKDDHB@webmail.nde.ag> Hi, > HA documentation guide says "OpenStack services are configured with the > list of these IP addresses" I created a bug report for this issue [1] a couple of months ago. The HA guide is not very good at the moment. You'll have to use a virtual IP and configure OpenStack services to use that IP (or the respective hostname): [database] connection = mysql+pymysql://keystone:@/keystone I'm not finished yet, but I use a combination of HA guide, more or less old blog posts and trying to figure out the error messages. So good luck to you. ;-) Regards, Eugen [1] https://bugs.launchpad.net/openstack-manuals/+bug/1755108 Zitat von Valdinei Rodrigues dos reis : > Hi there. > > I'm configuring HA for Openstack services, have just configured GaleraDB > with 5 nodes, and get stucked in indicating to Openstack services how to > use this cluster. > > HA documentation guide says "OpenStack services are configured with the > list of these IP addresses" > > But I cant figure out how to do this. Without HA configuration goes like: > > connection = mysql+pymysql://user:password at 172.16.0.226/glance > > I really appreciate any help. From fungi at yuggoth.org Thu Sep 13 13:32:27 2018 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 13 Sep 2018 13:32:27 +0000 Subject: [Openstack] [Magnum] no documentation for openstack client commands In-Reply-To: <38afb58a-b34f-f25c-7004-cb3a5cc4ee4b@gmail.com> References: <38afb58a-b34f-f25c-7004-cb3a5cc4ee4b@gmail.com> Message-ID: <20180913133227.ghtoukz6wn2dwv6x@yuggoth.org> On 2018-09-13 15:27:51 +0900 (+0900), Bernd Bausch wrote: [...] > To add insult to injury, when I click on the bug icon of the user > guide page, I am told "Launchpad doesn't know what bug tracker > Magnum uses. ". [...] At the top of https://launchpad.net/magnum it states "Note: Please file bugs in https://storyboard.openstack.org" (the last time I looked, it was still not possible to indicate arbitrary defect trackers from the bugs sub-pages in LP, only certain trackers the developers of LP decided to build integration for). It might be a little more user-friendly if they at least linked https://storyboard.openstack.org/#!/project/openstack/magnum instead of the base URL for SB on that main project page, but hopefully it's enough for most people to find what they're looking for. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From fungi at yuggoth.org Thu Sep 13 14:03:11 2018 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 13 Sep 2018 14:03:11 +0000 Subject: [Openstack] [Magnum] no documentation for openstack client commands In-Reply-To: <20180913133227.ghtoukz6wn2dwv6x@yuggoth.org> References: <38afb58a-b34f-f25c-7004-cb3a5cc4ee4b@gmail.com> <20180913133227.ghtoukz6wn2dwv6x@yuggoth.org> Message-ID: <20180913140311.2dgwwqhdxdmka32p@yuggoth.org> On 2018-09-13 13:32:27 +0000 (+0000), Jeremy Stanley wrote: > On 2018-09-13 15:27:51 +0900 (+0900), Bernd Bausch wrote: > [...] > > To add insult to injury, when I click on the bug icon of the user > > guide page, I am told "Launchpad doesn't know what bug tracker > > Magnum uses. ". > [...] > > At the top of https://launchpad.net/magnum it states "Note: Please > file bugs in https://storyboard.openstack.org" (the last time I > looked, it was still not possible to indicate arbitrary defect > trackers from the bugs sub-pages in LP, only certain trackers the > developers of LP decided to build integration for). It might be a > little more user-friendly if they at least linked > https://storyboard.openstack.org/#!/project/openstack/magnum instead > of the base URL for SB on that main project page, but hopefully it's > enough for most people to find what they're looking for. And yes, it does appear from the results of `git grep -i launchpad` in the openstack/magnum repo that they have quite a few places (contributing, readme, docs index...) where they need to update the URL for defect reporting to no longer be on launchpad.net. -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From rleander at redhat.com Thu Sep 13 14:52:14 2018 From: rleander at redhat.com (Rain Leander) Date: Thu, 13 Sep 2018 08:52:14 -0600 Subject: [Openstack] [ptg] Interviews at OpenStack PTG Denver In-Reply-To: References: Message-ID: I have two shifts open today in Longs Peak on the third floor! https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU On Wed, 12 Sep 2018 at 09:50, Rain Leander wrote: > Today I’m fully booked in Platte River on the ballroom level. I’d love to > talk to as many projects as possible during PTG. Let’s talk about what we > accomplished in Rocky, our hopes for Stein, and how contributors can get in > touch. > > See you soon! > > > https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU > > ~Rain. > > On Tue, 11 Sep 2018 at 14:05, Rain Leander wrote: > >> This afternoon I'm in Boulder Creek; look for the Interviews sign. If >> you're a PTL, I wanna talk with you. If you want contributors, I wanna talk >> with you. If you're here at the PTG, I wanna talk with you. >> >> >> https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU/edit?usp=sharing >> >> ~Rain. >> >> On Tue, Sep 11, 2018 at 6:01 PM Rain Leander wrote: >> >>> Today I'm conducting interviews in the lunch room - look for the >>> Interviews sign! >>> https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU/edit?usp=sharing >>> >>> On Mon, Sep 10, 2018 at 6:40 PM Rain Leander >>> wrote: >>> >>>> Today I'm conducting interviews in Blanca Peak and I'd love to talk >>>> about what you worked on in Rocky and / or what you're planning to do in >>>> Stein. I'm especially keen to hear about how new collaborators can join >>>> your project! See you soon! >>>> >>>> ~R. >>>> >>>> On Sat, Sep 8, 2018 at 6:24 PM Rain Leander >>>> wrote: >>>> >>>>> Hello all! >>>>> >>>>> I'm attending PTG this week to conduct project interviews [0]. These >>>>> interviews have several purposes. Please consider all of the following when >>>>> thinking about what you might want to say in your interview: >>>>> >>>>> * Tell the users/customers/press what you've been working on in Rocky >>>>> * Give them some idea of what's (what might be?) coming in Stein >>>>> * Put a human face on the OpenStack project and encourage new >>>>> participants to join us >>>>> * You're welcome to promote your company's involvement in OpenStack >>>>> but we ask that you avoid any kind of product pitches or job recruitment >>>>> >>>>> In the interview I'll ask some leading questions and it'll go easier >>>>> if you've given some thought to them ahead of time: >>>>> >>>>> * Who are you? (Your name, your employer, and the project(s) on which >>>>> you are active.) >>>>> * What did you accomplish in Rocky? (Focus on the 2-3 things that will >>>>> be most interesting to cloud operators) >>>>> * What do you expect to be the focus in Stein? (At the time of your >>>>> interview, it's likely that the meetings will not yet have decided anything >>>>> firm. That's ok.) >>>>> * Anything further about the project(s) you work on or the OpenStack >>>>> community in general. >>>>> >>>>> Finally, note that there are only 40 interview slots available, so >>>>> please consider coordinating with your project to designate the people that >>>>> you want to represent the project, so that we don't end up with 12 >>>>> interview about Neutron, or whatever. I mean, love me some Neutron, but >>>>> twelve interviews is a bit too many, eh? >>>>> >>>>> It's fine to have multiple people in one interview - Maximum 3, >>>>> probably. >>>>> >>>>> Interview slots are 30 minutes, in which time we hope to capture >>>>> somewhere between 10 and 20 minutes of content. It's fine to run shorter >>>>> but 15 minutes is probably an ideal length. >>>>> >>>>> See you SOON! >>>>> >>>>> [0] >>>>> https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU/edit?usp=sharing >>>>> -- >>>>> K Rain Leander >>>>> OpenStack Community Liaison >>>>> Open Source and Standards Team >>>>> https://www.rdoproject.org/ >>>>> http://community.redhat.com >>>>> >>>> >>>> >>>> -- >>>> K Rain Leander >>>> OpenStack Community Liaison >>>> Open Source and Standards Team >>>> https://www.rdoproject.org/ >>>> http://community.redhat.com >>>> >>> >>> >>> -- >>> K Rain Leander >>> OpenStack Community Liaison >>> Open Source and Standards Team >>> https://www.rdoproject.org/ >>> http://community.redhat.com >>> >> >> >> -- >> K Rain Leander >> OpenStack Community Liaison >> Open Source and Standards Team >> https://www.rdoproject.org/ >> http://community.redhat.com >> > -- > K Rain Leander > OpenStack Community Liaison > Open Source and Standards Team > https://www.rdoproject.org/ > http://community.redhat.com > -- K Rain Leander OpenStack Community Liaison Open Source and Standards Team https://www.rdoproject.org/ http://community.redhat.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From berndbausch at gmail.com Thu Sep 13 16:01:37 2018 From: berndbausch at gmail.com (Bernd Bausch) Date: Fri, 14 Sep 2018 01:01:37 +0900 Subject: [Openstack] [Magnum] no documentation for openstack client commands In-Reply-To: <20180913140311.2dgwwqhdxdmka32p@yuggoth.org> References: <38afb58a-b34f-f25c-7004-cb3a5cc4ee4b@gmail.com> <20180913133227.ghtoukz6wn2dwv6x@yuggoth.org> <20180913140311.2dgwwqhdxdmka32p@yuggoth.org> Message-ID: Thanks Jeremy. But the bug symbol is so cute, they should take advantage of it. Good opportunity to learn about Storyboard. Bernd. On 9/13/2018 11:03 PM, Jeremy Stanley wrote: > On 2018-09-13 13:32:27 +0000 (+0000), Jeremy Stanley wrote: >> On 2018-09-13 15:27:51 +0900 (+0900), Bernd Bausch wrote: >> [...] >>> To add insult to injury, when I click on the bug icon of the user >>> guide page, I am told "Launchpad doesn't know what bug tracker >>> Magnum uses. ". >> [...] >> >> At the top of https://launchpad.net/magnum it states "Note: Please >> file bugs in https://storyboard.openstack.org" (the last time I >> looked, it was still not possible to indicate arbitrary defect >> trackers from the bugs sub-pages in LP, only certain trackers the >> developers of LP decided to build integration for). It might be a >> little more user-friendly if they at least linked >> https://storyboard.openstack.org/#!/project/openstack/magnum instead >> of the base URL for SB on that main project page, but hopefully it's >> enough for most people to find what they're looking for. > And yes, it does appear from the results of `git grep -i launchpad` > in the openstack/magnum repo that they have quite a few places > (contributing, readme, docs index...) where they need to update the > URL for defect reporting to no longer be on launchpad.net. > > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From rleander at redhat.com Thu Sep 13 20:08:48 2018 From: rleander at redhat.com (Rain Leander) Date: Thu, 13 Sep 2018 22:08:48 +0200 Subject: [Openstack] [ptg] Interviews at OpenStack PTG Denver In-Reply-To: References: Message-ID: There are only THREE shifts available tomorrow in Platte River on the ballroom level. Sign up, quick quick like a bunny, as I'd love to talk with you about what you worked on in Rocky, what's your focus in Stein, and how we can generate more collaborators for your project or SIG. https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU/edit?usp=sharing ~Rain. On Thu, Sep 13, 2018 at 4:52 PM Rain Leander wrote: > I have two shifts open today in Longs Peak on the third floor! > > > https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU > > > On Wed, 12 Sep 2018 at 09:50, Rain Leander wrote: > >> Today I’m fully booked in Platte River on the ballroom level. I’d love to >> talk to as many projects as possible during PTG. Let’s talk about what we >> accomplished in Rocky, our hopes for Stein, and how contributors can get in >> touch. >> >> See you soon! >> >> >> https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU >> >> ~Rain. >> >> On Tue, 11 Sep 2018 at 14:05, Rain Leander wrote: >> >>> This afternoon I'm in Boulder Creek; look for the Interviews sign. If >>> you're a PTL, I wanna talk with you. If you want contributors, I wanna talk >>> with you. If you're here at the PTG, I wanna talk with you. >>> >>> >>> https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU/edit?usp=sharing >>> >>> ~Rain. >>> >>> On Tue, Sep 11, 2018 at 6:01 PM Rain Leander >>> wrote: >>> >>>> Today I'm conducting interviews in the lunch room - look for the >>>> Interviews sign! >>>> https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU/edit?usp=sharing >>>> >>>> On Mon, Sep 10, 2018 at 6:40 PM Rain Leander >>>> wrote: >>>> >>>>> Today I'm conducting interviews in Blanca Peak and I'd love to talk >>>>> about what you worked on in Rocky and / or what you're planning to do in >>>>> Stein. I'm especially keen to hear about how new collaborators can join >>>>> your project! See you soon! >>>>> >>>>> ~R. >>>>> >>>>> On Sat, Sep 8, 2018 at 6:24 PM Rain Leander >>>>> wrote: >>>>> >>>>>> Hello all! >>>>>> >>>>>> I'm attending PTG this week to conduct project interviews [0]. These >>>>>> interviews have several purposes. Please consider all of the following when >>>>>> thinking about what you might want to say in your interview: >>>>>> >>>>>> * Tell the users/customers/press what you've been working on in Rocky >>>>>> * Give them some idea of what's (what might be?) coming in Stein >>>>>> * Put a human face on the OpenStack project and encourage new >>>>>> participants to join us >>>>>> * You're welcome to promote your company's involvement in OpenStack >>>>>> but we ask that you avoid any kind of product pitches or job recruitment >>>>>> >>>>>> In the interview I'll ask some leading questions and it'll go easier >>>>>> if you've given some thought to them ahead of time: >>>>>> >>>>>> * Who are you? (Your name, your employer, and the project(s) on which >>>>>> you are active.) >>>>>> * What did you accomplish in Rocky? (Focus on the 2-3 things that >>>>>> will be most interesting to cloud operators) >>>>>> * What do you expect to be the focus in Stein? (At the time of your >>>>>> interview, it's likely that the meetings will not yet have decided anything >>>>>> firm. That's ok.) >>>>>> * Anything further about the project(s) you work on or the OpenStack >>>>>> community in general. >>>>>> >>>>>> Finally, note that there are only 40 interview slots available, so >>>>>> please consider coordinating with your project to designate the people that >>>>>> you want to represent the project, so that we don't end up with 12 >>>>>> interview about Neutron, or whatever. I mean, love me some Neutron, but >>>>>> twelve interviews is a bit too many, eh? >>>>>> >>>>>> It's fine to have multiple people in one interview - Maximum 3, >>>>>> probably. >>>>>> >>>>>> Interview slots are 30 minutes, in which time we hope to capture >>>>>> somewhere between 10 and 20 minutes of content. It's fine to run shorter >>>>>> but 15 minutes is probably an ideal length. >>>>>> >>>>>> See you SOON! >>>>>> >>>>>> [0] >>>>>> https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU/edit?usp=sharing >>>>>> -- >>>>>> K Rain Leander >>>>>> OpenStack Community Liaison >>>>>> Open Source and Standards Team >>>>>> https://www.rdoproject.org/ >>>>>> http://community.redhat.com >>>>>> >>>>> >>>>> >>>>> -- >>>>> K Rain Leander >>>>> OpenStack Community Liaison >>>>> Open Source and Standards Team >>>>> https://www.rdoproject.org/ >>>>> http://community.redhat.com >>>>> >>>> >>>> >>>> -- >>>> K Rain Leander >>>> OpenStack Community Liaison >>>> Open Source and Standards Team >>>> https://www.rdoproject.org/ >>>> http://community.redhat.com >>>> >>> >>> >>> -- >>> K Rain Leander >>> OpenStack Community Liaison >>> Open Source and Standards Team >>> https://www.rdoproject.org/ >>> http://community.redhat.com >>> >> -- >> K Rain Leander >> OpenStack Community Liaison >> Open Source and Standards Team >> https://www.rdoproject.org/ >> http://community.redhat.com >> > -- > K Rain Leander > OpenStack Community Liaison > Open Source and Standards Team > https://www.rdoproject.org/ > http://community.redhat.com > -- K Rain Leander OpenStack Community Liaison Open Source and Standards Team https://www.rdoproject.org/ http://community.redhat.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From rleander at redhat.com Fri Sep 14 14:57:48 2018 From: rleander at redhat.com (Rain Leander) Date: Fri, 14 Sep 2018 08:57:48 -0600 Subject: [Openstack] [ptg] Interviews at OpenStack PTG Denver In-Reply-To: References: Message-ID: Only ONE open shift left for interviews in Platte River. Sign up if you’re keen. https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU On Thu, 13 Sep 2018 at 14:08, Rain Leander wrote: > There are only THREE shifts available tomorrow in Platte River on the > ballroom level. Sign up, quick quick like a bunny, as I'd love to talk with > you about what you worked on in Rocky, what's your focus in Stein, and how > we can generate more collaborators for your project or SIG. > > > https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU/edit?usp=sharing > > ~Rain. > > > On Thu, Sep 13, 2018 at 4:52 PM Rain Leander wrote: > >> I have two shifts open today in Longs Peak on the third floor! >> >> >> https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU >> >> >> On Wed, 12 Sep 2018 at 09:50, Rain Leander wrote: >> >>> Today I’m fully booked in Platte River on the ballroom level. I’d love >>> to talk to as many projects as possible during PTG. Let’s talk about what >>> we accomplished in Rocky, our hopes for Stein, and how contributors can get >>> in touch. >>> >>> See you soon! >>> >>> >>> https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU >>> >>> ~Rain. >>> >>> On Tue, 11 Sep 2018 at 14:05, Rain Leander wrote: >>> >>>> This afternoon I'm in Boulder Creek; look for the Interviews sign. If >>>> you're a PTL, I wanna talk with you. If you want contributors, I wanna talk >>>> with you. If you're here at the PTG, I wanna talk with you. >>>> >>>> >>>> https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU/edit?usp=sharing >>>> >>>> ~Rain. >>>> >>>> On Tue, Sep 11, 2018 at 6:01 PM Rain Leander >>>> wrote: >>>> >>>>> Today I'm conducting interviews in the lunch room - look for the >>>>> Interviews sign! >>>>> https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU/edit?usp=sharing >>>>> >>>>> On Mon, Sep 10, 2018 at 6:40 PM Rain Leander >>>>> wrote: >>>>> >>>>>> Today I'm conducting interviews in Blanca Peak and I'd love to talk >>>>>> about what you worked on in Rocky and / or what you're planning to do in >>>>>> Stein. I'm especially keen to hear about how new collaborators can join >>>>>> your project! See you soon! >>>>>> >>>>>> ~R. >>>>>> >>>>>> On Sat, Sep 8, 2018 at 6:24 PM Rain Leander >>>>>> wrote: >>>>>> >>>>>>> Hello all! >>>>>>> >>>>>>> I'm attending PTG this week to conduct project interviews [0]. These >>>>>>> interviews have several purposes. Please consider all of the following when >>>>>>> thinking about what you might want to say in your interview: >>>>>>> >>>>>>> * Tell the users/customers/press what you've been working on in Rocky >>>>>>> * Give them some idea of what's (what might be?) coming in Stein >>>>>>> * Put a human face on the OpenStack project and encourage new >>>>>>> participants to join us >>>>>>> * You're welcome to promote your company's involvement in OpenStack >>>>>>> but we ask that you avoid any kind of product pitches or job recruitment >>>>>>> >>>>>>> In the interview I'll ask some leading questions and it'll go easier >>>>>>> if you've given some thought to them ahead of time: >>>>>>> >>>>>>> * Who are you? (Your name, your employer, and the project(s) on >>>>>>> which you are active.) >>>>>>> * What did you accomplish in Rocky? (Focus on the 2-3 things that >>>>>>> will be most interesting to cloud operators) >>>>>>> * What do you expect to be the focus in Stein? (At the time of your >>>>>>> interview, it's likely that the meetings will not yet have decided anything >>>>>>> firm. That's ok.) >>>>>>> * Anything further about the project(s) you work on or the OpenStack >>>>>>> community in general. >>>>>>> >>>>>>> Finally, note that there are only 40 interview slots available, so >>>>>>> please consider coordinating with your project to designate the people that >>>>>>> you want to represent the project, so that we don't end up with 12 >>>>>>> interview about Neutron, or whatever. I mean, love me some Neutron, but >>>>>>> twelve interviews is a bit too many, eh? >>>>>>> >>>>>>> It's fine to have multiple people in one interview - Maximum 3, >>>>>>> probably. >>>>>>> >>>>>>> Interview slots are 30 minutes, in which time we hope to capture >>>>>>> somewhere between 10 and 20 minutes of content. It's fine to run shorter >>>>>>> but 15 minutes is probably an ideal length. >>>>>>> >>>>>>> See you SOON! >>>>>>> >>>>>>> [0] >>>>>>> https://docs.google.com/spreadsheets/d/19XjQPeE9ZobK1b49aM-J7P-xQC-OKLNeFnwxavyQCgU/edit?usp=sharing >>>>>>> -- >>>>>>> K Rain Leander >>>>>>> OpenStack Community Liaison >>>>>>> Open Source and Standards Team >>>>>>> https://www.rdoproject.org/ >>>>>>> http://community.redhat.com >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> K Rain Leander >>>>>> OpenStack Community Liaison >>>>>> Open Source and Standards Team >>>>>> https://www.rdoproject.org/ >>>>>> http://community.redhat.com >>>>>> >>>>> >>>>> >>>>> -- >>>>> K Rain Leander >>>>> OpenStack Community Liaison >>>>> Open Source and Standards Team >>>>> https://www.rdoproject.org/ >>>>> http://community.redhat.com >>>>> >>>> >>>> >>>> -- >>>> K Rain Leander >>>> OpenStack Community Liaison >>>> Open Source and Standards Team >>>> https://www.rdoproject.org/ >>>> http://community.redhat.com >>>> >>> -- >>> K Rain Leander >>> OpenStack Community Liaison >>> Open Source and Standards Team >>> https://www.rdoproject.org/ >>> http://community.redhat.com >>> >> -- >> K Rain Leander >> OpenStack Community Liaison >> Open Source and Standards Team >> https://www.rdoproject.org/ >> http://community.redhat.com >> > > > -- > K Rain Leander > OpenStack Community Liaison > Open Source and Standards Team > https://www.rdoproject.org/ > http://community.redhat.com > -- K Rain Leander OpenStack Community Liaison Open Source and Standards Team https://www.rdoproject.org/ http://community.redhat.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From codeology.lab at gmail.com Fri Sep 14 22:21:29 2018 From: codeology.lab at gmail.com (Cody) Date: Fri, 14 Sep 2018 18:21:29 -0400 Subject: [Openstack] [neutron][tripleo] Neutron external bridge setting Message-ID: Hello everyone, Could someone kindly help explain the following paragraphs taken from the OpenStack TripleO documentation [1]? "By default, Neutron is configured with an empty string for the Neutron external bridge mapping. This results in the physical interface being patched to br-int, rather than using br-ex directly (as in previous versions). This model allows for multiple floating IP networks, using either VLANs or multiple physical connections. When using only one floating IP network on the native VLAN of a bridge, then you can optionally set the Neutron external bridge to e.g. “br-ex”. This results in the packets only having to traverse one bridge (instead of two), and may result in slightly lower CPU when passing traffic over the floating IP network." Here I have difficulties to understand the traffic flow in the case of setting an empty string for the Neutron external bridge mapping (NeutronExternalNetworkBridge). My interpretation is that an empty string would map a physical interface directly to the br-int instead of the br-ex. Does it mean to bypass the br-ex and use the br-int for external traffics? If so, how come this would be less efficient as oppose to using br-int <-> br-ex <-> physical interface? This is where I got confused. Any helps would be greatly appreciated. Thank you very much to all. [1]: https://docs.openstack.org/tripleo-docs/latest/install/advanced_deployment/network_isolation.html#using-the-native-vlan-for-floating-ips Regards, Cody From satish.txt at gmail.com Sat Sep 15 20:22:28 2018 From: satish.txt at gmail.com (Satish Patel) Date: Sat, 15 Sep 2018 16:22:28 -0400 Subject: [Openstack] URGENT: packet loss on openstack instance Message-ID: Folks, I need some advice or suggestion to find out what is going on with my network, we have notice high packet loss on openstack instance and not sure what is going on, same time if i check on host machine and it has zero packet loss.. this is what i did for test... ping 8.8.8.8 from instance: 50% packet loss from compute host: 0% packet loss I have disabled TSO/GSO/SG setting on physical compute node but still getting packet loss. We have 10G NIC on our network, look like something related to tap interface setting.. From satish.txt at gmail.com Sun Sep 16 04:52:01 2018 From: satish.txt at gmail.com (Satish Patel) Date: Sun, 16 Sep 2018 00:52:01 -0400 Subject: [Openstack] URGENT: packet loss on openstack instance In-Reply-To: References: Message-ID: [root at compute-33 ~]# ifconfig tap5af7f525-5f | grep -i drop RX errors 0 dropped 0 overruns 0 frame 0 TX errors 0 dropped 2528788837 overruns 0 carrier 0 collisions 0 Noticed tap interface dropping TX packets and even after increasing txqueue from 1000 to 10000 nothing changed, still getting packet drops. On Sat, Sep 15, 2018 at 4:22 PM Satish Patel wrote: > > Folks, > > I need some advice or suggestion to find out what is going on with my > network, we have notice high packet loss on openstack instance and not > sure what is going on, same time if i check on host machine and it has > zero packet loss.. this is what i did for test... > > ping 8.8.8.8 > > from instance: 50% packet loss > from compute host: 0% packet loss > > I have disabled TSO/GSO/SG setting on physical compute node but still > getting packet loss. > > We have 10G NIC on our network, look like something related to tap > interface setting.. From limao at cisco.com Sun Sep 16 06:27:21 2018 From: limao at cisco.com (Liping Mao (limao)) Date: Sun, 16 Sep 2018 06:27:21 +0000 Subject: [Openstack] URGENT: packet loss on openstack instance In-Reply-To: References: Message-ID: <4126044E-505F-4A48-B126-0625D5F40D72@cisco.com> Hi Satish, Did your packet loss happen always or it only happened when heavy load? AFAIK, if you do not tun anything, the vm tap can process about 50kpps before the tap device start to drop packets. If it happened in heavy load, couple of things you can try: 1) increase tap queue length, usually the default value is 500, you can try larger. (seems like you already tried) 2) Try to use virtio multi queues feature , see [1]. Virtio use one queue for rx/tx in vm, with this feature you can get more queues. You can check 3) In rock version, you can use [2] to increase virtio queue size, the default queues size is 256/512, you may increase it to 1024, this would help to increase pps of the tap device. If all these things can not get your network performance requirement, you may need to move to use dpdk / sriov stuff to get more vm performance. I did not actually used them in our env, you may refer to [3] [1] https://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/libvirt-virtiomq.html [2] https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/libvirt-virtio-set-queue-sizes.html [3] https://docs.openstack.org/ocata/networking-guide/config-sriov.html Regards, Liping Mao 在 2018/9/16 13:07,“Satish Patel” 写入: [root at compute-33 ~]# ifconfig tap5af7f525-5f | grep -i drop RX errors 0 dropped 0 overruns 0 frame 0 TX errors 0 dropped 2528788837 overruns 0 carrier 0 collisions 0 Noticed tap interface dropping TX packets and even after increasing txqueue from 1000 to 10000 nothing changed, still getting packet drops. On Sat, Sep 15, 2018 at 4:22 PM Satish Patel wrote: > > Folks, > > I need some advice or suggestion to find out what is going on with my > network, we have notice high packet loss on openstack instance and not > sure what is going on, same time if i check on host machine and it has > zero packet loss.. this is what i did for test... > > ping 8.8.8.8 > > from instance: 50% packet loss > from compute host: 0% packet loss > > I have disabled TSO/GSO/SG setting on physical compute node but still > getting packet loss. > > We have 10G NIC on our network, look like something related to tap > interface setting.. _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack at lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack From satish.txt at gmail.com Sun Sep 16 13:18:39 2018 From: satish.txt at gmail.com (Satish Patel) Date: Sun, 16 Sep 2018 09:18:39 -0400 Subject: [Openstack] URGENT: packet loss on openstack instance In-Reply-To: <4126044E-505F-4A48-B126-0625D5F40D72@cisco.com> References: <4126044E-505F-4A48-B126-0625D5F40D72@cisco.com> Message-ID: <11A2F68B-1A87-433B-A4D6-CA495DA88F5C@gmail.com> Hi Liping, Thank you for your reply, We notice packet drops during high load, I did try txqueue and didn't help so I believe I am going to try miltiqueue. For SRIOV I have to look if I have support in my nic. We are using queens so I think queue size option not possible :( We are using voip application and traffic is udp so our pps rate is 60k to 80k per vm instance. I will share my result as soon as I try multiqueue. Sent from my iPhone > On Sep 16, 2018, at 2:27 AM, Liping Mao (limao) wrote: > > Hi Satish, > > > > Did your packet loss happen always or it only happened when heavy load? > > AFAIK, if you do not tun anything, the vm tap can process about 50kpps before the tap device start to drop packets. > > > > If it happened in heavy load, couple of things you can try: > > 1) increase tap queue length, usually the default value is 500, you can try larger. (seems like you already tried) > > 2) Try to use virtio multi queues feature , see [1]. Virtio use one queue for rx/tx in vm, with this feature you can get more queues. You can check > > 3) In rock version, you can use [2] to increase virtio queue size, the default queues size is 256/512, you may increase it to 1024, this would help to increase pps of the tap device. > > > > If all these things can not get your network performance requirement, you may need to move to use dpdk / sriov stuff to get more vm performance. > > I did not actually used them in our env, you may refer to [3] > > > > [1] https://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/libvirt-virtiomq.html > > [2] https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/libvirt-virtio-set-queue-sizes.html > > [3] https://docs.openstack.org/ocata/networking-guide/config-sriov.html > > > > Regards, > > Liping Mao > > > > 在 2018/9/16 13:07,“Satish Patel” 写入: > > > > [root at compute-33 ~]# ifconfig tap5af7f525-5f | grep -i drop > > RX errors 0 dropped 0 overruns 0 frame 0 > > TX errors 0 dropped 2528788837 overruns 0 carrier 0 collisions 0 > > > > Noticed tap interface dropping TX packets and even after increasing > > txqueue from 1000 to 10000 nothing changed, still getting packet > > drops. > > > >> On Sat, Sep 15, 2018 at 4:22 PM Satish Patel wrote: >> >> > >> Folks, > >> > >> I need some advice or suggestion to find out what is going on with my > >> network, we have notice high packet loss on openstack instance and not > >> sure what is going on, same time if i check on host machine and it has > >> zero packet loss.. this is what i did for test... > >> > >> ping 8.8.8.8 > >> > >> from instance: 50% packet loss > >> from compute host: 0% packet loss > >> > >> I have disabled TSO/GSO/SG setting on physical compute node but still > >> getting packet loss. > >> > >> We have 10G NIC on our network, look like something related to tap > >> interface setting.. > > > > _______________________________________________ > > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > Post to : openstack at lists.openstack.org > > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > > From limao at cisco.com Sun Sep 16 14:50:58 2018 From: limao at cisco.com (Liping Mao (limao)) Date: Sun, 16 Sep 2018 14:50:58 +0000 Subject: [Openstack] URGENT: packet loss on openstack instance In-Reply-To: <11A2F68B-1A87-433B-A4D6-CA495DA88F5C@gmail.com> References: <4126044E-505F-4A48-B126-0625D5F40D72@cisco.com>, <11A2F68B-1A87-433B-A4D6-CA495DA88F5C@gmail.com> Message-ID: <5A9B804F-BEA1-468D-BBAB-3C50181A6190@cisco.com> It is still possible to update rx and tx queues length if your qemu and libvirt version is higher than the version recorded in [3]. (You should possible to update directly in libvirt configuration if my memory is correct) We also have some similar use case which run audio/vedio serivcs. They are CPU consuming and have UDP small packets. Another possible tunning is using CPU pin for the vm. you can use numa awared cpu feature to get stable cpu performance ,vm network dropped packets sometimes because of the vm cpu is too busy,with numa cpu it works better performance,our way is similar with [a]. You need to create flavor with special metadata and dedicated Host Agg for numa awared VMs. Dedicated CPU is very good for media service. It makes the CPU performance stable. Another packet loss case we get is because of vm kernel, some of our app are using 32bit OS, that cause memory issue, when traffic larger then 50kpps, it dropped a lot,sometimes,it even crash. In this case, 32bit os can actually use very limited memory, we have to add swap for the vm. Hope your app is using 64 bit OS. Because 32 bit could cause tons of trouble. BTW,if you are using vrouter on L3, you’d better to move provider network(no vrouter). I did not tried DVR, but if you are running without DVR, the L3 node will be bottleneck very quick. Especially default iptables conntrack is 65535, you will reach to it and drop packet on L3, even after you tun that value, it still hard to more that 1Mpps for your network node. If your App more than 200kpps per compute node, you may be better also have a look your physical network driver tx/rx configuration. Most of the HW default value for tx/rx queues number and length are very poor,you may start to get packet on eth interface on physical host when rx queue is full. [a]https://redhatstackblog.redhat.com/2015/05/05/cpu-pinning-and-numa-topology-awareness-in-openstack-compute/ Regards, Liping Mao 在 2018年9月16日,21:18,Satish Patel > 写道: Hi Liping, Thank you for your reply, We notice packet drops during high load, I did try txqueue and didn't help so I believe I am going to try miltiqueue. For SRIOV I have to look if I have support in my nic. We are using queens so I think queue size option not possible :( We are using voip application and traffic is udp so our pps rate is 60k to 80k per vm instance. I will share my result as soon as I try multiqueue. Sent from my iPhone On Sep 16, 2018, at 2:27 AM, Liping Mao (limao) > wrote: Hi Satish, Did your packet loss happen always or it only happened when heavy load? AFAIK, if you do not tun anything, the vm tap can process about 50kpps before the tap device start to drop packets. If it happened in heavy load, couple of things you can try: 1) increase tap queue length, usually the default value is 500, you can try larger. (seems like you already tried) 2) Try to use virtio multi queues feature , see [1]. Virtio use one queue for rx/tx in vm, with this feature you can get more queues. You can check 3) In rock version, you can use [2] to increase virtio queue size, the default queues size is 256/512, you may increase it to 1024, this would help to increase pps of the tap device. If all these things can not get your network performance requirement, you may need to move to use dpdk / sriov stuff to get more vm performance. I did not actually used them in our env, you may refer to [3] [1] https://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/libvirt-virtiomq.html [2] https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/libvirt-virtio-set-queue-sizes.html [3] https://docs.openstack.org/ocata/networking-guide/config-sriov.html Regards, Liping Mao 在 2018/9/16 13:07,“Satish Patel”> 写入: [root at compute-33 ~]# ifconfig tap5af7f525-5f | grep -i drop RX errors 0 dropped 0 overruns 0 frame 0 TX errors 0 dropped 2528788837 overruns 0 carrier 0 collisions 0 Noticed tap interface dropping TX packets and even after increasing txqueue from 1000 to 10000 nothing changed, still getting packet drops. On Sat, Sep 15, 2018 at 4:22 PM Satish Patel > wrote: Folks, I need some advice or suggestion to find out what is going on with my network, we have notice high packet loss on openstack instance and not sure what is going on, same time if i check on host machine and it has zero packet loss.. this is what i did for test... ping 8.8.8.8 from instance: 50% packet loss from compute host: 0% packet loss I have disabled TSO/GSO/SG setting on physical compute node but still getting packet loss. We have 10G NIC on our network, look like something related to tap interface setting.. _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack at lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Sun Sep 16 15:08:55 2018 From: satish.txt at gmail.com (Satish Patel) Date: Sun, 16 Sep 2018 11:08:55 -0400 Subject: [Openstack] URGENT: packet loss on openstack instance In-Reply-To: <5A9B804F-BEA1-468D-BBAB-3C50181A6190@cisco.com> References: <4126044E-505F-4A48-B126-0625D5F40D72@cisco.com> <11A2F68B-1A87-433B-A4D6-CA495DA88F5C@gmail.com> <5A9B804F-BEA1-468D-BBAB-3C50181A6190@cisco.com> Message-ID: Thanks Liping, I am using libvertd 3.9.0 version so look like i am eligible take advantage of that feature. phew! [root at compute-47 ~]# libvirtd -V libvirtd (libvirt) 3.9.0 Let me tell you how i am running instance on my openstack, my compute has 32 core / 32G memory and i have created two instance on compute node 15vcpu and 14G memory ( two instance using 30 vcpu core, i have kept 2 core for compute node). on compute node i disabled overcommit using ratio (1.0) I didn't configure NUMA yet because i wasn't aware of this feature, as per your last post do you think numa will help to fix this issue? following is my numa view [root at compute-47 ~]# numactl --hardware available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23 node 0 size: 16349 MB node 0 free: 133 MB node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31 node 1 size: 16383 MB node 1 free: 317 MB node distances: node 0 1 0: 10 20 1: 20 10 I am not using any L3 router, i am using provide VLAN network and using Cisco Nexus switch for my L3 function so i am not seeing any bottleneck there. This is the 10G NIC i have on all my compute node, dual 10G port with bonding (20G) 03:00.0 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10) 03:00.1 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10) On Sun, Sep 16, 2018 at 10:50 AM Liping Mao (limao) wrote: > > It is still possible to update rx and tx queues length if your qemu and libvirt version is higher than the version recorded in [3]. (You should possible to update directly in libvirt configuration if my memory is correct) > > We also have some similar use case which run audio/vedio serivcs. They are CPU consuming and have UDP small packets. Another possible tunning is using CPU pin for the vm. you can use numa awared cpu feature to get stable cpu performance ,vm network dropped packets sometimes because of the vm cpu is too busy,with numa cpu it works better performance,our way is similar with [a]. You need to create flavor with special metadata and dedicated Host Agg for numa awared VMs. Dedicated CPU is very good for media service. It makes the CPU performance stable. > > Another packet loss case we get is because of vm kernel, some of our app are using 32bit OS, that cause memory issue, when traffic larger then 50kpps, it dropped a lot,sometimes,it even crash. In this case, 32bit os can actually use very limited memory, we have to add swap for the vm. Hope your app is using 64 bit OS. Because 32 bit could cause tons of trouble. > > BTW,if you are using vrouter on L3, you’d better to move provider network(no vrouter). I did not tried DVR, but if you are running without DVR, the L3 node will be bottleneck very quick. Especially default iptables conntrack is 65535, you will reach to it and drop packet on L3, even after you tun that value, it still hard to more that 1Mpps for your network node. > > If your App more than 200kpps per compute node, you may be better also have a look your physical network driver tx/rx configuration. Most of the HW default value for tx/rx queues number and length are very poor,you may start to get packet on eth interface on physical host when rx queue is full. > > [a]https://redhatstackblog.redhat.com/2015/05/05/cpu-pinning-and-numa-topology-awareness-in-openstack-compute/ > > Regards, > Liping Mao > > 在 2018年9月16日,21:18,Satish Patel 写道: > > Hi Liping, > > Thank you for your reply, > > We notice packet drops during high load, I did try txqueue and didn't help so I believe I am going to try miltiqueue. > > For SRIOV I have to look if I have support in my nic. > > We are using queens so I think queue size option not possible :( > > We are using voip application and traffic is udp so our pps rate is 60k to 80k per vm instance. > > I will share my result as soon as I try multiqueue. > > > > Sent from my iPhone > > On Sep 16, 2018, at 2:27 AM, Liping Mao (limao) wrote: > > > Hi Satish, > > > > > Did your packet loss happen always or it only happened when heavy load? > > > AFAIK, if you do not tun anything, the vm tap can process about 50kpps before the tap device start to drop packets. > > > > > If it happened in heavy load, couple of things you can try: > > > 1) increase tap queue length, usually the default value is 500, you can try larger. (seems like you already tried) > > > 2) Try to use virtio multi queues feature , see [1]. Virtio use one queue for rx/tx in vm, with this feature you can get more queues. You can check > > > 3) In rock version, you can use [2] to increase virtio queue size, the default queues size is 256/512, you may increase it to 1024, this would help to increase pps of the tap device. > > > > > If all these things can not get your network performance requirement, you may need to move to use dpdk / sriov stuff to get more vm performance. > > > I did not actually used them in our env, you may refer to [3] > > > > > [1] https://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/libvirt-virtiomq.html > > > [2] https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/libvirt-virtio-set-queue-sizes.html > > > [3] https://docs.openstack.org/ocata/networking-guide/config-sriov.html > > > > > Regards, > > > Liping Mao > > > > > 在 2018/9/16 13:07,“Satish Patel” 写入: > > > > > [root at compute-33 ~]# ifconfig tap5af7f525-5f | grep -i drop > > > RX errors 0 dropped 0 overruns 0 frame 0 > > > TX errors 0 dropped 2528788837 overruns 0 carrier 0 collisions 0 > > > > > Noticed tap interface dropping TX packets and even after increasing > > > txqueue from 1000 to 10000 nothing changed, still getting packet > > > drops. > > > > > On Sat, Sep 15, 2018 at 4:22 PM Satish Patel wrote: > > > > > Folks, > > > > > I need some advice or suggestion to find out what is going on with my > > > network, we have notice high packet loss on openstack instance and not > > > sure what is going on, same time if i check on host machine and it has > > > zero packet loss.. this is what i did for test... > > > > > ping 8.8.8.8 > > > > > from instance: 50% packet loss > > > from compute host: 0% packet loss > > > > > I have disabled TSO/GSO/SG setting on physical compute node but still > > > getting packet loss. > > > > > We have 10G NIC on our network, look like something related to tap > > > interface setting.. > > > > > _______________________________________________ > > > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > Post to : openstack at lists.openstack.org > > > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > > > From limao at cisco.com Sun Sep 16 15:25:06 2018 From: limao at cisco.com (Liping Mao (limao)) Date: Sun, 16 Sep 2018 15:25:06 +0000 Subject: [Openstack] URGENT: packet loss on openstack instance In-Reply-To: References: <4126044E-505F-4A48-B126-0625D5F40D72@cisco.com> <11A2F68B-1A87-433B-A4D6-CA495DA88F5C@gmail.com> <5A9B804F-BEA1-468D-BBAB-3C50181A6190@cisco.com> Message-ID: I think multi queue feature should help.(be careful to make sure the ethtool update queue number action also did after reboot the vm). Numa cpu pin and queue length will be a plus in my exp. You may need yo have performance test in your situatuon,in my case cpu numa helpped the app get very stable 720p/1080p transcoding performance. Not sure if your app get benifit. You are not using L3,this will let you avoid a lot of performance issue. And since only two instance with 80kpps packets,so in your case,HW interface should not be bottleneck too. And your Nexus 5k/7k will not be bottleneck for sure ;-) Thanks, Liping Mao > 在 2018年9月16日,23:09,Satish Patel 写道: > > Thanks Liping, > > I am using libvertd 3.9.0 version so look like i am eligible take > advantage of that feature. phew! > > [root at compute-47 ~]# libvirtd -V > libvirtd (libvirt) 3.9.0 > > Let me tell you how i am running instance on my openstack, my compute > has 32 core / 32G memory and i have created two instance on compute > node 15vcpu and 14G memory ( two instance using 30 vcpu core, i have > kept 2 core for compute node). on compute node i disabled overcommit > using ratio (1.0) > > I didn't configure NUMA yet because i wasn't aware of this feature, as > per your last post do you think numa will help to fix this issue? > following is my numa view > > [root at compute-47 ~]# numactl --hardware > available: 2 nodes (0-1) > node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23 > node 0 size: 16349 MB > node 0 free: 133 MB > node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31 > node 1 size: 16383 MB > node 1 free: 317 MB > node distances: > node 0 1 > 0: 10 20 > 1: 20 10 > > > I am not using any L3 router, i am using provide VLAN network and > using Cisco Nexus switch for my L3 function so i am not seeing any > bottleneck there. > > This is the 10G NIC i have on all my compute node, dual 10G port with > bonding (20G) > > 03:00.0 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 > Gigabit Ethernet (rev 10) > 03:00.1 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 > Gigabit Ethernet (rev 10) > > >> On Sun, Sep 16, 2018 at 10:50 AM Liping Mao (limao) wrote: >> >> It is still possible to update rx and tx queues length if your qemu and libvirt version is higher than the version recorded in [3]. (You should possible to update directly in libvirt configuration if my memory is correct) >> >> We also have some similar use case which run audio/vedio serivcs. They are CPU consuming and have UDP small packets. Another possible tunning is using CPU pin for the vm. you can use numa awared cpu feature to get stable cpu performance ,vm network dropped packets sometimes because of the vm cpu is too busy,with numa cpu it works better performance,our way is similar with [a]. You need to create flavor with special metadata and dedicated Host Agg for numa awared VMs. Dedicated CPU is very good for media service. It makes the CPU performance stable. >> >> Another packet loss case we get is because of vm kernel, some of our app are using 32bit OS, that cause memory issue, when traffic larger then 50kpps, it dropped a lot,sometimes,it even crash. In this case, 32bit os can actually use very limited memory, we have to add swap for the vm. Hope your app is using 64 bit OS. Because 32 bit could cause tons of trouble. >> >> BTW,if you are using vrouter on L3, you’d better to move provider network(no vrouter). I did not tried DVR, but if you are running without DVR, the L3 node will be bottleneck very quick. Especially default iptables conntrack is 65535, you will reach to it and drop packet on L3, even after you tun that value, it still hard to more that 1Mpps for your network node. >> >> If your App more than 200kpps per compute node, you may be better also have a look your physical network driver tx/rx configuration. Most of the HW default value for tx/rx queues number and length are very poor,you may start to get packet on eth interface on physical host when rx queue is full. >> >> [a]https://redhatstackblog.redhat.com/2015/05/05/cpu-pinning-and-numa-topology-awareness-in-openstack-compute/ >> >> Regards, >> Liping Mao >> >> 在 2018年9月16日,21:18,Satish Patel 写道: >> >> Hi Liping, >> >> Thank you for your reply, >> >> We notice packet drops during high load, I did try txqueue and didn't help so I believe I am going to try miltiqueue. >> >> For SRIOV I have to look if I have support in my nic. >> >> We are using queens so I think queue size option not possible :( >> >> We are using voip application and traffic is udp so our pps rate is 60k to 80k per vm instance. >> >> I will share my result as soon as I try multiqueue. >> >> >> >> Sent from my iPhone >> >> On Sep 16, 2018, at 2:27 AM, Liping Mao (limao) wrote: >> >> >> Hi Satish, >> >> >> >> >> Did your packet loss happen always or it only happened when heavy load? >> >> >> AFAIK, if you do not tun anything, the vm tap can process about 50kpps before the tap device start to drop packets. >> >> >> >> >> If it happened in heavy load, couple of things you can try: >> >> >> 1) increase tap queue length, usually the default value is 500, you can try larger. (seems like you already tried) >> >> >> 2) Try to use virtio multi queues feature , see [1]. Virtio use one queue for rx/tx in vm, with this feature you can get more queues. You can check >> >> >> 3) In rock version, you can use [2] to increase virtio queue size, the default queues size is 256/512, you may increase it to 1024, this would help to increase pps of the tap device. >> >> >> >> >> If all these things can not get your network performance requirement, you may need to move to use dpdk / sriov stuff to get more vm performance. >> >> >> I did not actually used them in our env, you may refer to [3] >> >> >> >> >> [1] https://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/libvirt-virtiomq.html >> >> >> [2] https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/libvirt-virtio-set-queue-sizes.html >> >> >> [3] https://docs.openstack.org/ocata/networking-guide/config-sriov.html >> >> >> >> >> Regards, >> >> >> Liping Mao >> >> >> >> >> 在 2018/9/16 13:07,“Satish Patel” 写入: >> >> >> >> >> [root at compute-33 ~]# ifconfig tap5af7f525-5f | grep -i drop >> >> >> RX errors 0 dropped 0 overruns 0 frame 0 >> >> >> TX errors 0 dropped 2528788837 overruns 0 carrier 0 collisions 0 >> >> >> >> >> Noticed tap interface dropping TX packets and even after increasing >> >> >> txqueue from 1000 to 10000 nothing changed, still getting packet >> >> >> drops. >> >> >> >> >> On Sat, Sep 15, 2018 at 4:22 PM Satish Patel wrote: >> >> >> >> >> Folks, >> >> >> >> >> I need some advice or suggestion to find out what is going on with my >> >> >> network, we have notice high packet loss on openstack instance and not >> >> >> sure what is going on, same time if i check on host machine and it has >> >> >> zero packet loss.. this is what i did for test... >> >> >> >> >> ping 8.8.8.8 >> >> >> >> >> from instance: 50% packet loss >> >> >> from compute host: 0% packet loss >> >> >> >> >> I have disabled TSO/GSO/SG setting on physical compute node but still >> >> >> getting packet loss. >> >> >> >> >> We have 10G NIC on our network, look like something related to tap >> >> >> interface setting.. >> >> >> >> >> _______________________________________________ >> >> >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> >> >> Post to : openstack at lists.openstack.org >> >> >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> >> >> >> >> From satish.txt at gmail.com Sun Sep 16 15:51:53 2018 From: satish.txt at gmail.com (Satish Patel) Date: Sun, 16 Sep 2018 11:51:53 -0400 Subject: [Openstack] URGENT: packet loss on openstack instance In-Reply-To: References: <4126044E-505F-4A48-B126-0625D5F40D72@cisco.com> <11A2F68B-1A87-433B-A4D6-CA495DA88F5C@gmail.com> <5A9B804F-BEA1-468D-BBAB-3C50181A6190@cisco.com> Message-ID: I am currently playing with those setting and trying to generate traffic with hping3 tools, do you have any tool to test traffic performance for specially udp style small packets. I am going to share all my result and see what do you feel because i have noticed you went through this pain :) I will try every single option which you suggested to make sure we are good before i move forward to production. On Sun, Sep 16, 2018 at 11:25 AM Liping Mao (limao) wrote: > > I think multi queue feature should help.(be careful to make sure the ethtool update queue number action also did after reboot the vm). > > Numa cpu pin and queue length will be a plus in my exp. You may need yo have performance test in your situatuon,in my case cpu numa helpped the app get very stable 720p/1080p transcoding performance. Not sure if your app get benifit. > > You are not using L3,this will let you avoid a lot of performance issue. And since only two instance with 80kpps packets,so in your case,HW interface should not be bottleneck too. And your Nexus 5k/7k will not be bottleneck for sure ;-) > > > Thanks, > Liping Mao > > > 在 2018年9月16日,23:09,Satish Patel 写道: > > > > Thanks Liping, > > > > I am using libvertd 3.9.0 version so look like i am eligible take > > advantage of that feature. phew! > > > > [root at compute-47 ~]# libvirtd -V > > libvirtd (libvirt) 3.9.0 > > > > Let me tell you how i am running instance on my openstack, my compute > > has 32 core / 32G memory and i have created two instance on compute > > node 15vcpu and 14G memory ( two instance using 30 vcpu core, i have > > kept 2 core for compute node). on compute node i disabled overcommit > > using ratio (1.0) > > > > I didn't configure NUMA yet because i wasn't aware of this feature, as > > per your last post do you think numa will help to fix this issue? > > following is my numa view > > > > [root at compute-47 ~]# numactl --hardware > > available: 2 nodes (0-1) > > node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23 > > node 0 size: 16349 MB > > node 0 free: 133 MB > > node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31 > > node 1 size: 16383 MB > > node 1 free: 317 MB > > node distances: > > node 0 1 > > 0: 10 20 > > 1: 20 10 > > > > > > I am not using any L3 router, i am using provide VLAN network and > > using Cisco Nexus switch for my L3 function so i am not seeing any > > bottleneck there. > > > > This is the 10G NIC i have on all my compute node, dual 10G port with > > bonding (20G) > > > > 03:00.0 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 > > Gigabit Ethernet (rev 10) > > 03:00.1 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 > > Gigabit Ethernet (rev 10) > > > > > >> On Sun, Sep 16, 2018 at 10:50 AM Liping Mao (limao) wrote: > >> > >> It is still possible to update rx and tx queues length if your qemu and libvirt version is higher than the version recorded in [3]. (You should possible to update directly in libvirt configuration if my memory is correct) > >> > >> We also have some similar use case which run audio/vedio serivcs. They are CPU consuming and have UDP small packets. Another possible tunning is using CPU pin for the vm. you can use numa awared cpu feature to get stable cpu performance ,vm network dropped packets sometimes because of the vm cpu is too busy,with numa cpu it works better performance,our way is similar with [a]. You need to create flavor with special metadata and dedicated Host Agg for numa awared VMs. Dedicated CPU is very good for media service. It makes the CPU performance stable. > >> > >> Another packet loss case we get is because of vm kernel, some of our app are using 32bit OS, that cause memory issue, when traffic larger then 50kpps, it dropped a lot,sometimes,it even crash. In this case, 32bit os can actually use very limited memory, we have to add swap for the vm. Hope your app is using 64 bit OS. Because 32 bit could cause tons of trouble. > >> > >> BTW,if you are using vrouter on L3, you’d better to move provider network(no vrouter). I did not tried DVR, but if you are running without DVR, the L3 node will be bottleneck very quick. Especially default iptables conntrack is 65535, you will reach to it and drop packet on L3, even after you tun that value, it still hard to more that 1Mpps for your network node. > >> > >> If your App more than 200kpps per compute node, you may be better also have a look your physical network driver tx/rx configuration. Most of the HW default value for tx/rx queues number and length are very poor,you may start to get packet on eth interface on physical host when rx queue is full. > >> > >> [a]https://redhatstackblog.redhat.com/2015/05/05/cpu-pinning-and-numa-topology-awareness-in-openstack-compute/ > >> > >> Regards, > >> Liping Mao > >> > >> 在 2018年9月16日,21:18,Satish Patel 写道: > >> > >> Hi Liping, > >> > >> Thank you for your reply, > >> > >> We notice packet drops during high load, I did try txqueue and didn't help so I believe I am going to try miltiqueue. > >> > >> For SRIOV I have to look if I have support in my nic. > >> > >> We are using queens so I think queue size option not possible :( > >> > >> We are using voip application and traffic is udp so our pps rate is 60k to 80k per vm instance. > >> > >> I will share my result as soon as I try multiqueue. > >> > >> > >> > >> Sent from my iPhone > >> > >> On Sep 16, 2018, at 2:27 AM, Liping Mao (limao) wrote: > >> > >> > >> Hi Satish, > >> > >> > >> > >> > >> Did your packet loss happen always or it only happened when heavy load? > >> > >> > >> AFAIK, if you do not tun anything, the vm tap can process about 50kpps before the tap device start to drop packets. > >> > >> > >> > >> > >> If it happened in heavy load, couple of things you can try: > >> > >> > >> 1) increase tap queue length, usually the default value is 500, you can try larger. (seems like you already tried) > >> > >> > >> 2) Try to use virtio multi queues feature , see [1]. Virtio use one queue for rx/tx in vm, with this feature you can get more queues. You can check > >> > >> > >> 3) In rock version, you can use [2] to increase virtio queue size, the default queues size is 256/512, you may increase it to 1024, this would help to increase pps of the tap device. > >> > >> > >> > >> > >> If all these things can not get your network performance requirement, you may need to move to use dpdk / sriov stuff to get more vm performance. > >> > >> > >> I did not actually used them in our env, you may refer to [3] > >> > >> > >> > >> > >> [1] https://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/libvirt-virtiomq.html > >> > >> > >> [2] https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/libvirt-virtio-set-queue-sizes.html > >> > >> > >> [3] https://docs.openstack.org/ocata/networking-guide/config-sriov.html > >> > >> > >> > >> > >> Regards, > >> > >> > >> Liping Mao > >> > >> > >> > >> > >> 在 2018/9/16 13:07,“Satish Patel” 写入: > >> > >> > >> > >> > >> [root at compute-33 ~]# ifconfig tap5af7f525-5f | grep -i drop > >> > >> > >> RX errors 0 dropped 0 overruns 0 frame 0 > >> > >> > >> TX errors 0 dropped 2528788837 overruns 0 carrier 0 collisions 0 > >> > >> > >> > >> > >> Noticed tap interface dropping TX packets and even after increasing > >> > >> > >> txqueue from 1000 to 10000 nothing changed, still getting packet > >> > >> > >> drops. > >> > >> > >> > >> > >> On Sat, Sep 15, 2018 at 4:22 PM Satish Patel wrote: > >> > >> > >> > >> > >> Folks, > >> > >> > >> > >> > >> I need some advice or suggestion to find out what is going on with my > >> > >> > >> network, we have notice high packet loss on openstack instance and not > >> > >> > >> sure what is going on, same time if i check on host machine and it has > >> > >> > >> zero packet loss.. this is what i did for test... > >> > >> > >> > >> > >> ping 8.8.8.8 > >> > >> > >> > >> > >> from instance: 50% packet loss > >> > >> > >> from compute host: 0% packet loss > >> > >> > >> > >> > >> I have disabled TSO/GSO/SG setting on physical compute node but still > >> > >> > >> getting packet loss. > >> > >> > >> > >> > >> We have 10G NIC on our network, look like something related to tap > >> > >> > >> interface setting.. > >> > >> > >> > >> > >> _______________________________________________ > >> > >> > >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > >> > >> > >> Post to : openstack at lists.openstack.org > >> > >> > >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > >> > >> > >> > >> > >> From satish.txt at gmail.com Sun Sep 16 15:53:47 2018 From: satish.txt at gmail.com (Satish Patel) Date: Sun, 16 Sep 2018 11:53:47 -0400 Subject: [Openstack] URGENT: packet loss on openstack instance In-Reply-To: References: <4126044E-505F-4A48-B126-0625D5F40D72@cisco.com> <11A2F68B-1A87-433B-A4D6-CA495DA88F5C@gmail.com> <5A9B804F-BEA1-468D-BBAB-3C50181A6190@cisco.com> Message-ID: Hi Liping, >> I think multi queue feature should help.(be careful to make sure the ethtool update queue number action also did after reboot the vm). Is there a way i can automate this last task to update queue number action after reboot vm :) otherwise i can use cloud-init to make sure all VM build with same config. On Sun, Sep 16, 2018 at 11:51 AM Satish Patel wrote: > > I am currently playing with those setting and trying to generate > traffic with hping3 tools, do you have any tool to test traffic > performance for specially udp style small packets. > > I am going to share all my result and see what do you feel because i > have noticed you went through this pain :) I will try every single > option which you suggested to make sure we are good before i move > forward to production. > On Sun, Sep 16, 2018 at 11:25 AM Liping Mao (limao) wrote: > > > > I think multi queue feature should help.(be careful to make sure the ethtool update queue number action also did after reboot the vm). > > > > Numa cpu pin and queue length will be a plus in my exp. You may need yo have performance test in your situatuon,in my case cpu numa helpped the app get very stable 720p/1080p transcoding performance. Not sure if your app get benifit. > > > > You are not using L3,this will let you avoid a lot of performance issue. And since only two instance with 80kpps packets,so in your case,HW interface should not be bottleneck too. And your Nexus 5k/7k will not be bottleneck for sure ;-) > > > > > > Thanks, > > Liping Mao > > > > > 在 2018年9月16日,23:09,Satish Patel 写道: > > > > > > Thanks Liping, > > > > > > I am using libvertd 3.9.0 version so look like i am eligible take > > > advantage of that feature. phew! > > > > > > [root at compute-47 ~]# libvirtd -V > > > libvirtd (libvirt) 3.9.0 > > > > > > Let me tell you how i am running instance on my openstack, my compute > > > has 32 core / 32G memory and i have created two instance on compute > > > node 15vcpu and 14G memory ( two instance using 30 vcpu core, i have > > > kept 2 core for compute node). on compute node i disabled overcommit > > > using ratio (1.0) > > > > > > I didn't configure NUMA yet because i wasn't aware of this feature, as > > > per your last post do you think numa will help to fix this issue? > > > following is my numa view > > > > > > [root at compute-47 ~]# numactl --hardware > > > available: 2 nodes (0-1) > > > node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23 > > > node 0 size: 16349 MB > > > node 0 free: 133 MB > > > node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31 > > > node 1 size: 16383 MB > > > node 1 free: 317 MB > > > node distances: > > > node 0 1 > > > 0: 10 20 > > > 1: 20 10 > > > > > > > > > I am not using any L3 router, i am using provide VLAN network and > > > using Cisco Nexus switch for my L3 function so i am not seeing any > > > bottleneck there. > > > > > > This is the 10G NIC i have on all my compute node, dual 10G port with > > > bonding (20G) > > > > > > 03:00.0 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 > > > Gigabit Ethernet (rev 10) > > > 03:00.1 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 > > > Gigabit Ethernet (rev 10) > > > > > > > > >> On Sun, Sep 16, 2018 at 10:50 AM Liping Mao (limao) wrote: > > >> > > >> It is still possible to update rx and tx queues length if your qemu and libvirt version is higher than the version recorded in [3]. (You should possible to update directly in libvirt configuration if my memory is correct) > > >> > > >> We also have some similar use case which run audio/vedio serivcs. They are CPU consuming and have UDP small packets. Another possible tunning is using CPU pin for the vm. you can use numa awared cpu feature to get stable cpu performance ,vm network dropped packets sometimes because of the vm cpu is too busy,with numa cpu it works better performance,our way is similar with [a]. You need to create flavor with special metadata and dedicated Host Agg for numa awared VMs. Dedicated CPU is very good for media service. It makes the CPU performance stable. > > >> > > >> Another packet loss case we get is because of vm kernel, some of our app are using 32bit OS, that cause memory issue, when traffic larger then 50kpps, it dropped a lot,sometimes,it even crash. In this case, 32bit os can actually use very limited memory, we have to add swap for the vm. Hope your app is using 64 bit OS. Because 32 bit could cause tons of trouble. > > >> > > >> BTW,if you are using vrouter on L3, you’d better to move provider network(no vrouter). I did not tried DVR, but if you are running without DVR, the L3 node will be bottleneck very quick. Especially default iptables conntrack is 65535, you will reach to it and drop packet on L3, even after you tun that value, it still hard to more that 1Mpps for your network node. > > >> > > >> If your App more than 200kpps per compute node, you may be better also have a look your physical network driver tx/rx configuration. Most of the HW default value for tx/rx queues number and length are very poor,you may start to get packet on eth interface on physical host when rx queue is full. > > >> > > >> [a]https://redhatstackblog.redhat.com/2015/05/05/cpu-pinning-and-numa-topology-awareness-in-openstack-compute/ > > >> > > >> Regards, > > >> Liping Mao > > >> > > >> 在 2018年9月16日,21:18,Satish Patel 写道: > > >> > > >> Hi Liping, > > >> > > >> Thank you for your reply, > > >> > > >> We notice packet drops during high load, I did try txqueue and didn't help so I believe I am going to try miltiqueue. > > >> > > >> For SRIOV I have to look if I have support in my nic. > > >> > > >> We are using queens so I think queue size option not possible :( > > >> > > >> We are using voip application and traffic is udp so our pps rate is 60k to 80k per vm instance. > > >> > > >> I will share my result as soon as I try multiqueue. > > >> > > >> > > >> > > >> Sent from my iPhone > > >> > > >> On Sep 16, 2018, at 2:27 AM, Liping Mao (limao) wrote: > > >> > > >> > > >> Hi Satish, > > >> > > >> > > >> > > >> > > >> Did your packet loss happen always or it only happened when heavy load? > > >> > > >> > > >> AFAIK, if you do not tun anything, the vm tap can process about 50kpps before the tap device start to drop packets. > > >> > > >> > > >> > > >> > > >> If it happened in heavy load, couple of things you can try: > > >> > > >> > > >> 1) increase tap queue length, usually the default value is 500, you can try larger. (seems like you already tried) > > >> > > >> > > >> 2) Try to use virtio multi queues feature , see [1]. Virtio use one queue for rx/tx in vm, with this feature you can get more queues. You can check > > >> > > >> > > >> 3) In rock version, you can use [2] to increase virtio queue size, the default queues size is 256/512, you may increase it to 1024, this would help to increase pps of the tap device. > > >> > > >> > > >> > > >> > > >> If all these things can not get your network performance requirement, you may need to move to use dpdk / sriov stuff to get more vm performance. > > >> > > >> > > >> I did not actually used them in our env, you may refer to [3] > > >> > > >> > > >> > > >> > > >> [1] https://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/libvirt-virtiomq.html > > >> > > >> > > >> [2] https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/libvirt-virtio-set-queue-sizes.html > > >> > > >> > > >> [3] https://docs.openstack.org/ocata/networking-guide/config-sriov.html > > >> > > >> > > >> > > >> > > >> Regards, > > >> > > >> > > >> Liping Mao > > >> > > >> > > >> > > >> > > >> 在 2018/9/16 13:07,“Satish Patel” 写入: > > >> > > >> > > >> > > >> > > >> [root at compute-33 ~]# ifconfig tap5af7f525-5f | grep -i drop > > >> > > >> > > >> RX errors 0 dropped 0 overruns 0 frame 0 > > >> > > >> > > >> TX errors 0 dropped 2528788837 overruns 0 carrier 0 collisions 0 > > >> > > >> > > >> > > >> > > >> Noticed tap interface dropping TX packets and even after increasing > > >> > > >> > > >> txqueue from 1000 to 10000 nothing changed, still getting packet > > >> > > >> > > >> drops. > > >> > > >> > > >> > > >> > > >> On Sat, Sep 15, 2018 at 4:22 PM Satish Patel wrote: > > >> > > >> > > >> > > >> > > >> Folks, > > >> > > >> > > >> > > >> > > >> I need some advice or suggestion to find out what is going on with my > > >> > > >> > > >> network, we have notice high packet loss on openstack instance and not > > >> > > >> > > >> sure what is going on, same time if i check on host machine and it has > > >> > > >> > > >> zero packet loss.. this is what i did for test... > > >> > > >> > > >> > > >> > > >> ping 8.8.8.8 > > >> > > >> > > >> > > >> > > >> from instance: 50% packet loss > > >> > > >> > > >> from compute host: 0% packet loss > > >> > > >> > > >> > > >> > > >> I have disabled TSO/GSO/SG setting on physical compute node but still > > >> > > >> > > >> getting packet loss. > > >> > > >> > > >> > > >> > > >> We have 10G NIC on our network, look like something related to tap > > >> > > >> > > >> interface setting.. > > >> > > >> > > >> > > >> > > >> _______________________________________________ > > >> > > >> > > >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > >> > > >> > > >> Post to : openstack at lists.openstack.org > > >> > > >> > > >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > >> > > >> > > >> > > >> > > >> From qiaokang1213 at gmail.com Sun Sep 16 16:25:52 2018 From: qiaokang1213 at gmail.com (Qiao Kang) Date: Sun, 16 Sep 2018 11:25:52 -0500 Subject: [Openstack] Can any user add or delete OpenStack Swift middleware? Message-ID: Hi, I'm wondering whether Swift allows any user (not the administrator) to specify which middleware that she/he wants his data object to go throught. For instance, Alice wants to install a middleware but doesn't want Bob to use it, where Alice and Bob are two accounts in a single Swift cluster. Or maybe all middlewares are pre-installed globally and cannot be customized on a per-account basis? Thanks, Qiao -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Sun Sep 16 17:41:29 2018 From: satish.txt at gmail.com (Satish Patel) Date: Sun, 16 Sep 2018 13:41:29 -0400 Subject: [Openstack] URGENT: packet loss on openstack instance In-Reply-To: References: <4126044E-505F-4A48-B126-0625D5F40D72@cisco.com> <11A2F68B-1A87-433B-A4D6-CA495DA88F5C@gmail.com> <5A9B804F-BEA1-468D-BBAB-3C50181A6190@cisco.com> Message-ID: I successful reproduce this error with hping3 tool and look like multiqueue is our solution :) but i have few question you may have answer of that. 1. I have created two instance (vm1.example.com & vm2.example.com) 2. I have flood traffic from vm1 using "hping3 vm2.example.com --flood" and i have noticed drops on tap interface. ( This is without multiqueue) 3. Enable multiqueue in image and run same test and again got packet drops on tap interface ( I didn't update queue on vm2 guest, so definitely i was expecting packet drops) 4. Now i have try to update vm2 queue using ethtool and i got following error, I have 15vCPU and i was trying to add 15 queue [root at bar-mq ~]# ethtool -L eth0 combined 15 Cannot set device channel parameters: Invalid argument Then i have tried 8 queue which works. [root at bar-mq ~]# ethtool -L eth0 combined 8 combined unmodified, ignoring no channel parameters changed, aborting current values: tx 0 rx 0 other 0 combined 8 Now i am not seeing any packet drops on tap interface, I have measure PPS and i was able to get 160kpps without packet drops. Question: 1. why i am not able to add 15 queue? ( is this NIC or driver limitation?) 2. how do i automate "ethtool -L eth0 combined 8" command in instance so i don't need to tell my customer to do this manually? On Sun, Sep 16, 2018 at 11:53 AM Satish Patel wrote: > > Hi Liping, > > >> I think multi queue feature should help.(be careful to make sure the ethtool update queue number action also did after reboot the vm). > > Is there a way i can automate this last task to update queue number > action after reboot vm :) otherwise i can use cloud-init to make sure > all VM build with same config. > On Sun, Sep 16, 2018 at 11:51 AM Satish Patel wrote: > > > > I am currently playing with those setting and trying to generate > > traffic with hping3 tools, do you have any tool to test traffic > > performance for specially udp style small packets. > > > > I am going to share all my result and see what do you feel because i > > have noticed you went through this pain :) I will try every single > > option which you suggested to make sure we are good before i move > > forward to production. > > On Sun, Sep 16, 2018 at 11:25 AM Liping Mao (limao) wrote: > > > > > > I think multi queue feature should help.(be careful to make sure the ethtool update queue number action also did after reboot the vm). > > > > > > Numa cpu pin and queue length will be a plus in my exp. You may need yo have performance test in your situatuon,in my case cpu numa helpped the app get very stable 720p/1080p transcoding performance. Not sure if your app get benifit. > > > > > > You are not using L3,this will let you avoid a lot of performance issue. And since only two instance with 80kpps packets,so in your case,HW interface should not be bottleneck too. And your Nexus 5k/7k will not be bottleneck for sure ;-) > > > > > > > > > Thanks, > > > Liping Mao > > > > > > > 在 2018年9月16日,23:09,Satish Patel 写道: > > > > > > > > Thanks Liping, > > > > > > > > I am using libvertd 3.9.0 version so look like i am eligible take > > > > advantage of that feature. phew! > > > > > > > > [root at compute-47 ~]# libvirtd -V > > > > libvirtd (libvirt) 3.9.0 > > > > > > > > Let me tell you how i am running instance on my openstack, my compute > > > > has 32 core / 32G memory and i have created two instance on compute > > > > node 15vcpu and 14G memory ( two instance using 30 vcpu core, i have > > > > kept 2 core for compute node). on compute node i disabled overcommit > > > > using ratio (1.0) > > > > > > > > I didn't configure NUMA yet because i wasn't aware of this feature, as > > > > per your last post do you think numa will help to fix this issue? > > > > following is my numa view > > > > > > > > [root at compute-47 ~]# numactl --hardware > > > > available: 2 nodes (0-1) > > > > node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23 > > > > node 0 size: 16349 MB > > > > node 0 free: 133 MB > > > > node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31 > > > > node 1 size: 16383 MB > > > > node 1 free: 317 MB > > > > node distances: > > > > node 0 1 > > > > 0: 10 20 > > > > 1: 20 10 > > > > > > > > > > > > I am not using any L3 router, i am using provide VLAN network and > > > > using Cisco Nexus switch for my L3 function so i am not seeing any > > > > bottleneck there. > > > > > > > > This is the 10G NIC i have on all my compute node, dual 10G port with > > > > bonding (20G) > > > > > > > > 03:00.0 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 > > > > Gigabit Ethernet (rev 10) > > > > 03:00.1 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 > > > > Gigabit Ethernet (rev 10) > > > > > > > > > > > >> On Sun, Sep 16, 2018 at 10:50 AM Liping Mao (limao) wrote: > > > >> > > > >> It is still possible to update rx and tx queues length if your qemu and libvirt version is higher than the version recorded in [3]. (You should possible to update directly in libvirt configuration if my memory is correct) > > > >> > > > >> We also have some similar use case which run audio/vedio serivcs. They are CPU consuming and have UDP small packets. Another possible tunning is using CPU pin for the vm. you can use numa awared cpu feature to get stable cpu performance ,vm network dropped packets sometimes because of the vm cpu is too busy,with numa cpu it works better performance,our way is similar with [a]. You need to create flavor with special metadata and dedicated Host Agg for numa awared VMs. Dedicated CPU is very good for media service. It makes the CPU performance stable. > > > >> > > > >> Another packet loss case we get is because of vm kernel, some of our app are using 32bit OS, that cause memory issue, when traffic larger then 50kpps, it dropped a lot,sometimes,it even crash. In this case, 32bit os can actually use very limited memory, we have to add swap for the vm. Hope your app is using 64 bit OS. Because 32 bit could cause tons of trouble. > > > >> > > > >> BTW,if you are using vrouter on L3, you’d better to move provider network(no vrouter). I did not tried DVR, but if you are running without DVR, the L3 node will be bottleneck very quick. Especially default iptables conntrack is 65535, you will reach to it and drop packet on L3, even after you tun that value, it still hard to more that 1Mpps for your network node. > > > >> > > > >> If your App more than 200kpps per compute node, you may be better also have a look your physical network driver tx/rx configuration. Most of the HW default value for tx/rx queues number and length are very poor,you may start to get packet on eth interface on physical host when rx queue is full. > > > >> > > > >> [a]https://redhatstackblog.redhat.com/2015/05/05/cpu-pinning-and-numa-topology-awareness-in-openstack-compute/ > > > >> > > > >> Regards, > > > >> Liping Mao > > > >> > > > >> 在 2018年9月16日,21:18,Satish Patel 写道: > > > >> > > > >> Hi Liping, > > > >> > > > >> Thank you for your reply, > > > >> > > > >> We notice packet drops during high load, I did try txqueue and didn't help so I believe I am going to try miltiqueue. > > > >> > > > >> For SRIOV I have to look if I have support in my nic. > > > >> > > > >> We are using queens so I think queue size option not possible :( > > > >> > > > >> We are using voip application and traffic is udp so our pps rate is 60k to 80k per vm instance. > > > >> > > > >> I will share my result as soon as I try multiqueue. > > > >> > > > >> > > > >> > > > >> Sent from my iPhone > > > >> > > > >> On Sep 16, 2018, at 2:27 AM, Liping Mao (limao) wrote: > > > >> > > > >> > > > >> Hi Satish, > > > >> > > > >> > > > >> > > > >> > > > >> Did your packet loss happen always or it only happened when heavy load? > > > >> > > > >> > > > >> AFAIK, if you do not tun anything, the vm tap can process about 50kpps before the tap device start to drop packets. > > > >> > > > >> > > > >> > > > >> > > > >> If it happened in heavy load, couple of things you can try: > > > >> > > > >> > > > >> 1) increase tap queue length, usually the default value is 500, you can try larger. (seems like you already tried) > > > >> > > > >> > > > >> 2) Try to use virtio multi queues feature , see [1]. Virtio use one queue for rx/tx in vm, with this feature you can get more queues. You can check > > > >> > > > >> > > > >> 3) In rock version, you can use [2] to increase virtio queue size, the default queues size is 256/512, you may increase it to 1024, this would help to increase pps of the tap device. > > > >> > > > >> > > > >> > > > >> > > > >> If all these things can not get your network performance requirement, you may need to move to use dpdk / sriov stuff to get more vm performance. > > > >> > > > >> > > > >> I did not actually used them in our env, you may refer to [3] > > > >> > > > >> > > > >> > > > >> > > > >> [1] https://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/libvirt-virtiomq.html > > > >> > > > >> > > > >> [2] https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/libvirt-virtio-set-queue-sizes.html > > > >> > > > >> > > > >> [3] https://docs.openstack.org/ocata/networking-guide/config-sriov.html > > > >> > > > >> > > > >> > > > >> > > > >> Regards, > > > >> > > > >> > > > >> Liping Mao > > > >> > > > >> > > > >> > > > >> > > > >> 在 2018/9/16 13:07,“Satish Patel” 写入: > > > >> > > > >> > > > >> > > > >> > > > >> [root at compute-33 ~]# ifconfig tap5af7f525-5f | grep -i drop > > > >> > > > >> > > > >> RX errors 0 dropped 0 overruns 0 frame 0 > > > >> > > > >> > > > >> TX errors 0 dropped 2528788837 overruns 0 carrier 0 collisions 0 > > > >> > > > >> > > > >> > > > >> > > > >> Noticed tap interface dropping TX packets and even after increasing > > > >> > > > >> > > > >> txqueue from 1000 to 10000 nothing changed, still getting packet > > > >> > > > >> > > > >> drops. > > > >> > > > >> > > > >> > > > >> > > > >> On Sat, Sep 15, 2018 at 4:22 PM Satish Patel wrote: > > > >> > > > >> > > > >> > > > >> > > > >> Folks, > > > >> > > > >> > > > >> > > > >> > > > >> I need some advice or suggestion to find out what is going on with my > > > >> > > > >> > > > >> network, we have notice high packet loss on openstack instance and not > > > >> > > > >> > > > >> sure what is going on, same time if i check on host machine and it has > > > >> > > > >> > > > >> zero packet loss.. this is what i did for test... > > > >> > > > >> > > > >> > > > >> > > > >> ping 8.8.8.8 > > > >> > > > >> > > > >> > > > >> > > > >> from instance: 50% packet loss > > > >> > > > >> > > > >> from compute host: 0% packet loss > > > >> > > > >> > > > >> > > > >> > > > >> I have disabled TSO/GSO/SG setting on physical compute node but still > > > >> > > > >> > > > >> getting packet loss. > > > >> > > > >> > > > >> > > > >> > > > >> We have 10G NIC on our network, look like something related to tap > > > >> > > > >> > > > >> interface setting.. > > > >> > > > >> > > > >> > > > >> > > > >> _______________________________________________ > > > >> > > > >> > > > >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > >> > > > >> > > > >> Post to : openstack at lists.openstack.org > > > >> > > > >> > > > >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > >> > > > >> > > > >> > > > >> > > > >> From satish.txt at gmail.com Sun Sep 16 20:08:52 2018 From: satish.txt at gmail.com (Satish Patel) Date: Sun, 16 Sep 2018 16:08:52 -0400 Subject: [Openstack] URGENT: packet loss on openstack instance In-Reply-To: References: <4126044E-505F-4A48-B126-0625D5F40D72@cisco.com> <11A2F68B-1A87-433B-A4D6-CA495DA88F5C@gmail.com> <5A9B804F-BEA1-468D-BBAB-3C50181A6190@cisco.com> Message-ID: Update on my last email. I am able to achieve 150kpps with queue=8 and my goal is to do 300kpps because some of voice application using 300kps. Here i am trying to increase rx_queue_size & tx_queue_size but its not working somehow. I have tired following. 1. add rx/tx size in /etc/nova/nova.conf in libvirt section - (didn't work) 2. add /etc/libvirtd/qemu.conf - (didn't work) I have try to edit virsh edit file but somehow my changes not getting reflected, i did virsh define after change and hard reboot guest but no luck.. how do i edit that option in xml if i want to do that? On Sun, Sep 16, 2018 at 1:41 PM Satish Patel wrote: > > I successful reproduce this error with hping3 tool and look like > multiqueue is our solution :) but i have few question you may have > answer of that. > > 1. I have created two instance (vm1.example.com & vm2.example.com) > > 2. I have flood traffic from vm1 using "hping3 vm2.example.com > --flood" and i have noticed drops on tap interface. ( This is without > multiqueue) > > 3. Enable multiqueue in image and run same test and again got packet > drops on tap interface ( I didn't update queue on vm2 guest, so > definitely i was expecting packet drops) > > 4. Now i have try to update vm2 queue using ethtool and i got > following error, I have 15vCPU and i was trying to add 15 queue > > [root at bar-mq ~]# ethtool -L eth0 combined 15 > Cannot set device channel parameters: Invalid argument > > Then i have tried 8 queue which works. > > [root at bar-mq ~]# ethtool -L eth0 combined 8 > combined unmodified, ignoring > no channel parameters changed, aborting > current values: tx 0 rx 0 other 0 combined 8 > > Now i am not seeing any packet drops on tap interface, I have measure > PPS and i was able to get 160kpps without packet drops. > > Question: > > 1. why i am not able to add 15 queue? ( is this NIC or driver limitation?) > 2. how do i automate "ethtool -L eth0 combined 8" command in instance > so i don't need to tell my customer to do this manually? > On Sun, Sep 16, 2018 at 11:53 AM Satish Patel wrote: > > > > Hi Liping, > > > > >> I think multi queue feature should help.(be careful to make sure the ethtool update queue number action also did after reboot the vm). > > > > Is there a way i can automate this last task to update queue number > > action after reboot vm :) otherwise i can use cloud-init to make sure > > all VM build with same config. > > On Sun, Sep 16, 2018 at 11:51 AM Satish Patel wrote: > > > > > > I am currently playing with those setting and trying to generate > > > traffic with hping3 tools, do you have any tool to test traffic > > > performance for specially udp style small packets. > > > > > > I am going to share all my result and see what do you feel because i > > > have noticed you went through this pain :) I will try every single > > > option which you suggested to make sure we are good before i move > > > forward to production. > > > On Sun, Sep 16, 2018 at 11:25 AM Liping Mao (limao) wrote: > > > > > > > > I think multi queue feature should help.(be careful to make sure the ethtool update queue number action also did after reboot the vm). > > > > > > > > Numa cpu pin and queue length will be a plus in my exp. You may need yo have performance test in your situatuon,in my case cpu numa helpped the app get very stable 720p/1080p transcoding performance. Not sure if your app get benifit. > > > > > > > > You are not using L3,this will let you avoid a lot of performance issue. And since only two instance with 80kpps packets,so in your case,HW interface should not be bottleneck too. And your Nexus 5k/7k will not be bottleneck for sure ;-) > > > > > > > > > > > > Thanks, > > > > Liping Mao > > > > > > > > > 在 2018年9月16日,23:09,Satish Patel 写道: > > > > > > > > > > Thanks Liping, > > > > > > > > > > I am using libvertd 3.9.0 version so look like i am eligible take > > > > > advantage of that feature. phew! > > > > > > > > > > [root at compute-47 ~]# libvirtd -V > > > > > libvirtd (libvirt) 3.9.0 > > > > > > > > > > Let me tell you how i am running instance on my openstack, my compute > > > > > has 32 core / 32G memory and i have created two instance on compute > > > > > node 15vcpu and 14G memory ( two instance using 30 vcpu core, i have > > > > > kept 2 core for compute node). on compute node i disabled overcommit > > > > > using ratio (1.0) > > > > > > > > > > I didn't configure NUMA yet because i wasn't aware of this feature, as > > > > > per your last post do you think numa will help to fix this issue? > > > > > following is my numa view > > > > > > > > > > [root at compute-47 ~]# numactl --hardware > > > > > available: 2 nodes (0-1) > > > > > node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23 > > > > > node 0 size: 16349 MB > > > > > node 0 free: 133 MB > > > > > node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31 > > > > > node 1 size: 16383 MB > > > > > node 1 free: 317 MB > > > > > node distances: > > > > > node 0 1 > > > > > 0: 10 20 > > > > > 1: 20 10 > > > > > > > > > > > > > > > I am not using any L3 router, i am using provide VLAN network and > > > > > using Cisco Nexus switch for my L3 function so i am not seeing any > > > > > bottleneck there. > > > > > > > > > > This is the 10G NIC i have on all my compute node, dual 10G port with > > > > > bonding (20G) > > > > > > > > > > 03:00.0 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 > > > > > Gigabit Ethernet (rev 10) > > > > > 03:00.1 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 > > > > > Gigabit Ethernet (rev 10) > > > > > > > > > > > > > > >> On Sun, Sep 16, 2018 at 10:50 AM Liping Mao (limao) wrote: > > > > >> > > > > >> It is still possible to update rx and tx queues length if your qemu and libvirt version is higher than the version recorded in [3]. (You should possible to update directly in libvirt configuration if my memory is correct) > > > > >> > > > > >> We also have some similar use case which run audio/vedio serivcs. They are CPU consuming and have UDP small packets. Another possible tunning is using CPU pin for the vm. you can use numa awared cpu feature to get stable cpu performance ,vm network dropped packets sometimes because of the vm cpu is too busy,with numa cpu it works better performance,our way is similar with [a]. You need to create flavor with special metadata and dedicated Host Agg for numa awared VMs. Dedicated CPU is very good for media service. It makes the CPU performance stable. > > > > >> > > > > >> Another packet loss case we get is because of vm kernel, some of our app are using 32bit OS, that cause memory issue, when traffic larger then 50kpps, it dropped a lot,sometimes,it even crash. In this case, 32bit os can actually use very limited memory, we have to add swap for the vm. Hope your app is using 64 bit OS. Because 32 bit could cause tons of trouble. > > > > >> > > > > >> BTW,if you are using vrouter on L3, you’d better to move provider network(no vrouter). I did not tried DVR, but if you are running without DVR, the L3 node will be bottleneck very quick. Especially default iptables conntrack is 65535, you will reach to it and drop packet on L3, even after you tun that value, it still hard to more that 1Mpps for your network node. > > > > >> > > > > >> If your App more than 200kpps per compute node, you may be better also have a look your physical network driver tx/rx configuration. Most of the HW default value for tx/rx queues number and length are very poor,you may start to get packet on eth interface on physical host when rx queue is full. > > > > >> > > > > >> [a]https://redhatstackblog.redhat.com/2015/05/05/cpu-pinning-and-numa-topology-awareness-in-openstack-compute/ > > > > >> > > > > >> Regards, > > > > >> Liping Mao > > > > >> > > > > >> 在 2018年9月16日,21:18,Satish Patel 写道: > > > > >> > > > > >> Hi Liping, > > > > >> > > > > >> Thank you for your reply, > > > > >> > > > > >> We notice packet drops during high load, I did try txqueue and didn't help so I believe I am going to try miltiqueue. > > > > >> > > > > >> For SRIOV I have to look if I have support in my nic. > > > > >> > > > > >> We are using queens so I think queue size option not possible :( > > > > >> > > > > >> We are using voip application and traffic is udp so our pps rate is 60k to 80k per vm instance. > > > > >> > > > > >> I will share my result as soon as I try multiqueue. > > > > >> > > > > >> > > > > >> > > > > >> Sent from my iPhone > > > > >> > > > > >> On Sep 16, 2018, at 2:27 AM, Liping Mao (limao) wrote: > > > > >> > > > > >> > > > > >> Hi Satish, > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> Did your packet loss happen always or it only happened when heavy load? > > > > >> > > > > >> > > > > >> AFAIK, if you do not tun anything, the vm tap can process about 50kpps before the tap device start to drop packets. > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> If it happened in heavy load, couple of things you can try: > > > > >> > > > > >> > > > > >> 1) increase tap queue length, usually the default value is 500, you can try larger. (seems like you already tried) > > > > >> > > > > >> > > > > >> 2) Try to use virtio multi queues feature , see [1]. Virtio use one queue for rx/tx in vm, with this feature you can get more queues. You can check > > > > >> > > > > >> > > > > >> 3) In rock version, you can use [2] to increase virtio queue size, the default queues size is 256/512, you may increase it to 1024, this would help to increase pps of the tap device. > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> If all these things can not get your network performance requirement, you may need to move to use dpdk / sriov stuff to get more vm performance. > > > > >> > > > > >> > > > > >> I did not actually used them in our env, you may refer to [3] > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> [1] https://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/libvirt-virtiomq.html > > > > >> > > > > >> > > > > >> [2] https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/libvirt-virtio-set-queue-sizes.html > > > > >> > > > > >> > > > > >> [3] https://docs.openstack.org/ocata/networking-guide/config-sriov.html > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> Regards, > > > > >> > > > > >> > > > > >> Liping Mao > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> 在 2018/9/16 13:07,“Satish Patel” 写入: > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> [root at compute-33 ~]# ifconfig tap5af7f525-5f | grep -i drop > > > > >> > > > > >> > > > > >> RX errors 0 dropped 0 overruns 0 frame 0 > > > > >> > > > > >> > > > > >> TX errors 0 dropped 2528788837 overruns 0 carrier 0 collisions 0 > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> Noticed tap interface dropping TX packets and even after increasing > > > > >> > > > > >> > > > > >> txqueue from 1000 to 10000 nothing changed, still getting packet > > > > >> > > > > >> > > > > >> drops. > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> On Sat, Sep 15, 2018 at 4:22 PM Satish Patel wrote: > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> Folks, > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> I need some advice or suggestion to find out what is going on with my > > > > >> > > > > >> > > > > >> network, we have notice high packet loss on openstack instance and not > > > > >> > > > > >> > > > > >> sure what is going on, same time if i check on host machine and it has > > > > >> > > > > >> > > > > >> zero packet loss.. this is what i did for test... > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> ping 8.8.8.8 > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> from instance: 50% packet loss > > > > >> > > > > >> > > > > >> from compute host: 0% packet loss > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> I have disabled TSO/GSO/SG setting on physical compute node but still > > > > >> > > > > >> > > > > >> getting packet loss. > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> We have 10G NIC on our network, look like something related to tap > > > > >> > > > > >> > > > > >> interface setting.. > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> _______________________________________________ > > > > >> > > > > >> > > > > >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > > >> > > > > >> > > > > >> Post to : openstack at lists.openstack.org > > > > >> > > > > >> > > > > >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> From Remo at italy1.com Sun Sep 16 23:05:26 2018 From: Remo at italy1.com (Remo Mattei) Date: Sun, 16 Sep 2018 16:05:26 -0700 Subject: [Openstack] Can any user add or delete OpenStack Swift middleware? In-Reply-To: References: Message-ID: <1247B03A-F2E2-4C30-A96C-726141A4B64F@italy1.com> Users cannot install middleware. You can use ACL for users with the same share. Remo > On Sep 16, 2018, at 09:25, Qiao Kang wrote: > > Hi, > > I'm wondering whether Swift allows any user (not the administrator) to specify which middleware that she/he wants his data object to go throught. For instance, Alice wants to install a middleware but doesn't want Bob to use it, where Alice and Bob are two accounts in a single Swift cluster. > > Or maybe all middlewares are pre-installed globally and cannot be customized on a per-account basis? > > Thanks, > Qiao > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From limao at cisco.com Sun Sep 16 23:06:07 2018 From: limao at cisco.com (Liping Mao (limao)) Date: Sun, 16 Sep 2018 23:06:07 +0000 Subject: [Openstack] URGENT: packet loss on openstack instance In-Reply-To: References: <4126044E-505F-4A48-B126-0625D5F40D72@cisco.com> <11A2F68B-1A87-433B-A4D6-CA495DA88F5C@gmail.com> <5A9B804F-BEA1-468D-BBAB-3C50181A6190@cisco.com> Message-ID: <8089BF19-A95B-4CF5-A2D4-0CB2B7415362@cisco.com> Hi Satish, There are hard limitations in nova's code, I did not actually used more thant 8 queues: def _get_max_tap_queues(self): # NOTE(kengo.sakai): In kernels prior to 3.0, # multiple queues on a tap interface is not supported. # In kernels 3.x, the number of queues on a tap interface # is limited to 8. From 4.0, the number is 256. # See: https://bugs.launchpad.net/nova/+bug/1570631 kernel_version = int(os.uname()[2].split(".")[0]) if kernel_version <= 2: return 1 elif kernel_version == 3: return 8 elif kernel_version == 4: return 256 else: return None > I am currently playing with those setting and trying to generate traffic with hping3 tools, do you have any tool to test traffic performance for specially udp style small packets. Hping3 is good enough to reproduce it, we have app level test tool, but that is not your case. > Here i am trying to increase rx_queue_size & tx_queue_size but its not working somehow. I have tired following. Since you are not rocky code, it should only works in qemu.conf, maybe check if this bug[1] affect you. > Is there a way i can automate this last task to update queue number action after reboot vm :) otherwise i can use cloud-init to make sure all VM build with same config. Cloud-init or rc.local could be the place to do that. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1541960 Regards, Liping Mao 在 2018/9/17 04:09,“Satish Patel” 写入: Update on my last email. I am able to achieve 150kpps with queue=8 and my goal is to do 300kpps because some of voice application using 300kps. Here i am trying to increase rx_queue_size & tx_queue_size but its not working somehow. I have tired following. 1. add rx/tx size in /etc/nova/nova.conf in libvirt section - (didn't work) 2. add /etc/libvirtd/qemu.conf - (didn't work) I have try to edit virsh edit file but somehow my changes not getting reflected, i did virsh define after change and hard reboot guest but no luck.. how do i edit that option in xml if i want to do that? On Sun, Sep 16, 2018 at 1:41 PM Satish Patel wrote: > > I successful reproduce this error with hping3 tool and look like > multiqueue is our solution :) but i have few question you may have > answer of that. > > 1. I have created two instance (vm1.example.com & vm2.example.com) > > 2. I have flood traffic from vm1 using "hping3 vm2.example.com > --flood" and i have noticed drops on tap interface. ( This is without > multiqueue) > > 3. Enable multiqueue in image and run same test and again got packet > drops on tap interface ( I didn't update queue on vm2 guest, so > definitely i was expecting packet drops) > > 4. Now i have try to update vm2 queue using ethtool and i got > following error, I have 15vCPU and i was trying to add 15 queue > > [root at bar-mq ~]# ethtool -L eth0 combined 15 > Cannot set device channel parameters: Invalid argument > > Then i have tried 8 queue which works. > > [root at bar-mq ~]# ethtool -L eth0 combined 8 > combined unmodified, ignoring > no channel parameters changed, aborting > current values: tx 0 rx 0 other 0 combined 8 > > Now i am not seeing any packet drops on tap interface, I have measure > PPS and i was able to get 160kpps without packet drops. > > Question: > > 1. why i am not able to add 15 queue? ( is this NIC or driver limitation?) > 2. how do i automate "ethtool -L eth0 combined 8" command in instance > so i don't need to tell my customer to do this manually? > On Sun, Sep 16, 2018 at 11:53 AM Satish Patel wrote: > > > > Hi Liping, > > > > >> I think multi queue feature should help.(be careful to make sure the ethtool update queue number action also did after reboot the vm). > > > > Is there a way i can automate this last task to update queue number > > action after reboot vm :) otherwise i can use cloud-init to make sure > > all VM build with same config. > > On Sun, Sep 16, 2018 at 11:51 AM Satish Patel wrote: > > > > > > I am currently playing with those setting and trying to generate > > > traffic with hping3 tools, do you have any tool to test traffic > > > performance for specially udp style small packets. > > > > > > I am going to share all my result and see what do you feel because i > > > have noticed you went through this pain :) I will try every single > > > option which you suggested to make sure we are good before i move > > > forward to production. > > > On Sun, Sep 16, 2018 at 11:25 AM Liping Mao (limao) wrote: > > > > > > > > I think multi queue feature should help.(be careful to make sure the ethtool update queue number action also did after reboot the vm). > > > > > > > > Numa cpu pin and queue length will be a plus in my exp. You may need yo have performance test in your situatuon,in my case cpu numa helpped the app get very stable 720p/1080p transcoding performance. Not sure if your app get benifit. > > > > > > > > You are not using L3,this will let you avoid a lot of performance issue. And since only two instance with 80kpps packets,so in your case,HW interface should not be bottleneck too. And your Nexus 5k/7k will not be bottleneck for sure ;-) > > > > > > > > > > > > Thanks, > > > > Liping Mao > > > > > > > > > 在 2018年9月16日,23:09,Satish Patel 写道: > > > > > > > > > > Thanks Liping, > > > > > > > > > > I am using libvertd 3.9.0 version so look like i am eligible take > > > > > advantage of that feature. phew! > > > > > > > > > > [root at compute-47 ~]# libvirtd -V > > > > > libvirtd (libvirt) 3.9.0 > > > > > > > > > > Let me tell you how i am running instance on my openstack, my compute > > > > > has 32 core / 32G memory and i have created two instance on compute > > > > > node 15vcpu and 14G memory ( two instance using 30 vcpu core, i have > > > > > kept 2 core for compute node). on compute node i disabled overcommit > > > > > using ratio (1.0) > > > > > > > > > > I didn't configure NUMA yet because i wasn't aware of this feature, as > > > > > per your last post do you think numa will help to fix this issue? > > > > > following is my numa view > > > > > > > > > > [root at compute-47 ~]# numactl --hardware > > > > > available: 2 nodes (0-1) > > > > > node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23 > > > > > node 0 size: 16349 MB > > > > > node 0 free: 133 MB > > > > > node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31 > > > > > node 1 size: 16383 MB > > > > > node 1 free: 317 MB > > > > > node distances: > > > > > node 0 1 > > > > > 0: 10 20 > > > > > 1: 20 10 > > > > > > > > > > > > > > > I am not using any L3 router, i am using provide VLAN network and > > > > > using Cisco Nexus switch for my L3 function so i am not seeing any > > > > > bottleneck there. > > > > > > > > > > This is the 10G NIC i have on all my compute node, dual 10G port with > > > > > bonding (20G) > > > > > > > > > > 03:00.0 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 > > > > > Gigabit Ethernet (rev 10) > > > > > 03:00.1 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 > > > > > Gigabit Ethernet (rev 10) > > > > > > > > > > > > > > >> On Sun, Sep 16, 2018 at 10:50 AM Liping Mao (limao) wrote: > > > > >> > > > > >> It is still possible to update rx and tx queues length if your qemu and libvirt version is higher than the version recorded in [3]. (You should possible to update directly in libvirt configuration if my memory is correct) > > > > >> > > > > >> We also have some similar use case which run audio/vedio serivcs. They are CPU consuming and have UDP small packets. Another possible tunning is using CPU pin for the vm. you can use numa awared cpu feature to get stable cpu performance ,vm network dropped packets sometimes because of the vm cpu is too busy,with numa cpu it works better performance,our way is similar with [a]. You need to create flavor with special metadata and dedicated Host Agg for numa awared VMs. Dedicated CPU is very good for media service. It makes the CPU performance stable. > > > > >> > > > > >> Another packet loss case we get is because of vm kernel, some of our app are using 32bit OS, that cause memory issue, when traffic larger then 50kpps, it dropped a lot,sometimes,it even crash. In this case, 32bit os can actually use very limited memory, we have to add swap for the vm. Hope your app is using 64 bit OS. Because 32 bit could cause tons of trouble. > > > > >> > > > > >> BTW,if you are using vrouter on L3, you’d better to move provider network(no vrouter). I did not tried DVR, but if you are running without DVR, the L3 node will be bottleneck very quick. Especially default iptables conntrack is 65535, you will reach to it and drop packet on L3, even after you tun that value, it still hard to more that 1Mpps for your network node. > > > > >> > > > > >> If your App more than 200kpps per compute node, you may be better also have a look your physical network driver tx/rx configuration. Most of the HW default value for tx/rx queues number and length are very poor,you may start to get packet on eth interface on physical host when rx queue is full. > > > > >> > > > > >> [a]https://redhatstackblog.redhat.com/2015/05/05/cpu-pinning-and-numa-topology-awareness-in-openstack-compute/ > > > > >> > > > > >> Regards, > > > > >> Liping Mao > > > > >> > > > > >> 在 2018年9月16日,21:18,Satish Patel 写道: > > > > >> > > > > >> Hi Liping, > > > > >> > > > > >> Thank you for your reply, > > > > >> > > > > >> We notice packet drops during high load, I did try txqueue and didn't help so I believe I am going to try miltiqueue. > > > > >> > > > > >> For SRIOV I have to look if I have support in my nic. > > > > >> > > > > >> We are using queens so I think queue size option not possible :( > > > > >> > > > > >> We are using voip application and traffic is udp so our pps rate is 60k to 80k per vm instance. > > > > >> > > > > >> I will share my result as soon as I try multiqueue. > > > > >> > > > > >> > > > > >> > > > > >> Sent from my iPhone > > > > >> > > > > >> On Sep 16, 2018, at 2:27 AM, Liping Mao (limao) wrote: > > > > >> > > > > >> > > > > >> Hi Satish, > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> Did your packet loss happen always or it only happened when heavy load? > > > > >> > > > > >> > > > > >> AFAIK, if you do not tun anything, the vm tap can process about 50kpps before the tap device start to drop packets. > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> If it happened in heavy load, couple of things you can try: > > > > >> > > > > >> > > > > >> 1) increase tap queue length, usually the default value is 500, you can try larger. (seems like you already tried) > > > > >> > > > > >> > > > > >> 2) Try to use virtio multi queues feature , see [1]. Virtio use one queue for rx/tx in vm, with this feature you can get more queues. You can check > > > > >> > > > > >> > > > > >> 3) In rock version, you can use [2] to increase virtio queue size, the default queues size is 256/512, you may increase it to 1024, this would help to increase pps of the tap device. > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> If all these things can not get your network performance requirement, you may need to move to use dpdk / sriov stuff to get more vm performance. > > > > >> > > > > >> > > > > >> I did not actually used them in our env, you may refer to [3] > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> [1] https://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/libvirt-virtiomq.html > > > > >> > > > > >> > > > > >> [2] https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/libvirt-virtio-set-queue-sizes.html > > > > >> > > > > >> > > > > >> [3] https://docs.openstack.org/ocata/networking-guide/config-sriov.html > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> Regards, > > > > >> > > > > >> > > > > >> Liping Mao > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> 在 2018/9/16 13:07,“Satish Patel” 写入: > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> [root at compute-33 ~]# ifconfig tap5af7f525-5f | grep -i drop > > > > >> > > > > >> > > > > >> RX errors 0 dropped 0 overruns 0 frame 0 > > > > >> > > > > >> > > > > >> TX errors 0 dropped 2528788837 overruns 0 carrier 0 collisions 0 > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> Noticed tap interface dropping TX packets and even after increasing > > > > >> > > > > >> > > > > >> txqueue from 1000 to 10000 nothing changed, still getting packet > > > > >> > > > > >> > > > > >> drops. > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> On Sat, Sep 15, 2018 at 4:22 PM Satish Patel wrote: > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> Folks, > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> I need some advice or suggestion to find out what is going on with my > > > > >> > > > > >> > > > > >> network, we have notice high packet loss on openstack instance and not > > > > >> > > > > >> > > > > >> sure what is going on, same time if i check on host machine and it has > > > > >> > > > > >> > > > > >> zero packet loss.. this is what i did for test... > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> ping 8.8.8.8 > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> from instance: 50% packet loss > > > > >> > > > > >> > > > > >> from compute host: 0% packet loss > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> I have disabled TSO/GSO/SG setting on physical compute node but still > > > > >> > > > > >> > > > > >> getting packet loss. > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> We have 10G NIC on our network, look like something related to tap > > > > >> > > > > >> > > > > >> interface setting.. > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> _______________________________________________ > > > > >> > > > > >> > > > > >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > > >> > > > > >> > > > > >> Post to : openstack at lists.openstack.org > > > > >> > > > > >> > > > > >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> From qiaokang1213 at gmail.com Sun Sep 16 23:15:56 2018 From: qiaokang1213 at gmail.com (Qiao Kang) Date: Sun, 16 Sep 2018 18:15:56 -0500 Subject: [Openstack] Can any user add or delete OpenStack Swift middleware? In-Reply-To: <1247B03A-F2E2-4C30-A96C-726141A4B64F@italy1.com> References: <1247B03A-F2E2-4C30-A96C-726141A4B64F@italy1.com> Message-ID: Thanks Remo. What did you mean by using ACL? Does it mean different users can see different middleware pipelines? For instance, Alice: middleware_1 -> middleware_2 -> middleware_3 ... Bob: middleware_2 -> middleware_4 ... Is that feasible? Thanks, Qiao On Sun, Sep 16, 2018 at 6:05 PM Remo Mattei wrote: > > Users cannot install middleware. > You can use ACL for users with the same share. > > Remo > > > On Sep 16, 2018, at 09:25, Qiao Kang wrote: > > > > Hi, > > > > I'm wondering whether Swift allows any user (not the administrator) to specify which middleware that she/he wants his data object to go throught. For instance, Alice wants to install a middleware but doesn't want Bob to use it, where Alice and Bob are two accounts in a single Swift cluster. > > > > Or maybe all middlewares are pre-installed globally and cannot be customized on a per-account basis? > > > > Thanks, > > Qiao > > _______________________________________________ > > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > Post to : openstack at lists.openstack.org > > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > From Remo at italy1.com Sun Sep 16 23:43:11 2018 From: Remo at italy1.com (Remo Mattei) Date: Sun, 16 Sep 2018 16:43:11 -0700 Subject: [Openstack] Can any user add or delete OpenStack Swift middleware? In-Reply-To: References: <1247B03A-F2E2-4C30-A96C-726141A4B64F@italy1.com> Message-ID: <6FD05DE2-C79D-46C4-8C8F-600122632614@italy1.com> https://docs.openstack.org/swift/latest/overview_acl.html This should help. Remo > On Sep 16, 2018, at 16:15, Qiao Kang wrote: > > Thanks Remo. > > What did you mean by using ACL? Does it mean different users can see > different middleware pipelines? > > For instance, > Alice: middleware_1 -> middleware_2 -> middleware_3 ... > Bob: middleware_2 -> middleware_4 ... > > Is that feasible? > > Thanks, > Qiao > > On Sun, Sep 16, 2018 at 6:05 PM Remo Mattei wrote: >> >> Users cannot install middleware. >> You can use ACL for users with the same share. >> >> Remo >> >>> On Sep 16, 2018, at 09:25, Qiao Kang wrote: >>> >>> Hi, >>> >>> I'm wondering whether Swift allows any user (not the administrator) to specify which middleware that she/he wants his data object to go throught. For instance, Alice wants to install a middleware but doesn't want Bob to use it, where Alice and Bob are two accounts in a single Swift cluster. >>> >>> Or maybe all middlewares are pre-installed globally and cannot be customized on a per-account basis? >>> >>> Thanks, >>> Qiao >>> _______________________________________________ >>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>> Post to : openstack at lists.openstack.org >>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From me at not.mn Sun Sep 16 23:59:19 2018 From: me at not.mn (John Dickinson) Date: Sun, 16 Sep 2018 16:59:19 -0700 Subject: [Openstack] Can any user add or delete OpenStack Swift middleware? In-Reply-To: References: Message-ID: You may be interested in Storlets. It's another OpenStack project, maintained by a Swift core reviewer, that provides this sort of user-defined middleware functionality. You can also ask about it in #openstack-swift --John On 16 Sep 2018, at 9:25, Qiao Kang wrote: > Hi, > > I'm wondering whether Swift allows any user (not the administrator) to > specify which middleware that she/he wants his data object to go > throught. > For instance, Alice wants to install a middleware but doesn't want Bob > to > use it, where Alice and Bob are two accounts in a single Swift > cluster. > > Or maybe all middlewares are pre-installed globally and cannot be > customized on a per-account basis? > > Thanks, > Qiao > _______________________________________________ > Mailing list: > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack From satish.txt at gmail.com Mon Sep 17 03:48:11 2018 From: satish.txt at gmail.com (Satish Patel) Date: Sun, 16 Sep 2018 23:48:11 -0400 Subject: [Openstack] URGENT: packet loss on openstack instance In-Reply-To: <8089BF19-A95B-4CF5-A2D4-0CB2B7415362@cisco.com> References: <4126044E-505F-4A48-B126-0625D5F40D72@cisco.com> <11A2F68B-1A87-433B-A4D6-CA495DA88F5C@gmail.com> <5A9B804F-BEA1-468D-BBAB-3C50181A6190@cisco.com> <8089BF19-A95B-4CF5-A2D4-0CB2B7415362@cisco.com> Message-ID: Thanks Liping, I will check bug for tx/rx queue size and see if i can make it work but look like my 10G NIC support SR-IOV so i am trying that path because it will be better for long run. I have deploy my cloud using openstack-ansible so now i need to figure out how do i wire that up with openstack-ansible deployment, here is the article [1] Question: I have br-vlan interface mapp with bond0 to run my VM (VLAN traffic), so do i need to do anything in bond0 to enable VF/PF function? Just confused because currently my VM nic map with compute node br-vlan bridge. [root at compute-65 ~]# lspci -nn | grep -i ethernet 03:00.0 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 Gigabit Ethernet [14e4:168e] (rev 10) 03:00.1 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 Gigabit Ethernet [14e4:168e] (rev 10) 03:01.0 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] 03:01.1 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] 03:01.2 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] 03:01.3 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] 03:01.4 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] 03:01.5 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] 03:01.6 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] [1] https://docs.openstack.org/openstack-ansible-os_neutron/latest/configure-network-services.html On Sun, Sep 16, 2018 at 7:06 PM Liping Mao (limao) wrote: > > Hi Satish, > > > > > > There are hard limitations in nova's code, I did not actually used more thant 8 queues: > > def _get_max_tap_queues(self): > > # NOTE(kengo.sakai): In kernels prior to 3.0, > > # multiple queues on a tap interface is not supported. > > # In kernels 3.x, the number of queues on a tap interface > > # is limited to 8. From 4.0, the number is 256. > > # See: https://bugs.launchpad.net/nova/+bug/1570631 > > kernel_version = int(os.uname()[2].split(".")[0]) > > if kernel_version <= 2: > > return 1 > > elif kernel_version == 3: > > return 8 > > elif kernel_version == 4: > > return 256 > > else: > > return None > > > > > I am currently playing with those setting and trying to generate > > traffic with hping3 tools, do you have any tool to test traffic > > performance for specially udp style small packets. > > > > Hping3 is good enough to reproduce it, we have app level test tool, but that is not your case. > > > > > > > Here i am trying to increase rx_queue_size & tx_queue_size but its not > > working somehow. I have tired following. > > > > Since you are not rocky code, it should only works in qemu.conf, maybe check if this bug[1] affect you. > > > > > > > Is there a way i can automate this last task to update queue number > > action after reboot vm :) otherwise i can use cloud-init to make sure > > all VM build with same config. > > > > Cloud-init or rc.local could be the place to do that. > > > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1541960 > > > > Regards, > > Liping Mao > > > > 在 2018/9/17 04:09,“Satish Patel” 写入: > > > > Update on my last email. > > > > I am able to achieve 150kpps with queue=8 and my goal is to do 300kpps > > because some of voice application using 300kps. > > > > Here i am trying to increase rx_queue_size & tx_queue_size but its not > > working somehow. I have tired following. > > > > 1. add rx/tx size in /etc/nova/nova.conf in libvirt section - (didn't work) > > 2. add /etc/libvirtd/qemu.conf - (didn't work) > > > > I have try to edit virsh edit file but somehow my changes not > > getting reflected, i did virsh define after change and hard > > reboot guest but no luck.. how do i edit that option in xml if i want > > to do that? > > On Sun, Sep 16, 2018 at 1:41 PM Satish Patel wrote: > > > > > > I successful reproduce this error with hping3 tool and look like > > > multiqueue is our solution :) but i have few question you may have > > > answer of that. > > > > > > 1. I have created two instance (vm1.example.com & vm2.example.com) > > > > > > 2. I have flood traffic from vm1 using "hping3 vm2.example.com > > > --flood" and i have noticed drops on tap interface. ( This is without > > > multiqueue) > > > > > > 3. Enable multiqueue in image and run same test and again got packet > > > drops on tap interface ( I didn't update queue on vm2 guest, so > > > definitely i was expecting packet drops) > > > > > > 4. Now i have try to update vm2 queue using ethtool and i got > > > following error, I have 15vCPU and i was trying to add 15 queue > > > > > > [root at bar-mq ~]# ethtool -L eth0 combined 15 > > > Cannot set device channel parameters: Invalid argument > > > > > > Then i have tried 8 queue which works. > > > > > > [root at bar-mq ~]# ethtool -L eth0 combined 8 > > > combined unmodified, ignoring > > > no channel parameters changed, aborting > > > current values: tx 0 rx 0 other 0 combined 8 > > > > > > Now i am not seeing any packet drops on tap interface, I have measure > > > PPS and i was able to get 160kpps without packet drops. > > > > > > Question: > > > > > > 1. why i am not able to add 15 queue? ( is this NIC or driver limitation?) > > > 2. how do i automate "ethtool -L eth0 combined 8" command in instance > > > so i don't need to tell my customer to do this manually? > > > On Sun, Sep 16, 2018 at 11:53 AM Satish Patel wrote: > > > > > > > > Hi Liping, > > > > > > > > >> I think multi queue feature should help.(be careful to make sure the ethtool update queue number action also did after reboot the vm). > > > > > > > > Is there a way i can automate this last task to update queue number > > > > action after reboot vm :) otherwise i can use cloud-init to make sure > > > > all VM build with same config. > > > > On Sun, Sep 16, 2018 at 11:51 AM Satish Patel wrote: > > > > > > > > > > I am currently playing with those setting and trying to generate > > > > > traffic with hping3 tools, do you have any tool to test traffic > > > > > performance for specially udp style small packets. > > > > > > > > > > I am going to share all my result and see what do you feel because i > > > > > have noticed you went through this pain :) I will try every single > > > > > option which you suggested to make sure we are good before i move > > > > > forward to production. > > > > > On Sun, Sep 16, 2018 at 11:25 AM Liping Mao (limao) wrote: > > > > > > > > > > > > I think multi queue feature should help.(be careful to make sure the ethtool update queue number action also did after reboot the vm). > > > > > > > > > > > > Numa cpu pin and queue length will be a plus in my exp. You may need yo have performance test in your situatuon,in my case cpu numa helpped the app get very stable 720p/1080p transcoding performance. Not sure if your app get benifit. > > > > > > > > > > > > You are not using L3,this will let you avoid a lot of performance issue. And since only two instance with 80kpps packets,so in your case,HW interface should not be bottleneck too. And your Nexus 5k/7k will not be bottleneck for sure ;-) > > > > > > > > > > > > > > > > > > Thanks, > > > > > > Liping Mao > > > > > > > > > > > > > 在 2018年9月16日,23:09,Satish Patel 写道: > > > > > > > > > > > > > > Thanks Liping, > > > > > > > > > > > > > > I am using libvertd 3.9.0 version so look like i am eligible take > > > > > > > advantage of that feature. phew! > > > > > > > > > > > > > > [root at compute-47 ~]# libvirtd -V > > > > > > > libvirtd (libvirt) 3.9.0 > > > > > > > > > > > > > > Let me tell you how i am running instance on my openstack, my compute > > > > > > > has 32 core / 32G memory and i have created two instance on compute > > > > > > > node 15vcpu and 14G memory ( two instance using 30 vcpu core, i have > > > > > > > kept 2 core for compute node). on compute node i disabled overcommit > > > > > > > using ratio (1.0) > > > > > > > > > > > > > > I didn't configure NUMA yet because i wasn't aware of this feature, as > > > > > > > per your last post do you think numa will help to fix this issue? > > > > > > > following is my numa view > > > > > > > > > > > > > > [root at compute-47 ~]# numactl --hardware > > > > > > > available: 2 nodes (0-1) > > > > > > > node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23 > > > > > > > node 0 size: 16349 MB > > > > > > > node 0 free: 133 MB > > > > > > > node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31 > > > > > > > node 1 size: 16383 MB > > > > > > > node 1 free: 317 MB > > > > > > > node distances: > > > > > > > node 0 1 > > > > > > > 0: 10 20 > > > > > > > 1: 20 10 > > > > > > > > > > > > > > > > > > > > > I am not using any L3 router, i am using provide VLAN network and > > > > > > > using Cisco Nexus switch for my L3 function so i am not seeing any > > > > > > > bottleneck there. > > > > > > > > > > > > > > This is the 10G NIC i have on all my compute node, dual 10G port with > > > > > > > bonding (20G) > > > > > > > > > > > > > > 03:00.0 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 > > > > > > > Gigabit Ethernet (rev 10) > > > > > > > 03:00.1 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 > > > > > > > Gigabit Ethernet (rev 10) > > > > > > > > > > > > > > > > > > > > >> On Sun, Sep 16, 2018 at 10:50 AM Liping Mao (limao) wrote: > > > > > > >> > > > > > > >> It is still possible to update rx and tx queues length if your qemu and libvirt version is higher than the version recorded in [3]. (You should possible to update directly in libvirt configuration if my memory is correct) > > > > > > >> > > > > > > >> We also have some similar use case which run audio/vedio serivcs. They are CPU consuming and have UDP small packets. Another possible tunning is using CPU pin for the vm. you can use numa awared cpu feature to get stable cpu performance ,vm network dropped packets sometimes because of the vm cpu is too busy,with numa cpu it works better performance,our way is similar with [a]. You need to create flavor with special metadata and dedicated Host Agg for numa awared VMs. Dedicated CPU is very good for media service. It makes the CPU performance stable. > > > > > > >> > > > > > > >> Another packet loss case we get is because of vm kernel, some of our app are using 32bit OS, that cause memory issue, when traffic larger then 50kpps, it dropped a lot,sometimes,it even crash. In this case, 32bit os can actually use very limited memory, we have to add swap for the vm. Hope your app is using 64 bit OS. Because 32 bit could cause tons of trouble. > > > > > > >> > > > > > > >> BTW,if you are using vrouter on L3, you’d better to move provider network(no vrouter). I did not tried DVR, but if you are running without DVR, the L3 node will be bottleneck very quick. Especially default iptables conntrack is 65535, you will reach to it and drop packet on L3, even after you tun that value, it still hard to more that 1Mpps for your network node. > > > > > > >> > > > > > > >> If your App more than 200kpps per compute node, you may be better also have a look your physical network driver tx/rx configuration. Most of the HW default value for tx/rx queues number and length are very poor,you may start to get packet on eth interface on physical host when rx queue is full. > > > > > > >> > > > > > > >> [a]https://redhatstackblog.redhat.com/2015/05/05/cpu-pinning-and-numa-topology-awareness-in-openstack-compute/ > > > > > > >> > > > > > > >> Regards, > > > > > > >> Liping Mao > > > > > > >> > > > > > > >> 在 2018年9月16日,21:18,Satish Patel 写道: > > > > > > >> > > > > > > >> Hi Liping, > > > > > > >> > > > > > > >> Thank you for your reply, > > > > > > >> > > > > > > >> We notice packet drops during high load, I did try txqueue and didn't help so I believe I am going to try miltiqueue. > > > > > > >> > > > > > > >> For SRIOV I have to look if I have support in my nic. > > > > > > >> > > > > > > >> We are using queens so I think queue size option not possible :( > > > > > > >> > > > > > > >> We are using voip application and traffic is udp so our pps rate is 60k to 80k per vm instance. > > > > > > >> > > > > > > >> I will share my result as soon as I try multiqueue. > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> Sent from my iPhone > > > > > > >> > > > > > > >> On Sep 16, 2018, at 2:27 AM, Liping Mao (limao) wrote: > > > > > > >> > > > > > > >> > > > > > > >> Hi Satish, > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> Did your packet loss happen always or it only happened when heavy load? > > > > > > >> > > > > > > >> > > > > > > >> AFAIK, if you do not tun anything, the vm tap can process about 50kpps before the tap device start to drop packets. > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> If it happened in heavy load, couple of things you can try: > > > > > > >> > > > > > > >> > > > > > > >> 1) increase tap queue length, usually the default value is 500, you can try larger. (seems like you already tried) > > > > > > >> > > > > > > >> > > > > > > >> 2) Try to use virtio multi queues feature , see [1]. Virtio use one queue for rx/tx in vm, with this feature you can get more queues. You can check > > > > > > >> > > > > > > >> > > > > > > >> 3) In rock version, you can use [2] to increase virtio queue size, the default queues size is 256/512, you may increase it to 1024, this would help to increase pps of the tap device. > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> If all these things can not get your network performance requirement, you may need to move to use dpdk / sriov stuff to get more vm performance. > > > > > > >> > > > > > > >> > > > > > > >> I did not actually used them in our env, you may refer to [3] > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> [1] https://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/libvirt-virtiomq.html > > > > > > >> > > > > > > >> > > > > > > >> [2] https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/libvirt-virtio-set-queue-sizes.html > > > > > > >> > > > > > > >> > > > > > > >> [3] https://docs.openstack.org/ocata/networking-guide/config-sriov.html > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> Regards, > > > > > > >> > > > > > > >> > > > > > > >> Liping Mao > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> 在 2018/9/16 13:07,“Satish Patel” 写入: > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> [root at compute-33 ~]# ifconfig tap5af7f525-5f | grep -i drop > > > > > > >> > > > > > > >> > > > > > > >> RX errors 0 dropped 0 overruns 0 frame 0 > > > > > > >> > > > > > > >> > > > > > > >> TX errors 0 dropped 2528788837 overruns 0 carrier 0 collisions 0 > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> Noticed tap interface dropping TX packets and even after increasing > > > > > > >> > > > > > > >> > > > > > > >> txqueue from 1000 to 10000 nothing changed, still getting packet > > > > > > >> > > > > > > >> > > > > > > >> drops. > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> On Sat, Sep 15, 2018 at 4:22 PM Satish Patel wrote: > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> Folks, > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> I need some advice or suggestion to find out what is going on with my > > > > > > >> > > > > > > >> > > > > > > >> network, we have notice high packet loss on openstack instance and not > > > > > > >> > > > > > > >> > > > > > > >> sure what is going on, same time if i check on host machine and it has > > > > > > >> > > > > > > >> > > > > > > >> zero packet loss.. this is what i did for test... > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> ping 8.8.8.8 > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> from instance: 50% packet loss > > > > > > >> > > > > > > >> > > > > > > >> from compute host: 0% packet loss > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> I have disabled TSO/GSO/SG setting on physical compute node but still > > > > > > >> > > > > > > >> > > > > > > >> getting packet loss. > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> We have 10G NIC on our network, look like something related to tap > > > > > > >> > > > > > > >> > > > > > > >> interface setting.. > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> _______________________________________________ > > > > > > >> > > > > > > >> > > > > > > >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > > > > >> > > > > > > >> > > > > > > >> Post to : openstack at lists.openstack.org > > > > > > >> > > > > > > >> > > > > > > >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > From limao at cisco.com Mon Sep 17 05:27:32 2018 From: limao at cisco.com (Liping Mao (limao)) Date: Mon, 17 Sep 2018 05:27:32 +0000 Subject: [Openstack] URGENT: packet loss on openstack instance In-Reply-To: References: <4126044E-505F-4A48-B126-0625D5F40D72@cisco.com> <11A2F68B-1A87-433B-A4D6-CA495DA88F5C@gmail.com> <5A9B804F-BEA1-468D-BBAB-3C50181A6190@cisco.com> <8089BF19-A95B-4CF5-A2D4-0CB2B7415362@cisco.com> Message-ID: > Question: I have br-vlan interface mapp with bond0 to run my VM (VLAN traffic), so do i need to do anything in bond0 to enable VF/PF function? Just confused because currently my VM nic map with compute node br-vlan bridge. I had not actually used SRIOV in my env~ maybe others could help. Thanks, Liping Mao 在 2018/9/17 11:48,“Satish Patel” 写入: Thanks Liping, I will check bug for tx/rx queue size and see if i can make it work but look like my 10G NIC support SR-IOV so i am trying that path because it will be better for long run. I have deploy my cloud using openstack-ansible so now i need to figure out how do i wire that up with openstack-ansible deployment, here is the article [1] Question: I have br-vlan interface mapp with bond0 to run my VM (VLAN traffic), so do i need to do anything in bond0 to enable VF/PF function? Just confused because currently my VM nic map with compute node br-vlan bridge. [root at compute-65 ~]# lspci -nn | grep -i ethernet 03:00.0 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 Gigabit Ethernet [14e4:168e] (rev 10) 03:00.1 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 Gigabit Ethernet [14e4:168e] (rev 10) 03:01.0 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] 03:01.1 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] 03:01.2 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] 03:01.3 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] 03:01.4 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] 03:01.5 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] 03:01.6 Ethernet controller [0200]: Broadcom Limited NetXtreme II BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] [1] https://docs.openstack.org/openstack-ansible-os_neutron/latest/configure-network-services.html On Sun, Sep 16, 2018 at 7:06 PM Liping Mao (limao) wrote: > > Hi Satish, > > > > > > There are hard limitations in nova's code, I did not actually used more thant 8 queues: > > def _get_max_tap_queues(self): > > # NOTE(kengo.sakai): In kernels prior to 3.0, > > # multiple queues on a tap interface is not supported. > > # In kernels 3.x, the number of queues on a tap interface > > # is limited to 8. From 4.0, the number is 256. > > # See: https://bugs.launchpad.net/nova/+bug/1570631 > > kernel_version = int(os.uname()[2].split(".")[0]) > > if kernel_version <= 2: > > return 1 > > elif kernel_version == 3: > > return 8 > > elif kernel_version == 4: > > return 256 > > else: > > return None > > > > > I am currently playing with those setting and trying to generate > > traffic with hping3 tools, do you have any tool to test traffic > > performance for specially udp style small packets. > > > > Hping3 is good enough to reproduce it, we have app level test tool, but that is not your case. > > > > > > > Here i am trying to increase rx_queue_size & tx_queue_size but its not > > working somehow. I have tired following. > > > > Since you are not rocky code, it should only works in qemu.conf, maybe check if this bug[1] affect you. > > > > > > > Is there a way i can automate this last task to update queue number > > action after reboot vm :) otherwise i can use cloud-init to make sure > > all VM build with same config. > > > > Cloud-init or rc.local could be the place to do that. > > > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1541960 > > > > Regards, > > Liping Mao > > > > 在 2018/9/17 04:09,“Satish Patel” 写入: > > > > Update on my last email. > > > > I am able to achieve 150kpps with queue=8 and my goal is to do 300kpps > > because some of voice application using 300kps. > > > > Here i am trying to increase rx_queue_size & tx_queue_size but its not > > working somehow. I have tired following. > > > > 1. add rx/tx size in /etc/nova/nova.conf in libvirt section - (didn't work) > > 2. add /etc/libvirtd/qemu.conf - (didn't work) > > > > I have try to edit virsh edit file but somehow my changes not > > getting reflected, i did virsh define after change and hard > > reboot guest but no luck.. how do i edit that option in xml if i want > > to do that? > > On Sun, Sep 16, 2018 at 1:41 PM Satish Patel wrote: > > > > > > I successful reproduce this error with hping3 tool and look like > > > multiqueue is our solution :) but i have few question you may have > > > answer of that. > > > > > > 1. I have created two instance (vm1.example.com & vm2.example.com) > > > > > > 2. I have flood traffic from vm1 using "hping3 vm2.example.com > > > --flood" and i have noticed drops on tap interface. ( This is without > > > multiqueue) > > > > > > 3. Enable multiqueue in image and run same test and again got packet > > > drops on tap interface ( I didn't update queue on vm2 guest, so > > > definitely i was expecting packet drops) > > > > > > 4. Now i have try to update vm2 queue using ethtool and i got > > > following error, I have 15vCPU and i was trying to add 15 queue > > > > > > [root at bar-mq ~]# ethtool -L eth0 combined 15 > > > Cannot set device channel parameters: Invalid argument > > > > > > Then i have tried 8 queue which works. > > > > > > [root at bar-mq ~]# ethtool -L eth0 combined 8 > > > combined unmodified, ignoring > > > no channel parameters changed, aborting > > > current values: tx 0 rx 0 other 0 combined 8 > > > > > > Now i am not seeing any packet drops on tap interface, I have measure > > > PPS and i was able to get 160kpps without packet drops. > > > > > > Question: > > > > > > 1. why i am not able to add 15 queue? ( is this NIC or driver limitation?) > > > 2. how do i automate "ethtool -L eth0 combined 8" command in instance > > > so i don't need to tell my customer to do this manually? > > > On Sun, Sep 16, 2018 at 11:53 AM Satish Patel wrote: > > > > > > > > Hi Liping, > > > > > > > > >> I think multi queue feature should help.(be careful to make sure the ethtool update queue number action also did after reboot the vm). > > > > > > > > Is there a way i can automate this last task to update queue number > > > > action after reboot vm :) otherwise i can use cloud-init to make sure > > > > all VM build with same config. > > > > On Sun, Sep 16, 2018 at 11:51 AM Satish Patel wrote: > > > > > > > > > > I am currently playing with those setting and trying to generate > > > > > traffic with hping3 tools, do you have any tool to test traffic > > > > > performance for specially udp style small packets. > > > > > > > > > > I am going to share all my result and see what do you feel because i > > > > > have noticed you went through this pain :) I will try every single > > > > > option which you suggested to make sure we are good before i move > > > > > forward to production. > > > > > On Sun, Sep 16, 2018 at 11:25 AM Liping Mao (limao) wrote: > > > > > > > > > > > > I think multi queue feature should help.(be careful to make sure the ethtool update queue number action also did after reboot the vm). > > > > > > > > > > > > Numa cpu pin and queue length will be a plus in my exp. You may need yo have performance test in your situatuon,in my case cpu numa helpped the app get very stable 720p/1080p transcoding performance. Not sure if your app get benifit. > > > > > > > > > > > > You are not using L3,this will let you avoid a lot of performance issue. And since only two instance with 80kpps packets,so in your case,HW interface should not be bottleneck too. And your Nexus 5k/7k will not be bottleneck for sure ;-) > > > > > > > > > > > > > > > > > > Thanks, > > > > > > Liping Mao > > > > > > > > > > > > > 在 2018年9月16日,23:09,Satish Patel 写道: > > > > > > > > > > > > > > Thanks Liping, > > > > > > > > > > > > > > I am using libvertd 3.9.0 version so look like i am eligible take > > > > > > > advantage of that feature. phew! > > > > > > > > > > > > > > [root at compute-47 ~]# libvirtd -V > > > > > > > libvirtd (libvirt) 3.9.0 > > > > > > > > > > > > > > Let me tell you how i am running instance on my openstack, my compute > > > > > > > has 32 core / 32G memory and i have created two instance on compute > > > > > > > node 15vcpu and 14G memory ( two instance using 30 vcpu core, i have > > > > > > > kept 2 core for compute node). on compute node i disabled overcommit > > > > > > > using ratio (1.0) > > > > > > > > > > > > > > I didn't configure NUMA yet because i wasn't aware of this feature, as > > > > > > > per your last post do you think numa will help to fix this issue? > > > > > > > following is my numa view > > > > > > > > > > > > > > [root at compute-47 ~]# numactl --hardware > > > > > > > available: 2 nodes (0-1) > > > > > > > node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23 > > > > > > > node 0 size: 16349 MB > > > > > > > node 0 free: 133 MB > > > > > > > node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31 > > > > > > > node 1 size: 16383 MB > > > > > > > node 1 free: 317 MB > > > > > > > node distances: > > > > > > > node 0 1 > > > > > > > 0: 10 20 > > > > > > > 1: 20 10 > > > > > > > > > > > > > > > > > > > > > I am not using any L3 router, i am using provide VLAN network and > > > > > > > using Cisco Nexus switch for my L3 function so i am not seeing any > > > > > > > bottleneck there. > > > > > > > > > > > > > > This is the 10G NIC i have on all my compute node, dual 10G port with > > > > > > > bonding (20G) > > > > > > > > > > > > > > 03:00.0 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 > > > > > > > Gigabit Ethernet (rev 10) > > > > > > > 03:00.1 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 > > > > > > > Gigabit Ethernet (rev 10) > > > > > > > > > > > > > > > > > > > > >> On Sun, Sep 16, 2018 at 10:50 AM Liping Mao (limao) wrote: > > > > > > >> > > > > > > >> It is still possible to update rx and tx queues length if your qemu and libvirt version is higher than the version recorded in [3]. (You should possible to update directly in libvirt configuration if my memory is correct) > > > > > > >> > > > > > > >> We also have some similar use case which run audio/vedio serivcs. They are CPU consuming and have UDP small packets. Another possible tunning is using CPU pin for the vm. you can use numa awared cpu feature to get stable cpu performance ,vm network dropped packets sometimes because of the vm cpu is too busy,with numa cpu it works better performance,our way is similar with [a]. You need to create flavor with special metadata and dedicated Host Agg for numa awared VMs. Dedicated CPU is very good for media service. It makes the CPU performance stable. > > > > > > >> > > > > > > >> Another packet loss case we get is because of vm kernel, some of our app are using 32bit OS, that cause memory issue, when traffic larger then 50kpps, it dropped a lot,sometimes,it even crash. In this case, 32bit os can actually use very limited memory, we have to add swap for the vm. Hope your app is using 64 bit OS. Because 32 bit could cause tons of trouble. > > > > > > >> > > > > > > >> BTW,if you are using vrouter on L3, you’d better to move provider network(no vrouter). I did not tried DVR, but if you are running without DVR, the L3 node will be bottleneck very quick. Especially default iptables conntrack is 65535, you will reach to it and drop packet on L3, even after you tun that value, it still hard to more that 1Mpps for your network node. > > > > > > >> > > > > > > >> If your App more than 200kpps per compute node, you may be better also have a look your physical network driver tx/rx configuration. Most of the HW default value for tx/rx queues number and length are very poor,you may start to get packet on eth interface on physical host when rx queue is full. > > > > > > >> > > > > > > >> [a]https://redhatstackblog.redhat.com/2015/05/05/cpu-pinning-and-numa-topology-awareness-in-openstack-compute/ > > > > > > >> > > > > > > >> Regards, > > > > > > >> Liping Mao > > > > > > >> > > > > > > >> 在 2018年9月16日,21:18,Satish Patel 写道: > > > > > > >> > > > > > > >> Hi Liping, > > > > > > >> > > > > > > >> Thank you for your reply, > > > > > > >> > > > > > > >> We notice packet drops during high load, I did try txqueue and didn't help so I believe I am going to try miltiqueue. > > > > > > >> > > > > > > >> For SRIOV I have to look if I have support in my nic. > > > > > > >> > > > > > > >> We are using queens so I think queue size option not possible :( > > > > > > >> > > > > > > >> We are using voip application and traffic is udp so our pps rate is 60k to 80k per vm instance. > > > > > > >> > > > > > > >> I will share my result as soon as I try multiqueue. > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> Sent from my iPhone > > > > > > >> > > > > > > >> On Sep 16, 2018, at 2:27 AM, Liping Mao (limao) wrote: > > > > > > >> > > > > > > >> > > > > > > >> Hi Satish, > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> Did your packet loss happen always or it only happened when heavy load? > > > > > > >> > > > > > > >> > > > > > > >> AFAIK, if you do not tun anything, the vm tap can process about 50kpps before the tap device start to drop packets. > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> If it happened in heavy load, couple of things you can try: > > > > > > >> > > > > > > >> > > > > > > >> 1) increase tap queue length, usually the default value is 500, you can try larger. (seems like you already tried) > > > > > > >> > > > > > > >> > > > > > > >> 2) Try to use virtio multi queues feature , see [1]. Virtio use one queue for rx/tx in vm, with this feature you can get more queues. You can check > > > > > > >> > > > > > > >> > > > > > > >> 3) In rock version, you can use [2] to increase virtio queue size, the default queues size is 256/512, you may increase it to 1024, this would help to increase pps of the tap device. > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> If all these things can not get your network performance requirement, you may need to move to use dpdk / sriov stuff to get more vm performance. > > > > > > >> > > > > > > >> > > > > > > >> I did not actually used them in our env, you may refer to [3] > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> [1] https://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/libvirt-virtiomq.html > > > > > > >> > > > > > > >> > > > > > > >> [2] https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/libvirt-virtio-set-queue-sizes.html > > > > > > >> > > > > > > >> > > > > > > >> [3] https://docs.openstack.org/ocata/networking-guide/config-sriov.html > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> Regards, > > > > > > >> > > > > > > >> > > > > > > >> Liping Mao > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> 在 2018/9/16 13:07,“Satish Patel” 写入: > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> [root at compute-33 ~]# ifconfig tap5af7f525-5f | grep -i drop > > > > > > >> > > > > > > >> > > > > > > >> RX errors 0 dropped 0 overruns 0 frame 0 > > > > > > >> > > > > > > >> > > > > > > >> TX errors 0 dropped 2528788837 overruns 0 carrier 0 collisions 0 > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> Noticed tap interface dropping TX packets and even after increasing > > > > > > >> > > > > > > >> > > > > > > >> txqueue from 1000 to 10000 nothing changed, still getting packet > > > > > > >> > > > > > > >> > > > > > > >> drops. > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> On Sat, Sep 15, 2018 at 4:22 PM Satish Patel wrote: > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> Folks, > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> I need some advice or suggestion to find out what is going on with my > > > > > > >> > > > > > > >> > > > > > > >> network, we have notice high packet loss on openstack instance and not > > > > > > >> > > > > > > >> > > > > > > >> sure what is going on, same time if i check on host machine and it has > > > > > > >> > > > > > > >> > > > > > > >> zero packet loss.. this is what i did for test... > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> ping 8.8.8.8 > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> from instance: 50% packet loss > > > > > > >> > > > > > > >> > > > > > > >> from compute host: 0% packet loss > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> I have disabled TSO/GSO/SG setting on physical compute node but still > > > > > > >> > > > > > > >> > > > > > > >> getting packet loss. > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> We have 10G NIC on our network, look like something related to tap > > > > > > >> > > > > > > >> > > > > > > >> interface setting.. > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> _______________________________________________ > > > > > > >> > > > > > > >> > > > > > > >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > > > > >> > > > > > > >> > > > > > > >> Post to : openstack at lists.openstack.org > > > > > > >> > > > > > > >> > > > > > > >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > From eblock at nde.ag Mon Sep 17 07:55:37 2018 From: eblock at nde.ag (Eugen Block) Date: Mon, 17 Sep 2018 07:55:37 +0000 Subject: [Openstack] boot order with multiple attachments In-Reply-To: References: <8ddbf904-27bd-ecbf-3a13-efc3f697067b@gmx.com> Message-ID: <20180917075537.Horde.QwKH-mYRZAKRJSDpJ0K25pT@webmail.nde.ag> Hi Volodymyr, I didn't really try to reproduce this, but here's an excerpt from a template we have been using successfully: ---cut here--- [...] vm-vda: type: OS::Cinder::Volume properties: description: VM vda image: image-vda name: disk-vda size: 100 vm-vdb: type: OS::Cinder::Volume properties: description: VM vdb image: image-vdb name: disk-vdb size: 120 vm: type: OS::Nova::Server depends_on: [vm_subnet, vm_floating_port, vm-vda, vm-vdb, service] properties: flavor: big-flavor block_device_mapping: - { device_name: "vda", volume_id : { get_resource : vm-vda }, delete_on_termination : "true" } - { device_name: "vdb", volume_id : { get_resource : vm-vdb }, delete_on_termination : "true" } networks: [...] ---cut here--- So basically, this way you tell the instance which volume has to be /dev/vda, vdb etc. We don't use any boot_index for this. Hope this helps! Regards, Eugen Zitat von Volodymyr Litovka : > Hi again, > > there is similar case - https://bugs.launchpad.net/nova/+bug/1570107 > - but I get same result (booting from VOLUME2) regardless of whether > I use or don't use device_type/disk_bus properties in BDM description. > > Any ideas on how to solve this issue? > > Thanks. > > On 9/11/18 10:58 AM, Volodymyr Litovka wrote: >> Hi colleagues, >> >> is there any mechanism to ensure boot disk when attaching more than >> two volumes to server? At the moment, I can't find a way to make it >> predictable. >> >> I have two bootable images with the following properties: >> 1) hw_boot_menu='true', hw_disk_bus='scsi', >> hw_qemu_guest_agent='yes', hw_scsi_model='virtio-scsi', >> img_hide_hypervisor_id='true', locations='[{u'url': >> u'swift+config:...', u'metadata': {}}]' >> >> which corresponds to the following volume: >> >> - attachments: [{u'server_id': u'...', u'attachment_id': u'...', >> u'attached_at': u'...', u'host_name': u'...', u'volume_id': >> u'', u'device': u'/dev/sda', u'id': u'...'}] >> - volume_image_metadata: {u'checksum': u'...', >> u'hw_qemu_guest_agent': u'yes', u'disk_format': u'raw', >> u'image_name': u'bionic-Qpub', u'hw_scsi_model': u'virtio-scsi', >> u'image_id': u'...', u'hw_boot_menu': u'true', u'min_ram': u'0', >> u'container_format': u'bare', u'min_disk': u'0', >> u'img_hide_hypervisor_id': u'true', u'hw_disk_bus': u'scsi', >> u'size': u'...'} >> >> and second image: >> 2) hw_disk_bus='scsi', hw_qemu_guest_agent='yes', >> hw_scsi_model='virtio-scsi', img_hide_hypervisor_id='true', >> locations='[{u'url': u'cinder://...', u'metadata': {}}]' >> >> which corresponds to the following volume: >> >> - attachments: [{u'server_id': u'...', u'attachment_id': u'...', >> u'attached_at': u'...', u'host_name': u'...', u'volume_id': >> u'', u'device': u'/dev/sdb', u'id': u'...'}] >> - volume_image_metadata: {u'checksum': u'...', >> u'hw_qemu_guest_agent': u'yes', u'disk_format': u'raw', >> u'image_name': u'xenial', u'hw_scsi_model': u'virtio-scsi', >> u'image_id': u'...', u'min_ram': u'0', u'container_format': >> u'bare', u'min_disk': u'0', u'img_hide_hypervisor_id': u'true', >> u'hw_disk_bus': u'scsi', u'size': u'...'} >> >> Using Heat, I'm creating the following block_devices_mapping_v2 scheme: >> >> block_device_mapping_v2: >>         - volume_id: >>           delete_on_termination: false >>           device_type: disk >>           disk_bus: scsi >>           boot_index: 0 >>         - volume_id: >>           delete_on_termination: false >>           device_type: disk >>           disk_bus: scsi >>           boot_index: -1 >> >> which maps to the following nova-api debug log: >> >> Action: 'create', calling method: > ServersController.create of >> > 0x7f6b08dd4890>>, body: {"ser >> ver": {"name": "jex-n1", "imageRef": "", "block_device_mapping_v2": >> [{"boot_index": 0, "uuid": "", "disk_bus": "scsi", >> "source_type": "volume" >> , "device_type": "disk", "destination_type": "volume", >> "delete_on_termination": false}, {"boot_index": -1, "uuid": >> "", "disk_bus": "scsi", "so >> urce_type": "volume", "device_type": "disk", "destination_type": >> "volume", "delete_on_termination": false}], "flavorRef": >> "4b3da838-3d81-461a-b946-d3613fb6f4b3", "user_data": "...", >> "max_count": 1, "min_count": 1, "networks": [{"port": >> "9044f884-1a3d-4dc6-981e-f585f5e45dd1"}], "config_drive": true}} >> _process_stack >> /usr/lib/python2.7/dist-packages/nova/api/openstack/wsgi.py:604 >> >> Regardless of boot_index value, server boots from VOLUME2 >> (/dev/sdb), while having attached VOLUME1 as well as /dev/sda >> >> I'm using Queens. Where I'm wrong? >> >> Thank you. >> > > -- > Volodymyr Litovka > "Vision without Execution is Hallucination." -- Thomas Edison > > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack From satish.txt at gmail.com Mon Sep 17 12:33:42 2018 From: satish.txt at gmail.com (Satish Patel) Date: Mon, 17 Sep 2018 08:33:42 -0400 Subject: [Openstack] URGENT: packet loss on openstack instance In-Reply-To: References: <4126044E-505F-4A48-B126-0625D5F40D72@cisco.com> <11A2F68B-1A87-433B-A4D6-CA495DA88F5C@gmail.com> <5A9B804F-BEA1-468D-BBAB-3C50181A6190@cisco.com> <8089BF19-A95B-4CF5-A2D4-0CB2B7415362@cisco.com> Message-ID: Thanks Liping, I will try to reach out or open new thread to get sriov info. By the way what version of openstack you guys using and what hardware specially NIC. Just trying to see if it's hardware related. I'm running kernel 3.10.x do you think it's not something related kernel. Sent from my iPhone On Sep 17, 2018, at 1:27 AM, Liping Mao (limao) wrote: >> Question: I have br-vlan interface mapp with bond0 to run my VM (VLAN > > traffic), so do i need to do anything in bond0 to enable VF/PF > > function? Just confused because currently my VM nic map with compute > > node br-vlan bridge. > > > > I had not actually used SRIOV in my env~ maybe others could help. > > > > Thanks, > > Liping Mao > > > > 在 2018/9/17 11:48,“Satish Patel” 写入: > > > > Thanks Liping, > > > > I will check bug for tx/rx queue size and see if i can make it work > > but look like my 10G NIC support SR-IOV so i am trying that path > > because it will be better for long run. > > > > I have deploy my cloud using openstack-ansible so now i need to figure > > out how do i wire that up with openstack-ansible deployment, here is > > the article [1] > > > > Question: I have br-vlan interface mapp with bond0 to run my VM (VLAN > > traffic), so do i need to do anything in bond0 to enable VF/PF > > function? Just confused because currently my VM nic map with compute > > node br-vlan bridge. > > > > [root at compute-65 ~]# lspci -nn | grep -i ethernet > > 03:00.0 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > BCM57810 10 Gigabit Ethernet [14e4:168e] (rev 10) > > 03:00.1 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > BCM57810 10 Gigabit Ethernet [14e4:168e] (rev 10) > > 03:01.0 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] > > 03:01.1 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] > > 03:01.2 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] > > 03:01.3 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] > > 03:01.4 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] > > 03:01.5 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] > > 03:01.6 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] > > > > > > [1] https://docs.openstack.org/openstack-ansible-os_neutron/latest/configure-network-services.html > >> On Sun, Sep 16, 2018 at 7:06 PM Liping Mao (limao) wrote: >> >> > >> Hi Satish, > >> > >> > >> > >> > >> > >> There are hard limitations in nova's code, I did not actually used more thant 8 queues: > >> > >> def _get_max_tap_queues(self): > >> > >> # NOTE(kengo.sakai): In kernels prior to 3.0, > >> > >> # multiple queues on a tap interface is not supported. > >> > >> # In kernels 3.x, the number of queues on a tap interface > >> > >> # is limited to 8. From 4.0, the number is 256. > >> > >> # See: https://bugs.launchpad.net/nova/+bug/1570631 > >> > >> kernel_version = int(os.uname()[2].split(".")[0]) > >> > >> if kernel_version <= 2: > >> > >> return 1 > >> > >> elif kernel_version == 3: > >> > >> return 8 > >> > >> elif kernel_version == 4: > >> > >> return 256 > >> > >> else: > >> > >> return None > >> > >> > >> > >>> I am currently playing with those setting and trying to generate > >> > >> traffic with hping3 tools, do you have any tool to test traffic > >> > >> performance for specially udp style small packets. > >> > >> > >> > >> Hping3 is good enough to reproduce it, we have app level test tool, but that is not your case. > >> > >> > >> > >> > >> > >>> Here i am trying to increase rx_queue_size & tx_queue_size but its not > >> > >> working somehow. I have tired following. > >> > >> > >> > >> Since you are not rocky code, it should only works in qemu.conf, maybe check if this bug[1] affect you. > >> > >> > >> > >> > >> > >>> Is there a way i can automate this last task to update queue number > >> > >> action after reboot vm :) otherwise i can use cloud-init to make sure > >> > >> all VM build with same config. > >> > >> > >> > >> Cloud-init or rc.local could be the place to do that. > >> > >> > >> > >> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1541960 > >> > >> > >> > >> Regards, > >> > >> Liping Mao > >> > >> > >> > >> 在 2018/9/17 04:09,“Satish Patel” 写入: > >> > >> > >> > >> Update on my last email. > >> > >> > >> > >> I am able to achieve 150kpps with queue=8 and my goal is to do 300kpps > >> > >> because some of voice application using 300kps. > >> > >> > >> > >> Here i am trying to increase rx_queue_size & tx_queue_size but its not > >> > >> working somehow. I have tired following. > >> > >> > >> > >> 1. add rx/tx size in /etc/nova/nova.conf in libvirt section - (didn't work) > >> > >> 2. add /etc/libvirtd/qemu.conf - (didn't work) > >> > >> > >> > >> I have try to edit virsh edit file but somehow my changes not > >> > >> getting reflected, i did virsh define after change and hard > >> > >> reboot guest but no luck.. how do i edit that option in xml if i want > >> > >> to do that? > >> > >>> On Sun, Sep 16, 2018 at 1:41 PM Satish Patel wrote: >> >> > >>> > >> > >>> I successful reproduce this error with hping3 tool and look like > >> > >>> multiqueue is our solution :) but i have few question you may have > >> > >>> answer of that. > >> > >>> > >> > >>> 1. I have created two instance (vm1.example.com & vm2.example.com) > >> > >>> > >> > >>> 2. I have flood traffic from vm1 using "hping3 vm2.example.com > >> > >>> --flood" and i have noticed drops on tap interface. ( This is without > >> > >>> multiqueue) > >> > >>> > >> > >>> 3. Enable multiqueue in image and run same test and again got packet > >> > >>> drops on tap interface ( I didn't update queue on vm2 guest, so > >> > >>> definitely i was expecting packet drops) > >> > >>> > >> > >>> 4. Now i have try to update vm2 queue using ethtool and i got > >> > >>> following error, I have 15vCPU and i was trying to add 15 queue > >> > >>> > >> > >>> [root at bar-mq ~]# ethtool -L eth0 combined 15 > >> > >>> Cannot set device channel parameters: Invalid argument > >> > >>> > >> > >>> Then i have tried 8 queue which works. > >> > >>> > >> > >>> [root at bar-mq ~]# ethtool -L eth0 combined 8 > >> > >>> combined unmodified, ignoring > >> > >>> no channel parameters changed, aborting > >> > >>> current values: tx 0 rx 0 other 0 combined 8 > >> > >>> > >> > >>> Now i am not seeing any packet drops on tap interface, I have measure > >> > >>> PPS and i was able to get 160kpps without packet drops. > >> > >>> > >> > >>> Question: > >> > >>> > >> > >>> 1. why i am not able to add 15 queue? ( is this NIC or driver limitation?) > >> > >>> 2. how do i automate "ethtool -L eth0 combined 8" command in instance > >> > >>> so i don't need to tell my customer to do this manually? > >> > >>>> On Sun, Sep 16, 2018 at 11:53 AM Satish Patel wrote: >> >> > >>>> > >> > >>>> Hi Liping, > >> > >>>> > >> > >>>>>> I think multi queue feature should help.(be careful to make sure the ethtool update queue number action also did after reboot the vm). > >> > >>>> > >> > >>>> Is there a way i can automate this last task to update queue number > >> > >>>> action after reboot vm :) otherwise i can use cloud-init to make sure > >> > >>>> all VM build with same config. > >> > >>>>> On Sun, Sep 16, 2018 at 11:51 AM Satish Patel wrote: >> >> > >>>>> > >> > >>>>> I am currently playing with those setting and trying to generate > >> > >>>>> traffic with hping3 tools, do you have any tool to test traffic > >> > >>>>> performance for specially udp style small packets. > >> > >>>>> > >> > >>>>> I am going to share all my result and see what do you feel because i > >> > >>>>> have noticed you went through this pain :) I will try every single > >> > >>>>> option which you suggested to make sure we are good before i move > >> > >>>>> forward to production. > >> > >>>>>> On Sun, Sep 16, 2018 at 11:25 AM Liping Mao (limao) wrote: >> >> > >>>>>> > >> > >>>>>> I think multi queue feature should help.(be careful to make sure the ethtool update queue number action also did after reboot the vm). > >> > >>>>>> > >> > >>>>>> Numa cpu pin and queue length will be a plus in my exp. You may need yo have performance test in your situatuon,in my case cpu numa helpped the app get very stable 720p/1080p transcoding performance. Not sure if your app get benifit. > >> > >>>>>> > >> > >>>>>> You are not using L3,this will let you avoid a lot of performance issue. And since only two instance with 80kpps packets,so in your case,HW interface should not be bottleneck too. And your Nexus 5k/7k will not be bottleneck for sure ;-) > >> > >>>>>> > >> > >>>>>> > >> > >>>>>> Thanks, > >> > >>>>>> Liping Mao > >> > >>>>>> > >> > >>>>>>>> 在 2018年9月16日,23:09,Satish Patel 写道: >> >> > >>>>>>> > >> > >>>>>>> Thanks Liping, > >> > >>>>>>> > >> > >>>>>>> I am using libvertd 3.9.0 version so look like i am eligible take > >> > >>>>>>> advantage of that feature. phew! > >> > >>>>>>> > >> > >>>>>>> [root at compute-47 ~]# libvirtd -V > >> > >>>>>>> libvirtd (libvirt) 3.9.0 > >> > >>>>>>> > >> > >>>>>>> Let me tell you how i am running instance on my openstack, my compute > >> > >>>>>>> has 32 core / 32G memory and i have created two instance on compute > >> > >>>>>>> node 15vcpu and 14G memory ( two instance using 30 vcpu core, i have > >> > >>>>>>> kept 2 core for compute node). on compute node i disabled overcommit > >> > >>>>>>> using ratio (1.0) > >> > >>>>>>> > >> > >>>>>>> I didn't configure NUMA yet because i wasn't aware of this feature, as > >> > >>>>>>> per your last post do you think numa will help to fix this issue? > >> > >>>>>>> following is my numa view > >> > >>>>>>> > >> > >>>>>>> [root at compute-47 ~]# numactl --hardware > >> > >>>>>>> available: 2 nodes (0-1) > >> > >>>>>>> node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23 > >> > >>>>>>> node 0 size: 16349 MB > >> > >>>>>>> node 0 free: 133 MB > >> > >>>>>>> node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31 > >> > >>>>>>> node 1 size: 16383 MB > >> > >>>>>>> node 1 free: 317 MB > >> > >>>>>>> node distances: > >> > >>>>>>> node 0 1 > >> > >>>>>>> 0: 10 20 > >> > >>>>>>> 1: 20 10 > >> > >>>>>>> > >> > >>>>>>> > >> > >>>>>>> I am not using any L3 router, i am using provide VLAN network and > >> > >>>>>>> using Cisco Nexus switch for my L3 function so i am not seeing any > >> > >>>>>>> bottleneck there. > >> > >>>>>>> > >> > >>>>>>> This is the 10G NIC i have on all my compute node, dual 10G port with > >> > >>>>>>> bonding (20G) > >> > >>>>>>> > >> > >>>>>>> 03:00.0 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 > >> > >>>>>>> Gigabit Ethernet (rev 10) > >> > >>>>>>> 03:00.1 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 > >> > >>>>>>> Gigabit Ethernet (rev 10) > >> > >>>>>>> > >> > >>>>>>> > >> > >>>>>>>>> On Sun, Sep 16, 2018 at 10:50 AM Liping Mao (limao) wrote: >> >> > >>>>>>>> > >> > >>>>>>>> It is still possible to update rx and tx queues length if your qemu and libvirt version is higher than the version recorded in [3]. (You should possible to update directly in libvirt configuration if my memory is correct) > >> > >>>>>>>> > >> > >>>>>>>> We also have some similar use case which run audio/vedio serivcs. They are CPU consuming and have UDP small packets. Another possible tunning is using CPU pin for the vm. you can use numa awared cpu feature to get stable cpu performance ,vm network dropped packets sometimes because of the vm cpu is too busy,with numa cpu it works better performance,our way is similar with [a]. You need to create flavor with special metadata and dedicated Host Agg for numa awared VMs. Dedicated CPU is very good for media service. It makes the CPU performance stable. > >> > >>>>>>>> > >> > >>>>>>>> Another packet loss case we get is because of vm kernel, some of our app are using 32bit OS, that cause memory issue, when traffic larger then 50kpps, it dropped a lot,sometimes,it even crash. In this case, 32bit os can actually use very limited memory, we have to add swap for the vm. Hope your app is using 64 bit OS. Because 32 bit could cause tons of trouble. > >> > >>>>>>>> > >> > >>>>>>>> BTW,if you are using vrouter on L3, you’d better to move provider network(no vrouter). I did not tried DVR, but if you are running without DVR, the L3 node will be bottleneck very quick. Especially default iptables conntrack is 65535, you will reach to it and drop packet on L3, even after you tun that value, it still hard to more that 1Mpps for your network node. > >> > >>>>>>>> > >> > >>>>>>>> If your App more than 200kpps per compute node, you may be better also have a look your physical network driver tx/rx configuration. Most of the HW default value for tx/rx queues number and length are very poor,you may start to get packet on eth interface on physical host when rx queue is full. > >> > >>>>>>>> > >> > >>>>>>>> [a]https://redhatstackblog.redhat.com/2015/05/05/cpu-pinning-and-numa-topology-awareness-in-openstack-compute/ > >> > >>>>>>>> > >> > >>>>>>>> Regards, > >> > >>>>>>>> Liping Mao > >> > >>>>>>>> > >> > >>>>>>>>> 在 2018年9月16日,21:18,Satish Patel 写道: >> >> > >>>>>>>> > >> > >>>>>>>> Hi Liping, > >> > >>>>>>>> > >> > >>>>>>>> Thank you for your reply, > >> > >>>>>>>> > >> > >>>>>>>> We notice packet drops during high load, I did try txqueue and didn't help so I believe I am going to try miltiqueue. > >> > >>>>>>>> > >> > >>>>>>>> For SRIOV I have to look if I have support in my nic. > >> > >>>>>>>> > >> > >>>>>>>> We are using queens so I think queue size option not possible :( > >> > >>>>>>>> > >> > >>>>>>>> We are using voip application and traffic is udp so our pps rate is 60k to 80k per vm instance. > >> > >>>>>>>> > >> > >>>>>>>> I will share my result as soon as I try multiqueue. > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> Sent from my iPhone > >> > >>>>>>>> > >> > >>>>>>>>> On Sep 16, 2018, at 2:27 AM, Liping Mao (limao) wrote: >> >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> Hi Satish, > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> Did your packet loss happen always or it only happened when heavy load? > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> AFAIK, if you do not tun anything, the vm tap can process about 50kpps before the tap device start to drop packets. > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> If it happened in heavy load, couple of things you can try: > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> 1) increase tap queue length, usually the default value is 500, you can try larger. (seems like you already tried) > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> 2) Try to use virtio multi queues feature , see [1]. Virtio use one queue for rx/tx in vm, with this feature you can get more queues. You can check > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> 3) In rock version, you can use [2] to increase virtio queue size, the default queues size is 256/512, you may increase it to 1024, this would help to increase pps of the tap device. > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> If all these things can not get your network performance requirement, you may need to move to use dpdk / sriov stuff to get more vm performance. > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> I did not actually used them in our env, you may refer to [3] > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> [1] https://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/libvirt-virtiomq.html > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> [2] https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/libvirt-virtio-set-queue-sizes.html > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> [3] https://docs.openstack.org/ocata/networking-guide/config-sriov.html > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> Regards, > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> Liping Mao > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> 在 2018/9/16 13:07,“Satish Patel” 写入: > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> [root at compute-33 ~]# ifconfig tap5af7f525-5f | grep -i drop > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> RX errors 0 dropped 0 overruns 0 frame 0 > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> TX errors 0 dropped 2528788837 overruns 0 carrier 0 collisions 0 > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> Noticed tap interface dropping TX packets and even after increasing > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> txqueue from 1000 to 10000 nothing changed, still getting packet > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> drops. > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>>> On Sat, Sep 15, 2018 at 4:22 PM Satish Patel wrote: >> >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> Folks, > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> I need some advice or suggestion to find out what is going on with my > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> network, we have notice high packet loss on openstack instance and not > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> sure what is going on, same time if i check on host machine and it has > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> zero packet loss.. this is what i did for test... > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> ping 8.8.8.8 > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> from instance: 50% packet loss > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> from compute host: 0% packet loss > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> I have disabled TSO/GSO/SG setting on physical compute node but still > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> getting packet loss. > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> We have 10G NIC on our network, look like something related to tap > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> interface setting.. > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> _______________________________________________ > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> Post to : openstack at lists.openstack.org > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >>>>>>>> > >> > >> > >> > >> > > > > From openstack-dev at storpool.com Mon Sep 17 13:39:33 2018 From: openstack-dev at storpool.com (Peter Penchev) Date: Mon, 17 Sep 2018 16:39:33 +0300 Subject: [Openstack] [nova][cinder] Migrate instances between regions or between clusters? Message-ID: <20180917133933.GE4292@straylight.m.ringlet.net> Hi, So here's a possibly stupid question - or rather, a series of such :) Let's say a company has two (or five, or a hundred) datacenters in geographically different locations and wants to deploy OpenStack in both. What would be a deployment scenario that would allow relatively easy migration (cold, not live) of instances from one datacenter to another? My understanding is that for servers located far away from one another regions would be a better metaphor than availability zones, if only because it would be faster for the various storage, compute, etc. services to communicate with each other for the common case of doing actions within the same datacenter. Is this understanding wrong - is it considered all right for groups of servers located in far away places to be treated as different availability zones in the same cluster? If the groups of servers are put in different regions, though, this brings me to the real question: how can an instance be migrated across regions? Note that the instance will almost certainly have some shared-storage volume attached, and assume (not quite the common case, but still) that the underlying shared storage technology can be taught about another storage cluster in another location and can transfer volumes and snapshots to remote clusters. From what I've found, there are three basic ways: - do it pretty much by hand: create snapshots of the volumes used in the underlying storage system, transfer them to the other storage cluster, then tell the Cinder volume driver to manage them, and spawn an instance with the newly-managed newly-transferred volumes - use Cinder to backup the volumes from one region, then restore them to the other; if this is combined with a storage-specific Cinder backup driver that knows that "backing up" is "creating a snapshot" and "restoring to the other region" is "transferring that snapshot to the remote storage cluster", it seems to be the easiest way forward (once the Cinder backup driver has been written) - use Nova's "server image create" command, transfer the resulting Glance image somehow (possibly by downloading it from the Glance storage in one region and simulateneously uploading it to the Glance instance in the other), then spawn an instance off that image The "server image create" approach seems to be the simplest one, although it is a bit hard to imagine how it would work without transferring data unnecessarily (the online articles I've seen advocating it seem to imply that a Nova instance in a region cannot be spawned off a Glance image in another region, so there will need to be at least one set of "download the image and upload it to the other side", even if the volume-to-image and image-to-volume transfers are instantaneous, e.g. using glance-cinderclient). However, when I tried it with a Nova instance backed by a StorPool volume (no ephemeral image at all), the Glance image was zero bytes in length and only its metadata contained some information about a volume snapshot created at that point, so this seems once again to go back to options 1 and 2 for the different ways to transfer a Cinder volume or snapshot to the other region. Or have I missed something, is there a way to get the "server image create / image download / image create" route to handle volumes attached to the instance? So... have I missed something else, too, or are these the options for transferring a Nova instance between two distant locations? Thanks for reading this far, and thanks in advance for your help! Best regards, Peter -- Peter Penchev openstack-dev at storpool.com https://storpool.com/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: From jaypipes at gmail.com Mon Sep 17 21:43:57 2018 From: jaypipes at gmail.com (Jay Pipes) Date: Mon, 17 Sep 2018 17:43:57 -0400 Subject: [Openstack] [nova][cinder] Migrate instances between regions or between clusters? In-Reply-To: <20180917133933.GE4292@straylight.m.ringlet.net> References: <20180917133933.GE4292@straylight.m.ringlet.net> Message-ID: On 09/17/2018 09:39 AM, Peter Penchev wrote: > Hi, > > So here's a possibly stupid question - or rather, a series of such :) > Let's say a company has two (or five, or a hundred) datacenters in > geographically different locations and wants to deploy OpenStack in both. > What would be a deployment scenario that would allow relatively easy > migration (cold, not live) of instances from one datacenter to another? > > My understanding is that for servers located far away from one another > regions would be a better metaphor than availability zones, if only > because it would be faster for the various storage, compute, etc. > services to communicate with each other for the common case of doing > actions within the same datacenter. Is this understanding wrong - is it > considered all right for groups of servers located in far away places to > be treated as different availability zones in the same cluster? > > If the groups of servers are put in different regions, though, this > brings me to the real question: how can an instance be migrated across > regions? Note that the instance will almost certainly have some > shared-storage volume attached, and assume (not quite the common case, > but still) that the underlying shared storage technology can be taught > about another storage cluster in another location and can transfer > volumes and snapshots to remote clusters. From what I've found, there > are three basic ways: > > - do it pretty much by hand: create snapshots of the volumes used in > the underlying storage system, transfer them to the other storage > cluster, then tell the Cinder volume driver to manage them, and spawn > an instance with the newly-managed newly-transferred volumes Yes, this is a perfectly reasonable solution. In fact, when I was at AT&T, this was basically how we allowed tenants to spin up instances in multiple regions: snapshot the instance, it gets stored in the Swift storage for the region, tenant starts the instance in a different region, and Nova pulls the image from the Swift storage in the other region. It's slow the first time it's launched in the new region, of course, since the bits need to be pulled from the other region's Swift storage, but after that, local image caching speeds things up quite a bit. This isn't migration, though. Namely, the tenant doesn't keep their instance ID, their instance's IP addresses, or anything like that. I've heard some users care about that stuff, unfortunately, which is why we have shelve [offload]. There's absolutely no way to perform a cross-region migration that keeps the instance ID and instance IP addresses. > - use Cinder to backup the volumes from one region, then restore them to > the other; if this is combined with a storage-specific Cinder backup > driver that knows that "backing up" is "creating a snapshot" and > "restoring to the other region" is "transferring that snapshot to the > remote storage cluster", it seems to be the easiest way forward (once > the Cinder backup driver has been written) Still won't have the same instance ID and IP address, which is what certain users tend to complain about needing with move operations. > - use Nova's "server image create" command, transfer the resulting > Glance image somehow (possibly by downloading it from the Glance > storage in one region and simulateneously uploading it to the Glance > instance in the other), then spawn an instance off that image Still won't have the same instance ID and IP address :) Best, -jay > The "server image create" approach seems to be the simplest one, > although it is a bit hard to imagine how it would work without > transferring data unnecessarily (the online articles I've seen > advocating it seem to imply that a Nova instance in a region cannot be > spawned off a Glance image in another region, so there will need to be > at least one set of "download the image and upload it to the other > side", even if the volume-to-image and image-to-volume transfers are > instantaneous, e.g. using glance-cinderclient). However, when I tried > it with a Nova instance backed by a StorPool volume (no ephemeral image > at all), the Glance image was zero bytes in length and only its metadata > contained some information about a volume snapshot created at that > point, so this seems once again to go back to options 1 and 2 for the > different ways to transfer a Cinder volume or snapshot to the other > region. Or have I missed something, is there a way to get the "server > image create / image download / image create" route to handle volumes > attached to the instance? > > So... have I missed something else, too, or are these the options for > transferring a Nova instance between two distant locations? > > Thanks for reading this far, and thanks in advance for your help! > > Best regards, > Peter > > > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > From tsuyuzaki.kota at lab.ntt.co.jp Tue Sep 18 02:07:26 2018 From: tsuyuzaki.kota at lab.ntt.co.jp (Kota TSUYUZAKI) Date: Tue, 18 Sep 2018 11:07:26 +0900 Subject: [Openstack] Can any user add or delete OpenStack Swift middleware? In-Reply-To: References: Message-ID: <5BA05DDE.5060606@lab.ntt.co.jp> With Storlets, users will be able to create their own applications that are able to run like as a Swift middeleware. The application (currently Python and Java are supported as the language but the apps can calls any binaries in the workspace) can be uploaded as a Swift object, then, users can invoke them with just an extra header that specifies your apps. To fit your own use case, we may have to consider to invole or to integrate the system for you but I believe Storlets could be a choice for you. In detail, Storlets documantation is around there, Top Level Index: https://docs.openstack.org/storlets/latest/index.html System Overview: https://docs.openstack.org/storlets/latest/storlet_engine_overview.html APIs: https://docs.openstack.org/storlets/latest/api/overview_api.html Thanks, Kota (2018/09/17 8:59), John Dickinson wrote: > You may be interested in Storlets. It's another OpenStack project, maintained by a Swift core reviewer, that provides this sort of user-defined middleware functionality. > > You can also ask about it in #openstack-swift > > --John > > > > On 16 Sep 2018, at 9:25, Qiao Kang wrote: > >> Hi, >> >> I'm wondering whether Swift allows any user (not the administrator) to >> specify which middleware that she/he wants his data object to go throught. >> For instance, Alice wants to install a middleware but doesn't want Bob to >> use it, where Alice and Bob are two accounts in a single Swift cluster. >> >> Or maybe all middlewares are pre-installed globally and cannot be >> customized on a per-account basis? >> >> Thanks, >> Qiao >> _______________________________________________ >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> Post to : openstack at lists.openstack.org >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -- ---------------------------------------------------------- Kota Tsuyuzaki(露﨑 浩太) NTT Software Innovation Center Distributed Computing Technology Project Phone 0422-59-2837 Fax 0422-59-2965 ----------------------------------------------------------- From qiaokang1213 at gmail.com Tue Sep 18 02:34:58 2018 From: qiaokang1213 at gmail.com (Qiao Kang) Date: Mon, 17 Sep 2018 21:34:58 -0500 Subject: [Openstack] Can any user add or delete OpenStack Swift middleware? In-Reply-To: <5BA05DDE.5060606@lab.ntt.co.jp> References: <5BA05DDE.5060606@lab.ntt.co.jp> Message-ID: Kota, Thanks for your reply, very helpful! I know Storlets can provide user-defined computation functionalities, but I guess some capabilities can only be achieved using middleware. For example, a user may want such a feature: upon each PUT request, it creates a compressed copy of the object and stores both the original copy and compressed copy. It's feasible using middlware but I don't think Storlets provide such capability. Another example is that a user may want to install a Swift3-like middleware to provide APIs to a 3rd party, but she doesn't want other users to see this middleware. Regards, Qiao On Mon, Sep 17, 2018 at 9:19 PM Kota TSUYUZAKI wrote: > > With Storlets, users will be able to create their own applications that are able to run like as a Swift middeleware. The application (currently Python and Java are supported as the language but the > apps can calls any binaries in the workspace) can be uploaded as a Swift object, then, users can invoke them with just an extra header that specifies your apps. > > To fit your own use case, we may have to consider to invole or to integrate the system for you but I believe Storlets could be a choice for you. > > In detail, Storlets documantation is around there, > > Top Level Index: https://docs.openstack.org/storlets/latest/index.html > System Overview: https://docs.openstack.org/storlets/latest/storlet_engine_overview.html > APIs: https://docs.openstack.org/storlets/latest/api/overview_api.html > > Thanks, > > Kota > > (2018/09/17 8:59), John Dickinson wrote: > > You may be interested in Storlets. It's another OpenStack project, maintained by a Swift core reviewer, that provides this sort of user-defined middleware functionality. > > > > You can also ask about it in #openstack-swift > > > > --John > > > > > > > > On 16 Sep 2018, at 9:25, Qiao Kang wrote: > > > >> Hi, > >> > >> I'm wondering whether Swift allows any user (not the administrator) to > >> specify which middleware that she/he wants his data object to go throught. > >> For instance, Alice wants to install a middleware but doesn't want Bob to > >> use it, where Alice and Bob are two accounts in a single Swift cluster. > >> > >> Or maybe all middlewares are pre-installed globally and cannot be > >> customized on a per-account basis? > >> > >> Thanks, > >> Qiao > >> _______________________________________________ > >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > >> Post to : openstack at lists.openstack.org > >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > > _______________________________________________ > > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > Post to : openstack at lists.openstack.org > > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > -- > ---------------------------------------------------------- > Kota Tsuyuzaki(露﨑 浩太) > NTT Software Innovation Center > Distributed Computing Technology Project > Phone 0422-59-2837 > Fax 0422-59-2965 > ----------------------------------------------------------- > > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack From berndbausch at gmail.com Tue Sep 18 03:17:20 2018 From: berndbausch at gmail.com (Bernd Bausch) Date: Tue, 18 Sep 2018 12:17:20 +0900 Subject: [Openstack] [neutron][install] Neutron server fails citing unknown option auth_admin_prefix Message-ID: I am trying the Neutron Rocky installation guide, setting up the tenant networks option. The OS is Centos 7. After going through all configurations on the controller, the Neutron server fails to start with the following error message in its log:     NoSuchOptError: no such option auth_admin_prefix in group [keystone] This baffles me. There is no file under /etc that contains the string auth_admin_prefix. This config variable has been deprecated since Newton at least. Where does it come from? How can I remove it? I see the string mentioned in a few launchpad bug descriptions, but only in the context of deprecation warnings. The associated stack trace is reproduced below, plus a few additional log messages at the beginning. While "__file__" and "here" (mentioned in bug 1722444[1]) don't seem to trip Neutron or Keystone, "auth_admin_prefix" does. Thanks for inputs. Bernd. [1] https://bugs.launchpad.net/keystonemiddleware/+bug/1722444 --- 2018-09-18 11:16:10.421 16109 WARNING keystonemiddleware._common.config [req-52d15bb4-b80e-47fc-80f0-2e9a856e3796 - - - - -] The option "__file__" in conf is not known to auth_token 2018-09-18 11:16:10.422 16109 WARNING keystonemiddleware._common.config [req-52d15bb4-b80e-47fc-80f0-2e9a856e3796 - - - - -] The option "here" in conf is not known to auth_token 2018-09-18 11:16:10.427 16109 WARNING keystonemiddleware.auth_token [req-52d15bb4-b80e-47fc-80f0-2e9a856e3796 - - - - -] AuthToken middleware is set with keystone_authtoken.service_token_roles_required set to False. This is backwards compatible but deprecated behaviour. Please set this to True. 2018-09-18 11:16:10.429 16109 ERROR neutron.service [req-52d15bb4-b80e-47fc-80f0-2e9a856e3796 - - - - -] Unrecoverable error: please check log for details.: NoSuchOptError: no such option auth_admin_prefix in group [keystone] 2018-09-18 11:22:22.937 18401 CRITICAL neutron [req-35daa9ce-b9ce-481f-a9af-6304528b843a - - - - -] Unhandled error: NoSuchOptError: no such option auth_admin_prefix in group [keystone] 2018-09-18 11:22:22.937 18401 ERROR neutron Traceback (most recent call last): 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/bin/neutron-server", line 10, in 2018-09-18 11:22:22.937 18401 ERROR neutron     sys.exit(main()) 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/neutron/cmd/eventlet/server/__init__.py", line 19, in main 2018-09-18 11:22:22.937 18401 ERROR neutron     server.boot_server(wsgi_eventlet.eventlet_wsgi_server) 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/neutron/server/__init__.py", line 68, in boot_server 2018-09-18 11:22:22.937 18401 ERROR neutron     server_func() 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/neutron/server/wsgi_eventlet.py", line 23, in eventlet_wsgi_server 2018-09-18 11:22:22.937 18401 ERROR neutron     neutron_api = service.serve_wsgi(service.NeutronApiService) 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/neutron/service.py", line 89, in serve_wsgi 2018-09-18 11:22:22.937 18401 ERROR neutron     LOG.exception('Unrecoverable error: please check log ' 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2018-09-18 11:22:22.937 18401 ERROR neutron     self.force_reraise() 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2018-09-18 11:22:22.937 18401 ERROR neutron     six.reraise(self.type_, self.value, self.tb) 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/neutron/service.py", line 86, in serve_wsgi 2018-09-18 11:22:22.937 18401 ERROR neutron     service.start() 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/neutron/service.py", line 62, in start 2018-09-18 11:22:22.937 18401 ERROR neutron     self.wsgi_app = _run_wsgi(self.app_name) 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/neutron/service.py", line 291, in _run_wsgi 2018-09-18 11:22:22.937 18401 ERROR neutron     app = config.load_paste_app(app_name) 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/neutron/common/config.py", line 125, in load_paste_app 2018-09-18 11:22:22.937 18401 ERROR neutron     app = loader.load_app(app_name) 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/oslo_service/wsgi.py", line 353, in load_app 2018-09-18 11:22:22.937 18401 ERROR neutron     return deploy.loadapp("config:%s" % self.config_path, name=name) 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 247, in loadapp 2018-09-18 11:22:22.937 18401 ERROR neutron     return loadobj(APP, uri, name=name, **kw) 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 272, in loadobj 2018-09-18 11:22:22.937 18401 ERROR neutron     return context.create() 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 710, in create 2018-09-18 11:22:22.937 18401 ERROR neutron     return self.object_type.invoke(self) 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 144, in invoke 2018-09-18 11:22:22.937 18401 ERROR neutron     **context.local_conf) 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/paste/deploy/util.py", line 55, in fix_call 2018-09-18 11:22:22.937 18401 ERROR neutron     val = callable(*args, **kw) 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/paste/urlmap.py", line 25, in urlmap_factory 2018-09-18 11:22:22.937 18401 ERROR neutron     app = loader.get_app(app_name, global_conf=global_conf) 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 350, in get_app 2018-09-18 11:22:22.937 18401 ERROR neutron     name=name, global_conf=global_conf).create() 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 710, in create 2018-09-18 11:22:22.937 18401 ERROR neutron     return self.object_type.invoke(self) 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/paste/deploy/loadwsgi.py", line 144, in invoke 2018-09-18 11:22:22.937 18401 ERROR neutron     **context.local_conf) 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/paste/deploy/util.py", line 55, in fix_call 2018-09-18 11:22:22.937 18401 ERROR neutron     val = callable(*args, **kw) 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/neutron/auth.py", line 50, in pipeline_factory 2018-09-18 11:22:22.937 18401 ERROR neutron     app = filter(app) 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/keystonemiddleware/auth_token/__init__.py", line 988, in auth_filter 2018-09-18 11:22:22.937 18401 ERROR neutron     return AuthProtocol(app, conf) 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/keystonemiddleware/auth_token/__init__.py", line 576, in __init__ 2018-09-18 11:22:22.937 18401 ERROR neutron     self._auth = self._create_auth_plugin() 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/keystonemiddleware/auth_token/__init__.py", line 897, in _create_auth_plugin 2018-09-18 11:22:22.937 18401 ERROR neutron     group=group), 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/keystonemiddleware/_common/config.py", line 115, in get 2018-09-18 11:22:22.937 18401 ERROR neutron     return self.oslo_conf_obj[group][name] 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 3555, in __getitem__ 2018-09-18 11:22:22.937 18401 ERROR neutron     return self.__getattr__(key) 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 3551, in __getattr__ 2018-09-18 11:22:22.937 18401 ERROR neutron     return self._conf._get(name, self._group) 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 3073, in _get 2018-09-18 11:22:22.937 18401 ERROR neutron     value, loc = self._do_get(name, group, namespace) 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 3091, in _do_get 2018-09-18 11:22:22.937 18401 ERROR neutron     info = self._get_opt_info(name, group) 2018-09-18 11:22:22.937 18401 ERROR neutron   File "/usr/lib/python2.7/site-packages/oslo_config/cfg.py", line 3267, in _get_opt_info 2018-09-18 11:22:22.937 18401 ERROR neutron     raise NoSuchOptError(opt_name, group) 2018-09-18 11:22:22.937 18401 ERROR neutron NoSuchOptError: no such option auth_admin_prefix in group [keystone] -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From tsuyuzaki.kota at lab.ntt.co.jp Tue Sep 18 04:42:42 2018 From: tsuyuzaki.kota at lab.ntt.co.jp (Kota TSUYUZAKI) Date: Tue, 18 Sep 2018 13:42:42 +0900 Subject: [Openstack] Can any user add or delete OpenStack Swift middleware? In-Reply-To: References: <5BA05DDE.5060606@lab.ntt.co.jp> Message-ID: <5BA08242.9090305@lab.ntt.co.jp> Hi Quio, > I know Storlets can provide user-defined computation functionalities, > but I guess some capabilities can only be achieved using middleware. > For example, a user may want such a feature: upon each PUT request, it > creates a compressed copy of the object and stores both the original > copy and compressed copy. It's feasible using middlware but I don't > think Storlets provide such capability. Interesting, exactly currently it's not supported to write to multi objects for a PUT request but as well as other middlewares we could adopt the feasibility into Storlets if you prefer. Right now, the multi read (i.e. GET from multi sources) is only available and I think we would be able to expand the logic to PUT requests too. IIRC, in those days, we had discussion on sort of the multi-out use cases and I'm sure the data structure inside Storlets are designed to be capable to that expantion. At that time, we called them "Tee" application on Storlets, I could not find the historical discussion logs about how to implement tho, sorry. I believe that would be an use case for storlets if you prefer the user-defined application flexibilities rather than operator defined Swift middleware. The example of multi-read (GET from multi sources) are here: https://github.com/openstack/storlets/blob/master/tests/functional/python/test_multiinput_storlet.py And if you like to try to write multi write, please join us, I'm happy to help you anytime. > Another example is that a user may want to install a Swift3-like > middleware to provide APIs to a 3rd party, but she doesn't want other > users to see this middleware. > If the definition can be made by operators, perhaps one possible solution that preparing different proxy-server endpoint for different users is available. i.e. an user uses no-s3api available proxy, then the others use a different proxy-server endpoint that has the s3api in the pipeline. Or, it sounds like kinda defaulter middleware[1], I don't think it has the scope turning on/off the middlewares for now. 1: https://review.openstack.org/#/c/342857/ Best, Kota (2018/09/18 11:34), Qiao Kang wrote: > Kota, > > Thanks for your reply, very helpful! > > I know Storlets can provide user-defined computation functionalities, > but I guess some capabilities can only be achieved using middleware. > For example, a user may want such a feature: upon each PUT request, it > creates a compressed copy of the object and stores both the original > copy and compressed copy. It's feasible using middlware but I don't > think Storlets provide such capability. > > Another example is that a user may want to install a Swift3-like > middleware to provide APIs to a 3rd party, but she doesn't want other > users to see this middleware. > > Regards, > Qiao > > On Mon, Sep 17, 2018 at 9:19 PM Kota TSUYUZAKI > wrote: >> >> With Storlets, users will be able to create their own applications that are able to run like as a Swift middeleware. The application (currently Python and Java are supported as the language but the >> apps can calls any binaries in the workspace) can be uploaded as a Swift object, then, users can invoke them with just an extra header that specifies your apps. >> >> To fit your own use case, we may have to consider to invole or to integrate the system for you but I believe Storlets could be a choice for you. >> >> In detail, Storlets documantation is around there, >> >> Top Level Index: https://docs.openstack.org/storlets/latest/index.html >> System Overview: https://docs.openstack.org/storlets/latest/storlet_engine_overview.html >> APIs: https://docs.openstack.org/storlets/latest/api/overview_api.html >> >> Thanks, >> >> Kota >> >> (2018/09/17 8:59), John Dickinson wrote: >>> You may be interested in Storlets. It's another OpenStack project, maintained by a Swift core reviewer, that provides this sort of user-defined middleware functionality. >>> >>> You can also ask about it in #openstack-swift >>> >>> --John >>> >>> >>> >>> On 16 Sep 2018, at 9:25, Qiao Kang wrote: >>> >>>> Hi, >>>> >>>> I'm wondering whether Swift allows any user (not the administrator) to >>>> specify which middleware that she/he wants his data object to go throught. >>>> For instance, Alice wants to install a middleware but doesn't want Bob to >>>> use it, where Alice and Bob are two accounts in a single Swift cluster. >>>> >>>> Or maybe all middlewares are pre-installed globally and cannot be >>>> customized on a per-account basis? >>>> >>>> Thanks, >>>> Qiao >>>> _______________________________________________ >>>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>>> Post to : openstack at lists.openstack.org >>>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>> >>> _______________________________________________ >>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>> Post to : openstack at lists.openstack.org >>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> >> >> -- >> ---------------------------------------------------------- >> Kota Tsuyuzaki(露﨑 浩太) >> NTT Software Innovation Center >> Distributed Computing Technology Project >> Phone 0422-59-2837 >> Fax 0422-59-2965 >> ----------------------------------------------------------- >> >> >> _______________________________________________ >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> Post to : openstack at lists.openstack.org >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > -- ---------------------------------------------------------- Kota Tsuyuzaki(露﨑 浩太) NTT Software Innovation Center Distributed Computing Technology Project Phone 0422-59-2837 Fax 0422-59-2965 ----------------------------------------------------------- From dabarren at gmail.com Tue Sep 18 06:40:36 2018 From: dabarren at gmail.com (Eduardo Gonzalez) Date: Tue, 18 Sep 2018 08:40:36 +0200 Subject: [Openstack] [kolla] Stein forum topics proposal Message-ID: Hi, Berlin forum brainstorming has started, please add your proposal topics before 26th september https://etherpad.openstack.org/p/kolla-forum-stein Forum topics are proposed the same method as Summit presentations https://www.openstack.org/summit/berlin-2018/call-for-presentations Regards -------------- next part -------------- An HTML attachment was scrubbed... URL: From afazekas at redhat.com Tue Sep 18 09:32:37 2018 From: afazekas at redhat.com (Attila Fazekas) Date: Tue, 18 Sep 2018 11:32:37 +0200 Subject: [Openstack] [nova][cinder] Migrate instances between regions or between clusters? In-Reply-To: References: <20180917133933.GE4292@straylight.m.ringlet.net> Message-ID: Create a volume transfer VM/machine in each region. attache the volume -> dd -> compress -> internet ->decompress -> new volume, attache(/boot with) to the volume to the final machine. In case you have frequent transfers you may keep up the machines for the next one.. In case the storage is just on the compute node: snapshot ->glance download ->glance upload Would be nice if cinder/glance could take the credentials for another openstack and move the volume/image to another cinder/glance. If you want the same IP , specify the ip at instance boot time (port create), but you cannot be sure the same ip is always available or really route-able to different region.. unless... VPN like solution in place... The uuid not expected to be changed by the users or admins (unsafe), but you can use other metadata for description/your uuid. On Mon, Sep 17, 2018 at 11:43 PM, Jay Pipes wrote: > On 09/17/2018 09:39 AM, Peter Penchev wrote: > >> Hi, >> >> So here's a possibly stupid question - or rather, a series of such :) >> Let's say a company has two (or five, or a hundred) datacenters in >> geographically different locations and wants to deploy OpenStack in both. >> What would be a deployment scenario that would allow relatively easy >> migration (cold, not live) of instances from one datacenter to another? >> >> My understanding is that for servers located far away from one another >> regions would be a better metaphor than availability zones, if only >> because it would be faster for the various storage, compute, etc. >> services to communicate with each other for the common case of doing >> actions within the same datacenter. Is this understanding wrong - is it >> considered all right for groups of servers located in far away places to >> be treated as different availability zones in the same cluster? >> >> If the groups of servers are put in different regions, though, this >> brings me to the real question: how can an instance be migrated across >> regions? Note that the instance will almost certainly have some >> shared-storage volume attached, and assume (not quite the common case, >> but still) that the underlying shared storage technology can be taught >> about another storage cluster in another location and can transfer >> volumes and snapshots to remote clusters. From what I've found, there >> are three basic ways: >> >> - do it pretty much by hand: create snapshots of the volumes used in >> the underlying storage system, transfer them to the other storage >> cluster, then tell the Cinder volume driver to manage them, and spawn >> an instance with the newly-managed newly-transferred volumes >> > > Yes, this is a perfectly reasonable solution. In fact, when I was at AT&T, > this was basically how we allowed tenants to spin up instances in multiple > regions: snapshot the instance, it gets stored in the Swift storage for the > region, tenant starts the instance in a different region, and Nova pulls > the image from the Swift storage in the other region. It's slow the first > time it's launched in the new region, of course, since the bits need to be > pulled from the other region's Swift storage, but after that, local image > caching speeds things up quite a bit. > > This isn't migration, though. Namely, the tenant doesn't keep their > instance ID, their instance's IP addresses, or anything like that. > > I've heard some users care about that stuff, unfortunately, which is why > we have shelve [offload]. There's absolutely no way to perform a > cross-region migration that keeps the instance ID and instance IP addresses. > > - use Cinder to backup the volumes from one region, then restore them to >> the other; if this is combined with a storage-specific Cinder backup >> driver that knows that "backing up" is "creating a snapshot" and >> "restoring to the other region" is "transferring that snapshot to the >> remote storage cluster", it seems to be the easiest way forward (once >> the Cinder backup driver has been written) >> > > Still won't have the same instance ID and IP address, which is what > certain users tend to complain about needing with move operations. > > - use Nova's "server image create" command, transfer the resulting >> Glance image somehow (possibly by downloading it from the Glance >> storage in one region and simulateneously uploading it to the Glance >> instance in the other), then spawn an instance off that image >> > > Still won't have the same instance ID and IP address :) > > Best, > -jay > > The "server image create" approach seems to be the simplest one, >> although it is a bit hard to imagine how it would work without >> transferring data unnecessarily (the online articles I've seen >> advocating it seem to imply that a Nova instance in a region cannot be >> spawned off a Glance image in another region, so there will need to be >> at least one set of "download the image and upload it to the other >> side", even if the volume-to-image and image-to-volume transfers are >> instantaneous, e.g. using glance-cinderclient). However, when I tried >> it with a Nova instance backed by a StorPool volume (no ephemeral image >> at all), the Glance image was zero bytes in length and only its metadata >> contained some information about a volume snapshot created at that >> point, so this seems once again to go back to options 1 and 2 for the >> different ways to transfer a Cinder volume or snapshot to the other >> region. Or have I missed something, is there a way to get the "server >> image create / image download / image create" route to handle volumes >> attached to the instance? >> >> So... have I missed something else, too, or are these the options for >> transferring a Nova instance between two distant locations? >> >> Thanks for reading this far, and thanks in advance for your help! >> >> Best regards, >> Peter >> >> >> >> _______________________________________________ >> Mailing list: http://lists.openstack.org/cgi >> -bin/mailman/listinfo/openstack >> Post to : openstack at lists.openstack.org >> Unsubscribe : http://lists.openstack.org/cgi >> -bin/mailman/listinfo/openstack >> >> > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstac > k > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstac > k > -------------- next part -------------- An HTML attachment was scrubbed... URL: From anyrude10 at gmail.com Tue Sep 18 10:43:02 2018 From: anyrude10 at gmail.com (Anirudh Gupta) Date: Tue, 18 Sep 2018 16:13:02 +0530 Subject: [Openstack] [Openstack-Ansible] Unable to install Openstack Queens using Ansible Message-ID: Hi Team, I am installing Open Stack Queens using the Openstack Ansible and facing some issues *System Configuration* *Controller/Deployment Host* RAM - 12 GB Hard disk - 100 GB Linux - Ubuntu 16.04 Kernel Version - 4.4.0-135-generic *Compute* RAM - 4 GB Hard disk - 100 GB Linux - Ubuntu 16.04 Kernel Version - 4.4.0-135-generic *Issue Observed:* When we run the below playbook openstack-ansible setup-openstack.yml *Error Observed:* After running for some duration, it throws the error of "Out of Memory Killing mysqld" In the "top" command, we see only haproxy processes and the system gets so slow that we are not even able to login into the system. Can you please help me in resolving the issue. Regards Anirudh Gupta -------------- next part -------------- An HTML attachment was scrubbed... URL: From laszlo.budai at gmail.com Tue Sep 18 11:04:09 2018 From: laszlo.budai at gmail.com (Budai Laszlo) Date: Tue, 18 Sep 2018 14:04:09 +0300 Subject: [Openstack] [Openstack-Ansible] Unable to install Openstack Queens using Ansible In-Reply-To: References: Message-ID: Hi, run dmesg on your deployment host. It should print which process has been evicted by the OOM killer. We had similar issues with our deployment host. We had to increase its memory to 9G to have openstack-ansiblle working properly. You should also monitor the memory usage of your processes on the controller/deployment host. good luck, Laszlo On 18.09.2018 13:43, Anirudh Gupta wrote: > Hi Team, > > I am installing Open Stack Queens using the Openstack Ansible and facing some issues > > > *System Configuration* > > > *Controller/Deployment Host* > > RAM - 12 GB > > Hard disk - 100 GB > > Linux - Ubuntu 16.04 > > Kernel Version - 4.4.0-135-generic > > > > *Compute* > > RAM - 4 GB > > Hard disk - 100 GB > > Linux - Ubuntu 16.04 > > Kernel Version - 4.4.0-135-generic > > *Issue Observed:* > > When we run the below playbook > > openstack-ansible setup-openstack.yml > > *Error Observed:* > > After running for some duration, it throws the error of "Out of Memory Killing mysqld" > > > In the "top" command, we see only haproxy processes and the system gets so slow that we are not even able to login into the system. > > > Can you please help me in resolving the issue. > > > Regards > > Anirudh Gupta > > > > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > From openstack-dev at storpool.com Tue Sep 18 12:09:09 2018 From: openstack-dev at storpool.com (Peter Penchev) Date: Tue, 18 Sep 2018 15:09:09 +0300 Subject: [Openstack] [nova][cinder] Migrate instances between regions or between clusters? In-Reply-To: References: <20180917133933.GE4292@straylight.m.ringlet.net> Message-ID: <20180918120909.GA32020@straylight.m.ringlet.net> On Tue, Sep 18, 2018 at 11:32:37AM +0200, Attila Fazekas wrote: [format recovered; top-posting after an inline reply looks confusing] > On Mon, Sep 17, 2018 at 11:43 PM, Jay Pipes wrote: > > > On 09/17/2018 09:39 AM, Peter Penchev wrote: > > > >> Hi, > >> > >> So here's a possibly stupid question - or rather, a series of such :) > >> Let's say a company has two (or five, or a hundred) datacenters in > >> geographically different locations and wants to deploy OpenStack in both. > >> What would be a deployment scenario that would allow relatively easy > >> migration (cold, not live) of instances from one datacenter to another? > >> > >> My understanding is that for servers located far away from one another > >> regions would be a better metaphor than availability zones, if only > >> because it would be faster for the various storage, compute, etc. > >> services to communicate with each other for the common case of doing > >> actions within the same datacenter. Is this understanding wrong - is it > >> considered all right for groups of servers located in far away places to > >> be treated as different availability zones in the same cluster? > >> > >> If the groups of servers are put in different regions, though, this > >> brings me to the real question: how can an instance be migrated across > >> regions? Note that the instance will almost certainly have some > >> shared-storage volume attached, and assume (not quite the common case, > >> but still) that the underlying shared storage technology can be taught > >> about another storage cluster in another location and can transfer > >> volumes and snapshots to remote clusters. From what I've found, there > >> are three basic ways: > >> > >> - do it pretty much by hand: create snapshots of the volumes used in > >> the underlying storage system, transfer them to the other storage > >> cluster, then tell the Cinder volume driver to manage them, and spawn > >> an instance with the newly-managed newly-transferred volumes > >> > > > > Yes, this is a perfectly reasonable solution. In fact, when I was at AT&T, > > this was basically how we allowed tenants to spin up instances in multiple > > regions: snapshot the instance, it gets stored in the Swift storage for the > > region, tenant starts the instance in a different region, and Nova pulls > > the image from the Swift storage in the other region. It's slow the first > > time it's launched in the new region, of course, since the bits need to be > > pulled from the other region's Swift storage, but after that, local image > > caching speeds things up quite a bit. > > > > This isn't migration, though. Namely, the tenant doesn't keep their > > instance ID, their instance's IP addresses, or anything like that. Right, sorry, I should have clarified that what we're interested in is technically creating a new instance with the same disk contents, so that's fine. Thanks for confirming that there is not a simpler way that I've missed, I guess :) > > I've heard some users care about that stuff, unfortunately, which is why > > we have shelve [offload]. There's absolutely no way to perform a > > cross-region migration that keeps the instance ID and instance IP addresses. > > > > - use Cinder to backup the volumes from one region, then restore them to > >> the other; if this is combined with a storage-specific Cinder backup > >> driver that knows that "backing up" is "creating a snapshot" and > >> "restoring to the other region" is "transferring that snapshot to the > >> remote storage cluster", it seems to be the easiest way forward (once > >> the Cinder backup driver has been written) > >> > > > > Still won't have the same instance ID and IP address, which is what > > certain users tend to complain about needing with move operations. > > > > - use Nova's "server image create" command, transfer the resulting > >> Glance image somehow (possibly by downloading it from the Glance > >> storage in one region and simulateneously uploading it to the Glance > >> instance in the other), then spawn an instance off that image > >> > > > > Still won't have the same instance ID and IP address :) > > > > Best, > > -jay > > > > The "server image create" approach seems to be the simplest one, > >> although it is a bit hard to imagine how it would work without > >> transferring data unnecessarily (the online articles I've seen > >> advocating it seem to imply that a Nova instance in a region cannot be > >> spawned off a Glance image in another region, so there will need to be > >> at least one set of "download the image and upload it to the other > >> side", even if the volume-to-image and image-to-volume transfers are > >> instantaneous, e.g. using glance-cinderclient). However, when I tried > >> it with a Nova instance backed by a StorPool volume (no ephemeral image > >> at all), the Glance image was zero bytes in length and only its metadata > >> contained some information about a volume snapshot created at that > >> point, so this seems once again to go back to options 1 and 2 for the > >> different ways to transfer a Cinder volume or snapshot to the other > >> region. Or have I missed something, is there a way to get the "server > >> image create / image download / image create" route to handle volumes > >> attached to the instance? > >> > >> So... have I missed something else, too, or are these the options for > >> transferring a Nova instance between two distant locations? > >> > >> Thanks for reading this far, and thanks in advance for your help! > >> > >> Best regards, > >> Peter > > Create a volume transfer VM/machine in each region. > attache the volume -> dd -> compress -> internet ->decompress -> new > volume, attache(/boot with) to the volume to the final machine. > In case you have frequent transfers you may keep up the machines for the > next one.. Thanks for the advice, but this would involve transferring *a lot* more data than if we leave it to the underlying storage :) As I mentioned, the underlying storage can be taught about remote clusters and can be told to create a remote snapshot of a volume; this will be the base on which we will write our Cinder backup driver. So both my options 1 (do it "by hand" with the underlying storage) and 2 (cinder volume backup/restore) would be preferable. > In case the storage is just on the compute node: snapshot ->glance download > ->glance upload Right, as I mentioned in my description of the third option, this does not really work with attached volumes (thus your "just on the compute node") and as I mentioned before listing the options, the instances will almost certainly have attached volumes. > Would be nice if cinder/glance could take the credentials for another > openstack and move the volume/image to another cinder/glance. > > If you want the same IP , specify the ip at instance boot time (port > create), > but you cannot be sure the same ip is always available or really route-able > to different region.. unless... VPN like solution in place... > > The uuid not expected to be changed by the users or admins (unsafe), > but you can use other metadata for description/your uuid. Best regards, Peter -- Peter Penchev openstack-dev at storpool.com https://storpool.com/ From afazekas at redhat.com Tue Sep 18 13:07:45 2018 From: afazekas at redhat.com (Attila Fazekas) Date: Tue, 18 Sep 2018 15:07:45 +0200 Subject: [Openstack] [nova][cinder] Migrate instances between regions or between clusters? In-Reply-To: <20180918120909.GA32020@straylight.m.ringlet.net> References: <20180917133933.GE4292@straylight.m.ringlet.net> <20180918120909.GA32020@straylight.m.ringlet.net> Message-ID: On Tue, Sep 18, 2018 at 2:09 PM, Peter Penchev wrote: > On Tue, Sep 18, 2018 at 11:32:37AM +0200, Attila Fazekas wrote: > [format recovered; top-posting after an inline reply looks confusing] > > On Mon, Sep 17, 2018 at 11:43 PM, Jay Pipes wrote: > > > > > On 09/17/2018 09:39 AM, Peter Penchev wrote: > > > > > >> Hi, > > >> > > >> So here's a possibly stupid question - or rather, a series of such :) > > >> Let's say a company has two (or five, or a hundred) datacenters in > > >> geographically different locations and wants to deploy OpenStack in > both. > > >> What would be a deployment scenario that would allow relatively easy > > >> migration (cold, not live) of instances from one datacenter to > another? > > >> > > >> My understanding is that for servers located far away from one another > > >> regions would be a better metaphor than availability zones, if only > > >> because it would be faster for the various storage, compute, etc. > > >> services to communicate with each other for the common case of doing > > >> actions within the same datacenter. Is this understanding wrong - is > it > > >> considered all right for groups of servers located in far away places > to > > >> be treated as different availability zones in the same cluster? > > >> > > >> If the groups of servers are put in different regions, though, this > > >> brings me to the real question: how can an instance be migrated across > > >> regions? Note that the instance will almost certainly have some > > >> shared-storage volume attached, and assume (not quite the common case, > > >> but still) that the underlying shared storage technology can be taught > > >> about another storage cluster in another location and can transfer > > >> volumes and snapshots to remote clusters. From what I've found, there > > >> are three basic ways: > > >> > > >> - do it pretty much by hand: create snapshots of the volumes used in > > >> the underlying storage system, transfer them to the other storage > > >> cluster, then tell the Cinder volume driver to manage them, and > spawn > > >> an instance with the newly-managed newly-transferred volumes > > >> > > > > > > Yes, this is a perfectly reasonable solution. In fact, when I was at > AT&T, > > > this was basically how we allowed tenants to spin up instances in > multiple > > > regions: snapshot the instance, it gets stored in the Swift storage > for the > > > region, tenant starts the instance in a different region, and Nova > pulls > > > the image from the Swift storage in the other region. It's slow the > first > > > time it's launched in the new region, of course, since the bits need > to be > > > pulled from the other region's Swift storage, but after that, local > image > > > caching speeds things up quite a bit. > > > > > > This isn't migration, though. Namely, the tenant doesn't keep their > > > instance ID, their instance's IP addresses, or anything like that. > > Right, sorry, I should have clarified that what we're interested in is > technically creating a new instance with the same disk contents, so > that's fine. Thanks for confirming that there is not a simpler way that > I've missed, I guess :) > > > > I've heard some users care about that stuff, unfortunately, which is > why > > > we have shelve [offload]. There's absolutely no way to perform a > > > cross-region migration that keeps the instance ID and instance IP > addresses. > > > > > > - use Cinder to backup the volumes from one region, then restore them > to > > >> the other; if this is combined with a storage-specific Cinder > backup > > >> driver that knows that "backing up" is "creating a snapshot" and > > >> "restoring to the other region" is "transferring that snapshot to > the > > >> remote storage cluster", it seems to be the easiest way forward > (once > > >> the Cinder backup driver has been written) > > >> > > > > > > Still won't have the same instance ID and IP address, which is what > > > certain users tend to complain about needing with move operations. > > > > > > - use Nova's "server image create" command, transfer the resulting > > >> Glance image somehow (possibly by downloading it from the Glance > > >> storage in one region and simulateneously uploading it to the > Glance > > >> instance in the other), then spawn an instance off that image > > >> > > > > > > Still won't have the same instance ID and IP address :) > > > > > > Best, > > > -jay > > > > > > The "server image create" approach seems to be the simplest one, > > >> although it is a bit hard to imagine how it would work without > > >> transferring data unnecessarily (the online articles I've seen > > >> advocating it seem to imply that a Nova instance in a region cannot be > > >> spawned off a Glance image in another region, so there will need to be > > >> at least one set of "download the image and upload it to the other > > >> side", even if the volume-to-image and image-to-volume transfers are > > >> instantaneous, e.g. using glance-cinderclient). However, when I tried > > >> it with a Nova instance backed by a StorPool volume (no ephemeral > image > > >> at all), the Glance image was zero bytes in length and only its > metadata > > >> contained some information about a volume snapshot created at that > > >> point, so this seems once again to go back to options 1 and 2 for the > > >> different ways to transfer a Cinder volume or snapshot to the other > > >> region. Or have I missed something, is there a way to get the "server > > >> image create / image download / image create" route to handle volumes > > >> attached to the instance? > > >> > > >> So... have I missed something else, too, or are these the options for > > >> transferring a Nova instance between two distant locations? > > >> > > >> Thanks for reading this far, and thanks in advance for your help! > > >> > > >> Best regards, > > >> Peter > > > > Create a volume transfer VM/machine in each region. > > attache the volume -> dd -> compress -> internet ->decompress -> new > > volume, attache(/boot with) to the volume to the final machine. > > In case you have frequent transfers you may keep up the machines for the > > next one.. > > Thanks for the advice, but this would involve transferring *a lot* more > data than if we leave it to the underlying storage :) As I mentioned, > the underlying storage can be taught about remote clusters and can be told > to create a remote snapshot of a volume; this will be the base on which > we will write our Cinder backup driver. So both my options 1 (do it "by > hand" with the underlying storage) and 2 (cinder volume backup/restore) > would be preferable. > Cinder might get a feature for `rescue` a volume in case accidentally someone deleted the DB record or some other bad thing happened. This needs to be admin only op where you would need to specify where is the volume, If just a new volume `shows up` on the storage, but without the knowledge of cinder, it could be rescued as well. Among same storage types probably cinder could have an admin only API for transfer. I am not sure is volume backup/restore is really better across regions than the above steps properly piped however it is very infrastructure dependent, bandwidth and latency across regions matters. The direct storage usage likely better than the pipe/dd on close proximity, but in case of good internal networking on longer distance (external net) the diff will not be big, you might wait more on openstack api/client than on the actual data transfer, in case of small the size. The internal network total bandwith nowadays is very huge a little internal data move (storage->vm->internetet->vm->storage) might not even shows ups on the internal monitors ;-) The internet is the high latency thing even if you have the best internet connection on the world. The light is not getting faster. ;-) > > > In case the storage is just on the compute node: snapshot ->glance > download > > ->glance upload > > Right, as I mentioned in my description of the third option, this does > not really work with attached volumes (thus your "just on the compute > node") > and as I mentioned before listing the options, the instances will almost > certainly have attached volumes. > > yes, you need to use both way. > Would be nice if cinder/glance could take the credentials for another > > openstack and move the volume/image to another cinder/glance. > > > > If you want the same IP , specify the ip at instance boot time (port > > create), > > but you cannot be sure the same ip is always available or really > route-able > > to different region.. unless... VPN like solution in place... > > > > The uuid not expected to be changed by the users or admins (unsafe), > > but you can use other metadata for description/your uuid. > > Best regards, > Peter > > -- > Peter Penchev openstack-dev at storpool.com https://storpool.com/ > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/ > openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/ > openstack > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mnaser at vexxhost.com Tue Sep 18 13:17:18 2018 From: mnaser at vexxhost.com (Mohammed Naser) Date: Tue, 18 Sep 2018 09:17:18 -0400 Subject: [Openstack] [Openstack-Ansible] Unable to install Openstack Queens using Ansible In-Reply-To: References: Message-ID: <668D484C-7553-49CB-8A88-B4BD63F49BBD@vexxhost.com> Hi, 4GB of memory is not enough for a deployment unfortunately. You’ll have to bump it up. Thanks Mohammed Sent from my iPhone > On Sep 18, 2018, at 7:04 AM, Budai Laszlo wrote: > > Hi, > > run dmesg on your deployment host. It should print which process has been evicted by the OOM killer. > We had similar issues with our deployment host. We had to increase its memory to 9G to have openstack-ansiblle working properly. > You should also monitor the memory usage of your processes on the controller/deployment host. > > good luck, > Laszlo > >> On 18.09.2018 13:43, Anirudh Gupta wrote: >> Hi Team, >> I am installing Open Stack Queens using the Openstack Ansible and facing some issues >> *System Configuration* >> *Controller/Deployment Host* >> RAM - 12 GB >> Hard disk - 100 GB >> Linux - Ubuntu 16.04 >> Kernel Version - 4.4.0-135-generic >> *Compute* >> RAM - 4 GB >> Hard disk - 100 GB >> Linux - Ubuntu 16.04 >> Kernel Version - 4.4.0-135-generic >> *Issue Observed:* >> When we run the below playbook >> openstack-ansible setup-openstack.yml >> *Error Observed:* >> After running for some duration, it throws the error of "Out of Memory Killing mysqld" >> In the "top" command, we see only haproxy processes and the system gets so slow that we are not even able to login into the system. >> Can you please help me in resolving the issue. >> Regards >> Anirudh Gupta >> _______________________________________________ >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> Post to : openstack at lists.openstack.org >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack From afazekas at redhat.com Tue Sep 18 13:26:38 2018 From: afazekas at redhat.com (Attila Fazekas) Date: Tue, 18 Sep 2018 15:26:38 +0200 Subject: [Openstack] [nova][cinder] Migrate instances between regions or between clusters? In-Reply-To: References: <20180917133933.GE4292@straylight.m.ringlet.net> <20180918120909.GA32020@straylight.m.ringlet.net> Message-ID: On Tue, Sep 18, 2018 at 3:07 PM, Attila Fazekas wrote: > > > On Tue, Sep 18, 2018 at 2:09 PM, Peter Penchev > wrote: > >> On Tue, Sep 18, 2018 at 11:32:37AM +0200, Attila Fazekas wrote: >> [format recovered; top-posting after an inline reply looks confusing] >> > On Mon, Sep 17, 2018 at 11:43 PM, Jay Pipes wrote: >> > >> > > On 09/17/2018 09:39 AM, Peter Penchev wrote: >> > > >> > >> Hi, >> > >> >> > >> So here's a possibly stupid question - or rather, a series of such :) >> > >> Let's say a company has two (or five, or a hundred) datacenters in >> > >> geographically different locations and wants to deploy OpenStack in >> both. >> > >> What would be a deployment scenario that would allow relatively easy >> > >> migration (cold, not live) of instances from one datacenter to >> another? >> > >> >> > >> My understanding is that for servers located far away from one >> another >> > >> regions would be a better metaphor than availability zones, if only >> > >> because it would be faster for the various storage, compute, etc. >> > >> services to communicate with each other for the common case of doing >> > >> actions within the same datacenter. Is this understanding wrong - >> is it >> > >> considered all right for groups of servers located in far away >> places to >> > >> be treated as different availability zones in the same cluster? >> > >> >> > >> If the groups of servers are put in different regions, though, this >> > >> brings me to the real question: how can an instance be migrated >> across >> > >> regions? Note that the instance will almost certainly have some >> > >> shared-storage volume attached, and assume (not quite the common >> case, >> > >> but still) that the underlying shared storage technology can be >> taught >> > >> about another storage cluster in another location and can transfer >> > >> volumes and snapshots to remote clusters. From what I've found, >> there >> > >> are three basic ways: >> > >> >> > >> - do it pretty much by hand: create snapshots of the volumes used in >> > >> the underlying storage system, transfer them to the other storage >> > >> cluster, then tell the Cinder volume driver to manage them, and >> spawn >> > >> an instance with the newly-managed newly-transferred volumes >> > >> >> > > >> > > Yes, this is a perfectly reasonable solution. In fact, when I was at >> AT&T, >> > > this was basically how we allowed tenants to spin up instances in >> multiple >> > > regions: snapshot the instance, it gets stored in the Swift storage >> for the >> > > region, tenant starts the instance in a different region, and Nova >> pulls >> > > the image from the Swift storage in the other region. It's slow the >> first >> > > time it's launched in the new region, of course, since the bits need >> to be >> > > pulled from the other region's Swift storage, but after that, local >> image >> > > caching speeds things up quite a bit. >> > > >> > > This isn't migration, though. Namely, the tenant doesn't keep their >> > > instance ID, their instance's IP addresses, or anything like that. >> >> Right, sorry, I should have clarified that what we're interested in is >> technically creating a new instance with the same disk contents, so >> that's fine. Thanks for confirming that there is not a simpler way that >> I've missed, I guess :) >> >> > > I've heard some users care about that stuff, unfortunately, which is >> why >> > > we have shelve [offload]. There's absolutely no way to perform a >> > > cross-region migration that keeps the instance ID and instance IP >> addresses. >> > > >> > > - use Cinder to backup the volumes from one region, then restore them >> to >> > >> the other; if this is combined with a storage-specific Cinder >> backup >> > >> driver that knows that "backing up" is "creating a snapshot" and >> > >> "restoring to the other region" is "transferring that snapshot to >> the >> > >> remote storage cluster", it seems to be the easiest way forward >> (once >> > >> the Cinder backup driver has been written) >> > >> >> > > >> > > Still won't have the same instance ID and IP address, which is what >> > > certain users tend to complain about needing with move operations. >> > > >> > > - use Nova's "server image create" command, transfer the resulting >> > >> Glance image somehow (possibly by downloading it from the Glance >> > >> storage in one region and simulateneously uploading it to the >> Glance >> > >> instance in the other), then spawn an instance off that image >> > >> >> > > >> > > Still won't have the same instance ID and IP address :) >> > > >> > > Best, >> > > -jay >> > > >> > > The "server image create" approach seems to be the simplest one, >> > >> although it is a bit hard to imagine how it would work without >> > >> transferring data unnecessarily (the online articles I've seen >> > >> advocating it seem to imply that a Nova instance in a region cannot >> be >> > >> spawned off a Glance image in another region, so there will need to >> be >> > >> at least one set of "download the image and upload it to the other >> > >> side", even if the volume-to-image and image-to-volume transfers are >> > >> instantaneous, e.g. using glance-cinderclient). However, when I >> tried >> > >> it with a Nova instance backed by a StorPool volume (no ephemeral >> image >> > >> at all), the Glance image was zero bytes in length and only its >> metadata >> > >> contained some information about a volume snapshot created at that >> > >> point, so this seems once again to go back to options 1 and 2 for the >> > >> different ways to transfer a Cinder volume or snapshot to the other >> > >> region. Or have I missed something, is there a way to get the >> "server >> > >> image create / image download / image create" route to handle volumes >> > >> attached to the instance? >> > >> >> > >> So... have I missed something else, too, or are these the options for >> > >> transferring a Nova instance between two distant locations? >> > >> >> > >> Thanks for reading this far, and thanks in advance for your help! >> > >> >> > >> Best regards, >> > >> Peter >> > >> > Create a volume transfer VM/machine in each region. >> > attache the volume -> dd -> compress -> internet ->decompress -> new >> > volume, attache(/boot with) to the volume to the final machine. >> > In case you have frequent transfers you may keep up the machines for the >> > next one.. >> >> Thanks for the advice, but this would involve transferring *a lot* more >> data than if we leave it to the underlying storage :) As I mentioned, >> the underlying storage can be taught about remote clusters and can be told >> to create a remote snapshot of a volume; this will be the base on which >> we will write our Cinder backup driver. So both my options 1 (do it "by >> hand" with the underlying storage) and 2 (cinder volume backup/restore) >> would be preferable. >> > > Cinder might get a feature for `rescue` a volume in case accidentally > someone > deleted the DB record or some other bad thing happened. > This needs to be admin only op where you would need to specify where is > the volume, > If just a new volume `shows up` on the storage, but without > the knowledge of cinder, it could be rescued as well. > > Among same storage types probably cinder could have an admin only > API for transfer. > > I am not sure is volume backup/restore is really better across regions > than the above steps properly piped however > it is very infrastructure dependent, > bandwidth and latency across regions matters. > > The direct storage usage likely better than the pipe/dd on close proximity, > but in case of good internal networking on longer distance (external net) > the diff will not be big, > you might wait more on openstack api/client than on the > actual data transfer, in case of small the size. > The internal network total bandwith nowadays is very huge > a little internal data move (storage->vm->internetet->vm->storage) > might not even shows ups on the internal monitors ;-) > The internet is the high latency thing even > if you have the best internet connection on the world. > > The light is not getting faster. ;-) > > One other thing I forget to mention, depending on your compress/encrypt/hash method you might have a bottleneck on a single CPU core. Splitting the images/volume to nr cpu parts might help. > >> > In case the storage is just on the compute node: snapshot ->glance >> download >> > ->glance upload >> >> Right, as I mentioned in my description of the third option, this does >> not really work with attached volumes (thus your "just on the compute >> node") >> and as I mentioned before listing the options, the instances will almost >> certainly have attached volumes. >> >> yes, you need to use both way. > > > Would be nice if cinder/glance could take the credentials for another >> > openstack and move the volume/image to another cinder/glance. >> > >> > If you want the same IP , specify the ip at instance boot time (port >> > create), >> > but you cannot be sure the same ip is always available or really >> route-able >> > to different region.. unless... VPN like solution in place... >> > >> > The uuid not expected to be changed by the users or admins (unsafe), >> > but you can use other metadata for description/your uuid. >> >> Best regards, >> Peter >> >> -- >> Peter Penchev openstack-dev at storpool.com https://storpool.com/ >> >> _______________________________________________ >> Mailing list: http://lists.openstack.org/cgi >> -bin/mailman/listinfo/openstack >> Post to : openstack at lists.openstack.org >> Unsubscribe : http://lists.openstack.org/cgi >> -bin/mailman/listinfo/openstack >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From openstack-dev at storpool.com Tue Sep 18 14:31:36 2018 From: openstack-dev at storpool.com (Peter Penchev) Date: Tue, 18 Sep 2018 17:31:36 +0300 Subject: [Openstack] [nova][cinder] Migrate instances between regions or between clusters? In-Reply-To: References: <20180917133933.GE4292@straylight.m.ringlet.net> <20180918120909.GA32020@straylight.m.ringlet.net> Message-ID: <20180918143136.GA31205@straylight.m.ringlet.net> On Tue, Sep 18, 2018 at 03:07:45PM +0200, Attila Fazekas wrote: > On Tue, Sep 18, 2018 at 2:09 PM, Peter Penchev > wrote: > > > On Tue, Sep 18, 2018 at 11:32:37AM +0200, Attila Fazekas wrote: > > [format recovered; top-posting after an inline reply looks confusing] > > > On Mon, Sep 17, 2018 at 11:43 PM, Jay Pipes wrote: > > > > > > > On 09/17/2018 09:39 AM, Peter Penchev wrote: > > > > > > > >> Hi, > > > >> > > > >> So here's a possibly stupid question - or rather, a series of such :) > > > >> Let's say a company has two (or five, or a hundred) datacenters in > > > >> geographically different locations and wants to deploy OpenStack in > > both. > > > >> What would be a deployment scenario that would allow relatively easy > > > >> migration (cold, not live) of instances from one datacenter to > > another? > > > >> > > > >> My understanding is that for servers located far away from one another > > > >> regions would be a better metaphor than availability zones, if only > > > >> because it would be faster for the various storage, compute, etc. > > > >> services to communicate with each other for the common case of doing > > > >> actions within the same datacenter. Is this understanding wrong - is > > it > > > >> considered all right for groups of servers located in far away places > > to > > > >> be treated as different availability zones in the same cluster? > > > >> > > > >> If the groups of servers are put in different regions, though, this > > > >> brings me to the real question: how can an instance be migrated across > > > >> regions? Note that the instance will almost certainly have some > > > >> shared-storage volume attached, and assume (not quite the common case, > > > >> but still) that the underlying shared storage technology can be taught > > > >> about another storage cluster in another location and can transfer > > > >> volumes and snapshots to remote clusters. From what I've found, there > > > >> are three basic ways: > > > >> > > > >> - do it pretty much by hand: create snapshots of the volumes used in > > > >> the underlying storage system, transfer them to the other storage > > > >> cluster, then tell the Cinder volume driver to manage them, and > > spawn > > > >> an instance with the newly-managed newly-transferred volumes > > > >> > > > > > > > > Yes, this is a perfectly reasonable solution. In fact, when I was at > > AT&T, > > > > this was basically how we allowed tenants to spin up instances in > > multiple > > > > regions: snapshot the instance, it gets stored in the Swift storage > > for the > > > > region, tenant starts the instance in a different region, and Nova > > pulls > > > > the image from the Swift storage in the other region. It's slow the > > first > > > > time it's launched in the new region, of course, since the bits need > > to be > > > > pulled from the other region's Swift storage, but after that, local > > image > > > > caching speeds things up quite a bit. > > > > > > > > This isn't migration, though. Namely, the tenant doesn't keep their > > > > instance ID, their instance's IP addresses, or anything like that. > > > > Right, sorry, I should have clarified that what we're interested in is > > technically creating a new instance with the same disk contents, so > > that's fine. Thanks for confirming that there is not a simpler way that > > I've missed, I guess :) > > > > > > I've heard some users care about that stuff, unfortunately, which is > > why > > > > we have shelve [offload]. There's absolutely no way to perform a > > > > cross-region migration that keeps the instance ID and instance IP > > addresses. > > > > > > > > - use Cinder to backup the volumes from one region, then restore them > > to > > > >> the other; if this is combined with a storage-specific Cinder > > backup > > > >> driver that knows that "backing up" is "creating a snapshot" and > > > >> "restoring to the other region" is "transferring that snapshot to > > the > > > >> remote storage cluster", it seems to be the easiest way forward > > (once > > > >> the Cinder backup driver has been written) > > > >> > > > > > > > > Still won't have the same instance ID and IP address, which is what > > > > certain users tend to complain about needing with move operations. > > > > > > > > - use Nova's "server image create" command, transfer the resulting > > > >> Glance image somehow (possibly by downloading it from the Glance > > > >> storage in one region and simulateneously uploading it to the > > Glance > > > >> instance in the other), then spawn an instance off that image > > > >> > > > > > > > > Still won't have the same instance ID and IP address :) > > > > > > > > Best, > > > > -jay > > > > > > > > The "server image create" approach seems to be the simplest one, > > > >> although it is a bit hard to imagine how it would work without > > > >> transferring data unnecessarily (the online articles I've seen > > > >> advocating it seem to imply that a Nova instance in a region cannot be > > > >> spawned off a Glance image in another region, so there will need to be > > > >> at least one set of "download the image and upload it to the other > > > >> side", even if the volume-to-image and image-to-volume transfers are > > > >> instantaneous, e.g. using glance-cinderclient). However, when I tried > > > >> it with a Nova instance backed by a StorPool volume (no ephemeral > > image > > > >> at all), the Glance image was zero bytes in length and only its > > metadata > > > >> contained some information about a volume snapshot created at that > > > >> point, so this seems once again to go back to options 1 and 2 for the > > > >> different ways to transfer a Cinder volume or snapshot to the other > > > >> region. Or have I missed something, is there a way to get the "server > > > >> image create / image download / image create" route to handle volumes > > > >> attached to the instance? > > > >> > > > >> So... have I missed something else, too, or are these the options for > > > >> transferring a Nova instance between two distant locations? > > > >> > > > >> Thanks for reading this far, and thanks in advance for your help! > > > >> > > > >> Best regards, > > > >> Peter > > > > > > Create a volume transfer VM/machine in each region. > > > attache the volume -> dd -> compress -> internet ->decompress -> new > > > volume, attache(/boot with) to the volume to the final machine. > > > In case you have frequent transfers you may keep up the machines for the > > > next one.. > > > > Thanks for the advice, but this would involve transferring *a lot* more > > data than if we leave it to the underlying storage :) As I mentioned, > > the underlying storage can be taught about remote clusters and can be told > > to create a remote snapshot of a volume; this will be the base on which > > we will write our Cinder backup driver. So both my options 1 (do it "by > > hand" with the underlying storage) and 2 (cinder volume backup/restore) > > would be preferable. > > > > Cinder might get a feature for `rescue` a volume in case accidentally > someone > deleted the DB record or some other bad thing happened. > This needs to be admin only op where you would need to specify where is the > volume, > If just a new volume `shows up` on the storage, but without > the knowledge of cinder, it could be rescued as well. Hmm, is this not what the Cinder "manage" command does? > Among same storage types probably cinder could have an admin only > API for transfer. > > I am not sure is volume backup/restore is really better across regions > than the above steps properly piped however > it is very infrastructure dependent, > bandwidth and latency across regions matters. [snip discussion] Well, the reason my initial message said "assume the underlying storage can do that" was that I did not want to go into marketing/advertisement territory and say flat out that the StorPool storage system can do that :) Best regards, Peter -- Peter Penchev openstack-dev at storpool.com https://storpool.com/ From afazekas at redhat.com Tue Sep 18 14:59:31 2018 From: afazekas at redhat.com (Attila Fazekas) Date: Tue, 18 Sep 2018 16:59:31 +0200 Subject: [Openstack] [nova][cinder] Migrate instances between regions or between clusters? In-Reply-To: <20180918143136.GA31205@straylight.m.ringlet.net> References: <20180917133933.GE4292@straylight.m.ringlet.net> <20180918120909.GA32020@straylight.m.ringlet.net> <20180918143136.GA31205@straylight.m.ringlet.net> Message-ID: On Tue, Sep 18, 2018 at 4:31 PM, Peter Penchev wrote: > On Tue, Sep 18, 2018 at 03:07:45PM +0200, Attila Fazekas wrote: > > On Tue, Sep 18, 2018 at 2:09 PM, Peter Penchev < > openstack-dev at storpool.com> > > wrote: > > > > > On Tue, Sep 18, 2018 at 11:32:37AM +0200, Attila Fazekas wrote: > > > [format recovered; top-posting after an inline reply looks confusing] > > > > On Mon, Sep 17, 2018 at 11:43 PM, Jay Pipes > wrote: > > > > > > > > > On 09/17/2018 09:39 AM, Peter Penchev wrote: > > > > > > > > > >> Hi, > > > > >> > > > > >> So here's a possibly stupid question - or rather, a series of > such :) > > > > >> Let's say a company has two (or five, or a hundred) datacenters in > > > > >> geographically different locations and wants to deploy OpenStack > in > > > both. > > > > >> What would be a deployment scenario that would allow relatively > easy > > > > >> migration (cold, not live) of instances from one datacenter to > > > another? > > > > >> > > > > >> My understanding is that for servers located far away from one > another > > > > >> regions would be a better metaphor than availability zones, if > only > > > > >> because it would be faster for the various storage, compute, etc. > > > > >> services to communicate with each other for the common case of > doing > > > > >> actions within the same datacenter. Is this understanding wrong > - is > > > it > > > > >> considered all right for groups of servers located in far away > places > > > to > > > > >> be treated as different availability zones in the same cluster? > > > > >> > > > > >> If the groups of servers are put in different regions, though, > this > > > > >> brings me to the real question: how can an instance be migrated > across > > > > >> regions? Note that the instance will almost certainly have some > > > > >> shared-storage volume attached, and assume (not quite the common > case, > > > > >> but still) that the underlying shared storage technology can be > taught > > > > >> about another storage cluster in another location and can transfer > > > > >> volumes and snapshots to remote clusters. From what I've found, > there > > > > >> are three basic ways: > > > > >> > > > > >> - do it pretty much by hand: create snapshots of the volumes used > in > > > > >> the underlying storage system, transfer them to the other > storage > > > > >> cluster, then tell the Cinder volume driver to manage them, and > > > spawn > > > > >> an instance with the newly-managed newly-transferred volumes > > > > >> > > > > > > > > > > Yes, this is a perfectly reasonable solution. In fact, when I was > at > > > AT&T, > > > > > this was basically how we allowed tenants to spin up instances in > > > multiple > > > > > regions: snapshot the instance, it gets stored in the Swift storage > > > for the > > > > > region, tenant starts the instance in a different region, and Nova > > > pulls > > > > > the image from the Swift storage in the other region. It's slow the > > > first > > > > > time it's launched in the new region, of course, since the bits > need > > > to be > > > > > pulled from the other region's Swift storage, but after that, local > > > image > > > > > caching speeds things up quite a bit. > > > > > > > > > > This isn't migration, though. Namely, the tenant doesn't keep their > > > > > instance ID, their instance's IP addresses, or anything like that. > > > > > > Right, sorry, I should have clarified that what we're interested in is > > > technically creating a new instance with the same disk contents, so > > > that's fine. Thanks for confirming that there is not a simpler way > that > > > I've missed, I guess :) > > > > > > > > I've heard some users care about that stuff, unfortunately, which > is > > > why > > > > > we have shelve [offload]. There's absolutely no way to perform a > > > > > cross-region migration that keeps the instance ID and instance IP > > > addresses. > > > > > > > > > > - use Cinder to backup the volumes from one region, then restore > them > > > to > > > > >> the other; if this is combined with a storage-specific Cinder > > > backup > > > > >> driver that knows that "backing up" is "creating a snapshot" > and > > > > >> "restoring to the other region" is "transferring that snapshot > to > > > the > > > > >> remote storage cluster", it seems to be the easiest way forward > > > (once > > > > >> the Cinder backup driver has been written) > > > > >> > > > > > > > > > > Still won't have the same instance ID and IP address, which is what > > > > > certain users tend to complain about needing with move operations. > > > > > > > > > > - use Nova's "server image create" command, transfer the resulting > > > > >> Glance image somehow (possibly by downloading it from the > Glance > > > > >> storage in one region and simulateneously uploading it to the > > > Glance > > > > >> instance in the other), then spawn an instance off that image > > > > >> > > > > > > > > > > Still won't have the same instance ID and IP address :) > > > > > > > > > > Best, > > > > > -jay > > > > > > > > > > The "server image create" approach seems to be the simplest one, > > > > >> although it is a bit hard to imagine how it would work without > > > > >> transferring data unnecessarily (the online articles I've seen > > > > >> advocating it seem to imply that a Nova instance in a region > cannot be > > > > >> spawned off a Glance image in another region, so there will need > to be > > > > >> at least one set of "download the image and upload it to the other > > > > >> side", even if the volume-to-image and image-to-volume transfers > are > > > > >> instantaneous, e.g. using glance-cinderclient). However, when I > tried > > > > >> it with a Nova instance backed by a StorPool volume (no ephemeral > > > image > > > > >> at all), the Glance image was zero bytes in length and only its > > > metadata > > > > >> contained some information about a volume snapshot created at that > > > > >> point, so this seems once again to go back to options 1 and 2 for > the > > > > >> different ways to transfer a Cinder volume or snapshot to the > other > > > > >> region. Or have I missed something, is there a way to get the > "server > > > > >> image create / image download / image create" route to handle > volumes > > > > >> attached to the instance? > > > > >> > > > > >> So... have I missed something else, too, or are these the options > for > > > > >> transferring a Nova instance between two distant locations? > > > > >> > > > > >> Thanks for reading this far, and thanks in advance for your help! > > > > >> > > > > >> Best regards, > > > > >> Peter > > > > > > > > Create a volume transfer VM/machine in each region. > > > > attache the volume -> dd -> compress -> internet ->decompress -> > new > > > > volume, attache(/boot with) to the volume to the final machine. > > > > In case you have frequent transfers you may keep up the machines for > the > > > > next one.. > > > > > > Thanks for the advice, but this would involve transferring *a lot* more > > > data than if we leave it to the underlying storage :) As I mentioned, > > > the underlying storage can be taught about remote clusters and can be > told > > > to create a remote snapshot of a volume; this will be the base on which > > > we will write our Cinder backup driver. So both my options 1 (do it > "by > > > hand" with the underlying storage) and 2 (cinder volume backup/restore) > > > would be preferable. > > > > > > > Cinder might get a feature for `rescue` a volume in case accidentally > > someone > > deleted the DB record or some other bad thing happened. > > This needs to be admin only op where you would need to specify where is > the > > volume, > > If just a new volume `shows up` on the storage, but without > > the knowledge of cinder, it could be rescued as well. > > Hmm, is this not what the Cinder "manage" command does? > > Sounds like it does: https://blueprints.launchpad.net/horizon/+spec/add-manage-unmanage-volume > > Among same storage types probably cinder could have an admin only > > API for transfer. > > > > I am not sure is volume backup/restore is really better across regions > > than the above steps properly piped however > > it is very infrastructure dependent, > > bandwidth and latency across regions matters. > [snip discussion] > > Well, the reason my initial message said "assume the underlying storage > can do that" was that I did not want to go into marketing/advertisement > territory and say flat out that the StorPool storage system can do that :) > > Best regards, > Peter > > -- > Peter Penchev openstack-dev at storpool.com https://storpool.com/ > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/ > openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/ > openstack > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vimal7370 at gmail.com Tue Sep 18 18:31:18 2018 From: vimal7370 at gmail.com (Vimal Kumar) Date: Wed, 19 Sep 2018 00:01:18 +0530 Subject: [Openstack] [trove] publish_exists_event fails with "This is not a recognized Fernet token" Message-ID: Hi, Trove on Pike is running into a known bug: https://bugs.launchpad.net/trove/+bug/1700586 which seems to have been fixed in trove v9.0.0. Is Trove from Queens backward compatible with Pike? I am thinking whether I can install Trove from Queens repo on another server and make it work with the rest of the Pike installation. Can this be done? Please advice. -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Tue Sep 18 18:43:33 2018 From: satish.txt at gmail.com (Satish Patel) Date: Tue, 18 Sep 2018 14:43:33 -0400 Subject: [Openstack] URGENT: packet loss on openstack instance In-Reply-To: References: <4126044E-505F-4A48-B126-0625D5F40D72@cisco.com> <11A2F68B-1A87-433B-A4D6-CA495DA88F5C@gmail.com> <5A9B804F-BEA1-468D-BBAB-3C50181A6190@cisco.com> <8089BF19-A95B-4CF5-A2D4-0CB2B7415362@cisco.com> Message-ID: Liping, Last 2 days i am running test with hping3 and found following behavior, if you noticed my result UDP doing very bad if i increase number of queue, do you know why ? UDP: If i set "ethtool -L eth0 combined 1" then UDP pps rate is 100kpps if i set "ethtool -L eth0 combined 8" then UDP pps rate is 40kpps TCP: If i set "ethtool -L eth0 combined 1" then UDP pps rate is ~150kpps If i set "ethtool -L eth0 combined 1" then UDP pps rate is ~150kpps On Mon, Sep 17, 2018 at 8:33 AM Satish Patel wrote: > > Thanks Liping, > > I will try to reach out or open new thread to get sriov info. > > By the way what version of openstack you guys using and what hardware specially NIC. Just trying to see if it's hardware related. > > I'm running kernel 3.10.x do you think it's not something related kernel. > > Sent from my iPhone > > On Sep 17, 2018, at 1:27 AM, Liping Mao (limao) wrote: > > >> Question: I have br-vlan interface mapp with bond0 to run my VM (VLAN > > > > traffic), so do i need to do anything in bond0 to enable VF/PF > > > > function? Just confused because currently my VM nic map with compute > > > > node br-vlan bridge. > > > > > > > > I had not actually used SRIOV in my env~ maybe others could help. > > > > > > > > Thanks, > > > > Liping Mao > > > > > > > > 在 2018/9/17 11:48,“Satish Patel” 写入: > > > > > > > > Thanks Liping, > > > > > > > > I will check bug for tx/rx queue size and see if i can make it work > > > > but look like my 10G NIC support SR-IOV so i am trying that path > > > > because it will be better for long run. > > > > > > > > I have deploy my cloud using openstack-ansible so now i need to figure > > > > out how do i wire that up with openstack-ansible deployment, here is > > > > the article [1] > > > > > > > > Question: I have br-vlan interface mapp with bond0 to run my VM (VLAN > > > > traffic), so do i need to do anything in bond0 to enable VF/PF > > > > function? Just confused because currently my VM nic map with compute > > > > node br-vlan bridge. > > > > > > > > [root at compute-65 ~]# lspci -nn | grep -i ethernet > > > > 03:00.0 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > > > BCM57810 10 Gigabit Ethernet [14e4:168e] (rev 10) > > > > 03:00.1 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > > > BCM57810 10 Gigabit Ethernet [14e4:168e] (rev 10) > > > > 03:01.0 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > > > BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] > > > > 03:01.1 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > > > BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] > > > > 03:01.2 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > > > BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] > > > > 03:01.3 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > > > BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] > > > > 03:01.4 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > > > BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] > > > > 03:01.5 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > > > BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] > > > > 03:01.6 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > > > BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] > > > > > > > > > > > > [1] https://docs.openstack.org/openstack-ansible-os_neutron/latest/configure-network-services.html > > > >> On Sun, Sep 16, 2018 at 7:06 PM Liping Mao (limao) wrote: > >> > >> > > > >> Hi Satish, > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> There are hard limitations in nova's code, I did not actually used more thant 8 queues: > > > >> > > > >> def _get_max_tap_queues(self): > > > >> > > > >> # NOTE(kengo.sakai): In kernels prior to 3.0, > > > >> > > > >> # multiple queues on a tap interface is not supported. > > > >> > > > >> # In kernels 3.x, the number of queues on a tap interface > > > >> > > > >> # is limited to 8. From 4.0, the number is 256. > > > >> > > > >> # See: https://bugs.launchpad.net/nova/+bug/1570631 > > > >> > > > >> kernel_version = int(os.uname()[2].split(".")[0]) > > > >> > > > >> if kernel_version <= 2: > > > >> > > > >> return 1 > > > >> > > > >> elif kernel_version == 3: > > > >> > > > >> return 8 > > > >> > > > >> elif kernel_version == 4: > > > >> > > > >> return 256 > > > >> > > > >> else: > > > >> > > > >> return None > > > >> > > > >> > > > >> > > > >>> I am currently playing with those setting and trying to generate > > > >> > > > >> traffic with hping3 tools, do you have any tool to test traffic > > > >> > > > >> performance for specially udp style small packets. > > > >> > > > >> > > > >> > > > >> Hping3 is good enough to reproduce it, we have app level test tool, but that is not your case. > > > >> > > > >> > > > >> > > > >> > > > >> > > > >>> Here i am trying to increase rx_queue_size & tx_queue_size but its not > > > >> > > > >> working somehow. I have tired following. > > > >> > > > >> > > > >> > > > >> Since you are not rocky code, it should only works in qemu.conf, maybe check if this bug[1] affect you. > > > >> > > > >> > > > >> > > > >> > > > >> > > > >>> Is there a way i can automate this last task to update queue number > > > >> > > > >> action after reboot vm :) otherwise i can use cloud-init to make sure > > > >> > > > >> all VM build with same config. > > > >> > > > >> > > > >> > > > >> Cloud-init or rc.local could be the place to do that. > > > >> > > > >> > > > >> > > > >> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1541960 > > > >> > > > >> > > > >> > > > >> Regards, > > > >> > > > >> Liping Mao > > > >> > > > >> > > > >> > > > >> 在 2018/9/17 04:09,“Satish Patel” 写入: > > > >> > > > >> > > > >> > > > >> Update on my last email. > > > >> > > > >> > > > >> > > > >> I am able to achieve 150kpps with queue=8 and my goal is to do 300kpps > > > >> > > > >> because some of voice application using 300kps. > > > >> > > > >> > > > >> > > > >> Here i am trying to increase rx_queue_size & tx_queue_size but its not > > > >> > > > >> working somehow. I have tired following. > > > >> > > > >> > > > >> > > > >> 1. add rx/tx size in /etc/nova/nova.conf in libvirt section - (didn't work) > > > >> > > > >> 2. add /etc/libvirtd/qemu.conf - (didn't work) > > > >> > > > >> > > > >> > > > >> I have try to edit virsh edit file but somehow my changes not > > > >> > > > >> getting reflected, i did virsh define after change and hard > > > >> > > > >> reboot guest but no luck.. how do i edit that option in xml if i want > > > >> > > > >> to do that? > > > >> > > > >>> On Sun, Sep 16, 2018 at 1:41 PM Satish Patel wrote: > >> > >> > > > >>> > > > >> > > > >>> I successful reproduce this error with hping3 tool and look like > > > >> > > > >>> multiqueue is our solution :) but i have few question you may have > > > >> > > > >>> answer of that. > > > >> > > > >>> > > > >> > > > >>> 1. I have created two instance (vm1.example.com & vm2.example.com) > > > >> > > > >>> > > > >> > > > >>> 2. I have flood traffic from vm1 using "hping3 vm2.example.com > > > >> > > > >>> --flood" and i have noticed drops on tap interface. ( This is without > > > >> > > > >>> multiqueue) > > > >> > > > >>> > > > >> > > > >>> 3. Enable multiqueue in image and run same test and again got packet > > > >> > > > >>> drops on tap interface ( I didn't update queue on vm2 guest, so > > > >> > > > >>> definitely i was expecting packet drops) > > > >> > > > >>> > > > >> > > > >>> 4. Now i have try to update vm2 queue using ethtool and i got > > > >> > > > >>> following error, I have 15vCPU and i was trying to add 15 queue > > > >> > > > >>> > > > >> > > > >>> [root at bar-mq ~]# ethtool -L eth0 combined 15 > > > >> > > > >>> Cannot set device channel parameters: Invalid argument > > > >> > > > >>> > > > >> > > > >>> Then i have tried 8 queue which works. > > > >> > > > >>> > > > >> > > > >>> [root at bar-mq ~]# ethtool -L eth0 combined 8 > > > >> > > > >>> combined unmodified, ignoring > > > >> > > > >>> no channel parameters changed, aborting > > > >> > > > >>> current values: tx 0 rx 0 other 0 combined 8 > > > >> > > > >>> > > > >> > > > >>> Now i am not seeing any packet drops on tap interface, I have measure > > > >> > > > >>> PPS and i was able to get 160kpps without packet drops. > > > >> > > > >>> > > > >> > > > >>> Question: > > > >> > > > >>> > > > >> > > > >>> 1. why i am not able to add 15 queue? ( is this NIC or driver limitation?) > > > >> > > > >>> 2. how do i automate "ethtool -L eth0 combined 8" command in instance > > > >> > > > >>> so i don't need to tell my customer to do this manually? > > > >> > > > >>>> On Sun, Sep 16, 2018 at 11:53 AM Satish Patel wrote: > >> > >> > > > >>>> > > > >> > > > >>>> Hi Liping, > > > >> > > > >>>> > > > >> > > > >>>>>> I think multi queue feature should help.(be careful to make sure the ethtool update queue number action also did after reboot the vm). > > > >> > > > >>>> > > > >> > > > >>>> Is there a way i can automate this last task to update queue number > > > >> > > > >>>> action after reboot vm :) otherwise i can use cloud-init to make sure > > > >> > > > >>>> all VM build with same config. > > > >> > > > >>>>> On Sun, Sep 16, 2018 at 11:51 AM Satish Patel wrote: > >> > >> > > > >>>>> > > > >> > > > >>>>> I am currently playing with those setting and trying to generate > > > >> > > > >>>>> traffic with hping3 tools, do you have any tool to test traffic > > > >> > > > >>>>> performance for specially udp style small packets. > > > >> > > > >>>>> > > > >> > > > >>>>> I am going to share all my result and see what do you feel because i > > > >> > > > >>>>> have noticed you went through this pain :) I will try every single > > > >> > > > >>>>> option which you suggested to make sure we are good before i move > > > >> > > > >>>>> forward to production. > > > >> > > > >>>>>> On Sun, Sep 16, 2018 at 11:25 AM Liping Mao (limao) wrote: > >> > >> > > > >>>>>> > > > >> > > > >>>>>> I think multi queue feature should help.(be careful to make sure the ethtool update queue number action also did after reboot the vm). > > > >> > > > >>>>>> > > > >> > > > >>>>>> Numa cpu pin and queue length will be a plus in my exp. You may need yo have performance test in your situatuon,in my case cpu numa helpped the app get very stable 720p/1080p transcoding performance. Not sure if your app get benifit. > > > >> > > > >>>>>> > > > >> > > > >>>>>> You are not using L3,this will let you avoid a lot of performance issue. And since only two instance with 80kpps packets,so in your case,HW interface should not be bottleneck too. And your Nexus 5k/7k will not be bottleneck for sure ;-) > > > >> > > > >>>>>> > > > >> > > > >>>>>> > > > >> > > > >>>>>> Thanks, > > > >> > > > >>>>>> Liping Mao > > > >> > > > >>>>>> > > > >> > > > >>>>>>>> 在 2018年9月16日,23:09,Satish Patel 写道: > >> > >> > > > >>>>>>> > > > >> > > > >>>>>>> Thanks Liping, > > > >> > > > >>>>>>> > > > >> > > > >>>>>>> I am using libvertd 3.9.0 version so look like i am eligible take > > > >> > > > >>>>>>> advantage of that feature. phew! > > > >> > > > >>>>>>> > > > >> > > > >>>>>>> [root at compute-47 ~]# libvirtd -V > > > >> > > > >>>>>>> libvirtd (libvirt) 3.9.0 > > > >> > > > >>>>>>> > > > >> > > > >>>>>>> Let me tell you how i am running instance on my openstack, my compute > > > >> > > > >>>>>>> has 32 core / 32G memory and i have created two instance on compute > > > >> > > > >>>>>>> node 15vcpu and 14G memory ( two instance using 30 vcpu core, i have > > > >> > > > >>>>>>> kept 2 core for compute node). on compute node i disabled overcommit > > > >> > > > >>>>>>> using ratio (1.0) > > > >> > > > >>>>>>> > > > >> > > > >>>>>>> I didn't configure NUMA yet because i wasn't aware of this feature, as > > > >> > > > >>>>>>> per your last post do you think numa will help to fix this issue? > > > >> > > > >>>>>>> following is my numa view > > > >> > > > >>>>>>> > > > >> > > > >>>>>>> [root at compute-47 ~]# numactl --hardware > > > >> > > > >>>>>>> available: 2 nodes (0-1) > > > >> > > > >>>>>>> node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23 > > > >> > > > >>>>>>> node 0 size: 16349 MB > > > >> > > > >>>>>>> node 0 free: 133 MB > > > >> > > > >>>>>>> node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31 > > > >> > > > >>>>>>> node 1 size: 16383 MB > > > >> > > > >>>>>>> node 1 free: 317 MB > > > >> > > > >>>>>>> node distances: > > > >> > > > >>>>>>> node 0 1 > > > >> > > > >>>>>>> 0: 10 20 > > > >> > > > >>>>>>> 1: 20 10 > > > >> > > > >>>>>>> > > > >> > > > >>>>>>> > > > >> > > > >>>>>>> I am not using any L3 router, i am using provide VLAN network and > > > >> > > > >>>>>>> using Cisco Nexus switch for my L3 function so i am not seeing any > > > >> > > > >>>>>>> bottleneck there. > > > >> > > > >>>>>>> > > > >> > > > >>>>>>> This is the 10G NIC i have on all my compute node, dual 10G port with > > > >> > > > >>>>>>> bonding (20G) > > > >> > > > >>>>>>> > > > >> > > > >>>>>>> 03:00.0 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 > > > >> > > > >>>>>>> Gigabit Ethernet (rev 10) > > > >> > > > >>>>>>> 03:00.1 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 > > > >> > > > >>>>>>> Gigabit Ethernet (rev 10) > > > >> > > > >>>>>>> > > > >> > > > >>>>>>> > > > >> > > > >>>>>>>>> On Sun, Sep 16, 2018 at 10:50 AM Liping Mao (limao) wrote: > >> > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> It is still possible to update rx and tx queues length if your qemu and libvirt version is higher than the version recorded in [3]. (You should possible to update directly in libvirt configuration if my memory is correct) > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> We also have some similar use case which run audio/vedio serivcs. They are CPU consuming and have UDP small packets. Another possible tunning is using CPU pin for the vm. you can use numa awared cpu feature to get stable cpu performance ,vm network dropped packets sometimes because of the vm cpu is too busy,with numa cpu it works better performance,our way is similar with [a]. You need to create flavor with special metadata and dedicated Host Agg for numa awared VMs. Dedicated CPU is very good for media service. It makes the CPU performance stable. > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> Another packet loss case we get is because of vm kernel, some of our app are using 32bit OS, that cause memory issue, when traffic larger then 50kpps, it dropped a lot,sometimes,it even crash. In this case, 32bit os can actually use very limited memory, we have to add swap for the vm. Hope your app is using 64 bit OS. Because 32 bit could cause tons of trouble. > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> BTW,if you are using vrouter on L3, you’d better to move provider network(no vrouter). I did not tried DVR, but if you are running without DVR, the L3 node will be bottleneck very quick. Especially default iptables conntrack is 65535, you will reach to it and drop packet on L3, even after you tun that value, it still hard to more that 1Mpps for your network node. > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> If your App more than 200kpps per compute node, you may be better also have a look your physical network driver tx/rx configuration. Most of the HW default value for tx/rx queues number and length are very poor,you may start to get packet on eth interface on physical host when rx queue is full. > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> [a]https://redhatstackblog.redhat.com/2015/05/05/cpu-pinning-and-numa-topology-awareness-in-openstack-compute/ > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> Regards, > > > >> > > > >>>>>>>> Liping Mao > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>>> 在 2018年9月16日,21:18,Satish Patel 写道: > >> > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> Hi Liping, > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> Thank you for your reply, > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> We notice packet drops during high load, I did try txqueue and didn't help so I believe I am going to try miltiqueue. > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> For SRIOV I have to look if I have support in my nic. > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> We are using queens so I think queue size option not possible :( > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> We are using voip application and traffic is udp so our pps rate is 60k to 80k per vm instance. > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> I will share my result as soon as I try multiqueue. > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> Sent from my iPhone > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>>> On Sep 16, 2018, at 2:27 AM, Liping Mao (limao) wrote: > >> > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> Hi Satish, > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> Did your packet loss happen always or it only happened when heavy load? > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> AFAIK, if you do not tun anything, the vm tap can process about 50kpps before the tap device start to drop packets. > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> If it happened in heavy load, couple of things you can try: > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> 1) increase tap queue length, usually the default value is 500, you can try larger. (seems like you already tried) > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> 2) Try to use virtio multi queues feature , see [1]. Virtio use one queue for rx/tx in vm, with this feature you can get more queues. You can check > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> 3) In rock version, you can use [2] to increase virtio queue size, the default queues size is 256/512, you may increase it to 1024, this would help to increase pps of the tap device. > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> If all these things can not get your network performance requirement, you may need to move to use dpdk / sriov stuff to get more vm performance. > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> I did not actually used them in our env, you may refer to [3] > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> [1] https://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/libvirt-virtiomq.html > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> [2] https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/libvirt-virtio-set-queue-sizes.html > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> [3] https://docs.openstack.org/ocata/networking-guide/config-sriov.html > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> Regards, > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> Liping Mao > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> 在 2018/9/16 13:07,“Satish Patel” 写入: > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> [root at compute-33 ~]# ifconfig tap5af7f525-5f | grep -i drop > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> RX errors 0 dropped 0 overruns 0 frame 0 > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> TX errors 0 dropped 2528788837 overruns 0 carrier 0 collisions 0 > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> Noticed tap interface dropping TX packets and even after increasing > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> txqueue from 1000 to 10000 nothing changed, still getting packet > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> drops. > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>>> On Sat, Sep 15, 2018 at 4:22 PM Satish Patel wrote: > >> > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> Folks, > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> I need some advice or suggestion to find out what is going on with my > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> network, we have notice high packet loss on openstack instance and not > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> sure what is going on, same time if i check on host machine and it has > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> zero packet loss.. this is what i did for test... > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> ping 8.8.8.8 > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> from instance: 50% packet loss > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> from compute host: 0% packet loss > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> I have disabled TSO/GSO/SG setting on physical compute node but still > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> getting packet loss. > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> We have 10G NIC on our network, look like something related to tap > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> interface setting.. > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> _______________________________________________ > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> Post to : openstack at lists.openstack.org > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >>>>>>>> > > > >> > > > >> > > > >> > > > >> > > > > > > > > From qiaokang1213 at gmail.com Tue Sep 18 20:52:56 2018 From: qiaokang1213 at gmail.com (Qiao Kang) Date: Tue, 18 Sep 2018 15:52:56 -0500 Subject: [Openstack] Can any user add or delete OpenStack Swift middleware? In-Reply-To: <5BA08242.9090305@lab.ntt.co.jp> References: <5BA05DDE.5060606@lab.ntt.co.jp> <5BA08242.9090305@lab.ntt.co.jp> Message-ID: Dear Kota, On Mon, Sep 17, 2018 at 11:43 PM Kota TSUYUZAKI wrote: > > Hi Quio, > > > I know Storlets can provide user-defined computation functionalities, > > but I guess some capabilities can only be achieved using middleware. > > For example, a user may want such a feature: upon each PUT request, it > > creates a compressed copy of the object and stores both the original > > copy and compressed copy. It's feasible using middlware but I don't > > think Storlets provide such capability. > > Interesting, exactly currently it's not supported to write to multi objects for a PUT request but as well as other middlewares we could adopt the feasibility into Storlets if you prefer. > Right now, the multi read (i.e. GET from multi sources) is only available and I think we would be able to expand the logic to PUT requests too. IIRC, in those days, we had discussion on sort of the > multi-out use cases and I'm sure the data structure inside Storlets are designed to be capable to that expantion. At that time, we called them "Tee" application on Storlets, I could not find the > historical discussion logs about how to implement tho, sorry. I believe that would be an use case for storlets if you prefer the user-defined application flexibilities rather than operator defined > Swift middleware. > > The example of multi-read (GET from multi sources) are here: > https://github.com/openstack/storlets/blob/master/tests/functional/python/test_multiinput_storlet.py > > And if you like to try to write multi write, please join us, I'm happy to help you anytime. > Thanks! I'm interested and would like to join, as well as contribute! Another potential use case: imagine I want to compress objects upon PUTs using two different algorithms X and Y, and use the future 'multi-write' feature to store three objects upon any single PUT (original copy, X-compressed copy and Y-compressed copy). I can install two Storlets which implement X and Y respectively. However, seems Storlets engine can only invoke one per PUT, so this is still not feasible. Is that correct? > > > Another example is that a user may want to install a Swift3-like > > middleware to provide APIs to a 3rd party, but she doesn't want other > > users to see this middleware. > > > > If the definition can be made by operators, perhaps one possible solution that preparing different proxy-server endpoint for different users is available. i.e. an user uses no-s3api available proxy, > then the others use a different proxy-server endpoint that has the s3api in the pipeline. > > Or, it sounds like kinda defaulter middleware[1], I don't think it has the scope turning on/off the middlewares for now. > > 1: https://review.openstack.org/#/c/342857/ I see, thanks for pointing out the defaulter project! Best, Qiao > > Best, > Kota > > (2018/09/18 11:34), Qiao Kang wrote: > > Kota, > > > > Thanks for your reply, very helpful! > > > > I know Storlets can provide user-defined computation functionalities, > > but I guess some capabilities can only be achieved using middleware. > > For example, a user may want such a feature: upon each PUT request, it > > creates a compressed copy of the object and stores both the original > > copy and compressed copy. It's feasible using middlware but I don't > > think Storlets provide such capability. > > > > Another example is that a user may want to install a Swift3-like > > middleware to provide APIs to a 3rd party, but she doesn't want other > > users to see this middleware. > > > > Regards, > > Qiao > > > > On Mon, Sep 17, 2018 at 9:19 PM Kota TSUYUZAKI > > wrote: > >> > >> With Storlets, users will be able to create their own applications that are able to run like as a Swift middeleware. The application (currently Python and Java are supported as the language but the > >> apps can calls any binaries in the workspace) can be uploaded as a Swift object, then, users can invoke them with just an extra header that specifies your apps. > >> > >> To fit your own use case, we may have to consider to invole or to integrate the system for you but I believe Storlets could be a choice for you. > >> > >> In detail, Storlets documantation is around there, > >> > >> Top Level Index: https://docs.openstack.org/storlets/latest/index.html > >> System Overview: https://docs.openstack.org/storlets/latest/storlet_engine_overview.html > >> APIs: https://docs.openstack.org/storlets/latest/api/overview_api.html > >> > >> Thanks, > >> > >> Kota > >> > >> (2018/09/17 8:59), John Dickinson wrote: > >>> You may be interested in Storlets. It's another OpenStack project, maintained by a Swift core reviewer, that provides this sort of user-defined middleware functionality. > >>> > >>> You can also ask about it in #openstack-swift > >>> > >>> --John > >>> > >>> > >>> > >>> On 16 Sep 2018, at 9:25, Qiao Kang wrote: > >>> > >>>> Hi, > >>>> > >>>> I'm wondering whether Swift allows any user (not the administrator) to > >>>> specify which middleware that she/he wants his data object to go throught. > >>>> For instance, Alice wants to install a middleware but doesn't want Bob to > >>>> use it, where Alice and Bob are two accounts in a single Swift cluster. > >>>> > >>>> Or maybe all middlewares are pre-installed globally and cannot be > >>>> customized on a per-account basis? > >>>> > >>>> Thanks, > >>>> Qiao > >>>> _______________________________________________ > >>>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > >>>> Post to : openstack at lists.openstack.org > >>>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > >>> > >>> _______________________________________________ > >>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > >>> Post to : openstack at lists.openstack.org > >>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > >> > >> > >> -- > >> ---------------------------------------------------------- > >> Kota Tsuyuzaki(露﨑 浩太) > >> NTT Software Innovation Center > >> Distributed Computing Technology Project > >> Phone 0422-59-2837 > >> Fax 0422-59-2965 > >> ----------------------------------------------------------- > >> > >> > >> _______________________________________________ > >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > >> Post to : openstack at lists.openstack.org > >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > > > > > -- > ---------------------------------------------------------- > Kota Tsuyuzaki(露﨑 浩太) > NTT Software Innovation Center > Distributed Computing Technology Project > Phone 0422-59-2837 > Fax 0422-59-2965 > ----------------------------------------------------------- > From satish.txt at gmail.com Tue Sep 18 23:16:03 2018 From: satish.txt at gmail.com (Satish Patel) Date: Tue, 18 Sep 2018 19:16:03 -0400 Subject: [Openstack] [Openstack-Ansible] Unable to install Openstack Queens using Ansible In-Reply-To: <668D484C-7553-49CB-8A88-B4BD63F49BBD@vexxhost.com> References: <668D484C-7553-49CB-8A88-B4BD63F49BBD@vexxhost.com> Message-ID: You have 12GB on controller node where mysql going to run. I would say you need to tweak some setting in /etc/openstack_deploy/user_variables.yml file default setting is grab as much resource you can so i would say play with those setting and limit mem use something like this.. and there are more to set.. following is example, don't try in production. ## Galera settings galera_monitoring_allowed_source: "192.168.100.246 192.168.100.239 192.168.100.88 192.168.100.3 192.168.100.2 192.168.100.1 1.1.1.1 2.2.2.2 127.0.0.1" galera_innodb_buffer_pool_size: 16M galera_innodb_log_buffer_size: 4M galera_wsrep_provider_options: - { option: "gcache.size", value: "4M" } ## Neutron settings neutron_metadata_checksum_fix: True ### Set workers for all services to optimise memory usage ## Repo repo_nginx_threads: 2 ## Keystone keystone_httpd_mpm_start_servers: 2 keystone_httpd_mpm_min_spare_threads: 1 keystone_httpd_mpm_max_spare_threads: 2 keystone_httpd_mpm_thread_limit: 2 keystone_httpd_mpm_thread_child: 1 keystone_wsgi_threads: 1 keystone_wsgi_processes_max: 2 ## Barbican barbican_wsgi_processes: 2 barbican_wsgi_threads: 1 ## Glance glance_api_threads_max: 2 glance_api_threads: 1 glance_api_workers: 1 glance_registry_workers: 1 ## Nova nova_wsgi_threads: 1 nova_wsgi_processes_max: 2 nova_wsgi_processes: 2 nova_wsgi_buffer_size: 16384 nova_api_threads_max: 2 nova_api_threads: 1 nova_osapi_compute_workers: 1 nova_conductor_workers: 1 # rbd_user: "{{ cinder_ceph_client }}" # rbd_secret_uuid: "{{ cinder_ceph_client_uuid }}" # report_discard_supported: true ## Glance glance_api_threads_max: 2 glance_api_threads: 1 glance_api_workers: 1 glance_registry_workers: 1 ## Nova nova_wsgi_threads: 1 nova_wsgi_processes_max: 2 nova_wsgi_processes: 2 nova_wsgi_buffer_size: 16384 nova_api_threads_max: 2 nova_api_threads: 1 nova_osapi_compute_workers: 1 nova_conductor_workers: 1 nova_metadata_workers: 1 ## tux - new console (spice-html5 has been removed) nova_console_type: novnc ## Tux - Live migration nova_libvirtd_listen_tls: 0 nova_libvirtd_listen_tcp: 1 nova_libvirtd_auth_tcp: "none" ## Neutron neutron_rpc_workers: 1 neutron_metadata_workers: 1 neutron_api_workers: 1 neutron_api_threads_max: 2 neutron_api_threads: 2 neutron_num_sync_threads: 1 neutron_linuxbridge_agent_ini_overrides: linux_bridge: physical_interface_mappings: vlan:br-vlan ## Heat heat_api_workers: 1 heat_api_threads_max: 2 heat_api_threads: 1 heat_wsgi_threads: 1 heat_wsgi_processes_max: 2 heat_wsgi_processes: 1 heat_wsgi_buffer_size: 16384 ## Horizon horizon_wsgi_processes: 1 horizon_wsgi_threads: 1 horizon_wsgi_threads_max: 2 ## Ceilometer ceilometer_notification_workers_max: 2 ceilometer_notification_workers: 1 ## AODH aodh_wsgi_threads: 1 aodh_wsgi_processes_max: 2 aodh_wsgi_processes: 1 ## Gnocchi gnocchi_wsgi_threads: 1 gnocchi_wsgi_processes_max: 2 gnocchi_wsgi_processes: 1 ## Swift swift_account_server_replicator_workers: 1 swift_server_replicator_workers: 1 swift_object_replicator_workers: 1 swift_account_server_workers: 1 swift_container_server_workers: 1 swift_object_server_workers: 1 swift_proxy_server_workers_max: 2 swift_proxy_server_workers_not_capped: 1 swift_proxy_server_workers_capped: 1 ## Heat heat_api_workers: 1 heat_api_threads_max: 2 heat_api_threads: 1 heat_wsgi_threads: 1 heat_wsgi_processes_max: 2 heat_wsgi_processes: 1 heat_wsgi_buffer_size: 16384 ## Horizon horizon_wsgi_processes: 1 horizon_wsgi_threads: 1 horizon_wsgi_threads_max: 2 ## Ceilometer ceilometer_notification_workers_max: 2 ceilometer_notification_workers: 1 ## AODH aodh_wsgi_threads: 1 aodh_wsgi_processes_max: 2 aodh_wsgi_processes: 1 ## Gnocchi gnocchi_wsgi_threads: 1 gnocchi_wsgi_processes_max: 2 gnocchi_wsgi_processes: 1 On Tue, Sep 18, 2018 at 9:23 AM Mohammed Naser wrote: > > Hi, > > 4GB of memory is not enough for a deployment unfortunately. > > You’ll have to bump it up. > > Thanks > Mohammed > > Sent from my iPhone > > > On Sep 18, 2018, at 7:04 AM, Budai Laszlo wrote: > > > > Hi, > > > > run dmesg on your deployment host. It should print which process has been evicted by the OOM killer. > > We had similar issues with our deployment host. We had to increase its memory to 9G to have openstack-ansiblle working properly. > > You should also monitor the memory usage of your processes on the controller/deployment host. > > > > good luck, > > Laszlo > > > >> On 18.09.2018 13:43, Anirudh Gupta wrote: > >> Hi Team, > >> I am installing Open Stack Queens using the Openstack Ansible and facing some issues > >> *System Configuration* > >> *Controller/Deployment Host* > >> RAM - 12 GB > >> Hard disk - 100 GB > >> Linux - Ubuntu 16.04 > >> Kernel Version - 4.4.0-135-generic > >> *Compute* > >> RAM - 4 GB > >> Hard disk - 100 GB > >> Linux - Ubuntu 16.04 > >> Kernel Version - 4.4.0-135-generic > >> *Issue Observed:* > >> When we run the below playbook > >> openstack-ansible setup-openstack.yml > >> *Error Observed:* > >> After running for some duration, it throws the error of "Out of Memory Killing mysqld" > >> In the "top" command, we see only haproxy processes and the system gets so slow that we are not even able to login into the system. > >> Can you please help me in resolving the issue. > >> Regards > >> Anirudh Gupta > >> _______________________________________________ > >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > >> Post to : openstack at lists.openstack.org > >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > > > > _______________________________________________ > > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > Post to : openstack at lists.openstack.org > > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack From vimal7370 at gmail.com Wed Sep 19 08:56:06 2018 From: vimal7370 at gmail.com (Vimal Kumar) Date: Wed, 19 Sep 2018 14:26:06 +0530 Subject: [Openstack] [trove] Request needs authorization error Message-ID: Hi, Trove install on OpenStack Pike shows error in logs related to "Manager.publish_exists_event". Has anyone managed to install Trove on Pike without getting this error? Any help would be appreciated. Logs and config included below. Thank you! Regards, Vimal Trove Log: 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task [-] Error during Manager.publish_exists_event: Unauthorized: The request you have made requires authentication. (HTTP 401) (Request-ID: req-5dece909-c107-491a-91fc-9eaa8c7b8a91) 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task Traceback (most recent call last): 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/oslo_service/periodic_task.py", line 220, in run_periodic_tasks 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task task(self, context) 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/trove/taskmanager/manager.py", line 433, in publish_exists_event 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task self.admin_context) 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/trove/extensions/mgmt/instances/models.py", line 176, in publish_exist_events 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task notifications = transformer() 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/trove/extensions/mgmt/instances/models.py", line 268, in __call__ 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task client=self.nova_client) 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/trove/extensions/mgmt/instances/models.py", line 38, in load_mgmt_instances 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task mgmt_servers = client.servers.list(search_opts={'all_tenants': 1}) 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/novaclient/v2/servers.py", line 884, in list 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task "servers") 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/novaclient/base.py", line 254, in _list 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task resp, body = self.api.client.get(url) 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/keystoneauth1/adapter.py", line 288, in get 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task return self.request(url, 'GET', **kwargs) 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/novaclient/client.py", line 77, in request 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task **kwargs) 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/keystoneauth1/adapter.py", line 447, in request 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task resp = super(LegacyJsonAdapter, self).request(*args, **kwargs) 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/keystoneauth1/adapter.py", line 192, in request 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task return self.session.request(url, method, **kwargs) 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/positional/__init__.py", line 101, in inner 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task return wrapped(*args, **kwargs) 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/keystoneauth1/session.py", line 578, in request 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task auth_headers = self.get_auth_headers(auth) 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/keystoneauth1/session.py", line 905, in get_auth_headers 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task return auth.get_headers(self, **kwargs) 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/keystoneauth1/plugin.py", line 90, in get_headers 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task token = self.get_token(session) 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/keystoneauth1/identity/base.py", line 89, in get_token 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task return self.get_access(session).auth_token 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/keystoneauth1/identity/base.py", line 135, in get_access 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task self.auth_ref = self.get_auth_ref(session) 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/keystoneauth1/identity/generic/base.py", line 198, in get_auth_ref 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task return self._plugin.get_auth_ref(session, **kwargs) 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/keystoneauth1/identity/v2.py", line 65, in get_auth_ref 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task authenticated=False, log=False) 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/keystoneauth1/session.py", line 853, in post 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task return self.request(url, 'POST', **kwargs) 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/positional/__init__.py", line 101, in inner 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task return wrapped(*args, **kwargs) 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/keystoneauth1/session.py", line 742, in request 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task raise exceptions.from_response(resp, method, url) 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task Unauthorized: The request you have made requires authentication. (HTTP 401) (Request-ID: req-5dece909-c107-491a-91fc-9eaa8c7b8a91) 2018-09-19 16:42:31.337 8681 ERROR oslo_service.periodic_task Keystone Log: 2018-09-19 16:42:31.327 938 DEBUG keystone.middleware.auth [req-5dece909-c107-491a-91fc-9eaa8c7b8a91 - - - - -] There is either no auth token in the request or the certificate issuer is not trusted. No auth context will be set. fill_context /usr/lib/python2.7/site-packages/keystone/middleware/auth.py:203 2018-09-19 16:42:31.329 938 INFO keystone.common.wsgi [req-5dece909-c107-491a-91fc-9eaa8c7b8a91 - - - - -] POST http://keystone:35357/v2.0/tokens 2018-09-19 16:42:31.332 938 DEBUG keystone.common.fernet_utils [req-5dece909-c107-491a-91fc-9eaa8c7b8a91 - - - - -] Loaded 2 Fernet keys from /etc/keystone/fernet-keys/, but `[fernet_tokens] max_active_keys = 3`; perhaps there have not been enough key rotations to reach `max_active_keys` yet? load_keys /usr/lib/python2.7/site-packages/keystone/common/fernet_utils.py:306 2018-09-19 16:42:31.333 938 WARNING keystone.common.wsgi [req-5dece909-c107-491a-91fc-9eaa8c7b8a91 - - - - -] Authorization failed. The request you have made requires authentication. from 10.0.0.1: Unauthorized: The request you have made requires authentication. trove.conf: [DEFAULT] trove_api_workers = 2 default_datastore = mysql debug = True verbose = True #use_syslog = True bind_host = 0.0.0.0 bind_port = 8779 rpc_backend = rabbit control_exchange = trove db_api_implementation = "trove.db.sqlalchemy.api" #trove_auth_url = http://keystone:35357/v3 trove_auth_url = http://keystone:35357 nova_compute_url = http://openstack:8774/v2 neutron_url = http://openstack:9696/ notifier_queue_hostname = openstack trove_volume_support = True block_device_mapping = vdb device_path = /dev/vdb max_accepted_volume_size = 10 max_instances_per_tenant = 5 max_volumes_per_tenant = 100 max_backups_per_tenant = 5 volume_time_out=30 http_get_rate = 200 http_post_rate = 200 http_put_rate = 200 http_delete_rate = 200 http_mgmt_post_rate = 200 trove_dns_support = False dns_account_id = 123456 dns_auth_url = http://127.0.0.1/identity/v2.0 dns_username = user dns_passkey = password dns_ttl = 3600 dns_domain_name = 'trove.com.' dns_domain_id = 11111111-1111-1111-1111-111111111111 dns_driver = trove.dns.designate.driver.DesignateDriver dns_instance_entry_factory = trove.dns.designate.driver.DesignateInstanceEntryFactory dns_endpoint_url = http://127.0.0.1/v1/ dns_service_type = dns network_driver = trove.network.neutron.NeutronDriver default_neutron_networks = taskmanager_queue = taskmanager admin_roles = admin agent_heartbeat_time = 10 agent_call_low_timeout = 5 agent_call_high_timeout = 150 reboot_time_out = 60 api_paste_config = /etc/trove/api-paste.ini log_file = /var/log/trove/trove.log auth_strategy = keystone add_addresses = True network_label_regex = .* ip_regex = .* black_list_regex = log_dir = /var/log/trove [keystone_authtoken] auth_url = http://keystone:5000 auth_type = password project_domain_name = default user_domain_name = default project_name = service username = trove password = 2ffa4772223b858064e5 memcached_servers = controller01:11211,controller02:11211,controller03:11211 [database] connection = mysql+pymysql://trove:2ffa4772223b858064e5 at controller03/trove idle_timeout = 3600 [profiler] [ssl] [oslo_messaging_rabbit] rabbit_hosts = openstack rabbit_userid = trove rabbit_password = 5c5014aa32cf7999d195 [mysql] root_on_create = False tcp_ports = 3306 volume_support = True device_path = /dev/vdb ignore_users = os_admin, root ignore_dbs = mysql, information_schema, performance_schema [redis] tcp_ports = 6379, 16379 volume_support = False [cassandra] tcp_ports = 7000, 7001, 9042, 9160 volume_support = True device_path = /dev/vdb [couchbase] tcp_ports = 8091, 8092, 4369, 11209-11211, 21100-21199 volume_support = True device_path = /dev/vdb [mongodb] tcp_ports = 2500, 27017, 27019 volume_support = True device_path = /dev/vdb num_config_servers_per_cluster = 1 num_query_routers_per_cluster = 1 [vertica] tcp_ports = 5433, 5434, 22, 5444, 5450, 4803 udp_ports = 5433, 4803, 4804, 6453 volume_support = True device_path = /dev/vdb cluster_support = True cluster_member_count = 3 api_strategy = trove.common.strategies.cluster.experimental.vertica.api.VerticaAPIStrategy [cors] [cors.subdomain] [oslo_middleware] trove-taskmanager.conf: [DEFAULT] debug = True verbose = True #use_syslog = True rpc_backend = rabbit control_exchange = trove update_status_on_fail = True control_exchange = trove db_api_implementation = trove.db.sqlalchemy.api #trove_auth_url = http://keystone:35357/v3 trove_auth_url = http://keystone:35357 nova_compute_url = http://openstack:8774/v2 notifier_queue_hostname = openstack trove_volume_support = True block_device_mapping = vdb device_path = /dev/vdb mount_point = /var/lib/mysql volume_time_out=30 server_delete_time_out=480 use_nova_server_config_drive = True nova_proxy_admin_user = trove nova_proxy_admin_pass = 2ffa4772223b858064e5 nova_proxy_admin_tenant_name = service taskmanager_manager=trove.taskmanager.manager.Manager exists_notification_transformer = trove.extensions.mgmt.instances.models.NovaNotificationTransformer exists_notification_ticks = 30 notification_service_id = mysql:2f3ff068-2bfb-4f70-9a9d-a6bb65bc084b trove_dns_support = False dns_account_id = 123456 dns_auth_url = http://127.0.0.1/identity/v2.0 dns_username = user dns_passkey = password dns_ttl = 3600 dns_domain_name = 'trove.com.' dns_domain_id = 11111111-1111-1111-1111-111111111111 dns_driver = trove.dns.designate.driver.DesignateDriver dns_instance_entry_factory = trove.dns.designate.driver.DesignateInstanceEntryFactory dns_endpoint_url = http://127.0.0.1/v1/ dns_service_type = dns network_driver = trove.network.neutron.NeutronDriver network_label_regex = .* ip_regex = .* black_list_regex = default_neutron_networks = trove_security_groups_support = True trove_security_group_rule_cidr = 0.0.0.0/0 agent_heartbeat_time = 10 agent_call_low_timeout = 20 agent_call_high_timeout = 150 agent_replication_snapshot_timeout = 36000 use_nova_server_volume = False template_path = /etc/trove/templates/ pydev_debug = disabled [database] connection = mysql+pymysql://trove:2ffa4772223b858064e5 at controller03/trove idle_timeout = 3600 [profiler] [oslo_messaging_rabbit] rabbit_hosts = openstack rabbit_userid = trove rabbit_password = 5c5014aa32cf7999d195 [mysql] icmp = True tcp_ports = 22, 3306 volume_support = True device_path = /dev/vdb [redis] tcp_ports = 22, 6379, 16379 volume_support = False [cassandra] tcp_ports = 22, 7000, 7001, 7199, 9042, 9160 volume_support = True device_path = /dev/vdb [couchbase] tcp_ports = 22, 8091, 8092, 4369, 11209-11211, 21100-21199 volume_support = True device_path = /dev/vdb [couchdb] tcp_ports = 22, 5984 [db2] tcp_ports = 22, 50000 [mariadb] tcp_ports = 22, 3306, 4444, 4567, 4568 [mongodb] volume_support = True device_path = /dev/vdb tcp_ports = 22, 2500, 27017, 27019 [percona] tcp_ports = 22, 3306 [postgresql] tcp_ports = 22, 5432 [pxc] tcp_ports = 22, 3306, 4444, 4567, 4568 [vertica] tcp_ports = 22, 5433, 5434, 22, 5444, 5450, 4803 udp_ports = 5433, 4803, 4804, 6453 volume_support = True device_path = /dev/vdb mount_point = /var/lib/vertica taskmanager_strategy = trove.common.strategies.cluster.experimental.vertica.taskmanager.VerticaTaskManagerStrategy trove-conductor.conf: [DEFAULT] debug = True verbose = True #use_syslog = True #trove_auth_url = http://keystone:35357/v3 trove_auth_url = http://keystone:35357 nova_compute_url = http://openstack:8774/v2 notifier_queue_hostname = openstack conductor_manager = trove.conductor.manager.Manager control_exchange = trove [profiler] [database] connection = mysql+pymysql://trove:2ffa4772223b858064e5 at controller03/trove [oslo_messaging_rabbit] rabbit_hosts = openstack rabbit_userid = trove rabbit_password = 5c5014aa32cf7999d195 -------------- next part -------------- An HTML attachment was scrubbed... URL: From florian.engelmann at everyware.ch Wed Sep 19 15:54:53 2018 From: florian.engelmann at everyware.ch (Florian Engelmann) Date: Wed, 19 Sep 2018 17:54:53 +0200 Subject: [Openstack] White label Message-ID: <812b6b86-b08d-47d7-fb24-ece5e4e9aacd@everyware.ch> Hi, anyone with experiences in white labelling Openstack? Horizion is easy but how to white label all the API endpoints? All the best, Florian -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5210 bytes Desc: not available URL: From fungi at yuggoth.org Thu Sep 20 16:32:49 2018 From: fungi at yuggoth.org (Jeremy Stanley) Date: Thu, 20 Sep 2018 16:32:49 +0000 Subject: [Openstack] [all] We're combining the lists! (was: Bringing the community together...) In-Reply-To: <20180830170350.wrz4wlanb276kncb@yuggoth.org> References: <20180830170350.wrz4wlanb276kncb@yuggoth.org> Message-ID: <20180920163248.oia5t7zjqcfwluwz@yuggoth.org> tl;dr: The openstack, openstack-dev, openstack-sigs and openstack-operators mailing lists (to which this is being sent) will be replaced by a new openstack-discuss at lists.openstack.org mailing list. The new list is open for subscriptions[0] now, but is not yet accepting posts until Monday November 19 and it's strongly recommended to subscribe before that date so as not to miss any messages posted there. The old lists will be configured to no longer accept posts starting on Monday December 3, but in the interim posts to the old lists will also get copied to the new list so it's safe to unsubscribe from them any time after the 19th and not miss any messages. Now on to the details... The original proposal[1] I cross-posted to these lists in August received overwhelmingly positive feedback (indeed only one strong objection[2] was posted, thanks Thomas for speaking up, and my apologies in advance if this makes things less convenient for you), which is unusual since our community usually tends to operate on silent assent and tacit agreement. Seeing what we can only interpret as majority consensus for the plan among the people reading messages posted to these lists, a group of interested individuals met last week in the Infrastructure team room at the PTG to work out the finer details[3]. We devised a phased timeline: During the first phase (which begins with this announcement) the new openstack-discuss mailing list will accept subscriptions but not posts. Its short and full descriptions indicate this, as does the welcome message sent to all new subscribers during this phase. The list is configured for "emergency moderation" mode so that all posts, even those from subscribers, immediately land in the moderation queue and can be rejected with an appropriate message. We strongly recommend everyone who is on any of the current general openstack, openstack-dev, openstack-operators and openstack-sigs lists subscribe to openstack-discuss during this phase in order to avoid missing any messages to the new list. Phase one lasts roughly one month and ends on Monday November 19, just after the OpenStack Stein Summit in Berlin. The second phase picks up at the end of the first. During this phase, emergency moderation is no longer in effect and subscribers can post to the list normally (non-subscribers are subject to moderation of course in order to limit spam). Any owners/moderators from the original lists who wish it will be added to the new one to collaborate on moderation tasks. At this time the openstack-discuss list address itself will be subscribed to posts from the openstack, openstack-dev, openstack-operators and openstack-sigs mailing lists so anyone who wishes to unsubscribe from those can do so at any time during this phase without missing any replies sent there. The list descriptions and welcome message will also be updated to their production prose. Phase two runs for two weeks ending on Monday December 3. The third and final phase begins at the end of the second, when further posts to the general openstack, openstack-dev, openstack-operators and openstack-sigs lists will be refused and the descriptions for those lists updated to indicate they're indefinitely retired from use. The old archives will still be preserved of course, but no new content will appear in them. A note about DMARC/DKIM: during the planning discussion we also spoke briefly about the problems we encounter on the current lists whereby subscriber MTAs which check DKIM signatures appearing in some posts reject them and cause those subscribers to get unsubscribed after too many of these bounces. While reviewing the various possible mitigation options available to us, we eventually resolved that the least objectionable solution was to cease modifying the list subject and body. As such, for the new openstack-discuss list you won't see [openstack-discuss] prepended to message subjects, and there will be no list footer block added to the message body. Rest assured the usual RFC 2369 List-* headers[4] will still be added so MUAs can continue to take filtering actions based on them as on our other lists. I'm also including a couple of FAQs which have come up over the course of this... Why make a new list instead of just directing people to join an existing one such as the openstack general ML? For one, the above list behavior change to address DMARC/DKIM issues is a good reason to want a new list; making those changes to any of the existing lists is already likely to be disruptive anyway as subscribers may be relying on the subject mangling for purposes of filtering list traffic. Also as noted earlier in the thread for the original proposal, we have many suspected defunct subscribers who are not bouncing (either due to abandoned mailboxes or MTAs black-holing them) so this is a good opportunity to clean up the subscriber list and reduce the overall amount of E-mail unnecessarily sent by the server. Why not simply auto-subscribe everyone from the four older lists to the new one and call it a day? Well, I personally would find it rude if a list admin mass-subscribed me to a mailing list I hadn't directly requested. Doing so may even be illegal in some jurisdictions (we could probably make a case that it's warranted, but it's cleaner to not need to justify such an action). Much like the answer to the previous question, the changes in behavior (and also in the list name itself) are likely to cause lots of subscribers to need to update their message filtering rules anyway. I know by default it would all start landing in my main inbox, and annoy me mightily. What subject tags are we going to be using to identify messages of interest and to be able to skip those we don't care about? We're going to continuously deploy a list of recommended subject tags in a visible space, either on the listserv's WebUI or the Infra Manual and link to it liberally. There is already an initial set of suggestions[5] being brainstormed, so feel free to add any there you feel might be missing. It's not yet been decided whether we'll also include these in the Mailman "Topics" configuration to enable server-side filtering on them (as there's a good chance we'll be unable to continue supporting that after an upgrade to Mailman 3), so for now it's best to assume you may need to add them to your client-side filters if you rely on that capability. If you have any further questions, please feel free to respond to this announcement so we can make sure they're answered. [0] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-discuss [1] http://lists.openstack.org/pipermail/openstack-sigs/2018-August/000493.html [2] http://lists.openstack.org/pipermail/openstack-dev/2018-August/134074.html [3] https://etherpad.openstack.org/p/infra-ptg-denver-2018 [4] https://www.ietf.org/rfc/rfc2369.txt [5] https://etherpad.openstack.org/p/common-openstack-ml-topics -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From satish.txt at gmail.com Thu Sep 20 21:21:47 2018 From: satish.txt at gmail.com (Satish Patel) Date: Thu, 20 Sep 2018 17:21:47 -0400 Subject: [Openstack] SR-IOV packet drops stats Message-ID: Folks, I am running openstack with Linuxbridge+vlan and found getting TX drop on tap interface so i have plan to implement SR-IOV and now running VM on SR-IOV but question is how do i check VF interface status for packet drops? is there a way to find our packet drops on VF interface or i should be checking drops on instance OS interface with ifconfig ethX ? From mrhillsman at gmail.com Thu Sep 20 22:30:32 2018 From: mrhillsman at gmail.com (Melvin Hillsman) Date: Thu, 20 Sep 2018 17:30:32 -0500 Subject: [Openstack] Capturing Feedback/Input Message-ID: Hey everyone, During the TC meeting at the PTG we discussed the ideal way to capture user-centric feedback; particular from our various groups like SIGs, WGs, etc. Options that were mentioned ranged from a wiki page to a standalone solution like discourse. While there is no perfect solution it was determined that Storyboard could facilitate this. It would play out where there is a project group openstack-uc? and each of the SIGs, WGs, etc would have a project under this group; if I am wrong someone else in the room correct me. The entire point is a first step (maybe final) in centralizing user-centric feedback that does not require any extra overhead be it cost, time, or otherwise. Just kicking off a discussion so others have a chance to chime in before anyone pulls the plug or pushes the button on anything and we settle as a community on what makes sense. -- Kind regards, Melvin Hillsman mrhillsman at gmail.com mobile: (832) 264-2646 -------------- next part -------------- An HTML attachment was scrubbed... URL: From tsuyuzaki.kota at lab.ntt.co.jp Fri Sep 21 07:59:47 2018 From: tsuyuzaki.kota at lab.ntt.co.jp (Kota TSUYUZAKI) Date: Fri, 21 Sep 2018 16:59:47 +0900 Subject: [Openstack] Can any user add or delete OpenStack Swift middleware? In-Reply-To: References: <5BA05DDE.5060606@lab.ntt.co.jp> <5BA08242.9090305@lab.ntt.co.jp> Message-ID: <5BA4A4F3.8070702@lab.ntt.co.jp> Hi Qiao, > Thanks! I'm interested and would like to join, as well as contribute! > One example, that is how the multi-READ works, is around [1], the storlets middleware can make a subrequest against to the backend Swift then, attach the request input to the application in the Docker container by passing the file descriptor to be readable[2][3][4]*. After all of the prepartion for the invocation, the input descriptors will be readable in the storlet app as the InputFile. * At first, the runtime prepares the extra source stub at [2], then creates a set of pipes for each sources to be communicated with the app inside the docker daemon[3], then, the runtime module reads the extra data from Swift GET and flushes all buffers into the descriptor [4]. 1: https://github.com/openstack/storlets/blob/master/storlets/swift_middleware/handlers/proxy.py#L294-L305 2: https://github.com/openstack/storlets/blob/master/storlets/gateway/gateways/docker/runtime.py#L571-L581 3: https://github.com/openstack/storlets/blob/master/storlets/gateway/gateways/docker/runtime.py#L665-L666 4: https://github.com/openstack/storlets/blob/master/storlets/gateway/gateways/docker/runtime.py#L833-L840 Following the mechanism, IMO, what we can do to enable multi-out is - add the capability to create some PUT subrequest at swift_middleware module (define the new API header too) - create the extra communication write-able fds in the storlets runtime (perhaps, storlets daemon is also needed to be changed) - pass all data from the write-able fds as to the sub PUT request input If you have any nice idea rather than me, it's always welcome tho :) > Another potential use case: imagine I want to compress objects upon > PUTs using two different algorithms X and Y, and use the future > 'multi-write' feature to store three objects upon any single PUT > (original copy, X-compressed copy and Y-compressed copy). I can > install two Storlets which implement X and Y respectively. However, > seems Storlets engine can only invoke one per PUT, so this is still > not feasible. Is that correct? > It sounds interesting. As you know, yes, one Storlet application can be invoked per one PUT. On the other hand, Storlets has been capable to run several applications as you want. One idea using the capability, if you develop an application like: - Storlet app invokes several multi threads with their output descriptor - Storlet app reads the input stream, then pushes the data into the threads - Each thread performs as you want (one does as X compression, the other does as Y compression) then, writes its own result to the output descriptor It might work for your use case. Thanks, Kota (2018/09/19 5:52), Qiao Kang wrote: > Dear Kota, > > On Mon, Sep 17, 2018 at 11:43 PM Kota TSUYUZAKI > wrote: >> >> Hi Quio, >> >>> I know Storlets can provide user-defined computation functionalities, >>> but I guess some capabilities can only be achieved using middleware. >>> For example, a user may want such a feature: upon each PUT request, it >>> creates a compressed copy of the object and stores both the original >>> copy and compressed copy. It's feasible using middlware but I don't >>> think Storlets provide such capability. >> >> Interesting, exactly currently it's not supported to write to multi objects for a PUT request but as well as other middlewares we could adopt the feasibility into Storlets if you prefer. >> Right now, the multi read (i.e. GET from multi sources) is only available and I think we would be able to expand the logic to PUT requests too. IIRC, in those days, we had discussion on sort of the >> multi-out use cases and I'm sure the data structure inside Storlets are designed to be capable to that expantion. At that time, we called them "Tee" application on Storlets, I could not find the >> historical discussion logs about how to implement tho, sorry. I believe that would be an use case for storlets if you prefer the user-defined application flexibilities rather than operator defined >> Swift middleware. >> >> The example of multi-read (GET from multi sources) are here: >> https://github.com/openstack/storlets/blob/master/tests/functional/python/test_multiinput_storlet.py >> >> And if you like to try to write multi write, please join us, I'm happy to help you anytime. >> > > Thanks! I'm interested and would like to join, as well as contribute! > > Another potential use case: imagine I want to compress objects upon > PUTs using two different algorithms X and Y, and use the future > 'multi-write' feature to store three objects upon any single PUT > (original copy, X-compressed copy and Y-compressed copy). I can > install two Storlets which implement X and Y respectively. However, > seems Storlets engine can only invoke one per PUT, so this is still > not feasible. Is that correct? > >> >>> Another example is that a user may want to install a Swift3-like >>> middleware to provide APIs to a 3rd party, but she doesn't want other >>> users to see this middleware. >>> >> >> If the definition can be made by operators, perhaps one possible solution that preparing different proxy-server endpoint for different users is available. i.e. an user uses no-s3api available proxy, >> then the others use a different proxy-server endpoint that has the s3api in the pipeline. >> >> Or, it sounds like kinda defaulter middleware[1], I don't think it has the scope turning on/off the middlewares for now. >> >> 1: https://review.openstack.org/#/c/342857/ > > I see, thanks for pointing out the defaulter project! > > Best, > Qiao > >> >> Best, >> Kota >> >> (2018/09/18 11:34), Qiao Kang wrote: >>> Kota, >>> >>> Thanks for your reply, very helpful! >>> >>> I know Storlets can provide user-defined computation functionalities, >>> but I guess some capabilities can only be achieved using middleware. >>> For example, a user may want such a feature: upon each PUT request, it >>> creates a compressed copy of the object and stores both the original >>> copy and compressed copy. It's feasible using middlware but I don't >>> think Storlets provide such capability. >>> >>> Another example is that a user may want to install a Swift3-like >>> middleware to provide APIs to a 3rd party, but she doesn't want other >>> users to see this middleware. >>> >>> Regards, >>> Qiao >>> >>> On Mon, Sep 17, 2018 at 9:19 PM Kota TSUYUZAKI >>> wrote: >>>> >>>> With Storlets, users will be able to create their own applications that are able to run like as a Swift middeleware. The application (currently Python and Java are supported as the language but the >>>> apps can calls any binaries in the workspace) can be uploaded as a Swift object, then, users can invoke them with just an extra header that specifies your apps. >>>> >>>> To fit your own use case, we may have to consider to invole or to integrate the system for you but I believe Storlets could be a choice for you. >>>> >>>> In detail, Storlets documantation is around there, >>>> >>>> Top Level Index: https://docs.openstack.org/storlets/latest/index.html >>>> System Overview: https://docs.openstack.org/storlets/latest/storlet_engine_overview.html >>>> APIs: https://docs.openstack.org/storlets/latest/api/overview_api.html >>>> >>>> Thanks, >>>> >>>> Kota >>>> >>>> (2018/09/17 8:59), John Dickinson wrote: >>>>> You may be interested in Storlets. It's another OpenStack project, maintained by a Swift core reviewer, that provides this sort of user-defined middleware functionality. >>>>> >>>>> You can also ask about it in #openstack-swift >>>>> >>>>> --John >>>>> >>>>> >>>>> >>>>> On 16 Sep 2018, at 9:25, Qiao Kang wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I'm wondering whether Swift allows any user (not the administrator) to >>>>>> specify which middleware that she/he wants his data object to go throught. >>>>>> For instance, Alice wants to install a middleware but doesn't want Bob to >>>>>> use it, where Alice and Bob are two accounts in a single Swift cluster. >>>>>> >>>>>> Or maybe all middlewares are pre-installed globally and cannot be >>>>>> customized on a per-account basis? >>>>>> >>>>>> Thanks, >>>>>> Qiao >>>>>> _______________________________________________ >>>>>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>>>>> Post to : openstack at lists.openstack.org >>>>>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>>>> >>>>> _______________________________________________ >>>>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>>>> Post to : openstack at lists.openstack.org >>>>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>>> >>>> >>>> -- >>>> ---------------------------------------------------------- >>>> Kota Tsuyuzaki(露﨑 浩太) >>>> NTT Software Innovation Center >>>> Distributed Computing Technology Project >>>> Phone 0422-59-2837 >>>> Fax 0422-59-2965 >>>> ----------------------------------------------------------- >>>> >>>> >>>> _______________________________________________ >>>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>>> Post to : openstack at lists.openstack.org >>>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>> >>> >> >> >> -- >> ---------------------------------------------------------- >> Kota Tsuyuzaki(露﨑 浩太) >> NTT Software Innovation Center >> Distributed Computing Technology Project >> Phone 0422-59-2837 >> Fax 0422-59-2965 >> ----------------------------------------------------------- >> > > From jkzcristiano at gmail.com Fri Sep 21 15:20:05 2018 From: jkzcristiano at gmail.com (jkzcristiano) Date: Fri, 21 Sep 2018 17:20:05 +0200 Subject: [Openstack] Zun on Devstack Pike Message-ID: Dear all, I have recently came across Zun project that adds container capabilities to OpenStack. >From its git repository [1] I can see that it has a stable/pike branch. [image: image.png] However, in the devstack documentation to install Zun [2] the *local.conf* file must enable the kuryr-libnetwork plugin [3]: enable_plugin kuryr-libnetwork https://git.openstack.org/openstack/kuryr-libnetwork The thing is that the git repository of kuryr-libnetwork [3] has only two stable branches: one for rocky and another for queens, i.e., there is no pike. [image: image.png] Therefore, can I install Zun on Devstack Pike following the documentation [2] or do I need some special configuration? Sincerely, Xoan [1] https://github.com/openstack/zun [2] https://docs.openstack.org/zun/pike/contributor/manual-devstack.html [3] https://github.com/openstack/kuryr-libnetwork -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 15856 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 15643 bytes Desc: not available URL: From mrhillsman at gmail.com Fri Sep 21 17:55:09 2018 From: mrhillsman at gmail.com (Melvin Hillsman) Date: Fri, 21 Sep 2018 12:55:09 -0500 Subject: [Openstack] [Openstack-sigs] Capturing Feedback/Input In-Reply-To: <1537546393-sup-9882@lrrr.local> References: <1537540740-sup-4229@lrrr.local> <1537546393-sup-9882@lrrr.local> Message-ID: On Fri, Sep 21, 2018 at 11:16 AM Doug Hellmann wrote: > Excerpts from Melvin Hillsman's message of 2018-09-21 10:18:26 -0500: > > On Fri, Sep 21, 2018 at 9:41 AM Doug Hellmann > wrote: > > > > > Excerpts from Melvin Hillsman's message of 2018-09-20 17:30:32 -0500: > > > > Hey everyone, > > > > > > > > During the TC meeting at the PTG we discussed the ideal way to > capture > > > > user-centric feedback; particular from our various groups like SIGs, > WGs, > > > > etc. > > > > > > > > Options that were mentioned ranged from a wiki page to a standalone > > > > solution like discourse. > > > > > > > > While there is no perfect solution it was determined that Storyboard > > > could > > > > facilitate this. It would play out where there is a project group > > > > openstack-uc? and each of the SIGs, WGs, etc would have a project > under > > > > this group; if I am wrong someone else in the room correct me. > > > > > > > > The entire point is a first step (maybe final) in centralizing > > > user-centric > > > > feedback that does not require any extra overhead be it cost, time, > or > > > > otherwise. Just kicking off a discussion so others have a chance to > chime > > > > in before anyone pulls the plug or pushes the button on anything and > we > > > > settle as a community on what makes sense. > > > > > > > > > > I like the idea of tracking the information in storyboard. That > > > said, one of the main purposes of creating SIGs was to separate > > > those groups from the appearance that they were "managed" by the > > > TC or UC. So, rather than creating a UC-focused project group, if > > > we need a single project group at all, I would rather we call it > > > "SIGs" or something similar. > > > > > > > What you bring up re appearances makes sense definitely. Maybe we call it > > openstack-feedback since the purpose is focused on that and I actually > > looked at -uc as user-centric rather than user-committee; but > appearances :) > > Feedback implies that SIGs aren't engaged in creating OpenStack, though, > and I think that's the perception we're trying to change. > > > I think limiting it to SIGs will well, limit it to SIGs, and again could > > appear to be specific to those groups rather than for example the Public > > Cloud WG or Financial Team. > > OK, I thought those groups were SIGs. > > Maybe we're overthinking the organization on this. What is special about > the items that would be on this list compared to items opened directly > against projects? > Yeah unfortunately we do have a tendency to overthink/complicate things. Not saying Storyboard is the right tool but suggested rather than having something extra to maintain was what I understood. There are at least 3 things that were to be addressed: - single pane so folks know where to provide/see updates - it is not a catchall/dumpsite - something still needs to be flushed out/prioritized (Public Cloud WG's missing features spreadsheet for example) - not specific to a single project (i thought this was a given since there is already a process/workflow for single project) I could very well be wrong so I am open to be corrected. From my perspective the idea in the room was to not circumvent anything internal but rather make it easy for external viewers, passerbys, etc. When feedback is gathered from Ops Meetup, OpenStack Days, Local meetups/events, we discussed putting that here as well. > > Doug > > _______________________________________________ > openstack-sigs mailing list > openstack-sigs at lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-sigs > -- Kind regards, Melvin Hillsman mrhillsman at gmail.com mobile: (832) 264-2646 -------------- next part -------------- An HTML attachment was scrubbed... URL: From fungi at yuggoth.org Fri Sep 21 19:24:32 2018 From: fungi at yuggoth.org (Jeremy Stanley) Date: Fri, 21 Sep 2018 19:24:32 +0000 Subject: [Openstack] [Openstack-sigs] Capturing Feedback/Input In-Reply-To: References: <1537540740-sup-4229@lrrr.local> <1537546393-sup-9882@lrrr.local> Message-ID: <20180921192432.k23x2u3w7626cder@yuggoth.org> On 2018-09-21 12:55:09 -0500 (-0500), Melvin Hillsman wrote: [...] > Yeah unfortunately we do have a tendency to overthink/complicate > things. Not saying Storyboard is the right tool but suggested > rather than having something extra to maintain was what I > understood. There are at least 3 things that were to be addressed: > > - single pane so folks know where to provide/see updates Not all OpenStack projects use the same task trackers currently and there's no guarantee that they ever will, so this is a best effort only. Odds are you may wind up duplicating some information also present in the Nova project on Launchpad, the Tripleo project on Trello and the Foobly project on Bugzilla (I made this last one up, in case it's not obvious). > - it is not a catchall/dumpsite If it looks generic enough, it will become that unless there are people actively devoted to triaging and pruning submissions to curate them... a tedious and thankless long-term commitment, to be sure. > - something still needs to be flushed out/prioritized (Public > Cloud WG's missing features spreadsheet for example) This is definitely a good source of input, but still needs someone to determine which various projects/services the tasks for them get slotted into and then help prioritizing and managing spec submissions on a per-team basis. > - not specific to a single project (i thought this was a given > since there is already a process/workflow for single project) The way to do that on storyboard.openstack.org is to give it a project of its own. Basically just couple it to a new, empty Git repository and then the people doing these tasks still have the option of also putting that repository to some use later (for example, to house their workflow documentation). > I could very well be wrong so I am open to be corrected. From my > perspective the idea in the room was to not circumvent anything > internal but rather make it easy for external viewers, passerbys, > etc. When feedback is gathered from Ops Meetup, OpenStack Days, > Local meetups/events, we discussed putting that here as well. It seems a fine plan, just keep in mind that documenting and publishing feedback doesn't magically translate into developers acting on any of it (and this is far from the first time it's been attempted). -- Jeremy Stanley -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 963 bytes Desc: not available URL: From berndbausch at gmail.com Tue Sep 25 07:46:33 2018 From: berndbausch at gmail.com (Bernd Bausch) Date: Tue, 25 Sep 2018 16:46:33 +0900 Subject: [Openstack] [cinder] How to set microversion in the openstack client? Message-ID: <51f6a31b-7c50-7d46-8c4a-17db58aa28ca@gmail.com> I want to try out a few new features, such as creating a volume from backup. On a Rocky Devstack, I get this:     export OS_VOLUME_API_VERSION=3.47     $ openstack volume list     volume version 3.47 is not in supported versions: 1, 2, 3 How can I select a certain microversion? Or: Why is the openstack client limited? The cinder client works as expected. Bernd. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: OpenPGP digital signature URL: From ristovvtt at gmail.com Tue Sep 25 13:29:18 2018 From: ristovvtt at gmail.com (Risto Vaaraniemi) Date: Tue, 25 Sep 2018 16:29:18 +0300 Subject: [Openstack] [nova]Capacity discrepancy between command line and MySQL query In-Reply-To: <6d211944-6001-722a-60ea-c17dec096914@gmail.com> References: <6d211944-6001-722a-60ea-c17dec096914@gmail.com> Message-ID: On Tue, Aug 28, 2018 at 5:12 PM Jay Pipes wrote: > > On 08/27/2018 09:40 AM, Risto Vaaraniemi wrote: > > Hi, > > > > I tried to migrate a guest to another host but it failed with a > > message saying there's not enough capacity on the target host even > > though the server should me nearly empty. > > > > I did make a few attempts to resize the guest that now runs on > > compute1 but for some reason they failed and by default the resize > > tries to restart the resized guest on a different host (compute1). > > In the end I was able to do the resize on the same host (compute2). > > I was wondering if the resize attempts messed up the compute1 resource > > management. > > Very likely, yes. > > Have you tried restarting the nova-compute services on both compute > nodes and seeing whether the placement service tries to adjust > allocations upon restart? > > Also, please check the logs on the nova-compute workers looking for any > warnings or errors related to communication with placement. I tried restarting but that didn't help. In the end, I was able to solve the situation by making a backup image of the guest that caused the problem and re-creating the guest with the same flavor. From the placement outputs I could see that it was the the guest allocation that was wrong for some reason. The host did not have any extra hanging allocations after that. Thanks for your time, anyway. :) BR, Risto From rivawahyuda at gmail.com Tue Sep 25 15:48:23 2018 From: rivawahyuda at gmail.com (Riva Wahyuda) Date: Tue, 25 Sep 2018 22:48:23 +0700 Subject: [Openstack] Fwd: Need Information ( urgent ) In-Reply-To: References: Message-ID: ---------- Forwarded message --------- From: Riva Wahyuda Date: Tue, Sep 25, 2018 at 10:44 PM Subject: Need Information ( urgent ) To: I try to deploy openstack (queens) using TripleO. Success with Containerized Deployment option. But without Containerized Deployment options, deployment process failed. how to deploy openstack queens, without Containerized Deployment Method? Thank's. -------------- next part -------------- An HTML attachment was scrubbed... URL: From danny.rotscher at tu-dresden.de Wed Sep 26 09:56:14 2018 From: danny.rotscher at tu-dresden.de (Danny Marc Rotscher) Date: Wed, 26 Sep 2018 09:56:14 +0000 Subject: [Openstack] Packstack different ethernet device names Message-ID: <3c94ca9bb14f4e76b4ec29060f44d470@MSX-L101.msx.ad.zih.tu-dresden.de> Dear all, it is possible to address multible network device names in the answer file for example for the tunnel interface? Because my controller run on a vm and has the device name eth* and my hypervisor hosts have something like enp*, which is the new one. I know I can switch back to eth* for the hypervisor hosts, but that is only the last way I would prefer. Kind regards, Danny -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 6102 bytes Desc: not available URL: From sumagowda9394 at gmail.com Wed Sep 26 12:00:48 2018 From: sumagowda9394 at gmail.com (Suma Gowda) Date: Wed, 26 Sep 2018 17:30:48 +0530 Subject: [Openstack] Create a PNDA on openstack Message-ID: i need to create a PNDA on openstack.. 1. i installed openstack on devstack in vitualbox. ..after that what i have to do.. send me the screenshot.. i am beginner to this. What is cli,heat..etc..please and one example -------------- next part -------------- An HTML attachment was scrubbed... URL: From ebiibe82 at gmail.com Wed Sep 26 12:06:06 2018 From: ebiibe82 at gmail.com (Amit Kumar) Date: Wed, 26 Sep 2018 17:36:06 +0530 Subject: [Openstack] [OpenStack][Neutron][SFC] Regarding SFC support on provider VLAN N/W Message-ID: Hi All, We are using Ocata release and we have installed networking-sfc for Service Function Chaining functionality. Installation was successful and then we tried to create port pairs on VLAN N/W and it failed. We tried creating port-pairs on VXLAN based N/W and it worked. So, is it that SFC functionality is supported only on VXLAN based N/Ws? Regards, Amit -------------- next part -------------- An HTML attachment was scrubbed... URL: From eblock at nde.ag Wed Sep 26 13:06:47 2018 From: eblock at nde.ag (Eugen Block) Date: Wed, 26 Sep 2018 13:06:47 +0000 Subject: [Openstack] Create a PNDA on openstack In-Reply-To: Message-ID: <20180926130647.Horde.2D-_CtmzU5BjWEBceeL9JBc@webmail.nde.ag> Hi, to be honest, I think if you are supposed to work with PNDA and OpenStack is your platform, you should at least get an overview what components it has and what they are for. No example or screenshot will help you understand how those components interact. The PNDA pages mention OpenStack Mitaka, so you should study the respective docs [1]. Different guides for installation, operations and administration are available for different platforms (Ubuntu, openSUSE/SLES and RedHat/CentOS), pick the one suitable for your environment and understand the basics (creating tenants and users, neutron networking, nova compute etc.). In addition to the mandatory services (neutron, nova, glance, cinder) you'll need Swift (object storage) and Heat (orchestration). Maybe you already get an idea, this is not covered with "send me a screenshot". ;-) Another mandatory requirement is a basic understanding of the command line interface (CLI), this enables you to access and manage your openstack cloud. A quick search will also reveal some tutorials or videos [3] about the basic concepts. Regards, Eugen [1] https://docs.openstack.org/mitaka/ [2] https://docs.openstack.org/mitaka/cli-reference/ [3] https://opensource.com/business/14/2/openstack-beginners-guide Zitat von Suma Gowda : > i need to create a PNDA on openstack.. > 1. i installed openstack on devstack in vitualbox. ..after that what i have > to do.. send me the screenshot.. i am beginner to this. What is > cli,heat..etc..please and one example From dabarren at gmail.com Wed Sep 26 13:45:15 2018 From: dabarren at gmail.com (Eduardo Gonzalez) Date: Wed, 26 Sep 2018 15:45:15 +0200 Subject: [Openstack] [openstack-dev] [kolla] ceph osd deploy fails In-Reply-To: References: Message-ID: CC openstack so others can see the thread El mié., 26 sept. 2018 a las 15:44, Eduardo Gonzalez () escribió: > Hi, i'm not sure at this moment at what your issue may be. Using external > ceph with kolla-ansible is supported. > Just to make sure, rocky is not released yet in kolla/-ansible, only a > release candidate and a proposal for release candidate 2 this week. > > To dig more into your issue, what are your config? Anything out of the box > in the servers? What steps was made to define the osd disks? > > Regards > > El mié., 26 sept. 2018 a las 15:08, Florian Engelmann (< > florian.engelmann at everyware.ch>) escribió: > >> Dear Eduardo, >> >> thank you for your fast response! I recognized those fixes and we are >> using stable/rocky from yesterday because of those commits (using the >> tarballs - not the git repository). >> >> I guess you are talking about: >> >> https://github.com/openstack/kolla-ansible/commit/ef6921e6d7a0922f68ffb05bd022aab7c2882473 >> >> I saw that one in kolla as well: >> >> https://github.com/openstack/kolla/commit/60f0ea10bfdff12d847d9cb3b51ce02ffe96d6e1 >> >> So we are using Ceph 12.2.4 right now and everything up to 24th of >> september in stable/rocky. >> >> Anything else we could test/change? >> >> We are at the point to deploy ceph seperated from kolla (using >> ceph-ansible) because we need a working environment tomorrow. Do you see >> a real chance get ceph via kolla-ansible up and running today? >> >> >> All the best, >> Flo >> >> >> >> >> Am 26.09.18 um 14:44 schrieb Eduardo Gonzalez: >> > Hi, what version of rocky are you using. Maybe was in the middle of a >> > backport which temporally broke ceph. >> > >> > Could you try latest stable/rocky branch? >> > >> > It is now working properly. >> > >> > Regards >> > >> > On Wed, Sep 26, 2018, 2:32 PM Florian Engelmann >> > > >> >> > wrote: >> > >> > Hi, >> > >> > I tried to deploy Rocky in a multinode setup but ceph-osd fails >> with: >> > >> > >> > failed: [xxxxxxxxxxx-poc2] (item=[0, {u'fs_uuid': u'', >> u'bs_wal_label': >> > u'', u'external_journal': False, u'bs_blk_label': u'', >> > u'bs_db_partition_num': u'', u'journal_device': u'', u'journal': >> u'', >> > u'partition': u'/dev/nvme0n1', u'bs_wal_partition_num': u'', >> > u'fs_label': u'', u'journal_num': 0, u'bs_wal_device': u'', >> > u'partition_num': u'1', u'bs_db_label': u'', >> u'bs_blk_partition_num': >> > u'', u'device': u'/dev/nvme0n1', u'bs_db_device': u'', >> > u'partition_label': u'KOLLA_CEPH_OSD_BOOTSTRAP_BS', >> u'bs_blk_device': >> > u''}]) => { >> > "changed": true, >> > "item": [ >> > 0, >> > { >> > "bs_blk_device": "", >> > "bs_blk_label": "", >> > "bs_blk_partition_num": "", >> > "bs_db_device": "", >> > "bs_db_label": "", >> > "bs_db_partition_num": "", >> > "bs_wal_device": "", >> > "bs_wal_label": "", >> > "bs_wal_partition_num": "", >> > "device": "/dev/nvme0n1", >> > "external_journal": false, >> > "fs_label": "", >> > "fs_uuid": "", >> > "journal": "", >> > "journal_device": "", >> > "journal_num": 0, >> > "partition": "/dev/nvme0n1", >> > "partition_label": "KOLLA_CEPH_OSD_BOOTSTRAP_BS", >> > "partition_num": "1" >> > } >> > ] >> > } >> > >> > MSG: >> > >> > Container exited with non-zero return code 2 >> > >> > We tried to debug the error message by starting the container with a >> > modified endpoint but we are stuck at the following point right now: >> > >> > >> > docker run -e "HOSTNAME=10.0.153.11" -e "JOURNAL_DEV=" -e >> > "JOURNAL_PARTITION=" -e "JOURNAL_PARTITION_NUM=0" -e >> > "KOLLA_BOOTSTRAP=null" -e "KOLLA_CONFIG_STRATEGY=COPY_ALWAYS" -e >> > "KOLLA_SERVICE_NAME=bootstrap-osd-0" -e "OSD_BS_BLK_DEV=" -e >> > "OSD_BS_BLK_LABEL=" -e "OSD_BS_BLK_PARTNUM=" -e "OSD_BS_DB_DEV=" -e >> > "OSD_BS_DB_LABEL=" -e "OSD_BS_DB_PARTNUM=" -e >> "OSD_BS_DEV=/dev/nvme0n1" >> > -e "OSD_BS_LABEL=KOLLA_CEPH_OSD_BOOTSTRAP_BS" -e "OSD_BS_PARTNUM=1" >> -e >> > "OSD_BS_WAL_DEV=" -e "OSD_BS_WAL_LABEL=" -e "OSD_BS_WAL_PARTNUM=" -e >> > "OSD_DEV=/dev/nvme0n1" -e "OSD_FILESYSTEM=xfs" -e >> > "OSD_INITIAL_WEIGHT=1" >> > -e "OSD_PARTITION=/dev/nvme0n1" -e "OSD_PARTITION_NUM=1" -e >> > "OSD_STORETYPE=bluestore" -e "USE_EXTERNAL_JOURNAL=false" -v >> > "/etc/kolla//ceph-osd/:/var/lib/kolla/config_files/:ro" -v >> > "/etc/localtime:/etc/localtime:ro" -v "/dev/:/dev/" -v >> > "kolla_logs:/var/log/kolla/" -ti --privileged=true --entrypoint >> > /bin/bash >> > >> 10.0.128.7:5000/openstack/openstack-kolla-cfg/ubuntu-source-ceph-osd:7.0.0.3 >> > < >> http://10.0.128.7:5000/openstack/openstack-kolla-cfg/ubuntu-source-ceph-osd:7.0.0.3 >> > >> > >> > >> > >> > cat /var/lib/kolla/config_files/ceph.client.admin.keyring > >> > /etc/ceph/ceph.client.admin.keyring >> > >> > >> > cat /var/lib/kolla/config_files/ceph.conf > /etc/ceph/ceph.conf >> > >> > >> > (bootstrap-osd-0)[root at 985e2dee22bc /]# /usr/bin/ceph-osd -d >> > --public-addr 10.0.153.11 --cluster-addr 10.0.153.11 >> > usage: ceph-osd -i [flags] >> > --osd-data PATH data directory >> > --osd-journal PATH >> > journal file or block device >> > --mkfs create a [new] data directory >> > --mkkey generate a new secret key. This is normally >> > used in >> > combination with --mkfs >> > --convert-filestore >> > run any pending upgrade operations >> > --flush-journal flush all data out of journal >> > --mkjournal initialize a new journal >> > --check-wants-journal >> > check whether a journal is desired >> > --check-allows-journal >> > check whether a journal is allowed >> > --check-needs-journal >> > check whether a journal is required >> > --debug_osd set debug level (e.g. 10) >> > --get-device-fsid PATH >> > get OSD fsid for the given block device >> > >> > --conf/-c FILE read configuration from the given >> > configuration file >> > --id/-i ID set ID portion of my name >> > --name/-n TYPE.ID set name >> > --cluster NAME set cluster name (default: ceph) >> > --setuser USER set uid to user or uid (and gid to user's gid) >> > --setgroup GROUP set gid to group or gid >> > --version show version and quit >> > >> > -d run in foreground, log to stderr. >> > -f run in foreground, log to usual location. >> > --debug_ms N set message debug level (e.g. 1) >> > 2018-09-26 12:28:07.801066 7fbda64b4e40 0 ceph version 12.2.4 >> > (52085d5249a80c5f5121a76d6288429f35e4e77b) luminous (stable), >> process >> > (unknown), pid 46 >> > 2018-09-26 12:28:07.801078 7fbda64b4e40 -1 must specify '-i #' >> where # >> > is the osd number >> > >> > >> > But it looks like "-i" is not set anywere? >> > >> > grep command >> > >> /opt/stack/kolla-ansible/ansible/roles/ceph/templates/ceph-osd.json.j2 >> > "command": "/usr/bin/ceph-osd -f --public-addr {{ >> > hostvars[inventory_hostname]['ansible_' + >> > storage_interface]['ipv4']['address'] }} --cluster-addr {{ >> > hostvars[inventory_hostname]['ansible_' + >> > cluster_interface]['ipv4']['address'] }}", >> > >> > What's wrong with our setup? >> > >> > All the best, >> > Flo >> > >> > >> > -- >> > >> > EveryWare AG >> > Florian Engelmann >> > Systems Engineer >> > Zurlindenstrasse 52a >> > CH-8003 Zürich >> > >> > tel: +41 44 466 60 00 >> > fax: +41 44 466 60 10 >> > mail: mailto:florian.engelmann at everyware.ch >> > >> > web: http://www.everyware.ch >> > >> __________________________________________________________________________ >> > OpenStack Development Mailing List (not for usage questions) >> > Unsubscribe: >> > OpenStack-dev-request at lists.openstack.org?subject:unsubscribe >> > < >> http://OpenStack-dev-request at lists.openstack.org?subject:unsubscribe> >> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > >> > >> > >> __________________________________________________________________________ >> > OpenStack Development Mailing List (not for usage questions) >> > Unsubscribe: >> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe >> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > >> >> -- >> >> EveryWare AG >> Florian Engelmann >> Systems Engineer >> Zurlindenstrasse 52a >> CH-8003 Zürich >> >> tel: +41 44 466 60 00 >> fax: +41 44 466 60 10 >> mail: mailto:florian.engelmann at everyware.ch >> web: http://www.everyware.ch >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From remo at italy1.com Wed Sep 26 14:34:12 2018 From: remo at italy1.com (Remo Mattei) Date: Wed, 26 Sep 2018 07:34:12 -0700 Subject: [Openstack] Packstack different ethernet device names In-Reply-To: <3c94ca9bb14f4e76b4ec29060f44d470@MSX-L101.msx.ad.zih.tu-dresden.de> References: <3c94ca9bb14f4e76b4ec29060f44d470@MSX-L101.msx.ad.zih.tu-dresden.de> Message-ID: Yes packstack has a section to map the nic. I will have to check my old old config on my computer then share it. Remo > Il giorno 26 set 2018, alle ore 02:56, Danny Marc Rotscher ha scritto: > > Dear all, > > it is possible to address multible network device names in the answer file for example for the tunnel interface? > Because my controller run on a vm and has the device name eth* and my hypervisor hosts have something like enp*, which is the new one. > I know I can switch back to eth* for the hypervisor hosts, but that is only the last way I would prefer. > > Kind regards, > Danny > -------------- next part -------------- An HTML attachment was scrubbed... URL: From qiaokang1213 at gmail.com Wed Sep 26 15:58:56 2018 From: qiaokang1213 at gmail.com (Qiao Kang) Date: Wed, 26 Sep 2018 10:58:56 -0500 Subject: [Openstack] Can any user add or delete OpenStack Swift middleware? In-Reply-To: <5BA4A4F3.8070702@lab.ntt.co.jp> References: <5BA05DDE.5060606@lab.ntt.co.jp> <5BA08242.9090305@lab.ntt.co.jp> <5BA4A4F3.8070702@lab.ntt.co.jp> Message-ID: Kota, Sorry for the late response, see more below: On Fri, Sep 21, 2018 at 2:59 AM Kota TSUYUZAKI wrote: > > Hi Qiao, > > > Thanks! I'm interested and would like to join, as well as contribute! > > > > One example, that is how the multi-READ works, is around [1], the storlets middleware can make a subrequest against to the backend Swift then, > attach the request input to the application in the Docker container by passing the file descriptor to be readable[2][3][4]*. > After all of the prepartion for the invocation, the input descriptors will be readable in the storlet app as the InputFile. > > * At first, the runtime prepares the extra source stub at [2], then creates a set of pipes for each sources to be communicated with the app > inside the docker daemon[3], then, the runtime module reads the extra data from Swift GET and flushes all buffers into the descriptor [4]. > > 1: https://github.com/openstack/storlets/blob/master/storlets/swift_middleware/handlers/proxy.py#L294-L305 > 2: https://github.com/openstack/storlets/blob/master/storlets/gateway/gateways/docker/runtime.py#L571-L581 > 3: https://github.com/openstack/storlets/blob/master/storlets/gateway/gateways/docker/runtime.py#L665-L666 > 4: https://github.com/openstack/storlets/blob/master/storlets/gateway/gateways/docker/runtime.py#L833-L840 > > Following the mechanism, IMO, what we can do to enable multi-out is > > - add the capability to create some PUT subrequest at swift_middleware module (define the new API header too) > - create the extra communication write-able fds in the storlets runtime (perhaps, storlets daemon is also needed to be changed) > - pass all data from the write-able fds as to the sub PUT request input > > > If you have any nice idea rather than me, it's always welcome tho :) I think your approach is clear and straightforward. One quick question: > - create the extra communication write-able fds in the storlets runtime (perhaps, storlets daemon is also needed to be changed) So the Storlet app will write to those fds? Are these fds temporary and need to be destroyed after PUT requests in step 3? > > > Another potential use case: imagine I want to compress objects upon > > PUTs using two different algorithms X and Y, and use the future > > 'multi-write' feature to store three objects upon any single PUT > > (original copy, X-compressed copy and Y-compressed copy). I can > > install two Storlets which implement X and Y respectively. However, > > seems Storlets engine can only invoke one per PUT, so this is still > > not feasible. Is that correct? > > > > It sounds interesting. As you know, yes, one Storlet application can be invoked per one PUT. > On the other hand, Storlets has been capable to run several applications as you want. > One idea using the capability, if you develop an application like: > > - Storlet app invokes several multi threads with their output descriptor > - Storlet app reads the input stream, then pushes the data into the threads > - Each thread performs as you want (one does as X compression, the other does as Y compression) > then, writes its own result to the output descriptor > > It might work for your use case. Sounds great, I guess it should work as well. I'm also concerned with "performance isolation" in Storlets. For instance, is it possible for a user to launch several very heavy-loaded Storlets apps to consume lots of CPU/memory resources to affect other users? Does Storlets do performance/resource isolation? Thanks, Qiao > > Thanks, > Kota > > > (2018/09/19 5:52), Qiao Kang wrote: > > Dear Kota, > > > > On Mon, Sep 17, 2018 at 11:43 PM Kota TSUYUZAKI > > wrote: > >> > >> Hi Quio, > >> > >>> I know Storlets can provide user-defined computation functionalities, > >>> but I guess some capabilities can only be achieved using middleware. > >>> For example, a user may want such a feature: upon each PUT request, it > >>> creates a compressed copy of the object and stores both the original > >>> copy and compressed copy. It's feasible using middlware but I don't > >>> think Storlets provide such capability. > >> > >> Interesting, exactly currently it's not supported to write to multi objects for a PUT request but as well as other middlewares we could adopt the feasibility into Storlets if you prefer. > >> Right now, the multi read (i.e. GET from multi sources) is only available and I think we would be able to expand the logic to PUT requests too. IIRC, in those days, we had discussion on sort of the > >> multi-out use cases and I'm sure the data structure inside Storlets are designed to be capable to that expantion. At that time, we called them "Tee" application on Storlets, I could not find the > >> historical discussion logs about how to implement tho, sorry. I believe that would be an use case for storlets if you prefer the user-defined application flexibilities rather than operator defined > >> Swift middleware. > >> > >> The example of multi-read (GET from multi sources) are here: > >> https://github.com/openstack/storlets/blob/master/tests/functional/python/test_multiinput_storlet.py > >> > >> And if you like to try to write multi write, please join us, I'm happy to help you anytime. > >> > > > > Thanks! I'm interested and would like to join, as well as contribute! > > > > Another potential use case: imagine I want to compress objects upon > > PUTs using two different algorithms X and Y, and use the future > > 'multi-write' feature to store three objects upon any single PUT > > (original copy, X-compressed copy and Y-compressed copy). I can > > install two Storlets which implement X and Y respectively. However, > > seems Storlets engine can only invoke one per PUT, so this is still > > not feasible. Is that correct? > > > >> > >>> Another example is that a user may want to install a Swift3-like > >>> middleware to provide APIs to a 3rd party, but she doesn't want other > >>> users to see this middleware. > >>> > >> > >> If the definition can be made by operators, perhaps one possible solution that preparing different proxy-server endpoint for different users is available. i.e. an user uses no-s3api available proxy, > >> then the others use a different proxy-server endpoint that has the s3api in the pipeline. > >> > >> Or, it sounds like kinda defaulter middleware[1], I don't think it has the scope turning on/off the middlewares for now. > >> > >> 1: https://review.openstack.org/#/c/342857/ > > > > I see, thanks for pointing out the defaulter project! > > > > Best, > > Qiao > > > >> > >> Best, > >> Kota > >> > >> (2018/09/18 11:34), Qiao Kang wrote: > >>> Kota, > >>> > >>> Thanks for your reply, very helpful! > >>> > >>> I know Storlets can provide user-defined computation functionalities, > >>> but I guess some capabilities can only be achieved using middleware. > >>> For example, a user may want such a feature: upon each PUT request, it > >>> creates a compressed copy of the object and stores both the original > >>> copy and compressed copy. It's feasible using middlware but I don't > >>> think Storlets provide such capability. > >>> > >>> Another example is that a user may want to install a Swift3-like > >>> middleware to provide APIs to a 3rd party, but she doesn't want other > >>> users to see this middleware. > >>> > >>> Regards, > >>> Qiao > >>> > >>> On Mon, Sep 17, 2018 at 9:19 PM Kota TSUYUZAKI > >>> wrote: > >>>> > >>>> With Storlets, users will be able to create their own applications that are able to run like as a Swift middeleware. The application (currently Python and Java are supported as the language but the > >>>> apps can calls any binaries in the workspace) can be uploaded as a Swift object, then, users can invoke them with just an extra header that specifies your apps. > >>>> > >>>> To fit your own use case, we may have to consider to invole or to integrate the system for you but I believe Storlets could be a choice for you. > >>>> > >>>> In detail, Storlets documantation is around there, > >>>> > >>>> Top Level Index: https://docs.openstack.org/storlets/latest/index.html > >>>> System Overview: https://docs.openstack.org/storlets/latest/storlet_engine_overview.html > >>>> APIs: https://docs.openstack.org/storlets/latest/api/overview_api.html > >>>> > >>>> Thanks, > >>>> > >>>> Kota > >>>> > >>>> (2018/09/17 8:59), John Dickinson wrote: > >>>>> You may be interested in Storlets. It's another OpenStack project, maintained by a Swift core reviewer, that provides this sort of user-defined middleware functionality. > >>>>> > >>>>> You can also ask about it in #openstack-swift > >>>>> > >>>>> --John > >>>>> > >>>>> > >>>>> > >>>>> On 16 Sep 2018, at 9:25, Qiao Kang wrote: > >>>>> > >>>>>> Hi, > >>>>>> > >>>>>> I'm wondering whether Swift allows any user (not the administrator) to > >>>>>> specify which middleware that she/he wants his data object to go throught. > >>>>>> For instance, Alice wants to install a middleware but doesn't want Bob to > >>>>>> use it, where Alice and Bob are two accounts in a single Swift cluster. > >>>>>> > >>>>>> Or maybe all middlewares are pre-installed globally and cannot be > >>>>>> customized on a per-account basis? > >>>>>> > >>>>>> Thanks, > >>>>>> Qiao > >>>>>> _______________________________________________ > >>>>>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > >>>>>> Post to : openstack at lists.openstack.org > >>>>>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > >>>>> > >>>>> _______________________________________________ > >>>>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > >>>>> Post to : openstack at lists.openstack.org > >>>>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > >>>> > >>>> > >>>> -- > >>>> ---------------------------------------------------------- > >>>> Kota Tsuyuzaki(露﨑 浩太) > >>>> NTT Software Innovation Center > >>>> Distributed Computing Technology Project > >>>> Phone 0422-59-2837 > >>>> Fax 0422-59-2965 > >>>> ----------------------------------------------------------- > >>>> > >>>> > >>>> _______________________________________________ > >>>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > >>>> Post to : openstack at lists.openstack.org > >>>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > >>> > >>> > >> > >> > >> -- > >> ---------------------------------------------------------- > >> Kota Tsuyuzaki(露﨑 浩太) > >> NTT Software Innovation Center > >> Distributed Computing Technology Project > >> Phone 0422-59-2837 > >> Fax 0422-59-2965 > >> ----------------------------------------------------------- > >> > > > > > > > From danny.rotscher at tu-dresden.de Thu Sep 27 12:10:48 2018 From: danny.rotscher at tu-dresden.de (Danny Marc Rotscher) Date: Thu, 27 Sep 2018 12:10:48 +0000 Subject: [Openstack] Packstack different ethernet device names In-Reply-To: References: <3c94ca9bb14f4e76b4ec29060f44d470@MSX-L101.msx.ad.zih.tu-dresden.de> Message-ID: <4b9c59da070b48b6a047ca39e52eb245@MSX-L102.msx.ad.zih.tu-dresden.de> Hello Remo, a configuration example would be great, thank you! Kind regards, Danny Von: Remo Mattei [mailto:remo at italy1.com] Gesendet: Mittwoch, 26. September 2018 16:34 An: Rotscher, Danny Marc Cc: openstack at lists.openstack.org Betreff: Re: [Openstack] Packstack different ethernet device names Yes packstack has a section to map the nic. I will have to check my old old config on my computer then share it. Remo Il giorno 26 set 2018, alle ore 02:56, Danny Marc Rotscher > ha scritto: Dear all, it is possible to address multible network device names in the answer file for example for the tunnel interface? Because my controller run on a vm and has the device name eth* and my hypervisor hosts have something like enp*, which is the new one. I know I can switch back to eth* for the hypervisor hosts, but that is only the last way I would prefer. Kind regards, Danny -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 6102 bytes Desc: not available URL: From satish.txt at gmail.com Thu Sep 27 17:25:43 2018 From: satish.txt at gmail.com (Satish Patel) Date: Thu, 27 Sep 2018 13:25:43 -0400 Subject: [Openstack] URGENT: packet loss on openstack instance In-Reply-To: References: <4126044E-505F-4A48-B126-0625D5F40D72@cisco.com> <11A2F68B-1A87-433B-A4D6-CA495DA88F5C@gmail.com> <5A9B804F-BEA1-468D-BBAB-3C50181A6190@cisco.com> <8089BF19-A95B-4CF5-A2D4-0CB2B7415362@cisco.com> Message-ID: Hey Liping, Follow up on this issue, i have configured SR-IOV and now i am not seeing any packetloss or any latency issue. On Mon, Sep 17, 2018 at 1:27 AM Liping Mao (limao) wrote: > > > Question: I have br-vlan interface mapp with bond0 to run my VM (VLAN > > traffic), so do i need to do anything in bond0 to enable VF/PF > > function? Just confused because currently my VM nic map with compute > > node br-vlan bridge. > > > > I had not actually used SRIOV in my env~ maybe others could help. > > > > Thanks, > > Liping Mao > > > > 在 2018/9/17 11:48,“Satish Patel” 写入: > > > > Thanks Liping, > > > > I will check bug for tx/rx queue size and see if i can make it work > > but look like my 10G NIC support SR-IOV so i am trying that path > > because it will be better for long run. > > > > I have deploy my cloud using openstack-ansible so now i need to figure > > out how do i wire that up with openstack-ansible deployment, here is > > the article [1] > > > > Question: I have br-vlan interface mapp with bond0 to run my VM (VLAN > > traffic), so do i need to do anything in bond0 to enable VF/PF > > function? Just confused because currently my VM nic map with compute > > node br-vlan bridge. > > > > [root at compute-65 ~]# lspci -nn | grep -i ethernet > > 03:00.0 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > BCM57810 10 Gigabit Ethernet [14e4:168e] (rev 10) > > 03:00.1 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > BCM57810 10 Gigabit Ethernet [14e4:168e] (rev 10) > > 03:01.0 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] > > 03:01.1 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] > > 03:01.2 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] > > 03:01.3 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] > > 03:01.4 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] > > 03:01.5 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] > > 03:01.6 Ethernet controller [0200]: Broadcom Limited NetXtreme II > > BCM57810 10 Gigabit Ethernet Virtual Function [14e4:16af] > > > > > > [1] https://docs.openstack.org/openstack-ansible-os_neutron/latest/configure-network-services.html > > On Sun, Sep 16, 2018 at 7:06 PM Liping Mao (limao) wrote: > > > > > > Hi Satish, > > > > > > > > > > > > > > > > > > There are hard limitations in nova's code, I did not actually used more thant 8 queues: > > > > > > def _get_max_tap_queues(self): > > > > > > # NOTE(kengo.sakai): In kernels prior to 3.0, > > > > > > # multiple queues on a tap interface is not supported. > > > > > > # In kernels 3.x, the number of queues on a tap interface > > > > > > # is limited to 8. From 4.0, the number is 256. > > > > > > # See: https://bugs.launchpad.net/nova/+bug/1570631 > > > > > > kernel_version = int(os.uname()[2].split(".")[0]) > > > > > > if kernel_version <= 2: > > > > > > return 1 > > > > > > elif kernel_version == 3: > > > > > > return 8 > > > > > > elif kernel_version == 4: > > > > > > return 256 > > > > > > else: > > > > > > return None > > > > > > > > > > > > > I am currently playing with those setting and trying to generate > > > > > > traffic with hping3 tools, do you have any tool to test traffic > > > > > > performance for specially udp style small packets. > > > > > > > > > > > > Hping3 is good enough to reproduce it, we have app level test tool, but that is not your case. > > > > > > > > > > > > > > > > > > > Here i am trying to increase rx_queue_size & tx_queue_size but its not > > > > > > working somehow. I have tired following. > > > > > > > > > > > > Since you are not rocky code, it should only works in qemu.conf, maybe check if this bug[1] affect you. > > > > > > > > > > > > > > > > > > > Is there a way i can automate this last task to update queue number > > > > > > action after reboot vm :) otherwise i can use cloud-init to make sure > > > > > > all VM build with same config. > > > > > > > > > > > > Cloud-init or rc.local could be the place to do that. > > > > > > > > > > > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1541960 > > > > > > > > > > > > Regards, > > > > > > Liping Mao > > > > > > > > > > > > 在 2018/9/17 04:09,“Satish Patel” 写入: > > > > > > > > > > > > Update on my last email. > > > > > > > > > > > > I am able to achieve 150kpps with queue=8 and my goal is to do 300kpps > > > > > > because some of voice application using 300kps. > > > > > > > > > > > > Here i am trying to increase rx_queue_size & tx_queue_size but its not > > > > > > working somehow. I have tired following. > > > > > > > > > > > > 1. add rx/tx size in /etc/nova/nova.conf in libvirt section - (didn't work) > > > > > > 2. add /etc/libvirtd/qemu.conf - (didn't work) > > > > > > > > > > > > I have try to edit virsh edit file but somehow my changes not > > > > > > getting reflected, i did virsh define after change and hard > > > > > > reboot guest but no luck.. how do i edit that option in xml if i want > > > > > > to do that? > > > > > > On Sun, Sep 16, 2018 at 1:41 PM Satish Patel wrote: > > > > > > > > > > > > > > I successful reproduce this error with hping3 tool and look like > > > > > > > multiqueue is our solution :) but i have few question you may have > > > > > > > answer of that. > > > > > > > > > > > > > > 1. I have created two instance (vm1.example.com & vm2.example.com) > > > > > > > > > > > > > > 2. I have flood traffic from vm1 using "hping3 vm2.example.com > > > > > > > --flood" and i have noticed drops on tap interface. ( This is without > > > > > > > multiqueue) > > > > > > > > > > > > > > 3. Enable multiqueue in image and run same test and again got packet > > > > > > > drops on tap interface ( I didn't update queue on vm2 guest, so > > > > > > > definitely i was expecting packet drops) > > > > > > > > > > > > > > 4. Now i have try to update vm2 queue using ethtool and i got > > > > > > > following error, I have 15vCPU and i was trying to add 15 queue > > > > > > > > > > > > > > [root at bar-mq ~]# ethtool -L eth0 combined 15 > > > > > > > Cannot set device channel parameters: Invalid argument > > > > > > > > > > > > > > Then i have tried 8 queue which works. > > > > > > > > > > > > > > [root at bar-mq ~]# ethtool -L eth0 combined 8 > > > > > > > combined unmodified, ignoring > > > > > > > no channel parameters changed, aborting > > > > > > > current values: tx 0 rx 0 other 0 combined 8 > > > > > > > > > > > > > > Now i am not seeing any packet drops on tap interface, I have measure > > > > > > > PPS and i was able to get 160kpps without packet drops. > > > > > > > > > > > > > > Question: > > > > > > > > > > > > > > 1. why i am not able to add 15 queue? ( is this NIC or driver limitation?) > > > > > > > 2. how do i automate "ethtool -L eth0 combined 8" command in instance > > > > > > > so i don't need to tell my customer to do this manually? > > > > > > > On Sun, Sep 16, 2018 at 11:53 AM Satish Patel wrote: > > > > > > > > > > > > > > > > Hi Liping, > > > > > > > > > > > > > > > > >> I think multi queue feature should help.(be careful to make sure the ethtool update queue number action also did after reboot the vm). > > > > > > > > > > > > > > > > Is there a way i can automate this last task to update queue number > > > > > > > > action after reboot vm :) otherwise i can use cloud-init to make sure > > > > > > > > all VM build with same config. > > > > > > > > On Sun, Sep 16, 2018 at 11:51 AM Satish Patel wrote: > > > > > > > > > > > > > > > > > > I am currently playing with those setting and trying to generate > > > > > > > > > traffic with hping3 tools, do you have any tool to test traffic > > > > > > > > > performance for specially udp style small packets. > > > > > > > > > > > > > > > > > > I am going to share all my result and see what do you feel because i > > > > > > > > > have noticed you went through this pain :) I will try every single > > > > > > > > > option which you suggested to make sure we are good before i move > > > > > > > > > forward to production. > > > > > > > > > On Sun, Sep 16, 2018 at 11:25 AM Liping Mao (limao) wrote: > > > > > > > > > > > > > > > > > > > > I think multi queue feature should help.(be careful to make sure the ethtool update queue number action also did after reboot the vm). > > > > > > > > > > > > > > > > > > > > Numa cpu pin and queue length will be a plus in my exp. You may need yo have performance test in your situatuon,in my case cpu numa helpped the app get very stable 720p/1080p transcoding performance. Not sure if your app get benifit. > > > > > > > > > > > > > > > > > > > > You are not using L3,this will let you avoid a lot of performance issue. And since only two instance with 80kpps packets,so in your case,HW interface should not be bottleneck too. And your Nexus 5k/7k will not be bottleneck for sure ;-) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Liping Mao > > > > > > > > > > > > > > > > > > > > > 在 2018年9月16日,23:09,Satish Patel 写道: > > > > > > > > > > > > > > > > > > > > > > Thanks Liping, > > > > > > > > > > > > > > > > > > > > > > I am using libvertd 3.9.0 version so look like i am eligible take > > > > > > > > > > > advantage of that feature. phew! > > > > > > > > > > > > > > > > > > > > > > [root at compute-47 ~]# libvirtd -V > > > > > > > > > > > libvirtd (libvirt) 3.9.0 > > > > > > > > > > > > > > > > > > > > > > Let me tell you how i am running instance on my openstack, my compute > > > > > > > > > > > has 32 core / 32G memory and i have created two instance on compute > > > > > > > > > > > node 15vcpu and 14G memory ( two instance using 30 vcpu core, i have > > > > > > > > > > > kept 2 core for compute node). on compute node i disabled overcommit > > > > > > > > > > > using ratio (1.0) > > > > > > > > > > > > > > > > > > > > > > I didn't configure NUMA yet because i wasn't aware of this feature, as > > > > > > > > > > > per your last post do you think numa will help to fix this issue? > > > > > > > > > > > following is my numa view > > > > > > > > > > > > > > > > > > > > > > [root at compute-47 ~]# numactl --hardware > > > > > > > > > > > available: 2 nodes (0-1) > > > > > > > > > > > node 0 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23 > > > > > > > > > > > node 0 size: 16349 MB > > > > > > > > > > > node 0 free: 133 MB > > > > > > > > > > > node 1 cpus: 8 9 10 11 12 13 14 15 24 25 26 27 28 29 30 31 > > > > > > > > > > > node 1 size: 16383 MB > > > > > > > > > > > node 1 free: 317 MB > > > > > > > > > > > node distances: > > > > > > > > > > > node 0 1 > > > > > > > > > > > 0: 10 20 > > > > > > > > > > > 1: 20 10 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I am not using any L3 router, i am using provide VLAN network and > > > > > > > > > > > using Cisco Nexus switch for my L3 function so i am not seeing any > > > > > > > > > > > bottleneck there. > > > > > > > > > > > > > > > > > > > > > > This is the 10G NIC i have on all my compute node, dual 10G port with > > > > > > > > > > > bonding (20G) > > > > > > > > > > > > > > > > > > > > > > 03:00.0 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 > > > > > > > > > > > Gigabit Ethernet (rev 10) > > > > > > > > > > > 03:00.1 Ethernet controller: Broadcom Limited NetXtreme II BCM57810 10 > > > > > > > > > > > Gigabit Ethernet (rev 10) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> On Sun, Sep 16, 2018 at 10:50 AM Liping Mao (limao) wrote: > > > > > > > > > > >> > > > > > > > > > > >> It is still possible to update rx and tx queues length if your qemu and libvirt version is higher than the version recorded in [3]. (You should possible to update directly in libvirt configuration if my memory is correct) > > > > > > > > > > >> > > > > > > > > > > >> We also have some similar use case which run audio/vedio serivcs. They are CPU consuming and have UDP small packets. Another possible tunning is using CPU pin for the vm. you can use numa awared cpu feature to get stable cpu performance ,vm network dropped packets sometimes because of the vm cpu is too busy,with numa cpu it works better performance,our way is similar with [a]. You need to create flavor with special metadata and dedicated Host Agg for numa awared VMs. Dedicated CPU is very good for media service. It makes the CPU performance stable. > > > > > > > > > > >> > > > > > > > > > > >> Another packet loss case we get is because of vm kernel, some of our app are using 32bit OS, that cause memory issue, when traffic larger then 50kpps, it dropped a lot,sometimes,it even crash. In this case, 32bit os can actually use very limited memory, we have to add swap for the vm. Hope your app is using 64 bit OS. Because 32 bit could cause tons of trouble. > > > > > > > > > > >> > > > > > > > > > > >> BTW,if you are using vrouter on L3, you’d better to move provider network(no vrouter). I did not tried DVR, but if you are running without DVR, the L3 node will be bottleneck very quick. Especially default iptables conntrack is 65535, you will reach to it and drop packet on L3, even after you tun that value, it still hard to more that 1Mpps for your network node. > > > > > > > > > > >> > > > > > > > > > > >> If your App more than 200kpps per compute node, you may be better also have a look your physical network driver tx/rx configuration. Most of the HW default value for tx/rx queues number and length are very poor,you may start to get packet on eth interface on physical host when rx queue is full. > > > > > > > > > > >> > > > > > > > > > > >> [a]https://redhatstackblog.redhat.com/2015/05/05/cpu-pinning-and-numa-topology-awareness-in-openstack-compute/ > > > > > > > > > > >> > > > > > > > > > > >> Regards, > > > > > > > > > > >> Liping Mao > > > > > > > > > > >> > > > > > > > > > > >> 在 2018年9月16日,21:18,Satish Patel 写道: > > > > > > > > > > >> > > > > > > > > > > >> Hi Liping, > > > > > > > > > > >> > > > > > > > > > > >> Thank you for your reply, > > > > > > > > > > >> > > > > > > > > > > >> We notice packet drops during high load, I did try txqueue and didn't help so I believe I am going to try miltiqueue. > > > > > > > > > > >> > > > > > > > > > > >> For SRIOV I have to look if I have support in my nic. > > > > > > > > > > >> > > > > > > > > > > >> We are using queens so I think queue size option not possible :( > > > > > > > > > > >> > > > > > > > > > > >> We are using voip application and traffic is udp so our pps rate is 60k to 80k per vm instance. > > > > > > > > > > >> > > > > > > > > > > >> I will share my result as soon as I try multiqueue. > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> Sent from my iPhone > > > > > > > > > > >> > > > > > > > > > > >> On Sep 16, 2018, at 2:27 AM, Liping Mao (limao) wrote: > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> Hi Satish, > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> Did your packet loss happen always or it only happened when heavy load? > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> AFAIK, if you do not tun anything, the vm tap can process about 50kpps before the tap device start to drop packets. > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> If it happened in heavy load, couple of things you can try: > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> 1) increase tap queue length, usually the default value is 500, you can try larger. (seems like you already tried) > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> 2) Try to use virtio multi queues feature , see [1]. Virtio use one queue for rx/tx in vm, with this feature you can get more queues. You can check > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> 3) In rock version, you can use [2] to increase virtio queue size, the default queues size is 256/512, you may increase it to 1024, this would help to increase pps of the tap device. > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> If all these things can not get your network performance requirement, you may need to move to use dpdk / sriov stuff to get more vm performance. > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> I did not actually used them in our env, you may refer to [3] > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> [1] https://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/libvirt-virtiomq.html > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> [2] https://specs.openstack.org/openstack/nova-specs/specs/rocky/implemented/libvirt-virtio-set-queue-sizes.html > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> [3] https://docs.openstack.org/ocata/networking-guide/config-sriov.html > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> Regards, > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> Liping Mao > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> 在 2018/9/16 13:07,“Satish Patel” 写入: > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> [root at compute-33 ~]# ifconfig tap5af7f525-5f | grep -i drop > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> RX errors 0 dropped 0 overruns 0 frame 0 > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> TX errors 0 dropped 2528788837 overruns 0 carrier 0 collisions 0 > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> Noticed tap interface dropping TX packets and even after increasing > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> txqueue from 1000 to 10000 nothing changed, still getting packet > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> drops. > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> On Sat, Sep 15, 2018 at 4:22 PM Satish Patel wrote: > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> Folks, > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> I need some advice or suggestion to find out what is going on with my > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> network, we have notice high packet loss on openstack instance and not > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> sure what is going on, same time if i check on host machine and it has > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> zero packet loss.. this is what i did for test... > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> ping 8.8.8.8 > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> from instance: 50% packet loss > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> from compute host: 0% packet loss > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> I have disabled TSO/GSO/SG setting on physical compute node but still > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> getting packet loss. > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> We have 10G NIC on our network, look like something related to tap > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> interface setting.. > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> _______________________________________________ > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> Post to : openstack at lists.openstack.org > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > > > > > > > From remo at italy1.com Thu Sep 27 17:40:18 2018 From: remo at italy1.com (Remo Mattei) Date: Thu, 27 Sep 2018 10:40:18 -0700 Subject: [Openstack] Packstack different ethernet device names In-Reply-To: <4b9c59da070b48b6a047ca39e52eb245@MSX-L102.msx.ad.zih.tu-dresden.de> References: <3c94ca9bb14f4e76b4ec29060f44d470@MSX-L101.msx.ad.zih.tu-dresden.de> <4b9c59da070b48b6a047ca39e52eb245@MSX-L102.msx.ad.zih.tu-dresden.de> Message-ID: <67CAEE06-01D0-4B14-B997-9654781E5139@italy1.com> I will send it later sorry doing some production OpenStack tweakings so time has been limited Remo > Il giorno 27 set 2018, alle ore 05:10, Danny Marc Rotscher ha scritto: > > Hello Remo, > > a configuration example would be great, thank you! > > Kind regards, > Danny > > Von: Remo Mattei [mailto:remo at italy1.com] > Gesendet: Mittwoch, 26. September 2018 16:34 > An: Rotscher, Danny Marc > Cc: openstack at lists.openstack.org > Betreff: Re: [Openstack] Packstack different ethernet device names > > Yes packstack has a section to map the nic. > > I will have to check my old old config on my computer then share it. > > Remo > > Il giorno 26 set 2018, alle ore 02:56, Danny Marc Rotscher ha scritto: > > Dear all, > > it is possible to address multible network device names in the answer file for example for the tunnel interface? > Because my controller run on a vm and has the device name eth* and my hypervisor hosts have something like enp*, which is the new one. > I know I can switch back to eth* for the hypervisor hosts, but that is only the last way I would prefer. > > Kind regards, > Danny > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Remo at italy1.com Thu Sep 27 18:12:44 2018 From: Remo at italy1.com (Remo Mattei) Date: Thu, 27 Sep 2018 11:12:44 -0700 Subject: [Openstack] Packstack different ethernet device names In-Reply-To: <67CAEE06-01D0-4B14-B997-9654781E5139@italy1.com> References: <3c94ca9bb14f4e76b4ec29060f44d470@MSX-L101.msx.ad.zih.tu-dresden.de> <4b9c59da070b48b6a047ca39e52eb245@MSX-L102.msx.ad.zih.tu-dresden.de> <67CAEE06-01D0-4B14-B997-9654781E5139@italy1.com> Message-ID: <4BC25EAF-EB1E-443D-8611-1C2B9D19E4D1@italy1.com> Look for CONFIG_NEUTRON_OVS_BRIDGE_MAPPINGS=physnet1:br-ex CONFIG_NEUTRON_OVS_BRIDGE_IFACES=br-ex:eth0 > On Sep 27, 2018, at 10:40, Remo Mattei wrote: > > I will send it later sorry doing some production OpenStack tweakings so time has been limited > > Remo > > Il giorno 27 set 2018, alle ore 05:10, Danny Marc Rotscher > ha scritto: > >> Hello Remo, <> >> >> a configuration example would be great, thank you! >> >> Kind regards, >> Danny >> >> Von: Remo Mattei [mailto:remo at italy1.com ] >> Gesendet: Mittwoch, 26. September 2018 16:34 >> An: Rotscher, Danny Marc > >> Cc: openstack at lists.openstack.org >> Betreff: Re: [Openstack] Packstack different ethernet device names >> >> Yes packstack has a section to map the nic. >> >> I will have to check my old old config on my computer then share it. >> >> Remo >> >> Il giorno 26 set 2018, alle ore 02:56, Danny Marc Rotscher > ha scritto: >> >> Dear all, >> >> it is possible to address multible network device names in the answer file for example for the tunnel interface? >> Because my controller run on a vm and has the device name eth* and my hypervisor hosts have something like enp*, which is the new one. >> I know I can switch back to eth* for the hypervisor hosts, but that is only the last way I would prefer. >> >> Kind regards, >> Danny >> > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Fri Sep 28 03:19:32 2018 From: satish.txt at gmail.com (Satish Patel) Date: Thu, 27 Sep 2018 23:19:32 -0400 Subject: [Openstack] [Openstack-operators] UDP Buffer Filling In-Reply-To: References: <1B42179E-E753-4D1E-A766-41ABCD8FEA62@cisco.com> Message-ID: I know this thread is old but still wanted to post my finding which may help other folks to understand issue. I am dealing with same issue in my openstack network, we are media company and dealing with lots of VoIP applications where we need to handle high stream of udp packets, Virtio-net isn't meant to handl high PPS rate, i ran couple of test and found no matter what txqueue or multiqueue you set it will start dropping packet after 50kpps, I have tried numa too but result is negative. Finally i have decided to move and and try SR-IOV and now i am very very happy, SR-IOV reduce my VM guest CPU load 50% and now my NIC can handle 200kpps without dropping any packet. I would say use "iptraf-ng" utility to find out packet rate and see if its above ~40kpps. On Fri, Jul 28, 2017 at 8:39 PM Eugene Nikanorov wrote: > > John, > > multiqueue support will require qemu 2.5+ > I wonder why do you need this feature. It only will help in case of a really huge incoming pps or bandwidth. > I'm not sure udp packet loss can be solved with this, but of course better try. > > my 2c. > > Thanks, > Eugene. > > On Fri, Jul 28, 2017 at 5:00 PM, Liping Mao (limao) wrote: >> >> > We already tune these values in the VM. Would you suggest tuning them on the compute nodes as well? >> No need on compute nodes.(AFAIK) >> >> >> How much pps your vm need to handle? >> You can monitor CPU usage ,especially si to see where may drop. If you see vhost almost reach to 100% CPU ,multi queue may help in some case. >> >> Thanks. >> >> Regards, >> Liping Mao >> >> > 在 2017年7月28日,22:45,John Petrini 写道: >> > >> > We already tune these values in the VM. Would you suggest tuning them on the compute nodes as well? >> _______________________________________________ >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> Post to : openstack at lists.openstack.org >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack From remo at italy1.com Fri Sep 28 05:45:32 2018 From: remo at italy1.com (Remo Mattei) Date: Thu, 27 Sep 2018 22:45:32 -0700 Subject: [Openstack] [Openstack-operators] UDP Buffer Filling In-Reply-To: References: <1B42179E-E753-4D1E-A766-41ABCD8FEA62@cisco.com> Message-ID: <2049EE5F-4DAF-46A6-9A79-1D7D2C7D2D68@italy1.com> Are you using ubutu, TripleO??? Thanks > Il giorno 27 set 2018, alle ore 20:19, Satish Patel ha scritto: > > I know this thread is old but still wanted to post my finding which > may help other folks to understand issue. > > I am dealing with same issue in my openstack network, we are media > company and dealing with lots of VoIP applications where we need to > handle high stream of udp packets, Virtio-net isn't meant to handl > high PPS rate, i ran couple of test and found no matter what txqueue > or multiqueue you set it will start dropping packet after 50kpps, I > have tried numa too but result is negative. > > Finally i have decided to move and and try SR-IOV and now i am very > very happy, SR-IOV reduce my VM guest CPU load 50% and now my NIC can > handle 200kpps without dropping any packet. > > I would say use "iptraf-ng" utility to find out packet rate and see if > its above ~40kpps. > On Fri, Jul 28, 2017 at 8:39 PM Eugene Nikanorov > wrote: >> >> John, >> >> multiqueue support will require qemu 2.5+ >> I wonder why do you need this feature. It only will help in case of a really huge incoming pps or bandwidth. >> I'm not sure udp packet loss can be solved with this, but of course better try. >> >> my 2c. >> >> Thanks, >> Eugene. >> >>> On Fri, Jul 28, 2017 at 5:00 PM, Liping Mao (limao) wrote: >>> >>>> We already tune these values in the VM. Would you suggest tuning them on the compute nodes as well? >>> No need on compute nodes.(AFAIK) >>> >>> >>> How much pps your vm need to handle? >>> You can monitor CPU usage ,especially si to see where may drop. If you see vhost almost reach to 100% CPU ,multi queue may help in some case. >>> >>> Thanks. >>> >>> Regards, >>> Liping Mao >>> >>>> 在 2017年7月28日,22:45,John Petrini 写道: >>>> >>>> We already tune these values in the VM. Would you suggest tuning them on the compute nodes as well? >>> _______________________________________________ >>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>> Post to : openstack at lists.openstack.org >>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> >> >> _______________________________________________ >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> Post to : openstack at lists.openstack.org >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack From dpanarese at enter.eu Fri Sep 28 10:23:43 2018 From: dpanarese at enter.eu (Davide Panarese) Date: Fri, 28 Sep 2018 12:23:43 +0200 Subject: [Openstack] [Horizon][Keystone] Migration to keystone v3 Message-ID: Goodmorning every one, i'm finally approaching migration to keystone v3 but i want to maintain keystone v2 compatibility for all users that have custom scripts for authentication to our openstack. Migration seems to be pretty simple, change endpoint direct into database changing http://keystone:5000/v2.0 to http://keystone:5000 ; Openstack client have the capability to add /v2.0 or /v3 at the end of url retrieved from catalog. But i'm stuck with horizon dashboard, login works but compute information are not available and error log show: “ Forbidden: You are not authorized to perform the requested action: rescope a scoped token. (HTTP 403)" All other tabs works properly. I think that is a keystone issue but i don't understand why with openstack client works perfectly and with horizon not. Anyone can explain what i missed in migration? Thanks a lot, Davide Panarese -------------- next part -------------- An HTML attachment was scrubbed... URL: From eblock at nde.ag Fri Sep 28 11:50:51 2018 From: eblock at nde.ag (Eugen Block) Date: Fri, 28 Sep 2018 11:50:51 +0000 Subject: [Openstack] [Horizon][Keystone] Migration to keystone v3 In-Reply-To: Message-ID: <20180928115051.Horde.ZC_55UzSXeK4hiOjJt6tajA@webmail.nde.ag> Hi, what is your current horizon configuration? control:~ # grep KEYSTONE_URL /srv/www/openstack-dashboard/openstack_dashboard/local/local_settings.py OPENSTACK_KEYSTONE_URL = "http://%s:5000/v3" % OPENSTACK_HOST Maybe this still configured to v2? Regards, Eugen Zitat von Davide Panarese : > Goodmorning every one, > i'm finally approaching migration to keystone v3 but i want to > maintain keystone v2 compatibility for all users that have custom > scripts for authentication to our openstack. > Migration seems to be pretty simple, change endpoint direct into > database changing http://keystone:5000/v2.0 to http://keystone:5000 > ; Openstack client have the capability to add > /v2.0 or /v3 at the end of url retrieved from catalog. > But i'm stuck with horizon dashboard, login works but compute > information are not available and error log show: > “ Forbidden: You are not authorized to perform the requested action: > rescope a scoped token. (HTTP 403)" > All other tabs works properly. > I think that is a keystone issue but i don't understand why with > openstack client works perfectly and with horizon not. > Anyone can explain what i missed in migration? > > Thanks a lot, > Davide Panarese From dpanarese at enter.eu Fri Sep 28 12:45:39 2018 From: dpanarese at enter.eu (Davide Panarese) Date: Fri, 28 Sep 2018 14:45:39 +0200 Subject: [Openstack] [Horizon][Keystone] Migration to keystone v3 In-Reply-To: <20180928115051.Horde.ZC_55UzSXeK4hiOjJt6tajA@webmail.nde.ag> References: <20180928115051.Horde.ZC_55UzSXeK4hiOjJt6tajA@webmail.nde.ag> Message-ID: @Paul Yes keystone:5000 is my endpoint. @Eugen OPENSTACK_KEYSTONE_URL = "http://%s/v3 " % OPENSTACK_HOST Still not working. Davide Panarese > On 28 Sep 2018, at 13:50, Eugen Block wrote: > > Hi, > > what is your current horizon configuration? > > control:~ # grep KEYSTONE_URL /srv/www/openstack-dashboard/openstack_dashboard/local/local_settings.py > OPENSTACK_KEYSTONE_URL = "http://%s:5000/v3" % OPENSTACK_HOST > > Maybe this still configured to v2? > > Regards, > Eugen > > > Zitat von Davide Panarese : > >> Goodmorning every one, >> i'm finally approaching migration to keystone v3 but i want to maintain keystone v2 compatibility for all users that have custom scripts for authentication to our openstack. >> Migration seems to be pretty simple, change endpoint direct into database changing http://keystone:5000/v2.0 to http://keystone:5000 ; Openstack client have the capability to add /v2.0 or /v3 at the end of url retrieved from catalog. >> But i'm stuck with horizon dashboard, login works but compute information are not available and error log show: >> “ Forbidden: You are not authorized to perform the requested action: rescope a scoped token. (HTTP 403)" >> All other tabs works properly. >> I think that is a keystone issue but i don't understand why with openstack client works perfectly and with horizon not. >> Anyone can explain what i missed in migration? >> >> Thanks a lot, >> Davide Panarese > > > > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > -- > Questo messaggio e' stato analizzato con Libra ESVA ed e' risultato non infetto. > Seguire il link qui sotto per segnalarlo come spam:http://mx01.enter.it/cgi-bin/learn-msg.cgi?id=D389145856.A899A > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eblock at nde.ag Fri Sep 28 12:52:24 2018 From: eblock at nde.ag (Eugen Block) Date: Fri, 28 Sep 2018 12:52:24 +0000 Subject: [Openstack] [Horizon][Keystone] Migration to keystone v3 In-Reply-To: References: <20180928115051.Horde.ZC_55UzSXeK4hiOjJt6tajA@webmail.nde.ag> Message-ID: <20180928125224.Horde.33aqtdk0B9Ncylg-zxjA5to@webmail.nde.ag> Since nova-compute reports that failure, what is your auth_url in /etc/nova/nova.conf in the [placement] section? Zitat von Davide Panarese : > @Paul > Yes keystone:5000 is my endpoint. > > @Eugen > OPENSTACK_KEYSTONE_URL = "http://%s/v3 " % OPENSTACK_HOST > > Still not working. > > > Davide Panarese > > >> On 28 Sep 2018, at 13:50, Eugen Block wrote: >> >> Hi, >> >> what is your current horizon configuration? >> >> control:~ # grep KEYSTONE_URL >> /srv/www/openstack-dashboard/openstack_dashboard/local/local_settings.py >> OPENSTACK_KEYSTONE_URL = "http://%s:5000/v3" % OPENSTACK_HOST >> >> Maybe this still configured to v2? >> >> Regards, >> Eugen >> >> >> Zitat von Davide Panarese : >> >>> Goodmorning every one, >>> i'm finally approaching migration to keystone v3 but i want to >>> maintain keystone v2 compatibility for all users that have custom >>> scripts for authentication to our openstack. >>> Migration seems to be pretty simple, change endpoint direct into >>> database changing http://keystone:5000/v2.0 to >>> http://keystone:5000 ; Openstack client >>> have the capability to add /v2.0 or /v3 at the end of url >>> retrieved from catalog. >>> But i'm stuck with horizon dashboard, login works but compute >>> information are not available and error log show: >>> “ Forbidden: You are not authorized to perform the requested >>> action: rescope a scoped token. (HTTP 403)" >>> All other tabs works properly. >>> I think that is a keystone issue but i don't understand why with >>> openstack client works perfectly and with horizon not. >>> Anyone can explain what i missed in migration? >>> >>> Thanks a lot, >>> Davide Panarese >> >> >> >> >> _______________________________________________ >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> Post to : openstack at lists.openstack.org >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> >> -- >> Questo messaggio e' stato analizzato con Libra ESVA ed e' risultato >> non infetto. >> Seguire il link qui sotto per segnalarlo come >> spam:http://mx01.enter.it/cgi-bin/learn-msg.cgi?id=D389145856.A899A >> >> From dpanarese at enter.eu Fri Sep 28 15:33:42 2018 From: dpanarese at enter.eu (Davide Panarese) Date: Fri, 28 Sep 2018 17:33:42 +0200 Subject: [Openstack] [Horizon][Keystone] Migration to keystone v3 In-Reply-To: <20180928125224.Horde.33aqtdk0B9Ncylg-zxjA5to@webmail.nde.ag> References: <20180928115051.Horde.ZC_55UzSXeK4hiOjJt6tajA@webmail.nde.ag> <20180928125224.Horde.33aqtdk0B9Ncylg-zxjA5to@webmail.nde.ag> Message-ID: <9F3C86CE-862D-469A-AD79-3F334CD5DB41@enter.eu> It’s not nova-compute that report the issue but keystone authentication on computing tab. As I said before, openstack cli working properly with all services, nova included. Davide Panarese Cloud & Solution Architect Enter | The open network and cloud provider Via privata Stefanardo da Vimercate, 28 20128 Milano enter.eu Mobile: +39 3386369591 Phone: +39 02 25514 837 This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. > On 28 Sep 2018, at 14:52, Eugen Block wrote: > > Since nova-compute reports that failure, what is your auth_url in /etc/nova/nova.conf in the [placement] section? > > > > Zitat von Davide Panarese : > >> @Paul >> Yes keystone:5000 is my endpoint. >> >> @Eugen >> OPENSTACK_KEYSTONE_URL = "http://%s/v3 " % OPENSTACK_HOST >> >> Still not working. >> >> >> Davide Panarese >> >> >>> On 28 Sep 2018, at 13:50, Eugen Block wrote: >>> >>> Hi, >>> >>> what is your current horizon configuration? >>> >>> control:~ # grep KEYSTONE_URL /srv/www/openstack-dashboard/openstack_dashboard/local/local_settings.py >>> OPENSTACK_KEYSTONE_URL = "http://%s:5000/v3" % OPENSTACK_HOST >>> >>> Maybe this still configured to v2? >>> >>> Regards, >>> Eugen >>> >>> >>> Zitat von Davide Panarese : >>> >>>> Goodmorning every one, >>>> i'm finally approaching migration to keystone v3 but i want to maintain keystone v2 compatibility for all users that have custom scripts for authentication to our openstack. >>>> Migration seems to be pretty simple, change endpoint direct into database changing http://keystone:5000/v2.0 to http://keystone:5000 ; Openstack client have the capability to add /v2.0 or /v3 at the end of url retrieved from catalog. >>>> But i'm stuck with horizon dashboard, login works but compute information are not available and error log show: >>>> “ Forbidden: You are not authorized to perform the requested action: rescope a scoped token. (HTTP 403)" >>>> All other tabs works properly. >>>> I think that is a keystone issue but i don't understand why with openstack client works perfectly and with horizon not. >>>> Anyone can explain what i missed in migration? >>>> >>>> Thanks a lot, >>>> Davide Panarese >>> >>> >>> >>> >>> _______________________________________________ >>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>> Post to : openstack at lists.openstack.org >>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>> >>> -- >>> Questo messaggio e' stato analizzato con Libra ESVA ed e' risultato non infetto. >>> Seguire il link qui sotto per segnalarlo come spam:http://mx01.enter.it/cgi-bin/learn-msg.cgi?id=D389145856.A899A >>> >>> > > > > > -- > Questo messaggio e' stato analizzato con Libra ESVA ed e' risultato non infetto. > Seguire il link qui sotto per segnalarlo come spam:http://mx01.enter.it/cgi-bin/learn-msg.cgi?id=9946B46CDF.A2B74 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From emccormick at cirrusseven.com Fri Sep 28 18:28:24 2018 From: emccormick at cirrusseven.com (Erik McCormick) Date: Fri, 28 Sep 2018 14:28:24 -0400 Subject: [Openstack] [Horizon][Keystone] Migration to keystone v3 In-Reply-To: <9F3C86CE-862D-469A-AD79-3F334CD5DB41@enter.eu> References: <20180928115051.Horde.ZC_55UzSXeK4hiOjJt6tajA@webmail.nde.ag> <20180928125224.Horde.33aqtdk0B9Ncylg-zxjA5to@webmail.nde.ag> <9F3C86CE-862D-469A-AD79-3F334CD5DB41@enter.eu> Message-ID: Add yourself as an admin of the domain. I think it uses a domain scored token for that tab. In V2 you would have only been admin of a project. -Erik On Fri, Sep 28, 2018, 11:47 AM Davide Panarese wrote: > It’s not nova-compute that report the issue but keystone authentication on > computing tab. > As I said before, openstack cli working properly with all services, nova > included. > > > > *Davide Panarese* > Cloud & Solution Architect > > *Enter | The open network and cloud provider* > > Via privata Stefanardo da Vimercate, 28 > 20128 Milano > enter.eu > > Mobile: +39 3386369591 > Phone: +39 02 25514 837 > > This email and any files transmitted with it are confidential and intended > solely for the use of the individual or entity to whom they are addressed. > If you have received this email in error please notify the system manager. > This message contains confidential information and is intended only for the > individual named. If you are not the named addressee you should not > disseminate, distribute or copy this e-mail. Please notify the sender > immediately by e-mail if you have received this e-mail by mistake and > delete this e-mail from your system. If you are not the intended recipient > you are notified that disclosing, copying, distributing or taking any > action in reliance on the contents of this information is strictly > prohibited. > > On 28 Sep 2018, at 14:52, Eugen Block wrote: > > Since nova-compute reports that failure, what is your auth_url in > /etc/nova/nova.conf in the [placement] section? > > > > Zitat von Davide Panarese : > > @Paul > Yes keystone:5000 is my endpoint. > > @Eugen > OPENSTACK_KEYSTONE_URL = "http://%s/v3 " % OPENSTACK_HOST > > Still not working. > > > Davide Panarese > > > On 28 Sep 2018, at 13:50, Eugen Block wrote: > > Hi, > > what is your current horizon configuration? > > control:~ # grep KEYSTONE_URL > /srv/www/openstack-dashboard/openstack_dashboard/local/local_settings.py > OPENSTACK_KEYSTONE_URL = "http://%s:5000/v3" % OPENSTACK_HOST > > Maybe this still configured to v2? > > Regards, > Eugen > > > Zitat von Davide Panarese : > > Goodmorning every one, > i'm finally approaching migration to keystone v3 but i want to maintain > keystone v2 compatibility for all users that have custom scripts for > authentication to our openstack. > Migration seems to be pretty simple, change endpoint direct into database > changing http://keystone:5000/v2.0 to http://keystone:5000 < > http://keystone:5000/>; Openstack client have the capability to add /v2.0 > or /v3 at the end of url retrieved from catalog. > But i'm stuck with horizon dashboard, login works but compute information > are not available and error log show: > “ Forbidden: You are not authorized to perform the requested action: > rescope a scoped token. (HTTP 403)" > All other tabs works properly. > I think that is a keystone issue but i don't understand why with openstack > client works perfectly and with horizon not. > Anyone can explain what i missed in migration? > > Thanks a lot, > Davide Panarese > > > > > > _______________________________________________ > Mailing list: > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > -- > Questo messaggio e' stato analizzato con Libra ESVA ed e' risultato non > infetto. > Seguire il link qui sotto per segnalarlo come spam: > http://mx01.enter.it/cgi-bin/learn-msg.cgi?id=D389145856.A899A > > > > > > > -- > Questo messaggio e' stato analizzato con Libra ESVA ed e' risultato non > infetto. > Seguire il link qui sotto per segnalarlo come spam: > http://mx01.enter.it/cgi-bin/learn-msg.cgi?id=9946B46CDF.A2B74 > > > > _______________________________________________ > Mailing list: > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Fri Sep 28 20:11:10 2018 From: satish.txt at gmail.com (Satish Patel) Date: Fri, 28 Sep 2018 16:11:10 -0400 Subject: [Openstack] SR-IOV error libvirtError: Node device not found: no node device with matching name Message-ID: Folks, I am configuring SR-IOV and encounter that error, does anyone know what is that error related, I am using Queens I found this but it didn't help me http://lists.openstack.org/pipermail/openstack/2018-January/045982.html I have restarted libvert and whole compute node also but i can't see "net_enp3s1f4_00_00_00_00_00_00" device in virsh nodedev-list 2018-09-28 16:06:16.396 28663 WARNING nova.compute.monitors [req-6d056524-d0ed-48b8-b2cb-50a1fdc91dd1 - - - - -] Excluding nova.compute.monitors.cpu monitor virt_driver. Not in the list of enabled monitors (CONF.compute_monitors). 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager [req-6d056524-d0ed-48b8-b2cb-50a1fdc91dd1 - - - - -] Error updating resources for node ostack-compute-63.v1v0x.net.: libvirtError: Node device not found: no node device with matching name 'net_enp3s1f4_00_00_00_00_00_00' 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager Traceback (most recent call last): 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager File "/openstack/venvs/nova-17.0.8/lib/python2.7/site-packages/nova/compute/manager.py", line 7275, in update_available_resource_for_node 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager rt.update_available_resource(context, nodename) 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager File "/openstack/venvs/nova-17.0.8/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 673, in update_available_resource 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager resources = self.driver.get_available_resource(nodename) 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager File "/openstack/venvs/nova-17.0.8/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6442, in get_available_resource 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager self._get_pci_passthrough_devices() 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager File "/openstack/venvs/nova-17.0.8/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5938, in _get_pci_passthrough_devices 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager pci_info.append(self._get_pcidev_info(name)) 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager File "/openstack/venvs/nova-17.0.8/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5899, in _get_pcidev_info 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager device.update(_get_device_capabilities(device, address)) 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager File "/openstack/venvs/nova-17.0.8/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5870, in _get_device_capabilities 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager pcinet_info = self._get_pcinet_info(address) 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager File "/openstack/venvs/nova-17.0.8/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5813, in _get_pcinet_info 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager virtdev = self._host.device_lookup_by_name(devname) 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager File "/openstack/venvs/nova-17.0.8/lib/python2.7/site-packages/nova/virt/libvirt/host.py", line 838, in device_lookup_by_name 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager return self.get_connection().nodeDeviceLookupByName(name) 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager File "/openstack/venvs/nova-17.0.8/lib/python2.7/site-packages/eventlet/tpool.py", line 186, in doit 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager result = proxy_call(self._autowrap, f, *args, **kwargs) 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager File "/openstack/venvs/nova-17.0.8/lib/python2.7/site-packages/eventlet/tpool.py", line 144, in proxy_call 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager rv = execute(f, *args, **kwargs) 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager File "/openstack/venvs/nova-17.0.8/lib/python2.7/site-packages/eventlet/tpool.py", line 125, in execute 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager six.reraise(c, e, tb) 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager File "/openstack/venvs/nova-17.0.8/lib/python2.7/site-packages/eventlet/tpool.py", line 83, in tworker 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager rv = meth(*args, **kwargs) 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager File "/openstack/venvs/nova-17.0.8/lib/python2.7/site-packages/libvirt.py", line 4232, in nodeDeviceLookupByName 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager if ret is None:raise libvirtError('virNodeDeviceLookupByName() failed', conn=self) 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager libvirtError: Node device not found: no node device with matching name 'net_enp3s1f4_00_00_00_00_00_00' 2018-09-28 16:06:16.500 28663 ERROR nova.compute.manager From xin-ran.wang at intel.com Sat Sep 29 07:34:04 2018 From: xin-ran.wang at intel.com (Wang, Xin-ran) Date: Sat, 29 Sep 2018 07:34:04 +0000 Subject: [Openstack] [Nova][Cyborg] Cyborg-Nova integration -- new submitted Cyborg implementation code Message-ID: <607C549EEE0AF444B482D730EC151E71C9C313@SHSMSX101.ccr.corp.intel.com> Hi, I have noticed there are more and more developers who are interested in Cyborg at PTG, and there are also some valuable outputs after this PTG. According to the summary at this PTG(https://etherpad.openstack.org/p/cyborg-nova-ptg-stein ), 4 possible solutions about how Nova interact with Cyborg are proposed, including device profile proposal, device context etc. And the 4th solution(from Jay) is very similar with my draft implementation before PTG, and now I have submitted to upstream. FYI, the original description of 4th solution in etherpad as below: (https://etherpad.openstack.org/p/cyborg-nova-ptg-stein , L112-L123): a) CTX_UUID=$(cyborg device-context-create --resources=FPGA_GZIP:1 --requires=SOME_FOO_CAPABILITY --config-data=key1:val1) **or** cyborg device-context-create --device-profile-id=$PROFILE_ID ++ alex: the different is just create context/profile on the fly? Is $CTX_UUID a profile or a specific instance? If the former, this is #1; if the latter, it is #2 (jaypipes) it's neither. It's the equivalent of a port binding request, just for an accelerator or super amazing device thingy. It's not like #1 because the device context is (eventually) bound to an instance and a specific device/slot identifier. It's not like #2 because it's not pre-creating any device. (efried) oh, okay, so it's kind of like a "dynamic profile" - it only exists as long as this request. It's the equivalent of a Neutron port (bind request) or a Cinder attachment ID. Ight. b) nova boot --device-context-id=$CTX_UUID c) placement finds a compute node that has availability for 1 FPGA_GZIP and has SOME_FOO_CAPABILITY d) nova-conductor sends async call to Cyborg to identify a specific slot on the chosen compute host to start dedicating to the instance. This step could be called "pre_bind()" or something like that? My current code will work like this: when a new VM request comes in to Nova, the info about the acceleration device is stored in device-context, there will be 'resources' field and 'required traits' field. After selected a host by invoking placement, nova should call Cyborg API to allocate one accelerators(binding) on the selected host. The Cyborg's allocation API[1] will do that. But before that, nova should firstly invoke the parser() function in OS-ACC [2] to parse the device-context to the acceptable parameters in order to call Cyborg API with them. And OS-ACC also provides the attach() function for PCI device to generate the xml file. Hope my code can help to move faster for this problem, and welcome reviews. Ref. [1] Cyborg will provide 2 RESTFul APIs to do allocation(binding) and deallocation(unbinding) like described in https:/review.openstack.org/#/c/596187/ and https://review.openstack.org/#/c/597991/ . [2] The parser in OS-ACC to help parse these parameters into an acceptable format in order to invoke Cyborg API with them when binding or unbinding. The parser's related patch is the following: https://review.openstack.org/#/c/606036/. Thanks, Xin-Ran -------------- next part -------------- An HTML attachment was scrubbed... URL: From dpanarese at enter.eu Sat Sep 29 14:43:44 2018 From: dpanarese at enter.eu (Davide Panarese) Date: Sat, 29 Sep 2018 16:43:44 +0200 Subject: [Openstack] [Horizon][Keystone] Migration to keystone v3 In-Reply-To: References: <20180928115051.Horde.ZC_55UzSXeK4hiOjJt6tajA@webmail.nde.ag> <20180928125224.Horde.33aqtdk0B9Ncylg-zxjA5to@webmail.nde.ag> <9F3C86CE-862D-469A-AD79-3F334CD5DB41@enter.eu> Message-ID: I found the source of the issue. Into keystone configuration I set allow_rescope_scoped_token to false. Setting true this value horizon compute tab works. But now the question is: Why horizon try to rescope authentication token only for compute information? Thanks Davide Panarese Cloud & Solution Architect Enter | The open network and cloud provider Via privata Stefanardo da Vimercate, 28 20128 Milano enter.eu Mobile: +39 3386369591 Phone: +39 02 25514 837 This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. > On 28 Sep 2018, at 20:28, Erik McCormick wrote: > > Add yourself as an admin of the domain. I think it uses a domain scored token for that tab. In V2 you would have only been admin of a project. > > -Erik > > On Fri, Sep 28, 2018, 11:47 AM Davide Panarese > wrote: > It’s not nova-compute that report the issue but keystone authentication on computing tab. > As I said before, openstack cli working properly with all services, nova included. > > > > Davide Panarese > Cloud & Solution Architect > > Enter | The open network and cloud provider > > Via privata Stefanardo da Vimercate, 28 > 20128 Milano > enter.eu > > Mobile: +39 3386369591 > Phone: +39 02 25514 837 > > This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. > >> On 28 Sep 2018, at 14:52, Eugen Block > wrote: >> >> Since nova-compute reports that failure, what is your auth_url in /etc/nova/nova.conf in the [placement] section? >> >> >> >> Zitat von Davide Panarese >: >> >>> @Paul >>> Yes keystone:5000 is my endpoint. >>> >>> @Eugen >>> OPENSTACK_KEYSTONE_URL = "http://%s/v3 >" % OPENSTACK_HOST >>> >>> Still not working. >>> >>> >>> Davide Panarese >>> >>> >>>> On 28 Sep 2018, at 13:50, Eugen Block > wrote: >>>> >>>> Hi, >>>> >>>> what is your current horizon configuration? >>>> >>>> control:~ # grep KEYSTONE_URL /srv/www/openstack-dashboard/openstack_dashboard/local/local_settings.py >>>> OPENSTACK_KEYSTONE_URL = "http://%s:5000/v3 " % OPENSTACK_HOST >>>> >>>> Maybe this still configured to v2? >>>> >>>> Regards, >>>> Eugen >>>> >>>> >>>> Zitat von Davide Panarese >: >>>> >>>>> Goodmorning every one, >>>>> i'm finally approaching migration to keystone v3 but i want to maintain keystone v2 compatibility for all users that have custom scripts for authentication to our openstack. >>>>> Migration seems to be pretty simple, change endpoint direct into database changing MailScanner ha rilevato un possibile tentativo di frode proveniente da "keystone:5000" http://keystone:5000/v2.0 to http://keystone:5000 >; Openstack client have the capability to add /v2.0 or /v3 at the end of url retrieved from catalog. >>>>> But i'm stuck with horizon dashboard, login works but compute information are not available and error log show: >>>>> “ Forbidden: You are not authorized to perform the requested action: rescope a scoped token. (HTTP 403)" >>>>> All other tabs works properly. >>>>> I think that is a keystone issue but i don't understand why with openstack client works perfectly and with horizon not. >>>>> Anyone can explain what i missed in migration? >>>>> >>>>> Thanks a lot, >>>>> Davide Panarese >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>>> Post to : openstack at lists.openstack.org >>>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>>> >>>> -- >>>> Questo messaggio e' stato analizzato con Libra ESVA ed e' risultato non infetto. >>>> Seguire il link qui sotto per segnalarlo come spam:http://mx01.enter.it/cgi-bin/learn-msg.cgi?id=D389145856.A899A >>>> >>>> >> >> >> >> >> -- >> Questo messaggio e' stato analizzato con Libra ESVA ed e' risultato non infetto. >> Seguire il link qui sotto per segnalarlo come spam:http://mx01.enter.it/cgi-bin/learn-msg.cgi?id=9946B46CDF.A2B74 >> >> > > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > -- > Questo messaggio e' stato analizzato con Libra ESVA ed e' risultato non infetto. > Clicca qui per segnalarlo come spam. > Clicca qui per metterlo in blacklist _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack at lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -------------- next part -------------- An HTML attachment was scrubbed... URL: From satish.txt at gmail.com Sun Sep 30 13:57:00 2018 From: satish.txt at gmail.com (Satish Patel) Date: Sun, 30 Sep 2018 09:57:00 -0400 Subject: [Openstack] [Openstack-operators] UDP Buffer Filling In-Reply-To: <2049EE5F-4DAF-46A6-9A79-1D7D2C7D2D68@italy1.com> References: <1B42179E-E753-4D1E-A766-41ABCD8FEA62@cisco.com> <2049EE5F-4DAF-46A6-9A79-1D7D2C7D2D68@italy1.com> Message-ID: <38EE286C-069F-4C02-BCE9-12E60EF611C4@gmail.com> I'm using openstack-ansible to deploy openstack on centos 7.5 ( we are 100% centos shop ) Sent from my iPhone > On Sep 28, 2018, at 1:45 AM, Remo Mattei wrote: > > Are you using ubutu, TripleO??? > > Thanks > >> Il giorno 27 set 2018, alle ore 20:19, Satish Patel ha scritto: >> >> I know this thread is old but still wanted to post my finding which >> may help other folks to understand issue. >> >> I am dealing with same issue in my openstack network, we are media >> company and dealing with lots of VoIP applications where we need to >> handle high stream of udp packets, Virtio-net isn't meant to handl >> high PPS rate, i ran couple of test and found no matter what txqueue >> or multiqueue you set it will start dropping packet after 50kpps, I >> have tried numa too but result is negative. >> >> Finally i have decided to move and and try SR-IOV and now i am very >> very happy, SR-IOV reduce my VM guest CPU load 50% and now my NIC can >> handle 200kpps without dropping any packet. >> >> I would say use "iptraf-ng" utility to find out packet rate and see if >> its above ~40kpps. >> On Fri, Jul 28, 2017 at 8:39 PM Eugene Nikanorov >> wrote: >>> >>> John, >>> >>> multiqueue support will require qemu 2.5+ >>> I wonder why do you need this feature. It only will help in case of a really huge incoming pps or bandwidth. >>> I'm not sure udp packet loss can be solved with this, but of course better try. >>> >>> my 2c. >>> >>> Thanks, >>> Eugene. >>> >>>>> On Fri, Jul 28, 2017 at 5:00 PM, Liping Mao (limao) wrote: >>>>> >>>>> We already tune these values in the VM. Would you suggest tuning them on the compute nodes as well? >>>> No need on compute nodes.(AFAIK) >>>> >>>> >>>> How much pps your vm need to handle? >>>> You can monitor CPU usage ,especially si to see where may drop. If you see vhost almost reach to 100% CPU ,multi queue may help in some case. >>>> >>>> Thanks. >>>> >>>> Regards, >>>> Liping Mao >>>> >>>>> 在 2017年7月28日,22:45,John Petrini 写道: >>>>> >>>>> We already tune these values in the VM. Would you suggest tuning them on the compute nodes as well? >>>> _______________________________________________ >>>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>>> Post to : openstack at lists.openstack.org >>>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>> >>> >>> _______________________________________________ >>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >>> Post to : openstack at lists.openstack.org >>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> >> _______________________________________________ >> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >> Post to : openstack at lists.openstack.org >> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack >