From ignaziocassano at gmail.com  Sun Jul  1 10:25:24 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Sun, 1 Jul 2018 12:25:24 +0200
Subject: [Openstack-operators] diskimage-builder error
Message-ID: <CAB7j8cWfufyEbNJmXptw8wY4ZJedm8=cT-5X9tVt+5erOy8=xw@mail.gmail.com>

Hi All,
I just installed disk-image builder on my centos 7.
For creating centos7 image I am using the same command used 3 o 4 months
ago, but wiith the last diskimage-builder installed with pip I got the
following error:

+
/usr/share/diskimage-builder/lib/common-functions:tmpfs_check:20
:   '[' -r /proc/meminfo ']'
++
/usr/share/diskimage-builder/lib/common-functions:tmpfs_check:21
:   awk '/^MemTotal/ { print $2 }' /proc/meminfo
+
/usr/share/diskimage-builder/lib/common-functions:tmpfs_check:21
:   total_kB=8157200
+
/usr/share/diskimage-builder/lib/common-functions:tmpfs_check:24
:   RAM_NEEDED=4
+
/usr/share/diskimage-builder/lib/common-functions:tmpfs_check:25
:   '[' 8157200 -lt 4194304 ']'
+
/usr/share/diskimage-builder/lib/common-functions:tmpfs_check:25
:   return 0
+
/usr/share/diskimage-builder/lib/common-functions:cleanup_image_dir:211
:   timeout 120 sh -c 'while ! sudo umount -f /tmp/dib_image.azW5Wi4F; do
sleep 1; done'
+
/usr/share/diskimage-builder/lib/common-functions:cleanup_image_dir:216
:   rm -rf --one-file-system /tmp/dib_image.azW5Wi4F
+
/usr/share/diskimage-builder/lib/img-functions:trap_cleanup:46
:   exit 1


Anyone knows if is it a bug ?
Please, help me.
Regards
Ignazio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180701/96ad9c00/attachment.html>

From mrhillsman at gmail.com  Mon Jul  2 01:30:57 2018
From: mrhillsman at gmail.com (Melvin Hillsman)
Date: Sun, 1 Jul 2018 20:30:57 -0500
Subject: [Openstack-operators] Fwd: Reminder: User Committee Meeting -
	Monday July 2nd @1400UTC
In-Reply-To: <5b366ec68bafdc64b1000004@polymail.io>
References: <5b366ec68bafdc64b1000004@polymail.io>
Message-ID: <CAMVtB2HqjfBxbGpp4dqiTZa6U3kB9PY2Mem6cQCrjnpKoy3eOw@mail.gmail.com>

Hi everyone,

Please be sure to join us - if not getting ready for firecrackers - on
Monday July 2nd @1400UTC in #openstack-uc for weekly User Committee meeting.

Also you can freely add to the meeting agenda here -
Governance/Foundation/UserCommittee - OpenStack
<https://wiki.openstack.org/wiki/Governance/Foundation/UserCommittee#Meeting_Agenda.2FPrevious_Meeting_Logs>
<https://wiki.openstack.org/wiki/Governance/Foundation/UserCommittee#Meeting_Agenda.2FPrevious_Meeting_Logs>
WIKI.OPENSTACK.ORG
<https://wiki.openstack.org/wiki/Governance/Foundation/UserCommittee#Meeting_Agenda.2FPrevious_Meeting_Logs>
<https://wiki.openstack.org/wiki/Governance/Foundation/UserCommittee#Meeting_Agenda.2FPrevious_Meeting_Logs>


-- 
Kind regards,

Melvin Hillsman
mrhillsman at gmail.com
mobile: (832) 264-2646
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180701/964076cc/attachment.html>

From tony at bakeyournoodle.com  Mon Jul  2 01:34:20 2018
From: tony at bakeyournoodle.com (Tony Breeds)
Date: Mon, 2 Jul 2018 11:34:20 +1000
Subject: [Openstack-operators] diskimage-builder error
In-Reply-To: <CAB7j8cWfufyEbNJmXptw8wY4ZJedm8=cT-5X9tVt+5erOy8=xw@mail.gmail.com>
References: <CAB7j8cWfufyEbNJmXptw8wY4ZJedm8=cT-5X9tVt+5erOy8=xw@mail.gmail.com>
Message-ID: <20180702013419.GE21570@thor.bakeyournoodle.com>

On Sun, Jul 01, 2018 at 12:25:24PM +0200, Ignazio Cassano wrote:
> Hi All,
> I just installed disk-image builder on my centos 7.
> For creating centos7 image I am using the same command used 3 o 4 months
> ago, but wiith the last diskimage-builder installed with pip I got the
> following error:
> 
> +
> /usr/share/diskimage-builder/lib/common-functions:tmpfs_check:20
> :   '[' -r /proc/meminfo ']'
> ++
> /usr/share/diskimage-builder/lib/common-functions:tmpfs_check:21
> :   awk '/^MemTotal/ { print $2 }' /proc/meminfo
> +
> /usr/share/diskimage-builder/lib/common-functions:tmpfs_check:21
> :   total_kB=8157200
> +
> /usr/share/diskimage-builder/lib/common-functions:tmpfs_check:24
> :   RAM_NEEDED=4
> +
> /usr/share/diskimage-builder/lib/common-functions:tmpfs_check:25
> :   '[' 8157200 -lt 4194304 ']'
> +
> /usr/share/diskimage-builder/lib/common-functions:tmpfs_check:25
> :   return 0
> +
> /usr/share/diskimage-builder/lib/common-functions:cleanup_image_dir:211
> :   timeout 120 sh -c 'while ! sudo umount -f /tmp/dib_image.azW5Wi4F; do
> sleep 1; done'
> +
> /usr/share/diskimage-builder/lib/common-functions:cleanup_image_dir:216
> :   rm -rf --one-file-system /tmp/dib_image.azW5Wi4F
> +
> /usr/share/diskimage-builder/lib/img-functions:trap_cleanup:46
> :   exit 1
> 
> 
> 
> Anyone knows if is it a bug ?

Not one I know of.  It looks like the image based install was close to
completion, the tmpfs was unmounted but for some reason the removal of
the (empty) directory failed.  The only reason I can think of is if a
FS was still mounted.

Are you able to send the full logs?  It's possible something failed
earlier and we're just seeing a secondary failure.

If you can supply the logs, can you supply your command line and link to
the base image you're using?

Yours Tony.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180702/078d4332/attachment.sig>

From ignaziocassano at gmail.com  Mon Jul  2 05:55:13 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Mon, 2 Jul 2018 07:55:13 +0200
Subject: [Openstack-operators] diskimage-builder error
In-Reply-To: <20180702013419.GE21570@thor.bakeyournoodle.com>
References: <CAB7j8cWfufyEbNJmXptw8wY4ZJedm8=cT-5X9tVt+5erOy8=xw@mail.gmail.com>
 <20180702013419.GE21570@thor.bakeyournoodle.com>
Message-ID: <CAB7j8cUsF6r3f3PSRqz9NO9UNeA0AU5uaxtZMnVpKekGgHcMbg@mail.gmail.com>

Hi Tony,
applying the patch reported here (https://review.openstack.org/#/c/561740/)
the issue is solved..
The above path was related to another issue (distutils) but is solves also
the cleanup error.
Anycase I could unpatch the diskimage code and send you the log.
What do you think ?
Regards
Ignazio


2018-07-02 3:34 GMT+02:00 Tony Breeds <tony at bakeyournoodle.com>:

> On Sun, Jul 01, 2018 at 12:25:24PM +0200, Ignazio Cassano wrote:
> > Hi All,
> > I just installed disk-image builder on my centos 7.
> > For creating centos7 image I am using the same command used 3 o 4 months
> > ago, but wiith the last diskimage-builder installed with pip I got the
> > following error:
> >
> > +
> > /usr/share/diskimage-builder/lib/common-functions:tmpfs_check:20
> > :   '[' -r /proc/meminfo ']'
> > ++
> > /usr/share/diskimage-builder/lib/common-functions:tmpfs_check:21
> > :   awk '/^MemTotal/ { print $2 }' /proc/meminfo
> > +
> > /usr/share/diskimage-builder/lib/common-functions:tmpfs_check:21
> > :   total_kB=8157200
> > +
> > /usr/share/diskimage-builder/lib/common-functions:tmpfs_check:24
> > :   RAM_NEEDED=4
> > +
> > /usr/share/diskimage-builder/lib/common-functions:tmpfs_check:25
> > :   '[' 8157200 -lt 4194304 ']'
> > +
> > /usr/share/diskimage-builder/lib/common-functions:tmpfs_check:25
> > :   return 0
> > +
> > /usr/share/diskimage-builder/lib/common-functions:cleanup_image_dir:211
> > :   timeout 120 sh -c 'while ! sudo umount -f /tmp/dib_image.azW5Wi4F; do
> > sleep 1; done'
> > +
> > /usr/share/diskimage-builder/lib/common-functions:cleanup_image_dir:216
> > :   rm -rf --one-file-system /tmp/dib_image.azW5Wi4F
> > +
> > /usr/share/diskimage-builder/lib/img-functions:trap_cleanup:46
> > :   exit 1
> >
> >
> >
> > Anyone knows if is it a bug ?
>
> Not one I know of.  It looks like the image based install was close to
> completion, the tmpfs was unmounted but for some reason the removal of
> the (empty) directory failed.  The only reason I can think of is if a
> FS was still mounted.
>
> Are you able to send the full logs?  It's possible something failed
> earlier and we're just seeing a secondary failure.
>
> If you can supply the logs, can you supply your command line and link to
> the base image you're using?
>
> Yours Tony.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180702/1e0635ad/attachment.html>

From tony at bakeyournoodle.com  Mon Jul  2 06:11:45 2018
From: tony at bakeyournoodle.com (Tony Breeds)
Date: Mon, 2 Jul 2018 16:11:45 +1000
Subject: [Openstack-operators] diskimage-builder error
In-Reply-To: <CAB7j8cUsF6r3f3PSRqz9NO9UNeA0AU5uaxtZMnVpKekGgHcMbg@mail.gmail.com>
References: <CAB7j8cWfufyEbNJmXptw8wY4ZJedm8=cT-5X9tVt+5erOy8=xw@mail.gmail.com>
 <20180702013419.GE21570@thor.bakeyournoodle.com>
 <CAB7j8cUsF6r3f3PSRqz9NO9UNeA0AU5uaxtZMnVpKekGgHcMbg@mail.gmail.com>
Message-ID: <20180702061144.GF21570@thor.bakeyournoodle.com>

On Mon, Jul 02, 2018 at 07:55:13AM +0200, Ignazio Cassano wrote:
> Hi Tony,
> applying the patch reported here (https://review.openstack.org/#/c/561740/)
> the issue is solved..
> The above path was related to another issue (distutils) but is solves also
> the cleanup error.
> Anycase I could unpatch the diskimage code and send you the log.
> What do you think ?

That would be helpful.  Or if it's easier you can let us know how you're
running DIB


Yours Tony.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180702/e8ec7641/attachment.sig>

From ignaziocassano at gmail.com  Mon Jul  2 06:13:39 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Mon, 2 Jul 2018 08:13:39 +0200
Subject: [Openstack-operators] diskimage-builder error
In-Reply-To: <20180702061144.GF21570@thor.bakeyournoodle.com>
References: <CAB7j8cWfufyEbNJmXptw8wY4ZJedm8=cT-5X9tVt+5erOy8=xw@mail.gmail.com>
 <20180702013419.GE21570@thor.bakeyournoodle.com>
 <CAB7j8cUsF6r3f3PSRqz9NO9UNeA0AU5uaxtZMnVpKekGgHcMbg@mail.gmail.com>
 <20180702061144.GF21570@thor.bakeyournoodle.com>
Message-ID: <CAB7j8cXPsx0m1iFFeduG_dubWFHboLro81VQYWs6RvA8ZeXJCw@mail.gmail.com>

Tony, do you mean the script I am using to create the image ?


2018-07-02 8:11 GMT+02:00 Tony Breeds <tony at bakeyournoodle.com>:

> On Mon, Jul 02, 2018 at 07:55:13AM +0200, Ignazio Cassano wrote:
> > Hi Tony,
> > applying the patch reported here (https://review.openstack.org/
> #/c/561740/)
> > the issue is solved..
> > The above path was related to another issue (distutils) but is solves
> also
> > the cleanup error.
> > Anycase I could unpatch the diskimage code and send you the log.
> > What do you think ?
>
> That would be helpful.  Or if it's easier you can let us know how you're
> running DIB
>
>
> Yours Tony.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180702/6a2b0f9b/attachment.html>

From mrhillsman at gmail.com  Mon Jul  2 10:59:00 2018
From: mrhillsman at gmail.com (Melvin Hillsman)
Date: Mon, 02 Jul 2018 03:59:00 -0700
Subject: [Openstack-operators] Reminder: User Committee Meeting - Monday
	July 2nd @1400UTC
In-Reply-To: <5b366ec68bafdc64b1000004@polymail.io>
References: <5b366ec68bafdc64b1000004@polymail.io>
Message-ID: <5b3698532eef73480d000001@polymail.io>

In case you did not get the reminder on Friday afternoon ;)

-- 
Kind regards,

Melvin Hillsman
mrhillsman at gmail.com
mobile: (832) 264-2646

On Fri, Jun 29th, 2018 at 12:59 PM, Melvin Hillsman <mrhillsman at gmail.com> wrote:

> 
> Hi everyone,
> 
> 
> Please be sure to join us - if not getting ready for firecrackers - on
> Monday July 2nd @1400UTC in #openstack-uc for weekly User Committee
> meeting.
> 
> 
> Also you can freely add to the meeting agenda here - 
> (
> https://wiki.openstack.org/wiki/Governance/Foundation/UserCommittee#Meeting_Agenda.2FPrevious_Meeting_Logs
> )
> 
> 
> 
> Governance/Foundation/UserCommittee - OpenStack (
> https://wiki.openstack.org/wiki/Governance/Foundation/UserCommittee#Meeting_Agenda.2FPrevious_Meeting_Logs
> ) (
> https://wiki.openstack.org/wiki/Governance/Foundation/UserCommittee#Meeting_Agenda.2FPrevious_Meeting_Logs
> ) WIKI.OPENSTACK.ORG (
> https://wiki.openstack.org/wiki/Governance/Foundation/UserCommittee#Meeting_Agenda.2FPrevious_Meeting_Logs
> )
> 
> 
> 
> 
> 
> 
> 
> -- 
> Kind regards,
> 
> Melvin Hillsman
> mrhillsman at gmail.com
> mobile: (832) 264-2646
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180702/cf888b0a/attachment.html>

From mihalis68 at gmail.com  Mon Jul  2 13:58:57 2018
From: mihalis68 at gmail.com (Chris Morgan)
Date: Mon, 2 Jul 2018 09:58:57 -0400
Subject: [Openstack-operators] PTG survey reminder for ops
Message-ID: <CA+NmNoP4MrDcnEEHGg=akeL1H=jL2T125s07Hq4RCpgZjD6nbg@mail.gmail.com>

Hello Everyone,
  We have 23 responses so far on the PTG survey for openstack operators to
let the ops meetups team and the openstack foundation folk know preferences
for the upcoming PTG in Denver. Perhaps some of you that intended to
respond were, like me, sweltering in a heatwave and didn't touch your
computers over the weekend. If so here is a final reminder to please share
your preferences to help make this event as good as it can be. Survey link
: https://www.surveymonkey.com/r/ZSLF9GB

We need to close this today to allow detailed planning of the ops part of
the event to proceed.

Thanks to all those that already responded. I will share the results once
it closes later today.

Chris

-- 
Chris Morgan <mihalis68 at gmail.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180702/883240ff/attachment.html>

From tony at bakeyournoodle.com  Tue Jul  3 00:49:51 2018
From: tony at bakeyournoodle.com (Tony Breeds)
Date: Tue, 3 Jul 2018 10:49:51 +1000
Subject: [Openstack-operators] diskimage-builder error
In-Reply-To: <CAB7j8cXPsx0m1iFFeduG_dubWFHboLro81VQYWs6RvA8ZeXJCw@mail.gmail.com>
References: <CAB7j8cWfufyEbNJmXptw8wY4ZJedm8=cT-5X9tVt+5erOy8=xw@mail.gmail.com>
 <20180702013419.GE21570@thor.bakeyournoodle.com>
 <CAB7j8cUsF6r3f3PSRqz9NO9UNeA0AU5uaxtZMnVpKekGgHcMbg@mail.gmail.com>
 <20180702061144.GF21570@thor.bakeyournoodle.com>
 <CAB7j8cXPsx0m1iFFeduG_dubWFHboLro81VQYWs6RvA8ZeXJCw@mail.gmail.com>
Message-ID: <20180703004950.GA3734@thor.bakeyournoodle.com>

On Mon, Jul 02, 2018 at 08:13:39AM +0200, Ignazio Cassano wrote:
> Tony, do you mean the script I am using to create the image ?

Yup, it'd be good to try and reproduce this outside your environment as
that'll make fixing the underlying bug quicker.

Yours Tony.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180703/76913537/attachment.sig>

From ignaziocassano at gmail.com  Tue Jul  3 04:12:21 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Tue, 3 Jul 2018 06:12:21 +0200
Subject: [Openstack-operators] diskimage-builder error
In-Reply-To: <20180703004950.GA3734@thor.bakeyournoodle.com>
References: <CAB7j8cWfufyEbNJmXptw8wY4ZJedm8=cT-5X9tVt+5erOy8=xw@mail.gmail.com>
 <20180702013419.GE21570@thor.bakeyournoodle.com>
 <CAB7j8cUsF6r3f3PSRqz9NO9UNeA0AU5uaxtZMnVpKekGgHcMbg@mail.gmail.com>
 <20180702061144.GF21570@thor.bakeyournoodle.com>
 <CAB7j8cXPsx0m1iFFeduG_dubWFHboLro81VQYWs6RvA8ZeXJCw@mail.gmail.com>
 <20180703004950.GA3734@thor.bakeyournoodle.com>
Message-ID: <CAB7j8cWxMFCXc0V7-LN=eoFy6Xv6JkPJ9CFk=y-H1OGkOambyg@mail.gmail.com>

Hi Tony, I sent log file and script yesterday. I hope you received them.
Ignazio

Il Mar 3 Lug 2018 02:49 Tony Breeds <tony at bakeyournoodle.com> ha scritto:

> On Mon, Jul 02, 2018 at 08:13:39AM +0200, Ignazio Cassano wrote:
> > Tony, do you mean the script I am using to create the image ?
>
> Yup, it'd be good to try and reproduce this outside your environment as
> that'll make fixing the underlying bug quicker.
>
> Yours Tony.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180703/66728053/attachment.html>

From tony at bakeyournoodle.com  Tue Jul  3 04:36:21 2018
From: tony at bakeyournoodle.com (Tony Breeds)
Date: Tue, 3 Jul 2018 14:36:21 +1000
Subject: [Openstack-operators] diskimage-builder error
In-Reply-To: <CAB7j8cWxMFCXc0V7-LN=eoFy6Xv6JkPJ9CFk=y-H1OGkOambyg@mail.gmail.com>
References: <CAB7j8cWfufyEbNJmXptw8wY4ZJedm8=cT-5X9tVt+5erOy8=xw@mail.gmail.com>
 <20180702013419.GE21570@thor.bakeyournoodle.com>
 <CAB7j8cUsF6r3f3PSRqz9NO9UNeA0AU5uaxtZMnVpKekGgHcMbg@mail.gmail.com>
 <20180702061144.GF21570@thor.bakeyournoodle.com>
 <CAB7j8cXPsx0m1iFFeduG_dubWFHboLro81VQYWs6RvA8ZeXJCw@mail.gmail.com>
 <20180703004950.GA3734@thor.bakeyournoodle.com>
 <CAB7j8cWxMFCXc0V7-LN=eoFy6Xv6JkPJ9CFk=y-H1OGkOambyg@mail.gmail.com>
Message-ID: <20180703043620.GB3734@thor.bakeyournoodle.com>

On Tue, Jul 03, 2018 at 06:12:21AM +0200, Ignazio Cassano wrote:
> Hi Tony, I sent log file and script yesterday. I hope you received them.

Sorry I can't find them in any of my inboxes :(

Yours Tony.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180703/905d0585/attachment.sig>

From ignaziocassano at gmail.com  Tue Jul  3 06:08:09 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Tue, 3 Jul 2018 08:08:09 +0200
Subject: [Openstack-operators] diskimage-builder error
In-Reply-To: <20180703043620.GB3734@thor.bakeyournoodle.com>
References: <CAB7j8cWfufyEbNJmXptw8wY4ZJedm8=cT-5X9tVt+5erOy8=xw@mail.gmail.com>
 <20180702013419.GE21570@thor.bakeyournoodle.com>
 <CAB7j8cUsF6r3f3PSRqz9NO9UNeA0AU5uaxtZMnVpKekGgHcMbg@mail.gmail.com>
 <20180702061144.GF21570@thor.bakeyournoodle.com>
 <CAB7j8cXPsx0m1iFFeduG_dubWFHboLro81VQYWs6RvA8ZeXJCw@mail.gmail.com>
 <20180703004950.GA3734@thor.bakeyournoodle.com>
 <CAB7j8cWxMFCXc0V7-LN=eoFy6Xv6JkPJ9CFk=y-H1OGkOambyg@mail.gmail.com>
 <20180703043620.GB3734@thor.bakeyournoodle.com>
Message-ID: <CAB7j8cUZQEuO3JC-vMCvnQCoAMk6=ELeCR8eqyS6r1f6EPCvBQ@mail.gmail.com>

Hi Tony, the message I sent is waiting for moderator approval because it is
big. :-(

2018-07-03 6:36 GMT+02:00 Tony Breeds <tony at bakeyournoodle.com>:

> On Tue, Jul 03, 2018 at 06:12:21AM +0200, Ignazio Cassano wrote:
> > Hi Tony, I sent log file and script yesterday. I hope you received them.
>
> Sorry I can't find them in any of my inboxes :(
>
> Yours Tony.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180703/3d43bb25/attachment.html>

From sergio.traldi at pd.infn.it  Tue Jul  3 10:03:19 2018
From: sergio.traldi at pd.infn.it (Sergio Traldi)
Date: Tue, 3 Jul 2018 12:03:19 +0200
Subject: [Openstack-operators] Ocata heat AWS::CloudFormation::WaitCondition
	doesn't work
Message-ID: <21ec2111-aaf1-5bc3-5d3a-c34af1020e02@pd.infn.it>

Hi,

I have a previous IaaS with Openstack Mitaka version and my heat 
template with the AWS wait conditions perfectly working. Now the same 
template launch first instance and never launch the second one.

The part of the template useful is:

-----------------------------------------

.......

  node1_server_instance:
     type: OS::Nova::Server
     properties:
       name: "node1"
       key_name: { get_param: key_name_user }
       image: { get_param: image_centos_7 }
       flavor: "m1.small"
       networks:
         - port: { get_resource: pnode1_server_port }
       user_data_format: RAW
       user_data:
         str_replace:
           template: |
            #!/bin/bash
            curl -k -X PUT -H 'Content-Type:application/json' \
                    -d '{"Status" : "SUCCESS","Reason" : "Configuration 
OK","UniqueId" : "NODE1","Data" : "Node1 started Configured."}' \
                    "$wait_handle$"
           params:
             $wait_handle$: { get_resource: node1_instance_wait_handle }

   node1_instance_wait:
     type: "AWS::CloudFormation::WaitCondition"
     depends_on: node1_server_instance
     properties:
       Handle:
         get_resource: node1_instance_wait_handle
       Timeout: 3600

   node1_instance_wait_handle:
     type: "AWS::CloudFormation::WaitConditionHandle"


    node2_server_instance:
     type: OS::Nova::Server
     depends_on: node1_instance_wait
     properties:
       name: "node2"
......

--------------------------------------------------------------------


I try to enter in node1 with ssh and I try to use the curl command with 
the $wait_handle$ variable but I obtain a "User is not authorized to 
perform action":

curl -k -X PUT -H 'Content-Type:application/json' -d '{"Status" : 
"SUCCESS","Reason" : "Configuration OK","UniqueId" : "NODO2","Data" : 
"Nodo2 started Configured."}' -i 
"https://cloud-test.pd.infn.it:8000/v1/waitcondition/arn%3Aopenstack%3Aheat%3A%3A3beba6dd3f2648378263bc04d9c205fa%3Astacks%2Fvevever%2F66030fe2-56be-4e03-ad07-ce078a5a6f02%2Fresources%2Fnodo1_instance_wait_handle?Timestamp=2018-06-22T13%3A01%3A33Z&SignatureMethod=HmacSHA256&AWSAccessKeyId=38edd7e8c98e4e36b85331d4bca5601b&SignatureVersion=2&Signature=%2BT7%2FQVsHcvEpv63qfIe6wsGgG0enH54vEb%2FoWx5odfM%3D"
HTTP/1.1 403 AccessDenied
Content-Type: application/xml; charset=UTF-8
Content-Length: 149
Date: Fri, 22 Jun 2018 13:04:26 GMT
Connection: close

<ErrorResponse><Error><Message>User is not authorized to perform 
action</Message><Code>AccessDenied</Code><Type>Sender</Type></Error></ErrorResponse>


It seems the same error described here in kilo version:

https://bugs.launchpad.net/openstack-ansible/+bug/1515485


I have this Openstack version of keystone and heat in O.S. CentOS7 :

[~]# rpm -qa | grep -e keystone -e heat | sort
openstack-heat-api-8.0.6-1.el7.noarch
openstack-heat-api-cfn-8.0.6-1.el7.noarch
openstack-heat-common-8.0.6-1.el7.noarch
openstack-heat-engine-8.0.6-1.el7.noarch
openstack-keystone-11.0.3-1.el7.noarch
python2-heatclient-1.8.2-1.el7.noarch
python2-keystoneauth1-2.18.0-1.el7.noarch
python2-keystoneclient-3.10.0-1.el7.noarch
python2-keystonemiddleware-4.14.0-1.el7.noarch
python-keystone-11.0.3-1.el7.noarch

I try to add some conf in heat clients but no good try.

Anyone can suggest me something?

Cheers

Sergio


From mihalis68 at gmail.com  Tue Jul  3 11:20:42 2018
From: mihalis68 at gmail.com (Chris Morgan)
Date: Tue, 3 Jul 2018 07:20:42 -0400
Subject: [Openstack-operators] Ops Survey Results!
Message-ID: <CA+NmNoNY=PXfOUN2wpkmwbCWVEd0WJQ+j2hUSKv1_PQvFVJMzA@mail.gmail.com>

Question 1. "Are you considering attending the OpenStack Project Technical
Gathering (PTG) in Denver in September?"

83.33% yes
16.67% no

(24 respondents)

Question 2 "If you are considering attending, which is of more interest to
you?"

17.39% A 2-day event focused on OpenStack Operators working sessions, just
like prior Ops Meetups
82.61% A 5-day event including some specific operators sessions as well
community-wide sessions (SIGs for example)

(23 respondents)

Question 3 "If you are interested in attending a longer event, which would
you prefer?"

26.32% Cross-project sessions first, and then operator focused sessions
later in the week
73.68% Operator specific content first, then time to join specific project
teams

(19 respondents)

Thanks to all that responded. This will be a big help to those planning the
PTG to know how to mix in the OpenStack Ops sessions into the overall
agenda.

Chris
-- 
Chris Morgan <mihalis68 at gmail.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180703/2c9121c5/attachment.html>

From doug at doughellmann.com  Tue Jul  3 12:58:13 2018
From: doug at doughellmann.com (Doug Hellmann)
Date: Tue, 03 Jul 2018 08:58:13 -0400
Subject: [Openstack-operators] Ops Survey Results!
In-Reply-To: <CA+NmNoNY=PXfOUN2wpkmwbCWVEd0WJQ+j2hUSKv1_PQvFVJMzA@mail.gmail.com>
References: <CA+NmNoNY=PXfOUN2wpkmwbCWVEd0WJQ+j2hUSKv1_PQvFVJMzA@mail.gmail.com>
Message-ID: <1530622650-sup-4716@lrrr.local>

Excerpts from Chris Morgan's message of 2018-07-03 07:20:42 -0400:
> Question 1. "Are you considering attending the OpenStack Project Technical
> Gathering (PTG) in Denver in September?"
> 
> 83.33% yes
> 16.67% no
> 
> (24 respondents)

How does the response rate to the survey compare to attendance at
recent Ops Meetups?

Doug


From james.page at canonical.com  Tue Jul  3 14:35:58 2018
From: james.page at canonical.com (James Page)
Date: Tue, 3 Jul 2018 15:35:58 +0100
Subject: [Openstack-operators] [sig][upgrade] Todays IRC meeting
Message-ID: <CAG1bqQoVjfmkmFHSJG6yT=KbaSHDE6khYvJ=t1QwqVRY+EzyAA@mail.gmail.com>

Hi All

Unfortunately I can't make todays IRC meeting at 1600 UTC.

Should be back for next week, but I think we need todo some rescheduling to
fit better with other ops and dev meetings.

Cheers

James
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180703/e45903d7/attachment.html>

From emccormick at cirrusseven.com  Tue Jul  3 14:42:30 2018
From: emccormick at cirrusseven.com (Erik McCormick)
Date: Tue, 3 Jul 2018 10:42:30 -0400
Subject: [Openstack-operators] Ops Survey Results!
In-Reply-To: <1530622650-sup-4716@lrrr.local>
References: <CA+NmNoNY=PXfOUN2wpkmwbCWVEd0WJQ+j2hUSKv1_PQvFVJMzA@mail.gmail.com>
 <1530622650-sup-4716@lrrr.local>
Message-ID: <CAHUi5cNxQZiddb26=1eBi9JYX1oQcG25aKcnmfhV=C-DUS9duA@mail.gmail.com>

On Tue, Jul 3, 2018, 8:59 AM Doug Hellmann <doug at doughellmann.com> wrote:

> Excerpts from Chris Morgan's message of 2018-07-03 07:20:42 -0400:
> > Question 1. "Are you considering attending the OpenStack Project
> Technical
> > Gathering (PTG) in Denver in September?"
> >
> > 83.33% yes
> > 16.67% no
> >
> > (24 respondents)
>
> How does the response rate to the survey compare to attendance at
> recent Ops Meetups?
>
> Doug
>

We've been around 100ish so pretty light

-Erik

>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180703/989a1b0e/attachment.html>

From amy at demarco.com  Tue Jul  3 14:45:46 2018
From: amy at demarco.com (Amy Marrich)
Date: Tue, 3 Jul 2018 09:45:46 -0500
Subject: [Openstack-operators] Ops Survey Results!
In-Reply-To: <CAHUi5cNxQZiddb26=1eBi9JYX1oQcG25aKcnmfhV=C-DUS9duA@mail.gmail.com>
References: <CA+NmNoNY=PXfOUN2wpkmwbCWVEd0WJQ+j2hUSKv1_PQvFVJMzA@mail.gmail.com>
 <1530622650-sup-4716@lrrr.local>
 <CAHUi5cNxQZiddb26=1eBi9JYX1oQcG25aKcnmfhV=C-DUS9duA@mail.gmail.com>
Message-ID: <CAFs83QobM87RO9qv9w-7ymgQxi1pwGAxrRe30KxSqVPCVQx2+w@mail.gmail.com>

But a quarter responses for a survey actually isn't bad statistically.

Amy (spotz)

On Tue, Jul 3, 2018 at 9:42 AM, Erik McCormick <emccormick at cirrusseven.com>
wrote:

>
>
> On Tue, Jul 3, 2018, 8:59 AM Doug Hellmann <doug at doughellmann.com> wrote:
>
>> Excerpts from Chris Morgan's message of 2018-07-03 07:20:42 -0400:
>> > Question 1. "Are you considering attending the OpenStack Project
>> Technical
>> > Gathering (PTG) in Denver in September?"
>> >
>> > 83.33% yes
>> > 16.67% no
>> >
>> > (24 respondents)
>>
>> How does the response rate to the survey compare to attendance at
>> recent Ops Meetups?
>>
>> Doug
>>
>
> We've been around 100ish so pretty light
>
> -Erik
>
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180703/5a117908/attachment-0001.html>

From fungi at yuggoth.org  Tue Jul  3 14:47:33 2018
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Tue, 3 Jul 2018 14:47:33 +0000
Subject: [Openstack-operators] Ops Survey Results!
In-Reply-To: <CAHUi5cNxQZiddb26=1eBi9JYX1oQcG25aKcnmfhV=C-DUS9duA@mail.gmail.com>
References: <CA+NmNoNY=PXfOUN2wpkmwbCWVEd0WJQ+j2hUSKv1_PQvFVJMzA@mail.gmail.com>
 <1530622650-sup-4716@lrrr.local>
 <CAHUi5cNxQZiddb26=1eBi9JYX1oQcG25aKcnmfhV=C-DUS9duA@mail.gmail.com>
Message-ID: <20180703144732.r56ryvko2kkprwmp@yuggoth.org>

On 2018-07-03 10:42:30 -0400 (-0400), Erik McCormick wrote:
> On Tue, Jul 3, 2018, 8:59 AM Doug Hellmann <doug at doughellmann.com> wrote:
> 
> > Excerpts from Chris Morgan's message of 2018-07-03 07:20:42 -0400:
> > > Question 1. "Are you considering attending the OpenStack Project
> > Technical
> > > Gathering (PTG) in Denver in September?"
> > >
> > > 83.33% yes
> > > 16.67% no
> > >
> > > (24 respondents)
> >
> > How does the response rate to the survey compare to attendance at
> > recent Ops Meetups?
> >
> > Doug
> >
> 
> We've been around 100ish so pretty light

How does it compare to respondent count from previous surveys about
organizing operations-focused gatherings?
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180703/f80d9a5d/attachment.sig>

From openstack at medberry.net  Tue Jul  3 15:37:15 2018
From: openstack at medberry.net (David Medberry)
Date: Tue, 3 Jul 2018 09:37:15 -0600
Subject: [Openstack-operators] Fwd: Feedback on Ops Meetup Planning meeting
	today after the fact
In-Reply-To: <CAJhvMSsNQ6G8bXphrvqpgV5NEFY=VnNCEFczQsDB_maFTNYuJA@mail.gmail.com>
References: <CAJhvMSsNQ6G8bXphrvqpgV5NEFY=VnNCEFczQsDB_maFTNYuJA@mail.gmail.com>
Message-ID: <CAJhvMSsrOCz-XMu=kQGgargXFNDhszy+qtCH2y2_c-fOGL-J_w@mail.gmail.com>

I missed the ops meetup but have the scrollback of what was discussed.

I'd definitely like to see upgrades (FFUpgrades, etc) and LTS should be on
the Ops Agenda and socialized so that the distros and other concerned
parties come join us. And I'd put them at the start of day two. [Looks like
the actual plan may be to have us join the SIGs that have formed around
those ideas so that would be fine too as long as we're not double booked.]

I will be socializing an OpenStack Meetup the first night (as I think PTG
generally do a bigger event on Tuesday night so those arriving for Wed can
attend.)  But I'm open to any social events. I'm not sure what venue I will
be able to arrange for that (the normal one is up in Superior near
Boulder.) Stay tuned.

Also, reiterate what others said: Not a big venue. The only "commute"
between sessions that's more than 90 seconds is 3rd to 1st or visa versa.
And it's difficult to defeat the elevator system for that. So figure 3-4
minutes for that....

-dave
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180703/15511f6b/attachment.html>

From jimmy at openstack.org  Tue Jul  3 15:44:38 2018
From: jimmy at openstack.org (Jimmy McArthur)
Date: Tue, 03 Jul 2018 10:44:38 -0500
Subject: [Openstack-operators] [Openstack] Recovering from full outage
In-Reply-To: <aaea7caf65324f76a8393a86b10620d5@granddial.com>
References: <aaea7caf65324f76a8393a86b10620d5@granddial.com>
Message-ID: <5B3B99E6.4070306@openstack.org>

I'm adding this to the OpenStack Operators list as it's a bit better for 
these types of questions.

Torin Woltjer wrote:
> We just suffered a power outage in out data center and I'm having 
> trouble recovering the Openstack cluster. All of the nodes are back 
> online, every instance shows active but `virsh list --all` on the 
> compute nodes show that all of the VMs are actually shut down. Running 
> `ip addr` on any of the nodes shows that none of the bridges are 
> present and `ip netns` shows that all of the network namespaces are 
> missing as well. So despite all of the neutron service running, none 
> of the networking appears to be active, which is concerning. How do I 
> solve this without recreating all of the networks?
>
> /*Torin Woltjer*/
> *Grand Dial Communications - A ZK Tech Inc. Company*
> *616.776.1066 ext. 2006*
> /*www.granddial.com <http://www.granddial.com>*/
> _______________________________________________
> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to     : openstack at lists.openstack.org
> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180703/020ea30c/attachment.html>

From doug at doughellmann.com  Tue Jul  3 16:49:26 2018
From: doug at doughellmann.com (Doug Hellmann)
Date: Tue, 03 Jul 2018 12:49:26 -0400
Subject: [Openstack-operators] Fwd: Feedback on Ops Meetup Planning
	meeting today after the fact
In-Reply-To: <CAJhvMSsrOCz-XMu=kQGgargXFNDhszy+qtCH2y2_c-fOGL-J_w@mail.gmail.com>
References: <CAJhvMSsNQ6G8bXphrvqpgV5NEFY=VnNCEFczQsDB_maFTNYuJA@mail.gmail.com>
 <CAJhvMSsrOCz-XMu=kQGgargXFNDhszy+qtCH2y2_c-fOGL-J_w@mail.gmail.com>
Message-ID: <1530636431-sup-278@lrrr.local>

Excerpts from David Medberry's message of 2018-07-03 09:37:15 -0600:
> I missed the ops meetup but have the scrollback of what was discussed.
> 
> I'd definitely like to see upgrades (FFUpgrades, etc) and LTS should be on
> the Ops Agenda and socialized so that the distros and other concerned
> parties come join us. And I'd put them at the start of day two. [Looks like
> the actual plan may be to have us join the SIGs that have formed around
> those ideas so that would be fine too as long as we're not double booked.]

I hope folks will join the SIG meetings. One benefit of combining
the PTG and Ops meetings is to be able to discuss these sorts of
topics *together*.

> I will be socializing an OpenStack Meetup the first night (as I think PTG
> generally do a bigger event on Tuesday night so those arriving for Wed can
> attend.)  But I'm open to any social events. I'm not sure what venue I will
> be able to arrange for that (the normal one is up in Superior near
> Boulder.) Stay tuned.
> 
> Also, reiterate what others said: Not a big venue. The only "commute"
> between sessions that's more than 90 seconds is 3rd to 1st or visa versa.
> And it's difficult to defeat the elevator system for that. So figure 3-4
> minutes for that....
> 
> -dave


From mihalis68 at gmail.com  Tue Jul  3 19:51:09 2018
From: mihalis68 at gmail.com (Chris Morgan)
Date: Tue, 3 Jul 2018 15:51:09 -0400
Subject: [Openstack-operators] Ops Survey Results!
In-Reply-To: <20180703144732.r56ryvko2kkprwmp@yuggoth.org>
References: <CA+NmNoNY=PXfOUN2wpkmwbCWVEd0WJQ+j2hUSKv1_PQvFVJMzA@mail.gmail.com>
 <1530622650-sup-4716@lrrr.local>
 <CAHUi5cNxQZiddb26=1eBi9JYX1oQcG25aKcnmfhV=C-DUS9duA@mail.gmail.com>
 <20180703144732.r56ryvko2kkprwmp@yuggoth.org>
Message-ID: <CA+NmNoNkogSbGyTtsXQsz+Y3MORZSrcU=-ccVz4yAjt1Wi0Ctw@mail.gmail.com>

I've spent some time trying to find the old polls, but I don't think we
collected the numbers, we just announced the decision. Certainly for the
ops meetup in mexico last august I found a link for a doodle poll about
which venue, but the link is dead and the meeting minutes don't list the
numbers.

I agree with Erik that we normally think of these events (ops meetups not
at main Summits) as being for about 100 people. Japan earlier this year was
almost exactly that, mexico last august about half that, Philadelphia way
over. If we take that as a rule then we had about 25% participation, which
is not bad. We don't know how many people our potential audience is so in
the end this is always going to be hand-wavy

Chris

On Tue, Jul 3, 2018 at 10:51 AM Jeremy Stanley <fungi at yuggoth.org> wrote:

> On 2018-07-03 10:42:30 -0400 (-0400), Erik McCormick wrote:
> > On Tue, Jul 3, 2018, 8:59 AM Doug Hellmann <doug at doughellmann.com>
> wrote:
> >
> > > Excerpts from Chris Morgan's message of 2018-07-03 07:20:42 -0400:
> > > > Question 1. "Are you considering attending the OpenStack Project
> > > Technical
> > > > Gathering (PTG) in Denver in September?"
> > > >
> > > > 83.33% yes
> > > > 16.67% no
> > > >
> > > > (24 respondents)
> > >
> > > How does the response rate to the survey compare to attendance at
> > > recent Ops Meetups?
> > >
> > > Doug
> > >
> >
> > We've been around 100ish so pretty light
>
> How does it compare to respondent count from previous surveys about
> organizing operations-focused gatherings?
> --
> Jeremy Stanley
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>


-- 
Chris Morgan <mihalis68 at gmail.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180703/ceca0ad2/attachment.html>

From torin.woltjer at granddial.com  Tue Jul  3 21:34:26 2018
From: torin.woltjer at granddial.com (Torin Woltjer)
Date: Tue, 03 Jul 2018 21:34:26 GMT
Subject: [Openstack-operators] [Openstack] Recovering from full outage
Message-ID: <a3be07a9d7cd4d07b7d32350993448e3@granddial.com>

The following errors appear in the neutron-linuxbridge-agent.log on both controllers: http://paste.openstack.org/show/724930/

No such errors are on the compute nodes themselves.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/3/18 5:14 PM
To: <lmihaiescu at gmail.com>
Cc: "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>, "openstack at lists.openstack.org" <openstack at lists.openstack.org>
Subject: Re: [Openstack] Recovering from full outage
Running `openstack server reboot` on an instance just causes the instance to be stuck in a rebooting status. Most notable of the logs is neutron-server.log which shows the following:
http://paste.openstack.org/show/724917/

I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted controllers, and all of the agents show online.
http://paste.openstack.org/show/724921/
And all of the instances can be properly started, however I cannot ping any of the instances floating IPs or the neutron router. And when logging into an instance with the console, there is no IP address on any interface.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/3/18 11:50 AM
To: torin.woltjer at granddial.com
Subject: Re: [Openstack] Recovering from full outage
Try restarting them using "openstack server reboot" and also check the nova-compute.log and neutron agents logs on the compute nodes.

On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks?

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack at lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180703/f5e4ca74/attachment.html>

From doug at doughellmann.com  Tue Jul  3 21:41:45 2018
From: doug at doughellmann.com (Doug Hellmann)
Date: Tue, 03 Jul 2018 17:41:45 -0400
Subject: [Openstack-operators] Ops Survey Results!
In-Reply-To: <CA+NmNoNkogSbGyTtsXQsz+Y3MORZSrcU=-ccVz4yAjt1Wi0Ctw@mail.gmail.com>
References: <CA+NmNoNY=PXfOUN2wpkmwbCWVEd0WJQ+j2hUSKv1_PQvFVJMzA@mail.gmail.com>
 <1530622650-sup-4716@lrrr.local>
 <CAHUi5cNxQZiddb26=1eBi9JYX1oQcG25aKcnmfhV=C-DUS9duA@mail.gmail.com>
 <20180703144732.r56ryvko2kkprwmp@yuggoth.org>
 <CA+NmNoNkogSbGyTtsXQsz+Y3MORZSrcU=-ccVz4yAjt1Wi0Ctw@mail.gmail.com>
Message-ID: <1530654026-sup-7379@lrrr.local>

Excerpts from Chris Morgan's message of 2018-07-03 15:51:09 -0400:
> I've spent some time trying to find the old polls, but I don't think we
> collected the numbers, we just announced the decision. Certainly for the
> ops meetup in mexico last august I found a link for a doodle poll about
> which venue, but the link is dead and the meeting minutes don't list the
> numbers.
> 
> I agree with Erik that we normally think of these events (ops meetups not
> at main Summits) as being for about 100 people. Japan earlier this year was
> almost exactly that, mexico last august about half that, Philadelphia way
> over. If we take that as a rule then we had about 25% participation, which
> is not bad. We don't know how many people our potential audience is so in
> the end this is always going to be hand-wavy

That gives me an idea of how much weight to attach to the results.
At first they seemed like they may not mean much because the numbers
were low, but if there are only about 100 people attending the event
anyway then maybe the results are more significant than they initially
appeared.

Doug


From lmihaiescu at gmail.com  Tue Jul  3 23:47:13 2018
From: lmihaiescu at gmail.com (George Mihaiescu)
Date: Tue, 3 Jul 2018 19:47:13 -0400
Subject: [Openstack-operators] [Openstack] Recovering from full outage
In-Reply-To: <a3be07a9d7cd4d07b7d32350993448e3@granddial.com>
References: <a3be07a9d7cd4d07b7d32350993448e3@granddial.com>
Message-ID: <06CB62EE-0078-4C35-AF6C-7E7099DBC474@gmail.com>

Did you set a lock_path in the neutron’s config?

> On Jul 3, 2018, at 17:34, Torin Woltjer <torin.woltjer at granddial.com> wrote:
> 
> The following errors appear in the neutron-linuxbridge-agent.log on both controllers: http://paste.openstack.org/show/724930/
> 
> No such errors are on the compute nodes themselves.
> 
> Torin Woltjer
>  
> Grand Dial Communications - A ZK Tech Inc. Company
>  
> 616.776.1066 ext. 2006
> www.granddial.com
> 
> From: "Torin Woltjer" <torin.woltjer at granddial.com>
> Sent: 7/3/18 5:14 PM
> To: <lmihaiescu at gmail.com>
> Cc: "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>, "openstack at lists.openstack.org" <openstack at lists.openstack.org>
> Subject: Re: [Openstack] Recovering from full outage
> Running `openstack server reboot` on an instance just causes the instance to be stuck in a rebooting status. Most notable of the logs is neutron-server.log which shows the following:
> http://paste.openstack.org/show/724917/
> 
> I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted controllers, and all of the agents show online.
> http://paste.openstack.org/show/724921/
> And all of the instances can be properly started, however I cannot ping any of the instances floating IPs or the neutron router. And when logging into an instance with the console, there is no IP address on any interface.
> 
> Torin Woltjer
>  
> Grand Dial Communications - A ZK Tech Inc. Company
>  
> 616.776.1066 ext. 2006
> www.granddial.com
> 
> From: George Mihaiescu <lmihaiescu at gmail.com>
> Sent: 7/3/18 11:50 AM
> To: torin.woltjer at granddial.com
> Subject: Re: [Openstack] Recovering from full outage
> Try restarting them using "openstack server reboot" and also check the nova-compute.log and neutron agents logs on the compute nodes.
> 
>> On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
>> We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks?
>> 
>> Torin Woltjer
>>  
>> Grand Dial Communications - A ZK Tech Inc. Company
>>  
>> 616.776.1066 ext. 2006
>> www.granddial.com
>> 
>> _______________________________________________
>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>> Post to     : openstack at lists.openstack.org
>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180703/22d25c2b/attachment.html>

From laszlo.budai at gmail.com  Wed Jul  4 09:09:14 2018
From: laszlo.budai at gmail.com (Budai Laszlo)
Date: Wed, 4 Jul 2018 12:09:14 +0300
Subject: [Openstack-operators] [openstack-ansible] group_binds exceptions
Message-ID: <7bb29350-657c-a19a-f476-a5c8940bfff2@gmail.com>

Dear all,

is there a way to define exceptions for the group_binds for a network definition?

for instance we have something like this:
     - network:
         container_bridge: "vlan4"
         container_type: "veth"
         container_interface: "eth1"
         ip_from_q: "container"
         type: "raw"
         group_binds:
           *- all_containers*
           - hosts
         is_container_address: true
         is_ssh_address: true

So, instead of *all_containers* I would be interested in something like "all_containers except those running on the ceph nodes".

Any ideas are welcome.

Thank you,
Laszlo


From tobias.rydberg at citynetwork.eu  Thu Jul  5 07:58:26 2018
From: tobias.rydberg at citynetwork.eu (Tobias Rydberg)
Date: Thu, 5 Jul 2018 09:58:26 +0200
Subject: [Openstack-operators] [publiccloud-wg] Meeting this afternoon for
	Public Cloud WG
Message-ID: <fd41be17-1b9b-7f06-10ef-dc770b9b505c@citynetwork.eu>

Hi folks,

Time for a new meeting for the Public Cloud WG. Agenda draft can be 
found at https://etherpad.openstack.org/p/publiccloud-wg, feel free to 
add items to that list.

See you all at IRC 1400 UTC in #openstack-publiccloud

Cheers,
Tobias

-- 
Tobias Rydberg
Senior Developer
Twitter & IRC: tobberydberg

www.citynetwork.eu | www.citycloud.com

INNOVATION THROUGH OPEN IT INFRASTRUCTURE
ISO 9001, 14001, 27001, 27015 & 27018 CERTIFIED


From jean-daniel.bonnetot at corp.ovh.com  Thu Jul  5 08:02:34 2018
From: jean-daniel.bonnetot at corp.ovh.com (Jean-Daniel Bonnetot)
Date: Thu, 5 Jul 2018 08:02:34 +0000
Subject: [Openstack-operators] [openstack-dev] [publiccloud-wg] Meeting
 this afternoon for Public Cloud WG
In-Reply-To: <fd41be17-1b9b-7f06-10ef-dc770b9b505c@citynetwork.eu>
References: <fd41be17-1b9b-7f06-10ef-dc770b9b505c@citynetwork.eu>
Message-ID: <6AE18849-2F98-419B-833E-678C744A8CDE@corp.ovh.com>

Sorry guys, I'm not available once again.
See you next time.

Jean-Daniel Bonnetot
ovh.com <http://ovh.com> | @pilgrimstack
 

﻿On 05/07/2018 09:59, "Tobias Rydberg" <tobias.rydberg at citynetwork.eu> wrote:

    Hi folks,
    
    Time for a new meeting for the Public Cloud WG. Agenda draft can be 
    found at https://etherpad.openstack.org/p/publiccloud-wg, feel free to 
    add items to that list.
    
    See you all at IRC 1400 UTC in #openstack-publiccloud
    
    Cheers,
    Tobias
    
    -- 
    Tobias Rydberg
    Senior Developer
    Twitter & IRC: tobberydberg
    
    www.citynetwork.eu | www.citycloud.com
    
    INNOVATION THROUGH OPEN IT INFRASTRUCTURE
    ISO 9001, 14001, 27001, 27015 & 27018 CERTIFIED
    
    
    __________________________________________________________________________
    OpenStack Development Mailing List (not for usage questions)
    Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
    http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
    

From torin.woltjer at granddial.com  Thu Jul  5 12:43:43 2018
From: torin.woltjer at granddial.com (Torin Woltjer)
Date: Thu, 05 Jul 2018 12:43:43 GMT
Subject: [Openstack-operators] [Openstack] Recovering from full outage
Message-ID: <e68017ea504d4a3ca4ead17f558a03f7@granddial.com>

There is no lock path set in my neutron configuration. Does it ultimately matter what it is set to as long as it is consistent? Does it need to be set on compute nodes as well as controllers?

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/3/18 7:47 PM
To: torin.woltjer at granddial.com
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage

Did you set a lock_path in the neutron’s config?

On Jul 3, 2018, at 17:34, Torin Woltjer <torin.woltjer at granddial.com> wrote:

The following errors appear in the neutron-linuxbridge-agent.log on both controllers: http://paste.openstack.org/show/724930/

No such errors are on the compute nodes themselves.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/3/18 5:14 PM
To: <lmihaiescu at gmail.com>
Cc: "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>, "openstack at lists.openstack.org" <openstack at lists.openstack.org>
Subject: Re: [Openstack] Recovering from full outage
Running `openstack server reboot` on an instance just causes the instance to be stuck in a rebooting status. Most notable of the logs is neutron-server.log which shows the following:
http://paste.openstack.org/show/724917/

I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted controllers, and all of the agents show online.
http://paste.openstack.org/show/724921/
And all of the instances can be properly started, however I cannot ping any of the instances floating IPs or the neutron router. And when logging into an instance with the console, there is no IP address on any interface.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/3/18 11:50 AM
To: torin.woltjer at granddial.com
Subject: Re: [Openstack] Recovering from full outage
Try restarting them using "openstack server reboot" and also check the nova-compute.log and neutron agents logs on the compute nodes.

On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks?

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack at lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180705/2d482d3a/attachment.html>

From torin.woltjer at granddial.com  Thu Jul  5 14:30:10 2018
From: torin.woltjer at granddial.com (Torin Woltjer)
Date: Thu, 05 Jul 2018 14:30:10 GMT
Subject: [Openstack-operators] [Openstack] Recovering from full outage
Message-ID: <5d62f81a0e864009ab7a1b12097e0b2f@granddial.com>

The qrouter netns appears once the lock_path is specified, the neutron router is pingable as well. However, instances are not pingable. If I log in via console, the instances have not been given IP addresses, if I manually give them an address and route they are pingable and seem to work. So the router is working correctly but dhcp is not working.

No errors in any of the neutron or nova logs on controllers or compute nodes.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/5/18 8:53 AM
To: <lmihaiescu at gmail.com>
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage
There is no lock path set in my neutron configuration. Does it ultimately matter what it is set to as long as it is consistent? Does it need to be set on compute nodes as well as controllers?

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/3/18 7:47 PM
To: torin.woltjer at granddial.com
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage

Did you set a lock_path in the neutron’s config?

On Jul 3, 2018, at 17:34, Torin Woltjer <torin.woltjer at granddial.com> wrote:

The following errors appear in the neutron-linuxbridge-agent.log on both controllers: http://paste.openstack.org/show/724930/

No such errors are on the compute nodes themselves.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/3/18 5:14 PM
To: <lmihaiescu at gmail.com>
Cc: "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>, "openstack at lists.openstack.org" <openstack at lists.openstack.org>
Subject: Re: [Openstack] Recovering from full outage
Running `openstack server reboot` on an instance just causes the instance to be stuck in a rebooting status. Most notable of the logs is neutron-server.log which shows the following:
http://paste.openstack.org/show/724917/

I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted controllers, and all of the agents show online.
http://paste.openstack.org/show/724921/
And all of the instances can be properly started, however I cannot ping any of the instances floating IPs or the neutron router. And when logging into an instance with the console, there is no IP address on any interface.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/3/18 11:50 AM
To: torin.woltjer at granddial.com
Subject: Re: [Openstack] Recovering from full outage
Try restarting them using "openstack server reboot" and also check the nova-compute.log and neutron agents logs on the compute nodes.

On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks?

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack at lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180705/7d4871c7/attachment.html>

From torin.woltjer at granddial.com  Thu Jul  5 16:39:49 2018
From: torin.woltjer at granddial.com (Torin Woltjer)
Date: Thu, 05 Jul 2018 16:39:49 GMT
Subject: [Openstack-operators] [Openstack] Recovering from full outage
Message-ID: <4cb6b48da9734ad1899ff99db02db307@granddial.com>

Yes, I've done this. The VMs hang for awhile waiting for DHCP and eventually come up with no addresses. neutron-dhcp-agent has been restarted on both controllers. The qdhcp netns's were all present; I stopped the service, removed the qdhcp netns's, noted the dhcp agents show offline by `neutron agent-list`, restarted all neutron services, noted the qdhcp netns's were recreated, restarted a VM again and it still fails to pull an IP address.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/5/18 10:38 AM
To: torin.woltjer at granddial.com
Subject: Re: [Openstack] Recovering from full outage
Did you restart the neutron-dhcp-agent  and rebooted the VMs?

On Thu, Jul 5, 2018 at 10:30 AM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
The qrouter netns appears once the lock_path is specified, the neutron router is pingable as well. However, instances are not pingable. If I log in via console, the instances have not been given IP addresses, if I manually give them an address and route they are pingable and seem to work. So the router is working correctly but dhcp is not working.

No errors in any of the neutron or nova logs on controllers or compute nodes.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/5/18 8:53 AM
To: <lmihaiescu at gmail.com>
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage
There is no lock path set in my neutron configuration. Does it ultimately matter what it is set to as long as it is consistent? Does it need to be set on compute nodes as well as controllers?

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/3/18 7:47 PM
To: torin.woltjer at granddial.com
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage

Did you set a lock_path in the neutron’s config?

On Jul 3, 2018, at 17:34, Torin Woltjer <torin.woltjer at granddial.com> wrote:

The following errors appear in the neutron-linuxbridge-agent.log on both controllers: http://paste.openstack.org/show/724930/

No such errors are on the compute nodes themselves.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/3/18 5:14 PM
To: <lmihaiescu at gmail.com>
Cc: "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>, "openstack at lists.openstack.org" <openstack at lists.openstack.org>
Subject: Re: [Openstack] Recovering from full outage
Running `openstack server reboot` on an instance just causes the instance to be stuck in a rebooting status. Most notable of the logs is neutron-server.log which shows the following:
http://paste.openstack.org/show/724917/

I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted controllers, and all of the agents show online.
http://paste.openstack.org/show/724921/
And all of the instances can be properly started, however I cannot ping any of the instances floating IPs or the neutron router. And when logging into an instance with the console, there is no IP address on any interface.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/3/18 11:50 AM
To: torin.woltjer at granddial.com
Subject: Re: [Openstack] Recovering from full outage
Try restarting them using "openstack server reboot" and also check the nova-compute.log and neutron agents logs on the compute nodes.

On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks?

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack at lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180705/3865d393/attachment.html>

From lmihaiescu at gmail.com  Thu Jul  5 16:56:07 2018
From: lmihaiescu at gmail.com (George Mihaiescu)
Date: Thu, 5 Jul 2018 12:56:07 -0400
Subject: [Openstack-operators] [Openstack] Recovering from full outage
In-Reply-To: <4cb6b48da9734ad1899ff99db02db307@granddial.com>
References: <4cb6b48da9734ad1899ff99db02db307@granddial.com>
Message-ID: <CAGckRDpP8T4X5nAOmD3+XTtMo6sp1gU_+x0jqEQ57Ap41KLj+w@mail.gmail.com>

You should tcpdump inside the qdhcp namespace to see if the requests make
it there, and also check iptables rules on the compute nodes for the return
traffic.


On Thu, Jul 5, 2018 at 12:39 PM, Torin Woltjer <torin.woltjer at granddial.com>
wrote:

> Yes, I've done this. The VMs hang for awhile waiting for DHCP and
> eventually come up with no addresses. neutron-dhcp-agent has been restarted
> on both controllers. The qdhcp netns's were all present; I stopped the
> service, removed the qdhcp netns's, noted the dhcp agents show offline by
> `neutron agent-list`, restarted all neutron services, noted the qdhcp
> netns's were recreated, restarted a VM again and it still fails to pull an
> IP address.
>
> *Torin Woltjer*
>
> *Grand Dial Communications - A ZK Tech Inc. Company*
>
> *616.776.1066 ext. 2006*
> * <http://www.granddial.com>www.granddial.com <http://www.granddial.com>*
>
> ------------------------------
> *From*: George Mihaiescu <lmihaiescu at gmail.com>
> *Sent*: 7/5/18 10:38 AM
> *To*: torin.woltjer at granddial.com
> *Subject*: Re: [Openstack] Recovering from full outage
> Did you restart the neutron-dhcp-agent  and rebooted the VMs?
>
> On Thu, Jul 5, 2018 at 10:30 AM, Torin Woltjer <
> torin.woltjer at granddial.com> wrote:
>
>> The qrouter netns appears once the lock_path is specified, the neutron
>> router is pingable as well. However, instances are not pingable. If I log
>> in via console, the instances have not been given IP addresses, if I
>> manually give them an address and route they are pingable and seem to work.
>> So the router is working correctly but dhcp is not working.
>>
>> No errors in any of the neutron or nova logs on controllers or compute
>> nodes.
>>
>>
>> *Torin Woltjer*
>>
>> *Grand Dial Communications - A ZK Tech Inc. Company*
>>
>> *616.776.1066 ext. 2006*
>> * <http://www.granddial.com> <http://www.granddial.com>
>> <http://www.granddial.com>www.granddial.com <http://www.granddial.com>*
>>
>> ------------------------------
>> *From*: "Torin Woltjer" <torin.woltjer at granddial.com>
>> *Sent*: 7/5/18 8:53 AM
>> *To*: <lmihaiescu at gmail.com>
>> *Cc*: openstack-operators at lists.openstack.org,
>> openstack at lists.openstack.org
>> *Subject*: Re: [Openstack] Recovering from full outage
>> There is no lock path set in my neutron configuration. Does it ultimately
>> matter what it is set to as long as it is consistent? Does it need to be
>> set on compute nodes as well as controllers?
>>
>> *Torin Woltjer*
>>
>> *Grand Dial Communications - A ZK Tech Inc. Company*
>>
>> *616.776.1066 ext. 2006*
>> * <http://www.granddial.com> <http://www.granddial.com>
>> <http://www.granddial.com> <http://www.granddial.com>
>> <http://www.granddial.com>www.granddial.com <http://www.granddial.com>*
>>
>> ------------------------------
>> *From*: George Mihaiescu <lmihaiescu at gmail.com>
>> *Sent*: 7/3/18 7:47 PM
>> *To*: torin.woltjer at granddial.com
>> *Cc*: openstack-operators at lists.openstack.org,
>> openstack at lists.openstack.org
>> *Subject*: Re: [Openstack] Recovering from full outage
>>
>> Did you set a lock_path in the neutron’s config?
>>
>> On Jul 3, 2018, at 17:34, Torin Woltjer <torin.woltjer at granddial.com>
>> wrote:
>>
>> The following errors appear in the neutron-linuxbridge-agent.log on both
>> controllers: <http://paste.openstack.org/show/724930/>
>> <http://paste.openstack.org/show/724930/>
>> <http://paste.openstack.org/show/724930/>
>> <http://paste.openstack.org/show/724930/>
>> <http://paste.openstack.org/show/724930/>
>> <http://paste.openstack.org/show/724930/>http://paste.openstack.org/sho
>> w/724930/
>>
>> No such errors are on the compute nodes themselves.
>>
>> *Torin Woltjer*
>>
>> *Grand Dial Communications - A ZK Tech Inc. Company*
>>
>> *616.776.1066 ext. 2006*
>> * <http://www.granddial.com> <http://www.granddial.com>
>> <http://www.granddial.com> <http://www.granddial.com>
>> <http://www.granddial.com> <http://www.granddial.com>
>> <http://www.granddial.com>www.granddial.com <http://www.granddial.com>*
>>
>> ------------------------------
>> *From*: "Torin Woltjer" <torin.woltjer at granddial.com>
>> *Sent*: 7/3/18 5:14 PM
>> *To*: <lmihaiescu at gmail.com>
>> *Cc*: "openstack-operators at lists.openstack.org" <
>> openstack-operators at lists.openstack.org>, "openstack at lists.openstack.org"
>> <openstack at lists.openstack.org>
>> *Subject*: Re: [Openstack] Recovering from full outage
>> Running `openstack server reboot` on an instance just causes the instance
>> to be stuck in a rebooting status. Most notable of the logs is
>> neutron-server.log which shows the following:
>> <http://paste.openstack.org/show/724917/>
>> <http://paste.openstack.org/show/724917/>
>> <http://paste.openstack.org/show/724917/>
>> <http://paste.openstack.org/show/724917/>
>> <http://paste.openstack.org/show/724917/>
>> <http://paste.openstack.org/show/724917/>
>> <http://paste.openstack.org/show/724917/>
>> <http://paste.openstack.org/show/724917/>http://paste.openstack.org/sho
>> w/724917/
>>
>> I realized that rabbitmq was in a failed state, so I bootstrapped it,
>> rebooted controllers, and all of the agents show online.
>> <http://paste.openstack.org/show/724921/>
>> <http://paste.openstack.org/show/724921/>
>> <http://paste.openstack.org/show/724921/>
>> <http://paste.openstack.org/show/724921/>
>> <http://paste.openstack.org/show/724921/>
>> <http://paste.openstack.org/show/724921/>
>> <http://paste.openstack.org/show/724921/>
>> <http://paste.openstack.org/show/724921/>http://paste.openstack.org/sho
>> w/724921/
>> And all of the instances can be properly started, however I cannot ping
>> any of the instances floating IPs or the neutron router. And when logging
>> into an instance with the console, there is no IP address on any interface.
>>
>> *Torin Woltjer*
>>
>> *Grand Dial Communications - A ZK Tech Inc. Company*
>>
>> *616.776.1066 ext. 2006*
>> * <http://www.granddial.com> <http://www.granddial.com>
>> <http://www.granddial.com> <http://www.granddial.com>
>> <http://www.granddial.com> <http://www.granddial.com>
>> <http://www.granddial.com> <http://www.granddial.com>
>> <http://www.granddial.com>www.granddial.com <http://www.granddial.com>*
>>
>> ------------------------------
>> *From*: George Mihaiescu <lmihaiescu at gmail.com>
>> *Sent*: 7/3/18 11:50 AM
>> *To*: torin.woltjer at granddial.com
>> *Subject*: Re: [Openstack] Recovering from full outage
>> Try restarting them using "openstack server reboot" and also check the
>> nova-compute.log and neutron agents logs on the compute nodes.
>>
>> On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer <
>> torin.woltjer at granddial.com> wrote:
>>
>>> We just suffered a power outage in out data center and I'm having
>>> trouble recovering the Openstack cluster. All of the nodes are back online,
>>> every instance shows active but `virsh list --all` on the compute nodes
>>> show that all of the VMs are actually shut down. Running `ip addr` on any
>>> of the nodes shows that none of the bridges are present and `ip netns`
>>> shows that all of the network namespaces are missing as well. So despite
>>> all of the neutron service running, none of the networking appears to be
>>> active, which is concerning. How do I solve this without recreating all of
>>> the networks?
>>>
>>> *Torin Woltjer*
>>>
>>> *Grand Dial Communications - A ZK Tech Inc. Company*
>>>
>>> *616.776.1066 ext. 2006*
>>> * <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com>www.granddial.com <http://www.granddial.com>*
>>>
>>> _______________________________________________
>>> Mailing list:
>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>> Post to     : openstack at lists.openstack.org
>>> Unsubscribe :
>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180705/49cdd584/attachment-0001.html>

From melwittt at gmail.com  Thu Jul  5 18:55:47 2018
From: melwittt at gmail.com (melanie witt)
Date: Thu, 5 Jul 2018 11:55:47 -0700
Subject: [Openstack-operators] [Openstack] [nova][api] Novaclient
 redirect endpoint https into http
In-Reply-To: <0D8F95CB-0AAB-45FD-ADC8-3B917C1460D4@workday.com>
References: <00be01d41381$d37b2940$7a717bc0$@gmail.com>
 <0D8F95CB-0AAB-45FD-ADC8-3B917C1460D4@workday.com>
Message-ID: <af86021e-a6b4-747c-01d4-0450daf3145a@gmail.com>

+openstack-dev@

On Wed, 4 Jul 2018 14:50:26 +0000, Bogdan Katynski wrote:
>> But, I can not use nova command, endpoint nova have been redirected from https to http. Here:http://prntscr.com/k2e8s6  (command: nova –insecure service list)
> First of all, it seems that the nova client is hitting /v2.1 instead of /v2.1/ URI and this seems to be triggering the redirect.
> 
> Since openstack CLI works, I presume it must be using the correct URL and hence it’s not getting redirected.
> 
>>   
>> And this is error log: Unable to establish connection tohttp://192.168.30.70:8774/v2.1/: ('Connection aborted.', BadStatusLine("''",))
>>   
> Looks to me that nova-api does a redirect to an absolute URL. I suspect SSL is terminated on the HAProxy and nova-api itself is configured without SSL so it redirects to an http URL.
> 
> In my opinion, nova would be more load-balancer friendly if it used a relative URI in the redirect but that’s outside of the scope of this question and since I don’t know the context behind choosing the absolute URL, I could be wrong on that.

Thanks for mentioning this. We do have a bug open in python-novaclient 
around a similar issue [1]. I've added comments based on this thread and 
will consult with the API subteam to see if there's something we can do 
about this in nova-api.

-melanie

[1] https://bugs.launchpad.net/python-novaclient/+bug/1776928


From torin.woltjer at granddial.com  Thu Jul  5 20:06:17 2018
From: torin.woltjer at granddial.com (Torin Woltjer)
Date: Thu, 05 Jul 2018 20:06:17 GMT
Subject: [Openstack-operators] [Openstack] Recovering from full outage
Message-ID: <f9d057f6e3654434a0f939814b385563@granddial.com>

Are IP addresses set by cloud-init on boot? I noticed that cloud-init isn't working on my VMs. created a new instance from an ubuntu 18.04 image to test with, the hostname was not set to the name of the instance and could not login as users I had specified in the configuration.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/5/18 12:57 PM
To: torin.woltjer at granddial.com
Cc: "openstack at lists.openstack.org" <openstack at lists.openstack.org>, "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>
Subject: Re: [Openstack] Recovering from full outage
You should tcpdump inside the qdhcp namespace to see if the requests make it there, and also check iptables rules on the compute nodes for the return traffic.

On Thu, Jul 5, 2018 at 12:39 PM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
Yes, I've done this. The VMs hang for awhile waiting for DHCP and eventually come up with no addresses. neutron-dhcp-agent has been restarted on both controllers. The qdhcp netns's were all present; I stopped the service, removed the qdhcp netns's, noted the dhcp agents show offline by `neutron agent-list`, restarted all neutron services, noted the qdhcp netns's were recreated, restarted a VM again and it still fails to pull an IP address.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/5/18 10:38 AM
To: torin.woltjer at granddial.com
Subject: Re: [Openstack] Recovering from full outage
Did you restart the neutron-dhcp-agent  and rebooted the VMs?

On Thu, Jul 5, 2018 at 10:30 AM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
The qrouter netns appears once the lock_path is specified, the neutron router is pingable as well. However, instances are not pingable. If I log in via console, the instances have not been given IP addresses, if I manually give them an address and route they are pingable and seem to work. So the router is working correctly but dhcp is not working.

No errors in any of the neutron or nova logs on controllers or compute nodes.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/5/18 8:53 AM
To: <lmihaiescu at gmail.com>
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage
There is no lock path set in my neutron configuration. Does it ultimately matter what it is set to as long as it is consistent? Does it need to be set on compute nodes as well as controllers?

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/3/18 7:47 PM
To: torin.woltjer at granddial.com
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage

Did you set a lock_path in the neutron’s config?

On Jul 3, 2018, at 17:34, Torin Woltjer <torin.woltjer at granddial.com> wrote:

The following errors appear in the neutron-linuxbridge-agent.log on both controllers: http://paste.openstack.org/show/724930/

No such errors are on the compute nodes themselves.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/3/18 5:14 PM
To: <lmihaiescu at gmail.com>
Cc: "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>, "openstack at lists.openstack.org" <openstack at lists.openstack.org>
Subject: Re: [Openstack] Recovering from full outage
Running `openstack server reboot` on an instance just causes the instance to be stuck in a rebooting status. Most notable of the logs is neutron-server.log which shows the following:
http://paste.openstack.org/show/724917/

I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted controllers, and all of the agents show online.
http://paste.openstack.org/show/724921/
And all of the instances can be properly started, however I cannot ping any of the instances floating IPs or the neutron router. And when logging into an instance with the console, there is no IP address on any interface.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/3/18 11:50 AM
To: torin.woltjer at granddial.com
Subject: Re: [Openstack] Recovering from full outage
Try restarting them using "openstack server reboot" and also check the nova-compute.log and neutron agents logs on the compute nodes.

On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks?

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack at lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180705/203cdc74/attachment.html>

From mordred at inaugust.com  Thu Jul  5 20:10:08 2018
From: mordred at inaugust.com (Monty Taylor)
Date: Thu, 5 Jul 2018 15:10:08 -0500
Subject: [Openstack-operators] [openstack-dev] [Openstack] [nova][api]
 Novaclient redirect endpoint https into http
In-Reply-To: <af86021e-a6b4-747c-01d4-0450daf3145a@gmail.com>
References: <00be01d41381$d37b2940$7a717bc0$@gmail.com>
 <0D8F95CB-0AAB-45FD-ADC8-3B917C1460D4@workday.com>
 <af86021e-a6b4-747c-01d4-0450daf3145a@gmail.com>
Message-ID: <cf1db561-a3c2-f304-4ab4-c87a4dac4c53@inaugust.com>

On 07/05/2018 01:55 PM, melanie witt wrote:
> +openstack-dev@
> 
> On Wed, 4 Jul 2018 14:50:26 +0000, Bogdan Katynski wrote:
>>> But, I can not use nova command, endpoint nova have been redirected 
>>> from https to http. Here:http://prntscr.com/k2e8s6  (command: nova 
>>> –insecure service list)
>> First of all, it seems that the nova client is hitting /v2.1 instead 
>> of /v2.1/ URI and this seems to be triggering the redirect.
>>
>> Since openstack CLI works, I presume it must be using the correct URL 
>> and hence it’s not getting redirected.
>>
>>> And this is error log: Unable to establish connection 
>>> tohttp://192.168.30.70:8774/v2.1/: ('Connection aborted.', 
>>> BadStatusLine("''",))
>> Looks to me that nova-api does a redirect to an absolute URL. I 
>> suspect SSL is terminated on the HAProxy and nova-api itself is 
>> configured without SSL so it redirects to an http URL.
>>
>> In my opinion, nova would be more load-balancer friendly if it used a 
>> relative URI in the redirect but that’s outside of the scope of this 
>> question and since I don’t know the context behind choosing the 
>> absolute URL, I could be wrong on that.
> 
> Thanks for mentioning this. We do have a bug open in python-novaclient 
> around a similar issue [1]. I've added comments based on this thread and 
> will consult with the API subteam to see if there's something we can do 
> about this in nova-api.

A similar thing came up the other day related to keystone and version 
discovery. Version discovery documents tend to return full urls - even 
though relative urls would make public/internal API endpoints work 
better. (also, sometimes people don't configure things properly and the 
version discovery url winds up being incorrect)

In shade/sdk - we actually construct a wholly-new discovery url based on 
the url used for the catalog and the url in the discovery document since 
we've learned that the version discovery urls are frequently broken.

This is problematic because SOMETIMES people have public urls deployed 
as a sub-url and internal urls deployed on a port - so you have:

Catalog:
public: https://example.com/compute
internal: https://compute.example.com:1234

Version discovery:
https://example.com/compute/v2.1

When we go to combine the catalog url and the versioned url, if the user 
is hitting internal, we product 
https://compute.example.com:1234/compute/v2.1 - because we have no way 
of systemically knowing that /compute should also be stripped.

VERY LONG WINDED WAY of saying 2 things:

a) Relative URLs would be *way* friendlier (and incidentally are 
supported by keystoneauth, openstacksdk and shade - and are written up 
as being a thing people *should* support in the documents about API 
consumption)

b) Can we get agreement that changing behavior to return or redirect to 
a relative URL would not be considered an api contract break? (it's 
possible the answer to this is 'no' - so it's a real question)

Monty


From jp.methot at planethoster.info  Thu Jul  5 21:30:11 2018
From: jp.methot at planethoster.info (=?utf-8?Q?Jean-Philippe_M=C3=A9thot?=)
Date: Thu, 5 Jul 2018 17:30:11 -0400
Subject: [Openstack-operators] Storage concerns when switching from a single
 controller to a HA setup
Message-ID: <B0D91353-7094-486B-9300-38A0604F381A@planethoster.info>

Hi,

We’ve been running on Openstack for several years now and our setup has always counted a single controller. We are currently testing switching to a dual controller HA solution, but an unexpected issue has appeared, regarding storage. See, we use Dell compellent SAN for our block devices. I notice that when I create a volume on one controller, I am unable to make any operation on the same volume on the second controller (this is with an active/passive cinder-volume). Worse, this affects VMs directly as they can’t be migrated if the active controller isn’t the one that created their block device.

I know this issue doesn’t happen on Ceph, so I’ve been wondering, is this a limitation of Openstack or the SAN driver? Also, is there actually a way to reach even active-passive high availability with this current storage solution?


Jean-Philippe Méthot
Openstack system administrator
Administrateur système Openstack
PlanetHoster inc.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180705/aff2c6de/attachment.html>

From pgsousa at gmail.com  Thu Jul  5 22:57:44 2018
From: pgsousa at gmail.com (Pedro Sousa)
Date: Thu, 5 Jul 2018 23:57:44 +0100
Subject: [Openstack-operators] [Openstack] Recovering from full outage
In-Reply-To: <f9d057f6e3654434a0f939814b385563@granddial.com>
References: <f9d057f6e3654434a0f939814b385563@granddial.com>
Message-ID: <CA+E02ZCwDV8RYCeAw87eLSKD4KF98U9DYXTNyzrjs7ho3UHtuA@mail.gmail.com>

Hi,

that could be a problem with neutron metadata service, check the logs.

Have you considered that the outage might have corrupted your databases,
neutron, nova, etc?

BR

On Thu, Jul 5, 2018 at 9:07 PM Torin Woltjer <torin.woltjer at granddial.com>
wrote:

> Are IP addresses set by cloud-init on boot? I noticed that cloud-init
> isn't working on my VMs. created a new instance from an ubuntu 18.04 image
> to test with, the hostname was not set to the name of the instance and
> could not login as users I had specified in the configuration.
>
> *Torin Woltjer*
>
> *Grand Dial Communications - A ZK Tech Inc. Company*
>
> *616.776.1066 ext. 2006*
> *www.granddial.com <http://www.granddial.com> <http://www.granddial.com>*
>
> ------------------------------
> *From*: George Mihaiescu <lmihaiescu at gmail.com>
> *Sent*: 7/5/18 12:57 PM
> *To*: torin.woltjer at granddial.com
> *Cc*: "openstack at lists.openstack.org" <openstack at lists.openstack.org>, "
> openstack-operators at lists.openstack.org" <
> openstack-operators at lists.openstack.org>
> *Subject*: Re: [Openstack] Recovering from full outage
> You should tcpdump inside the qdhcp namespace to see if the requests make
> it there, and also check iptables rules on the compute nodes for the return
> traffic.
>
>
> On Thu, Jul 5, 2018 at 12:39 PM, Torin Woltjer <
> torin.woltjer at granddial.com> wrote:
>
>> Yes, I've done this. The VMs hang for awhile waiting for DHCP and
>> eventually come up with no addresses. neutron-dhcp-agent has been restarted
>> on both controllers. The qdhcp netns's were all present; I stopped the
>> service, removed the qdhcp netns's, noted the dhcp agents show offline by
>> `neutron agent-list`, restarted all neutron services, noted the qdhcp
>> netns's were recreated, restarted a VM again and it still fails to pull an
>> IP address.
>>
>> *Torin Woltjer*
>>
>> *Grand Dial Communications - A ZK Tech Inc. Company*
>>
>> *616.776.1066 ext. 2006*
>> * <http://www.granddial.com> <http://www.granddial.com>www.granddial.com
>> <http://www.granddial.com> <http://www.granddial.com>*
>>
>> ------------------------------
>> *From*: George Mihaiescu <lmihaiescu at gmail.com>
>> *Sent*: 7/5/18 10:38 AM
>> *To*: torin.woltjer at granddial.com
>> *Subject*: Re: [Openstack] Recovering from full outage
>> Did you restart the neutron-dhcp-agent  and rebooted the VMs?
>>
>> On Thu, Jul 5, 2018 at 10:30 AM, Torin Woltjer <
>> torin.woltjer at granddial.com> wrote:
>>
>>> The qrouter netns appears once the lock_path is specified, the neutron
>>> router is pingable as well. However, instances are not pingable. If I log
>>> in via console, the instances have not been given IP addresses, if I
>>> manually give them an address and route they are pingable and seem to work.
>>> So the router is working correctly but dhcp is not working.
>>>
>>> No errors in any of the neutron or nova logs on controllers or compute
>>> nodes.
>>>
>>>
>>> *Torin Woltjer*
>>>
>>> *Grand Dial Communications - A ZK Tech Inc. Company*
>>>
>>> *616.776.1066 ext. 2006*
>>> * <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>www.granddial.com
>>> <http://www.granddial.com> <http://www.granddial.com>*
>>>
>>> ------------------------------
>>> *From*: "Torin Woltjer" <torin.woltjer at granddial.com>
>>> *Sent*: 7/5/18 8:53 AM
>>> *To*: <lmihaiescu at gmail.com>
>>> *Cc*: openstack-operators at lists.openstack.org,
>>> openstack at lists.openstack.org
>>> *Subject*: Re: [Openstack] Recovering from full outage
>>> There is no lock path set in my neutron configuration. Does it
>>> ultimately matter what it is set to as long as it is consistent? Does it
>>> need to be set on compute nodes as well as controllers?
>>>
>>> *Torin Woltjer*
>>>
>>> *Grand Dial Communications - A ZK Tech Inc. Company*
>>>
>>> *616.776.1066 ext. 2006*
>>> * <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>www.granddial.com
>>> <http://www.granddial.com> <http://www.granddial.com>*
>>>
>>> ------------------------------
>>> *From*: George Mihaiescu <lmihaiescu at gmail.com>
>>> *Sent*: 7/3/18 7:47 PM
>>> *To*: torin.woltjer at granddial.com
>>> *Cc*: openstack-operators at lists.openstack.org,
>>> openstack at lists.openstack.org
>>> *Subject*: Re: [Openstack] Recovering from full outage
>>>
>>> Did you set a lock_path in the neutron’s config?
>>>
>>> On Jul 3, 2018, at 17:34, Torin Woltjer <torin.woltjer at granddial.com>
>>> wrote:
>>>
>>> The following errors appear in the neutron-linuxbridge-agent.log on both
>>> controllers: <http://paste.openstack.org/show/724930/>
>>> <http://paste.openstack.org/show/724930/>
>>> <http://paste.openstack.org/show/724930/>
>>> <http://paste.openstack.org/show/724930/>
>>> <http://paste.openstack.org/show/724930/>
>>> <http://paste.openstack.org/show/724930/>
>>> <http://paste.openstack.org/show/724930/>
>>> http://paste.openstack.org/show/724930/
>>>
>>> No such errors are on the compute nodes themselves.
>>>
>>> *Torin Woltjer*
>>>
>>> *Grand Dial Communications - A ZK Tech Inc. Company*
>>>
>>> *616.776.1066 ext. 2006*
>>> * <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>www.granddial.com
>>> <http://www.granddial.com> <http://www.granddial.com>*
>>>
>>> ------------------------------
>>> *From*: "Torin Woltjer" <torin.woltjer at granddial.com>
>>> *Sent*: 7/3/18 5:14 PM
>>> *To*: <lmihaiescu at gmail.com>
>>> *Cc*: "openstack-operators at lists.openstack.org" <
>>> openstack-operators at lists.openstack.org>, "openstack at lists.openstack.org"
>>> <openstack at lists.openstack.org>
>>> *Subject*: Re: [Openstack] Recovering from full outage
>>> Running `openstack server reboot` on an instance just causes the
>>> instance to be stuck in a rebooting status. Most notable of the logs is
>>> neutron-server.log which shows the following:
>>> <http://paste.openstack.org/show/724917/>
>>> <http://paste.openstack.org/show/724917/>
>>> <http://paste.openstack.org/show/724917/>
>>> <http://paste.openstack.org/show/724917/>
>>> <http://paste.openstack.org/show/724917/>
>>> <http://paste.openstack.org/show/724917/>
>>> <http://paste.openstack.org/show/724917/>
>>> <http://paste.openstack.org/show/724917/>
>>> <http://paste.openstack.org/show/724917/>
>>> http://paste.openstack.org/show/724917/
>>>
>>> I realized that rabbitmq was in a failed state, so I bootstrapped it,
>>> rebooted controllers, and all of the agents show online.
>>> <http://paste.openstack.org/show/724921/>
>>> <http://paste.openstack.org/show/724921/>
>>> <http://paste.openstack.org/show/724921/>
>>> <http://paste.openstack.org/show/724921/>
>>> <http://paste.openstack.org/show/724921/>
>>> <http://paste.openstack.org/show/724921/>
>>> <http://paste.openstack.org/show/724921/>
>>> <http://paste.openstack.org/show/724921/>
>>> <http://paste.openstack.org/show/724921/>
>>> http://paste.openstack.org/show/724921/
>>> And all of the instances can be properly started, however I cannot ping
>>> any of the instances floating IPs or the neutron router. And when logging
>>> into an instance with the console, there is no IP address on any interface.
>>>
>>> *Torin Woltjer*
>>>
>>> *Grand Dial Communications - A ZK Tech Inc. Company*
>>>
>>> *616.776.1066 ext. 2006*
>>> * <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>www.granddial.com
>>> <http://www.granddial.com> <http://www.granddial.com>*
>>>
>>> ------------------------------
>>> *From*: George Mihaiescu <lmihaiescu at gmail.com>
>>> *Sent*: 7/3/18 11:50 AM
>>> *To*: torin.woltjer at granddial.com
>>> *Subject*: Re: [Openstack] Recovering from full outage
>>> Try restarting them using "openstack server reboot" and also check the
>>> nova-compute.log and neutron agents logs on the compute nodes.
>>>
>>> On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer <
>>> torin.woltjer at granddial.com> wrote:
>>>
>>>> We just suffered a power outage in out data center and I'm having
>>>> trouble recovering the Openstack cluster. All of the nodes are back online,
>>>> every instance shows active but `virsh list --all` on the compute nodes
>>>> show that all of the VMs are actually shut down. Running `ip addr` on any
>>>> of the nodes shows that none of the bridges are present and `ip netns`
>>>> shows that all of the network namespaces are missing as well. So despite
>>>> all of the neutron service running, none of the networking appears to be
>>>> active, which is concerning. How do I solve this without recreating all of
>>>> the networks?
>>>>
>>>> *Torin Woltjer*
>>>>
>>>> *Grand Dial Communications - A ZK Tech Inc. Company*
>>>>
>>>> *616.776.1066 ext. 2006*
>>>> * <http://www.granddial.com> <http://www.granddial.com>
>>>> <http://www.granddial.com> <http://www.granddial.com>
>>>> <http://www.granddial.com> <http://www.granddial.com>
>>>> <http://www.granddial.com> <http://www.granddial.com>
>>>> <http://www.granddial.com> <http://www.granddial.com>
>>>> <http://www.granddial.com> <http://www.granddial.com>www.granddial.com
>>>> <http://www.granddial.com> <http://www.granddial.com>*
>>>>
>>>> _______________________________________________
>>>> Mailing list:
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>>> Post to     : openstack at lists.openstack.org
>>>> Unsubscribe :
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>>>
>>>>
>>>
>>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180705/12690cd4/attachment.html>

From christian.zunker at codecentric.cloud  Fri Jul  6 08:23:43 2018
From: christian.zunker at codecentric.cloud (Christian Zunker)
Date: Fri, 6 Jul 2018 10:23:43 +0200
Subject: [Openstack-operators] Storage concerns when switching from a
 single controller to a HA setup
In-Reply-To: <B0D91353-7094-486B-9300-38A0604F381A@planethoster.info>
References: <B0D91353-7094-486B-9300-38A0604F381A@planethoster.info>
Message-ID: <CAHS=D_ZNxcUZhONW6nXnf_8Z6r__M6Dn+rLUK0ei+1xW8=wKfQ@mail.gmail.com>

Hi Jean-Philippe,

we had the same issue with ceph as backend. This fixed the problem in our
setup:
https://ask.openstack.org/en/question/87545/cinder-high-availability/

Although the above link talks about an active-active setup, the official
docs mention the hostname in the configuration also for an active-passive
setup:
https://docs.openstack.org/ha-guide/storage-ha-block.html#configure-block-storage-api-service

regards
Christian Zunker

Jean-Philippe Méthot <jp.methot at planethoster.info> schrieb am Fr., 6. Juli
2018 um 07:11 Uhr:

> Hi,
>
> We’ve been running on Openstack for several years now and our setup has
> always counted a single controller. We are currently testing switching to a
> dual controller HA solution, but an unexpected issue has appeared,
> regarding storage. See, we use Dell compellent SAN for our block devices. I
> notice that when I create a volume on one controller, I am unable to make
> any operation on the same volume on the second controller (this is with an
> active/passive cinder-volume). Worse, this affects VMs directly as they
> can’t be migrated if the active controller isn’t the one that created their
> block device.
>
> I know this issue doesn’t happen on Ceph, so I’ve been wondering, is this
> a limitation of Openstack or the SAN driver? Also, is there actually a way
> to reach even active-passive high availability with this current storage
> solution?
>
>
> Jean-Philippe Méthot
> Openstack system administrator
> Administrateur système Openstack
> PlanetHoster inc.
>
>
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180706/45a66b5e/attachment.html>

From molenkam at uwo.ca  Fri Jul  6 11:55:17 2018
From: molenkam at uwo.ca (Gary Molenkamp)
Date: Fri, 6 Jul 2018 07:55:17 -0400
Subject: [Openstack-operators] After power outage,
	nearly all vm volumes corrupted and unmountable
Message-ID: <5794a4af-03d9-1159-c385-aed7c277675e@uwo.ca>

Good morning all,

After losing all power to our DC last night due to a storm, nearly all 
of the volumes in our Pike cluster are unmountable.  Of the 30 VMs in 
use at the time, only one has been able to successfully mount and boot 
from its rootfs.   We are using Ceph as the backend storage to cinder 
and glance.  Any help or pointers to bring this back online would be 
appreciated.

  What most of the volumes are seeing is

[    2.622252] SGI XFS with ACLs, security attributes, no debug enabled
[    2.629285] XFS (sda1): Mounting V5 Filesystem
[    2.832223] sd 2:0:0:0: [sda] FAILED Result: hostbyte=DID_OK 
driverbyte=DRIVER_SENSE
[    2.838412] sd 2:0:0:0: [sda] Sense Key : Aborted Command [current]
[    2.842383] sd 2:0:0:0: [sda] Add. Sense: I/O process terminated
[    2.846152] sd 2:0:0:0: [sda] CDB: Write(10) 2a 00 00 80 2c 19 00 04 
00 00
[    2.850146] blk_update_request: I/O error, dev sda, sector 8399897

or

[    2.590178] EXT4-fs (vda1): INFO: recovery required on readonly 
filesystem
[    2.594319] EXT4-fs (vda1): write access will be enabled during recovery
[    2.957742] print_req_error: I/O error, dev vda, sector 227328
[    2.962468] Buffer I/O error on dev vda1, logical block 0, lost async 
page write
[    2.967933] Buffer I/O error on dev vda1, logical block 1, lost async 
page write
[    2.973076] print_req_error: I/O error, dev vda, sector 229384

As a test for one of the less critical vms, I deleted the vm and mounted 
the volume on the one VM I managed to start.  The results were not 
promising:


# dmesg |tail
[    5.136862] type=1305 audit(1530847244.811:4): audit_pid=496 old=0 
auid=4294967295 ses=4294967295 subj=system_u:system_r:auditd_t:s0 res=1
[    7.726331] nf_conntrack version 0.5.0 (65536 buckets, 262144 max)
[29374.967315] scsi 2:0:0:1: Direct-Access     QEMU     QEMU HARDDISK    
2.5+ PQ: 0 ANSI: 5
[29374.988104] sd 2:0:0:1: [sdb] 83886080 512-byte logical blocks: (42.9 
GB/40.0 GiB)
[29374.991126] sd 2:0:0:1: Attached scsi generic sg1 type 0
[29374.995302] sd 2:0:0:1: [sdb] Write Protect is off
[29374.997109] sd 2:0:0:1: [sdb] Mode Sense: 63 00 00 08
[29374.997186] sd 2:0:0:1: [sdb] Write cache: enabled, read cache: 
enabled, doesn't support DPO or FUA
[29375.005968]  sdb: sdb1
[29375.007746] sd 2:0:0:1: [sdb] Attached SCSI disk

# parted /dev/sdb
GNU Parted 3.1
Using /dev/sdb
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) p
Model: QEMU QEMU HARDDISK (scsi)
Disk /dev/sdb: 42.9GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags:

Number  Start   End     Size    Type     File system  Flags
  1      1049kB  42.9GB  42.9GB  primary  xfs          boot

# mount -t xfs /dev/sdb temp
mount: wrong fs type, bad option, bad superblock on /dev/sdb,
        missing codepage or helper program, or other error

        In some cases useful info is found in syslog - try
        dmesg | tail or so.

# xfs_repair /dev/sdb
Phase 1 - find and verify superblock...
bad primary superblock - bad magic number !!!

attempting to find secondary superblock...


Which eventually fails.   The ceph cluster looks healthy, I can export 
the volumes from rbd.  I can find no other errors in ceph of openstack 
indicating a fault in either system.

     - Is this recoverable?

     - What happened to all of these volumes and can this be prevented 
from occurring again?  Note that any shutdown vm at the time of the 
outage appears to be fine.


Relevant versions:

     Base OS:  all Centos 7.5

     Ceph:  Luminous 12.2.5-0

     Openstack:  Latest Pike releases in centos-release-openstack-pike-1-1

         nova 16.1.4-1

         cinder  11.1.1-1


-- 
Gary Molenkamp			Computer Science/Science Technology Services
Systems Administrator		University of Western Ontario
molenkam at uwo.ca                 http://www.csd.uwo.ca
(519) 661-2111 x86882		(519) 661-3566


From molenkam at uwo.ca  Fri Jul  6 13:17:20 2018
From: molenkam at uwo.ca (Gary Molenkamp)
Date: Fri, 6 Jul 2018 09:17:20 -0400
Subject: [Openstack-operators] [ceph-users] After power outage,
 nearly all vm volumes corrupted and unmountable
In-Reply-To: <CA+aFP1C2zA1UroaWMhThD1eFFurPU9DOSq1YgA2HEw0SLQ3yQg@mail.gmail.com>
References: <5794a4af-03d9-1159-c385-aed7c277675e@uwo.ca>
 <CA+aFP1C2zA1UroaWMhThD1eFFurPU9DOSq1YgA2HEw0SLQ3yQg@mail.gmail.com>
Message-ID: <0d6e8673-0bc4-43c8-7776-f4debded6d42@uwo.ca>

Thank you Jason,  Not sure how I missed that step.


On 2018-07-06 08:34 AM, Jason Dillaman wrote:
> There have been several similar reports on the mailing list about this 
> [1][2][3][4] that are always a result of skipping step 6 from the 
> Luminous upgrade guide [5]. The new (starting Luminous) 'profile 
> rbd'-style caps are designed to try to simplify caps going forward [6].
>
> TL;DR: your Openstack CephX users need to have permission to blacklist 
> dead clients that failed to properly release the exclusive lock.
>
> [1] 
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-November/022278.html
> [2] 
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-November/022694.html
> [3] 
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-May/026496.html
> [4] https://www.spinics.net/lists/ceph-users/msg45665.html
> [5] 
> http://docs.ceph.com/docs/master/releases/luminous/#upgrade-from-jewel-or-kraken
> [6] 
> http://docs.ceph.com/docs/luminous/rbd/rbd-openstack/#setup-ceph-client-authentication
>
>
> On Fri, Jul 6, 2018 at 7:55 AM Gary Molenkamp <molenkam at uwo.ca 
> <mailto:molenkam at uwo.ca>> wrote:
>
>     Good morning all,
>
>     After losing all power to our DC last night due to a storm, nearly
>     all
>     of the volumes in our Pike cluster are unmountable.  Of the 30 VMs in
>     use at the time, only one has been able to successfully mount and
>     boot
>     from its rootfs.   We are using Ceph as the backend storage to cinder
>     and glance.  Any help or pointers to bring this back online would be
>     appreciated.
>
>       What most of the volumes are seeing is
>
>     [    2.622252] SGI XFS with ACLs, security attributes, no debug
>     enabled
>     [    2.629285] XFS (sda1): Mounting V5 Filesystem
>     [    2.832223] sd 2:0:0:0: [sda] FAILED Result: hostbyte=DID_OK
>     driverbyte=DRIVER_SENSE
>     [    2.838412] sd 2:0:0:0: [sda] Sense Key : Aborted Command [current]
>     [    2.842383] sd 2:0:0:0: [sda] Add. Sense: I/O process terminated
>     [    2.846152] sd 2:0:0:0: [sda] CDB: Write(10) 2a 00 00 80 2c 19
>     00 04
>     00 00
>     [    2.850146] blk_update_request: I/O error, dev sda, sector 8399897
>
>     or
>
>     [    2.590178] EXT4-fs (vda1): INFO: recovery required on readonly
>     filesystem
>     [    2.594319] EXT4-fs (vda1): write access will be enabled during
>     recovery
>     [    2.957742] print_req_error: I/O error, dev vda, sector 227328
>     [    2.962468] Buffer I/O error on dev vda1, logical block 0, lost
>     async
>     page write
>     [    2.967933] Buffer I/O error on dev vda1, logical block 1, lost
>     async
>     page write
>     [    2.973076] print_req_error: I/O error, dev vda, sector 229384
>
>     As a test for one of the less critical vms, I deleted the vm and
>     mounted
>     the volume on the one VM I managed to start.  The results were not
>     promising:
>
>
>     # dmesg |tail
>     [    5.136862] type=1305 audit(1530847244.811:4): audit_pid=496 old=0
>     auid=4294967295 ses=4294967295 subj=system_u:system_r:auditd_t:s0
>     res=1
>     [    7.726331] nf_conntrack version 0.5.0 (65536 buckets, 262144 max)
>     [29374.967315] scsi 2:0:0:1: Direct-Access     QEMU     QEMU HARDDISK
>     2.5+ PQ: 0 ANSI: 5
>     [29374.988104] sd 2:0:0:1: [sdb] 83886080 512-byte logical blocks:
>     (42.9
>     GB/40.0 GiB)
>     [29374.991126] sd 2:0:0:1: Attached scsi generic sg1 type 0
>     [29374.995302] sd 2:0:0:1: [sdb] Write Protect is off
>     [29374.997109] sd 2:0:0:1: [sdb] Mode Sense: 63 00 00 08
>     [29374.997186] sd 2:0:0:1: [sdb] Write cache: enabled, read cache:
>     enabled, doesn't support DPO or FUA
>     [29375.005968]  sdb: sdb1
>     [29375.007746] sd 2:0:0:1: [sdb] Attached SCSI disk
>
>     # parted /dev/sdb
>     GNU Parted 3.1
>     Using /dev/sdb
>     Welcome to GNU Parted! Type 'help' to view a list of commands.
>     (parted) p
>     Model: QEMU QEMU HARDDISK (scsi)
>     Disk /dev/sdb: 42.9GB
>     Sector size (logical/physical): 512B/512B
>     Partition Table: msdos
>     Disk Flags:
>
>     Number  Start   End     Size    Type     File system  Flags
>       1      1049kB  42.9GB  42.9GB  primary  xfs          boot
>
>     # mount -t xfs /dev/sdb temp
>     mount: wrong fs type, bad option, bad superblock on /dev/sdb,
>             missing codepage or helper program, or other error
>
>             In some cases useful info is found in syslog - try
>             dmesg | tail or so.
>
>     # xfs_repair /dev/sdb
>     Phase 1 - find and verify superblock...
>     bad primary superblock - bad magic number !!!
>
>     attempting to find secondary superblock...
>
>
>
>     Which eventually fails.   The ceph cluster looks healthy, I can
>     export
>     the volumes from rbd.  I can find no other errors in ceph of
>     openstack
>     indicating a fault in either system.
>
>          - Is this recoverable?
>
>          - What happened to all of these volumes and can this be
>     prevented
>     from occurring again?  Note that any shutdown vm at the time of the
>     outage appears to be fine.
>
>
>     Relevant versions:
>
>          Base OS:  all Centos 7.5
>
>          Ceph:  Luminous 12.2.5-0
>
>          Openstack:  Latest Pike releases in
>     centos-release-openstack-pike-1-1
>
>              nova 16.1.4-1
>
>              cinder  11.1.1-1
>
>
>
>     -- 
>     Gary Molenkamp                  Computer Science/Science
>     Technology Services
>     Systems Administrator           University of Western Ontario
>     molenkam at uwo.ca <mailto:molenkam at uwo.ca> http://www.csd.uwo.ca
>     (519) 661-2111 x86882           (519) 661-3566
>
>     _______________________________________________
>     ceph-users mailing list
>     ceph-users at lists.ceph.com <mailto:ceph-users at lists.ceph.com>
>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> -- 
> Jason

-- 
Gary Molenkamp			Computer Science/Science Technology Services
Systems Administrator		University of Western Ontario
molenkam at uwo.ca                 http://www.csd.uwo.ca
(519) 661-2111 x86882		(519) 661-3566

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180706/131b2453/attachment.html>

From torin.woltjer at granddial.com  Fri Jul  6 13:38:55 2018
From: torin.woltjer at granddial.com (Torin Woltjer)
Date: Fri, 06 Jul 2018 13:38:55 GMT
Subject: [Openstack-operators] [Openstack] Recovering from full outage
Message-ID: <b96551c30440435ea44bc42824b4ff10@granddial.com>

I have done tcpdumps on both the controllers and on a compute node.
Controller:
`ip netns exec qdhcp-d85c2a00-a637-4109-83f0-7c2949be4cad tcpdump -vnes0 -i ns-83d68c76-b8 port 67`
`tcpdump -vnes0 -i any port 67`
Compute:
`tcpdump -vnes0 -i brqd85c2a00-a6 port 68`

For the first command on the controller, there are no packets captured at all. The second command on the controller captures packets, but they don't appear to be relevant to openstack. The dump from the compute node shows constant requests are getting sent by openstack instances. 

In summary; DHCP requests are being sent, but are never received.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/5/18 4:50 PM
To: torin.woltjer at granddial.com
Subject: Re: [Openstack] Recovering from full outage

The cloud-init requires network connectivity by default in order to reach the metadata server for the hostname, ssh-key, etc

You can configure cloud-init to use the config-drive, but the lack of network connectivity will make the instance useless anyway, even though it will have you ssh-key and hostname...

Did you check the things I told you? 

On Jul 5, 2018, at 16:06, Torin Woltjer <torin.woltjer at granddial.com> wrote:

Are IP addresses set by cloud-init on boot? I noticed that cloud-init isn't working on my VMs. created a new instance from an ubuntu 18.04 image to test with, the hostname was not set to the name of the instance and could not login as users I had specified in the configuration.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/5/18 12:57 PM
To: torin.woltjer at granddial.com
Cc: "openstack at lists.openstack.org" <openstack at lists.openstack.org>, "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>
Subject: Re: [Openstack] Recovering from full outage
You should tcpdump inside the qdhcp namespace to see if the requests make it there, and also check iptables rules on the compute nodes for the return traffic.

On Thu, Jul 5, 2018 at 12:39 PM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
Yes, I've done this. The VMs hang for awhile waiting for DHCP and eventually come up with no addresses. neutron-dhcp-agent has been restarted on both controllers. The qdhcp netns's were all present; I stopped the service, removed the qdhcp netns's, noted the dhcp agents show offline by `neutron agent-list`, restarted all neutron services, noted the qdhcp netns's were recreated, restarted a VM again and it still fails to pull an IP address.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/5/18 10:38 AM
To: torin.woltjer at granddial.com
Subject: Re: [Openstack] Recovering from full outage
Did you restart the neutron-dhcp-agent  and rebooted the VMs?

On Thu, Jul 5, 2018 at 10:30 AM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
The qrouter netns appears once the lock_path is specified, the neutron router is pingable as well. However, instances are not pingable. If I log in via console, the instances have not been given IP addresses, if I manually give them an address and route they are pingable and seem to work. So the router is working correctly but dhcp is not working.

No errors in any of the neutron or nova logs on controllers or compute nodes.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/5/18 8:53 AM
To: <lmihaiescu at gmail.com>
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage
There is no lock path set in my neutron configuration. Does it ultimately matter what it is set to as long as it is consistent? Does it need to be set on compute nodes as well as controllers?

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/3/18 7:47 PM
To: torin.woltjer at granddial.com
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage

Did you set a lock_path in the neutron’s config?

On Jul 3, 2018, at 17:34, Torin Woltjer <torin.woltjer at granddial.com> wrote:

The following errors appear in the neutron-linuxbridge-agent.log on both controllers: http://paste.openstack.org/show/724930/

No such errors are on the compute nodes themselves.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/3/18 5:14 PM
To: <lmihaiescu at gmail.com>
Cc: "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>, "openstack at lists.openstack.org" <openstack at lists.openstack.org>
Subject: Re: [Openstack] Recovering from full outage
Running `openstack server reboot` on an instance just causes the instance to be stuck in a rebooting status. Most notable of the logs is neutron-server.log which shows the following:
http://paste.openstack.org/show/724917/

I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted controllers, and all of the agents show online.
http://paste.openstack.org/show/724921/
And all of the instances can be properly started, however I cannot ping any of the instances floating IPs or the neutron router. And when logging into an instance with the console, there is no IP address on any interface.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/3/18 11:50 AM
To: torin.woltjer at granddial.com
Subject: Re: [Openstack] Recovering from full outage
Try restarting them using "openstack server reboot" and also check the nova-compute.log and neutron agents logs on the compute nodes.

On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks?

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack at lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180706/60b0cf68/attachment-0001.html>

From lmihaiescu at gmail.com  Fri Jul  6 15:14:23 2018
From: lmihaiescu at gmail.com (George Mihaiescu)
Date: Fri, 6 Jul 2018 11:14:23 -0400
Subject: [Openstack-operators] [Openstack] Recovering from full outage
In-Reply-To: <b96551c30440435ea44bc42824b4ff10@granddial.com>
References: <b96551c30440435ea44bc42824b4ff10@granddial.com>
Message-ID: <CAGckRDoUyQCmmLS5g0N-o=DP43c=bRQH0WKvT-9tZC2gQwejsg@mail.gmail.com>

Can you manually assign an IP address to a VM and once inside, ping the
address of the dhcp server?
That would confirm if there is connectivity at least.


Also, on the controller node where the dhcp server for that network is,
check the "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases"
and make sure there are entries corresponding to your instances.

In my experience, if neutron is broken after working fine (so excluding any
miss-configuration), then an agent is out-of-sync and restart usually fixes
things.


On Fri, Jul 6, 2018 at 9:38 AM, Torin Woltjer <torin.woltjer at granddial.com>
wrote:

> I have done tcpdumps on both the controllers and on a compute node.
> Controller:
> `ip netns exec qdhcp-d85c2a00-a637-4109-83f0-7c2949be4cad tcpdump -vnes0
> -i ns-83d68c76-b8 port 67`
> `tcpdump -vnes0 -i any port 67`
> Compute:
> `tcpdump -vnes0 -i brqd85c2a00-a6 port 68`
>
> For the first command on the controller, there are no packets captured at
> all. The second command on the controller captures packets, but they don't
> appear to be relevant to openstack. The dump from the compute node shows
> constant requests are getting sent by openstack instances.
>
> In summary; DHCP requests are being sent, but are never received.
>
> *Torin Woltjer*
>
> *Grand Dial Communications - A ZK Tech Inc. Company*
>
> *616.776.1066 ext. 2006*
> * <http://www.granddial.com>www.granddial.com <http://www.granddial.com>*
>
> ------------------------------
> *From*: George Mihaiescu <lmihaiescu at gmail.com>
> *Sent*: 7/5/18 4:50 PM
> *To*: torin.woltjer at granddial.com
> *Subject*: Re: [Openstack] Recovering from full outage
>
> The cloud-init requires network connectivity by default in order to reach
> the metadata server for the hostname, ssh-key, etc
>
> You can configure cloud-init to use the config-drive, but the lack of
> network connectivity will make the instance useless anyway, even though it
> will have you ssh-key and hostname...
>
> Did you check the things I told you?
>
> On Jul 5, 2018, at 16:06, Torin Woltjer <torin.woltjer at granddial.com>
> wrote:
>
> Are IP addresses set by cloud-init on boot? I noticed that cloud-init
> isn't working on my VMs. created a new instance from an ubuntu 18.04 image
> to test with, the hostname was not set to the name of the instance and
> could not login as users I had specified in the configuration.
>
> *Torin Woltjer*
>
> *Grand Dial Communications - A ZK Tech Inc. Company*
>
> *616.776.1066 ext. 2006*
> * <http://www.granddial.com> <http://www.granddial.com>
> <http://www.granddial.com>www.granddial.com <http://www.granddial.com>*
>
> ------------------------------
> *From*: George Mihaiescu <lmihaiescu at gmail.com>
> *Sent*: 7/5/18 12:57 PM
> *To*: torin.woltjer at granddial.com
> *Cc*: "openstack at lists.openstack.org" <openstack at lists.openstack.org>, "
> openstack-operators at lists.openstack.org" <openstack-operators at lists.
> openstack.org>
> *Subject*: Re: [Openstack] Recovering from full outage
> You should tcpdump inside the qdhcp namespace to see if the requests make
> it there, and also check iptables rules on the compute nodes for the return
> traffic.
>
>
> On Thu, Jul 5, 2018 at 12:39 PM, Torin Woltjer <
> torin.woltjer at granddial.com> wrote:
>
>> Yes, I've done this. The VMs hang for awhile waiting for DHCP and
>> eventually come up with no addresses. neutron-dhcp-agent has been restarted
>> on both controllers. The qdhcp netns's were all present; I stopped the
>> service, removed the qdhcp netns's, noted the dhcp agents show offline by
>> `neutron agent-list`, restarted all neutron services, noted the qdhcp
>> netns's were recreated, restarted a VM again and it still fails to pull an
>> IP address.
>>
>> *Torin Woltjer*
>>
>> *Grand Dial Communications - A ZK Tech Inc. Company*
>>
>> *616.776.1066 ext. 2006*
>> * <http://www.granddial.com> <http://www.granddial.com>
>> <http://www.granddial.com> <http://www.granddial.com>
>> <http://www.granddial.com>www.granddial.com <http://www.granddial.com>*
>>
>> ------------------------------
>> *From*: George Mihaiescu <lmihaiescu at gmail.com>
>> *Sent*: 7/5/18 10:38 AM
>> *To*: torin.woltjer at granddial.com
>> *Subject*: Re: [Openstack] Recovering from full outage
>> Did you restart the neutron-dhcp-agent  and rebooted the VMs?
>>
>> On Thu, Jul 5, 2018 at 10:30 AM, Torin Woltjer <
>> torin.woltjer at granddial.com> wrote:
>>
>>> The qrouter netns appears once the lock_path is specified, the neutron
>>> router is pingable as well. However, instances are not pingable. If I log
>>> in via console, the instances have not been given IP addresses, if I
>>> manually give them an address and route they are pingable and seem to work.
>>> So the router is working correctly but dhcp is not working.
>>>
>>> No errors in any of the neutron or nova logs on controllers or compute
>>> nodes.
>>>
>>>
>>> *Torin Woltjer*
>>>
>>> *Grand Dial Communications - A ZK Tech Inc. Company*
>>>
>>> *616.776.1066 ext. 2006*
>>> * <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com>www.granddial.com <http://www.granddial.com>*
>>>
>>> ------------------------------
>>> *From*: "Torin Woltjer" <torin.woltjer at granddial.com>
>>> *Sent*: 7/5/18 8:53 AM
>>> *To*: <lmihaiescu at gmail.com>
>>> *Cc*: openstack-operators at lists.openstack.org,
>>> openstack at lists.openstack.org
>>> *Subject*: Re: [Openstack] Recovering from full outage
>>> There is no lock path set in my neutron configuration. Does it
>>> ultimately matter what it is set to as long as it is consistent? Does it
>>> need to be set on compute nodes as well as controllers?
>>>
>>> *Torin Woltjer*
>>>
>>> *Grand Dial Communications - A ZK Tech Inc. Company*
>>>
>>> *616.776.1066 ext. 2006*
>>> * <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com>www.granddial.com <http://www.granddial.com>*
>>>
>>> ------------------------------
>>> *From*: George Mihaiescu <lmihaiescu at gmail.com>
>>> *Sent*: 7/3/18 7:47 PM
>>> *To*: torin.woltjer at granddial.com
>>> *Cc*: openstack-operators at lists.openstack.org,
>>> openstack at lists.openstack.org
>>> *Subject*: Re: [Openstack] Recovering from full outage
>>>
>>> Did you set a lock_path in the neutron’s config?
>>>
>>> On Jul 3, 2018, at 17:34, Torin Woltjer <torin.woltjer at granddial.com>
>>> wrote:
>>>
>>> The following errors appear in the neutron-linuxbridge-agent.log on both
>>> controllers: <http://paste.openstack.org/show/724930/>
>>> <http://paste.openstack.org/show/724930/>
>>> <http://paste.openstack.org/show/724930/>
>>> <http://paste.openstack.org/show/724930/>
>>> <http://paste.openstack.org/show/724930/>
>>> <http://paste.openstack.org/show/724930/>
>>> <http://paste.openstack.org/show/724930/>
>>> <http://paste.openstack.org/show/724930/>
>>> <http://paste.openstack.org/show/724930/>
>>> <http://paste.openstack.org/show/724930/>http://paste.openstack.org/sho
>>> w/724930/
>>>
>>> No such errors are on the compute nodes themselves.
>>>
>>> *Torin Woltjer*
>>>
>>> *Grand Dial Communications - A ZK Tech Inc. Company*
>>>
>>> *616.776.1066 ext. 2006*
>>> * <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com>www.granddial.com <http://www.granddial.com>*
>>>
>>> ------------------------------
>>> *From*: "Torin Woltjer" <torin.woltjer at granddial.com>
>>> *Sent*: 7/3/18 5:14 PM
>>> *To*: <lmihaiescu at gmail.com>
>>> *Cc*: "openstack-operators at lists.openstack.org" <
>>> openstack-operators at lists.openstack.org>, "openstack at lists.openstack.org"
>>> <openstack at lists.openstack.org>
>>> *Subject*: Re: [Openstack] Recovering from full outage
>>> Running `openstack server reboot` on an instance just causes the
>>> instance to be stuck in a rebooting status. Most notable of the logs is
>>> neutron-server.log which shows the following:
>>> <http://paste.openstack.org/show/724917/>
>>> <http://paste.openstack.org/show/724917/>
>>> <http://paste.openstack.org/show/724917/>
>>> <http://paste.openstack.org/show/724917/>
>>> <http://paste.openstack.org/show/724917/>
>>> <http://paste.openstack.org/show/724917/>
>>> <http://paste.openstack.org/show/724917/>
>>> <http://paste.openstack.org/show/724917/>
>>> <http://paste.openstack.org/show/724917/>
>>> <http://paste.openstack.org/show/724917/>
>>> <http://paste.openstack.org/show/724917/>
>>> <http://paste.openstack.org/show/724917/>http://paste.openstack.org/sho
>>> w/724917/
>>>
>>> I realized that rabbitmq was in a failed state, so I bootstrapped it,
>>> rebooted controllers, and all of the agents show online.
>>> <http://paste.openstack.org/show/724921/>
>>> <http://paste.openstack.org/show/724921/>
>>> <http://paste.openstack.org/show/724921/>
>>> <http://paste.openstack.org/show/724921/>
>>> <http://paste.openstack.org/show/724921/>
>>> <http://paste.openstack.org/show/724921/>
>>> <http://paste.openstack.org/show/724921/>
>>> <http://paste.openstack.org/show/724921/>
>>> <http://paste.openstack.org/show/724921/>
>>> <http://paste.openstack.org/show/724921/>
>>> <http://paste.openstack.org/show/724921/>
>>> <http://paste.openstack.org/show/724921/>http://paste.openstack.org/sho
>>> w/724921/
>>> And all of the instances can be properly started, however I cannot ping
>>> any of the instances floating IPs or the neutron router. And when logging
>>> into an instance with the console, there is no IP address on any interface.
>>>
>>> *Torin Woltjer*
>>>
>>> *Grand Dial Communications - A ZK Tech Inc. Company*
>>>
>>> *616.776.1066 ext. 2006*
>>> * <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com> <http://www.granddial.com>
>>> <http://www.granddial.com>www.granddial.com <http://www.granddial.com>*
>>>
>>> ------------------------------
>>> *From*: George Mihaiescu <lmihaiescu at gmail.com>
>>> *Sent*: 7/3/18 11:50 AM
>>> *To*: torin.woltjer at granddial.com
>>> *Subject*: Re: [Openstack] Recovering from full outage
>>> Try restarting them using "openstack server reboot" and also check the
>>> nova-compute.log and neutron agents logs on the compute nodes.
>>>
>>> On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer <
>>> torin.woltjer at granddial.com> wrote:
>>>
>>>> We just suffered a power outage in out data center and I'm having
>>>> trouble recovering the Openstack cluster. All of the nodes are back online,
>>>> every instance shows active but `virsh list --all` on the compute nodes
>>>> show that all of the VMs are actually shut down. Running `ip addr` on any
>>>> of the nodes shows that none of the bridges are present and `ip netns`
>>>> shows that all of the network namespaces are missing as well. So despite
>>>> all of the neutron service running, none of the networking appears to be
>>>> active, which is concerning. How do I solve this without recreating all of
>>>> the networks?
>>>>
>>>> *Torin Woltjer*
>>>>
>>>> *Grand Dial Communications - A ZK Tech Inc. Company*
>>>>
>>>> *616.776.1066 ext. 2006*
>>>> * <http://www.granddial.com> <http://www.granddial.com>
>>>> <http://www.granddial.com> <http://www.granddial.com>
>>>> <http://www.granddial.com> <http://www.granddial.com>
>>>> <http://www.granddial.com> <http://www.granddial.com>
>>>> <http://www.granddial.com> <http://www.granddial.com>
>>>> <http://www.granddial.com> <http://www.granddial.com>
>>>> <http://www.granddial.com> <http://www.granddial.com>
>>>> <http://www.granddial.com>www.granddial.com <http://www.granddial.com>*
>>>>
>>>> _______________________________________________
>>>> Mailing list:
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>>> Post to     : openstack at lists.openstack.org
>>>> Unsubscribe :
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180706/ca78ba8f/attachment.html>

From torin.woltjer at granddial.com  Fri Jul  6 15:49:58 2018
From: torin.woltjer at granddial.com (Torin Woltjer)
Date: Fri, 06 Jul 2018 15:49:58 GMT
Subject: [Openstack-operators] [Openstack] Recovering from full outage
Message-ID: <b65fb1f413aa478fbf7e97da0ceb779f@granddial.com>

Interestingly, I can ping the neutron router at 172.16.1.1 just fine, but DHCP (located at 172.16.1.2 and 172.16.1.3) fails. The instance that I manually added the IP address to has a floating IP, and oddly enough I am able to ping DHCP on the provider network, which suggests that DHCP may be working on other networks but not on my selfservice network. I was able to confirm this by creating a new virtual machine directly on the provider network, I was able to ping to it and SSH into it right off of the bat, as it obtained the proper address on its own. "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" is empty. "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" contains:
fa:16:3e:3f:94:17,host-172-16-1-8.openstacklocal,172.16.1.8
fa:16:3e:e0:57:e7,host-172-16-1-7.openstacklocal,172.16.1.7
fa:16:3e:db:a7:cb,host-172-16-1-12.openstacklocal,172.16.1.12
fa:16:3e:f8:10:99,host-172-16-1-10.openstacklocal,172.16.1.10
fa:16:3e:a7:82:4c,host-172-16-1-3.openstacklocal,172.16.1.3
fa:16:3e:f8:23:1d,host-172-16-1-14.openstacklocal,172.16.1.14
fa:16:3e:63:53:a4,host-172-16-1-1.openstacklocal,172.16.1.1
fa:16:3e:b7:41:a8,host-172-16-1-2.openstacklocal,172.16.1.2
fa:16:3e:5e:25:5f,host-172-16-1-4.openstacklocal,172.16.1.4
fa:16:3e:3a:a2:53,host-172-16-1-100.openstacklocal,172.16.1.100
fa:16:3e:46:39:e2,host-172-16-1-13.openstacklocal,172.16.1.13
fa:16:3e:06:de:e0,host-172-16-1-18.openstacklocal,172.16.1.18

I've done system restarts since the power outage and the agent hasn't corrected itself. I've restarted all neutron services as I've done things, I could also try stopping and starting dnsmasq.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/6/18 11:15 AM
To: torin.woltjer at granddial.com
Cc: "openstack at lists.openstack.org" <openstack at lists.openstack.org>, "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>, pgsousa at gmail.com
Subject: Re: [Openstack] Recovering from full outage
Can you manually assign an IP address to a VM and once inside, ping the address of the dhcp server?
That would confirm if there is connectivity at least.

Also, on the controller node where the dhcp server for that network is, check the "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" and make sure there are entries corresponding to your instances.

In my experience, if neutron is broken after working fine (so excluding any miss-configuration), then an agent is out-of-sync and restart usually fixes things.

On Fri, Jul 6, 2018 at 9:38 AM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
I have done tcpdumps on both the controllers and on a compute node.
Controller:
`ip netns exec qdhcp-d85c2a00-a637-4109-83f0-7c2949be4cad tcpdump -vnes0 -i ns-83d68c76-b8 port 67`
`tcpdump -vnes0 -i any port 67`
Compute:
`tcpdump -vnes0 -i brqd85c2a00-a6 port 68`

For the first command on the controller, there are no packets captured at all. The second command on the controller captures packets, but they don't appear to be relevant to openstack. The dump from the compute node shows constant requests are getting sent by openstack instances. 

In summary; DHCP requests are being sent, but are never received.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/5/18 4:50 PM
To: torin.woltjer at granddial.com
Subject: Re: [Openstack] Recovering from full outage

The cloud-init requires network connectivity by default in order to reach the metadata server for the hostname, ssh-key, etc

You can configure cloud-init to use the config-drive, but the lack of network connectivity will make the instance useless anyway, even though it will have you ssh-key and hostname...

Did you check the things I told you? 

On Jul 5, 2018, at 16:06, Torin Woltjer <torin.woltjer at granddial.com> wrote:

Are IP addresses set by cloud-init on boot? I noticed that cloud-init isn't working on my VMs. created a new instance from an ubuntu 18.04 image to test with, the hostname was not set to the name of the instance and could not login as users I had specified in the configuration.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/5/18 12:57 PM
To: torin.woltjer at granddial.com
Cc: "openstack at lists.openstack.org" <openstack at lists.openstack.org>, "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>
Subject: Re: [Openstack] Recovering from full outage
You should tcpdump inside the qdhcp namespace to see if the requests make it there, and also check iptables rules on the compute nodes for the return traffic.

On Thu, Jul 5, 2018 at 12:39 PM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
Yes, I've done this. The VMs hang for awhile waiting for DHCP and eventually come up with no addresses. neutron-dhcp-agent has been restarted on both controllers. The qdhcp netns's were all present; I stopped the service, removed the qdhcp netns's, noted the dhcp agents show offline by `neutron agent-list`, restarted all neutron services, noted the qdhcp netns's were recreated, restarted a VM again and it still fails to pull an IP address.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/5/18 10:38 AM
To: torin.woltjer at granddial.com
Subject: Re: [Openstack] Recovering from full outage
Did you restart the neutron-dhcp-agent  and rebooted the VMs?

On Thu, Jul 5, 2018 at 10:30 AM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
The qrouter netns appears once the lock_path is specified, the neutron router is pingable as well. However, instances are not pingable. If I log in via console, the instances have not been given IP addresses, if I manually give them an address and route they are pingable and seem to work. So the router is working correctly but dhcp is not working.

No errors in any of the neutron or nova logs on controllers or compute nodes.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/5/18 8:53 AM
To: <lmihaiescu at gmail.com>
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage
There is no lock path set in my neutron configuration. Does it ultimately matter what it is set to as long as it is consistent? Does it need to be set on compute nodes as well as controllers?

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/3/18 7:47 PM
To: torin.woltjer at granddial.com
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage

Did you set a lock_path in the neutron’s config?

On Jul 3, 2018, at 17:34, Torin Woltjer <torin.woltjer at granddial.com> wrote:

The following errors appear in the neutron-linuxbridge-agent.log on both controllers: http://paste.openstack.org/show/724930/

No such errors are on the compute nodes themselves.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/3/18 5:14 PM
To: <lmihaiescu at gmail.com>
Cc: "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>, "openstack at lists.openstack.org" <openstack at lists.openstack.org>
Subject: Re: [Openstack] Recovering from full outage
Running `openstack server reboot` on an instance just causes the instance to be stuck in a rebooting status. Most notable of the logs is neutron-server.log which shows the following:
http://paste.openstack.org/show/724917/

I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted controllers, and all of the agents show online.
http://paste.openstack.org/show/724921/
And all of the instances can be properly started, however I cannot ping any of the instances floating IPs or the neutron router. And when logging into an instance with the console, there is no IP address on any interface.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/3/18 11:50 AM
To: torin.woltjer at granddial.com
Subject: Re: [Openstack] Recovering from full outage
Try restarting them using "openstack server reboot" and also check the nova-compute.log and neutron agents logs on the compute nodes.

On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks?

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack at lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180706/c1ae9320/attachment.html>

From Luiz.Gavioli at netapp.com  Fri Jul  6 17:08:08 2018
From: Luiz.Gavioli at netapp.com (Gavioli, Luiz)
Date: Fri, 6 Jul 2018 17:08:08 +0000
Subject: [Openstack-operators] Deprecation notice: Cinder Driver for NetApp
	E-Series
Message-ID: <1530896888.7565.11.camel@netapp.com>

Developers and Operators,

NetApp’s various Cinder drivers currently provide platform integration for ONTAP powered systems,

SolidFire, and E/EF-Series systems. Per systems-provided telemetry and discussion amongst our user

community, we’ve learned that when E/EF-series systems are deployed with OpenStack they do not

commonly make use of the platform specific Cinder driver (instead opting for use of the LVM

driver or Ceph layered atop). Given that, we’re proposing to cease further development and maintenance

of the E-Series drivers within OpenStack and will focus development on our widely used SolidFire and ONTAP options.

In accordance with community policy [1], we are initiating the deprecation process for the NetApp E-Series

drivers [2] set to conclude with their removal in the OpenStack Stein release. This will apply to both

protocols currently supported in this driver: iSCSI and FC.

What is being deprecated: Cinder drivers for NetApp E-Series

Period of deprecation: E-Series drivers will be around in stable/rocky and will be removed in the Stein

release (All milestones of this release)

What should users/operators do: Any Cinder E-series deployers are encouraged to get in touch with NetApp

via the community #openstack-netapp IRC channel on freenode or via the #OpenStack Slack channel

on http://netapp.io. We encourage migration to the LVM driver for continued use of E-series systems

in most cases via Cinder’s migrate facility [3].

[1] https://governance.openstack.org/reference/tags/assert_follows-standard-deprecation.html
[2] https://review.openstack.org/#/c/580679/<https://review.openstack.org/#/c/456990/>
[3] https://docs.openstack.org/admin-guide/blockstorage-volume-migration.html

Thanks,
Luiz Gavioli
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180706/59ff6a34/attachment.html>

From torin.woltjer at granddial.com  Fri Jul  6 18:13:20 2018
From: torin.woltjer at granddial.com (Torin Woltjer)
Date: Fri, 06 Jul 2018 18:13:20 GMT
Subject: [Openstack-operators] [Openstack] Recovering from full outage
Message-ID: <fea414e6ca8643dd8277a14dd6270d22@granddial.com>

I explored creating a second "selfservice" vxlan to see if DHCP would work on it as it does on my external "provider" network. The new vxlan network shares the same problems as the old vxlan network. Am I having problems with VXLAN in particular? 

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/6/18 12:05 PM
To: <lmihaiescu at gmail.com>
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage
Interestingly, I can ping the neutron router at 172.16.1.1 just fine, but DHCP (located at 172.16.1.2 and 172.16.1.3) fails. The instance that I manually added the IP address to has a floating IP, and oddly enough I am able to ping DHCP on the provider network, which suggests that DHCP may be working on other networks but not on my selfservice network. I was able to confirm this by creating a new virtual machine directly on the provider network, I was able to ping to it and SSH into it right off of the bat, as it obtained the proper address on its own. "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" is empty. "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" contains:
fa:16:3e:3f:94:17,host-172-16-1-8.openstacklocal,172.16.1.8
fa:16:3e:e0:57:e7,host-172-16-1-7.openstacklocal,172.16.1.7
fa:16:3e:db:a7:cb,host-172-16-1-12.openstacklocal,172.16.1.12
fa:16:3e:f8:10:99,host-172-16-1-10.openstacklocal,172.16.1.10
fa:16:3e:a7:82:4c,host-172-16-1-3.openstacklocal,172.16.1.3
fa:16:3e:f8:23:1d,host-172-16-1-14.openstacklocal,172.16.1.14
fa:16:3e:63:53:a4,host-172-16-1-1.openstacklocal,172.16.1.1
fa:16:3e:b7:41:a8,host-172-16-1-2.openstacklocal,172.16.1.2
fa:16:3e:5e:25:5f,host-172-16-1-4.openstacklocal,172.16.1.4
fa:16:3e:3a:a2:53,host-172-16-1-100.openstacklocal,172.16.1.100
fa:16:3e:46:39:e2,host-172-16-1-13.openstacklocal,172.16.1.13
fa:16:3e:06:de:e0,host-172-16-1-18.openstacklocal,172.16.1.18

I've done system restarts since the power outage and the agent hasn't corrected itself. I've restarted all neutron services as I've done things, I could also try stopping and starting dnsmasq.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/6/18 11:15 AM
To: torin.woltjer at granddial.com
Cc: "openstack at lists.openstack.org" <openstack at lists.openstack.org>, "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>, pgsousa at gmail.com
Subject: Re: [Openstack] Recovering from full outage
Can you manually assign an IP address to a VM and once inside, ping the address of the dhcp server?
That would confirm if there is connectivity at least.

Also, on the controller node where the dhcp server for that network is, check the "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" and make sure there are entries corresponding to your instances.

In my experience, if neutron is broken after working fine (so excluding any miss-configuration), then an agent is out-of-sync and restart usually fixes things.

On Fri, Jul 6, 2018 at 9:38 AM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
I have done tcpdumps on both the controllers and on a compute node.
Controller:
`ip netns exec qdhcp-d85c2a00-a637-4109-83f0-7c2949be4cad tcpdump -vnes0 -i ns-83d68c76-b8 port 67`
`tcpdump -vnes0 -i any port 67`
Compute:
`tcpdump -vnes0 -i brqd85c2a00-a6 port 68`

For the first command on the controller, there are no packets captured at all. The second command on the controller captures packets, but they don't appear to be relevant to openstack. The dump from the compute node shows constant requests are getting sent by openstack instances. 

In summary; DHCP requests are being sent, but are never received.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/5/18 4:50 PM
To: torin.woltjer at granddial.com
Subject: Re: [Openstack] Recovering from full outage

The cloud-init requires network connectivity by default in order to reach the metadata server for the hostname, ssh-key, etc

You can configure cloud-init to use the config-drive, but the lack of network connectivity will make the instance useless anyway, even though it will have you ssh-key and hostname...

Did you check the things I told you? 

On Jul 5, 2018, at 16:06, Torin Woltjer <torin.woltjer at granddial.com> wrote:

Are IP addresses set by cloud-init on boot? I noticed that cloud-init isn't working on my VMs. created a new instance from an ubuntu 18.04 image to test with, the hostname was not set to the name of the instance and could not login as users I had specified in the configuration.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/5/18 12:57 PM
To: torin.woltjer at granddial.com
Cc: "openstack at lists.openstack.org" <openstack at lists.openstack.org>, "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>
Subject: Re: [Openstack] Recovering from full outage
You should tcpdump inside the qdhcp namespace to see if the requests make it there, and also check iptables rules on the compute nodes for the return traffic.

On Thu, Jul 5, 2018 at 12:39 PM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
Yes, I've done this. The VMs hang for awhile waiting for DHCP and eventually come up with no addresses. neutron-dhcp-agent has been restarted on both controllers. The qdhcp netns's were all present; I stopped the service, removed the qdhcp netns's, noted the dhcp agents show offline by `neutron agent-list`, restarted all neutron services, noted the qdhcp netns's were recreated, restarted a VM again and it still fails to pull an IP address.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/5/18 10:38 AM
To: torin.woltjer at granddial.com
Subject: Re: [Openstack] Recovering from full outage
Did you restart the neutron-dhcp-agent  and rebooted the VMs?

On Thu, Jul 5, 2018 at 10:30 AM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
The qrouter netns appears once the lock_path is specified, the neutron router is pingable as well. However, instances are not pingable. If I log in via console, the instances have not been given IP addresses, if I manually give them an address and route they are pingable and seem to work. So the router is working correctly but dhcp is not working.

No errors in any of the neutron or nova logs on controllers or compute nodes.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/5/18 8:53 AM
To: <lmihaiescu at gmail.com>
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage
There is no lock path set in my neutron configuration. Does it ultimately matter what it is set to as long as it is consistent? Does it need to be set on compute nodes as well as controllers?

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/3/18 7:47 PM
To: torin.woltjer at granddial.com
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage

Did you set a lock_path in the neutron’s config?

On Jul 3, 2018, at 17:34, Torin Woltjer <torin.woltjer at granddial.com> wrote:

The following errors appear in the neutron-linuxbridge-agent.log on both controllers: http://paste.openstack.org/show/724930/

No such errors are on the compute nodes themselves.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/3/18 5:14 PM
To: <lmihaiescu at gmail.com>
Cc: "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>, "openstack at lists.openstack.org" <openstack at lists.openstack.org>
Subject: Re: [Openstack] Recovering from full outage
Running `openstack server reboot` on an instance just causes the instance to be stuck in a rebooting status. Most notable of the logs is neutron-server.log which shows the following:
http://paste.openstack.org/show/724917/

I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted controllers, and all of the agents show online.
http://paste.openstack.org/show/724921/
And all of the instances can be properly started, however I cannot ping any of the instances floating IPs or the neutron router. And when logging into an instance with the console, there is no IP address on any interface.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/3/18 11:50 AM
To: torin.woltjer at granddial.com
Subject: Re: [Openstack] Recovering from full outage
Try restarting them using "openstack server reboot" and also check the nova-compute.log and neutron agents logs on the compute nodes.

On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks?

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack at lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180706/b9283756/attachment-0001.html>

From mrhillsman at gmail.com  Mon Jul  9 17:27:55 2018
From: mrhillsman at gmail.com (Melvin Hillsman)
Date: Mon, 9 Jul 2018 12:27:55 -0500
Subject: [Openstack-operators] Reminder: UC Meeting Today 1800UTC / 1300CST
Message-ID: <CAMVtB2GKk0e06wK5aQubkUxaH2gGJb512o95YvigCWMUNCDOQg@mail.gmail.com>

Hey everyone,

Please see https://wiki.openstack.org/wiki/Governance/Foundation/Us
erCommittee for UC meeting info and add additional agenda items if needed.

-- 
Kind regards,

Melvin Hillsman
mrhillsman at gmail.com
mobile: (832) 264-2646

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180709/88856b7f/attachment.html>

From christian.zunker at codecentric.cloud  Tue Jul 10 04:54:55 2018
From: christian.zunker at codecentric.cloud (Christian Zunker)
Date: Tue, 10 Jul 2018 06:54:55 +0200
Subject: [Openstack-operators] How are you handling billing/chargeback?
In-Reply-To: <CAHS=D_ZYfBCogSOm4-yb7txwdJ=N3WXTo4oEi9C8CAU6h0SvLA@mail.gmail.com>
References: <20180312192113.znz4eavfze5zg7yn@redhat.com>
 <CABowN-HynSYyZnF=kuUL0=jjNLwT0eY_BytOatz5Le=KT0w=PQ@mail.gmail.com>
 <CAHS=D_ZYfBCogSOm4-yb7txwdJ=N3WXTo4oEi9C8CAU6h0SvLA@mail.gmail.com>
Message-ID: <CAHS=D_baNk_VYmVvfkVCU-DvJxZqPjK7TXnH83WRxDhE2ysjhw@mail.gmail.com>

Hi,

just a short feedback to my previous post.

We switched from our self-written Python script to cloudkitty. We had to
make some commits to openstack-ansible, to integrate cloudkitty into our
installation but we got it working.

cloudkitty replaced our self-written script completely. We needed some time
to get used to it, but it reduced our maintenance.

We will have to write a converter for the json/csv reports generated by
cloudkitty, so we can import the data into our billing system. But we get
the complete data we need from cloudkitty which is worth a lot.

Our next steps:
- Perhaps migrate to gnocchi as cloudkitty storage backend. We will have to
test this first.
- The Horizon pages aren't that nice, especially the user-facing one. We
will have a look at that later. For now, we disabled it.


regards
Christian

Christian Zunker <christian.zunker at codecentric.cloud> schrieb am Di., 8.
Mai 2018 um 08:36 Uhr:

> Hi,
>
> we are running a cloud based on openstack-ansible and now are trying to
> integrate cloudkitty for billing.
>
> Till now we used a self written python script to query ceilometer for
> needed data, but that got more tedious than we are willing to handle. We
> hope it gets much easier once cloudkitty is set up.
>
> regards
> Christian
>
>
>> From: Lars Kellogg-Stedman <lars at redhat.com>
>> Date: Mo., 12. März 2018 um 20:27 Uhr
>> Subject: [Openstack-operators] How are you handling billing/chargeback?
>> To: openstack-operators at lists.openstack.org <
>> openstack-operators at lists.openstack.org>
>>
>>
>> Hey folks,
>>
>> I'm curious what folks out there are using for chargeback/billing in
>> your OpenStack environment.
>>
>> Are you doing any sort of chargeback (or showback)?  Are you using (or
>> have you tried) CloudKitty?  Or some other existing project?  Have you
>> rolled your own instead?
>>
>> I ask because I am helping out some folks get a handle on the
>> operational side of their existing OpenStack environment, and they are
>> interested in but have not yet deployed some sort of reporting
>> mechanism.
>>
>> Thanks,
>>
>>
>> --
>> Lars Kellogg-Stedman <lars at redhat.com> | larsks @ {irc,twitter,github}
>> http://blog.oddbit.com/                |
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
> --
> cc cloud GmbH | Hochstr. 11
> <https://maps.google.com/?q=Hochstr.+11&entry=gmail&source=g> | 42697
> Solingen | Deutschland
> mobil: +49 175 1068513 <+49%20175%201068513>
> www.codecentric.cloud | blog.codecentric.de | www.meettheexperts.de
> Sitz der Gesellschaft: Solingen | HRB 28640| Amtsgericht Wuppertal
>
> Geschäftsführung: Werner Krandick . Rainer Vehns
>
> Diese E-Mail einschließlich evtl. beigefügter Dateien enthält vertrauliche
> und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige
> Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie
> bitte sofort den Absender und löschen Sie diese E-Mail und evtl.
> beigefügter Dateien umgehend. Das unerlaubte Kopieren, Nutzen oder Öffnen
> evtl. beigefügter Dateien sowie die unbefugte Weitergabe dieser E-Mail ist
> nicht gestattet.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180710/e792e3a2/attachment.html>

From berendt at betacloud-solutions.de  Tue Jul 10 12:03:54 2018
From: berendt at betacloud-solutions.de (Christian Berendt)
Date: Tue, 10 Jul 2018 14:03:54 +0200
Subject: [Openstack-operators] [glance] share image with domain
Message-ID: <094CFFC9-1F54-4522-8178-1642F94724A0@betacloud-solutions.de>

It is possible to add a domain as a member, however this is not taken in account. It should be mentioned that you can also add non-existing project ids as a member.

For me it looks like it is not possible to share a image with visibility “shared” with a domain.

Are there known workarounds or scripts for that use case?

Christian.

-- 
Christian Berendt
Chief Executive Officer (CEO)

Mail: berendt at betacloud-solutions.de
Web: https://www.betacloud-solutions.de

Betacloud Solutions GmbH
Teckstrasse 62 / 70190 Stuttgart / Deutschland

Geschäftsführer: Christian Berendt
Unternehmenssitz: Stuttgart
Amtsgericht: Stuttgart, HRB 756139


From amy at demarco.com  Tue Jul 10 15:32:42 2018
From: amy at demarco.com (Amy Marrich)
Date: Tue, 10 Jul 2018 10:32:42 -0500
Subject: [Openstack-operators] [openstack-community] Openstack package
	repo
In-Reply-To: <CAAWpFTHHryCEjjVAmKsm4v_FUD56v6iq4wM0xcN40cRj4L6WeA@mail.gmail.com>
References: <CAAWpFTHHryCEjjVAmKsm4v_FUD56v6iq4wM0xcN40cRj4L6WeA@mail.gmail.com>
Message-ID: <CAFs83Qqv8vsXUG+zYudsU9dU6K735oLc97OyaghEMbVw-tKviA@mail.gmail.com>

Alfredo,

Forwarding this to the OPS list in the hopes of it reaching the appropriate
folks, but you might also want to checkout the RDO repos

https://trunk.rdoproject.org/centos7/current/


Thanks,


Amy (spotz)

On Tue, Jul 10, 2018 at 10:07 AM, Alfredo De Luca <alfredo.deluca at gmail.com>
wrote:

> Hi all.
> I have centos/7 on a VM Virtualbox... I want to install all the openstack
> python clients (nova, swift etc).
> I installed
> *yum install centos-release-openstack-queens *
>
> and all good but when I try to install one client I have the following
> error:
>
> yum install python-swiftclient
>
> *<SNIP>*
> Loaded plugins: fastestmirror
> Loading mirror speeds from cached hostfile
>  * base: mirror.infonline.de
>  * extras: mirror.infonline.de
>  * updates: centos.mirrors.psw.services
> centos-ceph-luminous
>                             | 2.9 kB  00:00:00
> centos-openstack-queens
>                            | 2.9 kB  00:00:00
> *http://mirror.centos.org/altarch/7/virt/x86_64/kvm-common/repodata/repomd.xml
> <http://mirror.centos.org/altarch/7/virt/x86_64/kvm-common/repodata/repomd.xml>:
> [Errno 14] HTTP Error 404 - Not Found*
> Trying other mirror.
> To address this issue please refer to the below wiki article
>
> https://wiki.centos.org/yum-errors
> *<SNIP>*
>
> Now the only way to install the package (or any other) is to disable that
> repo
> *yum-config-manager --disable centos-qemu-ev*
>
> then I can install the client...
>
> Any idea?
> It looks like *http://mirror.centos.org/altarch/7/virt/x86_64
> <http://mirror.centos.org/altarch/7/virt/x86_64> doesn't exist.....*
>
>
>
>
>
>
> --
> *Alfredo*
>
>
> _______________________________________________
> Community mailing list
> Community at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/community
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180710/0b57dfc4/attachment.html>

From torin.woltjer at granddial.com  Tue Jul 10 18:58:32 2018
From: torin.woltjer at granddial.com (Torin Woltjer)
Date: Tue, 10 Jul 2018 18:58:32 GMT
Subject: [Openstack-operators] [Openstack] Recovering from full outage
Message-ID: <167ce30bac124c85a16061c83353553a@granddial.com>

DHCP is working again so instances are getting their addresses. For some reason cloud-init isn't working correctly. Hostnames aren't getting set, and SSH key pair isn't getting set. The neutron-metadata service is in control of this?

neutron-metadata-agent.log:
2018-07-10 08:01:42.046 5518 INFO eventlet.wsgi.server [-] 109.73.185.195,<local> "GET / HTTP/1.1" status: 404  len: 195 time: 0.0622332
2018-07-10 09:49:42.604 5518 INFO eventlet.wsgi.server [-] 197.149.85.150,<local> "GET / HTTP/1.1" status: 404  len: 195 time: 0.0645461
2018-07-10 10:52:50.845 5517 INFO eventlet.wsgi.server [-] 88.249.225.204,<local> "GET / HTTP/1.1" status: 404  len: 195 time: 0.0659041
2018-07-10 11:43:20.471 5518 INFO eventlet.wsgi.server [-] 143.208.186.168,<local> "GET / HTTP/1.1" status: 404  len: 195 time: 0.0618532
2018-07-10 11:53:15.574 5511 INFO eventlet.wsgi.server [-] 194.40.240.254,<local> "GET / HTTP/1.1" status: 404  len: 195 time: 0.0636070
2018-07-10 13:26:46.795 5518 INFO eventlet.wsgi.server [-] 109.73.177.149,<local> "GET / HTTP/1.1" status: 404  len: 195 time: 0.0611560
2018-07-10 13:27:38.795 5513 INFO eventlet.wsgi.server [-] 125.167.69.238,<local> "GET / HTTP/1.0" status: 404  len: 195 time: 0.0631371
2018-07-10 13:30:49.551 5514 INFO eventlet.wsgi.server [-] 155.93.152.111,<local> "GET / HTTP/1.0" status: 404  len: 195 time: 0.0609179
2018-07-10 14:12:42.008 5521 INFO eventlet.wsgi.server [-] 190.85.38.173,<local> "GET / HTTP/1.1" status: 404  len: 195 time: 0.0597739

No other log files show abnormal behavior.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/6/18 2:33 PM
To: "lmihaiescu at gmail.com" <lmihaiescu at gmail.com>
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage
I explored creating a second "selfservice" vxlan to see if DHCP would work on it as it does on my external "provider" network. The new vxlan network shares the same problems as the old vxlan network. Am I having problems with VXLAN in particular? 

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/6/18 12:05 PM
To: <lmihaiescu at gmail.com>
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage
Interestingly, I can ping the neutron router at 172.16.1.1 just fine, but DHCP (located at 172.16.1.2 and 172.16.1.3) fails. The instance that I manually added the IP address to has a floating IP, and oddly enough I am able to ping DHCP on the provider network, which suggests that DHCP may be working on other networks but not on my selfservice network. I was able to confirm this by creating a new virtual machine directly on the provider network, I was able to ping to it and SSH into it right off of the bat, as it obtained the proper address on its own. "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" is empty. "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" contains:
fa:16:3e:3f:94:17,host-172-16-1-8.openstacklocal,172.16.1.8
fa:16:3e:e0:57:e7,host-172-16-1-7.openstacklocal,172.16.1.7
fa:16:3e:db:a7:cb,host-172-16-1-12.openstacklocal,172.16.1.12
fa:16:3e:f8:10:99,host-172-16-1-10.openstacklocal,172.16.1.10
fa:16:3e:a7:82:4c,host-172-16-1-3.openstacklocal,172.16.1.3
fa:16:3e:f8:23:1d,host-172-16-1-14.openstacklocal,172.16.1.14
fa:16:3e:63:53:a4,host-172-16-1-1.openstacklocal,172.16.1.1
fa:16:3e:b7:41:a8,host-172-16-1-2.openstacklocal,172.16.1.2
fa:16:3e:5e:25:5f,host-172-16-1-4.openstacklocal,172.16.1.4
fa:16:3e:3a:a2:53,host-172-16-1-100.openstacklocal,172.16.1.100
fa:16:3e:46:39:e2,host-172-16-1-13.openstacklocal,172.16.1.13
fa:16:3e:06:de:e0,host-172-16-1-18.openstacklocal,172.16.1.18

I've done system restarts since the power outage and the agent hasn't corrected itself. I've restarted all neutron services as I've done things, I could also try stopping and starting dnsmasq.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/6/18 11:15 AM
To: torin.woltjer at granddial.com
Cc: "openstack at lists.openstack.org" <openstack at lists.openstack.org>, "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>, pgsousa at gmail.com
Subject: Re: [Openstack] Recovering from full outage
Can you manually assign an IP address to a VM and once inside, ping the address of the dhcp server?
That would confirm if there is connectivity at least.

Also, on the controller node where the dhcp server for that network is, check the "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" and make sure there are entries corresponding to your instances.

In my experience, if neutron is broken after working fine (so excluding any miss-configuration), then an agent is out-of-sync and restart usually fixes things.

On Fri, Jul 6, 2018 at 9:38 AM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
I have done tcpdumps on both the controllers and on a compute node.
Controller:
`ip netns exec qdhcp-d85c2a00-a637-4109-83f0-7c2949be4cad tcpdump -vnes0 -i ns-83d68c76-b8 port 67`
`tcpdump -vnes0 -i any port 67`
Compute:
`tcpdump -vnes0 -i brqd85c2a00-a6 port 68`

For the first command on the controller, there are no packets captured at all. The second command on the controller captures packets, but they don't appear to be relevant to openstack. The dump from the compute node shows constant requests are getting sent by openstack instances. 

In summary; DHCP requests are being sent, but are never received.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/5/18 4:50 PM
To: torin.woltjer at granddial.com
Subject: Re: [Openstack] Recovering from full outage

The cloud-init requires network connectivity by default in order to reach the metadata server for the hostname, ssh-key, etc

You can configure cloud-init to use the config-drive, but the lack of network connectivity will make the instance useless anyway, even though it will have you ssh-key and hostname...

Did you check the things I told you? 

On Jul 5, 2018, at 16:06, Torin Woltjer <torin.woltjer at granddial.com> wrote:

Are IP addresses set by cloud-init on boot? I noticed that cloud-init isn't working on my VMs. created a new instance from an ubuntu 18.04 image to test with, the hostname was not set to the name of the instance and could not login as users I had specified in the configuration.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/5/18 12:57 PM
To: torin.woltjer at granddial.com
Cc: "openstack at lists.openstack.org" <openstack at lists.openstack.org>, "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>
Subject: Re: [Openstack] Recovering from full outage
You should tcpdump inside the qdhcp namespace to see if the requests make it there, and also check iptables rules on the compute nodes for the return traffic.

On Thu, Jul 5, 2018 at 12:39 PM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
Yes, I've done this. The VMs hang for awhile waiting for DHCP and eventually come up with no addresses. neutron-dhcp-agent has been restarted on both controllers. The qdhcp netns's were all present; I stopped the service, removed the qdhcp netns's, noted the dhcp agents show offline by `neutron agent-list`, restarted all neutron services, noted the qdhcp netns's were recreated, restarted a VM again and it still fails to pull an IP address.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/5/18 10:38 AM
To: torin.woltjer at granddial.com
Subject: Re: [Openstack] Recovering from full outage
Did you restart the neutron-dhcp-agent  and rebooted the VMs?

On Thu, Jul 5, 2018 at 10:30 AM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
The qrouter netns appears once the lock_path is specified, the neutron router is pingable as well. However, instances are not pingable. If I log in via console, the instances have not been given IP addresses, if I manually give them an address and route they are pingable and seem to work. So the router is working correctly but dhcp is not working.

No errors in any of the neutron or nova logs on controllers or compute nodes.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/5/18 8:53 AM
To: <lmihaiescu at gmail.com>
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage
There is no lock path set in my neutron configuration. Does it ultimately matter what it is set to as long as it is consistent? Does it need to be set on compute nodes as well as controllers?

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/3/18 7:47 PM
To: torin.woltjer at granddial.com
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage

Did you set a lock_path in the neutron’s config?

On Jul 3, 2018, at 17:34, Torin Woltjer <torin.woltjer at granddial.com> wrote:

The following errors appear in the neutron-linuxbridge-agent.log on both controllers: http://paste.openstack.org/show/724930/

No such errors are on the compute nodes themselves.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/3/18 5:14 PM
To: <lmihaiescu at gmail.com>
Cc: "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>, "openstack at lists.openstack.org" <openstack at lists.openstack.org>
Subject: Re: [Openstack] Recovering from full outage
Running `openstack server reboot` on an instance just causes the instance to be stuck in a rebooting status. Most notable of the logs is neutron-server.log which shows the following:
http://paste.openstack.org/show/724917/

I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted controllers, and all of the agents show online.
http://paste.openstack.org/show/724921/
And all of the instances can be properly started, however I cannot ping any of the instances floating IPs or the neutron router. And when logging into an instance with the console, there is no IP address on any interface.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/3/18 11:50 AM
To: torin.woltjer at granddial.com
Subject: Re: [Openstack] Recovering from full outage
Try restarting them using "openstack server reboot" and also check the nova-compute.log and neutron agents logs on the compute nodes.

On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks?

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack at lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180710/e76d70f3/attachment.html>

From mrhillsman at gmail.com  Tue Jul 10 23:16:18 2018
From: mrhillsman at gmail.com (Melvin Hillsman)
Date: Tue, 10 Jul 2018 18:16:18 -0500
Subject: [Openstack-operators] [openstack-community] Openstack package
	repo
In-Reply-To: <CAFs83Qqv8vsXUG+zYudsU9dU6K735oLc97OyaghEMbVw-tKviA@mail.gmail.com>
References: <CAAWpFTHHryCEjjVAmKsm4v_FUD56v6iq4wM0xcN40cRj4L6WeA@mail.gmail.com>
 <CAFs83Qqv8vsXUG+zYudsU9dU6K735oLc97OyaghEMbVw-tKviA@mail.gmail.com>
Message-ID: <CAMVtB2EnVBJQt3PurLxR6eEuoqY+0+z+xE3W2S68tbTY7jEqfg@mail.gmail.com>

May I suggest install python-pip and then pip install python-swiftclient
(python-openstackclient, python-whateverclient, etc at that point)

On Tue, Jul 10, 2018 at 10:32 AM, Amy Marrich <amy at demarco.com> wrote:

> Alfredo,
>
> Forwarding this to the OPS list in the hopes of it reaching the
> appropriate folks, but you might also want to checkout the RDO repos
>
> https://trunk.rdoproject.org/centos7/current/
>
>
> Thanks,
>
>
> Amy (spotz)
>
> On Tue, Jul 10, 2018 at 10:07 AM, Alfredo De Luca <
> alfredo.deluca at gmail.com> wrote:
>
>> Hi all.
>> I have centos/7 on a VM Virtualbox... I want to install all the openstack
>> python clients (nova, swift etc).
>> I installed
>> *yum install centos-release-openstack-queens *
>>
>> and all good but when I try to install one client I have the following
>> error:
>>
>> yum install python-swiftclient
>>
>> *<SNIP>*
>> Loaded plugins: fastestmirror
>> Loading mirror speeds from cached hostfile
>>  * base: mirror.infonline.de
>>  * extras: mirror.infonline.de
>>  * updates: centos.mirrors.psw.services
>> centos-ceph-luminous
>>                             | 2.9 kB  00:00:00
>> centos-openstack-queens
>>                              | 2.9 kB  00:00:00
>> *http://mirror.centos.org/altarch/7/virt/x86_64/kvm-common/repodata/repomd.xml
>> <http://mirror.centos.org/altarch/7/virt/x86_64/kvm-common/repodata/repomd.xml>:
>> [Errno 14] HTTP Error 404 - Not Found*
>> Trying other mirror.
>> To address this issue please refer to the below wiki article
>>
>> https://wiki.centos.org/yum-errors
>> *<SNIP>*
>>
>> Now the only way to install the package (or any other) is to disable that
>> repo
>> *yum-config-manager --disable centos-qemu-ev*
>>
>> then I can install the client...
>>
>> Any idea?
>> It looks like *http://mirror.centos.org/altarch/7/virt/x86_64
>> <http://mirror.centos.org/altarch/7/virt/x86_64> doesn't exist.....*
>>
>>
>>
>>
>>
>>
>> --
>> *Alfredo*
>>
>>
>> _______________________________________________
>> Community mailing list
>> Community at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/community
>>
>>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>


-- 
Kind regards,

Melvin Hillsman
mrhillsman at gmail.com
mobile: (832) 264-2646
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180710/90b7859d/attachment.html>

From namnh at vn.fujitsu.com  Wed Jul 11 16:44:39 2018
From: namnh at vn.fujitsu.com (Nguyen Hoai, Nam)
Date: Wed, 11 Jul 2018 16:44:39 +0000
Subject: [Openstack-operators] Vietnam OpenInfra Days - Call for
	presentations
Message-ID: <1531413407355.46226@vn.fujitsu.com>

Hello everyone,

We - VietOpenStack from Vietnam would like to notify that OpenInfra Days will be held the full day on Aug-25-2018. We are writing this email to invite everyone to join with us as speakers or listeners.

In this event, we will focus on some topics like  OpenStack, SDS (Ceph), SDN/NFV, Container (K8S/Docker..), CI/CD (Jenkins/Gitlab/Zuul), Automation (Ansible..) and cases study in Cloud Native. So we are very glad to welcome you as speakers to present some above topics.

Some important information as the following:

1. Common information
        • Powered by: OpenStack Foundation
        • Website: https://2018.vietopenstack.org
        • Time:  8:00 to 17:00, Sunday, Aug-25-2018
        • Location: Hanoi, Vietnam
        • Email contact: contact at vietopenstack.org

2. How to become speakers

Link register: https://2018.vietopenstack.org/2018/06/28/call-for-presentations

Scheduler for CFP:
?        • Deadline for CFP: Aug-01-2018
        • Deadline to get result: Aug-03-2018
        • Deadline to send slides: Aug-10-2018
        • Deadline for reviewing : Aug-20-2018

3. Addition information
        • Maximum of each presentation is 2 speakers
        • Speakers will be provided free ticket access, not include traveling and living cost

After your register is approved, we will spend our full effort to support you how to come Vietnam, a beautiful and peaceful country.

Thanks and best regards,
Nam Nguyen Hoai

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180711/2f4446fc/attachment.html>

From melwittt at gmail.com  Wed Jul 11 18:49:23 2018
From: melwittt at gmail.com (melanie witt)
Date: Wed, 11 Jul 2018 11:49:23 -0700
Subject: [Openstack-operators] [nova] Denver Stein ptg planning
Message-ID: <60144508-c601-95f8-1b39-3b5287b2ff76@gmail.com>

Hello Devs and Ops,

I've created an etherpad where we can start collecting ideas for topics 
to cover at the Stein PTG. Please feel free to add your comments and 
topics with your IRC nick next to it to make it easier to discuss with you.

https://etherpad.openstack.org/p/nova-ptg-stein

Cheers,
-melanie


From ashlee at openstack.org  Wed Jul 11 19:34:33 2018
From: ashlee at openstack.org (Ashlee Ferguson)
Date: Wed, 11 Jul 2018 14:34:33 -0500
Subject: [Openstack-operators] OpenStack Summit Berlin CFP Closes July 17
Message-ID: <E56CB546-14C9-4BD3-B1D8-784EA4D5FB40@openstack.org>

Hi everyone,

The CFP deadline for the OpenStack Summit Berlin is less than one week away, so make sure to submit your talks <https://www.openstack.org/summit/berlin-2018/call-for-presentations/> before July 18 at 6:59am UTC (July 17 at 11:59pm PST). 

Tracks:

	• CI/CD
	• Container Infrastructure
	• Edge Computing
	• Hands on Workshops
	• HPC / GPU / AI
	• Open Source Community
	• Private & Hybrid Cloud
	• Public Cloud
	• Telecom & NFV

SUBMIT HERE <https://www.openstack.org/summit/berlin-2018/call-for-presentations/>

Community voting, the first step in building the Summit schedule, will open in mid July. Once community voting concludes, a Programming Committee for each Track will build the schedule. Programming Committees are made up of individuals from many different open source communities working in open infrastructure, in addition to people who have participated in the past. Read the full selection process here <https://www.openstack.org/summit/berlin-2018/call-for-presentations/selection-process>.


Register for the Summit <https://www.eventbrite.com/e/openstack-summit-november-2018-berlin-tickets-41880169715> - Early Bird pricing ends August 21

Become a Sponsor <https://www.openstack.org/summit/berlin-2018/sponsors/>


Cheers,
Ashlee


Ashlee Ferguson
OpenStack Foundation
ashlee at openstack.org


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180711/ccd22b01/attachment.html>

From torin.woltjer at granddial.com  Wed Jul 11 21:23:30 2018
From: torin.woltjer at granddial.com (Torin Woltjer)
Date: Wed, 11 Jul 2018 21:23:30 GMT
Subject: [Openstack-operators] [Openstack] Recovering from full outage
Message-ID: <ba30e1e1d2ab4dab9c2048dc42bd57c2@granddial.com>

If I run `ip netns exec qrouter netstat -lnp` or `ip netns exec qdhcp netstat -lnp` on the controller, should I see anything listening on the metadata port (8775)? When I run these commands I don't see that listening, but I have no example of a working system to check against. Can anybody verify this?

Thanks,

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/10/18 2:58 PM
To: <torin.woltjer at granddial.com>
Cc: <openstack-operators at lists.openstack.org>, <openstack at lists.openstack.org>
Subject: Re: [Openstack] Recovering from full outage
DHCP is working again so instances are getting their addresses. For some reason cloud-init isn't working correctly. Hostnames aren't getting set, and SSH key pair isn't getting set. The neutron-metadata service is in control of this?

neutron-metadata-agent.log:
2018-07-10 08:01:42.046 5518 INFO eventlet.wsgi.server [-] 109.73.185.195,<local> "GET / HTTP/1.1" status: 404  len: 195 time: 0.0622332
2018-07-10 09:49:42.604 5518 INFO eventlet.wsgi.server [-] 197.149.85.150,<local> "GET / HTTP/1.1" status: 404  len: 195 time: 0.0645461
2018-07-10 10:52:50.845 5517 INFO eventlet.wsgi.server [-] 88.249.225.204,<local> "GET / HTTP/1.1" status: 404  len: 195 time: 0.0659041
2018-07-10 11:43:20.471 5518 INFO eventlet.wsgi.server [-] 143.208.186.168,<local> "GET / HTTP/1.1" status: 404  len: 195 time: 0.0618532
2018-07-10 11:53:15.574 5511 INFO eventlet.wsgi.server [-] 194.40.240.254,<local> "GET / HTTP/1.1" status: 404  len: 195 time: 0.0636070
2018-07-10 13:26:46.795 5518 INFO eventlet.wsgi.server [-] 109.73.177.149,<local> "GET / HTTP/1.1" status: 404  len: 195 time: 0.0611560
2018-07-10 13:27:38.795 5513 INFO eventlet.wsgi.server [-] 125.167.69.238,<local> "GET / HTTP/1.0" status: 404  len: 195 time: 0.0631371
2018-07-10 13:30:49.551 5514 INFO eventlet.wsgi.server [-] 155.93.152.111,<local> "GET / HTTP/1.0" status: 404  len: 195 time: 0.0609179
2018-07-10 14:12:42.008 5521 INFO eventlet.wsgi.server [-] 190.85.38.173,<local> "GET / HTTP/1.1" status: 404  len: 195 time: 0.0597739

No other log files show abnormal behavior.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/6/18 2:33 PM
To: "lmihaiescu at gmail.com" <lmihaiescu at gmail.com>
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage
I explored creating a second "selfservice" vxlan to see if DHCP would work on it as it does on my external "provider" network. The new vxlan network shares the same problems as the old vxlan network. Am I having problems with VXLAN in particular? 

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/6/18 12:05 PM
To: <lmihaiescu at gmail.com>
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage
Interestingly, I can ping the neutron router at 172.16.1.1 just fine, but DHCP (located at 172.16.1.2 and 172.16.1.3) fails. The instance that I manually added the IP address to has a floating IP, and oddly enough I am able to ping DHCP on the provider network, which suggests that DHCP may be working on other networks but not on my selfservice network. I was able to confirm this by creating a new virtual machine directly on the provider network, I was able to ping to it and SSH into it right off of the bat, as it obtained the proper address on its own. "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" is empty. "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" contains:
fa:16:3e:3f:94:17,host-172-16-1-8.openstacklocal,172.16.1.8
fa:16:3e:e0:57:e7,host-172-16-1-7.openstacklocal,172.16.1.7
fa:16:3e:db:a7:cb,host-172-16-1-12.openstacklocal,172.16.1.12
fa:16:3e:f8:10:99,host-172-16-1-10.openstacklocal,172.16.1.10
fa:16:3e:a7:82:4c,host-172-16-1-3.openstacklocal,172.16.1.3
fa:16:3e:f8:23:1d,host-172-16-1-14.openstacklocal,172.16.1.14
fa:16:3e:63:53:a4,host-172-16-1-1.openstacklocal,172.16.1.1
fa:16:3e:b7:41:a8,host-172-16-1-2.openstacklocal,172.16.1.2
fa:16:3e:5e:25:5f,host-172-16-1-4.openstacklocal,172.16.1.4
fa:16:3e:3a:a2:53,host-172-16-1-100.openstacklocal,172.16.1.100
fa:16:3e:46:39:e2,host-172-16-1-13.openstacklocal,172.16.1.13
fa:16:3e:06:de:e0,host-172-16-1-18.openstacklocal,172.16.1.18

I've done system restarts since the power outage and the agent hasn't corrected itself. I've restarted all neutron services as I've done things, I could also try stopping and starting dnsmasq.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/6/18 11:15 AM
To: torin.woltjer at granddial.com
Cc: "openstack at lists.openstack.org" <openstack at lists.openstack.org>, "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>, pgsousa at gmail.com
Subject: Re: [Openstack] Recovering from full outage
Can you manually assign an IP address to a VM and once inside, ping the address of the dhcp server?
That would confirm if there is connectivity at least.

Also, on the controller node where the dhcp server for that network is, check the "/var/lib/neutron/dhcp/d85c2a00-a637-4109-83f0-7c2949be4cad/leases" and make sure there are entries corresponding to your instances.

In my experience, if neutron is broken after working fine (so excluding any miss-configuration), then an agent is out-of-sync and restart usually fixes things.

On Fri, Jul 6, 2018 at 9:38 AM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
I have done tcpdumps on both the controllers and on a compute node.
Controller:
`ip netns exec qdhcp-d85c2a00-a637-4109-83f0-7c2949be4cad tcpdump -vnes0 -i ns-83d68c76-b8 port 67`
`tcpdump -vnes0 -i any port 67`
Compute:
`tcpdump -vnes0 -i brqd85c2a00-a6 port 68`

For the first command on the controller, there are no packets captured at all. The second command on the controller captures packets, but they don't appear to be relevant to openstack. The dump from the compute node shows constant requests are getting sent by openstack instances. 

In summary; DHCP requests are being sent, but are never received.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/5/18 4:50 PM
To: torin.woltjer at granddial.com
Subject: Re: [Openstack] Recovering from full outage

The cloud-init requires network connectivity by default in order to reach the metadata server for the hostname, ssh-key, etc

You can configure cloud-init to use the config-drive, but the lack of network connectivity will make the instance useless anyway, even though it will have you ssh-key and hostname...

Did you check the things I told you? 

On Jul 5, 2018, at 16:06, Torin Woltjer <torin.woltjer at granddial.com> wrote:

Are IP addresses set by cloud-init on boot? I noticed that cloud-init isn't working on my VMs. created a new instance from an ubuntu 18.04 image to test with, the hostname was not set to the name of the instance and could not login as users I had specified in the configuration.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/5/18 12:57 PM
To: torin.woltjer at granddial.com
Cc: "openstack at lists.openstack.org" <openstack at lists.openstack.org>, "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>
Subject: Re: [Openstack] Recovering from full outage
You should tcpdump inside the qdhcp namespace to see if the requests make it there, and also check iptables rules on the compute nodes for the return traffic.

On Thu, Jul 5, 2018 at 12:39 PM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
Yes, I've done this. The VMs hang for awhile waiting for DHCP and eventually come up with no addresses. neutron-dhcp-agent has been restarted on both controllers. The qdhcp netns's were all present; I stopped the service, removed the qdhcp netns's, noted the dhcp agents show offline by `neutron agent-list`, restarted all neutron services, noted the qdhcp netns's were recreated, restarted a VM again and it still fails to pull an IP address.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/5/18 10:38 AM
To: torin.woltjer at granddial.com
Subject: Re: [Openstack] Recovering from full outage
Did you restart the neutron-dhcp-agent  and rebooted the VMs?

On Thu, Jul 5, 2018 at 10:30 AM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
The qrouter netns appears once the lock_path is specified, the neutron router is pingable as well. However, instances are not pingable. If I log in via console, the instances have not been given IP addresses, if I manually give them an address and route they are pingable and seem to work. So the router is working correctly but dhcp is not working.

No errors in any of the neutron or nova logs on controllers or compute nodes.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/5/18 8:53 AM
To: <lmihaiescu at gmail.com>
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage
There is no lock path set in my neutron configuration. Does it ultimately matter what it is set to as long as it is consistent? Does it need to be set on compute nodes as well as controllers?

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/3/18 7:47 PM
To: torin.woltjer at granddial.com
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] Recovering from full outage

Did you set a lock_path in the neutron’s config?

On Jul 3, 2018, at 17:34, Torin Woltjer <torin.woltjer at granddial.com> wrote:

The following errors appear in the neutron-linuxbridge-agent.log on both controllers: http://paste.openstack.org/show/724930/

No such errors are on the compute nodes themselves.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/3/18 5:14 PM
To: <lmihaiescu at gmail.com>
Cc: "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>, "openstack at lists.openstack.org" <openstack at lists.openstack.org>
Subject: Re: [Openstack] Recovering from full outage
Running `openstack server reboot` on an instance just causes the instance to be stuck in a rebooting status. Most notable of the logs is neutron-server.log which shows the following:
http://paste.openstack.org/show/724917/

I realized that rabbitmq was in a failed state, so I bootstrapped it, rebooted controllers, and all of the agents show online.
http://paste.openstack.org/show/724921/
And all of the instances can be properly started, however I cannot ping any of the instances floating IPs or the neutron router. And when logging into an instance with the console, there is no IP address on any interface.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: George Mihaiescu <lmihaiescu at gmail.com>
Sent: 7/3/18 11:50 AM
To: torin.woltjer at granddial.com
Subject: Re: [Openstack] Recovering from full outage
Try restarting them using "openstack server reboot" and also check the nova-compute.log and neutron agents logs on the compute nodes.

On Tue, Jul 3, 2018 at 11:28 AM, Torin Woltjer <torin.woltjer at granddial.com> wrote:
We just suffered a power outage in out data center and I'm having trouble recovering the Openstack cluster. All of the nodes are back online, every instance shows active but `virsh list --all` on the compute nodes show that all of the VMs are actually shut down. Running `ip addr` on any of the nodes shows that none of the bridges are present and `ip netns` shows that all of the network namespaces are missing as well. So despite all of the neutron service running, none of the networking appears to be active, which is concerning. How do I solve this without recreating all of the networks?

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack at lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180711/96a71e3a/attachment-0001.html>

From thangam.arunx at gmail.com  Thu Jul 12 03:59:56 2018
From: thangam.arunx at gmail.com (=?UTF-8?B?4K6F4K6w4K+B4K6j4K+NIOCuleCvgeCuruCuvuCusOCvjSAoQXJ1biBLdW1hcik=?=)
Date: Thu, 12 Jul 2018 11:59:56 +0800
Subject: [Openstack-operators] [Openstack] Recovering from full outage
In-Reply-To: <ba30e1e1d2ab4dab9c2048dc42bd57c2@granddial.com>
References: <ba30e1e1d2ab4dab9c2048dc42bd57c2@granddial.com>
Message-ID: <CAJghqqYjNZ+6V5SHsahLzV9ASOAQadQXyXy-EG=o-P2JUZ87tg@mail.gmail.com>

Hi Torin,

If I run `ip netns exec qrouter netstat -lnp` or `ip netns exec qdhcp
> netstat -lnp` on the controller, should I see anything listening on the
> metadata port (8775)? When I run these commands I don't see that listening,
> but I have no example of a working system to check against. Can anybody
> verify this?
>

Either on qrouter/qdhcp namespaces, you won't see port 8775, instead check
whether meta-data service is running on the neutron controller node(s) and
listening on port 8775? Aslo, you can verify metadata and neturon services
using following commands

service neutron-metadata-agent status
neutron agent-list
netstat -ntplua | grep :8775


Thanks & Regards
Arun

ஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃ
அன்புடன்
அருண்
நுட்பம் நம்மொழியில் தழைக்கச் செய்வோம்
http://thangamaniarun.wordpress.com
ஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃ
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180712/e5cba8eb/attachment.html>

From adriant at catalyst.net.nz  Thu Jul 12 04:01:21 2018
From: adriant at catalyst.net.nz (Adrian Turjak)
Date: Thu, 12 Jul 2018 16:01:21 +1200
Subject: [Openstack-operators] [publiccloud-wg] [adjutant] Input on
 Adjutant's official project status
Message-ID: <da47c45c-df7d-3387-51c0-c532add420db@catalyst.net.nz>

Hello fellow public cloud providers (and others)!

Adjutant is in the process of being voted in (or not) as an official
project as part of OpenStack, but to help over the last few hurdles,
some input from the people who would likely benefit the most directly
from such a service existing would really be useful.

In the past you've probably talked to me about the need for some form of
business logic related APIs and services in OpenStack (signup, account
termination, project/user management, billing details management, etc).
In that space I've been trying to push Adjutant as a solution, not
because it's the perfect solution, but because we are trying to keep the
service as a cloud agnostic solution that could be tweaked for the
unique requirements of various clouds. It's also a place were we can
collaborate on these often rather miscellaneous business logic
requirements rather than us each writing our own entirely distinct thing
and wasting time and effort reinventing the wheel again and again.

The review in question where this discussion has been happening for a while:
https://review.openstack.org/#/c/553643/

And if you don't know much about Adjutant, here is a little background.

The current mission statement is:
"To provide an extensible API framework for exposing to users an
organization's automated business processes relating to account
management across OpenStack and external systems, that can be adapted to
the unique requirements of an organization's processes."

The docs: https://adjutant.readthedocs.io/en/latest/
The code: https://github.com/openstack/adjutant

And here is a rough feature list that was put together as part of the
review process for official project status:
https://etherpad.openstack.org/p/Adjutant_Features

If you have any questions about the service, don't hesitate to get in
touch, but some input on the current discussion would be very welcome!

Cheers,
Adrian Turjak


-------------- next part --------------
A non-text attachment was scrubbed...
Name: pEpkey.asc
Type: application/pgp-keys
Size: 1769 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180712/703c3540/attachment.key>

From torin.woltjer at granddial.com  Thu Jul 12 12:20:32 2018
From: torin.woltjer at granddial.com (Torin Woltjer)
Date: Thu, 12 Jul 2018 12:20:32 GMT
Subject: [Openstack-operators] [Openstack] Recovering from full outage
Message-ID: <0742b8e467364769a2c2cdac10067e2f@granddial.com>

The neutron-metadata-agent service is running, the the agent is alive, and it is listening on port 8775. However, new instances still do not get any information like hostname or keypair. If I run `curl 192.168.116.22:8775` from the compute nodes, I do get a response. The metadata agent is running, listening, and accessible from the compute nodes; and it worked previously.

I'm stumped.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: அருண் குமார் (Arun Kumar) <thangam.arunx at gmail.com>
Sent: 7/12/18 12:01 AM
To: torin.woltjer at granddial.com
Cc: "openstack at lists.openstack.org" <openstack at lists.openstack.org>, openstack-operators at lists.openstack.org
Subject: Re: [Openstack-operators] [Openstack] Recovering from full outage
Hi Torin,

If I run `ip netns exec qrouter netstat -lnp` or `ip netns exec qdhcp netstat -lnp` on the controller, should I see anything listening on the metadata port (8775)? When I run these commands I don't see that listening, but I have no example of a working system to check against. Can anybody verify this?

Either on qrouter/qdhcp namespaces, you won't see port 8775, instead check whether meta-data service is running on the neutron controller node(s) and listening on port 8775? Aslo, you can verify metadata and neturon services using following commands

service neutron-metadata-agent status neutron agent-list  netstat -ntplua | grep :8775

Thanks & Regards
Arun

ஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃ
அன்புடன்
அருண்
நுட்பம் நம்மொழியில் தழைக்கச் செய்வோம்
http://thangamaniarun.wordpress.com
ஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃஃ


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180712/959f608c/attachment.html>

From jpetrini at coredial.com  Thu Jul 12 13:16:15 2018
From: jpetrini at coredial.com (John Petrini)
Date: Thu, 12 Jul 2018 09:16:15 -0400
Subject: [Openstack-operators] [Openstack] Recovering from full outage
In-Reply-To: <0742b8e467364769a2c2cdac10067e2f@granddial.com>
References: <0742b8e467364769a2c2cdac10067e2f@granddial.com>
Message-ID: <CAD4AmV6Hz_4Xj9AcsMw0FFG7poGCpC4YPsRaRod3Kf7pq-hz4Q@mail.gmail.com>

Are you instances receiving a route to the metadata service
(169.254.169.254) from DHCP? Can you curl the endpoint? curl
http://169.254.169.254/latest/meta-data
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180712/2438f12e/attachment.html>

From torin.woltjer at granddial.com  Thu Jul 12 14:21:17 2018
From: torin.woltjer at granddial.com (Torin Woltjer)
Date: Thu, 12 Jul 2018 14:21:17 GMT
Subject: [Openstack-operators] [Openstack] Recovering from full outage
Message-ID: <863528991ac14ddf87d2449c763071e1@granddial.com>

I tested this on two instances. The first instance has existed since before I began having this issue. The second is created from a cirros test image.

On the first instance:
The route exists: 169.254.169.254 via 172.16.1.1 dev ens3 proto dhcp metric 100.
curl returns information, for example; 
`curl http://169.254.169.254/latest/meta-data/public-keys`
0=nextcloud

On the second instance:
The route exists: 169.254.169.254 via 172.16.1.1 dev eth0
curl fails;
`curl http://169.254.169.254/latest/meta-data`
curl: (7) Failed to connect to 169.254.169.254 port 80: Connection timed out

I am curious why this is the case that one is able to connect but not the other. Both the first and second instances were running on the same compute node. 

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: John Petrini <jpetrini at coredial.com>
Sent: 7/12/18 9:16 AM
To: torin.woltjer at granddial.com
Cc: thangam.arunx at gmail.com, OpenStack Operators <openstack-operators at lists.openstack.org>, OpenStack Mailing List <openstack at lists.openstack.org>
Subject: Re: [Openstack-operators] [Openstack] Recovering from full outage
Are you instances receiving a route to the metadata service (169.254.169.254) from DHCP? Can you curl the endpoint? curl http://169.254.169.254/latest/meta-data


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180712/269e34d2/attachment.html>

From jpetrini at coredial.com  Thu Jul 12 14:33:10 2018
From: jpetrini at coredial.com (John Petrini)
Date: Thu, 12 Jul 2018 10:33:10 -0400
Subject: [Openstack-operators] [Openstack] Recovering from full outage
In-Reply-To: <863528991ac14ddf87d2449c763071e1@granddial.com>
References: <863528991ac14ddf87d2449c763071e1@granddial.com>
Message-ID: <CAD4AmV583KZfDEmZDPNL3EwO-SJy9fK=4Uj1vUNLov=bsb9y9w@mail.gmail.com>

You might want to try giving the neutron-dhcp and metadata agents a restart.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180712/9b063367/attachment.html>

From amy at demarco.com  Thu Jul 12 14:42:17 2018
From: amy at demarco.com (Amy Marrich)
Date: Thu, 12 Jul 2018 09:42:17 -0500
Subject: [Openstack-operators] [openstack-community] Running instance
	snapshot
In-Reply-To: <CAAWpFTGOVVKgX6f0HcOazsSFDytYSB698BLavwKojG8fTb8qyw@mail.gmail.com>
References: <CAAWpFTGOVVKgX6f0HcOazsSFDytYSB698BLavwKojG8fTb8qyw@mail.gmail.com>
Message-ID: <CAFs83QoBzmbyFm6eKN9Lb2cy8gJ2y51MFnhx-Q5W-FVL_wfZ3g@mail.gmail.com>

Alfredo,

I've added the operators list but in Newton you should be able to use the
OpenStack CLI for this as well. Checkout 'openstack server backup create
--help' if you get an error again add --debug to the end to get some more
information to troubleshoot.

Thanks,

Amy (spotz)

On Thu, Jul 12, 2018 at 8:25 AM, Alfredo De Luca <alfredo.deluca at gmail.com>
wrote:

> Hi all.
> We have OS Newton and I wonder if it's possible to perform instance
> snapshot either on WUI or CLI.
>
> ​I tried with glance image-create or nova backup.... but I got the
> following
>
> ERROR (BadRequest): The request is invalid. (HTTP 400) (Request-ID:
> req-89154d7e-f0c5-4a2a-9bc9-b98c0c5e3182)​
>
>
> ​Any clue/info?
> cheers
>
>
> --
> *Alfredo*
>
>
> _______________________________________________
> Community mailing list
> Community at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/community
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180712/88a34e37/attachment.html>

From torin.woltjer at granddial.com  Thu Jul 12 15:03:26 2018
From: torin.woltjer at granddial.com (Torin Woltjer)
Date: Thu, 12 Jul 2018 15:03:26 GMT
Subject: [Openstack-operators] [Openstack]  Recovering from full outage
Message-ID: <373f719b15654b4a8ae5832d8e12229f@granddial.com>

Checking iptables for the metadata-proxy inside of qrouter provides the following:
$ip netns exec qrouter-80c3bc40-b49c-446a-926f-99811adc0c5e iptables-save -c | grep 169            
[0:0] -A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697
[0:0] -A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j MARK --set-xmark 0x1/0xffff
Packets:Bytes are both 0, so no traffic is touching this rule?

Interestingly the command:
$ip netns exec qrouter-80c3bc40-b49c-446a-926f-99811adc0c5e netstat -anep | grep 9697
returns nothing, so there isn't actually anything running on 9697 in the network namespace...

This is the output without grep:
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       User       Inode      PID/Program name    
raw        0      0 0.0.0.0:112             0.0.0.0:*               7           0          76154      8404/keepalived      
raw        0      0 0.0.0.0:112             0.0.0.0:*               7           0          76153      8404/keepalived      
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags       Type       State         I-Node   PID/Program name     Path
unix  2      [ ]         DGRAM                    64501    7567/python2          
unix  2      [ ]         DGRAM                    79953    8403/keepalived

Could the reason no traffic touching the rule be that nothing is listening on that port, or is there a second issue down the chain?

Curl fails even after restarting the neutron-dhcp-agent & neutron-metadata agent.

Thank you for this, and any future help.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180712/5be786a4/attachment.html>

From alfredo.deluca at gmail.com  Thu Jul 12 15:07:41 2018
From: alfredo.deluca at gmail.com (Alfredo De Luca)
Date: Thu, 12 Jul 2018 17:07:41 +0200
Subject: [Openstack-operators] [openstack-community] Openstack package
	repo
In-Reply-To: <CAMVtB2EnVBJQt3PurLxR6eEuoqY+0+z+xE3W2S68tbTY7jEqfg@mail.gmail.com>
References: <CAAWpFTHHryCEjjVAmKsm4v_FUD56v6iq4wM0xcN40cRj4L6WeA@mail.gmail.com>
 <CAFs83Qqv8vsXUG+zYudsU9dU6K735oLc97OyaghEMbVw-tKviA@mail.gmail.com>
 <CAMVtB2EnVBJQt3PurLxR6eEuoqY+0+z+xE3W2S68tbTY7jEqfg@mail.gmail.com>
Message-ID: <CAAWpFTE2Y0tY_t-R71x2qxpauTdQPf3Rjbc=cMfQKKuqMTFb3w@mail.gmail.com>

Hi Melvin. Thanks for that. Anyway I was able to install the packages with
the repo.

Cheers


On Wed, Jul 11, 2018 at 1:16 AM Melvin Hillsman <mrhillsman at gmail.com>
wrote:

> May I suggest install python-pip and then pip install python-swiftclient
> (python-openstackclient, python-whateverclient, etc at that point)
>
> On Tue, Jul 10, 2018 at 10:32 AM, Amy Marrich <amy at demarco.com> wrote:
>
>> Alfredo,
>>
>> Forwarding this to the OPS list in the hopes of it reaching the
>> appropriate folks, but you might also want to checkout the RDO repos
>>
>> https://trunk.rdoproject.org/centos7/current/
>>
>>
>> Thanks,
>>
>>
>> Amy (spotz)
>>
>> On Tue, Jul 10, 2018 at 10:07 AM, Alfredo De Luca <
>> alfredo.deluca at gmail.com> wrote:
>>
>>> Hi all.
>>> I have centos/7 on a VM Virtualbox... I want to install all the
>>> openstack python clients (nova, swift etc).
>>> I installed
>>> *yum install centos-release-openstack-queens *
>>>
>>> and all good but when I try to install one client I have the following
>>> error:
>>>
>>> yum install python-swiftclient
>>>
>>> *<SNIP>*
>>> Loaded plugins: fastestmirror
>>> Loading mirror speeds from cached hostfile
>>>  * base: mirror.infonline.de
>>>  * extras: mirror.infonline.de
>>>  * updates: centos.mirrors.psw.services
>>> centos-ceph-luminous
>>>                               | 2.9 kB  00:00:00
>>> centos-openstack-queens
>>>                              | 2.9 kB  00:00:00
>>> *http://mirror.centos.org/altarch/7/virt/x86_64/kvm-common/repodata/repomd.xml
>>> <http://mirror.centos.org/altarch/7/virt/x86_64/kvm-common/repodata/repomd.xml>:
>>> [Errno 14] HTTP Error 404 - Not Found*
>>> Trying other mirror.
>>> To address this issue please refer to the below wiki article
>>>
>>> https://wiki.centos.org/yum-errors
>>> *<SNIP>*
>>>
>>> Now the only way to install the package (or any other) is to disable
>>> that repo
>>> *yum-config-manager --disable centos-qemu-ev*
>>>
>>> then I can install the client...
>>>
>>> Any idea?
>>> It looks like *http://mirror.centos.org/altarch/7/virt/x86_64
>>> <http://mirror.centos.org/altarch/7/virt/x86_64> doesn't exist.....*
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> *Alfredo*
>>>
>>>
>>> _______________________________________________
>>> Community mailing list
>>> Community at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/community
>>>
>>>
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>>
>
>
> --
> Kind regards,
>
> Melvin Hillsman
> mrhillsman at gmail.com
> mobile: (832) 264-2646
>


-- 
*Alfredo*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180712/ac9e9103/attachment.html>

From alfredo.deluca at gmail.com  Thu Jul 12 15:09:16 2018
From: alfredo.deluca at gmail.com (Alfredo De Luca)
Date: Thu, 12 Jul 2018 17:09:16 +0200
Subject: [Openstack-operators] [openstack-community] Running instance
	snapshot
In-Reply-To: <CAFs83QoBzmbyFm6eKN9Lb2cy8gJ2y51MFnhx-Q5W-FVL_wfZ3g@mail.gmail.com>
References: <CAAWpFTGOVVKgX6f0HcOazsSFDytYSB698BLavwKojG8fTb8qyw@mail.gmail.com>
 <CAFs83QoBzmbyFm6eKN9Lb2cy8gJ2y51MFnhx-Q5W-FVL_wfZ3g@mail.gmail.com>
Message-ID: <CAAWpFTFNd=pCV5xERFJpFLh0bL3vAkmkbgP0ndQRHG-g=ixv_g@mail.gmail.com>

Thanks Amy... I ll do that and post the result
Cheers


On Thu, Jul 12, 2018 at 4:42 PM Amy Marrich <amy at demarco.com> wrote:

> Alfredo,
>
> I've added the operators list but in Newton you should be able to use the
> OpenStack CLI for this as well. Checkout 'openstack server backup create
> --help' if you get an error again add --debug to the end to get some more
> information to troubleshoot.
>
> Thanks,
>
> Amy (spotz)
>
> On Thu, Jul 12, 2018 at 8:25 AM, Alfredo De Luca <alfredo.deluca at gmail.com
> > wrote:
>
>> Hi all.
>> We have OS Newton and I wonder if it's possible to perform instance
>> snapshot either on WUI or CLI.
>>
>> ​I tried with glance image-create or nova backup.... but I got the
>> following
>>
>> ERROR (BadRequest): The request is invalid. (HTTP 400) (Request-ID:
>> req-89154d7e-f0c5-4a2a-9bc9-b98c0c5e3182)​
>>
>>
>> ​Any clue/info?
>> cheers
>>
>>
>> --
>> *Alfredo*
>>
>>
>> _______________________________________________
>> Community mailing list
>> Community at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/community
>>
>>
>

-- 
*Alfredo*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180712/169c90d6/attachment.html>

From jp.methot at planethoster.info  Thu Jul 12 19:08:54 2018
From: jp.methot at planethoster.info (=?utf-8?Q?Jean-Philippe_M=C3=A9thot?=)
Date: Thu, 12 Jul 2018 15:08:54 -0400
Subject: [Openstack-operators] Cinder-volume and high availability
Message-ID: <A29BBF00-52BB-4EB7-A994-A9E1E8394FC6@planethoster.info>

Hi,

I’ve noticed that in the high-availability guide, it is not recommended to run cinder-volume in an active-active configuration. However, I have built an active-passive setup that uses keepalived and a virtual IP to redirect API traffic to only one controller at a time. In such a configuration, would I still need to have only one cinder-volume service running at a time? Also, the backend is a Dell compellent SAN, if that makes a difference.

Jean-Philippe Méthot
Openstack system administrator
Administrateur système Openstack
PlanetHoster inc.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180712/cf9ec77f/attachment.html>

From lars at redhat.com  Sun Jul 15 03:07:10 2018
From: lars at redhat.com (Lars Kellogg-Stedman)
Date: Sat, 14 Jul 2018 23:07:10 -0400
Subject: [Openstack-operators] How are you handling billing/chargeback?
In-Reply-To: <CAHS=D_baNk_VYmVvfkVCU-DvJxZqPjK7TXnH83WRxDhE2ysjhw@mail.gmail.com>
References: <20180312192113.znz4eavfze5zg7yn@redhat.com>
 <CABowN-HynSYyZnF=kuUL0=jjNLwT0eY_BytOatz5Le=KT0w=PQ@mail.gmail.com>
 <CAHS=D_ZYfBCogSOm4-yb7txwdJ=N3WXTo4oEi9C8CAU6h0SvLA@mail.gmail.com>
 <CAHS=D_baNk_VYmVvfkVCU-DvJxZqPjK7TXnH83WRxDhE2ysjhw@mail.gmail.com>
Message-ID: <20180715030710.uaqzkzhblnvjcvos@redhat.com>

On Tue, Jul 10, 2018 at 06:54:55AM +0200, Christian Zunker wrote:
> just a short feedback to my previous post.
> [...]
> cloudkitty replaced our self-written script completely. We needed some time
> to get used to it, but it reduced our maintenance...

Hi Christian,

Thanks for following up!  That all seems like a pretty strong
recommendation. I haven't had the chance to look at cloudkitty myself,
but once I get out from under my pile of tripleo patches I will try to
take a look.

-- 
Lars Kellogg-Stedman <lars at redhat.com> | larsks @ {irc,twitter,github}
http://blog.oddbit.com/                |


From mriedemos at gmail.com  Sun Jul 15 14:18:51 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Sun, 15 Jul 2018 09:18:51 -0500
Subject: [Openstack-operators] [nova] Cinder cross_az_attach=False
	changes/fixes
In-Reply-To: <9d5a9be4-7869-d67d-7931-d65ea3b8d10d@gmail.com>
References: <9d5a9be4-7869-d67d-7931-d65ea3b8d10d@gmail.com>
Message-ID: <23a931e6-9d30-344a-f938-601aa08c6a5d@gmail.com>

Just an update on an old thread, but I've been working on the 
cross_az_attach=False issues again this past week and I think I have a 
couple of decent fixes.

On 5/31/2017 6:08 PM, Matt Riedemann wrote:
> This is a request for any operators out there that configure nova to set:
> 
> [cinder]
> cross_az_attach=False
> 
> To check out these two bug fixes:
> 
> 1. https://review.openstack.org/#/c/366724/
> 
> This is a case where nova is creating the volume during boot from volume 
> and providing an AZ to cinder during the volume create request. Today we 
> just pass the instance.availability_zone which is None if the instance 
> was created without an AZ set. It's unclear to me if that causes the 
> volume creation to fail (someone in IRC was showing the volume going 
> into ERROR state while Nova was waiting for it to be available), but I 
> think it will cause the later attach to fail here [1] because the 
> instance AZ (defaults to None) and volume AZ (defaults to nova) may not 
> match. I'm still looking for more details on the actual failure in that 
> one though.
> 
> The proposed fix in this case is pass the AZ associated with any host 
> aggregate that the instance is in.

This was indirectly fixed by change 
https://review.openstack.org/#/c/446053/ in Pike where we now set the 
instance.availability_zone in conductor after we get a selected host 
from the scheduler (we get the AZ for the host and set that on the 
instance before sending the instance to compute to build it).

While investigating this on master, I found a new bug where we do an 
up-call to the API DB which fails in a split MQ setup, and I have a fix 
here:

https://review.openstack.org/#/c/582342/

> 
> 2. https://review.openstack.org/#/c/469675/
> 
> This is similar, but rather than checking the AZ when we're on the 
> compute and the instance has a host, we're in the API and doing a boot 
> from volume where an existing volume is provided during server create. 
> By default, the volume's AZ is going to be 'nova'. The code doing the 
> check here is getting the AZ for the instance, and since the instance 
> isn't on a host yet, it's not in any aggregate, so the only AZ we can 
> get is from the server create request itself. If an AZ isn't provided 
> during the server create request, then we're comparing 
> instance.availability_zone (None) to volume['availability_zone'] 
> ("nova") and that results in a 400.
> 
> My proposed fix is in the case of BFV checks from the API, we default 
> the AZ if one wasn't requested when comparing against the volume. By 
> default this is going to compare "nova" for nova and "nova" for cinder, 
> since CONF.default_availability_zone is "nova" by default in both projects.

I've refined this fix a bit to be more flexible:

https://review.openstack.org/#/c/469675/

So now if doing boot from volume and we're checking 
cross_az_attach=False in the API and the user didn't explicitly request 
an AZ for the instance, we do a few checks:

1. If [DEFAULT]/default_schedule_zone is not None (the default), we use 
that to compare against the volume AZ.

2. If the volume AZ is equal to the [DEFAULT]/default_availability_zone 
(nova by default in both nova and cinder), we're OK - no issues.

3. If the volume AZ is not equal to [DEFAULT]/default_availability_zone, 
it means either the volume was created with a specific AZ or cinder's 
default AZ is configured differently from nova's. In that case, I take 
the volume AZ and put it into the instance RequestSpec so that during 
scheduling, the nova scheduler picks a host in the same AZ as the volume 
- if that AZ isn't in nova, we fail to schedule (NoValidHost) (but that 
shouldn't really happen, why would one have cross_az_attach=False w/o 
mirrored AZ in both cinder and nova?).

> 
> -- 
> 
> I'm requesting help from any operators that are setting 
> cross_az_attach=False because I have to imagine your users have run into 
> this and you're patching around it somehow, so I'd like input on how you 
> or your users are dealing with this.
> 
> I'm also trying to recreate these in upstream CI [2] which I was already 
> able to do with the 2nd bug.

This devstack patch has recreated both issues above and I'm adding the 
fixes to it as dependencies to show the problems are resolved.

> 
> Having said all of this, I really hate cross_az_attach as it's 
> config-driven API behavior which is not interoperable across clouds. 
> Long-term I'd really love to deprecate this option but we need a 
> replacement first, and I'm hoping placement with compute/volume resource 
> providers in a shared aggregate can maybe make that happen.
> 
> [1] 
> https://github.com/openstack/nova/blob/f278784ccb06e16ee12a42a585c5615abe65edfe/nova/virt/block_device.py#L368 
> 
> [2] https://review.openstack.org/#/c/467674/


-- 

Thanks,

Matt


From lijie at unitedstack.com  Mon Jul 16 08:55:46 2018
From: lijie at unitedstack.com (=?utf-8?B?UmFtYm8=?=)
Date: Mon, 16 Jul 2018 16:55:46 +0800
Subject: [Openstack-operators]  [cinder] about BlockDeviceDriver
Message-ID: <tencent_58A59F632EC7EF1452347725@qq.com>

Hi,all


     In the Cinder repository, I noticed that the BlockDeviceDriver driver is being deprecated, and was eventually be removed with the Queens release.


https://github.com/openstack/cinder/blob/stable/ocata/cinder/volume/drivers/block_device.py


In my use case, the instances using Cinder perform intense I/O, thus iSCSI or LVM is not a viable option - benchmarked them several times, since Juno, unsatisfactory results.For data processing scenarios is always better to use local storage than any SAN/NAS solution.


So I felt a great need to know why we deprecated it.If there has any better one to replace it? What do you suggest to use once BlockDeviceDriver is removed?Can you tell me about this?Thank you very much!


Best Regards
Rambo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180716/5e48e9dd/attachment.html>

From torin.woltjer at granddial.com  Mon Jul 16 12:41:16 2018
From: torin.woltjer at granddial.com (Torin Woltjer)
Date: Mon, 16 Jul 2018 12:41:16 GMT
Subject: [Openstack-operators] [Openstack]  Recovering from full outage
Message-ID: <1361da1cb6954d29955d92d0b0f3ddae@granddial.com>

$ip netns exec qdhcp-87a5200d-057f-475d-953d-17e873a47454 curl http://169.254.169.254                                          
<html>
<head>
 <title>404 Not Found</title>
</head>
<body>
 <h1>404 Not Found</h1>
 The resource could not be found.<br /><br />

</body>
</html>
$ip netns exec qrouter-80c3bc40-b49c-446a-926f-99811adc0c5e curl http://169.254.169.254                                          
curl: (7) Couldn't connect to server

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: "Torin Woltjer" <torin.woltjer at granddial.com>
Sent: 7/12/18 11:16 AM
To: <haleyb.dev at gmail.com>, <thangam.arunx at gmail.com>, "jpetrini at coredial.com" <jpetrini at coredial.com>
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] [Openstack-operators] Recovering from full outage
Checking iptables for the metadata-proxy inside of qrouter provides the following:
$ip netns exec qrouter-80c3bc40-b49c-446a-926f-99811adc0c5e iptables-save -c | grep 169            
[0:0] -A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697
[0:0] -A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p tcp -m tcp --dport 80 -j MARK --set-xmark 0x1/0xffff
Packets:Bytes are both 0, so no traffic is touching this rule?

Interestingly the command:
$ip netns exec qrouter-80c3bc40-b49c-446a-926f-99811adc0c5e netstat -anep | grep 9697
returns nothing, so there isn't actually anything running on 9697 in the network namespace...

This is the output without grep:
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       User       Inode      PID/Program name    
raw        0      0 0.0.0.0:112             0.0.0.0:*               7           0          76154      8404/keepalived      
raw        0      0 0.0.0.0:112             0.0.0.0:*               7           0          76153      8404/keepalived      
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags       Type       State         I-Node   PID/Program name     Path
unix  2      [ ]         DGRAM                    64501    7567/python2          
unix  2      [ ]         DGRAM                    79953    8403/keepalived

Could the reason no traffic touching the rule be that nothing is listening on that port, or is there a second issue down the chain?

Curl fails even after restarting the neutron-dhcp-agent & neutron-metadata agent.

Thank you for this, and any future help.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180716/57a32033/attachment.html>

From torin.woltjer at granddial.com  Mon Jul 16 20:54:28 2018
From: torin.woltjer at granddial.com (Torin Woltjer)
Date: Mon, 16 Jul 2018 20:54:28 GMT
Subject: [Openstack-operators] [Openstack]  Recovering from full outage
Message-ID: <d50d96af0db14a6cb81d7b38d36c3fe5@granddial.com>

I feel pretty dumb about this, but it was fixed by adding a rule to my security groups. I'm still very confused about some of the other behavior that I saw, but at least the problem is fixed now.

Torin Woltjer
 
Grand Dial Communications - A ZK Tech Inc. Company
 
616.776.1066 ext. 2006
www.granddial.com

----------------------------------------
From: Brian Haley <haleyb.dev at gmail.com>
Sent: 7/16/18 4:39 PM
To: torin.woltjer at granddial.com, thangam.arunx at gmail.com, jpetrini at coredial.com
Cc: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
Subject: Re: [Openstack] [Openstack-operators] Recovering from full outage
On 07/16/2018 08:41 AM, Torin Woltjer wrote:
> $ip netns exec qdhcp-87a5200d-057f-475d-953d-17e873a47454 curl
> http://169.254.169.254
>
>
>  404 Not Found
>
>
>  
404 Not Found

>  The resource could not be found.

>
>

Strange, don't know where the reply came from for that.

> $ip netns exec qrouter-80c3bc40-b49c-446a-926f-99811adc0c5e curl
> http://169.254.169.254
> curl: (7) Couldn't connect to server

Based on your iptables output below, I would think the metadata proxy is
running in the qrouter namespace. However, a curl from there will not
work since it is restricted to only work for incoming packets from the
qr- device(s). You would have to try curl from a running instance.

Is there an haproxy process running? And is it listening on port 9697
in the qrouter namespace?

-Brian

> ------------------------------------------------------------------------
> *From*: "Torin Woltjer"
> *Sent*: 7/12/18 11:16 AM
> *To*: , ,
> "jpetrini at coredial.com"
> *Cc*: openstack-operators at lists.openstack.org, openstack at lists.openstack.org
> *Subject*: Re: [Openstack] [Openstack-operators] Recovering from full outage
> Checking iptables for the metadata-proxy inside of qrouter provides the
> following:
> $ip netns exec qrouter-80c3bc40-b49c-446a-926f-99811adc0c5e
> iptables-save -c | grep 169
> [0:0] -A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p
> tcp -m tcp --dport 80 -j REDIRECT --to-ports 9697
> [0:0] -A neutron-l3-agent-PREROUTING -d 169.254.169.254/32 -i qr-+ -p
> tcp -m tcp --dport 80 -j MARK --set-xmark 0x1/0xffff
> Packets:Bytes are both 0, so no traffic is touching this rule?
>
> Interestingly the command:
> $ip netns exec qrouter-80c3bc40-b49c-446a-926f-99811adc0c5e netstat
> -anep | grep 9697
> returns nothing, so there isn't actually anything running on 9697 in the
> network namespace...
>
> This is the output without grep:
> Active Internet connections (servers and established)
> Proto Recv-Q Send-Q Local Address           Foreign Address
> State       User       Inode      PID/Program name
> raw        0      0 0.0.0.0:112             0.0.0.0:*               7
>         0          76154      8404/keepalived
> raw        0      0 0.0.0.0:112             0.0.0.0:*               7
>         0          76153      8404/keepalived
> Active UNIX domain sockets (servers and established)
> Proto RefCnt Flags       Type       State         I-Node   PID/Program
> name     Path
> unix  2      [ ]         DGRAM                    64501    7567/python2
> unix  2      [ ]         DGRAM                    79953    8403/keepalived
>
> Could the reason no traffic touching the rule be that nothing is
> listening on that port, or is there a second issue down the chain?
>
> Curl fails even after restarting the neutron-dhcp-agent &
> neutron-metadata agent.
>
> Thank you for this, and any future help.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180716/eada4cc3/attachment.html>

From mriedemos at gmail.com  Mon Jul 16 21:40:11 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Mon, 16 Jul 2018 16:40:11 -0500
Subject: [Openstack-operators] [openstack-community] Running instance
 snapshot
In-Reply-To: <CAAWpFTFNd=pCV5xERFJpFLh0bL3vAkmkbgP0ndQRHG-g=ixv_g@mail.gmail.com>
References: <CAAWpFTGOVVKgX6f0HcOazsSFDytYSB698BLavwKojG8fTb8qyw@mail.gmail.com>
 <CAFs83QoBzmbyFm6eKN9Lb2cy8gJ2y51MFnhx-Q5W-FVL_wfZ3g@mail.gmail.com>
 <CAAWpFTFNd=pCV5xERFJpFLh0bL3vAkmkbgP0ndQRHG-g=ixv_g@mail.gmail.com>
Message-ID: <9a9a1238-a152-058c-c25e-0fb4b8b0646d@gmail.com>

On 7/12/2018 10:09 AM, Alfredo De Luca wrote:
> ​I tried with glance image-create or nova backup.... but I got the 
> following

Neither of those are server snapshot operations (well backup is, but 
it's probably not what you're looking for).

glance image-create is creating an image in glance, not creating a 
snapshot from a server. That would be 'nova image-create':

https://docs.openstack.org/python-novaclient/latest/cli/nova.html#nova-image-create

What is the error message in the 400 response? It should be in the CLI 
output but if not, what's in the nova-api logs?

-- 

Thanks,

Matt


From ashlee at openstack.org  Tue Jul 17 13:34:03 2018
From: ashlee at openstack.org (Ashlee Ferguson)
Date: Tue, 17 Jul 2018 08:34:03 -0500
Subject: [Openstack-operators] OpenStack Summit Berlin CFP Deadline Today
Message-ID: <D20A124D-50BB-4E57-AD95-578F8EF1040A@openstack.org>

Hi everyone,

The CFP for the OpenStack Summit Berlin closes July 17 at 11:59pm PST (July 18 at 6:59am UTC), so make sure to press submit <https://www.openstack.org/summit/berlin-2018/call-for-presentations/> on your talks for:

	• CI/CD
	• Container Infrastructure
	• Edge Computing
	• Hands-on Workshops
	• HPC / GPU / AI
	• Open Source Community
	• Private & Hybrid Cloud
	• Public Cloud
	• Telecom & NFV


SUBMIT HERE <https://www.openstack.org/summit/berlin-2018/call-for-presentations/>


Register for the Summit <https://www.eventbrite.com/e/openstack-summit-november-2018-berlin-tickets-41880169715> - Early Bird pricing ends August 21

Become a Sponsor <https://www.openstack.org/summit/berlin-2018/sponsors/>

If you have any questions, please email summit at openstack.org <mailto:summit at openstack.org>.

Cheers,
Ashlee


Ashlee Ferguson
OpenStack Foundation
ashlee at openstack.org


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180717/45d4765b/attachment.html>

From gael.therond at gmail.com  Tue Jul 17 15:13:41 2018
From: gael.therond at gmail.com (Flint WALRUS)
Date: Tue, 17 Jul 2018 17:13:41 +0200
Subject: [Openstack-operators] [kolla-ansible][octavia-role]
Message-ID: <CAG+53uZqKWsfHy2hQtn9oSF57tCC=6GzgAvEu5ueS1OW_c2_Jg@mail.gmail.com>

Hi guys, I'm a trying to install Octavia as a new service on our cloud and
facing few issues that I've been able to manage so far, until this nova-api
keypair related issue.

When creating a loadbalancer with the following command:

openstack --os-cloud <MyCLOUD> loadbalancer create --name lb1
--vip-network-id <NETID>

My loadbalancer is in ERROR state with the following error from the NOVA
API logs:

2018-07-17 14:03:58.721 25812 INFO nova.api.openstack.wsgi
[req-69713077-c1e9-409a-9f9b-e3d5fb8006fc - - - - -] HTTP exception thrown:
Invalid key_name provided.

2018-07-17 14:03:58.723 25812 INFO nova.osapi_compute.wsgi.server
[req-69713077-c1e9-409a-9f9b-e3d5fb8006fc - - - - -] 10.1.0.10,172.21.0.21
"POST /v2.1/8dfa9231b14545bbab9d222c4425dd2f/servers HTTP/1.1" status: 400
len: 489 time: 0.8432851

>From my understanding of the nova-api source code it seems to be related to
nova-api not being able to found out the expected ssh keypair, however if
I'm doing:

openstack --os-cloud <MyCLOUD> keypair list

I'm correctly seing the octavia_ssh_key entry for my user.

Has anyone already made it work using kolla?
On a side note, I'm using stable/queens branch for both kolla docker images
and kolla-ansible.

Kind regards,
G.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180717/be388387/attachment.html>

From iain.macdonnell at oracle.com  Tue Jul 17 16:06:47 2018
From: iain.macdonnell at oracle.com (iain MacDonnell)
Date: Tue, 17 Jul 2018 09:06:47 -0700
Subject: [Openstack-operators] [kolla-ansible][octavia-role]
In-Reply-To: <CAG+53uZqKWsfHy2hQtn9oSF57tCC=6GzgAvEu5ueS1OW_c2_Jg@mail.gmail.com>
References: <CAG+53uZqKWsfHy2hQtn9oSF57tCC=6GzgAvEu5ueS1OW_c2_Jg@mail.gmail.com>
Message-ID: <fe43a02e-cf9d-c207-fe39-8c9c55b66570@oracle.com>


On 07/17/2018 08:13 AM, Flint WALRUS wrote:
> Hi guys, I'm a trying to install Octavia as a new service on our cloud 
> and facing few issues that I've been able to manage so far, until this 
> nova-api keypair related issue.
> 
> When creating a loadbalancer with the following command:
> 
> openstack --os-cloud <MyCLOUD> loadbalancer create --name lb1 
> --vip-network-id <NETID>
> 
> My loadbalancer is in ERROR state with the following error from the NOVA 
> API logs:
> 
> 2018-07-17 14:03:58.721 25812 INFO nova.api.openstack.wsgi 
> [req-69713077-c1e9-409a-9f9b-e3d5fb8006fc - - - - -] HTTP exception 
> thrown: Invalid key_name provided.
> 
> 2018-07-17 14:03:58.723 25812 INFO nova.osapi_compute.wsgi.server 
> [req-69713077-c1e9-409a-9f9b-e3d5fb8006fc - - - - -] 
> 10.1.0.10,172.21.0.21 "POST 
> /v2.1/8dfa9231b14545bbab9d222c4425dd2f/servers HTTP/1.1" status: 400 
> len: 489 time: 0.8432851
> 
>  From my understanding of the nova-api source code it seems to be 
> related to nova-api not being able to found out the expected ssh 
> keypair, however if I'm doing:
> 
> openstack --os-cloud <MyCLOUD> keypair list
> 
> I'm correctly seing the octavia_ssh_key entry for my user.
> 
> Has anyone already made it work using kolla?
> On a side note, I'm using stable/queens branch for both kolla docker 
> images and kolla-ansible.

Don't know how kolla handles it, but I'm fairly sure that the ssh key 
has to be created/owned by the user that creates the amphora instances, 
which is not the same as the user that creates the load-balancer. I 
believe it's the user specified in the service_auth section of octavia.conf.

     ~iain


From johnsomor at gmail.com  Tue Jul 17 16:55:08 2018
From: johnsomor at gmail.com (Michael Johnson)
Date: Tue, 17 Jul 2018 09:55:08 -0700
Subject: [Openstack-operators] [kolla-ansible][octavia-role]
In-Reply-To: <fe43a02e-cf9d-c207-fe39-8c9c55b66570@oracle.com>
References: <CAG+53uZqKWsfHy2hQtn9oSF57tCC=6GzgAvEu5ueS1OW_c2_Jg@mail.gmail.com>
 <fe43a02e-cf9d-c207-fe39-8c9c55b66570@oracle.com>
Message-ID: <CAMH0MgJM-WPV9B5cS2V+Y-EdtCErEbiMrz-RMBfdiGwmee2w5g@mail.gmail.com>

Right. I am not familiar with the kolla role either, but you are
correct.  The keypair created in nova needs to be "owned" by the
octavia service account.

Michael
On Tue, Jul 17, 2018 at 9:07 AM iain MacDonnell
<iain.macdonnell at oracle.com> wrote:
>
>
>
> On 07/17/2018 08:13 AM, Flint WALRUS wrote:
> > Hi guys, I'm a trying to install Octavia as a new service on our cloud
> > and facing few issues that I've been able to manage so far, until this
> > nova-api keypair related issue.
> >
> > When creating a loadbalancer with the following command:
> >
> > openstack --os-cloud <MyCLOUD> loadbalancer create --name lb1
> > --vip-network-id <NETID>
> >
> > My loadbalancer is in ERROR state with the following error from the NOVA
> > API logs:
> >
> > 2018-07-17 14:03:58.721 25812 INFO nova.api.openstack.wsgi
> > [req-69713077-c1e9-409a-9f9b-e3d5fb8006fc - - - - -] HTTP exception
> > thrown: Invalid key_name provided.
> >
> > 2018-07-17 14:03:58.723 25812 INFO nova.osapi_compute.wsgi.server
> > [req-69713077-c1e9-409a-9f9b-e3d5fb8006fc - - - - -]
> > 10.1.0.10,172.21.0.21 "POST
> > /v2.1/8dfa9231b14545bbab9d222c4425dd2f/servers HTTP/1.1" status: 400
> > len: 489 time: 0.8432851
> >
> >  From my understanding of the nova-api source code it seems to be
> > related to nova-api not being able to found out the expected ssh
> > keypair, however if I'm doing:
> >
> > openstack --os-cloud <MyCLOUD> keypair list
> >
> > I'm correctly seing the octavia_ssh_key entry for my user.
> >
> > Has anyone already made it work using kolla?
> > On a side note, I'm using stable/queens branch for both kolla docker
> > images and kolla-ansible.
>
> Don't know how kolla handles it, but I'm fairly sure that the ssh key
> has to be created/owned by the user that creates the amphora instances,
> which is not the same as the user that creates the load-balancer. I
> believe it's the user specified in the service_auth section of octavia.conf.
>
>      ~iain
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


From gael.therond at gmail.com  Tue Jul 17 19:29:30 2018
From: gael.therond at gmail.com (Flint WALRUS)
Date: Tue, 17 Jul 2018 21:29:30 +0200
Subject: [Openstack-operators] [kolla-ansible][octavia-role]
In-Reply-To: <CAMH0MgJM-WPV9B5cS2V+Y-EdtCErEbiMrz-RMBfdiGwmee2w5g@mail.gmail.com>
References: <CAG+53uZqKWsfHy2hQtn9oSF57tCC=6GzgAvEu5ueS1OW_c2_Jg@mail.gmail.com>
 <fe43a02e-cf9d-c207-fe39-8c9c55b66570@oracle.com>
 <CAMH0MgJM-WPV9B5cS2V+Y-EdtCErEbiMrz-RMBfdiGwmee2w5g@mail.gmail.com>
Message-ID: <CAG+53uYdyqHqd1V_6rLqnf3VaZaHs+D6n9=0G4Pfx1XPriKGnw@mail.gmail.com>

Oooooh! Ok! Thanks a lot for such information !! Indeed that was not clear
and didn’t make sens to me why each user should have to get it’s own ssh
key to log into the amphora!

You rocks guys, I’ll test that on tomorrow morning!

Thanks a lot!
G.
Le mar. 17 juil. 2018 à 18:55, Michael Johnson <johnsomor at gmail.com> a
écrit :

> Right. I am not familiar with the kolla role either, but you are
> correct.  The keypair created in nova needs to be "owned" by the
> octavia service account.
>
> Michael
> On Tue, Jul 17, 2018 at 9:07 AM iain MacDonnell
> <iain.macdonnell at oracle.com> wrote:
> >
> >
> >
> > On 07/17/2018 08:13 AM, Flint WALRUS wrote:
> > > Hi guys, I'm a trying to install Octavia as a new service on our cloud
> > > and facing few issues that I've been able to manage so far, until this
> > > nova-api keypair related issue.
> > >
> > > When creating a loadbalancer with the following command:
> > >
> > > openstack --os-cloud <MyCLOUD> loadbalancer create --name lb1
> > > --vip-network-id <NETID>
> > >
> > > My loadbalancer is in ERROR state with the following error from the
> NOVA
> > > API logs:
> > >
> > > 2018-07-17 14:03:58.721 25812 INFO nova.api.openstack.wsgi
> > > [req-69713077-c1e9-409a-9f9b-e3d5fb8006fc - - - - -] HTTP exception
> > > thrown: Invalid key_name provided.
> > >
> > > 2018-07-17 14:03:58.723 25812 INFO nova.osapi_compute.wsgi.server
> > > [req-69713077-c1e9-409a-9f9b-e3d5fb8006fc - - - - -]
> > > 10.1.0.10,172.21.0.21 "POST
> > > /v2.1/8dfa9231b14545bbab9d222c4425dd2f/servers HTTP/1.1" status: 400
> > > len: 489 time: 0.8432851
> > >
> > >  From my understanding of the nova-api source code it seems to be
> > > related to nova-api not being able to found out the expected ssh
> > > keypair, however if I'm doing:
> > >
> > > openstack --os-cloud <MyCLOUD> keypair list
> > >
> > > I'm correctly seing the octavia_ssh_key entry for my user.
> > >
> > > Has anyone already made it work using kolla?
> > > On a side note, I'm using stable/queens branch for both kolla docker
> > > images and kolla-ansible.
> >
> > Don't know how kolla handles it, but I'm fairly sure that the ssh key
> > has to be created/owned by the user that creates the amphora instances,
> > which is not the same as the user that creates the load-balancer. I
> > believe it's the user specified in the service_auth section of
> octavia.conf.
> >
> >      ~iain
> >
> > _______________________________________________
> > OpenStack-operators mailing list
> > OpenStack-operators at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180717/bc454c75/attachment.html>

From tobias.rydberg at citynetwork.eu  Wed Jul 18 12:06:46 2018
From: tobias.rydberg at citynetwork.eu (Tobias Rydberg)
Date: Wed, 18 Jul 2018 14:06:46 +0200
Subject: [Openstack-operators] [publiccloud-wg][passport] Public Cloud
	Passport program v2
Message-ID: <35f0af3d-d106-c073-9440-4056fd14bd15@citynetwork.eu>

Hi everyone,

After the discussions we had in Vancouver at the related Forum session 
we have been working on a new spec for the Passport program version 2. 
This spec includes the use case around coupon codes, requirements for 
being a member in the program, changes at Foundation homepage to mention 
some of them.

We now would like feedback from all members member prospects and others 
that have interest in the project. Please vote +1 or -1 together with a 
comment, also online comments are appreciated.

Spec can be found at: https://review.openstack.org/#/c/583529/

Feel free to join our weekly meeting tomorrow at 1400 UTC in 
#openstack-publiccloud

Cheers,
Tobias Rydberg
Co-Chair Public Cloud WG


From tobias.rydberg at citynetwork.eu  Wed Jul 18 13:40:29 2018
From: tobias.rydberg at citynetwork.eu (Tobias Rydberg)
Date: Wed, 18 Jul 2018 15:40:29 +0200
Subject: [Openstack-operators] [publiccloud-wg] Meeting tomorrow for Public
	Cloud WG
Message-ID: <ae0e070b-aedc-c921-172b-14274284e4f1@citynetwork.eu>

Hi folks,

Time for a new meeting for the Public Cloud WG. Agenda draft can be 
found at https://etherpad.openstack.org/p/publiccloud-wg, feel free to 
add items to that list.

See you all tomorrow at IRC 1400 UTC in #openstack-publiccloud

Cheers,
Tobias

-- 
Tobias Rydberg
Senior Developer
Twitter & IRC: tobberydberg

www.citynetwork.eu | www.citycloud.com

INNOVATION THROUGH OPEN IT INFRASTRUCTURE
ISO 9001, 14001, 27001, 27015 & 27018 CERTIFIED


From thierry at openstack.org  Fri Jul 20 14:44:35 2018
From: thierry at openstack.org (Thierry Carrez)
Date: Fri, 20 Jul 2018 16:44:35 +0200
Subject: [Openstack-operators] [all] [ptg] PTG track schedule published
Message-ID: <e4102340-eaf6-eb01-afdd-57472d15a063@openstack.org>

Hi everyone,

Last month we published the tentative schedule layout for the 5 days of 
PTG. There was no major complaint, so that was confirmed as the PTG 
event schedule and published on the PTG website:

https://www.openstack.org/ptg#tab_schedule

You'll notice that:

- The Ops meetup days were added.

- Keystone track is split in two: one day on Monday for cross-project 
discussions around identity management, and two days on Thursday/Friday 
for team discussions.

- The "Ask me anything" project helproom on Monday/Tuesday is for 
horizontal support teams (infrastructure, release management, stable 
maint, requirements...) to provide support for other teams, SIGs and 
workgroups and answer their questions. Goal champions should also be 
available there to help with Stein goal completion questions.

- Like in Dublin, a number of tracks do not get pre-allocated time, and 
will be scheduled on the spot in available rooms at the time that makes 
the most sense for the participants.

- Every track will be able to book extra time and space in available 
extra rooms at the event.

To find more information about the event, register or book a room at the 
event hotel, visit: https://www.openstack.org/ptg

Note that the second (and last) round of applications for travel support 
to the event is closing at the end of next week (July 29th) ! Apply if 
you need financial help attending the event:

https://openstackfoundation.formstack.com/forms/travelsupportptg_denver_2018

See you there !

-- 
Thierry Carrez (ttx)


From thierry at openstack.org  Fri Jul 20 14:57:20 2018
From: thierry at openstack.org (Thierry Carrez)
Date: Fri, 20 Jul 2018 16:57:20 +0200
Subject: [Openstack-operators] [all] [ptg] PTG track schedule published
In-Reply-To: <e4102340-eaf6-eb01-afdd-57472d15a063@openstack.org>
References: <e4102340-eaf6-eb01-afdd-57472d15a063@openstack.org>
Message-ID: <b6f4ef55-f4ee-7810-3c4a-d988735c1d2a@openstack.org>

Thierry Carrez wrote:
> Hi everyone,
> 
> Last month we published the tentative schedule layout for the 5 days of 
> PTG. There was no major complaint, so that was confirmed as the PTG 
> event schedule and published on the PTG website:
> 
> https://www.openstack.org/ptg#tab_schedule

The tab temporarily disappeared, while it is being restored you can 
access the schedule at:

https://docs.google.com/spreadsheets/d/e/2PACX-1vRM2UIbpnL3PumLjRaso_9qpOfnyV9VrPqGbTXiMVNbVgjiR3SIdl8VSBefk339MhrbJO5RficKt2Rr/pubhtml?gid=1156322660&single=true

-- 
Thierry Carrez (ttx)


From jonmills at gmail.com  Mon Jul 23 15:43:52 2018
From: jonmills at gmail.com (Jonathan Mills)
Date: Mon, 23 Jul 2018 11:43:52 -0400
Subject: [Openstack-operators] Couple of CellsV2 questions
Message-ID: <CAB55WD97F=98=Xvr97gJM4CUnieBgY=cm=mvwe_ZE0QiGnHKwg@mail.gmail.com>

Good morning all,

I am looking at implementing CellsV2 with multiple cells, and there's a few
things I'm seeking clarification on:

1) How does a superconductor know that it is a superconductor?  Is its
operation different in any fundamental way?  Is there any explicit
configuration or a setting in the database required? Or does it simply not
care one way or another?

2) When I ran the command "nova-manage cell_v2 create_cell --name=cell1
--verbose", the entry created for cell1 in the api database includes only
one rabbitmq server, but I have three of them as an HA cluster.  Does it
only support talking to one rabbitmq server in this configuration? Or can I
just update the cell1 transport_url in the database to point to all three?
Is that a supported configuration?

3) Is there anything wrong with having one cell share the amqp bus with
your control plane, while having additional cells use their own amqp buses?
Certainly I realize that the point of CellsV2 is to shard the amqp bus for
greater horizontal scalability.  But in my case, my first cell is on the
smaller side, and happens to be colocated with the control plane hardware
(whereas other cells will be in other parts of the datacenter, or in other
datacenters with high-speed links).  I was thinking of just pointing that
first cell back at the same rabbitmq servers used by the control plane, but
perhaps directing them at their own rabbitmq vhost.  Is that a terrible
idea?

Your feedback is highly appreciated!  Thank you,


Jonathan Mills
NASA Center for Climate Simulation
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180723/95b47c6d/attachment.html>

From james.page at canonical.com  Mon Jul 23 16:01:39 2018
From: james.page at canonical.com (James Page)
Date: Mon, 23 Jul 2018 17:01:39 +0100
Subject: [Openstack-operators]
	[sig][upgrades][ansible][charms][tripleo][kolla][airship]
	reboot or poweroff?
Message-ID: <CAG1bqQqm+iftePNYKTPpTGYOzP-utLJ_n+aJfh_-aw2veWib_Q@mail.gmail.com>

Hi All

tl;dr we (the original founders) have not managed to invest the time to get
the Upgrades SIG booted - time to hit reboot or time to poweroff?

Since Vancouver, two of the original SIG chairs have stepped down leaving
me in the hot seat with minimal participation from either deployment
projects or operators in the IRC meetings.  In addition I've only been able
to make every 3rd IRC meeting, so they have generally not being happening.

I think the current timing is not good for a lot of folk so finding a
better slot is probably a must-have if the SIG is going to continue - and
maybe moving to a monthly or bi-weekly schedule rather than the weekly slot
we have now.

In addition I need some willing folk to help with leadership in the SIG.
If you have an interest and would like to help please let me know!

I'd also like to better engage with all deployment projects - upgrades is
something that deployment tools should be looking to encapsulate as
features, so it would be good to get deployment projects engaged in the SIG
with nominated representatives.

Based on the attendance in upgrades sessions in Vancouver and
developer/operator appetite to discuss all things upgrade at said sessions
I'm assuming that there is still interest in having a SIG for Upgrades but
I may be wrong!

Thoughts?

James
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180723/53e5ef2d/attachment.html>

From mriedemos at gmail.com  Mon Jul 23 21:57:16 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Mon, 23 Jul 2018 16:57:16 -0500
Subject: [Openstack-operators] [nova] Couple of CellsV2 questions
In-Reply-To: <CAB55WD97F=98=Xvr97gJM4CUnieBgY=cm=mvwe_ZE0QiGnHKwg@mail.gmail.com>
References: <CAB55WD97F=98=Xvr97gJM4CUnieBgY=cm=mvwe_ZE0QiGnHKwg@mail.gmail.com>
Message-ID: <5eb59ccc-f860-b15c-7ed8-e1a04807adb7@gmail.com>

I'll try to help a bit inline. Also cross-posting to openstack-dev and 
tagging with [nova] to highlight it.

On 7/23/2018 10:43 AM, Jonathan Mills wrote:
> I am looking at implementing CellsV2 with multiple cells, and there's a 
> few things I'm seeking clarification on:
> 
> 1) How does a superconductor know that it is a superconductor?  Is its 
> operation different in any fundamental way?  Is there any explicit 
> configuration or a setting in the database required? Or does it simply 
> not care one way or another?

It's a topology term, not really anything in config or the database that 
distinguishes the "super" conductor. I assume you've gone over the 
service layout in the docs:

https://docs.openstack.org/nova/latest/user/cellsv2-layout.html#service-layout

There are also some summit talks from Dan about the topology linked here:

https://docs.openstack.org/nova/latest/user/cells.html#cells-v2

The superconductor is the conductor service at the "top" of the tree 
which interacts with the API and scheduler (controller) services and 
routes operations to the cell. Then once in a cell, the operation should 
ideally be confined there. So, for example, reschedules during a build 
would be confined to the cell. The cell conductor doesn't go back "up" 
to the scheduler to get a new set of hosts for scheduling. This of 
course depends on which release you're using and your configuration, see 
the caveats section in the cellsv2-layout doc.

> 
> 2) When I ran the command "nova-manage cell_v2 create_cell --name=cell1 
> --verbose", the entry created for cell1 in the api database includes 
> only one rabbitmq server, but I have three of them as an HA cluster.  
> Does it only support talking to one rabbitmq server in this 
> configuration? Or can I just update the cell1 transport_url in the 
> database to point to all three? Is that a supported configuration?

First, don't update stuff directly in the database if you don't have to. 
:) What you set on the transport_url should be whatever oslo.messaging 
can handle:

https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.transport_url

There is at least one reported bug for this but I'm not sure I fully 
grok it or what its status is at this point:

https://bugs.launchpad.net/nova/+bug/1717915

> 
> 3) Is there anything wrong with having one cell share the amqp bus with 
> your control plane, while having additional cells use their own amqp 
> buses? Certainly I realize that the point of CellsV2 is to shard the 
> amqp bus for greater horizontal scalability.  But in my case, my first 
> cell is on the smaller side, and happens to be colocated with the 
> control plane hardware (whereas other cells will be in other parts of 
> the datacenter, or in other datacenters with high-speed links).  I was 
> thinking of just pointing that first cell back at the same rabbitmq 
> servers used by the control plane, but perhaps directing them at their 
> own rabbitmq vhost. Is that a terrible idea?

Would need to get input from operators and/or Dan Smith's opinion on 
this one, but I'd say it's no worse than having a flat single cell 
deployment. However, if you're going to do multi-cell long-term anyway, 
then it would be best to get in the mindset and discipline of not 
relying on shared MQ between the controller services and the cells. In 
other words, just do the right thing from the start rather than have to 
worry about maybe changing the deployment / configuration for that one 
cell down the road when it's harder.

-- 

Thanks,

Matt


From zhipengh512 at gmail.com  Tue Jul 24 06:58:59 2018
From: zhipengh512 at gmail.com (Zhipeng Huang)
Date: Tue, 24 Jul 2018 14:58:59 +0800
Subject: [Openstack-operators] [publiccloud-wg]New Meeting Time Starting
	This Week
Message-ID: <CAHZqm+VgAE241vvuCL3uMGxc1OAkR3dPS_OtPeK5WpzifG=8RQ@mail.gmail.com>

Hi Folks,

As indicated in https://review.openstack.org/#/c/584389/, PCWG is moving
towards a tick-tock meeting arrangements to better accommodate participants
along the globe.

For even weeks starting this Wed, we will have a new meeting time on
UTC0700. For odd weeks we will remain for the UTC1400 time slot.

Look forward to meet you all at #openstack-publiccloud on Wed !

-- 
Zhipeng (Howard) Huang

Standard Engineer
IT Standard & Patent/IT Product Line
Huawei Technologies Co,. Ltd
Email: huangzhipeng at huawei.com
Office: Huawei Industrial Base, Longgang, Shenzhen

(Previous)
Research Assistant
Mobile Ad-Hoc Network Lab, Calit2
University of California, Irvine
Email: zhipengh at uci.edu
Office: Calit2 Building Room 2402

OpenStack, OPNFV, OpenDaylight, OpenCompute Aficionado
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180724/92dfba1e/attachment.html>

From ashlee at openstack.org  Tue Jul 24 14:23:40 2018
From: ashlee at openstack.org (Ashlee Ferguson)
Date: Tue, 24 Jul 2018 09:23:40 -0500
Subject: [Openstack-operators] OpenStack Summit Berlin - Community Voting
	Open
Message-ID: <5D259863-CF2D-4D2C-B85C-4C029D686D75@openstack.org>

Hi everyone,

Session voting is now open for the November 2018 OpenStack Summit in Berlin!

VOTE HERE <https://www.openstack.org/summit/berlin-2018/vote-for-speakers>

Hurry, voting closes Thursday, July 26 at 11:59pm Pacific Time (Friday, July 27 at 6:59 UTC).

The Programming Committees will ultimately determine the final schedule. Community votes are meant to help inform the decision, but are not considered to be the deciding factor. The Programming Committee members exercise judgment in their area of expertise and help ensure diversity. View full details of the session selection process here.

Continue to visit https://www.openstack.org/summit/berlin-2018 <https://www.openstack.org/summit/berlin-2018> for all Summit-related information.

REGISTER
Register for the Summit <https://openstacksummit2018berlin.eventbrite.com/> before prices increase in late August!

VISA APPLICATION PROCESS
Make sure to secure your Visa soon. More information <https://www.openstack.org/summit/berlin-2018/travel/#travel-support> about the Visa application process.

TRAVEL SUPPORT PROGRAM
August 30 is the last day to submit applications. Please submit your applications <https://www.openstack.org/summit/berlin-2018/travel/#travel-support> by 11:59pm Pacific Time (August 31 at 6:59am UTC).

If you have any questions, please email summit at openstack.org <mailto:summit at openstack.org>.

Cheers,
Ashlee


Ashlee Ferguson
OpenStack Foundation
ashlee at openstack.org


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180724/9f11095b/attachment.html>

From jonmills at gmail.com  Tue Jul 24 14:38:05 2018
From: jonmills at gmail.com (Jonathan Mills)
Date: Tue, 24 Jul 2018 10:38:05 -0400
Subject: [Openstack-operators] [nova] Couple of CellsV2 questions
In-Reply-To: <5eb59ccc-f860-b15c-7ed8-e1a04807adb7@gmail.com>
References: <CAB55WD97F=98=Xvr97gJM4CUnieBgY=cm=mvwe_ZE0QiGnHKwg@mail.gmail.com>
 <5eb59ccc-f860-b15c-7ed8-e1a04807adb7@gmail.com>
Message-ID: <aa818007-dcf4-f0e9-a519-a42d28fca60c@icloud.com>

Thanks, Matt.  Those are all good suggestions, and we will incorporate
your feedback into our plans.

On 07/23/2018 05:57 PM, Matt Riedemann wrote:
> I'll try to help a bit inline. Also cross-posting to openstack-dev and
> tagging with [nova] to highlight it.
> 
> On 7/23/2018 10:43 AM, Jonathan Mills wrote:
>> I am looking at implementing CellsV2 with multiple cells, and there's
>> a few things I'm seeking clarification on:
>>
>> 1) How does a superconductor know that it is a superconductor?  Is its
>> operation different in any fundamental way?  Is there any explicit
>> configuration or a setting in the database required? Or does it simply
>> not care one way or another?
> 
> It's a topology term, not really anything in config or the database that
> distinguishes the "super" conductor. I assume you've gone over the
> service layout in the docs:
> 
> https://docs.openstack.org/nova/latest/user/cellsv2-layout.html#service-layout
> 
> 
> There are also some summit talks from Dan about the topology linked here:
> 
> https://docs.openstack.org/nova/latest/user/cells.html#cells-v2
> 
> The superconductor is the conductor service at the "top" of the tree
> which interacts with the API and scheduler (controller) services and
> routes operations to the cell. Then once in a cell, the operation should
> ideally be confined there. So, for example, reschedules during a build
> would be confined to the cell. The cell conductor doesn't go back "up"
> to the scheduler to get a new set of hosts for scheduling. This of
> course depends on which release you're using and your configuration, see
> the caveats section in the cellsv2-layout doc.
> 
>>
>> 2) When I ran the command "nova-manage cell_v2 create_cell
>> --name=cell1 --verbose", the entry created for cell1 in the api
>> database includes only one rabbitmq server, but I have three of them
>> as an HA cluster.  Does it only support talking to one rabbitmq server
>> in this configuration? Or can I just update the cell1 transport_url in
>> the database to point to all three? Is that a supported configuration?
> 
> First, don't update stuff directly in the database if you don't have to.
> :) What you set on the transport_url should be whatever oslo.messaging
> can handle:
> 
> https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.transport_url
> 
> 
> There is at least one reported bug for this but I'm not sure I fully
> grok it or what its status is at this point:
> 
> https://bugs.launchpad.net/nova/+bug/1717915
> 
>>
>> 3) Is there anything wrong with having one cell share the amqp bus
>> with your control plane, while having additional cells use their own
>> amqp buses? Certainly I realize that the point of CellsV2 is to shard
>> the amqp bus for greater horizontal scalability.  But in my case, my
>> first cell is on the smaller side, and happens to be colocated with
>> the control plane hardware (whereas other cells will be in other parts
>> of the datacenter, or in other datacenters with high-speed links).  I
>> was thinking of just pointing that first cell back at the same
>> rabbitmq servers used by the control plane, but perhaps directing them
>> at their own rabbitmq vhost. Is that a terrible idea?
> 
> Would need to get input from operators and/or Dan Smith's opinion on
> this one, but I'd say it's no worse than having a flat single cell
> deployment. However, if you're going to do multi-cell long-term anyway,
> then it would be best to get in the mindset and discipline of not
> relying on shared MQ between the controller services and the cells. In
> other words, just do the right thing from the start rather than have to
> worry about maybe changing the deployment / configuration for that one
> cell down the road when it's harder.
> 


From amy at demarco.com  Tue Jul 24 15:05:51 2018
From: amy at demarco.com (Amy Marrich)
Date: Tue, 24 Jul 2018 10:05:51 -0500
Subject: [Openstack-operators] UC Election - Looking for Election Officials
Message-ID: <CAFs83Qo_5XS0-JBQ03AwV0O1Z-7-DPY60JNxjYBanb+F6ZnX-Q@mail.gmail.com>

Hey Stackers,


We are getting ready for the Summer UC election and we need to have at
least two Election Officials. I was wondering if you would like to help us
on that process. You can find all the details of the election at
https://governance.openstack.org/uc/reference/uc-election-aug2018.html.


I do want to point out to those who are new that Election Officials are
unable to run in the election itself but can of course vote.


The election dates will be:

August 6 - August 17, 05:59 UTC: Open candidacy for UC positions

August 20 - August 24, 11:59 UTC: UC elections (voting)


Please, reach out to any of the current UC members or simple reply to this
email if you can help us in this community process.


Thanks,


OpenStack User Committee

Amy, Leong, Matt, Melvin, and Saverio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180724/58631b47/attachment.html>

From amy at demarco.com  Tue Jul 24 16:39:18 2018
From: amy at demarco.com (Amy Marrich)
Date: Tue, 24 Jul 2018 11:39:18 -0500
Subject: [Openstack-operators] UC Election - Looking for Election
	Officials
In-Reply-To: <CAFs83Qo_5XS0-JBQ03AwV0O1Z-7-DPY60JNxjYBanb+F6ZnX-Q@mail.gmail.com>
References: <CAFs83Qo_5XS0-JBQ03AwV0O1Z-7-DPY60JNxjYBanb+F6ZnX-Q@mail.gmail.com>
Message-ID: <CAFs83Qpv3Dy9K4oFhJyerbNwFg6yyKTE7T_ZpqLu-ZtT7GhQSA@mail.gmail.com>

Just wanted to say THANK you as we now have 3 officials! Please participate
in the User Committee elections as a candidate and perhaps most importantly
by voting!

Thanks,

Amy (spotz)

On Tue, Jul 24, 2018 at 10:05 AM, Amy Marrich <amy at demarco.com> wrote:

> Hey Stackers,
>
>
> We are getting ready for the Summer UC election and we need to have at
> least two Election Officials. I was wondering if you would like to help us
> on that process. You can find all the details of the election at
> https://governance.openstack.org/uc/reference/uc-election-aug2018.html.
>
>
> I do want to point out to those who are new that Election Officials are
> unable to run in the election itself but can of course vote.
>
>
>
> The election dates will be:
>
> August 6 - August 17, 05:59 UTC: Open candidacy for UC positions
>
> August 20 - August 24, 11:59 UTC: UC elections (voting)
>
>
>
> Please, reach out to any of the current UC members or simple reply to this
> email if you can help us in this community process.
>
>
>
> Thanks,
>
>
>
> OpenStack User Committee
>
> Amy, Leong, Matt, Melvin, and Saverio
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180724/613b1552/attachment.html>

From amy at demarco.com  Tue Jul 24 19:24:25 2018
From: amy at demarco.com (Amy Marrich)
Date: Tue, 24 Jul 2018 14:24:25 -0500
Subject: [Openstack-operators] UC Election - Looking for Election
	Officials
In-Reply-To: <CAFs83Qpv3Dy9K4oFhJyerbNwFg6yyKTE7T_ZpqLu-ZtT7GhQSA@mail.gmail.com>
References: <CAFs83Qo_5XS0-JBQ03AwV0O1Z-7-DPY60JNxjYBanb+F6ZnX-Q@mail.gmail.com>
 <CAFs83Qpv3Dy9K4oFhJyerbNwFg6yyKTE7T_ZpqLu-ZtT7GhQSA@mail.gmail.com>
Message-ID: <CAFs83QrUGShr-s1iedLDb+yN3ga+=_HpmE046_XFVo=By3EeLw@mail.gmail.com>

And for those curious... our officials are.....

Ed Leafe, Chandan Kumar and then Mohamed Elsakhawy


Thanks,
Amy (spotz)
(Who's claiming lack of sleep for not including the names earlier)

On Tue, Jul 24, 2018 at 11:39 AM, Amy Marrich <amy at demarco.com> wrote:

> Just wanted to say THANK you as we now have 3 officials! Please
> participate in the User Committee elections as a candidate and perhaps most
> importantly by voting!
>
> Thanks,
>
> Amy (spotz)
>
> On Tue, Jul 24, 2018 at 10:05 AM, Amy Marrich <amy at demarco.com> wrote:
>
>> Hey Stackers,
>>
>>
>> We are getting ready for the Summer UC election and we need to have at
>> least two Election Officials. I was wondering if you would like to help us
>> on that process. You can find all the details of the election at
>> https://governance.openstack.org/uc/reference/uc-election-aug2018.html.
>>
>>
>> I do want to point out to those who are new that Election Officials are
>> unable to run in the election itself but can of course vote.
>>
>>
>>
>> The election dates will be:
>>
>> August 6 - August 17, 05:59 UTC: Open candidacy for UC positions
>>
>> August 20 - August 24, 11:59 UTC: UC elections (voting)
>>
>>
>>
>> Please, reach out to any of the current UC members or simple reply to
>> this email if you can help us in this community process.
>>
>>
>>
>> Thanks,
>>
>>
>>
>> OpenStack User Committee
>>
>> Amy, Leong, Matt, Melvin, and Saverio
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180724/8f90d9d3/attachment.html>

From ashlee at openstack.org  Thu Jul 26 21:12:19 2018
From: ashlee at openstack.org (Ashlee Ferguson)
Date: Thu, 26 Jul 2018 16:12:19 -0500
Subject: [Openstack-operators] OpenStack Summit Berlin - Community Voting
	Closing Soon
Message-ID: <C1E5076B-B8DA-46D7-A69E-F627D1AAD2B9@openstack.org>

Hi everyone,

Session voting for the Berlin Summit closes in less than 8 hours! Submit your votes by July 26 at 11:59pm Pacific Time (Friday, July 27 at 6:59 UTC).

VOTE HERE <https://www.openstack.org/summit/berlin-2018/vote-for-speakers>

The Programming Committees will ultimately determine the final schedule. Community votes are meant to help inform the decision, but are not considered to be the deciding factor. The Programming Committee members exercise judgment in their area of expertise and help ensure diversity. View full details of the session selection process here. <https://www.openstack.org/summit/berlin-2018/call-for-presentations/selection-process>

Continue to visit https://www.openstack.org/summit/berlin-2018 <https://www.openstack.org/summit/berlin-2018> for all Summit-related information.

REGISTER
Register for the Summit <https://openstacksummit2018berlin.eventbrite.com/> for $699 before prices increase after August 21 at 11:59pm Pacific Time (August 22 at 6:59am UTC).

VISA APPLICATION PROCESS
Make sure to secure your Visa soon. More information <https://www.openstack.org/summit/berlin-2018/travel/#travel-support> about the Visa application process.

TRAVEL SUPPORT PROGRAM
August 30 is the last day to submit applications. Please submit your applications <https://www.openstack.org/summit/berlin-2018/travel/#travel-support> by 11:59pm Pacific Time (August 31 at 6:59am UTC).

If you have any questions, please email summit at openstack.org <mailto:summit at openstack.org>.

Cheers,
Ashlee


Ashlee Ferguson
OpenStack Foundation
ashlee at openstack.org


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180726/ca12b06a/attachment.html>

From gilles.mocellin at nuagelibre.org  Fri Jul 27 08:34:50 2018
From: gilles.mocellin at nuagelibre.org (Gilles Mocellin)
Date: Fri, 27 Jul 2018 10:34:50 +0200
Subject: [Openstack-operators] [openstack-ansible] How to manage system
	upgrades ?
Message-ID: <618796e0e942dc5bd5b0824950565ea1@nuagelibre.org>

Hello !

Would be great to have a playbook to upgrade system parts of an 
OpenStack Cloud !
With OpenStack Ansible : LXC containers and hosts.

It would be awesome to do a controlled rolling reboot of hosts when 
needed...

Different conditions to check :
- for controllers : check galera status...
- for compute nodes : disable compute node and live-evacuate 
instances...
- for storage : with Ceph : set no out...

I know, I can do it and contribute, but perhaps someone already has 
something similar ?
It could be hosted in one of these projects :
- https://git.openstack.org/cgit/openstack/openstack-ansible-ops
- 
https://git.openstack.org/cgit/openstack/ansible-role-openstack-operations

(Why already to project for the same goal ?)


From christian.zunker at codecentric.cloud  Fri Jul 27 11:58:20 2018
From: christian.zunker at codecentric.cloud (Christian Zunker)
Date: Fri, 27 Jul 2018 13:58:20 +0200
Subject: [Openstack-operators] [openstack-ansible] How to manage system
 upgrades ?
In-Reply-To: <618796e0e942dc5bd5b0824950565ea1@nuagelibre.org>
References: <618796e0e942dc5bd5b0824950565ea1@nuagelibre.org>
Message-ID: <CAHS=D_afkDJd1Y4_=M6WH4_4uEKP-3EyLO+qfEcdNjSLsaP2Cg@mail.gmail.com>

Hi Gilles,

sounds like a good idea. We've just written a script for live evacuate,
which we can contribute after some refactoring.


Gilles Mocellin <gilles.mocellin at nuagelibre.org> schrieb am Fr., 27. Juli
2018 um 10:44 Uhr:

> Hello !
>
> Would be great to have a playbook to upgrade system parts of an
> OpenStack Cloud !
> With OpenStack Ansible : LXC containers and hosts.
>
> It would be awesome to do a controlled rolling reboot of hosts when
> needed...
>
> Different conditions to check :
> - for controllers : check galera status...
> - for compute nodes : disable compute node and live-evacuate
> instances...
> - for storage : with Ceph : set no out...
>
> I know, I can do it and contribute, but perhaps someone already has
> something similar ?
> It could be hosted in one of these projects :
> - https://git.openstack.org/cgit/openstack/openstack-ansible-ops
> -
> https://git.openstack.org/cgit/openstack/ansible-role-openstack-operations
>
> (Why already to project for the same goal ?)
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180727/d680a5bf/attachment.html>

From mrhillsman at gmail.com  Mon Jul 30 13:44:31 2018
From: mrhillsman at gmail.com (Melvin Hillsman)
Date: Mon, 30 Jul 2018 08:44:31 -0500
Subject: [Openstack-operators] Reminder: User Committee @ 1800 UTC
Message-ID: <CAMVtB2GH_LqcyyffioMrHf4wqp8xOAQfDFiG5CEpTRhVWLJRRg@mail.gmail.com>

Hi everyone,

UC meeting today in #openstack-uc
Agenda: https://wiki.openstack.org/wiki/Governance/Foundation/UserCommittee

-- 

Kind regards,

Melvin Hillsman

mrhillsman at gmail.com
mobile: (832) 264-2646
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180730/bc5e8929/attachment.html>

From alfredo.deluca at gmail.com  Mon Jul 30 14:53:20 2018
From: alfredo.deluca at gmail.com (Alfredo De Luca)
Date: Mon, 30 Jul 2018 16:53:20 +0200
Subject: [Openstack-operators] swift question
Message-ID: <CAAWpFTFvLYsfYsoujqzA+0tnApX6dT=jG7M37LF-+pLKqXiHYA@mail.gmail.com>

Hi all.
I wonder if i can sync a directory on a server to the obj store (swift).
What I do now is just a backup but I d like to implement a sort of file
rotate locally and on the obj store.
Any  idea?


-- 
*Alfredo*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180730/1acc3a5b/attachment.html>

From ignaziocassano at gmail.com  Mon Jul 30 15:33:10 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Mon, 30 Jul 2018 17:33:10 +0200
Subject: [Openstack-operators] dashboard show only project after upgrading
Message-ID: <CAB7j8cUTsXhGs2pLAq4BMTNrquUrSpG5=UK-jwUg9+KxRv=qCw@mail.gmail.com>

Hello everyone,
I upgraded openstack centos 7 from ocata to pike ad command line work fine
but dashboard does not show any menu on the left .
I missed the following menus:

Project
Admin
Identity

You can find the image attached here.

Could anyone help me ?
Regards
Ignazio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180730/ca17169b/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot at 2018-07-30 17:31:50.png
Type: image/png
Size: 134429 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180730/ca17169b/attachment-0001.png>

From ignaziocassano at gmail.com  Mon Jul 30 15:35:31 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Mon, 30 Jul 2018 17:35:31 +0200
Subject: [Openstack-operators] dashboard show only project after
	upgrading
In-Reply-To: <CAB7j8cUTsXhGs2pLAq4BMTNrquUrSpG5=UK-jwUg9+KxRv=qCw@mail.gmail.com>
References: <CAB7j8cUTsXhGs2pLAq4BMTNrquUrSpG5=UK-jwUg9+KxRv=qCw@mail.gmail.com>
Message-ID: <CAB7j8cV01iGC1gXMkF7fVViqaXad3P=jWhBerGXouiBzVrKnSw@mail.gmail.com>

Sorry, I sent a wrong image.
The correct screenshot is attached here.
Regards

2018-07-30 17:33 GMT+02:00 Ignazio Cassano <ignaziocassano at gmail.com>:

> Hello everyone,
> I upgraded openstack centos 7 from ocata to pike ad command line work fine
> but dashboard does not show any menu on the left .
> I missed the following menus:
>
> Project
> Admin
> Identity
>
> You can find the image attached here.
>
> Could anyone help me ?
> Regards
> Ignazio
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180730/8558688d/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot at 2018-07-30 17:35:03.png
Type: image/png
Size: 122330 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180730/8558688d/attachment-0001.png>

From mriedemos at gmail.com  Mon Jul 30 15:52:59 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Mon, 30 Jul 2018 10:52:59 -0500
Subject: [Openstack-operators] [openstack-ansible] How to manage system
 upgrades ?
In-Reply-To: <618796e0e942dc5bd5b0824950565ea1@nuagelibre.org>
References: <618796e0e942dc5bd5b0824950565ea1@nuagelibre.org>
Message-ID: <a2bbe898-80f9-5ae9-b436-84cfece78dd1@gmail.com>

On 7/27/2018 3:34 AM, Gilles Mocellin wrote:
> - for compute nodes : disable compute node and live-evacuate instances...

To be clear, what do you mean exactly by "live-evacuate"? I assume you 
mean live migration of all instances off each (disabled) compute node 
*before* you upgrade it. I wanted to ask because "evacuate" as a server 
operation is something else entirely (it's rebuild on another host which 
is definitely disruptive to the workload on that server).

http://www.danplanet.com/blog/2016/03/03/evacuate-in-nova-one-command-to-confuse-us-all/

-- 

Thanks,

Matt


From bitskrieg at bitskrieg.net  Mon Jul 30 15:53:00 2018
From: bitskrieg at bitskrieg.net (Chris Apsey)
Date: Mon, 30 Jul 2018 11:53:00 -0400
Subject: [Openstack-operators] dashboard show only project after
	upgrading
In-Reply-To: <CAB7j8cV01iGC1gXMkF7fVViqaXad3P=jWhBerGXouiBzVrKnSw@mail.gmail.com>
References: <CAB7j8cUTsXhGs2pLAq4BMTNrquUrSpG5=UK-jwUg9+KxRv=qCw@mail.gmail.com>
 <CAB7j8cV01iGC1gXMkF7fVViqaXad3P=jWhBerGXouiBzVrKnSw@mail.gmail.com>
Message-ID: <164ebe48760.2784.5f0d7f2baa7831a2bbe6450f254d9a24@bitskrieg.net>

Ignazio,

Are your horizon instances in separate containers/VMS?  If so, I'd highly 
recommend completely wiping them and rebuilding from scratch since horizon 
itself is stateless.  I am not a fan of upgrades for reasons like this.

If that's not possible, a purge of the horizon packages on your controller 
and a reinstallation should fix it.

Chris

On July 30, 2018 11:38:03 Ignazio Cassano <ignaziocassano at gmail.com> wrote:
> Sorry, I sent a wrong image.
> The correct screenshot is attached here.
> Regards
>
> 2018-07-30 17:33 GMT+02:00 Ignazio Cassano <ignaziocassano at gmail.com>:
> Hello everyone,
> I upgraded openstack centos 7 from ocata to pike ad command line work fine
> but dashboard does not show any menu on the left .
> I missed the following menus:
>
> Project
> Admin
> Identity
>
> You can find the image attached here.
>
> Could anyone help me ?
> Regards
> Ignazio
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180730/ad07f40b/attachment.html>

From ignaziocassano at gmail.com  Mon Jul 30 16:14:14 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Mon, 30 Jul 2018 18:14:14 +0200
Subject: [Openstack-operators] dashboard show only project after
	upgrading
In-Reply-To: <164ebe48760.2784.5f0d7f2baa7831a2bbe6450f254d9a24@bitskrieg.net>
References: <CAB7j8cUTsXhGs2pLAq4BMTNrquUrSpG5=UK-jwUg9+KxRv=qCw@mail.gmail.com>
 <CAB7j8cV01iGC1gXMkF7fVViqaXad3P=jWhBerGXouiBzVrKnSw@mail.gmail.com>
 <164ebe48760.2784.5f0d7f2baa7831a2bbe6450f254d9a24@bitskrieg.net>
Message-ID: <CAB7j8cVrKfxh9nxgetKi6N7siO2xwcyT6o-hkNgvLp=3Q27WBQ@mail.gmail.com>

Hello Chris, I am not using containers yet. I wiil try to purge it.
Many thanks.
Ignazio

Il Lun 30 Lug 2018 17:53 Chris Apsey <bitskrieg at bitskrieg.net> ha scritto:

> Ignazio,
>
> Are your horizon instances in separate containers/VMS?  If so, I'd highly
> recommend completely wiping them and rebuilding from scratch since horizon
> itself is stateless.  I am not a fan of upgrades for reasons like this.
>
> If that's not possible, a purge of the horizon packages on your controller
> and a reinstallation should fix it.
>
> Chris
>
> On July 30, 2018 11:38:03 Ignazio Cassano <ignaziocassano at gmail.com>
> wrote:
>
>> Sorry, I sent a wrong image.
>> The correct screenshot is attached here.
>> Regards
>>
>> 2018-07-30 17:33 GMT+02:00 Ignazio Cassano <ignaziocassano at gmail.com>:
>>
>>> Hello everyone,
>>> I upgraded openstack centos 7 from ocata to pike ad command line work
>>> fine
>>> but dashboard does not show any menu on the left .
>>> I missed the following menus:
>>>
>>> Project
>>> Admin
>>> Identity
>>>
>>> You can find the image attached here.
>>>
>>> Could anyone help me ?
>>> Regards
>>> Ignazio
>>>
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180730/9ad9bfae/attachment.html>

From clay.gerrard at gmail.com  Mon Jul 30 16:28:09 2018
From: clay.gerrard at gmail.com (Clay Gerrard)
Date: Mon, 30 Jul 2018 11:28:09 -0500
Subject: [Openstack-operators] swift question
In-Reply-To: <CAAWpFTFvLYsfYsoujqzA+0tnApX6dT=jG7M37LF-+pLKqXiHYA@mail.gmail.com>
References: <CAAWpFTFvLYsfYsoujqzA+0tnApX6dT=jG7M37LF-+pLKqXiHYA@mail.gmail.com>
Message-ID: <CA+_JKzoq-5Nec5kciiAEXahCaVu8qbaqu+Ap5KH_-fsge0DYLQ@mail.gmail.com>

Sure!  python swiftclient's upload command has a --changed option:

https://docs.openstack.org/python-swiftclient/latest/cli/index.html#swift-upload

But you might be happier with something more sophisticated like rclone:

https://rclone.org/

Nice thing about object storage is you can access it from anywhere via HTTP
and PUT anything you want in there ;)

-Clay

On Mon, Jul 30, 2018 at 9:54 AM Alfredo De Luca <alfredo.deluca at gmail.com>
wrote:

> Hi all.
> I wonder if i can sync a directory on a server to the obj store (swift).
> What I do now is just a backup but I d like to implement a sort of file
> rotate locally and on the obj store.
> Any  idea?
>
>
> --
> *Alfredo*
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180730/25e4b279/attachment.html>

From tobias.urdin at crystone.com  Mon Jul 30 17:08:50 2018
From: tobias.urdin at crystone.com (Tobias Urdin)
Date: Mon, 30 Jul 2018 17:08:50 +0000
Subject: [Openstack-operators] [neutron] [neutron-dynamic-routing]
 bgp-dragent not sending BGP UPDATE messages
Message-ID: <a1167e3ee7e64cd5831000aba3a039aa@mb01.staff.ognet.se>

Hello,
I'm trying to get the neutron-bgp-dragent that is delivered by the
neutron-dynamic-routing project to work.

I've gotten it to open a BGP peer session without any issues but the no
BGP UPDATE messages seems to be sent from the
neutron-bgp-dragent daemon.

I'm having a BGP peer with a machine running FreeBSD 11 with OpenBGPD,
my goals is being able to announce IPv6 over IPv4 peers which should
work but I'm unsure
if python-ryu supports this.

[root at controller ~]# openstack bgp speaker show bgp-speaker-ipv6
+-----------------------------------+-------------------------------------------+
| Field                             |
Value                                     |
+-----------------------------------+-------------------------------------------+
| advertise_floating_ip_host_routes |
False                                     |
| advertise_tenant_networks         |
True                                      |
| id                                |
d22b30f2-50fe-49eb-9577-77cceb3fcc81      |
| ip_version                        |
6                                         |
| local_as                          |
64600                                     |
| name                              |
bgp-speaker-ipv6                          |
| networks                          |
[u'fdcead67-8a12-42fe-a31d-8cb3a03d8ee0'] |
| peers                             |
[u'b42d808f-c2ef-41e7-93b5-859a51cf6a36'] |
| project_id                        |
050c556faa5944a8953126c867313770          |
| tenant_id                         |
050c556faa5944a8953126c867313770          |
+-----------------------------------+-------------------------------------------+

[root at controller ~]# openstack bgp peer show
b42d808f-c2ef-41e7-93b5-859a51cf6a36
+------------+--------------------------------------+
| Field      | Value                                |
+------------+--------------------------------------+
| auth_type  | none                                 |
| id         | b42d808f-c2ef-41e7-93b5-859a51cf6a36 |
| name       | bgp-peer-1                        |
| peer_ip    | 172.20.x.y                       |
| project_id | 050c556faa5944a8953126c867313770     |
| remote_as  | xxxx                                |
| tenant_id  | 050c556faa5944a8953126c867313770     |
+------------+--------------------------------------+
[root at controller ~]# openstack bgp speaker list advertised routes
bgp-speaker-ipv6
+----+--------------------+--------------+
| ID | Destination        | Nexthop      |
+----+--------------------+--------------+
|    | xxxx:xxxx:0:1::/64 | xxxx:xxxx::f |
+----+--------------------+--------------+


2018-07-30 19:00:57.302 2143006 INFO
neutron_dynamic_routing.services.bgp.agent.driver.ryu.driver [-]
Initializing Ryu driver for BGP Speaker functionality.
2018-07-30 19:00:57.302 2143006 INFO
neutron_dynamic_routing.services.bgp.agent.driver.ryu.driver [-]
Initialized Ryu BGP Speaker driver interface with bgp_router_id=172.20.zz.yy
2018-07-30 19:00:57.351 2143006 INFO
neutron_dynamic_routing.services.bgp.agent.bgp_dragent [-] BGP dynamic
routing agent started
2018-07-30 19:00:57.450 2143006 INFO bgpspeaker.api.base
[req-f15418e8-731b-4ebe-82a9-e2933e8df8b7 - - - - -] API method
core.start called with args: {'router_id': '172.20.zz.yy',
'label_range': (100, 100000), 'waiter': <ryu.lib.hub.Event object at
0x7f3448774510>, 'bgp_server_port': 0, 'local_as': 64600,
'allow_local_as_in_count': 0, 'refresh_stalepath_time': 0, 'cluster_id':
None, 'local_pref': 100, 'refresh_max_eor_time': 0}
2018-07-30 19:00:57.455 2143006 INFO
neutron_dynamic_routing.services.bgp.agent.driver.ryu.driver
[req-f15418e8-731b-4ebe-82a9-e2933e8df8b7 - - - - -] Added BGP Speaker
for local_as=64600 with router_id= 172.20.zz.yy.
2018-07-30 19:00:57.456 2143006 INFO bgpspeaker.api.base
[req-f15418e8-731b-4ebe-82a9-e2933e8df8b7 - - - - -] API method
neighbor.create called with args: {'connect_mode': 'active',
'cap_mbgp_evpn': False, 'remote_as': 35041, 'cap_mbgp_vpnv6': False,
'cap_mbgp_l2vpnfs': False, 'cap_four_octet_as_number': True,
'cap_mbgp_ipv6': False, 'is_next_hop_self': False, 'cap_mbgp_ipv4':
True, 'cap_mbgp_ipv4fs': False, 'is_route_reflector_client': False,
'cap_mbgp_ipv6fs': False, 'is_route_server_client': False,
'cap_enhanced_refresh': False, 'peer_next_hop': None, 'password': None,
'ip_address': u'172.20.x.y', 'cap_mbgp_vpnv4fs': False,
'cap_mbgp_vpnv4': False, 'cap_mbgp_vpnv6fs': False}
2018-07-30 19:00:57.456 2143006 INFO
neutron_dynamic_routing.services.bgp.agent.driver.ryu.driver
[req-f15418e8-731b-4ebe-82a9-e2933e8df8b7 - - - - -] Added BGP Peer
172.20.x.y for remote_as=xxxx to BGP Speaker running for local_as=64600.
2018-07-30 19:00:57.457 2143006 INFO bgpspeaker.api.base
[req-f15418e8-731b-4ebe-82a9-e2933e8df8b7 - - - - -] API method
network.add called with args: {'prefix': u'xxxx:xxxx:0:1::/64',
'next_hop': u'2a05:4545::f'}
2018-07-30 19:00:57.457 2143006 INFO
neutron_dynamic_routing.services.bgp.agent.driver.ryu.driver
[req-f15418e8-731b-4ebe-82a9-e2933e8df8b7 - - - - -] Route
cidr=xxxx:xxxx:0:1::/64, nexthop=xxxx:xxxx::f is advertised for BGP
Speaker running for local_as=64600.
2018-07-30 19:00:58.460 2143006 INFO bgpspeaker.peer [-] Connection to
peer: 172.20.zz.yy established
2018-07-30 19:00:58.460 2143006 INFO
neutron_dynamic_routing.services.bgp.agent.driver.ryu.driver [-] BGP
Peer my.peer.id for remote_as=xxxx is UP.

On the router side the peer is up but there is no BGP UPDATE messages so
I don't get any prefixes.

root at router:~ # bgpctl show sum
Neighbor                   AS    MsgRcvd    MsgSent  OutQ Up/Down 
State/PrfRcvd
controllername       64600        488        491     0 00:02:20      0

root at dr20-1-sto1:~ # bgpctl show neighbor 172.20.104.192
BGP neighbor is 172.20.zz.yy, remote AS 64600
 Description: controllername
  BGP version 4, remote router-id 172.20.zz.yy
  BGP state = Established, up for 00:03:03
  Last read 00:00:01, holdtime 40s, keepalive interval 13s
  Neighbor capabilities:
    Multiprotocol extensions: IPv4 unicast
    Route Refresh
    4-byte AS numbers

  Message statistics:
                  Sent       Received 
  Opens                    6          6
  Notifications            0          0
  Updates                  0          0
  Keepalives             489        486
  Route Refresh            0          0
  Total                  495        492

  Update statistics:
                  Sent       Received 
  Updates                  0          0
  Withdraws                0          0
  End-of-Rib               0          0

I'm wondering if this might be something related to the neighbor
capabilities that is announces, see the output below and from the
neutron-bgp-dragent log we can see this capabilities:
2018-07-30 19:00:57.456 2143006 INFO bgpspeaker.api.base
[req-f15418e8-731b-4ebe-82a9-e2933e8df8b7 - - - - -] API method
neighbor.create called with args: {'connect_mode': 'active',
'cap_mbgp_evpn': False, 'remote_as': 35041, 'cap_mbgp_vpnv6': False,
'cap_mbgp_l2vpnfs': False, 'cap_four_octet_as_number': True,
'cap_mbgp_ipv6': False, 'is_next_hop_self': False, 'cap_mbgp_ipv4':
True, 'cap_mbgp_ipv4fs': False, 'is_route_reflector_client': False,
'cap_mbgp_ipv6fs': False, 'is_route_server_client': False,
'cap_enhanced_refresh': False, 'peer_next_hop': None, 'password': None,
'ip_address': u'172.20.x.y', 'cap_mbgp_vpnv4fs': False,
'cap_mbgp_vpnv4': False, 'cap_mbgp_vpnv6fs': False}

Here is an example on how the bgpd.conf looks like:
group "peering AS64600" {
  remote-as 64600
  softreconfig in yes
  transparent-as yes

  neighbor 172.20.zz.yy {
    announce none
    announce IPv6 unicast
    descr "controller"
    local-address 172.20.x.x
    depend on vlan10
  }

If I interpret the IPv6 section in this document correctly
https://docs.openstack.org/mitaka/networking-guide/config-bgp-dynamic-routing.html
it should work.
Anybody have any ideas or know if it's supported?

Appreciate any help or pointers.
Best regards


From tobias.urdin at crystone.com  Mon Jul 30 19:16:44 2018
From: tobias.urdin at crystone.com (Tobias Urdin)
Date: Mon, 30 Jul 2018 19:16:44 +0000
Subject: [Openstack-operators] [neutron] [neutron-dynamic-routing]
 bgp-dragent not sending BGP UPDATE messages
In-Reply-To: <a1167e3ee7e64cd5831000aba3a039aa@mb01.staff.ognet.se>
References: <a1167e3ee7e64cd5831000aba3a039aa@mb01.staff.ognet.se>
Message-ID: <1532978205744.26838@crystone.com>

So the real question is pretty much if Ryu supports MP-BGP and it does however it seems that neutron-dynamic-routing
is disabling IPv6 if the peer IP is a IPv4 address :(

So the link below answers my own question [1]

[1] https://github.com/openstack/neutron-dynamic-routing/blob/98d3cf24d6d7b5eca55ca19eb19bdd2e7b1975ec/neutron_dynamic_routing/services/bgp/agent/driver/ryu/driver.py#L131
________________________________________
From: Tobias Urdin <tobias.urdin at crystone.com>
Sent: Monday, July 30, 2018 7:08 PM
To: openstack-operators at lists.openstack.org
Subject: [Openstack-operators] [neutron] [neutron-dynamic-routing] bgp-dragent not sending BGP UPDATE messages

Hello,
I'm trying to get the neutron-bgp-dragent that is delivered by the
neutron-dynamic-routing project to work.

I've gotten it to open a BGP peer session without any issues but the no
BGP UPDATE messages seems to be sent from the
neutron-bgp-dragent daemon.

I'm having a BGP peer with a machine running FreeBSD 11 with OpenBGPD,
my goals is being able to announce IPv6 over IPv4 peers which should
work but I'm unsure
if python-ryu supports this.

[root at controller ~]# openstack bgp speaker show bgp-speaker-ipv6
+-----------------------------------+-------------------------------------------+
| Field                             |
Value                                     |
+-----------------------------------+-------------------------------------------+
| advertise_floating_ip_host_routes |
False                                     |
| advertise_tenant_networks         |
True                                      |
| id                                |
d22b30f2-50fe-49eb-9577-77cceb3fcc81      |
| ip_version                        |
6                                         |
| local_as                          |
64600                                     |
| name                              |
bgp-speaker-ipv6                          |
| networks                          |
[u'fdcead67-8a12-42fe-a31d-8cb3a03d8ee0'] |
| peers                             |
[u'b42d808f-c2ef-41e7-93b5-859a51cf6a36'] |
| project_id                        |
050c556faa5944a8953126c867313770          |
| tenant_id                         |
050c556faa5944a8953126c867313770          |
+-----------------------------------+-------------------------------------------+

[root at controller ~]# openstack bgp peer show
b42d808f-c2ef-41e7-93b5-859a51cf6a36
+------------+--------------------------------------+
| Field      | Value                                |
+------------+--------------------------------------+
| auth_type  | none                                 |
| id         | b42d808f-c2ef-41e7-93b5-859a51cf6a36 |
| name       | bgp-peer-1                        |
| peer_ip    | 172.20.x.y                       |
| project_id | 050c556faa5944a8953126c867313770     |
| remote_as  | xxxx                                |
| tenant_id  | 050c556faa5944a8953126c867313770     |
+------------+--------------------------------------+
[root at controller ~]# openstack bgp speaker list advertised routes
bgp-speaker-ipv6
+----+--------------------+--------------+
| ID | Destination        | Nexthop      |
+----+--------------------+--------------+
|    | xxxx:xxxx:0:1::/64 | xxxx:xxxx::f |
+----+--------------------+--------------+


2018-07-30 19:00:57.302 2143006 INFO
neutron_dynamic_routing.services.bgp.agent.driver.ryu.driver [-]
Initializing Ryu driver for BGP Speaker functionality.
2018-07-30 19:00:57.302 2143006 INFO
neutron_dynamic_routing.services.bgp.agent.driver.ryu.driver [-]
Initialized Ryu BGP Speaker driver interface with bgp_router_id=172.20.zz.yy
2018-07-30 19:00:57.351 2143006 INFO
neutron_dynamic_routing.services.bgp.agent.bgp_dragent [-] BGP dynamic
routing agent started
2018-07-30 19:00:57.450 2143006 INFO bgpspeaker.api.base
[req-f15418e8-731b-4ebe-82a9-e2933e8df8b7 - - - - -] API method
core.start called with args: {'router_id': '172.20.zz.yy',
'label_range': (100, 100000), 'waiter': <ryu.lib.hub.Event object at
0x7f3448774510>, 'bgp_server_port': 0, 'local_as': 64600,
'allow_local_as_in_count': 0, 'refresh_stalepath_time': 0, 'cluster_id':
None, 'local_pref': 100, 'refresh_max_eor_time': 0}
2018-07-30 19:00:57.455 2143006 INFO
neutron_dynamic_routing.services.bgp.agent.driver.ryu.driver
[req-f15418e8-731b-4ebe-82a9-e2933e8df8b7 - - - - -] Added BGP Speaker
for local_as=64600 with router_id= 172.20.zz.yy.
2018-07-30 19:00:57.456 2143006 INFO bgpspeaker.api.base
[req-f15418e8-731b-4ebe-82a9-e2933e8df8b7 - - - - -] API method
neighbor.create called with args: {'connect_mode': 'active',
'cap_mbgp_evpn': False, 'remote_as': 35041, 'cap_mbgp_vpnv6': False,
'cap_mbgp_l2vpnfs': False, 'cap_four_octet_as_number': True,
'cap_mbgp_ipv6': False, 'is_next_hop_self': False, 'cap_mbgp_ipv4':
True, 'cap_mbgp_ipv4fs': False, 'is_route_reflector_client': False,
'cap_mbgp_ipv6fs': False, 'is_route_server_client': False,
'cap_enhanced_refresh': False, 'peer_next_hop': None, 'password': None,
'ip_address': u'172.20.x.y', 'cap_mbgp_vpnv4fs': False,
'cap_mbgp_vpnv4': False, 'cap_mbgp_vpnv6fs': False}
2018-07-30 19:00:57.456 2143006 INFO
neutron_dynamic_routing.services.bgp.agent.driver.ryu.driver
[req-f15418e8-731b-4ebe-82a9-e2933e8df8b7 - - - - -] Added BGP Peer
172.20.x.y for remote_as=xxxx to BGP Speaker running for local_as=64600.
2018-07-30 19:00:57.457 2143006 INFO bgpspeaker.api.base
[req-f15418e8-731b-4ebe-82a9-e2933e8df8b7 - - - - -] API method
network.add called with args: {'prefix': u'xxxx:xxxx:0:1::/64',
'next_hop': u'2a05:4545::f'}
2018-07-30 19:00:57.457 2143006 INFO
neutron_dynamic_routing.services.bgp.agent.driver.ryu.driver
[req-f15418e8-731b-4ebe-82a9-e2933e8df8b7 - - - - -] Route
cidr=xxxx:xxxx:0:1::/64, nexthop=xxxx:xxxx::f is advertised for BGP
Speaker running for local_as=64600.
2018-07-30 19:00:58.460 2143006 INFO bgpspeaker.peer [-] Connection to
peer: 172.20.zz.yy established
2018-07-30 19:00:58.460 2143006 INFO
neutron_dynamic_routing.services.bgp.agent.driver.ryu.driver [-] BGP
Peer my.peer.id for remote_as=xxxx is UP.

On the router side the peer is up but there is no BGP UPDATE messages so
I don't get any prefixes.

root at router:~ # bgpctl show sum
Neighbor                   AS    MsgRcvd    MsgSent  OutQ Up/Down
State/PrfRcvd
controllername       64600        488        491     0 00:02:20      0

root at dr20-1-sto1:~ # bgpctl show neighbor 172.20.104.192
BGP neighbor is 172.20.zz.yy, remote AS 64600
 Description: controllername
  BGP version 4, remote router-id 172.20.zz.yy
  BGP state = Established, up for 00:03:03
  Last read 00:00:01, holdtime 40s, keepalive interval 13s
  Neighbor capabilities:
    Multiprotocol extensions: IPv4 unicast
    Route Refresh
    4-byte AS numbers

  Message statistics:
                  Sent       Received
  Opens                    6          6
  Notifications            0          0
  Updates                  0          0
  Keepalives             489        486
  Route Refresh            0          0
  Total                  495        492

  Update statistics:
                  Sent       Received
  Updates                  0          0
  Withdraws                0          0
  End-of-Rib               0          0

I'm wondering if this might be something related to the neighbor
capabilities that is announces, see the output below and from the
neutron-bgp-dragent log we can see this capabilities:
2018-07-30 19:00:57.456 2143006 INFO bgpspeaker.api.base
[req-f15418e8-731b-4ebe-82a9-e2933e8df8b7 - - - - -] API method
neighbor.create called with args: {'connect_mode': 'active',
'cap_mbgp_evpn': False, 'remote_as': 35041, 'cap_mbgp_vpnv6': False,
'cap_mbgp_l2vpnfs': False, 'cap_four_octet_as_number': True,
'cap_mbgp_ipv6': False, 'is_next_hop_self': False, 'cap_mbgp_ipv4':
True, 'cap_mbgp_ipv4fs': False, 'is_route_reflector_client': False,
'cap_mbgp_ipv6fs': False, 'is_route_server_client': False,
'cap_enhanced_refresh': False, 'peer_next_hop': None, 'password': None,
'ip_address': u'172.20.x.y', 'cap_mbgp_vpnv4fs': False,
'cap_mbgp_vpnv4': False, 'cap_mbgp_vpnv6fs': False}

Here is an example on how the bgpd.conf looks like:
group "peering AS64600" {
  remote-as 64600
  softreconfig in yes
  transparent-as yes

  neighbor 172.20.zz.yy {
    announce none
    announce IPv6 unicast
    descr "controller"
    local-address 172.20.x.x
    depend on vlan10
  }

If I interpret the IPv6 section in this document correctly
https://docs.openstack.org/mitaka/networking-guide/config-bgp-dynamic-routing.html
it should work.
Anybody have any ideas or know if it's supported?

Appreciate any help or pointers.
Best regards

_______________________________________________
OpenStack-operators mailing list
OpenStack-operators at lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


From mgagne at calavera.ca  Mon Jul 30 21:40:16 2018
From: mgagne at calavera.ca (=?UTF-8?Q?Mathieu_Gagn=C3=A9?=)
Date: Mon, 30 Jul 2018 17:40:16 -0400
Subject: [Openstack-operators] dashboard show only project after
	upgrading
In-Reply-To: <CAB7j8cV01iGC1gXMkF7fVViqaXad3P=jWhBerGXouiBzVrKnSw@mail.gmail.com>
References: <CAB7j8cUTsXhGs2pLAq4BMTNrquUrSpG5=UK-jwUg9+KxRv=qCw@mail.gmail.com>
 <CAB7j8cV01iGC1gXMkF7fVViqaXad3P=jWhBerGXouiBzVrKnSw@mail.gmail.com>
Message-ID: <CAFee_oR1oy2W5a2TP0XKb7-5x=n+jZtLhqNf=GsPBvX3ppZFEw@mail.gmail.com>

Try enabling DEBUG in local_settings.py. Some dashboard or panel might
fail loading for some reasons.
I had a similar behavior last week and enabling DEBUG should show the error.

--
Mathieu


On Mon, Jul 30, 2018 at 11:35 AM, Ignazio Cassano
<ignaziocassano at gmail.com> wrote:
> Sorry, I sent a wrong image.
> The correct screenshot is attached here.
> Regards
>
> 2018-07-30 17:33 GMT+02:00 Ignazio Cassano <ignaziocassano at gmail.com>:
>>
>> Hello everyone,
>> I upgraded openstack centos 7 from ocata to pike ad command line work fine
>> but dashboard does not show any menu on the left .
>> I missed the following menus:
>>
>> Project
>> Admin
>> Identity
>>
>> You can find the image attached here.
>>
>> Could anyone help me ?
>> Regards
>> Ignazio
>
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>


From iain.macdonnell at oracle.com  Tue Jul 31 00:09:56 2018
From: iain.macdonnell at oracle.com (iain MacDonnell)
Date: Mon, 30 Jul 2018 17:09:56 -0700
Subject: [Openstack-operators] neutron-server memcached connections
In-Reply-To: <9598665a-8748-9fa8-147d-e618db3f7b94@oracle.com>
References: <9598665a-8748-9fa8-147d-e618db3f7b94@oracle.com>
Message-ID: <59bb939e-7aa0-6de6-4f2a-61fd2f4650ae@oracle.com>


Following up on my own question, in case it's useful to others....

Turns out that keystonemiddleware uses eventlet, and, by default, 
creates a connection to memcached from each green thread (and doesn't 
clean them up), and the green threads are essentially unlimited.

There is a solution for this, which implements a shared connection pool. 
It's enabled via the keystone_authtoken.memcache_use_advanced_pool 
config option.

Unfortunately it was broken in a few different ways (I guess this means 
that no one is using it?)

I've worked with the keystone devs, and we were able to get a fix (in 
keystonemiddleware) in just in time for the Rocky release. Related fixes 
have also been backported to Queens (for the next update), and a couple 
needed for Pike are pending completion.

With this in place, so-far I have not seen more than one connection to 
memcached for each neutron-api worker process, and everything seems to 
be working well.

Some relevant changes:

master:

https://review.openstack.org/#/c/583695/


Queens:

https://review.openstack.org/#/c/583698/
https://review.openstack.org/#/c/583684/


Pike:

https://review.openstack.org/#/c/583699/
https://review.openstack.org/#/c/583835/


I do wonder how others are managing memcached connections for larger 
deployments...

     ~iain


On 06/26/2018 12:59 PM, iain MacDonnell wrote:
> 
> In diagnosing a situation where a Pike deployment was intermittently 
> slower (in general), I discovered that it was (sometimes) exceeding 
> memcached's maximum connection limit, which is set to 4096.
> 
> Looking closer, ~2750 of the connections are from 8 neutron-server 
> process. neutron-server is configured with 8 API workers, and those 8 
> processes have a combined total of ~2750 connections to memcached:
> 
> # lsof -i TCP:11211 | awk '/^neutron-s/ {print $2}' | sort | uniq -c
>      245 2611
>      306 2612
>      228 2613
>      406 2614
>      407 2615
>      385 2616
>      369 2617
>      398 2618
> #
> 
> 
> There doesn't seem to be much turnover - comparing samples of the 
> connections (incl. source port) 15 mins apart, two were dropped, and one 
> new one added.
> 
> In neutron.conf, keystone_authtoken.memcached_servers is configured, but 
> nothing else pertaining to caching, so 
> keystone_authtoken.memcache_pool_maxsize should default to 10.
> 
> Am I misunderstanding something, or shouldn't I see a maximum of 10 
> connections from each of the neutron-server API workers, with this 
> configuration?
> 
> Any known issues, or pointers to what I'm missing?
> 
> TIA,
> 
>      ~iain


From ignaziocassano at gmail.com  Tue Jul 31 05:04:06 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Tue, 31 Jul 2018 07:04:06 +0200
Subject: [Openstack-operators] dashboard show only project after
	upgrading
In-Reply-To: <CAFee_oR1oy2W5a2TP0XKb7-5x=n+jZtLhqNf=GsPBvX3ppZFEw@mail.gmail.com>
References: <CAB7j8cUTsXhGs2pLAq4BMTNrquUrSpG5=UK-jwUg9+KxRv=qCw@mail.gmail.com>
 <CAB7j8cV01iGC1gXMkF7fVViqaXad3P=jWhBerGXouiBzVrKnSw@mail.gmail.com>
 <CAFee_oR1oy2W5a2TP0XKb7-5x=n+jZtLhqNf=GsPBvX3ppZFEw@mail.gmail.com>
Message-ID: <CAB7j8cVJ8j8C9G49yv4yKn305fqrw7PNDP8-qgn--sp48xkn5A@mail.gmail.com>

Ok, I will check.
thanks

Il Lun 30 Lug 2018 23:40 Mathieu Gagné <mgagne at calavera.ca> ha scritto:

> Try enabling DEBUG in local_settings.py. Some dashboard or panel might
> fail loading for some reasons.
> I had a similar behavior last week and enabling DEBUG should show the
> error.
>
> --
> Mathieu
>
>
> On Mon, Jul 30, 2018 at 11:35 AM, Ignazio Cassano
> <ignaziocassano at gmail.com> wrote:
> > Sorry, I sent a wrong image.
> > The correct screenshot is attached here.
> > Regards
> >
> > 2018-07-30 17:33 GMT+02:00 Ignazio Cassano <ignaziocassano at gmail.com>:
> >>
> >> Hello everyone,
> >> I upgraded openstack centos 7 from ocata to pike ad command line work
> fine
> >> but dashboard does not show any menu on the left .
> >> I missed the following menus:
> >>
> >> Project
> >> Admin
> >> Identity
> >>
> >> You can find the image attached here.
> >>
> >> Could anyone help me ?
> >> Regards
> >> Ignazio
> >
> >
> >
> > _______________________________________________
> > OpenStack-operators mailing list
> > OpenStack-operators at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180731/558828a5/attachment.html>

From sorrison at gmail.com  Tue Jul 31 06:56:56 2018
From: sorrison at gmail.com (Sam Morrison)
Date: Tue, 31 Jul 2018 16:56:56 +1000
Subject: [Openstack-operators] neutron-server memcached connections
In-Reply-To: <59bb939e-7aa0-6de6-4f2a-61fd2f4650ae@oracle.com>
References: <9598665a-8748-9fa8-147d-e618db3f7b94@oracle.com>
 <59bb939e-7aa0-6de6-4f2a-61fd2f4650ae@oracle.com>
Message-ID: <23985AB0-B635-4311-BACF-0194D2306501@gmail.com>

Great, yeah we have also seen these issues with nova-api with keystonemiddle in newton and ocata.

Thanks for the heads up as I was going to start digging deeper.

Cheers,
Sam


> On 31 Jul 2018, at 10:09 am, iain MacDonnell <iain.macdonnell at oracle.com> wrote:
> 
> 
> Following up on my own question, in case it's useful to others....
> 
> Turns out that keystonemiddleware uses eventlet, and, by default, creates a connection to memcached from each green thread (and doesn't clean them up), and the green threads are essentially unlimited.
> 
> There is a solution for this, which implements a shared connection pool. It's enabled via the keystone_authtoken.memcache_use_advanced_pool config option.
> 
> Unfortunately it was broken in a few different ways (I guess this means that no one is using it?)
> 
> I've worked with the keystone devs, and we were able to get a fix (in keystonemiddleware) in just in time for the Rocky release. Related fixes have also been backported to Queens (for the next update), and a couple needed for Pike are pending completion.
> 
> With this in place, so-far I have not seen more than one connection to memcached for each neutron-api worker process, and everything seems to be working well.
> 
> Some relevant changes:
> 
> master:
> 
> https://review.openstack.org/#/c/583695/
> 
> 
> Queens:
> 
> https://review.openstack.org/#/c/583698/
> https://review.openstack.org/#/c/583684/
> 
> 
> Pike:
> 
> https://review.openstack.org/#/c/583699/
> https://review.openstack.org/#/c/583835/
> 
> 
> I do wonder how others are managing memcached connections for larger deployments...
> 
>    ~iain
> 
> 
> 
> On 06/26/2018 12:59 PM, iain MacDonnell wrote:
>> In diagnosing a situation where a Pike deployment was intermittently slower (in general), I discovered that it was (sometimes) exceeding memcached's maximum connection limit, which is set to 4096.
>> Looking closer, ~2750 of the connections are from 8 neutron-server process. neutron-server is configured with 8 API workers, and those 8 processes have a combined total of ~2750 connections to memcached:
>> # lsof -i TCP:11211 | awk '/^neutron-s/ {print $2}' | sort | uniq -c
>>     245 2611
>>     306 2612
>>     228 2613
>>     406 2614
>>     407 2615
>>     385 2616
>>     369 2617
>>     398 2618
>> #
>> There doesn't seem to be much turnover - comparing samples of the connections (incl. source port) 15 mins apart, two were dropped, and one new one added.
>> In neutron.conf, keystone_authtoken.memcached_servers is configured, but nothing else pertaining to caching, so keystone_authtoken.memcache_pool_maxsize should default to 10.
>> Am I misunderstanding something, or shouldn't I see a maximum of 10 connections from each of the neutron-server API workers, with this configuration?
>> Any known issues, or pointers to what I'm missing?
>> TIA,
>>     ~iain
> 
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


From alfredo.deluca at gmail.com  Tue Jul 31 08:41:37 2018
From: alfredo.deluca at gmail.com (Alfredo De Luca)
Date: Tue, 31 Jul 2018 10:41:37 +0200
Subject: [Openstack-operators] swift question
In-Reply-To: <CA+_JKzoq-5Nec5kciiAEXahCaVu8qbaqu+Ap5KH_-fsge0DYLQ@mail.gmail.com>
References: <CAAWpFTFvLYsfYsoujqzA+0tnApX6dT=jG7M37LF-+pLKqXiHYA@mail.gmail.com>
 <CA+_JKzoq-5Nec5kciiAEXahCaVu8qbaqu+Ap5KH_-fsge0DYLQ@mail.gmail.com>
Message-ID: <CAAWpFTH+gGFH60Ege9K+rF5V_Y0onRQUxkg6xdzyhpATGK9cZw@mail.gmail.com>

Thanks Clay.
I ve been using --changed already but  doesn't sync the content of the
folder remotely. I ll haver a look at rclone as you suggested
Thanks


On Mon, Jul 30, 2018 at 6:28 PM Clay Gerrard <clay.gerrard at gmail.com> wrote:

> Sure!  python swiftclient's upload command has a --changed option:
>
>
> https://docs.openstack.org/python-swiftclient/latest/cli/index.html#swift-upload
>
> But you might be happier with something more sophisticated like rclone:
>
> https://rclone.org/
>
> Nice thing about object storage is you can access it from anywhere via
> HTTP and PUT anything you want in there ;)
>
> -Clay
>
> On Mon, Jul 30, 2018 at 9:54 AM Alfredo De Luca <alfredo.deluca at gmail.com>
> wrote:
>
>> Hi all.
>> I wonder if i can sync a directory on a server to the obj store (swift).
>> What I do now is just a backup but I d like to implement a sort of file
>> rotate locally and on the obj store.
>> Any  idea?
>>
>>
>> --
>> *Alfredo*
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>

-- 
*Alfredo*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180731/534d1ee4/attachment.html>

From gael.therond at gmail.com  Tue Jul 31 08:59:56 2018
From: gael.therond at gmail.com (Flint WALRUS)
Date: Tue, 31 Jul 2018 10:59:56 +0200
Subject: [Openstack-operators] [OCTAVIA][KOLLA] - Amphora to control plan
	communication question.
Message-ID: <CAG+53uYKf8NZb-_ki-q8DNtTm4Mh_QXr3U=N+tk6epDiX4wEUw@mail.gmail.com>

Hi Folks,

I'm currently deploying the Octavia component into our testing environment
which is based on KOLLA.

So far I'm quite enjoying it as it is pretty much straight forward (Except
for some documentation pitfalls), but I'm now facing a weird and hard to
debug situation.

I actually have a hard time to understand how Amphora are communicating
back and forth with the Control Plan components.

>From my understanding, as soon as I create a new LB, the Control Plan is
spawning an instance using the configured Octavia Flavor and Image type,
attach it to the LB-MGMT-NET and to the user provided subnet.

What I think I'm misunderstanding is the discussion that follows between
the amphora and the different components such as the
HealthManager/HouseKeeper, the API and the Worker.

How is the amphora agent able to found my control plan? Is the
HealthManager or the Octavia Worker initiating the communication to the
Amphora on port 9443 and so give the agent the API/Control plan internalURL?

If anyone have a diagram of the workflow I would be more than happy ^^

Thanks a lot in advance to anyone willing to help :D
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180731/1a94dc7b/attachment.html>

From ignaziocassano at gmail.com  Tue Jul 31 12:23:04 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Tue, 31 Jul 2018 14:23:04 +0200
Subject: [Openstack-operators] manila-ui does not work after upgrading from
	ocata to pike
Message-ID: <CAB7j8cUgyX2gt8MfVBfs_aXT-xn-mqecYVrxvsq+Ry_=wP29pw@mail.gmail.com>

Hi everyone,
I upgraded my centos 7 openstack from ocata to pike.
Openstack dashboard works fine only if I remove openstack manila ui package.
With the manila ui it gives me internal server error.

In httpd error log I read:


 KeyError: <class 'manila_ui.dashboards.project.shares.panel.Shares


Please, anyone has solved this issue yet ?

Regards
Ignazio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180731/28f87807/attachment.html>

From johnsomor at gmail.com  Tue Jul 31 16:14:53 2018
From: johnsomor at gmail.com (Michael Johnson)
Date: Tue, 31 Jul 2018 09:14:53 -0700
Subject: [Openstack-operators] [OCTAVIA][KOLLA] - Amphora to control
 plan communication question.
In-Reply-To: <CAG+53uYKf8NZb-_ki-q8DNtTm4Mh_QXr3U=N+tk6epDiX4wEUw@mail.gmail.com>
References: <CAG+53uYKf8NZb-_ki-q8DNtTm4Mh_QXr3U=N+tk6epDiX4wEUw@mail.gmail.com>
Message-ID: <CAMH0Mg+WqZn1RwZFeabXUJ5+gDrdew-xggCSQ6KxZ8Nd0uC4Bg@mail.gmail.com>

Hi Flint,

We don't have a logical network diagram at this time (it's still on
the to-do list), but I can talk you through it.

The Octavia worker, health manager, and housekeeping need to be able
to reach the amphora (service VM at this point) over the lb-mgmt-net
on TCP 9443. It knows the amphora IP addresses on the lb-mgmt-net via
the database and the information we save from the compute driver (I.e.
what IP was assigned to the instance).

The Octavia API process does not need to be connected to the
lb-mgmt-net at this time. It only connects the the messaging bus and
the Octavia database. Provider drivers may have other connectivity
requirements for the Octavia API.

The amphorae also send UDP packets back to the health manager on port
5555. This is the heartbeat packet from the amphora. It contains the
health and statistics from that amphora. It know it's list of health
manager endpoints from the configuration file
"controller_ip_port_list"
(https://docs.openstack.org/octavia/latest/configuration/configref.html#health_manager.controller_ip_port_list).
Each amphora will rotate through that list of endpoints to reduce the
chance of a network split impacting the heartbeat messages.

This is the only traffic that passed over this network. All of it is
IP based and can be routed (it does not require L2 connectivity).

Michael

On Tue, Jul 31, 2018 at 2:00 AM Flint WALRUS <gael.therond at gmail.com> wrote:
>
> Hi Folks,
>
> I'm currently deploying the Octavia component into our testing environment which is based on KOLLA.
>
> So far I'm quite enjoying it as it is pretty much straight forward (Except for some documentation pitfalls), but I'm now facing a weird and hard to debug situation.
>
> I actually have a hard time to understand how Amphora are communicating back and forth with the Control Plan components.
>
> From my understanding, as soon as I create a new LB, the Control Plan is spawning an instance using the configured Octavia Flavor and Image type, attach it to the LB-MGMT-NET and to the user provided subnet.
>
> What I think I'm misunderstanding is the discussion that follows between the amphora and the different components such as the HealthManager/HouseKeeper, the API and the Worker.
>
> How is the amphora agent able to found my control plan? Is the HealthManager or the Octavia Worker initiating the communication to the Amphora on port 9443 and so give the agent the API/Control plan internalURL?
>
> If anyone have a diagram of the workflow I would be more than happy ^^
>
> Thanks a lot in advance to anyone willing to help :D
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


From thierry at openstack.org  Tue Jul 31 16:30:24 2018
From: thierry at openstack.org (Thierry Carrez)
Date: Tue, 31 Jul 2018 18:30:24 +0200
Subject: [Openstack-operators] [ptg] Self-healing SIG meeting moved to
	Thursday morning
Message-ID: <3ee9cf15-4587-7884-f8fd-b00ec22549fc@openstack.org>

Hi! Quick heads-up:

Following a request[1] from Adam Spiers (SIG lead), we modified the PTG 
schedule to move the Self-Healing SIG meeting from Friday (all day) to 
Thursday morning (only morning). You can see the resulting schedule at:

https://www.openstack.org/ptg#tab_schedule

Sorry for any inconvenience this may cause.

[1] http://lists.openstack.org/pipermail/openstack-dev/2018-July/132392.html

-- 
Thierry Carrez (ttx)


From aspiers at suse.com  Tue Jul 31 16:57:56 2018
From: aspiers at suse.com (Adam Spiers)
Date: Tue, 31 Jul 2018 17:57:56 +0100
Subject: [Openstack-operators] [openstack-dev] [ptg] Self-healing SIG
 meeting moved to Thursday morning
In-Reply-To: <3ee9cf15-4587-7884-f8fd-b00ec22549fc@openstack.org>
References: <3ee9cf15-4587-7884-f8fd-b00ec22549fc@openstack.org>
Message-ID: <20180731165755.dqxgittuzao2sdhu@pacific.linksys.moosehall>

Thierry Carrez <thierry at openstack.org> wrote:
>Hi! Quick heads-up:
>
>Following a request[1] from Adam Spiers (SIG lead), we modified the 
>PTG schedule to move the Self-Healing SIG meeting from Friday (all 
>day) to Thursday morning (only morning). You can see the resulting 
>schedule at:
>
>https://www.openstack.org/ptg#tab_schedule
>
>Sorry for any inconvenience this may cause.

It's me who should be apologising - Thierry only deserves thanks for
accommodating my request at late notice ;-)


From gael.therond at gmail.com  Tue Jul 31 17:05:11 2018
From: gael.therond at gmail.com (Flint WALRUS)
Date: Tue, 31 Jul 2018 19:05:11 +0200
Subject: [Openstack-operators] [OCTAVIA][KOLLA] - Amphora to control
 plan communication question.
In-Reply-To: <CAMH0Mg+WqZn1RwZFeabXUJ5+gDrdew-xggCSQ6KxZ8Nd0uC4Bg@mail.gmail.com>
References: <CAG+53uYKf8NZb-_ki-q8DNtTm4Mh_QXr3U=N+tk6epDiX4wEUw@mail.gmail.com>
 <CAMH0Mg+WqZn1RwZFeabXUJ5+gDrdew-xggCSQ6KxZ8Nd0uC4Bg@mail.gmail.com>
Message-ID: <CAG+53uazFtRnX47AmNGDpkGJmYGjfJNK5wfrptJB6yBMHVHRSA@mail.gmail.com>

Hi Michael, thanks a lot for that explanation, it’s actually how I
envisioned the flow.

I’ll have to produce a diagram for my peers understanding, I maybe can
share it with you.

There is still one point that seems to be a little bit odd to me.

How the amphora agent know where to find out the healthManagers and worker
services? Is that because the worker is sending the agent some catalog
informations or because we set that at diskimage-create time?

If so, I think the Centos based amphora is missing the agent.conf because
currently my vms doesn’t have any.

Once again thanks for your help!
Le mar. 31 juil. 2018 à 18:15, Michael Johnson <johnsomor at gmail.com> a
écrit :

> Hi Flint,
>
> We don't have a logical network diagram at this time (it's still on
> the to-do list), but I can talk you through it.
>
> The Octavia worker, health manager, and housekeeping need to be able
> to reach the amphora (service VM at this point) over the lb-mgmt-net
> on TCP 9443. It knows the amphora IP addresses on the lb-mgmt-net via
> the database and the information we save from the compute driver (I.e.
> what IP was assigned to the instance).
>
> The Octavia API process does not need to be connected to the
> lb-mgmt-net at this time. It only connects the the messaging bus and
> the Octavia database. Provider drivers may have other connectivity
> requirements for the Octavia API.
>
> The amphorae also send UDP packets back to the health manager on port
> 5555. This is the heartbeat packet from the amphora. It contains the
> health and statistics from that amphora. It know it's list of health
> manager endpoints from the configuration file
> "controller_ip_port_list"
> (
> https://docs.openstack.org/octavia/latest/configuration/configref.html#health_manager.controller_ip_port_list
> ).
> Each amphora will rotate through that list of endpoints to reduce the
> chance of a network split impacting the heartbeat messages.
>
> This is the only traffic that passed over this network. All of it is
> IP based and can be routed (it does not require L2 connectivity).
>
> Michael
>
> On Tue, Jul 31, 2018 at 2:00 AM Flint WALRUS <gael.therond at gmail.com>
> wrote:
> >
> > Hi Folks,
> >
> > I'm currently deploying the Octavia component into our testing
> environment which is based on KOLLA.
> >
> > So far I'm quite enjoying it as it is pretty much straight forward
> (Except for some documentation pitfalls), but I'm now facing a weird and
> hard to debug situation.
> >
> > I actually have a hard time to understand how Amphora are communicating
> back and forth with the Control Plan components.
> >
> > From my understanding, as soon as I create a new LB, the Control Plan is
> spawning an instance using the configured Octavia Flavor and Image type,
> attach it to the LB-MGMT-NET and to the user provided subnet.
> >
> > What I think I'm misunderstanding is the discussion that follows between
> the amphora and the different components such as the
> HealthManager/HouseKeeper, the API and the Worker.
> >
> > How is the amphora agent able to found my control plan? Is the
> HealthManager or the Octavia Worker initiating the communication to the
> Amphora on port 9443 and so give the agent the API/Control plan internalURL?
> >
> > If anyone have a diagram of the workflow I would be more than happy ^^
> >
> > Thanks a lot in advance to anyone willing to help :D
> >
> > _______________________________________________
> > OpenStack-operators mailing list
> > OpenStack-operators at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180731/4eba2c8c/attachment.html>

From tpb at dyncloud.net  Tue Jul 31 17:12:17 2018
From: tpb at dyncloud.net (Tom Barron)
Date: Tue, 31 Jul 2018 13:12:17 -0400
Subject: [Openstack-operators] manila-ui does not work after upgrading
 from ocata to pike
In-Reply-To: <CAB7j8cUgyX2gt8MfVBfs_aXT-xn-mqecYVrxvsq+Ry_=wP29pw@mail.gmail.com>
References: <CAB7j8cUgyX2gt8MfVBfs_aXT-xn-mqecYVrxvsq+Ry_=wP29pw@mail.gmail.com>
Message-ID: <20180731171217.t47enqskb425zu3b@barron.net>

On 31/07/18 14:23 +0200, Ignazio Cassano wrote:
>Hi everyone,
>I upgraded my centos 7 openstack from ocata to pike.
>Openstack dashboard works fine only if I remove openstack manila ui package.
>With the manila ui it gives me internal server error.
>
>In httpd error log I read:
>
>
> KeyError: <class 'manila_ui.dashboards.project.shares.panel.Shares
>
>
>Please, anyone has solved this issue yet ?
>
>Regards
>Ignazio

>_______________________________________________
>OpenStack-operators mailing list
>OpenStack-operators at lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Seems likely to be a packaging issue.  Might be the issue that this 
(queens) patch [1] addressed.  To get someone to work on the pike 
issue please file a BZ against RDO [2].

-- Tom Barron (tbarron)

[1] https://review.rdoproject.org/r/#/c/14049/

[2] https://bugzilla.redhat.com/enter_bug.cgi?product=RDO&component=openstack-manila-ui


From ignaziocassano at gmail.com  Tue Jul 31 19:19:25 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Tue, 31 Jul 2018 21:19:25 +0200
Subject: [Openstack-operators] manila-ui does not work after upgrading
 from ocata to pike
In-Reply-To: <20180731171217.t47enqskb425zu3b@barron.net>
References: <CAB7j8cUgyX2gt8MfVBfs_aXT-xn-mqecYVrxvsq+Ry_=wP29pw@mail.gmail.com>
 <20180731171217.t47enqskb425zu3b@barron.net>
Message-ID: <CAB7j8cU6MsNV5ir=xxhm-+07xwjwthjXH=MwdBsznnb3TFHQTA@mail.gmail.com>

Hello Tom, I wiil upgrade from pike to Queens asap.
I upgraded from ocata to pike to go on step by step, but it is useful dir
the community I can open a bug.
What do you think ?
Thanks
Ignazio


Il Mar 31 Lug 2018 19:12 Tom Barron <tpb at dyncloud.net> ha scritto:

> On 31/07/18 14:23 +0200, Ignazio Cassano wrote:
> >Hi everyone,
> >I upgraded my centos 7 openstack from ocata to pike.
> >Openstack dashboard works fine only if I remove openstack manila ui
> package.
> >With the manila ui it gives me internal server error.
> >
> >In httpd error log I read:
> >
> >
> > KeyError: <class 'manila_ui.dashboards.project.shares.panel.Shares
> >
> >
> >Please, anyone has solved this issue yet ?
> >
> >Regards
> >Ignazio
>
> >_______________________________________________
> >OpenStack-operators mailing list
> >OpenStack-operators at lists.openstack.org
> >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
> Seems likely to be a packaging issue.  Might be the issue that this
> (queens) patch [1] addressed.  To get someone to work on the pike
> issue please file a BZ against RDO [2].
>
> -- Tom Barron (tbarron)
>
> [1] https://review.rdoproject.org/r/#/c/14049/
>
> [2]
> https://bugzilla.redhat.com/enter_bug.cgi?product=RDO&component=openstack-manila-ui
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180731/0474ed24/attachment.html>

From tpb at dyncloud.net  Tue Jul 31 19:25:14 2018
From: tpb at dyncloud.net (Tom Barron)
Date: Tue, 31 Jul 2018 15:25:14 -0400
Subject: [Openstack-operators] manila-ui does not work after upgrading
 from ocata to pike
In-Reply-To: <CAB7j8cU6MsNV5ir=xxhm-+07xwjwthjXH=MwdBsznnb3TFHQTA@mail.gmail.com>
References: <CAB7j8cUgyX2gt8MfVBfs_aXT-xn-mqecYVrxvsq+Ry_=wP29pw@mail.gmail.com>
 <20180731171217.t47enqskb425zu3b@barron.net>
 <CAB7j8cU6MsNV5ir=xxhm-+07xwjwthjXH=MwdBsznnb3TFHQTA@mail.gmail.com>
Message-ID: <20180731192514.6whouw4h5af7arj2@barron.net>

On 31/07/18 21:19 +0200, Ignazio Cassano wrote:
>Hello Tom, I wiil upgrade from pike to Queens asap.
>I upgraded from ocata to pike to go on step by step, but it is useful dir
>the community I can open a bug.
>What do you think ?
>Thanks
>Ignazio

Opening a bug would help others.

manila-ui has been a bit of a neglected step-child so I'm glad you are 
checking it out!

-- Tom

>
>
>Il Mar 31 Lug 2018 19:12 Tom Barron <tpb at dyncloud.net> ha scritto:
>
>> On 31/07/18 14:23 +0200, Ignazio Cassano wrote:
>> >Hi everyone,
>> >I upgraded my centos 7 openstack from ocata to pike.
>> >Openstack dashboard works fine only if I remove openstack manila ui
>> package.
>> >With the manila ui it gives me internal server error.
>> >
>> >In httpd error log I read:
>> >
>> >
>> > KeyError: <class 'manila_ui.dashboards.project.shares.panel.Shares
>> >
>> >
>> >Please, anyone has solved this issue yet ?
>> >
>> >Regards
>> >Ignazio
>>
>> >_______________________________________________
>> >OpenStack-operators mailing list
>> >OpenStack-operators at lists.openstack.org
>> >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>> Seems likely to be a packaging issue.  Might be the issue that this
>> (queens) patch [1] addressed.  To get someone to work on the pike
>> issue please file a BZ against RDO [2].
>>
>> -- Tom Barron (tbarron)
>>
>> [1] https://review.rdoproject.org/r/#/c/14049/
>>
>> [2]
>> https://bugzilla.redhat.com/enter_bug.cgi?product=RDO&component=openstack-manila-ui
>>
>>


From ignaziocassano at gmail.com  Tue Jul 31 19:51:04 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Tue, 31 Jul 2018 21:51:04 +0200
Subject: [Openstack-operators] dashboard show only project after
	upgrading
In-Reply-To: <164ebe48760.2784.5f0d7f2baa7831a2bbe6450f254d9a24@bitskrieg.net>
References: <CAB7j8cUTsXhGs2pLAq4BMTNrquUrSpG5=UK-jwUg9+KxRv=qCw@mail.gmail.com>
 <CAB7j8cV01iGC1gXMkF7fVViqaXad3P=jWhBerGXouiBzVrKnSw@mail.gmail.com>
 <164ebe48760.2784.5f0d7f2baa7831a2bbe6450f254d9a24@bitskrieg.net>
Message-ID: <CAB7j8cWn+MWNEFLnirAnvg5WsL8F3Y5m3znxNpsOgRdcmPRj-g@mail.gmail.com>

Hello, purging and reinstalling the dashboard solved.
thanks
ignazio


Il Lun 30 Lug 2018 17:53 Chris Apsey <bitskrieg at bitskrieg.net> ha scritto:

> Ignazio,
>
> Are your horizon instances in separate containers/VMS?  If so, I'd highly
> recommend completely wiping them and rebuilding from scratch since horizon
> itself is stateless.  I am not a fan of upgrades for reasons like this.
>
> If that's not possible, a purge of the horizon packages on your controller
> and a reinstallation should fix it.
>
> Chris
>
> On July 30, 2018 11:38:03 Ignazio Cassano <ignaziocassano at gmail.com>
> wrote:
>
>> Sorry, I sent a wrong image.
>> The correct screenshot is attached here.
>> Regards
>>
>> 2018-07-30 17:33 GMT+02:00 Ignazio Cassano <ignaziocassano at gmail.com>:
>>
>>> Hello everyone,
>>> I upgraded openstack centos 7 from ocata to pike ad command line work
>>> fine
>>> but dashboard does not show any menu on the left .
>>> I missed the following menus:
>>>
>>> Project
>>> Admin
>>> Identity
>>>
>>> You can find the image attached here.
>>>
>>> Could anyone help me ?
>>> Regards
>>> Ignazio
>>>
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180731/1e2c783e/attachment.html>