From natsume.takashi at lab.ntt.co.jp  Tue May  1 01:02:48 2018
From: natsume.takashi at lab.ntt.co.jp (Takashi Natsume)
Date: Tue, 1 May 2018 10:02:48 +0900
Subject: [Openstack-operators] Need feedback for nova aborting cold
	migration function
Message-ID: <b25afb1a-b6e8-acef-c053-99e4a30114f8@lab.ntt.co.jp>

Hi everyone,

I'm going to add the aborting cold migration function [1] in nova.
I would like to ask operators' feedback on this.

The cold migration is an administrator operation by default.
If administrators perform cold migration and it is stalled out,
users cannot do their operations (e.g. starting the VM).

In that case, if administrators can abort the cold migration by using 
this function,
it enables users to operate their VMs.

If you are a person like the following, would you reply to this mail?

* Those who need this function
* Those who will use this function if it is implemented
* Those who think that it is better to have this function
* Those who are interested in this function

[1] https://review.openstack.org/#/c/334732/

Regards,
Takashi Natsume
NTT Software Innovation Center
E-mail: natsume.takashi at lab.ntt.co.jp


From dh3 at sanger.ac.uk  Tue May  1 08:30:33 2018
From: dh3 at sanger.ac.uk (Dave Holland)
Date: Tue, 1 May 2018 09:30:33 +0100
Subject: [Openstack-operators] [openstack-dev] [nova] Default scheduler
 filters survey
In-Reply-To: <CAFee_oQae8OJD3fmJkvHRxtR-h26NGE0aoKfuFwdCJR5tnAezw@mail.gmail.com>
References: <CACqyMiefAsG5iGf0-_VnX6ZK7rOCAMxA8T1Os72vCz-esWExQg@mail.gmail.com>
 <CAJCXu8ecJSJ0SKAJCwAOdwbN4cPPfvAew=dTY_U=XKtYiXiZ3A@mail.gmail.com>
 <CALjNAZ3ofxu2iGwC125mXPgiRp_eq6VRFiatgWTYg5u5YACe0w@mail.gmail.com>
 <CACqyMidO_s5f_wW_FKvvddioN_-3ESX1fth5LQLsBQO_bxxUFA@mail.gmail.com>
 <C40E40A4-C6EC-41BD-A793-F5D7545D6873@leafe.com>
 <CAFee_oQae8OJD3fmJkvHRxtR-h26NGE0aoKfuFwdCJR5tnAezw@mail.gmail.com>
Message-ID: <20180501083033.GF9259@sanger.ac.uk>

On Mon, Apr 30, 2018 at 12:41:21PM -0400, Mathieu Gagné wrote:
> Weighers for baremetal cells:
> * ReservedHostForTenantWeigher [7]
...
> [7] Used to favor reserved host over non-reserved ones based on project.

Hello Mathieu,

we are considering writing something like this, for virtual machines not
for baremetal. Our use case is that a project buying some compute
hardware is happy for others to use it, but when the compute "owner"
wants sole use of it, other projects' instances must be migrated off or
killed; a scheduler weigher like this might help us to minimise the
number of instances needing migration or termination at that point.
Would you be willing to share your source code please?

thanks,
Dave
-- 
** Dave Holland ** Systems Support -- Informatics Systems Group **
** 01223 496923 **    Wellcome Sanger Institute, Hinxton, UK    **


-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 


From Tim.Bell at cern.ch  Tue May  1 13:10:56 2018
From: Tim.Bell at cern.ch (Tim Bell)
Date: Tue, 1 May 2018 13:10:56 +0000
Subject: [Openstack-operators] [openstack-dev] [nova] Default scheduler
 filters survey
In-Reply-To: <20180501083033.GF9259@sanger.ac.uk>
References: <CACqyMiefAsG5iGf0-_VnX6ZK7rOCAMxA8T1Os72vCz-esWExQg@mail.gmail.com>
 <CAJCXu8ecJSJ0SKAJCwAOdwbN4cPPfvAew=dTY_U=XKtYiXiZ3A@mail.gmail.com>
 <CALjNAZ3ofxu2iGwC125mXPgiRp_eq6VRFiatgWTYg5u5YACe0w@mail.gmail.com>
 <CACqyMidO_s5f_wW_FKvvddioN_-3ESX1fth5LQLsBQO_bxxUFA@mail.gmail.com>
 <C40E40A4-C6EC-41BD-A793-F5D7545D6873@leafe.com>
 <CAFee_oQae8OJD3fmJkvHRxtR-h26NGE0aoKfuFwdCJR5tnAezw@mail.gmail.com>
 <20180501083033.GF9259@sanger.ac.uk>
Message-ID: <E4B95FE4-ED18-4A7E-9F42-71D34A41E0FA@cern.ch>

You may also need something like pre-emptible instances to arrange the clean up of opportunistic VMs when the owner needs his resources back. Some details on the early implementation at http://openstack-in-production.blogspot.fr/2018/02/maximizing-resource-utilization-with.html.

If you're in Vancouver, we'll be having a Forum session on this (https://www.openstack.org/summit/vancouver-2018/summit-schedule/events/21787/pre-emptible-instances-the-way-forward) and notes welcome on the etherpad (https://etherpad.openstack.org/p/YVR18-pre-emptible-instances)

It would be good to find common implementations since this is a common scenario in the academic and research communities.

Tim

﻿-----Original Message-----
From: Dave Holland <dh3 at sanger.ac.uk>
Date: Tuesday, 1 May 2018 at 10:40
To: Mathieu Gagné <mgagne at calavera.ca>
Cc: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org>, openstack-operators <openstack-operators at lists.openstack.org>
Subject: Re: [Openstack-operators] [openstack-dev] [nova] Default scheduler filters survey

    On Mon, Apr 30, 2018 at 12:41:21PM -0400, Mathieu Gagné wrote:
    > Weighers for baremetal cells:
    > * ReservedHostForTenantWeigher [7]
    ...
    > [7] Used to favor reserved host over non-reserved ones based on project.
    
    Hello Mathieu,
    
    we are considering writing something like this, for virtual machines not
    for baremetal. Our use case is that a project buying some compute
    hardware is happy for others to use it, but when the compute "owner"
    wants sole use of it, other projects' instances must be migrated off or
    killed; a scheduler weigher like this might help us to minimise the
    number of instances needing migration or termination at that point.
    Would you be willing to share your source code please?
    
    thanks,
    Dave
    -- 
    ** Dave Holland ** Systems Support -- Informatics Systems Group **
    ** 01223 496923 **    Wellcome Sanger Institute, Hinxton, UK    **
    
    
    -- 
     The Wellcome Sanger Institute is operated by Genome Research 
     Limited, a charity registered in England with number 1021457 and a 
     company registered in England with number 2742969, whose registered 
     office is 215 Euston Road, London, NW1 2BE. 
    
    _______________________________________________
    OpenStack-operators mailing list
    OpenStack-operators at lists.openstack.org
    http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
    

From emilien at redhat.com  Tue May  1 14:03:55 2018
From: emilien at redhat.com (Emilien Macchi)
Date: Tue, 1 May 2018 07:03:55 -0700
Subject: [Openstack-operators] [openstack-dev] The Forum Schedule is now
	live
In-Reply-To: <CACu=hysFopOjkhKmrFJMsiVQ47CMytKeqxw1nnDFMwC2znnL3Q@mail.gmail.com>
References: <5AE34A02.8020802@openstack.org>
 <CACu=hytSVJV0=_7rjoKTMJmNShWJE3xj4uJTT908szcMeESUmg@mail.gmail.com>
 <CAFs83QpewUf9VP0Dp9UCo5q8nKoi4O-Ur85r9Kse33TFHO7NyA@mail.gmail.com>
 <5AE73AA3.4030408@openstack.org>
 <CAFsb3b4ND6TocVHDC-vRJZMAVh5gDgReSF4hieWNrG3WfWgnLQ@mail.gmail.com>
 <5AE74CF2.9010804@openstack.org>
 <CACu=hysFopOjkhKmrFJMsiVQ47CMytKeqxw1nnDFMwC2znnL3Q@mail.gmail.com>
Message-ID: <CACu=hyuF3RruyzwU=_SwfDbx8HLDAhgpmkvo1XoGCheftpYdZA@mail.gmail.com>

On Mon, Apr 30, 2018 at 10:25 AM, Emilien Macchi <emilien at redhat.com> wrote:

> On Mon, Apr 30, 2018 at 10:05 AM, Jimmy McArthur <jimmy at openstack.org>
> wrote:
>>
>> It looks like we have a spot held for you, but did not receive
>> confirmation that TripleO would be moving forward with Project Update.  If
>> you all will be recording this, we have you down for Wednesday from 11:25 -
>> 11:45am.  Just let me know and I'll get it up on the schedule.
>>
>
> This slot is perfect, and I'll run it with one of my tripleo co-workers
> (Alex won't be here).
>

Jimmy, could you please confirm we have the TripleO Project Updates slot? I
don't see it in the schedule.

Thanks,
-- 
Emilien Macchi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180501/f83683a2/attachment.html>

From jimmy at openstack.org  Tue May  1 14:18:16 2018
From: jimmy at openstack.org (Jimmy McArthur)
Date: Tue, 01 May 2018 09:18:16 -0500
Subject: [Openstack-operators] [openstack-dev] The Forum Schedule is now
 live
In-Reply-To: <CACu=hyuF3RruyzwU=_SwfDbx8HLDAhgpmkvo1XoGCheftpYdZA@mail.gmail.com>
References: <5AE34A02.8020802@openstack.org>
 <CACu=hytSVJV0=_7rjoKTMJmNShWJE3xj4uJTT908szcMeESUmg@mail.gmail.com>
 <CAFs83QpewUf9VP0Dp9UCo5q8nKoi4O-Ur85r9Kse33TFHO7NyA@mail.gmail.com>
 <5AE73AA3.4030408@openstack.org>
 <CAFsb3b4ND6TocVHDC-vRJZMAVh5gDgReSF4hieWNrG3WfWgnLQ@mail.gmail.com>
 <5AE74CF2.9010804@openstack.org>
 <CACu=hysFopOjkhKmrFJMsiVQ47CMytKeqxw1nnDFMwC2znnL3Q@mail.gmail.com>
 <CACu=hyuF3RruyzwU=_SwfDbx8HLDAhgpmkvo1XoGCheftpYdZA@mail.gmail.com>
Message-ID: <5AE87728.1020804@openstack.org>

Apologies for the delay, Emilien!  I should be adding it today, but it's 
definitely yours.

> Emilien Macchi <mailto:emilien at redhat.com>
> May 1, 2018 at 9:03 AM
>
> Jimmy, could you please confirm we have the TripleO Project Updates 
> slot? I don't see it in the schedule.
>
> Thanks,
> -- 
> Emilien Macchi
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> Emilien Macchi <mailto:emilien at redhat.com>
> April 30, 2018 at 12:25 PM
>
> This slot is perfect, and I'll run it with one of my tripleo 
> co-workers (Alex won't be here).
>
> Thanks,
> -- 
> Emilien Macchi
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> Jimmy McArthur <mailto:jimmy at openstack.org>
> April 30, 2018 at 12:05 PM
> Alex,
>
> It looks like we have a spot held for you, but did not receive 
> confirmation that TripleO would be moving forward with Project 
> Update.  If you all will be recording this, we have you down for 
> Wednesday from 11:25 - 11:45am.  Just let me know and I'll get it up 
> on the schedule.
>
> Thanks!
> Jimmy
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> Alex Schultz <mailto:aschultz at redhat.com>
> April 30, 2018 at 11:52 AM
> On Mon, Apr 30, 2018 at 9:47 AM, Jimmy McArthur<jimmy at openstack.org>  wrote:
>> Project Updates are in their own track:
>> https://www.openstack.org/summit/vancouver-2018/summit-schedule#track=223
>>
>
> TripleO is still missing?
>
> Thanks,
> -Alex
>
>> As are SIG, BoF and Working Groups:
>> https://www.openstack.org/summit/vancouver-2018/summit-schedule#track=218
>>
>> Amy Marrich
>> April 30, 2018 at 10:44 AM
>> Emilien,
>>
>> I believe that the Project Updates are separate from the Forum? I know I saw
>> some in the schedule before the Forum submittals were even closed. Maybe
>> contact speaker support or Jimmy will answer here.
>>
>> Thanks,
>>
>> Amy (spotz)
>>
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>> Emilien Macchi
>> April 30, 2018 at 10:33 AM
>>
>>
>>> Hello all -
>>>
>>> Please take a look here for the posted Forum schedule:
>>> https://www.openstack.org/summit/vancouver-2018/summit-schedule#track=224
>>> You should also see it update on your Summit App.
>> Why TripleO doesn't have project update?
>> Maybe we could combine it with TripleO - Project Onboarding if needed but it
>> would be great to have it advertised as a project update!
>>
>> Thanks,
>> --
>> Emilien Macchi
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> Jimmy McArthur
>> April 27, 2018 at 11:04 AM
>> Hello all -
>>
>> Please take a look here for the posted Forum schedule:
>> https://www.openstack.org/summit/vancouver-2018/summit-schedule#track=224
>> You should also see it update on your Summit App.
>>
>> Thank you and see you in Vancouver!
>> Jimmy
>>
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> Jimmy McArthur <mailto:jimmy at openstack.org>
> April 30, 2018 at 10:47 AM
> Project Updates are in their own track: 
> https://www.openstack.org/summit/vancouver-2018/summit-schedule#track=223
>
> As are SIG, BoF and Working Groups: 
> https://www.openstack.org/summit/vancouver-2018/summit-schedule#track=218
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180501/9a783d7c/attachment.html>

From mgagne at calavera.ca  Tue May  1 14:35:13 2018
From: mgagne at calavera.ca (=?UTF-8?Q?Mathieu_Gagn=C3=A9?=)
Date: Tue, 1 May 2018 10:35:13 -0400
Subject: [Openstack-operators] [openstack-dev] [nova] Default scheduler
 filters survey
In-Reply-To: <20180501083033.GF9259@sanger.ac.uk>
References: <CACqyMiefAsG5iGf0-_VnX6ZK7rOCAMxA8T1Os72vCz-esWExQg@mail.gmail.com>
 <CAJCXu8ecJSJ0SKAJCwAOdwbN4cPPfvAew=dTY_U=XKtYiXiZ3A@mail.gmail.com>
 <CALjNAZ3ofxu2iGwC125mXPgiRp_eq6VRFiatgWTYg5u5YACe0w@mail.gmail.com>
 <CACqyMidO_s5f_wW_FKvvddioN_-3ESX1fth5LQLsBQO_bxxUFA@mail.gmail.com>
 <C40E40A4-C6EC-41BD-A793-F5D7545D6873@leafe.com>
 <CAFee_oQae8OJD3fmJkvHRxtR-h26NGE0aoKfuFwdCJR5tnAezw@mail.gmail.com>
 <20180501083033.GF9259@sanger.ac.uk>
Message-ID: <CAFee_oS6Lgb2WY+8C6yTYAzfcaGXDQVoOqaSC2M8iOuUOEe1yw@mail.gmail.com>

Hi Dave,

On Tue, May 1, 2018 at 4:30 AM, Dave Holland <dh3 at sanger.ac.uk> wrote:
> On Mon, Apr 30, 2018 at 12:41:21PM -0400, Mathieu Gagné wrote:
>> Weighers for baremetal cells:
>> * ReservedHostForTenantWeigher [7]
> ...
>> [7] Used to favor reserved host over non-reserved ones based on project.
>
> Hello Mathieu,
>
> we are considering writing something like this, for virtual machines not
> for baremetal. Our use case is that a project buying some compute
> hardware is happy for others to use it, but when the compute "owner"
> wants sole use of it, other projects' instances must be migrated off or
> killed; a scheduler weigher like this might help us to minimise the
> number of instances needing migration or termination at that point.
> Would you be willing to share your source code please?
>

I'm not sure how battle-tested this code is to be honest but here it is:
https://gist.github.com/mgagne/659ca02e63779802de6f7aec8cda612a

I had to merge 2 files in one (the weigher and the conf) so I'm not
sure if it still works but I think you will get the idea.

To use it, you need to define the "reserved_for_tenant_id" Ironic node
property with the project ID to reserve it. (through Ironic API)

This code also assumes you already filtered out hosts which are
reserved for a different tenant. I included that code in the gist too.

On a side note, our technicians generally use the forced host feature
of Nova to target specific Ironic nodes:
https://docs.openstack.org/nova/pike/admin/availability-zones.html

But if the customer buys and reserves some machines, he should get
them first before the ones in the "public pool".

--
Mathieu


From mihalis68 at gmail.com  Tue May  1 15:37:33 2018
From: mihalis68 at gmail.com (Chris Morgan)
Date: Tue, 1 May 2018 11:37:33 -0400
Subject: [Openstack-operators] ops meetups team : IRC meeting 2018-5-1
Message-ID: <CA+NmNoM1v9nCk9QxU0UbOUvJhS9yj6vUGLM=ckr5f62vLZq5dw@mail.gmail.com>

Lively meeting today on IRC. Minutes and log are here:

Meeting ended Tue May 1 15:01:25 2018 UTC. Information about MeetBot at
http://wiki.debian.org/MeetBot . (v 0.1.4)
11:01 AM Minutes:
http://eavesdrop.openstack.org/meetings/ops_meetup_team/2018/ops_meetup_team.2018-05-01-14.00.html
11:01 AM Minutes (text):
http://eavesdrop.openstack.org/meetings/ops_meetup_team/2018/ops_meetup_team.2018-05-01-14.00.txt
11:01 AM Log:
http://eavesdrop.openstack.org/meetings/ops_meetup_team/2018/ops_meetup_team.2018-05-01-14.00.log.html

We mostly focused on the upcoming PTG, see
https://www.mail-archive.com/openstack-operators at lists.openstack.org/msg10021.html
https://www.openstack.org/ptg

As a reminder

Early Bird: USD $199 (Deadline May 11 at 6:59 UTC)
Regular: USD $399 (Deadline August 23 at 6:59 UTC)
Late/Onsite: USD $599

Cheers

Chris

-- 
Chris Morgan <mihalis68 at gmail.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180501/c456acfd/attachment.html>

From jimmy at openstack.org  Tue May  1 20:20:49 2018
From: jimmy at openstack.org (Jimmy McArthur)
Date: Tue, 01 May 2018 15:20:49 -0500
Subject: [Openstack-operators] OpenStack PTG Update
Message-ID: <5AE8CC21.1070504@openstack.org>

Hello Ops Folks -

Wanted to reach out regarding some concerns that have been voiced around 
the pricing at the PTG.  Part of the value of the event is allowing Ops 
and Devs to co-mingle, collaborate, and work together on solving 
problems with OpenStack. What we are betting on is the opportunity to 
show that Ops and Devs together, can make a better OpenStack.

While this new price may be higher per attendee than previous ops 
meetups, attendees do receive a free ticket to the next two Summits. The 
result is an increased price for Ops Meetup/PTG, while lowering overall 
attendee costs for the PTG/Ops Meetup + Summits. Additionally, we are 
going to extend the deadline of the Early Bird offer to May 18, 6:59 
UTC.  After that time, the price will increase from USD $199 to USD $399.

Please keep in mind that the OpenStack Foundation doesn’t profit on 
these events. Our goal is to provide the absolute best community 
experience/opportunity/value for the money.  In short, we want and need 
you there!

If you are concerned about cost and your organization will not fund your 
travel, you can apply for Travel Support 
<https://openstackfoundation.formstack.com/forms/travelsupportptg_denver_2018>.  
If your organization is interested in sponsoring the PTG or supporting 
attendees through Travel Support, please email ptg at openstack.org 
<mailto:ptg at openstack.org>.

I'm sure there will be plenty of questions. We are happy to host a video 
conference if it's something that would be of value to the Ops community.

Thank you and we look forward to seeing you in Denver!
Jimmy

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180501/a770837d/attachment.html>

From arvindn05 at gmail.com  Tue May  1 22:26:58 2018
From: arvindn05 at gmail.com (Arvind N)
Date: Tue, 1 May 2018 15:26:58 -0700
Subject: [Openstack-operators] [openstack-dev] [nova][placement] Trying
 to summarize bp/glance-image-traits scheduling alternatives for rebuild
In-Reply-To: <CALOCmukpm1EY2AmzrUuoffZ0JWw7vwdWZ+sQTgQJfOQ5F63XYw@mail.gmail.com>
References: <221636a9-4b8f-1098-10b8-2240a7cb0ff7@gmail.com>
 <8eec45ab-f9ed-cd96-51a1-9be78849fb9b@gmail.com>
 <CAKQB9E2GR7VU8it4S0hjM=4VK3LnZVKB4TU-u3UYc=wxJbGOeQ@mail.gmail.com>
 <530903a4-701d-595e-acc3-05369697cf06@gmail.com>
 <CALOCmukpm1EY2AmzrUuoffZ0JWw7vwdWZ+sQTgQJfOQ5F63XYw@mail.gmail.com>
Message-ID: <CAKQB9E15jr4dhcWGqNUUqEVz8yt_4VNpTK7XwAxT4F2B15kaJg@mail.gmail.com>

Reminder for Operators, Please provide feedback either way.

In cases of rebuilding of an instance using a different image where the
image traits have changed between the original launch and the rebuild, is
it reasonable to ask to just re-launch a new instance with the new image?

The argument for this approach is that given that the requirements have
changed, we want the scheduler to pick and allocate the appropriate host
for the instance.

The approach above also gives you consistent results vs the other
approaches where the rebuild may or may not succeed depending on how the
original allocation of resources went.

For example(from Alex Xu) ,if you launched an instance on a host which has
two SRIOV nic. One is normal SRIOV nic(A), another one with some kind of
offload feature(B).

So, the original request is: resources=SRIOV_VF:1 The instance gets a VF
from the normal SRIOV nic(A).

But with a new image, the new request is: resources=SRIOV_VF:1
traits=HW_NIC_OFFLOAD_XX
With all the solutions discussed in the thread, a rebuild request like
above may or may not succeed depending on whether during the initial launch
whether nic A or nic B was allocated.

Remember that in rebuild new allocation don't happen, we have to reuse the
existing allocations.

Given the above background, there seems to be 2 competing options.

1. Fail in the API saying you can't rebuild with a new image with new
required traits.

2. Look at the current allocations for the instance and try to match the
new requirement from the image with the allocations.

With #1, we get consistent results in regards to how rebuilds are treated
when the image traits changed.

With #2, the rebuild may or may not succeed, depending on how well the
original allocations match up with the new requirements.

#2 will also need to need to account for handling preferred traits or
granular resource traits if we decide to implement them for images at some
point...


[1]
https://specs.openstack.org/openstack/nova-specs/specs/rocky/approved/glance-image-traits.html
[2] https://review.openstack.org/#/c/560718/

On Tue, Apr 24, 2018 at 6:26 AM, Sylvain Bauza <sbauza at redhat.com> wrote:

> Sorry folks for the late reply, I'll try to also weigh in the Gerrit
> change.
>
> On Tue, Apr 24, 2018 at 2:55 PM, Jay Pipes <jaypipes at gmail.com> wrote:
>
>> On 04/23/2018 05:51 PM, Arvind N wrote:
>>
>>> Thanks for the detailed options Matt/eric/jay.
>>>
>>> Just few of my thoughts,
>>>
>>> For #1, we can make the explanation very clear that we rejected the
>>> request because the original traits specified in the original image and the
>>> new traits specified in the new image do not match and hence rebuild is not
>>> supported.
>>>
>>
>> I believe I had suggested that on the spec amendment patch. Matt had
>> concerns about an error message being a poor user experience (I don't
>> necessarily disagree with that) and I had suggested a clearer error message
>> to try and make that user experience slightly less sucky.
>>
>> For #3,
>>>
>>> Even though it handles the nested provider, there is a potential issue.
>>>
>>> Lets say a host with two SRIOV nic. One is normal SRIOV nic(VF1),
>>> another one with some kind of offload feature(VF2).(Described by alex)
>>>
>>> Initial instance launch happens with VF:1 allocated, rebuild launches
>>> with modified request with traits=HW_NIC_OFFLOAD_X, so basically we want
>>> the instance to be allocated VF2.
>>>
>>> But the original allocation happens against VF1 and since in rebuild the
>>> original allocations are not changed, we have wrong allocations.
>>>
>>
>> Yep, that is certainly an issue. The only solution to this that I can see
>> would be to have the conductor ask the compute node to do the pre-flight
>> check. The compute node already has the entire tree of providers, their
>> inventories and traits, along with information about providers that share
>> resources with the compute node. It has this information in the
>> ProviderTree object in the reportclient that is contained in the compute
>> node resource tracker.
>>
>> The pre-flight check, if run on the compute node, would be able to grab
>> the allocation records for the instance and determine if the required
>> traits for the new image are present on the actual resource providers
>> allocated against for the instance (and not including any child providers
>> not allocated against).
>>
>>
> Yup, that. We also have pre-flight checks for move operations like live
> and cold migrations, and I'd really like to keep all the conditionals in
> the conductor, because it knows better than the scheduler which operation
> is asked.
> I'm not really happy with adding more in the scheduler about "yeah, it's a
> rebuild, so please do something exceptional", and I'm also not happy with
> having a filter (that can be disabled) calling the Placement API.
>
>
>> Or... we chalk this up as a "too bad" situation and just either go with
>> option #1 or simply don't care about it.
>
>
> Also, that too. Maybe just provide an error should be enough, nope?
> Operators, what do you think ? (cross-calling openstack-operators@)
>
>  -Sylvain
>
>
>>
>> Best,
>> -jay
>>
>> ____________________________________________________________
>> ______________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscrib
>> e
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>


-- 
Arvind N
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180501/04f60d78/attachment.html>

From emilien at redhat.com  Wed May  2 05:05:13 2018
From: emilien at redhat.com (Emilien Macchi)
Date: Tue, 1 May 2018 22:05:13 -0700
Subject: [Openstack-operators] [openstack-dev] The Forum Schedule is now
	live
In-Reply-To: <5AE87728.1020804@openstack.org>
References: <5AE34A02.8020802@openstack.org>
 <CACu=hytSVJV0=_7rjoKTMJmNShWJE3xj4uJTT908szcMeESUmg@mail.gmail.com>
 <CAFs83QpewUf9VP0Dp9UCo5q8nKoi4O-Ur85r9Kse33TFHO7NyA@mail.gmail.com>
 <5AE73AA3.4030408@openstack.org>
 <CAFsb3b4ND6TocVHDC-vRJZMAVh5gDgReSF4hieWNrG3WfWgnLQ@mail.gmail.com>
 <5AE74CF2.9010804@openstack.org>
 <CACu=hysFopOjkhKmrFJMsiVQ47CMytKeqxw1nnDFMwC2znnL3Q@mail.gmail.com>
 <CACu=hyuF3RruyzwU=_SwfDbx8HLDAhgpmkvo1XoGCheftpYdZA@mail.gmail.com>
 <5AE87728.1020804@openstack.org>
Message-ID: <CACu=hyvJ9guhYRVs24wBsMG7F9FAxmtTmhmPSa0q7M-jEi37pw@mail.gmail.com>

On Tue, May 1, 2018 at 7:18 AM, Jimmy McArthur <jimmy at openstack.org> wrote:

> Apologies for the delay, Emilien!  I should be adding it today, but it's
> definitely yours.
>

Could we change the title of the slot and actually be a TripleO Project
Update session?
It would have been great to have the onboarding session but I guess we also
have 2 other sessions where we'll have occasions to meet:
TripleO Ops and User feedback and TripleO and Ansible integration

If it's possible to still have an onboarding session, awesome otherwise
it's ok I think we'll deal with it.

Thanks,
-- 
Emilien Macchi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180501/4be74814/attachment.html>

From gael.therond at gmail.com  Wed May  2 08:52:24 2018
From: gael.therond at gmail.com (Flint WALRUS)
Date: Wed, 02 May 2018 08:52:24 +0000
Subject: [Openstack-operators] Need feedback for nova aborting cold
 migration function
In-Reply-To: <b25afb1a-b6e8-acef-c053-99e4a30114f8@lab.ntt.co.jp>
References: <b25afb1a-b6e8-acef-c053-99e4a30114f8@lab.ntt.co.jp>
Message-ID: <CAG+53uYwzbw6fNwjax9ixkGcwu_6e9wQZ-5StwrNitnaS2vVKg@mail.gmail.com>

As an operator dealing with platforms that do cold migration I would like
to be able to abort and rollback the process.

That would give us a better service quality and availability.

We do have no choices but to use cold migration on some of our remote sites
as they don’t get a unified storage such as CEPH for cost management.

Those remote sites have to growth and gain traction before being budgeted
for a truly powerful distributed storage backend. Due to such limitations I
would love to be able to reduce the time our customers are impacted by such
move while doing maintenance or any other jobs requiring us to do a
migration.

Thanks for the hard work on this topic!
Le mar. 1 mai 2018 à 03:03, Takashi Natsume <natsume.takashi at lab.ntt.co.jp>
a écrit :

> Hi everyone,
>
> I'm going to add the aborting cold migration function [1] in nova.
> I would like to ask operators' feedback on this.
>
> The cold migration is an administrator operation by default.
> If administrators perform cold migration and it is stalled out,
> users cannot do their operations (e.g. starting the VM).
>
> In that case, if administrators can abort the cold migration by using
> this function,
> it enables users to operate their VMs.
>
> If you are a person like the following, would you reply to this mail?
>
> * Those who need this function
> * Those who will use this function if it is implemented
> * Those who think that it is better to have this function
> * Those who are interested in this function
>
> [1] https://review.openstack.org/#/c/334732/
>
> Regards,
> Takashi Natsume
> NTT Software Innovation Center
> E-mail: natsume.takashi at lab.ntt.co.jp
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180502/9ff7b45f/attachment.html>

From dh3 at sanger.ac.uk  Wed May  2 09:57:38 2018
From: dh3 at sanger.ac.uk (Dave Holland)
Date: Wed, 2 May 2018 10:57:38 +0100
Subject: [Openstack-operators] [openstack-dev] [nova] Default scheduler
 filters survey
In-Reply-To: <E4B95FE4-ED18-4A7E-9F42-71D34A41E0FA@cern.ch>
References: <CACqyMiefAsG5iGf0-_VnX6ZK7rOCAMxA8T1Os72vCz-esWExQg@mail.gmail.com>
 <CAJCXu8ecJSJ0SKAJCwAOdwbN4cPPfvAew=dTY_U=XKtYiXiZ3A@mail.gmail.com>
 <CALjNAZ3ofxu2iGwC125mXPgiRp_eq6VRFiatgWTYg5u5YACe0w@mail.gmail.com>
 <CACqyMidO_s5f_wW_FKvvddioN_-3ESX1fth5LQLsBQO_bxxUFA@mail.gmail.com>
 <C40E40A4-C6EC-41BD-A793-F5D7545D6873@leafe.com>
 <CAFee_oQae8OJD3fmJkvHRxtR-h26NGE0aoKfuFwdCJR5tnAezw@mail.gmail.com>
 <20180501083033.GF9259@sanger.ac.uk>
 <E4B95FE4-ED18-4A7E-9F42-71D34A41E0FA@cern.ch>
Message-ID: <20180502095738.GM9259@sanger.ac.uk>

Thanks Tim, pre-emptible instances are definitely of interest too. I'll
be in Vancouver, hope to meet up at some point.

And thanks Mathieu for sharing the code, if we build anything of wider
interest I'll try to get it shared.

Cheers,
Dave
-- 
** Dave Holland ** Systems Support -- Informatics Systems Group **
** 01223 496923 **    Wellcome Sanger Institute, Hinxton, UK    **


On Tue, May 01, 2018 at 01:10:56PM +0000, Tim Bell wrote:
> You may also need something like pre-emptible instances to arrange the clean up of opportunistic VMs when the owner needs his resources back. Some details on the early implementation at http://openstack-in-production.blogspot.fr/2018/02/maximizing-resource-utilization-with.html.
> 
> If you're in Vancouver, we'll be having a Forum session on this (https://www.openstack.org/summit/vancouver-2018/summit-schedule/events/21787/pre-emptible-instances-the-way-forward) and notes welcome on the etherpad (https://etherpad.openstack.org/p/YVR18-pre-emptible-instances)
> 
> It would be good to find common implementations since this is a common scenario in the academic and research communities.
> 
> Tim
> 
> ﻿-----Original Message-----
> From: Dave Holland <dh3 at sanger.ac.uk>
> Date: Tuesday, 1 May 2018 at 10:40
> To: Mathieu Gagné <mgagne at calavera.ca>
> Cc: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org>, openstack-operators <openstack-operators at lists.openstack.org>
> Subject: Re: [Openstack-operators] [openstack-dev] [nova] Default scheduler filters survey
> 
>     On Mon, Apr 30, 2018 at 12:41:21PM -0400, Mathieu Gagné wrote:
>     > Weighers for baremetal cells:
>     > * ReservedHostForTenantWeigher [7]
>     ...
>     > [7] Used to favor reserved host over non-reserved ones based on project.
>     
>     Hello Mathieu,
>     
>     we are considering writing something like this, for virtual machines not
>     for baremetal. Our use case is that a project buying some compute
>     hardware is happy for others to use it, but when the compute "owner"
>     wants sole use of it, other projects' instances must be migrated off or
>     killed; a scheduler weigher like this might help us to minimise the
>     number of instances needing migration or termination at that point.
>     Would you be willing to share your source code please?
>     
>     thanks,
>     Dave
>     -- 
>     ** Dave Holland ** Systems Support -- Informatics Systems Group **
>     ** 01223 496923 **    Wellcome Sanger Institute, Hinxton, UK    **
>     
>     
>     -- 
>      The Wellcome Sanger Institute is operated by Genome Research 
>      Limited, a charity registered in England with number 1021457 and a 
>      company registered in England with number 2742969, whose registered 
>      office is 215 Euston Road, London, NW1 2BE. 
>     
>     _______________________________________________
>     OpenStack-operators mailing list
>     OpenStack-operators at lists.openstack.org
>     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>     
> 


-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 


From jimmy at openstack.org  Wed May  2 12:19:24 2018
From: jimmy at openstack.org (Jimmy McArthur)
Date: Wed, 02 May 2018 07:19:24 -0500
Subject: [Openstack-operators] [openstack-dev] The Forum Schedule is now
 live
In-Reply-To: <CACu=hyvJ9guhYRVs24wBsMG7F9FAxmtTmhmPSa0q7M-jEi37pw@mail.gmail.com>
References: <5AE34A02.8020802@openstack.org>
 <CACu=hytSVJV0=_7rjoKTMJmNShWJE3xj4uJTT908szcMeESUmg@mail.gmail.com>
 <CAFs83QpewUf9VP0Dp9UCo5q8nKoi4O-Ur85r9Kse33TFHO7NyA@mail.gmail.com>
 <5AE73AA3.4030408@openstack.org>
 <CAFsb3b4ND6TocVHDC-vRJZMAVh5gDgReSF4hieWNrG3WfWgnLQ@mail.gmail.com>
 <5AE74CF2.9010804@openstack.org>
 <CACu=hysFopOjkhKmrFJMsiVQ47CMytKeqxw1nnDFMwC2znnL3Q@mail.gmail.com>
 <CACu=hyuF3RruyzwU=_SwfDbx8HLDAhgpmkvo1XoGCheftpYdZA@mail.gmail.com>
 <5AE87728.1020804@openstack.org>
 <CACu=hyvJ9guhYRVs24wBsMG7F9FAxmtTmhmPSa0q7M-jEi37pw@mail.gmail.com>
Message-ID: <5AE9ACCC.4010200@openstack.org>


Emilien Macchi wrote:
> Could we change the title of the slot and actually be a TripleO 
> Project Update session?
> It would have been great to have the onboarding session but I guess we 
> also have 2 other sessions where we'll have occasions to meet:
> TripleO Ops and User feedback and TripleO and Ansible integration
>
> If it's possible to still have an onboarding session, awesome 
> otherwise it's ok I think we'll deal with it.
No problem, we have both on the schedule.  I moved the Project Update to 
11-11:20 so you can have a few minutes before the Onboarding starts at 
11:50.

https://www.openstack.org/summit/vancouver-2018/summit-schedule/global-search?t=TripleO

Let me know if I can assist further.

Thanks!
Jimmy


From emilien at redhat.com  Wed May  2 12:53:12 2018
From: emilien at redhat.com (Emilien Macchi)
Date: Wed, 2 May 2018 05:53:12 -0700
Subject: [Openstack-operators] [openstack-dev] The Forum Schedule is now
	live
In-Reply-To: <5AE9ACCC.4010200@openstack.org>
References: <5AE34A02.8020802@openstack.org>
 <CACu=hytSVJV0=_7rjoKTMJmNShWJE3xj4uJTT908szcMeESUmg@mail.gmail.com>
 <CAFs83QpewUf9VP0Dp9UCo5q8nKoi4O-Ur85r9Kse33TFHO7NyA@mail.gmail.com>
 <5AE73AA3.4030408@openstack.org>
 <CAFsb3b4ND6TocVHDC-vRJZMAVh5gDgReSF4hieWNrG3WfWgnLQ@mail.gmail.com>
 <5AE74CF2.9010804@openstack.org>
 <CACu=hysFopOjkhKmrFJMsiVQ47CMytKeqxw1nnDFMwC2znnL3Q@mail.gmail.com>
 <CACu=hyuF3RruyzwU=_SwfDbx8HLDAhgpmkvo1XoGCheftpYdZA@mail.gmail.com>
 <5AE87728.1020804@openstack.org>
 <CACu=hyvJ9guhYRVs24wBsMG7F9FAxmtTmhmPSa0q7M-jEi37pw@mail.gmail.com>
 <5AE9ACCC.4010200@openstack.org>
Message-ID: <CACu=hytKg2LFdBPAFeECxchYiRj9haKtaKH480X14qHLi+XFtg@mail.gmail.com>

On Wed, May 2, 2018 at 5:19 AM, Jimmy McArthur <jimmy at openstack.org> wrote:
>
> No problem, we have both on the schedule.  I moved the Project Update to
> 11-11:20 so you can have a few minutes before the Onboarding starts at
> 11:50.
>
> https://www.openstack.org/summit/vancouver-2018/summit-sched
> ule/global-search?t=TripleO
>
> Let me know if I can assist further.
>

Everything looks excellent to me now. Thanks for your help!
-- 
Emilien Macchi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180502/7f582064/attachment.html>

From mriedemos at gmail.com  Wed May  2 14:07:02 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Wed, 2 May 2018 09:07:02 -0500
Subject: [Openstack-operators] [openstack-dev] [nova][placement] Trying
 to summarize bp/glance-image-traits scheduling alternatives for rebuild
In-Reply-To: <CAKQB9E15jr4dhcWGqNUUqEVz8yt_4VNpTK7XwAxT4F2B15kaJg@mail.gmail.com>
References: <221636a9-4b8f-1098-10b8-2240a7cb0ff7@gmail.com>
 <8eec45ab-f9ed-cd96-51a1-9be78849fb9b@gmail.com>
 <CAKQB9E2GR7VU8it4S0hjM=4VK3LnZVKB4TU-u3UYc=wxJbGOeQ@mail.gmail.com>
 <530903a4-701d-595e-acc3-05369697cf06@gmail.com>
 <CALOCmukpm1EY2AmzrUuoffZ0JWw7vwdWZ+sQTgQJfOQ5F63XYw@mail.gmail.com>
 <CAKQB9E15jr4dhcWGqNUUqEVz8yt_4VNpTK7XwAxT4F2B15kaJg@mail.gmail.com>
Message-ID: <30e8e58b-a2f0-df83-49ba-d4d7a9aeddf3@gmail.com>

On 5/1/2018 5:26 PM, Arvind N wrote:
> In cases of rebuilding of an instance using a different image where the 
> image traits have changed between the original launch and the rebuild, 
> is it reasonable to ask to just re-launch a new instance with the new image?
> 
> The argument for this approach is that given that the requirements have 
> changed, we want the scheduler to pick and allocate the appropriate host 
> for the instance.

We don't know if the requirements have changed with the new image until 
we check them.

Here is another option:

What if the API compares the original image required traits against the 
new image required traits, and if the new image has required traits 
which weren't in the original image, then (punt) fail in the API? Then 
you would at least have a chance to rebuild with a new image that has 
required traits as long as those required traits are less than or equal 
to the originally validated traits for the host on which the instance is 
currently running.

> 
> The approach above also gives you consistent results vs the other 
> approaches where the rebuild may or may not succeed depending on how the 
> original allocation of resources went.
> 

Consistently frustrating, I agree. :) Because as a user, I can rebuild 
with some images (that don't have required traits) and can't rebuild 
with other images (that do have required traits).

I see no difference with this and being able to rebuild (with a new 
image) some instances (image-backed) and not others (volume-backed). 
Given that, I expect if we punt on this, someone will just come along 
asking for the support later. Could be a couple of years from now when 
everyone has moved on and it then becomes someone else's problem.

> For example(from Alex Xu) ,if you launched an instance on a host which 
> has two SRIOV nic. One is normal SRIOV nic(A), another one with some 
> kind of offload feature(B).
> 
> So, the original request is: resources=SRIOV_VF:1 The instance gets a VF 
> from the normal SRIOV nic(A).
> 
> But with a new image, the new request is: resources=SRIOV_VF:1 
> traits=HW_NIC_OFFLOAD_XX
> 
> With all the solutions discussed in the thread, a rebuild request like 
> above may or may not succeed depending on whether during the initial 
> launch whether nic A or nic B was allocated.
> 
> Remember that in rebuild new allocation don't happen, we have to reuse 
> the existing allocations.
> 
> Given the above background, there seems to be 2 competing options.
> 
> 1. Fail in the API saying you can't rebuild with a new image with new 
> required traits.
> 
> 2. Look at the current allocations for the instance and try to match the 
> new requirement from the image with the allocations.
> 
> With #1, we get consistent results in regards to how rebuilds are 
> treated when the image traits changed.
> 
> With #2, the rebuild may or may not succeed, depending on how well the 
> original allocations match up with the new requirements.
> 
> #2 will also need to need to account for handling preferred traits or 
> granular resource traits if we decide to implement them for images at 
> some point...

Option 10: Don't support image-defined traits at all. I know that won't 
happen though.

At this point I'm exhausted with this entire issue and conversation and 
will probably bow out and need someone else to step in with different 
perspective, like melwitt or dansmith.

All of the solutions are bad in their own way, either because they add 
technical debt and poor user experience, or because they make rebuild 
more complicated and harder to maintain for the developers.

-- 

Thanks,

Matt


From arvindn05 at gmail.com  Wed May  2 16:16:23 2018
From: arvindn05 at gmail.com (Arvind N)
Date: Wed, 2 May 2018 09:16:23 -0700
Subject: [Openstack-operators] [openstack-dev] [nova][placement] Trying
 to summarize bp/glance-image-traits scheduling alternatives for rebuild
In-Reply-To: <30e8e58b-a2f0-df83-49ba-d4d7a9aeddf3@gmail.com>
References: <221636a9-4b8f-1098-10b8-2240a7cb0ff7@gmail.com>
 <8eec45ab-f9ed-cd96-51a1-9be78849fb9b@gmail.com>
 <CAKQB9E2GR7VU8it4S0hjM=4VK3LnZVKB4TU-u3UYc=wxJbGOeQ@mail.gmail.com>
 <530903a4-701d-595e-acc3-05369697cf06@gmail.com>
 <CALOCmukpm1EY2AmzrUuoffZ0JWw7vwdWZ+sQTgQJfOQ5F63XYw@mail.gmail.com>
 <CAKQB9E15jr4dhcWGqNUUqEVz8yt_4VNpTK7XwAxT4F2B15kaJg@mail.gmail.com>
 <30e8e58b-a2f0-df83-49ba-d4d7a9aeddf3@gmail.com>
Message-ID: <CAKQB9E1aB-3qW8FBD1HpZ6SoEvaPH2KL0zg64TpYuGxGww_7tg@mail.gmail.com>

 > What if the API compares the original image required traits against the
new image required traits, and if the new image has required traits which
weren't in the original image, then (punt) fail in the API? Then you would
at least have a chance > to rebuild with a new image that has required
traits as long as those required traits are less than or equal to the
originally validated traits for the host on which the instance is currently
running.

This is what i was proposing with #1, sorry if it was unclear. Will make it
more explicit.

1. Reject the rebuild request indicating that rebuilding with a new image
with **different** required traits compared to the original request is not
supported.
If the new image has the same or reduced set of traits as the old image,
then the request will be passed through to the conductor etc

Pseudo code

> if  not set(new_image.traits_required).issubset(
set(original_image.traits_required))
>      raise exception

On Wed, May 2, 2018 at 7:07 AM, Matt Riedemann <mriedemos at gmail.com> wrote:

> On 5/1/2018 5:26 PM, Arvind N wrote:
>
>> In cases of rebuilding of an instance using a different image where the
>> image traits have changed between the original launch and the rebuild, is
>> it reasonable to ask to just re-launch a new instance with the new image?
>>
>> The argument for this approach is that given that the requirements have
>> changed, we want the scheduler to pick and allocate the appropriate host
>> for the instance.
>>
>
> We don't know if the requirements have changed with the new image until we
> check them.
>
> Here is another option:
>
> What if the API compares the original image required traits against the
> new image required traits, and if the new image has required traits which
> weren't in the original image, then (punt) fail in the API? Then you would
> at least have a chance to rebuild with a new image that has required traits
> as long as those required traits are less than or equal to the originally
> validated traits for the host on which the instance is currently running.
>
>
>> The approach above also gives you consistent results vs the other
>> approaches where the rebuild may or may not succeed depending on how the
>> original allocation of resources went.
>>
>>
> Consistently frustrating, I agree. :) Because as a user, I can rebuild
> with some images (that don't have required traits) and can't rebuild with
> other images (that do have required traits).
>
> I see no difference with this and being able to rebuild (with a new image)
> some instances (image-backed) and not others (volume-backed). Given that, I
> expect if we punt on this, someone will just come along asking for the
> support later. Could be a couple of years from now when everyone has moved
> on and it then becomes someone else's problem.
>
> For example(from Alex Xu) ,if you launched an instance on a host which has
>> two SRIOV nic. One is normal SRIOV nic(A), another one with some kind of
>> offload feature(B).
>>
>> So, the original request is: resources=SRIOV_VF:1 The instance gets a VF
>> from the normal SRIOV nic(A).
>>
>> But with a new image, the new request is: resources=SRIOV_VF:1
>> traits=HW_NIC_OFFLOAD_XX
>>
>> With all the solutions discussed in the thread, a rebuild request like
>> above may or may not succeed depending on whether during the initial launch
>> whether nic A or nic B was allocated.
>>
>> Remember that in rebuild new allocation don't happen, we have to reuse
>> the existing allocations.
>>
>> Given the above background, there seems to be 2 competing options.
>>
>> 1. Fail in the API saying you can't rebuild with a new image with new
>> required traits.
>>
>> 2. Look at the current allocations for the instance and try to match the
>> new requirement from the image with the allocations.
>>
>> With #1, we get consistent results in regards to how rebuilds are treated
>> when the image traits changed.
>>
>> With #2, the rebuild may or may not succeed, depending on how well the
>> original allocations match up with the new requirements.
>>
>> #2 will also need to need to account for handling preferred traits or
>> granular resource traits if we decide to implement them for images at some
>> point...
>>
>
> Option 10: Don't support image-defined traits at all. I know that won't
> happen though.
>
> At this point I'm exhausted with this entire issue and conversation and
> will probably bow out and need someone else to step in with different
> perspective, like melwitt or dansmith.
>
> All of the solutions are bad in their own way, either because they add
> technical debt and poor user experience, or because they make rebuild more
> complicated and harder to maintain for the developers.
>
> --
>
> Thanks,
>
> Matt
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>


-- 
Arvind N
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180502/17cc0b0a/attachment.html>

From mriedemos at gmail.com  Wed May  2 16:25:25 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Wed, 2 May 2018 11:25:25 -0500
Subject: [Openstack-operators] [nova][ironic] ironic_host_manager and
 baremetal scheduler options removal
Message-ID: <f1553545-38dd-247e-542a-ceb432a9371d@gmail.com>

The baremetal scheduling options were deprecated in Pike [1] and the 
ironic_host_manager was deprecated in Queens [2] and is now being 
removed [3]. Deployments must use resource classes now for baremetal 
scheduling. [4]

The large host subset size value is also no longer needed. [5]

I've gone through all of the references to "ironic_host_manager" that I 
could find in codesearch.o.o and updated projects accordingly [6].

Please reply ASAP to this thread and/or [3] if you have issues with this.

[1] https://review.openstack.org/#/c/493052/
[2] https://review.openstack.org/#/c/521648/
[3] https://review.openstack.org/#/c/565805/
[4] 
https://docs.openstack.org/ironic/latest/install/configure-nova-flavors.html#scheduling-based-on-resource-classes
[5] https://review.openstack.org/565736/
[6] 
https://review.openstack.org/#/q/topic:exact-filters+(status:open+OR+status:merged)

-- 

Thanks,

Matt


From mgagne at calavera.ca  Wed May  2 16:40:56 2018
From: mgagne at calavera.ca (=?UTF-8?Q?Mathieu_Gagn=C3=A9?=)
Date: Wed, 2 May 2018 12:40:56 -0400
Subject: [Openstack-operators] [openstack-dev] [nova][ironic]
 ironic_host_manager and baremetal scheduler options removal
In-Reply-To: <f1553545-38dd-247e-542a-ceb432a9371d@gmail.com>
References: <f1553545-38dd-247e-542a-ceb432a9371d@gmail.com>
Message-ID: <CAFee_oS9V4kQyq-2DgzhrS-nVZj6gYJaEHsAoe1en0oWTxCgXQ@mail.gmail.com>

What's the state of caching_scheduler which could still be using those configs?

Mathieu

On Wed, May 2, 2018 at 12:25 PM, Matt Riedemann <mriedemos at gmail.com> wrote:
> The baremetal scheduling options were deprecated in Pike [1] and the
> ironic_host_manager was deprecated in Queens [2] and is now being removed
> [3]. Deployments must use resource classes now for baremetal scheduling. [4]
>
> The large host subset size value is also no longer needed. [5]
>
> I've gone through all of the references to "ironic_host_manager" that I
> could find in codesearch.o.o and updated projects accordingly [6].
>
> Please reply ASAP to this thread and/or [3] if you have issues with this.
>
> [1] https://review.openstack.org/#/c/493052/
> [2] https://review.openstack.org/#/c/521648/
> [3] https://review.openstack.org/#/c/565805/
> [4]
> https://docs.openstack.org/ironic/latest/install/configure-nova-flavors.html#scheduling-based-on-resource-classes
> [5] https://review.openstack.org/565736/
> [6]
> https://review.openstack.org/#/q/topic:exact-filters+(status:open+OR+status:merged)
>


From mriedemos at gmail.com  Wed May  2 16:49:46 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Wed, 2 May 2018 11:49:46 -0500
Subject: [Openstack-operators] [openstack-dev] [nova][ironic]
 ironic_host_manager and baremetal scheduler options removal
In-Reply-To: <CAFee_oS9V4kQyq-2DgzhrS-nVZj6gYJaEHsAoe1en0oWTxCgXQ@mail.gmail.com>
References: <f1553545-38dd-247e-542a-ceb432a9371d@gmail.com>
 <CAFee_oS9V4kQyq-2DgzhrS-nVZj6gYJaEHsAoe1en0oWTxCgXQ@mail.gmail.com>
Message-ID: <96f7142b-8838-93f8-d8a7-46ff7010c394@gmail.com>

On 5/2/2018 11:40 AM, Mathieu Gagné wrote:
> What's the state of caching_scheduler which could still be using those configs?

The CachingScheduler has been deprecated since Pike [1]. We discussed 
the CachingScheduler at the Rocky PTG in Dublin [2] and have a TODO to 
write a nova-manage data migration tool to create allocations in 
Placement for instances that were scheduled using the CachingScheduler 
(since Pike) which don't have their own resource allocations set in 
Placement (remember that starting in Pike the FilterScheduler started 
creating allocations in Placement rather than the ResourceTracker in 
nova-compute).

If you're running computes that are Ocata or Newton, then the 
ResourceTracker in the nova-compute service should be creating the 
allocations in Placement for you, assuming you have the compute service 
configured to talk to Placement (optional in Newton, required in Ocata).

[1] https://review.openstack.org/#/c/492210/
[2] https://etherpad.openstack.org/p/nova-ptg-rocky-placement

-- 

Thanks,

Matt


From mgagne at calavera.ca  Wed May  2 17:00:46 2018
From: mgagne at calavera.ca (=?UTF-8?Q?Mathieu_Gagn=C3=A9?=)
Date: Wed, 2 May 2018 13:00:46 -0400
Subject: [Openstack-operators] [openstack-dev] [nova][ironic]
 ironic_host_manager and baremetal scheduler options removal
In-Reply-To: <96f7142b-8838-93f8-d8a7-46ff7010c394@gmail.com>
References: <f1553545-38dd-247e-542a-ceb432a9371d@gmail.com>
 <CAFee_oS9V4kQyq-2DgzhrS-nVZj6gYJaEHsAoe1en0oWTxCgXQ@mail.gmail.com>
 <96f7142b-8838-93f8-d8a7-46ff7010c394@gmail.com>
Message-ID: <CAFee_oRQnEYE_0N-VwE+FOQs_vWE85zpOf-9jJA0-OJJEVqk7w@mail.gmail.com>

On Wed, May 2, 2018 at 12:49 PM, Matt Riedemann <mriedemos at gmail.com> wrote:
> On 5/2/2018 11:40 AM, Mathieu Gagné wrote:
>>
>> What's the state of caching_scheduler which could still be using those
>> configs?
>
>
> The CachingScheduler has been deprecated since Pike [1]. We discussed the
> CachingScheduler at the Rocky PTG in Dublin [2] and have a TODO to write a
> nova-manage data migration tool to create allocations in Placement for
> instances that were scheduled using the CachingScheduler (since Pike) which
> don't have their own resource allocations set in Placement (remember that
> starting in Pike the FilterScheduler started creating allocations in
> Placement rather than the ResourceTracker in nova-compute).
>
> If you're running computes that are Ocata or Newton, then the
> ResourceTracker in the nova-compute service should be creating the
> allocations in Placement for you, assuming you have the compute service
> configured to talk to Placement (optional in Newton, required in Ocata).
>
> [1] https://review.openstack.org/#/c/492210/
> [2] https://etherpad.openstack.org/p/nova-ptg-rocky-placement

If one can still run CachingScheduler (even if it's deprecated), I
think we shouldn't remove the above options.
As you can end up with a broken setup and IIUC no way to migrate to
placement since migration script has yet to be written.

--
Mathieu


From mriedemos at gmail.com  Wed May  2 17:39:03 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Wed, 2 May 2018 12:39:03 -0500
Subject: [Openstack-operators] [openstack-dev] [nova][ironic]
 ironic_host_manager and baremetal scheduler options removal
In-Reply-To: <CAFee_oRQnEYE_0N-VwE+FOQs_vWE85zpOf-9jJA0-OJJEVqk7w@mail.gmail.com>
References: <f1553545-38dd-247e-542a-ceb432a9371d@gmail.com>
 <CAFee_oS9V4kQyq-2DgzhrS-nVZj6gYJaEHsAoe1en0oWTxCgXQ@mail.gmail.com>
 <96f7142b-8838-93f8-d8a7-46ff7010c394@gmail.com>
 <CAFee_oRQnEYE_0N-VwE+FOQs_vWE85zpOf-9jJA0-OJJEVqk7w@mail.gmail.com>
Message-ID: <60821a79-42a4-dfa4-cc65-2fbc068f8b35@gmail.com>

On 5/2/2018 12:00 PM, Mathieu Gagné wrote:
> If one can still run CachingScheduler (even if it's deprecated), I
> think we shouldn't remove the above options.
> As you can end up with a broken setup and IIUC no way to migrate to
> placement since migration script has yet to be written.

You're currently on cells v1 on mitaka right? So you have some time to 
get this sorted out before getting to Rocky where the IronicHostManager 
is dropped.

I know you're just one case, but I don't know how many people are really 
running the CachingScheduler with ironic either, so it might be rare. It 
would be nice to get other operator input here, like I'm guessing CERN 
has their cells carved up so that certain cells are only serving 
baremetal requests while other cells are only VMs?

FWIW, I think we can also backport the data migration CLI to stable 
branches once we have it available so you can do your migration in let's 
say Queens before getting to Rocky.

-- 

Thanks,

Matt


From mgagne at calavera.ca  Wed May  2 17:48:06 2018
From: mgagne at calavera.ca (=?UTF-8?Q?Mathieu_Gagn=C3=A9?=)
Date: Wed, 2 May 2018 13:48:06 -0400
Subject: [Openstack-operators] [openstack-dev] [nova][ironic]
 ironic_host_manager and baremetal scheduler options removal
In-Reply-To: <60821a79-42a4-dfa4-cc65-2fbc068f8b35@gmail.com>
References: <f1553545-38dd-247e-542a-ceb432a9371d@gmail.com>
 <CAFee_oS9V4kQyq-2DgzhrS-nVZj6gYJaEHsAoe1en0oWTxCgXQ@mail.gmail.com>
 <96f7142b-8838-93f8-d8a7-46ff7010c394@gmail.com>
 <CAFee_oRQnEYE_0N-VwE+FOQs_vWE85zpOf-9jJA0-OJJEVqk7w@mail.gmail.com>
 <60821a79-42a4-dfa4-cc65-2fbc068f8b35@gmail.com>
Message-ID: <CAFee_oTW=ZqrrZ608cgbmg2P2JxoUjT40ZN6LUQ2VB1mV7U2WA@mail.gmail.com>

On Wed, May 2, 2018 at 1:39 PM, Matt Riedemann <mriedemos at gmail.com> wrote:
>
> I know you're just one case, but I don't know how many people are really
> running the CachingScheduler with ironic either, so it might be rare. It
> would be nice to get other operator input here, like I'm guessing CERN has
> their cells carved up so that certain cells are only serving baremetal
> requests while other cells are only VMs?

I found FilterScheduler to be near impossible to use with Ironic due
to the huge number of hypervisors it had to handle.
Using CachingScheduler made a huge difference, like day and night.

> FWIW, I think we can also backport the data migration CLI to stable branches
> once we have it available so you can do your migration in let's say Queens
> before getting to Rocky.

--
Mathieu


From mriedemos at gmail.com  Wed May  2 22:45:37 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Wed, 2 May 2018 17:45:37 -0500
Subject: [Openstack-operators] [openstack-dev] [nova][placement] Trying
 to summarize bp/glance-image-traits scheduling alternatives for rebuild
In-Reply-To: <f362ec26-65f4-1ed6-5c7a-e351b3a1b372@gmail.com>
References: <221636a9-4b8f-1098-10b8-2240a7cb0ff7@gmail.com>
 <8eec45ab-f9ed-cd96-51a1-9be78849fb9b@gmail.com>
 <CAKQB9E2GR7VU8it4S0hjM=4VK3LnZVKB4TU-u3UYc=wxJbGOeQ@mail.gmail.com>
 <530903a4-701d-595e-acc3-05369697cf06@gmail.com>
 <CALOCmukpm1EY2AmzrUuoffZ0JWw7vwdWZ+sQTgQJfOQ5F63XYw@mail.gmail.com>
 <CAKQB9E15jr4dhcWGqNUUqEVz8yt_4VNpTK7XwAxT4F2B15kaJg@mail.gmail.com>
 <30e8e58b-a2f0-df83-49ba-d4d7a9aeddf3@gmail.com>
 <f362ec26-65f4-1ed6-5c7a-e351b3a1b372@gmail.com>
Message-ID: <f64463ba-1c50-a7c4-e5e1-1b806dd48544@gmail.com>

On 5/2/2018 5:39 PM, Jay Pipes wrote:
> My personal preference is to add less technical debt and go with a 
> solution that checks if image traits have changed in nova-api and if so, 
> simply refuse to perform a rebuild.

So, what if when I created my server, the image I used, let's say 
image1, had required trait A and that fit the host.

Then some external service removes (or somehow changes) trait A from the 
compute node resource provider (because people can and will do this, 
there are a few vmware specs up that rely on being able to manage traits 
out of band from nova), and then I rebuild my server with image2 that 
has required trait A. That would match the original trait A in image1 
and we'd say, "yup, lgtm!" and do the rebuild even though the compute 
node resource provider wouldn't have trait A anymore.

Having said that, it could technically happen before traits if the 
operator changed something on the underlying compute host which 
invalidated instances running on that host, but I'd think if that 
happened the operator would be migrating everything off the host and 
disabling it from scheduling before making whatever that kind of change 
would be, let's say they change the hypervisor or something less drastic 
but still image property invalidating.

-- 

Thanks,

Matt


From arvindn05 at gmail.com  Wed May  2 23:06:03 2018
From: arvindn05 at gmail.com (Arvind N)
Date: Wed, 2 May 2018 16:06:03 -0700
Subject: [Openstack-operators] [openstack-dev] [nova][placement] Trying
 to summarize bp/glance-image-traits scheduling alternatives for rebuild
In-Reply-To: <f64463ba-1c50-a7c4-e5e1-1b806dd48544@gmail.com>
References: <221636a9-4b8f-1098-10b8-2240a7cb0ff7@gmail.com>
 <8eec45ab-f9ed-cd96-51a1-9be78849fb9b@gmail.com>
 <CAKQB9E2GR7VU8it4S0hjM=4VK3LnZVKB4TU-u3UYc=wxJbGOeQ@mail.gmail.com>
 <530903a4-701d-595e-acc3-05369697cf06@gmail.com>
 <CALOCmukpm1EY2AmzrUuoffZ0JWw7vwdWZ+sQTgQJfOQ5F63XYw@mail.gmail.com>
 <CAKQB9E15jr4dhcWGqNUUqEVz8yt_4VNpTK7XwAxT4F2B15kaJg@mail.gmail.com>
 <30e8e58b-a2f0-df83-49ba-d4d7a9aeddf3@gmail.com>
 <f362ec26-65f4-1ed6-5c7a-e351b3a1b372@gmail.com>
 <f64463ba-1c50-a7c4-e5e1-1b806dd48544@gmail.com>
Message-ID: <CAKQB9E0CZrfMTvnu=tgV_QEsLH4PivomaiSN8AZMfpegQWHoeA@mail.gmail.com>

Isnt this an existing issue with traits specified in flavor as well?

Server is created using flavor1 requiring trait A on RP1. Before the
rebuild is called, the underlying RP1 can be updated to remove trait A and
when a rebuild is requested(regardless of whether the image is updated or
not), we skip scheduling and allow the rebuild to go through.

Now, even though the flavor1 requests trait A, the underlying RP1 does not
have that trait the rebuild will succeed...

I think maybe there should be some kind of report or query which runs
periodically to ensure continued conformance with respect to instance
running and their traits. But since traits are intend to provide hints for
scheduling, this is different problem to solve IMO.

On Wed, May 2, 2018 at 3:45 PM, Matt Riedemann <mriedemos at gmail.com> wrote:

> On 5/2/2018 5:39 PM, Jay Pipes wrote:
>
>> My personal preference is to add less technical debt and go with a
>> solution that checks if image traits have changed in nova-api and if so,
>> simply refuse to perform a rebuild.
>>
>
> So, what if when I created my server, the image I used, let's say image1,
> had required trait A and that fit the host.
>
> Then some external service removes (or somehow changes) trait A from the
> compute node resource provider (because people can and will do this, there
> are a few vmware specs up that rely on being able to manage traits out of
> band from nova), and then I rebuild my server with image2 that has required
> trait A. That would match the original trait A in image1 and we'd say,
> "yup, lgtm!" and do the rebuild even though the compute node resource
> provider wouldn't have trait A anymore.
>
> Having said that, it could technically happen before traits if the
> operator changed something on the underlying compute host which invalidated
> instances running on that host, but I'd think if that happened the operator
> would be migrating everything off the host and disabling it from scheduling
> before making whatever that kind of change would be, let's say they change
> the hypervisor or something less drastic but still image property
> invalidating.
>
> --
>
> Thanks,
>
> Matt
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>


-- 
Arvind N
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180502/2b9751cb/attachment.html>

From mriedemos at gmail.com  Thu May  3 00:47:01 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Wed, 2 May 2018 19:47:01 -0500
Subject: [Openstack-operators] [openstack-dev] [nova][ironic]
 ironic_host_manager and baremetal scheduler options removal
In-Reply-To: <60821a79-42a4-dfa4-cc65-2fbc068f8b35@gmail.com>
References: <f1553545-38dd-247e-542a-ceb432a9371d@gmail.com>
 <CAFee_oS9V4kQyq-2DgzhrS-nVZj6gYJaEHsAoe1en0oWTxCgXQ@mail.gmail.com>
 <96f7142b-8838-93f8-d8a7-46ff7010c394@gmail.com>
 <CAFee_oRQnEYE_0N-VwE+FOQs_vWE85zpOf-9jJA0-OJJEVqk7w@mail.gmail.com>
 <60821a79-42a4-dfa4-cc65-2fbc068f8b35@gmail.com>
Message-ID: <356c7795-b31e-4de6-47c6-61949f8a3e95@gmail.com>

On 5/2/2018 12:39 PM, Matt Riedemann wrote:
> FWIW, I think we can also backport the data migration CLI to stable 
> branches once we have it available so you can do your migration in let's 
> say Queens before g

FYI, here is the start on the data migration CLI:

https://review.openstack.org/#/c/565886/

-- 

Thanks,

Matt


From mrhillsman at gmail.com  Fri May  4 16:58:17 2018
From: mrhillsman at gmail.com (Melvin Hillsman)
Date: Fri, 04 May 2018 16:58:17 +0000
Subject: [Openstack-operators] Reminder: UC Meeting Monday 1800UTC
Message-ID: <CAMVtB2FcrXWxGDQC2Yz+S0x9eWtf-h_+qXDkMvA3_ubQuPQ4AA@mail.gmail.com>

Hey everyone,

Please see
https://wiki.openstack.org/wiki/Governance/Foundation/UserCommittee for UC
meeting info and add additional agenda items if needed.

-- 
Kind regards,

Melvin Hillsman
mrhillsman at gmail.com
mobile: (832) 264-2646
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180504/32394372/attachment.html>

From ignaziocassano at gmail.com  Mon May  7 07:11:59 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Mon, 7 May 2018 09:11:59 +0200
Subject: [Openstack-operators] octavia on ocata
Message-ID: <CAB7j8cU=pU0WoEdkgWvdcbsZzJE3ZxExj0-K45LpnUQMhk=J5w@mail.gmail.com>

Hello everyone,
I'd like to know if anynone has tried to installa octavia lbaas on ocata
centos 7 release .
If yes, does it work ?

Regards
Ignazio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180507/57e020cb/attachment.html>

From shake.chen at gmail.com  Mon May  7 07:17:05 2018
From: shake.chen at gmail.com (Shake Chen)
Date: Mon, 7 May 2018 15:17:05 +0800
Subject: [Openstack-operators] octavia on ocata
In-Reply-To: <CAB7j8cU=pU0WoEdkgWvdcbsZzJE3ZxExj0-K45LpnUQMhk=J5w@mail.gmail.com>
References: <CAB7j8cU=pU0WoEdkgWvdcbsZzJE3ZxExj0-K45LpnUQMhk=J5w@mail.gmail.com>
Message-ID: <CAO__-NYG=0NsxE5V6rh-E8t-fVpMRDQP758mdTmbrUoNcZGY7w@mail.gmail.com>

in kolla, ocata, Ocatavia is work.

On Mon, May 7, 2018 at 3:11 PM, Ignazio Cassano <ignaziocassano at gmail.com>
wrote:

> Hello everyone,
> I'd like to know if anynone has tried to installa octavia lbaas on ocata
> centos 7 release .
> If yes, does it work ?
>
> Regards
> Ignazio
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>


-- 
Shake Chen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180507/fe1480f3/attachment.html>

From ignaziocassano at gmail.com  Mon May  7 07:18:26 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Mon, 7 May 2018 09:18:26 +0200
Subject: [Openstack-operators] octavia on ocata
In-Reply-To: <CAO__-NYG=0NsxE5V6rh-E8t-fVpMRDQP758mdTmbrUoNcZGY7w@mail.gmail.com>
References: <CAB7j8cU=pU0WoEdkgWvdcbsZzJE3ZxExj0-K45LpnUQMhk=J5w@mail.gmail.com>
 <CAO__-NYG=0NsxE5V6rh-E8t-fVpMRDQP758mdTmbrUoNcZGY7w@mail.gmail.com>
Message-ID: <CAB7j8cU7eSAryzLMuG5TupbnhMZQ5P4KvMmgoq-GVkNPMYa5FA@mail.gmail.com>

Many thanks


2018-05-07 9:17 GMT+02:00 Shake Chen <shake.chen at gmail.com>:

> in kolla, ocata, Ocatavia is work.
>
> On Mon, May 7, 2018 at 3:11 PM, Ignazio Cassano <ignaziocassano at gmail.com>
> wrote:
>
>> Hello everyone,
>> I'd like to know if anynone has tried to installa octavia lbaas on ocata
>> centos 7 release .
>> If yes, does it work ?
>>
>> Regards
>> Ignazio
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>>
>
>
> --
> Shake Chen
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180507/2263690e/attachment.html>

From rico.lin.guanyu at gmail.com  Mon May  7 10:27:48 2018
From: rico.lin.guanyu at gmail.com (Rico Lin)
Date: Mon, 7 May 2018 18:27:48 +0800
Subject: [Openstack-operators] [openstack-dev][heat][all] Heat now
 migrated to StoryBoard!!
In-Reply-To: <CA+WCyypkkcXK6D_LPhp7q2Hx15-L4L_WusrtQACqBmuKa=1HCA@mail.gmail.com>
References: <CA+WCyyp2eBRv4jy85yp2ayFMG2oPmqs6DHq8hkC2uSXgruzepQ@mail.gmail.com>
 <CA+WCyypkkcXK6D_LPhp7q2Hx15-L4L_WusrtQACqBmuKa=1HCA@mail.gmail.com>
Message-ID: <CA+WCyyqw0SY+F5VSPC+T3FfypihFrnM9oYsmRA7mLrxrXszHqA@mail.gmail.com>

Hi all,

I updated more information to this guideline in [1].
Please must take a view on [1] to see what's been updated.
will likely to keep update on that etherpad if new Q&A or issue found.

Will keep trying to make this process as painless for you as possible,
so please endure with us for now, and sorry for any inconvenience

*[1] https://etherpad.openstack.org/p/Heat-StoryBoard-Migration-Info
<https://etherpad.openstack.org/p/Heat-StoryBoard-Migration-Info>*

2018-05-05 12:15 GMT+08:00 Rico Lin <rico.lin.guanyu at gmail.com>:

> looping heat-dashboard team
>
> 2018-05-05 12:02 GMT+08:00 Rico Lin <rico.lin.guanyu at gmail.com>:
>
>> Dear all Heat members and friends
>>
>> As you might award, OpenStack projects are scheduled to migrating ([5])
>> from Launchpad to StoryBoard [1].
>> For whom who like to know where to file a bug/blueprint, here are some
>> heads up for you.
>>
>> *What's StoryBoard?*
>> StoryBoard is a cross-project task-tracker, contains numbers of
>> ``project``, each project contains numbers of ``story`` which you can think
>> it as an issue or blueprint. Within each story, contains one or multiple
>> ``task`` (task separate stories into the tasks to resolve/implement). To
>> learn more about StoryBoard or how to make a good story, you can reference
>> [6].
>>
>> *How to file a bug?*
>> This is actually simple, use your current ubuntu-one id to access to
>> storyboard. Then find the corresponding project in [2] and create a story
>> to it with a description of your issue. We should try to create tasks which
>> to reference with patches in Gerrit.
>>
>> *How to work on a spec (blueprint)?*
>> File a story like you used to file a Blueprint. Create tasks for your
>> plan. Also you might want to create a task for adding spec( in heat-spec
>> repo) if your blueprint needs documents to explain.
>> I still leave current blueprint page open, so if you like to create a
>> story from BP, you can still get information. Right now we will start work
>> as task-driven workflow, so BPs should act no big difference with a bug in
>> StoryBoard (which is a story with many tasks).
>>
>> *Where should I put my story?*
>> We migrate all heat sub-projects to StoryBoard to try to keep the impact
>> to whatever you're doing as small as possible. However, if you plan to
>> create a new story, *please create it under heat project [4]* and tag it
>> with what it might affect with (like python-heatclint, heat-dashboard,
>> heat-agents). We do hope to let users focus their stories in one place so
>> all stories will get better attention and project maintainers don't need to
>> go around separate places to find it.
>>
>> *How to connect from Gerrit to StoryBoard?*
>> We usually use following key to reference Launchpad
>> Closes-Bug: #######
>> Partial-Bug: #######
>> Related-Bug: #######
>>
>> Now in StoryBoard, you can use following key.
>> Task: ######
>> Story: ######
>> you can find more info in [3].
>>
>> *What I need to do for my exists bug/bps?*
>> Your bug is automatically migrated to StoryBoard, however, the reference
>> in your patches ware not, so you need to change your commit message to
>> replace the old link to launchpad to new links to StoryBoard.
>>
>> *Do we still need Launchpad after all this migration are done?*
>> As the plan, we won't need Launchpad for heat anymore once we have done
>> with migrating. Will forbid new bugs/bps filed in Launchpad. Also, try to
>> provide new information as many as possible. Hopefully, we can make
>> everyone happy. For those newly created bugs during/after migration, don't
>> worry we will disallow further create new bugs/bps and do a second migrate
>> so we won't missed yours.
>>
>> [1] https://storyboard.openstack.org/
>> [2] https://storyboard.openstack.org/#!/project_group/82
>> [3] https://docs.openstack.org/infra/manual/developers.html#
>> development-workflow
>> [4] https://storyboard.openstack.org/#!/project/989
>> [5] https://docs.openstack.org/infra/storyboard/migration.html
>> [6] https://docs.openstack.org/infra/storyboard/gui/tasks_
>> stories_tags.html#what-is-a-story
>>
>>
>>
>> --
>> May The Force of OpenStack Be With You,
>>
>> *Rico Lin*irc: ricolin
>>
>>
>
>
> --
> May The Force of OpenStack Be With You,
>
> *Rico Lin*irc: ricolin
>
>


-- 
May The Force of OpenStack Be With You,

*Rico Lin*irc: ricolin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180507/e0fac4ba/attachment.html>

From amy at demarco.com  Mon May  7 20:45:25 2018
From: amy at demarco.com (Amy Marrich)
Date: Mon, 7 May 2018 15:45:25 -0500
Subject: [Openstack-operators] OpenStack User Survey
Message-ID: <CAFs83Qrj5jhdKWb+-BZkNuihmJCqueHWmHTVuObgw2zYC4+a_g@mail.gmail.com>

Hi everyone,

If you’re running OpenStack, please participate in the User Survey
<http://www.openstack.org/user-survey> to share more about your technology
implementations and provide feedback for the community.

Please help us spread the word. We're trying to gather as much real-world
deployment data as possible to share back with both the operator and
developer communities. We have made it easier to complete, and the
survey is* now
available in 7 languages*—English, German, Indonesian, Japanese, Korean,
traditional Chinese and simplified Chinese.

Based on feedback from the operator community, we are only conducting one
survey this year, collecting submissions until early August. The report
will then be published in October prior to the Berlin Summit
<https://www.openstack.org/summit/berlin-2018/>  If you would like
OpenStack user data in the meantime, check out the analytics dashboard
<https://www.openstack.org/analytics>updates in real time, throughout the
year.

The information provided is confidential and will only be presented in
aggregate unless you consent to make it public.

The deadline to complete the survey and be part of the next report is
*Friday, August
3 at 23:59 UTC.*

   - You can login and complete the OpenStack User Survey here:
   http://www.openstack.org/user-survey
   <http://www.openstack.org/user-survey>
   - If you’re interested in joining the OpenStack User Survey Working
   Group to help with the survey analysis, please complete this form:
   https://openstackfoundation.formstack.com/forms/user_survey_working_group
   <https://openstackfoundation.formstack.com/forms/user_survey_working_group>

   - Help us promote the User Survey: https://twitter.com/OpenStack/status/
   993589356312088577


Please let me know if you have any questions.

Cheers,
Amy

Amy Marrich (spotz)
OpenStack User Committee
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180507/91d5a303/attachment-0001.html>

From christian.zunker at codecentric.cloud  Tue May  8 06:36:38 2018
From: christian.zunker at codecentric.cloud (Christian Zunker)
Date: Tue, 08 May 2018 06:36:38 +0000
Subject: [Openstack-operators] How are you handling billing/chargeback?
In-Reply-To: <CABowN-HynSYyZnF=kuUL0=jjNLwT0eY_BytOatz5Le=KT0w=PQ@mail.gmail.com>
References: <20180312192113.znz4eavfze5zg7yn@redhat.com>
 <CABowN-HynSYyZnF=kuUL0=jjNLwT0eY_BytOatz5Le=KT0w=PQ@mail.gmail.com>
Message-ID: <CAHS=D_ZYfBCogSOm4-yb7txwdJ=N3WXTo4oEi9C8CAU6h0SvLA@mail.gmail.com>

Hi,

we are running a cloud based on openstack-ansible and now are trying to
integrate cloudkitty for billing.

Till now we used a self written python script to query ceilometer for
needed data, but that got more tedious than we are willing to handle. We
hope it gets much easier once cloudkitty is set up.

regards
Christian


> From: Lars Kellogg-Stedman <lars at redhat.com>
> Date: Mo., 12. März 2018 um 20:27 Uhr
> Subject: [Openstack-operators] How are you handling billing/chargeback?
> To: openstack-operators at lists.openstack.org <
> openstack-operators at lists.openstack.org>
>
>
> Hey folks,
>
> I'm curious what folks out there are using for chargeback/billing in
> your OpenStack environment.
>
> Are you doing any sort of chargeback (or showback)?  Are you using (or
> have you tried) CloudKitty?  Or some other existing project?  Have you
> rolled your own instead?
>
> I ask because I am helping out some folks get a handle on the
> operational side of their existing OpenStack environment, and they are
> interested in but have not yet deployed some sort of reporting
> mechanism.
>
> Thanks,
>
>
> --
> Lars Kellogg-Stedman <lars at redhat.com> | larsks @ {irc,twitter,github}
> http://blog.oddbit.com/                |
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-- 
cc cloud GmbH | Hochstr. 11
<https://maps.google.com/?q=Hochstr.+11&entry=gmail&source=g> | 42697
Solingen | Deutschland
mobil: +49 175 1068513
www.codecentric.cloud | blog.codecentric.de | www.meettheexperts.de
Sitz der Gesellschaft: Solingen | HRB 28640| Amtsgericht Wuppertal

Geschäftsführung: Werner Krandick . Rainer Vehns

Diese E-Mail einschließlich evtl. beigefügter Dateien enthält vertrauliche
und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige
Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie
bitte sofort den Absender und löschen Sie diese E-Mail und evtl.
beigefügter Dateien umgehend. Das unerlaubte Kopieren, Nutzen oder Öffnen
evtl. beigefügter Dateien sowie die unbefugte Weitergabe dieser E-Mail ist
nicht gestattet.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180508/a8a08045/attachment.html>

From mihalis68 at gmail.com  Tue May  8 15:04:21 2018
From: mihalis68 at gmail.com (Chris Morgan)
Date: Tue, 8 May 2018 11:04:21 -0400
Subject: [Openstack-operators] ops meetups team meeting minutes 2018-5-8
Message-ID: <CA+NmNoOb5_4TbcnPib=5xSGxxbUfXmrSho5rcO7HYajVEy_n-A@mail.gmail.com>

Today's Ops Meetups Team meeting was chaired by Shintaro Mizuno. Minutes
here:

Minutes:
http://eavesdrop.openstack.org/meetings/ops_meetup_team/2018/ops_meetup_team.2018-05-08-14.17.html
10:58 AM Minutes (text):
http://eavesdrop.openstack.org/meetings/ops_meetup_team/2018/ops_meetup_team.2018-05-08-14.17.txt
10:58 AM Log:
http://eavesdrop.openstack.org/meetings/ops_meetup_team/2018/ops_meetup_team.2018-05-08-14.17.log.html

Please watch out for further updates about the upcoming Vancouver ops
sessions, and also please note that early-bird tickets for the PTG in
september will now remain available until May 18th.

Chris

-- 
Chris Morgan <mihalis68 at gmail.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180508/93e71744/attachment.html>

From martialmichel at datamachines.io  Tue May  8 21:59:24 2018
From: martialmichel at datamachines.io (Martial Michel)
Date: Tue, 08 May 2018 21:59:24 +0000
Subject: [Openstack-operators] [Scientific] Scientific SIG - IRC meeting Wed
	9 at 1100UTC
Message-ID: <CAAXGpAqqbp0AtSemmOic5O+oVC1297L+u+2vqoAaRD0GbHAVuQ@mail.gmail.com>

Hello,

We will have our IRC meeting in the #openstack-meeting channel at 1100 UTC
May 9th.
Final agenda will be at:
*https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_May_8st_2018
<https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_May_8st_2018>*


   1. SIG Cycle Report
      1. https://etherpad.openstack.org/p/scientific-sig-report-queens
   2. Call for Lighting Talks
      1.
      https://etherpad.openstack.org/p/scientific-sig-vancouver2018-lighting-talks
   3. AOB


All are welcome. Looking forward to seeing you there -- Martial
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180508/a4a22a41/attachment.html>

From ignaziocassano at gmail.com  Wed May  9 09:09:10 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Wed, 9 May 2018 11:09:10 +0200
Subject: [Openstack-operators] ocata /usr/bin/octavia-diskimage-create.sh -i
	centos fails
Message-ID: <CAB7j8cUzQZKCsM7Tv8Nb1WGjEHOGYw4WkZUU8MrXXyKoKR3-ZQ@mail.gmail.com>

Hi all,
I am trying to create an octavia amphra image on ocata usigng the package
openstack-octavia-diskimage-create on centos 7 but it fails:

diskimage-builder fails to create disk image- cannot uninstall virtualenv
I read this is a bug and a workaround cloud be setting DIB_INSTALLTYPE
_pip_and_virtualenv to "package"
In this case the command reported:Command "python setup.py egg_info" failed
with error code 1 in /tmp/pip-obMDl9-build/

Regards
Ignazio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180509/292ca65e/attachment.html>

From cdent+os at anticdent.org  Wed May  9 12:42:02 2018
From: cdent+os at anticdent.org (Chris Dent)
Date: Wed, 9 May 2018 13:42:02 +0100 (BST)
Subject: [Openstack-operators] [nova] [placement] placement extraction
	session at forum
Message-ID: <alpine.OSX.2.21.1805091328080.13248@cdent-a01.vmware.com>


I've started an etherpad related to the Vancouver Forum session on
extracting placement from nova. It's mostly just an outline for
now but is evolving:

     https://etherpad.openstack.org/p/YVR-placement-extraction

If we can get some real information in there before the session we
are much more likely to have a productive session. Please feel free
to add any notes or questions you have there. Or on this thread if
you prefer.

The (potentially overly-optimistic) hope is that we can complete any
prepatory work before the end of Rocky and then do the extraction in
Stein. If we are willing to accept (please, let's) some form of
control plane downtime data migration issues can be vastly eased.
Getting agreement on how that might work is one of the goals of the
session.

Your input very appreciated.

-- 
Chris Dent                       ٩◔̯◔۶           https://anticdent.org/
freenode: cdent                                         tw: @anticdent

From cdent+os at anticdent.org  Wed May  9 12:56:58 2018
From: cdent+os at anticdent.org (Chris Dent)
Date: Wed, 9 May 2018 13:56:58 +0100 (BST)
Subject: [Openstack-operators] [cinder] [placement] cinder + placement forum
	session etherpad
Message-ID: <alpine.OSX.2.21.1805091351260.13248@cdent-a01.vmware.com>


I've started an etherpad for the forum session in Vancouver devoted
to discussing the possibility of tracking and allocation resources
in Cinder using the Placement service. This is not a done deal.
Instead the session is to discuss if it could work and how to make
it happen if it seems like a good idea.

The etherpad is at

     https://etherpad.openstack.org/p/YVR-cinder-placement

but there's not a great deal there yet. Notably there's no
description of how scheduling and resource tracking currently works
in Cinder because I have no experience with that.

This session is mostly for exploring and sharing information so the
value of the etherpad may mostly be in the notes we take at the
session, but anything we write in advance will help keep things a
bit more structured and focused.

If this is a topic of interest for you please add some notes to the
etherpad, or if you prefer, here.

Thanks.

-- 
Chris Dent                       ٩◔̯◔۶           https://anticdent.org/
freenode: cdent                                         tw: @anticdent

From jp.methot at planethoster.info  Thu May 10 01:11:06 2018
From: jp.methot at planethoster.info (=?utf-8?Q?Jean-Philippe_M=C3=A9thot?=)
Date: Thu, 10 May 2018 10:11:06 +0900
Subject: [Openstack-operators] New project creation fails because of a Nova
 check in a multi-region cloud
Message-ID: <DCE4F0C4-6352-4EF7-87BF-82C66C944C75@planethoster.info>

Hi,

I currently operate a multi-region cloud split between 2 geographic locations. I have updated it to Pike not too long ago, but I've been running into a peculiar issue. Ever since the Pike release, Nova now asks Keystone if a new project exists in Keystone before configuring the project’s quotas. However, there doesn’t seem to be any region restriction regarding which endpoint Nova will query Keystone on. So, right now, if I create a new project in region one, Nova will query Keystone in region two. Because my keystone databases are not synched in real time between each region, the region two Keystone will tell it that the new project doesn't exist, while it exists in region one Keystone.

Thinking that this could be a configuration error, I tried setting the region_name in keystone_authtoken, but that didn’t change much of anything. Right now I am thinking this may be a bug. Could someone confirm that this is indeed a bug and not a configuration error?

To circumvent this issue, I am considering either modifying the database by hand or trying to implement realtime replication between both Keystone databases. Would there be another solution? (beside modifying the code for the Nova check)

Jean-Philippe Méthot
Openstack system administrator
Administrateur système Openstack
PlanetHoster inc.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180510/9b9c5cc6/attachment.html>

From sagaray at nttdata.co.jp  Thu May 10 02:33:16 2018
From: sagaray at nttdata.co.jp (sagaray at nttdata.co.jp)
Date: Thu, 10 May 2018 02:33:16 +0000
Subject: [Openstack-operators] Need feedback for nova aborting
 cold	migration function
Message-ID: <1525919628734.2105@nttdata.co.jp>

Hi Takashi, and guys,

We are operating large telco enterprise cloud.

We always do the maintenance work on midnight during limited time-slot to minimize impact to our users.

Operation planning of cold migration is difficult because cold migration time will vary drastically as it also depends on the load on storage servers at that point of time. If cold migration task stalls for any unknown reasons, operators may decide to cancel it manually. This requires several manual steps to be carried out for recovering from such situation such as kill the copy process, reset-state, stop, and start the VM. If we have the ability to cancel cold migration, we can resume our service safely even though the migration is not complete in the stipulated maintenance time window.

As of today, we can solve the above issue by following manual procedure to recover instances from cold migration failure but we still need to follow these steps every time. We can build our own tool to automate this process but we will need to maintain it by ourselves as this feature is not supported by any OpenStack distribution.

If Nova supports function to cancel cold migration, it’s definitely going to help us to bring instances back from cold migration failure thus improving service availability to our end users. Secondly, we don’t need to worry about maintaining procedure manual or proprietary tool by ourselves which will be a huge win for us.

We are definitely interested in this function and we would love to see it in the next coming release.

Thank you for your hard work.

--------------------------------------------------
Yukinori Sagara <sagaray at nttdata.co.jp>
Platform Engineering Department, NTT DATA Corp.

> Hi everyone,
>
> I'm going to add the aborting cold migration function [1] in nova.
> I would like to ask operators' feedback on this.
>
> The cold migration is an administrator operation by default.
> If administrators perform cold migration and it is stalled out,
> users cannot do their operations (e.g. starting the VM).
>
> In that case, if administrators can abort the cold migration by using
> this function,
> it enables users to operate their VMs.
>
> If you are a person like the following, would you reply to this mail?
>
> * Those who need this function
> * Those who will use this function if it is implemented
> * Those who think that it is better to have this function
> * Those who are interested in this function
>
> [1] https://review.openstack.org/#/c/334732/
>
> Regards,
> Takashi Natsume
> NTT Software Innovation Center
> E-mail: natsume.takashi at lab.ntt.co.jp


From ignaziocassano at gmail.com  Thu May 10 09:58:02 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Thu, 10 May 2018 11:58:02 +0200
Subject: [Openstack-operators] octavia worker on ocata
Message-ID: <CAB7j8cWpO4LBtd931H_1E7PDg5JGEUtiVoe9=15WiZ_okk_bYw@mail.gmail.com>

Hello everyone,
I've just installed octavia on ocata .
All octavia services are running except worker.
It reports the following error in worker.log:

2018-05-10 11:33:27.404 121193 ERROR oslo_service.service InvalidTarget: A
server's target must have topic and server names specified:<Target
server=podto2-octavia>
2018-05-10 11:33:27.404 121193 ERROR oslo_service.service

Could anyone help me ?
Regards
Ignazio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180510/4f941ffc/attachment.html>

From ignaziocassano at gmail.com  Thu May 10 10:05:57 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Thu, 10 May 2018 12:05:57 +0200
Subject: [Openstack-operators] octavia worker on ocata
In-Reply-To: <CAB7j8cWpO4LBtd931H_1E7PDg5JGEUtiVoe9=15WiZ_okk_bYw@mail.gmail.com>
References: <CAB7j8cWpO4LBtd931H_1E7PDg5JGEUtiVoe9=15WiZ_okk_bYw@mail.gmail.com>
Message-ID: <CAB7j8cVAWBS=O=NdEsJSrKCVxza3Qy+MyQWYW6BuTqngucTkFg@mail.gmail.com>

I am sorry,
I forgot to setup topic attribute in oslo_messaging section.

Regards
Ignazio

2018-05-10 11:58 GMT+02:00 Ignazio Cassano <ignaziocassano at gmail.com>:

> Hello everyone,
> I've just installed octavia on ocata .
> All octavia services are running except worker.
> It reports the following error in worker.log:
>
> 2018-05-10 11:33:27.404 121193 ERROR oslo_service.service InvalidTarget: A
> server's target must have topic and server names specified:<Target
> server=podto2-octavia>
> 2018-05-10 11:33:27.404 121193 ERROR oslo_service.service
>
> Could anyone help me ?
> Regards
> Ignazio
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180510/d93da685/attachment.html>

From rico.lin.guanyu at gmail.com  Thu May 10 10:42:05 2018
From: rico.lin.guanyu at gmail.com (Rico Lin)
Date: Thu, 10 May 2018 18:42:05 +0800
Subject: [Openstack-operators] [openstack-dev][heat][all] Heat now
 migrated to StoryBoard!!
In-Reply-To: <CA+WCyyqw0SY+F5VSPC+T3FfypihFrnM9oYsmRA7mLrxrXszHqA@mail.gmail.com>
References: <CA+WCyyp2eBRv4jy85yp2ayFMG2oPmqs6DHq8hkC2uSXgruzepQ@mail.gmail.com>
 <CA+WCyypkkcXK6D_LPhp7q2Hx15-L4L_WusrtQACqBmuKa=1HCA@mail.gmail.com>
 <CA+WCyyqw0SY+F5VSPC+T3FfypihFrnM9oYsmRA7mLrxrXszHqA@mail.gmail.com>
Message-ID: <CA+WCyyqi2M+OeTwgp9HC3dDj0E1FkFDQ7+AjfKvX1Oxi8-neeA@mail.gmail.com>

Hi all,
As we keep adding more info to the migration guideline [1], you might like
to take a look again.
And do hope it will make things easier for you. If not, please find me in
irc or mail.

[1] https://etherpad.openstack.org/p/Heat-StoryBoard-Migration-Info

Here's the quick hint for you, your bug id is exactly your story id.

2018-05-07 18:27 GMT+08:00 Rico Lin <rico.lin.guanyu at gmail.com>:

> Hi all,
>
> I updated more information to this guideline in [1].
> Please must take a view on [1] to see what's been updated.
> will likely to keep update on that etherpad if new Q&A or issue found.
>
> Will keep trying to make this process as painless for you as possible,
> so please endure with us for now, and sorry for any inconvenience
>
> *[1] https://etherpad.openstack.org/p/Heat-StoryBoard-Migration-Info
> <https://etherpad.openstack.org/p/Heat-StoryBoard-Migration-Info>*
>
> 2018-05-05 12:15 GMT+08:00 Rico Lin <rico.lin.guanyu at gmail.com>:
>
>> looping heat-dashboard team
>>
>> 2018-05-05 12:02 GMT+08:00 Rico Lin <rico.lin.guanyu at gmail.com>:
>>
>>> Dear all Heat members and friends
>>>
>>> As you might award, OpenStack projects are scheduled to migrating ([5])
>>> from Launchpad to StoryBoard [1].
>>> For whom who like to know where to file a bug/blueprint, here are some
>>> heads up for you.
>>>
>>> *What's StoryBoard?*
>>> StoryBoard is a cross-project task-tracker, contains numbers of
>>> ``project``, each project contains numbers of ``story`` which you can think
>>> it as an issue or blueprint. Within each story, contains one or multiple
>>> ``task`` (task separate stories into the tasks to resolve/implement). To
>>> learn more about StoryBoard or how to make a good story, you can reference
>>> [6].
>>>
>>> *How to file a bug?*
>>> This is actually simple, use your current ubuntu-one id to access to
>>> storyboard. Then find the corresponding project in [2] and create a story
>>> to it with a description of your issue. We should try to create tasks which
>>> to reference with patches in Gerrit.
>>>
>>> *How to work on a spec (blueprint)?*
>>> File a story like you used to file a Blueprint. Create tasks for your
>>> plan. Also you might want to create a task for adding spec( in heat-spec
>>> repo) if your blueprint needs documents to explain.
>>> I still leave current blueprint page open, so if you like to create a
>>> story from BP, you can still get information. Right now we will start work
>>> as task-driven workflow, so BPs should act no big difference with a bug in
>>> StoryBoard (which is a story with many tasks).
>>>
>>> *Where should I put my story?*
>>> We migrate all heat sub-projects to StoryBoard to try to keep the impact
>>> to whatever you're doing as small as possible. However, if you plan to
>>> create a new story, *please create it under heat project [4]* and tag
>>> it with what it might affect with (like python-heatclint, heat-dashboard,
>>> heat-agents). We do hope to let users focus their stories in one place so
>>> all stories will get better attention and project maintainers don't need to
>>> go around separate places to find it.
>>>
>>> *How to connect from Gerrit to StoryBoard?*
>>> We usually use following key to reference Launchpad
>>> Closes-Bug: #######
>>> Partial-Bug: #######
>>> Related-Bug: #######
>>>
>>> Now in StoryBoard, you can use following key.
>>> Task: ######
>>> Story: ######
>>> you can find more info in [3].
>>>
>>> *What I need to do for my exists bug/bps?*
>>> Your bug is automatically migrated to StoryBoard, however, the reference
>>> in your patches ware not, so you need to change your commit message to
>>> replace the old link to launchpad to new links to StoryBoard.
>>>
>>> *Do we still need Launchpad after all this migration are done?*
>>> As the plan, we won't need Launchpad for heat anymore once we have done
>>> with migrating. Will forbid new bugs/bps filed in Launchpad. Also, try to
>>> provide new information as many as possible. Hopefully, we can make
>>> everyone happy. For those newly created bugs during/after migration, don't
>>> worry we will disallow further create new bugs/bps and do a second migrate
>>> so we won't missed yours.
>>>
>>> [1] https://storyboard.openstack.org/
>>> [2] https://storyboard.openstack.org/#!/project_group/82
>>> [3] https://docs.openstack.org/infra/manual/developers.html#
>>> development-workflow
>>> [4] https://storyboard.openstack.org/#!/project/989
>>> [5] https://docs.openstack.org/infra/storyboard/migration.html
>>> [6] https://docs.openstack.org/infra/storyboard/gui/tasks_st
>>> ories_tags.html#what-is-a-story
>>>
>>>
>>>
>>> --
>>> May The Force of OpenStack Be With You,
>>>
>>> *Rico Lin*irc: ricolin
>>>
>>>
>>
>>
>> --
>> May The Force of OpenStack Be With You,
>>
>> *Rico Lin*irc: ricolin
>>
>>
>
>
> --
> May The Force of OpenStack Be With You,
>
> *Rico Lin*irc: ricolin
>
>


-- 
May The Force of OpenStack Be With You,

*Rico Lin*irc: ricolin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180510/93c80fdd/attachment.html>

From natsume.takashi at lab.ntt.co.jp  Thu May 10 10:42:14 2018
From: natsume.takashi at lab.ntt.co.jp (Takashi Natsume)
Date: Thu, 10 May 2018 19:42:14 +0900
Subject: [Openstack-operators] Need feedback for nova aborting cold
 migration function
In-Reply-To: <1525919628734.2105@nttdata.co.jp>
References: <1525919628734.2105@nttdata.co.jp>
Message-ID: <5fa86256-c601-91cf-570e-04b63a688b47@lab.ntt.co.jp>

Flint and Yukinori, Thank you for your replies!

On 2018/05/10 11:33, sagaray at nttdata.co.jp wrote:
> Hi Takashi, and guys,
> 
> We are operating large telco enterprise cloud.
> 
> We always do the maintenance work on midnight during limited time-slot to minimize impact to our users.
> 
> Operation planning of cold migration is difficult because cold migration time will vary drastically as it also depends on the load on storage servers at that point of time. If cold migration task stalls for any unknown reasons, operators may decide to cancel it manually. This requires several manual steps to be carried out for recovering from such situation such as kill the copy process, reset-state, stop, and start the VM. If we have the ability to cancel cold migration, we can resume our service safely even though the migration is not complete in the stipulated maintenance time window.
> 
> As of today, we can solve the above issue by following manual procedure to recover instances from cold migration failure but we still need to follow these steps every time. We can build our own tool to automate this process but we will need to maintain it by ourselves as this feature is not supported by any OpenStack distribution.
> 
> If Nova supports function to cancel cold migration, it’s definitely going to help us to bring instances back from cold migration failure thus improving service availability to our end users. Secondly, we don’t need to worry about maintaining procedure manual or proprietary tool by ourselves which will be a huge win for us.
> 
> We are definitely interested in this function and we would love to see it in the next coming release.
> 
> Thank you for your hard work.
> 
> --------------------------------------------------
> Yukinori Sagara <sagaray at nttdata.co.jp>
> Platform Engineering Department, NTT DATA Corp.
> 
>> Hi everyone,
>>
>> I'm going to add the aborting cold migration function [1] in nova.
>> I would like to ask operators' feedback on this.
>>
>> The cold migration is an administrator operation by default.
>> If administrators perform cold migration and it is stalled out,
>> users cannot do their operations (e.g. starting the VM).
>>
>> In that case, if administrators can abort the cold migration by using
>> this function,
>> it enables users to operate their VMs.
>>
>> If you are a person like the following, would you reply to this mail?
>>
>> * Those who need this function
>> * Those who will use this function if it is implemented
>> * Those who think that it is better to have this function
>> * Those who are interested in this function
>>
>> [1] https://review.openstack.org/#/c/334732/
>>
>> Regards,
>> Takashi Natsume
>> NTT Software Innovation Center
>> E-mail: natsume.takashi at lab.ntt.co.jp

Regards,
Takashi Natsume
NTT Software Innovation Center
E-mail: natsume.takashi at lab.ntt.co.jp


From mriedemos at gmail.com  Thu May 10 13:52:00 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Thu, 10 May 2018 08:52:00 -0500
Subject: [Openstack-operators] New project creation fails because of a
 Nova check in a multi-region cloud
In-Reply-To: <DCE4F0C4-6352-4EF7-87BF-82C66C944C75@planethoster.info>
References: <DCE4F0C4-6352-4EF7-87BF-82C66C944C75@planethoster.info>
Message-ID: <82bad7a4-0f03-fe6d-7179-7f50b42f3502@gmail.com>

On 5/9/2018 8:11 PM, Jean-Philippe Méthot wrote:
> I currently operate a multi-region cloud split between 2 geographic 
> locations. I have updated it to Pike not too long ago, but I've been 
> running into a peculiar issue. Ever since the Pike release, Nova now 
> asks Keystone if a new project exists in Keystone before configuring the 
> project’s quotas. However, there doesn’t seem to be any region 
> restriction regarding which endpoint Nova will query Keystone on. So, 
> right now, if I create a new project in region one, Nova will query 
> Keystone in region two. Because my keystone databases are not synched in 
> real time between each region, the region two Keystone will tell it that 
> the new project doesn't exist, while it exists in region one Keystone.
> 
> Thinking that this could be a configuration error, I tried setting the 
> region_name in keystone_authtoken, but that didn’t change much of 
> anything. Right now I am thinking this may be a bug. Could someone 
> confirm that this is indeed a bug and not a configuration error?
> 
> To circumvent this issue, I am considering either modifying the database 
> by hand or trying to implement realtime replication between both 
> Keystone databases. Would there be another solution? (beside modifying 
> the code for the Nova check)

This is the specific code you're talking about:

https://github.com/openstack/nova/blob/stable/pike/nova/api/openstack/identity.py#L35

I don't see region_name as a config option for talking to keystone in Pike:

https://docs.openstack.org/nova/pike/configuration/config.html#keystone

But it is in Queens:

https://docs.openstack.org/nova/queens/configuration/config.html#keystone

That was added in this change:

https://review.openstack.org/#/c/507693/

But I think what you're saying is, since you have multiple regions, the 
project could be in any of them at any given time until they synchronize 
so configuring nova for a specific region isn't probably going to help 
in this case, right?

Isn't this somehow resolved with keystone federation? Granted, I'm not 
at all a keystone person, but I'd think this isn't a unique problem.

-- 

Thanks,

Matt


From mriedemos at gmail.com  Thu May 10 13:54:42 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Thu, 10 May 2018 08:54:42 -0500
Subject: [Openstack-operators] Need feedback for nova aborting cold
 migration function
In-Reply-To: <1525919628734.2105@nttdata.co.jp>
References: <1525919628734.2105@nttdata.co.jp>
Message-ID: <0470c0af-6777-2771-35e1-69ee029b485d@gmail.com>

On 5/9/2018 9:33 PM, sagaray at nttdata.co.jp wrote:
> Operation planning of cold migration is difficult because cold migration time will vary drastically as it also depends on the load on storage servers at that point of time. If cold migration task stalls for any unknown reasons, operators may decide to cancel it manually.

What storage backend are you using? What are some reasons that it has 
stalled in the past?

-- 

Thanks,

Matt


From thierry at openstack.org  Thu May 10 13:56:49 2018
From: thierry at openstack.org (Thierry Carrez)
Date: Thu, 10 May 2018 15:56:49 +0200
Subject: [Openstack-operators] [forum] Etherpad for "Ops/Devs: One
	community" session
Message-ID: <0e25c5a4-ef13-f877-0114-ec2468079b03@openstack.org>

Hi!

I have created an etherpad for the "Ops/Devs: One community" Forum
session that will happen in Vancouver on Monday at 4:20pm.

https://etherpad.openstack.org/p/YVR-ops-devs-one-community

If you are interested in continuing breaking up the community silos and
making everyone "contributors" with various backgrounds but a single
objective, please add to it and join the session !

-- 
Thierry Carrez (ttx)


From mriedemos at gmail.com  Thu May 10 13:59:29 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Thu, 10 May 2018 08:59:29 -0500
Subject: [Openstack-operators] Need feedback for nova aborting cold
 migration function
In-Reply-To: <1525919628734.2105@nttdata.co.jp>
References: <1525919628734.2105@nttdata.co.jp>
Message-ID: <5fea9373-021a-0a2e-ba91-d7fe62bd5ca9@gmail.com>

On 5/9/2018 9:33 PM, sagaray at nttdata.co.jp wrote:
> We always do the maintenance work on midnight during limited time-slot to minimize impact to our users.

Also, why are you doing maintenance with cold migration? Why not do live 
migration for your maintenance (which already supports the abort function).

-- 

Thanks,

Matt


From lbragstad at gmail.com  Thu May 10 14:26:47 2018
From: lbragstad at gmail.com (Lance Bragstad)
Date: Thu, 10 May 2018 09:26:47 -0500
Subject: [Openstack-operators] New project creation fails because of a
 Nova check in a multi-region cloud
In-Reply-To: <82bad7a4-0f03-fe6d-7179-7f50b42f3502@gmail.com>
References: <DCE4F0C4-6352-4EF7-87BF-82C66C944C75@planethoster.info>
 <82bad7a4-0f03-fe6d-7179-7f50b42f3502@gmail.com>
Message-ID: <c723af0d-b1cd-5cd8-ba84-0fd40e1e56a3@gmail.com>


On 05/10/2018 08:52 AM, Matt Riedemann wrote:
> On 5/9/2018 8:11 PM, Jean-Philippe Méthot wrote:
>> I currently operate a multi-region cloud split between 2 geographic
>> locations. I have updated it to Pike not too long ago, but I've been
>> running into a peculiar issue. Ever since the Pike release, Nova now
>> asks Keystone if a new project exists in Keystone before configuring
>> the project’s quotas. However, there doesn’t seem to be any region
>> restriction regarding which endpoint Nova will query Keystone on. So,
>> right now, if I create a new project in region one, Nova will query
>> Keystone in region two. Because my keystone databases are not synched
>> in real time between each region, the region two Keystone will tell
>> it that the new project doesn't exist, while it exists in region one
>> Keystone.
Are both keystone nodes completely separate? Do they share any information?
>>
>> Thinking that this could be a configuration error, I tried setting
>> the region_name in keystone_authtoken, but that didn’t change much of
>> anything. Right now I am thinking this may be a bug. Could someone
>> confirm that this is indeed a bug and not a configuration error?
>>
>> To circumvent this issue, I am considering either modifying the
>> database by hand or trying to implement realtime replication between
>> both Keystone databases. Would there be another solution? (beside
>> modifying the code for the Nova check)
A variant of this just came up as a proposal for the Forum in a couple
weeks [0]. A separate proposal was also discussed during this week's
keystone meeting [1], which brought up an interesting solution. We
should be seeing a specification soon that covers the proposal in
greater detail and includes use cases. Either way, both sound like they
may be relevant to you.

[0] https://etherpad.openstack.org/p/YVR-edge-keystone-brainstorming
[1]
http://eavesdrop.openstack.org/meetings/keystone/2018/keystone.2018-05-08-16.00.log.html#l-156
>
> This is the specific code you're talking about:
>
> https://github.com/openstack/nova/blob/stable/pike/nova/api/openstack/identity.py#L35
>
>
> I don't see region_name as a config option for talking to keystone in
> Pike:
>
> https://docs.openstack.org/nova/pike/configuration/config.html#keystone
>
> But it is in Queens:
>
> https://docs.openstack.org/nova/queens/configuration/config.html#keystone
>
> That was added in this change:
>
> https://review.openstack.org/#/c/507693/
>
> But I think what you're saying is, since you have multiple regions,
> the project could be in any of them at any given time until they
> synchronize so configuring nova for a specific region isn't probably
> going to help in this case, right?
>
> Isn't this somehow resolved with keystone federation? Granted, I'm not
> at all a keystone person, but I'd think this isn't a unique problem.
Without knowing a whole lot about the current setup, I'm inclined to say
it is. Keystone-to-keystone federation was developed for cases like
this, and it's been something we've been trying to encourage in favor of
building replication tooling outside of the database or over an API. The
main concerns with taking a manual replication approach is that it could
negatively impact overall performance and that keystone already assumes
it will be in control of ID generation for most cases (replicating a
project in RegionOne into RegionTwo will yield a different project ID,
even though it is possible for both to have the same name).
Additionally, there are some things that keystone doesn't expose over
the API that would need to be replicated, like revocation events (I
mentioned this in the etherpad linked above).


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180510/9d6f08ff/attachment.sig>

From ignaziocassano at gmail.com  Thu May 10 17:45:45 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Thu, 10 May 2018 19:45:45 +0200
Subject: [Openstack-operators] Octavia on ocata centos 7
Message-ID: <CAB7j8cUvN_CMDq_J-OL-kKEqYxzQCxTgEsOgVfqzEnV63cee8w@mail.gmail.com>

Hi everyone,
I am moving from lbaas v2 based on haproxy driver to octavia on centos 7
ocata.

I installed a new host with octavia following the documentation.
I removed all old load balancers, stopped lbaas agent and configured
neutron following this link:

https://docs.openstack.org/octavia/queens/contributor/guides/dev-quick-start.html


On the octavia server all services are active, amphora images are
installed, but when I try to create a load balancer:

nuutron lbaas-loadbalancer-create --name lb1 private-subnet

it tries to connect to 127.0.0.1:5000

Either on octavia.conf or neutron.conf the section for keystone is
correctly configured

to reach controller address.

The old lbaas v2 based on haproxy driver worked fine before changing
configuration but

is was not possible protect lbaas adresses with security groups (this
is a very old problem) because security groups are applyed only to vm
ports.

Since Octavia load balancer is based on vm deirved from amphora image,
I'd like to use it to improve my security.

Any suggestion for my octavia configuration or alternatives to improve
security on lbaas ?

Thanks and Regards

Ignazio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180510/f39b9405/attachment-0001.html>

From iain.macdonnell at oracle.com  Thu May 10 19:03:57 2018
From: iain.macdonnell at oracle.com (iain MacDonnell)
Date: Thu, 10 May 2018 12:03:57 -0700
Subject: [Openstack-operators] Octavia on ocata centos 7
In-Reply-To: <CAB7j8cUvN_CMDq_J-OL-kKEqYxzQCxTgEsOgVfqzEnV63cee8w@mail.gmail.com>
References: <CAB7j8cUvN_CMDq_J-OL-kKEqYxzQCxTgEsOgVfqzEnV63cee8w@mail.gmail.com>
Message-ID: <f3ab5de6-f14c-84fe-8cd9-4c15898e674d@oracle.com>


On 05/10/2018 10:45 AM, Ignazio Cassano wrote:
> I am moving from lbaas v2 based on haproxy driver to octavia on centos 7 
> ocata.
[snip]
> On the octavia server all services are active, amphora images are 
> installed, but when I try to create a load balancer:
> 
> nuutron lbaas-loadbalancer-create --name lb1 private-subnet
> 
> it tries to connect to 127.0.0.1:5000 

Google found:

https://bugzilla.redhat.com/show_bug.cgi?id=1434904 => 
https://bugzilla.redhat.com/show_bug.cgi?id=1433728

Seems that you may be missing the service_auth section from 
neutron_lbaas.conf or/and octavia.conf ?

I've been through the frustration of trying to get Octavia working. The 
docs are bit iffy, and it's ... "still maturing" (from my observation).

I think I did have it working with neutron_lbaasv2 at one point. My 
neutron_lbaas.conf included:

[service_auth]
auth_url = http://mykeystonehost:35357/v3
admin_user = neutron
admin_tenant_name = service
admin_password = n0ttell1nU
admin_user_domain = default
admin_project_domain = default
region = myregion

and octavia.conf:

[service_auth]
memcached_servers = mymemcachedhost:11211
auth_url = http://mykeystonehost:35357
auth_type = password
project_domain_name = default
project_name = service
user_domain_name = default
username = octavia
password = n0ttell1nU


Not sure how correct those are, but IIRC it did basically work.

I've since moved to pure Octavia on Queens, where there is no neutron_lbaas.

GL!

     ~iain


From ignaziocassano at gmail.com  Thu May 10 19:08:53 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Thu, 10 May 2018 19:08:53 +0000
Subject: [Openstack-operators] Octavia on ocata centos 7
In-Reply-To: <f3ab5de6-f14c-84fe-8cd9-4c15898e674d@oracle.com>
References: <CAB7j8cUvN_CMDq_J-OL-kKEqYxzQCxTgEsOgVfqzEnV63cee8w@mail.gmail.com>
 <f3ab5de6-f14c-84fe-8cd9-4c15898e674d@oracle.com>
Message-ID: <CAB7j8cVrv8gt6y1NOkCf+dtY2ozzLJmbLbwPghWVEGHiOw0UyQ@mail.gmail.com>

Many thanks for your help.
Ignazio

Il Gio 10 Mag 2018 21:05 iain MacDonnell <iain.macdonnell at oracle.com> ha
scritto:

>
>
> On 05/10/2018 10:45 AM, Ignazio Cassano wrote:
> > I am moving from lbaas v2 based on haproxy driver to octavia on centos 7
> > ocata.
> [snip]
> > On the octavia server all services are active, amphora images are
> > installed, but when I try to create a load balancer:
> >
> > nuutron lbaas-loadbalancer-create --name lb1 private-subnet
> >
> > it tries to connect to 127.0.0.1:5000
>
> Google found:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1434904 =>
> https://bugzilla.redhat.com/show_bug.cgi?id=1433728
>
> Seems that you may be missing the service_auth section from
> neutron_lbaas.conf or/and octavia.conf ?
>
> I've been through the frustration of trying to get Octavia working. The
> docs are bit iffy, and it's ... "still maturing" (from my observation).
>
> I think I did have it working with neutron_lbaasv2 at one point. My
> neutron_lbaas.conf included:
>
> [service_auth]
> auth_url = http://mykeystonehost:35357/v3
> admin_user = neutron
> admin_tenant_name = service
> admin_password = n0ttell1nU
> admin_user_domain = default
> admin_project_domain = default
> region = myregion
>
> and octavia.conf:
>
> [service_auth]
> memcached_servers = mymemcachedhost:11211
> auth_url = http://mykeystonehost:35357
> auth_type = password
> project_domain_name = default
> project_name = service
> user_domain_name = default
> username = octavia
> password = n0ttell1nU
>
>
> Not sure how correct those are, but IIRC it did basically work.
>
> I've since moved to pure Octavia on Queens, where there is no
> neutron_lbaas.
>
> GL!
>
>      ~iain
>
>
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180510/d1707bdd/attachment.html>

From jp.methot at planethoster.info  Thu May 10 23:30:16 2018
From: jp.methot at planethoster.info (=?utf-8?Q?Jean-Philippe_M=C3=A9thot?=)
Date: Fri, 11 May 2018 08:30:16 +0900
Subject: [Openstack-operators] New project creation fails because of a
 Nova check in a multi-region cloud
In-Reply-To: <c723af0d-b1cd-5cd8-ba84-0fd40e1e56a3@gmail.com>
References: <DCE4F0C4-6352-4EF7-87BF-82C66C944C75@planethoster.info>
 <82bad7a4-0f03-fe6d-7179-7f50b42f3502@gmail.com>
 <c723af0d-b1cd-5cd8-ba84-0fd40e1e56a3@gmail.com>
Message-ID: <1A1729FA-BE7F-4A42-A42F-BC9B772DFE73@planethoster.info>


>> 
>>> I currently operate a multi-region cloud split between 2 geographic
>>> locations. I have updated it to Pike not too long ago, but I've been
>>> running into a peculiar issue. Ever since the Pike release, Nova now
>>> asks Keystone if a new project exists in Keystone before configuring
>>> the project’s quotas. However, there doesn’t seem to be any region
>>> restriction regarding which endpoint Nova will query Keystone on. So,
>>> right now, if I create a new project in region one, Nova will query
>>> Keystone in region two. Because my keystone databases are not synched
>>> in real time between each region, the region two Keystone will tell
>>> it that the new project doesn't exist, while it exists in region one
>>> Keystone.
> Are both keystone nodes completely separate? Do they share any information?

I share the DB information between both. In our use case, we very rarely make changes to keystone (password change, user creation, project creation) and there is a limited number of people who even have access to it, so I can get away with having my main DB in region 1 and hosting an exact copy in region 2. The original idea was to have a mysql slave in region 2, but that failed and we decided to go with manually replicating the keystone DB whenever we would make changes. This means I have the same users and projects in both regions, which is exactly what I want right now for my specific use case. Of course, that also means I only do operations in keystone in Region 1 and never from Region 2 to prevent discrepancies.
>>> 
>>> Thinking that this could be a configuration error, I tried setting
>>> the region_name in keystone_authtoken, but that didn’t change much of
>>> anything. Right now I am thinking this may be a bug. Could someone
>>> confirm that this is indeed a bug and not a configuration error?
>>> 
>>> To circumvent this issue, I am considering either modifying the
>>> database by hand or trying to implement realtime replication between
>>> both Keystone databases. Would there be another solution? (beside
>>> modifying the code for the Nova check)
> A variant of this just came up as a proposal for the Forum in a couple
> weeks [0]. A separate proposal was also discussed during this week's
> keystone meeting [1], which brought up an interesting solution. We
> should be seeing a specification soon that covers the proposal in
> greater detail and includes use cases. Either way, both sound like they
> may be relevant to you.
> 
> [0] https://etherpad.openstack.org/p/YVR-edge-keystone-brainstorming <https://etherpad.openstack.org/p/YVR-edge-keystone-brainstorming>
> [1]
> http://eavesdrop.openstack.org/meetings/keystone/2018/keystone.2018-05-08-16.00.log.html#l-156 <http://eavesdrop.openstack.org/meetings/keystone/2018/keystone.2018-05-08-16.00.log.html#l-156>

This is interesting. Unfortunately I will not be in Vancouver, but I will keep an eye on it in the future. I will need to find a way to solve the current issue at hand shortly though.

>> 
>> This is the specific code you're talking about:
>> 
>> https://github.com/openstack/nova/blob/stable/pike/nova/api/openstack/identity.py#L35
>> 
>> 
>> I don't see region_name as a config option for talking to keystone in
>> Pike:
>> 
>> https://docs.openstack.org/nova/pike/configuration/config.html#keystone
>> 
>> But it is in Queens:
>> 
>> https://docs.openstack.org/nova/queens/configuration/config.html#keystone
>> 
>> That was added in this change:
>> 
>> https://review.openstack.org/#/c/507693/
>> 
>> But I think what you're saying is, since you have multiple regions,
>> the project could be in any of them at any given time until they
>> synchronize so configuring nova for a specific region isn't probably
>> going to help in this case, right?
>> 
>> Isn't this somehow resolved with keystone federation? Granted, I'm not
>> at all a keystone person, but I'd think this isn't a unique problem.
> Without knowing a whole lot about the current setup, I'm inclined to say
> it is. Keystone-to-keystone federation was developed for cases like
> this, and it's been something we've been trying to encourage in favor of
> building replication tooling outside of the database or over an API. The
> main concerns with taking a manual replication approach is that it could
> negatively impact overall performance and that keystone already assumes
> it will be in control of ID generation for most cases (replicating a
> project in RegionOne into RegionTwo will yield a different project ID,
> even though it is possible for both to have the same name).
> Additionally, there are some things that keystone doesn't expose over
> the API that would need to be replicated, like revocation events (I
> mentioned this in the etherpad linked above).

To answer the questions of both posts:

1.I was talking about the region-name parameter underneath keystone_authtoken. That is in the pike doc you linked, but I am unaware if this is only used for token generation or not. Anyhow, it doesn’t seem to have any impact on the issue at hand.

2.My understanding of the issue is this:
	-Keystone creates new project in region 1
	-Nova wants to check if the project exists in keystone, so it asks keystone for its endpoint list.
	-Nova picks the first endpoint in the list, which happens to be the region 2 endpoint (my endpoint list has the endpoints of both regions since I manage from a single horizon/controller node).
	-Since there’s no real-time replication, region 2 replies that the project doesn’t exist, while it exists in region 1.

I may be wrong about my assumption that it picks the region 2 endpoint, but the facts are that it does query region 2 keystone when it shouldn’t (I see the 404s in the region 2 logs)

3.I haven't really looked into keystone federation yet, but wouldn’t it cause issues if projects in 2 different regions have the same uuid? 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180511/02ea8247/attachment.html>

From mriedemos at gmail.com  Thu May 10 23:36:06 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Thu, 10 May 2018 18:36:06 -0500
Subject: [Openstack-operators] New project creation fails because of a
 Nova check in a multi-region cloud
In-Reply-To: <1A1729FA-BE7F-4A42-A42F-BC9B772DFE73@planethoster.info>
References: <DCE4F0C4-6352-4EF7-87BF-82C66C944C75@planethoster.info>
 <82bad7a4-0f03-fe6d-7179-7f50b42f3502@gmail.com>
 <c723af0d-b1cd-5cd8-ba84-0fd40e1e56a3@gmail.com>
 <1A1729FA-BE7F-4A42-A42F-BC9B772DFE73@planethoster.info>
Message-ID: <62e79005-7f7b-aa8d-0262-bfc267ca6b3f@gmail.com>

On 5/10/2018 6:30 PM, Jean-Philippe Méthot wrote:
> 1.I was talking about the region-name parameter underneath 
> keystone_authtoken. That is in the pike doc you linked, but I am unaware 
> if this is only used for token generation or not. Anyhow, it doesn’t 
> seem to have any impact on the issue at hand.

The [keystone]/region_name config option in nova is used to pike the 
identity service endpoint so I think in that case region_one will matter 
if there are multiple identity endpoints in the service catalog. The 
only thing is you're on pike where [keystone]/region_name isn't in 
nova.conf and it's not used, it was added in queens for this lookup:

https://review.openstack.org/#/c/507693/

So that might be why it doesn't seem to make a difference if you set it 
in nova.conf - because the nova code isn't actually using it.

You could try backporting that patch into your pike deployment, set 
region_name to RegionOne and see if it makes a difference (although I 
thought RegionOne was the default if not specified?).

-- 

Thanks,

Matt


From jp.methot at planethoster.info  Fri May 11 00:04:32 2018
From: jp.methot at planethoster.info (=?utf-8?Q?Jean-Philippe_M=C3=A9thot?=)
Date: Fri, 11 May 2018 09:04:32 +0900
Subject: [Openstack-operators] New project creation fails because of a
 Nova check in a multi-region cloud
In-Reply-To: <62e79005-7f7b-aa8d-0262-bfc267ca6b3f@gmail.com>
References: <DCE4F0C4-6352-4EF7-87BF-82C66C944C75@planethoster.info>
 <82bad7a4-0f03-fe6d-7179-7f50b42f3502@gmail.com>
 <c723af0d-b1cd-5cd8-ba84-0fd40e1e56a3@gmail.com>
 <1A1729FA-BE7F-4A42-A42F-BC9B772DFE73@planethoster.info>
 <62e79005-7f7b-aa8d-0262-bfc267ca6b3f@gmail.com>
Message-ID: <48915EC3-5BD0-4156-95C8-E67EDEB9AD2F@planethoster.info>


> Le 11 mai 2018 à 08:36, Matt Riedemann <mriedemos at gmail.com> a écrit :
> 
> On 5/10/2018 6:30 PM, Jean-Philippe Méthot wrote:
>> 1.I was talking about the region-name parameter underneath keystone_authtoken. That is in the pike doc you linked, but I am unaware if this is only used for token generation or not. Anyhow, it doesn’t seem to have any impact on the issue at hand.
> 
> The [keystone]/region_name config option in nova is used to pike the identity service endpoint so I think in that case region_one will matter if there are multiple identity endpoints in the service catalog. The only thing is you're on pike where [keystone]/region_name isn't in nova.conf and it's not used, it was added in queens for this lookup:
> 
> https://review.openstack.org/#/c/507693/
> 
> So that might be why it doesn't seem to make a difference if you set it in nova.conf - because the nova code isn't actually using it.
> 

I was talking about the parameter under [keystone_authtoken] ([keystone_authtoken]/region_name) and not the new one under [keystone] ([keystone]/region_name). It seems that we were talking about different parameters though so this explains that. 


> You could try backporting that patch into your pike deployment, set region_name to RegionOne and see if it makes a difference (although I thought RegionOne was the default if not specified?).

I will attempt this next week. Will update if I run into any issues. Also, from experience, most Openstack services seem to pick a random endpoint when region_name isn’t specified in a multi-region cloud. I’ve seen that several times ever since I've built and started maintaining this infrastructure.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180511/8da52c4e/attachment.html>

From ignaziocassano at gmail.com  Fri May 11 08:57:02 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Fri, 11 May 2018 10:57:02 +0200
Subject: [Openstack-operators] octavia on ocata no amphora instaces are
	created
Message-ID: <CAB7j8cWDOM4U+xZjPJXesk+KabhvjiCh8U9+FQnZ1soO9pfTGg@mail.gmail.com>

Hi everyone,
I installed octavia on ocata centos 7.
Load balancer, listener and pool are created and they are active but I
cannot see any amphora instance.
No errors in octavia logs.
Could anyone help me  ?

Regards
Ignazio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180511/ed61860a/attachment.html>

From ignaziocassano at gmail.com  Fri May 11 10:29:30 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Fri, 11 May 2018 12:29:30 +0200
Subject: [Openstack-operators] octavia amphora instances on ocata
Message-ID: <CAB7j8cXfZ8WTEfcYoHpZbYAxf8Vn-2vjveeEeNOMpZOGCbYyjA@mail.gmail.com>

Hi everyone,
I installed octavia on ocata centos 7 and now when I create a load balancer
amphora instances are
automatically created but there are some problems:

1) amphora-agent on amphora instances is in error state because it needs
ceertificates
(Must I create amphora image with certicates on it ? Or certicates are
copyed  durinig instance deployment ?)

2) health-manager.log reports:

Amphora 4e6d19d3-bc19-4882-aeca-4772b069c53b health message reports 0
listeners when 1 expected


Please, could anyone explain what happens ?

Regards
Ignazio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180511/6ef56ccc/attachment.html>

From ignaziocassano at gmail.com  Fri May 11 16:28:35 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Fri, 11 May 2018 18:28:35 +0200
Subject: [Openstack-operators] ocata octavia apmhora ssl error
Message-ID: <CAB7j8cW-s829s67oeMZEyMAsT2Cb9NDczgBuqxNFPjwV8X+xiw@mail.gmail.com>

Hi eveyone,
I am trying to configura octavia lbaas on ocata centos 7.

When I create the load balancer a vm is created from amphora image but
worker log reports:

018-05-11 17:38:56.013 125607 WARNING
octavia.amphorae.drivers.haproxy.rest_api_driver [-] Could not connect to
instance. Retrying.
2018-05-11 17:39:01.016 125607 WARNING
octavia.amphorae.drivers.haproxy.rest_api_driver [-] Could not connect to
instance. Retrying.


I think it is trying to connect the amphora instance on port 9443 but in
the amphora instance  /var/log/amphora-agent.log file the following is
reported:

2018-05-11 15:38:45 +0000] [900] [DEBUG] Failed to send error message.
[2018-05-11 15:38:50 +0000] [900] [DEBUG] Error processing SSL request.
[2018-05-11 15:38:50 +0000] [900] [DEBUG] Invalid request from ip=::ffff:
10.138.176.96: [SSL: SSL_HANDSHAKE_FAILURE] ssl handshake failure
(_ssl.c:1977)
[2018-05-11 15:38:50 +0000] [900] [DEBUG] Failed to send error message.
[2018-05-11 15:38:55 +0000] [900] [DEBUG] Error processing SSL request.
[2018-05-11 15:38:55 +0000] [900] [DEBUG] Invalid request from ip=::ffff:
10.138.176.96: [SSL: SSL_HANDSHAKE_FAILURE] ssl handshake failure
(_ssl.c:1977)
[2018-05-11 15:38:55 +0000] [900] [DEBUG] Failed to send error message.
[2018-05-11 15:39:00 +0000] [900] [DEBUG] Error processing SSL request.
[2018-05-11 15:39:00 +0000] [900] [DEBUG] Invalid request from ip=::ffff:
10.138.176.96: [SSL: SSL_HANDSHAKE_FAILURE] ssl handshake failure
(_ssl.c:1977)
[2018-05-11 15:39:00 +0000] [900] [DEBUG] Failed to send error message.

10.138.176.96 is the address of my controller worker.

Security groups allow any protocol any port
there aren't connection problem between networks.
Probably some errors in certificate creation....
Anyone can help me, please?

Is possible to disable  ssl for testing ?

Regards
Ignazio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180511/77cf5f22/attachment.html>

From ignaziocassano at gmail.com  Mon May 14 14:16:15 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Mon, 14 May 2018 16:16:15 +0200
Subject: [Openstack-operators] ocata gnocchi file system : erasing old data
Message-ID: <CAB7j8cULEP0O6L1LfTaazEvS5VfV-AfqEe-qfu5LWXCL4s=SbA@mail.gmail.com>

Hi everyone,
I am osing ocata on centos 7 with ceilometer and gnocchi.
The gnocchi backend is nfs and I would like to know if it is possible
remove old data on the backend file system.
Dome directories on the backend are 6 months old.

Please, any suggestion ?

Thanks and Regards
Ignazio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180514/5c0217f3/attachment.html>

From jungleboyj at gmail.com  Mon May 14 21:15:53 2018
From: jungleboyj at gmail.com (Jay S Bryant)
Date: Mon, 14 May 2018 16:15:53 -0500
Subject: [Openstack-operators] [cinder] forum etherpads now available ...
Message-ID: <2caa6f0d-2084-27f5-196e-fdecbf10d6f2@gmail.com>

All,

I have etherpads created for our Cinder related Forum discussions:

  * Tuesday, 5/22 11:00 to 11:40 - Room 221-222 - Cinder High
    Availability (HA) Discussion
    -https://etherpad.openstack.org/p/YVR18-cinder-ha-forum
  * Tuesday, 5/22 11:50 to 12:30 - Room 221-222 - Multi-attach
    Introduction and Future Direction
    -https://etherpad.openstack.org/p/YVR18-cinder-mutiattach-forum
  * Wednesday, 5/23 9:40 to 10:30 - Room 221-222 - Cinder's
    Documentation Discussion
    -https://etherpad.openstack.org/p/YVR18-cinder-documentation-forum

We also have the session on using the placement service:

  * Monday 5/21 16:20 to 17:00 - Planning to use Placement in
    Cinderhttps://etherpad.openstack.org/p/YVR-cinder-placement

Please take some time to look at the etherpads before the forum and add 
your thoughts/questions for discussion.

Thank you!

Jay Bryant

(jungleboyj)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180514/86de839e/attachment.html>

From gord at live.ca  Tue May 15 01:33:24 2018
From: gord at live.ca (gordon chung)
Date: Tue, 15 May 2018 01:33:24 +0000
Subject: [Openstack-operators] ocata gnocchi file system : erasing old
 data
In-Reply-To: <CAB7j8cULEP0O6L1LfTaazEvS5VfV-AfqEe-qfu5LWXCL4s=SbA@mail.gmail.com>
References: <CAB7j8cULEP0O6L1LfTaazEvS5VfV-AfqEe-qfu5LWXCL4s=SbA@mail.gmail.com>
Message-ID: <CY1PR16MB0412E175A1EB79A8FDECD306DE930@CY1PR16MB0412.namprd16.prod.outlook.com>


On 2018-05-14 10:16 AM, Ignazio Cassano wrote:
> Hi everyone,
> I am osing ocata on centos 7 with ceilometer and gnocchi.
> The gnocchi backend is nfs and I would like to know if it is possible 
> remove old data on the backend file system.
> Dome directories on the backend are 6 months old.
> 
> Please, any suggestion ?

there isn't a way to this without manually modifying the files (which 
can get a bit sketchy). Gnocchi is designed to capture (at most) the 
amount of data you define in your policy and does not prune data based 
on 'now' so it won't shrink on it's own over time.

i guess we could support a tool that could prune based on 'now' but that 
doesn't exist currently.

the safest way to clean old data is to delete the metric. if you want to
delete only some of the metric that will be difficult. it'll require you 
figuring out what time range of data is stored in a given file (which is 
not difficult if you look at code), then properly deserialising, 
pruning, and reserialising (probably difficult or at least annoying).

cheers,

-- 
gord

From mizuno.shintaro at lab.ntt.co.jp  Tue May 15 06:06:29 2018
From: mizuno.shintaro at lab.ntt.co.jp (Shintaro Mizuno)
Date: Tue, 15 May 2018 15:06:29 +0900
Subject: [Openstack-operators] [Forum] "DPDK/SR-IOV NFV Operational issues
 and way forward" session etherpad
Message-ID: <0eda6a49-352d-5c04-da87-3f1ae72516ac@lab.ntt.co.jp>

Hi

I have created an etherpad page for
"DPDK/SR-IOV NFV Operational issues and way forward"
session at the Vancouver Forum [1].

It will take place on Wed 23, 11:50am - 12:30pm
Vancouver Convention Centre West - Level Two - Room 221-222

If you are using/testing DPDK/SR-IOV for NFV workloads and interested in  
discussing their pros/cons and possible next steps for NFV operators and  
developers, please come join the session.
Please also add your comment/topic proposals to the etherpad beforehand.

[1] https://etherpad.openstack.org/p/YVR-dpdk-sriov-way-forward

Any input is highly appreciated.
Regards,
Shintaro
-- 
Shintaro MIZUNO (水野伸太郎)
NTT Software Innovation Center
TEL: 0422-59-4977
E-mail: mizuno.shintaro at lab.ntt.co.jp


From ignaziocassano at gmail.com  Tue May 15 06:40:05 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Tue, 15 May 2018 08:40:05 +0200
Subject: [Openstack-operators] ocata gnocchi file system : erasing old
	data
In-Reply-To: <CY1PR16MB0412E175A1EB79A8FDECD306DE930@CY1PR16MB0412.namprd16.prod.outlook.com>
References: <CAB7j8cULEP0O6L1LfTaazEvS5VfV-AfqEe-qfu5LWXCL4s=SbA@mail.gmail.com>
 <CY1PR16MB0412E175A1EB79A8FDECD306DE930@CY1PR16MB0412.namprd16.prod.outlook.com>
Message-ID: <CAB7j8cWe8q8gSZA-aXC+AtL8pGzj67z8gVWOT2Zm8At9Cde_Eg@mail.gmail.com>

Hi Gordon, please, let me to understand better....
I am collecting only  instance metrics:
 I need to remove data about instances that ah been removed.
I could do that with:
gnocchi resource list --type instance -c id -f value
 if the instance has been delete I could:

gnocchi resource delete instance id


Does the above procedure remove data either from database or
/var/lib/gnocchi directory ?


Any suggestion ?

Thanks and Regards
Ignazio


2018-05-15 3:33 GMT+02:00 gordon chung <gord at live.ca>:

>
>
> On 2018-05-14 10:16 AM, Ignazio Cassano wrote:
> > Hi everyone,
> > I am osing ocata on centos 7 with ceilometer and gnocchi.
> > The gnocchi backend is nfs and I would like to know if it is possible
> > remove old data on the backend file system.
> > Dome directories on the backend are 6 months old.
> >
> > Please, any suggestion ?
>
> there isn't a way to this without manually modifying the files (which
> can get a bit sketchy). Gnocchi is designed to capture (at most) the
> amount of data you define in your policy and does not prune data based
> on 'now' so it won't shrink on it's own over time.
>
> i guess we could support a tool that could prune based on 'now' but that
> doesn't exist currently.
>
> the safest way to clean old data is to delete the metric. if you want to
> delete only some of the metric that will be difficult. it'll require you
> figuring out what time range of data is stored in a given file (which is
> not difficult if you look at code), then properly deserialising,
> pruning, and reserialising (probably difficult or at least annoying).
>
> cheers,
>
> --
> gord
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180515/219ca93c/attachment.html>

From sagaray at nttdata.co.jp  Tue May 15 08:48:18 2018
From: sagaray at nttdata.co.jp (sagaray at nttdata.co.jp)
Date: Tue, 15 May 2018 08:48:18 +0000
Subject: [Openstack-operators] Need feedback for nova aborting cold
 migration function
In-Reply-To: <5fea9373-021a-0a2e-ba91-d7fe62bd5ca9@gmail.com>
References: <1525919628734.2105@nttdata.co.jp>,
 <5fea9373-021a-0a2e-ba91-d7fe62bd5ca9@gmail.com>
Message-ID: <1526374144863.89140@nttdata.co.jp>

Hi Matt,

> On 5/9/2018 9:33 PM, sagaray at nttdata.co.jp wrote:
> > Operation planning of cold migration is difficult because cold migration time will vary drastically as it also depends on the load on storage servers at that point of time. If cold migration task stalls for any unknown reasons, operators may decide to cancel it manually.
> 
> What storage backend are you using? What are some reasons that it has 
> stalled in the past?

Our storage backend is EMC VNX, and we have not shared the instance-store storage among compute nodes.

The storage is also accessed by external system.
We store the service logs which are created by VM on that storage. Our system needs to backup those logs by transferring to other storage.
Those logs sometimes becomes very large, and the load of storage also becomes high.
In those situation, migrating the VM takes more time than expected in advance,
so we would like to cancel some migration task on the way if maintenance time being close to the end.

> On 5/9/2018 9:33 PM, sagaray at nttdata.co.jp wrote:
> > We always do the maintenance work on midnight during limited time-slot to minimize impact to our users.
> 
> Also, why are you doing maintenance with cold migration? Why not do live 
> migration for your maintenance (which already supports the abort function).

We would like to migrate stopped servers as it is.

As the reason above, we think we can operate the system more flexible if we able to cancel cold-migration as live-migration can.

--------------------------------------------------
Yukinori Sagara <sagaray at nttdata.co.jp>
Platform Engineering Department, NTT DATA Corp.

________________________________________
差出人: Matt Riedemann <mriedemos at gmail.com>
送信日時: 2018年5月10日 22:59
宛先: openstack-operators at lists.openstack.org
件名: Re: [Openstack-operators] Need feedback for nova aborting cold migration function

On 5/9/2018 9:33 PM, sagaray at nttdata.co.jp wrote:
> We always do the maintenance work on midnight during limited time-slot to minimize impact to our users.

Also, why are you doing maintenance with cold migration? Why not do live
migration for your maintenance (which already supports the abort function).

--

Thanks,

Matt

_______________________________________________
OpenStack-operators mailing list
OpenStack-operators at lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

From martialmichel at datamachines.io  Tue May 15 20:15:40 2018
From: martialmichel at datamachines.io (Martial Michel)
Date: Tue, 15 May 2018 16:15:40 -0400
Subject: [Openstack-operators] [scientific] Scientific SIG - IRC meeting Tue
	15th at 2100 UTC
Message-ID: <CAAXGpAq1JaSm6EL4E3a6NSaiRaXK8ZwAR8rPbM4j3u+7T1gQ8g@mail.gmail.com>

Hello,

With a late email invitation, we will have our IRC meeting in the
#openstack-meeting
channel at 2100 UTC May 15th.
Final agenda will be at
https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_May_15th_2018
and to be a repeat of last week in preparation for the summit next week


   1. SIG Cycle Report
      1. https://etherpad.openstack.org/p/scientific-sig-report-queens
   2. Call for Lighting Talks
      1. https://etherpad.openstack.org/p/scientific
      -sig-vancouver2018-lighting-talks
   3. AOB


All are welcome. Looking forward to seeing you there -- Martial
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180515/9704763e/attachment.html>

From mebert at uvic.ca  Tue May 15 20:58:57 2018
From: mebert at uvic.ca (Marcus Ebert)
Date: Tue, 15 May 2018 13:58:57 -0700 (PDT)
Subject: [Openstack-operators] [Openstack-sigs] [scientific] Scientific
 SIG - IRC meeting Tue 15th	at 2100 UTC
In-Reply-To: <CAAXGpAq1JaSm6EL4E3a6NSaiRaXK8ZwAR8rPbM4j3u+7T1gQ8g@mail.gmail.com>
References: <CAAXGpAq1JaSm6EL4E3a6NSaiRaXK8ZwAR8rPbM4j3u+7T1gQ8g@mail.gmail.com>
Message-ID: <alpine.LRH.2.20.1805151325410.21189@heplw65.phys.uvic.ca>

Hello all,

I'm new to the list, so I would like to give a short introduction to what 
we do:
I'm with the HEP Research Computing group at UVic, where we utilize 
(Openstack) clouds for the computing needs of different High Energy 
Physics groups. We don't use just single clouds but work on a system that 
unifies all clouds available to us in a way that it looks like a single 
computing resource for the user jobs, and for that it also handles the 
distribution of needed images to the different clouds. In addition, we are 
working on a system that unifies cloud storage on different clouds into a 
unified storage space with a single endpoint for all user jobs on any 
cloud, no matter on which clouds the data ends up or from where it is 
read. Although we have this storage federation in production now, it is 
still mainly work in progress.

more general information can be found here:
http://heprc.phys.uvic.ca/
https://heprc.blogspot.ca/


Unfortunately, I can't join the IRC meeting today, but will be at the 
summit next week in Vancouver.
Could you please let me know which Scientific SIG activities are planned 
for it and on which days? (from the schedule, it's just Wednesday morning?)


Cheers,
  Marcus


On Tue, 15 May 2018, Martial Michel wrote:

> Hello,
>
> With a late email invitation, we will have our IRC meeting in the
> #openstack-meeting
> channel at 2100 UTC May 15th.
> Final agenda will be at
> https://wiki.openstack.org/wiki/Scientific_SIG#IRC_Meeting_May_15th_2018
> and to be a repeat of last week in preparation for the summit next week
>
>
>   1. SIG Cycle Report
>      1. https://etherpad.openstack.org/p/scientific-sig-report-queens
>   2. Call for Lighting Talks
>      1. https://etherpad.openstack.org/p/scientific
>      -sig-vancouver2018-lighting-talks
>   3. AOB
>
>
> All are welcome. Looking forward to seeing you there -- Martial
>


From rico.lin.guanyu at gmail.com  Wed May 16 06:18:09 2018
From: rico.lin.guanyu at gmail.com (Rico Lin)
Date: Wed, 16 May 2018 14:18:09 +0800
Subject: [Openstack-operators] [openstack-dev][heat][all] Heat now
 migrated to StoryBoard!!
In-Reply-To: <CA+WCyyqi2M+OeTwgp9HC3dDj0E1FkFDQ7+AjfKvX1Oxi8-neeA@mail.gmail.com>
References: <CA+WCyyp2eBRv4jy85yp2ayFMG2oPmqs6DHq8hkC2uSXgruzepQ@mail.gmail.com>
 <CA+WCyypkkcXK6D_LPhp7q2Hx15-L4L_WusrtQACqBmuKa=1HCA@mail.gmail.com>
 <CA+WCyyqw0SY+F5VSPC+T3FfypihFrnM9oYsmRA7mLrxrXszHqA@mail.gmail.com>
 <CA+WCyyqi2M+OeTwgp9HC3dDj0E1FkFDQ7+AjfKvX1Oxi8-neeA@mail.gmail.com>
Message-ID: <CA+WCyyr_h286foo6YMqsmTzUYgzNbQHLXT_eaKb6ENssSsvBMw@mail.gmail.com>

Bump the last time

Hi all,
As we keep adding more info to the migration guideline [1], you might like
to take a look again.
And do hope it will make things easier for you. If not, please find me in
irc or mail.

[1] https://etherpad.openstack.org/p/Heat-StoryBoard-Migration-Info


2018-05-10 18:42 GMT+08:00 Rico Lin <rico.lin.guanyu at gmail.com>:

> Hi all,
> As we keep adding more info to the migration guideline [1], you might like
> to take a look again.
> And do hope it will make things easier for you. If not, please find me in
> irc or mail.
>
> [1] https://etherpad.openstack.org/p/Heat-StoryBoard-Migration-Info
>
> Here's the quick hint for you, your bug id is exactly your story id.
>
> 2018-05-07 18:27 GMT+08:00 Rico Lin <rico.lin.guanyu at gmail.com>:
>
>> Hi all,
>>
>> I updated more information to this guideline in [1].
>> Please must take a view on [1] to see what's been updated.
>> will likely to keep update on that etherpad if new Q&A or issue found.
>>
>> Will keep trying to make this process as painless for you as possible,
>> so please endure with us for now, and sorry for any inconvenience
>>
>> *[1] https://etherpad.openstack.org/p/Heat-StoryBoard-Migration-Info
>> <https://etherpad.openstack.org/p/Heat-StoryBoard-Migration-Info>*
>>
>> 2018-05-05 12:15 GMT+08:00 Rico Lin <rico.lin.guanyu at gmail.com>:
>>
>>> looping heat-dashboard team
>>>
>>> 2018-05-05 12:02 GMT+08:00 Rico Lin <rico.lin.guanyu at gmail.com>:
>>>
>>>> Dear all Heat members and friends
>>>>
>>>> As you might award, OpenStack projects are scheduled to migrating ([5])
>>>> from Launchpad to StoryBoard [1].
>>>> For whom who like to know where to file a bug/blueprint, here are some
>>>> heads up for you.
>>>>
>>>> *What's StoryBoard?*
>>>> StoryBoard is a cross-project task-tracker, contains numbers of
>>>> ``project``, each project contains numbers of ``story`` which you can think
>>>> it as an issue or blueprint. Within each story, contains one or multiple
>>>> ``task`` (task separate stories into the tasks to resolve/implement). To
>>>> learn more about StoryBoard or how to make a good story, you can reference
>>>> [6].
>>>>
>>>> *How to file a bug?*
>>>> This is actually simple, use your current ubuntu-one id to access to
>>>> storyboard. Then find the corresponding project in [2] and create a story
>>>> to it with a description of your issue. We should try to create tasks which
>>>> to reference with patches in Gerrit.
>>>>
>>>> *How to work on a spec (blueprint)?*
>>>> File a story like you used to file a Blueprint. Create tasks for your
>>>> plan. Also you might want to create a task for adding spec( in heat-spec
>>>> repo) if your blueprint needs documents to explain.
>>>> I still leave current blueprint page open, so if you like to create a
>>>> story from BP, you can still get information. Right now we will start work
>>>> as task-driven workflow, so BPs should act no big difference with a bug in
>>>> StoryBoard (which is a story with many tasks).
>>>>
>>>> *Where should I put my story?*
>>>> We migrate all heat sub-projects to StoryBoard to try to keep the
>>>> impact to whatever you're doing as small as possible. However, if you plan
>>>> to create a new story, *please create it under heat project [4]* and
>>>> tag it with what it might affect with (like python-heatclint,
>>>> heat-dashboard, heat-agents). We do hope to let users focus their stories
>>>> in one place so all stories will get better attention and project
>>>> maintainers don't need to go around separate places to find it.
>>>>
>>>> *How to connect from Gerrit to StoryBoard?*
>>>> We usually use following key to reference Launchpad
>>>> Closes-Bug: #######
>>>> Partial-Bug: #######
>>>> Related-Bug: #######
>>>>
>>>> Now in StoryBoard, you can use following key.
>>>> Task: ######
>>>> Story: ######
>>>> you can find more info in [3].
>>>>
>>>> *What I need to do for my exists bug/bps?*
>>>> Your bug is automatically migrated to StoryBoard, however, the
>>>> reference in your patches ware not, so you need to change your commit
>>>> message to replace the old link to launchpad to new links to StoryBoard.
>>>>
>>>> *Do we still need Launchpad after all this migration are done?*
>>>> As the plan, we won't need Launchpad for heat anymore once we have done
>>>> with migrating. Will forbid new bugs/bps filed in Launchpad. Also, try to
>>>> provide new information as many as possible. Hopefully, we can make
>>>> everyone happy. For those newly created bugs during/after migration, don't
>>>> worry we will disallow further create new bugs/bps and do a second migrate
>>>> so we won't missed yours.
>>>>
>>>> [1] https://storyboard.openstack.org/
>>>> [2] https://storyboard.openstack.org/#!/project_group/82
>>>> [3] https://docs.openstack.org/infra/manual/developers.html#
>>>> development-workflow
>>>> [4] https://storyboard.openstack.org/#!/project/989
>>>> [5] https://docs.openstack.org/infra/storyboard/migration.html
>>>> [6] https://docs.openstack.org/infra/storyboard/gui/tasks_st
>>>> ories_tags.html#what-is-a-story
>>>>
>>>>
>>>>
>>>> --
>>>> May The Force of OpenStack Be With You,
>>>>
>>>> *Rico Lin*irc: ricolin
>>>>
>>>>
>>>
>>>
>>> --
>>> May The Force of OpenStack Be With You,
>>>
>>> *Rico Lin*irc: ricolin
>>>
>>>
>>
>>
>> --
>> May The Force of OpenStack Be With You,
>>
>> *Rico Lin*irc: ricolin
>>
>>
>
>
> --
> May The Force of OpenStack Be With You,
>
> *Rico Lin*irc: ricolin
>
>


-- 
May The Force of OpenStack Be With You,

*Rico Lin*irc: ricolin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180516/98060a53/attachment-0001.html>

From eumel at arcor.de  Wed May 16 10:43:06 2018
From: eumel at arcor.de (Frank Kloeker)
Date: Wed, 16 May 2018 12:43:06 +0200
Subject: [Openstack-operators] [I18n] [Docs] Forum session Vancouver
Message-ID: <8d2e092118bf02c028c17151b8a34af5@arcor.de>

Good morning,

just a quick note when packing the suitcase: We have a Docs/I18n Forum 
session on Monday 21th, 13:30, direct after lunch [1]. Take the chance 
to discuss topics about project onboarding with translation or 
documentation, usage of translated documents or tools. Or just come to 
say Hello :-)
Looking forward to see you there!

kind regards

Frank (PTL I18n)


[1] 
https://etherpad.openstack.org/p/docs-i18n-project-onboarding-vancouver


From gord at live.ca  Wed May 16 14:45:55 2018
From: gord at live.ca (gordon chung)
Date: Wed, 16 May 2018 14:45:55 +0000
Subject: [Openstack-operators] ocata gnocchi file system : erasing old
 data
In-Reply-To: <CAB7j8cWe8q8gSZA-aXC+AtL8pGzj67z8gVWOT2Zm8At9Cde_Eg@mail.gmail.com>
References: <CAB7j8cULEP0O6L1LfTaazEvS5VfV-AfqEe-qfu5LWXCL4s=SbA@mail.gmail.com>
 <CY1PR16MB0412E175A1EB79A8FDECD306DE930@CY1PR16MB0412.namprd16.prod.outlook.com>
 <CAB7j8cWe8q8gSZA-aXC+AtL8pGzj67z8gVWOT2Zm8At9Cde_Eg@mail.gmail.com>
Message-ID: <CY1PR16MB0412BA49DF597B0B6DF58910DE920@CY1PR16MB0412.namprd16.prod.outlook.com>


On 2018-05-15 2:40 AM, Ignazio Cassano wrote:
> gnocchi resource delete instance id
> 
> 
> Does the above procedure remove data either from database or 
> /var/lib/gnocchi directory ?

not immediately, it will mark the data for deletion. there is a 
'janitor' service that runs periodically that will remove the data. this 
is defined by the `metric_cleanup_delay` in the configuration file.

cheers,

-- 
gord

From radu.popescu at emag.ro  Wed May 16 15:30:59 2018
From: radu.popescu at emag.ro (Radu Popescu | eMAG, Technology)
Date: Wed, 16 May 2018 15:30:59 +0000
Subject: [Openstack-operators] attaching network cards to VMs taking a very
	long time
Message-ID: <c5fd25596f875e7315f49ebc2f80e4bb9717aa5a.camel@emag.ro>

Hi all,

we have the following setup:
- Openstack Ocata deployed with Openstack Ansible (v15.1.7)
- 66 compute nodes, each having between 50 and 150 VMs, depending on their hardware configuration
- we don't use Ceilometer (so not adding extra load on RabbitMQ cluster)
- using Openvswitch HA with DVR
- all messaging are going through a 3 servers RabbitMQ cluster
- we now have 3 CCs hosting (initially had 2) hosting every other internal service

What happens is, when we create a large number of VMs (it's something we do on a daily basis, just to test different types of VMs and apps, around 300 VMs), there are some of them that don't get the network interface attached in a reasonable time.
After investigating, we can see that Neutron Openvswitch agent sees the port attached to the server, from an Openstack point of view, I can see the tap interface created in Openvswitch using both its logs and dmesg, but I can see nova attaching the interface after a huge amount of time. (I could see even 45 minutes delay)

Since I can't see any reasonable errors I could take care of, my last chance is this mailing list.
Only thing I can think of, is that maybe libvirt is not able to attach the interface in a reasonable amount of time. But still, 45 minutes is way too much.

At the moment:
vif_plugging_is_fatal = True
vif_plugging_timeout = 600 (modified from default 300s)

That's because we needed VMs with networking. Otherwise, if either with error, either with no network, it's the same thing for us.

Thanks,

--

Radu Popescu <radu.popescu at emag.ro<mailto:radu.popescu at emag.ro>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180516/5fb7814d/attachment.html>

From ignaziocassano at gmail.com  Wed May 16 18:37:50 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Wed, 16 May 2018 20:37:50 +0200
Subject: [Openstack-operators] ocata gnocchi file system : erasing old
	data
In-Reply-To: <CY1PR16MB0412BA49DF597B0B6DF58910DE920@CY1PR16MB0412.namprd16.prod.outlook.com>
References: <CAB7j8cULEP0O6L1LfTaazEvS5VfV-AfqEe-qfu5LWXCL4s=SbA@mail.gmail.com>
 <CY1PR16MB0412E175A1EB79A8FDECD306DE930@CY1PR16MB0412.namprd16.prod.outlook.com>
 <CAB7j8cWe8q8gSZA-aXC+AtL8pGzj67z8gVWOT2Zm8At9Cde_Eg@mail.gmail.com>
 <CY1PR16MB0412BA49DF597B0B6DF58910DE920@CY1PR16MB0412.namprd16.prod.outlook.com>
Message-ID: <CAB7j8cVAEMHJHXo_i-pN3poduTJfoQi6n6-Gs23ViNLMNfGamg@mail.gmail.com>

Many thanks
Ignazio

2018-05-16 16:45 GMT+02:00 gordon chung <gord at live.ca>:

>
>
> On 2018-05-15 2:40 AM, Ignazio Cassano wrote:
> > gnocchi resource delete instance id
> >
> >
> > Does the above procedure remove data either from database or
> > /var/lib/gnocchi directory ?
>
> not immediately, it will mark the data for deletion. there is a
> 'janitor' service that runs periodically that will remove the data. this
> is defined by the `metric_cleanup_delay` in the configuration file.
>
> cheers,
>
> --
> gord
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180516/ffd8b5b4/attachment.html>

From ignaziocassano at gmail.com  Wed May 16 18:46:06 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Wed, 16 May 2018 20:46:06 +0200
Subject: [Openstack-operators] ocata octavia http loadbalancer error 503
	Service Unavailable
Message-ID: <CAB7j8cUAXOf61N_1_rr0fCJ05KhS2NmNgL_Ppba_Nmq88X0zFg@mail.gmail.com>

Hi everyone,
I am using octavia on centos7 ocata.
When I define a http load balancer it does not work.
Accessing the load balancer address returns 503 error.
The above happens when the load balancer protocol specified is HTTP.

If the load balancer protocol specified is TCP , it works.

Probably the amphora instance haproxy is facing some issue with http health
check ?

Could anyone help me , please ?

Thanks & Regards
Ignazio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180516/c67bf7d8/attachment.html>

From mriedemos at gmail.com  Wed May 16 21:09:42 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Wed, 16 May 2018 16:09:42 -0500
Subject: [Openstack-operators] attaching network cards to VMs taking a
 very long time
In-Reply-To: <c5fd25596f875e7315f49ebc2f80e4bb9717aa5a.camel@emag.ro>
References: <c5fd25596f875e7315f49ebc2f80e4bb9717aa5a.camel@emag.ro>
Message-ID: <a6616e77-9304-d704-04cd-466681f32a36@gmail.com>

On 5/16/2018 10:30 AM, Radu Popescu | eMAG, Technology wrote:
> but I can see nova attaching the interface after a huge amount of time.

What specifically are you looking for in the logs when you see this?

Are you passing pre-created ports to attach to nova or are you passing a 
network ID so nova will create the port for you during the attach call?

This is where the ComputeManager calls the driver to plug the vif on the 
host:

https://github.com/openstack/nova/blob/stable/ocata/nova/compute/manager.py#L5187

Assuming you're using the libvirt driver, the host vif plug happens here:

https://github.com/openstack/nova/blob/stable/ocata/nova/virt/libvirt/driver.py#L1463

And the guest is updated here:

https://github.com/openstack/nova/blob/stable/ocata/nova/virt/libvirt/driver.py#L1472

vif_plugging_is_fatal and vif_plugging_timeout don't come into play here 
because we're attaching an interface to an existing server - or are you 
talking about during the initial creation of the guest, i.e. this code 
in the driver?

https://github.com/openstack/nova/blob/stable/ocata/nova/virt/libvirt/driver.py#L5257

Are you seeing this in the logs for the given port?

https://github.com/openstack/nova/blob/stable/ocata/nova/compute/manager.py#L6875

If not, it could mean that neutron-server never send the event to nova, 
so nova-compute timed out waiting for the vif plug callback event to 
tell us that the port is ready and the server can be changed to ACTIVE 
status.

The neutron-server logs should log when external events are being sent 
to nova for the given port, you probably need to trace the requests and 
compare the nova-compute and neutron logs for a given server create request.

-- 

Thanks,

Matt


From yu-kasuya at kddi-research.jp  Thu May 17 05:39:03 2018
From: yu-kasuya at kddi-research.jp (Yuki Kasuya)
Date: Thu, 17 May 2018 14:39:03 +0900
Subject: [Openstack-operators] [Forum] Fault Management/Monitoring for
	NFV/Edge/5G/IoT
Message-ID: <0091929a-0ca3-ff11-5a41-4525c53a4fb9@kddi-research.jp>

Hi All,

I've created an etherpad for Fault Management/Monitoring for  
NFV/Edge/5G/IoT. It'll take place on Tuesday, May 22, 4:40pm-6:10pm @  
Room 221-222. If you have any usecase/idea/challenge for FM at these new  
area, could you join this forum and add any topic/comment to etherpad.

https://etherpad.openstack.org/p/YVR-fm-monitoring

Best regards,
Yuki

-- 
---------------------------------------------
KDDI Research, Inc.
Integrated Core Network Control
And Management Laboratory
Yuki Kasuya
yu-kasuya at kddilabs.jp
+81 80 9048 8405


From radu.popescu at emag.ro  Thu May 17 11:49:48 2018
From: radu.popescu at emag.ro (Radu Popescu | eMAG, Technology)
Date: Thu, 17 May 2018 11:49:48 +0000
Subject: [Openstack-operators] attaching network cards to VMs taking a
 very long time
In-Reply-To: <a6616e77-9304-d704-04cd-466681f32a36@gmail.com>
Message-ID: <ef13f15d4f3106fc4307c04fcb514378c35fa8c4.camel@emag.ro>

Hi,

unfortunately, didn't get the reply in my inbox, so I'm answering from the link here:
http://lists.openstack.org/pipermail/openstack-operators/2018-May/015270.html
(hopefully, my reply will go to the same thread)

Anyway, I can see the neutron openvswitch agent logs processing the interface way after the VM is up (in this case, 30 minutes). And after the vif plugin timeout of 5 minutes (currently 10 minutes).
After searching for logs, I came out with an example here: (replaced nova compute hostname with "nova.compute.hostname")

http://paste.openstack.org/show/1VevKuimoBMs4G8X53Eu/

As you can see, the request for the VM starts around 3:27AM. Ports get created, openvswitch has the command to do it, has DHCP, but apparently Neutron server sends the callback after Neutron Openvswitch agent finishes. Callback is at 2018-05-10 03:57:36.177 while Neutron Openvswitch agent says it completed the setup and configuration at 2018-05-10 03:57:35.247.

So, my question is, why is Neutron Openvswitch agent processing the request 30 minutes after the VM is started? And where can I search for logs for whatever happens during those 30 minutes?
And yes, we're using libvirt. At some point, we added some new nova compute nodes and the new ones came with v3.2.0 and was breaking migration between hosts. That's why we downgraded (and versionlocked) everything at v2.0.0.

Thanks,
Radu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180517/1c346b81/attachment.html>

From lmihaiescu at gmail.com  Thu May 17 14:46:59 2018
From: lmihaiescu at gmail.com (George Mihaiescu)
Date: Thu, 17 May 2018 10:46:59 -0400
Subject: [Openstack-operators] attaching network cards to VMs taking a
 very long time
In-Reply-To: <ef13f15d4f3106fc4307c04fcb514378c35fa8c4.camel@emag.ro>
References: <a6616e77-9304-d704-04cd-466681f32a36@gmail.com>
 <ef13f15d4f3106fc4307c04fcb514378c35fa8c4.camel@emag.ro>
Message-ID: <CAGckRDo3kVsYZ7dNOQiuw=H9mbMZmzZr-b-t0AO-1=XTWGyS0w@mail.gmail.com>

We use "vif_plugging_is_fatal = False" and "vif_plugging_timeout = 0" as
well as "no-ping" in the dnsmasq-neutron.conf, and large rally tests of 500
instances complete with no issues.

These are some good blogposts about Neutron performance:
https://www.mirantis.com/blog/openstack-neutron-performance-and-scalability-testing-summary/
https://www.mirantis.com/blog/improving-dhcp-performance-openstack/

I would run a large rally test like this one and see where time is spent
mostly:
{
    "NovaServers.boot_and_delete_server": [
        {
            "args": {
                "flavor": {
                    "name": "c2.small"
                },
                "image": {
                    "name": "^Ubuntu 16.04 - latest$"
                },
                "force_delete": false
            },
            "runner": {
                "type": "constant",
                "times": 500,
                "concurrency": 100
            }
        }
    ]
}


Cheers,
George

On Thu, May 17, 2018 at 7:49 AM, Radu Popescu | eMAG, Technology <
radu.popescu at emag.ro> wrote:

> Hi,
>
> unfortunately, didn't get the reply in my inbox, so I'm answering from the
> link here:
> http://lists.openstack.org/pipermail/openstack-operators/
> 2018-May/015270.html
> (hopefully, my reply will go to the same thread)
>
> Anyway, I can see the neutron openvswitch agent logs processing the
> interface way after the VM is up (in this case, 30 minutes). And after the
> vif plugin timeout of 5 minutes (currently 10 minutes).
> After searching for logs, I came out with an example here: (replaced nova
> compute hostname with "nova.compute.hostname")
>
> http://paste.openstack.org/show/1VevKuimoBMs4G8X53Eu/
>
> As you can see, the request for the VM starts around 3:27AM. Ports get
> created, openvswitch has the command to do it, has DHCP, but apparently
> Neutron server sends the callback after Neutron Openvswitch agent finishes.
> Callback is at 2018-05-10 03:57:36.177 while Neutron Openvswitch agent says
> it completed the setup and configuration at 2018-05-10 03:57:35.247.
>
> So, my question is, why is Neutron Openvswitch agent processing the
> request 30 minutes after the VM is started? And where can I search for logs
> for whatever happens during those 30 minutes?
> And yes, we're using libvirt. At some point, we added some new nova
> compute nodes and the new ones came with v3.2.0 and was breaking migration
> between hosts. That's why we downgraded (and versionlocked) everything at
> v2.0.0.
>
> Thanks,
> Radu
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180517/b2376630/attachment.html>

From mriedemos at gmail.com  Thu May 17 15:42:32 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Thu, 17 May 2018 10:42:32 -0500
Subject: [Openstack-operators] attaching network cards to VMs taking a
 very long time
In-Reply-To: <CAGckRDo3kVsYZ7dNOQiuw=H9mbMZmzZr-b-t0AO-1=XTWGyS0w@mail.gmail.com>
References: <a6616e77-9304-d704-04cd-466681f32a36@gmail.com>
 <ef13f15d4f3106fc4307c04fcb514378c35fa8c4.camel@emag.ro>
 <CAGckRDo3kVsYZ7dNOQiuw=H9mbMZmzZr-b-t0AO-1=XTWGyS0w@mail.gmail.com>
Message-ID: <715adc7d-64f6-9545-1bf6-5eb13fb1d991@gmail.com>

On 5/17/2018 9:46 AM, George Mihaiescu wrote:
> and large rally tests of 500 instances complete with no issues.

Sure, except you can't ssh into the guests.

The whole reason the vif plugging is fatal and timeout and callback code 
was because the upstream CI was unstable without it. The server would 
report as ACTIVE but the ports weren't wired up so ssh would fail. 
Having an ACTIVE guest that you can't actually do anything with is kind 
of pointless.

-- 

Thanks,

Matt


From lmihaiescu at gmail.com  Thu May 17 15:50:49 2018
From: lmihaiescu at gmail.com (George Mihaiescu)
Date: Thu, 17 May 2018 11:50:49 -0400
Subject: [Openstack-operators] attaching network cards to VMs taking a
 very long time
In-Reply-To: <715adc7d-64f6-9545-1bf6-5eb13fb1d991@gmail.com>
References: <a6616e77-9304-d704-04cd-466681f32a36@gmail.com>
 <ef13f15d4f3106fc4307c04fcb514378c35fa8c4.camel@emag.ro>
 <CAGckRDo3kVsYZ7dNOQiuw=H9mbMZmzZr-b-t0AO-1=XTWGyS0w@mail.gmail.com>
 <715adc7d-64f6-9545-1bf6-5eb13fb1d991@gmail.com>
Message-ID: <CAGckRDo87sE6MAqbVP1D=H1HPN-6Vtz8ohdhXPLh0RNPbAkd=A@mail.gmail.com>

We have other scheduled tests that perform end-to-end (assign floating IP,
ssh, ping outside) and never had an issue.
I think we turned it off because the callback code was initially buggy and
nova would wait forever while things were in fact ok, but I'll  change
"vif_plugging_is_fatal = True" and "vif_plugging_timeout = 300" and run
another large test, just to confirm.

We usually run these large tests after a version upgrade to test the APIs
under load.


On Thu, May 17, 2018 at 11:42 AM, Matt Riedemann <mriedemos at gmail.com>
wrote:

> On 5/17/2018 9:46 AM, George Mihaiescu wrote:
>
>> and large rally tests of 500 instances complete with no issues.
>>
>
> Sure, except you can't ssh into the guests.
>
> The whole reason the vif plugging is fatal and timeout and callback code
> was because the upstream CI was unstable without it. The server would
> report as ACTIVE but the ports weren't wired up so ssh would fail. Having
> an ACTIVE guest that you can't actually do anything with is kind of
> pointless.
>
> --
>
> Thanks,
>
> Matt
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180517/51a2e65f/attachment.html>

From mriedemos at gmail.com  Thu May 17 16:39:13 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Thu, 17 May 2018 11:39:13 -0500
Subject: [Openstack-operators] Need feedback for nova aborting cold
 migration function
In-Reply-To: <1526374144863.89140@nttdata.co.jp>
References: <1525919628734.2105@nttdata.co.jp>
 <5fea9373-021a-0a2e-ba91-d7fe62bd5ca9@gmail.com>
 <1526374144863.89140@nttdata.co.jp>
Message-ID: <9b1c9c3d-00dc-d073-96e7-4d6409521261@gmail.com>

On 5/15/2018 3:48 AM, sagaray at nttdata.co.jp wrote:
> We store the service logs which are created by VM on that storage.

I don't mean to be glib, but have you considered maybe not doing that?

-- 

Thanks,

Matt


From mriedemos at gmail.com  Thu May 17 20:36:01 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Thu, 17 May 2018 15:36:01 -0500
Subject: [Openstack-operators] [nova] FYI on changes that might impact out
	of tree scheduler filters
Message-ID: <58e08692-483a-9188-d2ee-e02978ce995c@gmail.com>

CERN has upgraded to Cells v2 and is doing performance testing of the 
scheduler and were reporting some things today which got us back to this 
bug [1]. So I've starting pushing some patches related to this but also 
related to an older blueprint I created [2]. In summary, we do quite a 
bit of DB work just to load up a list of instance objects per host that 
the in-tree filters don't even use.

The first change [3] is a simple optimization to avoid the default joins 
on the instance_info_caches and security_groups tables. If you have out 
of tree filters that, for whatever reason, rely on the 
HostState.instances objects to have info_cache or security_groups set, 
they'll continue to work, but will have to round-trip to the DB to 
lazy-load the fields, which is going to be a performance penalty on that 
filter. See the change for details.

The second change in the series [4] is more drastic in that we'll do 
away with pulling the full Instance object per host, which means only a 
select set of optional fields can be lazy-loaded [5], and the rest will 
result in an exception. The patch currently has a workaround config 
option to continue doing things the old way if you have out of tree 
filters that rely on this, but for good citizens with only in-tree 
filters, you will get a performance improvement during scheduling.

There are some other things we can do to optimize more of this flow, but 
this email is just about the ones that have patches up right now.

[1] https://bugs.launchpad.net/nova/+bug/1737465
[2] 
https://blueprints.launchpad.net/nova/+spec/put-host-manager-instance-info-on-a-diet
[3] https://review.openstack.org/#/c/569218/
[4] https://review.openstack.org/#/c/569247/
[5] 
https://github.com/openstack/nova/blob/de52fefa1fd52ccaac6807e5010c5f2a2dcbaab5/nova/objects/instance.py#L66

-- 

Thanks,

Matt


From gouthampravi at gmail.com  Thu May 17 20:40:34 2018
From: gouthampravi at gmail.com (Goutham Pacha Ravi)
Date: Thu, 17 May 2018 13:40:34 -0700
Subject: [Openstack-operators] [manila] manila operator's feedback forum
	etherpad available
Message-ID: <CAKSuTPZFqpRwtfaggSRaZpESu6BAvUH2qoiBTYWSoYMg6pn+3Q@mail.gmail.com>

Cross posting from Openstack-dev because Tom's unable to post to this
list yet. Manila operators, please note the session at the Forum next
week.

Thanks,
Goutham

---------- Forwarded message ----------
From: Tom Barron <tpb at dyncloud.net>
Date: Thu, May 17, 2018 at 10:57 AM
Subject: [openstack-dev] [manila] manila operator's feedback forum
etherpad available
To: openstack-operators at lists.openstack.org, openstack-dev at lists.openstack.org


Next week at the Summit there is a forum session dedicated to Manila
opertors' feedback on Thursday from 1:50-2:30pm [1] for which we have
started an etherpad [2].  Please come and help manila developers do
the right thing!  We're particularly interested in experiences running
the OpenStack share service at scale and overcoming any obstacles to
deployment but are interested in getting any and all feedback from
real deployments so that we can tailor our development and maintenance
efforts to real world needs.

Please feel free and encouraged to add to the etherpad starting now.

See you there!

-- Tom Barron
  Manila PTL
  irc: tbarron

[1] https://www.openstack.org/summit/vancouver-2018/summit-schedule/events/21780/manila-ops-feedback-running-at-scale-overcoming-barriers-to-deployment
[2] https://etherpad.openstack.org/p/YVR18-manila-forum-ops-feedback


From rochelle.grober at huawei.com  Fri May 18 00:55:22 2018
From: rochelle.grober at huawei.com (Rochelle Grober)
Date: Fri, 18 May 2018 00:55:22 +0000
Subject: [Openstack-operators] [Forum] [all] [Stable] OpenStack is "mature"
 -- time to get serious on Maintainers -- Session etherpad and food for
 thought for discussion
Message-ID: <DA7681A6D234954992BD2FB907F966620BC53D72@sjceml521-mbx.china.huawei.com>

Folks,

TL;DR
The last session related to extended releases is: OpenStack is "mature" -- time to get serious on Maintainers
It will be in room 220 at 11:00-11:40
The etherpad for the last session in the series on Extended releases is here:
https://etherpad.openstack.org/p/YVR-openstack-maintainers-maint-pt3

There are links to info on other communities’ maintainer process/role/responsibilities also, as reference material on how other have made it work (or not).

The nitty gritty details:

The upcoming Forum is filled with sessions that are focused on issues needed to improve and maintain the sustainability of OpenStack projects for the long term.  We have discussion on reducing technical debt, extended releases, fast forward installs, bringing Ops and User communities closer together, etc.  The community is showing it is now invested in activities that are often part of “Sustaining Engineering” teams (corporate speak) or “Maintainers (OSS speak).  We are doing this; we are thinking about the moving parts to do this; let’s think about the contributors who want to do these and bring some clarity to their roles and the processes they need to be successful.  I am hoping you read this and keep these ideas in mind as you participate in the various Forum sessions.  Then you can bring the ideas generated during all these discussions to the Maintainers session near the end of the Summit to brainstorm how to visualize and define this new(ish) component of our technical community.

So, who has been doing the maintenance work so far?  Mostly (mostly) unsung heroes like the Stable Release team, Release team, Oslo team, project liaisons and the community goals champions (yes, moving to py3 is a sustaining/maintenance type of activity).  And some operators (Hi, mnaser!).  We need to lean on their experience and what we think the community will need to reduce that technical debt to outline what the common tasks of maintainers should be, what else might fall in their purview, and how to partner with them to better serve them.

With API lower limits, new tool versions, placement, py3, and even projects reaching “code complete” or “maintenance mode,” there is a lot of work for maintainers to do (I really don’t like that term, but is there one that fits OpenStack’s community?).  It would be great if we could find a way to share the load such that we can have part time contributors here.  We know that operators know how to cherrypick, test in there clouds, do bug fixes.  How do we pair with them to get fixes upstreamed without requiring them to be full on developers?  We have a bunch of alumni who have stopped being “cores” and sometimes even developers, but who love our community and might be willing and able to put in a few hours a week, maybe reviewing small patches, providing help with user/ops submitted patch requests, or whatever.  They were trusted with +2 and +W in the past, so we should at least be able to trust they know what they know.  We  would need some way to identify them to Cores, since they would be sort of 1.5 on the voting scale, but……

So, burn out is high in other communities for maintainers.  We need to find a way to make sustaining the stable parts of OpenStack sustainable.

Hope you can make the talk, or add to the etherpad, or both.  The etherpad is very musch still a work in progress (trying to organize it to make sense).  If you want to jump in now, go for it, otherwise it should be in reasonable shape for use at the session.  I hope we get a good mix of community and a good collection of those who are already doing the job without title.

Thanks and see you next week.
--rocky


________________________________
华为技术有限公司 Huawei Technologies Co., Ltd.
[Company_logo]
Rochelle Grober
Sr. Staff Architect, Open Source
Office Phone:408-330-5472
Email:rochelle.grober at huawei.com
________________________________
﻿ 本邮件及其附件含有华为公司的保密信息，仅限于发送给上面地址中列出的个人或群组。禁
止任何其他人以任何形式使用（包括但不限于全部或部分地泄露、复制、或散发）本邮件中
的信息。如果您错收了本邮件，请您立即电话或邮件通知发件人并删除本邮件！
This e-mail and its attachments contain confidential information from HUAWEI, which
is intended only for the person or entity whose address is listed above. Any use of the
information contained herein in any way (including, but not limited to, total or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by
phone or email immediately and delete it!

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180518/5e4f1252/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 5474 bytes
Desc: image001.png
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180518/5e4f1252/attachment.png>

From radu.popescu at emag.ro  Fri May 18 08:21:55 2018
From: radu.popescu at emag.ro (Radu Popescu | eMAG, Technology)
Date: Fri, 18 May 2018 08:21:55 +0000
Subject: [Openstack-operators] attaching network cards to VMs taking a
 very long time
In-Reply-To: <CAGckRDo87sE6MAqbVP1D=H1HPN-6Vtz8ohdhXPLh0RNPbAkd=A@mail.gmail.com>
References: <a6616e77-9304-d704-04cd-466681f32a36@gmail.com>
 <ef13f15d4f3106fc4307c04fcb514378c35fa8c4.camel@emag.ro>
 <CAGckRDo3kVsYZ7dNOQiuw=H9mbMZmzZr-b-t0AO-1=XTWGyS0w@mail.gmail.com>
 <715adc7d-64f6-9545-1bf6-5eb13fb1d991@gmail.com>
 <CAGckRDo87sE6MAqbVP1D=H1HPN-6Vtz8ohdhXPLh0RNPbAkd=A@mail.gmail.com>
Message-ID: <350f070b9d654a0a5430fafb07bcc1d41c98d2f8.camel@emag.ro>

Hi,

so, nova says the VM is ACTIVE and actually boots with no network. We are setting some metadata that we use later on and have cloud-init for different tasks.
So, VM is up, OS is running, but network is working after a random amount of time, that can get to around 45 minutes. Thing is, is not happening to all VMs in that test (around 300), but it's happening to a fair amount - around 25%.

I can see the callback coming few seconds after neutron openvswitch agent says it's completed the setup. My question is, why is it taking so long for nova openvswitch agent to configure the port? I can see the port up in both host OS and openvswitch. I would assume it's doing the whole namespace and iptables setup. But still, 30 minutes? Seems a lot!

Thanks,
Radu

On Thu, 2018-05-17 at 11:50 -0400, George Mihaiescu wrote:
We have other scheduled tests that perform end-to-end (assign floating IP, ssh, ping outside) and never had an issue.
I think we turned it off because the callback code was initially buggy and nova would wait forever while things were in fact ok, but I'll  change "vif_plugging_is_fatal = True" and "vif_plugging_timeout = 300" and run another large test, just to confirm.

We usually run these large tests after a version upgrade to test the APIs under load.


On Thu, May 17, 2018 at 11:42 AM, Matt Riedemann <mriedemos at gmail.com<mailto:mriedemos at gmail.com>> wrote:
On 5/17/2018 9:46 AM, George Mihaiescu wrote:
and large rally tests of 500 instances complete with no issues.


Sure, except you can't ssh into the guests.

The whole reason the vif plugging is fatal and timeout and callback code was because the upstream CI was unstable without it. The server would report as ACTIVE but the ports weren't wired up so ssh would fail. Having an ACTIVE guest that you can't actually do anything with is kind of pointless.


_______________________________________________

OpenStack-operators mailing list

OpenStack-operators at lists.openstack.org<mailto:OpenStack-operators at lists.openstack.org>

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180518/38dae18e/attachment.html>

From ghanshyammann at gmail.com  Fri May 18 09:08:03 2018
From: ghanshyammann at gmail.com (Ghanshyam Mann)
Date: Fri, 18 May 2018 18:08:03 +0900
Subject: [Openstack-operators] [Openstack-sigs] [First Contact] [SIG]
 [Forum] First Contact SIG Operator Inclusion Session
Message-ID: <CACE3TKX3xPHDiPLr1ZJUb5Xiih0F-qxtyx=tJ6o23OO3ODUX5w@mail.gmail.com>

Hi All,

As you might know, FirstContact SIG is planning a forum sessions
"First Contact SIG Operator Inclusion" [1] on Monday, May 21, 3:10pm.

This session will discuss about Operators inclusion in FirstConact SIG
to setup the operator bridge in this SIG.

Hope to see more operators in this sessions and their valuable feedback/help.

For more details, please go through the etherpad [2].

..1 https://www.openstack.org/summit/vancouver-2018/summit-schedule/events/21712/first-contact-sig-operator-inclusion

..2  https://etherpad.openstack.org/p/FC-SIG-Ops-Inclusion

-gmann


From ebiibe82 at gmail.com  Fri May 18 10:46:39 2018
From: ebiibe82 at gmail.com (Amit Kumar)
Date: Fri, 18 May 2018 16:16:39 +0530
Subject: [Openstack-operators] [OpenStack-Operators][OpenStack] Regarding
	production grade OpenStack deployment
Message-ID: <CAKLnnCssuZ-G+jT3Bj5GgsRo6UoGuzb35pr9DmcPDf1WgwvCOQ@mail.gmail.com>

Hi All,

We want to deploy our private cloud using OpenStack as highly available
(zero downtime (ZDT) - in normal course of action and during upgrades as
well) production grade environment. We came across following tools.


   - We thought of using *Kolla-Kubernetes* as deployment tool, but we got
   feedback from Kolla IRC channel that this project is being retired.
   Moreover, we couldn't find latest documents having multi-node deployment
   steps and, High Availability support was also not mentioned at all anywhere
   in the documentation.
   - Another option to have Kubernetes based deployment is to use
   OpenStack-Helm, but it seems the OSH community has not made OSH 1.0
   officially available yet.
   - Last option, is to use *Kolla-Ansible*, although it is not a
   Kubernetes deployment, but seems to have good community support around it.
   Also, its documentation talks a little about production grade deployment,
   probably it is being used in production grade environments.


If you folks have used any of these tools for deploying OpenStack to
fulfill these requirements: HA and ZDT, then please provide your inputs
specifically about HA and ZDT support of the deployment tool, based on your
experience. And please share if you have any reference links that you have
used for achieving HA and ZDT for the respective tools.

Lastly, if you think we should think that we have missed another more
viable and stable options of deployment tools which can serve our
requirement: HA and ZDT, then please do suggest the same.

Regards,
Amit
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180518/f4fd118a/attachment.html>

From gael.therond at gmail.com  Fri May 18 12:17:58 2018
From: gael.therond at gmail.com (Flint WALRUS)
Date: Fri, 18 May 2018 14:17:58 +0200
Subject: [Openstack-operators] [OpenStack-Operators][OpenStack]
 Regarding production grade OpenStack deployment
In-Reply-To: <CAKLnnCssuZ-G+jT3Bj5GgsRo6UoGuzb35pr9DmcPDf1WgwvCOQ@mail.gmail.com>
References: <CAKLnnCssuZ-G+jT3Bj5GgsRo6UoGuzb35pr9DmcPDf1WgwvCOQ@mail.gmail.com>
Message-ID: <CAG+53uZZNNC+rdXU_ky6GspAnFsW_W3vF9bYt4P=14uYkVfD2g@mail.gmail.com>

Hi amit,

I’m using kolla-ansible as a solution on our own infrastructure, however,
be aware that because of the nature of Openstack you wont be able to
achieve zero downtime if your hosted application do not take advantage of
the distributed natre of ressources or if they’re not basically Cloud ready.

Cheers.
Le ven. 18 mai 2018 à 12:47, Amit Kumar <ebiibe82 at gmail.com> a écrit :

> Hi All,
>
> We want to deploy our private cloud using OpenStack as highly available
> (zero downtime (ZDT) - in normal course of action and during upgrades as
> well) production grade environment. We came across following tools.
>
>
>    - We thought of using *Kolla-Kubernetes* as deployment tool, but we
>    got feedback from Kolla IRC channel that this project is being retired.
>    Moreover, we couldn't find latest documents having multi-node deployment
>    steps and, High Availability support was also not mentioned at all anywhere
>    in the documentation.
>    - Another option to have Kubernetes based deployment is to use
>    OpenStack-Helm, but it seems the OSH community has not made OSH 1.0
>    officially available yet.
>    - Last option, is to use *Kolla-Ansible*, although it is not a
>    Kubernetes deployment, but seems to have good community support around it.
>    Also, its documentation talks a little about production grade deployment,
>    probably it is being used in production grade environments.
>
>
> If you folks have used any of these tools for deploying OpenStack to
> fulfill these requirements: HA and ZDT, then please provide your inputs
> specifically about HA and ZDT support of the deployment tool, based on your
> experience. And please share if you have any reference links that you have
> used for achieving HA and ZDT for the respective tools.
>
> Lastly, if you think we should think that we have missed another more
> viable and stable options of deployment tools which can serve our
> requirement: HA and ZDT, then please do suggest the same.
>
> Regards,
> Amit
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180518/61388d26/attachment-0001.html>

From james.page at canonical.com  Fri May 18 13:15:07 2018
From: james.page at canonical.com (James Page)
Date: Fri, 18 May 2018 14:15:07 +0100
Subject: [Openstack-operators] [sig] [upgrades] inaugural meeting minutes &
	vancouver forum
In-Reply-To: <CAG1bqQo37cwiszLP=_t8CZ3BoWwMFUvCWPR_0_893ziSHrR=nw@mail.gmail.com>
References: <CAG1bqQo37cwiszLP=_t8CZ3BoWwMFUvCWPR_0_893ziSHrR=nw@mail.gmail.com>
Message-ID: <CAG1bqQoLU5LZqgOiDS1A1=pFh=w-qj3DytZ=Z4-+ccXZNei=8A@mail.gmail.com>

Hi All

Lujin, Lee and myself held the inaugural IRC meeting for the Upgrades SIG
this week (see [0]). Suffice to say that, due to other time pressures,
setup of the SIG has taken a lot longer than desired, but hopefully now we
have the ball rolling we can keep up a bit of momentum.

The Upgrades SIG intended to meet weekly, alternating between slots that
work for (hopefully) all time zones:

   http://eavesdrop.openstack.org/#Upgrades_SIG

That said, we'll skip next weeks meeting due to the OpenStack Summit and
Forum in Vancouver, where we have a BoF on the schedule (see [1]) instead.

If you're interested in OpenStack Upgrades the BoF and Erik's sessions on
Fast Forward Upgrades (see [2]) should be on your schedule for next week!

Cheers

James


[0]
http://eavesdrop.openstack.org/meetings/upgrade_sig/2018/upgrade_sig.2018-05-15-09.06.html
[1]
https://www.openstack.org/summit/vancouver-2018/summit-schedule/events/21855/upgrade-sig-bof
[2]
https://www.openstack.org/summit/vancouver-2018/summit-schedule/global-search?t=upgrades
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180518/19359305/attachment.html>

From emccormick at cirrusseven.com  Fri May 18 16:42:17 2018
From: emccormick at cirrusseven.com (Erik McCormick)
Date: Fri, 18 May 2018 09:42:17 -0700
Subject: [Openstack-operators] Fast Forward Upgrades (FFU) Forum Sessions
Message-ID: <CAHUi5cNc-UoeLUR23oj86KdAnX-an8E1p-56SLx=mLshN-BErQ@mail.gmail.com>

Hello all,

There are two forum sessions in Vancouver covering Fast Forward Upgrades.

Session 1 (Current State): Wednesday May 23rd, 09:00 - 09:40, Room 220
Session 2 (Future Work): Wednesday May 23rd, 09:50 - 10:30, Room 220

The combined etherpad for both sessions can be found at:
https://etherpad.openstack.org/p/YVR-forum-fast-forward-upgrades

Please take some time to add in topics you would like to see discussed
or add any other pertinent information. There are several reference
links at the top which are worth reviewing prior to the sessions if
you have the time.

See you all in Vancover!

Cheers,
Erik


From ebiibe82 at gmail.com  Fri May 18 18:59:52 2018
From: ebiibe82 at gmail.com (Amit Kumar)
Date: Sat, 19 May 2018 00:29:52 +0530
Subject: [Openstack-operators] [OpenStack-Operators][OpenStack]
 Regarding production grade OpenStack deployment
In-Reply-To: <CAG+53uZZNNC+rdXU_ky6GspAnFsW_W3vF9bYt4P=14uYkVfD2g@mail.gmail.com>
References: <CAKLnnCssuZ-G+jT3Bj5GgsRo6UoGuzb35pr9DmcPDf1WgwvCOQ@mail.gmail.com>
 <CAG+53uZZNNC+rdXU_ky6GspAnFsW_W3vF9bYt4P=14uYkVfD2g@mail.gmail.com>
Message-ID: <CAKLnnCtiWq5fXzxFvY=XCMiUFWErpcfyK_5=iCfFjmNxztDzoQ@mail.gmail.com>

Hi,

Thanks for sharing your experience.

I am talking about HA of only OpenStack services and not the hosted
applications or the OpenStack instances they are hosted on. So, for now it
is not the requirement.

But from your response, it seems that you have deployed OpenStack with
Kolla-Ansible in multi node, multi-Controller architecture, right? And any
experience with Kolla-Ansible from OpenStack release upgrade perspective?
Is ZDT of OpenStack services feasible while upgrading?

Regards,
Amit

On May 18, 2018 5:48 PM, "Flint WALRUS" <gael.therond at gmail.com> wrote:

Hi amit,

I’m using kolla-ansible as a solution on our own infrastructure, however,
be aware that because of the nature of Openstack you wont be able to
achieve zero downtime if your hosted application do not take advantage of
the distributed natre of ressources or if they’re not basically Cloud ready.

Cheers.
Le ven. 18 mai 2018 à 12:47, Amit Kumar <ebiibe82 at gmail.com> a écrit :

> Hi All,
>
> We want to deploy our private cloud using OpenStack as highly available
> (zero downtime (ZDT) - in normal course of action and during upgrades as
> well) production grade environment. We came across following tools.
>
>
>    - We thought of using *Kolla-Kubernetes* as deployment tool, but we
>    got feedback from Kolla IRC channel that this project is being retired.
>    Moreover, we couldn't find latest documents having multi-node deployment
>    steps and, High Availability support was also not mentioned at all anywhere
>    in the documentation.
>    - Another option to have Kubernetes based deployment is to use
>    OpenStack-Helm, but it seems the OSH community has not made OSH 1.0
>    officially available yet.
>    - Last option, is to use *Kolla-Ansible*, although it is not a
>    Kubernetes deployment, but seems to have good community support around it.
>    Also, its documentation talks a little about production grade deployment,
>    probably it is being used in production grade environments.
>
>
> If you folks have used any of these tools for deploying OpenStack to
> fulfill these requirements: HA and ZDT, then please provide your inputs
> specifically about HA and ZDT support of the deployment tool, based on your
> experience. And please share if you have any reference links that you have
> used for achieving HA and ZDT for the respective tools.
>
> Lastly, if you think we should think that we have missed another more
> viable and stable options of deployment tools which can serve our
> requirement: HA and ZDT, then please do suggest the same.
>
> Regards,
> Amit
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180519/f55ec1a9/attachment.html>

From Kevin.Fox at pnnl.gov  Fri May 18 20:07:01 2018
From: Kevin.Fox at pnnl.gov (Fox, Kevin M)
Date: Fri, 18 May 2018 20:07:01 +0000
Subject: [Openstack-operators] [OpenStack-Operators][OpenStack]
 Regarding	production grade OpenStack deployment
In-Reply-To: <CAKLnnCssuZ-G+jT3Bj5GgsRo6UoGuzb35pr9DmcPDf1WgwvCOQ@mail.gmail.com>
References: <CAKLnnCssuZ-G+jT3Bj5GgsRo6UoGuzb35pr9DmcPDf1WgwvCOQ@mail.gmail.com>
Message-ID: <1A3C52DFCD06494D8528644858247BF01C0D11A7@EX10MBOX03.pnnl.gov>

I don't think openstack itself can meet full zero downtime requirements. But if it can, then I also think none of the deployment tools try and support that use case either.

Thanks,
Kevin
________________________________
From: Amit Kumar [ebiibe82 at gmail.com]
Sent: Friday, May 18, 2018 3:46 AM
To: OpenStack Operators; Openstack
Subject: [Openstack-operators] [OpenStack-Operators][OpenStack] Regarding production grade OpenStack deployment

Hi All,

We want to deploy our private cloud using OpenStack as highly available (zero downtime (ZDT) - in normal course of action and during upgrades as well) production grade environment. We came across following tools.


  *   We thought of using Kolla-Kubernetes as deployment tool, but we got feedback from Kolla IRC channel that this project is being retired. Moreover, we couldn't find latest documents having multi-node deployment steps and, High Availability support was also not mentioned at all anywhere in the documentation.
  *   Another option to have Kubernetes based deployment is to use OpenStack-Helm, but it seems the OSH community has not made OSH 1.0 officially available yet.
  *   Last option, is to use Kolla-Ansible, although it is not a Kubernetes deployment, but seems to have good community support around it. Also, its documentation talks a little about production grade deployment, probably it is being used in production grade environments.

If you folks have used any of these tools for deploying OpenStack to fulfill these requirements: HA and ZDT, then please provide your inputs specifically about HA and ZDT support of the deployment tool, based on your experience. And please share if you have any reference links that you have used for achieving HA and ZDT for the respective tools.

Lastly, if you think we should think that we have missed another more viable and stable options of deployment tools which can serve our requirement: HA and ZDT, then please do suggest the same.

Regards,
Amit


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180518/018c49a5/attachment.html>

From gael.therond at gmail.com  Fri May 18 20:35:56 2018
From: gael.therond at gmail.com (Flint WALRUS)
Date: Fri, 18 May 2018 22:35:56 +0200
Subject: [Openstack-operators] [OpenStack-Operators][OpenStack]
 Regarding production grade OpenStack deployment
In-Reply-To: <1A3C52DFCD06494D8528644858247BF01C0D11A7@EX10MBOX03.pnnl.gov>
References: <CAKLnnCssuZ-G+jT3Bj5GgsRo6UoGuzb35pr9DmcPDf1WgwvCOQ@mail.gmail.com>
 <1A3C52DFCD06494D8528644858247BF01C0D11A7@EX10MBOX03.pnnl.gov>
Message-ID: <CAG+53uagMuXGLZ1zvF_CqG+2JHh6mSpr1AHVYnhF1dkB3w8MDw@mail.gmail.com>

Oh ok! Yes, if you only focus on the control plan, the answer is yes, I’m
using kolla-ansible and it’s working really well. It helped us to bring
more services online more quickly and solved our lifecycle management that
was kind of tricky.

I’m using a Blue/Green deployement method and yes I’m using multinode form.

Remember that kolla-ansible is a simple shell script wrapping
ansible-playbook and that if you’re curious of what the playbooks look like
you just have to install it (or goes on github) with pip and then get your
hands on it.

Kind regards.
Le ven. 18 mai 2018 à 22:07, Fox, Kevin M <Kevin.Fox at pnnl.gov> a écrit :

> I don't think openstack itself can meet full zero downtime requirements.
> But if it can, then I also think none of the deployment tools try and
> support that use case either.
>
> Thanks,
> Kevin
> ------------------------------
> *From:* Amit Kumar [ebiibe82 at gmail.com]
> *Sent:* Friday, May 18, 2018 3:46 AM
> *To:* OpenStack Operators; Openstack
> *Subject:* [Openstack-operators] [OpenStack-Operators][OpenStack]
> Regarding production grade OpenStack deployment
>
> Hi All,
>
> We want to deploy our private cloud using OpenStack as highly available
> (zero downtime (ZDT) - in normal course of action and during upgrades as
> well) production grade environment. We came across following tools.
>
>
>    - We thought of using *Kolla-Kubernetes* as deployment tool, but we
>    got feedback from Kolla IRC channel that this project is being retired.
>    Moreover, we couldn't find latest documents having multi-node deployment
>    steps and, High Availability support was also not mentioned at all anywhere
>    in the documentation.
>    - Another option to have Kubernetes based deployment is to use
>    OpenStack-Helm, but it seems the OSH community has not made OSH 1.0
>    officially available yet.
>    - Last option, is to use *Kolla-Ansible*, although it is not a
>    Kubernetes deployment, but seems to have good community support around it.
>    Also, its documentation talks a little about production grade deployment,
>    probably it is being used in production grade environments.
>
>
> If you folks have used any of these tools for deploying OpenStack to
> fulfill these requirements: HA and ZDT, then please provide your inputs
> specifically about HA and ZDT support of the deployment tool, based on your
> experience. And please share if you have any reference links that you have
> used for achieving HA and ZDT for the respective tools.
>
> Lastly, if you think we should think that we have missed another more
> viable and stable options of deployment tools which can serve our
> requirement: HA and ZDT, then please do suggest the same.
>
> Regards,
> Amit
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180518/7416c276/attachment.html>

From chris.friesen at windriver.com  Fri May 18 20:39:39 2018
From: chris.friesen at windriver.com (Chris Friesen)
Date: Fri, 18 May 2018 14:39:39 -0600
Subject: [Openstack-operators] [OpenStack-Operators][OpenStack]
 Regarding production grade OpenStack deployment
In-Reply-To: <1A3C52DFCD06494D8528644858247BF01C0D11A7@EX10MBOX03.pnnl.gov>
References: <CAKLnnCssuZ-G+jT3Bj5GgsRo6UoGuzb35pr9DmcPDf1WgwvCOQ@mail.gmail.com>
 <1A3C52DFCD06494D8528644858247BF01C0D11A7@EX10MBOX03.pnnl.gov>
Message-ID: <5AFF3A0B.3010400@windriver.com>

Are you talking about downtime of instances (and the dataplane), or of the 
OpenStack API and control plane?

And when you say "zero downtime" are you really talking about "five nines" or 
similar?  Because nothing is truly zero downtime.

If you care about HA then you'll need additional components outside of OpenStack 
proper.  You'll need health checks on your physical nodes, health checks on your 
network links, possibly end-to-end health checks up into the applications 
running in your guests, redundant network paths, redundant controller nodes, HA 
storage, etc.  You'll have to think about how to ensure your database and 
messaging service are HA.  You may want to look at ensuring that your OpenStack 
services do not interfere with the VMs running on that node and vice versa.

We ended up rolling our own install mechanisms because we weren't satisfied with 
any of the existing projects.  That was a while ago now so I don't know how far 
they've come.

Chris

On 05/18/2018 02:07 PM, Fox, Kevin M wrote:
> I don't think openstack itself can meet full zero downtime requirements. But if
> it can, then I also think none of the deployment tools try and support that use
> case either.
>
> Thanks,
> Kevin
> --------------------------------------------------------------------------------
> *From:* Amit Kumar [ebiibe82 at gmail.com]
> *Sent:* Friday, May 18, 2018 3:46 AM
> *To:* OpenStack Operators; Openstack
> *Subject:* [Openstack-operators] [OpenStack-Operators][OpenStack] Regarding
> production grade OpenStack deployment
>
> Hi All,
>
> We want to deploy our private cloud using OpenStack as highly available (zero
> downtime (ZDT) - in normal course of action and during upgrades as well)
> production grade environment. We came across following tools.
>
>   * We thought of using /Kolla-Kubernetes/ as deployment tool, but we got
>     feedback from Kolla IRC channel that this project is being retired.
>     Moreover, we couldn't find latest documents having multi-node deployment
>     steps and, High Availability support was also not mentioned at all anywhere
>     in the documentation.
>   * Another option to have Kubernetes based deployment is to use OpenStack-Helm,
>     but it seems the OSH community has not made OSH 1.0 officially available yet.
>   * Last option, is to use /Kolla-Ansible/, although it is not a Kubernetes
>     deployment, but seems to have good community support around it. Also, its
>     documentation talks a little about production grade deployment, probably it
>     is being used in production grade environments.
>
>
> If you folks have used any of these tools for deploying OpenStack to fulfill
> these requirements: HA and ZDT, then please provide your inputs specifically
> about HA and ZDT support of the deployment tool, based on your experience. And
> please share if you have any reference links that you have used for achieving HA
> and ZDT for the respective tools.
>
> Lastly, if you think we should think that we have missed another more viable and
> stable options of deployment tools which can serve our requirement: HA and ZDT,
> then please do suggest the same.
>
> Regards,
> Amit
>
>
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>


From lbragstad at gmail.com  Fri May 18 21:02:47 2018
From: lbragstad at gmail.com (Lance Bragstad)
Date: Fri, 18 May 2018 16:02:47 -0500
Subject: [Openstack-operators] [User-committee] [Forum] [all] [Stable]
 OpenStack is "mature" -- time to get serious on Maintainers -- Session
 etherpad and food for thought for discussion
In-Reply-To: <DA7681A6D234954992BD2FB907F966620BC53D72@sjceml521-mbx.china.huawei.com>
References: <DA7681A6D234954992BD2FB907F966620BC53D72@sjceml521-mbx.china.huawei.com>
Message-ID: <1d7a6055-df34-c0f6-98a0-d8a8f9cfafa8@gmail.com>

Here is the link to the session in case you'd like to add it to your
schedule [0].

[0]
https://www.openstack.org/summit/vancouver-2018/summit-schedule/events/21759/openstack-is-mature-time-to-get-serious-on-maintainers

On 05/17/2018 07:55 PM, Rochelle Grober wrote:
>
> Folks,
>
>  
>
> TL;DR
>
> The last session related to extended releases is: OpenStack is
> "mature" -- time to get serious on Maintainers
> It will be in room 220 at 11:00-11:40
>
> The etherpad for the last session in the series on Extended releases
> is here:
>
> https://etherpad.openstack.org/p/YVR-openstack-maintainers-maint-pt3
>
>  
>
> There are links to info on other communities’ maintainer
> process/role/responsibilities also, as reference material on how other
> have made it work (or not).
>
>  
>
> The nitty gritty details:
>
>  
>
> The upcoming Forum is filled with sessions that are focused on issues
> needed to improve and maintain the sustainability of OpenStack
> projects for the long term.  We have discussion on reducing technical
> debt, extended releases, fast forward installs, bringing Ops and User
> communities closer together, etc.  The community is showing it is now
> invested in activities that are often part of “Sustaining Engineering”
> teams (corporate speak) or “Maintainers (OSS speak).  We are doing
> this; we are thinking about the moving parts to do this; let’s think
> about the contributors who want to do these and bring some clarity to
> their roles and the processes they need to be successful.  I am hoping
> you read this and keep these ideas in mind as you participate in the
> various Forum sessions.  Then you can bring the ideas generated during
> all these discussions to the Maintainers session near the end of the
> Summit to brainstorm how to visualize and define this new(ish)
> component of our technical community.
>
>  
>
> So, who has been doing the maintenance work so far?  Mostly (mostly)
> unsung heroes like the Stable Release team, Release team, Oslo team,
> project liaisons and the community goals champions (yes, moving to py3
> is a sustaining/maintenance type of activity).  And some operators
> (Hi, mnaser!).  We need to lean on their experience and what we think
> the community will need to reduce that technical debt to outline what
> the common tasks of maintainers should be, what else might fall in
> their purview, and how to partner with them to better serve them.
>
>  
>
> With API lower limits, new tool versions, placement, py3, and even
> projects reaching “code complete” or “maintenance mode,” there is a
> lot of work for maintainers to do (I really don’t like that term, but
> is there one that fits OpenStack’s community?).  It would be great if
> we could find a way to share the load such that we can have part time
> contributors here.  We know that operators know how to cherrypick,
> test in there clouds, do bug fixes.  How do we pair with them to get
> fixes upstreamed without requiring them to be full on developers?  We
> have a bunch of alumni who have stopped being “cores” and sometimes
> even developers, but who love our community and might be willing and
> able to put in a few hours a week, maybe reviewing small patches,
> providing help with user/ops submitted patch requests, or whatever. 
> They were trusted with +2 and +W in the past, so we should at least be
> able to trust they know what they know.  We  would need some way to
> identify them to Cores, since they would be sort of 1.5 on the voting
> scale, but……
>
>  
>
> So, burn out is high in other communities for maintainers.  We need to
> find a way to make sustaining the stable parts of OpenStack sustainable.
>
>  
>
> Hope you can make the talk, or add to the etherpad, or both.  The
> etherpad is very musch still a work in progress (trying to organize it
> to make sense).  If you want to jump in now, go for it, otherwise it
> should be in reasonable shape for use at the session.  I hope we get a
> good mix of community and a good collection of those who are already
> doing the job without title.
>
>  
>
> Thanks and see you next week.
>
> --rocky
>
>  
>
>  
>
>  
>
> ------------------------------------------------------------------------
>
> 华为技术有限公司 Huawei Technologies Co., Ltd.
>
> Company_logo
>
> Rochelle Grober
>
> Sr. Staff Architect, Open Source
> Office Phone:408-330-5472
> Email:rochelle.grober at huawei.com
>
> ------------------------------------------------------------------------
>
> ﻿本邮件及其附件含有华为公司的保密信息，仅限于发送给上面地址中列出的个人或群组。禁
> 止任何其他人以任何形式使用（包括但不限于全部或部分地泄露、复制、或散发）本邮件中
> 的信息。如果您错收了本邮件，请您立即电话或邮件通知发件人并删除本邮件！
> This e-mail and its attachments contain confidential information from
> HUAWEI, which
> is intended only for the person or entity whose address is listed
> above. Any use of the
> information contained herein in any way (including, but not limited
> to, total or partial
> disclosure, reproduction, or dissemination) by persons other than the
> intended
> recipient(s) is prohibited. If you receive this e-mail in error,
> please notify the sender by
> phone or email immediately and delete it!
>
>  
>
>
>
> _______________________________________________
> User-committee mailing list
> User-committee at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/user-committee

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180518/bcd98f6b/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 5474 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180518/bcd98f6b/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180518/bcd98f6b/attachment.sig>

From rochelle.grober at huawei.com  Fri May 18 21:07:46 2018
From: rochelle.grober at huawei.com (Rochelle Grober)
Date: Fri, 18 May 2018 21:07:46 +0000
Subject: [Openstack-operators] [User-committee] [Forum] [all] [Stable]
 OpenStack is "mature" -- time to get serious on Maintainers -- Session
 etherpad and food for thought for discussion
In-Reply-To: <1d7a6055-df34-c0f6-98a0-d8a8f9cfafa8@gmail.com>
References: <DA7681A6D234954992BD2FB907F966620BC53D72@sjceml521-mbx.china.huawei.com>
 <1d7a6055-df34-c0f6-98a0-d8a8f9cfafa8@gmail.com>
Message-ID: <DA7681A6D234954992BD2FB907F966620BC54FAD@sjceml521-mbx.china.huawei.com>

Thanks, Lance!

Also, the more I think about it, the more I think Maintainer has too much baggage to use that term for this role.  It really is “continuity” that we are looking for.  Continuous important fixes, continuous updates of tools used to produce the SW.

Keep this in the back of your minds for the discussion.  And yes, this is a discussion to see if we are interested, and only if there is interest, how to move forward.

--Rocky

From: Lance Bragstad [mailto:lbragstad at gmail.com]
Sent: Friday, May 18, 2018 2:03 PM
To: Rochelle Grober <rochelle.grober at huawei.com>; openstack-dev <openstack-dev at lists.openstack.org>; openstack-operators <openstack-operators at lists.openstack.org>; user-committee <user-committee at lists.openstack.org>
Subject: Re: [User-committee] [Forum] [all] [Stable] OpenStack is "mature" -- time to get serious on Maintainers -- Session etherpad and food for thought for discussion

Here is the link to the session in case you'd like to add it to your schedule [0].

[0] https://www.openstack.org/summit/vancouver-2018/summit-schedule/events/21759/openstack-is-mature-time-to-get-serious-on-maintainers
On 05/17/2018 07:55 PM, Rochelle Grober wrote:
Folks,

TL;DR
The last session related to extended releases is: OpenStack is "mature" -- time to get serious on Maintainers
It will be in room 220 at 11:00-11:40
The etherpad for the last session in the series on Extended releases is here:
https://etherpad.openstack.org/p/YVR-openstack-maintainers-maint-pt3

There are links to info on other communities’ maintainer process/role/responsibilities also, as reference material on how other have made it work (or not).

The nitty gritty details:

The upcoming Forum is filled with sessions that are focused on issues needed to improve and maintain the sustainability of OpenStack projects for the long term.  We have discussion on reducing technical debt, extended releases, fast forward installs, bringing Ops and User communities closer together, etc.  The community is showing it is now invested in activities that are often part of “Sustaining Engineering” teams (corporate speak) or “Maintainers (OSS speak).  We are doing this; we are thinking about the moving parts to do this; let’s think about the contributors who want to do these and bring some clarity to their roles and the processes they need to be successful.  I am hoping you read this and keep these ideas in mind as you participate in the various Forum sessions.  Then you can bring the ideas generated during all these discussions to the Maintainers session near the end of the Summit to brainstorm how to visualize and define this new(ish) component of our technical community.

So, who has been doing the maintenance work so far?  Mostly (mostly) unsung heroes like the Stable Release team, Release team, Oslo team, project liaisons and the community goals champions (yes, moving to py3 is a sustaining/maintenance type of activity).  And some operators (Hi, mnaser!).  We need to lean on their experience and what we think the community will need to reduce that technical debt to outline what the common tasks of maintainers should be, what else might fall in their purview, and how to partner with them to better serve them.

With API lower limits, new tool versions, placement, py3, and even projects reaching “code complete” or “maintenance mode,” there is a lot of work for maintainers to do (I really don’t like that term, but is there one that fits OpenStack’s community?).  It would be great if we could find a way to share the load such that we can have part time contributors here.  We know that operators know how to cherrypick, test in there clouds, do bug fixes.  How do we pair with them to get fixes upstreamed without requiring them to be full on developers?  We have a bunch of alumni who have stopped being “cores” and sometimes even developers, but who love our community and might be willing and able to put in a few hours a week, maybe reviewing small patches, providing help with user/ops submitted patch requests, or whatever.  They were trusted with +2 and +W in the past, so we should at least be able to trust they know what they know.  We  would need some way to identify them to Cores, since they would be sort of 1.5 on the voting scale, but……

So, burn out is high in other communities for maintainers.  We need to find a way to make sustaining the stable parts of OpenStack sustainable.

Hope you can make the talk, or add to the etherpad, or both.  The etherpad is very musch still a work in progress (trying to organize it to make sense).  If you want to jump in now, go for it, otherwise it should be in reasonable shape for use at the session.  I hope we get a good mix of community and a good collection of those who are already doing the job without title.

Thanks and see you next week.
--rocky


________________________________
华为技术有限公司 Huawei Technologies Co., Ltd.
[Company_logo]
Rochelle Grober
Sr. Staff Architect, Open Source
Office Phone:408-330-5472
Email:rochelle.grober at huawei.com<mailto:Email:rochelle.grober at huawei.com>
________________________________
﻿ 本邮件及其附件含有华为公司的保密信息，仅限于发送给上面地址中列出的个人或群组。禁
止任何其他人以任何形式使用（包括但不限于全部或部分地泄露、复制、或散发）本邮件中
的信息。如果您错收了本邮件，请您立即电话或邮件通知发件人并删除本邮件！
This e-mail and its attachments contain confidential information from HUAWEI, which
is intended only for the person or entity whose address is listed above. Any use of the
information contained herein in any way (including, but not limited to, total or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by
phone or email immediately and delete it!


_______________________________________________

User-committee mailing list

User-committee at lists.openstack.org<mailto:User-committee at lists.openstack.org>

http://lists.openstack.org/cgi-bin/mailman/listinfo/user-committee

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180518/71f9aeb9/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 5474 bytes
Desc: image001.png
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180518/71f9aeb9/attachment.png>

From gmann at ghanshyammann.com  Sat May 19 14:24:59 2018
From: gmann at ghanshyammann.com (Ghanshyam Mann)
Date: Sat, 19 May 2018 23:24:59 +0900
Subject: [Openstack-operators] [openstack-dev] [openstack-operators][qa]
	Tempest removal of test_get_service_by_service_and_host_name
Message-ID: <CACE3TKXqsXXVM-9918_8zj23MUATOXjrnHFwz7RQ3nEeWsK+OQ@mail.gmail.com>

Hi All,

Patch https://review.openstack.org/#/c/569112/1 removed the
test_get_service_by_service_and_host_name from tempest tree which
looks ok as per bug and commit msg.

This satisfy the condition of test removal as per process [1] and this
mail is to complete the test removal process to check the external
usage of this test.

There is one place this tests is listed in Trio2o doc, i have raised
patch in Trio2o to remove that to avoid any future confusion[2].

If this test is required by anyone, please respond to this mail,
otherwise we are good here.

..1 https://docs.openstack.org/tempest/latest/test_removal.html
..2 https://review.openstack.org/#/c/569568/

-gmann


From amy at demarco.com  Sun May 20 16:18:22 2018
From: amy at demarco.com (Amy Marrich)
Date: Sun, 20 May 2018 09:18:22 -0700
Subject: [Openstack-operators] OPs and User Sessions at Summit
Message-ID: <614A0623-1D14-43FA-8EC4-AE71FCEA4BCD@demarco.com>


Hi everyone,

There are a lot of great events and sessions going on at Summit next week that I wanted to bring your attention to! Forum sessions are extremely important for starting and continuing conversations and really are a can't miss!

For those of you who may not know what the forum is, it’s the opportunities for operators and developers to gather together to discuss requirements for the release, provide feedback and have strategic discussions.

Amy (spotz)
User Committee
Diversity WG Chair


Sunday, May 20 - All day
Board Meeting

Sunday, May 20 @6:00 - 8:00pm
https://www.openstack.org/summit/vancouver-2018/summit-schedule/events/21791/weareopenstack-diversity-happy-hour-sponsored-by-red-hat-rsvp-required
- Diversity Happy Hour

Monday, May 21 @2:10 - 2:30
https://www.openstack.org/summit/vancouver-2018/summit-schedule/events/21862/state-of-the-user-committee
- Lightning Talk by Melvin & Matt 

Monday, May 21 @ 2:20
https://www.openstack.org/summit/vancouver-2018/summit-schedule/events/21712/first-contact-sig-operator-inclusion
- First Contact SIG Operator Inclusion

Monday, May 21 @ 4:20
https://www.openstack.org/summit/vancouver-2018/summit-schedule/events/21747/opsdevs-one-community
- Ops/Devs One Community

Monday, May 21 @5:10pm - 5:
https://www.openstack.org/summit/vancouver-2018/summit-schedule/events/21788/ops-meetups-team-catch-up-and-ptg-merger-discussion
- PTG Merger Discussion

Monday, May 21, 6:00pm-7:00pm
https://www.openstack.org/summit/vancouver-2018/summit-schedule/events/21865/ambassador-meet-and-greet-at-the-openinfra-mixer
- Meet ambassadors. Have cocktails.

Tuesday, May 22, 1:50
https://www.openstack.org/summit/vancouver-2018/summit-schedule/events/21785/upgrading-openstack-war-stories
- Upgrading War Stories,

Wednesday, May 23 @ 5:30
https://www.openstack.org/summit/vancouver-2018/summit-schedule/events/21725/openstack-operators-community-documentation
- OpenStack Operators Community Documentation

Thursday May 24 @ 9:00 and 9:50amam
https://www.openstack.org/summit/vancouver-2018/summit-schedule/events/21721/extended-maintenance-part-i-past-present-and-future
- Extended Maintenance: Parts I and II

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180520/08d6c7e6/attachment-0001.html>

From john.studarus at openstacksandiego.org  Sun May 20 20:28:09 2018
From: john.studarus at openstacksandiego.org (John Studarus)
Date: Sun, 20 May 2018 13:28:09 -0700
Subject: [Openstack-operators] OpenStack US & CA speaker opportunities
Message-ID: <1637f3cef87.11d475b0f196695.942736707121116595@openstacksandiego.org>


Dear OpenStack PTLs, devs, operators, and community leaders, 


We're reaching out to those interested in presenting at events across the US &amp; Canada. The first opportunity is this July 10th, at the Intel Campus in Santa Clara, CA. The SF Bay Area OpenStack group is organizing a half day of presentations and labs with an evening social event to showcase Open Infrastructure and Cloud Native technologies (like Containers, and SDN). We have a number of invited, sponsored breakout sessions and lightening talks available. 


If you're interested, feel free to contact us directly via email or at the submission page below.


https://www.papercall.io/openstack-8th-san-jose


We're also happy to co-ordinate events with the Meetup groups across the US and Canada. If you're looking to get out and talk, just drop us a note and we can co-ordinate which groups would be convenient to you. Perhaps you'll be traveling and have the evening free to speak to a local group? We can make it happen!


All three of us will be in Vancouver this week if you'd like to talk in person.


John, Lisa, &amp; Stacy

OpenStack Ambassadors for North America and Canada


----
John Studarus - OpenStack Ambassador - John at OpenStackSanDiego.org


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180520/94304040/attachment.html>

From rico.lin.guanyu at gmail.com  Mon May 21 09:15:33 2018
From: rico.lin.guanyu at gmail.com (Rico Lin)
Date: Mon, 21 May 2018 17:15:33 +0800
Subject: [Openstack-operators] [openstack-dev][heat] Heat sessions in
 Vancouver summit!! And they're all in Tuesday!
Message-ID: <CA+WCyyrFjVBYTrBDzfZDQ_ttzA-RSC2vH7sqN0gDRa4nWFGfgw@mail.gmail.com>

Dear all

As Summit is about to start, looking forward to meet all of you here.
Don't miss out sessions from Heat team. They're all on Tuesday!
Feel free to let me know if you hope to see anything or learn anything
from sessions.
Will try my best to prepare it for you.


*Tuesday 229:00am - 9:40am Users & Ops feedback for Heat *
Vancouver Convention Centre West - Level Two - Room 220
https://www.openstack.org/summit/vancouver-2018/summit-
schedule/events/21713/users-and-ops-feedback-for-heat


*11:00am - 11:20am Heat - Project Update*
Vancouver Convention Centre West - Level Two - Room 212
https://www.openstack.org/summit/vancouver-2018/summit-
schedule/events/21595/heat-project-update


*1:50pm - 2:30pm Heat - Project Onboarding*
Vancouver Convention Centre West - Level Two - Room 223
https://www.openstack.org/summit/vancouver-2018/summit-
schedule/events/21629/heat-project-onboarding


See you all on Tuesday!!

-- 
May The Force of OpenStack Be With You,

*Rico Lin*irc: ricolin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180521/f8a9b7ba/attachment.html>

From Eric.Smith at ccur.com  Mon May 21 18:51:05 2018
From: Eric.Smith at ccur.com (Smith, Eric)
Date: Mon, 21 May 2018 18:51:05 +0000
Subject: [Openstack-operators] Multiple Ceph pools for Nova?
Message-ID: <DE700105-6D87-4696-B9BB-3FF8C05535F4@ccur.com>

I have 2 Ceph pools, one backed by SSDs and one backed by spinning disks (Separate roots within the CRUSH hierarchy). I’d like to run all instances in a single project / tenant on SSDs and the rest on spinning disks. How would I go about setting this up?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180521/85d27b32/attachment.html>

From emccormick at cirrusseven.com  Mon May 21 19:17:24 2018
From: emccormick at cirrusseven.com (Erik McCormick)
Date: Mon, 21 May 2018 12:17:24 -0700
Subject: [Openstack-operators] Multiple Ceph pools for Nova?
In-Reply-To: <DE700105-6D87-4696-B9BB-3FF8C05535F4@ccur.com>
References: <DE700105-6D87-4696-B9BB-3FF8C05535F4@ccur.com>
Message-ID: <CAHUi5cN+UqUraNQgsmmAjDYL+uLoX+9+3DdRL7RdXntGsN1YWw@mail.gmail.com>

Do you have enough hypervisors you can dedicate some to each purpose? You
could make two availability zones each with a different backend.

On Mon, May 21, 2018, 11:52 AM Smith, Eric <Eric.Smith at ccur.com> wrote:

> I have 2 Ceph pools, one backed by SSDs and one backed by spinning disks
> (Separate roots within the CRUSH hierarchy). I’d like to run all instances
> in a single project / tenant on SSDs and the rest on spinning disks. How
> would I go about setting this up?
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180521/f8519226/attachment.html>

From guilherme.pimentel at ccc.ufcg.edu.br  Mon May 21 19:20:07 2018
From: guilherme.pimentel at ccc.ufcg.edu.br (Guilherme Steinmuller Pimentel Pimentel)
Date: Mon, 21 May 2018 16:20:07 -0300
Subject: [Openstack-operators] Multiple Ceph pools for Nova?
In-Reply-To: <DE700105-6D87-4696-B9BB-3FF8C05535F4@ccur.com>
References: <DE700105-6D87-4696-B9BB-3FF8C05535F4@ccur.com>
Message-ID: <CAL_MbwtDovoq1do2VPCfPdVuz97fBpnHHVZWvaMcdUoP0176bg@mail.gmail.com>

I usually separate things using host aggregate feature.

In my deployment, I have 2 different nova pools. So, in nova.conf, I define
the *images_rbd_pool* variable point to desired pool and then, I create an
aggregate and put these computes into it. The flavor extra_spec metadata
will define which aggregate the instance will be scheduled.


2018-05-21 15:51 GMT-03:00 Smith, Eric <Eric.Smith at ccur.com>:

> I have 2 Ceph pools, one backed by SSDs and one backed by spinning disks
> (Separate roots within the CRUSH hierarchy). I’d like to run all instances
> in a single project / tenant on SSDs and the rest on spinning disks. How
> would I go about setting this up?
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180521/304b7968/attachment.html>

From guilherme.pimentel at ccc.ufcg.edu.br  Mon May 21 19:31:40 2018
From: guilherme.pimentel at ccc.ufcg.edu.br (Guilherme Steinmuller Pimentel Pimentel)
Date: Mon, 21 May 2018 16:31:40 -0300
Subject: [Openstack-operators] Multiple Ceph pools for Nova?
In-Reply-To: <CAHUi5cN+UqUraNQgsmmAjDYL+uLoX+9+3DdRL7RdXntGsN1YWw@mail.gmail.com>
References: <DE700105-6D87-4696-B9BB-3FF8C05535F4@ccur.com>
 <CAHUi5cN+UqUraNQgsmmAjDYL+uLoX+9+3DdRL7RdXntGsN1YWw@mail.gmail.com>
Message-ID: <CAL_MbwsRy5OP0oCXadEP7wn47am+FMA44+_xht5dM6h5RT-6jQ@mail.gmail.com>

2018-05-21 16:17 GMT-03:00 Erik McCormick <emccormick at cirrusseven.com>:

> Do you have enough hypervisors you can dedicate some to each purpose? You
> could make two availability zones each with a different backend.
>

I have about 20 hypervisors. Ten are using a nova pool with SAS disks and
the other 10 are using another pool using SATA disks.

Yes, making two availability zones is an option. I didn't dive deep into it
when I was planning the deployment, so I am using the default nova
availability zone and defining which pool to use by flavor/aggregate
metadata.


>
> On Mon, May 21, 2018, 11:52 AM Smith, Eric <Eric.Smith at ccur.com> wrote:
>
>> I have 2 Ceph pools, one backed by SSDs and one backed by spinning disks
>> (Separate roots within the CRUSH hierarchy). I’d like to run all instances
>> in a single project / tenant on SSDs and the rest on spinning disks. How
>> would I go about setting this up?
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180521/e4777af3/attachment.html>

From mriedemos at gmail.com  Tue May 22 04:51:36 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Mon, 21 May 2018 21:51:36 -0700
Subject: [Openstack-operators] Multiple Ceph pools for Nova?
In-Reply-To: <DE700105-6D87-4696-B9BB-3FF8C05535F4@ccur.com>
References: <DE700105-6D87-4696-B9BB-3FF8C05535F4@ccur.com>
Message-ID: <66b14191-cd3b-b455-3be9-80cf3629517e@gmail.com>

On 5/21/2018 11:51 AM, Smith, Eric wrote:
> I have 2 Ceph pools, one backed by SSDs and one backed by spinning disks 
> (Separate roots within the CRUSH hierarchy). I’d like to run all 
> instances in a single project / tenant on SSDs and the rest on spinning 
> disks. How would I go about setting this up?

As mentioned elsewhere, host aggregate would work for the compute hosts 
connected to each storage pool. Then you can have different flavors per 
aggregate and charge more for the SSD flavors or restrict the aggregates 
based on tenant [1].

Alternatively, if this is something you plan to eventually scale to a 
larger size, you could even separate the pools with separate cells and 
use resource provider aggregates in placement to mirror the host 
aggregates for tenant-per-cell filtering [2]. It sounds like this is 
very similar to what CERN does (cells per hardware characteristics and 
projects assigned to specific cells). So Belmiro could probably help 
give some guidance here too. Check out the talk he gave today at the 
summit [3].

[1] 
https://docs.openstack.org/nova/latest/admin/configuration/schedulers.html#aggregatemultitenancyisolation
[2] 
https://docs.openstack.org/nova/latest/admin/configuration/schedulers.html#tenant-isolation-with-placement
[3] 
https://www.openstack.org/videos/vancouver-2018/moving-from-cellsv1-to-cellsv2-at-cern

-- 

Thanks,

Matt


From zioproto at gmail.com  Tue May 22 13:29:45 2018
From: zioproto at gmail.com (Saverio Proto)
Date: Tue, 22 May 2018 15:29:45 +0200
Subject: [Openstack-operators] attaching network cards to VMs taking a
 very long time
In-Reply-To: <350f070b9d654a0a5430fafb07bcc1d41c98d2f8.camel@emag.ro>
References: <a6616e77-9304-d704-04cd-466681f32a36@gmail.com>
 <ef13f15d4f3106fc4307c04fcb514378c35fa8c4.camel@emag.ro>
 <CAGckRDo3kVsYZ7dNOQiuw=H9mbMZmzZr-b-t0AO-1=XTWGyS0w@mail.gmail.com>
 <715adc7d-64f6-9545-1bf6-5eb13fb1d991@gmail.com>
 <CAGckRDo87sE6MAqbVP1D=H1HPN-6Vtz8ohdhXPLh0RNPbAkd=A@mail.gmail.com>
 <350f070b9d654a0a5430fafb07bcc1d41c98d2f8.camel@emag.ro>
Message-ID: <CAPmmg8uZshbkkYcubT5JWjPBqXeKJanbKXXmz9CMaaYrmcTt3w@mail.gmail.com>

Hello Radu,

do you have the Openstack rootwrap configured to work in daemon mode ?

please read this article:

2018-05-18 10:21 GMT+02:00 Radu Popescu | eMAG, Technology
<radu.popescu at emag.ro>:
> Hi,
>
> so, nova says the VM is ACTIVE and actually boots with no network. We are
> setting some metadata that we use later on and have cloud-init for different
> tasks.
> So, VM is up, OS is running, but network is working after a random amount of
> time, that can get to around 45 minutes. Thing is, is not happening to all
> VMs in that test (around 300), but it's happening to a fair amount - around
> 25%.
>
> I can see the callback coming few seconds after neutron openvswitch agent
> says it's completed the setup. My question is, why is it taking so long for
> nova openvswitch agent to configure the port? I can see the port up in both
> host OS and openvswitch. I would assume it's doing the whole namespace and
> iptables setup. But still, 30 minutes? Seems a lot!
>
> Thanks,
> Radu
>
> On Thu, 2018-05-17 at 11:50 -0400, George Mihaiescu wrote:
>
> We have other scheduled tests that perform end-to-end (assign floating IP,
> ssh, ping outside) and never had an issue.
> I think we turned it off because the callback code was initially buggy and
> nova would wait forever while things were in fact ok, but I'll  change
> "vif_plugging_is_fatal = True" and "vif_plugging_timeout = 300" and run
> another large test, just to confirm.
>
> We usually run these large tests after a version upgrade to test the APIs
> under load.
>
>
>
> On Thu, May 17, 2018 at 11:42 AM, Matt Riedemann <mriedemos at gmail.com>
> wrote:
>
> On 5/17/2018 9:46 AM, George Mihaiescu wrote:
>
> and large rally tests of 500 instances complete with no issues.
>
>
> Sure, except you can't ssh into the guests.
>
> The whole reason the vif plugging is fatal and timeout and callback code was
> because the upstream CI was unstable without it. The server would report as
> ACTIVE but the ports weren't wired up so ssh would fail. Having an ACTIVE
> guest that you can't actually do anything with is kind of pointless.
>
> _______________________________________________
>
> OpenStack-operators mailing list
>
> OpenStack-operators at lists.openstack.org
>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>


From zioproto at gmail.com  Tue May 22 13:30:24 2018
From: zioproto at gmail.com (Saverio Proto)
Date: Tue, 22 May 2018 15:30:24 +0200
Subject: [Openstack-operators] attaching network cards to VMs taking a
 very long time
In-Reply-To: <CAPmmg8uZshbkkYcubT5JWjPBqXeKJanbKXXmz9CMaaYrmcTt3w@mail.gmail.com>
References: <a6616e77-9304-d704-04cd-466681f32a36@gmail.com>
 <ef13f15d4f3106fc4307c04fcb514378c35fa8c4.camel@emag.ro>
 <CAGckRDo3kVsYZ7dNOQiuw=H9mbMZmzZr-b-t0AO-1=XTWGyS0w@mail.gmail.com>
 <715adc7d-64f6-9545-1bf6-5eb13fb1d991@gmail.com>
 <CAGckRDo87sE6MAqbVP1D=H1HPN-6Vtz8ohdhXPLh0RNPbAkd=A@mail.gmail.com>
 <350f070b9d654a0a5430fafb07bcc1d41c98d2f8.camel@emag.ro>
 <CAPmmg8uZshbkkYcubT5JWjPBqXeKJanbKXXmz9CMaaYrmcTt3w@mail.gmail.com>
Message-ID: <CAPmmg8uJUWY65bFQeZnpnTRRE=nSgbqk1ktKebHRGEQnJDHp3w@mail.gmail.com>

Sorry email went out incomplete.
Read this:
https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/

make sure that Openstack rootwrap configured to work in daemon mode

Thank you

Saverio


2018-05-22 15:29 GMT+02:00 Saverio Proto <zioproto at gmail.com>:
> Hello Radu,
>
> do you have the Openstack rootwrap configured to work in daemon mode ?
>
> please read this article:
>
> 2018-05-18 10:21 GMT+02:00 Radu Popescu | eMAG, Technology
> <radu.popescu at emag.ro>:
>> Hi,
>>
>> so, nova says the VM is ACTIVE and actually boots with no network. We are
>> setting some metadata that we use later on and have cloud-init for different
>> tasks.
>> So, VM is up, OS is running, but network is working after a random amount of
>> time, that can get to around 45 minutes. Thing is, is not happening to all
>> VMs in that test (around 300), but it's happening to a fair amount - around
>> 25%.
>>
>> I can see the callback coming few seconds after neutron openvswitch agent
>> says it's completed the setup. My question is, why is it taking so long for
>> nova openvswitch agent to configure the port? I can see the port up in both
>> host OS and openvswitch. I would assume it's doing the whole namespace and
>> iptables setup. But still, 30 minutes? Seems a lot!
>>
>> Thanks,
>> Radu
>>
>> On Thu, 2018-05-17 at 11:50 -0400, George Mihaiescu wrote:
>>
>> We have other scheduled tests that perform end-to-end (assign floating IP,
>> ssh, ping outside) and never had an issue.
>> I think we turned it off because the callback code was initially buggy and
>> nova would wait forever while things were in fact ok, but I'll  change
>> "vif_plugging_is_fatal = True" and "vif_plugging_timeout = 300" and run
>> another large test, just to confirm.
>>
>> We usually run these large tests after a version upgrade to test the APIs
>> under load.
>>
>>
>>
>> On Thu, May 17, 2018 at 11:42 AM, Matt Riedemann <mriedemos at gmail.com>
>> wrote:
>>
>> On 5/17/2018 9:46 AM, George Mihaiescu wrote:
>>
>> and large rally tests of 500 instances complete with no issues.
>>
>>
>> Sure, except you can't ssh into the guests.
>>
>> The whole reason the vif plugging is fatal and timeout and callback code was
>> because the upstream CI was unstable without it. The server would report as
>> ACTIVE but the ports weren't wired up so ssh would fail. Having an ACTIVE
>> guest that you can't actually do anything with is kind of pointless.
>>
>> _______________________________________________
>>
>> OpenStack-operators mailing list
>>
>> OpenStack-operators at lists.openstack.org
>>
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>>
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>


From ignaziocassano at gmail.com  Tue May 22 13:32:36 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Tue, 22 May 2018 15:32:36 +0200
Subject: [Openstack-operators] community vs founation membership
Message-ID: <CAB7j8cVr08b0f6Hcqk59AC8AGJo_nEhcGG46SSDJM_gRJDKAxA@mail.gmail.com>

Hi all,
please, what's the difference between community and foundation membership ?

Regards
Ignazio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180522/e01ec64e/attachment.html>

From fungi at yuggoth.org  Tue May 22 13:56:02 2018
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Tue, 22 May 2018 13:56:02 +0000
Subject: [Openstack-operators] community vs founation membership
In-Reply-To: <CAB7j8cVr08b0f6Hcqk59AC8AGJo_nEhcGG46SSDJM_gRJDKAxA@mail.gmail.com>
References: <CAB7j8cVr08b0f6Hcqk59AC8AGJo_nEhcGG46SSDJM_gRJDKAxA@mail.gmail.com>
Message-ID: <20180522135602.rotcqcy6pnx2sork@yuggoth.org>

On 2018-05-22 15:32:36 +0200 (+0200), Ignazio Cassano wrote:
> please, what's the difference between community and foundation
> membership ?

The "community" setting is just a means of indicating that you have
a profile/account for any of various purposes (scheduling, speaker
submissions, et cetera) but are not officially an Individual Member
of the OpenStack Foundation. A foundation membership is necessary
for some official activities, particularly for participating in
elections (board of directors, user committee, technical committee,
project team lead) as either a candidate or voter. Joining the
OpenStack Foundation as an Individual Member comes with no cost
other than a minute or two of your time to provide contact
information at https://www.openstack.org/join/ but does obligate you
to at least vote in OpenStack Foundation Board of Directors
elections once you are eligible to do so.
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180522/7062284b/attachment.sig>

From Eric.Smith at ccur.com  Tue May 22 13:57:47 2018
From: Eric.Smith at ccur.com (Smith, Eric)
Date: Tue, 22 May 2018 13:57:47 +0000
Subject: [Openstack-operators] Multiple Ceph pools for Nova?
In-Reply-To: <66b14191-cd3b-b455-3be9-80cf3629517e@gmail.com>
References: <DE700105-6D87-4696-B9BB-3FF8C05535F4@ccur.com>
 <66b14191-cd3b-b455-3be9-80cf3629517e@gmail.com>
Message-ID: <C974FAD3-76B8-4DD7-BEA6-5D4204529CF8@ccur.com>

Thanks everyone for the feedback - I have a pretty small environment (11 nodes) and I was able to find the compute / volume pool segregation within nova.conf / cinder.conf. I think I should be able to just export / import my existing RBDs from the spinning disk compute pool to the SSD compute pool and update my nova.conf. Then I'll add an extra backend in cinder.conf to point new volumes to the SSD volumes pool.

Thanks for all the help again.
Eric

﻿On 5/22/18, 12:53 AM, "Matt Riedemann" <mriedemos at gmail.com> wrote:

    On 5/21/2018 11:51 AM, Smith, Eric wrote:
    > I have 2 Ceph pools, one backed by SSDs and one backed by spinning disks 
    > (Separate roots within the CRUSH hierarchy). I’d like to run all 
    > instances in a single project / tenant on SSDs and the rest on spinning 
    > disks. How would I go about setting this up?
    
    As mentioned elsewhere, host aggregate would work for the compute hosts 
    connected to each storage pool. Then you can have different flavors per 
    aggregate and charge more for the SSD flavors or restrict the aggregates 
    based on tenant [1].
    
    Alternatively, if this is something you plan to eventually scale to a 
    larger size, you could even separate the pools with separate cells and 
    use resource provider aggregates in placement to mirror the host 
    aggregates for tenant-per-cell filtering [2]. It sounds like this is 
    very similar to what CERN does (cells per hardware characteristics and 
    projects assigned to specific cells). So Belmiro could probably help 
    give some guidance here too. Check out the talk he gave today at the 
    summit [3].
    
    [1] 
    https://docs.openstack.org/nova/latest/admin/configuration/schedulers.html#aggregatemultitenancyisolation
    [2] 
    https://docs.openstack.org/nova/latest/admin/configuration/schedulers.html#tenant-isolation-with-placement
    [3] 
    https://www.openstack.org/videos/vancouver-2018/moving-from-cellsv1-to-cellsv2-at-cern
    
    -- 
    
    Thanks,
    
    Matt
    
    _______________________________________________
    OpenStack-operators mailing list
    OpenStack-operators at lists.openstack.org
    http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
    

From ignaziocassano at gmail.com  Tue May 22 14:06:43 2018
From: ignaziocassano at gmail.com (Ignazio Cassano)
Date: Tue, 22 May 2018 16:06:43 +0200
Subject: [Openstack-operators] community vs founation membership
In-Reply-To: <20180522135602.rotcqcy6pnx2sork@yuggoth.org>
References: <CAB7j8cVr08b0f6Hcqk59AC8AGJo_nEhcGG46SSDJM_gRJDKAxA@mail.gmail.com>
 <20180522135602.rotcqcy6pnx2sork@yuggoth.org>
Message-ID: <CAB7j8cVcgE4iPPA7MG66_jnMCX8KxOp5gENgDAXVZ3b7vadrqA@mail.gmail.com>

Hi Jeremy, thanks for your help.
I am interested in openstack testing (no code contributing).
Becoming community member give me any advantage ?
At this time I am testing on ocata  on centos 7.
My environment is in HA with pacemaker (3 controllers) and 5 kvm nodes.
Regards
Ignazio

2018-05-22 15:56 GMT+02:00 Jeremy Stanley <fungi at yuggoth.org>:

> On 2018-05-22 15:32:36 +0200 (+0200), Ignazio Cassano wrote:
> > please, what's the difference between community and foundation
> > membership ?
>
> The "community" setting is just a means of indicating that you have
> a profile/account for any of various purposes (scheduling, speaker
> submissions, et cetera) but are not officially an Individual Member
> of the OpenStack Foundation. A foundation membership is necessary
> for some official activities, particularly for participating in
> elections (board of directors, user committee, technical committee,
> project team lead) as either a candidate or voter. Joining the
> OpenStack Foundation as an Individual Member comes with no cost
> other than a minute or two of your time to provide contact
> information at https://www.openstack.org/join/ but does obligate you
> to at least vote in OpenStack Foundation Board of Directors
> elections once you are eligible to do so.
> --
> Jeremy Stanley
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180522/1953583c/attachment.html>

From fungi at yuggoth.org  Tue May 22 14:48:26 2018
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Tue, 22 May 2018 14:48:26 +0000
Subject: [Openstack-operators] community vs founation membership
In-Reply-To: <CAB7j8cVcgE4iPPA7MG66_jnMCX8KxOp5gENgDAXVZ3b7vadrqA@mail.gmail.com>
References: <CAB7j8cVr08b0f6Hcqk59AC8AGJo_nEhcGG46SSDJM_gRJDKAxA@mail.gmail.com>
 <20180522135602.rotcqcy6pnx2sork@yuggoth.org>
 <CAB7j8cVcgE4iPPA7MG66_jnMCX8KxOp5gENgDAXVZ3b7vadrqA@mail.gmail.com>
Message-ID: <20180522144826.nhal6rbdxdoreaaa@yuggoth.org>

On 2018-05-22 16:06:43 +0200 (+0200), Ignazio Cassano wrote:
> I am interested in openstack testing (no code contributing).
> Becoming community member give me any advantage ?
> At this time I am testing on ocata  on centos 7.
> My environment is in HA with pacemaker (3 controllers) and 5 kvm nodes.

If you wish to participate in the periodic OpenStack User Survey to
provide details on your test deployment and any related experiences
running OpenStack, then I think you'd need to create an account for
https://www.openstack.org/ (so at least "community" level) to be
able to do so. Participation in the survey is not mandatory, but
still much appreciated as it helps the OpenStack project
contributors better determine where improvements are most needed.
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180522/ffd55191/attachment.sig>

From jon at csail.mit.edu  Tue May 22 15:02:34 2018
From: jon at csail.mit.edu (Jonathan D. Proulx)
Date: Tue, 22 May 2018 08:02:34 -0700
Subject: [Openstack-operators] community vs founation membership
In-Reply-To: <CAB7j8cVcgE4iPPA7MG66_jnMCX8KxOp5gENgDAXVZ3b7vadrqA@mail.gmail.com>
References: <CAB7j8cVr08b0f6Hcqk59AC8AGJo_nEhcGG46SSDJM_gRJDKAxA@mail.gmail.com>
 <20180522135602.rotcqcy6pnx2sork@yuggoth.org>
 <CAB7j8cVcgE4iPPA7MG66_jnMCX8KxOp5gENgDAXVZ3b7vadrqA@mail.gmail.com>
Message-ID: <20180522150234.fuwrqg2dzvwz33hy@csail.mit.edu>

On Tue, May 22, 2018 at 04:06:43PM +0200, Ignazio Cassano wrote:
:   Hi Jeremy, thanks for your help.
:   I am interested in openstack testing (no code contributing).
:   Becoming community member give me any advantage ?
:   At this time I am testing on ocata  on centos 7.
:   My environment is in HA with pacemaker (3 controllers) and 5 kvm nodes.

Anyone can report bugs of course.

If you became a foundation memeber you could also comment on proposed
fixes during code review even if you're not contributing the code
yourself. Code review is a valuable service to the community and
something most projects are usually looking for more of if that is
something you're comfortable with.

Personally it's about where my python skills fall. I can read enough
to see if a fix doesn't quite fix what I'm looking for or sometimes if
it has adverse side effects for me.

Again seeing these reviews is public but writing reviews requires
foundation memebership in the same way being a code contributor does.

If I recall the "community" level was meant mostly for people who
could not sign the contributor agreement typically becase of employer
policies. If that's a concer then become a "community" memeber if not
join as a "foundation" member would be my advice.

Welcome,
-Jon


From fungi at yuggoth.org  Tue May 22 15:12:28 2018
From: fungi at yuggoth.org (Jeremy Stanley)
Date: Tue, 22 May 2018 15:12:28 +0000
Subject: [Openstack-operators] community vs founation membership
In-Reply-To: <20180522150234.fuwrqg2dzvwz33hy@csail.mit.edu>
References: <CAB7j8cVr08b0f6Hcqk59AC8AGJo_nEhcGG46SSDJM_gRJDKAxA@mail.gmail.com>
 <20180522135602.rotcqcy6pnx2sork@yuggoth.org>
 <CAB7j8cVcgE4iPPA7MG66_jnMCX8KxOp5gENgDAXVZ3b7vadrqA@mail.gmail.com>
 <20180522150234.fuwrqg2dzvwz33hy@csail.mit.edu>
Message-ID: <20180522151228.qevzpobtpumldspq@yuggoth.org>

On 2018-05-22 08:02:34 -0700 (-0700), Jonathan D. Proulx wrote:
[...]
> Again seeing these reviews is public but writing reviews requires
> foundation memebership in the same way being a code contributor does.
[...]

In fact, commenting on https://review.openstack.org/ only requires
creating an account at https://login.ubuntu.com/ (the OpenID service
we're presently using for that) and logging in with it. Further, we
dropped the need to become a member of the OpenStack Foundation in
order to submit patches for review (it was never a legal
requirement, but only a quirk of how we were previously linking
accounts together to simplify technical elections). Contributing
patches to most OpenStack projects does require agreeing to the
OpenStack Individual Contributor License Agreement (ICLA) in Gerrit
for now, but this is not the same thing as becoming an Individual
Member of the OpenStack Foundation and doesn't even require any
account on www.openstack.org for now, just login.ubuntu.com (this
will likely change in the future when we eventually switch OpenID
providers).
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180522/36ac6d3e/attachment.sig>

From jon at csail.mit.edu  Tue May 22 15:26:50 2018
From: jon at csail.mit.edu (Jonathan D. Proulx)
Date: Tue, 22 May 2018 08:26:50 -0700
Subject: [Openstack-operators] community vs founation membership
In-Reply-To: <20180522151228.qevzpobtpumldspq@yuggoth.org>
References: <CAB7j8cVr08b0f6Hcqk59AC8AGJo_nEhcGG46SSDJM_gRJDKAxA@mail.gmail.com>
 <20180522135602.rotcqcy6pnx2sork@yuggoth.org>
 <CAB7j8cVcgE4iPPA7MG66_jnMCX8KxOp5gENgDAXVZ3b7vadrqA@mail.gmail.com>
 <20180522150234.fuwrqg2dzvwz33hy@csail.mit.edu>
 <20180522151228.qevzpobtpumldspq@yuggoth.org>
Message-ID: <20180522152650.pyui54fh4m5amzct@csail.mit.edu>

On Tue, May 22, 2018 at 03:12:28PM +0000, Jeremy Stanley wrote:
:On 2018-05-22 08:02:34 -0700 (-0700), Jonathan D. Proulx wrote:
:[...]
:> Again seeing these reviews is public but writing reviews requires
:> foundation memebership in the same way being a code contributor does.
:[...]
:
:In fact, commenting on https://review.openstack.org/ only requires
:creating an account at https://login.ubuntu.com/ (the OpenID service
:we're presently using for that) and logging in with it. Further, we
:dropped the need to become a member of the OpenStack Foundation in
:order to submit patches for review (it was never a legal
:requirement, but only a quirk of how we were previously linking
:accounts together to simplify technical elections). Contributing
:patches to most OpenStack projects does require agreeing to the
:OpenStack Individual Contributor License Agreement (ICLA) in Gerrit
:for now, but this is not the same thing as becoming an Individual
:Member of the OpenStack Foundation and doesn't even require any
:account on www.openstack.org for now, just login.ubuntu.com (this
:will likely change in the future when we eventually switch OpenID
:providers).

Excellent! Glad to stand corrected, lower barriers are better barriers
:)

-Jon
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180522/7c51716d/attachment.sig>

From sundar.nadathur at intel.com  Tue May 22 22:06:55 2018
From: sundar.nadathur at intel.com (Nadathur, Sundar)
Date: Tue, 22 May 2018 15:06:55 -0700
Subject: [Openstack-operators] Followup to Cyborg/FPGA discussion at
	OpenStack Summit
Message-ID: <c8d60066-3292-625f-2330-2270be8e7d41@intel.com>

Hello operators,

    We had a good discussion at the OpenStack Summit at Vancouver [1] on 
Cyborg/FPGA for Cloud/NFV. Cyborg [2] is the OpenStack project for life 
cycle management of accelerators, including GPUs and FPGAs.

Thanks to those of you who attended. The discussion during the session 
has been captured in this etherpad [3]. Please feel free to respond on 
the etherpad.

If you are interested in a follow-up discussion, please indicate in the 
same etherpad.

[1]https://www.openstack.org/summit/vancouver-2018/summit-schedule/events/21720/cyborgfpga-support-for-cloudnfv 

[2] https://wiki.openstack.org/wiki/Cyborg
[3] https://etherpad.openstack.org/p/Cyborg-FPGA-Support-for-Cloud-NFV

Thanks.

Regards,
Sundar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180522/20777dc3/attachment-0001.html>

From radu.popescu at emag.ro  Wed May 23 10:08:18 2018
From: radu.popescu at emag.ro (Radu Popescu | eMAG, Technology)
Date: Wed, 23 May 2018 10:08:18 +0000
Subject: [Openstack-operators] attaching network cards to VMs taking a
 very long time
In-Reply-To: <CAPmmg8uJUWY65bFQeZnpnTRRE=nSgbqk1ktKebHRGEQnJDHp3w@mail.gmail.com>
References: <a6616e77-9304-d704-04cd-466681f32a36@gmail.com>
 <ef13f15d4f3106fc4307c04fcb514378c35fa8c4.camel@emag.ro>
 <CAGckRDo3kVsYZ7dNOQiuw=H9mbMZmzZr-b-t0AO-1=XTWGyS0w@mail.gmail.com>
 <715adc7d-64f6-9545-1bf6-5eb13fb1d991@gmail.com>
 <CAGckRDo87sE6MAqbVP1D=H1HPN-6Vtz8ohdhXPLh0RNPbAkd=A@mail.gmail.com>
 <350f070b9d654a0a5430fafb07bcc1d41c98d2f8.camel@emag.ro>
 <CAPmmg8uZshbkkYcubT5JWjPBqXeKJanbKXXmz9CMaaYrmcTt3w@mail.gmail.com>
 <CAPmmg8uJUWY65bFQeZnpnTRRE=nSgbqk1ktKebHRGEQnJDHp3w@mail.gmail.com>
Message-ID: <43cd8579c761a13dcc81e6ffc9a69089fb421cda.camel@emag.ro>

Hi,

actually, I didn't know about that option. I'll enable it right now.
Testing is done every morning at about 4:00AM ..so I'll know tomorrow morning if it changed anything.

Thanks,
Radu

On Tue, 2018-05-22 at 15:30 +0200, Saverio Proto wrote:

Sorry email went out incomplete.

Read this:

https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/


make sure that Openstack rootwrap configured to work in daemon mode


Thank you


Saverio


2018-05-22 15:29 GMT+02:00 Saverio Proto <zioproto at gmail.com<mailto:zioproto at gmail.com>>:

Hello Radu,


do you have the Openstack rootwrap configured to work in daemon mode ?


please read this article:


2018-05-18 10:21 GMT+02:00 Radu Popescu | eMAG, Technology

<radu.popescu at emag.ro<mailto:radu.popescu at emag.ro>>:

Hi,


so, nova says the VM is ACTIVE and actually boots with no network. We are

setting some metadata that we use later on and have cloud-init for different

tasks.

So, VM is up, OS is running, but network is working after a random amount of

time, that can get to around 45 minutes. Thing is, is not happening to all

VMs in that test (around 300), but it's happening to a fair amount - around

25%.


I can see the callback coming few seconds after neutron openvswitch agent

says it's completed the setup. My question is, why is it taking so long for

nova openvswitch agent to configure the port? I can see the port up in both

host OS and openvswitch. I would assume it's doing the whole namespace and

iptables setup. But still, 30 minutes? Seems a lot!


Thanks,

Radu


On Thu, 2018-05-17 at 11:50 -0400, George Mihaiescu wrote:


We have other scheduled tests that perform end-to-end (assign floating IP,

ssh, ping outside) and never had an issue.

I think we turned it off because the callback code was initially buggy and

nova would wait forever while things were in fact ok, but I'll  change

"vif_plugging_is_fatal = True" and "vif_plugging_timeout = 300" and run

another large test, just to confirm.


We usually run these large tests after a version upgrade to test the APIs

under load.


On Thu, May 17, 2018 at 11:42 AM, Matt Riedemann <mriedemos at gmail.com<mailto:mriedemos at gmail.com>>

wrote:


On 5/17/2018 9:46 AM, George Mihaiescu wrote:


and large rally tests of 500 instances complete with no issues.


Sure, except you can't ssh into the guests.


The whole reason the vif plugging is fatal and timeout and callback code was

because the upstream CI was unstable without it. The server would report as

ACTIVE but the ports weren't wired up so ssh would fail. Having an ACTIVE

guest that you can't actually do anything with is kind of pointless.


_______________________________________________


OpenStack-operators mailing list


OpenStack-operators at lists.openstack.org<mailto:OpenStack-operators at lists.openstack.org>


http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


_______________________________________________

OpenStack-operators mailing list

OpenStack-operators at lists.openstack.org<mailto:OpenStack-operators at lists.openstack.org>

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180523/7d56e6d3/attachment.html>

From jon at csail.mit.edu  Wed May 23 20:41:29 2018
From: jon at csail.mit.edu (Jonathan D. Proulx)
Date: Wed, 23 May 2018 13:41:29 -0700
Subject: [Openstack-operators] OSA migrating existing deployment to OSA
Message-ID: <20180523204129.nzexkf4vojqiohlb@csail.mit.edu>

Hi All,

Having attened Fast Forward upgrade session and Upgrade SIG this
morning in Vancouver I'm convinced my upgrade problem is really a
config management problem.

I'm runnig an old deprecated config system that worked gread in 2012,
but has aged poorly.  I've been meaning to move to OSA for a very long
time but with a small fragemented team and a generally working cloud
something else alway grabs priority.

This also seems a more common rut than I realized based on what I head
this morning.

I very much need to make my move over the next 3-4 months.

If other people are are looking to make a similar migration to OSA (or
have recently compleded one) I'd love to work toagther on documenting
(if not codifying) mapping an existing cloud to OSA config.

Obviously brownfield deployments are messy with site specific oddities
all over the place, but If I'm going to suffer hopefully I can spare
others some of that suffering.

Anyone else crazy enough to get in this boat with me?

-Jon


From sagaray at nttdata.co.jp  Wed May 23 21:34:09 2018
From: sagaray at nttdata.co.jp (sagaray at nttdata.co.jp)
Date: Wed, 23 May 2018 21:34:09 +0000
Subject: [Openstack-operators] Need feedback for nova aborting cold
 migration function
In-Reply-To: <9b1c9c3d-00dc-d073-96e7-4d6409521261@gmail.com>
References: <1525919628734.2105@nttdata.co.jp>
 <5fea9373-021a-0a2e-ba91-d7fe62bd5ca9@gmail.com>
 <1526374144863.89140@nttdata.co.jp>,
 <9b1c9c3d-00dc-d073-96e7-4d6409521261@gmail.com>
Message-ID: <3231666a74104fae802f064bfa8ce88f@MP-MSGSS-MBX017.msg.nttdata.co.jp>

Hi Matt,

> > We store the service logs which are created by VM on that storage.
>
> I don't mean to be glib, but have you considered maybe not doing that?

The load issue on storage is due to the way we deploy our business softwares on VM.
The best way is introducing a new storage and separate the SAN, but we cannot change our deployment method due to it's cost and other limitations.
On a long-term, our operation team will change the deployment method to better one to resolve this problem.

On the other hand, we would like to build a tool to support VM migration that is unaware of which migration method is used for VM migration (Cold or Live). Feature parity wise, if live migration supports cancel feature, then we think that cold migration must support it as well.

--------------------------------------------------
Yukinori Sagara <sagaray at nttdata.co.jp>
Platform Engineering Department, NTT DATA Corp.

________________________________________
差出人: Matt Riedemann <mriedemos at gmail.com>
送信日時: 2018年5月18日 1:39
宛先: openstack-operators at lists.openstack.org
件名: Re: [Openstack-operators] Need feedback for nova aborting cold migration function

On 5/15/2018 3:48 AM, sagaray at nttdata.co.jp wrote:
> We store the service logs which are created by VM on that storage.

I don't mean to be glib, but have you considered maybe not doing that?

--

Thanks,

Matt

_______________________________________________
OpenStack-operators mailing list
OpenStack-operators at lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

From gael.therond at gmail.com  Wed May 23 21:59:37 2018
From: gael.therond at gmail.com (Flint WALRUS)
Date: Wed, 23 May 2018 23:59:37 +0200
Subject: [Openstack-operators] Need feedback for nova aborting cold
 migration function
In-Reply-To: <3231666a74104fae802f064bfa8ce88f@MP-MSGSS-MBX017.msg.nttdata.co.jp>
References: <1525919628734.2105@nttdata.co.jp>
 <5fea9373-021a-0a2e-ba91-d7fe62bd5ca9@gmail.com>
 <1526374144863.89140@nttdata.co.jp>
 <9b1c9c3d-00dc-d073-96e7-4d6409521261@gmail.com>
 <3231666a74104fae802f064bfa8ce88f@MP-MSGSS-MBX017.msg.nttdata.co.jp>
Message-ID: <CAG+53uYKCz6hpynGa8OohR+jJHQTS=GzVV7-kcs7j6z_5U7pXw@mail.gmail.com>

We are using multiple storage backend / topology on our side ranging from
ScaleIO to CEPH passing by local compute host storage (were we need cold
storage) and VNX, I have to said that CEPH is our best bet. Since we use it
we clearly reduced our outages, allowed our user advanced features such as
live-migration, boot from volumes and on top of that a better and more
reliable performance.

Yet we still need to get live and cold migration the same features set as
our users/customers are really expecting us to provide a seamless
experience between options.

I can’t really speak out about real numbers but I’m within the video game
industry if that help to drive support and traction/interest.

Thanks for the survey btw.

Kind regards,
Gaël.
Le mer. 23 mai 2018 à 23:36, <sagaray at nttdata.co.jp> a écrit :

> Hi Matt,
>
> > > We store the service logs which are created by VM on that storage.
> >
> > I don't mean to be glib, but have you considered maybe not doing that?
>
> The load issue on storage is due to the way we deploy our business
> softwares on VM.
> The best way is introducing a new storage and separate the SAN, but we
> cannot change our deployment method due to it's cost and other limitations.
> On a long-term, our operation team will change the deployment method to
> better one to resolve this problem.
>
> On the other hand, we would like to build a tool to support VM migration
> that is unaware of which migration method is used for VM migration (Cold or
> Live). Feature parity wise, if live migration supports cancel feature, then
> we think that cold migration must support it as well.
>
> --------------------------------------------------
> Yukinori Sagara <sagaray at nttdata.co.jp>
> Platform Engineering Department, NTT DATA Corp.
>
> ________________________________________
> 差出人: Matt Riedemann <mriedemos at gmail.com>
> 送信日時: 2018年5月18日 1:39
> 宛先: openstack-operators at lists.openstack.org
> 件名: Re: [Openstack-operators] Need feedback for nova aborting cold
> migration function
>
> On 5/15/2018 3:48 AM, sagaray at nttdata.co.jp wrote:
> > We store the service logs which are created by VM on that storage.
>
> I don't mean to be glib, but have you considered maybe not doing that?
>
> --
>
> Thanks,
>
> Matt
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180523/1ed56896/attachment.html>

From openstack at medberry.net  Wed May 23 22:09:40 2018
From: openstack at medberry.net (David Medberry)
Date: Wed, 23 May 2018 15:09:40 -0700
Subject: [Openstack-operators] Fwd: Follow Up: Private Enterprise Cloud
	Issues
In-Reply-To: <CAJhvMSv84u+z4+hOeFaAKUyBOeSCj_Q6pm6zEXFqGe4D2CofHA@mail.gmail.com>
References: <CAJhvMSv84u+z4+hOeFaAKUyBOeSCj_Q6pm6zEXFqGe4D2CofHA@mail.gmail.com>
Message-ID: <CAJhvMSs-fjY6rU3134xSNbb9z3xe58+XTX0ZGZQ7zHn6sZw7QQ@mail.gmail.com>

There was a great turnout at the Private Enterprise Cloud Issues session
here in Vancouver. I'll propose a follow-on discussion for Denver PTG as
well as trying to sift the data a bit and pre-populate. Look for that
sifted data soon.

For folks unable to participate locally, the etherpad is here:

https://etherpad.openstack.org/p/YVR-private-enterprise-cloud-issues

(and I've cached a copy offline in case it gets reset/etc.)

-- 
-dave
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180523/cb79801b/attachment.html>

From ekcs.openstack at gmail.com  Wed May 23 23:39:16 2018
From: ekcs.openstack at gmail.com (Eric K)
Date: Wed, 23 May 2018 16:39:16 -0700
Subject: [Openstack-operators] [self-healing] BoF in Vancouver tomorrow
Message-ID: <CAOakLUO5za90j+yJqfmo6z5dZduo-6z0hqi8peW3tkCjOKqzfQ@mail.gmail.com>

For everyone interested in self-healing infra, come share your
experience and your ideas with like-minded stackers, including folks
from 10+ projects working together to make OpenStack self-healing a
reality!

Thursday, May 24, 1:50pm-2:30pm
Vancouver Convention Centre West - Level Two - Room 217

https://www.openstack.org/summit/vancouver-2018/summit-schedule/events/21830/self-healing-sig-bof

Brainstorming etherpad:
https://etherpad.openstack.org/p/YVR-self-healing-brainstorming


From mihalis68 at gmail.com  Thu May 24 01:38:32 2018
From: mihalis68 at gmail.com (Chris Morgan)
Date: Wed, 23 May 2018 18:38:32 -0700
Subject: [Openstack-operators] Ops Community Documentation - first anchor
	point
Message-ID: <CA+NmNoPRU+R7HQkJbwkVhOu1jLpUXuw7u_Neb8fCDdfrUvmdng@mail.gmail.com>

Hello Everyone,

In the Ops Community documentation working session today in Vancouver, we
made some really good progress (etherpad here:
https://etherpad.openstack.org/p/YVR-Ops-Community-Docs but not all of the
good stuff is yet written down).

In short, we're going to course correct on maintaining the Operators Guide,
the HA Guide and Architecture Guide, not edit-in-place via the wiki and
instead try still maintaining them as code, but with a different, new set
of owners, possibly in a new Ops-focused repo. There was a strong consensus
that a) code workflow >> wiki workflow and that b) openstack core docs
tools are just fine.

There is a lot still to be decided on how where and when, but we do have an
offer of a rewrite of the HA Guide, as long as the changes will be allowed
to actually land, so we expect to actually start showing some progress.

At the end of the session, people wanted to know how to follow along as
various people work out how to do this... and so for now that place is this
very email thread. The idea is if the code for those documents goes to live
in a different repo, or if new contributors turn up, or if a new version we
will announce/discuss it here until such time as we have a better home for
this initiative.

Cheers

Chris

-- 
Chris Morgan <mihalis68 at gmail.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180523/265e615b/attachment.html>

From jon at csail.mit.edu  Thu May 24 01:46:26 2018
From: jon at csail.mit.edu (Jonathan D. Proulx)
Date: Wed, 23 May 2018 18:46:26 -0700
Subject: [Openstack-operators] Ops Community Documentation - first
 anchor point
In-Reply-To: <CA+NmNoPRU+R7HQkJbwkVhOu1jLpUXuw7u_Neb8fCDdfrUvmdng@mail.gmail.com>
References: <CA+NmNoPRU+R7HQkJbwkVhOu1jLpUXuw7u_Neb8fCDdfrUvmdng@mail.gmail.com>
Message-ID: <20180524014626.2e3n7kmjxdjb7rjv@csail.mit.edu>


Thanks for kicking this off Chris.

Were you going to create that new repository?  If not I can take on
the tasks of learning how and making it happen.

-Jon

On Wed, May 23, 2018 at 06:38:32PM -0700, Chris Morgan wrote:
:   Hello Everyone,
:   In the Ops Community documentation working session today in Vancouver,
:   we made some really good progress (etherpad
:   here: [1]https://etherpad.openstack.org/p/YVR-Ops-Community-Docs but
:   not all of the good stuff is yet written down).
:   In short, we're going to course correct on maintaining the Operators
:   Guide, the HA Guide and Architecture Guide, not edit-in-place via the
:   wiki and instead try still maintaining them as code, but with a
:   different, new set of owners, possibly in a new Ops-focused repo. There
:   was a strong consensus that a) code workflow >> wiki workflow and that
:   b) openstack core docs tools are just fine.
:   There is a lot still to be decided on how where and when, but we do
:   have an offer of a rewrite of the HA Guide, as long as the changes will
:   be allowed to actually land, so we expect to actually start showing
:   some progress.
:   At the end of the session, people wanted to know how to follow along as
:   various people work out how to do this... and so for now that place is
:   this very email thread. The idea is if the code for those documents
:   goes to live in a different repo, or if new contributors turn up, or if
:   a new version we will announce/discuss it here until such time as we
:   have a better home for this initiative.
:   Cheers
:   Chris
:   --
:   Chris Morgan <[2]mihalis68 at gmail.com>
:
:References
:
:   1. https://etherpad.openstack.org/p/YVR-Ops-Community-Docs
:   2. mailto:mihalis68 at gmail.com

:_______________________________________________
:OpenStack-operators mailing list
:OpenStack-operators at lists.openstack.org
:http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


From mihalis68 at gmail.com  Thu May 24 03:09:05 2018
From: mihalis68 at gmail.com (Chris Morgan)
Date: Wed, 23 May 2018 20:09:05 -0700
Subject: [Openstack-operators] Ops Community Documentation - first
	anchor point
In-Reply-To: <20180524014626.2e3n7kmjxdjb7rjv@csail.mit.edu>
References: <CA+NmNoPRU+R7HQkJbwkVhOu1jLpUXuw7u_Neb8fCDdfrUvmdng@mail.gmail.com>
 <20180524014626.2e3n7kmjxdjb7rjv@csail.mit.edu>
Message-ID: <9187FEF2-6403-42F7-87AC-E160E4529688@gmail.com>

I hadn’t got that far in my thoughts. If you’re able to give that a go, then that would be great!

Chris

Sent from my iPhone

> On May 23, 2018, at 6:46 PM, Jonathan D. Proulx <jon at csail.mit.edu> wrote:
> 
> 
> Thanks for kicking this off Chris.
> 
> Were you going to create that new repository?  If not I can take on
> the tasks of learning how and making it happen.
> 
> -Jon
> 
> On Wed, May 23, 2018 at 06:38:32PM -0700, Chris Morgan wrote:
> :   Hello Everyone,
> :   In the Ops Community documentation working session today in Vancouver,
> :   we made some really good progress (etherpad
> :   here: [1]https://etherpad.openstack.org/p/YVR-Ops-Community-Docs but
> :   not all of the good stuff is yet written down).
> :   In short, we're going to course correct on maintaining the Operators
> :   Guide, the HA Guide and Architecture Guide, not edit-in-place via the
> :   wiki and instead try still maintaining them as code, but with a
> :   different, new set of owners, possibly in a new Ops-focused repo. There
> :   was a strong consensus that a) code workflow >> wiki workflow and that
> :   b) openstack core docs tools are just fine.
> :   There is a lot still to be decided on how where and when, but we do
> :   have an offer of a rewrite of the HA Guide, as long as the changes will
> :   be allowed to actually land, so we expect to actually start showing
> :   some progress.
> :   At the end of the session, people wanted to know how to follow along as
> :   various people work out how to do this... and so for now that place is
> :   this very email thread. The idea is if the code for those documents
> :   goes to live in a different repo, or if new contributors turn up, or if
> :   a new version we will announce/discuss it here until such time as we
> :   have a better home for this initiative.
> :   Cheers
> :   Chris
> :   --
> :   Chris Morgan <[2]mihalis68 at gmail.com>
> :
> :References
> :
> :   1. https://etherpad.openstack.org/p/YVR-Ops-Community-Docs
> :   2. mailto:mihalis68 at gmail.com
> 
> :_______________________________________________
> :OpenStack-operators mailing list
> :OpenStack-operators at lists.openstack.org
> :http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> 


From doug at doughellmann.com  Thu May 24 04:23:53 2018
From: doug at doughellmann.com (Doug Hellmann)
Date: Wed, 23 May 2018 21:23:53 -0700
Subject: [Openstack-operators] Ops Community Documentation - first
	anchor point
In-Reply-To: <9187FEF2-6403-42F7-87AC-E160E4529688@gmail.com>
References: <CA+NmNoPRU+R7HQkJbwkVhOu1jLpUXuw7u_Neb8fCDdfrUvmdng@mail.gmail.com>
 <20180524014626.2e3n7kmjxdjb7rjv@csail.mit.edu>
 <9187FEF2-6403-42F7-87AC-E160E4529688@gmail.com>
Message-ID: <1527135398-sup-4094@lrrr.local>

You will want to follow the steps of the "project creators' guide"
[1].  Not all of them apply, because this is a docs repo and not a
code project repo. Let me know if you have questions about which
pieces do or do not apply as you go along, and we can work on
improving that document as well.

The openstack/tripleo-docs repo looks like it has a setup similar
to the one you'll be creating, so when you get to the steps about
setting up jobs you can probably copy what they have.

After the session today it occurred to me that there is one
governance-related thing that we would need to do in order to publish
this content to docs.openstack.org. Right now we have a policy that
only official teams can do that. I think if the guide is owned by
a SIG or other group chartered either by the TC or UC we can make
that work. We can do quite a lot of the setup work while we figure
that out, though, so don't lose momentum in the mean time.

Doug

[1] https://docs.openstack.org/infra/manual/creators.html

Excerpts from Chris Morgan's message of 2018-05-23 20:09:05 -0700:
> I hadn’t got that far in my thoughts. If you’re able to give that a go, then that would be great!
> 
> Chris
> 
> Sent from my iPhone
> 
> > On May 23, 2018, at 6:46 PM, Jonathan D. Proulx <jon at csail.mit.edu> wrote:
> > 
> > 
> > Thanks for kicking this off Chris.
> > 
> > Were you going to create that new repository?  If not I can take on
> > the tasks of learning how and making it happen.
> > 
> > -Jon
> > 
> > On Wed, May 23, 2018 at 06:38:32PM -0700, Chris Morgan wrote:
> > :   Hello Everyone,
> > :   In the Ops Community documentation working session today in Vancouver,
> > :   we made some really good progress (etherpad
> > :   here: [1]https://etherpad.openstack.org/p/YVR-Ops-Community-Docs but
> > :   not all of the good stuff is yet written down).
> > :   In short, we're going to course correct on maintaining the Operators
> > :   Guide, the HA Guide and Architecture Guide, not edit-in-place via the
> > :   wiki and instead try still maintaining them as code, but with a
> > :   different, new set of owners, possibly in a new Ops-focused repo. There
> > :   was a strong consensus that a) code workflow >> wiki workflow and that
> > :   b) openstack core docs tools are just fine.
> > :   There is a lot still to be decided on how where and when, but we do
> > :   have an offer of a rewrite of the HA Guide, as long as the changes will
> > :   be allowed to actually land, so we expect to actually start showing
> > :   some progress.
> > :   At the end of the session, people wanted to know how to follow along as
> > :   various people work out how to do this... and so for now that place is
> > :   this very email thread. The idea is if the code for those documents
> > :   goes to live in a different repo, or if new contributors turn up, or if
> > :   a new version we will announce/discuss it here until such time as we
> > :   have a better home for this initiative.
> > :   Cheers
> > :   Chris
> > :   --
> > :   Chris Morgan <[2]mihalis68 at gmail.com>
> > :
> > :References
> > :
> > :   1. https://etherpad.openstack.org/p/YVR-Ops-Community-Docs
> > :   2. mailto:mihalis68 at gmail.com
> > 
> > :_______________________________________________
> > :OpenStack-operators mailing list
> > :OpenStack-operators at lists.openstack.org
> > :http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> > 
> 


From eumel at arcor.de  Thu May 24 04:56:47 2018
From: eumel at arcor.de (Frank Kloeker)
Date: Thu, 24 May 2018 06:56:47 +0200
Subject: [Openstack-operators] Ops Community Documentation - first
 anchor point
In-Reply-To: <CA+NmNoPRU+R7HQkJbwkVhOu1jLpUXuw7u_Neb8fCDdfrUvmdng@mail.gmail.com>
References: <CA+NmNoPRU+R7HQkJbwkVhOu1jLpUXuw7u_Neb8fCDdfrUvmdng@mail.gmail.com>
Message-ID: <30d4f1a3668445a11fd34b271bc37e94@arcor.de>

Hi Chris,

thanks for summarize our session today in Vancouver. As I18n PTL and one 
of the Docs Core I put Petr in Cc. He is currently Docs PTL, but 
unfortunatelly not on-site.
I couldn't also not get the full history of the story and that's also 
not the idea to starting finger pointing. As usualy we moving forward 
and there are some interesting things to know what happened.
First of all: There are no "Docs-Team" anymore. If you look at [1] there 
are mostly part-time contributors like me or people are more involved in 
other projects and therefore busy. Because of that, the responsibility 
of documentation content are moved completely to the project teams. Each 
repo has a user guide, admin guide, deployment guide, and so on. The 
small Documentation Team provides only tooling and give advices how to 
write and publish a document. So it's up to you to re-use the old repo 
on [2] or setup a new one. I would recommend to use the best of both 
worlds. There are a very good toolset in place for testing and 
publishing documents. There are also various text editors for rst 
extensions available, like in vim, notepad++ or also online services. I 
understand the concerns and when people are sad because their patches 
are ignored for months. But it's alltime a question of responsibilty and 
how can spend people time.
I would be available for help. As I18n PTL I could imagine that a 
OpenStack Operations Guide is available in different languages and 
portable in different formats like in Sphinx. For us as translation team 
it's a good possibility to get feedback about the quality and to 
understand the requirements, also for other documents.
So let's move on.

kind regards

Frank

[1] https://review.openstack.org/#/admin/groups/30,members
[2] https://github.com/openstack/operations-guide

Am 2018-05-24 03:38, schrieb Chris Morgan:
> Hello Everyone,
> 
> In the Ops Community documentation working session today in Vancouver,
> we made some really good progress (etherpad here:
> https://etherpad.openstack.org/p/YVR-Ops-Community-Docs but not all of
> the good stuff is yet written down).
> 
> In short, we're going to course correct on maintaining the Operators
> Guide, the HA Guide and Architecture Guide, not edit-in-place via the
> wiki and instead try still maintaining them as code, but with a
> different, new set of owners, possibly in a new Ops-focused repo.
> There was a strong consensus that a) code workflow >> wiki workflow
> and that b) openstack core docs tools are just fine.
> 
> There is a lot still to be decided on how where and when, but we do
> have an offer of a rewrite of the HA Guide, as long as the changes
> will be allowed to actually land, so we expect to actually start
> showing some progress.
> 
> At the end of the session, people wanted to know how to follow along
> as various people work out how to do this... and so for now that place
> is this very email thread. The idea is if the code for those documents
> goes to live in a different repo, or if new contributors turn up, or
> if a new version we will announce/discuss it here until such time as
> we have a better home for this initiative.
> 
> Cheers
> 
> Chris
> 
> --
> Chris Morgan <mihalis68 at gmail.com>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


From mrhillsman at gmail.com  Thu May 24 05:26:02 2018
From: mrhillsman at gmail.com (Melvin Hillsman)
Date: Wed, 23 May 2018 22:26:02 -0700
Subject: [Openstack-operators] Ops Community Documentation - first
	anchor point
In-Reply-To: <30d4f1a3668445a11fd34b271bc37e94@arcor.de>
References: <CA+NmNoPRU+R7HQkJbwkVhOu1jLpUXuw7u_Neb8fCDdfrUvmdng@mail.gmail.com>
 <30d4f1a3668445a11fd34b271bc37e94@arcor.de>
Message-ID: <CAMVtB2F_YKaaoZOQFz8vpdkeJRJTgkKj0urO4sHG-o5sp=_2OQ@mail.gmail.com>

Great to see this moving. I have some questions/concerns based on your
statement Doug about docs.openstack.org publishing and do not want to
detour the conversation but ask for feedback. Currently there are a number
of repositories under osops-

https://github.com/openstack-infra/project-config/blob/master/gerrit/projects.yaml#L5673-L5703

Generally active:
osops-tools-contrib
osops-tools-generic
osops-tools-monitoring


Probably dead:
osops-tools-logging
osops-coda
osops-example-configs

Because you are more familiar with how things work, is there a way to
consolidate these vs coming up with another repo like osops-docs or
whatever in this case? And second, is there already governance clearance to
publish based on the following - https://launchpad.net/osops - which is
where these repos originated.


On Wed, May 23, 2018 at 9:56 PM, Frank Kloeker <eumel at arcor.de> wrote:

> Hi Chris,
>
> thanks for summarize our session today in Vancouver. As I18n PTL and one
> of the Docs Core I put Petr in Cc. He is currently Docs PTL, but
> unfortunatelly not on-site.
> I couldn't also not get the full history of the story and that's also not
> the idea to starting finger pointing. As usualy we moving forward and there
> are some interesting things to know what happened.
> First of all: There are no "Docs-Team" anymore. If you look at [1] there
> are mostly part-time contributors like me or people are more involved in
> other projects and therefore busy. Because of that, the responsibility of
> documentation content are moved completely to the project teams. Each repo
> has a user guide, admin guide, deployment guide, and so on. The small
> Documentation Team provides only tooling and give advices how to write and
> publish a document. So it's up to you to re-use the old repo on [2] or
> setup a new one. I would recommend to use the best of both worlds. There
> are a very good toolset in place for testing and publishing documents.
> There are also various text editors for rst extensions available, like in
> vim, notepad++ or also online services. I understand the concerns and when
> people are sad because their patches are ignored for months. But it's
> alltime a question of responsibilty and how can spend people time.
> I would be available for help. As I18n PTL I could imagine that a
> OpenStack Operations Guide is available in different languages and portable
> in different formats like in Sphinx. For us as translation team it's a good
> possibility to get feedback about the quality and to understand the
> requirements, also for other documents.
> So let's move on.
>
> kind regards
>
> Frank
>
> [1] https://review.openstack.org/#/admin/groups/30,members
> [2] https://github.com/openstack/operations-guide
>
>
> Am 2018-05-24 03:38, schrieb Chris Morgan:
>
>> Hello Everyone,
>>
>> In the Ops Community documentation working session today in Vancouver,
>> we made some really good progress (etherpad here:
>> https://etherpad.openstack.org/p/YVR-Ops-Community-Docs but not all of
>> the good stuff is yet written down).
>>
>> In short, we're going to course correct on maintaining the Operators
>> Guide, the HA Guide and Architecture Guide, not edit-in-place via the
>> wiki and instead try still maintaining them as code, but with a
>> different, new set of owners, possibly in a new Ops-focused repo.
>> There was a strong consensus that a) code workflow >> wiki workflow
>> and that b) openstack core docs tools are just fine.
>>
>> There is a lot still to be decided on how where and when, but we do
>> have an offer of a rewrite of the HA Guide, as long as the changes
>> will be allowed to actually land, so we expect to actually start
>> showing some progress.
>>
>> At the end of the session, people wanted to know how to follow along
>> as various people work out how to do this... and so for now that place
>> is this very email thread. The idea is if the code for those documents
>> goes to live in a different repo, or if new contributors turn up, or
>> if a new version we will announce/discuss it here until such time as
>> we have a better home for this initiative.
>>
>> Cheers
>>
>> Chris
>>
>> --
>> Chris Morgan <mihalis68 at gmail.com>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>


-- 
Kind regards,

Melvin Hillsman
mrhillsman at gmail.com
mobile: (832) 264-2646
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180523/4e029f92/attachment.html>

From mrhillsman at gmail.com  Thu May 24 05:28:26 2018
From: mrhillsman at gmail.com (Melvin Hillsman)
Date: Wed, 23 May 2018 22:28:26 -0700
Subject: [Openstack-operators] Ops Community Documentation - first
	anchor point
In-Reply-To: <CAMVtB2F_YKaaoZOQFz8vpdkeJRJTgkKj0urO4sHG-o5sp=_2OQ@mail.gmail.com>
References: <CA+NmNoPRU+R7HQkJbwkVhOu1jLpUXuw7u_Neb8fCDdfrUvmdng@mail.gmail.com>
 <30d4f1a3668445a11fd34b271bc37e94@arcor.de>
 <CAMVtB2F_YKaaoZOQFz8vpdkeJRJTgkKj0urO4sHG-o5sp=_2OQ@mail.gmail.com>
Message-ID: <CAMVtB2FFGZKiU1B3J5q9ctN9SAMzhWD8sdm=fcE9fvTDX8y7TQ@mail.gmail.com>

Also, apologies, if consolidation or reorganizing all these is reasonable,
what do you think that would look like; i.e.

osops
> tools
>> contrib
>> generic
>> monitoring
>> logging
> docs
> example-configs


On Wed, May 23, 2018 at 10:26 PM, Melvin Hillsman <mrhillsman at gmail.com>
wrote:

> Great to see this moving. I have some questions/concerns based on your
> statement Doug about docs.openstack.org publishing and do not want to
> detour the conversation but ask for feedback. Currently there are a number
> of repositories under osops-
>
> https://github.com/openstack-infra/project-config/blob/
> master/gerrit/projects.yaml#L5673-L5703
>
> Generally active:
> osops-tools-contrib
> osops-tools-generic
> osops-tools-monitoring
>
>
> Probably dead:
> osops-tools-logging
> osops-coda
> osops-example-configs
>
> Because you are more familiar with how things work, is there a way to
> consolidate these vs coming up with another repo like osops-docs or
> whatever in this case? And second, is there already governance clearance to
> publish based on the following - https://launchpad.net/osops - which is
> where these repos originated.
>
>
> On Wed, May 23, 2018 at 9:56 PM, Frank Kloeker <eumel at arcor.de> wrote:
>
>> Hi Chris,
>>
>> thanks for summarize our session today in Vancouver. As I18n PTL and one
>> of the Docs Core I put Petr in Cc. He is currently Docs PTL, but
>> unfortunatelly not on-site.
>> I couldn't also not get the full history of the story and that's also not
>> the idea to starting finger pointing. As usualy we moving forward and there
>> are some interesting things to know what happened.
>> First of all: There are no "Docs-Team" anymore. If you look at [1] there
>> are mostly part-time contributors like me or people are more involved in
>> other projects and therefore busy. Because of that, the responsibility of
>> documentation content are moved completely to the project teams. Each repo
>> has a user guide, admin guide, deployment guide, and so on. The small
>> Documentation Team provides only tooling and give advices how to write and
>> publish a document. So it's up to you to re-use the old repo on [2] or
>> setup a new one. I would recommend to use the best of both worlds. There
>> are a very good toolset in place for testing and publishing documents.
>> There are also various text editors for rst extensions available, like in
>> vim, notepad++ or also online services. I understand the concerns and when
>> people are sad because their patches are ignored for months. But it's
>> alltime a question of responsibilty and how can spend people time.
>> I would be available for help. As I18n PTL I could imagine that a
>> OpenStack Operations Guide is available in different languages and portable
>> in different formats like in Sphinx. For us as translation team it's a good
>> possibility to get feedback about the quality and to understand the
>> requirements, also for other documents.
>> So let's move on.
>>
>> kind regards
>>
>> Frank
>>
>> [1] https://review.openstack.org/#/admin/groups/30,members
>> [2] https://github.com/openstack/operations-guide
>>
>>
>> Am 2018-05-24 03:38, schrieb Chris Morgan:
>>
>>> Hello Everyone,
>>>
>>> In the Ops Community documentation working session today in Vancouver,
>>> we made some really good progress (etherpad here:
>>> https://etherpad.openstack.org/p/YVR-Ops-Community-Docs but not all of
>>> the good stuff is yet written down).
>>>
>>> In short, we're going to course correct on maintaining the Operators
>>> Guide, the HA Guide and Architecture Guide, not edit-in-place via the
>>> wiki and instead try still maintaining them as code, but with a
>>> different, new set of owners, possibly in a new Ops-focused repo.
>>> There was a strong consensus that a) code workflow >> wiki workflow
>>> and that b) openstack core docs tools are just fine.
>>>
>>> There is a lot still to be decided on how where and when, but we do
>>> have an offer of a rewrite of the HA Guide, as long as the changes
>>> will be allowed to actually land, so we expect to actually start
>>> showing some progress.
>>>
>>> At the end of the session, people wanted to know how to follow along
>>> as various people work out how to do this... and so for now that place
>>> is this very email thread. The idea is if the code for those documents
>>> goes to live in a different repo, or if new contributors turn up, or
>>> if a new version we will announce/discuss it here until such time as
>>> we have a better home for this initiative.
>>>
>>> Cheers
>>>
>>> Chris
>>>
>>> --
>>> Chris Morgan <mihalis68 at gmail.com>
>>> _______________________________________________
>>> OpenStack-operators mailing list
>>> OpenStack-operators at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>
>>
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>
>
>
> --
> Kind regards,
>
> Melvin Hillsman
> mrhillsman at gmail.com
> mobile: (832) 264-2646
>


-- 
Kind regards,

Melvin Hillsman
mrhillsman at gmail.com
mobile: (832) 264-2646
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180523/e77a9b55/attachment.html>

From doug at doughellmann.com  Thu May 24 05:58:40 2018
From: doug at doughellmann.com (Doug Hellmann)
Date: Wed, 23 May 2018 22:58:40 -0700
Subject: [Openstack-operators] Ops Community Documentation - first
	anchor point
In-Reply-To: <CAMVtB2F_YKaaoZOQFz8vpdkeJRJTgkKj0urO4sHG-o5sp=_2OQ@mail.gmail.com>
References: <CA+NmNoPRU+R7HQkJbwkVhOu1jLpUXuw7u_Neb8fCDdfrUvmdng@mail.gmail.com>
 <30d4f1a3668445a11fd34b271bc37e94@arcor.de>
 <CAMVtB2F_YKaaoZOQFz8vpdkeJRJTgkKj0urO4sHG-o5sp=_2OQ@mail.gmail.com>
Message-ID: <1527141275-sup-1922@lrrr.local>

Excerpts from Melvin Hillsman's message of 2018-05-23 22:26:02 -0700:
> Great to see this moving. I have some questions/concerns based on your
> statement Doug about docs.openstack.org publishing and do not want to
> detour the conversation but ask for feedback. Currently there are a number

I'm just unclear on that, but don't consider it a blocker. We will sort
out whatever governance or policy change is needed to let this move
forward.

> of repositories under osops-
> 
> https://github.com/openstack-infra/project-config/blob/master/gerrit/projects.yaml#L5673-L5703
> 
> Generally active:
> osops-tools-contrib
> osops-tools-generic
> osops-tools-monitoring
> 
> 
> Probably dead:
> osops-tools-logging
> osops-coda
> osops-example-configs
> 
> Because you are more familiar with how things work, is there a way to
> consolidate these vs coming up with another repo like osops-docs or
> whatever in this case? And second, is there already governance clearance to
> publish based on the following - https://launchpad.net/osops - which is
> where these repos originated.

I don't really know what any of those things are, or whether it
makes sense to put this new content there. I assumed we would make
a repo with a name like "operations-guide", but that's up to Chris
and John.  If they think reusing an existing repository makes sense,
that would be OK with me, but it's cheap and easy to set up a new
one, too.

My main concern is that we remove the road blocks, now that we have
people interested in contributing to this documentation.

> 
> On Wed, May 23, 2018 at 9:56 PM, Frank Kloeker <eumel at arcor.de> wrote:
> 
> > Hi Chris,
> >
> > thanks for summarize our session today in Vancouver. As I18n PTL and one
> > of the Docs Core I put Petr in Cc. He is currently Docs PTL, but
> > unfortunatelly not on-site.
> > I couldn't also not get the full history of the story and that's also not
> > the idea to starting finger pointing. As usualy we moving forward and there
> > are some interesting things to know what happened.
> > First of all: There are no "Docs-Team" anymore. If you look at [1] there
> > are mostly part-time contributors like me or people are more involved in
> > other projects and therefore busy. Because of that, the responsibility of
> > documentation content are moved completely to the project teams. Each repo
> > has a user guide, admin guide, deployment guide, and so on. The small
> > Documentation Team provides only tooling and give advices how to write and
> > publish a document. So it's up to you to re-use the old repo on [2] or
> > setup a new one. I would recommend to use the best of both worlds. There
> > are a very good toolset in place for testing and publishing documents.
> > There are also various text editors for rst extensions available, like in
> > vim, notepad++ or also online services. I understand the concerns and when
> > people are sad because their patches are ignored for months. But it's
> > alltime a question of responsibilty and how can spend people time.
> > I would be available for help. As I18n PTL I could imagine that a
> > OpenStack Operations Guide is available in different languages and portable
> > in different formats like in Sphinx. For us as translation team it's a good
> > possibility to get feedback about the quality and to understand the
> > requirements, also for other documents.
> > So let's move on.
> >
> > kind regards
> >
> > Frank
> >
> > [1] https://review.openstack.org/#/admin/groups/30,members
> > [2] https://github.com/openstack/operations-guide
> >
> >
> > Am 2018-05-24 03:38, schrieb Chris Morgan:
> >
> >> Hello Everyone,
> >>
> >> In the Ops Community documentation working session today in Vancouver,
> >> we made some really good progress (etherpad here:
> >> https://etherpad.openstack.org/p/YVR-Ops-Community-Docs but not all of
> >> the good stuff is yet written down).
> >>
> >> In short, we're going to course correct on maintaining the Operators
> >> Guide, the HA Guide and Architecture Guide, not edit-in-place via the
> >> wiki and instead try still maintaining them as code, but with a
> >> different, new set of owners, possibly in a new Ops-focused repo.
> >> There was a strong consensus that a) code workflow >> wiki workflow
> >> and that b) openstack core docs tools are just fine.
> >>
> >> There is a lot still to be decided on how where and when, but we do
> >> have an offer of a rewrite of the HA Guide, as long as the changes
> >> will be allowed to actually land, so we expect to actually start
> >> showing some progress.
> >>
> >> At the end of the session, people wanted to know how to follow along
> >> as various people work out how to do this... and so for now that place
> >> is this very email thread. The idea is if the code for those documents
> >> goes to live in a different repo, or if new contributors turn up, or
> >> if a new version we will announce/discuss it here until such time as
> >> we have a better home for this initiative.
> >>
> >> Cheers
> >>
> >> Chris
> >>
> >> --
> >> Chris Morgan <mihalis68 at gmail.com>
> >> _______________________________________________
> >> OpenStack-operators mailing list
> >> OpenStack-operators at lists.openstack.org
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> >>
> >
> >
> > _______________________________________________
> > OpenStack-operators mailing list
> > OpenStack-operators at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> >
> 


From mrhillsman at gmail.com  Thu May 24 06:31:03 2018
From: mrhillsman at gmail.com (Melvin Hillsman)
Date: Wed, 23 May 2018 23:31:03 -0700
Subject: [Openstack-operators] Ops Community Documentation - first
	anchor point
In-Reply-To: <1527141275-sup-1922@lrrr.local>
References: <CA+NmNoPRU+R7HQkJbwkVhOu1jLpUXuw7u_Neb8fCDdfrUvmdng@mail.gmail.com>
 <30d4f1a3668445a11fd34b271bc37e94@arcor.de>
 <CAMVtB2F_YKaaoZOQFz8vpdkeJRJTgkKj0urO4sHG-o5sp=_2OQ@mail.gmail.com>
 <1527141275-sup-1922@lrrr.local>
Message-ID: <CAMVtB2G3RqwgS9fKUnBUa3b29TuvBtLAAiAAaZ0Sg72cj9BgdQ@mail.gmail.com>

Sure definitely, that's why I said I was not trying to detour the
conversation, but rather asking for feedback. Definitely agree things
should continue to plow forward and Chris has been doing an excellent job
here and I think it is awesome that he is continuing to push this.

On Wed, May 23, 2018 at 10:58 PM, Doug Hellmann <doug at doughellmann.com>
wrote:

> Excerpts from Melvin Hillsman's message of 2018-05-23 22:26:02 -0700:
> > Great to see this moving. I have some questions/concerns based on your
> > statement Doug about docs.openstack.org publishing and do not want to
> > detour the conversation but ask for feedback. Currently there are a
> number
>
> I'm just unclear on that, but don't consider it a blocker. We will sort
> out whatever governance or policy change is needed to let this move
> forward.
>
> > of repositories under osops-
> >
> > https://github.com/openstack-infra/project-config/blob/
> master/gerrit/projects.yaml#L5673-L5703
> >
> > Generally active:
> > osops-tools-contrib
> > osops-tools-generic
> > osops-tools-monitoring
> >
> >
> > Probably dead:
> > osops-tools-logging
> > osops-coda
> > osops-example-configs
> >
> > Because you are more familiar with how things work, is there a way to
> > consolidate these vs coming up with another repo like osops-docs or
> > whatever in this case? And second, is there already governance clearance
> to
> > publish based on the following - https://launchpad.net/osops - which is
> > where these repos originated.
>
> I don't really know what any of those things are, or whether it
> makes sense to put this new content there. I assumed we would make
> a repo with a name like "operations-guide", but that's up to Chris
> and John.  If they think reusing an existing repository makes sense,
> that would be OK with me, but it's cheap and easy to set up a new
> one, too.
>
> My main concern is that we remove the road blocks, now that we have
> people interested in contributing to this documentation.
>
> >
> > On Wed, May 23, 2018 at 9:56 PM, Frank Kloeker <eumel at arcor.de> wrote:
> >
> > > Hi Chris,
> > >
> > > thanks for summarize our session today in Vancouver. As I18n PTL and
> one
> > > of the Docs Core I put Petr in Cc. He is currently Docs PTL, but
> > > unfortunatelly not on-site.
> > > I couldn't also not get the full history of the story and that's also
> not
> > > the idea to starting finger pointing. As usualy we moving forward and
> there
> > > are some interesting things to know what happened.
> > > First of all: There are no "Docs-Team" anymore. If you look at [1]
> there
> > > are mostly part-time contributors like me or people are more involved
> in
> > > other projects and therefore busy. Because of that, the responsibility
> of
> > > documentation content are moved completely to the project teams. Each
> repo
> > > has a user guide, admin guide, deployment guide, and so on. The small
> > > Documentation Team provides only tooling and give advices how to write
> and
> > > publish a document. So it's up to you to re-use the old repo on [2] or
> > > setup a new one. I would recommend to use the best of both worlds.
> There
> > > are a very good toolset in place for testing and publishing documents.
> > > There are also various text editors for rst extensions available, like
> in
> > > vim, notepad++ or also online services. I understand the concerns and
> when
> > > people are sad because their patches are ignored for months. But it's
> > > alltime a question of responsibilty and how can spend people time.
> > > I would be available for help. As I18n PTL I could imagine that a
> > > OpenStack Operations Guide is available in different languages and
> portable
> > > in different formats like in Sphinx. For us as translation team it's a
> good
> > > possibility to get feedback about the quality and to understand the
> > > requirements, also for other documents.
> > > So let's move on.
> > >
> > > kind regards
> > >
> > > Frank
> > >
> > > [1] https://review.openstack.org/#/admin/groups/30,members
> > > [2] https://github.com/openstack/operations-guide
> > >
> > >
> > > Am 2018-05-24 03:38, schrieb Chris Morgan:
> > >
> > >> Hello Everyone,
> > >>
> > >> In the Ops Community documentation working session today in Vancouver,
> > >> we made some really good progress (etherpad here:
> > >> https://etherpad.openstack.org/p/YVR-Ops-Community-Docs but not all
> of
> > >> the good stuff is yet written down).
> > >>
> > >> In short, we're going to course correct on maintaining the Operators
> > >> Guide, the HA Guide and Architecture Guide, not edit-in-place via the
> > >> wiki and instead try still maintaining them as code, but with a
> > >> different, new set of owners, possibly in a new Ops-focused repo.
> > >> There was a strong consensus that a) code workflow >> wiki workflow
> > >> and that b) openstack core docs tools are just fine.
> > >>
> > >> There is a lot still to be decided on how where and when, but we do
> > >> have an offer of a rewrite of the HA Guide, as long as the changes
> > >> will be allowed to actually land, so we expect to actually start
> > >> showing some progress.
> > >>
> > >> At the end of the session, people wanted to know how to follow along
> > >> as various people work out how to do this... and so for now that place
> > >> is this very email thread. The idea is if the code for those documents
> > >> goes to live in a different repo, or if new contributors turn up, or
> > >> if a new version we will announce/discuss it here until such time as
> > >> we have a better home for this initiative.
> > >>
> > >> Cheers
> > >>
> > >> Chris
> > >>
> > >> --
> > >> Chris Morgan <mihalis68 at gmail.com>
> > >> _______________________________________________
> > >> OpenStack-operators mailing list
> > >> OpenStack-operators at lists.openstack.org
> > >> http://lists.openstack.org/cgi-bin/mailman/listinfo/
> openstack-operators
> > >>
> > >
> > >
> > > _______________________________________________
> > > OpenStack-operators mailing list
> > > OpenStack-operators at lists.openstack.org
> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/
> openstack-operators
> > >
> >
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>


-- 
Kind regards,

Melvin Hillsman
mrhillsman at gmail.com
mobile: (832) 264-2646
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180523/ed86cc73/attachment.html>

From radu.popescu at emag.ro  Thu May 24 09:07:20 2018
From: radu.popescu at emag.ro (Radu Popescu | eMAG, Technology)
Date: Thu, 24 May 2018 09:07:20 +0000
Subject: [Openstack-operators] attaching network cards to VMs taking a
 very long time
In-Reply-To: <43cd8579c761a13dcc81e6ffc9a69089fb421cda.camel@emag.ro>
References: <a6616e77-9304-d704-04cd-466681f32a36@gmail.com>
 <ef13f15d4f3106fc4307c04fcb514378c35fa8c4.camel@emag.ro>
 <CAGckRDo3kVsYZ7dNOQiuw=H9mbMZmzZr-b-t0AO-1=XTWGyS0w@mail.gmail.com>
 <715adc7d-64f6-9545-1bf6-5eb13fb1d991@gmail.com>
 <CAGckRDo87sE6MAqbVP1D=H1HPN-6Vtz8ohdhXPLh0RNPbAkd=A@mail.gmail.com>
 <350f070b9d654a0a5430fafb07bcc1d41c98d2f8.camel@emag.ro>
 <CAPmmg8uZshbkkYcubT5JWjPBqXeKJanbKXXmz9CMaaYrmcTt3w@mail.gmail.com>
 <CAPmmg8uJUWY65bFQeZnpnTRRE=nSgbqk1ktKebHRGEQnJDHp3w@mail.gmail.com>
 <43cd8579c761a13dcc81e6ffc9a69089fb421cda.camel@emag.ro>
Message-ID: <6ec89b3767f26b73451fa490d18ed7756266d3ec.camel@emag.ro>

Hi,

did the change yesterday. Had no issue this morning with neutron not being able to move fast enough. Still, we had some storage issues, but that's another thing.
Anyway, I'll leave it like this for the next few days and report back in case I get the same slow neutron errors.

Thanks a lot!
Radu

On Wed, 2018-05-23 at 10:08 +0000, Radu Popescu | eMAG, Technology wrote:
Hi,

actually, I didn't know about that option. I'll enable it right now.
Testing is done every morning at about 4:00AM ..so I'll know tomorrow morning if it changed anything.

Thanks,
Radu

On Tue, 2018-05-22 at 15:30 +0200, Saverio Proto wrote:

Sorry email went out incomplete.

Read this:

https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/


make sure that Openstack rootwrap configured to work in daemon mode


Thank you


Saverio


2018-05-22 15:29 GMT+02:00 Saverio Proto <zioproto at gmail.com<mailto:zioproto at gmail.com>>:

Hello Radu,


do you have the Openstack rootwrap configured to work in daemon mode ?


please read this article:


2018-05-18 10:21 GMT+02:00 Radu Popescu | eMAG, Technology

<radu.popescu at emag.ro<mailto:radu.popescu at emag.ro>>:

Hi,


so, nova says the VM is ACTIVE and actually boots with no network. We are

setting some metadata that we use later on and have cloud-init for different

tasks.

So, VM is up, OS is running, but network is working after a random amount of

time, that can get to around 45 minutes. Thing is, is not happening to all

VMs in that test (around 300), but it's happening to a fair amount - around

25%.


I can see the callback coming few seconds after neutron openvswitch agent

says it's completed the setup. My question is, why is it taking so long for

nova openvswitch agent to configure the port? I can see the port up in both

host OS and openvswitch. I would assume it's doing the whole namespace and

iptables setup. But still, 30 minutes? Seems a lot!


Thanks,

Radu


On Thu, 2018-05-17 at 11:50 -0400, George Mihaiescu wrote:


We have other scheduled tests that perform end-to-end (assign floating IP,

ssh, ping outside) and never had an issue.

I think we turned it off because the callback code was initially buggy and

nova would wait forever while things were in fact ok, but I'll  change

"vif_plugging_is_fatal = True" and "vif_plugging_timeout = 300" and run

another large test, just to confirm.


We usually run these large tests after a version upgrade to test the APIs

under load.


On Thu, May 17, 2018 at 11:42 AM, Matt Riedemann <mriedemos at gmail.com<mailto:mriedemos at gmail.com>>

wrote:


On 5/17/2018 9:46 AM, George Mihaiescu wrote:


and large rally tests of 500 instances complete with no issues.


Sure, except you can't ssh into the guests.


The whole reason the vif plugging is fatal and timeout and callback code was

because the upstream CI was unstable without it. The server would report as

ACTIVE but the ports weren't wired up so ssh would fail. Having an ACTIVE

guest that you can't actually do anything with is kind of pointless.


_______________________________________________


OpenStack-operators mailing list


OpenStack-operators at lists.openstack.org<mailto:OpenStack-operators at lists.openstack.org>


http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


_______________________________________________

OpenStack-operators mailing list

OpenStack-operators at lists.openstack.org<mailto:OpenStack-operators at lists.openstack.org>

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


_______________________________________________

OpenStack-operators mailing list

OpenStack-operators at lists.openstack.org<mailto:OpenStack-operators at lists.openstack.org>

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180524/e7b8b498/attachment-0001.html>

From zioproto at gmail.com  Thu May 24 09:51:10 2018
From: zioproto at gmail.com (Saverio Proto)
Date: Thu, 24 May 2018 11:51:10 +0200
Subject: [Openstack-operators] attaching network cards to VMs taking a
 very long time
In-Reply-To: <6ec89b3767f26b73451fa490d18ed7756266d3ec.camel@emag.ro>
References: <a6616e77-9304-d704-04cd-466681f32a36@gmail.com>
 <ef13f15d4f3106fc4307c04fcb514378c35fa8c4.camel@emag.ro>
 <CAGckRDo3kVsYZ7dNOQiuw=H9mbMZmzZr-b-t0AO-1=XTWGyS0w@mail.gmail.com>
 <715adc7d-64f6-9545-1bf6-5eb13fb1d991@gmail.com>
 <CAGckRDo87sE6MAqbVP1D=H1HPN-6Vtz8ohdhXPLh0RNPbAkd=A@mail.gmail.com>
 <350f070b9d654a0a5430fafb07bcc1d41c98d2f8.camel@emag.ro>
 <CAPmmg8uZshbkkYcubT5JWjPBqXeKJanbKXXmz9CMaaYrmcTt3w@mail.gmail.com>
 <CAPmmg8uJUWY65bFQeZnpnTRRE=nSgbqk1ktKebHRGEQnJDHp3w@mail.gmail.com>
 <43cd8579c761a13dcc81e6ffc9a69089fb421cda.camel@emag.ro>
 <6ec89b3767f26b73451fa490d18ed7756266d3ec.camel@emag.ro>
Message-ID: <CAPmmg8s-z87fA4irGi+yxwHoBriPr4S6SrWzBXjV2cbS_gWnOw@mail.gmail.com>

Glad to hear it!
Always monitor rabbitmq queues to identify bottlenecks !! :)

Cheers

Saverio

Il gio 24 mag 2018, 11:07 Radu Popescu | eMAG, Technology <
radu.popescu at emag.ro> ha scritto:

> Hi,
>
> did the change yesterday. Had no issue this morning with neutron not being
> able to move fast enough. Still, we had some storage issues, but that's
> another thing.
> Anyway, I'll leave it like this for the next few days and report back in
> case I get the same slow neutron errors.
>
> Thanks a lot!
> Radu
>
> On Wed, 2018-05-23 at 10:08 +0000, Radu Popescu | eMAG, Technology wrote:
>
> Hi,
>
> actually, I didn't know about that option. I'll enable it right now.
> Testing is done every morning at about 4:00AM ..so I'll know tomorrow
> morning if it changed anything.
>
> Thanks,
> Radu
>
> On Tue, 2018-05-22 at 15:30 +0200, Saverio Proto wrote:
>
> Sorry email went out incomplete.
>
> Read this:
>
> https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/
>
>
> make sure that Openstack rootwrap configured to work in daemon mode
>
>
> Thank you
>
>
> Saverio
>
>
>
> 2018-05-22 15:29 GMT+02:00 Saverio Proto <zioproto at gmail.com>:
>
> Hello Radu,
>
>
> do you have the Openstack rootwrap configured to work in daemon mode ?
>
>
> please read this article:
>
>
> 2018-05-18 10:21 GMT+02:00 Radu Popescu | eMAG, Technology
>
> <radu.popescu at emag.ro>:
>
> Hi,
>
>
> so, nova says the VM is ACTIVE and actually boots with no network. We are
>
> setting some metadata that we use later on and have cloud-init for different
>
> tasks.
>
> So, VM is up, OS is running, but network is working after a random amount of
>
> time, that can get to around 45 minutes. Thing is, is not happening to all
>
> VMs in that test (around 300), but it's happening to a fair amount - around
>
> 25%.
>
>
> I can see the callback coming few seconds after neutron openvswitch agent
>
> says it's completed the setup. My question is, why is it taking so long for
>
> nova openvswitch agent to configure the port? I can see the port up in both
>
> host OS and openvswitch. I would assume it's doing the whole namespace and
>
> iptables setup. But still, 30 minutes? Seems a lot!
>
>
> Thanks,
>
> Radu
>
>
> On Thu, 2018-05-17 at 11:50 -0400, George Mihaiescu wrote:
>
>
> We have other scheduled tests that perform end-to-end (assign floating IP,
>
> ssh, ping outside) and never had an issue.
>
> I think we turned it off because the callback code was initially buggy and
>
> nova would wait forever while things were in fact ok, but I'll  change
>
> "vif_plugging_is_fatal = True" and "vif_plugging_timeout = 300" and run
>
> another large test, just to confirm.
>
>
> We usually run these large tests after a version upgrade to test the APIs
>
> under load.
>
>
>
>
> On Thu, May 17, 2018 at 11:42 AM, Matt Riedemann <mriedemos at gmail.com>
>
> wrote:
>
>
> On 5/17/2018 9:46 AM, George Mihaiescu wrote:
>
>
> and large rally tests of 500 instances complete with no issues.
>
>
>
> Sure, except you can't ssh into the guests.
>
>
> The whole reason the vif plugging is fatal and timeout and callback code was
>
> because the upstream CI was unstable without it. The server would report as
>
> ACTIVE but the ports weren't wired up so ssh would fail. Having an ACTIVE
>
> guest that you can't actually do anything with is kind of pointless.
>
>
> _______________________________________________
>
>
> OpenStack-operators mailing list
>
>
> OpenStack-operators at lists.openstack.org
>
>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
>
>
> _______________________________________________
>
> OpenStack-operators mailing list
>
> OpenStack-operators at lists.openstack.org
>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
> _______________________________________________
>
> OpenStack-operators mailing list
>
> OpenStack-operators at lists.openstack.org
>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180524/dcc63ab9/attachment.html>

From doug at doughellmann.com  Thu May 24 14:07:10 2018
From: doug at doughellmann.com (Doug Hellmann)
Date: Thu, 24 May 2018 07:07:10 -0700
Subject: [Openstack-operators] Ops Community Documentation - first
	anchor point
In-Reply-To: <1527141275-sup-1922@lrrr.local>
References: <CA+NmNoPRU+R7HQkJbwkVhOu1jLpUXuw7u_Neb8fCDdfrUvmdng@mail.gmail.com>
 <30d4f1a3668445a11fd34b271bc37e94@arcor.de>
 <CAMVtB2F_YKaaoZOQFz8vpdkeJRJTgkKj0urO4sHG-o5sp=_2OQ@mail.gmail.com>
 <1527141275-sup-1922@lrrr.local>
Message-ID: <1527170041-sup-4254@lrrr.local>

Excerpts from Doug Hellmann's message of 2018-05-23 22:58:40 -0700:
> Excerpts from Melvin Hillsman's message of 2018-05-23 22:26:02 -0700:
> > Great to see this moving. I have some questions/concerns based on your
> > statement Doug about docs.openstack.org publishing and do not want to
> > detour the conversation but ask for feedback. Currently there are a number
> 
> I'm just unclear on that, but don't consider it a blocker. We will sort
> out whatever governance or policy change is needed to let this move
> forward.

When I talked with Petr about it, he pointed to the Security SIG
and Security Guide as a parallel precedent for this. IIRC, yesterday
Adam mentioned that the Self-Healing SIG was also going to be
managing some documentation, so we have two examples.

Looking at https://governance.openstack.org/sigs/, I don't see
another existing SIG that it would make sense to join, so, I think
to deal with the publishing rights we would want set up a SIG for
something like "Operator Documentation," which gives you some
flexibility on exactly what content is managed.

I know you wanted to avoid lots of governance overhead, so I want
to just mention that establishing a SIG is meant to be a painless
and light-weight way to declare that a group of interested people
exists so that others can find them and participate in the work
[1]. It shouldn't take much effort to do the setup, and any ongoing
communication is something you would presumably by doing anyway
among a group of people trying to collaborate on a project like
this.

Let me know if you have any questions or concerns about the process.

Doug

[1] https://governance.openstack.org/sigs/#process-to-create-a-sig

> 
> > of repositories under osops-
> > 
> > https://github.com/openstack-infra/project-config/blob/master/gerrit/projects.yaml#L5673-L5703
> > 
> > Generally active:
> > osops-tools-contrib
> > osops-tools-generic
> > osops-tools-monitoring
> > 
> > 
> > Probably dead:
> > osops-tools-logging
> > osops-coda
> > osops-example-configs
> > 
> > Because you are more familiar with how things work, is there a way to
> > consolidate these vs coming up with another repo like osops-docs or
> > whatever in this case? And second, is there already governance clearance to
> > publish based on the following - https://launchpad.net/osops - which is
> > where these repos originated.
> 
> I don't really know what any of those things are, or whether it
> makes sense to put this new content there. I assumed we would make
> a repo with a name like "operations-guide", but that's up to Chris
> and John.  If they think reusing an existing repository makes sense,
> that would be OK with me, but it's cheap and easy to set up a new
> one, too.
> 
> My main concern is that we remove the road blocks, now that we have
> people interested in contributing to this documentation.
> 
> > 
> > On Wed, May 23, 2018 at 9:56 PM, Frank Kloeker <eumel at arcor.de> wrote:
> > 
> > > Hi Chris,
> > >
> > > thanks for summarize our session today in Vancouver. As I18n PTL and one
> > > of the Docs Core I put Petr in Cc. He is currently Docs PTL, but
> > > unfortunatelly not on-site.
> > > I couldn't also not get the full history of the story and that's also not
> > > the idea to starting finger pointing. As usualy we moving forward and there
> > > are some interesting things to know what happened.
> > > First of all: There are no "Docs-Team" anymore. If you look at [1] there
> > > are mostly part-time contributors like me or people are more involved in
> > > other projects and therefore busy. Because of that, the responsibility of
> > > documentation content are moved completely to the project teams. Each repo
> > > has a user guide, admin guide, deployment guide, and so on. The small
> > > Documentation Team provides only tooling and give advices how to write and
> > > publish a document. So it's up to you to re-use the old repo on [2] or
> > > setup a new one. I would recommend to use the best of both worlds. There
> > > are a very good toolset in place for testing and publishing documents.
> > > There are also various text editors for rst extensions available, like in
> > > vim, notepad++ or also online services. I understand the concerns and when
> > > people are sad because their patches are ignored for months. But it's
> > > alltime a question of responsibilty and how can spend people time.
> > > I would be available for help. As I18n PTL I could imagine that a
> > > OpenStack Operations Guide is available in different languages and portable
> > > in different formats like in Sphinx. For us as translation team it's a good
> > > possibility to get feedback about the quality and to understand the
> > > requirements, also for other documents.
> > > So let's move on.
> > >
> > > kind regards
> > >
> > > Frank
> > >
> > > [1] https://review.openstack.org/#/admin/groups/30,members
> > > [2] https://github.com/openstack/operations-guide
> > >
> > >
> > > Am 2018-05-24 03:38, schrieb Chris Morgan:
> > >
> > >> Hello Everyone,
> > >>
> > >> In the Ops Community documentation working session today in Vancouver,
> > >> we made some really good progress (etherpad here:
> > >> https://etherpad.openstack.org/p/YVR-Ops-Community-Docs but not all of
> > >> the good stuff is yet written down).
> > >>
> > >> In short, we're going to course correct on maintaining the Operators
> > >> Guide, the HA Guide and Architecture Guide, not edit-in-place via the
> > >> wiki and instead try still maintaining them as code, but with a
> > >> different, new set of owners, possibly in a new Ops-focused repo.
> > >> There was a strong consensus that a) code workflow >> wiki workflow
> > >> and that b) openstack core docs tools are just fine.
> > >>
> > >> There is a lot still to be decided on how where and when, but we do
> > >> have an offer of a rewrite of the HA Guide, as long as the changes
> > >> will be allowed to actually land, so we expect to actually start
> > >> showing some progress.
> > >>
> > >> At the end of the session, people wanted to know how to follow along
> > >> as various people work out how to do this... and so for now that place
> > >> is this very email thread. The idea is if the code for those documents
> > >> goes to live in a different repo, or if new contributors turn up, or
> > >> if a new version we will announce/discuss it here until such time as
> > >> we have a better home for this initiative.
> > >>
> > >> Cheers
> > >>
> > >> Chris
> > >>
> > >> --
> > >> Chris Morgan <mihalis68 at gmail.com>
> > >> _______________________________________________
> > >> OpenStack-operators mailing list
> > >> OpenStack-operators at lists.openstack.org
> > >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> > >>
> > >
> > >
> > > _______________________________________________
> > > OpenStack-operators mailing list
> > > OpenStack-operators at lists.openstack.org
> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> > >
> > 


From jon at csail.mit.edu  Thu May 24 14:19:29 2018
From: jon at csail.mit.edu (Jonathan D. Proulx)
Date: Thu, 24 May 2018 07:19:29 -0700
Subject: [Openstack-operators] Ops Community Documentation - first
 anchor point
In-Reply-To: <30d4f1a3668445a11fd34b271bc37e94@arcor.de>
References: <CA+NmNoPRU+R7HQkJbwkVhOu1jLpUXuw7u_Neb8fCDdfrUvmdng@mail.gmail.com>
 <30d4f1a3668445a11fd34b271bc37e94@arcor.de>
Message-ID: <20180524141929.2vylwguebcgkjxa3@csail.mit.edu>


My intention based on current understandign would be to create a git
repo called "osops-docs" as this fits current naming an thin initial
document we intend to put there and the others we may adopt from
docs-team.

My understanding being they don't to have this type of
documentention due to much reduced team size and prefer it live with
subject matter experts. It that correct?  If that's not correct I'm
not personally opposed to trying this under docs.  We'll need to
maintain enough contributors and reviewers to make the work flow go in
either location and that's my understanding of the basic issue not
where it lives.

This naming would also match other repos wich could be consolidated into an
"osops" repo to rule them all.  That may make sense as I think there's
significant overlap in set of people who might contribute, but that
can be a parallel conversation.

Doug looking at new project docs I think most of it is clear enough to
me.  Since it's not code I can skip all th PyPi stuff yes? The repo
creation seems pretty clear and I can steal the CI stuff from similar
projects.  I'm a little unclear on the Storyboard bit I've not done
much contribution lately and haven't storyboarded.  Is that relevant
(or at least relevent at first) for this use case?  If it is I
probably have more questions.

I agree governance can also be a parallel discussion.  I don't have
strong opinions there but seems based on participants and content like
a "UC" thing but < shrug />

-Jon


From jon at csail.mit.edu  Thu May 24 14:26:55 2018
From: jon at csail.mit.edu (Jonathan D. Proulx)
Date: Thu, 24 May 2018 07:26:55 -0700
Subject: [Openstack-operators] Ops Community Documentation - first
 anchor point
In-Reply-To: <1527170041-sup-4254@lrrr.local>
References: <CA+NmNoPRU+R7HQkJbwkVhOu1jLpUXuw7u_Neb8fCDdfrUvmdng@mail.gmail.com>
 <30d4f1a3668445a11fd34b271bc37e94@arcor.de>
 <CAMVtB2F_YKaaoZOQFz8vpdkeJRJTgkKj0urO4sHG-o5sp=_2OQ@mail.gmail.com>
 <1527141275-sup-1922@lrrr.local> <1527170041-sup-4254@lrrr.local>
Message-ID: <20180524142655.btcnop2tpexq32of@csail.mit.edu>

On Thu, May 24, 2018 at 07:07:10AM -0700, Doug Hellmann wrote:

:I know you wanted to avoid lots of governance overhead, so I want
:to just mention that establishing a SIG is meant to be a painless
:and light-weight way to declare that a group of interested people
:exists so that others can find them and participate in the work
:[1]. It shouldn't take much effort to do the setup, and any ongoing
:communication is something you would presumably by doing anyway
:among a group of people trying to collaborate on a project like
:this.

Yeah I can see SIG as a useful structure too.  I'm just more familiar
with UC "teams" because of my personal history.

I do thing SIG -vs- team would impace repo naming, and I'm still going
over creation doc, so I'll let this simmer here at least until YVR lunch
time to see if there's consensus or cotroversy in the potential
contributer community.  Lacking either I think I will default to
SIG-ops-docs.

Thanks,
-Jon

:
:Let me know if you have any questions or concerns about the process.
:
:Doug
:
:[1] https://governance.openstack.org/sigs/#process-to-create-a-sig
:
:> 
:> > of repositories under osops-
:> > 
:> > https://github.com/openstack-infra/project-config/blob/master/gerrit/projects.yaml#L5673-L5703
:> > 
:> > Generally active:
:> > osops-tools-contrib
:> > osops-tools-generic
:> > osops-tools-monitoring
:> > 
:> > 
:> > Probably dead:
:> > osops-tools-logging
:> > osops-coda
:> > osops-example-configs
:> > 
:> > Because you are more familiar with how things work, is there a way to
:> > consolidate these vs coming up with another repo like osops-docs or
:> > whatever in this case? And second, is there already governance clearance to
:> > publish based on the following - https://launchpad.net/osops - which is
:> > where these repos originated.
:> 
:> I don't really know what any of those things are, or whether it
:> makes sense to put this new content there. I assumed we would make
:> a repo with a name like "operations-guide", but that's up to Chris
:> and John.  If they think reusing an existing repository makes sense,
:> that would be OK with me, but it's cheap and easy to set up a new
:> one, too.
:> 
:> My main concern is that we remove the road blocks, now that we have
:> people interested in contributing to this documentation.
:> 
:> > 
:> > On Wed, May 23, 2018 at 9:56 PM, Frank Kloeker <eumel at arcor.de> wrote:
:> > 
:> > > Hi Chris,
:> > >
:> > > thanks for summarize our session today in Vancouver. As I18n PTL and one
:> > > of the Docs Core I put Petr in Cc. He is currently Docs PTL, but
:> > > unfortunatelly not on-site.
:> > > I couldn't also not get the full history of the story and that's also not
:> > > the idea to starting finger pointing. As usualy we moving forward and there
:> > > are some interesting things to know what happened.
:> > > First of all: There are no "Docs-Team" anymore. If you look at [1] there
:> > > are mostly part-time contributors like me or people are more involved in
:> > > other projects and therefore busy. Because of that, the responsibility of
:> > > documentation content are moved completely to the project teams. Each repo
:> > > has a user guide, admin guide, deployment guide, and so on. The small
:> > > Documentation Team provides only tooling and give advices how to write and
:> > > publish a document. So it's up to you to re-use the old repo on [2] or
:> > > setup a new one. I would recommend to use the best of both worlds. There
:> > > are a very good toolset in place for testing and publishing documents.
:> > > There are also various text editors for rst extensions available, like in
:> > > vim, notepad++ or also online services. I understand the concerns and when
:> > > people are sad because their patches are ignored for months. But it's
:> > > alltime a question of responsibilty and how can spend people time.
:> > > I would be available for help. As I18n PTL I could imagine that a
:> > > OpenStack Operations Guide is available in different languages and portable
:> > > in different formats like in Sphinx. For us as translation team it's a good
:> > > possibility to get feedback about the quality and to understand the
:> > > requirements, also for other documents.
:> > > So let's move on.
:> > >
:> > > kind regards
:> > >
:> > > Frank
:> > >
:> > > [1] https://review.openstack.org/#/admin/groups/30,members
:> > > [2] https://github.com/openstack/operations-guide
:> > >
:> > >
:> > > Am 2018-05-24 03:38, schrieb Chris Morgan:
:> > >
:> > >> Hello Everyone,
:> > >>
:> > >> In the Ops Community documentation working session today in Vancouver,
:> > >> we made some really good progress (etherpad here:
:> > >> https://etherpad.openstack.org/p/YVR-Ops-Community-Docs but not all of
:> > >> the good stuff is yet written down).
:> > >>
:> > >> In short, we're going to course correct on maintaining the Operators
:> > >> Guide, the HA Guide and Architecture Guide, not edit-in-place via the
:> > >> wiki and instead try still maintaining them as code, but with a
:> > >> different, new set of owners, possibly in a new Ops-focused repo.
:> > >> There was a strong consensus that a) code workflow >> wiki workflow
:> > >> and that b) openstack core docs tools are just fine.
:> > >>
:> > >> There is a lot still to be decided on how where and when, but we do
:> > >> have an offer of a rewrite of the HA Guide, as long as the changes
:> > >> will be allowed to actually land, so we expect to actually start
:> > >> showing some progress.
:> > >>
:> > >> At the end of the session, people wanted to know how to follow along
:> > >> as various people work out how to do this... and so for now that place
:> > >> is this very email thread. The idea is if the code for those documents
:> > >> goes to live in a different repo, or if new contributors turn up, or
:> > >> if a new version we will announce/discuss it here until such time as
:> > >> we have a better home for this initiative.
:> > >>
:> > >> Cheers
:> > >>
:> > >> Chris
:> > >>
:> > >> --
:> > >> Chris Morgan <mihalis68 at gmail.com>
:> > >> _______________________________________________
:> > >> OpenStack-operators mailing list
:> > >> OpenStack-operators at lists.openstack.org
:> > >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
:> > >>
:> > >
:> > >
:> > > _______________________________________________
:> > > OpenStack-operators mailing list
:> > > OpenStack-operators at lists.openstack.org
:> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
:> > >
:> > 
:
:_______________________________________________
:OpenStack-operators mailing list
:OpenStack-operators at lists.openstack.org
:http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


From mrhillsman at gmail.com  Thu May 24 20:26:07 2018
From: mrhillsman at gmail.com (Melvin Hillsman)
Date: Thu, 24 May 2018 13:26:07 -0700
Subject: [Openstack-operators] Ops Community Documentation - first
	anchor point
In-Reply-To: <20180524142655.btcnop2tpexq32of@csail.mit.edu>
References: <CA+NmNoPRU+R7HQkJbwkVhOu1jLpUXuw7u_Neb8fCDdfrUvmdng@mail.gmail.com>
 <30d4f1a3668445a11fd34b271bc37e94@arcor.de>
 <CAMVtB2F_YKaaoZOQFz8vpdkeJRJTgkKj0urO4sHG-o5sp=_2OQ@mail.gmail.com>
 <1527141275-sup-1922@lrrr.local> <1527170041-sup-4254@lrrr.local>
 <20180524142655.btcnop2tpexq32of@csail.mit.edu>
Message-ID: <CAMVtB2H88AC-6stqOhPtbRgKoSRCj52FRKOzCY7TrCCo+Kde+Q@mail.gmail.com>

I think a great model we have in general as a community is if people show
up to do the work, it is not something crazy, get out of their way; at
least that is how I think of it. I apologize if there is any perception
opposed to my previous statement by me bringing up the other repos. I tried
to be clear in wanting to get feedback from Doug in hope that as we move
forward in general, what are some thoughts on that front to ensure we
continue to remove roadblocks if any exist in parallel to great work, like
what Chris is driving here. On that front, please do what works best for
those doing the work.

On Thu, May 24, 2018 at 7:26 AM, Jonathan D. Proulx <jon at csail.mit.edu>
wrote:

> On Thu, May 24, 2018 at 07:07:10AM -0700, Doug Hellmann wrote:
>
> :I know you wanted to avoid lots of governance overhead, so I want
> :to just mention that establishing a SIG is meant to be a painless
> :and light-weight way to declare that a group of interested people
> :exists so that others can find them and participate in the work
> :[1]. It shouldn't take much effort to do the setup, and any ongoing
> :communication is something you would presumably by doing anyway
> :among a group of people trying to collaborate on a project like
> :this.
>
> Yeah I can see SIG as a useful structure too.  I'm just more familiar
> with UC "teams" because of my personal history.
>
> I do thing SIG -vs- team would impace repo naming, and I'm still going
> over creation doc, so I'll let this simmer here at least until YVR lunch
> time to see if there's consensus or cotroversy in the potential
> contributer community.  Lacking either I think I will default to
> SIG-ops-docs.
>
> Thanks,
> -Jon
>
> :
> :Let me know if you have any questions or concerns about the process.
> :
> :Doug
> :
> :[1] https://governance.openstack.org/sigs/#process-to-create-a-sig
> :
> :>
> :> > of repositories under osops-
> :> >
> :> > https://github.com/openstack-infra/project-config/blob/
> master/gerrit/projects.yaml#L5673-L5703
> :> >
> :> > Generally active:
> :> > osops-tools-contrib
> :> > osops-tools-generic
> :> > osops-tools-monitoring
> :> >
> :> >
> :> > Probably dead:
> :> > osops-tools-logging
> :> > osops-coda
> :> > osops-example-configs
> :> >
> :> > Because you are more familiar with how things work, is there a way to
> :> > consolidate these vs coming up with another repo like osops-docs or
> :> > whatever in this case? And second, is there already governance
> clearance to
> :> > publish based on the following - https://launchpad.net/osops - which
> is
> :> > where these repos originated.
> :>
> :> I don't really know what any of those things are, or whether it
> :> makes sense to put this new content there. I assumed we would make
> :> a repo with a name like "operations-guide", but that's up to Chris
> :> and John.  If they think reusing an existing repository makes sense,
> :> that would be OK with me, but it's cheap and easy to set up a new
> :> one, too.
> :>
> :> My main concern is that we remove the road blocks, now that we have
> :> people interested in contributing to this documentation.
> :>
> :> >
> :> > On Wed, May 23, 2018 at 9:56 PM, Frank Kloeker <eumel at arcor.de>
> wrote:
> :> >
> :> > > Hi Chris,
> :> > >
> :> > > thanks for summarize our session today in Vancouver. As I18n PTL
> and one
> :> > > of the Docs Core I put Petr in Cc. He is currently Docs PTL, but
> :> > > unfortunatelly not on-site.
> :> > > I couldn't also not get the full history of the story and that's
> also not
> :> > > the idea to starting finger pointing. As usualy we moving forward
> and there
> :> > > are some interesting things to know what happened.
> :> > > First of all: There are no "Docs-Team" anymore. If you look at [1]
> there
> :> > > are mostly part-time contributors like me or people are more
> involved in
> :> > > other projects and therefore busy. Because of that, the
> responsibility of
> :> > > documentation content are moved completely to the project teams.
> Each repo
> :> > > has a user guide, admin guide, deployment guide, and so on. The
> small
> :> > > Documentation Team provides only tooling and give advices how to
> write and
> :> > > publish a document. So it's up to you to re-use the old repo on [2]
> or
> :> > > setup a new one. I would recommend to use the best of both worlds.
> There
> :> > > are a very good toolset in place for testing and publishing
> documents.
> :> > > There are also various text editors for rst extensions available,
> like in
> :> > > vim, notepad++ or also online services. I understand the concerns
> and when
> :> > > people are sad because their patches are ignored for months. But
> it's
> :> > > alltime a question of responsibilty and how can spend people time.
> :> > > I would be available for help. As I18n PTL I could imagine that a
> :> > > OpenStack Operations Guide is available in different languages and
> portable
> :> > > in different formats like in Sphinx. For us as translation team
> it's a good
> :> > > possibility to get feedback about the quality and to understand the
> :> > > requirements, also for other documents.
> :> > > So let's move on.
> :> > >
> :> > > kind regards
> :> > >
> :> > > Frank
> :> > >
> :> > > [1] https://review.openstack.org/#/admin/groups/30,members
> :> > > [2] https://github.com/openstack/operations-guide
> :> > >
> :> > >
> :> > > Am 2018-05-24 03:38, schrieb Chris Morgan:
> :> > >
> :> > >> Hello Everyone,
> :> > >>
> :> > >> In the Ops Community documentation working session today in
> Vancouver,
> :> > >> we made some really good progress (etherpad here:
> :> > >> https://etherpad.openstack.org/p/YVR-Ops-Community-Docs but not
> all of
> :> > >> the good stuff is yet written down).
> :> > >>
> :> > >> In short, we're going to course correct on maintaining the
> Operators
> :> > >> Guide, the HA Guide and Architecture Guide, not edit-in-place via
> the
> :> > >> wiki and instead try still maintaining them as code, but with a
> :> > >> different, new set of owners, possibly in a new Ops-focused repo.
> :> > >> There was a strong consensus that a) code workflow >> wiki workflow
> :> > >> and that b) openstack core docs tools are just fine.
> :> > >>
> :> > >> There is a lot still to be decided on how where and when, but we do
> :> > >> have an offer of a rewrite of the HA Guide, as long as the changes
> :> > >> will be allowed to actually land, so we expect to actually start
> :> > >> showing some progress.
> :> > >>
> :> > >> At the end of the session, people wanted to know how to follow
> along
> :> > >> as various people work out how to do this... and so for now that
> place
> :> > >> is this very email thread. The idea is if the code for those
> documents
> :> > >> goes to live in a different repo, or if new contributors turn up,
> or
> :> > >> if a new version we will announce/discuss it here until such time
> as
> :> > >> we have a better home for this initiative.
> :> > >>
> :> > >> Cheers
> :> > >>
> :> > >> Chris
> :> > >>
> :> > >> --
> :> > >> Chris Morgan <mihalis68 at gmail.com>
> :> > >> _______________________________________________
> :> > >> OpenStack-operators mailing list
> :> > >> OpenStack-operators at lists.openstack.org
> :> > >> http://lists.openstack.org/cgi-bin/mailman/listinfo/
> openstack-operators
> :> > >>
> :> > >
> :> > >
> :> > > _______________________________________________
> :> > > OpenStack-operators mailing list
> :> > > OpenStack-operators at lists.openstack.org
> :> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/
> openstack-operators
> :> > >
> :> >
> :
> :_______________________________________________
> :OpenStack-operators mailing list
> :OpenStack-operators at lists.openstack.org
> :http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>


-- 
Kind regards,

Melvin Hillsman
mrhillsman at gmail.com
mobile: (832) 264-2646
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180524/3b098c95/attachment.html>

From jon at csail.mit.edu  Thu May 24 20:31:29 2018
From: jon at csail.mit.edu (Jonathan D. Proulx)
Date: Thu, 24 May 2018 13:31:29 -0700
Subject: [Openstack-operators] Ops Community Documentation - first
 anchor point
In-Reply-To: <CAMVtB2H88AC-6stqOhPtbRgKoSRCj52FRKOzCY7TrCCo+Kde+Q@mail.gmail.com>
References: <CA+NmNoPRU+R7HQkJbwkVhOu1jLpUXuw7u_Neb8fCDdfrUvmdng@mail.gmail.com>
 <30d4f1a3668445a11fd34b271bc37e94@arcor.de>
 <CAMVtB2F_YKaaoZOQFz8vpdkeJRJTgkKj0urO4sHG-o5sp=_2OQ@mail.gmail.com>
 <1527141275-sup-1922@lrrr.local> <1527170041-sup-4254@lrrr.local>
 <20180524142655.btcnop2tpexq32of@csail.mit.edu>
 <CAMVtB2H88AC-6stqOhPtbRgKoSRCj52FRKOzCY7TrCCo+Kde+Q@mail.gmail.com>
Message-ID: <20180524203129.ovlsvq4qe6tcrh7b@csail.mit.edu>

On Thu, May 24, 2018 at 01:26:07PM -0700, Melvin Hillsman wrote:
:   I think a great model we have in general as a community is if people
:   show up to do the work, it is not something crazy, get out of their
:   way; at least that is how I think of it. I apologize if there is any
:   perception opposed to my previous statement by me bringing up the other
:   repos. I tried to be clear in wanting to get feedback from Doug in hope
:   that as we move forward in general, what are some thoughts on that
:   front to ensure we continue to remove roadblocks if any exist in
:   parallel to great work, like what Chris is driving here. On that front,
:   please do what works best for those doing the work.

No worries I feel the love :)

Going to go forward implemnting as SIG + repo which seems lightest way
forward, we can always adapt and evolve.

-Jon

:   On Thu, May 24, 2018 at 7:26 AM, Jonathan D. Proulx
:   <[1]jon at csail.mit.edu> wrote:
:
:     On Thu, May 24, 2018 at 07:07:10AM -0700, Doug Hellmann wrote:
:     :I know you wanted to avoid lots of governance overhead, so I want
:     :to just mention that establishing a SIG is meant to be a painless
:     :and light-weight way to declare that a group of interested people
:     :exists so that others can find them and participate in the work
:     :[1]. It shouldn't take much effort to do the setup, and any ongoing
:     :communication is something you would presumably by doing anyway
:     :among a group of people trying to collaborate on a project like
:     :this.
:     Yeah I can see SIG as a useful structure too.  I'm just more
:     familiar
:     with UC "teams" because of my personal history.
:     I do thing SIG -vs- team would impace repo naming, and I'm still
:     going
:     over creation doc, so I'll let this simmer here at least until YVR
:     lunch
:     time to see if there's consensus or cotroversy in the potential
:     contributer community.  Lacking either I think I will default to
:     SIG-ops-docs.
:     Thanks,
:     -Jon
:     :
:     :Let me know if you have any questions or concerns about the
:     process.
:
:   :
:   :Doug
:   :
:   :[1] [2]https://governance.openstack.org/sigs/#process-to-create-a-sig
:   :
:   :>
:   :> > of repositories under osops-
:   :> >
:   :> > [3]https://github.com/openstack-infra/project-config/blob/
:   master/gerrit/projects.yaml#L5673-L5703
:   :> >
:   :> > Generally active:
:   :> > osops-tools-contrib
:   :> > osops-tools-generic
:   :> > osops-tools-monitoring
:   :> >
:   :> >
:   :> > Probably dead:
:   :> > osops-tools-logging
:   :> > osops-coda
:   :> > osops-example-configs
:   :> >
:   :> > Because you are more familiar with how things work, is there a way
:   to
:   :> > consolidate these vs coming up with another repo like osops-docs
:   or
:   :> > whatever in this case? And second, is there already governance
:   clearance to
:   :> > publish based on the following - [4]https://launchpad.net/osops -
:   which is
:   :> > where these repos originated.
:   :>
:   :> I don't really know what any of those things are, or whether it
:   :> makes sense to put this new content there. I assumed we would make
:   :> a repo with a name like "operations-guide", but that's up to Chris
:   :> and John.  If they think reusing an existing repository makes sense,
:   :> that would be OK with me, but it's cheap and easy to set up a new
:   :> one, too.
:   :>
:   :> My main concern is that we remove the road blocks, now that we have
:   :> people interested in contributing to this documentation.
:   :>
:   :> >
:   :> > On Wed, May 23, 2018 at 9:56 PM, Frank Kloeker <[5]eumel at arcor.de>
:   wrote:
:   :> >
:   :> > > Hi Chris,
:   :> > >
:   :> > > thanks for summarize our session today in Vancouver. As I18n PTL
:   and one
:   :> > > of the Docs Core I put Petr in Cc. He is currently Docs PTL, but
:   :> > > unfortunatelly not on-site.
:   :> > > I couldn't also not get the full history of the story and that's
:   also not
:   :> > > the idea to starting finger pointing. As usualy we moving
:   forward and there
:   :> > > are some interesting things to know what happened.
:   :> > > First of all: There are no "Docs-Team" anymore. If you look at
:   [1] there
:   :> > > are mostly part-time contributors like me or people are more
:   involved in
:   :> > > other projects and therefore busy. Because of that, the
:   responsibility of
:   :> > > documentation content are moved completely to the project teams.
:   Each repo
:   :> > > has a user guide, admin guide, deployment guide, and so on. The
:   small
:   :> > > Documentation Team provides only tooling and give advices how to
:   write and
:   :> > > publish a document. So it's up to you to re-use the old repo on
:   [2] or
:   :> > > setup a new one. I would recommend to use the best of both
:   worlds. There
:   :> > > are a very good toolset in place for testing and publishing
:   documents.
:   :> > > There are also various text editors for rst extensions
:   available, like in
:   :> > > vim, notepad++ or also online services. I understand the
:   concerns and when
:   :> > > people are sad because their patches are ignored for months. But
:   it's
:   :> > > alltime a question of responsibilty and how can spend people
:   time.
:   :> > > I would be available for help. As I18n PTL I could imagine that
:   a
:   :> > > OpenStack Operations Guide is available in different languages
:   and portable
:   :> > > in different formats like in Sphinx. For us as translation team
:   it's a good
:   :> > > possibility to get feedback about the quality and to understand
:   the
:   :> > > requirements, also for other documents.
:   :> > > So let's move on.
:   :> > >
:   :> > > kind regards
:   :> > >
:   :> > > Frank
:   :> > >
:   :> > > [1] [6]https://review.openstack.org/#/admin/groups/30,members
:   :> > > [2] [7]https://github.com/openstack/operations-guide
:   :> > >
:   :> > >
:   :> > > Am 2018-05-24 03:38, schrieb Chris Morgan:
:   :> > >
:   :> > >> Hello Everyone,
:   :> > >>
:   :> > >> In the Ops Community documentation working session today in
:   Vancouver,
:   :> > >> we made some really good progress (etherpad here:
:   :> > >> [8]https://etherpad.openstack.org/p/YVR-Ops-Community-Docs but
:   not all of
:   :> > >> the good stuff is yet written down).
:   :> > >>
:   :> > >> In short, we're going to course correct on maintaining the
:   Operators
:   :> > >> Guide, the HA Guide and Architecture Guide, not edit-in-place
:   via the
:   :> > >> wiki and instead try still maintaining them as code, but with a
:   :> > >> different, new set of owners, possibly in a new Ops-focused
:   repo.
:   :> > >> There was a strong consensus that a) code workflow >> wiki
:   workflow
:   :> > >> and that b) openstack core docs tools are just fine.
:   :> > >>
:   :> > >> There is a lot still to be decided on how where and when, but
:   we do
:   :> > >> have an offer of a rewrite of the HA Guide, as long as the
:   changes
:   :> > >> will be allowed to actually land, so we expect to actually
:   start
:   :> > >> showing some progress.
:   :> > >>
:   :> > >> At the end of the session, people wanted to know how to follow
:   along
:   :> > >> as various people work out how to do this... and so for now
:   that place
:   :> > >> is this very email thread. The idea is if the code for those
:   documents
:   :> > >> goes to live in a different repo, or if new contributors turn
:   up, or
:   :> > >> if a new version we will announce/discuss it here until such
:   time as
:   :> > >> we have a better home for this initiative.
:   :> > >>
:   :> > >> Cheers
:   :> > >>
:   :> > >> Chris
:   :> > >>
:   :> > >> --
:   :> > >> Chris Morgan <[9]mihalis68 at gmail.com>
:   :> > >> _______________________________________________
:   :> > >> OpenStack-operators mailing list
:   :> > >> [10]OpenStack-operators at lists.openstack.org
:   :> > >> [11]http://lists.openstack.org/cgi-bin/mailman/listinfo/
:   openstack-operators
:   :> > >>
:   :> > >
:   :> > >
:   :> > > _______________________________________________
:   :> > > OpenStack-operators mailing list
:   :> > > [12]OpenStack-operators at lists.openstack.org
:   :> > > [13]http://lists.openstack.org/cgi-bin/mailman/listinfo/
:   openstack-operators
:   :> > >
:   :> >
:   :
:   :_______________________________________________
:   :OpenStack-operators mailing list
:   :[14]OpenStack-operators at lists.openstack.org
:   :[15]http://lists.openstack.org/cgi-bin/mailman/listinfo/
:   openstack-operators
:   _______________________________________________
:   OpenStack-operators mailing list
:   [16]OpenStack-operators at lists.openstack.org
:   [17]http://lists.openstack.org/cgi-bin/mailman/listinfo/
:   openstack-operators
:
:   --
:   Kind regards,
:   Melvin Hillsman
:   [18]mrhillsman at gmail.com
:   mobile: (832) 264-2646
:
:References
:
:   1. mailto:jon at csail.mit.edu
:   2. https://governance.openstack.org/sigs/#process-to-create-a-sig
:   3. https://github.com/openstack-infra/project-config/blob/master/gerrit/projects.yaml#L5673-L5703
:   4. https://launchpad.net/osops
:   5. mailto:eumel at arcor.de
:   6. https://review.openstack.org/#/admin/groups/30,members
:   7. https://github.com/openstack/operations-guide
:   8. https://etherpad.openstack.org/p/YVR-Ops-Community-Docs
:   9. mailto:mihalis68 at gmail.com
:  10. mailto:OpenStack-operators at lists.openstack.org
:  11. http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
:  12. mailto:OpenStack-operators at lists.openstack.org
:  13. http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
:  14. mailto:OpenStack-operators at lists.openstack.org
:  15. http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
:  16. mailto:OpenStack-operators at lists.openstack.org
:  17. http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
:  18. mailto:mrhillsman at gmail.com


From blair.bethwaite at gmail.com  Thu May 24 21:59:16 2018
From: blair.bethwaite at gmail.com (Blair Bethwaite)
Date: Fri, 25 May 2018 07:59:16 +1000
Subject: [Openstack-operators] pci passthrough & numa affinity
Message-ID: <CA+z5Dsyspy1L2pZg5=zuofQpYYvG_WuCq3OGyobRGZcuEKEKLg@mail.gmail.com>

Hi Jon,

Following up to the question you asked during the HPC on OpenStack
panel at the summit yesterday...

You might have already seen Daniel Berrange's blog on this topic:
https://www.berrange.com/posts/2017/02/16/setting-up-a-nested-kvm-guest-for-developing-testing-pci-device-assignment-with-numa/
? He essentially describes how you can get around the issue of the
naive flat pci bus topology in the guest - exposing numa affinity of
the PCIe root ports requires newish qemu and libvirt.

However, best I can tell there is no way to do this with Nova today.
Are you interested in working together on a spec for this?

The other related feature of interest here (newer though - no libvirt
support yet I think) is gpu cliques
(https://github.com/qemu/qemu/commit/dfbee78db8fdf7bc8c151c3d29504bb47438480b),
would be really nice to have a way to set these up through Nova once
libvirt supports it.

-- 
Cheers,
~Blairo


From jon at csail.mit.edu  Thu May 24 22:19:09 2018
From: jon at csail.mit.edu (Jonathan D. Proulx)
Date: Thu, 24 May 2018 15:19:09 -0700
Subject: [Openstack-operators] pci passthrough & numa affinity
In-Reply-To: <CA+z5Dsyspy1L2pZg5=zuofQpYYvG_WuCq3OGyobRGZcuEKEKLg@mail.gmail.com>
References: <CA+z5Dsyspy1L2pZg5=zuofQpYYvG_WuCq3OGyobRGZcuEKEKLg@mail.gmail.com>
Message-ID: <20180524221909.tgdivnx6dvotdwnl@csail.mit.edu>

On Fri, May 25, 2018 at 07:59:16AM +1000, Blair Bethwaite wrote:
:Hi Jon,
:
:Following up to the question you asked during the HPC on OpenStack
:panel at the summit yesterday...
:
:You might have already seen Daniel Berrange's blog on this topic:
:https://www.berrange.com/posts/2017/02/16/setting-up-a-nested-kvm-guest-for-developing-testing-pci-device-assignment-with-numa/
:? He essentially describes how you can get around the issue of the
:naive flat pci bus topology in the guest - exposing numa affinity of
:the PCIe root ports requires newish qemu and libvirt.

Thanks for the pointer not sure if I've seen that one, I've seen a few
ways to map manually.  I would have been quite surprised if nova did
this so I am poking at libvirt.xml outside nova for now

:However, best I can tell there is no way to do this with Nova today.
:Are you interested in working together on a spec for this?

I'm not yet convinced it's worth the bother, that's the crux of the
question I'm investigating.  Is this worth the effort?  There's a meta
question "do I have time to find out" :)

:The other related feature of interest here (newer though - no libvirt
:support yet I think) is gpu cliques
:(https://github.com/qemu/qemu/commit/dfbee78db8fdf7bc8c151c3d29504bb47438480b),
:would be really nice to have a way to set these up through Nova once
:libvirt supports it.

Thanks,
-Jon


From mriedemos at gmail.com  Thu May 24 22:19:49 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Thu, 24 May 2018 15:19:49 -0700
Subject: [Openstack-operators] [nova] Need some feedback on the proposed
	heal_allocations CLI
Message-ID: <bd2051c0-6572-3cd1-3f16-dd93de817677@gmail.com>

I've written a nova-manage placement heal_allocations CLI [1] which was 
a TODO from the PTG in Dublin as a step toward getting existing 
CachingScheduler users to roll off that (which is deprecated).

During the CERN cells v1 upgrade talk it was pointed out that CERN was 
able to go from placement-per-cell to centralized placement in Ocata 
because the nova-computes in each cell would automatically recreate the 
allocations in Placement in a periodic task, but that code is gone once 
you're upgraded to Pike or later.

In various other talks during the summit this week, we've talked about 
things during upgrades where, for instance, if placement is down for 
some reason during an upgrade, a user deletes an instance and the 
allocation doesn't get cleaned up from placement so it's going to 
continue counting against resource usage on that compute node even 
though the server instance in nova is gone. So this CLI could be 
expanded to help clean up situations like that, e.g. provide it a 
specific server ID and the CLI can figure out if it needs to clean 
things up in placement.

So there are plenty of things we can build into this, but the patch is 
already quite large. I expect we'll also be backporting this to stable 
branches to help operators upgrade/fix allocation issues. It already has 
several things listed in a code comment inline about things to build 
into this later.

My question is, is this good enough for a first iteration or is there 
something severely missing before we can merge this, like the automatic 
marker tracking mentioned in the code (that will probably be a 
non-trivial amount of code to add). I could really use some operator 
feedback on this to just take a look at what it already is capable of 
and if it's not going to be useful in this iteration, let me know what's 
missing and I can add that in to the patch.

[1] https://review.openstack.org/#/c/565886/

-- 

Thanks,

Matt


From openstack at fried.cc  Thu May 24 23:34:06 2018
From: openstack at fried.cc (Eric Fried)
Date: Thu, 24 May 2018 18:34:06 -0500
Subject: [Openstack-operators] pci passthrough & numa affinity
In-Reply-To: <20180524221909.tgdivnx6dvotdwnl@csail.mit.edu>
References: <CA+z5Dsyspy1L2pZg5=zuofQpYYvG_WuCq3OGyobRGZcuEKEKLg@mail.gmail.com>
 <20180524221909.tgdivnx6dvotdwnl@csail.mit.edu>
Message-ID: <c8d58c9d-71fe-1381-64e9-13cbb55da0d2@fried.cc>

How long are you willing to wait?

The work we're doing to use Placement from Nova ought to allow us to
model both of these things nicely from the virt driver, and request them
nicely from the flavor.

By the end of Rocky we will have laid a large percentage of the
groundwork to enable this. This is all part of the road to what we've
been calling "generic device management" (GDM) -- which we hope will
eventually let us remove most/all of the existing PCI passthrough code.

I/we would be interested in hearing more specifics of your requirements
around this, as it will help inform the GDM roadmap.  And of course,
upstream help & contributions would be very welcome.

Thanks,
efried

On 05/24/2018 05:19 PM, Jonathan D. Proulx wrote:
> On Fri, May 25, 2018 at 07:59:16AM +1000, Blair Bethwaite wrote:
> :Hi Jon,
> :
> :Following up to the question you asked during the HPC on OpenStack
> :panel at the summit yesterday...
> :
> :You might have already seen Daniel Berrange's blog on this topic:
> :https://www.berrange.com/posts/2017/02/16/setting-up-a-nested-kvm-guest-for-developing-testing-pci-device-assignment-with-numa/
> :? He essentially describes how you can get around the issue of the
> :naive flat pci bus topology in the guest - exposing numa affinity of
> :the PCIe root ports requires newish qemu and libvirt.
> 
> Thanks for the pointer not sure if I've seen that one, I've seen a few
> ways to map manually.  I would have been quite surprised if nova did
> this so I am poking at libvirt.xml outside nova for now
> 
> :However, best I can tell there is no way to do this with Nova today.
> :Are you interested in working together on a spec for this?
> 
> I'm not yet convinced it's worth the bother, that's the crux of the
> question I'm investigating.  Is this worth the effort?  There's a meta
> question "do I have time to find out" :)
> 
> :The other related feature of interest here (newer though - no libvirt
> :support yet I think) is gpu cliques
> :(https://github.com/qemu/qemu/commit/dfbee78db8fdf7bc8c151c3d29504bb47438480b),
> :would be really nice to have a way to set these up through Nova once
> :libvirt supports it.
> 
> Thanks,
> -Jon
> 
> 
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> 


From jon at csail.mit.edu  Fri May 25 01:58:35 2018
From: jon at csail.mit.edu (Jonathan D. Proulx)
Date: Thu, 24 May 2018 18:58:35 -0700
Subject: [Openstack-operators] pci passthrough & numa affinity
In-Reply-To: <c8d58c9d-71fe-1381-64e9-13cbb55da0d2@fried.cc>
References: <CA+z5Dsyspy1L2pZg5=zuofQpYYvG_WuCq3OGyobRGZcuEKEKLg@mail.gmail.com>
 <20180524221909.tgdivnx6dvotdwnl@csail.mit.edu>
 <c8d58c9d-71fe-1381-64e9-13cbb55da0d2@fried.cc>
Message-ID: <20180525015835.kesc3lcdjrwlpsrg@csail.mit.edu>

On Thu, May 24, 2018 at 06:34:06PM -0500, Eric Fried wrote:
:How long are you willing to wait?
:
:The work we're doing to use Placement from Nova ought to allow us to
:model both of these things nicely from the virt driver, and request them
:nicely from the flavor.
:
:By the end of Rocky we will have laid a large percentage of the
:groundwork to enable this. This is all part of the road to what we've
:been calling "generic device management" (GDM) -- which we hope will
:eventually let us remove most/all of the existing PCI passthrough code.
:
:I/we would be interested in hearing more specifics of your requirements
:around this, as it will help inform the GDM roadmap.  And of course,
:upstream help & contributions would be very welcome.

Sounds like good work.

My use case is not yet very clear.  I do have some upcoming
discussions with users around requirements and funding so being able
to say "this is on the road map and could be accelerated with
developer hours"  is useful.  I expect patience is what will come of
that but very good to know where to go when I get some clarity and if
I get some resources.

-Jon

:
:Thanks,
:efried
:
:On 05/24/2018 05:19 PM, Jonathan D. Proulx wrote:
:> On Fri, May 25, 2018 at 07:59:16AM +1000, Blair Bethwaite wrote:
:> :Hi Jon,
:> :
:> :Following up to the question you asked during the HPC on OpenStack
:> :panel at the summit yesterday...
:> :
:> :You might have already seen Daniel Berrange's blog on this topic:
:> :https://www.berrange.com/posts/2017/02/16/setting-up-a-nested-kvm-guest-for-developing-testing-pci-device-assignment-with-numa/
:> :? He essentially describes how you can get around the issue of the
:> :naive flat pci bus topology in the guest - exposing numa affinity of
:> :the PCIe root ports requires newish qemu and libvirt.
:> 
:> Thanks for the pointer not sure if I've seen that one, I've seen a few
:> ways to map manually.  I would have been quite surprised if nova did
:> this so I am poking at libvirt.xml outside nova for now
:> 
:> :However, best I can tell there is no way to do this with Nova today.
:> :Are you interested in working together on a spec for this?
:> 
:> I'm not yet convinced it's worth the bother, that's the crux of the
:> question I'm investigating.  Is this worth the effort?  There's a meta
:> question "do I have time to find out" :)
:> 
:> :The other related feature of interest here (newer though - no libvirt
:> :support yet I think) is gpu cliques
:> :(https://github.com/qemu/qemu/commit/dfbee78db8fdf7bc8c151c3d29504bb47438480b),
:> :would be really nice to have a way to set these up through Nova once
:> :libvirt supports it.
:> 
:> Thanks,
:> -Jon
:> 
:> 
:> _______________________________________________
:> OpenStack-operators mailing list
:> OpenStack-operators at lists.openstack.org
:> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
:> 
:
:_______________________________________________
:OpenStack-operators mailing list
:OpenStack-operators at lists.openstack.org
:http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


From doug at doughellmann.com  Fri May 25 12:30:40 2018
From: doug at doughellmann.com (Doug Hellmann)
Date: Fri, 25 May 2018 05:30:40 -0700
Subject: [Openstack-operators] Ops Community Documentation - first
	anchor point
In-Reply-To: <20180524141929.2vylwguebcgkjxa3@csail.mit.edu>
References: <CA+NmNoPRU+R7HQkJbwkVhOu1jLpUXuw7u_Neb8fCDdfrUvmdng@mail.gmail.com>
 <30d4f1a3668445a11fd34b271bc37e94@arcor.de>
 <20180524141929.2vylwguebcgkjxa3@csail.mit.edu>
Message-ID: <1527251170-sup-2275@lrrr.local>

Excerpts from Jonathan D. Proulx's message of 2018-05-24 07:19:29 -0700:
> 
> My intention based on current understandign would be to create a git
> repo called "osops-docs" as this fits current naming an thin initial
> document we intend to put there and the others we may adopt from
> docs-team.

Normally I would say "yay, consistency!" In this case, let's verify
that that name isn't going to have an undesirable effect when the
content is published.

I know the default destination directory for the publish job is
taken from the repository name, which would mean we would have a
URL like docs.openstack.org/osops-docs. I don't know if there is a
way to override that, but the infra team will know. So, if you want
a URL like docs.o.o/operations-guide instead, you'll want to check
with the infra folks before creating the repo to make sure it's set
up in a way to get the URL you want.

Doug


From jon at csail.mit.edu  Fri May 25 17:37:39 2018
From: jon at csail.mit.edu (Jonathan Proulx)
Date: Fri, 25 May 2018 10:37:39 -0700
Subject: [Openstack-operators] Ops Community Documentation -
	first	anchor point
In-Reply-To: <1527251170-sup-2275@lrrr.local>
References: <CA+NmNoPRU+R7HQkJbwkVhOu1jLpUXuw7u_Neb8fCDdfrUvmdng@mail.gmail.com>
 <30d4f1a3668445a11fd34b271bc37e94@arcor.de>
 <20180524141929.2vylwguebcgkjxa3@csail.mit.edu>
 <1527251170-sup-2275@lrrr.local>
Message-ID: <37FF3737-79CA-4ED2-B078-900F26F99C53@csail.mit.edu>


On May 25, 2018 5:30:40 AM PDT, Doug Hellmann <doug at doughellmann.com> wrote:
>Excerpts from Jonathan D. Proulx's message of 2018-05-24 07:19:29
>-0700:
>> 
>> My intention based on current understandign would be to create a git
>> repo called "osops-docs" as this fits current naming an thin initial
>> document we intend to put there and the others we may adopt from
>> docs-team.
>
>Normally I would say "yay, consistency!" In this case, let's verify
>that that name isn't going to have an undesirable effect when the
>content is published.
>
>I know the default destination directory for the publish job is
>taken from the repository name, which would mean we would have a
>URL like docs.openstack.org/osops-docs. I don't know if there is a
>way to override that, but the infra team will know. So, if you want
>a URL like docs.o.o/operations-guide instead, you'll want to check
>with the infra folks before creating the repo to make sure it's set
>up in a way to get the URL you want.

Names are hard! Thanks for pointing out the implications.

-Jon

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


From sbauza at redhat.com  Mon May 28 12:31:59 2018
From: sbauza at redhat.com (Sylvain Bauza)
Date: Mon, 28 May 2018 14:31:59 +0200
Subject: [Openstack-operators] [nova] Need some feedback on the proposed
 heal_allocations CLI
In-Reply-To: <bd2051c0-6572-3cd1-3f16-dd93de817677@gmail.com>
References: <bd2051c0-6572-3cd1-3f16-dd93de817677@gmail.com>
Message-ID: <CALOCmum7=f_PaMiP_1ofJ=r1heZpxDZqT7BU5HQMg3rekwQHrw@mail.gmail.com>

On Fri, May 25, 2018 at 12:19 AM, Matt Riedemann <mriedemos at gmail.com>
wrote:

> I've written a nova-manage placement heal_allocations CLI [1] which was a
> TODO from the PTG in Dublin as a step toward getting existing
> CachingScheduler users to roll off that (which is deprecated).
>
> During the CERN cells v1 upgrade talk it was pointed out that CERN was
> able to go from placement-per-cell to centralized placement in Ocata
> because the nova-computes in each cell would automatically recreate the
> allocations in Placement in a periodic task, but that code is gone once
> you're upgraded to Pike or later.
>
> In various other talks during the summit this week, we've talked about
> things during upgrades where, for instance, if placement is down for some
> reason during an upgrade, a user deletes an instance and the allocation
> doesn't get cleaned up from placement so it's going to continue counting
> against resource usage on that compute node even though the server instance
> in nova is gone. So this CLI could be expanded to help clean up situations
> like that, e.g. provide it a specific server ID and the CLI can figure out
> if it needs to clean things up in placement.
>
> So there are plenty of things we can build into this, but the patch is
> already quite large. I expect we'll also be backporting this to stable
> branches to help operators upgrade/fix allocation issues. It already has
> several things listed in a code comment inline about things to build into
> this later.
>
> My question is, is this good enough for a first iteration or is there
> something severely missing before we can merge this, like the automatic
> marker tracking mentioned in the code (that will probably be a non-trivial
> amount of code to add). I could really use some operator feedback on this
> to just take a look at what it already is capable of and if it's not going
> to be useful in this iteration, let me know what's missing and I can add
> that in to the patch.
>
> [1] https://review.openstack.org/#/c/565886/
>
>

It does sound for me a good way to help operators.

That said, given I'm now working on using Nested Resource Providers for
VGPU inventories, I wonder about a possible upgrade problem with VGPU
allocations. Given that :
 - in Queens, VGPU inventories are for the root RP (ie. the compute node
RP), but,
 - in Rocky, VGPU inventories will be for children RPs (ie. against a
specific VGPU type), then

if we have VGPU allocations in Queens, when upgrading to Rocky, we should
maybe recreate the allocations to a specific other inventory ?

Hope you see the problem with upgrading by creating nested RPs ?


> --
>
> Thanks,
>
> Matt
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180528/ba022734/attachment.html>

From zioproto at gmail.com  Mon May 28 12:50:10 2018
From: zioproto at gmail.com (Saverio Proto)
Date: Mon, 28 May 2018 14:50:10 +0200
Subject: [Openstack-operators]
 [openstack-dev][publiccloud-wg][k8s][octavia] OpenStack Load Balancer APIs
 and K8s
In-Reply-To: <D81C658C-6004-4755-BDF3-E03374286F71@openstack.org>
References: <D81C658C-6004-4755-BDF3-E03374286F71@openstack.org>
Message-ID: <CAPmmg8tfLd9oGG9+w0M-jqK49-51RTU1yXBda9+zBbOQhmz1Uw@mail.gmail.com>

Hello Chris,

I finally had the time to write about my deployment:
https://cloudblog.switch.ch/2018/05/22/openstack-horizon-runs-on-kubernetes-in-production-at-switch/

in this blog post I explain why I use the kubernetes nginx-ingress
instead of Openstack LBaaS.

Cheers,

Saverio


2018-03-15 23:55 GMT+01:00 Chris Hoge <chris at openstack.org>:
> Hi everyone,
>
> I wanted to notify you of a thread I started in openstack-dev about the state
> of the OpenStack load balancer APIs and the difficulty in integrating them
> with Kubernetes. This in part directly relates to current public and private
> deployments, and any feedback you have would be appreciated. Especially
> feedback on which version of the load balancer APIs you deploy, and if you
> haven't moved on to Octavia, why.
>
> http://lists.openstack.org/pipermail/openstack-dev/2018-March/128399.html <http://lists.openstack.org/pipermail/openstack-dev/2018-March/128399.html>
>
> Thanks in advance,
> Chris
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


From pkovar at redhat.com  Mon May 28 14:03:41 2018
From: pkovar at redhat.com (Petr Kovar)
Date: Mon, 28 May 2018 16:03:41 +0200
Subject: [Openstack-operators] Ops Community Documentation - first
 anchor point
In-Reply-To: <20180524141929.2vylwguebcgkjxa3@csail.mit.edu>
References: <CA+NmNoPRU+R7HQkJbwkVhOu1jLpUXuw7u_Neb8fCDdfrUvmdng@mail.gmail.com>
 <30d4f1a3668445a11fd34b271bc37e94@arcor.de>
 <20180524141929.2vylwguebcgkjxa3@csail.mit.edu>
Message-ID: <20180528160341.be386cd2a4562d2981f470fc@redhat.com>

On Thu, 24 May 2018 07:19:29 -0700
"Jonathan D. Proulx" <jon at csail.mit.edu> wrote:

> My intention based on current understandign would be to create a git
> repo called "osops-docs" as this fits current naming an thin initial
> document we intend to put there and the others we may adopt from
> docs-team.

So, just to clarify, the current plan is for your group to take ownership
of the following docs?

https://github.com/openstack/openstack-manuals/tree/a1f1748478125ccd68d90a98ccc06c7ec359d3a0/doc/ops-guide
https://github.com/openstack/openstack-manuals/tree/master/doc/arch-design
https://github.com/openstack/openstack-manuals/tree/master/doc/ha-guide

Note that there is also
https://github.com/openstack/openstack-manuals/tree/master/doc/ha-guide-draft
which you probably want to merge with the ha-guide going forward (or
retire one or the other).

As for naming the repo, this is really up to you, but it should be
something clear and easily recognizable by your audience.

I can help with moving some of the content around, but as Doug pointed out,
a few points about actual publishing need to be clarified first with the
infra team.

 > My understanding being they don't to have this type of
> documentention due to much reduced team size and prefer it live with
> subject matter experts. It that correct?  If that's not correct I'm
> not personally opposed to trying this under docs.  We'll need to
> maintain enough contributors and reviewers to make the work flow go in
> either location and that's my understanding of the basic issue not
> where it lives.

If you want more reviewers involved, I'd recommended inviting the reviewers
from the docs group.
 
> This naming would also match other repos wich could be consolidated into an
> "osops" repo to rule them all.  That may make sense as I think there's
> significant overlap in set of people who might contribute, but that
> can be a parallel conversation.
> 
> Doug looking at new project docs I think most of it is clear enough to
> me.  Since it's not code I can skip all th PyPi stuff yes? The repo
> creation seems pretty clear and I can steal the CI stuff from similar
> projects.

Might be best to look into how https://github.com/openstack/security-doc is
configured as that repo contains a number of separate documents, all managed
by one group.

> I'm a little unclear on the Storyboard bit I've not done
> much contribution lately and haven't storyboarded.  Is that relevant
> (or at least relevent at first) for this use case?  If it is I
> probably have more questions.

I'd suggest either having your own storyboard or launchpad project so that
users can file bugs somewhere, and give you feedback. storyboard might be a
better option since all OpenStack projects all likely to migrate to it from
launchpad at some point or another.

Cheers,
pk


From gael.therond at gmail.com  Mon May 28 17:09:50 2018
From: gael.therond at gmail.com (Flint WALRUS)
Date: Mon, 28 May 2018 19:09:50 +0200
Subject: [Openstack-operators]
 [openstack-dev][publiccloud-wg][k8s][octavia] OpenStack Load Balancer APIs
 and K8s
In-Reply-To: <CAPmmg8tfLd9oGG9+w0M-jqK49-51RTU1yXBda9+zBbOQhmz1Uw@mail.gmail.com>
References: <D81C658C-6004-4755-BDF3-E03374286F71@openstack.org>
 <CAPmmg8tfLd9oGG9+w0M-jqK49-51RTU1yXBda9+zBbOQhmz1Uw@mail.gmail.com>
Message-ID: <CAG+53uYKg=aWH5n1rmKRcAoJ1XcHGSOmCrvxvKbqV9eBYAD_5w@mail.gmail.com>

Hi everyone, I’m currently deploying Octavia as our global LBaaS for a lot
of various workload such as Kubernetes ingress LB.

We use Queens and plan to upgrade to rocky as soon as it reach the stable
release and we use the native Octavia APIv2 (Not a neutron redirect etc).

What do you need to know?
Le lun. 28 mai 2018 à 14:50, Saverio Proto <zioproto at gmail.com> a écrit :

> Hello Chris,
>
> I finally had the time to write about my deployment:
>
> https://cloudblog.switch.ch/2018/05/22/openstack-horizon-runs-on-kubernetes-in-production-at-switch/
>
> in this blog post I explain why I use the kubernetes nginx-ingress
> instead of Openstack LBaaS.
>
> Cheers,
>
> Saverio
>
>
> 2018-03-15 23:55 GMT+01:00 Chris Hoge <chris at openstack.org>:
> > Hi everyone,
> >
> > I wanted to notify you of a thread I started in openstack-dev about the
> state
> > of the OpenStack load balancer APIs and the difficulty in integrating
> them
> > with Kubernetes. This in part directly relates to current public and
> private
> > deployments, and any feedback you have would be appreciated. Especially
> > feedback on which version of the load balancer APIs you deploy, and if
> you
> > haven't moved on to Octavia, why.
> >
> >
> http://lists.openstack.org/pipermail/openstack-dev/2018-March/128399.html
> <http://lists.openstack.org/pipermail/openstack-dev/2018-March/128399.html
> >
> >
> > Thanks in advance,
> > Chris
> > _______________________________________________
> > OpenStack-operators mailing list
> > OpenStack-operators at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180528/8ad6b139/attachment.html>

From zioproto at gmail.com  Mon May 28 19:26:01 2018
From: zioproto at gmail.com (Saverio Proto)
Date: Mon, 28 May 2018 21:26:01 +0200
Subject: [Openstack-operators]
 [openstack-dev][publiccloud-wg][k8s][octavia] OpenStack Load Balancer APIs
 and K8s
In-Reply-To: <CAG+53uYKg=aWH5n1rmKRcAoJ1XcHGSOmCrvxvKbqV9eBYAD_5w@mail.gmail.com>
References: <D81C658C-6004-4755-BDF3-E03374286F71@openstack.org>
 <CAPmmg8tfLd9oGG9+w0M-jqK49-51RTU1yXBda9+zBbOQhmz1Uw@mail.gmail.com>
 <CAG+53uYKg=aWH5n1rmKRcAoJ1XcHGSOmCrvxvKbqV9eBYAD_5w@mail.gmail.com>
Message-ID: <CAPmmg8v6OUXpP_GfNBj9ro+f8-18QmUCO69SbQtuWHNi2KCwKg@mail.gmail.com>

Hello Flint,

what version of Kubernetes are you deploying on top of Openstack ?

are you using the external Openstack cloud controller ? I tested it an
it works only if you have at least v.1.10.3

Look at this page:
https://github.com/kubernetes/cloud-provider-openstack/tree/master/examples/loadbalancers

Please test that you can make a SSL termination on the loadbalancer,
describing it with Kubernetes yaml files. That is important for
production operation. Test also if you have downtime when you have to
renew SSL certificates.

You will also want to check that traffic that hits your pods has the
HTTP header X-Forwarded-For, or even better the IP packets you receive
at the Pods have the source IP address of the original client.

If needed test everything also with IPv6

I personally decided not to use Octavia, but to go for the Kubernetes
ingress-nginx
https://github.com/kubernetes/ingress-nginx

The key idea is that instead of Openstack controlling the LoadBalancer
having Octavia spinning up a VM running nginx, you have Kubernetes
controlling the LoadBalancer, running a nginx-container.
At the end you need a nginx to reverse proxy, you have to decided if
this resource is managed by Openstack or Kubernetes.

Keep in mind that if you go for a kubernetes ingress controller you
can avoid using nginx. There is already an alternative ha-proxy
implementation:
https://www.haproxy.com/blog/haproxy_ingress_controller_for_kubernetes/

Cheers,

Saverio

2018-05-28 19:09 GMT+02:00 Flint WALRUS <gael.therond at gmail.com>:
> Hi everyone, I’m currently deploying Octavia as our global LBaaS for a lot
> of various workload such as Kubernetes ingress LB.
>
> We use Queens and plan to upgrade to rocky as soon as it reach the stable
> release and we use the native Octavia APIv2 (Not a neutron redirect etc).
>
> What do you need to know?
>
> Le lun. 28 mai 2018 à 14:50, Saverio Proto <zioproto at gmail.com> a écrit :
>>
>> Hello Chris,
>>
>> I finally had the time to write about my deployment:
>>
>> https://cloudblog.switch.ch/2018/05/22/openstack-horizon-runs-on-kubernetes-in-production-at-switch/
>>
>> in this blog post I explain why I use the kubernetes nginx-ingress
>> instead of Openstack LBaaS.
>>
>> Cheers,
>>
>> Saverio
>>
>>
>> 2018-03-15 23:55 GMT+01:00 Chris Hoge <chris at openstack.org>:
>> > Hi everyone,
>> >
>> > I wanted to notify you of a thread I started in openstack-dev about the
>> > state
>> > of the OpenStack load balancer APIs and the difficulty in integrating
>> > them
>> > with Kubernetes. This in part directly relates to current public and
>> > private
>> > deployments, and any feedback you have would be appreciated. Especially
>> > feedback on which version of the load balancer APIs you deploy, and if
>> > you
>> > haven't moved on to Octavia, why.
>> >
>> >
>> > http://lists.openstack.org/pipermail/openstack-dev/2018-March/128399.html
>> > <http://lists.openstack.org/pipermail/openstack-dev/2018-March/128399.html>
>> >
>> > Thanks in advance,
>> > Chris
>> > _______________________________________________
>> > OpenStack-operators mailing list
>> > OpenStack-operators at lists.openstack.org
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


From gael.therond at gmail.com  Mon May 28 19:35:22 2018
From: gael.therond at gmail.com (Flint WALRUS)
Date: Mon, 28 May 2018 21:35:22 +0200
Subject: [Openstack-operators]
 [openstack-dev][publiccloud-wg][k8s][octavia] OpenStack Load Balancer APIs
 and K8s
In-Reply-To: <CAPmmg8v6OUXpP_GfNBj9ro+f8-18QmUCO69SbQtuWHNi2KCwKg@mail.gmail.com>
References: <D81C658C-6004-4755-BDF3-E03374286F71@openstack.org>
 <CAPmmg8tfLd9oGG9+w0M-jqK49-51RTU1yXBda9+zBbOQhmz1Uw@mail.gmail.com>
 <CAG+53uYKg=aWH5n1rmKRcAoJ1XcHGSOmCrvxvKbqV9eBYAD_5w@mail.gmail.com>
 <CAPmmg8v6OUXpP_GfNBj9ro+f8-18QmUCO69SbQtuWHNi2KCwKg@mail.gmail.com>
Message-ID: <CAG+53ubyikFi8denAr=0n44nWGY+QU8E-uGZQp4nOfcbiv7rhA@mail.gmail.com>

Using 1.10. Something, I’ll have to check tomorrow morning.

I don’t want to use nginx or the provided haproxy as my Octavia LBaaS is a
global service and because the less I rely on Kube the more I’m happy ;-)
Le lun. 28 mai 2018 à 21:26, Saverio Proto <zioproto at gmail.com> a écrit :

> Hello Flint,
>
> what version of Kubernetes are you deploying on top of Openstack ?
>
> are you using the external Openstack cloud controller ? I tested it an
> it works only if you have at least v.1.10.3
>
> Look at this page:
>
> https://github.com/kubernetes/cloud-provider-openstack/tree/master/examples/loadbalancers
>
> Please test that you can make a SSL termination on the loadbalancer,
> describing it with Kubernetes yaml files. That is important for
> production operation. Test also if you have downtime when you have to
> renew SSL certificates.
>
> You will also want to check that traffic that hits your pods has the
> HTTP header X-Forwarded-For, or even better the IP packets you receive
> at the Pods have the source IP address of the original client.
>
> If needed test everything also with IPv6
>
> I personally decided not to use Octavia, but to go for the Kubernetes
> ingress-nginx
> https://github.com/kubernetes/ingress-nginx
>
> The key idea is that instead of Openstack controlling the LoadBalancer
> having Octavia spinning up a VM running nginx, you have Kubernetes
> controlling the LoadBalancer, running a nginx-container.
> At the end you need a nginx to reverse proxy, you have to decided if
> this resource is managed by Openstack or Kubernetes.
>
> Keep in mind that if you go for a kubernetes ingress controller you
> can avoid using nginx. There is already an alternative ha-proxy
> implementation:
> https://www.haproxy.com/blog/haproxy_ingress_controller_for_kubernetes/
>
> Cheers,
>
> Saverio
>
> 2018-05-28 19:09 GMT+02:00 Flint WALRUS <gael.therond at gmail.com>:
> > Hi everyone, I’m currently deploying Octavia as our global LBaaS for a
> lot
> > of various workload such as Kubernetes ingress LB.
> >
> > We use Queens and plan to upgrade to rocky as soon as it reach the stable
> > release and we use the native Octavia APIv2 (Not a neutron redirect etc).
> >
> > What do you need to know?
> >
> > Le lun. 28 mai 2018 à 14:50, Saverio Proto <zioproto at gmail.com> a écrit
> :
> >>
> >> Hello Chris,
> >>
> >> I finally had the time to write about my deployment:
> >>
> >>
> https://cloudblog.switch.ch/2018/05/22/openstack-horizon-runs-on-kubernetes-in-production-at-switch/
> >>
> >> in this blog post I explain why I use the kubernetes nginx-ingress
> >> instead of Openstack LBaaS.
> >>
> >> Cheers,
> >>
> >> Saverio
> >>
> >>
> >> 2018-03-15 23:55 GMT+01:00 Chris Hoge <chris at openstack.org>:
> >> > Hi everyone,
> >> >
> >> > I wanted to notify you of a thread I started in openstack-dev about
> the
> >> > state
> >> > of the OpenStack load balancer APIs and the difficulty in integrating
> >> > them
> >> > with Kubernetes. This in part directly relates to current public and
> >> > private
> >> > deployments, and any feedback you have would be appreciated.
> Especially
> >> > feedback on which version of the load balancer APIs you deploy, and if
> >> > you
> >> > haven't moved on to Octavia, why.
> >> >
> >> >
> >> >
> http://lists.openstack.org/pipermail/openstack-dev/2018-March/128399.html
> >> > <
> http://lists.openstack.org/pipermail/openstack-dev/2018-March/128399.html>
> >> >
> >> > Thanks in advance,
> >> > Chris
> >> > _______________________________________________
> >> > OpenStack-operators mailing list
> >> > OpenStack-operators at lists.openstack.org
> >> >
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> >>
> >> _______________________________________________
> >> OpenStack-operators mailing list
> >> OpenStack-operators at lists.openstack.org
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180528/2637b274/attachment-0001.html>

From massimo.sgaravatto at gmail.com  Tue May 29 09:41:46 2018
From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto)
Date: Tue, 29 May 2018 11:41:46 +0200
Subject: [Openstack-operators] Problems with AggregateMultiTenancyIsolation
	while migrating an instance
Message-ID: <CALaZjREON=StmQAXXEfZm3U+syG3Kyrotw8+_ENS_c7g748zhg@mail.gmail.com>

I have a small testbed OpenStack cloud (running Ocata) where I am trying to
debug a problem with Nova scheduling.


In short: I see different behaviors when I create a new VM and when I try
to migrate a VM


Since I want to partition the Cloud so that each project uses only certain
compute nodes, I created one host aggregate per project (see also this
thread:
http://lists.openstack.org/pipermail/openstack-operators/2018-February/014831.html
)


The host-aggregate for my project is:

# nova  aggregate-show 52
+----+-----------+-------------------+--------------------------------------------------------------+----------------------------------------------------------------------------------------------+--------------------------------------+
| Id | Name      | Availability Zone | Hosts
                        | Metadata
                                           | UUID
       |
+----+-----------+-------------------+--------------------------------------------------------------+----------------------------------------------------------------------------------------------+--------------------------------------+
| 52 | SgaraPrj1 | nova              | 'compute-01.cloud.pd.infn.it', '
compute-02.cloud.pd.infn.it' | 'availability_zone=nova',
'filter_tenant_id=ee1865a76440481cbcff08544c7d580a', 'size=normal' |
675f6291-6997-470d-87e1-e9ea199a379f |
+----+-----------+-------------------+--------------------------------------------------------------+----------------------------------------------------------------------------------------------+--------------------------------------+

The same compute nodes are shared by other projects  (for which specific
host-aggregates, as this one, have been created)
The other compute node (I have only 3 compute nodes in this small testbed)
is targeted to other projects (for which specific host-aggregates exist)


This is what I have in nova.conf wrt scheduling filters:

enabled_filters =
AggregateInstanceExtraSpecsFilter,AggregateMultiTenancyIsolation,RetryFilter,AvailabilityZoneFilter,RamFilter,CoreFilter,AggregateRamFilter,AggregateCo
reFilter,DiskFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter


If I try to create a VM, I see from the scheduler log [*] that
the AggregateMultiTenancyIsolation selects only 2 compute nodes, as
expected.


But if I then try to migrate the very same VM, it reports that no valid
host was found:

# nova migrate afaf2a2d-7ff8-4e52-a89a-031ee079a9ba
ERROR (BadRequest): No valid host was found. No valid host found for cold
migrate (HTTP 400) (Request-ID: req-45b8afd5-9683-40a6-8416-295563e37e34)


And according to the scheduler log the problem is with the
AggregateMultiTenancyIsolation which returned 0 hosts (while I would have
expected one):

2018-05-29 11:12:56.375 19428 INFO nova.scheduler.host_manager
[req-45b8afd5-9683-40a6-8416-295563e37e34 9bd03f63fa9d4beb8de31e6c2f2c8d12
56c3f5c047e74a78a714\
38c4412e6e13 - - -\
] Host filter ignoring hosts: compute-02.cloud.pd.infn.it
2018-05-29 11:12:56.375 19428 DEBUG nova.filters
[req-45b8afd5-9683-40a6-8416-295563e37e34 9bd03f63fa9d4beb8de31e6c2f2c8d12
56c3f5c047e74a78a71438c4412e6e13 -\
 - -] Starting wit\
h 2 host(s) get_filtered_objects
/usr/lib/python2.7/site-packages/nova/filters.py:70
2018-05-29 11:12:56.376 19428 DEBUG nova.filters
[req-45b8afd5-9683-40a6-8416-295563e37e34 9bd03f63fa9d4beb8de31e6c2f2c8d12
56c3f5c047e74a78a71438c4412e6e13 -\
 - -] Filter Aggre\
gateInstanceExtraSpecsFilter returned 2 host(s) get_filtered_objects
/usr/lib/python2.7/site-packages/nova/filters.py:104
2018-05-29 11:12:56.377 19428 DEBUG
nova.scheduler.filters.aggregate_multitenancy_isolation
[req-45b8afd5-9683-40a6-8416-295563e37e34 9bd03f63fa9d4beb8de31e6c\
2f2c8d12 56c3f5c04\
7e74a78a71438c4412e6e13 - - -] (compute-01.cloud.pd.infn.it,
compute-01.cloud.pd.infn.it) ram: 12797MB disk: 48128MB io_ops: 0
instances: 0 fails tenant id on\
 aggregate host_pa\
sses
/usr/lib/python2.7/site-packages/nova/scheduler/filters/aggregate_multitenancy_isolation.py:50
2018-05-29 11:12:56.378 19428 DEBUG
nova.scheduler.filters.aggregate_multitenancy_isolation
[req-45b8afd5-9683-40a6-8416-295563e37e34 9bd03f63fa9d4beb8de31e6c\
2f2c8d12 56c3f5c04\
7e74a78a71438c4412e6e13 - - -] (compute-03.cloud.pd.infn.it,
compute-03.cloud.pd.infn.it) ram: 8701MB disk: -4096MB io_ops: 0 instances:
0 fails tenant id on \
aggregate host_pas\
ses
/usr/lib/python2.7/site-packages/nova/scheduler/filters/aggregate_multitenancy_isolation.py:50
2018-05-29 11:12:56.378 19428 INFO nova.filters
[req-45b8afd5-9683-40a6-8416-295563e37e34 9bd03f63fa9d4beb8de31e6c2f2c8d12
56c3f5c047e74a78a71438c4412e6e13 - \
- -] Filter Aggreg\
ateMultiTenancyIsolation returned 0 hosts


I am confused ...
Any hints ?

Thanks, Massimo

[*]


2018-05-29 11:09:54.328 19428 DEBUG nova.filters
[req-1a838e77-8042-4550-b157-4943445119a2 ab573ba3ea014b778193b6922ffffe6d
ee1865a76440481cbcff08544c7d580a -\
 - -] Filter AggregateInstanceExtraSpecsFilter returned 3 host(s)
get_filtered_objects /usr/lib/python2.7/site-packages/nova/filters.py:104
2018-05-29 11:09:54.330 19428 DEBUG nova.filters
[req-1a838e77-8042-4550-b157-4943445119a2 ab573ba3ea014b778193b6922ffffe6d
ee1865a76440481cbcff08544c7d580a -\
 - -] Filter AggregateMultiTenancyIsolation returned 2 host(s)
get_filtered_objects /usr/lib/python2.7/site-packages/nova/filters.py:104
2018-05-29 11:09:54.332 19428 DEBUG nova.filters
[req-1a838e77-8042-4550-b157-4943445119a2 ab573ba3ea014b778193b6922ffffe6d
ee1865a76440481cbcff08544c7d580a -\
 - -] Filter RetryFilter returned 2 host(s) get_filtered_objects
/usr/lib/python2.7/site-packages/nova/filters.py:104
2018-05-29 11:09:54.332 19428 DEBUG nova.filters
[req-1a838e77-8042-4550-b157-4943445119a2 ab573ba3ea014b778193b6922ffffe6d
ee1865a76440481cbcff08544c7d580a -\
 - -] Filter AvailabilityZoneFilter returned 2 host(s) get_filtered_objects
/usr/lib/python2.7/site-packages/nova/filters.py:104
2018-05-29 11:09:54.333 19428 DEBUG nova.filters
[req-1a838e77-8042-4550-b157-4943445119a2 ab573ba3ea014b778193b6922ffffe6d
ee1865a76440481cbcff08544c7d580a -\
 - -] Filter RamFilter returned 2 host(s) get_filtered_objects
/usr/lib/python2.7/site-packages/nova/filters.py:104
2018-05-29 11:09:54.334 19428 DEBUG nova.filters
[req-1a838e77-8042-4550-b157-4943445119a2 ab573ba3ea014b778193b6922ffffe6d
ee1865a76440481cbcff08544c7d580a -\
 - -] Filter CoreFilter returned 2 host(s) get_filtered_objects
/usr/lib/python2.7/site-packages/nova/filters.py:104
2018-05-29 11:09:54.334 19428 DEBUG nova.filters
[req-1a838e77-8042-4550-b157-4943445119a2 ab573ba3ea014b778193b6922ffffe6d
ee1865a76440481cbcff08544c7d580a -\
 - -] Filter AggregateRamFilter returned 2 host(s) get_filtered_objects
/usr/lib/python2.7/site-packages/nova/filters.py:104
2018-05-29 11:09:54.335 19428 DEBUG nova.filters
[req-1a838e77-8042-4550-b157-4943445119a2 ab573ba3ea014b778193b6922ffffe6d
ee1865a76440481cbcff08544c7d580a -\
 - -] Filter AggregateCoreFilter returned 2 host(s) get_filtered_objects
/usr/lib/python2.7/site-packages/nova/filters.py:104
2018-05-29 11:09:54.335 19428 DEBUG nova.filters
[req-1a838e77-8042-4550-b157-4943445119a2 ab573ba3ea014b778193b6922ffffe6d
ee1865a76440481cbcff08544c7d580a -\
 - -] Filter DiskFilter returned 2 host(s) get_filtered_objects
/usr/lib/python2.7/site-packages/nova/filters.py:104
2018-05-29 11:09:54.336 19428 DEBUG nova.filters
[req-1a838e77-8042-4550-b157-4943445119a2 ab573ba3ea014b778193b6922ffffe6d
ee1865a76440481cbcff08544c7d580a -\
 - -] Filter ComputeFilter returned 2 host(s) get_filtered_objects
/usr/lib/python2.7/site-packages/nova/filters.py:104
2018-05-29 11:09:54.337 19428 DEBUG nova.filters
[req-1a838e77-8042-4550-b157-4943445119a2 ab573ba3ea014b778193b6922ffffe6d
ee1865a76440481cbcff08544c7d580a -\
 - -] Filter ComputeCapabilitiesFilter returned 2 host(s)
get_filtered_objects /usr/lib/python2.7/site-packages/nova/filters.py:104
2018-05-29 11:09:54.338 19428 DEBUG nova.filters
[req-1a838e77-8042-4550-b157-4943445119a2 ab573ba3ea014b778193b6922ffffe6d
ee1865a76440481cbcff08544c7d580a -\
 - -] Filter ImagePropertiesFilter returned 2 host(s) get_filtered_objects
/usr/lib/python2.7/site-packages/nova/filters.py:104
2018-05-29 11:09:54.339 19428 DEBUG nova.filters
[req-1a838e77-8042-4550-b157-4943445119a2 ab573ba3ea014b778193b6922ffffe6d
ee1865a76440481cbcff08544c7d580a -\
 - -] Filter ServerGroupAntiAffinityFilter returned 2 host(s)
get_filtered_objects /usr/lib/python2.7/site-packages/nova/filters.py:104
2018-05-29 11:09:54.339 19428 DEBUG nova.filters
[req-1a838e77-8042-4550-b157-4943445119a2 ab573ba3ea014b778193b6922ffffe6d
ee1865a76440481cbcff08544c7d580a -\
 - -] Filter ServerGroupAffinityFilter returned 2 host(s)
get_filtered_objects /usr/lib/python2.7/site-packages/nova/filters.py:104
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180529/bfb85979/attachment.html>

From mihalis68 at gmail.com  Tue May 29 11:14:03 2018
From: mihalis68 at gmail.com (Chris Morgan)
Date: Tue, 29 May 2018 07:14:03 -0400
Subject: [Openstack-operators] Proposing no Ops Meetups team meeting this
	week
Message-ID: <CA+NmNoNDwqCjfc4cs1xHMxm_iD5JewyYG=7eD5mPZnp1NjU+BA@mail.gmail.com>

Some of us will be only just returning to work today after being away all
week last week for the (successful) OpenStack Summit, therefore I propose
we skip having a meeting today but regroup next week?

Chris

-- 
Chris Morgan <mihalis68 at gmail.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180529/65605730/attachment.html>

From tomi.juvonen at nokia.com  Tue May 29 11:14:12 2018
From: tomi.juvonen at nokia.com (Juvonen, Tomi (Nokia - FI/Espoo))
Date: Tue, 29 May 2018 11:14:12 +0000
Subject: [Openstack-operators] New OpenStack project for rolling maintenance
 and upgrade in interaction with application on top of it
Message-ID: <HE1PR0701MB2700C26A4436C67660DAE19E856D0@HE1PR0701MB2700.eurprd07.prod.outlook.com>

Hi,

I am the PTL of the OPNFV Doctor project.

I have been working for a couple of years figuring out the infrastructure maintenance in interaction with application on top of it. Looked into Nova, Craton and had several Ops sessions. Past half a year there has been couple of different POCs, the last in March in the ONS [1] [2]

In OpenStack Vancouver summit last week it was time to present [3]. In Forum discussion following the presentation it was whether to make this just by utilizing different existing projects, but to make this generic, pluggable, easily adapted and future proof, it now goes down to start what I almost started a couple of years ago; the OpenStack Fenix project [4].

On behalf of OPNFV Doctor I would welcome any last thoughts before starting the project and would also love to see somebody joining to make the Fenix fly.

Main use cases to list most of them:
*       As a cloud admin I want to maintain and upgrade my infrastructure in a rolling fashion.
*       As a cloud admin I want to have a pluggable workflow to maintain and upgrade my infrastructure, to ensure it can be done with complicated infrastructure components and in interaction with different application payloads on top of it.
*       As a infrastructure service, I need to know whether infrastructure unavailability is because of planned maintenance.
*       As a critical application owner, I want to be aware of any planned downtime effecting to my service.
*       As a critical application owner, I want to have interaction with infrastructure rolling maintenance workflow to have a time window to ensure zero down time for my service and to be able to decide to make admin actions like migration of my instance.
*       As an application owner, I need to know when admin action like migration is complete.
*       As an application owner, I want to know about new capabilities coming because of infrastructure maintenance or upgrade, so I can take it also into use by my application. This could be hardware capability or for example OpenStack upgrade.
*       As a critical application that needs to scale by varying load, I need to interactively know about infrastructure resources scaling up and down, so I can scale my application at the same and keeping zero downtime for my service
*       As a critical application, I want to have retirement of my service done in controlled fashion.

[1] Infrastructure Maintenance & Upgrade: Zero VNF Downtime with OPNFV Doctor on OCP Hardware video<https://youtu.be/7q496Tutzlo>
[2] Infrastructure Maintenance & Upgrade: Zero VNF Downtime with OPNFV Doctor on OCP Hardware slides<https://wiki.opnfv.org/download/attachments/5046291/Maintenance%20Demo%20OCP%20and%20ONS%20summit.pptx?version=1&modificationDate=1521773334000&api=v2>
[3] How to gain VNF zero down-time during Infrastructure Maintenance and Upgrade<https://youtu.be/ZZBJEZsiwqo>
[4] Fenix project wiki<https://wiki.openstack.org/wiki/Fenix>
[5] Doctor design guideline draft<https://wiki.opnfv.org/download/attachments/5046291/Planned%20Maintenance%20Design%20Guideline.pdf?version=1&modificationDate=1527183603000&api=v2>


Best Regards,
Tomi Juvonen


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180529/4715d3c7/attachment.html>

From sean.mcginnis at gmx.com  Tue May 29 11:42:34 2018
From: sean.mcginnis at gmx.com (Sean McGinnis)
Date: Tue, 29 May 2018 06:42:34 -0500
Subject: [Openstack-operators] Proposing no Ops Meetups team meeting
 this week
In-Reply-To: <CA+NmNoNDwqCjfc4cs1xHMxm_iD5JewyYG=7eD5mPZnp1NjU+BA@mail.gmail.com>
References: <CA+NmNoNDwqCjfc4cs1xHMxm_iD5JewyYG=7eD5mPZnp1NjU+BA@mail.gmail.com>
Message-ID: <1de2dd4e-d41b-5a31-84db-4d9ebbaae3c2@gmx.com>

On 05/29/2018 06:14 AM, Chris Morgan wrote:
> Some of us will be only just returning to work today after being away 
> all week last week for the (successful) OpenStack Summit, therefore I 
> propose we skip having a meeting today but regroup next week?
>
> Chris

Makes sense to me. I know I have a lot of catching up to do.


From emccormick at cirrusseven.com  Tue May 29 11:53:19 2018
From: emccormick at cirrusseven.com (Erik McCormick)
Date: Tue, 29 May 2018 07:53:19 -0400
Subject: [Openstack-operators] Proposing no Ops Meetups team meeting
	this week
In-Reply-To: <CA+NmNoNDwqCjfc4cs1xHMxm_iD5JewyYG=7eD5mPZnp1NjU+BA@mail.gmail.com>
References: <CA+NmNoNDwqCjfc4cs1xHMxm_iD5JewyYG=7eD5mPZnp1NjU+BA@mail.gmail.com>
Message-ID: <CAHUi5cMCg3NXYFViGitVWcjKfaOtcXpUU7DSmtkLv2RCdeD6JQ@mail.gmail.com>

On Tue, May 29, 2018, 7:15 AM Chris Morgan <mihalis68 at gmail.com> wrote:

> Some of us will be only just returning to work today after being away all
> week last week for the (successful) OpenStack Summit, therefore I propose
> we skip having a meeting today but regroup next week?
>

+1


> Chris
>
> --
> Chris Morgan <mihalis68 at gmail.com>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


-Erik

>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180529/bad46e02/attachment.html>

From mriedemos at gmail.com  Tue May 29 14:02:58 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Tue, 29 May 2018 09:02:58 -0500
Subject: [Openstack-operators] [nova] Need some feedback on the proposed
 heal_allocations CLI
In-Reply-To: <CALOCmum7=f_PaMiP_1ofJ=r1heZpxDZqT7BU5HQMg3rekwQHrw@mail.gmail.com>
References: <bd2051c0-6572-3cd1-3f16-dd93de817677@gmail.com>
 <CALOCmum7=f_PaMiP_1ofJ=r1heZpxDZqT7BU5HQMg3rekwQHrw@mail.gmail.com>
Message-ID: <97fc2ff4-ef97-4c36-0f89-3ef8d9c874fb@gmail.com>

On 5/28/2018 7:31 AM, Sylvain Bauza wrote:
> That said, given I'm now working on using Nested Resource Providers for 
> VGPU inventories, I wonder about a possible upgrade problem with VGPU 
> allocations. Given that :
>   - in Queens, VGPU inventories are for the root RP (ie. the compute 
> node RP), but,
>   - in Rocky, VGPU inventories will be for children RPs (ie. against a 
> specific VGPU type), then
> 
> if we have VGPU allocations in Queens, when upgrading to Rocky, we 
> should maybe recreate the allocations to a specific other inventory ?

For how the heal_allocations CLI works today, if the instance has any 
allocations in placement, it skips that instance. So this scenario 
wouldn't be a problem.

> 
> Hope you see the problem with upgrading by creating nested RPs ?

Yes, the CLI doesn't attempt to have any knowledge about nested resource 
providers, it just takes the flavor embedded in the instance and creates 
allocations against the compute node provider using the flavor. It has 
no explicit knowledge about granular request groups or more advanced 
features like that.

-- 

Thanks,

Matt


From openstack at medberry.net  Tue May 29 14:27:02 2018
From: openstack at medberry.net (David Medberry)
Date: Tue, 29 May 2018 08:27:02 -0600
Subject: [Openstack-operators] Proposing no Ops Meetups team meeting
	this week
In-Reply-To: <CAHUi5cMCg3NXYFViGitVWcjKfaOtcXpUU7DSmtkLv2RCdeD6JQ@mail.gmail.com>
References: <CA+NmNoNDwqCjfc4cs1xHMxm_iD5JewyYG=7eD5mPZnp1NjU+BA@mail.gmail.com>
 <CAHUi5cMCg3NXYFViGitVWcjKfaOtcXpUU7DSmtkLv2RCdeD6JQ@mail.gmail.com>
Message-ID: <CAJhvMSu44+y3-VeYRmFSOVkLhsvXPZbGVL5P8D6mOY=ugHg_wQ@mail.gmail.com>

Good plan. I'm just getting on email now and hadn't even considered IRC
yet. :^)

On Tue, May 29, 2018 at 5:53 AM, Erik McCormick <emccormick at cirrusseven.com>
wrote:

>
>
> On Tue, May 29, 2018, 7:15 AM Chris Morgan <mihalis68 at gmail.com> wrote:
>
>> Some of us will be only just returning to work today after being away all
>> week last week for the (successful) OpenStack Summit, therefore I propose
>> we skip having a meeting today but regroup next week?
>>
>
> +1
>
>
>> Chris
>>
>> --
>> Chris Morgan <mihalis68 at gmail.com>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
> -Erik
>
>>
>>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180529/b72bee27/attachment.html>

From jaypipes at gmail.com  Tue May 29 16:10:00 2018
From: jaypipes at gmail.com (Jay Pipes)
Date: Tue, 29 May 2018 12:10:00 -0400
Subject: [Openstack-operators] Problems with
 AggregateMultiTenancyIsolation while migrating an instance
In-Reply-To: <CALaZjREON=StmQAXXEfZm3U+syG3Kyrotw8+_ENS_c7g748zhg@mail.gmail.com>
References: <CALaZjREON=StmQAXXEfZm3U+syG3Kyrotw8+_ENS_c7g748zhg@mail.gmail.com>
Message-ID: <e9d6a204-ac2b-3b55-4bd9-713811ee7826@gmail.com>

The hosts you are attempting to migrate *to* do not have the 
filter_tenant_id property set to the same tenant ID as the compute host 
2 that originally hosted the instance.

That is why you see this in the scheduler logs when evaluating the 
fitness of compute host 1 and compute host 3:

"fails tenant id"

Best,
-jay

On 05/29/2018 05:41 AM, Massimo Sgaravatto wrote:
> I have a small testbed OpenStack cloud (running Ocata) where I am trying 
> to debug a problem with Nova scheduling.
> 
> 
> In short: I see different behaviors when I create a new VM and when I 
> try to migrate a VM
> 
> 
> Since I want to partition the Cloud so that each project uses only 
> certain compute nodes, I created one host aggregate per project (see 
> also this thread:
> http://lists.openstack.org/pipermail/openstack-operators/2018-February/014831.html)
> 
> 
> The host-aggregate for my project is:
> 
> # nova  aggregate-show 52
> +----+-----------+-------------------+--------------------------------------------------------------+----------------------------------------------------------------------------------------------+--------------------------------------+
> | Id | Name      | Availability Zone | Hosts                            
>                              | Metadata                                  
>                                                     | UUID              
>                     |
> +----+-----------+-------------------+--------------------------------------------------------------+----------------------------------------------------------------------------------------------+--------------------------------------+
> | 52 | SgaraPrj1 | nova              | 'compute-01.cloud.pd.infn.it 
> <http://compute-01.cloud.pd.infn.it>', 'compute-02.cloud.pd.infn.it 
> <http://compute-02.cloud.pd.infn.it>' | 'availability_zone=nova', 
> 'filter_tenant_id=ee1865a76440481cbcff08544c7d580a', 'size=normal' | 
> 675f6291-6997-470d-87e1-e9ea199a379f |
> +----+-----------+-------------------+--------------------------------------------------------------+----------------------------------------------------------------------------------------------+--------------------------------------+
> 
> The same compute nodes are shared by other projects  (for which specific 
> host-aggregates, as this one, have been created)
> The other compute node (I have only 3 compute nodes in this small 
> testbed) is targeted to other projects (for which specific 
> host-aggregates exist)
> 
> 
> This is what I have in nova.conf wrt scheduling filters:
> 
> enabled_filters = 
> AggregateInstanceExtraSpecsFilter,AggregateMultiTenancyIsolation,RetryFilter,AvailabilityZoneFilter,RamFilter,CoreFilter,AggregateRamFilter,AggregateCo
> reFilter,DiskFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter
> 
> 
> 
> If I try to create a VM, I see from the scheduler log [*] that 
> the AggregateMultiTenancyIsolation selects only 2 compute nodes, as 
> expected.
> 
> 
> But if I then try to migrate the very same VM, it reports that no valid 
> host was found:
> 
> # nova migrate afaf2a2d-7ff8-4e52-a89a-031ee079a9ba
> ERROR (BadRequest): No valid host was found. No valid host found for 
> cold migrate (HTTP 400) (Request-ID: 
> req-45b8afd5-9683-40a6-8416-295563e37e34)
> 
> 
> And according to the scheduler log the problem is with the 
> AggregateMultiTenancyIsolation which returned 0 hosts (while I would 
> have expected one):
> 
> 2018-05-29 11:12:56.375 19428 INFO nova.scheduler.host_manager 
> [req-45b8afd5-9683-40a6-8416-295563e37e34 
> 9bd03f63fa9d4beb8de31e6c2f2c8d12 56c3f5c047e74a78a714\
> 38c4412e6e13 - - -\
> ] Host filter ignoring hosts: compute-02.cloud.pd.infn.it 
> <http://compute-02.cloud.pd.infn.it>
> 2018-05-29 11:12:56.375 19428 DEBUG nova.filters 
> [req-45b8afd5-9683-40a6-8416-295563e37e34 
> 9bd03f63fa9d4beb8de31e6c2f2c8d12 56c3f5c047e74a78a71438c4412e6e13 -\
>   - -] Starting wit\
> h 2 host(s) get_filtered_objects 
> /usr/lib/python2.7/site-packages/nova/filters.py:70
> 2018-05-29 11:12:56.376 19428 DEBUG nova.filters 
> [req-45b8afd5-9683-40a6-8416-295563e37e34 
> 9bd03f63fa9d4beb8de31e6c2f2c8d12 56c3f5c047e74a78a71438c4412e6e13 -\
>   - -] Filter Aggre\
> gateInstanceExtraSpecsFilter returned 2 host(s) get_filtered_objects 
> /usr/lib/python2.7/site-packages/nova/filters.py:104
> 2018-05-29 11:12:56.377 19428 DEBUG 
> nova.scheduler.filters.aggregate_multitenancy_isolation 
> [req-45b8afd5-9683-40a6-8416-295563e37e34 9bd03f63fa9d4beb8de31e6c\
> 2f2c8d12 56c3f5c04\
> 7e74a78a71438c4412e6e13 - - -] (compute-01.cloud.pd.infn.it 
> <http://compute-01.cloud.pd.infn.it>, compute-01.cloud.pd.infn.it 
> <http://compute-01.cloud.pd.infn.it>) ram: 12797MB disk: 48128MB io_ops: 
> 0 instances: 0 fails tenant id on\
>   aggregate host_pa\
> sses 
> /usr/lib/python2.7/site-packages/nova/scheduler/filters/aggregate_multitenancy_isolation.py:50
> 2018-05-29 11:12:56.378 19428 DEBUG 
> nova.scheduler.filters.aggregate_multitenancy_isolation 
> [req-45b8afd5-9683-40a6-8416-295563e37e34 9bd03f63fa9d4beb8de31e6c\
> 2f2c8d12 56c3f5c04\
> 7e74a78a71438c4412e6e13 - - -] (compute-03.cloud.pd.infn.it 
> <http://compute-03.cloud.pd.infn.it>, compute-03.cloud.pd.infn.it 
> <http://compute-03.cloud.pd.infn.it>) ram: 8701MB disk: -4096MB io_ops: 
> 0 instances: 0 fails tenant id on \
> aggregate host_pas\
> ses 
> /usr/lib/python2.7/site-packages/nova/scheduler/filters/aggregate_multitenancy_isolation.py:50
> 2018-05-29 11:12:56.378 19428 INFO nova.filters 
> [req-45b8afd5-9683-40a6-8416-295563e37e34 
> 9bd03f63fa9d4beb8de31e6c2f2c8d12 56c3f5c047e74a78a71438c4412e6e13 - \
> - -] Filter Aggreg\
> ateMultiTenancyIsolation returned 0 hosts
> 
> 
> 
> I am confused ...
> Any hints ?
> 
> Thanks, Massimo
> 
> [*]
> 
> 
> 2018-05-29 11:09:54.328 19428 DEBUG nova.filters 
> [req-1a838e77-8042-4550-b157-4943445119a2 
> ab573ba3ea014b778193b6922ffffe6d ee1865a76440481cbcff08544c7d580a -\
>   - -] Filter AggregateInstanceExtraSpecsFilter returned 3 host(s) 
> get_filtered_objects /usr/lib/python2.7/site-packages/nova/filters.py:104
> 2018-05-29 11:09:54.330 19428 DEBUG nova.filters 
> [req-1a838e77-8042-4550-b157-4943445119a2 
> ab573ba3ea014b778193b6922ffffe6d ee1865a76440481cbcff08544c7d580a -\
>   - -] Filter AggregateMultiTenancyIsolation returned 2 host(s) 
> get_filtered_objects /usr/lib/python2.7/site-packages/nova/filters.py:104
> 2018-05-29 11:09:54.332 19428 DEBUG nova.filters 
> [req-1a838e77-8042-4550-b157-4943445119a2 
> ab573ba3ea014b778193b6922ffffe6d ee1865a76440481cbcff08544c7d580a -\
>   - -] Filter RetryFilter returned 2 host(s) get_filtered_objects 
> /usr/lib/python2.7/site-packages/nova/filters.py:104
> 2018-05-29 11:09:54.332 19428 DEBUG nova.filters 
> [req-1a838e77-8042-4550-b157-4943445119a2 
> ab573ba3ea014b778193b6922ffffe6d ee1865a76440481cbcff08544c7d580a -\
>   - -] Filter AvailabilityZoneFilter returned 2 host(s) 
> get_filtered_objects /usr/lib/python2.7/site-packages/nova/filters.py:104
> 2018-05-29 11:09:54.333 19428 DEBUG nova.filters 
> [req-1a838e77-8042-4550-b157-4943445119a2 
> ab573ba3ea014b778193b6922ffffe6d ee1865a76440481cbcff08544c7d580a -\
>   - -] Filter RamFilter returned 2 host(s) get_filtered_objects 
> /usr/lib/python2.7/site-packages/nova/filters.py:104
> 2018-05-29 11:09:54.334 19428 DEBUG nova.filters 
> [req-1a838e77-8042-4550-b157-4943445119a2 
> ab573ba3ea014b778193b6922ffffe6d ee1865a76440481cbcff08544c7d580a -\
>   - -] Filter CoreFilter returned 2 host(s) get_filtered_objects 
> /usr/lib/python2.7/site-packages/nova/filters.py:104
> 2018-05-29 11:09:54.334 19428 DEBUG nova.filters 
> [req-1a838e77-8042-4550-b157-4943445119a2 
> ab573ba3ea014b778193b6922ffffe6d ee1865a76440481cbcff08544c7d580a -\
>   - -] Filter AggregateRamFilter returned 2 host(s) get_filtered_objects 
> /usr/lib/python2.7/site-packages/nova/filters.py:104
> 2018-05-29 11:09:54.335 19428 DEBUG nova.filters 
> [req-1a838e77-8042-4550-b157-4943445119a2 
> ab573ba3ea014b778193b6922ffffe6d ee1865a76440481cbcff08544c7d580a -\
>   - -] Filter AggregateCoreFilter returned 2 host(s) 
> get_filtered_objects /usr/lib/python2.7/site-packages/nova/filters.py:104
> 2018-05-29 11:09:54.335 19428 DEBUG nova.filters 
> [req-1a838e77-8042-4550-b157-4943445119a2 
> ab573ba3ea014b778193b6922ffffe6d ee1865a76440481cbcff08544c7d580a -\
>   - -] Filter DiskFilter returned 2 host(s) get_filtered_objects 
> /usr/lib/python2.7/site-packages/nova/filters.py:104
> 2018-05-29 11:09:54.336 19428 DEBUG nova.filters 
> [req-1a838e77-8042-4550-b157-4943445119a2 
> ab573ba3ea014b778193b6922ffffe6d ee1865a76440481cbcff08544c7d580a -\
>   - -] Filter ComputeFilter returned 2 host(s) get_filtered_objects 
> /usr/lib/python2.7/site-packages/nova/filters.py:104
> 2018-05-29 11:09:54.337 19428 DEBUG nova.filters 
> [req-1a838e77-8042-4550-b157-4943445119a2 
> ab573ba3ea014b778193b6922ffffe6d ee1865a76440481cbcff08544c7d580a -\
>   - -] Filter ComputeCapabilitiesFilter returned 2 host(s) 
> get_filtered_objects /usr/lib/python2.7/site-packages/nova/filters.py:104
> 2018-05-29 11:09:54.338 19428 DEBUG nova.filters 
> [req-1a838e77-8042-4550-b157-4943445119a2 
> ab573ba3ea014b778193b6922ffffe6d ee1865a76440481cbcff08544c7d580a -\
>   - -] Filter ImagePropertiesFilter returned 2 host(s) 
> get_filtered_objects /usr/lib/python2.7/site-packages/nova/filters.py:104
> 2018-05-29 11:09:54.339 19428 DEBUG nova.filters 
> [req-1a838e77-8042-4550-b157-4943445119a2 
> ab573ba3ea014b778193b6922ffffe6d ee1865a76440481cbcff08544c7d580a -\
>   - -] Filter ServerGroupAntiAffinityFilter returned 2 host(s) 
> get_filtered_objects /usr/lib/python2.7/site-packages/nova/filters.py:104
> 2018-05-29 11:09:54.339 19428 DEBUG nova.filters 
> [req-1a838e77-8042-4550-b157-4943445119a2 
> ab573ba3ea014b778193b6922ffffe6d ee1865a76440481cbcff08544c7d580a -\
>   - -] Filter ServerGroupAffinityFilter returned 2 host(s) 
> get_filtered_objects /usr/lib/python2.7/site-packages/nova/filters.py:104
> 
> 
> 
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> 


From mriedemos at gmail.com  Tue May 29 17:06:16 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Tue, 29 May 2018 12:06:16 -0500
Subject: [Openstack-operators] Problems with
 AggregateMultiTenancyIsolation while migrating an instance
In-Reply-To: <e9d6a204-ac2b-3b55-4bd9-713811ee7826@gmail.com>
References: <CALaZjREON=StmQAXXEfZm3U+syG3Kyrotw8+_ENS_c7g748zhg@mail.gmail.com>
 <e9d6a204-ac2b-3b55-4bd9-713811ee7826@gmail.com>
Message-ID: <de650976-4fc8-f09b-4883-c86aa464177d@gmail.com>

On 5/29/2018 11:10 AM, Jay Pipes wrote:
> The hosts you are attempting to migrate *to* do not have the 
> filter_tenant_id property set to the same tenant ID as the compute host 
> 2 that originally hosted the instance.
> 
> That is why you see this in the scheduler logs when evaluating the 
> fitness of compute host 1 and compute host 3:
> 
> "fails tenant id"
> 
> Best,
> -jay

Hmm, I'm not sure about that. This is the aggregate right?

# nova  aggregate-show 52
+----+-----------+-------------------+--------------------------------------------------------------+----------------------------------------------------------------------------------------------+--------------------------------------+
| Id | Name      | Availability Zone | Hosts 
                             | Metadata 
                                                     | UUID 
                     |
+----+-----------+-------------------+--------------------------------------------------------------+----------------------------------------------------------------------------------------------+--------------------------------------+
| 52 | SgaraPrj1 | nova              | 'compute-01.cloud.pd.infn.it 
<http://compute-01.cloud.pd.infn.it>', 'compute-02.cloud.pd.infn.it 
<http://compute-02.cloud.pd.infn.it>' | 'availability_zone=nova', 
'filter_tenant_id=ee1865a76440481cbcff08544c7d580a', 'size=normal' | 
675f6291-6997-470d-87e1-e9ea199a379f |
+----+-----------+-------------------+--------------------------------------------------------------+----------------------------------------------------------------------------------------------+--------------------------------------+


So compute-01 and compute-02 are in that aggregate for the same tenant 
ee1865a76440481cbcff08544c7d580a.

 From the logs, it skips compute-02 since the instance is already on 
that host.

 > 2018-05-29 11:12:56.375 19428 INFO nova.scheduler.host_manager 
[req-45b8afd5-9683-40a6-8416-295563e37e34 
9bd03f63fa9d4beb8de31e6c2f2c8d12 56c3f5c047e74a78a714\
38c4412e6e13 - - -\
] Host filter ignoring hosts: compute-02.cloud.pd.infn.it 
<http://compute-02.cloud.pd.infn.it>

So it processes compute-01 and compute-03. It should accept compute-01 
since it's in the same tenant-specific aggregate and reject compute-03. 
But the filter rejects both hosts.

It would be useful to know what the tenant_id is when comparing against 
the aggregate metadata:

https://github.com/openstack/nova/blob/stable/ocata/nova/scheduler/filters/aggregate_multitenancy_isolation.py#L50

I'm wondering if the RequestSpec.project_id is null? Like, I wonder if 
you're hitting this bug:

https://bugs.launchpad.net/nova/+bug/1739318

Although if this is a clean Ocata environment with new instances, you 
shouldn't have that problem.

-- 

Thanks,

Matt


From doug at doughellmann.com  Tue May 29 17:31:21 2018
From: doug at doughellmann.com (Doug Hellmann)
Date: Tue, 29 May 2018 13:31:21 -0400
Subject: [Openstack-operators] Ops Community Documentation - first
	anchor point
In-Reply-To: <20180528160341.be386cd2a4562d2981f470fc@redhat.com>
References: <CA+NmNoPRU+R7HQkJbwkVhOu1jLpUXuw7u_Neb8fCDdfrUvmdng@mail.gmail.com>
 <30d4f1a3668445a11fd34b271bc37e94@arcor.de>
 <20180524141929.2vylwguebcgkjxa3@csail.mit.edu>
 <20180528160341.be386cd2a4562d2981f470fc@redhat.com>
Message-ID: <1527614644-sup-279@lrrr.local>

Excerpts from Petr Kovar's message of 2018-05-28 16:03:41 +0200:
> On Thu, 24 May 2018 07:19:29 -0700
> "Jonathan D. Proulx" <jon at csail.mit.edu> wrote:
> 
> > My intention based on current understandign would be to create a git
> > repo called "osops-docs" as this fits current naming an thin initial
> > document we intend to put there and the others we may adopt from
> > docs-team.
> 
> So, just to clarify, the current plan is for your group to take ownership
> of the following docs?
> 
> https://github.com/openstack/openstack-manuals/tree/a1f1748478125ccd68d90a98ccc06c7ec359d3a0/doc/ops-guide
> https://github.com/openstack/openstack-manuals/tree/master/doc/arch-design
> https://github.com/openstack/openstack-manuals/tree/master/doc/ha-guide

Hmm, no, that's not what I thought we agreed to in the room.

During the Pike cycle the Docs team indicated that it could no longer
maintain the Operators Guide. That guide has *already* been handed off
to new owners. They are changing from hosting it in the wiki to using a
git repository. As part of that discussion, we talked about team
ownership, and they indicated that they still wanted to be independent
of the Documentation team.

Those other repositories did come up, but without clear contributors I
encouraged them to wait until they have the Operators Guide online
before they try to take on any more work. At that point we can have the
conversation about ownership.

> 
> Note that there is also
> https://github.com/openstack/openstack-manuals/tree/master/doc/ha-guide-draft
> which you probably want to merge with the ha-guide going forward (or
> retire one or the other).
> 
> As for naming the repo, this is really up to you, but it should be
> something clear and easily recognizable by your audience.
> 
> I can help with moving some of the content around, but as Doug pointed out,
> a few points about actual publishing need to be clarified first with the
> infra team.

The current plan is to create a SIG to own the repo so the owners
can publish the results to docs.openstack.org somewhere. The exact
URL is yet to be determined.

> 
>  > My understanding being they don't to have this type of
> > documentention due to much reduced team size and prefer it live with
> > subject matter experts. It that correct?  If that's not correct I'm
> > not personally opposed to trying this under docs.  We'll need to
> > maintain enough contributors and reviewers to make the work flow go in
> > either location and that's my understanding of the basic issue not
> > where it lives.
> 
> If you want more reviewers involved, I'd recommended inviting the reviewers
> from the docs group.

Yes, it would be good to have reviews from the existing documentation
team, especially any of them familiar with the content already and
have the time to help.

>  
> > This naming would also match other repos wich could be consolidated into an
> > "osops" repo to rule them all.  That may make sense as I think there's
> > significant overlap in set of people who might contribute, but that
> > can be a parallel conversation.
> > 
> > Doug looking at new project docs I think most of it is clear enough to
> > me.  Since it's not code I can skip all th PyPi stuff yes? The repo
> > creation seems pretty clear and I can steal the CI stuff from similar
> > projects.
> 
> Might be best to look into how https://github.com/openstack/security-doc is
> configured as that repo contains a number of separate documents, all managed
> by one group.

That may be a good example. I still think we want 1 guide per
repository, because it makes publishing much simpler.

> 
> > I'm a little unclear on the Storyboard bit I've not done
> > much contribution lately and haven't storyboarded.  Is that relevant
> > (or at least relevent at first) for this use case?  If it is I
> > probably have more questions.
> 
> I'd suggest either having your own storyboard or launchpad project so that
> users can file bugs somewhere, and give you feedback. storyboard might be a
> better option since all OpenStack projects all likely to migrate to it from
> launchpad at some point or another.

Yes, please use storyboard for anything new.

Doug

> 
> Cheers,
> pk
> 


From jaypipes at gmail.com  Tue May 29 17:44:51 2018
From: jaypipes at gmail.com (Jay Pipes)
Date: Tue, 29 May 2018 13:44:51 -0400
Subject: [Openstack-operators] Problems with
 AggregateMultiTenancyIsolation while migrating an instance
In-Reply-To: <de650976-4fc8-f09b-4883-c86aa464177d@gmail.com>
References: <CALaZjREON=StmQAXXEfZm3U+syG3Kyrotw8+_ENS_c7g748zhg@mail.gmail.com>
 <e9d6a204-ac2b-3b55-4bd9-713811ee7826@gmail.com>
 <de650976-4fc8-f09b-4883-c86aa464177d@gmail.com>
Message-ID: <83e68473-1c74-2a3c-2b94-2e6c8967ec90@gmail.com>

On 05/29/2018 01:06 PM, Matt Riedemann wrote:
> I'm wondering if the RequestSpec.project_id is null? Like, I wonder if 
> you're hitting this bug:
> 
> https://bugs.launchpad.net/nova/+bug/1739318
> 
> Although if this is a clean Ocata environment with new instances, you 
> shouldn't have that problem.

Looks very much like that bug, yes. Either that, or the wrong project_id 
is being used when attempting to migrate? Maybe the admin project_id is 
being used instead of the original project_id who launched the instance?

There is only one way that "fails tenant ID" shows up in the logs, and 
it's when the project ID making the request isn't in the configured 
projects for the aggregate...

https://github.com/openstack/nova/blob/master/nova/scheduler/filters/aggregate_multitenancy_isolation.py#L49-L51

Best,
-jay


From mriedemos at gmail.com  Tue May 29 19:47:30 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Tue, 29 May 2018 14:47:30 -0500
Subject: [Openstack-operators] Problems with
 AggregateMultiTenancyIsolation while migrating an instance
In-Reply-To: <83e68473-1c74-2a3c-2b94-2e6c8967ec90@gmail.com>
References: <CALaZjREON=StmQAXXEfZm3U+syG3Kyrotw8+_ENS_c7g748zhg@mail.gmail.com>
 <e9d6a204-ac2b-3b55-4bd9-713811ee7826@gmail.com>
 <de650976-4fc8-f09b-4883-c86aa464177d@gmail.com>
 <83e68473-1c74-2a3c-2b94-2e6c8967ec90@gmail.com>
Message-ID: <416c49dc-1dfb-2432-ad76-be9ce71ee352@gmail.com>

On 5/29/2018 12:44 PM, Jay Pipes wrote:
> Either that, or the wrong project_id is being used when attempting to 
> migrate? Maybe the admin project_id is being used instead of the 
> original project_id who launched the instance?

Could be, but we should be pulling the request spec from the database 
which was created when the instance was created. There is some shim code 
from Newton which will create an essentially fake request spec on-demand 
when doing a move operation if the instance was created before newton, 
which could go back to that bug I was referring to.

Massimo - can you clarify if this is a new server created in your Ocata 
test environment that you're trying to move? Or is this a server created 
before Ocata?

-- 

Thanks,

Matt


From massimo.sgaravatto at gmail.com  Tue May 29 20:07:39 2018
From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto)
Date: Tue, 29 May 2018 22:07:39 +0200
Subject: [Openstack-operators] Problems with
 AggregateMultiTenancyIsolation while migrating an instance
In-Reply-To: <416c49dc-1dfb-2432-ad76-be9ce71ee352@gmail.com>
References: <CALaZjREON=StmQAXXEfZm3U+syG3Kyrotw8+_ENS_c7g748zhg@mail.gmail.com>
 <e9d6a204-ac2b-3b55-4bd9-713811ee7826@gmail.com>
 <de650976-4fc8-f09b-4883-c86aa464177d@gmail.com>
 <83e68473-1c74-2a3c-2b94-2e6c8967ec90@gmail.com>
 <416c49dc-1dfb-2432-ad76-be9ce71ee352@gmail.com>
Message-ID: <CALaZjRFi=nRU955Hvuqfq6oo_i6xKgKY6CZ9D9R9oyr5Hc_0zQ@mail.gmail.com>

The VM that I am trying to migrate was created when the Cloud was already
running Ocata

Cheers, Massimo

On Tue, May 29, 2018 at 9:47 PM, Matt Riedemann <mriedemos at gmail.com> wrote:

> On 5/29/2018 12:44 PM, Jay Pipes wrote:
>
>> Either that, or the wrong project_id is being used when attempting to
>> migrate? Maybe the admin project_id is being used instead of the original
>> project_id who launched the instance?
>>
>
> Could be, but we should be pulling the request spec from the database
> which was created when the instance was created. There is some shim code
> from Newton which will create an essentially fake request spec on-demand
> when doing a move operation if the instance was created before newton,
> which could go back to that bug I was referring to.
>
> Massimo - can you clarify if this is a new server created in your Ocata
> test environment that you're trying to move? Or is this a server created
> before Ocata?
>
> --
>
> Thanks,
>
> Matt
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180529/1195a676/attachment.html>

From mriedemos at gmail.com  Tue May 29 23:01:24 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Tue, 29 May 2018 18:01:24 -0500
Subject: [Openstack-operators] Problems with
 AggregateMultiTenancyIsolation while migrating an instance
In-Reply-To: <CALaZjRFi=nRU955Hvuqfq6oo_i6xKgKY6CZ9D9R9oyr5Hc_0zQ@mail.gmail.com>
References: <CALaZjREON=StmQAXXEfZm3U+syG3Kyrotw8+_ENS_c7g748zhg@mail.gmail.com>
 <e9d6a204-ac2b-3b55-4bd9-713811ee7826@gmail.com>
 <de650976-4fc8-f09b-4883-c86aa464177d@gmail.com>
 <83e68473-1c74-2a3c-2b94-2e6c8967ec90@gmail.com>
 <416c49dc-1dfb-2432-ad76-be9ce71ee352@gmail.com>
 <CALaZjRFi=nRU955Hvuqfq6oo_i6xKgKY6CZ9D9R9oyr5Hc_0zQ@mail.gmail.com>
Message-ID: <4f3531e3-e3bc-c5a6-5a5c-9621dd799127@gmail.com>

On 5/29/2018 3:07 PM, Massimo Sgaravatto wrote:
> The VM that I am trying to migrate was created when the Cloud was 
> already running Ocata

OK, I'd added the tenant_id variable in scope to the log message here:

https://github.com/openstack/nova/blob/stable/ocata/nova/scheduler/filters/aggregate_multitenancy_isolation.py#L50

And make sure when it fails, it matches what you'd expect. If it's None 
or '' or something weird then we have a bug.

-- 

Thanks,

Matt


From bitskrieg at bitskrieg.net  Wed May 30 01:23:50 2018
From: bitskrieg at bitskrieg.net (Chris Apsey)
Date: Tue, 29 May 2018 21:23:50 -0400
Subject: [Openstack-operators] attaching network cards to VMs taking a
	very long time
In-Reply-To: <CAPmmg8s-z87fA4irGi+yxwHoBriPr4S6SrWzBXjV2cbS_gWnOw@mail.gmail.com>
References: <a6616e77-9304-d704-04cd-466681f32a36@gmail.com>
 <ef13f15d4f3106fc4307c04fcb514378c35fa8c4.camel@emag.ro>
 <CAGckRDo3kVsYZ7dNOQiuw=H9mbMZmzZr-b-t0AO-1=XTWGyS0w@mail.gmail.com>
 <715adc7d-64f6-9545-1bf6-5eb13fb1d991@gmail.com>
 <CAGckRDo87sE6MAqbVP1D=H1HPN-6Vtz8ohdhXPLh0RNPbAkd=A@mail.gmail.com>
 <350f070b9d654a0a5430fafb07bcc1d41c98d2f8.camel@emag.ro>
 <CAPmmg8uZshbkkYcubT5JWjPBqXeKJanbKXXmz9CMaaYrmcTt3w@mail.gmail.com>
 <CAPmmg8uJUWY65bFQeZnpnTRRE=nSgbqk1ktKebHRGEQnJDHp3w@mail.gmail.com>
 <43cd8579c761a13dcc81e6ffc9a69089fb421cda.camel@emag.ro>
 <6ec89b3767f26b73451fa490d18ed7756266d3ec.camel@emag.ro>
 <CAPmmg8s-z87fA4irGi+yxwHoBriPr4S6SrWzBXjV2cbS_gWnOw@mail.gmail.com>
Message-ID: <163aea4dc70.2784.5f0d7f2baa7831a2bbe6450f254d9a24@bitskrieg.net>

I want to echo the effectiveness of this change - we had vif failures when 
launching more than 50 or so cirros instances simultaneously, but moving to 
daemon mode made this issue disappear and we've tested 5x that amount.  
This has been the single biggest scalability improvement to date.  This 
option should be the default in the official docs.

On May 24, 2018 05:55:49 Saverio Proto <zioproto at gmail.com> wrote:
> Glad to hear it!
> Always monitor rabbitmq queues to identify bottlenecks !! :)
>
> Cheers
>
> Saverio
>
>
> Il gio 24 mag 2018, 11:07 Radu Popescu | eMAG, Technology 
> <radu.popescu at emag.ro> ha scritto:
> Hi,
>
> did the change yesterday. Had no issue this morning with neutron not being 
> able to move fast enough. Still, we had some storage issues, but that's 
> another thing.
> Anyway, I'll leave it like this for the next few days and report back in 
> case I get the same slow neutron errors.
>
> Thanks a lot!
> Radu
>
> On Wed, 2018-05-23 at 10:08 +0000, Radu Popescu | eMAG, Technology wrote:
>> Hi,
>>
>> actually, I didn't know about that option. I'll enable it right now.
>> Testing is done every morning at about 4:00AM ..so I'll know tomorrow 
>> morning if it changed anything.
>>
>> Thanks,
>> Radu
>>
>> On Tue, 2018-05-22 at 15:30 +0200, Saverio Proto wrote:
>>> Sorry email went out incomplete.
>>> Read this:
>>> https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/
>>>
>>> make sure that Openstack rootwrap configured to work in daemon mode
>>>
>>> Thank you
>>>
>>> Saverio
>>>
>>>
>>> 2018-05-22 15:29 GMT+02:00 Saverio Proto <zioproto at gmail.com>:
>>>>
>>>
>>> Hello Radu,
>>>
>>> do you have the Openstack rootwrap configured to work in daemon mode ?
>>>
>>> please read this article:
>>>
>>> 2018-05-18 10:21 GMT+02:00 Radu Popescu | eMAG, Technology
>>> <radu.popescu at emag.ro>:
>>>>
>>>
>>> Hi,
>>>
>>> so, nova says the VM is ACTIVE and actually boots with no network. We are
>>> setting some metadata that we use later on and have cloud-init for different
>>> tasks.
>>> So, VM is up, OS is running, but network is working after a random amount of
>>> time, that can get to around 45 minutes. Thing is, is not happening to all
>>> VMs in that test (around 300), but it's happening to a fair amount - around
>>> 25%.
>>>
>>> I can see the callback coming few seconds after neutron openvswitch agent
>>> says it's completed the setup. My question is, why is it taking so long for
>>> nova openvswitch agent to configure the port? I can see the port up in both
>>> host OS and openvswitch. I would assume it's doing the whole namespace and
>>> iptables setup. But still, 30 minutes? Seems a lot!
>>>
>>> Thanks,
>>> Radu
>>>
>>> On Thu, 2018-05-17 at 11:50 -0400, George Mihaiescu wrote:
>>>
>>> We have other scheduled tests that perform end-to-end (assign floating IP,
>>> ssh, ping outside) and never had an issue.
>>> I think we turned it off because the callback code was initially buggy and
>>> nova would wait forever while things were in fact ok, but I'll change
>>> "vif_plugging_is_fatal = True" and "vif_plugging_timeout = 300" and run
>>> another large test, just to confirm.
>>>
>>> We usually run these large tests after a version upgrade to test the APIs
>>> under load.
>>>
>>>
>>>
>>> On Thu, May 17, 2018 at 11:42 AM, Matt Riedemann <mriedemos at gmail.com>
>>> wrote:
>>>
>>> On 5/17/2018 9:46 AM, George Mihaiescu wrote:
>>>
>>> and large rally tests of 500 instances complete with no issues.
>>>
>>>
>>> Sure, except you can't ssh into the guests.
>>>
>>> The whole reason the vif plugging is fatal and timeout and callback code was
>>> because the upstream CI was unstable without it. The server would report as
>>> ACTIVE but the ports weren't wired up so ssh would fail. Having an ACTIVE
>>> guest that you can't actually do anything with is kind of pointless.
>>>
>>> _______________________________________________
>>>
>>> OpenStack-operators mailing list
>>>
>>> OpenStack-operators at lists.openstack.org
>>>
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>
>>>
>>>
>>> _______________________________________________
>>> OpenStack-operators mailing list
>>> OpenStack-operators at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180529/c1fa5dd1/attachment.html>

From radu.popescu at emag.ro  Wed May 30 08:02:30 2018
From: radu.popescu at emag.ro (Radu Popescu | eMAG, Technology)
Date: Wed, 30 May 2018 08:02:30 +0000
Subject: [Openstack-operators] attaching network cards to VMs taking a
 very long time
In-Reply-To: <163aea4dc70.2784.5f0d7f2baa7831a2bbe6450f254d9a24@bitskrieg.net>
References: <a6616e77-9304-d704-04cd-466681f32a36@gmail.com>
 <ef13f15d4f3106fc4307c04fcb514378c35fa8c4.camel@emag.ro>
 <CAGckRDo3kVsYZ7dNOQiuw=H9mbMZmzZr-b-t0AO-1=XTWGyS0w@mail.gmail.com>
 <715adc7d-64f6-9545-1bf6-5eb13fb1d991@gmail.com>
 <CAGckRDo87sE6MAqbVP1D=H1HPN-6Vtz8ohdhXPLh0RNPbAkd=A@mail.gmail.com>
 <350f070b9d654a0a5430fafb07bcc1d41c98d2f8.camel@emag.ro>
 <CAPmmg8uZshbkkYcubT5JWjPBqXeKJanbKXXmz9CMaaYrmcTt3w@mail.gmail.com>
 <CAPmmg8uJUWY65bFQeZnpnTRRE=nSgbqk1ktKebHRGEQnJDHp3w@mail.gmail.com>
 <43cd8579c761a13dcc81e6ffc9a69089fb421cda.camel@emag.ro>
 <6ec89b3767f26b73451fa490d18ed7756266d3ec.camel@emag.ro>
 <CAPmmg8s-z87fA4irGi+yxwHoBriPr4S6SrWzBXjV2cbS_gWnOw@mail.gmail.com>
 <163aea4dc70.2784.5f0d7f2baa7831a2bbe6450f254d9a24@bitskrieg.net>
Message-ID: <46be77c4a7b362245a729f39af058cc93ee8d769.camel@emag.ro>

Hi,

just to let you know. Problem is now gone. Instances boot up with working network interface.

Thanks a lot,
Radu

On Tue, 2018-05-29 at 21:23 -0400, Chris Apsey wrote:
I want to echo the effectiveness of this change - we had vif failures when launching more than 50 or so cirros instances simultaneously, but moving to daemon mode made this issue disappear and we've tested 5x that amount.  This has been the single biggest scalability improvement to date.  This option should be the default in the official docs.


On May 24, 2018 05:55:49 Saverio Proto <zioproto at gmail.com> wrote:

Glad to hear it!
Always monitor rabbitmq queues to identify bottlenecks !! :)

Cheers

Saverio

Il gio 24 mag 2018, 11:07 Radu Popescu | eMAG, Technology <radu.popescu at emag.ro<mailto:radu.popescu at emag.ro>> ha scritto:
Hi,

did the change yesterday. Had no issue this morning with neutron not being able to move fast enough. Still, we had some storage issues, but that's another thing.
Anyway, I'll leave it like this for the next few days and report back in case I get the same slow neutron errors.

Thanks a lot!
Radu

On Wed, 2018-05-23 at 10:08 +0000, Radu Popescu | eMAG, Technology wrote:
Hi,

actually, I didn't know about that option. I'll enable it right now.
Testing is done every morning at about 4:00AM ..so I'll know tomorrow morning if it changed anything.

Thanks,
Radu

On Tue, 2018-05-22 at 15:30 +0200, Saverio Proto wrote:

Sorry email went out incomplete.

Read this:

https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/


make sure that Openstack rootwrap configured to work in daemon mode


Thank you


Saverio


2018-05-22 15:29 GMT+02:00 Saverio Proto <zioproto at gmail.com<mailto:zioproto at gmail.com>>:

Hello Radu,


do you have the Openstack rootwrap configured to work in daemon mode ?


please read this article:


2018-05-18 10:21 GMT+02:00 Radu Popescu | eMAG, Technology

<radu.popescu at emag.ro<mailto:radu.popescu at emag.ro>>:

Hi,


so, nova says the VM is ACTIVE and actually boots with no network. We are

setting some metadata that we use later on and have cloud-init for different

tasks.

So, VM is up, OS is running, but network is working after a random amount of

time, that can get to around 45 minutes. Thing is, is not happening to all

VMs in that test (around 300), but it's happening to a fair amount - around

25%.


I can see the callback coming few seconds after neutron openvswitch agent

says it's completed the setup. My question is, why is it taking so long for

nova openvswitch agent to configure the port? I can see the port up in both

host OS and openvswitch. I would assume it's doing the whole namespace and

iptables setup. But still, 30 minutes? Seems a lot!


Thanks,

Radu


On Thu, 2018-05-17 at 11:50 -0400, George Mihaiescu wrote:


We have other scheduled tests that perform end-to-end (assign floating IP,

ssh, ping outside) and never had an issue.

I think we turned it off because the callback code was initially buggy and

nova would wait forever while things were in fact ok, but I'll  change

"vif_plugging_is_fatal = True" and "vif_plugging_timeout = 300" and run

another large test, just to confirm.


We usually run these large tests after a version upgrade to test the APIs

under load.


On Thu, May 17, 2018 at 11:42 AM, Matt Riedemann <mriedemos at gmail.com<mailto:mriedemos at gmail.com>>

wrote:


On 5/17/2018 9:46 AM, George Mihaiescu wrote:


and large rally tests of 500 instances complete with no issues.


Sure, except you can't ssh into the guests.


The whole reason the vif plugging is fatal and timeout and callback code was

because the upstream CI was unstable without it. The server would report as

ACTIVE but the ports weren't wired up so ssh would fail. Having an ACTIVE

guest that you can't actually do anything with is kind of pointless.


_______________________________________________


OpenStack-operators mailing list


OpenStack-operators at lists.openstack.org<mailto:OpenStack-operators at lists.openstack.org>


http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


_______________________________________________

OpenStack-operators mailing list

OpenStack-operators at lists.openstack.org<mailto:OpenStack-operators at lists.openstack.org>

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


_______________________________________________

OpenStack-operators mailing list

OpenStack-operators at lists.openstack.org<mailto:OpenStack-operators at lists.openstack.org>

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


_______________________________________________
OpenStack-operators mailing list
OpenStack-operators at lists.openstack.org<mailto:OpenStack-operators%40lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180530/196ccc40/attachment-0001.html>

From massimo.sgaravatto at gmail.com  Wed May 30 10:21:32 2018
From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto)
Date: Wed, 30 May 2018 12:21:32 +0200
Subject: [Openstack-operators] Problems with
 AggregateMultiTenancyIsolation while migrating an instance
In-Reply-To: <4f3531e3-e3bc-c5a6-5a5c-9621dd799127@gmail.com>
References: <CALaZjREON=StmQAXXEfZm3U+syG3Kyrotw8+_ENS_c7g748zhg@mail.gmail.com>
 <e9d6a204-ac2b-3b55-4bd9-713811ee7826@gmail.com>
 <de650976-4fc8-f09b-4883-c86aa464177d@gmail.com>
 <83e68473-1c74-2a3c-2b94-2e6c8967ec90@gmail.com>
 <416c49dc-1dfb-2432-ad76-be9ce71ee352@gmail.com>
 <CALaZjRFi=nRU955Hvuqfq6oo_i6xKgKY6CZ9D9R9oyr5Hc_0zQ@mail.gmail.com>
 <4f3531e3-e3bc-c5a6-5a5c-9621dd799127@gmail.com>
Message-ID: <CALaZjRFut_k-6YF-Gt2VwxdVUPp83F=f21m9_weyi2XKx2u3Cg@mail.gmail.com>

The problem is indeed with the tenant_id

When I create a VM, tenant_id is ee1865a76440481cbcff08544c7d580a
(SgaraPrj1), as expected

But when, as admin, I run the "nova migrate" command to migrate the very
same instance, the tenant_id is 56c3f5c047e74a78a71438c4412e6e13 (admin) !

Cheers, Massimo

On Wed, May 30, 2018 at 1:01 AM, Matt Riedemann <mriedemos at gmail.com> wrote:

> On 5/29/2018 3:07 PM, Massimo Sgaravatto wrote:
>
>> The VM that I am trying to migrate was created when the Cloud was already
>> running Ocata
>>
>
> OK, I'd added the tenant_id variable in scope to the log message here:
>
> https://github.com/openstack/nova/blob/stable/ocata/nova/sch
> eduler/filters/aggregate_multitenancy_isolation.py#L50
>
> And make sure when it fails, it matches what you'd expect. If it's None or
> '' or something weird then we have a bug.
>
> --
>
> Thanks,
>
> Matt
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180530/92559942/attachment.html>

From mriedemos at gmail.com  Wed May 30 14:30:51 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Wed, 30 May 2018 09:30:51 -0500
Subject: [Openstack-operators] attaching network cards to VMs taking a
 very long time
In-Reply-To: <163aea4dc70.2784.5f0d7f2baa7831a2bbe6450f254d9a24@bitskrieg.net>
References: <a6616e77-9304-d704-04cd-466681f32a36@gmail.com>
 <ef13f15d4f3106fc4307c04fcb514378c35fa8c4.camel@emag.ro>
 <CAGckRDo3kVsYZ7dNOQiuw=H9mbMZmzZr-b-t0AO-1=XTWGyS0w@mail.gmail.com>
 <715adc7d-64f6-9545-1bf6-5eb13fb1d991@gmail.com>
 <CAGckRDo87sE6MAqbVP1D=H1HPN-6Vtz8ohdhXPLh0RNPbAkd=A@mail.gmail.com>
 <350f070b9d654a0a5430fafb07bcc1d41c98d2f8.camel@emag.ro>
 <CAPmmg8uZshbkkYcubT5JWjPBqXeKJanbKXXmz9CMaaYrmcTt3w@mail.gmail.com>
 <CAPmmg8uJUWY65bFQeZnpnTRRE=nSgbqk1ktKebHRGEQnJDHp3w@mail.gmail.com>
 <43cd8579c761a13dcc81e6ffc9a69089fb421cda.camel@emag.ro>
 <6ec89b3767f26b73451fa490d18ed7756266d3ec.camel@emag.ro>
 <CAPmmg8s-z87fA4irGi+yxwHoBriPr4S6SrWzBXjV2cbS_gWnOw@mail.gmail.com>
 <163aea4dc70.2784.5f0d7f2baa7831a2bbe6450f254d9a24@bitskrieg.net>
Message-ID: <eff637cd-7ffe-d8fe-bee9-a78b8c4b112a@gmail.com>

On 5/29/2018 8:23 PM, Chris Apsey wrote:
> I want to echo the effectiveness of this change - we had vif failures 
> when launching more than 50 or so cirros instances simultaneously, but 
> moving to daemon mode made this issue disappear and we've tested 5x that 
> amount.  This has been the single biggest scalability improvement to 
> date.  This option should be the default in the official docs.

This is really good feedback. I'm not sure if there is any kind of 
centralized performance/scale-related documentation, does the LCOO team 
[1] have something that's current? There are also the performance docs 
[2] but that looks pretty stale.

We could add a note to the neutron rootwrap configuration option such 
that if you're running into timeout issues you could consider running 
that in daemon mode, but it's probably not very discoverable. In fact, I 
couldn't find anything about it in the neutron docs, I only found this 
[3] because I know it's defined in oslo.rootwrap (I don't expect 
everyone to know where this is defined).

I found root_helper_daemon in the neutron docs [4] but it doesn't 
mention anything about performance or related options, and it just makes 
it sound like it matters for xenserver, which I'd gloss over if I were 
using libvirt. The root_helper_daemon config option help in neutron 
should probably refer to the neutron-rootwrap-daemon which is in the 
setup.cfg [5].

For better discoverability of this, probably the best place to mention 
it is in the nova vif_plugging_timeout configuration option, since I 
expect that's the first place operators will be looking when they start 
hitting timeouts during vif plugging at scale.

I can start pushing some docs patches and report back here for review help.

[1] https://wiki.openstack.org/wiki/LCOO
[2] https://docs.openstack.org/developer/performance-docs/
[3] 
https://docs.openstack.org/oslo.rootwrap/latest/user/usage.html#daemon-mode
[4] 
https://docs.openstack.org/neutron/latest/configuration/neutron.html#agent.root_helper_daemon
[5] https://github.com/openstack/neutron/blob/f486f0/setup.cfg#L54

-- 

Thanks,

Matt


From mriedemos at gmail.com  Wed May 30 14:41:32 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Wed, 30 May 2018 09:41:32 -0500
Subject: [Openstack-operators] Problems with
 AggregateMultiTenancyIsolation while migrating an instance
In-Reply-To: <CALaZjRFut_k-6YF-Gt2VwxdVUPp83F=f21m9_weyi2XKx2u3Cg@mail.gmail.com>
References: <CALaZjREON=StmQAXXEfZm3U+syG3Kyrotw8+_ENS_c7g748zhg@mail.gmail.com>
 <e9d6a204-ac2b-3b55-4bd9-713811ee7826@gmail.com>
 <de650976-4fc8-f09b-4883-c86aa464177d@gmail.com>
 <83e68473-1c74-2a3c-2b94-2e6c8967ec90@gmail.com>
 <416c49dc-1dfb-2432-ad76-be9ce71ee352@gmail.com>
 <CALaZjRFi=nRU955Hvuqfq6oo_i6xKgKY6CZ9D9R9oyr5Hc_0zQ@mail.gmail.com>
 <4f3531e3-e3bc-c5a6-5a5c-9621dd799127@gmail.com>
 <CALaZjRFut_k-6YF-Gt2VwxdVUPp83F=f21m9_weyi2XKx2u3Cg@mail.gmail.com>
Message-ID: <f12cc889-e500-ed81-15fc-03e8ddaae38c@gmail.com>

On 5/30/2018 5:21 AM, Massimo Sgaravatto wrote:
> The problem is indeed with the tenant_id
> 
> When I create a VM, tenant_id is ee1865a76440481cbcff08544c7d580a 
> (SgaraPrj1), as expected
> 
> But when, as admin, I run the "nova migrate" command to migrate the very 
> same instance, the tenant_id is 56c3f5c047e74a78a71438c4412e6e13 (admin) !

OK that's good information.

Tracing the code for cold migrate in ocata, we get the request spec that 
was created when the instance was created here:

https://github.com/openstack/nova/blob/stable/ocata/nova/compute/api.py#L3339

As I mentioned earlier, if it was cold migrating an instance created 
before Newton and the online data migration wasn't run on it, we'd 
create a temporary request spec here:

https://github.com/openstack/nova/blob/stable/ocata/nova/conductor/manager.py#L263

But that shouldn't be the case in your scenario.

Right before we call the scheduler, for some reason, we completely 
ignore the request spec retrieved in the API, and re-create it from 
local scope variables in conductor:

https://github.com/openstack/nova/blob/stable/ocata/nova/conductor/tasks/migrate.py#L50

And *that* is precisely where this breaks down and takes the project_id 
from the current context (admin) rather than the instance:

https://github.com/openstack/nova/blob/stable/ocata/nova/objects/request_spec.py#L407

Thanks for your patience in debugging this Massimo! I'll get a bug 
reported and patch posted to fix it.

-- 

Thanks,

Matt


From mriedemos at gmail.com  Wed May 30 18:06:21 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Wed, 30 May 2018 13:06:21 -0500
Subject: [Openstack-operators] Problems with
 AggregateMultiTenancyIsolation while migrating an instance
In-Reply-To: <f12cc889-e500-ed81-15fc-03e8ddaae38c@gmail.com>
References: <CALaZjREON=StmQAXXEfZm3U+syG3Kyrotw8+_ENS_c7g748zhg@mail.gmail.com>
 <e9d6a204-ac2b-3b55-4bd9-713811ee7826@gmail.com>
 <de650976-4fc8-f09b-4883-c86aa464177d@gmail.com>
 <83e68473-1c74-2a3c-2b94-2e6c8967ec90@gmail.com>
 <416c49dc-1dfb-2432-ad76-be9ce71ee352@gmail.com>
 <CALaZjRFi=nRU955Hvuqfq6oo_i6xKgKY6CZ9D9R9oyr5Hc_0zQ@mail.gmail.com>
 <4f3531e3-e3bc-c5a6-5a5c-9621dd799127@gmail.com>
 <CALaZjRFut_k-6YF-Gt2VwxdVUPp83F=f21m9_weyi2XKx2u3Cg@mail.gmail.com>
 <f12cc889-e500-ed81-15fc-03e8ddaae38c@gmail.com>
Message-ID: <f951eb48-ff1d-fa9a-4137-36f0b718d2e2@gmail.com>

On 5/30/2018 9:41 AM, Matt Riedemann wrote:
> Thanks for your patience in debugging this Massimo! I'll get a bug 
> reported and patch posted to fix it.

I'm tracking the problem with this bug:

https://bugs.launchpad.net/nova/+bug/1774205

I found that this has actually been fixed since Pike:

https://review.openstack.org/#/c/449640/

But I've got a patch up for another related issue, and a functional test 
to avoid regressions which I can also use when backporting the fix to 
stable/ocata.

-- 

Thanks,

Matt


From massimo.sgaravatto at gmail.com  Thu May 31 06:34:42 2018
From: massimo.sgaravatto at gmail.com (Massimo Sgaravatto)
Date: Thu, 31 May 2018 08:34:42 +0200
Subject: [Openstack-operators] Problems with
 AggregateMultiTenancyIsolation while migrating an instance
In-Reply-To: <f951eb48-ff1d-fa9a-4137-36f0b718d2e2@gmail.com>
References: <CALaZjREON=StmQAXXEfZm3U+syG3Kyrotw8+_ENS_c7g748zhg@mail.gmail.com>
 <e9d6a204-ac2b-3b55-4bd9-713811ee7826@gmail.com>
 <de650976-4fc8-f09b-4883-c86aa464177d@gmail.com>
 <83e68473-1c74-2a3c-2b94-2e6c8967ec90@gmail.com>
 <416c49dc-1dfb-2432-ad76-be9ce71ee352@gmail.com>
 <CALaZjRFi=nRU955Hvuqfq6oo_i6xKgKY6CZ9D9R9oyr5Hc_0zQ@mail.gmail.com>
 <4f3531e3-e3bc-c5a6-5a5c-9621dd799127@gmail.com>
 <CALaZjRFut_k-6YF-Gt2VwxdVUPp83F=f21m9_weyi2XKx2u3Cg@mail.gmail.com>
 <f12cc889-e500-ed81-15fc-03e8ddaae38c@gmail.com>
 <f951eb48-ff1d-fa9a-4137-36f0b718d2e2@gmail.com>
Message-ID: <CALaZjRHu84g4Dygd4UGtzK9XUGJCZ9d9ZfYMWY1WLf+wZZNe3w@mail.gmail.com>

Thanks a lot !!

On Wed, May 30, 2018 at 8:06 PM, Matt Riedemann <mriedemos at gmail.com> wrote:

> On 5/30/2018 9:41 AM, Matt Riedemann wrote:
>
>> Thanks for your patience in debugging this Massimo! I'll get a bug
>> reported and patch posted to fix it.
>>
>
> I'm tracking the problem with this bug:
>
> https://bugs.launchpad.net/nova/+bug/1774205
>
> I found that this has actually been fixed since Pike:
>
> https://review.openstack.org/#/c/449640/
>
> But I've got a patch up for another related issue, and a functional test
> to avoid regressions which I can also use when backporting the fix to
> stable/ocata.
>
> --
>
> Thanks,
>
> Matt
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180531/240b23b5/attachment.html>

From mrhillsman at gmail.com  Thu May 31 18:03:20 2018
From: mrhillsman at gmail.com (Melvin Hillsman)
Date: Thu, 31 May 2018 13:03:20 -0500
Subject: [Openstack-operators] OpenLab Cross-community Impact
Message-ID: <CAMVtB2GkP_prqcqGm_cA+n0YVKB2xn+tpi911gcAxwkPL85k4w@mail.gmail.com>

Hi everyone,

I know we have sent out quite a bit of information over the past few days
with the OpenStack Summit and other updates recently. Additionally there
are plenty of meetings we all attend. I just want to take time to point to
something very significant in my opinion and again give big thanks to
Chris, Dims, Liusheng, Chenrui, Zhuli, Joe (gophercloud), and anyone else
contributing to OpenLab.

A member of the release team working on the testing infrastructure for
Kubernetes did a shoutout to the team for the following:

(AishSundar)
Shoutout to @dims and OpenStack team for quickly getting their 1.11
Conformance results piped to CI runs and contributing results to
Conformance dashboard !
https://k8s-testgrid.appspot.com/sig-release-1.11-all#Conformance%20-%20OpenStack&show-stale-tests=

Here is why this is significant and those working on this who I previously
mentioned should get recognition:

(hogepodge)
OpenStack and GCE are the first two clouds that will release block on
conformance testing failures. Thanks @dims for building out the test
pipeline and @mrhillsman for leading the OpenLab efforts that are reporting
back to the test grid. @RuiChen for his contributions to the testing
effort. Amazing work for the last six months.

In other words, if the external cloud provider ci conformance tests we do
in OpenLab are not passing, it will be one of the signals used for blocking
the release. OpenStack and GCE are the first two clouds to achieve this and
it is a significant accomplishment for the OpenLab team and the OpenStack
community overall regarding our relationship with the Kubernetes community.
Thanks again Chris, Dims, Joe, Liusheng, Chenrui, and Zhuli for the work
you have done and continue to do in this space.

Personally I hope we take a moment to really consider this milestone and
work to ensure OpenLab's continued success as we embark on working on other
integrations. We started OpenLab hoping we could make substantial impact
specifically for the ecosystem that builds on top of OpenStack and this is
evidence we can and should do more.

-- 
Kind regards,

Melvin Hillsman
mrhillsman at gmail.com
mobile: (832) 264-2646
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180531/b09c1afd/attachment.html>

From melwittt at gmail.com  Thu May 31 18:35:43 2018
From: melwittt at gmail.com (melanie witt)
Date: Thu, 31 May 2018 11:35:43 -0700
Subject: [Openstack-operators] [nova] proposal to postpone nova-network core
 functionality removal to Stein
Message-ID: <29873b6f-8a3c-ae6e-0756-c90d2c52a306@gmail.com>

Hello Operators and Devs,

This cycle at the PTG, we had decided to start making some progress 
toward removing nova-network [1] (thanks to those who have helped!) and 
so far, we've landed some patches to extract common network utilities 
from nova-network core functionality into separate utility modules. And 
we've started proposing removal of nova-network REST APIs [2].

At the cells v2 sync with operators forum session at the summit [3], we 
learned that CERN is in the middle of migrating from nova-network to 
neutron and that holding off on removal of nova-network core 
functionality until Stein would help them out a lot to have a safety net 
as they continue progressing through the migration.

If we recall correctly, they did say that removal of the nova-network 
REST APIs would not impact their migration and Surya Seetharaman is 
double-checking about that and will get back to us. If so, we were 
thinking we can go ahead and work on nova-network REST API removals this 
cycle to make some progress while holding off on removing the core 
functionality of nova-network until Stein.

I wanted to send this to the ML to let everyone know what we were 
thinking about this and to receive any additional feedback folks might 
have about this plan.

Thanks,
-melanie

[1] https://etherpad.openstack.org/p/nova-ptg-rocky L301
[2] https://review.openstack.org/567682
[3] 
https://etherpad.openstack.org/p/YVR18-cellsv2-migration-sync-with-operators 
L30


From glavado at whitestack.com  Thu May 31 19:13:05 2018
From: glavado at whitestack.com (Gianpietro Lavado)
Date: Thu, 31 May 2018 21:13:05 +0200
Subject: [Openstack-operators] [publiccloud] Feedback requested on use cases
Message-ID: <CADPrPgbqD4kmKA0XnzrJMOAqxgjb9M2dstAd6dYHkr5aQC_dDg@mail.gmail.com>

Hi team,

As mentioned briefly during Vancouver's forum meeting, we are working in
deploying OpenStack at a new public cloud operator, who found some gaps
compared to their current platform (OVirt)

We agreed to fix them and take the opportunity to contribute upstream if
applicable.

Since my experience is more related to private cloud space, I'm not sure if
these cases are already there or if it's worth to open them for discussion
in Launchpad. So if possible, I would like to receive a little bit of
feedback before documenting them at a detailed level.

I have a longer list, but here the three main ones:

   1. *Instance HA:* being able to mark some VMs with priority levels so
   when a compute node goes down, they are automatically rebuilt at some other
   node.
   2. *Compute node rebalancing: *rely on live migrations to periodically
   and automatically rebalance the capacity of compute nodes.
   3. *VM Agent*: something like the oVirt Guest Agent
   <https://www.ovirt.org/documentation/internal/guest-agent/understanding-guest-agents-and-other-tools/>,
   to be able to enhance interaction and get richer information from the VMs


Any feedback is welcomed.

Thanks!
Gianpietro
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180531/70f51002/attachment.html>

From mriedemos at gmail.com  Thu May 31 20:06:05 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Thu, 31 May 2018 15:06:05 -0500
Subject: [Openstack-operators] [openstack-dev] [nova] proposal to
 postpone nova-network core functionality removal to Stein
In-Reply-To: <1391ee64-90f7-9414-9168-3a4caf495555@gmail.com>
References: <29873b6f-8a3c-ae6e-0756-c90d2c52a306@gmail.com>
 <1391ee64-90f7-9414-9168-3a4caf495555@gmail.com>
Message-ID: <c7b747ed-dc09-e4d7-f738-3bda36ce1955@gmail.com>

+openstack-operators

On 5/31/2018 3:04 PM, Matt Riedemann wrote:
> On 5/31/2018 1:35 PM, melanie witt wrote:
>>
>> This cycle at the PTG, we had decided to start making some progress 
>> toward removing nova-network [1] (thanks to those who have helped!) 
>> and so far, we've landed some patches to extract common network 
>> utilities from nova-network core functionality into separate utility 
>> modules. And we've started proposing removal of nova-network REST APIs 
>> [2].
>>
>> At the cells v2 sync with operators forum session at the summit [3], 
>> we learned that CERN is in the middle of migrating from nova-network 
>> to neutron and that holding off on removal of nova-network core 
>> functionality until Stein would help them out a lot to have a safety 
>> net as they continue progressing through the migration.
>>
>> If we recall correctly, they did say that removal of the nova-network 
>> REST APIs would not impact their migration and Surya Seetharaman is 
>> double-checking about that and will get back to us. If so, we were 
>> thinking we can go ahead and work on nova-network REST API removals 
>> this cycle to make some progress while holding off on removing the 
>> core functionality of nova-network until Stein.
>>
>> I wanted to send this to the ML to let everyone know what we were 
>> thinking about this and to receive any additional feedback folks might 
>> have about this plan.
>>
>> Thanks,
>> -melanie
>>
>> [1] https://etherpad.openstack.org/p/nova-ptg-rocky L301
>> [2] https://review.openstack.org/567682
>> [3] 
>> https://etherpad.openstack.org/p/YVR18-cellsv2-migration-sync-with-operators 
>> L30
> 
> As a reminder, this is the etherpad I started to document the nova-net 
> specific compute REST APIs which are candidates for removal:
> 
> https://etherpad.openstack.org/p/nova-network-removal-rocky
> 


-- 

Thanks,

Matt


From mriedemos at gmail.com  Thu May 31 21:46:54 2018
From: mriedemos at gmail.com (Matt Riedemann)
Date: Thu, 31 May 2018 16:46:54 -0500
Subject: [Openstack-operators] attaching network cards to VMs taking a
 very long time
In-Reply-To: <eff637cd-7ffe-d8fe-bee9-a78b8c4b112a@gmail.com>
References: <a6616e77-9304-d704-04cd-466681f32a36@gmail.com>
 <ef13f15d4f3106fc4307c04fcb514378c35fa8c4.camel@emag.ro>
 <CAGckRDo3kVsYZ7dNOQiuw=H9mbMZmzZr-b-t0AO-1=XTWGyS0w@mail.gmail.com>
 <715adc7d-64f6-9545-1bf6-5eb13fb1d991@gmail.com>
 <CAGckRDo87sE6MAqbVP1D=H1HPN-6Vtz8ohdhXPLh0RNPbAkd=A@mail.gmail.com>
 <350f070b9d654a0a5430fafb07bcc1d41c98d2f8.camel@emag.ro>
 <CAPmmg8uZshbkkYcubT5JWjPBqXeKJanbKXXmz9CMaaYrmcTt3w@mail.gmail.com>
 <CAPmmg8uJUWY65bFQeZnpnTRRE=nSgbqk1ktKebHRGEQnJDHp3w@mail.gmail.com>
 <43cd8579c761a13dcc81e6ffc9a69089fb421cda.camel@emag.ro>
 <6ec89b3767f26b73451fa490d18ed7756266d3ec.camel@emag.ro>
 <CAPmmg8s-z87fA4irGi+yxwHoBriPr4S6SrWzBXjV2cbS_gWnOw@mail.gmail.com>
 <163aea4dc70.2784.5f0d7f2baa7831a2bbe6450f254d9a24@bitskrieg.net>
 <eff637cd-7ffe-d8fe-bee9-a78b8c4b112a@gmail.com>
Message-ID: <914e8aec-2a25-e7d0-270f-725b0aeba9d5@gmail.com>

On 5/30/2018 9:30 AM, Matt Riedemann wrote:
> 
> I can start pushing some docs patches and report back here for review help.

Here are the docs patches in both nova and neutron:

https://review.openstack.org/#/q/topic:bug/1774217+(status:open+OR+status:merged)

-- 

Thanks,

Matt