[Openstack-operators] [openstack-dev] [Nova] Reconciling flavors and block device mappings

Andrew Laski andrew at lascii.com
Fri Aug 26 15:44:32 UTC 2016




On Fri, Aug 26, 2016, at 11:01 AM, John Griffith wrote:
>
>
> On Fri, Aug 26, 2016 at 7:37 AM, Andrew Laski
> <andrew at lascii.com> wrote:
>>
>>
>> On Fri, Aug 26, 2016, at 03:44
>> AM,Kostiantyn.Volenbovskyi at swisscom.com
>>  wrote:
>> > Hi,
>>  > option 1 (=that's what patches suggest) sounds totally fine.
>>  > Option 3 > Allow block device mappings, when present, to mostly
>>  > determine instance  packing sounds like option 1+additional logic
>>  > (=keyword 'mostly') I think I miss to understand the part of
>>  > 'undermining the purpose of the flavor' Why new behavior might
>>  > require one more parameter to limit number of instances of host?
>>  > Isn't it that those VMs will be under control of other flavor
>>  > constraints, such as CPU and RAM anyway and those will be the ones
>>  > controlling 'instance packing'?
>>
>> Yes it is possible that CPU and RAM could be controlling instance
>>  packing. But my understanding is that since those are often
>>  oversubscribed
> I don't understand why the oversubscription ratio matters here?
>

My experience is with environments where the oversubscription was used
to be a little loose with how many vCPUs were allocated or how much RAM
was allocated but disk was strictly controlled.

>
>
>
>> while disk is not that it's actually the disk amounts
>>  that control the packing on some environments.
> Maybe an explanation of what you mean by "packing" here.  Customers
> that I've worked with over the years have used CPU and Mem as their
> levers and the main thing that they care about in terms of how many
> Instances go on a Node.  I'd like to learn more about why that's wrong
> and that disk space is the mechanism that deployers use for this.
>

By packing I just mean the various ways that different flavors fit on a
host. A host may be designed to hold 1 xlarge, or 2 large, or 4 mediums,
or 1 large and 2 mediums, etc... The challenge I see here is that the
constraint can be managed by using CPU or RAM or disk or some
combination of the three. For deployers just using disk the above
patches will change behavior for them.

It's not wrong to use CPU/RAM, but it's not what everyone is doing. One
purpose of this email was to gauge if it would be acceptable to only use
CPU/RAM for packing.


>
>
>> But that is a sub option
>>  here, just document that disk amounts should not be used to
>>  determine
>>  flavor packing on hosts and instead CPU and RAM must be used.
>>
>>  > Does option 3 covers In case someone relied on eg. flavor root
>>  > disk for disk volume booted from volume - and now instance packing
>>  > will change once patches are implemented?
>>
>> That's the goal. In a simple case of having hosts with 16 CPUs,
>> 128GB of
>>  RAM and 2TB of disk and a flavor with VCPU=4, RAM=32GB,
>>  root_gb=500GB,
>>  swap/ephemeral=0 the deployer is stating that they want only 4
>>  instances
>>  on that host.
> How do you arrive at that logic?  What if they actually wanted a
> single VCPU=4,RAM=32GB,root_gb=500 but then they wanted the remaining
> resources split among Instances that were all 1 VCPU, 1 G ram and a 1
> G root disk?

My example assumes the one stated flavor. But if they have a smaller
flavor then more than 4 instances would fit.

>
>> If there is CPU and RAM oversubscription enabled then by
>>  using volumes a user could end up with more than 4 instances on that
>>  host. So a max_instances=4 setting could solve that. However I don't
>>  like the idea of adding a new config, and I think it's too
>>  simplistic to
>>  cover more complex use cases. But it's an option.
>
> I would venture to guess that most Operators would be sad to read
> that.  So rather than give them an explicit lever that does exactly
> what they want clearly and explicitly we should make it as complex as
> possible and have it be the result of a 4 or 5 variable equation?  Not
> to mention it's completely dynamic (because it seems like
> lots of clouds have more than one flavor).

Is that lever exactly what they want? That's part of what I'd like to
find out here. But currently it's possible to setup a situation where 1
large flavor or 4 small flavors fit on a host. So would the
max_instances=4 setting be desired? Keeping in mind that if the above
patches merged 4 large flavors could be put on that host if they only
use remote volumes and aren't using proper CPU/RAM limits.

I probably was not clear enough in my original description or made some
bad assumptions. The concern I have is that if someone is currently
relying on disk sizes for their instance limits then the above patches
change behavior for them and affect capacity limits and planning. Is
this okay and if not what do we do?


>
> All I know is that the current state is broken.  It's not just the
> scheduling problem, I could live with that probably since it's too
> hard to fix... but keep in mind that you're reporting the complete
> wrong information for the Instance in these cases.  My flavor says
> it's 5G, but in reality it's 200 or whatever.  Rather than make it
> perfect we should just fix it.  Personally I thought the proposals for
> a scheduler check and the addition of the Instances/Node option was a
> win win for everyone.  What am I
> missing?  Would you rather a custom filter scheduler so it wasn't a
> config option?

There is another effort in progress to address the reporting issue. If
you poke around Nova specs or conversations you'll hear it referred to
as Resource Providers, though it's actually a series of specs with
various names. There's certainly a conversation that can be had about
waiting for that effort vs trying to address resource tracking in a
backportable manner, but that's not what I wanted to get into here.

>
>>
>> >
>>  > BR,
>>  > Konstantin
>>  >
>>  > > -----Original Message-----
>>  > > From: Andrew Laski [mailto:andrew at lascii.com]
>>  > > Sent: Thursday, August 25, 2016 10:20 PM
>>  > > To: openstack-dev at lists.openstack.org
>>  > > Cc: openstack-operators at lists.openstack.org
>>  > > Subject: [Openstack-operators] [Nova] Reconciling flavors and
>>  > > block device
>>  > > mappings
>>  > >
>>  > > Cross posting to gather some operator feedback.
>>  > >
>>  > > There have been a couple of contentious patches gathering
>>  > > attention recently
>>  > > about how to handle the case where a block device mapping
>>  > > supersedes flavor
>>  > > information. Before moving forward on either of those I think we
>>  > > should have a
>>  > > discussion about how best to handle the general case, and how to
>>  > > handle any
>>  > > changes in behavior that results from that.
>>  > >
>>  > > There are two cases presented:
>>  > >
>>  > > 1. A user boots an instance using a Cinder volume as a root
>>  > >    disk, however the
>>  > > flavor specifies root_gb = x where x > 0. The current behavior
>>  > > in Nova is that the
>>  > > scheduler is given the flavor root_gb info to take into account
>>  > > during scheduling.
>>  > > This may disqualify some hosts from receiving the instance even
>>  > > though that disk
>>  > > space  is not necessary because the root disk is a remote
>>  > > volume.
>>  > > https://review.openstack.org/#/c/200870/
>>  > >
>>  > > 2. A user boots an instance and uses the block device mapping
>>  > >    parameters to
>>  > > specify a swap or ephemeral disk size that is less than
>>  > > specified on the flavor.
>>  > > This leads to the same problem as above, the scheduler is
>>  > > provided information
>>  > > that doesn't match the actual disk space to be consumed.
>>  > > https://review.openstack.org/#/c/352522/
>>  > >
>>  > > Now the issue: while it's easy enough to provide proper
>>  > > information to the
>>  > > scheduler on what the actual disk consumption will be when using
>>  > > block device
>>  > > mappings that undermines one of the purposes of flavors which is
>>  > > to control
>>  > > instance packing on hosts. So the outstanding question is to
>>  > > what extent should
>>  > > users have the ability to use block device mappings to bypass
>>  > > flavor constraints?
>>  > >
>>  > > One other thing to note is that while a flavor constrains how
>>  > > much local disk is
>>  > > used it does not constrain volume size at all. So a user can
>>  > > specify an
>>  > > ephemeral/swap disk <= to what the flavor provides but can have
>>  > > an arbitrary
>>  > > sized root disk if it's a remote volume.
>>  > >
>>  > > Some possibilities:
>>  > >
>>  > > Completely allow block device mappings, when present, to
>>  > > determine instance
>>  > > packing. This is what the patches above propose and there's a
>>  > > strong desire for
>>  > > this behavior from some folks. But changes how many instances
>>  > > may fit on a
>>  > > host which could be undesirable to some.
>>  > >
>>  > > Keep the status quo. It's clear that is undesirable based on the
>>  > > bug reports and
>>  > > proposed patches above.
>>  > >
>>  > > Allow block device mappings, when present, to mostly determine
>>  > > instance
>>  > > packing. By that I mean that the scheduler only takes into
>>  > > account local disk that
>>  > > would be consumed, but we add additional configuration to Nova
>>  > > which limits
>>  > > the number of instance that can be placed on a host. This is a
>>  > > compromise
>>  > > solution but I fear that a single int value does not meet the
>>  > > needs of deployers
>>  > > wishing to limit instances on a host. They want it to take into
>>  > > account cpu
>>  > > allocations and ram and disk, in short a flavor :)
>>  > >
>>  > > And of course there may be some other unconsidered solution.
>>  > > That's where
>>  > > you, dear reader, come in.
>>  > >
>>  > > Thoughts?
>>  > >
>>  > > -Andrew
>>  > >
>>  > >
>>  > > _______________________________________________
>>  > > OpenStack-operators mailing list
>>  > > OpenStack-operators at lists.openstack.org
>>  > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>>  ___________________________________________________________________-
>>  _______
>>  OpenStack Development Mailing List (not for usage questions)
>>  Unsubscribe: OpenStack-dev-
>>  request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> ____________________________________________________________________-
> ________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-
> request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20160826/7b06f0a8/attachment.html>


More information about the OpenStack-operators mailing list