Open Stack

Mon Apr 7 18:33:08 UTC 2014

On Mon, Apr 7, 2014 at 11:52 AM, Jay Pipes <jaypipes at gmail.com> wrote:

> Thanks for you response, Narayan, this is an interesting discussion. See
> comments below inline.
>
> On Mon, 2014-04-07 at 11:38 -0500, Narayan Desai wrote:
>
> > I think that this assumption is the reason that we have the quota
> > system that we do. A resource like 1GB of memory isn't comparable
> > across all kinds of resources. It is if you have a simple hardware
> > configuration with relatively uniform hardware configurations, but
> > heterogeneity causes this not to work.
> >
> > For example, 1GB of memory on a 24 GB system is way different than 1GB
> > of memory on a 2TB system.
>
> How so? If you said that 1GB of some super-fast RAM is different than
> 1GB of normal RAM, I would agree with you (and point you to the
> discussion of weighting factors in my full previous ML response). But I
> don't see any difference between 1GB of memory on systems with different
> overall amounts of the same type of memory.

The ability to gang together more 1GB chunks on a 2TB node than on a 16GB
node makes them different. We consider their costs differently for this
reason.

> >  The costs are different, the resource allocation considerations are
> > different, etc.
>
> No, the costs are the same. The resource allocation considerations are
> only different in that the things providing the resource (the hosts)
> have different available amounts of RAM. But that's not what quotas are
> about. That's what scheduling/placement is about. It's two different
> things.

We've ended up leaning pretty heavily on quotas for resource management
policies. This is going to be a different regime on our system than on a
public system, because we have a fixed resource configuration we're trying
to maximize value of, as opposed to running with a continuous acquisition
model.

In effect, we want to limit the quantity of resources a single tenant can
allocate. We'd like to be able to say something like "tenant T can allocate
1 2TB instance, but only 128 1GB instances". In this kind of situation,
resources aren't fungible in the way you're asserting they are.

> >  A model that assumes uniformity means that you can't do anything
> > sensible when resource costs and considerations aren't uniform across
> > your system.
>
> A model that assumes uniformity of the basic resource types is essential
> here. A single CPU unit is the same as any other CPU unit. A GB of RAM
> is the same as any other GB of RAM. With weighting factors for given
> extra specs, you can influence the overall resource used (see my full
> response for how weighting factors could be used). But if you don't have
> uniformity of the base resources, you have no foundation from which to
> apply quota management.

In the HPC world, we typically have a much richer set of mechanisms to
control job ingress, and explicitly have bounded runtimes, which makes
these desired behaviors a lot easier to implement.

I honestly think that an assumption of uniform resources is a losing
proposition in the long run. Our system is likely more heterogenous than
most, but I think it is a safe assumption that hardware platforms will get
more differentiated and interesting over the next few years.

We're already expecting different flavors of memory, with different
durability properties, all sorts of different kinds of cores, and way
different sorts of storage.

> > While keying off of instance types isn't perfect, and certainly
> > doesn't give you the model that you'd like, you can make things work
> > based on an implicit resource model when using instances that you
> > can't do through resource reductionism.
>
> Two problems with using the instance type for the model, instead of the
> base resource types that the instance type has (and the weighting
> factors of its extra specs):
>
> 1) Instance types can and do change over time. Unless you've decomposed
> the instance type into its base resource types, you don't have any way
> to adjust quotas properly that reflect the usage of the underlying
> resources.
>

If you've tied everything to instance types, you can easily adjust those as
the underlying resource amounts change. One nice thing about tying things
to instance types is that it is pretty clear how changes to an instance
type will affect project quotas.

> 2) Quota setting based on instance types would require a human making
> decisions about capacity based on that human's (often flawed)
> understanding of how many of that instance type can "fit" into their
> deployment. Things that humans do are unreliable compared with an
> algorithmic approach to quota determination.
>

You must be using a much smarter openstack scheduler than I am. I'll agree
in principle that this could be true, however, fragmentation avoidance is
tricky problem, particularly when you have a big range of potential
configuration. YOu can only imagine the sad trombone that played in my
office the first time the scheduler placed an 8GB instance on a bigmem
node, blocking that system's largest configuration from being usable. We've
personally had a lot more luck partitioning resources, and picking a set of
favorable resource combinations than letting the scheduler deal with it.

This is a great discussion. I think that we need more of this kind of thing
on the operators list.
 -nld
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20140407/ce87a7ec/attachment.html>

Open Stack

[Openstack-operators] Quota Templates

OpenStack

Community

Documentation

Branding & Legal