[Openstack-operators] Managing quota for Nova local storage?

Kris G. Lindgren klindgren at godaddy.com
Fri Nov 11 14:07:19 UTC 2016


I don’t mean to hijack your thread a little bit but since you mentioned KSM.  I realize that you guys run private cloud, so you don’t have to worry about bad actors getting a server from you and doing malicious things with it.  But do you have any concerns about the recent research [1] that uses Rowhammer + ksm + transparent hugepages + kvm  to change the memory of collocated VM’s?  The research showed that they were able to successfully target memory inside other VM’s to do things like modify authorized_keys in memory in such a way that they could successfully login with their own key.  They also performed other attacks like manipulating the update URL for Ubuntu vm’s and modifying the gpg key (in memory), so that when an update is performed they install packages from a malicious source.  On the SSH attack, they showed that out of 300 attempts they were able to successfully change the in memory representation of authorized_keys in another vm 252 (84.1%) of the time, most of the time within 6 minutes, with a max time of 12.6 minutes.

The attack mainly works because of KSM + transparent hugepages.  You obviously need rowhammer vulnerable memory chips.  But lets face it – with the majority of them susceptible – you most likely have vulnerable ram somewhere the machines in your datacenter.

1 - https://www.usenix.org/system/files/conference/usenixsecurity16/sec16_paper_razavi.pdf
___________________________________________________________________
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: "Van Leeuwen, Robert" <rovanleeuwen at ebay.com>
Date: Friday, November 11, 2016 at 12:10 AM
To: "Kris G. Lindgren" <klindgren at godaddy.com>, Edmund Rhudy <erhudy at bloomberg.net>, "warren at wangspeed.com" <warren at wangspeed.com>
Cc: "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>
Subject: Re: [Openstack-operators] Managing quota for Nova local storage?

Thx for your stories,

I think we are now all doing pretty much the same thing to get around the issue but it still looks like a very useful feature.

So to share what we (eBay-ECG) are doing:
We also started out with scaling the flavor disksize to either memory or cpu. (so e.g. large disk == large memory)
But our users started asking for flavors with quite different specs.
Not being able to give those would be hugely inefficient.

So now we started giving flavors to specific tenants instead of making them public (and let the quota’s sort it out)
e.g. a flavor with 8 cores, 12G and 1TB of local storage will only be available for the tenants that really need it.

Looking at our hypervisor stats we either run out of memory or disk before cpu cycles so not having a tunable on disk is inconvenient.
Our latest spec hypervisors have 768GB and we run KSM so we will probably run out of DISK first there.
We run SSD-only on local storage so that space in the flavor is real $$$.

We started to run on zfs with compression on our latest config/iteration and that seems to alleviate the pain a bit.
It is a bit early to tell exactly but it seems to run stable and the compression factor will be around 2.0

P.S. I noticed my search for blueprints was not good enough so I closed mine and subscribed to the one that’s was already there:
https://blueprints.launchpad.net/nova/+spec/nova-disk-quota-tracking

Robert van Leeuwen

From: "Kris G. Lindgren" <klindgren at godaddy.com>
Date: Thursday, November 10, 2016 at 5:18 PM
To: Edmund Rhudy <erhudy at bloomberg.net>, "warren at wangspeed.com" <warren at wangspeed.com>, Robert Van Leeuwen <rovanleeuwen at ebay.com>
Cc: "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>
Subject: Re: [Openstack-operators] Managing quota for Nova local storage?

This is what we have done as well.

We made our flavors stackable, starting with our average deployed flavor size and making things a multiple of that.  IE if our average deployed flavor size is 8GB 120GB of disk, our larger flavors are multiple of that.  So if 16GB 240GB of disk is the average, the next flavor up maybe: 32GB 480GB of disk.  From there its easy to then say with 256GB of ram we will average:  ~30 VM’s which means we need to have ~3.6TB of local storage per node.  Assuming that you don’t over allocate disk or ram.  In practice though you can get a running average of the amount of disk space consumed and work towards that plus a bit of a buffer and run with a disk oversubscription.

We currently have no desire to remove local storage.  We want the root disks to be on local storage.  That being said in the future we will most likely give smaller root disks and if people need more space ask them to provisioning a rbd volume through cinder.

___________________________________________________________________
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy

From: "Edmund Rhudy (BLOOMBERG/ 120 PARK)" <erhudy at bloomberg.net>
Reply-To: Edmund Rhudy <erhudy at bloomberg.net>
Date: Thursday, November 10, 2016 at 8:47 AM
To: "warren at wangspeed.com" <warren at wangspeed.com>, "rovanleeuwen at ebay.com" <rovanleeuwen at ebay.com>
Cc: "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>
Subject: Re: [Openstack-operators] Managing quota for Nova local storage?

We didn't come up with one. RAM on our HVs is the limiting factor since we don't run with memory overcommit, so the ability of people to run an HV out of disk space ended up being moot. ¯\_(ツ)_/¯

Long term we would like to switch to being exclusively RBD-backed and get rid of local storage entirely, but that is Distant Future at best.

From: rovanleeuwen at ebay.com
Subject: Re: [Openstack-operators] Managing quota for Nova local storage?
Hi,

Found this thread in the archive so a bit of a late reaction.
We are hitting the same thing so I created a blueprint:
https://blueprints.launchpad.net/nova/+spec/nova-local-storage-quota

If you guys already found a nice solution to this problem I’d like to hear it :)

Robert van Leeuwen
eBay - ECG

From: Warren Wang <warren at wangspeed.com>
Date: Wednesday, February 17, 2016 at 8:00 PM
To: Ned Rhudy <erhudy at bloomberg.net>
Cc: "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>
Subject: Re: [Openstack-operators] Managing quota for Nova local storage?

We are in the same boat. Can't get rid of ephemeral for it's speed, and independence. I get it, but it makes management of all these tiny pools a scheduling and capacity nightmare.
Warren @ Walmart

On Wed, Feb 17, 2016 at 1:50 PM, Ned Rhudy (BLOOMBERG/ 731 LEX) <erhudy at bloomberg.net<mailto:erhudy at bloomberg.net>> wrote:
The subject says it all - does anyone know of a method by which quota can be enforced on storage provisioned via Nova rather than Cinder? Googling around appears to indicate that this is not possible out of the box (e.g., https://ask.openstack.org/en/question/8518/disk-quota-for-projects/).

The rationale is we offer two types of storage, RBD that goes via Cinder and LVM that goes directly via the libvirt driver in Nova. Users know they can escape the constraints of their volume quotas by using the LVM-backed instances, which were designed to provide a fast-but-unreliable RAID 0-backed alternative to slower-but-reliable RBD volumes. Eventually users will hit their max quota in some other dimension (CPU or memory), but we'd like to be able to limit based directly on how much local storage is used in a tenancy.

Does anyone have a solution they've already built to handle this scenario? We have a few ideas already for things we could do, but maybe somebody's already come up with something. (Social engineering on our user base by occasionally destroying a random RAID 0 to remind people of their unsafety, while tempting, is probably not a viable candidate solution.)

_______________________________________________
OpenStack-operators mailing list
OpenStack-operators at lists.openstack.org<mailto:OpenStack-operators at lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20161111/346a4f3b/attachment.html>


More information about the OpenStack-operators mailing list