[openstack-hpc] How are you using OpenStack for your HPC workflow? (We're looking at Ironic!)

John Hearns hearnsj at googlemail.com
Wed Jan 14 11:37:06 UTC 2015


Tim Bell wrote:

- Accounting - The cloud accounting systems are based on vCPUs. There is no
concept in OpenStack of a relative performance measure so you could be
allocated a VM on 3 year old hardware or the latest and the vCPU metric is
the same. In the high throughput use cases, there should be a relative unit
which is used to scale with regards to the amount of work that could be
done. With over 20 different hardware configurations running (competitive
public procurement cycles over 4 years), we can't define 100 flavors with
different accounting rates and expect our users to guess which ones have
capacity.

This is quite interesting. I guess that CERN do not use the VUP  (Vax Unit
of Power) any more!
Could you not use some sort of virtual currency, translating time occupied
on a server X HEPSpec rating of that server into a virtual amount of money
or units?









On 14 January 2015 at 06:24, Tim Bell <Tim.Bell at cern.ch> wrote:

>
> At CERN, we have taken two approaches for allocating compute resources on
> a single cloud
>
> - Run classic high throughput computing using a batch system (LSF in our
> case) using virtualised resources. There are some losses from the memory
> needs of the hypervisor (so you have a slot or so less per machine) and
> some local I/O impacts but other than that, there are major benefits of
> being able to automate recovery/recycling of VMs such as for security of
> hardware issues.
>
> - Run cloud services ala Amazon. Here users have one of the standard
> images such as CentOS  or their own if they wish with cloud-init.
>
> Compute resources at CERN are allocated using a pledge system, i.e. the
> experiments request and justify their needs for the year, resources are
> purchased and then allocated out according to these pledges. There is no
> charging as such.
>
> The biggest challenges we've faced are
>
> - Elasticity of cloud - the perception from Amazon is that you can scale
> up and down. Within a standard private cloud on-premise, the elasticity can
> only come from removing other work (since we aim to be using the resources
> to the full). We use short queue, opportunistic batch work to fill in so
> that we drain and re-instantiate the high throughput computing batch
> workers to accommodate some elasticity but it is limited. Spot market
> functionality would be interesting but we've not seen something in
> OpenStack yet.
>
> - Scheduling - We have more work to do that there are resources. The cloud
> itself has no 'queue' or fair share. The experiment workflows thus have to
> place their workload into the cloud within their quota and maintain a queue
> or work on their side. INFN are working on some enhancements to OpenStack
> with Blazar or Nova queues which is worth following.
>
> - Quota - Given the fixed resources, the quota controls are vital to avoid
> overcommitting. Currently, quotas are flat so the cloud administrators are
> asked to adjust the quotas to balance the user priorities within their
> overall pledges. The developments in Nested Projects coming along with Kilo
> will be a major help here and we're working with BARC to deliver this in
> Nova so an experiment resource co-ordinator can be given the power to
> manage quotas for their subgroups.
>
> - VM Lifecycle - We have around 200 arrivals and departures a month.
> Without a credit card, it would be easy for their compute resources to
> remain running after they have left. We have an automated engine which
> ensures that VMs of departing staff are quiesced and deleted following a
> standard lifecycle.
>
> - Accounting - The cloud accounting systems are based on vCPUs. There is
> no concept in OpenStack of a relative performance measure so you could be
> allocated a VM on 3 year old hardware or the latest and the vCPU metric is
> the same. In the high throughput use cases, there should be a relative unit
> which is used to scale with regards to the amount of work that could be
> done. With over 20 different hardware configurations running (competitive
> public procurement cycles over 4 years), we can't define 100 flavors with
> different accounting rates and expect our users to guess which ones have
> capacity.
>
> We're starting to have a look at bare metal and containers too. Using
> OpenStack as an overall resource allocation system to ensure all compute
> usage is accounted and managed is of great interest. The highlighted items
> above will remain but hopefully we are work with others in the community to
> address them (as we are with nested projects)
>
> Tim
>
>
>
>
> > -----Original Message-----
> > From: Jonathon A Anderson [mailto:Jonathon.Anderson at Colorado.EDU]
> > Sent: 13 January 2015 20:14
> > To: openstack-hpc at lists.openstack.org
> > Subject: [openstack-hpc] How are you using OpenStack for your HPC
> workflow?
> > (We're looking at Ironic!)
> >
> > Hi, everybody!
> >
> > We at CU-Boulder Research Computing are looking at using OpenStack Ironic
> > for our HPC cluster(s). I’ve got a sysadmin background, so my initial
> goal is to
> > simply use OpenStack to deploy our traditional queueing system (Slurm),
> and
> > basically use OpenStack (w/Ironic) as a replacement for Cobbler, xcat,
> perceus,
> > hand-rolled PXE, or whatever else sysadmin-focused PXE-based node
> > provisioning solution one might use. That said, once we have OpenStack
> (and
> > have some competency with it), we’d presumably be well positioned to
> offer
> > actual cloud-y virtualization to our users, or even offer baremetal
> instances for
> > direct use for some of our (presumably most
> > trusted) users.
> >
> > But how about the rest of you? Are you using OpenStack in your HPC
> workloads
> > today? Are you doing “traditional” virtualization? Are you using the
> nova-
> > compute baremetal driver? Are you using Ironic? Something even more
> exotic?
> >
> > And, most importantly, how has it worked for you?
> >
> > I’ve finally got some nodes to start experimenting with, so I hope to
> have some
> > real hands-on experience with Ironic soon. (So far, I’ve only been able
> to do
> > traditional virtualization on some workstations.)
> >
> > ~jonathon
> >
> > _______________________________________________
> > OpenStack-HPC mailing list
> > OpenStack-HPC at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-hpc
> _______________________________________________
> OpenStack-HPC mailing list
> OpenStack-HPC at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-hpc
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-hpc/attachments/20150114/e98264f1/attachment-0001.html>


More information about the OpenStack-HPC mailing list