Open Stack

Thu May 18 23:53:20 UTC 2017

The etherpad for this session is here [1]. This was about discussing 
ways to achieve co-location or affinity for VMs and volumes for 
high-performance, and was spurred by an older dev list discussion 
(linked in the etherpad).

This quickly grew into side discussions and it became apparent that at a 
high level we were talking about complicated solutions looking for a 
problem. That is also touched on a bit after the session in the dev ML [2].

The base use case is a user wants their server instance and volume 
located as close to each other as possible, ideally on the same compute 
host.

We talked about ways to model a sort of "distance" attribute between 
resource providers in an aggregate relationship (in the placement sense 
of 'aggregate', not compute host aggregates in nova). This distance or 
nearness idea led down a path for how you define distance in a cloud, 
i.e. does 'near' mean the same host or rack or data center in a 
particular cloud? How are these values defined - would they be custom 
per cloud and if so, how is that discoverable/inter-operable for an end 
API user? It was noted that flavors aren't inter-operable either really, 
at least not by name. Jay Pipes has an older spec [3] about generic 
scheduling which could replace server groups, so this could maybe fall 
into that.

When talking about this there are also private cloud biases, i.e. things 
people are willing to tolerate or expose to their users because they are 
running a private cloud. Those same things don't all work in a public 
cloud, e.g. mapping availability zones one-to-one for cinder-volume and 
nova-compute on the same host when you have hundreds of thousands of hosts.

Then there are other questions about if/how people have already solved 
this using things like flavors with extra specs and host aggregates and 
the AggregateInstanceExtraSpecsFilter, or setting 
[cinder]cross_az_attach=False in nova.conf on certain hosts. For 
example, setup host aggregates with nova-compute and cinder-volume 
running on the same host, define flavors with extra specs that match the 
host aggregate metadata, and then charge more for those flavors as your 
HPC type. Or, can we say, use Ironic?

It's clear that we don't have a good end-user story for this 
requirement, and so I think next steps for this are going to involve 
working with the public cloud work group [4] and/or product work group 
[5] (hopefully those two groups could work together here) on defining 
the actual use cases and what the end user experience looks like.

[1] 
https://etherpad.openstack.org/p/BOS-forum-compute-instance-volume-affinity-hpc
[2] http://lists.openstack.org/pipermail/openstack-dev/2017-May/116694.html
[3] https://review.openstack.org/#/c/183837/
[4] https://wiki.openstack.org/wiki/PublicCloudWorkingGroup
[5] https://wiki.openstack.org/wiki/ProductTeam

-- 

Thanks,

Matt

Open Stack

[openstack-dev] [nova] Boston Forum session recap - instance/volume affinity for HPC

OpenStack

Community

Documentation

Branding & Legal