AW: Customization of nova-scheduler

Sean Mooney smooney at redhat.com
Tue Jun 1 20:55:43 UTC 2021


On Mon, 2021-05-31 at 17:21 +0100, Stephen Finucane wrote:
> On Mon, 2021-05-31 at 13:44 +0200, levonmelikbekjan at yahoo.de wrote:
> > Hello Stephen,
> > 
> > I am a student from Germany who is currently working on his bachelor thesis. My job is to build a cloud solution for my university with Openstack. The functionality should include the prioritization of users. So that you can imagine exactly how the whole thing should work, I would like to give you an example.
> > 
> > Two cases should be solved!
> > 
> > Case 1: A user A with a low priority uses a VM from Openstack with half performance of the available host. Then user B comes in with a high priority and needs the full performance of the host for his VM. When creating the VM of user B, the VM of user A should be deleted because there is not enough compute power for user B. The VM of user B is successfully created.
> > 
> > Case 2: A user A with a low priority uses a VM with half the performance of the available host, then user B comes in with a high priority and needs half of the performance of the host for his VM. When creating the VM of user B, user A should not be deleted, since enough computing power is available for both users.
> > 
one thing to keep in mind is that end users are not allow to know the capstity of the cloud in terms of number of host, the resouces on a host or what
host there vm is placeed on. so as a user the conceph of "a low priority uses a VM from Openstack with half performance of the available host" is not
something that you can express arctecurally in nova.
flavor define the size of vms in absolute term i.e. 4GB of ram not relitve "50% of the host".
we have a 3 laryer schuldeing prcoess that start with a query to the placment service for a set of quantitative resouce class and qualitative traits.
that produces a set fo allcoation candiate against a serise of host that could fit the instance, we then filter those host useing python filters
wich are boolean fucntion that either pass the host or reject it finally after filtering we weight the remaining hosts and selecet one to boot the vm.

once you have completed a steph in this processs you can nolonger go to a previous step and you can never readd a host afteer it has been elimiated by
placemnt or a filter to be considered again. as a result if you get the end of the avaiable hosts and there are none that can fix your vm we cannot
delete a vm and start again without redoing all the work and possible facing with concurrent api requests.
this is why this is a hard problem with out an external service that can rebalance exiting workloads and free up capsity.



> > These cases should work for unlimited users. In order to optimize the whole thing, I would like to write a function that precisely calculates all performance components to determine whether enough resources are available for the VM of the high priority user.
> 
> What you're describing is commonly referred to as "preemptible" or "spot"
> instances. This topic has a long, complicated history in nova and has yet to be
> implemented. Searching for "preemptible instances openstack" should yield you
> lots of discussion on the topic along with a few proof-of-concept approaches
> using external services or out-of-tree modifications to nova.
> 
> > I’m new to Openstack, but I’ve already implemented cloud projects with Microsoft Azure and have solid programming skills. Can you give me a hint where and how I can start?
> 
> As hinted above, this is likely to be a very difficult project given the fraught
> history of the idea. I don't want to dissuade you from this work but you should
> be aware of what you're getting into from the start. If you're serious about
> pursuing this, I suggest you first do some research on prior art. As noted
> above, there is lots of information on the internet about this. With this
> research done, you'll need to decide whether this is something you want to
> approach within nova itself, via out-of-tree extensions or via a third party
> project. If you're opting for integration with nova, then you'll need to think
> long and hard about how you would design such a system and start working on a
> spec (a design document) outlining your proposed solution. Details on how to
> write a spec are discussed at [1]. The only extension points nova offers today
> are scheduler filters and weighers so your options for an out-of-tree extension
> approach will be limited. A third party project will arguably be the easiest
> approach but you will be restricted to talking to nova's REST APIs which may
> limit the design somewhat. This Blazar spec [2] could give you some ideas on
> this approach (assuming it was never actually implemented, though it may well
> have been).
> 
> > My university gave me three compute hosts and one control host to implement this solution for the bachelor thesis. I’m currently setting up Openstack and all the services on the control host all by myself to understand all the functionality (sorry for not using Packstack) 😉. All my hosts have CentOS 7 and the minimum deployment which I configure is Train.
> > 
> > My idea is to work with nova schedulers, because they seem to be interesting for my case. I've found a whole infrastructure description of the provisioning of an instance in Openstack https://docs.openstack.org/operations-guide/de/_images/provision-an-instance.png.  
> > 
> > The nova scheduler https://docs.openstack.org/operations-guide/ops-customize-compute.html is the first component, where it is possible to implement functions via Python and the Compute API https://docs.openstack.org/api-ref/compute/?expanded=show-details-of-specific-api-version-detail,list-servers-detail to check for active VMs and probably delete them if needed before a successful request for an instantiation can be made. 
> > 
> > What do you guys think about it? Does it seem like a good starting point for you or is it the wrong approach? 
> 
> This could potentially work, but I suspect there will be serious performance
> implications with this, particularly at scale. Scheduler filters are
> historically used for simple things like "find me a group of hosts that have
> this metadata attribute I set on my image". Making API calls sounds like
> something that would take significant time and therefore slow down the schedule
> process. You'd also have to decide what your heuristic for deciding which VM(s)
> to delete would be, since there's nothing obvious in nova that you could use.
> You could use something as simple as filter extra specs or something as
> complicated as an external service.
yes implementing preemption in the scheduler  as filet was disccused in the passed and discounted for the performance implication stephen hinted at.
in tree we currentlyt do not allow filter to make any api or db queires. that approach also will not work toady since you would have to rexecute the
query to the placment service after deleting an instance when you run out of capacity and restart the filtering which a filter cannot do as i noted
above.

the most recent spec in this area was https://review.opendev.org/c/openstack/nova-specs/+/438640 for the integrated approch and
https://review.opendev.org/c/openstack/nova-specs/+/554212/12 which proposed adding  a pending state for use with a standalone service

https://gitlab.cern.ch/ttsiouts/ReaperServicePrototype

ther are a number of presentation on this form cern/stackhapc
https://www.stackhpc.com/scientific-sig-at-the-dublin-ptg.html
http://openstack-in-production.blogspot.com/2018/02/maximizing-resource-utilization-with.html
https://openlab.cern/sites/openlab.web.cern.ch/files/2018-07/Containers_on_Baremetal_and_Preemptible_VMs_at_CERN_and_SKA.pdf
https://indico.cern.ch/event/739089/sessions/282073/attachments/1689073/2717151/ASDF_preemptible.pdf


the current state is rebuilding from cell0 is not support but the pending state was never added and the reaper service was not upstream.

work in this are has now move the blazar project as stphen noted in [2]
https://specs.openstack.org/openstack/blazar-specs/specs/ussuri/blazar-preemptible-instances.html
but is dont think it has made much progress. https://review.opendev.org/q/topic:%22preemptibles%22+(status:open%20OR%20status:merged)

nova previously had a pluggable scheduler that would have allowed you to reimplent the scudler entirely from scratch but we removed that
capability in the last year or two. at this point the only viable approach that will not take multiple upstream cycles to this is really to use an
external service.

> 
> This should be lots to get you started. Once again, do make sure you're aware of
> what you're getting yourself into before you start. This could get complicated
> very quickly :)

yes anything other then adding the pending state to nova will be very complex due to placement interaction.
you would really need to implement a fallback query mechanism in the scudler iteself.
anything after the call to placement is already too late. you might be able to reuse consumer types to make some allocation
preemtiblae and have a prefilter decide if an allocation should be a normal nova consumer or premtable consumer based on
a flavor extra spec.https://docs.openstack.org/placement/train/specs/train/approved/2005473-support-consumer-types.html
this would still require the pending state and an external reaper service to free the capsity to be clean but its a possible direction.


> 
> Cheers,
> Stephen
> 
> > I'm very happy to have found you!!! 
> > 
> > Thank you really much for your time!
> 
> 
> [1] https://specs.openstack.org/openstack/nova-specs/readme.html
> [2] https://specs.openstack.org/openstack/blazar-specs/specs/ussuri/blazar-preemptible-instances.html
> 
> > Best regards
> > Levon
> > 
> > -----Ursprüngliche Nachricht-----
> > Von: Stephen Finucane <stephenfin at redhat.com> 
> > Gesendet: Montag, 31. Mai 2021 12:34
> > An: Levon Melikbekjan <levonmelikbekjan at yahoo.de>; openstack at lists.openstack.org
> > Betreff: Re: Customization of nova-scheduler
> > 
> > On Wed, 2021-05-26 at 22:46 +0200, Levon Melikbekjan wrote:
> > > Hello Openstack team,
> > > 
> > > is it possible to customize the nova-scheduler via Python? If yes, how? 
> > 
> > Yes, you can provide your own filters and weighers. This is documented at [1].
> > 
> > Hope this helps,
> > Stephen
> > 
> > [1] https://docs.openstack.org/nova/latest/user/filter-scheduler#writing-your-own-filter
> > 
> > > 
> > > Best regards
> > > Levon
> > > 
> > 
> > 
> 
> 
> 





More information about the openstack-discuss mailing list