[nova] Scheduler Optimiser

Alvaro Soto alsotoes at gmail.com
Mon Jul 3 20:08:27 UTC 2023


Nice idea, but my main question is, how do you plan to beat the ones
implemented currently?

I'm working a little researching a little with some techniques to try to
beat the random resource allocation schedulers.

Can you share more about your research and/or implementation idea?

Cheers.

---
Alvaro Soto.

Note: My work hours may not be your work hours. Please do not feel the need
to respond during a time that is not convenient for you.
----------------------------------------------------------
Great people talk about ideas,
ordinary people talk about things,
small people talk... about other people.

On Mon, Jul 3, 2023, 2:00 PM Dominik Danelski <ddanelski at cloudferro.com>
wrote:

>
> Hello,
>
>
> I would like to introduce you to the tool developed under the working
> title "Scheduler Optimiser". It is meant to test the effectiveness of
> different Scheduler configurations, both weights and filters on a given
> list of VM orders and in a semi-realistic infrastructure.
>
> My company - CloudFerro - has been preparing in-house for the last few
> months and foresees publishing the project as FOSS once it reaches the
> MVP stage. To make the final result more useful to the community and
> speed up the development (and release), I humbly ask for your expertise:
> Are you aware of previous similar efforts? Do you notice some flaws in
> the current approach? What, in your opinion, are more important aspects
> of the infrastructure behaviour, and what can be relatively safely
> ignored in terms of the effect on Scheduler results/allocation?
>
>
> Project objectives:
>
>   * Use Devstack (or another OpenStack deployer) with a real Scheduler
>     to replay a list of compute VM orders, either real from one's
>     infrastructure or artificially created.
>   * Assess the effectiveness of the scheduling in various terms like:
>     "How many machines of a given type can still be allocated at the
>     moment?" using plug-in "success meters". In a strict sense, the
>     project does not simulate THE Scheduler but interacts with it.
>   * Use fake-virt to emulate huge architectures on a relatively tiny
>     test bench.
>   * Have as little as possible, and ideally no changes to the Devstack's
>     code that could not be included in the upstream repository. The
>     usage should be as simple as: 1. Install Devstack. 2. Configure
>     Devstack's cluster with its infrastructure information like flavours
>     and hosts. 3. Configure Scheduler for a new test case. 4. Replay VM
>     orders. 5. Repeat steps 3 and 4 to find better Scheduler settings.
>   * Facilitate creating a minimal required setup of the test bench. Not
>     by replacing standard Devstack scripts, but mainly through tooling
>     for quick rebuilding data like flavours, infrastructure state, and
>     other factors relevant to the simulation.
>
>
> Outside of the scope:
>
>   * Running continuous analysis on the production environment, even if
>     some plug-ins could be extracted for this purpose.
>   * Retaining information about users and projects when replaying orders.
>   * (Probably / low priority) replaying actions other than VM
>     creation/deletion as they form a minority of operations and ignoring
>     them should not have a distinct effect on the comparison experiments.
>
>
> Current state:
>
>     Implemented:
>
>   * Recreating flavours from JSON file exported via OpenStack CLI.
>   * Replaying a list of orders in the form of (creation_date,
>     termination_date, resource_id (optional), flavor_id) with basic
>     flavour properties like VCPU, RAM, and DISK GB. The orders are
>     replayed consecutively.
>   * Plug-in success-rater mechanism which runs rater classes (returning
>     quantified success measure) after each VM add/delete action, retains
>     their intermediate history and "total success" - how it is defined
>     is implementation dependent. First classes interacting with
>     Placement like: "How many VMs of flavours x (with basic parameters
>     for now) can fit in the cluster?" or "How many hosts are empty?".
>
>
> Missing:
>
>   * Recreating hosts, note the fake-virt remark from "Risks and
> Challenges".
>   * Tools facilitating Scheduler configuration.
>   * Creating VMs with more parameters like VGPU, traits, and aggregates.
>   * (Lower priority) saving the intermediate state of the cluster during
>     simulation i.e. allocations to analyse it without rerunning the
>     experiment. Currently, only the quantified meters are saved.
>   * Gently failing and saving all information in case of resource
>     depletion: close to completion, handling one exception type in upper
>     layers is needed.
>   * More success meters.
>
>
> Risks and Challenges:
>
>   * Currently, the tool replays actions one by one, it waits for each
>     creation and deletion to be complete before running success raters
>     and taking another order. Thus, the order of actions is important,
>     but not their absolute time and temporal density. This might skip
>     some side-effects of a realistic execution.
>   * Similarly, to the above, fake-virt provides simple classes that will
>     not reproduce some behaviours of real-world hypervisors. An explicit
>     Scheduler avoids hosts that had recently failed to allocate a VM,
>     but most likely fake-virt will not mock such behaviour.
>   * Fake-virt should reproduce a real diverse infrastructure instead of
>     x copies of the same flavour. This might be the only, but very
>     important change to the OpenStack codebase. If successful, it could
>     benefit other projects and tests as well.
>
>
> Even though the list of missing features is seemingly larger, the most
> important parts of the program are already there, so we hope to finish
> the MVP development in a relatively short amount of time. We are going
> to publish it as FOSS in either case, but as mentioned your observations
> would be very much welcome at this stage. I am also open to answering
> more questions about the project.
>
>
> Kind regards
>
> Dominik Danelski
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230703/1a311776/attachment-0001.htm>


More information about the openstack-discuss mailing list