[openstack-dev] [Congress] Re: Placement and Scheduling via Policy
Tim Hinrichs
thinrichs at vmware.com
Mon Jan 5 20:12:59 UTC 2015
Hi all,
Happy new year! (I added the openstack mailing list back into the thread.) Responses inline.
On Jan 5, 2015, at 7:40 AM, <ruby.krishnaswamy at orange.com<mailto:ruby.krishnaswamy at orange.com>> <ruby.krishnaswamy at orange.com<mailto:ruby.krishnaswamy at orange.com>> wrote:
Hello,
Happy New Year to all !
@Ramki: yes, please do send the original word document.
@Tim from your mail: My gut tells me that the placement-solver should basically say “I enforce policies having to do with the schema nova:location. This way the Congress policy engine knows to give it policies relevant to nova:location (placement). If we do that, I believe we can carve off the right sub theory. ”
ð (for my understanding) Do you mean: The “solver” should take over derivation of ‘tables’ whose body contains an atom of one of the following?
o nova:location table,
o A table defined by other rules using nova:location (transitively)
o A builtin using nova:location
ð So such (head) rules will get compiled into code to invoke the solver?
In Congress the ‘error’ table is special. So conceptually yes—all errors that can be defined (transitively) in terms of nova:location and builtins like less-than, greater-than, etc. There is definitely a technical question to be answered here—what fragment of Datalog can we send the solver? Can the solver handle negation? Which builtins can it handle? But I’d suggest we start with a slightly simpler case and work our way up from there. For example, here’s an anti-affinity policy.
// anti-affinity
error(vm) :-
affinity_group(vm1, grp1),
affinity_group(vm2, grp2),
grp1 != grp2,
nova:location(vm1, loc1),
nova:location(vm2, loc2),
loc1 == loc2
This error is defined using nova:location and the builtins == and !=. But it also uses the affinity_group table, which would need to be communicated in some way to the solver.
Q: How to set the optimization goal?
What does “theory” correspond to?
@Tim: (for understanding): If we take only the anti-affinity “policy” enforcement, how can this be programmed as a policy using only Datalog?
ð Place VM1 on hosts without breaching anti-affinity policy (for simplicity I am assuming that this is case of proactive placement without needing migration)
ð ha_safe_assignments (VM1, host) :- nova:location(vm, host), not_same_ha_group(vm, VM1)
I think I answered this above (by happenstance, actually :)).
Then ha_safe_assignment will give table of possible (feasible) assignments?
In terms of outputs...
If we’re integrating with the nova scheduler project, then we may not need to worry about this since once it’s given the right policy, it knows how to construct the appropriate linear program and utilize the outputs of that program to do VM placement. I imagine they could enforce this policy proactively.
If we’re not using the Nova scheduler project, then we need to write an algorithm that takes the Datalog policy as input, constructs the linear program, and then utilizes the output to do VM placement. As a first cut I’d suggest focusing on reactive enforcement since Ramki found the Nova API to migrate a VM.
ð In the case where an “optimal” (according to the goal policy) assignment is required
ð While we are talking of invoking solvers (lp, constraints etc), one may also want to implement some greedy algorithm to find the best assignment?
Do we want to take this in consideration?
I think we want to focus on getting correct answers first and worry about optimization second.
@Prabhakar: Did you mean that we need to first work out the following:
- INPUT : Goal to solve
How to set the goal (e.g. maximize servers occupancy) through Datalog?
Then generate code (for goals and constraints below) to something like Pulp?
- INPUT : Constraints to solver
How to identify solver variables (how to). E.g. for the ha assignment, we will want to set the following constraint:
Only one VM of a given group should be assigned to a host.
e.g. if Xij indicates that VM_i is assigned to host_j, and Gmn indicates that VM_m belongs to group_n, then we will want (for each group), the sum of foreach j and each n, Xij*Gin = 1 for “I” in 1… max_vm”
This is useful for me to see how you would create the matrix. The question is now how we write the algorithm that analyzes the Datalog above, identifies the solver variables, and generates the constraint. Once we’ve done that, we would also want to be careful to ensure that if there are already VMs deployed, we try to fix a policy violation with a small number of migrations.
- OUTPUT: matrix/table of assignment
Will be “materialized” to a
BTW: Can we clarify: “communicates with other domain specific policy engines (such as solver)”
- Is a solver considered as a domain specific policy engine? Or is a solver a mechanism/tool (providing algorithms) to calculate the values of variables such that some configured “policies” are not violated and that some configured optimization goal is satisfied?
I’d say…
“solver”: a tool/algorithm that assigns values to variables to achieve some goal (like a linear solver, SAT solver, SMT solver)
“policy engine”: a datacenter service that monitors/enforces/audits a given policy
- Instead: can the solver be considered as ‘extending’ the evaluation strategy used by the Datalog engine (by invoking a solver module)?
Conceptually, I’d say that “communicating with other domain specific policy engines” implies performing some analysis on the Datalog policy (i.e. not just evaluation) and invoking an API call that sets policy in the other engine. This could be completely hidden from the policy writer—it’s just an implementation detail for how Congress monitors/enforces/audits policy.
“extending the evaluation strategy used by the Datalog engine” sounds more like we’re adding a new builtin to the policy language (like less-than, greater-than). This seems to require the policy-writer to learn about the extension and utilize it appropriately.
Tim
Regards
Ruby
De : Ramki Krishnan [mailto:ramk at Brocade.com]
Envoyé : vendredi 19 décembre 2014 16:42
À : Prabhakar Kudva; KRISHNASWAMY Ruby IMT/OLPS
Cc : Gokul B Kandiraju; Norival Figueira; Tim Hinrichs
Objet : RE: [Congress] Re: Placement and Scheduling via Policy
Makes sense Prabhakar.
We could capture Ruby’s additional use cases in the following document.
https://datatracker.ietf.org/doc/draft-krishnan-nfvrg-policy-based-rm-nfviaas/?include_text=1<https://urldefense.proofpoint.com/v2/url?u=https-3A__datatracker.ietf.org_doc_draft-2Dkrishnan-2Dnfvrg-2Dpolicy-2Dbased-2Drm-2Dnfviaas_-3Finclude-5Ftext-3D1&d=AwMGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=B6BWd4kFfgOzAREgThxkmTZKy7dDXE2-eBAmL0PBK7s&m=Gy2MLgPshC-XbnttowqIfVvT5WQdEK6R6RGBCs6nkkQ&s=0zD4roZoYquRtqgb_SiSDTOIksVrxWwTMlU_qj3HNmk&e=>
Ruby – I would be glad to send you the original word document for the same so that you can make your edits.
We could kick off with an information model across congress and domain specific policy engines which could lead to a data model/protocol. Looks like Nova will be our first victim -☺. This could be another IETF draft.
Thanks,
Ramki
From: Prabhakar Kudva [mailto:kudva at us.ibm.com]
Sent: Friday, December 19, 2014 8:31 PM
To: Ramki Krishnan
Cc: Gokul B Kandiraju; Norival Figueira; ruby.krishnaswamy at orange.com<mailto:ruby.krishnaswamy at orange.com>; Tim Hinrichs
Subject: RE: [Congress] Re: Placement and Scheduling via Policy
Since this is an example of how Congress communicates with other domain specific policy engines
(such as solver) perhaps we could spend time defining the exact apis and mechanisms of how we
would generally go about it. That is we could standardize how a Congress policy gets translated
to domain-specific ones in general? The path from the use case Ruby described can be a driver
for such an exercise?
Prabhakar
From: Ramki Krishnan <ramk at Brocade.com<mailto:ramk at Brocade.com>>
To: "ruby.krishnaswamy at orange.com<mailto:ruby.krishnaswamy at orange.com>" <ruby.krishnaswamy at orange.com<mailto:ruby.krishnaswamy at orange.com>>, "Tim Hinrichs" <thinrichs at vmware.com<mailto:thinrichs at vmware.com>>, Prabhakar Kudva/Watson/IBM at IBMUS
Cc: Gokul B Kandiraju/Watson/IBM at IBMUS, Norival Figueira <nfigueir at Brocade.com<mailto:nfigueir at Brocade.com>>
Date: 12/19/2014 07:38 AM
Subject: RE: [Congress] Re: Placement and Scheduling via Policy
________________________________
Hi Ruby,
Please find inline.
Thanks,
Ramki
From: ruby.krishnaswamy at orange.com<mailto:ruby.krishnaswamy at orange.com> [mailto:ruby.krishnaswamy at orange.com]
Sent: Friday, December 19, 2014 1:22 PM
To: Ramki Krishnan; Tim Hinrichs; Prabhakar Kudva
Cc: Gokul B Kandiraju; Norival Figueira
Subject: RE: [Congress] Re: Placement and Scheduling via Policy
Hello Ramki
“Ramki: I believe nova-scheduler (or for that matter individual sub-system placement/scheduling engines) should still exist. The new placement/scheduling engine which works across multiple sub-systems will over-ride the individual sub-system placement/scheduling engine as needed. These architectural aspects are captured in the following IETF draft”
OK, but I think that we should discuss further on what the “nova-scheduler” will be built on in the light of Congress policy engine.
In the lines of this draft https://datatracker.ietf.org/doc/draft-norival-nfvrg-nfv-policy-arch/?include_text=1<https://urldefense.proofpoint.com/v2/url?u=https-3A__datatracker.ietf.org_doc_draft-2Dnorival-2Dnfvrg-2Dnfv-2Dpolicy-2Darch_-3Finclude-5Ftext-3D1&d=AwMGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=B6BWd4kFfgOzAREgThxkmTZKy7dDXE2-eBAmL0PBK7s&m=Gy2MLgPshC-XbnttowqIfVvT5WQdEK6R6RGBCs6nkkQ&s=GNziTPIS-pU6LtstkiIAfvuK0h46vnmbdyp36LM9bE0&e=>, it makes sense that the sub-systems (compute node) also rely on a policy engine. Taking your example (Platinum clients to be given platinum treatment):
=> Currently with Nova-scheduler, this lower policy will be implemented using the Host Aggregates mechanism.
=> But in going towards a convergent ‘policy’ architecture, it may make sense if subsystems also employ the same framework?
Completely agree. As you mentioned, we should avoid point solutions like the host aggregates mechanism in Nova. I think this aspect is definitely worth bringing up in our mailing list discussions.
Regards
Ruby
De : Ramki Krishnan [mailto:ramk at Brocade.com]
Envoyé : vendredi 19 décembre 2014 08:40
À : KRISHNASWAMY Ruby IMT/OLPS; Tim Hinrichs; Prabhakar Kudva
Cc : Gokul B Kandiraju; Norival Figueira
Objet : RE: [Congress] Re: Placement and Scheduling via Policy
Hi Ruby, Tim, All,
Great discussions. Please find inline. I will be offline till next year and advanced happy new year to all.
Thanks,
Ramki
From: ruby.krishnaswamy at orange.com<mailto:ruby.krishnaswamy at orange.com> [mailto:ruby.krishnaswamy at orange.com]
Sent: Wednesday, December 17, 2014 12:28 PM
To: Tim Hinrichs; Prabhakar Kudva
Cc: Ramki Krishnan; Gokul B Kandiraju
Subject: RE: [Congress] Re: Placement and Scheduling via Policy
Hi Tim & All
@Tim: I did not reply to openstack-dev. Do you think we could have an openstack list specific for “congress” to which anybody may subscribe?
Ramki: I think we should include NFV and Congress in the openstack-dev list.
1) Enforcement:
By this we mean “how will the actions computed by the policy engine be executed by the concerned OpenStack functional module”.
In this case, it is better to first work this out for a “simpler” case, e.g. your running example concerning the network/groups.
Note: some actions concern only some data base (e.g. insert the user within some group).
2) From Prabhakar’s mail
“Enforcement. That is with a large number of constraints in place for placement and
scheduling, how does the policy engine communicate and enforce the placement
constraints to nova scheduler. “
Nova scheduler (current): It assigns VMs to servers based on the policy set by the administrator (through filters and host aggregates).
The administrator also configures a scheduling heuristic (implemented as a driver), for example “round-robin” driver.
Then the computed assignment is sent back to the requestor (API server) that interacts with nova-compute to provision the VM.
The current nova-scheduler has another function: It updates the allocation status of each compute node on the DB (through another indirection called nova-conductor)
So it is correct to re-interpret your statement as follows:
- What is the entity with which the policy engine interacts for either proactive or reactive placement management?
- How will the output from the policy engine (for example the placement matrix) be communicated back?
o Proactive: this gives the mapping of VM to host
o Reactive: this gives the new mapping of running VMs to hosts
- How starting from the placement matrix, the correct migration plan will be executed? (for reactive case)
3) Currently openstack does not have “automated management of reactive placement”: Hence if the policy engine is used for reactive placement, then there is a need for another “orchestrator” that can interpret the new proposed placement configuration (mapping of VM to servers) and execute the reconfiguration workflow.
Ramki: For arriving at the new proposed placement configuration, the policy engine needs to know the current resource mapping of all the sub-systems involved. Some ideas below on division of labor – this definitely needs more discussion.
- Orchestrator (openstack heat ?) – proactive placement
- Congress – reactive placement
- Resource manager (typically distributed across individual sub-systems such as nova, neutron …) – global read/write access
For a proof of concept, the simple approach discussed in the IETF draft https://datatracker.ietf.org/doc/draft-krishnan-nfvrg-policy-based-rm-nfviaas/?include_text=1<https://urldefense.proofpoint.com/v2/url?u=https-3A__datatracker.ietf.org_doc_draft-2Dkrishnan-2Dnfvrg-2Dpolicy-2Dbased-2Drm-2Dnfviaas_-3Finclude-5Ftext-3D1&d=AwMGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=B6BWd4kFfgOzAREgThxkmTZKy7dDXE2-eBAmL0PBK7s&m=Gy2MLgPshC-XbnttowqIfVvT5WQdEK6R6RGBCs6nkkQ&s=0zD4roZoYquRtqgb_SiSDTOIksVrxWwTMlU_qj3HNmk&e=> should suffice I guess.
4) So with a policy-based “placement engine” that is integrated with external solvers, then this engine will replace nova-scheduler?
Could we converge on this?
Ramki: I believe nova-scheduler (or for that matter individual sub-system placement/scheduling engines) should still exist. The new placement/scheduling engine which works across multiple sub-systems will over-ride the individual sub-system placement/scheduling engine as needed. These architectural aspects are captured in the following IETF draft
https://datatracker.ietf.org/doc/draft-norival-nfvrg-nfv-policy-arch/?include_text=1<https://urldefense.proofpoint.com/v2/url?u=https-3A__datatracker.ietf.org_doc_draft-2Dnorival-2Dnfvrg-2Dnfv-2Dpolicy-2Darch_-3Finclude-5Ftext-3D1&d=AwMGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=B6BWd4kFfgOzAREgThxkmTZKy7dDXE2-eBAmL0PBK7s&m=Gy2MLgPshC-XbnttowqIfVvT5WQdEK6R6RGBCs6nkkQ&s=GNziTPIS-pU6LtstkiIAfvuK0h46vnmbdyp36LM9bE0&e=> (copying Norival for the same).
Regards
Ruby
De : Tim Hinrichs [mailto:thinrichs at vmware.com]
Envoyé : mardi 16 décembre 2014 19:25
À : Prabhakar Kudva
Cc : KRISHNASWAMY Ruby IMT/OLPS; Ramki Krishnan (ramk at Brocade.com<mailto:ramk at Brocade.com>); Gokul B Kandiraju; openstack-dev
Objet : [Congress] Re: Placement and Scheduling via Policy
[Adding openstack-dev to this thread. For those of you just joining… We started kicking around ideas for how we might integrate a special-purpose VM placement engine into Congress.]
Kudva: responses inline.
On Dec 16, 2014, at 6:25 AM, Prabhakar Kudva <kudva at us.ibm.com<mailto:kudva at us.ibm.com>> wrote:
Hi,
I am very interested in this.
So, it looks like there are two parts to this:
1. Policy analysis when there are a significant mix of logical and builtin predicates (i.e.,
runtime should identify a solution space when there are arithmetic operators). This will
require linear programming/ILP type solvers. There might be a need to have a function
in runtime.py that specifically deals with this (Tim?)
I think it’s right that we expect there to be a mix of builtins and standard predicates. But what we’re considering here is having the linear solver be treated as if it were a domain-specific policy engine. So that solver wouldn’t be embedded into the runtime.py necessarily. Rather, we’d delegate part of the policy to that domain-specific policy engine.
2. Enforcement. That is with a large number of constraints in place for placement and
scheduling, how does the policy engine communicate and enforce the placement
constraints to nova scheduler.
I would imagine that we could delegate either enforcement or monitoring or both. Eventually we want enforcement here, but monitoring could be useful too.
And yes you’re asking the right questions. I was trying to break the problem down into pieces in my bullet (1) below. But I think there is significant overlap in the questions we need to answer whether we’re delegating monitoring or enforcement.
Both of these require some form of mathematical analysis.
Would be happy and interested to discuss more on these lines.
Maybe take a look at how I tried to breakdown the problem into separate questions in bullet (1) below and see if that makes sense.
Tim
Prabhakar
From: Tim Hinrichs <thinrichs at vmware.com<mailto:thinrichs at vmware.com>>
To: "ruby.krishnaswamy at orange.com<mailto:ruby.krishnaswamy at orange.com>" <ruby.krishnaswamy at orange.com<mailto:ruby.krishnaswamy at orange.com>>
Cc: "Ramki Krishnan (ramk at Brocade.com<mailto:ramk at Brocade.com>)" <ramk at Brocade.com<mailto:ramk at Brocade.com>>, Gokul B Kandiraju/Watson/IBM at IBMUS, Prabhakar Kudva/Watson/IBM at IBMUS
Date: 12/15/2014 12:09 PM
Subject: Re: Placement and Scheduling via Policy
________________________________
[Adding Prabhakar and Gokul, in case they are interested.]
1) Ruby, thinking about the solver as taking 1 matrix of [vm, server] and returning another matrix helps me understand what we’re talking about—thanks. I think you’re right that once we move from placement to optimization problems in general we’ll need to figure out how to deal with actions. But if it’s a placement-specific policy engine, then we can build VM-migration into it.
It seems to me that the only part left is figuring out how to take an arbitrary policy, carve off the placement-relevant portion, and create the inputs the solver needs to generate that new matrix. Some thoughts...
- My gut tells me that the placement-solver should basically say “I enforce policies having to do with the schema nova:location.” This way the Congress policy engine knows to give it policies relevant to nova:location (placement). If we do that, I believe we can carve off the right sub theory.
- That leaves taking a Datalog policy where we know nova:location is important and converting it to the input language required by a linear solver. We need to remember that the Datalog rules may reference tables from other services like Neutron, Ceilometer, etc. I think the key will be figuring out what class of policies we can actually do that for reliably. Cool—a concrete question.
2) We can definitely wait until January on this. I’ll be out of touch starting Friday too; it seems we all get back early January, which seems like the right time to resume our discussions. We have some concrete questions to answer, which was what I was hoping to accomplish before we all went on holiday.
Happy Holidays!
Tim
On Dec 15, 2014, at 5:53 AM, <ruby.krishnaswamy at orange.com<mailto:ruby.krishnaswamy at orange.com>> <ruby.krishnaswamy at orange.com<mailto:ruby.krishnaswamy at orange.com>> wrote:
Hi Tim
“Questions:
1) Is there any more data the solver needs? Seems like it needs something about CPU-load for each VM.
2) Which solver should we be using? What does the linear program that we feed it look like? How do we translate the results of the linear solver into a collection of ‘migrate_VM’ API calls?”
Question (2) seems to me the first to address, in particular:
“how to prepare the input (variables, constraints, goal) and invoke the solver”
=> We need rules that represent constraints to give the solver (e.g. a technical constraint that a VM should not be assigned to more than one server or that more than maximum resource (cpu / mem …) of a server cannot be assigned.
“how to translate the results of the linear solver into a collection of API calls”:
=> The output from the “solver” will give the new placement plan (respecting the constraints in input)?
o E.g. a table of [vm, server, true/false]
=> Then this depends on how “action” is going to be implemented in Congress (whether an external solver is used or not)
o Is the action presented as the “final” DB rows that the system must produce as a result of the actions?
o E.g. if current vm table is [vm3, host4] and the recomputed row says [vm3, host6], then the action is to move vm3 to host6?
“how will the solver be invoked”?
=> When will the optimization call be invoked?
=> Is it “batched”, e.g. periodically invoke Congress to compute new assignments?
Which solver to use:
http://www.coin-or.org/projects/<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.coin-2Dor.org_projects_&d=AAMFAw&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=B6BWd4kFfgOzAREgThxkmTZKy7dDXE2-eBAmL0PBK7s&m=3lvgeryw4T-aWafrSZZG96NcydtHt6HnT_6vKookx6U&s=01_9grcy8VGwbKRXcqhFRex3N0XIoCBzOimWFwXYI58&e=> and http://www.coin-or.org/projects/PuLP.xml<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.coin-2Dor.org_projects_PuLP.xml&d=AAMFAw&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=B6BWd4kFfgOzAREgThxkmTZKy7dDXE2-eBAmL0PBK7s&m=3lvgeryw4T-aWafrSZZG96NcydtHt6HnT_6vKookx6U&s=RRiv5ZWCQwWguBZIsIXzCA4_otY4Gr7aeFmFMRB4ZZQ&e=>
I think it may be useful to pass through an interface (e.g. LP modeler to generate LP files in standard formats accepted by prevalent solvers)
The mathematical program:
We can (Orange) contribute to writing down in an informal way the program for this precise use case, if this can wait until January.
Perhaps the objective is to may be “minimize the number of servers whose usage is less than 50%”, since the original policy “Not more than 1 server of type1 to have a load under 50%” need not necessarily have a solution.
This may help to derive the “mappings” from Congress (rules to program equations, intermediary tables to program variables)?
For “migration” use case: it may be useful to add some constraint representing cost of migration, such that the solver computes the new assignment plan such that the maximum migration cost is not exceeded. To start with, perhaps number of migrations?
I will be away from the end of the week until 5th January. I will also discuss with colleagues to see how we can formalize contribution (congress+nfv poc).
Rgds
Ruby
De : Tim Hinrichs [mailto:thinrichs at vmware.com]
Envoyé : vendredi 12 décembre 2014 19:41
À : KRISHNASWAMY Ruby IMT/OLPS
Cc : Ramki Krishnan (ramk at Brocade.com<mailto:ramk at Brocade.com>)
Objet : Re: Placement and Scheduling via Policy
There’s a ton of good stuff here!
So if we took Ramki’s initial use case and combined it with Ruby’s HA constraint, we’d have something like the following policy.
// anti-affinity
error (server, VM1, VM2) :-
same_ha_group(VM1, VM2),
nova:location(VM1, server),
nova:location(VM2, server)
// server-utilization
error(server) :-
type1_server(server),
ceilometer:average_utilization(server, “cpu-util”, avg),
avg < 50
As a start, this seems plenty complex to me. anti-affinity is great b/c it DOES NOT require a sophisticated solver; server-utilization is great because it DOES require a linear solver.
Data the solver needs:
- Ceilometer: cpu-utilization for all the servers
- Nova: data as to where each VM is located
- Policy: high-availability groups
Questions:
1) Is there any more data the solver needs? Seems like it needs something about CPU-load for each VM.
2) Which solver should we be using? What does the linear program that we feed it look like? How do we translate the results of the linear solver into a collection of ‘migrate_VM’ API calls?
Maybe another few emails and then we set up a phone call.
Tim
On Dec 11, 2014, at 1:33 AM, <ruby.krishnaswamy at orange.com<mailto:ruby.krishnaswamy at orange.com>> <ruby.krishnaswamy at orange.com<mailto:ruby.krishnaswamy at orange.com>> wrote:
Hello
A) First a small extension to the use case that Ramki proposes
- Add high availability constraint.
- Assuming server-a and server-b are of same size and same failure model.
[Later: Assumption of identical failure rates can be loosened.
Instead of considering only servers as failure domains, can introduce other failure domains ==> not just an anti-affinity policy but a calculation from 99,99.. requirement to VM placements, e.g.
]
- For an exemplary maximum usage scenario, 53 physical servers could be under peak utilization (100%), 1 server (server-a) could be under partial utilization (50%) with 2 instances of type large.3 and 1 instance of type large.2, and 1 server (server-b) could be under partial utilization (37.5%) with 3 instances of type large.2.
Call VM.one.large2 as the large2 VM in server-a
Call VM.two.large2 as one of the large2 VM in server-b
- VM.one.large2 and VM.two.large2
- When one of the large.3 instances mapped to server-a is deleted from physical server type 1, Policy 1 will be violated, since the overall utilization of server-a falls to 37,5%.
- Various new placements(s) are described below
VM.two.large2 must not be moved. Moving VM.two.large2 breaks non-affinity constraint.
error (server, VM1, VM2) :-
node (VM1, server1),
node (VM2, server2),
same_ha_group(VM1, VM2),
equal(server1, server2);
1) New placement 1: Move 2 instances of large.2 to server-a. Overall
utilization of server-a - 50%. Overall utilization of server-b -
12.5%.
2) New placement 2: Move 1 instance of large.3 to server-b. Overall
utilization of server-a - 0%. Overall utilization of server-b -
62.5%.
3) New placement 3: Move 3 instances of large.2 to server-a. Overall
utilization of server-a - 62.5%. Overall utilization of server-b -
0%.
New placements 2 and 3 could be considered optimal, since they
achieve maximal bin packing and open up the door for turning off
server-a or server-b and maximizing energy efficiency.
But new placement 3 breaks client policy.
BTW: what happens if a given situation does not allow the policy violation to be removed?
B) Ramki’s original use case can itself be extended:
Adding additional constraints to the previous use case due to cases such as:
- Server heterogeneity
- CPU “pinning”
- “VM groups” (and allocation
- Application interference
- Refining on the statement “instantaneous energy consumption can be approximately measured using an overall utilization metric, which is a combination of CPU utilization, memory usage, I/O usage, and network usage”
Let me know if this will interest you. Some (e.g. application interference) will need some time. E.G; benchmarking / profiling to class VMs etc.
C) New placement plan execution
- In Ramki’s original use case, violation is detected at events such as VM delete.
While certainly this by itself is sufficiently complex, we may need to consider other triggering cases (periodic or when multiple VMs are deleted/added)
- In this case, it may not be sufficient to compute the new placement plan that brings the system to a configuration that does not break policy, but also add other goals
D) Let me know if a use case such as placing “video conferencing servers” (geographically distributed clients) would suit you (multi site scenario)
=> Or is it too premature?
Ruby
De : Tim Hinrichs [mailto:thinrichs at vmware.com]
Envoyé : mercredi 10 décembre 2014 19:44
À : KRISHNASWAMY Ruby IMT/OLPS
Cc : Ramki Krishnan (ramk at Brocade.com<mailto:ramk at Brocade.com>)
Objet : Re: Placement and Scheduling via Policy
Hi Ruby,
Whatever information you think is important for the use case is good. Section 3 from one of the docs Ramki sent you covers his use case.
https://datatracker.ietf.org/doc/draft-krishnan-nfvrg-policy-based-rm-nfviaas/?include_text=1<https://urldefense.proofpoint.com/v2/url?u=https-3A__datatracker.ietf.org_doc_draft-2Dkrishnan-2Dnfvrg-2Dpolicy-2Dbased-2Drm-2Dnfviaas_-3Finclude-5Ftext-3D1&d=AAMFAw&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=B6BWd4kFfgOzAREgThxkmTZKy7dDXE2-eBAmL0PBK7s&m=R82SMwEX_3O32-8F5eqMQ8Y6wuHt9WhjmMg6rr-4gWs&s=nE7Xheq0TcCDN98mFIOG_VvMsmfBeIDNDVVFV1HpJx0&e=>
From my point of view, the keys things for the use case are…
- The placement policy (i.e. the conditions under which VMs require migration).
- A description of how we want to compute what specific migrations should be performed (a sketch of (i) the information that we need about current placements, policy violations, etc., (2) what systems/algorithms/etc. can utilize that input to figure out what migrations to perform.
I think we want to focus on the end-user/customer experience (write a policy, and watch the VMs move around to obey that policy in response to environment changes) and then work out the details of how to implement that experience. That’s why I didn’t include things like delays, asynchronous/synchronous, architecture, applications, etc. in my 2 bullets above.
Tim
On Dec 10, 2014, at 8:55 AM, <ruby.krishnaswamy at orange.com<mailto:ruby.krishnaswamy at orange.com>> <ruby.krishnaswamy at orange.com<mailto:ruby.krishnaswamy at orange.com>> wrote:
Hi Ramki, Tim
By a “format” for describing use cases, I meant to ask what sets of information to provide, for example,
- what granularity in description of use case?
- a specific placement policy (and perhaps citing reasons for needing such policy)?
- Specific applications
- Requirements on the placement manager itself (delay, …)?
o Architecture as well
- Specific services from the placement manager (using Congress), such as,
o Violation detection (load, security, …)
- Adapting (e.g. context-aware) of policies used
In any case I will read the documents that Ramki has sent to not resend similar things.
Regards
Ruby
De : Ramki Krishnan [mailto:ramk at Brocade.com]
Envoyé : mercredi 10 décembre 2014 16:59
À : Tim Hinrichs; KRISHNASWAMY Ruby IMT/OLPS
Cc : Norival Figueira; Pierre Ettori; Alex Yip; dilikris at in.ibm.com<mailto:dilikris at in.ibm.com>
Objet : RE: Placement and Scheduling via Policy
Hi Tim,
This sounds like a plan. It would be great if you could add the links below to the Congress wiki. I am all for discussing this in the openstack-dev mailing list and at this point this discussion is completely open.
IRTF NFVRG Research Group: https://trac.tools.ietf.org/group/irtf/trac/wiki/nfvrg<https://urldefense.proofpoint.com/v2/url?u=https-3A__trac.tools.ietf.org_group_irtf_trac_wiki_nfvrg&d=AAMFAw&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=B6BWd4kFfgOzAREgThxkmTZKy7dDXE2-eBAmL0PBK7s&m=R82SMwEX_3O32-8F5eqMQ8Y6wuHt9WhjmMg6rr-4gWs&s=X---GnOf7YwhOGKMWYa8Mh52VtmO-2imfuZdKLEY39M&e=>
IRTF NFVRG draft on NFVIaaS placement/scheduling (includes system analysis for the PoC we are thinking): https://datatracker.ietf.org/doc/draft-krishnan-nfvrg-policy-based-rm-nfviaas/?include_text=1<https://urldefense.proofpoint.com/v2/url?u=https-3A__datatracker.ietf.org_doc_draft-2Dkrishnan-2Dnfvrg-2Dpolicy-2Dbased-2Drm-2Dnfviaas_-3Finclude-5Ftext-3D1&d=AAMFAw&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=B6BWd4kFfgOzAREgThxkmTZKy7dDXE2-eBAmL0PBK7s&m=R82SMwEX_3O32-8F5eqMQ8Y6wuHt9WhjmMg6rr-4gWs&s=nE7Xheq0TcCDN98mFIOG_VvMsmfBeIDNDVVFV1HpJx0&e=>
IRTF NFVRG draft on Policy Architecture and Framework (looking forward to your comments and thoughts): https://datatracker.ietf.org/doc/draft-norival-nfvrg-nfv-policy-arch/?include_text=1<https://urldefense.proofpoint.com/v2/url?u=https-3A__datatracker.ietf.org_doc_draft-2Dnorival-2Dnfvrg-2Dnfv-2Dpolicy-2Darch_-3Finclude-5Ftext-3D1&d=AAMFAw&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=B6BWd4kFfgOzAREgThxkmTZKy7dDXE2-eBAmL0PBK7s&m=R82SMwEX_3O32-8F5eqMQ8Y6wuHt9WhjmMg6rr-4gWs&s=lBet00H8iO1igDZNEMUGaryHWutkg8abBbL5VG8pjyk&e=>
Hi Ruby,
Looking forward to your use cases.
Thanks,
Ramki
_________________________________________________________________________________________________________________________
Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.
_________________________________________________________________________________________________________________________
Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.
_________________________________________________________________________________________________________________________
Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150105/1e759200/attachment-0001.html>
More information about the OpenStack-dev
mailing list