[openstack-dev] [Congress][Delegation] Initial workflow design
Tim Hinrichs
thinrichs at vmware.com
Mon Mar 2 17:28:49 UTC 2015
Inline.
On Feb 26, 2015, at 3:32 PM, Ramki Krishnan <ramk at Brocade.com<mailto:ramk at Brocade.com>> wrote:
1)
Ruby: One of the issues highlighted in OpenStack (scheduler) and also elsewhere (e.g. Omega scheduler by google) is :
Reading “host utilization” state from the data bases and DB (nova:host table) updates and overhead of maintaining in-memory state uptodate.
ð This is expensive and current nova-scheduler does face this issue (many blogs/discussions).
While the first goal is a PoC, this will likely become a concern in terms of adoption.
Tim: So you’re saying we won’t have fresh enough data to make policy decisions? If the data changes so frequently that we can’t get an accurate view, then I’m guessing we shouldn’t be migrating based on that data anyway. Could you point me to some of these discussions?
Ramki: We have to keep in mind that VM migration could be an expensive operation depending on the size of the VM and various other factors; such an operation cannot be performed frequently.
2)
>>From document: As soon as the subscription occurs, the DSE sends the VM-placement engine the current contents of those tables, and when these tables change, the DSE informs the VM-placement engine in the form of diffs (aka deltas or updates).
Ramki: Is the criteria for table change programmable? This would be useful to generate significant change events based on application needs.
Not as of now. We’ve kicked around the idea of changing a subscription from an entire table to an arbitrary slice of a table (expressed via Datalog). That functionality will be necessary for dealing with large datasources like Ceilometer. But we don’t have the design fleshed out or the people to build it.
Tim
Thanks,
Ramki
From: Tim Hinrichs [mailto:thinrichs at vmware.com]
Sent: Thursday, February 26, 2015 10:17 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Congress][Delegation] Initial workflow design
Inline.
From: "ruby.krishnaswamy at orange.com<mailto:ruby.krishnaswamy at orange.com>" <ruby.krishnaswamy at orange.com<mailto:ruby.krishnaswamy at orange.com>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Date: Wednesday, February 25, 2015 at 8:53 AM
To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Subject: Re: [openstack-dev] [Congress][Delegation] Initial workflow design
Hi Tim, All,
1) Step 3: The VM-placement engine is also a “datalog engine” . Right?
When policies are delegated:
when policies are inserted? When the VM-placement engine has already registered itself all policies are given to it?
“In our example, this would mean the domain-specific policy engine executes the following API call over the DSE”
ð “domain-agnostic” ….
Done.
2) Step 4:
Ok
But finally: if Congress will likely “delegate”
Not sure what you’re suggesting here.
3) Step 5: Compilation of subpolicy to LP in VM-placement engine
For the PoC, it is likely that the LP program ( in PuLP or some other ML) is *not* completely generated by compiler/translator.
ð Right?
Where does the rest of the program originate? I’m not saying the entire LP program is generated from the Datalog constraints; some of it is generated by the solver independent of the Datalog. In the text, I gave the example of defining hMemUse[j].
You also indicate that some category of constraints (“the LP solver doesn’t know what the relationship between assign[i][j], hMemUse[j], and vMemUse[i] actually is, so the VM-placement engine must also include constraints”) .
These constraints must be “explicitly” written? (e.g. max_ram_allocation etc that are constraints used in the solver-scheduler’s package).
The VM-placement engine does 2 things: (I) translates Datalog to LP and (ii) generates additional LP constraints. (Both work items could leverage any constraints that are builtin to a specific solver, e.g. the solver-scheduler. The point is that there are 2 distinct, conceptual origins of the LP constraints: those that represent the Datalog and those that codify the domain.
So what “parts” will be generated:
Cost function :
Constraint from Policy : memory usage < 75%
Then the rest should be “filled” up?
Could we convene on an intermediary “modeling language”?
@Yathi: do you think we could use some thing like AMPL ? Is this proprietary?
A detail: the example “Y[host1] = hMemUse[host1] > 0.75 * hMemCap[host1]”
ð To be changed to a linear form (mi – Mi > 0 then Yi = 1 else Yi = 0) so something like (mi – Mi) < 100 yi
Each domain-specific solver can do whatever it wants, so it’s not clear to me what the value of choosing a modeling language actually is—unless we want to build a library of common functionality that makes the construction of domain-specific engine (wrappers) easier. I’d prefer to spend our energy understanding whether the proposed workflow/interface works for a couple of different domain-specific policy engines OR to flush this one out and build it.
4) Step 6: This is completely internal to the VM-placement engine (and we could say this is “transparent” to Congress)
We should allow configuration of a solver (this itself could be a policy ☺ )
How to invoke the solver API ?
The domain-specific placement engine could send out to DSE (action_handler: data)?
I had always envisioned the solver being just a library of code—not an entity that sits on the DSE itself.
3) Step 7 : Perform the migrations (according to the assignments computed in the step 6)
This part invokes OpenStack API (to perform migrations).
We may suppose that there are services implementing “action handlers”?
It can listen on the DSE and execute the action.
That interface is supposed to exist by the Kilo release. I’ll check up on the progress.
5) Nova tables to use
Policy warning(id) :-
nova:host(id, name, service, zone, memory_capacity),
legacy:special_zone(zone),
ceilometer:statistics(id, "memory", avg, count, duration,
durstart, durend, max, min, period,
perstart, perend, sum, unit),
avg > 0.75 * memory_capacity
I believe that ceilometer gives usage of VMs and not hosts. The host table (ComputeNode table) should give current used capacity.
Good to know.
6) One of the issues highlighted in OpenStack (scheduler) and also elsewhere (e.g. Omega scheduler by google) is :
Reading “host utilization” state from the data bases and DB (nova:host table) updates and overhead of maintaining in-memory state uptodate.
ð This is expensive and current nova-scheduler does face this issue (many blogs/discussions).
While the first goal is a PoC, this will likely become a concern in terms of adoption.
So you’re saying we won’t have fresh enough data to make policy decisions? If the data changes so frequently that we can’t get an accurate view, then I’m guessing we shouldn’t be migrating based on that data anyway.
Could you point me to some of these discussions?
7) While in this document you have changed the “example” policy, could we drill down the set of policies for the PoC (the server under utilization ?)
ð As a reference
Sure. The only reason I chose this policy was because it doesn’t have aggregation. I’m guessing we’ll want to avoid aggregation for the POC because we don’t yet have it in Congress, and it complicates the problem of translating Datalog to LP substantially.
Tim
Ruby
De : Yathiraj Udupi (yudupi) [mailto:yudupi at cisco.com]
Envoyé : mardi 24 février 2015 20:01
À : OpenStack Development Mailing List (not for usage questions); Tim Hinrichs
Cc : Debo Dutta (dedutta)
Objet : Re: [openstack-dev] [Congress][Delegation] Initial workflow design
Hi Tim,
Thanks for your updated doc on Delegation from Congress to a domain-specific policy engine, in this case, you are planning to build a LP-based VM-Placement engine to be the domain specific policy engine.
I agree your main goal is to first get the delegation interface sorted out. It will be good so that external services (like Solver-Scheduler) can also easily integrate to the delegation model.
From the Solver-Scheduler point of view, we would actually want to start working on a PoC effort to start integrating Congress and the Solver-Scheduler.
We believe rather than pushing this effort to a long-term, it would add value to both the Solver Scheduler effort, as well as the Congress effort to try some early integration now, as most of the LP solver work for VM placements is ready available now in Solver scheduler, and we need to spend some time thinking about translating your domain-agnostic policy to constraints that the Solver scheduler can use.
I would definitely need your help from the Congress interfaces and I hope you will share your early interfaces for the delegation, so I can start the effort from the Solver scheduler side for integration.
I will reach out to you to get some initial help for integration w.r.t. Congress, and also keep you posted about the progress from our side.
Thanks,
Yathi.
On 2/23/15, 11:28 AM, "Tim Hinrichs" <thinrichs at vmware.com<mailto:thinrichs at vmware.com>> wrote:
Hi all,
I made a heavy editing pass of the Delegation google doc, incorporating many of your comments and my latest investigations into VM-placement. I left the old stuff in place at the end of the doc and put the new stuff at the top.
My goal was to propose an end-to-end workflow for a PoC that we could put together quickly to help us explore the delegation interface. We should iterate on this design until we have something that we think is workable. And by all means pipe up if you think we need a totally different starting point to begin the iteration.
(BTW I'm thinking of the integration with solver-scheduler as a long-term solution to VM-placement, once we get the delegation interface sorted out.)
https://docs.google.com/document/d/1ksDilJYXV-5AXWON8PLMedDKr9NpS8VbT0jIy_MIEtI/edit#<https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1ksDilJYXV-2D5AXWON8PLMedDKr9NpS8VbT0jIy-5FMIEtI_edit&d=AwMFAw&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=B6BWd4kFfgOzAREgThxkmTZKy7dDXE2-eBAmL0PBK7s&m=kF8jMOpogOhk8MJWvNMKJC3PiNImxWpZeD2o642YM2s&s=8PV5EW-kz8Q9aP9riFbIjJXJNZXchx2NsL-Z3Y7E5Vg&e=>
Tim
_________________________________________________________________________________________________________________________
Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org<mailto:OpenStack-dev-request at lists.openstack.org>?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150302/78a2ebde/attachment.html>
More information about the OpenStack-dev
mailing list