[placement][ptg] Enabling other projects to continue with placement or get started
From the etherpad [1]
* blazar * cinder * cyborg * ironic * neutron Who else? This is a bit of a catch-many topic. Despite being birthed in Nova, Placement is designed to be useful to lots of different services. There's already some time defined at the PTG to talk about the interaction of Ironic, Blazar, and Placement. What are the issues with that? What are the issues other services are experiencing with Placement? Preventing people from using Placement? What services are using Placement and the team doesn't know about it? [1] https://etherpad.openstack.org/p/placement-ptg-train -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent
On 4/8/19 6:16 PM, Chris Dent wrote:
From the etherpad [1]
* blazar * cinder * cyborg * ironic * neutron
Who else?
This is a bit of a catch-many topic. Despite being birthed in Nova, Placement is designed to be useful to lots of different services.
There's already some time defined at the PTG to talk about the interaction of Ironic, Blazar, and Placement.
What are the issues with that?
From ironic perspective there is no issue, but there is a critical question to decide: when Ironic+Placement is used, which of them acts as the final authority? If Ironic, then we need to teach Placement to talk to its Allocation API when allocating a bare metal node. If Placement, then we need to support Allocation API talking to Placement. I suspect the latter is saner, but I'd like to hear more opinions. In both cases we'll need something that syncs nodes from Ironic to Placement when there is no Compute to do it.
What are the issues other services are experiencing with Placement? Preventing people from using Placement?
What services are using Placement and the team doesn't know about it?
On 04/09/2019 12:51 PM, Dmitry Tantsur wrote:
On 4/8/19 6:16 PM, Chris Dent wrote:
From the etherpad [1]
* blazar * cinder * cyborg * ironic * neutron
Who else?
This is a bit of a catch-many topic. Despite being birthed in Nova, Placement is designed to be useful to lots of different services.
There's already some time defined at the PTG to talk about the interaction of Ironic, Blazar, and Placement.
What are the issues with that?
From ironic perspective there is no issue, but there is a critical question to decide: when Ironic+Placement is used, which of them acts as the final authority? If Ironic, then we need to teach Placement to talk to its Allocation API when allocating a bare metal node. If Placement, then we need to support Allocation API talking to Placement. I suspect the latter is saner, but I'd like to hear more opinions.
Ironic (scheduler?) would request candidates from the placement service using the GET /allocation_candidates API. Ironic (scheduler?) would then claim the resources on a provider (a baremetal node) by calling the POST /allocations API.
In both cases we'll need something that syncs nodes from Ironic to Placement when there is no Compute to do it.
Yep, this is absolutely correct. My advice: don't bother copying any code from the nova-compute resource tracker. It's horrible. Best, -jay
What are the issues other services are experiencing with Placement? Preventing people from using Placement?
What services are using Placement and the team doesn't know about it?
On 4/9/19 7:20 PM, Jay Pipes wrote:
On 04/09/2019 12:51 PM, Dmitry Tantsur wrote:
On 4/8/19 6:16 PM, Chris Dent wrote:
From the etherpad [1]
* blazar * cinder * cyborg * ironic * neutron
Who else?
This is a bit of a catch-many topic. Despite being birthed in Nova, Placement is designed to be useful to lots of different services.
There's already some time defined at the PTG to talk about the interaction of Ironic, Blazar, and Placement.
What are the issues with that?
From ironic perspective there is no issue, but there is a critical question to decide: when Ironic+Placement is used, which of them acts as the final authority? If Ironic, then we need to teach Placement to talk to its Allocation API when allocating a bare metal node. If Placement, then we need to support Allocation API talking to Placement. I suspect the latter is saner, but I'd like to hear more opinions.
Ironic (scheduler?) would request candidates from the placement service using the GET /allocation_candidates API. Ironic (scheduler?) would then claim the resources on a provider (a baremetal node) by calling the POST /allocations API.
Okay, this matches my expectation. My concern will be with Blazar and reservations. If reservations happen through Placement only, how will ironic know about them? I guess we need to teach Blazar to talk to Ironic, which in turn will talk to Placement.
In both cases we'll need something that syncs nodes from Ironic to Placement when there is no Compute to do it.
Yep, this is absolutely correct. My advice: don't bother copying any code from the nova-compute resource tracker. It's horrible.
Noted :)
Best, -jay
What are the issues other services are experiencing with Placement? Preventing people from using Placement?
What services are using Placement and the team doesn't know about it?
On Wed, 10 Apr 2019, Dmitry Tantsur wrote:
On 4/9/19 7:20 PM, Jay Pipes wrote:
Ironic (scheduler?) would request candidates from the placement service using the GET /allocation_candidates API. Ironic (scheduler?) would then claim the resources on a provider (a baremetal node) by calling the POST /allocations API.
Okay, this matches my expectation.
One of the reasons I wrote my etcd-compute [1] compute thing was to demo how simple the inventory writing, candidate selection (scheduling), allocation writing can be if a) assume that placement is where the truth lives, b) represent everything that matters in the scheduling decisions in placement, c) only care about stuff that really matters (so (b) can be straightforward). It does what Jay describes in his paragraph above and that is pretty much the basic model for using placement. (Except, to be pedantic, for a single consumer the PUT /allocations/{consumer_uuid} API would be the normal one for "claiming". POST is for manipulating multiple allocations in one (atomic) request.)j
My concern will be with Blazar and reservations. If reservations happen through Placement only, how will ironic know about them? I guess we need to teach Blazar to talk to Ironic, which in turn will talk to Placement.
One option would be that Blazar talks to placement, making some stuff resources unavailable. Ironic doesn't specifically need to know that they are unavailable, it is would rather be the case they are not present in scheduling results. [1] https://github.com/cdent/etcd-compute -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent
On 4/10/19 12:11 PM, Chris Dent wrote:
On Wed, 10 Apr 2019, Dmitry Tantsur wrote:
On 4/9/19 7:20 PM, Jay Pipes wrote:
Ironic (scheduler?) would request candidates from the placement service using the GET /allocation_candidates API. Ironic (scheduler?) would then claim the resources on a provider (a baremetal node) by calling the POST /allocations API.
Okay, this matches my expectation.
One of the reasons I wrote my etcd-compute [1] compute thing was to demo how simple the inventory writing, candidate selection (scheduling), allocation writing can be if a) assume that placement is where the truth lives, b) represent everything that matters in the scheduling decisions in placement, c) only care about stuff that really matters (so (b) can be straightforward).
It does what Jay describes in his paragraph above and that is pretty much the basic model for using placement.
(Except, to be pedantic, for a single consumer the PUT /allocations/{consumer_uuid} API would be the normal one for "claiming". POST is for manipulating multiple allocations in one (atomic) request.)j
My concern will be with Blazar and reservations. If reservations happen through Placement only, how will ironic know about them? I guess we need to teach Blazar to talk to Ironic, which in turn will talk to Placement.
One option would be that Blazar talks to placement, making some stuff resources unavailable. Ironic doesn't specifically need to know that they are unavailable, it is would rather be the case they are not present in scheduling results.
We would rather have ironic aware that some nodes are reserved for internal bookkeeping.
On Wed, 10 Apr 2019, Dmitry Tantsur wrote:
We would rather have ironic aware that some nodes are reserved for internal bookkeeping.
Could you query placement for that "bookkeeping"? That's what placement does. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent
On 4/10/19 1:25 PM, Chris Dent wrote:
On Wed, 10 Apr 2019, Dmitry Tantsur wrote:
We would rather have ironic aware that some nodes are reserved for internal bookkeeping.
Could you query placement for that "bookkeeping"? That's what placement does.
It means replacing a short database query with a call to a remote resource with all the associated performance and reliability penalties. Also if someone creates an allocation in Placement directly, we won't have an associated ironic allocation, which is fine, but may be confusing for users. E.g. they see a free node in ironic, but it's actually occupied in Placement.
On Wed, 10 Apr 2019, Dmitry Tantsur wrote:
On 4/10/19 1:25 PM, Chris Dent wrote:
On Wed, 10 Apr 2019, Dmitry Tantsur wrote:
We would rather have ironic aware that some nodes are reserved for internal bookkeeping.
Could you query placement for that "bookkeeping"? That's what placement does.
It means replacing a short database query with a call to a remote resource with all the associated performance and reliability penalties. Also if someone creates an allocation in Placement directly, we won't have an associated ironic allocation, which is fine, but may be confusing for users. E.g. they see a free node in ironic, but it's actually occupied in Placement.
From what I've been able to discern with the various ways I've used
This gets at some of the general concepts of "how best to use placement" which are still emerging and evolving and one of the reasons why I think these threads are useful: We can spend some time reflecting on that kind of big topic without feeling like it is wasting precious PTG time. placement, it works most easily and effectively when you allow it to be the single source of truth for inventory and use of that inventory [1]. This suggests that if you want to optionally use placement with ironic, then having placement be a one of several possible backends to the ironic bookkeeping system might be worth considering. That said, I'm not sure placement is expensive or risky enough to not just use (solely) it. Yes, it is another http service and database but it is super lightweight; easy to install, scale and manage. If the http service is multi-homed, then there are pretty much the same network partitioning issues (for talking to the database) with or without a placement. You could easily choose to co-locate the placement service (web and database) on the same host(s) as Ironic. Note, I'm not trying to twist your arm anything here, just present options. I recognize that achieving a standalone ironic and then going and adding a placement to it feels like a step backwards. It might help to think of placement as a library that happens to be packaged as an HTTP microservice. [1] Which, if you're going to have N less than many sources of truth, it may as well be _one_ to avoid synchronization problems. Before placement came along I thought (and sometimes still do) that having a thing which _represents_ a view of the truth (about inventory) as placement does is overkill if it is possible for things themselves to represent their own truth, and be event/pubsub driven about their willingness to accept a workload (instead of being told by some authority). -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent
On 4/10/19 1:51 PM, Chris Dent wrote:
On Wed, 10 Apr 2019, Dmitry Tantsur wrote:
On 4/10/19 1:25 PM, Chris Dent wrote:
On Wed, 10 Apr 2019, Dmitry Tantsur wrote:
We would rather have ironic aware that some nodes are reserved for internal bookkeeping.
Could you query placement for that "bookkeeping"? That's what placement does.
It means replacing a short database query with a call to a remote resource with all the associated performance and reliability penalties. Also if someone creates an allocation in Placement directly, we won't have an associated ironic allocation, which is fine, but may be confusing for users. E.g. they see a free node in ironic, but it's actually occupied in Placement.
This gets at some of the general concepts of "how best to use placement" which are still emerging and evolving and one of the reasons why I think these threads are useful: We can spend some time reflecting on that kind of big topic without feeling like it is wasting precious PTG time.
++ and enable more people to participate as well.
From what I've been able to discern with the various ways I've used placement, it works most easily and effectively when you allow it to be the single source of truth for inventory and use of that inventory [1]. This suggests that if you want to optionally use placement with ironic, then having placement be a one of several possible backends to the ironic bookkeeping system might be worth considering.
I'm totally okay with it, but it will need many internal changes to ironic, including calling to Placement for some popular API calls. I'll try to explain the background a bit better now: Ironic has a notion of instance_uuid. It use to mean "Nova server ID", but now is used more freely. We don't impose any restrictions on its value, except that it must be a UUID. Now, an important property of instance_uuid is that it enables cooperative locking. It is implemented by making it possible to assign and unset instance_uuid, but not change it. So if you're doing $ openstack baremetal node set --instance-uuid <UUID> <node> and it succeeds, you know that the <node> is now "reserved" for allocation, instance or whatever designated by <UUID>. Fairly primitive, but works. Now, the challenge is that we must make sure it works no matter what we invent, because it's a critical part of Bare Metal API contract. Now, instance_uuid is not a scheduler. We've built an API that can find a node by resource_class and traits: $ openstack baremetal allocation create --resource-class baremetal-large --trait CUSTOM_OPENCV You could argue that we should have gone with Placement instead, and you would probably be right. But see below about resistance to add and maintain new services, and blah-blah. Also the Placement split was in its early phases, and a solution was long overdue. My thought was that we could optionally integrate with Placement as a backend instead. Anyway, this is a bit of a side discussion. What I'm worried about is compatibility with the old case of $ openstack baremetal node set --instance-uuid <UUID> <node> and listing free nodes with $ openstack baremetal node list --unassociated (which ends up filtering by instance_uuid=NULL). Our allocation API handles it by setting node.instance_uuid = allocation.uuid in the same atomic fashion. Thus, it can co-operate with older code, as well as Nova (which also does this trick, although after going to Placement first). If we tell users to go through Placement for reservation, we need to provide synchronization between Ironic and Placement somehow, so that $ openstack baremetal node list --unassociated does not yield unpredictable results and $ openstack baremetal node set --instance-uuid <UUID> <node> ends up checking for a Placement allocation for this node.
That said, I'm not sure placement is expensive or risky enough to not just use (solely) it. Yes, it is another http service and database but it is super lightweight; easy to install, scale and manage. If the http service is multi-homed, then there are pretty much the same network partitioning issues (for talking to the database) with or without a placement. You could easily choose to co-locate the placement service (web and database) on the same host(s) as Ironic.
Resistance to add new services to a standalone deployment can be borderline religious at times :(
Note, I'm not trying to twist your arm anything here, just present options. I recognize that achieving a standalone ironic and then going and adding a placement to it feels like a step backwards. It might help to think of placement as a library that happens to be packaged as an HTTP microservice.
[1] Which, if you're going to have N less than many sources of truth, it may as well be _one_ to avoid synchronization problems. Before placement came along I thought (and sometimes still do) that having a thing which _represents_ a view of the truth (about inventory) as placement does is overkill if it is possible for things themselves to represent their own truth, and be event/pubsub driven about their willingness to accept a workload (instead of being told by some authority).
On 04/10/2019 05:47 AM, Dmitry Tantsur wrote:
On 4/9/19 7:20 PM, Jay Pipes wrote:
On 04/09/2019 12:51 PM, Dmitry Tantsur wrote:
From ironic perspective there is no issue, but there is a critical question to decide: when Ironic+Placement is used, which of them acts as the final authority? If Ironic, then we need to teach Placement to talk to its Allocation API when allocating a bare metal node. If Placement, then we need to support Allocation API talking to Placement. I suspect the latter is saner, but I'd like to hear more opinions.
Ironic (scheduler?) would request candidates from the placement service using the GET /allocation_candidates API. Ironic (scheduler?) would then claim the resources on a provider (a baremetal node) by calling the POST /allocations API.
Okay, this matches my expectation.
My concern will be with Blazar and reservations. If reservations happen through Placement only, how will ironic know about them? I guess we need to teach Blazar to talk to Ironic, which in turn will talk to Placement.
Hmm. So, here's the problem: placement has no concept of time. [1] Placement only knows about one period of time: now. Placement doesn't have any concept of an allocation or an inventory existing at some point in the future or in the past. Therefore, Blazar must unfortunately keep all of the temporal state about reservations in its own data store. So, Ironic would actually have to talk to Blazar to create a reservation of some amount of resources and Blazar will need to call Placement to manage the state of resource inventory and allocations over time; as reservations are activated, Blazar will either create or swap allocation records in Placement to consume the Ironic resources for a tenant that made the reservation. Best, -jay [1] this was a mistake for which I take full responsibility.
On 4/10/19 12:14 PM, Jay Pipes wrote:
On 04/10/2019 05:47 AM, Dmitry Tantsur wrote:
On 4/9/19 7:20 PM, Jay Pipes wrote:
On 04/09/2019 12:51 PM, Dmitry Tantsur wrote:
From ironic perspective there is no issue, but there is a critical question to decide: when Ironic+Placement is used, which of them acts as the final authority? If Ironic, then we need to teach Placement to talk to its Allocation API when allocating a bare metal node. If Placement, then we need to support Allocation API talking to Placement. I suspect the latter is saner, but I'd like to hear more opinions.
Ironic (scheduler?) would request candidates from the placement service using the GET /allocation_candidates API. Ironic (scheduler?) would then claim the resources on a provider (a baremetal node) by calling the POST /allocations API.
Okay, this matches my expectation.
My concern will be with Blazar and reservations. If reservations happen through Placement only, how will ironic know about them? I guess we need to teach Blazar to talk to Ironic, which in turn will talk to Placement.
Hmm. So, here's the problem: placement has no concept of time. [1]
Placement only knows about one period of time: now. Placement doesn't have any concept of an allocation or an inventory existing at some point in the future or in the past.
Therefore, Blazar must unfortunately keep all of the temporal state about reservations in its own data store. So, Ironic would actually have to talk to Blazar to create a reservation of some amount of resources and Blazar will need to call Placement to manage the state of resource inventory and allocations over time; as reservations are activated, Blazar will either create or swap allocation records in Placement to consume the Ironic resources for a tenant that made the reservation.
I imagined it a bit differently. I assumed that 1. a user (operator?) talks to Blazar to create a bare metal node reservation. 1.1. some kind of quota is checked? does Blazar have any notion of quota? 2. Blazar talks to Ironic to create an allocation (yes, we also call it allocation, feel free to blame me) 3. Ironic gets a reservation from Placement and updates its internal records 4. Ironic returns an allocation to Blazar 5. Blazar returns a reservation to a user And when a reservation in Blazar expires or is returned: 6. Blazar talks to Ironic to undeploy the node 6.1. Ironic deletes its allocation automatically 6.2. And thus notifies Placement to remove its allocation Am I missing something?
Best, -jay
[1] this was a mistake for which I take full responsibility.
On Wed, 10 Apr 2019 at 12:16, Dmitry Tantsur <dtantsur@redhat.com> wrote:
On 4/10/19 12:14 PM, Jay Pipes wrote:
On 04/10/2019 05:47 AM, Dmitry Tantsur wrote:
On 4/9/19 7:20 PM, Jay Pipes wrote:
On 04/09/2019 12:51 PM, Dmitry Tantsur wrote:
From ironic perspective there is no issue, but there is a critical question to decide: when Ironic+Placement is used, which of them acts as the final authority? If Ironic, then we need to teach Placement to talk to its Allocation API when allocating a bare metal node. If Placement, then we need to support Allocation API talking to Placement. I suspect the latter is saner, but I'd like to hear more opinions.
Ironic (scheduler?) would request candidates from the placement service using the GET /allocation_candidates API. Ironic (scheduler?) would then claim the resources on a provider (a baremetal node) by calling the POST /allocations API.
Okay, this matches my expectation.
My concern will be with Blazar and reservations. If reservations happen through Placement only, how will ironic know about them? I guess we need to teach Blazar to talk to Ironic, which in turn will talk to Placement.
Hmm. So, here's the problem: placement has no concept of time. [1]
Placement only knows about one period of time: now. Placement doesn't have any concept of an allocation or an inventory existing at some point in the future or in the past.
Therefore, Blazar must unfortunately keep all of the temporal state about reservations in its own data store. So, Ironic would actually have to talk to Blazar to create a reservation of some amount of resources and Blazar will need to call Placement to manage the state of resource inventory and allocations over time; as reservations are activated, Blazar will either create or swap allocation records in Placement to consume the Ironic resources for a tenant that made the reservation.
I imagined it a bit differently. I assumed that 1. a user (operator?) talks to Blazar to create a bare metal node reservation. 1.1. some kind of quota is checked? does Blazar have any notion of quota? 2. Blazar talks to Ironic to create an allocation (yes, we also call it allocation, feel free to blame me) 3. Ironic gets a reservation from Placement and updates its internal records 4. Ironic returns an allocation to Blazar 5. Blazar returns a reservation to a user
And when a reservation in Blazar expires or is returned: 6. Blazar talks to Ironic to undeploy the node 6.1. Ironic deletes its allocation automatically 6.2. And thus notifies Placement to remove its allocation
Am I missing something?
If we follow the current model used with Nova and now also with Neutron, Blazar would be directly aware of all Ironic nodes (we store a copy of the essential resource information in the Blazar database). It would use it to make its own resource allocation decisions for future reservations. On reservation start, it could call Ironic to create an allocation on the node with `candidate_nodes` and provide it to the reservation's owner for deployment. However, is there a way to prevent anyone but the reservation's owner from using the corresponding Ironic allocation? From looking at the API doc, I couldn't really see any enforcement of Ironic allocations being done. I am not sure how placement fits into this. If Ironic starts using placement for its allocation API, as it is mentioned in https://storyboard.openstack.org/#!/story/2004341, we would figure out a way for Blazar to put a hold on reservable nodes, so that users allocating directly with Ironic could only request from a non-reservable pool of nodes. Blazar would create Ironic allocation on behalf of users as described above. The question of allocation enforcement is also critical in this scenario. PS: Blazar doesn't handle any quota yet. It's a major flaw so we'll need to tackle it soon. I am hoping we'll be able to leverage unified limits for this.
On Apr 12, 2019, at 12:02 PM, Pierre Riteau <pierre@stackhpc.com> wrote:
I am not sure how placement fits into this. If Ironic starts using placement for its allocation API, as it is mentioned in https://storyboard.openstack.org/#!/story/2004341, we would figure out a way for Blazar to put a hold on reservable nodes, so that users allocating directly with Ironic could only request from a non-reservable pool of nodes. Blazar would create Ironic allocation on behalf of users as described above. The question of allocation enforcement is also critical in this scenario.
As was mentioned earlier in this thread, placement does not have a concept of time, except for “right now”. In other words, you cannot say “allocate this resource to this consumer next Tuesday at 1400UTC, and then delete that allocation after 48 hours”. At an earlier PTG (Dublin?) we spoke of different ways around this. One was creating an aggregate for Blazar-controlled resources, and then somehow forbidding other services from using those resources. Another was allocating the resources to Blazar so that no one else could use them, and then Blazar would use a two-step process (release the resource; allocate it to the consumer) when the actual usage begins. Neither is ideal, but they do get around the time-agnostic nature of placement. -- Ed Leafe
On 4/12/19 7:02 PM, Pierre Riteau wrote:
On Wed, 10 Apr 2019 at 12:16, Dmitry Tantsur <dtantsur@redhat.com> wrote:
On 4/10/19 12:14 PM, Jay Pipes wrote:
On 04/10/2019 05:47 AM, Dmitry Tantsur wrote:
On 4/9/19 7:20 PM, Jay Pipes wrote:
On 04/09/2019 12:51 PM, Dmitry Tantsur wrote:
From ironic perspective there is no issue, but there is a critical question to decide: when Ironic+Placement is used, which of them acts as the final authority? If Ironic, then we need to teach Placement to talk to its Allocation API when allocating a bare metal node. If Placement, then we need to support Allocation API talking to Placement. I suspect the latter is saner, but I'd like to hear more opinions.
Ironic (scheduler?) would request candidates from the placement service using the GET /allocation_candidates API. Ironic (scheduler?) would then claim the resources on a provider (a baremetal node) by calling the POST /allocations API.
Okay, this matches my expectation.
My concern will be with Blazar and reservations. If reservations happen through Placement only, how will ironic know about them? I guess we need to teach Blazar to talk to Ironic, which in turn will talk to Placement.
Hmm. So, here's the problem: placement has no concept of time. [1]
Placement only knows about one period of time: now. Placement doesn't have any concept of an allocation or an inventory existing at some point in the future or in the past.
Therefore, Blazar must unfortunately keep all of the temporal state about reservations in its own data store. So, Ironic would actually have to talk to Blazar to create a reservation of some amount of resources and Blazar will need to call Placement to manage the state of resource inventory and allocations over time; as reservations are activated, Blazar will either create or swap allocation records in Placement to consume the Ironic resources for a tenant that made the reservation.
I imagined it a bit differently. I assumed that 1. a user (operator?) talks to Blazar to create a bare metal node reservation. 1.1. some kind of quota is checked? does Blazar have any notion of quota? 2. Blazar talks to Ironic to create an allocation (yes, we also call it allocation, feel free to blame me) 3. Ironic gets a reservation from Placement and updates its internal records 4. Ironic returns an allocation to Blazar 5. Blazar returns a reservation to a user
And when a reservation in Blazar expires or is returned: 6. Blazar talks to Ironic to undeploy the node 6.1. Ironic deletes its allocation automatically 6.2. And thus notifies Placement to remove its allocation
Am I missing something?
If we follow the current model used with Nova and now also with Neutron, Blazar would be directly aware of all Ironic nodes (we store a copy of the essential resource information in the Blazar database). It would use it to make its own resource allocation decisions for future reservations. On reservation start, it could call Ironic to create an allocation on the node with `candidate_nodes` and provide it to the reservation's owner for deployment.
However, is there a way to prevent anyone but the reservation's owner from using the corresponding Ironic allocation? From looking at the API doc, I couldn't really see any enforcement of Ironic allocations being done.
There is no enforcement currently, it's cooperative. We're going to discuss Ironic multi-tenancy at the Forum https://www.openstack.org/summit/denver-2019/summit-schedule/events/23668/ir... and later at the PTG, and this is probably going to be they key question. One way we could solve it is to create a new Deployment API in ironic (we have been thinking about it for long time), and create more fine-grained policies for who can use it and how. For example, disallow changing nodes, allow deployment API, but only for previously allocated nodes. This will need some notion of an owner for an allocation or to dynamically change the owner field of a node.
I am not sure how placement fits into this. If Ironic starts using placement for its allocation API, as it is mentioned in https://storyboard.openstack.org/#!/story/2004341, we would figure out a way for Blazar to put a hold on reservable nodes, so that users allocating directly with Ironic could only request from a non-reservable pool of nodes. Blazar would create Ironic allocation on behalf of users as described above. The question of allocation enforcement is also critical in this scenario.
We probably need to work on a story of integrating the "owner" field with the allocation API. This is something I'm hoping to define and implement in Train with your input.
PS: Blazar doesn't handle any quota yet. It's a major flaw so we'll need to tackle it soon. I am hoping we'll be able to leverage unified limits for this.
On 04/10/2019 05:47 AM, Dmitry Tantsur wrote:
On 4/9/19 7:20 PM, Jay Pipes wrote:
On 04/09/2019 12:51 PM, Dmitry Tantsur wrote:
From ironic perspective there is no issue, but there is a critical question to decide: when Ironic+Placement is used, which of them acts as the final authority? If Ironic, then we need to teach Placement to talk to its Allocation API when allocating a bare metal node. If Placement, then we need to support Allocation API talking to Placement. I suspect the latter is saner, but I'd like to hear more opinions.
Ironic (scheduler?) would request candidates from the placement service using the GET /allocation_candidates API. Ironic (scheduler?) would then claim the resources on a provider (a baremetal node) by calling the POST /allocations API.
Okay, this matches my expectation.
My concern will be with Blazar and reservations. If reservations happen through Placement only, how will ironic know about them? I guess we need to teach Blazar to talk to Ironic, which in turn will talk to Placement.
Hmm. So, here's the problem: placement has no concept of time. [1] Placement only knows about one period of time: now. Placement doesn't have any concept of an allocation or an inventory existing at some point in the future or in the past. Just to play devil's advocate... what about changing/adding this? What if Placement did support an inventory having different states depending on time frame requested? In my mind this would enable a more ideal division of responsibility: * Placement manages the availability of resources and maintains the single source of truth for inventory at a given time. * Blazar uses Placement as its default inventory backend. Blazar's main role now is business logic around quota and handling allocation/deallocation when a lease starts/ends. * But, Blazar could optionally use a different inventory backend, to allow standalone use (?) * Ironic uses Placement as its default inventory backend. * But, Ironic could optionally also manage its own inventory, to allow standalone use (?) To further tease out the relationships here, we should think about what makes the most sense for baremetal reservations done via Blazar. Should Blazar always go to Ironic for this, ignoring Nova entirely? Or should it go through Nova if Nova is being used? I believe Blazar still will always have to go through Nova for instance reservations at minimum. Keep in mind that Blazar is designed to integrate with arbitrary external services; currently it has integrations with Neutron (for provisioning Floating IPs as part of a lease), and it could support any number of other resources, like bandwidth on an uplink. Having learned more about Placement's design as a result of these threads, I'm excited about how it could make some things cleaner if it truly could handle the generic inventory management problem that advanced reservations pose. Therefore, Blazar must unfortunately keep all of the temporal state about reservations in its own data store. So, Ironic would actually have to talk to Blazar to create a reservation of some amount of resources and Blazar will need to call Placement to manage the state of resource inventory and allocations over time; as reservations are activated, Blazar will either create or swap allocation records in Placement to consume the Ironic resources for a tenant that made the reservation. Best, -jay [1] this was a mistake for which I take full responsibility. Cheers, /Jason
My apologies for the late response, Jason. Comments inline. On 04/10/2019 11:24 AM, Jason Anderson wrote:
On 04/10/2019 05:47 AM, Dmitry Tantsur wrote: > On 4/9/19 7:20 PM, Jay Pipes wrote: >> On 04/09/2019 12:51 PM, Dmitry Tantsur wrote: >>> From ironic perspective there is no issue, but there is a critical >>> question to decide: when Ironic+Placement is used, which of them acts >>> as the final authority? If Ironic, then we need to teach Placement to >>> talk to its Allocation API when allocating a bare metal node. If >>> Placement, then we need to support Allocation API talking to >>> Placement. I suspect the latter is saner, but I'd like to hear more >>> opinions. >> >> Ironic (scheduler?) would request candidates from the placement >> service using the GET /allocation_candidates API. Ironic (scheduler?) >> would then claim the resources on a provider (a baremetal node) by >> calling the POST /allocations API. > > Okay, this matches my expectation. > > My concern will be with Blazar and reservations. If reservations happen > through Placement only, how will ironic know about them? I guess we need > to teach Blazar to talk to Ironic, which in turn will talk to Placement.
Hmm. So, here's the problem: placement has no concept of time. [1]
Placement only knows about one period of time: now. Placement doesn't have any concept of an allocation or an inventory existing at some point in the future or in the past.
Just to play devil's advocate... what about changing/adding this? What if Placement did support an inventory having different states depending on time frame requested?
After 3+ years with the placement modeling, I've come to realize it was a fundamental mistake to not include a temporal aspect to both the inventories and allocations table schemas. While I would *not* support a schema that had different "states" for an inventory depending on the time frame requested, I *do* think that adding a claim_time and release_time column to the allocations table and a start_time and end_time column to the inventories table would allow Placement to fulfill a simple reservation system using the same transactional logic it currently uses.
In my mind this would enable a more ideal division of responsibility:
* Placement manages the availability of resources and maintains the single source of truth for inventory at a given time.
++
* Blazar uses Placement as its default inventory backend. Blazar's main role now is business logic around quota and handling allocation/deallocation when a lease starts/ends.
Yes on Blazar handling the release of resources when the lease ends. No on Blazar handling the acquisition of resources when the lease starts (that would fundamentally be accomplished by Placement if Placement had a temporal dimension to its allocations and inventories table schemas). No on Blazar handling quota. Quota is a giant pain in the behind, frankly. Trust me, you want no part of it ;) No matter how many "dimensions" of quota slicing and dicing are made available, operators will always want to add yet another dimension. If it's not quota "classes", then it's different quotas per region, then different quotas per AZ, then different quotas per aggregate, and on and on. Never mind the whole "we confused quotas with rate-limiting" and "here is a type of quota that is not consistently measurable" problems... Anyway, my advice would be leave quotas alone if you can :)
o But, Blazar could optionally use a different inventory backend, to allow standalone use (?)
Not sure why you'd want to do this. But, as Dima remarked in another sub-thread of this conversation, the question about "which things should a standalone service depend on" is a religious debate. (and a debate I no longer have the energy to participate in)
* Ironic uses Placement as its default inventory backend. o But, Ironic could optionally also manage its own inventory, to allow standalone use (?)
To further tease out the relationships here, we should think about what makes the most sense for baremetal reservations done via Blazar. Should Blazar always go to Ironic for this, ignoring Nova entirely? Or should it go through Nova if Nova is being used? I believe Blazar still will always have to go through Nova for instance reservations at minimum.
Certainly Blazar will have to go through Nova *in its current implementation*, since Blazar currently relies on host aggregates and special aggregate and flavor metadata to "reserve" compute nodes.
Keep in mind that Blazar is designed to integrate with arbitrary external services; currently it has integrations with Neutron (for provisioning Floating IPs as part of a lease), and it could support any number of other resources, like bandwidth on an uplink.
The flexibility for close integration with arbitrary services often comes with a high price: complexity and potential code rot.
Having learned more about Placement's design as a result of these threads, I'm excited about how it could make some things cleaner if it truly could handle the generic inventory management problem that advanced reservations pose.
If you will be in Denver, I'm happy to outline some ideas I had that would pave a way for adding a temporal dimension to Placement's database schema. I won't be able to implement these ideas, but I'm happy to share them with you if you're interested. Best, -jay
Therefore, Blazar must unfortunately keep all of the temporal state about reservations in its own data store. So, Ironic would actually have to talk to Blazar to create a reservation of some amount of resources and Blazar will need to call Placement to manage the state of resource inventory and allocations over time; as reservations are activated, Blazar will either create or swap allocation records in Placement to consume the Ironic resources for a tenant that made the reservation.
Best, -jay [1] this was a mistake for which I take full responsibility.
Cheers, /Jason
Sorry for the late response. To be on the same page, this is how now blazar works today for the instance reservation 1. User makes a reservation with start-end time via Blazar API 2. Blazar looks up/its own DB/to pick up a host to schedule the instance using/its own scheduler/ 3. When the reservation time starts, 1. Blazar gives an inventory of that reservation resource class to the child resource provider of the compute node resource provider Blazar has picked up 2. Blazar makes a flavor which requests/consumes that reservation resource class and expose it to the user who made that reservation 4. The user boots an instance with that flavor, and placement will schedule that instance to the compute node blazar has picked since the compute node is only a resource provider which has that reservation class inventory (in its child resource provider) Since I’m not familiar with Ironic I’m not sure what will be the pain point (technical blocker) to use the same mechanism for Ironic driver. The pain point for Blazar itself so far is the second part of the above sequence, that said Blazer’s mini scheduler doesn’t consider traits or even the overcommitting ratio in placement :( Jay’s idea, I think, enables Blazar to offload to and rely on Placement only that second sequence task, solving those pain points. That sounds attractive to me, but on the other hand, I also don’t want every user to go through the new “temporal” searching path since it is useless for I-don’t-care-about-time users, So the point, IMO, is if we can (or really should?) skip the temporal stuff if not needed and if we can centralize the code for time in implementation since developers don’t want to always be aware of dealing time? On 2019/04/22 12:07, Jay Pipes wrote:
My apologies for the late response, Jason. Comments inline.
On 04/10/2019 11:24 AM, Jason Anderson wrote:
On 04/10/2019 05:47 AM, Dmitry Tantsur wrote: > On 4/9/19 7:20 PM, Jay Pipes wrote: >> On 04/09/2019 12:51 PM, Dmitry Tantsur wrote: >>> From ironic perspective there is no issue, but there is a critical >>> question to decide: when Ironic+Placement is used, which of them acts >>> as the final authority? If Ironic, then we need to teach Placement to >>> talk to its Allocation API when allocating a bare metal node. If >>> Placement, then we need to support Allocation API talking to >>> Placement. I suspect the latter is saner, but I'd like to hear more >>> opinions. >> >> Ironic (scheduler?) would request candidates from the placement >> service using the GET /allocation_candidates API. Ironic (scheduler?) >> would then claim the resources on a provider (a baremetal node) by >> calling the POST /allocations API. > > Okay, this matches my expectation. > > My concern will be with Blazar and reservations. If reservations happen > through Placement only, how will ironic know about them? I guess we need > to teach Blazar to talk to Ironic, which in turn will talk to Placement.
Hmm. So, here's the problem: placement has no concept of time. [1]
Placement only knows about one period of time: now. Placement doesn't have any concept of an allocation or an inventory existing at some point in the future or in the past.
Just to play devil's advocate... what about changing/adding this? What if Placement did support an inventory having different states depending on time frame requested?
After 3+ years with the placement modeling, I've come to realize it was a fundamental mistake to not include a temporal aspect to both the inventories and allocations table schemas.
While I would *not* support a schema that had different "states" for an inventory depending on the time frame requested, I *do* think that adding a claim_time and release_time column to the allocations table and a start_time and end_time column to the inventories table would allow Placement to fulfill a simple reservation system using the same transactional logic it currently uses.
In my mind this would enable a more ideal division of responsibility:
* Placement manages the availability of resources and maintains the single source of truth for inventory at a given time.
++
* Blazar uses Placement as its default inventory backend. Blazar's main role now is business logic around quota and handling allocation/deallocation when a lease starts/ends.
Yes on Blazar handling the release of resources when the lease ends.
No on Blazar handling the acquisition of resources when the lease starts (that would fundamentally be accomplished by Placement if Placement had a temporal dimension to its allocations and inventories table schemas).
No on Blazar handling quota. Quota is a giant pain in the behind, frankly. Trust me, you want no part of it ;) No matter how many "dimensions" of quota slicing and dicing are made available, operators will always want to add yet another dimension. If it's not quota "classes", then it's different quotas per region, then different quotas per AZ, then different quotas per aggregate, and on and on.
Never mind the whole "we confused quotas with rate-limiting" and "here is a type of quota that is not consistently measurable" problems...
Anyway, my advice would be leave quotas alone if you can :)
o But, Blazar could optionally use a different inventory backend, to allow standalone use (?)
Not sure why you'd want to do this. But, as Dima remarked in another sub-thread of this conversation, the question about "which things should a standalone service depend on" is a religious debate. (and a debate I no longer have the energy to participate in)
* Ironic uses Placement as its default inventory backend. o But, Ironic could optionally also manage its own inventory, to allow standalone use (?)
To further tease out the relationships here, we should think about what makes the most sense for baremetal reservations done via Blazar. Should Blazar always go to Ironic for this, ignoring Nova entirely? Or should it go through Nova if Nova is being used? I believe Blazar still will always have to go through Nova for instance reservations at minimum.
Certainly Blazar will have to go through Nova *in its current implementation*, since Blazar currently relies on host aggregates and special aggregate and flavor metadata to "reserve" compute nodes.
Keep in mind that Blazar is designed to integrate with arbitrary external services; currently it has integrations with Neutron (for provisioning Floating IPs as part of a lease), and it could support any number of other resources, like bandwidth on an uplink.
The flexibility for close integration with arbitrary services often comes with a high price: complexity and potential code rot.
Having learned more about Placement's design as a result of these threads, I'm excited about how it could make some things cleaner if it truly could handle the generic inventory management problem that advanced reservations pose.
If you will be in Denver, I'm happy to outline some ideas I had that would pave a way for adding a temporal dimension to Placement's database schema. I won't be able to implement these ideas, but I'm happy to share them with you if you're interested.
Best, -jay
Therefore, Blazar must unfortunately keep all of the temporal state about reservations in its own data store. So, Ironic would actually have to talk to Blazar to create a reservation of some amount of resources and Blazar will need to call Placement to manage the state of resource inventory and allocations over time; as reservations are activated, Blazar will either create or swap allocation records in Placement to consume the Ironic resources for a tenant that made the reservation.
Best, -jay [1] this was a mistake for which I take full responsibility.
Cheers, /Jason
-- Tetsuro Nakamura <nakamura.tetsuro@lab.ntt.co.jp> NTT Network Service Systems Laboratories TEL:0422 59 6914(National)/+81 422 59 6914(International) 3-9-11, Midori-Cho Musashino-Shi, Tokyo 180-8585 Japan
Chris, Sorry for the slow reply here. The team has talked about this a little bit in our team meetings. We had previously talked to the placement team about how it could benefit Cinder and I think we had reach the conclusion that there wasn't really any benefit that Cinder could get from placement. I think, however, the open item is if Placement can benefit from Cinder if we were to make available volume and storage backend information to Placement. If so we would need to understand the work involved. It might be worth planning some cross project time at the PTG just to sync up on where things are at. Let me know if you are interested in doing this. Thanks! Jay On 4/8/2019 11:16 AM, Chris Dent wrote:
From the etherpad [1]
* blazar * cinder * cyborg * ironic * neutron
Who else?
This is a bit of a catch-many topic. Despite being birthed in Nova, Placement is designed to be useful to lots of different services.
There's already some time defined at the PTG to talk about the interaction of Ironic, Blazar, and Placement.
What are the issues with that?
What are the issues other services are experiencing with Placement? Preventing people from using Placement?
What services are using Placement and the team doesn't know about it?
On 4/26/2019 8:50 AM, Jay Bryant wrote:
The team has talked about this a little bit in our team meetings. We had previously talked to the placement team about how it could benefit Cinder and I think we had reach the conclusion that there wasn't really any benefit that Cinder could get from placement.
I think, however, the open item is if Placement can benefit from Cinder if we were to make available volume and storage backend information to Placement. If so we would need to understand the work involved.
It might be worth planning some cross project time at the PTG just to sync up on where things are at. Let me know if you are interested in doing this.
Modeling AZ affinity in a central location (placement) between compute nodes and volumes would likely benefit the wonky [cinder]/cross_az_attach and related config options in cinder. We have a class of bugs in nova when that is enforced (cross_az_attach=False) which maybe useful for HPC and Edge workloads, but isn't tested or supported very well at this time. Granted, it might be as simple as reporting volumes (or their backend pool) as a resource provider and then putting that provider and the compute node provider in a resource provider aggregate (sort of like how we model [but don't yet use] shared DISK_GB resources). My thinking is if you had that modeled and nova is configured with cross_az_attach=False, and a server is created with some pre-existing volumes, the nova scheduler translates that to a request to placement for compute nodes only in the aggregate with whatever storage backend is providing those volume resources (the same AZ essentially). But this is probably low priority and arguably re-inventing an already somewhat broken wheel. Would have to think about how doing this with placement would be superior to what we have today. -- Thanks, Matt
participants (9)
-
Chris Dent
-
Dmitry Tantsur
-
Ed Leafe
-
Jason Anderson
-
Jay Bryant
-
Jay Pipes
-
Matt Riedemann
-
Pierre Riteau
-
Tetsuro Nakamura