[nova] instance breaks the affinity/anti-affinity of server group with force_hosts or force_nodes
Hi All My openstack all-in-one environment is setup by devstack. I created a server-group with anti-affinity policy. Then an instance was created with this server-group successfully with command[1]. Another instance was also created with this server-group and with force_host successfully with command[2]. But if I did not specify the force_host, the instance failed to create[3]. I think the second instance had broken the anti-affinity server-group. So my question is that whether the result is within design scope or it is a issue and needs to be discussed : ) [1] nova boot <instance-name> --flavor <flavor> --image <image> --availability-zone <az> --security-groups <sec-group> --nic net-id=<net-id> --hint group=<server-group> [2] nova boot <instance-name> --flavor <flavor> --image <image> --availability-zone <az>:<host> --security-groups <sec-group> --nic net-id=<net-id> --hint group=<server-group> [3] nova boot <instance-name> --flavor <flavor> --image <image> --availability-zone <az> --security-groups <sec-group> --nic net-id=<net-id> --hint group=<server-group> Best Regards Boxiang
On 3/14/2019 12:29 AM, Boxiang Zhu wrote:
My openstack all-in-one environment is setup by devstack. I created a server-group with anti-affinity policy. Then an instance was created with this server-group successfully with command[1]. Another instance was also created with this server-group and with force_host successfully with command[2]. But if I did not specify the force_host, the instance failed to create[3].
I think the second instance had broken the anti-affinity server-group.
So my question is that whether the result is within design scope or it is a issue and needs to be discussed : )
[1] nova boot <instance-name> --flavor <flavor> --image <image> --availability-zone <az> --security-groups <sec-group> --nic net-id=<net-id> --hint group=<server-group> [2] nova boot <instance-name> --flavor <flavor> --image <image> --availability-zone <az>:<host> --security-groups <sec-group> --nic net-id=<net-id> --hint group=<server-group> [3] nova boot <instance-name> --flavor <flavor> --image <image> --availability-zone <az> --security-groups <sec-group> --nic net-id=<net-id> --hint group=<server-group>
I assume [2] works because force_hosts is an administrative override during server create. Did you create [2] after [1]? Or concurrently? It does seem odd that [2] would pass the ServerGroupAntiAffinityFilter even with the forced host. Seems that should be a 409 or 400 type of case. The behavior with [3] is what I would expect, but haven't dug into the code. Maybe the forced_host is overriding the requested server group? -- Thanks, Matt
In my all-in-one openstack, I created the first instance[1], then created the second[2]. At last I created the third[3]. Just one by one. Instance[1] and instance[2] are active and instance[3] is error. But I only have one host and use the anti-afinity policy server-group for instance[1][2]. The code[4] is now if we specify the force_hosts or force_nodes, scheduler will ignore the filters. So I think even we specify the force_hosts or force_nodes, scheduler should evaluate the filters. For that, I have pushed a patch to add a config[5]. [1] nova boot <instance-name> --flavor <flavor> --image <image> --availability-zone <az> --security-groups <sec-group> --nic net-id=<net-id> --hint group=<server-group> [2] nova boot <instance-name> --flavor <flavor> --image <image> --availability-zone <az>:<host> --security-groups <sec-group> --nic net-id=<net-id> --hint group=<server-group> [3] nova boot <instance-name> --flavor <flavor> --image <image> --availability-zone <az> --security-groups <sec-group> --nic net-id=<net-id> --hint group=<server-group> [4] https://github.com/openstack/nova/blob/master/nova/scheduler/host_manager.py... [5] https://review.openstack.org/#/c/641908/ Best Regards Boxiang On 3/14/2019 21:29,Matt Riedemann<mriedemos@gmail.com> wrote: On 3/14/2019 12:29 AM, Boxiang Zhu wrote: My openstack all-in-one environment is setup by devstack. I created a server-group with anti-affinity policy. Then an instance was created with this server-group successfully with command[1]. Another instance was also created with this server-group and with force_host successfully with command[2]. But if I did not specify the force_host, the instance failed to create[3]. I think the second instance had broken the anti-affinity server-group. So my question is that whether the result is within design scope or it is a issue and needs to be discussed : ) [1] nova boot <instance-name> --flavor <flavor> --image <image> --availability-zone <az> --security-groups <sec-group> --nic net-id=<net-id> --hint group=<server-group> [2] nova boot <instance-name> --flavor <flavor> --image <image> --availability-zone <az>:<host> --security-groups <sec-group> --nic net-id=<net-id> --hint group=<server-group> [3] nova boot <instance-name> --flavor <flavor> --image <image> --availability-zone <az> --security-groups <sec-group> --nic net-id=<net-id> --hint group=<server-group> I assume [2] works because force_hosts is an administrative override during server create. Did you create [2] after [1]? Or concurrently? It does seem odd that [2] would pass the ServerGroupAntiAffinityFilter even with the forced host. Seems that should be a 409 or 400 type of case. The behavior with [3] is what I would expect, but haven't dug into the code. Maybe the forced_host is overriding the requested server group? -- Thanks, Matt
On 3/14/2019 9:04 AM, Boxiang Zhu wrote:
The code[4] is now if we specify the force_hosts or force_nodes, scheduler will ignore the filters.
Oh right I always forget that force_hosts/nodes is analogous to the force parameter on live migrate and evacuate APIs which is used to bypass the scheduler and just use the requested host. Note that we removed the force parameters for the live migration and evacuate APIs in 2.68.
So I think even we specify the force_hosts or force_nodes, scheduler should evaluate the filters.
Sylvain has talked about make force_hosts/nodes more like cold/live migration and evacuate where a host is provided but not forced through bypassing the scheduler, and I think that would be a good addition but not really with a configuration option - generally we don't want config-driven API behavior since it's not interoperable. This is a murky area though since force hosts/nodes is an admin API parameter by default policy so it's not interoperable by default anyway. A couple of options to avoid a config option: 1. Add a new parameter (or couple of parameters) to the server create API which would deprecate the weird az:host:node format for forcing a host/node and if used, would run the requested destination through the scheduler filters. This would be like how cold migrate with a target host works today. If users wanted to continue forcing the host and bypass the scheduler, they could still use an older microversion with the az:host:node format. 2. At the very least, rather than a config option, add a policy rule to control whether or not az:host:node (force host/node) bypasses the scheduler filters, so some users with certain roles can do that but not others. This is not an ideal option though and I'd prefer option 1. -- Thanks, Matt
On 3/14/2019 7:19 AM, Matt Riedemann wrote:
So I think even we specify the force_hosts or force_nodes, scheduler should evaluate the filters.
Sylvain has talked about make force_hosts/nodes more like cold/live migration and evacuate where a host is provided but not forced through bypassing the scheduler, and I think that would be a good addition but not really with a configuration option - generally we don't want config-driven API behavior since it's not interoperable. This is a murky area though since force hosts/nodes is an admin API parameter by default policy so it's not interoperable by default anyway. A couple of options to avoid a config option:
1. Add a new parameter (or couple of parameters) to the server create API which would deprecate the weird az:host:node format for forcing a host/node and if used, would run the requested destination through the scheduler filters. This would be like how cold migrate with a target host works today. If users wanted to continue forcing the host and bypass the scheduler, they could still use an older microversion with the az:host:node format.
2. At the very least, rather than a config option, add a policy rule to control whether or not az:host:node (force host/node) bypasses the scheduler filters, so some users with certain roles can do that but not others. This is not an ideal option though and I'd prefer option 1.
Another alternative for force_hosts/nodes during server create is to use the JsonFilter and provide a 'query' scheduler hint where you filter on a specific host/node, e.g.: openstack server create --hint '["=","host","target.host.com"]' ... Granted, the JsonFilter is not enabled by default and not very well tested so we recommend against using it [1] but it is an existing alternative if you want to specify a host and still run through the filters. [1] https://review.openstack.org/#/c/647796/ -- Thanks, Matt
Hi Matt, Wow, It is wonderful. And it's my first time to use the JsonFilter. At your suggestion and reading the doc[1], add the JsonFilter to the `filter_scheduler` for `nova.conf`. Then I used the command like "openstack server create <name> ... --hint query='["=","$host","openstack-node-1"]'. Great, nova evaluates all the scheduler filters and when it uses the JsonFilter, it only matches the specified host in query and removes others. :P BTW, as you mentioned at the early email, the az:host:node format is an admin API. Maybe we can add a micro version for booting API to evaluate the filters even with forced_host/forced_node. I have post a draft spec to the nova-spec[2]. If anyone is convenient, welcome to review and give some comments. : ) [1] https://docs.openstack.org/nova/latest/user/filter-scheduler.html [2] https://review.openstack.org/#/c/645458/ Best Regards Boxiang On 3/28/2019 22:51,Matt Riedemann<mriedemos@gmail.com> wrote: On 3/14/2019 7:19 AM, Matt Riedemann wrote: So I think even we specify the force_hosts or force_nodes, scheduler should evaluate the filters. Sylvain has talked about make force_hosts/nodes more like cold/live migration and evacuate where a host is provided but not forced through bypassing the scheduler, and I think that would be a good addition but not really with a configuration option - generally we don't want config-driven API behavior since it's not interoperable. This is a murky area though since force hosts/nodes is an admin API parameter by default policy so it's not interoperable by default anyway. A couple of options to avoid a config option: 1. Add a new parameter (or couple of parameters) to the server create API which would deprecate the weird az:host:node format for forcing a host/node and if used, would run the requested destination through the scheduler filters. This would be like how cold migrate with a target host works today. If users wanted to continue forcing the host and bypass the scheduler, they could still use an older microversion with the az:host:node format. 2. At the very least, rather than a config option, add a policy rule to control whether or not az:host:node (force host/node) bypasses the scheduler filters, so some users with certain roles can do that but not others. This is not an ideal option though and I'd prefer option 1. Another alternative for force_hosts/nodes during server create is to use the JsonFilter and provide a 'query' scheduler hint where you filter on a specific host/node, e.g.: openstack server create --hint '["=","host","target.host.com"]' ... Granted, the JsonFilter is not enabled by default and not very well tested so we recommend against using it [1] but it is an existing alternative if you want to specify a host and still run through the filters. [1] https://review.openstack.org/#/c/647796/ -- Thanks, Matt
On 3/28/2019 10:07 PM, Boxiang Zhu wrote:
BTW, as you mentioned at the early email, the az:host:node format is an admin API. Maybe we can add a micro version for booting API to evaluate the filters even with forced_host/forced_node. I have post a draft spec to the nova-spec[2]. If anyone is convenient, welcome to review and give some comments. : )
[1] https://docs.openstack.org/nova/latest/user/filter-scheduler.html [2] https://review.openstack.org/#/c/645458/
Just an update on this but I'm +2 on the proposed spec for this now. I'd suggest we get some more cores on board with the spec though before it's approved so that not just two people (me and whoever approves) are aware of the proposal. -- Thanks, Matt
participants (2)
-
Boxiang Zhu
-
Matt Riedemann