As per the ironic docs for configuring nova, we’ve had that flag disabled. Right now we’re running with a single ironic-conductor, so I don’t think the hash-ring behavior should affect things?
On Mon, 2024-04-15 at 14:53 +0000, Michael Sherman wrote: the hash ring behavior is not related to ironic conductors its related to the peer-list option in the nova compute service config. the ironic driver only supported a single nova-comptue for the entire ironic at that time it was possibel to map all your ironics to via that single compute service to a host aggrate/az in newton https://specs.openstack.org/openstack/nova-specs/specs/newton/implemented/ir... support for runnign multiple nova-compute agents with the ironic virt dirver was added. That intoduced the conceph of a hashring that blanced compute nodes between compute services at runtime based on teh up state of the compute service. with that change it because impossible to relibly manage ironic compute nodes with host aggregates as the dirver would uncondtionally blance across all ironic compute services. with https://specs.openstack.org/openstack/nova-specs/specs/stein/implemented/iro... the ironic driver was enhanced with awareness of ironic conductor groups. this intoduced a peer-list option and partition key in pricniapl it was possible to create host aggrate that mapped to conductor groups by incluing all host listed in the peer_list in the host aggreated i.e. if you had partition_key=conductor_group_1 peer_list=ironic-1,ironic-2 and you created a host aggreate with ironic-1 and ironic-2 that can work however its not tested or supproted in general as without carfully configuring the ironic driver the hashrign can voilate the constraits and move compute_nodes between aggrates by balancing them to compute services not listed in teh host aggrate in antelope/bobcat we deprecated the hashring mechaium and intoduced a new ha model and a new ironic sharding mechanium this was finally implemnted in caracal 2024.1 https://specs.openstack.org/openstack/nova-specs/specs/2024.1/implemented/ir... with this deployment topology it is not guarenteed that ironic will not rebalnce comptue between compute service which means you can now staticaly map compute services (and the ironic shard it manages) to a host aggrate again without worriying that the ironic driver will violate the aggreate expections.
I’m very surprised to hear that nova aggregates are not supported with ironic, it doesn’t seem to be indicated anywhere that I could find in the docs? We’ve been using this configuration (Ironic + aggregates) since Rocky, and the Blazar project’s support for ironic depends on host aggregates.
we did have this documented at one point but i agree its not something that is widely know and what makes matters worse is it almost works in some cases. what people often doen realise is that the host aggregate api https://docs.openstack.org/api-ref/compute/#host-aggregates-os-aggregates is written in terms of compute services not compute nodes. so when trying to use it with ironic they expect to be able to add indivigual ironic server to a host aggreate https://docs.openstack.org/api-ref/compute/#add-host but they can only add the compute services. that means you cant use the tenat isolation filter to isolate a subset of ironic nodes for a given tenant. im not sure how blazar was tryign to use aggreates with ironic but i suspect there integration was incomplete if not fundementally broken by the limitations of how aggreate function when used with ironic. if blazar is using placement aggreates rather then nova host aggrate that might change things but there is no nova api to request a instance ot be created in a placement aggreate. each ironic node has its own resouce provider in placmenet and placement aggregate work on the resouce provider level that means you can create aggregate of ironic nodes in placement. while that is nice since you can use that aggreate in a nova api request that is realy only useful if blazar is going directly to ironic.
Best, -Mike Sherman
On 4/15/24, 6:13 AM, "smooney@redhat.com" <smooney@redhat.com> wrote:
On Sat, 2024-04-13 at 12:52 +0000, Michael Sherman wrote:
Hi all,
We’re running into an issue, where for two sites with 150-250 ironic nodes on a single conductor and nova-compute instance, we’ve started to get “no hosts available” errors from nova scheduler.
We’re using the blazar-nova filter to match on hosts in specifically tagged aggregates. After adding some debug logs, I found that the “host_state” object passed to the filter seems to have out-of-date aggregate information. so first thing to be aware of is that host aggrates are not support with ironic virt driver. until the the caracal release the ironic virt driver uses a hash ring to balance comptue nodes between compute services with amoung other things broke host aggrates. From a nova project perspective using hostaggrates with ironic compute services is unsupported.
in caracal they might now work when ironic sharding is used. host aggrates are used to map compute servivices not compute nodes to an aggrate so when using shards you can map a given shard to a host aggrated by mapping the compute service for that shard to an aggrate.
Specifically, if I query the system with “openstack aggregate show …” or “openstack allocation candidate list”, I see the correct aggregate for the nodes in question, but the contents of “host_state” reflect a previous state.
This “staleness” does not seem to correct itself over time, but is resolved by restarting the nova-scheduler process (actually restarting the kolla docker container, but the same effect). However the issues return over the course of a couple hours.
this is likely caused by a combination of the hash ring and the host cache. again your current toplogy is unsupproted as we do not offically supprot using host aggrates with ironic nodes. with that said you could try disabling the cacheing by setting https://docs.openstack.org/nova/latest/configuration/config.html#filter_sche... on the scheduler and all compute services.
that may or may not work depending on why but my guess is its becauses the compute nodes that are "stale" have been reblance by the hash ring.
the other way to work around this might eb to ensure you do not use peer list and have exactly one compute service per conductor group. again im not sure if that will work because your trying to use a feature (host aggarates) that is not supported by the ironic virt dirver but it might mitigate the incompatiblity in older releases.
We haven’t increased the number of nodes, or otherwise changed the hardware, so I’m not sure what could have triggered this issue.
Any advice on further debugging steps would be greatly appreciated. Thank you!
-- Michael Sherman Infrastructure Lead – Chameleon Computer Science, University of Chicago MCS, Argonne National Lab