On Mon, Feb 14, 2022 at 9:40 PM Laurent Dumont <laurentfdumont@gmail.com> wrote:From what I understand of baremetal nodes, they will show up as hypervisors from the Nova perspective.Can you try "openstack hypervisor list"From the doc+1. This is a good idea. This will tell us if Nova is at least syncing with Ironic. If it can't push the information to placement, that is obviously going to cause issues.Each bare metal node becomes a separate hypervisor in Nova. The hypervisor host name always matches the associated node UUID.
On Mon, Feb 14, 2022 at 10:03 AM Lokendra Rathour <lokendrarathour@gmail.com> wrote:Hi Julia,Thanks once again. we got your point and understood the issue, but we still are facing the same issue on our TRIPLEO Train HA Setup, even if the settings are done as per your recommendations.The error that we are seeing is again "No valid host was found"So this error is a bit of a generic catch error indicating it just doesn't know how to schedule the node. But the next error you mentioned *is* telling in that a node can't be scheduled if placement is not working.[trim]On further debugging, we found that in the nova-scheduler logs :
2022-02-14 12:58:22.830 7 WARNING keystoneauth.discover [-] Failed to contact the endpoint at http://172.16.2.224:8778/placement for discovery. Fallback to using that endpoint as the base url.
2022-02-14 12:58:23.438 7 WARNING keystoneauth.discover [req-ad5801e4-efd7-4159-a601-68e72c0d651f - - - - -] Failed to contact the endpoint at http://172.16.2.224:8778/placement for discovery. Fallback to using that endpoint as the base url.where 172.16.2.224 is the internal IP.going by document :it is given as below for commands:(overcloud) [root@overcloud-controller-0 ~]# endpoint=http://172.16.2.224:8778/placement(overcloud) [root@overcloud-controller-0 ~]# token=$(openstack token issue -f value -c id)(overcloud) [root@overcloud-controller-0 ~]# curl -sH "X-Auth-Token: $token" $endpoint/resource_providers/<node id> | jq .inventoriesnullresult is the same even if we run the curl command on public endpoint.Please advice.So this sounds like you have placement either not operating or incorrectly configured somehow. I am not a placement expert, but I don't think a node_id is used for resource providers.Hopefully a placement expert can chime in here. That being said, the note about the service failing to connect to the endpoint for discovery is somewhat telling. You *should* be able to curl the root of the API, without a token, and discover a basic JSON document response with information which is used for API discovery. If that is not working, then there may be several things occuring. I would check to make sure the container(s) running placement are operating, not logging any errors, and responding properly. If they are responding if directly queried, then I wonder if there is something going on with load balancing. Possibly consider connecting to placement's port directly instead of through any sort of load balancing such as what is provided by haproxy. I think placement's log indicates the port it starts on, so that would hopefully help. It's configuration should also share the information as well.