Hi there,

I like the idea, but historically, Nova has steered away from giving more details on why things failed to schedule in order to prevent leaking information about the cloud. 

I agree that it’s one of the more painful errors, but I see the purpose behind masking it from the user in an environment where the user is not the operator. 

It would be good to hear from other devs, or maybe if this can be an admin-level thing. 

Thanks
Mohammed

On Wed, Aug 25, 2021 at 9:53 AM Brito, Hugo Nicodemos <Hugo.Brito@windriver.com> wrote:
Hi,

In a prototype, we have improved Nova's scheduling error messages.
This helps both developers and end users better understand the
scheduler problems that occur on creation of an instance.

When a scheduler error happens during instance creation via the nova
upstream, we get the following message on the Overview tab
(Horizon dashboard): "No valid host was found." This doesn't give us
enough information about what really happened, so our solution was to
add more details on the instance's overview page, e.g.:

**Fault:Message** attribute provides a summary of why each host can not
satisfy the instance’s resource requirements, e.g. for controller-0, it
indicates “No valid host was found. Not enough host cell CPUs to fit
instance cell” (where cell is a numa-node or socket).

**Fault:Details** attribute provides even more detail for each
individual host, for example it shows that the instance “required” 2
CPU cores and shows the “actual” CPU cores available on each “numa”
node: “actual:0, numa:1” and “actual:1, numa:0”.

These details are also present using the OpenStack CLI, in the
_fault_ attribute:

- openstack server show <instance>  

With that in mind, we'd like to know if you are open to consider such
a change. We are willing to submit a spec and upstream that
implementation.

Regards,
- nicodemos
--
Mohammed Naser
VEXXHOST, Inc.