You're right; I was thinking AZ but my fingers typed region.

The problem started Friday afternoon without anything being changed AFAIK. It seems to have fixed itself over the weekend. If it starts happening again. I'll enable debugging. Thanks for your advice!
On Monday, August 29, 2022, 04:54:50 AM EDT, Pierre Riteau <pierre@stackhpc.com> wrote:


Hello,

First, a point about terminology: you use the term "region", but I think you meant an availability zone. Regions are something entirely different in OpenStack.

I would suggest the following:

- first, enable debug logging in Nova. You can do so in Kolla by setting nova_logging_debug to true and reconfiguring nova. As you can see in the code, there are several log statements at the debug level which would help understand why candidate hosts are rejected by this filter: https://opendev.org/openstack/nova/src/branch/stable/train/nova/scheduler/filters/aggregate_multitenancy_isolation.py
- second, maybe check if you have other aggregates that are set to the "open" AZ and would have the filter_tenant_id property on them?

On Fri, 26 Aug 2022 at 21:51, Albert Braden <ozzzo@yahoo.com> wrote:
We're running kolla train, and we use the AggregateMultiTenancyIsolation for some aggregates by setting filter_tenant_id. Today customers reported build failures when they try to build VMs in a non-filtered region. I am able to duplicate the issue:

os server create --image <image> --flavor medium --network private --availability-zone open alberttest1

| 5dd44105-2045-4d53-be43-5f521ddb420b | alberttest1 | ERROR  |          | <image> | medium |

2022-08-26 18:39:38.977 30 INFO nova.filters [req-342d065a-cd47-4edf-bc4b-3f84b34ab97c 25b53bdb96fb5f9f6e7331d7e03eee0a12c45746a9e8b978858b2140a5275a09 fdcf1553db504c8f82a2b54851a4c262 - 8793b235debf49e6aba6bd1e2bf65360 8793b235debf49e6aba6bd1e2bf65360] Filtering removed all hosts for the request with instance ID '5dd44105-2045-4d53-be43-5f521ddb420b'. Filter results: ['ComputeFilter: (start: 50, end: 50)', 'RetryFilter: (start: 50, end: 50)', 'AggregateNumInstancesFilter: (start: 50, end: 50)', 'AvailabilityZoneFilter: (start: 50, end: 6)', 'AggregateInstanceExtraSpecsFilter: (start: 6, end: 6)', 'ImagePropertiesFilter: (start: 6, end: 6)', 'ServerGroupAntiAffinityFilter: (start: 6, end: 6)', 'ServerGroupAffinityFilter: (start: 6, end: 6)', 'AggregateMultiTenancyIsolation: (start: 6, end: 0)']

Region "open" does not have any properties specified, so the AggregateMultiTenancyIsolation filter should not be active.

qde3:admin]$ os aggregate show open|grep properties
| properties        |                 

This is what we would see if it had the filter active:

:qde3:admin]$ os aggregate show closed|grep properties
| properties        | filter_tenant_id='1c41e088b35f4b438023d081a6f70292,3e9727aaf03e4459a176c28dbdb3965e,f9b4b7dc8c614bb09d66657afc3b21cd,121a5da3dd0b489986908bee7eea61ae,d580ccc4b07e478a9efc2d71acf04cc1,107e14eeda01400988e58f5aac8b2772', closed='true'                                           

What could be causing this filter to remove hosts when we haven't set filter_tenant_id for that aggregate?