On 5/6/2019 10:49 AM, Chris Dent wrote:
Still nova might want to fix this placement data inconsistency. I guess the new placement microversion will allow to update the consumer type of an allocation.
Yeah, I think this has to be updated from Nova. I (and I imagine others) would like to avoid making the type field optional in the API. So maybe default the value to something like "incomplete" or "unknown" and then let nova correct this naturally for instances on host startup and migrations on complete/revert. Ideally nova will be one one of the users that wants to depend on the type string, so we want to use our knowledge of which is which to get existing allocations updated so we can depend on the type value later.
Ah, okay, good. If something like "unknown" is workable I think that's much much better than defaulting to instance. Thanks.
Yup I agree with everything said from a nova perspective. Our public cloud operators were just asking about leaked allocations and if there was tooling to report and clean that kind of stuff up. I explained we have the heal_allocations CLI but that's only going to create allocations for *instances* and only if those instances aren't deleted, but we don't have anything in nova that deals with detection and cleanup of leaked allocations, sort of like what this tooling does [1] but I think is different. So I was thinking about how we could write something in nova that reads the allocations from placement and checks to see if there is anything in there that doesn't match what we have for instances or migrations, i.e. the server was deleted but for whatever reason an allocation was leaked. To be able to determine what allocations are nova-specific today we'd have to guess based on the resource classes being used, namely VCPU and/or MEMORY_MB, but it of course gets more complicated once we start adding supported for nested allocations and such. So consumer type will help here, but we need it more than from the GET /usages API I think. If I were writing that kind of report/cleanup tool today, I'd probably want a GET /allocations API, but that might be too heavy (it would definitely require paging support I think). I could probably get by with using GET /resource_providers/{uuid}/allocations for each compute node we have in nova, but again that starts to get complicated with nested providers (what if the allocations are for VGPU?). Anyway, from a "it's better to have something than nothing at all" perspective it's probably easiest to just start with the easy thing and ask placement for allocations on all compute node providers and cross-check those consumers against what's in nova and if we find allocations that don't have a matching migration or instance we could optional delete them. [1] https://github.com/larsks/os-placement-tools -- Thanks, Matt