[nova] Mempage fun
Stephen Finucane
sfinucan at redhat.com
Mon Jan 7 17:32:32 UTC 2019
We've been looking at a patch that landed some months ago and have
spotted some issues:
https://review.openstack.org/#/c/532168
In summary, that patch is intended to make the memory check for
instances memory pagesize aware. The logic it introduces looks
something like this:
If the instance requests a specific pagesize
(#1) Check if each host cell can provide enough memory of the
pagesize requested for each instance cell
Otherwise
If the host has hugepages
(#2) Check if each host cell can provide enough memory of the
smallest pagesize available on the host for each instance cell
Otherwise
(#3) Check if each host cell can provide enough memory for
each instance cell, ignoring pagesizes
This also has the side-effect of allowing instances with hugepages and
instances with a NUMA topology but no hugepages to co-exist on the same
host, because the latter will now be aware of hugepages and won't
consume them. However, there are a couple of issues with this:
1. It breaks overcommit for instances without pagesize request
runningĀ on hosts with different pagesizes. This is because we don't
allow overcommit for hugepages, but case (#2) above means we are now
reusing the same functions previously used for actual hugepage
checks to check for regular 4k pages
2. It doesn't fix the issue when non-NUMA instances exist on the same
host as NUMA instances with hugepages. The non-NUMA instances don't
run through any of the code above, meaning they're still not
pagesize aware
We could probably fix issue (1) by modifying those hugepage functions
we're using to allow overcommit via a flag that we pass for case (#2).
We can mitigate issue (2) by advising operators to split hosts into
aggregates for 'hw:mem_page_size' set or unset (in addition to
'hw:cpu_policy' set to dedicated or shared/unset). I need to check but
I think this may be the case in some docs (sean-k-mooney said Intel
used to do this. I don't know about Red Hat's docs or upstream). In
addition, we did actually called that out in the original spec:
https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact
However, if we're doing that for non-NUMA instances, one would have to
question why the patch is necessary/acceptable for NUMA instances. For
what it's worth, a longer fix would be to start tracking hugepages in a
non-NUMA aware way too but that's a lot more work and doesn't fix the
issue now.
As such, my question is this: should be look at fixing issue (1) and
documenting issue (2), or should we revert the thing wholesale until we
work on a solution that could e.g. let us track hugepages via placement
and resolve issue (2) too.
Thoughts?
Stephen
More information about the openstack-discuss
mailing list