[nova] Mempage fun
Balázs Gibizer
balazs.gibizer at ericsson.com
Tue Jan 8 08:54:39 UTC 2019
On Mon, Jan 7, 2019 at 6:32 PM, Stephen Finucane <sfinucan at redhat.com>
wrote:
> We've been looking at a patch that landed some months ago and have
> spotted some issues:
>
> https://review.openstack.org/#/c/532168
>
> In summary, that patch is intended to make the memory check for
> instances memory pagesize aware. The logic it introduces looks
> something like this:
>
> If the instance requests a specific pagesize
> (#1) Check if each host cell can provide enough memory of the
> pagesize requested for each instance cell
> Otherwise
> If the host has hugepages
> (#2) Check if each host cell can provide enough memory of the
> smallest pagesize available on the host for each instance
> cell
> Otherwise
> (#3) Check if each host cell can provide enough memory for
> each instance cell, ignoring pagesizes
>
> This also has the side-effect of allowing instances with hugepages and
> instances with a NUMA topology but no hugepages to co-exist on the
> same
> host, because the latter will now be aware of hugepages and won't
> consume them. However, there are a couple of issues with this:
>
> 1. It breaks overcommit for instances without pagesize request
> running on hosts with different pagesizes. This is because we
> don't
> allow overcommit for hugepages, but case (#2) above means we
> are now
> reusing the same functions previously used for actual hugepage
> checks to check for regular 4k pages
> 2. It doesn't fix the issue when non-NUMA instances exist on the
> same
> host as NUMA instances with hugepages. The non-NUMA instances
> don't
> run through any of the code above, meaning they're still not
> pagesize aware
>
> We could probably fix issue (1) by modifying those hugepage functions
> we're using to allow overcommit via a flag that we pass for case (#2).
> We can mitigate issue (2) by advising operators to split hosts into
> aggregates for 'hw:mem_page_size' set or unset (in addition to
> 'hw:cpu_policy' set to dedicated or shared/unset). I need to check but
> I think this may be the case in some docs (sean-k-mooney said Intel
> used to do this. I don't know about Red Hat's docs or upstream). In
> addition, we did actually called that out in the original spec:
>
> https://specs.openstack.org/openstack/nova-specs/specs/juno/approved/virt-driver-large-pages.html#other-deployer-impact
>
> However, if we're doing that for non-NUMA instances, one would have to
> question why the patch is necessary/acceptable for NUMA instances. For
> what it's worth, a longer fix would be to start tracking hugepages in
> a
> non-NUMA aware way too but that's a lot more work and doesn't fix the
> issue now.
>
> As such, my question is this: should be look at fixing issue (1) and
> documenting issue (2), or should we revert the thing wholesale until
> we
> work on a solution that could e.g. let us track hugepages via
> placement
> and resolve issue (2) too.
If you feel that fixing (1) is pretty simple then I suggest to do that
and document the limitation of (2) while we think about a proper
solution.
gibi
>
> Thoughts?
> Stephen
>
>
More information about the openstack-discuss
mailing list