[openstack-dev] [nova] placement/resource providers update 19

Matt Riedemann mriedemos at gmail.com
Mon Apr 17 19:49:52 UTC 2017


On 4/14/2017 5:18 AM, Chris Dent wrote:
>
> Here's the 19th placement and resource providers update. As usual,
> if I've left anything out, please followup with that information.
>
> # What Matters Most
>
> Discussion continues on the spec for claims being done during
> scheduling. Getting this worked out and implemented is a high
> priority. Links below.
>
> # What's Changed
>
> The routes and handlers for adding and manipulating traits in the
> placement API have now merged. This opens the door for starting to
> report traits for compute-nodes and other resource providers and
> filtering based on those traits (note that the added code does not
> support that filtering, what's been added is the interface to CRUD
> traits).
>
> The spec for associating user and project information with
> allocations and for being able to view usages based on those
> characteristics has been merged. We had to go a few rounds as we
> were so excited about this idea we missed some critical bits:
>
>
> http://specs.openstack.org/openstack/nova-specs/specs/pike/approved/placement-project-user.html
>
>
> The placement-api-ref gate job that checks the docs is now linking
> the output. Here's a sample (which may have expired by the time you
> are reading this message):
>
>
> http://docs-draft.openstack.org/98/456198/1/check/gate-placement-api-ref-nv/deee665//placement-api-ref/build/html/

Cool, this looks nice.

>
>
> More about docs below.
>
> # Help Wanted
>
> Areas where volunteers are needed.
>
> * General attention to bugs tagged placement:
>      https://bugs.launchpad.net/nova/+bugs?field.tag=placement
>
> * Helping to create api documentation for placement (see the Docs
>      section below).
>
> * Helping to create and evaluate functional tests of the resource
>      tracker and the ways in which it and nova-scheduler use the
>      reporting client. For some info see
>      https://etherpad.openstack.org/p/nova-placement-functional
>      and talk to edleafe.
>
> * Performance testing. If you have access to some nodes, some basic
>     benchmarking and profiling would be very useful. See the
>     performance section below.
>
> # Main Themes
>
> ## Traits
>
> The main API is in place, there's one patch left for a new command
> to sync the os-traits library into the database:
>
>      https://review.openstack.org/#/c/450125/
>
> There is a stack of changes to the os-traits library to add more traits
> and also automate creating symbols associated with the trait
> strings:
>
>      https://review.openstack.org/#/c/448282/
>
> ## Ironic/Custom Resource Classes
>
> There's a blueprint for "custom resource classes in flavors" that
> describes the stuff that will actually make use of custom resource
> classes:
>
>
> https://blueprints.launchpad.net/nova/+spec/custom-resource-classes-in-flavors

Due to the OSIC thing, Jay Pipes is going to pick this up now.

>
>
> The spec has merged, but the implementation has not yet started.
>
> Over in Ironic some functional and integration tests have started:
>
>      https://review.openstack.org/#/c/443628/
>
> ## Claims in the Scheduler
>
> Progress has been made on the spec for claims in the scheduler:
>
>      https://review.openstack.org/#/c/437424/
>
> Continued eyes and brains required. The current state is that more
> detail is desired on why some particular design choices are being
> made.

As of today's scheduler subteam meeting, the current most important 
decision to be made is if conductor does the retries when a claim fails 
due to conflict, or if the scheduler should. Related to the need for 
performance testing at scale, it would really help to gather some data 
on both approaches for retries here. Retrying in nova-conductor means 
going back through the scheduler to pull all of the hosts fresh and 
filtering/weighing them again, which would be more accurate but could be 
inefficient. Retrying in the filter scheduler would be more efficient 
since we have the hosts in an ordered list and don't need to refresh - 
but that could mean they are stale now too. Maybe we can stub some scale 
testing with fake compute nodes and the fake compute driver for this. 
Having a 'test' in tree doesn't make a lot of sense though as it does 
not really pass or fail, it's just there to compare against 
alternatives. I was wondering if we could re-use some of Yingxin's 
performance scale testing work that he presented at the Newton summit 
[1]. Alex said Yingxin is working on Ceph now, but the tooling should be 
on github somewhere.

>
> Thinking about this stuff has also revealed some places where it's
> possible for allocations to become wrong or orphaned:
>
>      https://bugs.launchpad.net/nova/+bug/1679750
>      https://bugs.launchpad.net/nova/+bug/1661312
>
> ## Shared Resource Providers
>
> https://blueprints.launchpad.net/nova/+spec/shared-resources-pike
>
> Progress on this will continue once traits and claims have moved forward.
>
> ## Nested Resource Providers
>
> On hold while attention is given to traits and claims. There's a
> stack of code waiting until all of that settles:
>
>
> https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/nested-resource-providers
>
>
> ## Docs
>
> https://review.openstack.org/#/q/topic:cd/placement-api-ref
>
> Several reviews are in progress for documenting the placement API.
> This is likely going to take quite a few iterations as we work out
> the patterns and tooling. But it's great to see the progress and
> when looking at the draft rendered docs it makes placement feel like
> a real thing™.
>
> Find me (cdent) or Andrey (avolkov) if you want to help out or have
> other questions.
>
> ## Performance
>
> We're aware that there are some redundancies in the resource tracker
> that we'd like to clean up
>
>
> http://lists.openstack.org/pipermail/openstack-dev/2017-January/110953.html
>
> but it's also the case that we've done no performance testing on the
> placement service itself.
>
> We ought to do some testing to make sure there aren't unexpected
> performance drains.
>
> # Other Code/Specs
>
> * https://review.openstack.org/#/c/448791/
>     Idempotent PUT for resource classes. This is something that was
>     discovered while evaluating some resource tracker code.
>
>     Once this merges a change to the report client can be made
>     to use it.
>
> * https://bugs.launchpad.net/nova/+bug/1632852
>     Cache headers not produced by placement API. This was assigned to
>     several different people over time, but I'm not sure if there is
>     any active code.
>
> * https://etherpad.openstack.org/p/placement-newton-leftovers
>     There's still some lingering stuff on here, some of which is
>     mentioned elsewhere in this message, but not all.
>
> * https://review.openstack.org/#/c/456717/
>     There's effort afoot over in devstack to use a combination of
>     apache2, mod_proxy_uwsgi, and uwsgi itself to run the services.
>     The review above is to the placement part of that. This allows
>     the placement api to be managed by systemd, not occupy a port,
>     and have some reasonable log handling.
>
> * https://review.openstack.org/#/q/project:openstack/osc-placement
>     Work has started on an osc-plugin that can provide a command
>     line interface to the placement API.
>
> # End
>
> Thanks for reading.
>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

[1] 
https://docs.google.com/presentation/d/1UG1HkEWyxPVMXseLwJ44ZDm-ek_MPc4M65H8EiwZnWs/edit?ts=571fcdd5#slide=id.g12d2cf15cd_2_90

-- 

Thanks,

Matt



More information about the OpenStack-dev mailing list