[openstack-dev] [Nova] Cells conversation starter

Andrew Laski andrew.laski at rackspace.com
Wed Oct 22 18:55:04 UTC 2014


On 10/22/2014 12:24 AM, Tom Fifield wrote:
> On 22/10/14 03:07, Andrew Laski wrote:
>> On 10/21/2014 04:31 AM, Nikola Đipanov wrote:
>>> On 10/20/2014 08:00 PM, Andrew Laski wrote:
>>>> One of the big goals for the Kilo cycle by users and developers of the
>>>> cells functionality within Nova is to get it to a point where it can be
>>>> considered a first class citizen of Nova.  Ultimately I think this comes
>>>> down to getting it tested by default in Nova jobs, and making it easy
>>>> for developers to work with.  But there's a lot of work to get there.
>>>> In order to raise awareness of this effort, and get the conversation
>>>> started on a few things, I've summarized a little bit about cells and
>>>> this effort below.
>>>>
>>>>
>>>> Goals:
>>>>
>>>> Testing of a single cell setup in the gate.
>>>> Feature parity.
>>>> Make cells the default implementation.  Developers write code once and
>>>> it works for  cells.
>>>>
>>>> Ultimately the goal is to improve maintainability of a large feature
>>>> within the Nova code base.
>>>>
>>> Thanks for the write-up Andrew! Some thoughts/questions below. Looking
>>> forward to the discussion on some of these topics, and would be happy to
>>> review the code once we get to that point.
>>>
>>>> Feature gaps:
>>>>
>>>> Host aggregates
>>>> Security groups
>>>> Server groups
>>>>
>>>>
>>>> Shortcomings:
>>>>
>>>> Flavor syncing
>>>>       This needs to be addressed now.
>>>>
>>>> Cells scheduling/rescheduling
>>>> Instances can not currently move between cells
>>>>       These two won't affect the default one cell setup so they will be
>>>> addressed later.
>>>>
>>>>
>>>> What does cells do:
>>>>
>>>> Schedule an instance to a cell based on flavor slots available.
>>>> Proxy API requests to the proper cell.
>>>> Keep a copy of instance data at the global level for quick retrieval.
>>>> Sync data up from a child cell to keep the global level up to date.
>>>>
>>>>
>>>> Simplifying assumptions:
>>>>
>>>> Cells will be treated as a two level tree structure.
>>>>
>>> Are we thinking of making this official by removing code that actually
>>> allows cells to be an actual tree of depth N? I am not sure if doing so
>>> would be a win, although it does complicate the RPC/Messaging/State code
>>> a bit, but if it's not being used, even though a nice generalization,
>>> why keep it around?
>> My preference would be to remove that code since I don't envision anyone
>> writing tests to ensure that functionality works and/or doesn't
>> regress.  But there's the challenge of not knowing if anyone is actually
>> relying on that behavior.  So initially I'm not creating a specific work
>> item to remove it.  But I think it needs to be made clear that it's not
>> officially supported and may get removed unless a case is made for
>> keeping it and work is put into testing it.
> While I agree that N is a bit interesting, I have seen N=3 in production
>
> [central API]-->[state/region1]-->[state/region DC1]
>                                 \->[state/region DC2]
>                -->[state/region2 DC]
>                -->[state/region3 DC]
>                -->[state/region4 DC]

I would be curious to hear any information about how this is working 
out.  Does everything that works for N=2 work when N=3?  Are there fixes 
that needed to be added to make this work?  Why do it this way rather 
than bring [state/region DC1] and [state/region DC2] up a level?


>
>
>>>> Plan:
>>>>
>>>> Fix flavor breakage in child cell which causes boot tests to fail.
>>>> Currently the libvirt driver needs flavor.extra_specs which is not
>>>> synced to the child cell.  Some options are to sync flavor and extra
>>>> specs to child cell db, or pass full data with the request.
>>>> https://review.openstack.org/#/c/126620/1 offers a means of passing full
>>>> data with the request.
>>>>
>>>> Determine proper switches to turn off Tempest tests for features that
>>>> don't work with the goal of getting a voting job.  Once this is in place
>>>> we can move towards feature parity and work on internal refactorings.
>>>>
>>>> Work towards adding parity for host aggregates, security groups, and
>>>> server groups.  They should be made to work in a single cell setup, but
>>>> the solution should not preclude them from being used in multiple
>>>> cells.  There needs to be some discussion as to whether a host aggregate
>>>> or server group is a global concept or per cell concept.
>>>>
>>> Have there been any previous discussions on this topic? If so I'd really
>>> like to read up on those to make sure I understand the pros and cons
>>> before the summit session.
>> The only discussion I'm aware of is some comments on
>> https://review.openstack.org/#/c/59101/ , though they mention a
>> discussion at the Utah mid-cycle.
>>
>> The main con I'm aware of for defining these as global concepts is that
>> there is no rescheduling capability in the cells scheduler.  So if a
>> build is sent to a cell with a host aggregate that can't fit that
>> instance the build will fail even though there may be space in that host
>> aggregate from a global perspective.  That should be somewhat
>> straightforward to address though.
>>
>> I think it makes sense to define these as global concepts.  But these
>> are features that aren't used with cells yet so I haven't put a lot of
>> thought into potential arguments or cases for doing this one way or
>> another.
>>
>>
>>>> Work towards merging compute/api.py and compute/cells_api.py so that
>>>> developers only need to make changes/additions in once place.  The goal
>>>> is for as much as possible to be hidden by the RPC layer, which will
>>>> determine whether a call goes to a compute/conductor/cell.
>>>>
>>>> For syncing data between cells, look at using objects to handle the
>>>> logic of writing data to the cell/parent and then syncing the data to
>>>> the other.
>>>>
>>> Some of that work has been done already, although in a somewhat ad-hoc
>>> fashion, were you thinking of extending objects to support this natively
>>> (whatever that means), or do we continue to inline the code in the
>>> existing object methods.
>> I would prefer to have some native support for this.  In general data is
>> considered authoritative at the global level or the cell level.  For
>> example, instance data is synced down from the global level to a
>> cell(except for a few fields which are synced up) but a migration would
>> be synced up.  I could imagine decorators that would specify how data
>> should be synced and handle that as transparently as possible.
>>
>>>> A potential migration scenario is to consider a non cells setup to be a
>>>> child cell and converting to cells will mean setting up a parent cell
>>>> and linking them.  There are periodic tasks in place to sync data up
>>>> from a child already, but a manual kick off mechanism will need to be
>>>> added.
>>>>
>>>>
>>>> Future plans:
>>>>
>>>> Something that has been considered, but is out of scope for now, is that
>>>> the parent/api cell doesn't need the same data model as the child cell.
>>>> Since the majority of what it does is act as a cache for API requests,
>>>> it does not need all the data that a cell needs and what data it does
>>>> need could be stored in a form that's optimized for reads.
>>>>
>>>>
>>>> Thoughts?
>>>>
>>>> _______________________________________________
>>>> OpenStack-dev mailing list
>>>> OpenStack-dev at lists.openstack.org
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>> _______________________________________________
>>> OpenStack-dev mailing list
>>> OpenStack-dev at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




More information about the OpenStack-dev mailing list