[openstack-dev] [heat] [magnum] Subjects to discuss during the summit

Spyros Trigazis strigazi at gmail.com
Mon Oct 10 15:54:11 UTC 2016


Hi Sergey,

I have seen the session, I wanted to add more details to
start the discussion earlier and to be better prepared.

Thanks,
Spyros


On 10 October 2016 at 17:36, Sergey Kraynev <skraynev at mirantis.com> wrote:

> Hi Spyros,
>
> AFAIK we already have special session slot related with your topic.
> So thank you for the providing all items here.
> Rabi, can we add link on this mail to etherpad ? (it will save our time
> during session :) )
>
> On 10 October 2016 at 18:11, Spyros Trigazis <strigazi at gmail.com> wrote:
>
>> Hi heat and magnum.
>>
>> Apart from the scalability issues that have been observed, I'd like to
>> add few more subjects to discuss during the summit.
>>
>> 1. One nested stack per node and linear scale of cluster creation
>> time.
>>
>> 1.1
>> For large stacks, the creation of all nested stack scales linearly. We
>> haven't run any tested using the convergence-engine.
>>
>> 1.2
>> For large stacks, 1000 nodes, the final call to heat to fetch the
>> IPs for all nodes takes 3 to 4 minutes. In heat, the stack has status
>> CREATE_COMPLETE but magnum's state is updated when this long final
>> call is done. Can we do better? Maybe fetch only the master IPs or
>> get he IPs in chunks.
>>
>> 1.3
>> After the stack create API call to heat, magnum's conductor
>> busy-waits heat with a thread/cluster. (In case of a magnum conductor
>> restart, we lose that thread and we can't update the status in
>> magnum). Investigate better ways to sync the status between magnum
>> and heat.
>>
>> 2. Next generation magnum clusters
>>
>> A need that comes up frequently in magnum is heterogeneous clusters.
>> * We want to able to create cluster on different hardware, (e.g. spawn
>>   vms on nodes with SSDs and nodes without SSDs or other special
>>   hardware available only in some nodes of the cluster FPGA, GPU)
>> * Spawn cluster across different AZs
>>
>> I'll describe briefly our plan here, for further information we have a
>> detailed spec under review. [1]
>>
>> To address this issue we introduce the node-group concept in magnum.
>> Each node-group will correspond to a different heat stack. The master
>> nodes can be organized in one or more stacks, so as the worker nodes.
>>
>> We investigate how to implement this feature. We consider the
>> following:
>> At the moment, we have three template files, cluster, master and
>> node, and all three template files create one stack. The new
>> generation of clusters will have a cluster stack containing
>> the resources in the cluster template, specifically, networks, lbaas
>> floating-ips etc. Then, the output of this stack would be passed as
>> input to create the master node stack(s) and the worker nodes
>> stack(s).
>>
>> 3. Use of heat-agent
>>
>> A missing feature in magnum is the lifecycle operations in magnum. For
>> restart of services and COE upgrades (upgrade docker, kubernetes and
>> mesos) we consider using the heat-agent. Another option is to create a
>> magnum agent or daemon like trove.
>>
>> 3.1
>> For restart, a few systemctl restart or service restart commands will
>> be issued. [2]
>>
>> 3.2
>> For upgrades there are three scenarios:
>> 1. Upgrade a service which runs in a container. In this case, a small
>>    script that runs in each node is sufficient. No vm reboot required.
>> 2. For an ubuntu based image or similar that requires a package upgrade
>>    a similar small script is sufficient too. No vm reboot required.
>> 3. For our fedora atomic images, we need to perform a rebase on the
>>    rpm-ostree files system which requires a reboot.
>> 4. Finally, a thought under investigation is replacing the nodes one
>>    by one using a different image. e.g. Upgrade from fedora 24 to 25
>>    with new versions of packages all in a new qcow2 image. How could
>>    we update the stack for this?
>>
>> Options 1. and 2. can be done by upgrading all worker nodes at once or
>> one by one. Options 3. and 4. should be done one by one.
>>
>> I'm drafting a spec about upgrades, should be ready by Wednesday.
>>
>> Cheers,
>> Spyros
>>
>> [1] https://review.openstack.org/#/c/352734/
>> [2] https://review.openstack.org/#/c/368981/
>>
>> ____________________________________________________________
>> ______________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscrib
>> e
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
>
> --
> Regards,
> Sergey.
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20161010/1a56cdd4/attachment.html>


More information about the OpenStack-dev mailing list