Open Stack

Mon Oct 10 15:11:28 UTC 2016

Hi heat and magnum.

Apart from the scalability issues that have been observed, I'd like to
add few more subjects to discuss during the summit.

1. One nested stack per node and linear scale of cluster creation
time.

1.1
For large stacks, the creation of all nested stack scales linearly. We
haven't run any tested using the convergence-engine.

1.2
For large stacks, 1000 nodes, the final call to heat to fetch the
IPs for all nodes takes 3 to 4 minutes. In heat, the stack has status
CREATE_COMPLETE but magnum's state is updated when this long final
call is done. Can we do better? Maybe fetch only the master IPs or
get he IPs in chunks.

1.3
After the stack create API call to heat, magnum's conductor
busy-waits heat with a thread/cluster. (In case of a magnum conductor
restart, we lose that thread and we can't update the status in
magnum). Investigate better ways to sync the status between magnum
and heat.

2. Next generation magnum clusters

A need that comes up frequently in magnum is heterogeneous clusters.
* We want to able to create cluster on different hardware, (e.g. spawn
  vms on nodes with SSDs and nodes without SSDs or other special
  hardware available only in some nodes of the cluster FPGA, GPU)
* Spawn cluster across different AZs

I'll describe briefly our plan here, for further information we have a
detailed spec under review. [1]

To address this issue we introduce the node-group concept in magnum.
Each node-group will correspond to a different heat stack. The master
nodes can be organized in one or more stacks, so as the worker nodes.

We investigate how to implement this feature. We consider the
following:
At the moment, we have three template files, cluster, master and
node, and all three template files create one stack. The new
generation of clusters will have a cluster stack containing
the resources in the cluster template, specifically, networks, lbaas
floating-ips etc. Then, the output of this stack would be passed as
input to create the master node stack(s) and the worker nodes
stack(s).

3. Use of heat-agent

A missing feature in magnum is the lifecycle operations in magnum. For
restart of services and COE upgrades (upgrade docker, kubernetes and
mesos) we consider using the heat-agent. Another option is to create a
magnum agent or daemon like trove.

3.1
For restart, a few systemctl restart or service restart commands will
be issued. [2]

3.2
For upgrades there are three scenarios:
1. Upgrade a service which runs in a container. In this case, a small
   script that runs in each node is sufficient. No vm reboot required.
2. For an ubuntu based image or similar that requires a package upgrade
   a similar small script is sufficient too. No vm reboot required.
3. For our fedora atomic images, we need to perform a rebase on the
   rpm-ostree files system which requires a reboot.
4. Finally, a thought under investigation is replacing the nodes one
   by one using a different image. e.g. Upgrade from fedora 24 to 25
   with new versions of packages all in a new qcow2 image. How could
   we update the stack for this?

Options 1. and 2. can be done by upgrading all worker nodes at once or
one by one. Options 3. and 4. should be done one by one.

I'm drafting a spec about upgrades, should be ready by Wednesday.

Cheers,
Spyros

[1] https://review.openstack.org/#/c/352734/
[2] https://review.openstack.org/#/c/368981/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20161010/c15ceb5e/attachment.html>

Open Stack

[openstack-dev] [heat] [magnum] Subjects to discuss during the summit

OpenStack

Community

Documentation

Branding & Legal