[openstack-dev] [senlin] Mitaka summit meetup - a summary
tengqim at linux.vnet.ibm.com
Thu Nov 5 03:28:57 UTC 2015
Thanks for joining the senlin meetup last week at Tokyo summit. We know
some of you were not able to make it for various reasons. I'm trying to
summarize things we discussed during the meetup and some preliminary
conclusions we got. Please feel free to reply to this email or find the
team on #senlin channel if you have questions/suggestions.
- Senlin will focus more on two things during Mitaka cycle: 1)
stability regarding API and engine; 2) Heat resource type support.
- Senlin engine won't do "convergence" as suggested by some people,
however the engine should be responsible to manage the lifecycles of
the objects it creates on behalf of users.
- Team will revise the APIs according to the recent guidelines from
api-wg and make the first version released as stable as possible.
Before having a versioning scheme in place, we won't bump the API
versions in ad-hoc ways.
- Senlin will NOT introduce complicated monitoring mechanisms into the
engine albeit we'd strive to support cluster/node status checkings.
We opt to use whatever external monitoring services and leave that
an option for users.
- We will continue working with TOSCA team to polish policy definitions.
- We will document guidelines on how policy decisions are passed from
one policy to another.
- We are interested in building baremetal clusters, but we will keep it
in pipeline unless there are: 1) real requests, and 2) resources to
get it done.
- As part of the API stabilization effort, we will generalize the
concept of 'webhook' into 'receiver'.
Long Version (TL;DR)
* Stability v.s. Features
We had some feature requests like managing container clusters, doing
smart scheduling, running scripts on a cluster of servers, supporting
clusters of non-compute resources... etc. These are all good ideas.
However, Senlin is not aiming to become a service of everything. We have
to refrain from the temptation of too wide a scope. There are millions
of things we can do, but the first priority at this stage is about
stability. Making it usable and stable before adding fancy features,
this was the consensus we achieved during the meetup. We will stick to
that during Mitaka cycle.
* Heat Resource Type Support
Team had a discussion with heat team during a design summit slot. The
basic vision remained the same: let senlin do autoscaling and deprecate
heat autoscaling when senlin is stable. There are quite some details
to be figured out. The first thing we would do is to land senlin
cluster, node and profile resource types in Heat and build a
auto-scaling end-to-end solution comparable to existing one. Then the
two teams will make decision on how to make the transition smooth for
both developers and users.
* Convergence or Not
There were suggestions to define 'desired' state and 'observed' state
for clusters and have senlin engine do the convergence. After some
closer examination of the use case, we decided not to do it. The
'desired' state of a node is obvious (i.e. ACTIVE). The 'desired' state
of a cluster is a little bit vague. It boils down to whether we would
allow 'partial success' when creating a cluster of 1,000 nodes. Failures
are unavoidable, thus something we have to live with. However, we are
very cautious about making decisions for users. Say we have 90% nodes
ACTIVE in a cluster, should we label the cluster an 'ERROR' state, or a
'WARNING' state, or just 'ACTIVE'? We tend to leave this decision to
users who are smart people too. To avoid too much burdens on users, we
will add some defaults that can be set by operators.
There are cases where senlin engine creates objects when enforcing a
policy, e.g. the load-balancing policy. The engine should do a good job
managing the status of those objects.
* API Design
Senlin already have an API design which is documented. Before doing a
verion 1.0 release, we need to further hammer on it. Most of these
revisions would be related to guidelines from api-wg. For example, the
following changes are expected to land during Mitaka:
- return 202 instead of 200 for asynchronous operations
- better align with the proposed change to 'action' APIs
- sorting keys and directions
- returning 400 or 404 for resources not found
- add location headers where appropriate
Another change to the current API will be about webhook. We got
suggestions related to receving notifications from other channels other
than webhooks, e.g. message queues, external monitoring services. To avoid
disruptive changes to the APIs in future, we decided to generalize webhook
APIs to 'receivers'. This is an important work even if we only support
webhook as the only type of receivers. We don't want to see webhook APIs
provided and soon replaced/deprecated.
* Relying on External Monitoring
There used to be some interests in doing status polling on cluster
nodes so that the engine will know whether nodes are healthy or not.
This idea was rejected during the meetup. There are several reasons on
this: too much overhead on the backend services; still unable to get the
latest status of resources; scalability concerns etc. We have decided to
rely on other monitoring/alarming services to provide status updates for
health status report.
When users send a GET request for a node resource, we will allow them to
get the latest resource status. By default, we may just return the
'cached' status in our own database.
* Collaboration with TOSCA
Team has been engaging with the development of TOSCA policy definitions
since a few months ago. We will continue this collaboration to make sure
senlin's policy definition is well aligned with the standard so that a
translator can easily translate TOSCA policy definition into senlin
version. We will also feed the standard team with suggestions.
* More Documentations
Senlin already have some documentations for users, developers. The APIs
are documented using WADL as well. Going forward, we will need to
provide more for developers on policy development. For example, there
will be a chain of policies that will be checked in sequence when an
action is performed. We need a more explicit protocol for policies to
exchange data. A policy has to document the inputs it can consume and
the outputs it will generate.
We need to keep an eye on the recent proposal to rewrite API docs in a
different format. That will hopefully get done during Mitaka cycle as
Guys, please fill in things I missed and bomb us with questions or
More information about the OpenStack-dev