[Openstack] Fwd: [trove] - Discussion on Clustering and Replication API

Daniel Salinas imsplitbit at gmail.com
Tue Aug 27 13:26:20 UTC 2013


Forwarding to ML.  Sorry somehow this got sent directly to Auston.

---------- Forwarded message ----------
From: Daniel Salinas <imsplitbit at gmail.com>
Date: Fri, Aug 23, 2013 at 3:18 PM
Subject: Re: [Openstack] [trove] - Discussion on Clustering and Replication
API
To: "McReynolds, Auston" <amcreynolds at ebay.com>


Auston,

The wiki page you're looking through was one approach we discussed but
ultimately decided against.  The current proposal is here:

https://wiki.openstack.org/wiki/Trove-Replication-And-Clustering-API-Using-Instances

It is a bit of a misnomer to call it "using instances" since it uses both
/instance and /cluster but the consensus during design discussions was that
we were going to have a clustering api which provides a control structure
for clustering/replication, not instances.   For example, adding an
instance to a cluster would be a PUT on /cluster/{id}, not
/cluster/{id}/nodes.

Also we discussed removing actions altogether so we won't have a promote
action but rather an explicit path for promotion for an instance of a
cluster.  This path *should* be /cluster/{id}/promote with a body
containing the instance id being promoted.

If you get a chance to read through the above link and still have questions
feel free to let me know.

I will try to spend some time looking through your discussion points and
get back to you asap.


On Wed, Aug 21, 2013 at 7:46 PM, McReynolds, Auston <amcreynolds at ebay.com>wrote:

> Blueprint:
>
> https://wiki.openstack.org/wiki/Trove-Replication-And-Clustering-API
>
> Questions:
>
> * Today, /instance/{instance_id}/action is the single endpoint for all
> actions on an instance (where the action is parsed from the payload).
> I see in the newly proposed /clusters api that there's
> /clusters/{cluster_id}/restart, etc. Is this a purposeful move from
> "field of a resource" to sub-resources? If so, is there a plan to
> retrofit the /instance api?
>
> * For "Promote a Slave Node to Master", where is the request
> indicating the promote action (explicitly or implicitly)? I don't see
> it in the uri or the payload.
>
> * "Create Replication Set" is a POST to /clusters, but "Add Node" is a
> PUT to /clusters/{cluster_id}/nodes. This seems inconsistent given
> both are essentially doing the same thing: adding nodes to a cluster.
> What's the reasoning behind the divergence?
>
> * What is the expected result of a resize action request on
> /instance/{instance_id} for an instance that's a part of a cluster
> (meaning the request could have alternatively been executed against
> /cluster/{cluster_id}/nodes/{node_id})? Will it return an error?
> Redirect the request to the /clusters internals?
>
> Discussion:
>
> Although it's common and often advised that the same flavor be used
> for every node in a cluster, there are many situations in which you'd
> purposefully buck the tradition. One example would be choosing a
> beefier flavor for a slave to support ad-hoc queries from a tertiary
> web application (analytics, monitoring, etc.).
>
> Therefore,
>
> {
>   "cluster":{
>     "nodes":3,
>     "flavorRef":"https://service/v1.0/1234/flavors/1",
>     "name":"replication_set_1",
>     "volume":{
>       "size":2
>     },
>     "clusterConfig":{
>       "type":"https://service/v1.0/1234/clustertypes/1234"
>     }
>   }
> }
>
> is not quite expressive enough. One "out" is that you could force the
> user to resize the slave(s) after the cluster has been completely
> provisioned, but that seems a bit egregious.
>
> Something like the following seems to fit the bill:
>
> {
>   "cluster":{
>     "clusterConfig":{
>       "type":"https://service/v1.0/1234/clustertypes/1234"
>     },
>     "nodes":[
>     {
>       "flavorRef":"https://service/v1.0/1234/flavors/1",
>       "volume":{
>         "size":2
>       }
>     },
>     {
>       "flavorRef":"https://service/v1.0/1234/flavors/3",
>       "volume":{
>         "size":2
>       }
>     }]
>   }
> }
>
> but, which node is arbitrarily elected the master if the clusterConfig
> is set to MySQL Master/Slave? When region awareness is supported in
> Trove, how would you pin a specifically configured node to its
> earmarked region/datacenter? What will the names of the nodes of the
> cluster be?
>
> {
>   "cluster":{
>     "clusterConfig":{
>       "type":"https://service/v1.0/1234/clustertypes/1234"
>     },
>     "nodes":[
>     {
>       "name":"usecase-master",
>       "flavorRef":"https://service/v1.0/1234/flavors/1",
>       "volume":{
>         "size":2
>       },
>       "region": "us-west",
>       "nodeConfig": {
>         "type": "master"
>       }
>     },
>     {
>       "name":"usecase-slave-us-east"
>       "flavorRef":"https://service/v1.0/1234/flavors/3",
>       "volume":{
>         "size":2
>       },
>       "region": "us-east",
>       "nodeConfig": {
>         "type": "slave"
>       }
>     },
>     {
>       "name":"usecase-slave-eu-de"
>       "flavorRef":"https://service/v1.0/1234/flavors/3",
>       "volume":{
>         "size":2
>       },
>       "region": "eu-de",
>       "nodeConfig": {
>         "type": "slave"
>       }
>     }]
>   }
> }
>
> This works decently enough, but it assumes a simple master/slave
> architecture. What about MySQL multi-master with replication?
> See /doc/refman/5.5/en/mysql-cluster-replication-multi-master.html.
> Now, a 'slaveof' or 'primary'/'parent' field is necessary to be more
> specific (either that, or nesting of JSON to indicate relationships).
>
> From above, it's clear that a "nodeConfig" of sorts is needed to
> indicate whether the node is a slave or master, and to whom. Thus far,
> a RDBMS has been assumed, but consider other offerings in the space:
> How will you designate if the node is a seed in the case of Cassandra?
> The endpoint snitch for a Cassandra node? The cluster name for
> Cassandra or the replica-set for Mongo? Whether a slave should be
> daisy-chained to another slave or attached to directly to master in
> the case of Redis?
>
> Preventing service type specifics from bleeding into what should be a
> generic (as possible) schema is paramount. Unfortunately, "nodeConfig"
> as you can see starts to become an amalgamation of fields that are
> only applicable in certain situations, making documentation, codegen
> for clients, and ease of use, a bit challenging. Fast-forward to when
> editable parameter groups become a priority (a.k.a. being able to set
> name-value-pairs in the service type's CONF). If users/customers
> demand the ability to set things like buffer-pool-size while
> provisioning, these fields would likely be placed in "nodeConfig",
> making the situation worse.
>
> Here's an attempt with a slightly different approach:
> https://gist.github.com/amcr/96c59a333b72ec973c3a
>
> From there, you could build a convenience /cluster api to facilitate
> multi-node deployments (vs. building and associating node by node), or
> wait for Heat integration.
>
> Both approaches have their strengths, so I'm convinced it's the
> blending of the two that will result in what we're all looking for.
>
> Thoughts?
>
> Cheers,
> amc
>
>
> _______________________________________________
> Mailing list:
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to     : openstack at lists.openstack.org
> Unsubscribe :
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20130827/8a1c0269/attachment.html>


More information about the Openstack mailing list