<div dir="ltr">Forwarding to ML. Sorry somehow this got sent directly to Auston.<br><div><br><div class="gmail_quote">---------- Forwarded message ----------<br>From: <b class="gmail_sendername">Daniel Salinas</b> <span dir="ltr"><<a href="mailto:imsplitbit@gmail.com">imsplitbit@gmail.com</a>></span><br>
Date: Fri, Aug 23, 2013 at 3:18 PM<br>Subject: Re: [Openstack] [trove] - Discussion on Clustering and Replication API<br>To: "McReynolds, Auston" <<a href="mailto:amcreynolds@ebay.com">amcreynolds@ebay.com</a>><br>
<br><br><div dir="ltr"><div>Auston,<br><br>The wiki page you're looking through was one approach we
discussed but ultimately decided against. The current proposal is here:<br><br><a href="https://wiki.openstack.org/wiki/Trove-Replication-And-Clustering-API-Using-Instances" target="_blank">https://wiki.openstack.org/wiki/Trove-Replication-And-Clustering-API-Using-Instances</a><br>
<br></div><div>It is a bit of a misnomer to call it "using instances" since it uses both /instance and /cluster but the consensus during design discussions was that we were going to have a clustering api which provides a control structure for clustering/replication, not instances. For example, adding an instance to a cluster would be a PUT on /cluster/{id}, not /cluster/{id}/nodes.<br>
<br></div><div>Also we discussed removing actions altogether so we won't have a promote action but rather an explicit path for promotion for an instance of a cluster. This path *should* be /cluster/{id}/promote with a body containing the instance id being promoted.<br>
</div><div><br>If you get a chance to read through the above link and still have questions feel free to let me know.<br><br></div>I will try to spend some time looking through your discussion points and get back to you asap.<br>
</div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Aug 21, 2013 at 7:46 PM, McReynolds, Auston <span dir="ltr"><<a href="mailto:amcreynolds@ebay.com" target="_blank">amcreynolds@ebay.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Blueprint:<br>
<br>
<a href="https://wiki.openstack.org/wiki/Trove-Replication-And-Clustering-API" target="_blank">https://wiki.openstack.org/wiki/Trove-Replication-And-Clustering-API</a><br>
<br>
Questions:<br>
<br>
* Today, /instance/{instance_id}/action is the single endpoint for all<br>
actions on an instance (where the action is parsed from the payload).<br>
I see in the newly proposed /clusters api that there's<br>
/clusters/{cluster_id}/restart, etc. Is this a purposeful move from<br>
"field of a resource" to sub-resources? If so, is there a plan to<br>
retrofit the /instance api?<br>
<br>
* For "Promote a Slave Node to Master", where is the request<br>
indicating the promote action (explicitly or implicitly)? I don't see<br>
it in the uri or the payload.<br>
<br>
* "Create Replication Set" is a POST to /clusters, but "Add Node" is a<br>
PUT to /clusters/{cluster_id}/nodes. This seems inconsistent given<br>
both are essentially doing the same thing: adding nodes to a cluster.<br>
What's the reasoning behind the divergence?<br>
<br>
* What is the expected result of a resize action request on<br>
/instance/{instance_id} for an instance that's a part of a cluster<br>
(meaning the request could have alternatively been executed against<br>
/cluster/{cluster_id}/nodes/{node_id})? Will it return an error?<br>
Redirect the request to the /clusters internals?<br>
<br>
Discussion:<br>
<br>
Although it's common and often advised that the same flavor be used<br>
for every node in a cluster, there are many situations in which you'd<br>
purposefully buck the tradition. One example would be choosing a<br>
beefier flavor for a slave to support ad-hoc queries from a tertiary<br>
web application (analytics, monitoring, etc.).<br>
<br>
Therefore,<br>
<br>
{<br>
"cluster":{<br>
"nodes":3,<br>
"flavorRef":"<a href="https://service/v1.0/1234/flavors/1" target="_blank">https://service/v1.0/1234/flavors/1</a>",<br>
"name":"replication_set_1",<br>
"volume":{<br>
"size":2<br>
},<br>
"clusterConfig":{<br>
"type":"<a href="https://service/v1.0/1234/clustertypes/1234" target="_blank">https://service/v1.0/1234/clustertypes/1234</a>"<br>
}<br>
}<br>
}<br>
<br>
is not quite expressive enough. One "out" is that you could force the<br>
user to resize the slave(s) after the cluster has been completely<br>
provisioned, but that seems a bit egregious.<br>
<br>
Something like the following seems to fit the bill:<br>
<br>
{<br>
"cluster":{<br>
"clusterConfig":{<br>
"type":"<a href="https://service/v1.0/1234/clustertypes/1234" target="_blank">https://service/v1.0/1234/clustertypes/1234</a>"<br>
},<br>
"nodes":[<br>
{<br>
"flavorRef":"<a href="https://service/v1.0/1234/flavors/1" target="_blank">https://service/v1.0/1234/flavors/1</a>",<br>
"volume":{<br>
"size":2<br>
}<br>
},<br>
{<br>
"flavorRef":"<a href="https://service/v1.0/1234/flavors/3" target="_blank">https://service/v1.0/1234/flavors/3</a>",<br>
"volume":{<br>
"size":2<br>
}<br>
}]<br>
}<br>
}<br>
<br>
but, which node is arbitrarily elected the master if the clusterConfig<br>
is set to MySQL Master/Slave? When region awareness is supported in<br>
Trove, how would you pin a specifically configured node to its<br>
earmarked region/datacenter? What will the names of the nodes of the<br>
cluster be?<br>
<br>
{<br>
"cluster":{<br>
"clusterConfig":{<br>
"type":"<a href="https://service/v1.0/1234/clustertypes/1234" target="_blank">https://service/v1.0/1234/clustertypes/1234</a>"<br>
},<br>
"nodes":[<br>
{<br>
"name":"usecase-master",<br>
"flavorRef":"<a href="https://service/v1.0/1234/flavors/1" target="_blank">https://service/v1.0/1234/flavors/1</a>",<br>
"volume":{<br>
"size":2<br>
},<br>
"region": "us-west",<br>
"nodeConfig": {<br>
"type": "master"<br>
}<br>
},<br>
{<br>
"name":"usecase-slave-us-east"<br>
"flavorRef":"<a href="https://service/v1.0/1234/flavors/3" target="_blank">https://service/v1.0/1234/flavors/3</a>",<br>
"volume":{<br>
"size":2<br>
},<br>
"region": "us-east",<br>
"nodeConfig": {<br>
"type": "slave"<br>
}<br>
},<br>
{<br>
"name":"usecase-slave-eu-de"<br>
"flavorRef":"<a href="https://service/v1.0/1234/flavors/3" target="_blank">https://service/v1.0/1234/flavors/3</a>",<br>
"volume":{<br>
"size":2<br>
},<br>
"region": "eu-de",<br>
"nodeConfig": {<br>
"type": "slave"<br>
}<br>
}]<br>
}<br>
}<br>
<br>
This works decently enough, but it assumes a simple master/slave<br>
architecture. What about MySQL multi-master with replication?<br>
See /doc/refman/5.5/en/mysql-cluster-replication-multi-master.html.<br>
Now, a 'slaveof' or 'primary'/'parent' field is necessary to be more<br>
specific (either that, or nesting of JSON to indicate relationships).<br>
<br>
>From above, it's clear that a "nodeConfig" of sorts is needed to<br>
indicate whether the node is a slave or master, and to whom. Thus far,<br>
a RDBMS has been assumed, but consider other offerings in the space:<br>
How will you designate if the node is a seed in the case of Cassandra?<br>
The endpoint snitch for a Cassandra node? The cluster name for<br>
Cassandra or the replica-set for Mongo? Whether a slave should be<br>
daisy-chained to another slave or attached to directly to master in<br>
the case of Redis?<br>
<br>
Preventing service type specifics from bleeding into what should be a<br>
generic (as possible) schema is paramount. Unfortunately, "nodeConfig"<br>
as you can see starts to become an amalgamation of fields that are<br>
only applicable in certain situations, making documentation, codegen<br>
for clients, and ease of use, a bit challenging. Fast-forward to when<br>
editable parameter groups become a priority (a.k.a. being able to set<br>
name-value-pairs in the service type's CONF). If users/customers<br>
demand the ability to set things like buffer-pool-size while<br>
provisioning, these fields would likely be placed in "nodeConfig",<br>
making the situation worse.<br>
<br>
Here's an attempt with a slightly different approach:<br>
<a href="https://gist.github.com/amcr/96c59a333b72ec973c3a" target="_blank">https://gist.github.com/amcr/96c59a333b72ec973c3a</a><br>
<br>
>From there, you could build a convenience /cluster api to facilitate<br>
multi-node deployments (vs. building and associating node by node), or<br>
wait for Heat integration.<br>
<br>
Both approaches have their strengths, so I'm convinced it's the<br>
blending of the two that will result in what we're all looking for.<br>
<br>
Thoughts?<br>
<br>
Cheers,<br>
amc<br>
<br>
<br>
_______________________________________________<br>
Mailing list: <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a><br>
Post to : <a href="mailto:openstack@lists.openstack.org" target="_blank">openstack@lists.openstack.org</a><br>
Unsubscribe : <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a><br>
</blockquote></div><br></div>
</div></div></div><br></div></div>