<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Nov 17, 2014 at 1:06 PM, Joshua Harlow <span dir="ltr"><<a href="mailto:harlowja@outlook.com" target="_blank">harlowja@outlook.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">Hi guys,<br>

<br>

A recent question came up about how do we test better with redis for tooz. I think this question is also relevant for ceilometer (and other users of redis) and in general applies to the whole of openstack as the larger system is what people run (I hope not everyone just runs devstack on a single-node and that's where they stop, ha).<br></blockquote><div><br></div><div><a href="https://review.openstack.org/#/c/106043/23">https://review.openstack.org/#/c/106043/23</a></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

<br>

The basic question is that redis (or zookeeper) have (and typically are) ways to be setup with multi-node instances (for example with redis + sentinel or zookeeper in multi-node configurations, or the newly released redis clustering...). It seems though that our testing infrastructure is setup to do the basics of tests (which isn't bad, but does have its limits), and this got me thinking on what would be needed to actually test these multi-node configurations of things like redis (configured in sentinel mode, or redis in clustering mode) in a realistic manner that tests 'common' failure patterns (net splits for example).<br>

<br>

I guess we can split it up into 3 or 4 (or more questions).<br>

<br>

1. How do we get a multi-node configuration (of say redis) setup in the first place, configured so that all nodes are running and sentinel (for example) is running as expected?<br>

2. How do we then inject failures into this setup to ensure that the applications and clients built ontop of those systems reliably handle these type of injected failures (something like <a href="https://github.com/aphyr/">https://github.com/aphyr/</a><u></u>jepsen or similar?).<br>

3. How do we analyze those results (for when #2 doesn't turn out to work as expected) in a meaningful manner, so that we can then turn those experiments into more reliable software?<br>

<br>

Anyone else have any interesting ideas for this?<br>

<br>

-Josh<br>

<br>

______________________________<u></u>_________________<br>

OpenStack-dev mailing list<br>

OpenStack-dev@lists.openstack.<u></u>org<br>

<a href="http://lists.openstack.org/">http://lists.openstack.org/</a><u></u>cgi-bin/mailman/listinfo/<u></u>openstack-dev<br>

</blockquote></div><br></div></div>