<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Aug 26, 2015 at 8:18 AM, Matt Riedemann <span dir="ltr"><<a href="mailto:mriedem@linux.vnet.ibm.com" target="_blank">mriedem@linux.vnet.ibm.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><span><br>
<br>
On 8/26/2015 3:21 AM, Timofei Durakov wrote:<br>
</span><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><span>
Hello,<br>
<br>
Here is the situation: nova has live-migration feature but doesn't have<br>
ci job to cover it by functional tests, only<br>
gate-tempest-dsvm-multinode-full(non-voting, btw), which covers<br>
block-migration only.<br>
The problem here is, that live-migration could be different, depending<br>
on how instance was booted(volume-backed/ephemeral), how environment is<br>
configured(is shared instance directory(NFS, for example), or RBD used<br>
to store ephemeral disk), or for example user don't have that and is<br>
going to use --block-migrate flag. To claim that we have reliable<br>
live-migration in nova, we should check it at least on envs with rbd or<br>
nfs as more popular than envs without shared storages at all.<br>
Here is the steps for that:<br>
<br></span>
 1. make  gate-tempest-dsvm-multinode-full voting, as it looks OK for<br>
    block-migration testing purposes;<br></blockquote></blockquote><div><br></div><div>When we are ready to make multinode voting we should remove the equivalent single node job.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
</blockquote>
<br>
If it's been stable for awhile then I'd be OK with making it voting on nova changes, I agree it's important to have at least *something* that gates on multi-node testing for nova since we seem to break this a few times per release.<br></blockquote><div><br></div><div>Last I checked it isn't as stable is single node yet: <a href="http://jogo.github.io/gate/multinode" target="_blank">http://jogo.github.io/gate/multinode</a> [0].  The data going into graphite is a bit noisy so this may be a red herring, but at the very least it needs to be investigated. When I was last looking into this there were at least two known bugs:</div><div><br></div><div><a href="https://bugs.launchpad.net/nova/+bug/1445569">https://bugs.launchpad.net/nova/+bug/1445569 <br></a></div><div><a href="https://bugs.launchpad.net/nova/+bug/1462305">https://bugs.launchpad.net/nova/+bug/1462305<br></a></div><div><br></div><div><br></div><div>[0] <a href="http://graphite.openstack.org/graph/?from=-36hours&height=500&until=now&width=800&bgcolor=ffffff&fgcolor=000000&yMax=100&yMin=0&target=color(alias(movingAverage(asPercent(stats.zuul.pipeline.check.job.gate-tempest-dsvm-full.FAILURE,sum(stats.zuul.pipeline.check.job.gate-tempest-dsvm-full.%7BSUCCESS,FAILURE%7D)),%275hours%27),%20%27gate-tempest-dsvm-full%27),%27orange%27)&target=color(alias(movingAverage(asPercent(stats.zuul.pipeline.check.job.gate-tempest-dsvm-multinode-full.FAILURE,sum(stats.zuul.pipeline.check.job.gate-tempest-dsvm-multinode-full.%7BSUCCESS,FAILURE%7D)),%275hours%27),%20%27gate-tempest-dsvm-multinode-full%27),%27brown%27)&title=Check%20Failure%20Rates%20(36%20hours)&_t=0.48646087432280183" target="_blank">http://graphite.openstack.org/graph/?from=-36hours&height=500&until=now&width=800&bgcolor=ffffff&fgcolor=000000&yMax=100&yMin=0&target=color(alias(movingAverage(asPercent(stats.zuul.pipeline.check.job.gate-tempest-dsvm-full.FAILURE,sum(stats.zuul.pipeline.check.job.gate-tempest-dsvm-full.{SUCCESS,FAILURE})),%275hours%27),%20%27gate-tempest-dsvm-full%27),%27orange%27)&target=color(alias(movingAverage(asPercent(stats.zuul.pipeline.check.job.gate-tempest-dsvm-multinode-full.FAILURE,sum(stats.zuul.pipeline.check.job.gate-tempest-dsvm-multinode-full.{SUCCESS,FAILURE})),%275hours%27),%20%27gate-tempest-dsvm-multinode-full%27),%27brown%27)&title=Check%20Failure%20Rates%20(36%20hours)&_t=0.48646087432280183</a></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
 2. contribute to tempest to cover volume-backed instances live-migration;<br>
</blockquote>
<br>
jogo has had a patch up for this for awhile:<br>
<br>
<a href="https://review.openstack.org/#/c/165233/" rel="noreferrer" target="_blank">https://review.openstack.org/#/c/165233/</a><br>
<br>
Since it's not full time on openstack anymore I assume some help there in picking up the change would be appreciated.<br></blockquote><div><br></div><div>yes please</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
 3. make another job with rbd for storing ephemerals, it also requires<br>
    changing tempest config;<br>
</blockquote>
<br>
We already have a voting ceph job for nova - can we turn that into a multi-node testing job and run live migration with shared storage using that? </blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
 4. make job with nfs for ephemerals.<br>
</blockquote>
<br>
Can't we use a multi-node ceph job (#3) for this?<br>
<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><span>
<br>
These steps should help us to improve current situation with<br>
live-migration.<br>
<br>
--<br>
Timofey.<br>
<br>
<br>
<br></span>
__________________________________________________________________________<br>
OpenStack Development Mailing List (not for usage questions)<br>
Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" rel="noreferrer" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
<br><span><font color="#888888">
</font></span></blockquote><span><font color="#888888">
<br>
-- <br>
<br>
Thanks,<br>
<br>
Matt Riedemann<br>
<br>
<br>
__________________________________________________________________________<br>
OpenStack Development Mailing List (not for usage questions)<br>
Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" rel="noreferrer" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
</font></span></blockquote></div><br></div></div>