[openstack-dev] [nova] Migration progress

Zhenyu Zheng zhengzhenyulixi at gmail.com
Fri Feb 5 02:52:06 UTC 2016


I think we can add a config option for this and set a theoretical proper
default value,
we also add help messages to inform the the user about how inappropriate
value of
this config option will effect the performance.



On Wed, Feb 3, 2016 at 7:45 PM, Daniel P. Berrange <berrange at redhat.com>
wrote:

> On Wed, Feb 03, 2016 at 11:27:16AM +0000, Paul Carlton wrote:
> > On 03/02/16 10:49, Daniel P. Berrange wrote:
> > >On Wed, Feb 03, 2016 at 10:44:36AM +0000, Daniel P. Berrange wrote:
> > >>On Wed, Feb 03, 2016 at 10:37:24AM +0000, Koniszewski, Pawel wrote:
> > >>>Hello everyone,
> > >>>
> > >>>On the yesterday's live migration meeting we had concerns that
> interval of
> > >>>writing migration progress to the database is too short.
> > >>>
> > >>>Information about migration progress will be stored in the database
> and
> > >>>exposed through the API (/servers/<uuid>/migrations/<id>). In current
> > >>>proposition [1] migration progress will be updated every 2 seconds. It
> > >>>basically means that every 2 seconds a call through RPC will go from
> compute
> > >>>to conductor to write migration data to the database. In case of
> parallel
> > >>>live migrations each migration will report progress by itself.
> > >>>
> > >>>Isn't 2 seconds interval too short for updates if the information is
> exposed
> > >>>through the API and it requires RPC and DB call to actually save it
> in the
> > >>>DB?
> > >>>
> > >>>Our default configuration allows only for 1 concurrent live migration
> [2],
> > >>>but it might vary between different deployments and use cases as it is
> > >>>configurable. Someone might want to trigger 10 (or even more)
> parallel live
> > >>>migrations and each might take even a day to finish in case of block
> > >>>migration. Also if deployment is big enough rabbitmq might be
> fully-loaded.
> > >>>I'm not sure whether updating each migration every 2 seconds makes
> sense in
> > >>>this case. On the other hand it might be hard to observe fast enough
> that
> > >>>migration is stuck if we increase this interval...
> > >>Do we have any actual data that this is a real problem. I have a
> pretty hard
> > >>time believing that a database update of a single field every 2
> seconds is
> > >>going to be what pushes Nova over the edge into a performance
> collapse, even
> > >>if there are 20 migrations running in parallel, when you compare it to
> the
> > >>amount of DB queries & updates done across other areas of the code for
> pretty
> > >>much every singke API call and background job.
> > >Also note that progress is rounded to the nearest integer. So even if
> the
> > >migration runs all day, there is a maximum of 100 possible changes in
> value
> > >for the progress field, so most of the updates should turn in to no-ops
> at
> > >the database level.
> > >
> > >Regards,
> > >Daniel
> > I agree with Daniel, these rpc and db access ops are a tiny percentage
> > of the overall load on rabbit and mysql and properly configured these
> > subsystems should have no issues with this workload.
> >
> > One correction, unless I'm misreading it, the existing
> > _live_migration_monitor code updates the progress field of the instance
> > record every 5 seconds.  However this value can go up and down so
> > an infinate number of updates are possible?
>
> Oh yes, you are in fact correct. Technically you could have an unbounded
> number of updates if migration goes backwards. Some mitigation against
> this is if we see progress going backwards we'll actually abort the
> migration if it gets stuck for too long. We'll also be progressively
> increasing the permitted downtime. So except in pathelogical scenarios
> I think the number of updates should still be relatively small.
>
> > However, the issue raised here is not with the existing implementation
> > but with the proposed change
> > https://review.openstack.org/#/c/258813/5/nova/virt/libvirt/driver.py
> > This add a save() operation on the migration object every 2 seconds
>
> Ok, that is more heavy weight since it is recording the raw byte values
> and so it is guaranteed to do a database update pretty much every time.
> It still shouldn't be too unreasonable a loading though. FWIW I think
> it is worth being consistent in the update frequency betweeen the
> progress value & the migration object save, so switching to be every
> 5 seconds probably makes more sense, so we know both objects are
> reflecting the same point in time.
>
> Regards,
> Daniel
> --
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/
> :|
> |: http://libvirt.org              -o-             http://virt-manager.org
> :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/
> :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc
> :|
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160205/51968e53/attachment.html>


More information about the OpenStack-dev mailing list