[openstack-dev] [nova] Migration progress

Paul Carlton paul.carlton2 at hpe.com
Wed Feb 3 11:27:16 UTC 2016


On 03/02/16 10:49, Daniel P. Berrange wrote:
> On Wed, Feb 03, 2016 at 10:44:36AM +0000, Daniel P. Berrange wrote:
>> On Wed, Feb 03, 2016 at 10:37:24AM +0000, Koniszewski, Pawel wrote:
>>> Hello everyone,
>>>
>>> On the yesterday's live migration meeting we had concerns that interval of
>>> writing migration progress to the database is too short.
>>>
>>> Information about migration progress will be stored in the database and
>>> exposed through the API (/servers/<uuid>/migrations/<id>). In current
>>> proposition [1] migration progress will be updated every 2 seconds. It
>>> basically means that every 2 seconds a call through RPC will go from compute
>>> to conductor to write migration data to the database. In case of parallel
>>> live migrations each migration will report progress by itself.
>>>
>>> Isn't 2 seconds interval too short for updates if the information is exposed
>>> through the API and it requires RPC and DB call to actually save it in the
>>> DB?
>>>
>>> Our default configuration allows only for 1 concurrent live migration [2],
>>> but it might vary between different deployments and use cases as it is
>>> configurable. Someone might want to trigger 10 (or even more) parallel live
>>> migrations and each might take even a day to finish in case of block
>>> migration. Also if deployment is big enough rabbitmq might be fully-loaded.
>>> I'm not sure whether updating each migration every 2 seconds makes sense in
>>> this case. On the other hand it might be hard to observe fast enough that
>>> migration is stuck if we increase this interval...
>> Do we have any actual data that this is a real problem. I have a pretty hard
>> time believing that a database update of a single field every 2 seconds is
>> going to be what pushes Nova over the edge into a performance collapse, even
>> if there are 20 migrations running in parallel, when you compare it to the
>> amount of DB queries & updates done across other areas of the code for pretty
>> much every singke API call and background job.
> Also note that progress is rounded to the nearest integer. So even if the
> migration runs all day, there is a maximum of 100 possible changes in value
> for the progress field, so most of the updates should turn in to no-ops at
> the database level.
>
> Regards,
> Daniel
I agree with Daniel, these rpc and db access ops are a tiny percentage
of the overall load on rabbit and mysql and properly configured these
subsystems should have no issues with this workload.

One correction, unless I'm misreading it, the existing
_live_migration_monitor code updates the progress field of the instance
record every 5 seconds.  However this value can go up and down so
an infinate number of updates are possible?

However, the issue raised here is not with the existing implementation
but with the proposed change
https://review.openstack.org/#/c/258813/5/nova/virt/libvirt/driver.py
This add a save() operation on the migration object every 2 seconds

Paul Carlton
Software Engineer
Cloud Services
Hewlett Packard Enterprise
BUK03:T242
Longdown Avenue
Stoke Gifford
Bristol BS34 8QZ

Mobile:    +44 (0)7768 994283
Office:    +44 (0)117 316 2189
Email:    mailto:paul.carlton2 at hpe.com
irc:      paul-carlton2

Hewlett-Packard Enterprise Limited registered Office: Cain Road, Bracknell, Berks RG12 1HN Registered No: 690597 England.
The contents of this message and any attachments to it are confidential and may be legally privileged. If you have received this message in error, you should delete it from your system immediately and advise the sender. To any recipient of this message within HP, unless otherwise stated you should consider this message and attachments as "HP CONFIDENTIAL".




More information about the OpenStack-dev mailing list