[Openstack-operators] MessagingTimeout in block live-migration due to long image fetch operation
mriedemos at gmail.com
Fri Dec 1 23:05:42 UTC 2017
On 11/28/2017 9:13 AM, Gustavo Randich wrote:
> (running Mitaka)
> When doing block live-migration, if the image / backing file is not
> present at destination host, sometimes pre-live migration fails after 60
> seconds as shown below. Retrying the migration to the same destination
> host succeeds.
> It seems that an rpc_response_timeout of 60 seconds is not enough for
> this scenario, in which fetching the image involves 90 seconds. We don't
> like to increase rpc_response_timeout to say, 120 seconds, only for
> this reason ('cause in other kind of errors we prefer to fail fast).
> Given that migrations are usually long, shouldn't this operation be
> under the scope of a configurable timeout such as
> live_migration_progress_timeout or live_migration_completion_timeout
> which overrides the default rpc timeout?
I think we've talked about adding a config option or somehow doing rpc
timeouts differently for operations that we know are prone to timeouts,
so I don't think people would be against a config option for this. I
know there is at least one place in nova where we specify an rpc
response timeout which is not the default.
More information about the OpenStack-operators