[ops][nova][victoria] Migrate cross CPU?

DHilsbos at performair.com DHilsbos at performair.com
Mon Apr 19 02:51:19 UTC 2021


Sean;

Thank you, your suggestion led me to a problem with ssh.  I was a little surprised by this, as live migration works.

I reviewed:
https://docs.openstack.org/nova/victoria/admin/ssh-configuration.html#cli-os-migrate-cfg-ssh
and found that I had a problem with the authorized keys file.  I took care of that, and it still didn't work.

Here's what came out of the nova compute log:
2021-04-18 19:24:27.201 10808 ERROR oslo_messaging.rpc.server [req-225e7beb-f186-4235-abce-efcf4924d505 d7c514813e5d4fe6815f5f59e8e35f2f a008ad02d16f436a9e320882ca497055 - default default] Exception during message handling: nova.exception.ResizeError: Resize error: not able to execute ssh command: Unexpected error while running command.
Command: ssh -o BatchMode=yes 10.0.128.20 mkdir -p /var/lib/nova/instances/64229d87-4cbb-44d1-ba8a-5fe63c9c40f3
Exit code: 255
Stdout: ''
Stderr: 'Host key verification failed.\r\n'

When I do su - nova on the origin server, as per the above, then ssh to the receiving server, I get this:
Load key "/etc/nova/migration/identity": invalid format

/etc/nova/migration/identity isn't mentioned anywhere in the documentation above.

I tried:
cat id_rsa > /etc/nova/migration/identity
and
cat id_rsa.pub >> /etc/nova/migration/authorized_keys

Using the keys copied in the documentation above; still no go.  Same 'Host key verification failed.\r\n' result.

What am I missing?

Thank you,

Dominic L. Hilsbos, MBA 
Director – Information Technology 
Perform Air International Inc.
DHilsbos at PerformAir.com 
www.PerformAir.com

-----Original Message-----
From: Sean Mooney [mailto:smooney at redhat.com] 
Sent: Friday, April 16, 2021 9:58 AM
To: Dominic Hilsbos; openstack-discuss at lists.openstack.org
Subject: Re: [ops][nova][victoria] Migrate cross CPU?

hum ok the best way to debug this is to lis the server events and get 
the request id for the migration
it may be req-ff109e53-74e0-40de-8ec7-29aff600b5f7 based on the logs you 
posted but you should see more info
in the api, conductor and compute logs for that request id.

given the state has not change i suspect it failed rather early.

its possible that you are expirence an issue with the rabbitmq service 
and rpc calls are bing lost but
i woudl not expect to see logs realted to this in the scudler while the 
vm is stilll in the SHUTOFF status.

can you do "openstack server event list 
64229d87-4cbb-44d1-ba8a-5fe63c9c40f3" then get the most recent
resize event's request id and see if there are any other logs.

regard
sean.

(note i think it will be listed as a resize not a migrate since 
interanlly migreate is implmented as resize but to the same flavour).

On 16/04/2021 17:04, DHilsbos at performair.com wrote:
> Sean;
>
> Thank you very much for your response.  I wasn't aware of the state change to resize_verify, that's useful.
>
> Unfortunately, at present, the state change is not occurring.
>
> Here's a series of commands, with output:
>
> #openstack server show 64229d87-4cbb-44d1-ba8a-5fe63c9c40f3
> +-------------------------------------+----------------------------------------------------------+
> | Field                               | Value                                                    |
> +-------------------------------------+----------------------------------------------------------+
> | OS-DCF:diskConfig                   | MANUAL                                                   |
> | OS-EXT-AZ:availability_zone         | az-elcom-1                                               |
> | OS-EXT-SRV-ATTR:host                | s700030.463.os.mcgown.enterprises                        |
> | OS-EXT-SRV-ATTR:hypervisor_hostname | s700030.463.os.mcgown.enterprises                        |
> | OS-EXT-SRV-ATTR:instance_name       | instance-00000037                                        |
> | OS-EXT-STS:power_state              | Shutdown                                                 |
> | OS-EXT-STS:task_state               | None                                                     |
> | OS-EXT-STS:vm_state                 | stopped                                                  |
> | OS-SRV-USG:launched_at              | 2021-03-06T04:36:07.000000                               |
> | OS-SRV-USG:terminated_at            | None                                                     |
> | accessIPv4                          |                                                          |
> | accessIPv6                          |                                                          |
> | addresses                           | it-network=10.255.127.208, 10.0.160.35                   |
> | config_drive                        |                                                          |
> | created                             | 2021-03-06T04:35:51Z                                     |
> | flavor                              | m4.large (8)                                             |
> | hostId                              | 174a83351ac674a25a2bf5131b931fc7a9e16be48b62f37925a66676 |
> | id                                  | 64229d87-4cbb-44d1-ba8a-5fe63c9c40f3                     |
> | image                               | N/A (booted from volume)                                 |
> | key_name                            | None                                                     |
> | name                                | Java Dev                                                 |
> | project_id                          | 10dfdfadb7374ea1ba37bee1435d87ad                         |
> | properties                          |                                                          |
> | security_groups                     | name='allow-ping'                                        |
> |                                     | name='allow-ssh'                                         |
> |                                     | name='default'                                           |
> | status                              | SHUTOFF                                                  |
> | updated                             | 2021-04-16T15:52:07Z                                     |
> | user_id                             | 69b73ea8f55c46a99021e77ebf70b62a                         |
> | volumes_attached                    | id='ae69c924-60e5-431e-9572-c41a153e720b'                |
> +-------------------------------------+----------------------------------------------------------+
> #openstack server migrate --host s700066.463.os.mcgown.enterprises --os-compute-api-version 2.56 64229d87-4cbb-44d1-ba8a-5fe63c9c40f3
> #openstack server show 64229d87-4cbb-44d1-ba8a-5fe63c9c40f3
> +-------------------------------------+----------------------------------------------------------+
> | Field                               | Value                                                    |
> +-------------------------------------+----------------------------------------------------------+
> | OS-DCF:diskConfig                   | MANUAL                                                   |
> | OS-EXT-AZ:availability_zone         | az-elcom-1                                               |
> | OS-EXT-SRV-ATTR:host                | s700030.463.os.mcgown.enterprises                        |
> | OS-EXT-SRV-ATTR:hypervisor_hostname | s700030.463.os.mcgown.enterprises                        |
> | OS-EXT-SRV-ATTR:instance_name       | instance-00000037                                        |
> | OS-EXT-STS:power_state              | Shutdown                                                 |
> | OS-EXT-STS:task_state               | None                                                     |
> | OS-EXT-STS:vm_state                 | stopped                                                  |
> | OS-SRV-USG:launched_at              | 2021-03-06T04:36:07.000000                               |
> | OS-SRV-USG:terminated_at            | None                                                     |
> | accessIPv4                          |                                                          |
> | accessIPv6                          |                                                          |
> | addresses                           | it-network=10.255.127.208, 10.0.160.35                   |
> | config_drive                        |                                                          |
> | created                             | 2021-03-06T04:35:51Z                                     |
> | flavor                              | m4.large (8)                                             |
> | hostId                              | 174a83351ac674a25a2bf5131b931fc7a9e16be48b62f37925a66676 |
> | id                                  | 64229d87-4cbb-44d1-ba8a-5fe63c9c40f3                     |
> | image                               | N/A (booted from volume)                                 |
> | key_name                            | None                                                     |
> | name                                | Java Dev                                                 |
> | project_id                          | 10dfdfadb7374ea1ba37bee1435d87ad                         |
> | properties                          |                                                          |
> | security_groups                     | name='allow-ping'                                        |
> |                                     | name='allow-ssh'                                         |
> |                                     | name='default'                                           |
> | status                              | SHUTOFF                                                  |
> | updated                             | 2021-04-16T15:53:32Z                                     |
> | user_id                             | 69b73ea8f55c46a99021e77ebf70b62a                         |
> | volumes_attached                    | id='ae69c924-60e5-431e-9572-c41a153e720b'                |
> +-------------------------------------+----------------------------------------------------------+
> #tail /var/log/nova/nova-conductor.log
> #tail /var/log/nova/nova-scheduler.log
> 2021-04-16 08:53:24.870 3773 INFO nova.scheduler.host_manager [req-ff109e53-74e0-40de-8ec7-29aff600b5f7 d7c514813e5d4fe6815f5f59e8e35f2f a008ad02d16f436a9e320882ca497055 - default default] Host filter only checking host s700066.463.os.mcgown.enterprises and node s700066.463.os.mcgown.enterprises
> 2021-04-16 08:53:24.871 3773 INFO nova.scheduler.host_manager [req-ff109e53-74e0-40de-8ec7-29aff600b5f7 d7c514813e5d4fe6815f5f59e8e35f2f a008ad02d16f436a9e320882ca497055 - default default] Host filter ignoring hosts:
>
> Both Cinder volume storage, and ephemeral storage are being handled by Ceph.
>
> Thank you,
>
> Dominic L. Hilsbos, MBA
> Director – Information Technology
> Perform Air International Inc.
> DHilsbos at PerformAir.com
> www.PerformAir.com
>
>
> -----Original Message-----
> From: Sean Mooney [mailto:smooney at redhat.com]
> Sent: Friday, April 16, 2021 6:28 AM
> To: openstack-discuss at lists.openstack.org
> Subject: Re: [ops][nova][victoria] Migrate cross CPU?
>
>
>
> On 15/04/2021 19:05, DHilsbos at performair.com wrote:
>> All;
>>
>> I seem to have generated another issue for myself...
>>
>> I built our Victoria cloud initially on Intel Atom servers.  We recently received the first of our AMD Epyc (7002 series) servers, which are intended to take over the Nova Compute responsibilities.
>>
>> I've had success in the past doing live migrates, but live migrating from one of the Atom servers to the new server fails, with an error indicating CPU compatibility problems.  Ok, I can understand that.
>>
>> My problem is that I don't seem to understand the openstack server migrate command (non-live).  It doesn't seem to do anything, whether the instance is Running or Shut Down.  I can't find errors in the logs from the API / conductor / scheduler host.
>>
>> I also can't find an option to pass to the openstack server start command which requests a specific host.
>>
>> Can I get these existing instances moved from the Atom servers to the Epyc server(s), or do I need to recreate them to do this?
> you should be able to cold migrate them using the migrate command but
> that should put the servers into resize_verify and then you need
> to confirm the migration to complte it. we will not clean up the vm on
> the source node until you do that last step.
>
>> Thank you,
>>
>> Dominic L. Hilsbos, MBA
>> Director - Information Technology
>> Perform Air International Inc.
>> DHilsbos at PerformAir.com
>> www.PerformAir.com
>>
>>
>>
>



More information about the openstack-discuss mailing list