On Thu, Sep 12, 2024 at 1:44 AM <collinl@churchofjesuschrist.org> wrote:
Amit, thanks for the quick response.

Not sure what you mean by "BFV" ?

Boot from Volume; but in your case you booted from image. 

After your comments, I was able to make a little progress.  I hadn't looked at the logs on the compute notes prior to that, I was only looking on the controller.  I saw an error related to the compute note trying to make a connection to the database.  I made that fix in the nova.conf on the compute nodes, and now when I spin up a new instance, the volume does correctly show "in-use" and that it is attached to the instance :-)

However, I am still getting some errors that I don't know what to do with.  On the controller, I am seeing a bunch of these:
/var/log/nova/nova-metadata-api.log:2024-09-11 09:59:51.273 560137 ERROR nova.api.metadata.handler [None req-799be5d5-5a00-4eb9-a90b-8561f73949ab - - - - - -] Failed to get metadata for instance id: f3a05784-7404-4319-b878-d40538bf152d: keystoneauth1.exceptions.http.Unauthorized: The request you have made requires authentication. (HTTP 401) (Request-ID: req-c3aea86b-3bdf-409c-8862-27c98d98aaaa)

This is from keystone, the operator-user is not authorized to get instance_info, but you were able to create an instance with this user, so !!! 

I haven't tried packstack, but the installer should take care of this for admin user, so unless you are creating an new tenant and all resources inside it.

 
And on the compute node, I see this:
nova-compute.log:2024-09-11 09:56:38.628 45664 WARNING nova.compute.manager [req-af634eb4-cf6a-4461-be14-fe4b4f0b6f87 req-a83bf6fc-ea9a-42e7-b841-8a46a4168eef 46f271442f53434dbf6b35719b8fbc2d a938e696c70f4469884ab4d5f5dcf2ac - - default default] [instance: f3a05784-7404-4319-b878-d40538bf152d] Received unexpected event network-vif-plugged-eba53055-f08d-423d-8ace-474e0af51b0c for instance with vm_state active and task_state None.

When trying to create a snapshot, it gets further, but still reports an error.  The message is "create snapshot:Snapshot failed to create."
Here is what I see for the volume and volume snapshot output:
[root@l21651 56522c30c45d0a9afe5818208e55a58b(keystone_admin)]# openstack volume list
+--------------------------------------+------+--------+------+-----------------------------------+
| ID                                   | Name | Status | Size | Attached to                       |
+--------------------------------------+------+--------+------+-----------------------------------+
| 3cf3c7e3-dc43-4537-a3e7-8b20a4d35927 |      | in-use |    1 | Attached to snaptest on /dev/vda  |
+--------------------------------------+------+--------+------+-----------------------------------+
[root@l21651 56522c30c45d0a9afe5818208e55a58b(keystone_admin)]# openstack volume snapshot list
+--------------------------------------+--------------------------------+-------------+--------+------+
| ID                                   | Name                           | Description | Status | Size |
+--------------------------------------+--------------------------------+-------------+--------+------+
| a0afabe0-158e-4858-86e6-7f9682cb4433 | snapshot for snaptest_snapshot |             | error  |    1 |
+--------------------------------------+--------------------------------+-------------+--------+------+

. . .Not sure if I believe that though, as if I look in the nfs directory, I see that there does appear to be a snapshot created:
-rw-rw-rw-. 1  107  107 1.0G Sep 11 10:04 volume-3cf3c7e3-dc43-4537-a3e7-8b20a4d35927
-rw-rw-rw-. 1 root root 193K Sep 11 10:16 volume-3cf3c7e3-dc43-4537-a3e7-8b20a4d35927.a0afabe0-158e-4858-86e6-7f9682cb4433

And, it appears to be the right kind of file:
[root@l21651 56522c30c45d0a9afe5818208e55a58b(keystone_admin)]# file volume-3cf3c7e3-dc43-4537-a3e7-8b20a4d35927.a0afabe0-158e-4858-86e6-7f9682cb4433
volume-3cf3c7e3-dc43-4537-a3e7-8b20a4d35927.a0afabe0-158e-4858-86e6-7f9682cb4433: QEMU QCOW2 Image (v3), has backing file (path volume-3cf3c7e3-dc43-4537-a3e7-8b20a4d35927), 1073741824 bytes

Here is another error that may be related to this issue?
/var/log/cinder/volume.log:2024-09-11 11:38:51.743 1196213 ERROR cinder.volume.drivers.remotefs [req-e8eab520-53bf-405c-b2c7-3be5204ecf1c req-b0d1d655-1567-46aa-8809-ebc62760e0e6 956bef8f3ca34d9bbb5b3bb8876890fd b3001e31df2947e78c5fda1e0d8479b0 - - - -] Call to Nova to create snapshot failed: keystoneauth1.exceptions.http.BadRequest: Expecting to find domain in project. The server could not comply with the request since it is either malformed or otherwise incorrect. The client is assumed to be in error. (HTTP 400) (Request-ID: req-279a7830-5c12-4997-9560-857c30742a63)   

How do I add a domain to the project?


On a positive note, when trying the migrate, I did find/fix an error in my sudoers config, and now I am able to live migrate, so it is just the snapshot error that I am trying to fix currently.


So it seems like its an identity issue only.

So, now I would go like this:
1- see here first https://docs.openstack.org/newton/install-guide-obs/keystone-users.html 
     - fix if anything is wrong/pending
2- create VM
3- create snapshot using cmd
     openstack server image create
     # this creates a snapshot image of server in glance, from which you can create next server.

4 - from existing errors you are getting, it seems it will fix from keystone docs, but if fails
     4.1 look for what really happend in image-create request, you can try --debug with above cmd 

     4.2 or look via req-id, openstack server event list <server-id>
          4.2.1 take req-id of create iameg action, it looks like `req-uuid`
          4.2.2 grep for this req-id in nova-logs
          4.2.3 to understand better start with nova-api logs; nova-conductor then nova-compute logs.
          4.2.4 look for at what time, it called glance API.

At this moment, you would know the flow of requests, and why it failed.

If things are still same, tell us the cmds as well which you tried.

Regads