Hi Sean, You’re right, the issue isn’t the scheduler after all. I wasn’t fully aware of how the scheduler and conductor work with each other. From the logs, I can see that it does pick a compute node as selected host: 2020-06-05 17:50:54.742 520 DEBUG nova.conductor.manager [req-323fe2d5-4ab5-4d3f-a923-deb7ce9da9f9 0ec0114b913646338f16f1ce6457da3a e9dd8de2dad64f3c99078f26330aefb9 - default default] [instance: e1d20902-660e-4359-bb56-2482734b656f] Selected host: compute2.staging.planethoster.net; Selected node: compute2.staging.planethoster.net; Alternates: [(u'compute3.staging.planethoster.net', u'compute3.staging.planethoster.net'), (u'compute1.staging.planethoster.net', u'compute1.staging.planethoster.net')] schedule_and_build_instances /usr/lib/python2.7/site-packages/nova/conductor/manager.py:1371 It then proceed to block device mapping : 2020-06-05 17:50:54.750 520 DEBUG nova.conductor.manager [req-323fe2d5-4ab5-4d3f-a923-deb7ce9da9f9 0ec0114b913646338f16f1ce6457da3a e9dd8de2dad64f3c99078f26330aefb9 - default default] [instance: e1d20902-660e-4359-bb56-2482734b656f] block_device_mapping [BlockDeviceMapping(attachment_id=<?>,boot_index=0,connection_info=None,created_at=<?>,delete_on_termination=False,deleted=<?>,deleted_at=<?>,destination_type='volume',device_name=None,device_type=None,disk_bus=None,guest_format=None,id=<?>,image_id='364b1fe6-6025-4ca5-8d7d-76fbd71074cb',instance=<?>,instance_uuid=<?>,no_device=False,snapshot_id=None,source_type='image',tag=None,updated_at=<?>,uuid=<?>,volume_id=None,volume_size=20)] _create_block_device_mapping /usr/lib/python2.7/site-packages/nova/conductor/manager.py:1169 However, the status of the VM in the database stays at scheduling and the conductor doesn’t do anything else, as if it was waiting for something that never comes. So, would that mean that the scheduler and placement actually do their job, but the process gets stuck in cinder? I was under the impression this was a nova issue, because if I shutdown my containers with the nova services and boot the same nova services with same configuration locally, I have no issue whatsoever. Jean-Philippe Méthot Senior Openstack system administrator Administrateur système Openstack sénior PlanetHoster inc. 4414-4416 Louis B Mayer Laval, QC, H7P 0G1, Canada TEL : +1.514.802.1644 - Poste : 2644 FAX : +1.514.612.0678 CA/US : 1.855.774.4678 FR : 01 76 60 41 43 UK : 0808 189 0423
Le 5 juin 2020 à 13:26, Sean Mooney <smooney@redhat.com> a écrit :
On Fri, 2020-06-05 at 12:33 -0400, Jean-Philippe Méthot wrote:
Hi,
I’ve been building my own docker images as a mean to both learn docker and to see if we can make our own images and run them in production. I’ve figured out how to make most services run fairly well. However, an issue remains with nova-scheduler and I can’t seem to figure out what’s going on.
Essentially, when I try to create a VM it loops in a scheduling state and when I try to delete a VM, it loops forever in a deleting state.
the scheduler is not invlvoed in deleteing a vm so this more or less rules out the schduler as teh route cause. i woudl guess the issue likes somewhere beteen the api and conductor.
I’ve narrowed down the culprit to nova-scheduler. can you explaine why you think its the nova-schduler? As far as I know, nothing appears in the debug logs of my containerized nova-scheduler whenever I do any kind of action, which forces me to believe that nova-scheduler is not receiving any command. did you confirm thjat the conductor was reciving the build requierst and calling the schduler.
From what I’ve always understood, nova-scheduler works through RPC and Rabbitmq. The fact that this nova-scheduler connects to rabbitmq without issue makes me believe that something else is missing from my container configuration.
Does Nova-scheduler listen on network port? not the scudler only compunicates withthe conductor via the rpc bus. Does it listen on a socket? no Is there any way that nova-scheduler could ignore requests sent to it? only if it was not listening to the corerct exchange.
i would first change that the api show an rpc to the conductor and validate that the conductor started the buidl request. if you see output in the conductor log realted to your api queries then you can check the logs to see if ti called the schduler.
Jean-Philippe Méthot Senior Openstack system administrator Administrateur système Openstack sénior PlanetHoster inc.