[Openstack] unexpected distribution of compute instances in queens

newer
[devstack][neutron] truble with...

older
[first contact] Cancelling Next...

Zufar Dhiyaulhaq

26 Nov 2018 26 Nov '18

4:45 a.m.

Hi, I am deploying OpenStack with 3 compute node, but I am seeing an abnormal distribution of instance, the instance is only deployed in a specific compute node, and not distribute among other compute node. this is my nova.conf from the compute node. (template jinja2 based) [DEFAULT] osapi_compute_listen = {{ hostvars[inventory_hostname]['ansible_ens3f1'][ 'ipv4']['address'] }} metadata_listen = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4'][ 'address'] }} enabled_apis = osapi_compute,metadata transport_url = rabbit://openstack:{{ rabbitmq_pw }}@{{ controller1_ip_man }}:5672,openstack:{{ rabbitmq_pw }}@{{ controller2_ip_man }}:5672,openstack:{{ rabbitmq_pw }}@{{ controller3_ip_man }}:5672 my_ip = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }} use_neutron = True firewall_driver = nova.virt.firewall.NoopFirewallDriver [api] auth_strategy = keystone [api_database] connection = mysql+pymysql://nova:{{ nova_dbpw }}@{{ vip }}/nova_api [barbican] [cache] backend=oslo_cache.memcache_pool enabled=true memcache_servers={{ controller1_ip_man }}:11211,{{ controller2_ip_man }}:11211,{{ controller3_ip_man }}:11211 [cells] [cinder] os_region_name = RegionOne [compute] [conductor] [console] [consoleauth] [cors] [crypto] [database] connection = mysql+pymysql://nova:{{ nova_dbpw }}@{{ vip }}/nova [devices] [ephemeral_storage_encryption] [filter_scheduler] [glance] api_servers = http://{{ vip }}:9292 [guestfs] [healthcheck] [hyperv] [ironic] [key_manager] [keystone] [keystone_authtoken] auth_url = http://{{ vip }}:5000/v3 memcached_servers = {{ controller1_ip_man }}:11211,{{ controller2_ip_man }}:11211,{{ controller3_ip_man }}:11211 auth_type = password project_domain_name = default user_domain_name = default project_name = service username = nova password = {{ nova_pw }} [libvirt] [matchmaker_redis] [metrics] [mks] [neutron] url = http://{{ vip }}:9696 auth_url = http://{{ vip }}:35357 auth_type = password project_domain_name = default user_domain_name = default region_name = RegionOne project_name = service username = neutron password = {{ neutron_pw }} service_metadata_proxy = true metadata_proxy_shared_secret = {{ metadata_secret }} [notifications] [osapi_v21] [oslo_concurrency] lock_path = /var/lib/nova/tmp [oslo_messaging_amqp] [oslo_messaging_kafka] [oslo_messaging_notifications] [oslo_messaging_rabbit] [oslo_messaging_zmq] [oslo_middleware] [oslo_policy] [pci] [placement] os_region_name = RegionOne project_domain_name = Default project_name = service auth_type = password user_domain_name = Default auth_url = http://{{ vip }}:5000/v3 username = placement password = {{ placement_pw }} [quota] [rdp] [remote_debug] [scheduler] discover_hosts_in_cells_interval = 300 [serial_console] [service_user] [spice] [upgrade_levels] [vault] [vendordata_dynamic_auth] [vmware] [vnc] enabled = true keymap=en-us novncproxy_base_url = https://{{ vip }}:6080/vnc_auto.html novncproxy_host = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4'][ 'address'] }} [workarounds] [wsgi] [xenserver] [xvp] [placement_database] connection=mysql+pymysql://nova:{{ nova_dbpw }}@{{ vip }}/nova_placement what is the problem? I have lookup the openstack-nova-scheduler in the controller node but it's running well with only warning nova-scheduler[19255]: /usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py:332: NotSupportedWarning: Configuration option(s) ['use_tpool'] not supported the result I want is the instance is distributed in all compute node. Thank you. -- *Regards,* *Zufar Dhiyaulhaq* _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Attachments:

attachment.html (text/html — 15.4 KB)

Show replies by date

Sean Mooney

26 Nov 26 Nov

10:13 a.m.

On Mon, 2018-11-26 at 17:45 +0700, Zufar Dhiyaulhaq wrote:

...

Hi,

I am deploying OpenStack with 3 compute node, but I am seeing an abnormal distribution of instance, the instance is only deployed in a specific compute node, and not distribute among other compute node.

this is my nova.conf from the compute node. (template jinja2 based)

hi, the default behavior of nova used to be spread not pack and i belive it still is. the default behavior with placement however is closer to a packing behavior as allcoation candiates are retrunidn in an undefined but deterministic order. on a busy cloud this does not strictly pack instaces but on a quite cloud it effectivly does you can try and enable randomisation of the allocation candiates by setting this config option in the nova.conf of the shcduler to true. https://docs.openstack.org/nova/latest/configuration/config.html#placement.r... on that note can you provide the nova.conf for the schduelr is used instead of the compute node nova.conf. if you have not overriden any of the nova defaults the ram and cpu weigher should spread instances withing the allocation candiates returned by placement.

...

[DEFAULT] osapi_compute_listen = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }} metadata_listen = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }} enabled_apis = osapi_compute,metadata transport_url = rabbit://openstack:{{ rabbitmq_pw }}@{{ controller1_ip_man }}:5672,openstack:{{ rabbitmq_pw }}@{{ controller2_ip_man }}:5672,openstack:{{ rabbitmq_pw }}@{{ controller3_ip_man }}:5672 my_ip = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }} use_neutron = True firewall_driver = nova.virt.firewall.NoopFirewallDriver [api] auth_strategy = keystone [api_database] connection = mysql+pymysql://nova:{{ nova_dbpw }}@{{ vip }}/nova_api [barbican] [cache] backend=oslo_cache.memcache_pool enabled=true memcache_servers={{ controller1_ip_man }}:11211,{{ controller2_ip_man }}:11211,{{ controller3_ip_man }}:11211 [cells] [cinder] os_region_name = RegionOne [compute] [conductor] [console] [consoleauth] [cors] [crypto] [database] connection = mysql+pymysql://nova:{{ nova_dbpw }}@{{ vip }}/nova [devices] [ephemeral_storage_encryption] [filter_scheduler] [glance] api_servers = http://{{ vip }}:9292 [guestfs] [healthcheck] [hyperv] [ironic] [key_manager] [keystone] [keystone_authtoken] auth_url = http://{{ vip }}:5000/v3 memcached_servers = {{ controller1_ip_man }}:11211,{{ controller2_ip_man }}:11211,{{ controller3_ip_man }}:11211 auth_type = password project_domain_name = default user_domain_name = default project_name = service username = nova password = {{ nova_pw }} [libvirt] [matchmaker_redis] [metrics] [mks] [neutron] url = http://{{ vip }}:9696 auth_url = http://{{ vip }}:35357 auth_type = password project_domain_name = default user_domain_name = default region_name = RegionOne project_name = service username = neutron password = {{ neutron_pw }} service_metadata_proxy = true metadata_proxy_shared_secret = {{ metadata_secret }} [notifications] [osapi_v21] [oslo_concurrency] lock_path = /var/lib/nova/tmp [oslo_messaging_amqp] [oslo_messaging_kafka] [oslo_messaging_notifications] [oslo_messaging_rabbit] [oslo_messaging_zmq] [oslo_middleware] [oslo_policy] [pci] [placement] os_region_name = RegionOne project_domain_name = Default project_name = service auth_type = password user_domain_name = Default auth_url = http://{{ vip }}:5000/v3 username = placement password = {{ placement_pw }} [quota] [rdp] [remote_debug] [scheduler] discover_hosts_in_cells_interval = 300 [serial_console] [service_user] [spice] [upgrade_levels] [vault] [vendordata_dynamic_auth] [vmware] [vnc] enabled = true keymap=en-us novncproxy_base_url = https://{{ vip }}:6080/vnc_auto.html novncproxy_host = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }} [workarounds] [wsgi] [xenserver] [xvp] [placement_database] connection=mysql+pymysql://nova:{{ nova_dbpw }}@{{ vip }}/nova_placement

what is the problem? I have lookup the openstack-nova-scheduler in the controller node but it's running well with only warning

nova-scheduler[19255]: /usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py:332: NotSupportedWarning: Configuration option(s) ['use_tpool'] not supported

the result I want is the instance is distributed in all compute node. Thank you.

_______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

_______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Matt Riedemann

2:40 p.m.

On 11/26/2018 10:13 AM, Sean Mooney wrote:

...

hi, the default behavior of nova used to be spread not pack and i belive it still is. the default behavior with placement however is closer to a packing behavior as allcoation candiates are retrunidn in an undefined but deterministic order.

on a busy cloud this does not strictly pack instaces but on a quite cloud it effectivly does

you can try and enable randomisation of the allocation candiates by setting this config option in the nova.conf of the shcduler to true. https://docs.openstack.org/nova/latest/configuration/config.html#placement.r...

on that note can you provide the nova.conf for the schduelr is used instead of the compute node nova.conf. if you have not overriden any of the nova defaults the ram and cpu weigher should spread instances withing the allocation candiates returned by placement.

Or simply the other hosts are being filtered out, either because they aren't reporting into placement or some other filter is removing them. You should be able to see which hosts are being filtered if you enable debug logging in the nova-scheduler process (via the "debug" configuration option). -- Thanks, Matt

Zufar Dhiyaulhaq

27 Nov 27 Nov

3:55 a.m.

Hi Smooney, thank you for your help. I am trying to enable randomization but not working. The instance I have created is still in the same node. Below is my nova configuration (added randomization from your suggestion) from the master node (Template jinja2 based). [DEFAULT] enabled_apis = osapi_compute,metadata transport_url = rabbit://openstack:{{ rabbitmq_pw }}@{{ controller1_ip_man }}:5672,openstack:{{ rabbitmq_pw }}@{{ controller2_ip_man }}:5672,openstack:{{ rabbitmq_pw }}@{{ controller3_ip_man }}:5672 my_ip = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }} use_neutron = True firewall_driver = nova.virt.firewall.NoopFirewallDriver [api] auth_strategy = keystone [api_database] [barbican] [cache] backend=oslo_cache.memcache_pool enabled=true memcache_servers={{ controller1_ip_man }}:11211,{{ controller2_ip_man }}:11211,{{ controller3_ip_man }}:11211 [cells] [cinder] [compute] [conductor] [console] [consoleauth] [cors] [crypto] [database] [devices] [ephemeral_storage_encryption] [filter_scheduler] [glance] api_servers = http://{{ vip }}:9292 [guestfs] [healthcheck] [hyperv] [ironic] [key_manager] [keystone] [keystone_authtoken] auth_url = http://{{ vip }}:5000/v3 memcached_servers = {{ controller1_ip_man }}:11211,{{ controller2_ip_man }}:11211,{{ controller3_ip_man }}:11211 auth_type = password project_domain_name = default user_domain_name = default project_name = service username = nova password = {{ nova_pw }} [libvirt] virt_type = kvm [matchmaker_redis] [metrics] [mks] [neutron] url = http://{{ vip }}:9696 auth_url = http://{{ vip }}:35357 auth_type = password project_domain_name = default user_domain_name = default region_name = RegionOne project_name = service username = neutron password = {{ neutron_pw }} [notifications] [osapi_v21] [oslo_concurrency] lock_path = /var/lib/nova/tmp [oslo_messaging_amqp] [oslo_messaging_kafka] [oslo_messaging_notifications] [oslo_messaging_rabbit] [oslo_messaging_zmq] [oslo_middleware] [oslo_policy] [pci] [placement] os_region_name = RegionOne project_domain_name = Default project_name = service auth_type = password user_domain_name = Default auth_url = http://{{ vip }}:5000/v3 username = placement password = {{ placement_pw }} [quota] [rdp] [remote_debug] [scheduler] [serial_console] [service_user] [spice] [upgrade_levels] [vault] [vendordata_dynamic_auth] [vmware] [vnc] enabled = True keymap=en-us server_listen = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4'][ 'address'] }} server_proxyclient_address = {{ hostvars[inventory_hostname][ 'ansible_ens3f1']['ipv4']['address'] }} novncproxy_base_url = https://{{ vip }}:6080/vnc_auto.html [workarounds] [wsgi] [xenserver] [xvp] Thank you, Best Regards, Zufar Dhiyaulhaq On Mon, Nov 26, 2018 at 11:13 PM Sean Mooney <smooney@redhat.com> wrote:

...

On Mon, 2018-11-26 at 17:45 +0700, Zufar Dhiyaulhaq wrote:

...
Hi,

I am deploying OpenStack with 3 compute node, but I am seeing an abnormal distribution of instance, the instance is only deployed in a specific compute node, and not distribute among other compute node.

this is my nova.conf from the compute node. (template jinja2 based)

hi, the default behavior of nova used to be spread not pack and i belive it still is. the default behavior with placement however is closer to a packing behavior as allcoation candiates are retrunidn in an undefined but deterministic order.

on a busy cloud this does not strictly pack instaces but on a quite cloud it effectivly does

you can try and enable randomisation of the allocation candiates by setting this config option in the nova.conf of the shcduler to true.

https://docs.openstack.org/nova/latest/configuration/config.html#placement.r...

on that note can you provide the nova.conf for the schduelr is used instead of the compute node nova.conf. if you have not overriden any of the nova defaults the ram and cpu weigher should spread instances withing the allocation candiates returned by placement.

...
[DEFAULT] osapi_compute_listen = {{

hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }}

...
metadata_listen = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }} enabled_apis = osapi_compute,metadata transport_url = rabbit://openstack:{{ rabbitmq_pw }}@{{ controller1_ip_man }}:5672,openstack:{{ rabbitmq_pw }}@{{ controller2_ip_man }}:5672,openstack:{{ rabbitmq_pw }}@{{ controller3_ip_man }}:5672 my_ip = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }} use_neutron = True firewall_driver = nova.virt.firewall.NoopFirewallDriver [api] auth_strategy = keystone [api_database] connection = mysql+pymysql://nova:{{ nova_dbpw }}@{{ vip }}/nova_api [barbican] [cache] backend=oslo_cache.memcache_pool enabled=true memcache_servers={{ controller1_ip_man }}:11211,{{ controller2_ip_man }}:11211,{{ controller3_ip_man }}:11211 [cells] [cinder] os_region_name = RegionOne [compute] [conductor] [console] [consoleauth] [cors] [crypto] [database] connection = mysql+pymysql://nova:{{ nova_dbpw }}@{{ vip }}/nova [devices] [ephemeral_storage_encryption] [filter_scheduler] [glance] api_servers = http://{{ vip }}:9292 [guestfs] [healthcheck] [hyperv] [ironic] [key_manager] [keystone] [keystone_authtoken] auth_url = http://{{ vip }}:5000/v3 memcached_servers = {{ controller1_ip_man }}:11211,{{ controller2_ip_man }}:11211,{{ controller3_ip_man }}:11211 auth_type = password project_domain_name = default user_domain_name = default project_name = service username = nova password = {{ nova_pw }} [libvirt] [matchmaker_redis] [metrics] [mks] [neutron] url = http://{{ vip }}:9696 auth_url = http://{{ vip }}:35357 auth_type = password project_domain_name = default user_domain_name = default region_name = RegionOne project_name = service username = neutron password = {{ neutron_pw }} service_metadata_proxy = true metadata_proxy_shared_secret = {{ metadata_secret }} [notifications] [osapi_v21] [oslo_concurrency] lock_path = /var/lib/nova/tmp [oslo_messaging_amqp] [oslo_messaging_kafka] [oslo_messaging_notifications] [oslo_messaging_rabbit] [oslo_messaging_zmq] [oslo_middleware] [oslo_policy] [pci] [placement] os_region_name = RegionOne project_domain_name = Default project_name = service auth_type = password user_domain_name = Default auth_url = http://{{ vip }}:5000/v3 username = placement password = {{ placement_pw }} [quota] [rdp] [remote_debug] [scheduler] discover_hosts_in_cells_interval = 300 [serial_console] [service_user] [spice] [upgrade_levels] [vault] [vendordata_dynamic_auth] [vmware] [vnc] enabled = true keymap=en-us novncproxy_base_url = https://{{ vip }}:6080/vnc_auto.html novncproxy_host = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }} [workarounds] [wsgi] [xenserver] [xvp] [placement_database] connection=mysql+pymysql://nova:{{ nova_dbpw }}@{{ vip }}/nova_placement

what is the problem? I have lookup the openstack-nova-scheduler in the controller node but it's running well with only warning

nova-scheduler[19255]: /usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py:332: NotSupportedWarning: Configuration option(s) ['use_tpool'] not supported

the result I want is the instance is distributed in all compute node. Thank you.

_______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Zufar Dhiyaulhaq

4:01 a.m.

Hi Smooney, sorry for the last reply. I am attaching wrong configuration files. This is my nova configuration (added randomization from your suggestion) from the master node (Template jinja2 based). [DEFAULT] osapi_compute_listen = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }} metadata_listen = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }} enabled_apis = osapi_compute,metadata transport_url = rabbit://openstack:{{ rabbitmq_pw }}@{{ controller1_ip_man }}:5672,openstack:{{ rabbitmq_pw }}@{{ controller2_ip_man }}:5672,openstack:{{ rabbitmq_pw }}@{{ controller3_ip_man }}:5672 my_ip = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }} use_neutron = True firewall_driver = nova.virt.firewall.NoopFirewallDriver [api] auth_strategy = keystone [api_database] connection = mysql+pymysql://nova:{{ nova_dbpw }}@{{ vip }}/nova_api [barbican] [cache] backend=oslo_cache.memcache_pool enabled=true memcache_servers={{ controller1_ip_man }}:11211,{{ controller2_ip_man }}:11211,{{ controller3_ip_man }}:11211 [cells] [cinder] os_region_name = RegionOne [compute] [conductor] [console] [consoleauth] [cors] [crypto] [database] connection = mysql+pymysql://nova:{{ nova_dbpw }}@{{ vip }}/nova [devices] [ephemeral_storage_encryption] [filter_scheduler] [glance] api_servers = http://{{ vip }}:9292 [guestfs] [healthcheck] [hyperv] [ironic] [key_manager] [keystone] [keystone_authtoken] auth_url = http://{{ vip }}:5000/v3 memcached_servers = {{ controller1_ip_man }}:11211,{{ controller2_ip_man }}:11211,{{ controller3_ip_man }}:11211 auth_type = password project_domain_name = default user_domain_name = default project_name = service username = nova password = {{ nova_pw }} [libvirt] [matchmaker_redis] [metrics] [mks] [neutron] url = http://{{ vip }}:9696 auth_url = http://{{ vip }}:35357 auth_type = password project_domain_name = default user_domain_name = default region_name = RegionOne project_name = service username = neutron password = {{ neutron_pw }} service_metadata_proxy = true metadata_proxy_shared_secret = {{ metadata_secret }} [notifications] [osapi_v21] [oslo_concurrency] lock_path = /var/lib/nova/tmp [oslo_messaging_amqp] [oslo_messaging_kafka] [oslo_messaging_notifications] [oslo_messaging_rabbit] [oslo_messaging_zmq] [oslo_middleware] [oslo_policy] [pci] [placement] os_region_name = RegionOne project_domain_name = Default project_name = service auth_type = password user_domain_name = Default auth_url = http://{{ vip }}:5000/v3 username = placement password = {{ placement_pw }} randomize_allocation_candidates = true [quota] [rdp] [remote_debug] [scheduler] discover_hosts_in_cells_interval = 300 [serial_console] [service_user] [spice] [upgrade_levels] [vault] [vendordata_dynamic_auth] [vmware] [vnc] enabled = true keymap=en-us novncproxy_base_url = https://{{ vip }}:6080/vnc_auto.html novncproxy_host = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }} [workarounds] [wsgi] [xenserver] [xvp] [placement_database] connection=mysql+pymysql://nova:{{ nova_dbpw }}@{{ vip }}/nova_placement Thank you Best Regards, Zufar Dhiyaulhaq On Tue, Nov 27, 2018 at 4:55 PM Zufar Dhiyaulhaq <zufardhiyaulhaq@gmail.com> wrote:

...

Hi Smooney,

thank you for your help. I am trying to enable randomization but not working. The instance I have created is still in the same node. Below is my nova configuration (added randomization from your suggestion) from the master node (Template jinja2 based).

[DEFAULT] enabled_apis = osapi_compute,metadata transport_url = rabbit://openstack:{{ rabbitmq_pw }}@{{ controller1_ip_man }}:5672,openstack:{{ rabbitmq_pw }}@{{ controller2_ip_man }}:5672,openstack:{{ rabbitmq_pw }}@{{ controller3_ip_man }}:5672 my_ip = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4'][ 'address'] }} use_neutron = True firewall_driver = nova.virt.firewall.NoopFirewallDriver [api] auth_strategy = keystone [api_database] [barbican] [cache] backend=oslo_cache.memcache_pool enabled=true memcache_servers={{ controller1_ip_man }}:11211,{{ controller2_ip_man }}:11211,{{ controller3_ip_man }}:11211 [cells] [cinder] [compute] [conductor] [console] [consoleauth] [cors] [crypto] [database] [devices] [ephemeral_storage_encryption] [filter_scheduler] [glance] api_servers = http://{{ vip }}:9292 [guestfs] [healthcheck] [hyperv] [ironic] [key_manager] [keystone] [keystone_authtoken] auth_url = http://{{ vip }}:5000/v3 memcached_servers = {{ controller1_ip_man }}:11211,{{ controller2_ip_man }}:11211,{{ controller3_ip_man }}:11211 auth_type = password project_domain_name = default user_domain_name = default project_name = service username = nova password = {{ nova_pw }} [libvirt] virt_type = kvm [matchmaker_redis] [metrics] [mks] [neutron] url = http://{{ vip }}:9696 auth_url = http://{{ vip }}:35357 auth_type = password project_domain_name = default user_domain_name = default region_name = RegionOne project_name = service username = neutron password = {{ neutron_pw }} [notifications] [osapi_v21] [oslo_concurrency] lock_path = /var/lib/nova/tmp [oslo_messaging_amqp] [oslo_messaging_kafka] [oslo_messaging_notifications] [oslo_messaging_rabbit] [oslo_messaging_zmq] [oslo_middleware] [oslo_policy] [pci] [placement] os_region_name = RegionOne project_domain_name = Default project_name = service auth_type = password user_domain_name = Default auth_url = http://{{ vip }}:5000/v3 username = placement password = {{ placement_pw }} [quota] [rdp] [remote_debug] [scheduler] [serial_console] [service_user] [spice] [upgrade_levels] [vault] [vendordata_dynamic_auth] [vmware] [vnc] enabled = True keymap=en-us server_listen = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4'][ 'address'] }} server_proxyclient_address = {{ hostvars[inventory_hostname][ 'ansible_ens3f1']['ipv4']['address'] }} novncproxy_base_url = https://{{ vip }}:6080/vnc_auto.html [workarounds] [wsgi] [xenserver] [xvp]

Thank you,

Best Regards, Zufar Dhiyaulhaq

On Mon, Nov 26, 2018 at 11:13 PM Sean Mooney <smooney@redhat.com> wrote:

...
On Mon, 2018-11-26 at 17:45 +0700, Zufar Dhiyaulhaq wrote:

...
Hi,

I am deploying OpenStack with 3 compute node, but I am seeing an abnormal distribution of instance, the instance is only deployed in a specific compute node, and not distribute among other compute node.

this is my nova.conf from the compute node. (template jinja2 based)

hi, the default behavior of nova used to be spread not pack and i belive it still is. the default behavior with placement however is closer to a packing behavior as allcoation candiates are retrunidn in an undefined but deterministic order.

on a busy cloud this does not strictly pack instaces but on a quite cloud it effectivly does

you can try and enable randomisation of the allocation candiates by setting this config option in the nova.conf of the shcduler to true.

https://docs.openstack.org/nova/latest/configuration/config.html#placement.r...

on that note can you provide the nova.conf for the schduelr is used instead of the compute node nova.conf. if you have not overriden any of the nova defaults the ram and cpu weigher should spread instances withing the allocation candiates returned by placement.

...
[DEFAULT] osapi_compute_listen = {{

hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }}

...
metadata_listen = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }} enabled_apis = osapi_compute,metadata transport_url = rabbit://openstack:{{ rabbitmq_pw }}@{{ controller1_ip_man }}:5672,openstack:{{ rabbitmq_pw }}@{{ controller2_ip_man }}:5672,openstack:{{ rabbitmq_pw }}@{{ controller3_ip_man }}:5672 my_ip = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }} use_neutron = True firewall_driver = nova.virt.firewall.NoopFirewallDriver [api] auth_strategy = keystone [api_database] connection = mysql+pymysql://nova:{{ nova_dbpw }}@{{ vip }}/nova_api [barbican] [cache] backend=oslo_cache.memcache_pool enabled=true memcache_servers={{ controller1_ip_man }}:11211,{{ controller2_ip_man }}:11211,{{ controller3_ip_man }}:11211 [cells] [cinder] os_region_name = RegionOne [compute] [conductor] [console] [consoleauth] [cors] [crypto] [database] connection = mysql+pymysql://nova:{{ nova_dbpw }}@{{ vip }}/nova [devices] [ephemeral_storage_encryption] [filter_scheduler] [glance] api_servers = http://{{ vip }}:9292 [guestfs] [healthcheck] [hyperv] [ironic] [key_manager] [keystone] [keystone_authtoken] auth_url = http://{{ vip }}:5000/v3 memcached_servers = {{ controller1_ip_man }}:11211,{{ controller2_ip_man }}:11211,{{ controller3_ip_man }}:11211 auth_type = password project_domain_name = default user_domain_name = default project_name = service username = nova password = {{ nova_pw }} [libvirt] [matchmaker_redis] [metrics] [mks] [neutron] url = http://{{ vip }}:9696 auth_url = http://{{ vip }}:35357 auth_type = password project_domain_name = default user_domain_name = default region_name = RegionOne project_name = service username = neutron password = {{ neutron_pw }} service_metadata_proxy = true metadata_proxy_shared_secret = {{ metadata_secret }} [notifications] [osapi_v21] [oslo_concurrency] lock_path = /var/lib/nova/tmp [oslo_messaging_amqp] [oslo_messaging_kafka] [oslo_messaging_notifications] [oslo_messaging_rabbit] [oslo_messaging_zmq] [oslo_middleware] [oslo_policy] [pci] [placement] os_region_name = RegionOne project_domain_name = Default project_name = service auth_type = password user_domain_name = Default auth_url = http://{{ vip }}:5000/v3 username = placement password = {{ placement_pw }} [quota] [rdp] [remote_debug] [scheduler] discover_hosts_in_cells_interval = 300 [serial_console] [service_user] [spice] [upgrade_levels] [vault] [vendordata_dynamic_auth] [vmware] [vnc] enabled = true keymap=en-us novncproxy_base_url = https://{{ vip }}:6080/vnc_auto.html novncproxy_host = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }} [workarounds] [wsgi] [xenserver] [xvp] [placement_database] connection=mysql+pymysql://nova:{{ nova_dbpw }}@{{ vip }}/nova_placement

what is the problem? I have lookup the openstack-nova-scheduler in the controller node but it's running well with only warning

nova-scheduler[19255]: /usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py:332: NotSupportedWarning: Configuration option(s) ['use_tpool'] not supported

the result I want is the instance is distributed in all compute node. Thank you.

_______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Zufar Dhiyaulhaq

28 Nov 28 Nov

1:50 a.m.

Hi, Thank you. I am able to fix this issue by adding this configuration into nova configuration file in controller node. driver=filter_scheduler Best Regards Zufar Dhiyaulhaq On Tue, Nov 27, 2018 at 5:01 PM Zufar Dhiyaulhaq <zufardhiyaulhaq@gmail.com> wrote:

...

Hi Smooney, sorry for the last reply. I am attaching wrong configuration files. This is my nova configuration (added randomization from your suggestion) from the master node (Template jinja2 based).

[DEFAULT] osapi_compute_listen = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }} metadata_listen = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }} enabled_apis = osapi_compute,metadata transport_url = rabbit://openstack:{{ rabbitmq_pw }}@{{ controller1_ip_man }}:5672,openstack:{{ rabbitmq_pw }}@{{ controller2_ip_man }}:5672,openstack:{{ rabbitmq_pw }}@{{ controller3_ip_man }}:5672 my_ip = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }} use_neutron = True firewall_driver = nova.virt.firewall.NoopFirewallDriver [api] auth_strategy = keystone [api_database] connection = mysql+pymysql://nova:{{ nova_dbpw }}@{{ vip }}/nova_api [barbican] [cache] backend=oslo_cache.memcache_pool enabled=true memcache_servers={{ controller1_ip_man }}:11211,{{ controller2_ip_man }}:11211,{{ controller3_ip_man }}:11211 [cells] [cinder] os_region_name = RegionOne [compute] [conductor] [console] [consoleauth] [cors] [crypto] [database] connection = mysql+pymysql://nova:{{ nova_dbpw }}@{{ vip }}/nova [devices] [ephemeral_storage_encryption] [filter_scheduler] [glance] api_servers = http://{{ vip }}:9292 [guestfs] [healthcheck] [hyperv] [ironic] [key_manager] [keystone] [keystone_authtoken] auth_url = http://{{ vip }}:5000/v3 memcached_servers = {{ controller1_ip_man }}:11211,{{ controller2_ip_man }}:11211,{{ controller3_ip_man }}:11211 auth_type = password project_domain_name = default user_domain_name = default project_name = service username = nova password = {{ nova_pw }} [libvirt] [matchmaker_redis] [metrics] [mks] [neutron] url = http://{{ vip }}:9696 auth_url = http://{{ vip }}:35357 auth_type = password project_domain_name = default user_domain_name = default region_name = RegionOne project_name = service username = neutron password = {{ neutron_pw }} service_metadata_proxy = true metadata_proxy_shared_secret = {{ metadata_secret }} [notifications] [osapi_v21] [oslo_concurrency] lock_path = /var/lib/nova/tmp [oslo_messaging_amqp] [oslo_messaging_kafka] [oslo_messaging_notifications] [oslo_messaging_rabbit] [oslo_messaging_zmq] [oslo_middleware] [oslo_policy] [pci] [placement] os_region_name = RegionOne project_domain_name = Default project_name = service auth_type = password user_domain_name = Default auth_url = http://{{ vip }}:5000/v3 username = placement password = {{ placement_pw }} randomize_allocation_candidates = true [quota] [rdp] [remote_debug] [scheduler] discover_hosts_in_cells_interval = 300 [serial_console] [service_user] [spice] [upgrade_levels] [vault] [vendordata_dynamic_auth] [vmware] [vnc] enabled = true keymap=en-us novncproxy_base_url = https://{{ vip }}:6080/vnc_auto.html novncproxy_host = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }} [workarounds] [wsgi] [xenserver] [xvp] [placement_database] connection=mysql+pymysql://nova:{{ nova_dbpw }}@{{ vip }}/nova_placement

Thank you

Best Regards, Zufar Dhiyaulhaq

On Tue, Nov 27, 2018 at 4:55 PM Zufar Dhiyaulhaq < zufardhiyaulhaq@gmail.com> wrote:

...
Hi Smooney,

thank you for your help. I am trying to enable randomization but not working. The instance I have created is still in the same node. Below is my nova configuration (added randomization from your suggestion) from the master node (Template jinja2 based).

[DEFAULT] enabled_apis = osapi_compute,metadata transport_url = rabbit://openstack:{{ rabbitmq_pw }}@{{ controller1_ip_man }}:5672,openstack:{{ rabbitmq_pw }}@{{ controller2_ip_man }}:5672,openstack:{{ rabbitmq_pw }}@{{ controller3_ip_man }}:5672 my_ip = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4'][ 'address'] }} use_neutron = True firewall_driver = nova.virt.firewall.NoopFirewallDriver [api] auth_strategy = keystone [api_database] [barbican] [cache] backend=oslo_cache.memcache_pool enabled=true memcache_servers={{ controller1_ip_man }}:11211,{{ controller2_ip_man }}:11211,{{ controller3_ip_man }}:11211 [cells] [cinder] [compute] [conductor] [console] [consoleauth] [cors] [crypto] [database] [devices] [ephemeral_storage_encryption] [filter_scheduler] [glance] api_servers = http://{{ vip }}:9292 [guestfs] [healthcheck] [hyperv] [ironic] [key_manager] [keystone] [keystone_authtoken] auth_url = http://{{ vip }}:5000/v3 memcached_servers = {{ controller1_ip_man }}:11211,{{ controller2_ip_man }}:11211,{{ controller3_ip_man }}:11211 auth_type = password project_domain_name = default user_domain_name = default project_name = service username = nova password = {{ nova_pw }} [libvirt] virt_type = kvm [matchmaker_redis] [metrics] [mks] [neutron] url = http://{{ vip }}:9696 auth_url = http://{{ vip }}:35357 auth_type = password project_domain_name = default user_domain_name = default region_name = RegionOne project_name = service username = neutron password = {{ neutron_pw }} [notifications] [osapi_v21] [oslo_concurrency] lock_path = /var/lib/nova/tmp [oslo_messaging_amqp] [oslo_messaging_kafka] [oslo_messaging_notifications] [oslo_messaging_rabbit] [oslo_messaging_zmq] [oslo_middleware] [oslo_policy] [pci] [placement] os_region_name = RegionOne project_domain_name = Default project_name = service auth_type = password user_domain_name = Default auth_url = http://{{ vip }}:5000/v3 username = placement password = {{ placement_pw }} [quota] [rdp] [remote_debug] [scheduler] [serial_console] [service_user] [spice] [upgrade_levels] [vault] [vendordata_dynamic_auth] [vmware] [vnc] enabled = True keymap=en-us server_listen = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4' ]['address'] }} server_proxyclient_address = {{ hostvars[inventory_hostname][ 'ansible_ens3f1']['ipv4']['address'] }} novncproxy_base_url = https://{{ vip }}:6080/vnc_auto.html [workarounds] [wsgi] [xenserver] [xvp]

Thank you,

Best Regards, Zufar Dhiyaulhaq

On Mon, Nov 26, 2018 at 11:13 PM Sean Mooney <smooney@redhat.com> wrote:

...
On Mon, 2018-11-26 at 17:45 +0700, Zufar Dhiyaulhaq wrote:

...
Hi,

I am deploying OpenStack with 3 compute node, but I am seeing an abnormal distribution of instance, the instance is only deployed in a specific compute node, and not distribute among other compute node.

this is my nova.conf from the compute node. (template jinja2 based)

hi, the default behavior of nova used to be spread not pack and i belive it still is. the default behavior with placement however is closer to a packing behavior as allcoation candiates are retrunidn in an undefined but deterministic order.

on a busy cloud this does not strictly pack instaces but on a quite cloud it effectivly does

you can try and enable randomisation of the allocation candiates by setting this config option in the nova.conf of the shcduler to true.

https://docs.openstack.org/nova/latest/configuration/config.html#placement.r...

on that note can you provide the nova.conf for the schduelr is used instead of the compute node nova.conf. if you have not overriden any of the nova defaults the ram and cpu weigher should spread instances withing the allocation candiates returned by placement.

...
[DEFAULT] osapi_compute_listen = {{

hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }}

...
metadata_listen = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }} enabled_apis = osapi_compute,metadata transport_url = rabbit://openstack:{{ rabbitmq_pw }}@{{ controller1_ip_man }}:5672,openstack:{{ rabbitmq_pw }}@{{ controller2_ip_man }}:5672,openstack:{{ rabbitmq_pw }}@{{ controller3_ip_man }}:5672 my_ip = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }} use_neutron = True firewall_driver = nova.virt.firewall.NoopFirewallDriver [api] auth_strategy = keystone [api_database] connection = mysql+pymysql://nova:{{ nova_dbpw }}@{{ vip }}/nova_api [barbican] [cache] backend=oslo_cache.memcache_pool enabled=true memcache_servers={{ controller1_ip_man }}:11211,{{ controller2_ip_man }}:11211,{{ controller3_ip_man }}:11211 [cells] [cinder] os_region_name = RegionOne [compute] [conductor] [console] [consoleauth] [cors] [crypto] [database] connection = mysql+pymysql://nova:{{ nova_dbpw }}@{{ vip }}/nova [devices] [ephemeral_storage_encryption] [filter_scheduler] [glance] api_servers = http://{{ vip }}:9292 [guestfs] [healthcheck] [hyperv] [ironic] [key_manager] [keystone] [keystone_authtoken] auth_url = http://{{ vip }}:5000/v3 memcached_servers = {{ controller1_ip_man }}:11211,{{ controller2_ip_man }}:11211,{{ controller3_ip_man }}:11211 auth_type = password project_domain_name = default user_domain_name = default project_name = service username = nova password = {{ nova_pw }} [libvirt] [matchmaker_redis] [metrics] [mks] [neutron] url = http://{{ vip }}:9696 auth_url = http://{{ vip }}:35357 auth_type = password project_domain_name = default user_domain_name = default region_name = RegionOne project_name = service username = neutron password = {{ neutron_pw }} service_metadata_proxy = true metadata_proxy_shared_secret = {{ metadata_secret }} [notifications] [osapi_v21] [oslo_concurrency] lock_path = /var/lib/nova/tmp [oslo_messaging_amqp] [oslo_messaging_kafka] [oslo_messaging_notifications] [oslo_messaging_rabbit] [oslo_messaging_zmq] [oslo_middleware] [oslo_policy] [pci] [placement] os_region_name = RegionOne project_domain_name = Default project_name = service auth_type = password user_domain_name = Default auth_url = http://{{ vip }}:5000/v3 username = placement password = {{ placement_pw }} [quota] [rdp] [remote_debug] [scheduler] discover_hosts_in_cells_interval = 300 [serial_console] [service_user] [spice] [upgrade_levels] [vault] [vendordata_dynamic_auth] [vmware] [vnc] enabled = true keymap=en-us novncproxy_base_url = https://{{ vip }}:6080/vnc_auto.html novncproxy_host = {{ hostvars[inventory_hostname]['ansible_ens3f1']['ipv4']['address'] }} [workarounds] [wsgi] [xenserver] [xvp] [placement_database] connection=mysql+pymysql://nova:{{ nova_dbpw }}@{{ vip }}/nova_placement

what is the problem? I have lookup the openstack-nova-scheduler in the controller node but it's running well with only warning

nova-scheduler[19255]: /usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py:332: NotSupportedWarning: Configuration option(s) ['use_tpool'] not supported

the result I want is the instance is distributed in all compute node. Thank you.

_______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Jay Pipes

8:56 a.m.

On 11/28/2018 02:50 AM, Zufar Dhiyaulhaq wrote:

...

Hi,

Thank you. I am able to fix this issue by adding this configuration into nova configuration file in controller node.

driver=filter_scheduler

That's the default: https://docs.openstack.org/ocata/config-reference/compute/config-options.htm... So that was definitely not the solution to your problem. My guess is that Sean's suggestion to randomize the allocation candidates fixed your issue. Best, -jay _______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Mike Carden

30 Nov 30 Nov

1:53 a.m.

I'm seeing a similar issue in Queens deployed via tripleo. Two x86 compute nodes and one ppc64le node and host aggregates for virtual instances and baremetal (x86) instances. Baremetal on x86 is working fine. All VMs get deployed to compute-0. I can live migrate VMs to compute-1 and all is well, but I tire of being the 'meatspace scheduler'. I've looked at the nova.conf in the various nova-xxx containers on the controllers, but I have failed to discern the root of this issue. Anyone have a suggestion? -- MC

...

Jay Pipes

7:57 a.m.

On 11/30/2018 02:53 AM, Mike Carden wrote:

...

I'm seeing a similar issue in Queens deployed via tripleo.

Two x86 compute nodes and one ppc64le node and host aggregates for virtual instances and baremetal (x86) instances. Baremetal on x86 is working fine.

All VMs get deployed to compute-0. I can live migrate VMs to compute-1 and all is well, but I tire of being the 'meatspace scheduler'.

LOL, I love that term and will have to remember to use it in the future.

...

I've looked at the nova.conf in the various nova-xxx containers on the controllers, but I have failed to discern the root of this issue.

Have you set the placement_randomize_allocation_candidates CONF option and are still seeing the packing behaviour? Best, -jay

Mike Carden

4:52 p.m.

...

Have you set the placement_randomize_allocation_candidates CONF option and are still seeing the packing behaviour?

No I haven't. Where would be the place to do that? In a nova.conf somewhere that the nova-scheduler containers on the controller hosts could pick it up? Just about to deploy for realz with about forty x86 compute nodes, so it would be really nice to sort this first. :) -- MC

Jay Pipes

3 Dec 3 Dec

7:59 a.m.

On 11/30/2018 05:52 PM, Mike Carden wrote:

...

Have you set the placement_randomize_allocation_candidates CONF option and are still seeing the packing behaviour?

No I haven't. Where would be the place to do that? In a nova.conf somewhere that the nova-scheduler containers on the controller hosts could pick it up?

Just about to deploy for realz with about forty x86 compute nodes, so it would be really nice to sort this first. :)

Presuming you are deploying Rocky or Queens, It goes in the nova.conf file under the [placement] section: randomize_allocation_candidates = true The nova.conf file should be the one used by nova-scheduler. Best, -jay

Mike Carden

2:43 p.m.

...

Presuming you are deploying Rocky or Queens,

Yep, it's Queens.

...

It goes in the nova.conf file under the [placement] section:

randomize_allocation_candidates = true

In triple-o land it seems like the config may need to be somewhere like nova-scheduler.yaml and laid down via a re-deploy. Or something. The nova_scheduler runs in a container on a 'controller' host. -- MC

Mike Carden

8:04 p.m.

Having found the nice docs at: https://docs.openstack.org/tripleo-docs/latest/install/containers_deployment... I have divined that I can ssh to each controller node and: sudo docker exec -u root nova_scheduler crudini --set /etc/nova/nova.conf placement randomize_allocation_candidates true sudo docker kill -s SIGHUP nova_scheduler ...and indeed the /etc/nova/nova.conf in each nova_scheduler container is updated accordingly. Unfortunately, instances are all still launched on compute-0. -- MC

Alex Schultz

9:46 p.m.

On Mon, Dec 3, 2018 at 7:06 PM Mike Carden <mike.carden@gmail.com> wrote:

...

Having found the nice docs at: https://docs.openstack.org/tripleo-docs/latest/install/containers_deployment...

I have divined that I can ssh to each controller node and: sudo docker exec -u root nova_scheduler crudini --set /etc/nova/nova.conf placement randomize_allocation_candidates true sudo docker kill -s SIGHUP nova_scheduler

FYI protip, you can add the following to a custom environment file to configure this value (as we don't expose the config by default) parameters_defaults: ControllerExtraConfig: nova::config::nova_config: placement/randomize_allocation_candidates: value: true And then do a deployment. This will persist it and ensure future scaling/management updates won't remove this configuration.

...

...and indeed the /etc/nova/nova.conf in each nova_scheduler container is updated accordingly.

Unfortunately, instances are all still launched on compute-0.

-- MC

Mike Carden

10:10 p.m.

On Tue, Dec 4, 2018 at 2:46 PM Alex Schultz <aschultz@redhat.com> wrote:

...

parameters_defaults: ControllerExtraConfig: nova::config::nova_config: placement/randomize_allocation_candidates: value: true

Thanks for that Alex. I'll roll that into our next over-the-top deploy update. I won't hold my breath for it actually getting our scheduling sorted out though, since it made no difference when I manually updated all three controllers with that config. -- MC

Tobias Urdin

4 Dec 4 Dec

2:38 a.m.

Started a patch to include that option in puppet-nova as well based on this thread which perhaps can help the TripleO based world as well. https://review.openstack.org/#/c/621593/ Best regards On 12/04/2018 04:56 AM, Alex Schultz wrote:

...

On Mon, Dec 3, 2018 at 7:06 PM Mike Carden <mike.carden@gmail.com> wrote:

...
Having found the nice docs at: https://docs.openstack.org/tripleo-docs/latest/install/containers_deployment...

I have divined that I can ssh to each controller node and: sudo docker exec -u root nova_scheduler crudini --set /etc/nova/nova.conf placement randomize_allocation_candidates true sudo docker kill -s SIGHUP nova_scheduler

FYI protip, you can add the following to a custom environment file to configure this value (as we don't expose the config by default)

parameters_defaults: ControllerExtraConfig: nova::config::nova_config: placement/randomize_allocation_candidates: value: true

And then do a deployment. This will persist it and ensure future scaling/management updates won't remove this configuration.

...
...and indeed the /etc/nova/nova.conf in each nova_scheduler container is updated accordingly.

Unfortunately, instances are all still launched on compute-0.

-- MC

Chris Dent

4:55 a.m.

On Tue, 4 Dec 2018, Mike Carden wrote:

...

Having found the nice docs at: https://docs.openstack.org/tripleo-docs/latest/install/containers_deployment...

I have divined that I can ssh to each controller node and: sudo docker exec -u root nova_scheduler crudini --set /etc/nova/nova.conf placement randomize_allocation_candidates true sudo docker kill -s SIGHUP nova_scheduler

...and indeed the /etc/nova/nova.conf in each nova_scheduler container is updated accordingly.

Unfortunately, instances are all still launched on compute-0.

Sorry this has been such a pain for you. There are a couple of issues/other things to try: * The 'randomize_allocation_candidates' config setting is used by the placement-api process (probably called nova-placement-api in queens), not the nova-scheduler process, so you need to update the config (in the placement section) for the former and restart it. * If that still doesn't fix it then it would be helpful to see the logs from both the placement-api and nova-scheduler process from around the time you try to launch some instances, as that will help show if there's some other factor at play that is changing the number of available target hosts, causing attempts on the other two hosts to not land. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent

Mike Carden

3:04 p.m.

On Tue, Dec 4, 2018 at 9:58 PM Chris Dent <cdent+os@anticdent.org> wrote:

...

* The 'randomize_allocation_candidates' config setting is used by the placement-api process (probably called nova-placement-api in queens), not the nova-scheduler process, so you need to update the config (in the placement section) for the former and restart it.

Thanks Chris. I tried the same thing in the nova.conf of the nova_placement containers and still no joy. A check on a fresh deploy of Queens with just a couple of x86 compute nodes proves that it can work without randomize_allocation_candidates being set to True. Out of the box we get an even distribution of VMs across compute nodes. It seems that somewhere along the path of adding Ironic and some baremetal nodes and host aggregates and a PPC64LE node, the scheduling goes awry. Back to the drawing board, and the logs. -- MC

Chris Dent

3:35 p.m.

On Wed, 5 Dec 2018, Mike Carden wrote:

...

On Tue, Dec 4, 2018 at 9:58 PM Chris Dent <cdent+os@anticdent.org> wrote:

...
* The 'randomize_allocation_candidates' config setting is used by the placement-api process (probably called nova-placement-api in queens), not the nova-scheduler process, so you need to update the config (in the placement section) for the former and restart it.

I tried the same thing in the nova.conf of the nova_placement containers and still no joy.

Darn.

...

A check on a fresh deploy of Queens with just a couple of x86 compute nodes proves that it can work without randomize_allocation_candidates being set to True. Out of the box we get an even distribution of VMs across compute nodes. It seems that somewhere along the path of adding Ironic and some baremetal nodes and host aggregates and a PPC64LE node, the scheduling goes awry.

Yeah, this sort of stuff is why I was hoping we could see some of your logs, to figure out which of those things was the haymaker. If you figure it out, please post about it. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent tw: @anticdent

Sean Mooney

3:50 p.m.

On Tue, 2018-12-04 at 21:35 +0000, Chris Dent wrote:

...

On Wed, 5 Dec 2018, Mike Carden wrote:

...
On Tue, Dec 4, 2018 at 9:58 PM Chris Dent <cdent+os@anticdent.org> wrote:

...
* The 'randomize_allocation_candidates' config setting is used by the placement-api process (probably called nova-placement-api in queens), not the nova-scheduler process, so you need to update the config (in the placement section) for the former and restart it.

I tried the same thing in the nova.conf of the nova_placement containers and still no joy.

Darn.

...
A check on a fresh deploy of Queens with just a couple of x86 compute nodes proves that it can work without randomize_allocation_candidates being set to True. Out of the box we get an even distribution of VMs across compute nodes. It seems that somewhere along the path of adding Ironic and some baremetal nodes and host aggregates and a PPC64LE node, the scheduling goes awry.

Yeah, this sort of stuff is why I was hoping we could see some of your logs, to figure out which of those things was the haymaker. so one thing that came up recently downstream was a discusion around the BuildFailureWeigher https://docs.openstack.org/nova/rocky/user/filter-scheduler.html#weights and the build_failure_weight_multiplier https://docs.openstack.org/nova/latest/configuration/config.html#filter_sche... i wonder if failed build shoudl be leading to packing behavior.

it would explain why intially it is fine but over time as host get a significant build failure weight applied that the cluster transiation form even spread to packing behaivior.

...

If you figure it out, please post about it.

Mike Carden

4:08 p.m.

On Wed, Dec 5, 2018 at 8:54 AM Sean Mooney <smooney@redhat.com> wrote:

...

so one thing that came up recently downstream was a discusion around the BuildFailureWeigher

Now there's an interesting idea. The compute node that's not being scheduled *hasn't* had build failures, but we had a lot of build failures in Ironic for a while due to the published RHEL7.6 qcow2 image having a wee typo in its grub conf. Those failures *shouldn't* influence non-ironic scheduling I'd have thought. Hmmm. -- MC

Mike Carden

19 Dec 19 Dec

7:32 p.m.

Just an update on this. I have found a 'fix' that is more like a workaround in that I still don't know what causes the problem. So, to start with, all VMs are being spawned on a single node and while I can live migrate to another node, the scheduler never sends VMs there. To fix this, I disable the hypervisor on the node that is getting all the VMs, then I spawn a bunch of VMs. They go to the node that still has its hypervisor enabled. Then I re-enable the disabled hypervisor and spawn a bunch of VMs. Now they get evenly split across the two nodes. Fixed. -- MC

Zufar Dhiyaulhaq

4 Dec 4 Dec

2:59 a.m.

Hi all, I am facing this issue again, I try to add this configuration but still, the node is going to compute1. [scheduler] driver = filter_scheduler host_manager = host_manager [filter_scheduler] available_filters=nova.scheduler.filters.all_filters enabled_filters=RetryFilter,AvailabilityZoneFilter,RamFilter,DiskFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter,CoreFilter use_baremetal_filters=False weight_classes=nova.scheduler.weights.all_weighers [placement] randomize_allocation_candidates = true thank you. Best Regards, Zufar Dhiyaulhaq On Tue, Dec 4, 2018 at 3:55 AM Mike Carden <mike.carden@gmail.com> wrote:

...

...
Presuming you are deploying Rocky or Queens,

Yep, it's Queens.

...
It goes in the nova.conf file under the [placement] section:

randomize_allocation_candidates = true

In triple-o land it seems like the config may need to be somewhere like nova-scheduler.yaml and laid down via a re-deploy.

Or something.

The nova_scheduler runs in a container on a 'controller' host.

-- MC

_______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Zufar Dhiyaulhaq

10 a.m.

...

Hi all, I am facing this issue again,

I try to add this configuration but still, the node is going to compute1.

[scheduler] driver = filter_scheduler host_manager = host_manager

[filter_scheduler] available_filters=nova.scheduler.filters.all_filters

enabled_filters=RetryFilter,AvailabilityZoneFilter,RamFilter,DiskFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter,CoreFilter use_baremetal_filters=False weight_classes=nova.scheduler.weights.all_weighers

[placement] randomize_allocation_candidates = true

thank you.

Best Regards, Zufar Dhiyaulhaq

On Tue, Dec 4, 2018 at 3:55 AM Mike Carden <mike.carden@gmail.com> wrote:

...
...
Presuming you are deploying Rocky or Queens,

Yep, it's Queens.

...
It goes in the nova.conf file under the [placement] section:

randomize_allocation_candidates = true

In triple-o land it seems like the config may need to be somewhere like nova-scheduler.yaml and laid down via a re-deploy.

Or something.

The nova_scheduler runs in a container on a 'controller' host.

-- MC

_______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

2440

Age (days ago)

2464

Last active (days ago)

List overview

Download

23 comments

9 participants

participants (9)

Alex Schultz
Chris Dent
Jay Pipes
Matt Riedemann
Mike Carden
Sean Mooney
Tobias Urdin
Zufar Dhiyaulhaq
Zufar Dhiyaulhaq