[nova][scheduler] scheduler spawns to the same compute node only
Nicolas Ghirlanda
nicolas.ghirlanda at everyware.ch
Mon Apr 15 15:36:08 UTC 2019
Hi all,
we have a kolla-ansible deployed "Queens" Release of openstack with 8
compute nodes and an external Percona XtraDB Cluster (with read-write
split with haproxy).
New VMs are just currently always scheduled to the same compute node,
even though a manual live-migration is working fine to other compute nodes.
We're not sure, what the issue is, but perhaps someone may spot it from
our config:
# nova.conf scheduler config
default_availability_zone = az1
...
[filter_scheduler]
available_filters = nova.scheduler.filters.all_filters
enabled_filters = RetryFilter, AvailabilityZoneFilter,
ComputeCapabilitiesFilter, ImagePropertiesFilter,
ServerGroupAntiAffinityFilter, ServerGroupAffinityFilter,
AggregateInstanceExtraSpecsFilter, AggregateMultiTenancyIsolation,
DifferentHostFilter, RamFilter, SameHostFilter, NUMATopologyFilter
Database is an external Percona XtraDB Cluster (Version 5.7.24) with
haproxy for read-write-splitting (currently only one write node).
We do see mysql errors in the nova-scheduler.log on the write DB node
when an instance is created.
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db [-]
Unexpected error while reporting service status: OperationalError:
(pymysql.err.OperationalError) (1213, u'WSREP detected deadlock/conflict
and aborted the transaction. Try restarting the transaction')
(Background on this error at: http://sqlalche.me/e/e3q8)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db Traceback
(most recent call last):
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/servicegroup/drivers/db.py",
line 91, in _report_state
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
service.service_ref.save()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_versionedobjects/base.py",
line 226, in wrapper
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db return
fn(self, *args, **kwargs)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/objects/service.py",
line 397, in save
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db db_service
= db.service_update(self._context, self.id, updates)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/db/api.py",
line 183, in service_update
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db return
IMPL.service_update(context, service_id, values)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_db/api.py",
line 154, in wrapper
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
ectxt.value = e.inner_exc
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_utils/excutils.py",
line 220, in __exit__
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
self.force_reraise()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_utils/excutils.py",
line 196, in force_reraise
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
six.reraise(self.type_, self.value, self.tb)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_db/api.py",
line 142, in wrapper
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db return
f(*args, **kwargs)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py",
line 227, in wrapped
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db return
f(context, *args, **kwargs)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/usr/lib/python2.7/contextlib.py", line 24, in __exit__
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
self.gen.next()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py",
line 1043, in _transaction_scope
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db yield resource
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/usr/lib/python2.7/contextlib.py", line 24, in __exit__
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
self.gen.next()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py",
line 653, in _session
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
self.session.rollback()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_utils/excutils.py",
line 220, in __exit__
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
self.force_reraise()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_utils/excutils.py",
line 196, in force_reraise
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
six.reraise(self.type_, self.value, self.tb)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py",
line 650, in _session
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
self._end_session_transaction(self.session)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/oslo_db/sqlalchemy/enginefacade.py",
line 678, in _end_session_transaction
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
session.commit()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py",
line 943, in commit
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
self.transaction.commit()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py",
line 471, in commit
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db t[1].commit()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py",
line 1643, in commit
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
self._do_commit()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py",
line 1674, in _do_commit
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
self.connection._commit_impl()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py",
line 726, in _commit_impl
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
self._handle_dbapi_exception(e, None, None, None, None)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py",
line 1409, in _handle_dbapi_exception
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
util.raise_from_cause(newraise, exc_info)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/sqlalchemy/util/compat.py",
line 265, in raise_from_cause
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
reraise(type(exception), exception, tb=exc_tb, cause=cause)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/sqlalchemy/engine/base.py",
line 724, in _commit_impl
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
self.engine.dialect.do_commit(self.connection)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/sqlalchemy/dialects/mysql/base.py",
line 1765, in do_commit
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
dbapi_connection.commit()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/pymysql/connections.py",
line 422, in commit
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
self._read_ok_packet()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/pymysql/connections.py",
line 396, in _read_ok_packet
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db pkt =
self._read_packet()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/pymysql/connections.py",
line 683, in _read_packet
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
packet.check_error()
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/pymysql/protocol.py",
line 220, in check_error
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
err.raise_mysql_exception(self._data)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db File
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/pymysql/err.py",
line 109, in raise_mysql_exception
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db raise
errorclass(errno, errval)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
OperationalError: (pymysql.err.OperationalError) (1213, u'WSREP detected
deadlock/conflict and aborted the transaction. Try restarting the
transaction') (Background on this error at: http://sqlalche.me/e/e3q8)
2019-04-15 16:52:10.016 24 ERROR nova.servicegroup.drivers.db
2019-04-15 16:52:20.020 24 INFO nova.servicegroup.drivers.db [-]
Recovered from being unable to report status.
The deadlock message is quite strange, as we have haproxy configured so
all write requests are handled by one node.
There are NO errors in the mysqld.log WHILE creating an instance, but we
see from time to time aborted connections from nova.
2019-04-15T14:22:36.232108Z 30616972 [Note] Aborted connection 30616972
to db: 'nova' user: 'nova' host: '10.x.y.z' (Got an error reading
communication packets)
As I said, all instances are allocated to the same compute node.
nova-compute.log doesn't show an error while creating the instance.
Beside that, we also see messages from nova.scheduler.host_manager on
all other nodes like (but those messages are _not_ triggered, when an
instance is spawned.!)
2019-04-15 16:28:47.771 22 INFO nova.scheduler.host_manager
[req-f92e340e-a88a-44a0-8cad-588390c25bc2 - - - - -] The instance sync
for host 'xxx' did not match. Re-created its InstanceList.
Don't know if that may be relevant, but somehow our (currently single)
AZ is listed several times.
# openstack availability zone list
+------------+-------------+
| Zone Name | Zone Status |
+------------+-------------+
| internal | available |
| az1 | available |
| az1 | available |
| az1 | available |
| az1 | available |
+------------+-------------+
May that be related somehow?
Thanks for any consideration and support!
kind regards
Nicolas
--
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5230 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190415/6c21110a/attachment-0001.bin>
More information about the openstack-discuss
mailing list