Hi, We are running a 5 node galera cluster with haproxy in front. Haproxy is installed on the same node as cinder and its configuration sends all reads and writes to the first node. From what I can tell the galera and mariadb rpms did not come from the RDO repository. /etc/cinder/cinder.conf [database] connection = mysql+pymysql://cinder:xxxxxxxx@127.0.0.1/cinder /etc/haproxy/haproxy.conf listen galera 127.0.0.1:3306 maxconn 10000 mode tcp option tcpka option tcplog option mysql-check user haproxy server db1 10.252.173.54:3306 check maxconn 10000 server db2 10.252.173.55:3306 check backup maxconn 10000 server db3 10.252.173.56:3306 check backup maxconn 10000 server db4 10.252.173.57:3306 check backup maxconn 10000 server db5 10.252.173.58:3306 check backup maxconn 10000 Name : haproxy Version : 1.5.18 Release : 7.el7 Architecture: x86_64 Install Date: Wed 09 Jan 2019 07:09:01 PM GMT Group : System Environment/Daemons Size : 2689838 License : GPLv2+ Signature : RSA/SHA256, Wed 25 Apr 2018 11:04:31 AM GMT, Key ID 24c6a8a7f4a80eb5 Source RPM : haproxy-1.5.18-7.el7.src.rpm Build Date : Wed 11 Apr 2018 04:28:42 AM GMT Build Host : x86-01.bsys.centos.org Relocations : (not relocatable) Packager : CentOS BuildSystem <http://bugs.centos.org> Vendor : CentOS URL : http://www.haproxy.org/ Summary : TCP/HTTP proxy and load balancer for high availability environments Name : galera Version : 25.3.20 Release : 1.rhel7.el7.centos Architecture: x86_64 Install Date: Wed 09 Jan 2019 07:07:52 PM GMT Group : System Environment/Libraries Size : 36383325 License : GPL-2.0 Signature : DSA/SHA1, Tue 02 May 2017 04:20:52 PM GMT, Key ID cbcb082a1bb943db Source RPM : galera-25.3.20-1.rhel7.el7.centos.src.rpm Build Date : Thu 27 Apr 2017 12:58:55 PM GMT Build Host : centos70-x86-64 Relocations : (not relocatable) Packager : Codership Oy Vendor : Codership Oy URL : http://www.codership.com/ Summary : Galera: a synchronous multi-master wsrep provider (replication engine) Name : MariaDB-server Version : 10.3.2 Release : 1.el7.centos Architecture: x86_64 Install Date: Wed 09 Jan 2019 07:08:11 PM GMT Group : Applications/Databases Size : 511538370 License : GPLv2 Signature : DSA/SHA1, Sat 07 Oct 2017 05:51:08 PM GMT, Key ID cbcb082a1bb943db Source RPM : MariaDB-server-10.3.2-1.el7.centos.src.rpm Build Date : Fri 06 Oct 2017 01:51:16 PM GMT Build Host : centos70-x86-64 Relocations : (not relocatable) Vendor : MariaDB Foundation URL : http://mariadb.org Summary : MariaDB: a very fast and robust SQL database server Thanks On Mon, Jan 14, 2019 at 8:23 AM Mike Bayer <mike_mp@zzzcomputing.com> wrote:
On Mon, Jan 14, 2019 at 7:38 AM Gorka Eguileor <geguileo@redhat.com> wrote:
On 11/01, Brandon Caulder wrote:
Hi,
The steps were... - purge - shutdown cinder-scheduler, cinder-api - upgrade software - restart cinder-volume
Hi,
You should not restart cinder volume services before doing the DB sync, otherwise the Cinder service is likely to fail.
- sync (upgrade fails and stops at v114) - sync again (db upgrades to v117) - restart cinder-volume - stacktrace observed in volume.log
At this point this could be a DB issue:
https://bugs.mysql.com/bug.php?id=67926 https://jira.mariadb.org/browse/MDEV-10558
that's a scary issue, can the reporter please list what MySQL / MariaDB version is running and if this is Galera/HA or single node?
Cheers, Gorka.
Thanks
On Fri, Jan 11, 2019 at 7:23 AM Gorka Eguileor <geguileo@redhat.com>
On 10/01, Brandon Caulder wrote:
Hi Iain,
There are 424 rows in volumes which drops down to 185 after running cinder-manage db purge 1. Restarting the volume service after
upgrade and running sync again does not remediate the problem, although running db sync a second time does bump the version up to 117, the following appears in the volume.log...
Hi,
If I understand correctly the steps were:
- Run DB sync --> Fail - Run DB purge - Restart volume services - See the log error - Run DB sync --> version proceeds to 117
If that is the case, could you restart the services again now that
migration has been moved to version 117?
If the cinder-volume service is able to restart please run the online data migrations with the service running.
Cheers, Gorka.
Thanks
On Thu, Jan 10, 2019 at 11:15 AM iain MacDonnell < iain.macdonnell@oracle.com> wrote:
Different issue, I believe (DB sync vs. online migrations) - it
just
happens that both pertain to shared targets.
Brandon, might you have a very large number of rows in your volumes table? Have you been purging soft-deleted rows?
~iain
On 1/10/19 11:01 AM, Jay Bryant wrote: > Brandon, > > I am thinking you are hitting this bug: >
> > > I think you can work around it by retrying the migration with
wrote: package the the
volume
> service running. You may, however, want to check with Iain MacDonnell > as he has been looking at this for a while. > > Thanks! > Jay > > > On 1/10/2019 12:34 PM, Brandon Caulder wrote: >> Hi, >> >> I am receiving the following error when performing an offline upgrade >> of cinder from RDO openstack-cinder-1:11.1.0-1.el7 to >> openstack-cinder-1:12.0.3-1.el7. >> >> # cinder-manage db version >> 105 >> >> # cinder-manage --debug db sync >> Error during database migration: (pymysql.err.OperationalError) (2013, >> 'Lost connection to MySQL server during query') [SQL: u'UPDATE volumes >> SET shared_targets=%(shared_targets)s'] [parameters: >> {'shared_targets': 1}] >> >> # cinder-manage db version >> 114 >> >> The db version does not upgrade to queens version 117. Any help would >> be appreciated. >> >> Thank you >