[cinder] db sync error upgrading from pike to queens
Brandon Caulder
kbcaulder at gmail.com
Mon Jan 14 16:43:56 UTC 2019
Hi,
We are running a 5 node galera cluster with haproxy in front. Haproxy is
installed on the same node as cinder and its configuration sends all reads
and writes to the first node. From what I can tell the galera and mariadb
rpms did not come from the RDO repository.
/etc/cinder/cinder.conf
[database]
connection = mysql+pymysql://cinder:xxxxxxxx@127.0.0.1/cinder
/etc/haproxy/haproxy.conf
listen galera 127.0.0.1:3306
maxconn 10000
mode tcp
option tcpka
option tcplog
option mysql-check user haproxy
server db1 10.252.173.54:3306 check maxconn 10000
server db2 10.252.173.55:3306 check backup maxconn 10000
server db3 10.252.173.56:3306 check backup maxconn 10000
server db4 10.252.173.57:3306 check backup maxconn 10000
server db5 10.252.173.58:3306 check backup maxconn 10000
Name : haproxy
Version : 1.5.18
Release : 7.el7
Architecture: x86_64
Install Date: Wed 09 Jan 2019 07:09:01 PM GMT
Group : System Environment/Daemons
Size : 2689838
License : GPLv2+
Signature : RSA/SHA256, Wed 25 Apr 2018 11:04:31 AM GMT, Key ID
24c6a8a7f4a80eb5
Source RPM : haproxy-1.5.18-7.el7.src.rpm
Build Date : Wed 11 Apr 2018 04:28:42 AM GMT
Build Host : x86-01.bsys.centos.org
Relocations : (not relocatable)
Packager : CentOS BuildSystem <http://bugs.centos.org>
Vendor : CentOS
URL : http://www.haproxy.org/
Summary : TCP/HTTP proxy and load balancer for high availability
environments
Name : galera
Version : 25.3.20
Release : 1.rhel7.el7.centos
Architecture: x86_64
Install Date: Wed 09 Jan 2019 07:07:52 PM GMT
Group : System Environment/Libraries
Size : 36383325
License : GPL-2.0
Signature : DSA/SHA1, Tue 02 May 2017 04:20:52 PM GMT, Key ID
cbcb082a1bb943db
Source RPM : galera-25.3.20-1.rhel7.el7.centos.src.rpm
Build Date : Thu 27 Apr 2017 12:58:55 PM GMT
Build Host : centos70-x86-64
Relocations : (not relocatable)
Packager : Codership Oy
Vendor : Codership Oy
URL : http://www.codership.com/
Summary : Galera: a synchronous multi-master wsrep provider
(replication engine)
Name : MariaDB-server
Version : 10.3.2
Release : 1.el7.centos
Architecture: x86_64
Install Date: Wed 09 Jan 2019 07:08:11 PM GMT
Group : Applications/Databases
Size : 511538370
License : GPLv2
Signature : DSA/SHA1, Sat 07 Oct 2017 05:51:08 PM GMT, Key ID
cbcb082a1bb943db
Source RPM : MariaDB-server-10.3.2-1.el7.centos.src.rpm
Build Date : Fri 06 Oct 2017 01:51:16 PM GMT
Build Host : centos70-x86-64
Relocations : (not relocatable)
Vendor : MariaDB Foundation
URL : http://mariadb.org
Summary : MariaDB: a very fast and robust SQL database server
Thanks
On Mon, Jan 14, 2019 at 8:23 AM Mike Bayer <mike_mp at zzzcomputing.com> wrote:
> On Mon, Jan 14, 2019 at 7:38 AM Gorka Eguileor <geguileo at redhat.com>
> wrote:
> >
> > On 11/01, Brandon Caulder wrote:
> > > Hi,
> > >
> > > The steps were...
> > > - purge
> > > - shutdown cinder-scheduler, cinder-api
> > > - upgrade software
> > > - restart cinder-volume
> >
> > Hi,
> >
> > You should not restart cinder volume services before doing the DB sync,
> > otherwise the Cinder service is likely to fail.
> >
> > > - sync (upgrade fails and stops at v114)
> > > - sync again (db upgrades to v117)
> > > - restart cinder-volume
> > > - stacktrace observed in volume.log
> > >
> >
> > At this point this could be a DB issue:
> >
> > https://bugs.mysql.com/bug.php?id=67926
> > https://jira.mariadb.org/browse/MDEV-10558
>
> that's a scary issue, can the reporter please list what MySQL /
> MariaDB version is running and if this is Galera/HA or single node?
>
>
> >
> > Cheers,
> > Gorka.
> >
> > > Thanks
> > >
> > > On Fri, Jan 11, 2019 at 7:23 AM Gorka Eguileor <geguileo at redhat.com>
> wrote:
> > >
> > > > On 10/01, Brandon Caulder wrote:
> > > > > Hi Iain,
> > > > >
> > > > > There are 424 rows in volumes which drops down to 185 after running
> > > > > cinder-manage db purge 1. Restarting the volume service after
> package
> > > > > upgrade and running sync again does not remediate the problem,
> although
> > > > > running db sync a second time does bump the version up to 117, the
> > > > > following appears in the volume.log...
> > > > >
> > > > > http://paste.openstack.org/show/Gfbe94mSAqAzAp4Ycwlz/
> > > > >
> > > >
> > > > Hi,
> > > >
> > > > If I understand correctly the steps were:
> > > >
> > > > - Run DB sync --> Fail
> > > > - Run DB purge
> > > > - Restart volume services
> > > > - See the log error
> > > > - Run DB sync --> version proceeds to 117
> > > >
> > > > If that is the case, could you restart the services again now that
> the
> > > > migration has been moved to version 117?
> > > >
> > > > If the cinder-volume service is able to restart please run the online
> > > > data migrations with the service running.
> > > >
> > > > Cheers,
> > > > Gorka.
> > > >
> > > >
> > > > > Thanks
> > > > >
> > > > > On Thu, Jan 10, 2019 at 11:15 AM iain MacDonnell <
> > > > iain.macdonnell at oracle.com>
> > > > > wrote:
> > > > >
> > > > > >
> > > > > > Different issue, I believe (DB sync vs. online migrations) - it
> just
> > > > > > happens that both pertain to shared targets.
> > > > > >
> > > > > > Brandon, might you have a very large number of rows in your
> volumes
> > > > > > table? Have you been purging soft-deleted rows?
> > > > > >
> > > > > > ~iain
> > > > > >
> > > > > >
> > > > > > On 1/10/19 11:01 AM, Jay Bryant wrote:
> > > > > > > Brandon,
> > > > > > >
> > > > > > > I am thinking you are hitting this bug:
> > > > > > >
> > > > > >
> > > >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.launchpad.net_cinder_-2Bbug_1806156&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=RxYkIjeLZPK2frXV_wEUCq8d3wvUIvDPimUcunMwbMs&m=FHjmiBaQPWLNzGreplNmZfCZ0MkpV5GLaqD2hcs5hwg&s=AvAoszuVyGkd2_1hyCnQjwGEw9dUNfEoqsUcxdHYZqU&e=
> > > > > > >
> > > > > > >
> > > > > > > I think you can work around it by retrying the migration with
> the
> > > > volume
> > > > > > > service running. You may, however, want to check with Iain
> > > > MacDonnell
> > > > > > > as he has been looking at this for a while.
> > > > > > >
> > > > > > > Thanks!
> > > > > > > Jay
> > > > > > >
> > > > > > >
> > > > > > > On 1/10/2019 12:34 PM, Brandon Caulder wrote:
> > > > > > >> Hi,
> > > > > > >>
> > > > > > >> I am receiving the following error when performing an offline
> > > > upgrade
> > > > > > >> of cinder from RDO openstack-cinder-1:11.1.0-1.el7 to
> > > > > > >> openstack-cinder-1:12.0.3-1.el7.
> > > > > > >>
> > > > > > >> # cinder-manage db version
> > > > > > >> 105
> > > > > > >>
> > > > > > >> # cinder-manage --debug db sync
> > > > > > >> Error during database migration:
> (pymysql.err.OperationalError)
> > > > (2013,
> > > > > > >> 'Lost connection to MySQL server during query') [SQL: u'UPDATE
> > > > volumes
> > > > > > >> SET shared_targets=%(shared_targets)s'] [parameters:
> > > > > > >> {'shared_targets': 1}]
> > > > > > >>
> > > > > > >> # cinder-manage db version
> > > > > > >> 114
> > > > > > >>
> > > > > > >> The db version does not upgrade to queens version 117. Any
> help
> > > > would
> > > > > > >> be appreciated.
> > > > > > >>
> > > > > > >> Thank you
> > > > > > >
> > > > > >
> > > > > >
> > > >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190114/de71484b/attachment.html>
More information about the openstack-discuss
mailing list