[openstack-dev] [Fuel-dev] [Openstack-dev] New RA for Galera

Sergii Golovatiuk sgolovatiuk at mirantis.com
Mon Jun 2 13:17:04 UTC 2014


Hi crew,

Thank you for starting this topic. I've already performed the research and
started blueprint. Since we changed our blueprint strategy, I made it in
rst format and added it to Gerrit workflow. Feel free to participate.

https://review.openstack.org/#/c/97191/
http://docs-draft.openstack.org/91/97191/8/check/gate-fuel-specs-docs/a1e7c72/doc/build/html/specs/5.1/pacemaker-galera-resource-agent.html

It's still draft as some discussions/tests are still going on.

>PoC RA get GTID from sql (SHOW STATUS LIKE ‚wsrep_last_committed’) if
MySQL is running, in other case RA start mysqld with --wsrep-recover. I
>skipped grastate.dat because in all my test this file had commit_id set to
-1.

Percona way is more robust as they can restore the state even when CIB
became corrupted and whole cluster went down (power outage).

~Sergii




On Mon, Jun 2, 2014 at 3:09 PM, Bartosz Kupidura <bkupidura at mirantis.com>
wrote:

> Vladimir,
>
>
> Wiadomość napisana przez Vladimir Kuklin <vkuklin at mirantis.com> w dniu 2
> cze 2014, o godz. 13:49:
>
> > Bartosz, if you look into what Percona guys are doing - you will see
> here:
> https://github.com/percona/percona-pacemaker-agents/blob/new_pxc_ra/agents/pxc_resource_agent#L516
> that they first try to use MySQL and then to get GTID from grastate.dat.
> Also, I am wondering if you are using cluster-wide attributes instead of
> node-attributes. If you use node-scoped attributes, then shadow/commit
> commands should not affect anything.
>
> PoC RA get GTID from sql (SHOW STATUS LIKE ‚wsrep_last_committed’) if
> MySQL is running, in other case RA start mysqld with --wsrep-recover. I
> skipped grastate.dat because in all my test this file had commit_id set to
> -1.
>
> In PoC i use only node-attributes (crm_attribute --node $HOSTNAME
> --lifetime forever --name gtid --update $GTID).
>
> >
> >
> > On Mon, Jun 2, 2014 at 2:34 PM, Bogdan Dobrelya <bdobrelia at mirantis.com>
> wrote:
> > On 05/29/2014 02:06 PM, Bartosz Kupidura wrote:
> > > Hello,
> > >
> > >
> > > Wiadomość napisana przez Vladimir Kuklin <vkuklin at mirantis.com> w
> dniu 29 maj 2014, o godz. 12:09:
> > >
> > >> may be the problem is that you are using liftetime crm attributes
> instead of 'reboot' ones. shadow/commit is used by us because we need
> transactional behaviour in some cases. if you turn crm_shadow off, then you
> will experience problems with multi-state resources and
> location/colocation/order constraints. so we need to find a way to make
> commits transactional. there are two ways:
> > >> 1) rewrite corosync providers to use crm_diff command and apply it
> instead of shadow commit that can swallow cluster attributes sometimes
> > >
> > > In PoC i removed all cs_commit/cs_shadow, and looks that everything is
> working. But as you says, this can lead to problems with more complicated
> deployments.
> > > This need to be verified.
> > >
> > >> 2) store 'reboot' attributes instead of lifetime ones
> > >
> > > I test with —lifetime forever and reboot. No difference for
> cs_commit/cs_shadow fail.
> > >
> > > Moreover we need method to store GTID permanent (to support whole
> cluster reboot).
> >
> > Please note, GTID could always be fetched from the
> > /var/lib/mysql/grastate.dat at the galera node
> >
> > > If we want to stick to cs_commit/cs_shadow, we need other method to
> store GTID than crm_attribute.
> >
> > WE could use a modified ocf::pacemaker:SysInfo resource. We could put
> > GTID there and use it the similar way as I did for fencing PoC[0] (for
> > free space monitoring)
> >
> > [0]
> >
> https://github.com/bogdando/fuel-library-1/blob/ha_fencing_WIP/deployment/puppet/cluster/manifests/fencing_primitives.pp#L41-L70
> >
> > >
> > >>
> > >>
> > >>
> > >> On Thu, May 29, 2014 at 12:42 PM, Bogdan Dobrelya <
> bdobrelia at mirantis.com> wrote:
> > >> On 05/27/14 16:44, Bartosz Kupidura wrote:
> > >>> Hello,
> > >>> Responses inline.
> > >>>
> > >>>
> > >>> Wiadomość napisana przez Vladimir Kuklin <vkuklin at mirantis.com> w
> dniu 27 maj 2014, o godz. 15:12:
> > >>>
> > >>>> Hi, Bartosz
> > >>>>
> > >>>> First of all, we are using openstack-dev for such discussions.
> > >>>>
> > >>>> Second, there is also Percona's RA for Percona XtraDB Cluster,
> which looks like pretty similar, although it is written in Perl. May be we
> could derive something useful from it.
> > >>>>
> > >>>> Next, if you are working on this stuff, let's make it as open for
> the community as possible. There is a blueprint for Galera OCF script:
> https://blueprints.launchpad.net/fuel/+spec/reliable-galera-ocf-script.
> It would be awesome if you wrote down the specification and sent  newer
> galera ocf code change request to fuel-library gerrit.
> > >>>
> > >>> Sure, I will update this blueprint.
> > >>> Change request in fuel-library:
> https://review.openstack.org/#/c/95764/
> > >>
> > >> That is a really nice catch, Bartosz, thank you. I believe we should
> > >> review the new OCF script thoroughly and consider omitting
> > >> cs_commits/cs_shadows as well. What would be the downsides?
> > >>
> > >>>
> > >>>>
> > >>>> Speaking of crm_attribute stuff. I am very surprised that you are
> saying that node attributes are altered by crm shadow commit. We are using
> similar approach in our scripts and have never faced this issue.
> > >>>
> > >>> This is probably because you update crm_attribute very rarely. And
> with my approach GTID attribute is updated every 60s on every node (3
> updates in 60s, in standard HA setup).
> > >>>
> > >>> You can try to update any attribute in loop during deploying cluster
> to trigger fail with corosync diff.
> > >>
> > >> It sounds reasonable and we should verify it.
> > >> I've updated the statuses for related bugs and attached them to the
> > >> aforementioned blueprint as well:
> > >> https://bugs.launchpad.net/fuel/+bug/1283062/comments/7
> > >> https://bugs.launchpad.net/fuel/+bug/1281592/comments/6
> > >>
> > >>
> > >>>
> > >>>>
> > >>>> Corosync 2.x support is in our roadmap, but we are not sure that we
> will use Corosync 2.x earlier than 6.x release series start.
> > >>>
> > >>> Yeah, moreover corosync CMAP is not synced between cluster nodes (or
> maybe im doing something wrong?). So we need other solution for this...
> > >>>
> > >>
> > >> We should use CMAN for Corosync 1.x, perhaps.
> > >>
> > >>>>
> > >>>>
> > >>>> On Tue, May 27, 2014 at 3:08 PM, Bartosz Kupidura <
> bkupidura at mirantis.com> wrote:
> > >>>> Hello guys!
> > >>>> I would like to start discussion on a new resource agent for
> galera/pacemaker.
> > >>>>
> > >>>> Main features:
> > >>>> * Support cluster boostrap
> > >>>> * Support reboot any node in cluster
> > >>>> * Support reboot whole cluster
> > >>>> * To determine which node have latest DB version, we should use
> galera GTID (Global Transaction ID)
> > >>>> * Node with latest GTID is galera PC (primary component) in case of
> reelection
> > >>>> * Administrator can manually set node as PC
> > >>>>
> > >>>> GTID:
> > >>>> * get GTID from mysqld --wsrep-recover or SQL query 'SHOW STATUS
> LIKE ‚wsrep_local_state_uuid''
> > >>>> * store GTID as crm_attribute for node (crm_attribute --node
> $HOSTNAME --lifetime $LIFETIME --name gtid --update $GTID)
> > >>>> * on every monitor/stop/start action update GTID for given node
> > >>>> * GTID can have 3 format:
> > >>>>  - XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX:123 - standard
> cluster-id:commit-id
> > >>>>  - XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX:-1 - standard non
> initialized cluster, 00000000-0000-0000-0000-000000000000:-1
> > >>>>  - XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX:INF - commit-id manually
> set to INF, force RA to create new cluster, with master on given node
> > >>>>
> > >>>> Check if reelection of PC is needed:
> > >>>> * (node is located in partition with quorum OR we have only 1 node
> configured in cluster) AND galera resource is not running on any node
> > >>>> * GTID is manually set to INF on given node
> > >>>>
> > >>>> Check if given node is PC:
> > >>>> * have highest GTID in cluster, in case we have more than one node
> with „highest” GTID, we use CRC32 to choose proper PC.
> > >>>> * GTID is manually set to INF
> > >>>> * in case node with highest GTID will not come back after cluster
> reboot (for example disk failure) administrator should set GTID to INF on
> other node
> > >>>>
> > >>>> I have almost ready RA: http://zynzel.spof.pl/mysql-wss
> > >>>>
> > >>>> Tested with vanila centos galera/pacemaker/corosync - OK
> > >>>> Tested with Fuel 4.1 - Fail
> > >>>>
> > >>>>
> > >>>> Fuel 4.1 with that RA will not deploy correctly, because we use
> crm_attribute to store GTID, and in manifest we use cs_shadow/cs_commit for
> every pacemaker resource.
> > >>>> This lead to cs_commit problem with different configuration in
> shadow copy and running configuration (running config changed by RA).
> > >>>> "Could not commit shadow instance [..] to the CIB: Application of
> an update diff failed”
> > >>>>
> > >>>> To solve this we can go in 2 ways:
> > >>>> 1) dont use cs_commit/cs_shadow in manifests
> > >>>> 2) store GTID in other way than crm_attribute
> > >>>>
> > >>>> IMHO 2) is better (less invasive) and we can store GTID in corosync
> CMAP (http://www.polarhome.com/service/man/generic.php?qf=corosync-cmapctl),
> but this require corosync 2.X
> > >>>>
> > >>>>
> > >>>> --
> > >>>> Mailing list: https://launchpad.net/~fuel-dev
> > >>>> Post to     : fuel-dev at lists.launchpad.net
> > >>>> Unsubscribe : https://launchpad.net/~fuel-dev
> > >>>> More help   : https://help.launchpad.net/ListHelp
> > >>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> Yours Faithfully,
> > >>>> Vladimir Kuklin,
> > >>>> Fuel Library Tech Lead,
> > >>>> Mirantis, Inc.
> > >>>> +7 (495) 640-49-04
> > >>>> +7 (926) 702-39-68
> > >>>> Skype kuklinvv
> > >>>> 45bk3, Vorontsovskaya Str.
> > >>>> Moscow, Russia,
> > >>>> www.mirantis.com
> > >>>> www.mirantis.ru
> > >>>> vkuklin at mirantis.com
> > >>>
> > >>>
> > >>
> > >>
> > >> --
> > >> Best regards,
> > >> Bogdan Dobrelya,
> > >> Skype #bogdando_at_yahoo.com
> > >> Irc #bogdando
> > >>
> > >>
> > >>
> > >> --
> > >> Yours Faithfully,
> > >> Vladimir Kuklin,
> > >> Fuel Library Tech Lead,
> > >> Mirantis, Inc.
> > >> +7 (495) 640-49-04
> > >> +7 (926) 702-39-68
> > >> Skype kuklinvv
> > >> 45bk3, Vorontsovskaya Str.
> > >> Moscow, Russia,
> > >> www.mirantis.com
> > >> www.mirantis.ru
> > >> vkuklin at mirantis.com
> > >
> > >
> > >
> >
> >
> > --
> > Best regards,
> > Bogdan Dobrelya,
> > Skype #bogdando_at_yahoo.com
> > Irc #bogdando
> >
> >
> >
> > --
> > Yours Faithfully,
> > Vladimir Kuklin,
> > Fuel Library Tech Lead,
> > Mirantis, Inc.
> > +7 (495) 640-49-04
> > +7 (926) 702-39-68
> > Skype kuklinvv
> > 45bk3, Vorontsovskaya Str.
> > Moscow, Russia,
> > www.mirantis.com
> > www.mirantis.ru
> > vkuklin at mirantis.com
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140602/7f2c2897/attachment.html>


More information about the OpenStack-dev mailing list