[openstack-dev] [Fuel][MySQL][DLM][Oslo][DB][Trove][Galera][operators] Multi-master writes look OK, OCF RA and more things

Mike Bayer mbayer at redhat.com
Sat Apr 30 13:19:51 UTC 2016



On 04/30/2016 02:57 AM, bdobrelia at mirantis.com wrote:
> Hi Roman.
> That's interesting, although’s hard to believe (there is no slave lag in
> galera multi master). I can only suggest us to create another jepsen
> test to verify exactly scenario you describe. As well as other OpenStack
> specific patterns.
>

There is definitely slave lag in Galera and it can be controlled using 
the wsrep_causal_reads_flag.

Demonstration script, whose results I have confirmed separately using 
Pythons scripts, is at:

https://www.percona.com/blog/2013/03/03/investigating-replication-
latency-in-percona-xtradb-cluster/	



> Regards,
> Bogdan.
>
> *Od:* Roman Podoliaka <mailto:rpodolyaka at mirantis.com>
> *Wysłano:* ‎piątek‎, ‎29‎ ‎kwietnia‎ ‎2016 ‎21‎:‎04
> *Do:* OpenStack Development Mailing List (not for usage questions)
> <mailto:openstack-dev at lists.openstack.org>
> *DW:* openstack-operators at lists.openstack.org
> <mailto:openstack-operators at lists.openstack.org>
>
> Hi Bogdan,
>
> Thank you for sharing this! I'll need to familiarize myself with this
> Jepsen thing, but overall it looks interesting.
>
> As it turns out, we already run Galera in multi-writer mode in Fuel
> unintentionally in the case, when the active MySQL node goes down,
> HAProxy starts opening connections to a backup, then the active goes
> up again, HAProxy starts opening connections to the original MySQL
> node, but OpenStack services may still have connections opened to the
> backup in their connection pools - so now you may have connections to
> multiple MySQL nodes at the same time, exactly what you wanted to
> avoid by using active/backup in the HAProxy configuration.
>
> ^ this actually leads to an interesting issue [1], when the DB state
> committed on one node is not immediately available on another one.
> Replication lag can be controlled  via session variables [2], but that
> does not always help: e.g. in [1] Nova first goes to Neutron to create
> a new floating IP, gets 201 (and Neutron actually *commits* the DB
> transaction) and then makes another REST API request to get a list of
> floating IPs by address - the latter can be served by another
> neutron-server, connected to another Galera node, which does not have
> the latest state applied yet due to 'slave lag' - it can happen that
> the list will be empty. Unfortunately, 'wsrep_sync_wait' can't help
> here, as it's two different REST API requests, potentially served by
> two different neutron-server instances.
>
> Basically, you'd need to *always* wait for the latest state to be
> applied before executing any queries, which Galera is trying to avoid
> for performance reasons.
>
> Thanks,
> Roman
>
> [1] https://bugs.launchpad.net/fuel/+bug/1529937
> [2]
> http://galeracluster.com/2015/06/achieving-read-after-write-semantics-with-galera/
>
> On Fri, Apr 22, 2016 at 10:42 AM, Bogdan Dobrelya
> <bdobrelia at mirantis.com> wrote:
>  > [crossposting to openstack-operators at lists.openstack.org]
>  >
>  > Hello.
>  > I wrote this paper [0] to demonstrate an approach how we can leverage a
>  > Jepsen framework for QA/CI/CD pipeline for OpenStack projects like Oslo
>  > (DB) or Trove, Tooz DLM and perhaps for any integration projects which
>  > rely on distributed systems. Although all tests are yet to be finished,
>  > results are quite visible, so I better off share early for a review,
>  > discussion and comments.
>  >
>  > I have similar tests done for the RabbitMQ OCF RA clusterers as well,
>  > although have yet wrote a report.
>  >
>  > PS. I'm sorry for so many tags I placed in the topic header, should I've
>  > used just "all" :) ? Have a nice weekends and take care!
>  >
>  > [0] https://goo.gl/VHyIIE
>  >
>  > --
>  > Best regards,
>  > Bogdan Dobrelya,
>  > Irc #bogdando
>  >
>  >
>  >
>  >
> __________________________________________________________________________
>  > OpenStack Development Mailing List (not for usage questions)
>  > Unsubscribe:
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>  > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



More information about the OpenStack-dev mailing list