[all][tc][Edge][FEMDC][tripleo][akraino][starlingx] Chronicles Of A Causal Consistency

Bogdan Dobrelya bdobreli at redhat.com
Wed Dec 5 11:08:56 UTC 2018


Background.
Fact 0. Edge MVP reference architectures are limited to a single control 
plane that uses a central/global data backend by usual for boring Cloud 
computing meanings.
Fact 1. Edge clouds in Fog computing world are WAN-distributed. Far and 
middle-level tiers may be communicating to their control planes over 
high-latency (~50/100ms or more) connections.
Fact 2. Post-MVP phases [0] of future reference architectures for Edge 
imply high autonomity of edge sites (aka cloudlets [1][2]), which is 
having multiple control planes
always maintaining CRUD operations locally and replicating shared state 
asynchronously, only when "uplinks" are available, if available at all.
Fact 3. Distributed Compute Node in the post-MVP phases represents a 
multi-tiered star topology with middle-layer control planes aggregating 
thousands of computes at far edge sites and serving CRUD operations for 
those locally and fully autonomous to upper aggregation edge layers [3]. 
Those in turn might be aggregating tens of thousands of computes via 
tens/hundreds of such middle layers. And finally, there may be a central 
site or a few that want some data and metrics from all of the 
aggregation edge layers under its control, or pushing deployment 
configuration down hill through all of the layers.

Reality check.
That said, the given facts 1-3 contradict to strongly consistent data 
backends supported in today OpenStack (oslo.db), or Kubernetes as well.
That means that neither of two IaaS/PaaS solutions is ready for future 
post-MVP phases of Edge as of yet. That also means that both will need a 
new, weaker consistent, data backend to pass the future reality check. 
If you're interested in formal proves of that claim, please see for 
sources [4][5][6][7][8]. A [tl;dr] of those:

a) It is known that causal consistency is the best suitable for 
high-latency, high-scale and highly dynamic nature of membership in clusters
b) "it it is significantly harder to implement causal consistency than 
eventual consistency. This explains the fact why there is not even a 
single commercial database system
that uses causal consistency" [6]

Challenge accepted!
What can we as OpenStack community, joined the Kubernetes/OSF/CNCF 
communities perhaps, for the bright Edge future can do to make things 
passing that reality check?
It's time to start thinking off it early, before we are to face the 
post-MVP phases for Edge, IMO. That is also something being discussed in 
the neighbour topic [9] and that I'm also trying to position as a 
challenge in that very high-level draft paper [10]. As of potential 
steps on the way of implementing/adopting such a causal data backend in 
OpenStack at least, we should start looking into the papers, like 
[4][5][6][7][8] (or even [11], why not having a FS for that?), and 
probably more of it as a "theoretical background".

[0] 
https://wiki.openstack.org/w/index.php?title=OpenStack_Edge_Discussions_Dublin_PTG#Features_2
[1] 
https://github.com/State-of-the-Edge/glossary/blob/master/edge-glossary.md#cloudlet
[2] https://en.wikipedia.org/wiki/Cloudlet
[3] 
https://github.com/State-of-the-Edge/glossary/blob/master/edge-glossary.md#aggregation-edge-layer
[4] http://www.bailis.org/papers/bolton-sigmod2013.pdf
[5] http://www.cs.princeton.edu/~wlloyd/papers/eiger-nsdi13.pdf
[6] https://www.ronpub.com/OJDB_2015v2i1n02_Elbushra.pdf
[7] http://www.cs.cornell.edu/lorenzo/papers/cac-tr.pdf
[8] https://www.cs.cmu.edu/~dga/papers/cops-sosp2011.pdf
[9] 
http://lists.openstack.org/pipermail/openstack-discuss/2018-December/000492.html
[10] 
https://github.com/bogdando/papers-ieee/blob/master/ICFC-2019/LaTeX/position_paper_1570506394.pdf
[11] http://rainbowfs.lip6.fr/data/RainbowFS-2016-04-12.pdf

-- 
Best regards,
Bogdan Dobrelya,
Irc #bogdando



More information about the openstack-discuss mailing list