[openstack-dev] [swift] Exploring the feasibility of a dependency approach

John Dickinson me at not.mn
Tue Jun 7 21:30:54 UTC 2016

Below is the entirety of an email thread between myself, Thierry, and Flavio. It goes into detail about Swift's design and the feasibility and potential impact of a "split repos" scenario.

I'm posting this with permission as an FYI, not to reraise discussion.


Forwarded message:

> From: John Dickinson <me at not.mn>
> To: Thierry Carrez <thierry at openstack.org>
> Cc: Flavio Percoco <flaper87 at gmail.com>
> Subject: Re: Exploring the feasibility of a dependency approach
> Date: Wed, 01 Jun 2016 13:58:21 -0700
> On 30 May 2016, at 2:48, Thierry Carrez wrote:
>> John Dickinson wrote:
>>> Responses inline.
>> Thank you for taking up the time to write this, it's really helpful (to me at least). I have a few additional comments/questions to make sure I fully understand.
>>>> [...]
>>>> 1. How much sense would a Swift API / Swift engine split make today ?
>>>> [...]
>>> It doesn't make much sense to try to split Swift into an API part and
>>> an engine part because the things the API handles are inexorably
>>> linked to the storage engine itself. In other words, the API handlers
>>> are the implementation of the engine.
>>> Since the API is handling the actual resources that are exposed (ie
>>> the data itself), it also has to handle the "engine" pieces like the
>>> consistency model (when is something "durable"), placement (where
>>> should something go), failure handling (what if hardware in the
>>> cluster isn't available), and durability schemes (replicas, erasure
>>> coding, etc).
>> Right, so knowledge of the data placement algorithm (or the durability constraints) in pervasive across the Swift nodes. The proxy server is, in a way, as low-level as the storage server.
>>> The "engine" in Swift has two logical parts. One part is responsible
>>> for taking a request, making a canonical persistent "key" for it,
>>> handing the data to the storage media, and ensuring that the media has
>>> durably stored the data. The other part is responsible for handling a
>>> client request, finding data in the cluster, and coordinating all
>>> responses from the stuff in the first part.
>>> We call the first part "storage servers" and the second part "proxy
>>> servers". There are three different kinds of storage servers in Swift:
>>> account, container, and object, and each also have several background
>>> daemon processes associated with them. For the rest of this email, I'll
>>> refer to a proxy server and storage servers (or specific account,
>>> container, or object servers).
>>> The proxy server and the storage servers are pluggable. The proxy
>>> server and the storage servers support 3rd party WSGI middleware. The
>>> proxy server has been extended many times in the ecosystem with a lot
>>> of really cool functionality:
>>>   * Swift as an origin server for CDNs
>>>   * Storlets, which allow executable code stored as objects to
>>>     mutate requests and responses
>>>   * Image thumbnails (eg for wikimedia)
>>>   * Genome sequence format conversions, so data coming out of a
>>>     gene sequencer can go directly to swift and be usable by other
>>>     apps in the workflow
>>>   * Media server timestamp to byte offset translator (eg for CrunchyRoll)
>>>   * Caching systems
>>>   * Metadata indexing
>>> The object server also supports different implementations for how it
>>> talks to durable media. The in-repo version has a memory-only
>>> implementation and a generic filesystem implementation. Third-party
>>> implementations support different storage media like Kinetic drives.
>>> If there were to be special optimizations for flash media, this is
>>> where it would go. Inside of the object server, this is abstracted as
>>> a "DiskFile", and extending it is a supported use case for Swift.
>>> The DiskFile is how other full-featured storage systems have plugged
>>> in to Swift. For example, the SwiftOnFile project implements a
>>> DiskFile that handles talking to a distributed filesystem instead of a
>>> local filesystem. This is used for putting Swift on GlusterFS or on
>>> NetApp. It's the same pattern that's used for swift-on-ceph and all of
>>> the other swift-on-* implementations out there. My previous email had
>>> more examples of these.
>> The complaints I heard with DiskFile abstractions is that you pile up two data distributions algorithms: in the case of Ceph for example you use the Swift rings only to hand off data distribution to CRUSH at the end, which is like twice the complexity compared to what you actually need. So it's great for Kinetic drives, but not so much for alternate data distribution mechanisms. Is that a fair or a partial complaint ?
> If you run Swift on top of a different system that itself provides durable storage, you'll end up with a lot of complexity and cost for little benefit. This isn't a Swift-specific complaint. If you were to run GlusterFS on an Isilon or HDFS on NetApp volumes, you'd have the same issue. You're layering two different systems that are trying to do the same thing; either you live with the overhead or you end up disabling most of the functionality in one of the two systems. Incidentally, this is similar to the reasons is inadvisable to run Swift on RAID volumes or on SAN-provisioned volumes.
> You're right. The abstractions in Swift are great for allowing different sorts of local media technology to be used, abstracted, and pooled. It works great for traditional magnetic media, flash, and event stuff like optical or tape storage.
> However, you can still run Swift on whatever and still pass the DefCore definition for OpenStack object storage capabilities.
>>> It looks something like this:
>>> <client> <-> [plugins]<proxy> --- [plugins]<storage>[DiskFile] <-> <media>
>>> So if there were to be any split of Swift, it would be either in front
>>> of the proxy or after the storage server. If it were before the proxy
>>> server, you'd be left with nothing in OpenStack. If you split after
>>> the storage server, you'd end up with exactly what Swift is today in
>>> OpenStack.
>> The way I saw it (possibly) working was to separate the swift-aware part of the proxy into driver code for the swift object storage engine. Something like:
>> <cli> <--> [plug]<proxy>[swiftdriver] -- [plug]<storage>[DF] <-> <media>
>> that would let you plug a completely different object storage engine below the proxy, like
>> <cli> <--> [plug]<proxy>[radosdriver] -- RADOS <-> <media>
>> The [plug]<proxy> *and its various drivers* would live in OpenStack, while the [plug]<storage>[DF] part would be a dependency.
>> That said, I understand from what you're saying it's clearly not the way the Proxy server is written today, and it's also clearly not the way Swift was designed. And since so much of the Swift model lives in the proxy, the two components ("driver" and "engine") would have to evolve together in a way that a split community would hurt.
> That's right. It's not the way the proxy or storage servers are written today, nor is it the way those individual system components (or Swift overall) are designed.
>>>> 2. What drawbacks do you see to a dependency approach ?
>>>> [...]
>>> You're right. I do not support organizing things so that Go code is an
>>> external dependency. All of what's below is assuming we throw away the
>>> answers from above and explores what would happen if the Go code were
>>> to be an external dependency.
>>> The major technical drawbacks have been discussed above. The parts of
>>> the code we're talking about reimplementing in Golang are directly
>>> tied in with the rest of the API implementation. The majority of the
>>> drawbacks to making any Golang code an external dependency are related
>>> to the community impact of that decision.
>>> The Golang bits that would be separated do not constitute a complete
>>> thing on their own, and the parts that are staying as Python would not
>>> be complete without the Golang bits.
>>> We've got a very narrow scope for the Golang code we're planning on
>>> merging. Specifically, our first goal with the Golang implementation
>>> is to speed up replication. To do that, we're looking at Golang code
>>> for the object server process and the object replication daemon (and
>>> probably the object reconstructor daemon for erasure code
>>> functionality shortly thereafter). We're not currently looking at
>>> replacing any other part of the system. We're not rewriting the Python
>>> implementations of the other object daemons. We're not rewriting the
>>> proxy server. We're not rewriting the account or container servers.
>>> Right now we're just looking at speeding up object replication pieces.
>>> That's all that's in the current scope.
>>> If the Golang object server and Golang object replicator processes
>>> were an external dependency, then the rest of the Swift code that
>>> would remain under the OpenStack repo would be incomplete. Yet the
>>> goal would be that the exact same people would write and review code
>>> for that repo too. However, as we all know, many organizations that
>>> participate in OpenStack because it is OpenStack. If we carved off
>>> some functionality and moved it to an external dependency, we'd be
>>> splitting the community and probably preventing some core reviewers
>>> from even having their employer's permission to contribute to the
>>> external dependency.
>>> In fact, one of the reasons for integrating the Golang code into
>>> Swift's master branch is to re-unify the community. The currently
>>> Golang work has been done by a small subset of the community who have
>>> found themselves working alone and exclusively on the Golang work.
>>> Merging the Golang code into master and having one implementation of
>>> functionality lets us again work together as a whole community.
>>> Causing a community split in the Swift project is exactly the wrong
>>> choice.
>>> One way to think about possible impact is to imagine what life would
>>> be like today had we already done this. If the object server had been
>>> made an external dependency a long time ago, various major features
>>> that have been implemented in Swift over the past few years would have
>>> been very difficult, if not impossible, to do. Features like storage
>>> policies, erasure codes, and encrypted data have all impacted both the
>>> proxy and storage layers in Swift and the communication between them.
>>> Since these parts of Swift are together in the same codebase, it means
>>> we can update both client and server sides of any communication
>>> atomically and make rapid progress. If they were in different
>>> codebases, progress would be much slower and more difficult.
>>> If the object server were an external dependency, other work that we
>>> want to do in the future would be much more difficult. We've discussed
>>> better ways to optimize the on-disk layout, new/different protocols
>>> for internal cluster communication, and various internal feedback
>>> mechanisms to allow smarter management of data. All of these require
>>> changes to both the proxy and to the object server. If they are in the
>>> same repo we can easily update both (in a safe way accounting for
>>> migrations, of course) with one commit and one release.
>>> Overall, if the proposed Golang code were in an external dependency,
>>> we'd have two incomplete projects, a bifurcated contributor community,
>>> and much more difficulty in developing improvements to the system.
>> That makes a lot of sense (avoiding fragmentation inside the Swift community). It seems hard to justify reducing fragmentation risk at OpenStack-level by actively fragmenting a specific so-far-coherent community. But at the same time, introducing golang in Swift will create two (or three) camps anyway: those who grok one language or the other, and those who can be efficient code reviewers for both. You seem confident that this internal fragmentation along language lines won't hurt Swift ?
> I definitely think introducing golang into Swift will impact the community, but I don't think it will be bad. We'll have new people come in to the community who only know Go. We'll have Pythonistas who don't (yet?) know Go. And as time progresses, I expect that most people will become familiar with both.
> But this isn't too different than the situation today. We have people who are really comfortable with updating Swift's ring code and people who are best at dealing with the client SDK. Some are great at getting the crypto work done, while others are good at understanding erasure codes.
> I am confident that multiple languages in Swift will not negatively impact the project. What keeps us unified is a common purpose and goal and not a particular implementation language.
>> Another suggestion that was made on the thread was to develop the key pieces which actually need native performance (or a different I/O behavior) as golang packages and integrate them as Python modules with gopy. Such an approach would keep the external dependency at a minimum and reduce community fragmentation within Swift (be it around language familiarity or in vs. out-of OpenStack "officialness").
>> I understand that Hummingbird is a full server / background process rewrite in golang, so this is a different approach to what you currently have almost ready to merge. But would that alternate approach have been possible ? From Sam's post I infer that it's not just a single operation that needs more performance, but the object server process itself in its juggling between various requests blocked on I/O, so you need the whole process / event loop running in golang (rather than just single atomic operations). Did I get that right ?
>> -- 
>> Thierry Carrez (ttx)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160607/9dd50f3d/attachment.pgp>

More information about the OpenStack-dev mailing list