[openstack-dev] Why doesn't Swift cache object data?

John Dickinson me at not.mn
Thu Jun 25 23:01:41 UTC 2015

You're right. Caching object data is one way to really speed up reads to content that is stored in Swift and accessed frequently. Often time, deployers use existing tools like squid, varnish, or a CDN to do that.

But that still leaves the question "why don't we cache the object data in Swift?". The short answer is that we sacrifice potential end-user time-to-first-byte gains in favor of reliability in the overall system.

The longer answer requires a bit more explanation of how Swift is put together.

Think of Swift as having 3 layers: the proxy which accepts API requests and coordinates data with the storage servers, the storage servers which persist data, and the actual storage media (ie hard drives). Each hard drive that is plugged in to a storage server is formatted with a local filesystem (eg XFS). Swift lays out the data across the filesystems in the cluster such that one object ends up being one file per replica on a filesystem somewhere (for replication. erasure codes are slightly different).

As a cluster grows to many hard drives and fills up with more data, each of those individual filesystems on the drives in the cluster get more and more data on them. And since the data is stored in a local filesystem, that means that there is filesystem overhead there too (inodes, dentries, etc). The reality is that it's not too hard to have more inode entries on a moderately full 6TB drive than available memory on the system. This means that the system page cache can be entirely flushed by simply walking the data. And Swift walks the drives to make sure the data is correct and durably stored in the cluster.

Since the health of the data and recovery from hardware failure is directly dependent on Swift's ability to walk the data, it's important that we prioritize filesystem metadata over object data in the page cache. Therefore we fadvise() the system to drop the object data from the cache as soon as we're done reading object data off the drive. That keeps more space available in the storage nodes' memory for the filesystem metadata, which in turn help keep the cluster happy.

The other option would be to cache the data on the proxy server. And sure, I guess we could do that, but it would end up meaning that we (the swift devs) would implement some kind of cache system, and at that point, why not use an existing one that's specialized for that purpose and does a great job (eg varnish/squid)?

So that's why we don't cache object data in Swift. Great question!


> On Jun 25, 2015, at 1:05 PM, 杨苏立 Yang Su Li <yangsuli at gmail.com> wrote:
> Hi,
> I have noticed that even though account/container information is cached using memcached in Swift, it doesn't cache any actual object data.
> Could someone enlighten me what's the consideration behind this decision? Because it seems like it might be useful...
> Thanks a lot
> Suli
> --
> Suli Yang
> Department of Physics
> University of Wisconsin Madison
> 4257 Chamberlin Hall
> Madison WI 53703
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150625/9c119fba/attachment.pgp>

More information about the OpenStack-dev mailing list