[Openstack] Writes are faster than reads in Swift

Jay Pipes jaypipes at gmail.com
Thu Dec 15 16:43:30 UTC 2011

On Wed, Dec 14, 2011 at 5:38 PM, Ewan Mellor <Ewan.Mellor at eu.citrix.com> wrote:
>> Or static disk image files!
> Only if you've got enough RAM on the storage worker node to cache the entire disk image.  Otherwise it's just going to get evicted straight away.

> The case where you've got so few, small, disk images that you can cache them all in RAM must be pretty rare.  And if it's not rare in your use-case, I reckon giving that RAM to your Glance nodes and doing the caching there would be just as good if not better.

There's no reason it couldn't be both. Generally, the Glance API
servers are not installed on the same servers as Swift object

My point being is that Swift was not designed for (large) static files
for readonly workloads, and of course, that is the use case for Glance
with a Swift backend (where a single container contains few objects,
but they tend to be large). It would be great to be able to control
the disk flushing behaviour in Swift to control for these alternate
workloads. Certain use cases, such as starting up a lot of servers
across lots of Nova compute nodes  from a base image stored in Swift
(read through Glance or not through Glance, it doesn't matter
here...), would benefit from not throwing away the disk blocks that
will be read heavily over and over again.

Even on my home desktop machine (nowhere near the types of servers I
see typically provisioned for Swift), I've got a bunch of buffer space

jpipes at uberbox:~$ free
             total       used       free     shared    buffers     cached
Mem:      24730152    7997080   16733072          0    2488360    3482836
-/+ buffers/cache:    2025884   22704268
Swap:     25149436          0   25149436

We can simulate a situation where a 2GB image file is available for
streaming through a Swift object server by creating a 2GB file and
cat'ing it to /dev/null. The below timing results show the time taken
to stream the image when the image's disk blocks are in the disk

jpipes at uberbox:~$ dd if=/dev/zero of=fakeimage bs=1M count=2000
2000+0 records in
2000+0 records out
2097152000 bytes (2.1 GB) copied, 1.25832 s, 1.7 GB/s

jpipes at uberbox:~$ free
             total       used       free     shared    buffers     cached
Mem:      24730152    3821108   20909044          0       9752    2220208
-/+ buffers/cache:    1591148   23139004
Swap:     25149436          0   25149436

jpipes at uberbox:~$ time cat fakeimage > /dev/null

real	0m0.346s
user	0m0.000s
sys	0m0.340s

Now, we simulate the dropping of the disk buffer cache:

jpipes at uberbox:~$ echo 3 | sudo tee /proc/sys/vm/drop_caches

And try streaming the image again:

jpipes at uberbox:~$ time cat fakeimage > /dev/null

real	0m8.813s
user	0m0.012s
sys	0m1.360s

0.346s vs. 8.813s is a pretty massive difference and one that I think
deserves at least a switch to control the behaviour in Swift.


More information about the Openstack mailing list