[openstack-dev] the performance degradation of swift PUT
wuse.kalrey at gmail.com
Sat Aug 3 17:34:00 UTC 2013
I'm a learner of swift. I took some benchmark about swift last week and the result is not pleasant.
When I put a large number of small files(4KB) under high concurrency, the performance degradation of PUT appeared.
The speed of PUT even can reach to 2000/s at beginning. But it down to 600/s after one minute. It's stable at 100/s at last and some error like '503' occured. But when I flushed all disk in cluster it could reach back 2000/s.
In fact, I also took some benchmark about GET in the same environment but it works very well(5000/s).
There are some information which maybe useful:
1 proxy-node : 128GB-ram / CPU 16core / 1Gb NIC*1
5 Storage-nodes : each for 128GB-ram / CPU 16core / 2TB*4 / 1Gb NIC*1.
concurrency = 200
object_size = 4096
num_objects = 2000000
num_containers = 200
I have traced the code of PUT operation to find out what cause the performance degradation while putting objects. Some code cost a long time in ObjectController::PUT(swift/obj/server.py).
> for chunk in iter(lambda: reader(self.network_chunk_size), ”):
start_time = time.time()
> upload_size += len(chunk)
> if time.time() > upload_expiration:
> return HTTPRequestTimeout(request=request)
> while chunk:
> written = os.write(fd, chunk)
> chunk = chunk[written:]
'lambda: reader' will cost average of 600ms per execution. And 'sleep()' will cost 500ms per execution.In fact, 'fsync' also spend a lot time when file flush to disk at last and I removed it already just for testing. I think the time is too long.
I monitor resource of cluster while putting object.The usage of bandwidth is very low and the load of CPUs were very light.
I have tried to change vfs_cache_pressure to a low value and it does not seem to work.
Is there any advice to figure out the problem?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the OpenStack-dev