[Openstack] Study of Swift performance degradation during drive failure
John Dickinson
me at not.mn
Thu Sep 6 16:01:55 UTC 2018
On 5 Sep 2018, at 22:08, Sameer Kulkarni wrote:
> Hi All,
>
> We are trying to understand and study how Swift handles drive
> failures.
> From the book we have learnt that a drive failure triggers replication
> by
> default where as a node failure doesnt. We are trying to study the
> performance impact of this replication on the handoff nodes.
>
> If during the replication of an entire partition P to one of the
> handoff
> nodes N1, an object is upload whose 1 of the 3 replicas is destined to
> node
> N1, then is one operation going to have a higher priority ? i.e is
> does a
> normal upload operation take priority over the replication that is in
> progress or does it wait for the replication to complete.
>
> Also in the above scenario I do not believe the user experiences much
> performance degradation as the proxy server would have recieved the
> quorum
> of successful responses from the other 2 nodes. This brings us to our
> next
> question, what would be the simplest way to quantify the performance
> degradation due to a drive failure(maybe multiple) on a Swift setup
> using
> as few drives as possible.
>
> Any help or pointers would be appreciated.
>
> Thank you.
Some very short answers:
no, Swift does not automatically prioritize one type of operation over
another, although there are config settings that operators may adjust to
balance background tasks and client requests. I would love for Swift to
be able to do this, and we're slowly working towards that goal with a
few ongoing pieces of work.
There is likely no simple way to quantify performance degradation due to
hardware failure. That's the "fun" of distributed systems. It depends
too much on specifics of the hardware, the current workload, and the
particular characteristics of the failure. I cannot give you a general
answer. Normally deployers will run benchmarks against their cluster
under different circumstances to measure actual impact of expected
failure modes.
--John
More information about the Openstack
mailing list