[openstack-dev] [tc] [all] OpenStack moving both too fast and too slow at the same time
chris.friesen at windriver.com
Fri May 5 22:45:40 UTC 2017
On 05/05/2017 02:04 PM, John Griffith wrote:
> On Fri, May 5, 2017 at 11:24 AM, Chris Friesen <chris.friesen at windriver.com
> <mailto:chris.friesen at windriver.com>> wrote:
> On 05/05/2017 10:48 AM, Chris Dent wrote:
> Would it be accurate to say, then, that from your perpsective the
> tendency of OpenStack to adopt new projects willy nilly contributes
> to the sense of features winning out over deployment, configuration
> and usability issues?
> Personally I don't care about the new projects...if I'm not using them I can
> ignore them, and if I am using them then I'll pay attention to them.
> But within existing established projects there are some odd gaps.
> Like nova hasn't implemented cold-migration or resize (or live-migration) of
> an instance with LVM local storage if you're using libvirt.
> Image properties get validated, but not flavor extra-specs or instance metadata.
> Cinder theoretically supports LVM/iSCSI, but if you actually try to use it
> for anything stressful it falls over.
> Oh really?
> I'd love some detail on this. What falls over?
It's been a while since I looked at it, but the main issue was that with LIO as
the iSCSI server there is no automatic traffic shaping/QoS between guests, or
between guests and the host. (There's no iSCSI server process to assign to a
cgroup, for example.)
The throttling in IOPS/Bps is better than nothing, but doesn't really help when
you don't necessarily know what your total IOPS/bandwidth actually is or how
many volumes could get created.
So you have one or more guests that are hammering on the disk as fast as they
can, combined with disks on the cinder server that maybe aren't as fast as they
should be, and it ended up slowing down all the other guests. And if the host
is using the same physical disks for things like glance downloads or image
conversion, then a badly-behaved guest can cause performance issues on the host
as well due to IO congestion. And if they fill up the host caches they can even
affect writes to other unrelated devices.
So yes, it wasn't the ideal hardware for the purpose, and there are some tuning
knobs, but in an ideal world we'd be able to reserve some amount/percentage of
bandwidth/IOPs for the host and have the rest shared equally between all active
iSCSI sessions (or unequally via a share allocation if desired).
More information about the OpenStack-dev