[Openstack-operators] Lets talk capacity monitoring

George Shuklin george.shuklin at gmail.com
Thu Jan 15 23:08:56 UTC 2015

On 01/15/2015 06:43 PM, Jesse Keating wrote:
> We have a need to better manage the various openstack capacities 
> across our numerous clouds. We want to be able to detect when capacity 
> of one system or another is approaching the point where it would be a 
> good idea to arrange to increase that capacity. Be it volume space, 
> VCPU capability, object storage space, etc...
> What systems are you folks using to monitor and react to such things?

In our case we are using standard metrics (ganglia) and monitoring 
(shinken). I have thoughts about 'capacity planing', but the problem is 
that you cannot separate payload from wasted resources. For example, 
when snapshot is created, it eats space on compute (for some 
configuration) beyond flavor limits. If instance boots, _base is used 
too (and if instance is booting from big snapshot, it use more space in 
_base, than in /instances). CPU can be heavily used by many 
host-internal processes, and memory is shared with management software 
(which can be greedy too). IO can be overspend on snapshots/booting.

So we are using cumulative graphs for free space, cpu usage, memory 
usage. It does not cover flavor/aggregate/pinning-to-host-by-metadata 
cases, but overall give some feeling about available free resources.

More information about the OpenStack-operators mailing list