[Openstack-operators] Lets talk capacity monitoring
Mathieu Gagné
mgagne at iweb.com
Thu Jan 15 17:25:59 UTC 2015
On 2015-01-15 11:43 AM, Jesse Keating wrote:
> We have a need to better manage the various openstack capacities across
> our numerous clouds. We want to be able to detect when capacity of one
> system or another is approaching the point where it would be a good idea
> to arrange to increase that capacity. Be it volume space, VCPU
> capability, object storage space, etc...
>
> What systems are you folks using to monitor and react to such things?
>
Thanks for bringing up the subject Jesse.
I believe you are not the only one facing this challenge because I am too.
I added the subject to the midcycle ops meetup (Capacity
planning/monitoring) which I hope to be able to attend:
https://etherpad.openstack.org/p/PHL-ops-meetup
We are using host aggregates and have a complex combination of them.
(imaging a venn diagram)
What we do is retrieving all:
- hypervisor stats
- host aggregates
From there, we compute resource usage (vcpus, ram, disk) in any given
host aggregate.
This part is very challenging as we have to partially reimplement
nova-scheduler logic to determine if a given hypervisor has different
resource allocation ratios based on host aggregate attributes.
The result in a table with resource usage percentage (and absolute
numbers) for each host aggregates (and combinations).
Unfortunately, I can't share yet this first tool as my coworker very
tightly integrated it to our internal monitoring tool and wouldn't work
outside it. No promise but I'll try to find time to extract it and share
it with you guys.
We also coded a very primitive tool which takes a flavor name and
compute available "slots" on each hypervisors (regardless of host
aggregate memberships):
https://gist.github.com/mgagne/bc54c3434a119246a88d
This tool is not actively used in our monitoring due to mentioned
limitation as we would again have to partially reimplement
nova-scheduler logic to determine if a given flavor can (or not) be
spawn on a given hypervisor and filter it out from the output if it
can't accept the flavor. Furthermore, it does not take into account
resource allocation ratios based on host aggregates.
Hopefully, other people will join in and share their tools so we can all
improve our OpenStack operations experience.
--
Mathieu
More information about the OpenStack-operators
mailing list