[Openstack-operators] [Scale][Performance] <nodes_with_someting> / compute_nodes ratio experience
Rochelle Grober
rochelle.grober at huawei.com
Thu Nov 19 23:36:17 UTC 2015
Sorry this doesn't thread properly, but cut and pasted out of the digest...
> As providing OpenStack community with understandable recommendations
> and instructions on performant OpenStack cloud deployments is part of
> Performance Team mission, I'm kindly asking you to share your
> experience on safe cloud deployment ratio between various types of
> nodes you're having right now and the possible issues you observed (as
> an example: discussed GoDaddy's cloud is having 3 conductor boxes vs
> 250 computes in the cell, and there was an opinion that it's simply
> not enough and that's it).
That was my opinion, and it was based on an apparently incorrect assumption that they had a lot of things coming and going on their cloud. I think they've demonstrated at this point that (other issues
aside) three is enough for them, given their environment, workload, and configuration.
This information is great for building rules of thumb, so to speak. GoDaddy has an example configuraton that is adequate for low frequency construct/destruct (low number of vm create/destroy) cloud architectures. This provides a lower bounds and might be representative of a lot of enterprise cloud deployments.
The problem with coming up with any sort of metric that will apply to everyone is that it's highly variable. If you have 250 compute nodes and never create or destroy any instances, you'll be able to get away with
*many* fewer conductors than if you have a very active cloud. Similarly, during a live upgrade (or following any upgrade where we do some online migration of data), your conductor load will be higher than normal. Of course, 4-core and 96-core conductor nodes aren't equal either.
And here we have another rule of thumb, but no numbers put to it yet. If you have a low frequency construct/destruct cloud model, you will need to temporarily increase your number of conductors by {x amount OR x%} when performing OpenStack live upgrades.
So, by all means, we should gather information on what people are doing successfully, but keep in mind that it depends *a lot* on what sort of workloads the cloud is supporting.
Right, but we can start applying fuzzy logic (the human kind, not machine) and get a better understanding of working configurations and *why* they work, then start examining where the transition states between configurations are. You need data before you can create information ;-)
--Rocky
--Dan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20151119/a3dad604/attachment.html>
More information about the OpenStack-operators
mailing list