[openstack-dev] [savanna] scalable architecture
matt at redhat.com
Thu Jul 25 04:05:41 UTC 2013
On 07/23/2013 12:32 PM, Sergey Lukjanov wrote:
> Hi evereyone,
> We’ve started working on upgrading Savanna architecture in version
> 0.3 to make it horizontally scalable.
> The most part of information is in the wiki page -
> Additionally there are several blueprints created for this activity -
> We are looking for comments / questions / suggestions.
Some comments on "Why not provision agents to Hadoop cluster's to
provision all other stuff?"
Re problems with scaling agents for launching large clusters - launching
large clusters may be resource intensive, those resources must be
provided by someone. They're either going to be provided by a) the
hardware running the savanna infrastructure or b) the instance hardware
provided to the tenant. If they are provided by (a) then the cost of
launching the cluster is incurred by all users of savanna. If (b) then
the cost is incurred by the user trying to launch the large cluster. It
is true that some instance recommendations may be necessary, e.g. if you
want to run a 500 instance cluster than your head node should be large
(vs medium or small). That sizing decision needs to happen for (a) or
(b) because enough virtual resources must be present to maintain the
large cluster after it is launched. There are accounting and isolation
benefits to (b).
Re problems migrating agents while cluster is scaling - will you expand
on this point?
Re unexpected resource consumers - during launch, maybe, during
execution the agent should be a minimal consumer of resources. sshd may
also be an unexpected resource consumer.
Re security vulnerability - the agents should only communicate within
the instance network, primarily w/ the head node. The head node can
relay information to the savanna infrastructure outside the instances in
the same way savanna-api gets information now. So there should be no
difference in vulnerability assessment.
Re support multiple distros - yes, but I'd argue this is at most a small
incremental complexity on what already exists today w/ properly creating
savanna plugin compatible instances.
Concretely, the architecture of using instance resources for
provisioning is no different than spinning an instance w/ ambari and
then telling that instance to provision the rest of the cluster and
report back status.
Re metrics - wherever you gather Hz (# req per sec, # queries per sec,
etc), also gather standard summary statistics (mean, median, std dev,
More information about the OpenStack-dev