Open Stack

Thu Sep 5 14:46:18 UTC 2013

Hi Clint,

I run an OpenStack cloud for academic research as well (over here
https://tig.csail.mit.edu/wiki/TIG/OpenStack ).  Started on Essex just over
a year ago, moved to Folsom just after it came out, and most recently
Grizzly since last month including a move from nova-network to
quantum/neutron.

There definitely many valid ways to approach things here, so saying what is
right or wrong in planning is difficult but I'll comment from my experience
as best as I can below.

On Wed, Sep 4, 2013 at 11:10 PM, Clint Dilks <clintd at waikato.ac.nz> wrote:

> Hi,
>
> I have been asked to setup a Grizzly Instance for academic research.  This
> project will evolve as it goes, so I don't have a clear set of
> requirements, my initial plan is to try installing using Neutron configured
> as "Single Flat Network"
> http://docs.openstack.org/trunk/openstack-network/admin/content/app_demo_flat.htmlusing the Open vSwitch Plugin.
>
> The equipment I have is 3 Nodes (2 Nic's per node ), with a switch for a
> physically isolated node subnet and access to other switches for the rest
> of our network. The nodes have Fast CPU's with a decent number of Cores and
> more RAM than we should need
>

I doubt very much that it's more RAM than you'll need instances are
addictive and I've found RAM to be  the most limiting factor.  Over
committing CPUs is easy and usually pretty safe (depending on workload),
RAM you can cheat a bit but not nearly so much.

> So currently I am picturing a setup like this
>
> Network
>
> subnet 1 - openstack management
> subnet 2 - openstack data
> subnet 3 - Public Network
>
> Nodes
>
> A  (Controller + Storage + Compute)
>    Keystone
>    Glance
>    Horizon
>    Neutron
>    Compute
>    Cinder
>    Shared Storage/NFS
>    Swift storage
>    Swift Proxy
>    (any other needed services)
>
> B (Network + Compute)
>    Neutron
>    Compute
>    Swift Storage
>
> C (Network + Compute)
>    Neutron
>    Compute
>    Swift Storage
>
> So my questions
>
> 1. Does anything seem fundamentally broken with this approach?
>

That should work.

My setup is a little different, I don't use swift, my controller node does
everything except compute and my (60) compute nodes do only compute.  All
the network bits are running on the controller node (quantum ovs using gre
for client networks, not used much, and vlans for provider networks which
is what is primarily used).  Systems are all running Ubuntu 12.04LTS with
Grizzly packages from the ubuntu cloudarchive and managed using
puppetlabs-openstack puppet modules.  I did need to apply some number of
patches to get the networking services to scale up decently (they scale out
fine but ...)  mostly from
http://blog.gridcentric.com/bid/318277/Boosting-OpenStack-s-Parallel-Performancebut
I don't think you'll need those.

> 2.  Is there anything else that I haven't mentioned that I should be
> thinking about before making a start?
>

While most documentation shows an RFC1918 "private" network NAT'ed to a
"public" network with routeable IPs  it is perfectly possible to connect
instances directly to a public network so they both get a "real" ip and
know internally what it is (rather than seeing only the RFC1918 address).
This also has the advantage that traffic is direct and not bottle necked
through a quantum-l3-agent node.  it does require you have sufficient
public IPs
, many academic institutions, but few commercial companies, have this.

You should definitely be deploying this using some configuration management
system (puppet, chef, juju, something).  It makes it much easier to deploy
initially since you can typically rely on "reasonable" defaults in the
packages modules or cookbooks.  More importantly it makes it repeatable so
when this takes off and you need dozens of compute nodes it's all magical
and zero new work.  Perhaps you are already planning this...

> 3. Do people see any advantages for a use case like ours sticking with
> nova-network, or using an alternate plugin with Quantum such as Linux
> Bridge Plugin?
>

Short rant-free version go with Neutron and OVS.

LinuxBridge is a little simpler but gives you fewer options later. Changing
out the way you do networking is a huge pain (trust me I just did it), so
I'd recommend suffering through the Quantum/Neitron OVS stuff for new
deployments.

Nova network is much-much-much easier to set up and I've found much more
stable (due to it's simplicity) than the quantum/neutron bits and given
retroactive dictatorial powers would not have made it the default network
service until at least Havana, possible Icehouse.  For existing deployments
based on nova-network I'd strongly discourage moving to neutron unless you
have an immediate need for the more advanced features.

For new deploys Neutron is the only way to go. If you deploy a new cloud
with nova-network you're only setting yourself up for a very painful
transition later.  No matter what magic the network wizards come up with
replacing the way you do networking is going to be painful and disruptive,
I can't even imagine it other wise.

> 4. Is it possible / practical to merge the management and data networks?
>

Yes especially at your scale.  In practice you can use only one network for
everything.  Multiple networks gets you traffic separation  which can help
with through put by keeping different classes of traffic on different
physical interfaces and even if they are sharing physical media can help
with logical isolation for security (filtering rules for example).

In a small research deployment you probably need to worry less about these
issues than Rackspace or the like.

I'm actually running sort of like this now, though that's more a side
effect than a plan, with everything on a single public IP network.  For
better hygiene I do plan to migrate the OpenStack server to a different
network than the instances are on now (and have several more existing
network that I'll be making available to specific projects)

5. Currently the isolated switch is 1G, is this likely to be a significant
> bottleneck to getting a small number of VM's running?
>

Not a problem.  I only recently put 10G interfaces in my controller node.
For a year it was serving glance images to 60 other compute nodes and some
of my users like to start 100's of instances at a time.  We ran just under
500k instances in that year with an average time to boot of 2min, not
stellar but not bad (and there was very little variation in timing due to
load)

> Thanks for your time, and any insight you are willing to share.
>

Also when you start building it the #openstack irc channel can be a life
saver  when you're stuck, especially for things that turn out to be "oh you
need to set this config variable" or "run this command" which seems to be
most things I trip over...the devil as they say is in the details.

Good Luck,
-Jon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20130905/45589a51/attachment.html>

Open Stack

[Openstack] Grizzy - Planning Help

OpenStack

Community

Documentation

Branding & Legal