[openstack-dev] [baremetal][quantum][nova] bare metal host allocation, mac addresses, vlans, security
Robert Collins
robertc at robertcollins.net
Thu Nov 15 09:45:57 UTC 2012
So, having had a good look around the bare metal code, I think there
are some significant gaps with respect to bare metal and networking.
I'm seeking advice on how to tackle them, how quantum interacts (and
whether it is ready to handle this; what the best place to do the work
is), and additional use cases. Please excuse the length of this mail
:). I fully expect to be told 'turn it into N blueprints' - but before
then discussing it will help me make sure I've analysed it right.
To start with, background: we're looking into building something using
nova baremetal, so I want to understand both the short term pragmatic
options, and the long term 'right way' things. Vlan stuff turns up
down the end, with my sketch of code changes needed.
As I see it there are two dimensions for deployers, giving 4
interesting scenarios for deployers:
- use bare metal to deploy nodes that are themselves openstack
compute/object storage/block storage/etc hosts
- offer baremetal nodes as a flavour type to clients of the cloud
The interesting scenarios are then:
A - just a cloud with baremetal node flavours
B - just a regular vm only cloud deployed by a baremetal cloud (2
clouds in play)
C - a cloud with both baremetal flavours and vm compute hosts deployed
on those flavours, not clients having access to the baremetal nodes
D - a cloud with both baremetal flavours, vm compute hosts deployed on
those flavours and clients of the cloud also using baremetal nodes.
There is another dimension, which is whether SDN is in use - existing
infrastructure often isn't SDN ready, so I'd like to make sure we
design for both w/o and with SDN.
Without SDN, any host will have access to whatever the network
environment is configured to permit, so in analysing how these
scenarios play out, we need to ensure that configure instances to
match the network(s) available. With SDN (e.g. openflow switch
hardware) it should be a simple matter, because it is approximately as
flexible as openvswitch is. The cases below, then, are examining the
no-SDN implications.
So, case A:
- As long as the cloud control plane - its own rabbit, MySQL etc -
are appropriately firewalled off from the baremetal tenants, we should
be fine.
Case B:
- As the baremetal cloud is separate from the client hosting cloud,
this is also pretty straight forward. We've no particular
complications to worry about.
Case C:
- As long as the baremetal flavours are private and limited to the
operators only, this becomes the same as B: clients are always
virtualised, the network for baremetal is only accessed by operator
sysadmin controlled instances.
Case D:
- *If* the baremetal nodes tenants are allowed to use can be
separated from the baremetal nodes that are used to deploy the cloud,
then this devolves to separate analysis of B and C.
- There is no current link between flavor and baremetal node, nor any
other way of constraining things, so we need to examine the
implications.
- If a client were to deploy instances onto a baremetal node plugged
into the control network for the cloud, even if they were allocated
addresses from a differently firewalled range, the could probably
locally configure addresses from the control network and get past
firewalls, and from there attack rabbit or MySQL or whatever.
- -> So to implement case D [without SDN] we need to add a link
between bare metal node and flavor, or some other way of ensuring that
nodes where the network topology would let them act as a member of the
control plane, are only able to be deployed to by operators.
So thats the first point - host allocation, we need a way to limit
which physical nodes are usable by which tenants, or we must separate
out the bare metal cloud for deploying hosts.
MAC addresses
=============
Currently, baremetal has an issue -
https://bugs.launchpad.net/nova/+bug/1078154 - where MAC addresses are
allocated by nova, but they can only be imperfectly delivered on bare
metal nodes. I'd like to fix this by allowing bare metal nodes to
retain their own MAC addresses, but to populate the virtual interfaces
table correctly (which I think we should so queries work properly) we
still need a VIF with a MAC address. One way of doing this would be to
ignore the VIF MAC when deploying to a node. That won't work with
anything looking for the MAC - e.g. dnsmasq / Quantum DHCP agent. A
related problem is making sure that (without SDN) any tenant network
being requested is physically available for that physical node
(because we can't dynamically create networks for baremetal without
SDN, and running trunk mode VLAN access is thoroughly unsafe for an
untrusted tenant). A better way, AFAICT, is to pass more data down
from the compute manager through to allocate_for_instance, stating
that a particular MAC, or MAC's have to be used - and equally that
only a fixed number of interfaces are available for allocation. I
think the following data is sufficient: a dict mapping MAC address ->
[network, network, ...], where
{}/None would indicate no interface constraints exist,
{foo: [] } would indicate that there is one physical interface, no
network constraints are in place for it (e.g. its a flat network
layout, or SDN is in use).
{foo: [192.168.0/24, 192.168.1/24], bar: [192.168.3/24]} would
indicate 2 networks are available on the layer 2 broadcast domain the
interface foo is connected to, and one on the one bar is connected to.
The allocation code would then error if the user requested networks
cannot be assigned to the available interfaces. We need to record in
the baremetal db a list of those networks - and because its a separate
DB, we can't use referential integrity to the nova DB which contains
the network definitions, but I think this is tolerable, as with
quantum we'd be making API calls anyway rather than querying.
I haven't climbed through the Quantum code yet. I presume there is a
similar place in Quantum to do this (or perhaps such granularity
already exists).
VLANs
======
With baremetal, a common deployment tool is to use VLANs to segregate
out management / control plane traffic from tenant traffic. For case
B/C/D above where we are installing openstack nodes as baremetal
images, we need to be able to configure the network configuration
*including* those VLANs. This seems pretty straight forward to me: we
need to extend the concept I sketched above for describing whats
available on a given node to also describe VLAN's that are being
offered on the port each MAC address is on. Then extend the cross
checking logic to also look for VLANs to satisfy the request. The goal
is to be able to run dnsmasq/quantum DHCP agent somewhere serving onto
each VLAN, so that the baremetal nodes can DHCP each VLAN logical
interface.
Security
=======
I touched on security aspects a bit above, but there is another thing
to call out - without SDN, giving end users access to run baremetal
nodes is very risky unless the network switches perform filtering to
prevent tenants spoofing control plane nodes and doing DHCP or TFTP
serving.
Secondly, if a node is being deployed to by a different tenant, we
probably need to wipe the contents of the hard disk before handing it
over to the new tenant. Are there any other things we need to do to
prevent information leaks between tenants? We can document (again,
w/out SDN) that its up to tenants to encrypt API calls and other
traffic on the network, to deal with the incomplete isolation that
non-SDN environments will probably have.
-Rob
--
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Cloud Services
More information about the OpenStack-dev
mailing list