[openstack-dev] [Neutron][IPAM] Arbitrary JSON blobs in ipam db tables

Shraddha Pandhe spandhe.openstack at gmail.com
Mon Nov 9 20:39:19 UTC 2015


Hi Carl,

Please find me reply inline


On Mon, Nov 9, 2015 at 9:49 AM, Carl Baldwin <carl at ecbaldwin.net> wrote:

> On Fri, Nov 6, 2015 at 2:59 PM, Shraddha Pandhe <
> spandhe.openstack at gmail.com> wrote:
>>
>> We have a similar requirement where we want to pick a network thats
>> accessible in the rack that VM belongs to. We have L3 Top-of-rack, so the
>> network is confined to the rack. Right now, we are achieving this by naming
>> physical network name in a certain way, but thats not going to scale.
>>
>> We also want to be able to make scheduling decisions based on IP
>> availability. So we need to know rack <-> network <-> mapping.  We can't
>> embed all factors in a name. It will be impossible to make scheduling
>> decisions by parsing name and comparing. GoDaddy has also been doing
>> something similar [1], [2].
>>
>
> This is precisely the use case that the large deployers team (LDT) has
> brought to Neutron [1].  In fact, GoDaddy has been at the forefront of that
> request.  We've had discussions about this since just after Vancouver on
> the ML.  I've put up several specs to address it [2] and I'm working
> another revision of it.  My take on it is that Neutron needs a model for a
> layer 3 network (IpNetwork) which would group the rack networks.  The
> IpNetwork would be visible to the end user and there will be a network <->
> host mapping.  I am still aiming to have working code for this in Mitaka.
> I discussed this with the LDT in Tokyo and they seemed to agree.  We had a
> session on this in the Neutron design track [3][4] though that discussion
> didn't produce anything actionable.
>
> Thats great. L3 layer network model is definitely one of our most
important requirements. All our go-forward deployments are going to be L3.
So this is a big deal for us.


> Solving this problem at the IPAM level has come up in discussion but I
> don't have any references for that.  It is something that I'm still
> considering but I haven't worked out all of the details for how this can
> work in a portable way.  Could you describe how you imagine how this flow
> would work from a user's perspective?  Specifically, when a user wants to
> boot a VM, what precise API calls would be made to achieve this on your
> network and how where would the IPAM data come in to play?
>

Here's what the flow looks like to me.

1. User sends a boot request as usual. The user need not know all the
network and subnet information beforehand. All he would do is send a boot
request.

2. The scheduler will pick a node in an L3 rack. The way we map nodes <->
racks is as follows:
    a. For VMs, we store rack_id in nova.conf on compute nodes
    b. For Ironic nodes, right now we have static IP allocation, so we
practically know which IP we want to assign. But when we move to dynamic
allocation, we would probably use 'chassis' or 'driver_info' fields to
store the rack id.

3. Nova compute will try to pick a network ID for this instance.  At this
point, it needs to know what networks (or subnets) are available in this
rack. Based on that, it will pick a network ID and send port creation
request to Neutron. At Yahoo, to avoid some back-and-forth, we send a fake
network_id and let the plugin do all the work.

4. We need some information associated with the network/subnet that tells
us what rack it belongs to. Right now, for VMs, we have that information
embedded in physnet name. But we would like to move away from that. If we
had a column for subnets - e.g. tag, it would solve our problem. Ideally,
we would like a column 'rack id' or a new table 'racks' that maps to
subnets, or something. We are open to different ideas that work for
everyone. This is where IPAM can help.

5. We have another requirement where we want to store multiple gateway
addresses for a subnet, just like name servers.


We also have a requirement where we want to make scheduling decisions based
on IP availability. We want to allocate multiple IPs to the hosts. e.g. We
want to allocate X IPs to a host. The flow in that case would be

1. User sends a boot request with --num-ips X
    The network/subnet level complexities need not be exposed to the user.
For better experience, all we want our users to tell us is the number of
IPs they want.

2. When the scheduler tries to find an appropriate host in L3 racks, we
want it to find a rack that can satisfy this IP requirement. So, the
scheduler will basically say, "give me all racks that have >X IPs
available". If we have a 'Racks' table in IPAM, that would help.
    Once the scheduler gets a rack, it will apply remaining filters to
narrow down to one host and call nova-compute. The IP count will be
propagated to nova compute from scheduler.


3. Nova compute will call Neutron and send the node details and IP count
along. Neutron IPAM driver will then look at the node details, query the
database to find a network in that rack and allocate X IPs from the subnet.



> Carl
>
> [1] https://bugs.launchpad.net/neutron/+bug/1458890
> [2] https://review.openstack.org/#/c/225384/
> [3] https://etherpad.openstack.org/p/mitaka-neutron-next-network-model
> [4] https://www.openstack.org/summit/tokyo-2015/schedule/design-summit
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20151109/223a636c/attachment.html>


More information about the OpenStack-dev mailing list