[openstack-dev] [oslo.db] [CC neutron] CIDR overlap functionality and constraints

Carl Baldwin carl at ecbaldwin.net
Thu Jul 21 18:43:57 UTC 2016


On Tue, Jul 19, 2016 at 7:40 AM, Mike Bayer <mbayer at redhat.com> wrote:

> Oslo.db devs :
>
> We've developed a system by which CIDR math, such as that of detecting
> region overlaps, can be performed on a MySQL database within queries [1]
> [2].   This feature makes use of a custom stored function I helped to
> produce which provides functionality similar to that which Postgresql
> provides built in [3].   SQLite also supports a simple way to add CIDR math
> functions as well which I've demonstrated at [4].
>
> Note that I use the term "function" and not "procedure" to stress that
> this is not a "stored procedure" in the traditional sense of performing
> complex business logic and persistence operations - this CIDR function
> performs a calculation that is not at all specific to Openstack, and is
> provided already by other databases as a built-in, and nothing else.
>
> The rationale for network-math logic being performed in the relational
> database is so that SQL like SELECT, UPDATE, and INSERT can make use of
> CIDR overlaps and other network math, such as to locate records that
> correspond to network ranges in some way and of course to provide guards
> and constraints, like that of concurrent UPDATE statements against
> conflicting ranges as well as being able to produce INSERT constraints for
> similar reasons.   Both MySQL and Postgresql have support for network
> number functions, Postgresql just has a lot more.
>
> The INSERT constraint problem is also addressed by our patch and makes use
> of an INSERT trigger on MySQL [5], but on Postgresql we use a GIST index
> which has been shown to be more reliable under concurrent use than a
> trigger on this backend [6].
>
> Not surprisingly, there's a lot of verbosity to both the production of the
> MySQL CIDR overlap function and the corresponding trigger and constraint,
> as well as the fact that to support the addition of these functions /
> constraints at both the Alembic migration level as well as that of the
> model level (because we would like metadata.create_all() to work), they are
> currently stated twice within this patch within their full verbosity.
> This is sub-optimal, and while the patch here makes use of an Alembic
> recipe [7] to aid in the maintenance of special DDL constructs, it's adding
> lots of burden to the Neutron codebase that could be better stated
> elsewhere.
>
> The general verbosity and unfamiliarity of these well known SQL features
> is understandably being met with trepidation.  I've identified that this
> trepidation is likely rooted in the fact that unlike the many other
> elaborate SQL features we use like ALTER TABLE, savepoints, subqueries,
> SELECT FOR UPDATE, isolation levels, etc. etc., there is no warm and fuzzy
> abstraction layer here that is both greatly reducing the amount of explicit
> code needed to produce and upgrade the feature, as well as indicating that
> "someone else" will fix this system when it has problems.
>
> Rather than hobbling the entire Openstack ecosystem to using a small
> subset of what our relational databases are capable of, I'd like to propose
> that preferably somewhere in oslo.db, or elsewhere, we begin providing the
> foundation for the use of SQL features that are rooted in mechanisms such
> as triggers and small use of stored functions, and more specifically begin
> to produce network-math SQL features as the public API, starting with this
> one.
>

Mike,

This is pretty cool, I'll admit. I enjoyed looking through and learning
about some modern capabilities in Postgres. The thing is, I can only think
of one area in Neutron's API which would benefit from this. That is subnet
pools. Specifically, these operations could benefit:

- Create a subnet from a subnet pool.
    - It would be helpful for the database to check overlap with other
subnets already allocated from the same pool. Now, we have to do a select
to check for overlap and then an insert later. Obviously, we've had to work
out a way to avoid races during the time between select and update.

- Adding a subnet pool to an address scope or updating a subnet pool
already under an address scope. These operations require that all of the
various subnet pools not have any mutually overlapping IP space.

These operations have been working pretty well. As I recall, it was tricky
to get them right to avoid races, but I think they're working correctly
now. There is one more operation that we are looking to enable in Newton
that could also benefit.

- Adopting a subnet in to an exist subnet pool.

None of these operations are expected to be very contentious and
performance hasn't really been a concern yet. If it were a big concern, I'd
be very interested in the GiST index solution because, as I understand it,
detecting overlap without that capability requires a linear search through
the existing records. But, GiST index capability isn't ubiquitous which
makes it difficult to get excited about for practical purposes. I do have
an academic interest in it. Computational geometry used to be a hobby of
mine when I worked on tools for physical design of microchips. I've been
telling people for years that I thought it'd be cool if databases had some
facility for indexing potentially overlapping ranges in one or more
dimensions. This looks like some pretty cool stuff.

Can you think of any other operations in Neutron -- or elsewhere in
OpenStack -- which will benefit from these new functions? I'll be honest.
Without some compelling benefit, it may be very difficult to swallow the
pill of dealing with special case code in each kind of DB for this
capability. But, if it is abstracted sufficiently by oslo db, it might be
worth looking at down the road. The changes to adopt such functionality
shouldn't be too difficult.

I don't think that regular IP allocation benefits but I could be missing
something. We always allocate discrete IP addresses. B-tree based indexes
and constraints work fine (probably better that a GiST index) for these.

Actually, the thing that has been a pain in IP allocation is knowing IP
availability (i.e. what IP addresses on a subnet are available for
allocation). What makes this difficult is that IP availability is derived
from the subnet's allocation pools and the existing discrete allocations
like this (in set notation):

  available_ips = allocation_pools - existing_allocations

Until recently, we actually stored all three of these in the database.
Obviously, we have to store allocations because those are the IP addresses
assigned to ports. We also have to store the pools because that is a value
provided to us in the API and it actually can't be derived from
available_ips and existing_allocations because existing_allocations can
include IPs outside the pools.

  allocation_pools == available_ips + current_allocations (this is not
necessarily true. In fact, it is seldom true. Think venn diagram.)

For a long time -- since before I started working on Neutron -- we also
stored available_ips as a set of IP ranges hoping that it would be more
convenient and perform better when allocating IP addresses. After a few
years of experience dealing with this, I found that it was just not worth
it. The operation to allocate one IP in isolation *might* have performed a
little bit better in some cases but what I found was that this table made
IP allocation very contentious. During the Newton cycle, we've eliminated
the need for this table by computing availability on the fly and we avoid
contention by randomly selecting from a window of the next available
addresses.

I guess this reply has gotten a little off topic. Sorry for that.

Carl Baldwin

[1]
> https://review.openstack.org/gitweb?p=openstack/neutron.git;a=blob;f=neutron/db/migration/alembic_migrations/versions/newton/expand/5bbf1e0b1774_add_stored_procedure_and_trigger_for_.py;h=8af394d319d119f57b224d391c844c0a87178856;hb=90f46e235672d3917015e5c49aa0513fb1de7ba9#l36
>
> [2] https://review.openstack.org/#/c/314054/
>
> [3] https://www.postgresql.org/docs/9.1/static/functions-net.html
>
> [4] https://gist.github.com/zzzeek/a3bccad40610b9b69803531cc71a79b1
>
> [5]
> https://review.openstack.org/gitweb?p=openstack/neutron.git;a=blob;f=neutron/db/migration/alembic_migrations/versions/newton/expand/5bbf1e0b1774_add_stored_procedure_and_trigger_for_.py;h=8af394d319d119f57b224d391c844c0a87178856;hb=90f46e235672d3917015e5c49aa0513fb1de7ba9#l92
>
> [6]
> https://review.openstack.org/gitweb?p=openstack/neutron.git;a=blob;f=neutron/db/migration/alembic_migrations/versions/newton/expand/5bbf1e0b1774_add_stored_procedure_and_trigger_for_.py;h=8af394d319d119f57b224d391c844c0a87178856;hb=90f46e235672d3917015e5c49aa0513fb1de7ba9#l116
>
> [7]
> http://alembic.zzzcomputing.com/en/latest/cookbook.html#replaceable-objects
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160721/436e12de/attachment.html>


More information about the OpenStack-dev mailing list