[openstack-dev] [Nova][Quantum] Move quantum port creation to nova-api

Aaron Rosen arosen at nicira.com
Mon May 20 19:46:31 UTC 2013


Hi Jun,

On Mon, May 20, 2013 at 12:07 PM, Jun Cheol Park
<jun.park.earth at gmail.com>wrote:

> I'm not sure it is good in this thread to continue to talk about the
> design flaw, alleged by a few people including Mike and me and recognized
> by a few other people including Joshua and more. However, since I got a
> response from Aaron, I would like to follow up.
>
> On Mon, May 20, 2013 at 10:39 AM, Aaron Rosen <arosen at nicira.com> wrote:
>
>>
>>
>>
>> On Thu, May 16, 2013 at 3:05 PM, Jun Cheol Park <jun.park.earth at gmail.com
>> > wrote:
>>
>>> Aaron,
>>>
>>> >@Mike - I think we'd still want to leave nova-compute to create the tap
>>> interfaces and sticking  external-ids on them though.
>>>
>>> Sorry, I don't get this. Why do we need to leave nova-compute to create
>>> tap interfaces? That behavior has been a serious problem and a design flaw
>>> in dealing with ports, as Mike and I presented in Portland (Title: Using
>>> OpenStack In A Traditional Hosting Environment).
>>>
>>> All,
>>>
>>> Please, let me share the problems that we ran into due to such a design
>>> flaw between nova-compute and quantum-agent.
>>>
>>> 1. Using external-ids as an implicit triggering mechanism for deploying
>>> OVS flows (we used OVS plugin on hosts) causes inconsistencies between
>>> quantum DB and actual OVS tap interfaces on hosts. For example, even when
>>> necessary OVS flows have not been set up or failed for whatever reason
>>> (messaging system unstable, or quantum-server down, etc), nova-compute
>>> unknowingly declares that a VM is in the "active" state as long as
>>> nova-compute successfully creates OVS taps and sets up external-ids. But,
>>> the VM does not have actual network connectivity until somebody (here
>>> quantum-agent) deploys desired OVS flows. At this point, it is very hard to
>>> track down what goes wrong because nova list shows the VM is "active." This
>>> kind of inconsistency happens a lot because a quantum API (which
>>> quantum-server provided, here e.g., create_port()) only manages its quantum
>>> DB, but does not deal with actual network objects (e.g., OVS taps on
>>> hosts). In this design, there is no way to verify the actual state of
>>> targeting network objects.
>>>
>>
>> Sure. The same thing happens though if you boot a machine and plug it
>> into a switch where the physical switch isn't configured. In my opinion
>> quantum's job is to handle the programming of the network from the tap
>> interface downwards (not actually creating the interfaces as those are part
>> of the server). The tap interfaces are currently created when the instance
>> is started via (kvm,etc). Changing this so that quantum would be creating
>> the tap interfaces in my opinion seems like it will make things more
>> complicated as we'll then add another ordered component.
>>
>
> Aaron, I'm really trying to understand what you described here, but
> unfortunately failing. So please help us. You already mentioned several
> times in this thread that you think that the task of creating OVS taps
> needs to be done by nova-compute. But, I don't think you explained why or
> what benefits in doing so. More importantly, I think we should know what is
> your suggestion then. How can we resolve such an inconsistency problem that
> I described while keeping (as you think it's a right thing to do) the
> functionality of creating OVS taps belong to nova-compute?
>
>
Sure. I think the confusion here is that the status nova is returning is
the status that the vm has booted. One would need to query quantum to
actually see the status if the port has been wired up or not. As far as why
the creation of the tap interface should be done in nova-compute vs a
quantum agent - is that the interface is attached to the vm (i.e belongs to
compute).  Today the role that quantum has been playing is wiring the
network that the tap interface connects into. I like how salvatore stated
in the above thread (quoted below),

"About whether Quantum should take responsibility of plugging network
cards, this was one of my doubts in the early Quantum days. At the end of
the day the 'real-world analogy' helped me understand that when installing
a server the compute guy would not call the networking guy for plugging the
NICs on the MB and plugging the cables in the patch panel, but would
probably do that by himself. <snip>".


>>>   Q. What if a quantum API really deals with network objects (e.g., OVS
>>> taps), not only updating quantum DB?
>>>   A. Nova-compute now can call a truly abstracted quantum API for
>>> creating a real port (or an OVS tap interface) on a targeting host, and
>>> then wait for a response from the call to see if an OVS tap interface is
>>> really created on the host. This way, nova-compute is able to make sure
>>> what is going on before proceeding the rest of tasks for creating a new VM.
>>> When there are some tasks that need to be taken care of regarding ports
>>> such as QoS (as Henry mentioned), quota (as this thread was invoked from),
>>> etc, nova-compute then decides what would be a next step (at least it would
>>> not blindly say that the VM is active).
>>>
>>> 2. Another example as the side effect of tap being created by
>>> nova-compute. When a host is rebooted, we expect all the VMs are
>>> automatically restarted. However, it's not possible. Here is why. When
>>> nova-compute restarts, it expects to see libvirtd running. Otherwise,
>>> nova-compute immediately stops. So we have to first start libvirtd before
>>> nova-compute. Now when libvirtd starts, it expects that all the OVS taps
>>> exist so that it can successfully start all the VMs that are supposed to
>>> use OVS taps. However, at this point since we have not started nova-compute
>>> that would create OVS taps, restarting libvirtd fails to restart VMs due to
>>> no taps found. So I ended up adding "restart libvirtd" in rc.local so that
>>> we can make libvirtd retry to restart VMs after nova-compute creates OVS
>>> taps.
>>>
>>
>> This sounds like a bug to me. I'll play around with trying to reproduce
>> this later.  Feel free to create a launchpad bug.
>>
>
> As I explained, I don't think this problem is simply a bug. I would say
> more of a design issue because this happens due to a circular dependency
> between libvirtd and nova-compute regarding the automated restart of VMs.
>

Sorry, I'm not 100% sure I understand this issue. Would have nova-compute
retry to connect to libvirtd rather than stopping solve the issue here?

>
>
>>
>>>  Q. Again, what if quantum-agent itself is able to deal with actual
>>> ports without relying on nova-compute at all?
>>>  A. We can start quantum-agent which would create all the necessary OVS
>>> taps in its own way. Then, restart libvirtd which then would start all the
>>> VMs with the created OVS taps. This is a good example how to make quantum
>>> truly independent of nova-compute without using any dependency on
>>> external-ids.
>>>
>>
>> I don't think this is the right approach. Answered above.
>>
>
> As I questioned above, again why? Is there any other reason than my
> suggestion appeared to be complex?
>
> Regards,
>
> -Jun
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20130520/fc3ea605/attachment.html>


More information about the OpenStack-dev mailing list