[openstack-dev] [Neutron][kuryr] network control plane (libkv role)

Taku Fukushima tfukushima at midokura.com
Fri Nov 6 09:24:14 UTC 2015


Hi Vikas,

> Ideally libnetwork should not be able to sync network state with driver
capability as "local". If that is the case, what is the purpose of having
"capability" feature. How drivers(those having their own control plane)
will be able to "mute" libnetwork. This will result in two "source of
truth" in that case.

It might be the case of the multiple Consul servers in the single or
multiple datacenters. But with the stable 1.9.0, I'm seeing the behaviour
although it's likely a bug.

Let me put the recorded videos. I prepared hostA(192.168.11.14) and
hostB(192.168.11.18). Due to the bad synchronization, there's an orphan,
the network "test", on hostB but please ignore it. To reproduce the
following cases, I'd strongly recommend to cleanup the created network and
so on not to break the synchronization. In my experience the coordination
with Consul is very fragile at this moment. If something went wrong and you
don't have important data in Consul, removing files under /tmp/consul and
starting over from scratch might solve your problem.

Mohammad (banix) also tried it and I heard he successfully have the network
state synchronized as well with Consul.

A. Single Consul agent
First, I tried with the single Consul agent. This would cover the case
multiple Docker daemons are coordinated with the single Consul server in
the single datacenter.

https://drive.google.com/file/d/0BwURaz1ic-5tUDJ1NFBJU1Bjc00/view?usp=sharing

I launched a Consul agent as a server on hostA, which has 192.168.11.14.

  hostA$ consul agent -server -client=192.168.11.14 -bootstrap -data-dir
/tmp/consul -node=agent-one -bind=192.168.11.14
  hostA$ cat /etc/default/docker | grep ^DOCKER
  DOCKER_OPTS="-D -H unix:///var/run/docker.sock -H :2376
--cluster-store=consul://192.168.11.14:8500 --cluster-advertise=
192.168.11.14:2376"

Then I configure another host, hostB, which has 192.168.11.18, as follow.

  hostB$ cat /etc/default/docker | grep ^DOCKER
  DOCKER_OPTS="-D -H unix:///var/run/docker.sock -H :2376
--cluster-store=consul://192.168.11.14:8500 --cluster-advertise=
192.168.11.18:2376"

The capability of Kuryr is set to "local" and we sill see the created
network on both host.

B. Multiple Consul agents, the server and the client
Second, I added another Consul agent as a client on hostB and let it join
the Consul server on hostA. This covers multiple Docker daemons are
coordinated with the Consul server and the Consul client in the single
datacenter.

https://drive.google.com/file/d/0BwURaz1ic-5tNTFtR3ZXRDZmM0k/view?usp=sharing

  hostB$ consul agent -client=192.168.11.18 -data-dir /tmp/consul
-node=agent-two -bind=192.168.11.18 -join=192.168.11.14

Then I modified the configuration of the Docker daemon on hostB to point to
the newly added Consul on hostB.

  hostB$ cat /etc/default/docker | grep ^DOCKER
  DOCKER_OPTS="-D -H unix:///var/run/docker.sock -H :2376
--cluster-store=consul://192.168.11.18:8500 --cluster-advertise=
192.168.11.18:2376"

To reflect the configuration change I restarted the Docker daemon. Then I
created a new network "multi" and it gets synchronized on both hosts. The
capability was still set to "local" but the both hosts saw the same network.

I may be doing wrong or misunderstanding things. Please let me know in that
case. And I haven't tested multiple Consul servers have the consensus with
Raft nor the Consul servers across the multiple datacenters but they're
supposed to work.

https://www.consul.io/docs/internals/architecture.html

Best regards,
Taku Fukushima

On Fri, Nov 6, 2015 at 5:48 PM, Vikas Choudhary <choudharyvikas16 at gmail.com>
wrote:

> @Taku,
>
> Ideally libnetwork should not be able to sync network state with driver
> capability as "local". If that is the case, what is the purpose of having
> "capability" feature. How drivers(those having their own control plane)
> will be able to "mute" libnetwork. This will result in two "source of
> truth" in that case.
>
> Thoughts?
>
>
> -Vikas
>
> On Fri, Nov 6, 2015 at 9:19 AM, Vikas Choudhary <
> choudharyvikas16 at gmail.com> wrote:
>
>> @Taku,
>>
>> Please have a look on this discussion. This is all about local and global
>> scope:
>> https://github.com/docker/libnetwork/issues/486
>>
>>
>> Plus, I used same docker options as you mentioned. Fact that it was
>> working for networks created with overlay driver making me think it was not
>> a configuration issue. Only networks created with kuryr were not getting
>> synced.
>>
>>
>> Thanks
>> Vikas Choudhary
>>
>> On Fri, Nov 6, 2015 at 8:07 AM, Taku Fukushima <tfukushima at midokura.com>
>> wrote:
>>
>>> Hi Vikas,
>>>
>>> I thought the "capability" affected the propagation of the network state
>>> across nodes as well. However, in my environment, where I tried Consul and
>>> ZooKeeper, I observed a new network created in a host is displayed on
>>> another host when I hit "sudo docker network ls" even if I set the
>>> capability to "local", which is the current default. So I'm just wondering
>>> what this capability means. The spec doesn't say much about it.
>>>
>>>
>>> https://github.com/docker/libnetwork/blob/8d03e80f21c2f21a792efbd49509f487da0d89cc/docs/remote.md#set-capability
>>>
>>> I saw your bug report that describes the network state propagation
>>> didn't happen appropriately. I also experienced the issue and I'd say it
>>> would be the configuration issue. Please try with the following option. I'm
>>> putting it in /etc/default/docker and managing the docker daemon through
>>> "service" command.
>>>
>>> DOCKER_OPTS="-D -H unix:///var/run/docker.sock -H :2376
>>> --cluster-store=consul://192.168.11.14:8500 --cluster-advertise=
>>> 192.168.11.18:2376"
>>>
>>> The network is the only user facing entity in libnetwork for now since
>>> the concept of the "service" is abandoned in the stable Docker 1.9.0
>>> release and it's shared by libnetwork through libkv across multiple hosts.
>>> Endpoint information is stored as a part of the network information as you
>>> documented in the devref and the network is all what we need so far.
>>>
>>>
>>> https://github.com/openstack/kuryr/blob/d1f4272d6b6339686a7e002f8af93320f5430e43/doc/source/devref/libnetwork_remote_driver_design.rst#libnetwork-user-workflow-with-kuryr-as-remote-network-driver---host-networking
>>>
>>> Regarding changing the capability to "global", it totally makes sense
>>> and we should change it despite the networks would be shared among multiple
>>> hosts anyways.
>>>
>>> Best regards,
>>> Taku Fukushima
>>>
>>>
>>> On Thu, Nov 5, 2015 at 8:39 PM, Vikas Choudhary <
>>> choudharyvikas16 at gmail.com> wrote:
>>>
>>>> Thanks Toni.
>>>> On 5 Nov 2015 16:02, "Antoni Segura Puimedon" <
>>>> toni+openstackml at midokura.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Thu, Nov 5, 2015 at 10:47 AM, Vikas Choudhary <
>>>>> choudharyvikas16 at gmail.com> wrote:
>>>>>
>>>>>> ++ [Neutron] tag
>>>>>>
>>>>>>
>>>>>> On Thu, Nov 5, 2015 at 10:40 AM, Vikas Choudhary <
>>>>>> choudharyvikas16 at gmail.com> wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> By network control plane i specifically mean here sharing network
>>>>>>> state across docker daemons sitting on different hosts/nova_vms in
>>>>>>> multi-host networking.
>>>>>>>
>>>>>>> libnetwork provides flexibility where vendors have a choice between
>>>>>>> network control plane to be handled by libnetwork(libkv) or remote driver
>>>>>>> itself OOB. Vendor can choose to "mute" libnetwork/libkv by advertising
>>>>>>> remote driver capability as "local".
>>>>>>>
>>>>>>> "local" is our current default "capability" configuration in kuryr.
>>>>>>>
>>>>>>> I have following queries:
>>>>>>> 1. Does it mean Kuryr is taking responsibility of sharing network
>>>>>>> state across docker daemons? If yes, network created on one docker host
>>>>>>> should be visible in "docker network ls" on other hosts. To achieve this, I
>>>>>>> guess kuryr driver will need help of some distributed data-store like
>>>>>>> consul etc. so that kuryr driver on other hosts could create network in
>>>>>>> docker on other hosts. Is this correct?
>>>>>>>
>>>>>>> 2. Why we cannot  set default scope as "Global" and let libkv do the
>>>>>>> network state sync work?
>>>>>>>
>>>>>>> Thoughts?
>>>>>>>
>>>>>>
>>>>> Hi Vikas,
>>>>>
>>>>> Thanks for raising this. As part of the current work on enabling
>>>>> multi-node we should be moving the default to 'global'.
>>>>>
>>>>>
>>>>>>
>>>>>>> Regards
>>>>>>> -Vikas Choudhary
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> __________________________________________________________________________
>>>>>> OpenStack Development Mailing List (not for usage questions)
>>>>>> Unsubscribe:
>>>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> __________________________________________________________________________
>>>>> OpenStack Development Mailing List (not for usage questions)
>>>>> Unsubscribe:
>>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>>
>>>>>
>>>>
>>>> __________________________________________________________________________
>>>> OpenStack Development Mailing List (not for usage questions)
>>>> Unsubscribe:
>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>>
>>>
>>>
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe:
>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>>
>>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20151106/e999ecf7/attachment.html>


More information about the OpenStack-dev mailing list