<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Mar 23, 2020 at 11:48 AM <<a href="mailto:mdulko@redhat.com">mdulko@redhat.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Sun, 2020-03-22 at 13:28 -0400, Hongbin Lu wrote:<br>

> Hi all,<br>

> <br>

> As we are approaching the end of Ussuri cycle, I would like to take a<br>

> chance to introduce a new feature the Zun team implemented in this<br>

> cycle - CRI integration [1].<br>

> <br>

> As known by people, Zun is an OpenStack Container service. It<br>

> provides API for users to create and manage application containers in<br>

> an OpenStack cloud. The main concepts in Zun are "Container" and<br>

> "Capsule". A container is a single container, while a capsule is a<br>

> group of co-located and co-scheduled containers (basically the same<br>

> as k8s pod).<br>

> <br>

> Probably, the "Container" concept is more widely used. People can use<br>

> the /containers API endpoint to create and manage a single container.<br>

> Under the hook, a container is a Docker container in a compute node.<br>

> What is special is that each Docker container is given a Neutron port<br>

> so the container is connected to a tenant network in Neutron. Kuryr-<br>

> libnetwork is the Docker network plugin we use to perform the Neutron<br>

> port binding which basically connects the container to the virtual<br>

> switch managed by Neutron.<br>

> <br>

> As mentioned before, the concept of "Capsule" in Zun is basically the<br>

> same as pod in k8s. We introduced this concept mainly for k8s<br>

> integration. Roughly speaking, the Zun-k8s integration is achieved by<br>

> (i) registering a special node in k8s, (ii) watching the k8s API for<br>

> pods being scheduled to this node and (iii) invoking Zun's /capsules<br>

> API endpoint to create a capsule for each incoming pod. The (i) and<br>

> (ii) is done by a CNCF sandbox project called Virtual Kubelet [2].<br>

> The (iii) is achieved by providing an OpenStack provider [3] for<br>

> Virtual Kubelet. The special node registered by Virtual Kubelet is<br>

> called a virtual node because the node doesn't physically exist. Pods<br>

> being scheduled to the virtual node is basically offloaded from the<br>

> current k8s cluster, eventually landed on an external platform such<br>

> as an OpenStack cloud.<br>

> <br>

> In high level, what is offered to end-users is a "serverless<br>

> kubernetes pod" [4]. This term basically means the ability to run<br>

> pods on demand without planing the capacity (i.e. nodes) upfront. An<br>

> example of that is AWS EKS on Fargate [5]. In comparison, the<br>

> traditional approach is to create an entire k8s cluster upfront in<br>

> order to run the workload. Let's give a simple example. Suppose you<br>

> want to run a pod, the traditional approach is to provision a k8s<br>

> cluster with a worker node. Then, run the pod on the worker node. In<br>

> contract, the "serverless" approach is to create a k8s cluster<br>

> without any worker node and the pod is offloaded to a cloud provider<br>

> that provisions the pods at runtime. This approach works well for<br>

> applications who have fluctuated workloads so it is hard to provision<br>

> a cluster with the right size for them. Furthermore, from cloud<br>

> provider's perspective, if all tenant users offloads their pods to<br>

> the cloud, the cloud provider might be able to pack the workload<br>

> better (i.e. with fewer physical nodes) thus saving cost.<br>

> <br>

> Under the hook, a capsule is a podsandbox with one or more containers<br>

> in a CRI runtime (i.e. containerd). Compared to Docker, a CRI runtime<br>

> has a better support for the pod concept so we chose it to implement<br>

> capsule. A caveat is that CRI requires a CNI plugin for the<br>

> networking, so we need to implement a CNI plugin for Zun (called zun-<br>

> cni). The role of CNI plugin is similar as kuryr-libnetwork that we<br>

> are using for Docker except it implements a different networking<br>

> model (CNI). I summaries it as below:<br>

<br>

Hi,<br>

<br>

I noticed that Zun's CNI plugin [1] is basically a simplified version<br>

of kuryr-kubernetes code. While it's totally fine you've copied that, I<br>

wonder what modifications had been made to make it suit Zun? Is there a<br>

chance to converge this to make Zun use kuryr-kubernetes directly so<br>

that we won't develop two versions of that code in parallel?<br></blockquote><div><br></div><div>Right. I also investigated the possibilities of reusing the kuryr-kubernetes codebase as well. Definitely, some codes are common among two projects. If we can move the common code to a library (i.e. kuryr-lib), Zun should be able to directly consume the code. In particular, I am interesting to directly consume the CNI binding code (kuryr_kubernetes/cni/binding/) and the VIF versioned object (kuryr_kubernetes/objects).</div><div><br></div><div>Most parts of kuryr-kubernetes code is coupling with the "list-and-watch" logic against k8s API. Zun is not able to reuse those part of code. However, I do advocate to move all the common code to kuryr-lib so Zun can reuse it whenever it is appropriate.</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

Thanks,<br>

Michał<br>

<br>

[1] <a href="https://github.com/openstack/zun/tree/master/zun/cni" rel="noreferrer" target="_blank">https://github.com/openstack/zun/tree/master/zun/cni</a><br>

<br>

> +--------------+------------------------+---------------+<br>

> | Concept      | Container              | Capsule (Pod) |<br>

> +--------------+------------------------+---------------+<br>

> | API endpoint | /containers            | /capsules     |<br>

> | Engine       | Docker                 | CRI runtime   |<br>

> | Network      | kuryr-libnetwork (CNM) | zun-cni (CNI) |<br>

> +--------------+------------------------+---------------+<br>

> <br>

> Typically, a CRI runtime works well with Kata Container which<br>

> provides hypervisor-based isolation for neighboring containers in the<br>

> same node. As a result, it is secure to consolidate pods from<br>

> different tenants into a single node which increases the resource<br>

> utilization. For deployment, a typical stack looks like below:<br>

> <br>

> +----------------------------------------------+<br>

> | k8s control plane                            |<br>

> +----------------------------------------------+<br>

> | Virtual Kubelet (OpenStack provider)         |<br>

> +----------------------------------------------+<br>

> | OpenStack control plane (Zun, Neutron, etc.) |<br>

> +----------------------------------------------+<br>

> | OpenStack data plane                         |<br>

> | (Zun compute agent, Neutron OVS agent, etc.) |<br>

> +----------------------------------------------+<br>

> | Containerd (with CRI plugin)                 |<br>

> +----------------------------------------------+<br>

> | Kata Container                               |<br>

> +----------------------------------------------+<br>

> <br>

> In this stack, if a user creates a deployment or pod in k8s, the k8s<br>

> scheduler will schedule the pod to the virtual node registered by<br>

> Virtual Kubelet. Virtual Kubelet will pick up the pod and let the<br>

> configured cloud provider to handle it. The cloud provider invokes<br>

> Zun API to create a capsule. Upon receiving the API request to create<br>

> a capsule, Zun scheduler will schedule the capsule to a compute node.<br>

> The Zun compute agent in that node will provision the capsule using a<br>

> CRI runtime (containerd in this example). The Zun-CRI runtime<br>

> communication is done via a gRPC protocol through a unix socket. The<br>

> CRI runtime will first create the pod in Kata Container (or runc as<br>

> an alternative) that realizes the pod using a lightweight VM.<br>

> Furthermore, the CRI runtime will use a CNI plugin, which is the zun-<br>

> cni binary, to setup the network. The zun-cni binary is a thin<br>

> executable that dispatches the CNI command to a daemon service called<br>

> zun-cni-daemon. The community is via HTTP within localhost. The zun-<br>

> cni-daemon will look up the Neutron port information from DB and<br>

> perform the port binding.<br>

> <br>

> In conclusion, starting from Ussuri, Zun adds support for CRI-<br>

> compatible runtime. Zun uses CRI runtime to realize the concept of<br>

> pod. Using this feature together with Virtual Kubelet and Kata<br>

> Container, we can offer "serverless kubernetes pod" service which is<br>

> comparable with AWS EKS with Fargate.<br>

> <br>

> [1] <a href="https://blueprints.launchpad.net/zun/+spec/add-support-cri-runtime" rel="noreferrer" target="_blank">https://blueprints.launchpad.net/zun/+spec/add-support-cri-runtime</a><br>

> [2] <a href="https://github.com/virtual-kubelet/virtual-kubelet" rel="noreferrer" target="_blank">https://github.com/virtual-kubelet/virtual-kubelet</a><br>

> [3] <a href="https://github.com/virtual-kubelet/openstack-zun" rel="noreferrer" target="_blank">https://github.com/virtual-kubelet/openstack-zun</a><br>

> [4] <a href="https://aws.amazon.com/about-aws/whats-new/2019/12/run-serverless-kubernetes-pods-using-amazon-eks-and-aws-fargate/" rel="noreferrer" target="_blank">https://aws.amazon.com/about-aws/whats-new/2019/12/run-serverless-kubernetes-pods-using-amazon-eks-and-aws-fargate/</a><br>

> [5] <a href="https://aws.amazon.com/blogs/aws/amazon-eks-on-aws-fargate-now-generally-available/" rel="noreferrer" target="_blank">https://aws.amazon.com/blogs/aws/amazon-eks-on-aws-fargate-now-generally-available/</a><br>

<br>

<br>

</blockquote></div></div>