[Magnum] Virtual PTG planning

feilong feilong at catalyst.net.nz
Sun Dec 1 20:43:43 UTC 2019


But the good thing is I still have the tag opened on my laptop, so here
you are:


  Magnum Ussuri Virtual PTG


Two days:

  * Thursday        28 November 0900-1100 UTC

  * Wednesday     4 December 0900-1100 UTC


    Attendees:

    flwang
    brtknr
    strigazi


    Topics:


      - Containerized master

  * - https://drive.google.com/open?id=10Qx-BQwv-JSXSSDTdFOQI_PnU2tL4JM9

  * - How customisable is the master from the user point of
    view? Lingxian has done a
    PoC https://github.com/lingxiankong/magnum/tree/k8s-on-k8s just FYI,
    it does need more work, but just wanna give you guys an idea

  * - Catalyst Cloud care about this feature as a public cloud provider,
    how about CERN and StackHPC?

  * - It looks interesting, but at the moment we won't benefit very much
    from a brutal change. What about a new driver? See this
    llink https://github.com/lingxiankong/magnum/tree/k8s-on-k8s it will
    be a new driver. And user can choose which kind of cluster by using
    different driver/template

  * go for it

  * - Is there a demand for this feature from you as a Cloud Provider
    POV or is this mirroring GCP model? Not just mirroring, this feature
    can dramitically reduce the cluster cost since user won't have to
    pay the master node cost. Dos this allow a master to be shared
    between multiple clusters? No, there will be a root cluster managed
    by clouder provider, and user's cluster master node will be run on
    the root cluster as pods. so yes :) It is consolidation of
    resources. No, it's different. it's not a master being shared. still
    disagree but ok, it is a detail on describing Right, we don't have
    to go to far in this session. From our POV, there is a greater
    demand for master resize. With this way, master resizing will be
    easy. Do you have a diagram or just code? Download
    this https://drive.google.com/open?id=10Qx-BQwv-JSXSSDTdFOQI_PnU2tL4JM9.
    We are not opposed to this, its a cool idea, but would like to see
    it in action. Was there a live demo at the summit??? YesCool, I'll
    wait for the videos...

  * there are things missing, like the master IP. I (love? irony) that
    it is in ubuntu so fully incompatible with we have now. It's just a
    PoC, we can switch to FC30, it's not a problem I think. Lingxian did
    this with Ubuntu, just because he's more familiar with Ubuntu and
    Ansible, that doesn't mean we will go for this eventually.

  * - Catalyst Cloud will slowly and carefully do the upstream

  * I can not gurantee we will be onboard. That is why we have
    drivers. Totally understand.


      - Cinder CSI - remove in-tree Cinder

      Similar work in
kubespray https://github.com/kubernetes-sigs/kubespray/pull/5184/files
      this is a small patch, no? Is anyone depending on cinder? let's
take a patch. Catalyst Cloud is using Cinder, I think CERN and StackHPC
are not, right?
      Yes looks like it from the kubespray PR. 
      Caveat: For the moment, only Cinder v3 is supported by the CSI
Driver. Not sure if this is a blocker? Is this the block-storage
endpoint? For master branch, i don't think this is a blocker. Ok.So I
will propose a patch for CSI support.  Yes. I am also happy to work on
the patch next week. Sold, it's all yours. That was easy... +1


      - All addons in a helm chart

    - Traefik - this one should be easy. We need an ower for this.CERN
can take this. Wonderful.
    - What else?
    - autoscaler
    - OCCM
    - keystone webhook
    - kubernetes dashboard
    - flannel
    - calico
    - Isn't kustomize the cool kid in the block these days? There are no
community ready recipes for this.
    As for the Helm, personally, I'd like to see a refactoring so that
we can remove those shell scripts
from https://github.com/openstack/magnum/tree/master/magnum/drivers/common/templates/kubernetes/helm Do
you think we can drop those scripts by only have one and using a loop to
install charts?????? It won't be a loop. It will one umbrella chart. I
don't care TBH, but I don't want to see we're using Helm and still have
to maintain such a lot bash scripts. Which makes me hard to see the
value of switching to Helm, comparing the old way we're using with tons
of bash scripts. One chart to rule them all. It would be fantastic, will
CERN take this??????????????????????????????????????????????? ;)


      - Support for containerd?

      https://review.opendev.org/#/c/695210/ CERN has this. I'm excited
about this! Will allow us to run Kata containers. +1+100 I'm talking
about Kata tonight in London, anyone fancy
coming? https://www.meetup.com/OpenInfra-London/events/266230014/?rv=wm1&_xtd=gatlbWFpbF9jbGlja9oAJDExYzZhNjcwLTRjMWItNDc3OC05NTM2LWU2MDhiMTQxZWY2MQ&_af=event&_af_eid=266230014 Buy
me a flight ticket from Wellington to London please - sure, my bank
account detail, i can email you later.


      - Master resize

      Another bonus maybe got by this is dropping the (discovery service
dependency - what is this?) If we want to support master resizing, that
probably means we will use token to bootstrap etcd cluster, and with
that way, we can drop the dependency for the https://discovery.etcd.io
<https://discovery.etcd.io/>
      Isn't this irrelevant with the containerized master? Yes and No.
Though with the containerized master, the master resizing could be easy.
But we still want to allow user to resize a VM-based masters Catalyst
needs this? We do have customers asked this. +1. I would say the public
cloud customers are more picky than private cloud users. You guys can
argue that :)They are not. Try having customers that can ask for
anything and they know it is free. LOL.
      After discussing the better containrized magnum, we (CERN) will
take a look into rebuiding the master and moving the master to the OPS
tenant. Not sure how far will take the resizing.Into Ops tenant, but
still being VM? yes. OK. Then what's the benefit? users dont' touch or
see the vm, OPS have access to it and can do inrenventions. The change
is small, "don't create the stack in user's tenant, but in OPS" OK, I
can see the point from CERN's PoV. You don't benefit from this? Not
much, because user still need to pay (I don't think you saw what i
said)for the VM and it's making harder for billing. The master will be
owned by the OPS project. How is the user being charged? But the master
nodes will still occupy resource, no? The containerzed masters cluster
is not occupying resources? It is, but far less than several c2r4
VMs. How come? if you need 4 cores in VMs you still need cores in the
k8s cluster. Overcommit the vms if needed. But as pods, we can use small
'flavor' for master nodes and have more for HA. I don't see it, sorry,
maybe it is your billing. From the resources perspective is almost the
same.Yes, you raised a good point. I will calculate it again. That said,
there is still value in containerized master.


      * - CLI*

      At the moment, `openstack coe cluster list` AS AN ADMIN shows all
clusters from all projects. When you try to do `openstack coe cluster
config project_from_another_tenant`, you hit a 404. This does not seem
right.??????????????????????? this 100% not an upstream bug.
|      (os) [bkunwar at cloudadmin000 ~]$ openstack coe cluster config
k8s-radarad-2 --dir ~/.kube/ --force|
|Cluster 20b15ecd-9748-433a-b52c-09c2bbf7f603 could not be found (HTTP
404) (Request-ID: req-d9eaddf5-ef46-449a-b622-c6da7e26ecf3)|
ah, ok makes sense, it is a feature. You can change the policy or add
admin in all projects.Yes, by default, as admin, you shouldn't get the
right to access any other's cluster(say customer's cluster) OK understood.


      - Replace heapster with metrics-server. Isn't this already done?

      https://github.com/kubernetes-retired/heapster/blob/master/docs/deprecation.md Isn't
this done? heapster is running but metrics server is used by the kubectl
top command. Just drop heapster? Yep, we should drop heapster and
probably install metrics-server by default. Thoughts? ok +2. I will take
this.


      - Magum UI

     Catalyst Cloud just done some big improvements for the Magnum UI
and we would like to contribute them back. I will add you guys as
reviewer for that. Just a heads up.
     Cool! I have just figured out how to look at the magnum-ui plugin
on dev Horizon so will be fun to put that to use.


      - Worker node upgrade by replacement

  * New upgrade in progress or reconciling status

  * Use resize API to drop nodes from one nodegroup to anpther
    (one-by-one / batch)

  * do this from inside the cluster with a daemon/pod running in the
    master so you can drain first. It would be great. We can extend the
    magnum-auto-healer as magnum-controller

  * maybe new controller and then merge if needed? Also works for me.

  * Would this work for upgrade from Atomic to CoreOS?

  * not initially, it could be possible. But when I started to support
    both Atomic and CoreOS in the same driver (we - who? I think me,
    you, feilong, it was IRC) changed our mind.

  * I am in two minds about this -  on one hand, clusters are
    disposable... they are not pets. Every elaborate service I know on
    k8s never uprgades, only replaces. many small clusters. This is the
    most stable pattern. You just need to figure out the proxy in front
    of the cluster.

  * Not tried this but if a new nodegroup has a different image
    specified and it uses fedora-coreos label instead of fedora-atomic,
    what would happen. It could work. Cool, this should be good enough.


      - Removing unused drivers and labels

   - e.g. kube_version... what is this for? nothing but the values in
cluster show
   - Do we still need a CoreOS driver? Fedora Ironic Driver? Anyone
using the Mesos and DCOS drivers?  we don't swarm_fedora_atomic_v1? v2
for us Does v2 still work? We can call for owner/maintainers for each
inactive driver, if there is no maintainer for volunteer, then we can
move them into "contrib", thoughts? Moving it to contrib would not
lighten the code base. I was thinking of culling them. Cant remember the
last time anyone asked a question about these drivers +1? I don't
know. We can revisit this one later. Seems we don't have a quick
conclustion. 👍
   


    - ACTIONS

  * - *Containerized master* - Catalyst

  * - *Cinder CSI - remove in-tree Cinder* - StackHPC, do this with helm
    from the start ✅

  * - *All addons in a helm chart* - CERN will look at this, StackHPC
    can help with... also bump up chart versions and check compatibility.

  * - *Master resize* - Catalyst, as part of the containerized
    solution? I will start with a spec, maybe separate

  * - *Worker node replacement** *- CERN

  * - *Magnum UI* - Catalyst

  * - *Containerd* - CERN

  * - *Drop Heapster* - Catalyst

  *


  *





















On 2/12/19 9:41 AM, Feilong Wang wrote:
>
> I can't load it, the page is always in Loading status.
>
>
> On 2/12/19 2:49 AM, Spyros Trigazis wrote:
>> Hello,
>>
>> the etherpad is broken for me. I think an emoji did it. I have seen
>> that in the past.
>> Infra team resurrected it.
>>
>> Is it working for you? 
>>
>> Cheers,
>> Spyros
>>
>> On Mon, Nov 25, 2019 at 9:48 PM Feilong Wang <feilong at catalyst.net.nz
>> <mailto:feilong at catalyst.net.nz>> wrote:
>>
>>     Hi team,
>>
>>     After discussed with other team members, the virtual PTG is
>>     schedule on:
>>
>>     1st Session:  28th Nov 9:00AM-11:00AM UTC
>>
>>     2nd Session: 4th Dec 9:00AM-11:00AM UTC
>>
>>     Please add your topics on
>>     https://etherpad.openstack.org/p/magnum-ussuri-virtual-ptg-planning
>>     Thanks.
>>
>>
>>     On 19/11/19 10:46 AM, Feilong Wang wrote:
>>     > Hi team,
>>     >
>>     > As we discussed on last weekly team meeting, we'd like to have
>>     a virtual
>>     > PTG before the Xmas holiday to plan our work for the U release. The
>>     > general idea is extending our current weekly meeting time from
>>     1 hour to
>>     > 2 hours and having 2 sessions with total 4 hours. My current
>>     proposal is
>>     > as below, please reply if you have question or comments. Thanks.
>>     >
>>     > Pre discussion/Ideas collection:   20th Nov  9:00AM-10:00AM UTC
>>     >
>>     > 1st Session:  27th Nov 9:00AM-11:00AM UTC
>>     >
>>     > 2nd Session: 4th Dec 9:00AM-11:00AM UTC
>>     >
>>     >
>>     -- 
>>     Cheers & Best regards,
>>     Feilong Wang (王飞龙)
>>     Head of R&D
>>     Catalyst Cloud - Cloud Native New Zealand
>>     --------------------------------------------------------------------------
>>     Tel: +64-48032246
>>     Email: flwang at catalyst.net.nz <mailto:flwang at catalyst.net.nz>
>>     Level 6, Catalyst House, 150 Willis Street, Wellington
>>     --------------------------------------------------------------------------
>>
>>
>>
> -- 
> Cheers & Best regards,
> Feilong Wang (王飞龙)
> Head of R&D
> Catalyst Cloud - Cloud Native New Zealand
> --------------------------------------------------------------------------
> Tel: +64-48032246
> Email: flwang at catalyst.net.nz
> Level 6, Catalyst House, 150 Willis Street, Wellington
> -------------------------------------------------------------------------- 

-- 
Cheers & Best regards,
Feilong Wang (王飞龙)
------------------------------------------------------
Senior Cloud Software Engineer
Tel: +64-48032246
Email: flwang at catalyst.net.nz
Catalyst IT Limited
Level 6, Catalyst House, 150 Willis Street, Wellington
------------------------------------------------------ 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20191202/3898ff77/attachment-0001.html>


More information about the openstack-discuss mailing list