[openstack-dev] [magnum] supported OS images and magnum spawn failures for Swarm and Kubernetes

Tobias Urdin tobias.urdin at binero.se
Thu Aug 23 12:46:43 UTC 2018


Thanks for all of your help everyone,

I've been busy with other thing but was able to pick up where I left 
regarding Magnum.
After fixing some issues I have been able to provision a working 
Kubernetes cluster.

I'm still having issues with getting Docker Swarm working, I've tried 
with both Docker and flannel as the networking layer but
none of these works. After investigating the issue seems to be that 
etcd.service is not installed (unit file doesn't exist) so the master
doesn't work, the minion swarm node is provisioned but cannot join the 
cluster because there is no etcd.

Anybody seen this issue before? I've been digging through all cloud-init 
logs and cannot see anything that would cause this.

I also have another separate issue, when provisioning using the 
magnum-ui in Horizon and selecting ubuntu with Mesos I get the error
"The Parameter (nodes_affinity_policy) was not provided". The 
nodes_affinity_policy do have a default value in magnum.conf so I'm starting
to think this might be an issue with the magnum-ui dashboard?

Best regards
Tobias

On 08/04/2018 06:24 PM, Joe Topjian wrote:
> We recently deployed Magnum and I've been making my way through 
> getting both Swarm and Kubernetes running. I also ran into some 
> initial issues. These notes may or may not help, but thought I'd share 
> them in case:
>
> * We're using Barbican for SSL. I have not tried with the internal 
> x509keypair.
>
> * I was only able to get things running with Fedora Atomic 27, 
> specifically the version used in the Magnum docs: 
> https://docs.openstack.org/magnum/latest/install/launch-instance.html
>
> Anything beyond that wouldn't even boot in my cloud. I haven't dug 
> into this.
>
> * Kubernetes requires a Cluster Template to have a label of 
> cert_manager_api=true set in order for the cluster to fully come up 
> (at least, it didn't work for me until I set this).
>
> As far as troubleshooting methods go, check the cloud-init logs on the 
> individual instances to see if any of the "parts" have failed to run. 
> Manually re-run the parts on the command-line to get a better idea of 
> why they failed. Review the actual script, figure out the variable 
> interpolation and how it relates to the Cluster Template being used.
>
> Eventually I was able to get clusters running with the stock 
> driver/templates, but wanted to tune them in order to better fit in 
> our cloud, so I've "forked" them. This is in no way a slight against 
> the existing drivers/templates nor do I recommend doing this until you 
> reach a point where the stock drivers won't meet your needs. But I 
> mention it because it's possible to do and it's not terribly hard. 
> This is still a work-in-progress and a bit hacky:
>
> https://github.com/cybera/magnum-templates
>
> Hope that helps,
> Joe
>
> On Fri, Aug 3, 2018 at 6:46 AM, Tobias Urdin <tobias.urdin at binero.se 
> <mailto:tobias.urdin at binero.se>> wrote:
>
>     Hello,
>
>     I'm testing around with Magnum and have so far only had issues.
>     I've tried deploying Docker Swarm (on Fedora Atomic 27, Fedora
>     Atomic 28) and Kubernetes (on Fedora Atomic 27) and haven't been
>     able to get it working.
>
>     Running Queens, is there any information about supported images?
>     Is Magnum maintained to support Fedora Atomic still?
>     What is in charge of population the certificates inside the
>     instances, because this seems to be the root of all issues, I'm
>     not using Barbican but the x509keypair driver
>     is that the reason?
>
>     Perhaps I missed some documentation that x509keypair does not
>     support what I'm trying to do?
>
>     I've seen the following issues:
>
>     Docker:
>     * Master does not start and listen on TCP because of certificate
>     issues
>     dockerd-current[1909]: Could not load X509 key pair (cert:
>     "/etc/docker/server.crt", key: "/etc/docker/server.key")
>
>     * Node does not start with:
>     Dependency failed for Docker Application Container Engine.
>     docker.service: Job docker.service/start failed with result
>     'dependency'.
>
>     Kubernetes:
>     * Master etcd does not start because /run/etcd does not exist
>     ** When that is created it fails to start because of certificate
>     2018-08-03 12:41:16.554257 C | etcdmain: open
>     /etc/etcd/certs/server.crt: no such file or directory
>
>     * Master kube-apiserver does not start because of certificate
>     unable to load server certificate: open
>     /etc/kubernetes/certs/server.crt: no such file or directory
>
>     * Master heat script just sleeps forever waiting for port 8080 to
>     become available (kube-apiserver) so it can never kubectl apply
>     the final steps.
>
>     * Node does not even start and times out when Heat deploys it,
>     probably because master never finishes
>
>     Any help is appreciated perhaps I've missed something crucial,
>     I've not tested Kubernetes on CoreOS yet.
>
>     Best regards
>     Tobias
>
>     __________________________________________________________________________
>     OpenStack Development Mailing List (not for usage questions)
>     Unsubscribe:
>     OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>     <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>     <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20180823/661d5f22/attachment.html>


More information about the OpenStack-dev mailing list