On Mon, 2019-07-15 at 11:25 -0700, Dan Sneddon wrote:
This is my main question about this proposal. When TripleO was in its infancy, there wasn't a mechanism to create Neutron ports separately from the server, so we created a Nova Server resource that specified which network the port was on (originally there was only one port created, now we create additional ports in Neutron). This can be seen in the puppet/<role>-role.yaml file, for example:
resources: Controller: type: OS::TripleO::ControllerServer deletion_policy: {get_param: ServerDeletionPolicy} metadata: os-collect-config: command: {get_param: ConfigCommand} splay: {get_param: ConfigCollectSplay} properties: [...] networks: - if: - ctlplane_fixed_ip_set - network: ctlplane subnet: {get_param: ControllerControlPlaneSubnet} fixed_ip: yaql: expression: $.data.where(not isEmpty($)).first() data: - get_param: [ControllerIPs, 'ctlplane', {get_param: NodeIndex}] - network: ctlplane subnet: {get_param: ControllerControlPlaneSubnet}
This has the side-effect that the ports are created by Nova calling Neutron rather than by Heat calling Neutron for port creation. We have maintained this mechanism even in the latest versions of THT for backwards compatibility. This would all be easier if we were creating the Neutron ctlplane port and then assigning it to the server, but that breaks backwards-compatibility.
This is indeed an issue that both nova-less and N=1 need to find a solution for. As soon as the nova server resources are removed from a stack the server and ctlplane port will be deleted. We loose track of which IP was assigned to which server at that point. I believe the plan in nova-less is to use the "protected" flag for Ironic nodes to ensure the baremetal node is not unprovisioned (destroyed). So the overcloud node will keep running. This however does'nt solve the problem with the ctlplane port being deleted. We need to ensure that the port is either not deleted, or that a new port is immediately created using the same IP address as before. If we don't we will very likely have duplicate IP issues on next scale out.
How would the creation of the ctlplane port be handled in this proposal? If metalsmith is creating the ctlplane port, do we still need a separate Server resource for every node? If so, I imagine it would have a much smaller stack than what we currently create for each server. If not, would metalsmith create a port on the ctlplane as part of the provisioning steps, and then pass this port back? We still need to be able to support fixed IPs for ctlplane ports, so we need to be able to pass a specific IP to metalsmith.
The way nova-less works is that "openstack overcloud node provision" call's metalsmith to create a port and deploy the server. Once done the data for the servers are placed in a heat environment file defining the 'DeployedServerPortMap' parameter etc so that the already existing pre- deployed-server workflow[1] can be utilized. Using fixed IPs for ctlplane ports is possible with nova-less. But the interface to do so is changed, see[2]. [1] https://docs.openstack.org/tripleo-docs/latest/install/advanced_deployment/d... [2] https://specs.openstack.org/openstack/tripleo-specs/specs/stein/nova-less-de...