[openstack-dev] [octavia][upgrades] upgrade loadbalancer to new amphora image
Doug Wiegley
dougwig at parksidesoftware.com
Thu Jun 30 16:53:21 UTC 2016
> On Jun 30, 2016, at 7:15 AM, Ihar Hrachyshka <ihrachys at redhat.com> wrote:
>
>>
>> On 30 Jun 2016, at 01:16, Brandon Logan <brandon.logan at rackspace.com> wrote:
>>
>> Hi Ihar, thanks for starting this discussion. Comments in-line.
>>
>> After writing my comments in line, I might now realize that you're just
>> talking about documenting a way for a user to do this, and not have
>> Octavia handle it at all. If that's the case I apologize for my reading
>> comprehension, but I'll keep my comments in case I'm wrong. My brain is
>> not working well today, sorry :(
>
> Right. All the mechanisms needed to apply the approach are already in place in both Octavia and Neutron as of Mitaka. The question is mostly about whether the team behind the project may endorse the alternative approach in addition to whatever is in the implementation in regards to failovers by giving space to describe it in the official docs. I don’t suggest that the approach is the sole documented, or that octavia team need to implement anything. [That said, it may be wise to look at providing some smart scripts on top of neutron/octavia API that would realize the approach without putting the burden of multiple API calls onto users.]
I don’t have a problem documenting it, but I also wouldn’t personally want to recommend it.
We’re adding a layer of NAT, which has performance and HA implications of its own.
We’re adding FIPs, when the neutron advice for “simple nova-net like deployment” is provider nets and linuxbridge, which don’t support them.
Thanks,
doug
>
>>
>> Thanks,
>> Brandon
>>
>> On Wed, 2016-06-29 at 18:14 +0200, Ihar Hrachyshka wrote:
>>> Hi all,
>>>
>>> I was looking lately at upgrades for octavia images. This includes using new images for new loadbalancers, as well as for existing balancers.
>>>
>>> For the first problem, the amp_image_tag option that I added in Mitaka seems to do the job: all new balancers are created with the latest image that is tagged properly.
>>>
>>> As for balancers that already exist, the only way to get them use a new image is to trigger an instance failure, that should rebuild failed nova instance, using the new image. AFAIU the failover process is not currently automated, requiring from the user to set the corresponding port to DOWN and waiting for failover to be detected. I’ve heard there are plans to introduce a specific command to trigger a quick-failover, that would streamline the process and reduce the time needed for the process because the failover would be immediately detected and processed instead of waiting for keepalived failure mode to occur. Is it on the horizon? Patches to review?
>>
>> Not that I know of and with all the work slated for Newton, I'm 99% sure
>> it won't be done in Newton. Perhaps Ocata.
>
> I see. Do we maybe want to provide a smart script that would help to trigger a failover with neutron API? [detect the port id, set it to DOWN, …]
>
>>>
>>> While the approach seems rather promising and may be applicable for some environments, I have several concerns about the failover approach that we may want to address.
>>>
>>> 1. HA assumption. The approach assumes there is another node running available to serve requests while instance is rebuilding. For non-HA amphoras, it’s not the case, meaning the image upgrade process has a significant downtime.
>>>
>>> 2. Even if we have HA, for the time of instance rebuilding, the balancer cluster is degraded to a single node.
>>>
>>> 3. (minor) during the upgrade phase, instances that belong to the same HA amphora may run different versions of the image.
>>>
>>> What’s the alternative?
>>>
>>> One idea I was running with for some time is moving the upgrade complexity one level up. Instead of making Octavia aware of upgrade intricacies, allow it to do its job (load balance), while use neutron floating IP resource to flip a switch from an old image to a new one. Let me elaborate.
>> I'm not sure I like the idea of tying this to floating IP as there are
>> deployers who do not use floating IPs. Then again, we are currently
>> depending on allowed address pairs which is also an extension, but I
>> suspect its probably deployed in more places. I have no proof of this
>> though.
>
> I guess you already deduced that, but just for the sake of completeness: no, I don’t suggest that octavia ties its backend to FIPs. I merely suggest to document the proposed approach as ‘yet another way of doing it’, at least until we tackle the first two concerns raised.
>
>>>
>>> Let’s say we have a load balancer LB1 that is running Image1. In this scenario, we assume that access to LB1 VIP is proxied through a floating ip FIP that points to LB1 VIP. Now, the operator uploaded a new Image2 to glance registry and tagged it for octavia usage. The user now wants to migrate the load balancer function to using the new image. To achieve this, the user follows the steps:
>>>
>>> 1. create an independent clone of LB1 (let’s call it LB2) that has exact same attributes (members) as LB1.
>>> 2. once LB2 is up and ready to process requests incoming to its VIP, redirect FIP to the LB2 VIP.
>>> 3. now all new flows are immediately redirected to LB2 VIP, no downtime (for new flows) due to atomic nature of FIP update on the backend (we use iptables-save/iptables-restore to update FIP rules on the router).
>> Will this sever any existing connections? Is there a way to drain
>> connections? Or is that already done?
>
> Not sure. Hopefully conntrack entries still apply until you shutdown the node or close all current sessions. I don’t know of a way to detect if there are active sessions running. The safe fallback would be giving the load balancer enough time for any connections to die (a day?) before deprovisioning the old balancer.
>
>>> 4. since LB1 is no longer handling any flows, we can deprovision it. LB2 is now the only balancer handling members.
>>>
>>> With that approach, 1) we provide for consistent downtime expectations irrelevant to amphora architecture chosen (HA or not); 2) we flip the switch when the clone is up and ready, so no degraded state for the balancer function; 3) all instances in an HA amphora run the same image.
>>>
>>> Of course, it won’t provide no downtime for existing flows that may already be handled by the balancer function. That’s a limitation that I believe is shared by all approaches currently at the table.
>>>
>>> As a side note, the approach would work for other lbaas drivers, like namespaces, f.e. in case we want to update haproxy.
>>>
>>> Several questions in regards to the topic:
>>>
>>> 1. are there any drawbacks with the approach? can we consider it an alternative way of doing image upgrades that could find its way into official documentation?
>>
>> Echoing my comment above of being tightly coupled with floating IPs is a
>> draw back.
>>
>> Another way would be to make use of the allowed address pairs:
>> 1) spin up a clone of the amp cluster for a loadbalancer but don't bring
>> up the VIP IP Interface and don't start keepalived (or just prevent
>> garping)
>> 2) update the allowed address pairs for the clones to accept the vip IP
>> 3) bring up VIP IP interface up and start keepalived (or do a garp)
>> 4) stop keepalived on the old cluster, take the interface down
>> 5) deprovision old cluster.
>>
>> I feel bad things can happen between 3 and 4 though. This is just a
>> thought to play around with, I'm sure I'm not realizing some minute
>> details that may cause this to not work. Plus, its a bit more involved
>> that the FIP solution you proposed.
>
> I think there is benefit to discuss how to make upgrades more atomic. Pairs are indeed something to consider, that would allow us to proceed without introducing port replug in neutron.
>
> Anyway, that’s a lot more involving than either FIP or failover approach, and would take a lot of time to properly plan for it.
>
>>>
>>> 2. if the answer is yes, then how can I contribute the piece? should I sync with some other doc related work that I know is currently ongoing in the team?
>>>
>>> Ihar
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
More information about the OpenStack-dev
mailing list