[openstack-dev] 答复: [Quantum][LBaaS] Final Questions on Load Balancing APIs

Salvatore Orlando sorlando at nicira.com
Sat Dec 8 10:48:01 UTC 2012


Thanks for your replies.
More comments inline.

Salvatore

On 8 December 2012 04:46, Leon Cui <lcui at vmware.com> wrote:
> Hi Salvatore and Youcef,
>
> I added more comments inline…
>
>
>
> Thanks
>
> Leon
>
> 发件人: Youcef Laribi [mailto:Youcef.Laribi at eu.citrix.com]
> 发送时间: 2012年12月7日 18:31
> 收件人: OpenStack Development Mailing List
> 主题: Re: [openstack-dev] [Quantum][LBaaS] Final Questions on Load Balancing
> APIs
>
>
>
> Hi Salvatore,
>
>
>
> Added the [LBaaS] tag to the subject line...
>
>
>
> 1) Pool Status
>
>
>
> Does a pool have a status of his own, or is the status of the pool a
> function of the state of its members? For example, is it possible that the
> pool is in ERROR state while all of its members are ACTIVE?
>
>
>
> Yes, usually the pool’s status is determined by that of its members. But
> there are cases where if you try to update a property on the pool itself
> (like health monitors), and this is refused by the driver (for whatever
> reason), then the pool will be in ERROR status (and this has nothing to do
> with the status of its members). In any case, this is a driver issue, so if
> for whatever reason, the driver returns an error for an operation on the
> pool, we should set it in the DB to this state.
>
>
>
> [leon] I suggest that we decouple pool and member status from mgmt.
> perspective.  This is different from health status where the health_status
> of pool is aggregation of member health_status.
>
>

Thanks for the clarification. I am fine with having a separate status
on the pool, I just wanted to understand its semantics.

>
> 2) PENDING CREATE, PENDING UPDATE, and PENDING DELETE
>
>
>
> I am not discussing the rationale behind these stata; however one would
> expect that the plugin, via a driver would perform the transition from a
> PENDING to a definite state. Shall the currently proposed DB support class
> include methods for handling the 'state machine' for LB resources?
>
> Also, I cannot really understand PENDING DELETE in the way it's currently
> implemented, as we set the status to this value and then delete the resource
> from the DB immediately after. Is that intended behaviour?
>
>
>
> I let Leon answer, as I haven’t reviewed the code.
>
> [Leon] Current lbaas DB will provide method to update the resource state
> (from pending to done). The plugin is responsible for moving the status
> according to the interaction with driver.  For instance, in create_vip call,
> the plugin first write db with pending_create state, then call driver to do
> real configuration (could be either sync or async), finally update the
> resource to created or error.

That kind of make sense; this means that at some point driver calls
will appear in the code that I've been looking at.
Frankly, my opinion is that the status management code without the
relevant driver calls probably does not make a lot of sense.
Hence, we can either:
- postpone status management to when drivers will be available (and
then assume for instance 'status' is UP in the current module, which
you can also regard as a no-op plugin).
- keep this class simple with no status management and subclass it
with another module which does status management (and possibly device
management, but that's another story)
- expose methods for managing this 'state machine' and let the driver
handle it, which might be the case if we end up in a situation in
which the behaviour across drivers is not exactly the same.

>
>
>
> It’s same for delete operation.  I need to update code a bit that a delete
> call should first update resource status to pending_delete status, then call
> driver to do real action, finally remove from DB.
>

Yes, I understand. It's just that I do not see those driver calls at
the moment, so that's been a bit confusing for me!

>
>
> 3) Operating on 'transient' or error resources
>
>
>
> I see that operations on resources which are in a PENDING or ERROR state
> will raise a StateInvalid Operation. More than a year ago we had the same
> discussion for Quantum core resources, and we concluded that API operations
> at the end of the day where just altering a DB model.
>
> This should be allowed regardless of the operational status of a resource.
> For instance if a resource is in ERROR it might make sense to a user to set
> it administratively down; another example might be setting the LB method on
> a pool while the pool itself is in PENDING CREATE state. The subsequent
> change in the LB method will be eventually picked up by the driver and the
> underlying configuration updated. So why ask the user to poll on the
> resource (or subscribe to notifications that we do not have yet anyway).
>
>
>
> The problem with accepting updates while the object is in a PENDING_* state
> is we have to queue these operations on the driver, while the first
> operation is still pending. And after this, if there is an error, it is
> difficult to convey to the user which of the last N operations has failed,
> or may be several. It was a simpler model to do one update at a time, and
> wait for it until it either succeeds or fails, before making another update.

The same discussion for core services happened over a year ago. We
ended up with a different model, as behaviour across plugins might be
different.
It seems there has been already plenty of discussion on this topic;
I'll therefore refrain from re-opening a can of worms.

Still, there is some concern on not having consistent behaviour across
the API, but we should probably defer this discussion and ensure
progress is made now.

>
>
>
> [Leon] As Youcef said, this is simple for plugin so that we don’t need to
> pending the operations. But I’d say that we should follow the common
> practice across the whole quantum plugins.  How other plugins handle this?

Basically the core plugin (the only other one available at the
moment), just defines a set of possible 'states'.
The db_plugin (the equivalent of which is Leon's DB class for Load
Balancing) does not manage operational status at all. This is left to
the plugin.
Changes to the data model are alway allowed, and there's no plugin has
a concept of operation queueing.
Some plugins rely on agents reacting to notifications and then
fetching updated records from the DB to perform actual configuration,
other plugins acts as a proxy, either REST or RPC.
In the former case, there's no need for worrying about concurrent
operations, whereas in the latter case we do not worry (or care,
according to the point of view) of concurrent calls on the same
resource.

Also, I saw the status is updated within a transaction that finishes
when the operation is complete. This means that concurrent API calls
won't see the update status (e.g.: PENDING CREATE), and I guess they
won't receive the StateInvalid error. I haven't tested it, but this is
something worth investigating.

>
>
>
> 4) Association between pool and network.
>
>
>
> This association is kind of making an assumption that all members are on the
> same network. If this is correct, I am wondering whether this assumption
> holds in real basic use cases. A network in Quantum is representative of a
> single l2 broadcast domain. Load Balancers, being much above in the stack,
> should be able to add members from different networks. At the beginning I
> thought this was going to be addressed by having n pools for each VIP, but
> it seems this is not the case. Even for basic use cases, I think this might
> be a little limiting. Do you think there might a case for doing Load
> Balancing with members across multiple networks since v1.0?
>
>
>
> Yes the current model is all members of the pool are on the same L2/L3
> network. This supports the primary case of load-balancing a web-tier, or an
> app-tier, where all VMs are on the same network. Supporting load-balancing
> where members are spread across several Quantum networks in 1.0 would also
> complicate wiring the LB devices to several networks for each vip. I think
> once we get the current model working, we can aim to support this in a
> future release.
>

I am fine with load balancing only on a single network at the moment.
However, how would this work if you have several subnets on your
selected network?
Will the load balancer automatically get access to all these subnets -
or do you need a specific subnet?

>
>
> [Leon] I tend to agree with Youcef.  The most important thing is to deliver
> the 1st release even the use scenario is limited, then we can progress it on
> top of it.
>
>
>
> Also, you probably want to consider having a subnet_id instead of a
> network_id. This is because a network might have multiple subnets, and a
> load balancer driver will likely need an IP address on this subnet as it
> will be used a source IP in the communication with the members of the pool.
>
>
>
> Yes you are right, network_id in the LBaaS API means “L3” network_id but for
> terminology consistency, we should change this to Quantum’s “subnet_id”
> (assuming that a “subnet_id” is unique ID across networks). Same thing would
> apply to the VIP’s “network_id”.
>
>
>
> [Leon] Youcef, do you mean we should change all “nework_id” in current Lbaas
> API resources to subnet_id?
>
>
>
> Thanks for your code review and feedback!
>
>
>
> Youcef
>
>
>
> -----Original Message-----
> From: Salvatore Orlando [mailto:sorlando at nicira.com]
> Sent: Friday, December 07, 2012 4:36 PM
> To: OpenStack Development Mailing List
> Subject: [openstack-dev] [Quantum] Final Questions on Load Balancing APIs
>
>
>
> Hello again,
>
>
>
> as the patches for the API and relevant DB support are getting ready to be
> merged, I have some questions on a few more details of the load balancing
> API.
>
> I have been mulling over those details in the past few days, and in this
> email I will discuss only the details I could not figure out by myself.
>
>
>
> 1) Pool Status
>
>
>
> Does a pool have a status of his own, or is the status of the pool a
> function of the state of its members? For example, is it possible that the
> pool is in ERROR state while all of its members are ACTIVE?
>
>
>
> 2) PENDING CREATE, PENDING UPDATE, and PENDING DELETE
>
>
>
> I am not discussing the rationale behind these stata; however one would
> expect that the plugin, via a driver would perform the transition from a
> PENDING to a definite state. Shall the currently proposed DB support class
> include methods for handling the 'state machine' for LB resources?
>
> Also, I cannot really understand PENDING DELETE in the way it's currently
> implemented, as we set the status to this value and then delete the resource
> from the DB immediately after. Is that intended behaviour?
>
>
>
> 3) Operating on 'transient' or error resources
>
>
>
> I see that operations on resources which are in a PENDING or ERROR state
> will raise a StateInvalid Operation. More than a year ago we had the same
> discussion for Quantum core resources, and we concluded that API operations
> at the end of the day where just altering a DB model.
>
> This should be allowed regardless of the operational status of a resource.
> For instance if a resource is in ERROR it might make sense to a user to set
> it administratively down; another example might be setting the LB method on
> a pool while the pool itself is in PENDING CREATE state. The subsequent
> change in the LB method will be eventually picked up by the driver and the
> underlying configuration updated. So why ask the user to poll on the
> resource (or subscribe to notifications that we do not have yet anyway).
>
>
>
> 4) Association between pool and network.
>
>
>
> This association is kind of making an assumption that all members are on the
> same network. If this is correct, I am wondering whether this assumption
> holds in real basic use cases. A network in Quantum is representative of a
> single l2 broadcast domain. Load Balancers, being much above in the stack,
> should be able to add members from different networks. At the beginning I
> thought this was going to be addressed by having n pools for each VIP, but
> it seems this is not the case. Even for basic use cases, I think this might
> be a little limiting. Do you think there might a case for doing Load
> Balancing with members across multiple networks since v1.0?
>
> Also, you probably want to consider having a subnet_id instead of a
> network_id. This is because a network might have multiple subnets, and a
> load balancer driver will likely need an IP address on this subnet as it
> will be used a source IP in the communication with the members of the pool.
>
>
>
> Thanks for taking time to read this email. Your feedback is much
> appreciated.
>
>
>
> Regards,
>
> Salvatore
>
>
>
> _______________________________________________
>
> OpenStack-dev mailing list
>
> OpenStack-dev at lists.openstack.org
>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



More information about the OpenStack-dev mailing list