[openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements
Jorge Miramontes
jorge.miramontes at RACKSPACE.COM
Wed Oct 22 13:51:18 UTC 2014
Hey Stephen (and Robert),
For real-time usage I was thinking something similar to what you are proposing. Using logs for this would be overkill IMO so your suggestions were what I was thinking of starting with.
As far as storing logs is concerned I was definitely thinking of offloading these onto separate storage devices. Robert, I totally hear you on the scalability part as our current LBaaS setup generates TB of request logs. I'll start planning out a spec and then I'll let everyone chime in there. I just wanted to get a general feel for the ideas I had mentioned. I'll also bring it up in today's meeting.
Cheers,
--Jorge
From: Stephen Balukoff <sbalukoff at bluebox.net<mailto:sbalukoff at bluebox.net>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Date: Wednesday, October 22, 2014 4:04 AM
To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Subject: Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements
Hi Jorge!
Welcome back, eh! You've been missed.
Anyway, I just wanted to say that your proposal sounds great to me, and it's good to finally be closer to having concrete requirements for logging, eh. Once this discussion is nearing a conclusion, could you write up the specifics of logging into a specification proposal document?
Regarding the discussion itself: I think we can ignore UDP for now, as there doesn't seem to be high demand for it, and it certainly won't be supported in v 0.5 of Octavia (and maybe not in v1 or v2 either, unless we see real demand).
Regarding the 'real-time usage' information: I have some ideas regarding getting this from a combination of iptables and / or the haproxy stats interface. Were you thinking something different that involves on-the-fly analysis of the logs or something? (I tend to find that logs are great for non-real time data, but can often be lacking if you need, say, a gauge like 'currently open connections' or something.)
One other thing: If there's a chance we'll be storing logs on the amphorae themselves, then we need to have log rotation as part of the configuration here. It would be silly to have an amphora failure just because its ephemeral disk fills up, eh.
Stephen
On Wed, Oct 15, 2014 at 4:03 PM, Jorge Miramontes <jorge.miramontes at rackspace.com<mailto:jorge.miramontes at rackspace.com>> wrote:
Hey Octavia folks!
First off, yes, I'm still alive and kicking. :)
I,d like to start a conversation on usage requirements and have a few
suggestions. I advocate that, since we will be using TCP and HTTP/HTTPS
based protocols, we inherently enable connection logging for load
balancers for several reasons:
1) We can use these logs as the raw and granular data needed to track
usage. With logs, the operator has flexibility as to what usage metrics
they want to bill against. For example, bandwidth is easy to track and can
even be split into header and body data so that the provider can choose if
they want to bill on header data or not. Also, the provider can determine
if they will bill their customers for failed requests that were the fault
of the provider themselves. These are just a few examples; the point is
the flexible nature of logs.
2) Creating billable usage from logs is easy compared to other options
like polling. For example, in our current LBaaS iteration at Rackspace we
bill partly on "average concurrent connections". This is based on polling
and is not as accurate as it possibly can be. It's very close, but it
doesn't get more accurate that the logs themselves. Furthermore, polling
is more complex and uses up resources on the polling cadence.
3) Enabling logs for all load balancers can be used for debugging, support
and audit purposes. While the customer may or may not want their logs
uploaded to swift, operators and their support teams can still use this
data to help customers out with billing and setup issues. Auditing will
also be easier with raw logs.
4) Enabling logs for all load balancers will help mitigate uncertainty in
terms of capacity planning. Imagine if every customer suddenly enabled
logs without it ever being turned on. This could produce a spike in
resource utilization that will be hard to manage. Enabling logs from the
start means we are certain as to what to plan for other than the nature of
the customer's traffic pattern.
Some Cons I can think of (please add more as I think the pros outweigh the
cons):
1) If we every add UDP based protocols then this model won't work. < 1% of
our load balancers at Rackspace are UDP based so we are not looking at
using this protocol for Octavia. I'm more of a fan of building a really
good TCP/HTTP/HTTPS based load balancer because UDP load balancing solves
a different problem. For me different problem == different product.
2) I'm assuming HA Proxy. Thus, if we choose another technology for the
amphora then this model may break.
Also, and more generally speaking, I have categorized usage into three
categories:
1) Tracking usage - this is usage that will be used my operators and
support teams to gain insight into what load balancers are doing in an
attempt to monitor potential issues.
2) Billable usage - this is usage that is a subset of tracking usage used
to bill customers.
3) Real-time usage - this is usage that should be exposed via the API so
that customers can make decisions that affect their configuration (ex.
"Based off of the number of connections my web heads can handle when
should I add another node to my pool?").
These are my preliminary thoughts, and I'd love to gain insight into what
the community thinks. I have built about 3 usage collection systems thus
far (1 with Brandon) and have learned a lot. Some basic rules I have
discovered with collecting usage are:
1) Always collect granular usage as it "paints a picture" of what actually
happened. Massaged/un-granular usage == lost information.
2) Never imply, always be explicit. Implications usually stem from bad
assumptions.
Last but not least, we need to store every user and system load balancer
event such as creation, updates, suspension and deletion so that we may
bill on things like uptime and serve our customers better by knowing what
happened and when.
Cheers,
--Jorge
_______________________________________________
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
--
Stephen Balukoff
Blue Box Group, LLC
(800)613-4305 x807
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20141022/aae0a354/attachment.html>
More information about the OpenStack-dev
mailing list