[openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

Jorge Miramontes jorge.miramontes at RACKSPACE.COM
Wed Nov 5 18:25:43 UTC 2014


Thanks German,

It looks like the conversation is going towards using the HAProxy stats interface and/or iptables. I just wanted to explore logging a bit. That said, can you and Stephen share your thoughts on how we might implement that approach? I'd like to get a spec out soon because I believe metric gathering can be worked on in parallel with the rest of the project. In fact, I was hoping to get my hands dirty on this one and contribute some code, but a strategy and spec are needed first before I can start that ;)

Cheers,
--Jorge

From: <Eichberger>, German <german.eichberger at hp.com<mailto:german.eichberger at hp.com>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Date: Wednesday, November 5, 2014 3:50 AM
To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Subject: Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

Hi Jorge,

I am still not convinced that we need to use logging for usage metrics. We can also use the haproxy stats interface (which the haproxy team is willing to improve based on our input) and/or iptables as Stephen suggested. That said this probably needs more exploration.

>From an HP perspective the full logs on the load balancer are mostly interesting for the user of the loadbalancer – we only care about aggregates for our metering. That said we would be happy to just move them on demand to a place the user can access.

Thanks,
German


From: Jorge Miramontes [mailto:jorge.miramontes at RACKSPACE.COM]
Sent: Tuesday, November 04, 2014 8:20 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

Hi Susanne,

Thanks for the reply. As Angus pointed out, the one big item that needs to be addressed with this method is network I/O of raw logs. One idea to mitigate this concern is to store the data locally for the operator-configured granularity, process it and THEN send it to cielometer, etc. If we can't engineer a way to deal with the high network I/O that will inevitably occur we may have to move towards a polling approach. Thoughts?

Cheers,
--Jorge

From: Susanne Balle <sleipnir012 at gmail.com<mailto:sleipnir012 at gmail.com>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Date: Tuesday, November 4, 2014 11:10 AM
To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Subject: Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

Jorge

I understand your use cases around capturing of metrics, etc.

Today we mine the logs for usage information on our Hadoop cluster. In the future we'll capture all the metrics via ceilometer.

IMHO the amphorae should have an interface that allow for the logs to be moved to various backends such as an elastic search, hadoop HDFS, Swift, etc as well as by default (but with the option to disable it) ceilometer. Ceilometer is the metering defacto for OpenStack so we need to support it. We would like the integration with Ceilometer to be based on Notifications. I believe German send a reference to that in another email. The pre-processing will need to be optional and the amount of data aggregation configurable.

What you describe below to me is usage gathering/metering. The billing is independent since companies with private clouds might not want to bill but still need usage reports for capacity planning etc. Billing/Charging is just putting a monetary value on the various form of usage,

I agree with all points.

> - Capture logs in a scalable way (i.e. capture logs and put them on a
> separate scalable store somewhere so that it doesn't affect the amphora).

> - Every X amount of time (every hour, for example) process the logs and
> send them on their merry way to cielometer or whatever service an operator
> will be using for billing purposes.

"Keep the logs": This is what we would use log forwarding to either Swift or Elastic Search, etc.

>- Keep logs for some configurable amount of time. This could be anything
> from indefinitely to not at all. Rackspace is planing on keeping them for
> a certain period of time for the following reasons:

It looks like we are in agreement so I am not sure why it sounded like we were in disagreement on the IRC. I am not sure why but it sounded like you were talking about something else when you were talking about the real time processing. If we are just taking about moving the logs to your Hadoop cluster or any backedn a scalable way we agree.

Susanne


On Thu, Oct 23, 2014 at 6:30 PM, Jorge Miramontes <jorge.miramontes at rackspace.com<mailto:jorge.miramontes at rackspace.com>> wrote:
Hey German/Susanne,

To continue our conversation from our IRC meeting could you all provide
more insight into you usage requirements? Also, I'd like to clarify a few
points related to using logging.

I am advocating that logs be used for multiple purposes, including
billing. Billing requirements are different that connection logging
requirements. However, connection logging is a very accurate mechanism to
capture billable metrics and thus, is related. My vision for this is
something like the following:

- Capture logs in a scalable way (i.e. capture logs and put them on a
separate scalable store somewhere so that it doesn't affect the amphora).
- Every X amount of time (every hour, for example) process the logs and
send them on their merry way to cielometer or whatever service an operator
will be using for billing purposes.
- Keep logs for some configurable amount of time. This could be anything
from indefinitely to not at all. Rackspace is planing on keeping them for
a certain period of time for the following reasons:

        A) We have connection logging as a planned feature. If a customer turns
on the connection logging feature for their load balancer it will already
have a history. One important aspect of this is that customers (at least
ours) tend to turn on logging after they realize they need it (usually
after a tragic lb event). By already capturing the logs I'm sure customers
will be extremely happy to see that there are already X days worth of logs
they can immediately sift through.
        B) Operators and their support teams can leverage logs when providing
service to their customers. This is huge for finding issues and resolving
them quickly.
        C) Albeit a minor point, building support for logs from the get-go
mitigates capacity management uncertainty. My example earlier was the
extreme case of every customer turning on logging at the same time. While
unlikely, I would hate to manage that!

I agree that there are other ways to capture billing metrics but, from my
experience, those tend to be more complex than what I am advocating and
without the added benefits listed above. An understanding of HP's desires
on this matter will hopefully get this to a point where we can start
working on a spec.

Cheers,
--Jorge

P.S. Real-time stats is a different beast and I envision there being an
API call that returns "real-time" data such as this ==>
http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#9.


From:  <Eichberger>, German <german.eichberger at hp.com<mailto:german.eichberger at hp.com>>
Reply-To:  "OpenStack Development Mailing List (not for usage questions)"
<openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Date:  Wednesday, October 22, 2014 2:41 PM
To:  "OpenStack Development Mailing List (not for usage questions)"
<openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Subject:  Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements


>Hi Jorge,
>
>Good discussion so far + glad to have you back
>J
>
>I am not a big fan of using logs for billing information since ultimately
>(at least at HP) we need to pump it into ceilometer. So I am envisioning
>either the
> amphora (via a proxy) to pump it straight into that system or we collect
>it on the controller and pump it from there.
>
>Allowing/enabling logging creates some requirements on the hardware,
>mainly, that they can handle the IO coming from logging. Some operators
>might choose to
> hook up very cheap and non performing disks which might not be able to
>deal with the log traffic. So I would suggest that there is some rate
>limiting on the log output to help with that.
>
>
>Thanks,
>German
>
>From: Jorge Miramontes [mailto:jorge.miramontes at RACKSPACE.COM<mailto:jorge.miramontes at RACKSPACE.COM>]
>
>Sent: Wednesday, October 22, 2014 6:51 AM
>To: OpenStack Development Mailing List (not for usage questions)
>Subject: Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements
>
>
>
>Hey Stephen (and Robert),
>
>
>
>For real-time usage I was thinking something similar to what you are
>proposing. Using logs for this would be overkill IMO so your suggestions
>were what I was
> thinking of starting with.
>
>
>
>As far as storing logs is concerned I was definitely thinking of
>offloading these onto separate storage devices. Robert, I totally hear
>you on the scalability
> part as our current LBaaS setup generates TB of request logs. I'll start
>planning out a spec and then I'll let everyone chime in there. I just
>wanted to get a general feel for the ideas I had mentioned. I'll also
>bring it up in today's meeting.
>
>
>
>Cheers,
>
>--Jorge
>
>
>
>
>
>
>From:
>Stephen Balukoff <sbalukoff at bluebox.net<mailto:sbalukoff at bluebox.net>>
>Reply-To: "OpenStack Development Mailing List (not for usage questions)"
><openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
>Date: Wednesday, October 22, 2014 4:04 AM
>To: "OpenStack Development Mailing List (not for usage questions)"
><openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
>Subject: Re: [openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements
>
>
>
>>Hi Jorge!
>>
>>
>>
>>Welcome back, eh! You've been missed.
>>
>>
>>
>>Anyway, I just wanted to say that your proposal sounds great to me, and
>>it's good to finally be closer to having concrete requirements for
>>logging, eh. Once this
>> discussion is nearing a conclusion, could you write up the specifics of
>>logging into a specification proposal document?
>>
>>
>>
>>Regarding the discussion itself: I think we can ignore UDP for now, as
>>there doesn't seem to be high demand for it, and it certainly won't be
>>supported in v 0.5
>> of Octavia (and maybe not in v1 or v2 either, unless we see real
>>demand).
>>
>>
>>
>>Regarding the 'real-time usage' information: I have some ideas regarding
>>getting this from a combination of iptables and / or the haproxy stats
>>interface. Were
>> you thinking something different that involves on-the-fly analysis of
>>the logs or something?  (I tend to find that logs are great for non-real
>>time data, but can often be lacking if you need, say, a gauge like
>>'currently open connections' or something.)
>>
>>
>>
>>One other thing: If there's a chance we'll be storing logs on the
>>amphorae themselves, then we need to have log rotation as part of the
>>configuration here. It
>> would be silly to have an amphora failure just because its ephemeral
>>disk fills up, eh.
>>
>>
>>
>>Stephen
>>
>>
>>
>>On Wed, Oct 15, 2014 at 4:03 PM, Jorge Miramontes
>><jorge.miramontes at rackspace.com<mailto:jorge.miramontes at rackspace.com>> wrote:
>>Hey Octavia folks!
>>
>>
>>First off, yes, I'm still alive and kicking. :)
>>
>>I,d like to start a conversation on usage requirements and have a few
>>suggestions. I advocate that, since we will be using TCP and HTTP/HTTPS
>>based protocols, we inherently enable connection logging for load
>>balancers for several reasons:
>>
>>1) We can use these logs as the raw and granular data needed to track
>>usage. With logs, the operator has flexibility as to what usage metrics
>>they want to bill against. For example, bandwidth is easy to track and
>>can
>>even be split into header and body data so that the provider can choose
>>if
>>they want to bill on header data or not. Also, the provider can determine
>>if they will bill their customers for failed requests that were the fault
>>of the provider themselves. These are just a few examples; the point is
>>the flexible nature of logs.
>>
>>2) Creating billable usage from logs is easy compared to other options
>>like polling. For example, in our current LBaaS iteration at Rackspace we
>>bill partly on "average concurrent connections". This is based on polling
>>and is not as accurate as it possibly can be. It's very close, but it
>>doesn't get more accurate that the logs themselves. Furthermore, polling
>>is more complex and uses up resources on the polling cadence.
>>
>>3) Enabling logs for all load balancers can be used for debugging,
>>support
>>and audit purposes. While the customer may or may not want their logs
>>uploaded to swift, operators and their support teams can still use this
>>data to help customers out with billing and setup issues. Auditing will
>>also be easier with raw logs.
>>
>>4) Enabling logs for all load balancers will help mitigate uncertainty in
>>terms of capacity planning. Imagine if every customer suddenly enabled
>>logs without it ever being turned on. This could produce a spike in
>>resource utilization that will be hard to manage. Enabling logs from the
>>start means we are certain as to what to plan for other than the nature
>>of
>>the customer's traffic pattern.
>>
>>Some Cons I can think of (please add more as I think the pros outweigh
>>the
>>cons):
>>
>>1) If we every add UDP based protocols then this model won't work. < 1%
>>of
>>our load balancers at Rackspace are UDP based so we are not looking at
>>using this protocol for Octavia. I'm more of a fan of building a really
>>good TCP/HTTP/HTTPS based load balancer because UDP load balancing solves
>>a different problem. For me different problem == different product.
>>
>>2) I'm assuming HA Proxy. Thus, if we choose another technology for the
>>amphora then this model may break.
>>
>>
>>Also, and more generally speaking, I have categorized usage into three
>>categories:
>>
>>1) Tracking usage - this is usage that will be used my operators and
>>support teams to gain insight into what load balancers are doing in an
>>attempt to monitor potential issues.
>>2) Billable usage - this is usage that is a subset of tracking usage used
>>to bill customers.
>>3) Real-time usage - this is usage that should be exposed via the API so
>>that customers can make decisions that affect their configuration (ex.
>>"Based off of the number of connections my web heads can handle when
>>should I add another node to my pool?").
>>
>>These are my preliminary thoughts, and I'd love to gain insight into what
>>the community thinks. I have built about 3 usage collection systems thus
>>far (1 with Brandon) and have learned a lot. Some basic rules I have
>>discovered with collecting usage are:
>>
>>1) Always collect granular usage as it "paints a picture" of what
>>actually
>>happened. Massaged/un-granular usage == lost information.
>>2) Never imply, always be explicit. Implications usually stem from bad
>>assumptions.
>>
>>
>>Last but not least, we need to store every user and system load balancer
>>event such as creation, updates, suspension and deletion so that we may
>>bill on things like uptime and serve our customers better by knowing what
>>happened and when.
>>
>>
>>Cheers,
>>--Jorge
>>
>>
>>_______________________________________________
>>OpenStack-dev mailing list
>>OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
>>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>>
>>
>>
>>
>>--
>>
>>Stephen Balukoff
>>Blue Box Group, LLC
>>(800)613-4305 x807<tel:%28800%29613-4305%20x807>


_______________________________________________
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20141105/b226a01f/attachment.html>


More information about the OpenStack-dev mailing list