[openstack-dev] [Neutron][LBaaS][Octavia] Usage Requirements

Angus Lees gus at inodes.org
Wed Oct 29 00:25:51 UTC 2014


On Tue, 28 Oct 2014 04:42:27 PM Jorge Miramontes wrote:
> Thanks for the reply Angus,
> 
> DDoS attacks are definitely a concern we are trying to address here. My
> assumptions are based on a solution that is engineered for this type of
> thing. Are you more concerned with network I/O during a DoS attack or
> storing the logs? Under the idea I had, I wanted to make the amount of
> time logs are stored for configurable so that the operator can choose
> whether they want the logs after processing or not. The network I/O of
> pumping logs out is a concern of mine, however.

My primary concern was the generated network I/O, and the write bandwidth to 
storage media implied by that (not so much the accumulated volume of data).

We're in an era where 10Gb/s networking is now common for serving/loadbalancer 
infrastructure and as far as I can see the trend for networking is climbing 
more steeply that storage I/O, so it's only going to get worse.   10Gb/s of 
short-lived connections is a *lot* to try to write to reliable storage 
somewhere and later analyse.
It's a useful option for some users, but it would be a shame to have to limit 
loadbalancer throughput by the logging infrastructure just because we didn't 
have an alternative available.

I think you're right, that we don't have an obviously-correct choice here.  I 
think we need to expose both cheap sampling/polling of counters and more 
detailed logging of connections matching patterns (and indeed actual packet 
capture would be nice too).  Someone could then choose to base their billing 
on either datasource depending on their own accuracy-vs-cost-of-collection 
tradeoffs.  I don't see that either approach is going to be sufficiently 
universal to obsolete the other :(

Also: UDP.   Most providers are all about HTTP now, but there are still some 
people that need to bill for UDP, SIP, VPN, etc traffic.

 - Gus

> Sampling seems like the go-to solution for gathering usage but I was
> looking for something different as sampling can get messy and can be
> inaccurate for certain metrics. Depending on the sampling rate, this
> solution has the potential to miss spikes in traffic if you are gathering
> gauge metrics such as active connections/sessions. Using logs would be
> 100% accurate in this case. Also, I'm assuming LBaaS will have events so
> combining sampling with events (CREATE, UPDATE, SUSPEND, DELETE, etc.)
> gets complicated. Combining logs with events is arguably less complicated
> as the granularity of logs is high. Due to this granularity, one can split
> the logs based on the event times cleanly. Since sampling will have a
> fixed cadence you will have to perform a "manual" sample at the time of
> the event (i.e. add complexity).
> 
> At the end of the day there is no free lunch so more insight is
> appreciated. Thanks for the feedback.
> 
> Cheers,
> --Jorge
> 
> On 10/27/14 6:55 PM, "Angus Lees" <gus at inodes.org> wrote:
> >On Wed, 22 Oct 2014 11:29:27 AM Robert van Leeuwen wrote:
> >> > I,d like to start a conversation on usage requirements and have a few
> >> > suggestions. I advocate that, since we will be using TCP and
> >>
> >>HTTP/HTTPS
> >>
> >> > based protocols, we inherently enable connection logging for load
> >> 
> >> > balancers for several reasons:
> >> Just request from the operator side of things:
> >> Please think about the scalability when storing all logs.
> >> 
> >> e.g. we are currently logging http requests to one load balanced
> >>
> >>application
> >>
> >> (that would be a fit for LBAAS) It is about 500 requests per second,
> >>
> >>which
> >>
> >> adds up to 40GB per day (in elasticsearch.) Please make sure whatever
> >> solution is chosen it can cope with machines doing 1000s of requests per
> >> second...
> >
> >And to take this further, what happens during DoS attack (either syn
> >flood or
> >full connections)?  How do we ensure that we don't lose our logging
> >system
> >and/or amplify the DoS attack?
> >
> >One solution is sampling, with a tunable knob for the sampling rate -
> >perhaps
> >tunable per-vip.  This still increases linearly with attack traffic,
> >unless you
> >use time-based sampling (1-every-N-seconds rather than 1-every-N-packets).
> >
> >One of the advantages of (eg) polling the number of current sessions is
> >that
> >the cost of that monitoring is essentially fixed regardless of the number
> >of
> >connections passing through.  Numerous other metrics (rate of new
> >connections,
> >etc) also have this property and could presumably be used for accurate
> >billing
> >- without amplifying attacks.
> >
> >I think we should be careful about whether we want logging or metrics for
> >more
> >accurate billing.  Both are useful, but full logging is only really
> >required
> >for ad-hoc debugging (important! but different).
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list