[Openstack] [Metering] schema and counter definitions

Doug Hellmann doug.hellmann at dreamhost.com
Tue May 1 15:49:58 UTC 2012


On Tue, May 1, 2012 at 10:38 AM, Nick Barcet <nick.barcet at canonical.com>wrote:

> On 05/01/2012 02:23 AM, Loic Dachary wrote:
> > On 04/30/2012 11:39 PM, Doug Hellmann wrote:
> >>
> >>
> >> On Mon, Apr 30, 2012 at 3:43 PM, Loic Dachary <loic at enovance.com
> >> <mailto:loic at enovance.com>> wrote:
> >>
> >>     On 04/30/2012 08:03 PM, Doug Hellmann wrote:
> >>>
> >>>
> >>>     On Mon, Apr 30, 2012 at 11:43 AM, Loic Dachary <loic at enovance.com
> >>>     <mailto:loic at enovance.com>> wrote:
> >>>
> >>>         On 04/30/2012 03:49 PM, Doug Hellmann wrote:
> >>>>
> >>>>
> >>>>         On Mon, Apr 30, 2012 at 6:46 AM, Loic Dachary
> >>>>         <loic at enovance.com <mailto:loic at enovance.com>> wrote:
> >>>>
> >>>>             On 04/30/2012 12:15 PM, Loic Dachary wrote:
> >>>>             > We could start a discussion from the content of the
> >>>>             following sections:
> >>>>             >
> >>>>             > http://wiki.openstack.org/EfficientMetering#Counters
> >>>>             I think the rationale of the counter aggregation needs
> >>>>             to be explained. My understanding is that the metering
> >>>>             system will be able to deliver the following
> >>>>             information: 10 floating IPv4 addresses were allocated
> >>>>             to the tenant during three months and were leased from
> >>>>             provider NNN. From this, the billing system could add a
> >>>>             line to the invoice : 10 IPv4, $N each = $10xN because
> >>>>             it has been configured to invoice each IPv4 leased from
> >>>>             provider NNN for $N.
> >>>>
> >>>>             It is not the purpose of the metering system to display
> >>>>             each IPv4 used, therefore it only exposes the aggregated
> >>>>             information. The counters define how the information
> >>>>             should be aggregated. If the idea was to expose each
> >>>>             resource usage individually, defining counters would be
> >>>>             meaningless as they would duplicate the activity log
> >>>>             from each OpenStack component.
> >>>>
> >>>>             What do you think ?
> >>>>
> >>>>
> >>>>         At DreamHost we are going to want to show each individual
> >>>>         resource (the IPv4 address, the instance, etc.) along with
> >>>>         the charge information. Having the metering system aggregate
> >>>>         that data will make it difficult/impossible to present the
> >>>>         bill summary and detail views that we want. It would be much
> >>>>         more useful for us if it tracked the usage details for each
> >>>>         resource, and let us aggregate the data ourselves.
> >>>>
> >>>>         If other vendors want to show the data differently, perhaps
> >>>>         we should provide separate APIs for retrieving the detailed
> >>>>         and aggregate data.
> >>>>
> >>>>         Doug
> >>>>
> >>>         Hi,
> >>>
> >>>         For the record, here is the unfinished conversation we had on
> IRC
> >>>
> >>>         (04:29:06 PM) dhellmann: dachary, did you see my reply about
> >>>         counter definitions on the list today?
> >>>         (04:39:05 PM) dachary: It means some counters must not be
> >>>         aggregated. Only the amount associated with it is but there
> >>>         is one counter per IP.
> >>>         (04:55:01 PM) dachary: dhellmann: what about this :the id of
> >>>         the ressource controls the agregation of all counters : if it
> >>>         is missing, all resources of the same kind and their measures
> >>>         are aggregated. Otherwise only the measures are agreggated.
> >>>
> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39
> >>>         <
> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39>
> >>>         (04:55:58 PM) dachary: it makes me a little unconfortable to
> >>>         define such an "ad-hoc" grouping
> >>>         (04:56:53 PM) dachary: i.e. you actuall control the
> >>>         aggregation by chosing which value to put in the id column
> >>>         (04:58:43 PM) dachary: s/actuall/actually/
> >>>         (05:05:38 PM) ***dachary reading
> >>>         http://www.ogf.org/documents/GFD.98.pdf
> >>>         (05:05:54 PM) dachary: I feel like we're trying to resolve a
> >>>         non problem here
> >>>         (05:08:42 PM) dachary: values need to be aggregated. The raw
> >>>         input is a full description of the resource and a value (
> >>>         gauge ). The question is how to control the aggregation in a
> >>>         reasonably flexible way.
> >>>         (05:11:34 PM) dachary: The definition of a counter could
> >>>         probably be described as : the id of a resource and code to
> >>>         fill each column associated with it.
> >>>
> >>>         I tried to append the following, but the wiki kept failing.
> >>>
> >>>         Propose that the counters are defined by a function instead
> >>>         of being fixed. That helps addressing the issue of
> >>>         aggregating the bandwidth associated to a given IP into a
> >>>         single counter.
> >>>
> >>>         Alternate idea :
> >>>          * a counter is defined by
> >>>           * a name ( o1, n2, etc. ) that uniquely identifies the
> >>>         nature of the measure ( outbound internet transit, amount of
> >>>         RAM, etc. )
> >>>           * the component in which it can be found ( nova, swift etc.)
> >>>          * and by columns, each one is set with the result of
> >>>         aggregate(find(record),record) where
> >>>           * find() looks for the existing column as found by
> >>>         selecting with the unique key ( maybe the name and the
> >>>         resource id )
> >>>           * record is a detailed description of the metering event to
> >>>         be aggregated (
> >>>
> http://wiki.openstack.org/SystemUsageData#compute.instance.exists:
> >>>         )
> >>>           * the aggregate() function returns the updated row. By
> >>>         default it just += the counter value with the old row
> >>>         returned by find()
> >>>
> >>>
> >>>     Would we want aggregation to occur within the database where we
> >>>     are collecting events, or should that move somewhere else?
> >>     I assume the events collected by the metering agents will all be
> >>     archived for auditing (or re-building the database)
> >>
> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44
> >>     <
> http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44>
> >>
> >>     Therefore the aggregation should occur when the database is
> >>     updated to account for a new event.
> >>
> >>     Does this make sense ? I may have misunderstood part of your
> question.
> >>
> >>
> >> I guess what I don't understand is why the aggregated data is written
> >> back to the metering database at all. If it's in the same database, it
> >> seems like it should be in a different "table" (or equivalent) so the
> >> original data is left alone.
> > In my view the events are not stored in a database, they are merely
> > appended to a log file. The database is built from the events with
> > aggregated data. I now understand that you (and Joshua Harlow) think
> > it's better to not aggregate the data and let the billing system do this
> > job.
>
> My intent when writing the blueprint was that each event would be
> recorded atomically in the database, as it is the only way to control
> that we have not missed any. Aggregation, should be done at the external
> API level if the request is to get the sum of a given counter.
>

That matches what I was thinking. The "log file" that Loic mentioned would
in fact be a database that can handle a lot of writes. We could use some
sort of simple file format, but since we're going to have to read and parse
the log anyway, we might as well use a tool that makes that easy.

Aggregation could happen either in a metering API based on the query, or an
external app could retrieve a large dataset and manage the aggregation
itself.


> What I missed in the blueprint and seems to be appearing clearly now, is
> that an event need to be able to carry the "object-reference" for which
> it was collected, and this would seem highly necessary looking at the
> messages in this thread. A metering event would essentially be defined
> by (who, what, which) instead of a simple (who, what).  As a consequence
> we would need to extend the DB schema to add this [which/object
> reference], and make sure that we carry it as well when we will work on
> the message API format definition.
>
> How does this sound?
>

I think so. A lot of these sorts of issues can probably be fixed by being
careful about how we define the measurements. For example, I may want to be
able to show a customer the network bandwidth used per server, not just per
network. If we measure the bandwidth consumed by each VIF, the aggregation
code can take care of summarizing by network (because we know where the VIF
is) and/or server (because we know which server has the VIF).

We may need to record more detail than a simple "which," though, because it
may be possible to change some information relevant for calculating the
billing rate later. For example, a tenant can resize an instance, which
would usually cause a change in the billing rate. Some of the relationships
might change, too (Is it possible to move a VIF between networks?).

At first I thought this might require separate table definitions per
resource type (instance, network, etc.) but re-reading the table of
counters in EfficientMetering I guess this is handled by measuring things
like CPU, RAM, and block storage as separate counters? So a single event
for creating a new instance might result in several records being written
to the database, with the "which" set to the instance identifier. The data
could then be presented as a unified "resource usage" report for that
server.

I think that works, but it may make the job of calculating the bill harder.
We are planning to follow the model of specifying rates per size, so we
would have to figure out which combination of CPU, RAM, and root volume
storage matches up with a given size to determine the rate.

Another piece I've been thinking about is handling boundary conditions when
resource create and delete events don't both fall inside a billing cycle
(or within the granularity of the metering system). That shouldn't be part
of logging the events, necessarily, but it could be a reusable component
that feeds into producing the aggregated data (either through the API, or
as a way of processing the results returned by the API).

>> Maybe it's time to start focusing these discussions on user stories?
> >>
> > I agree. Would you like to go first ?
>

These are "things that might happen" use cases rather than "user stories,"
but let's see where they take us:

1. User creates an instance, waits some period of time, then terminates it.
 - Vary the period of time to allow the events to both fall within the
metering granularity window, to overlap an entire window, to start in one
window and end in another.
 - The same variations for "billing cycle" instead of "metering granularity
window."
2. User creates an instance, waits some period of time, then resizes it.
 - Vary the period of time as above.
 - Do we need variations for resizing up and down?
3. User creates an instance but it fails to create properly (provider
issue).
4. User creates an instance but it fails to boot after creation (bad image).
5. User create volume storage, adds it to an existing instance, waits a
period of time, then deletes the volume.
 - Vary the period of time as above.
6. User creates volume storage, adds it to an existing instance, waits a
period of time, then terminates the instance (I'm not sure what happens to
the volume in that case, maybe it still exists?)

A provider-related story might be:

1. As a provider, I can query the metering API to determine the activity
for a tenant within a given period of time.

Although that's pretty vague. :-)

Doug
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20120501/2f407e80/attachment.html>


More information about the Openstack mailing list