<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    On 05/01/2012 05:49 PM, Doug Hellmann wrote:

    <blockquote

cite="mid:CADb+p3TCUtJX9b+Lvsjm98akmVMZ44eho6tVfQ8fdfCq8UxhFA@mail.gmail.com"

      type="cite">

      <div class="gmail_extra"><br>

        <br>

        <div class="gmail_quote">On Tue, May 1, 2012 at 10:38 AM, Nick

          Barcet <span dir="ltr"><<a moz-do-not-send="true"

              href="mailto:nick.barcet@canonical.com" target="_blank">nick.barcet@canonical.com</a>></span>

          wrote:<br>

          <blockquote class="gmail_quote" style="margin:0 0 0

            .8ex;border-left:1px #ccc solid;padding-left:1ex">

            <div class="im">On 05/01/2012 02:23 AM, Loic Dachary wrote:<br>

              > On 04/30/2012 11:39 PM, Doug Hellmann wrote:<br>

              >><br>

              >><br>

              >> On Mon, Apr 30, 2012 at 3:43 PM, Loic Dachary

              <<a moz-do-not-send="true"

                href="mailto:loic@enovance.com">loic@enovance.com</a><br>

            </div>

            <div class="im">>> <mailto:<a

                moz-do-not-send="true" href="mailto:loic@enovance.com">loic@enovance.com</a>>>

              wrote:<br>

              >><br>

              >>     On 04/30/2012 08:03 PM, Doug Hellmann wrote:<br>

              >>><br>

              >>><br>

              >>>     On Mon, Apr 30, 2012 at 11:43 AM, Loic

              Dachary <<a moz-do-not-send="true"

                href="mailto:loic@enovance.com">loic@enovance.com</a><br>

            </div>

            <div class="im">>>>     <mailto:<a

                moz-do-not-send="true" href="mailto:loic@enovance.com">loic@enovance.com</a>>>

              wrote:<br>

              >>><br>

              >>>         On 04/30/2012 03:49 PM, Doug Hellmann

              wrote:<br>

              >>>><br>

              >>>><br>

              >>>>         On Mon, Apr 30, 2012 at 6:46 AM,

              Loic Dachary<br>

            </div>

            <div>

              <div class="h5">>>>>         <<a

                  moz-do-not-send="true" href="mailto:loic@enovance.com">loic@enovance.com</a>

                <mailto:<a moz-do-not-send="true"

                  href="mailto:loic@enovance.com">loic@enovance.com</a>>>

                wrote:<br>

                >>>><br>

                >>>>             On 04/30/2012 12:15 PM,

                Loic Dachary wrote:<br>

                >>>>             > We could start a

                discussion from the content of the<br>

                >>>>             following sections:<br>

                >>>>             ><br>

                >>>>             > <a

                  moz-do-not-send="true"

                  href="http://wiki.openstack.org/EfficientMetering#Counters"

                  target="_blank">http://wiki.openstack.org/EfficientMetering#Counters</a><br>

                >>>>             I think the rationale of

                the counter aggregation needs<br>

                >>>>             to be explained. My

                understanding is that the metering<br>

                >>>>             system will be able to

                deliver the following<br>

                >>>>             information: 10 floating

                IPv4 addresses were allocated<br>

                >>>>             to the tenant during three

                months and were leased from<br>

                >>>>             provider NNN. From this,

                the billing system could add a<br>

                >>>>             line to the invoice : 10

                IPv4, $N each = $10xN because<br>

                >>>>             it has been configured to

                invoice each IPv4 leased from<br>

                >>>>             provider NNN for $N.<br>

                >>>><br>

                >>>>             It is not the purpose of

                the metering system to display<br>

                >>>>             each IPv4 used, therefore

                it only exposes the aggregated<br>

                >>>>             information. The counters

                define how the information<br>

                >>>>             should be aggregated. If

                the idea was to expose each<br>

                >>>>             resource usage

                individually, defining counters would be<br>

                >>>>             meaningless as they would

                duplicate the activity log<br>

                >>>>             from each OpenStack

                component.<br>

                >>>><br>

                >>>>             What do you think ?<br>

                >>>><br>

                >>>><br>

                >>>>         At DreamHost we are going to

                want to show each individual<br>

                >>>>         resource (the IPv4 address, the

                instance, etc.) along with<br>

                >>>>         the charge information. Having

                the metering system aggregate<br>

                >>>>         that data will make it

                difficult/impossible to present the<br>

                >>>>         bill summary and detail views

                that we want. It would be much<br>

                >>>>         more useful for us if it

                tracked the usage details for each<br>

                >>>>         resource, and let us aggregate

                the data ourselves.<br>

                >>>><br>

                >>>>         If other vendors want to show

                the data differently, perhaps<br>

                >>>>         we should provide separate APIs

                for retrieving the detailed<br>

                >>>>         and aggregate data.<br>

                >>>><br>

                >>>>         Doug<br>

                >>>><br>

                >>>         Hi,<br>

                >>><br>

                >>>         For the record, here is the

                unfinished conversation we had on IRC<br>

                >>><br>

                >>>         (04:29:06 PM) dhellmann: dachary,

                did you see my reply about<br>

                >>>         counter definitions on the list

                today?<br>

                >>>         (04:39:05 PM) dachary: It means

                some counters must not be<br>

                >>>         aggregated. Only the amount

                associated with it is but there<br>

                >>>         is one counter per IP.<br>

                >>>         (04:55:01 PM) dachary: dhellmann:

                what about this :the id of<br>

                >>>         the ressource controls the

                agregation of all counters : if it<br>

                >>>         is missing, all resources of the

                same kind and their measures<br>

                >>>         are aggregated. Otherwise only the

                measures are agreggated.<br>

                >>>         <a moz-do-not-send="true"

href="http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39"

                  target="_blank">http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39</a><br>

                >>>         <<a moz-do-not-send="true"

href="http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39"

                  target="_blank">http://wiki.openstack.org/EfficientMetering?action=diff&rev2=40&rev1=39</a>><br>

                >>>         (04:55:58 PM) dachary: it makes me

                a little unconfortable to<br>

                >>>         define such an "ad-hoc" grouping<br>

                >>>         (04:56:53 PM) dachary: i.e. you

                actuall control the<br>

                >>>         aggregation by chosing which value

                to put in the id column<br>

                >>>         (04:58:43 PM) dachary:

                s/actuall/actually/<br>

                >>>         (05:05:38 PM) ***dachary reading<br>

                >>>         <a moz-do-not-send="true"

                  href="http://www.ogf.org/documents/GFD.98.pdf"

                  target="_blank">http://www.ogf.org/documents/GFD.98.pdf</a><br>

                >>>         (05:05:54 PM) dachary: I feel like

                we're trying to resolve a<br>

                >>>         non problem here<br>

                >>>         (05:08:42 PM) dachary: values need

                to be aggregated. The raw<br>

                >>>         input is a full description of the

                resource and a value (<br>

                >>>         gauge ). The question is how to

                control the aggregation in a<br>

                >>>         reasonably flexible way.<br>

                >>>         (05:11:34 PM) dachary: The

                definition of a counter could<br>

                >>>         probably be described as : the id

                of a resource and code to<br>

                >>>         fill each column associated with

                it.<br>

                >>><br>

                >>>         I tried to append the following,

                but the wiki kept failing.<br>

                >>><br>

                >>>         Propose that the counters are

                defined by a function instead<br>

                >>>         of being fixed. That helps

                addressing the issue of<br>

                >>>         aggregating the bandwidth

                associated to a given IP into a<br>

                >>>         single counter.<br>

                >>><br>

                >>>         Alternate idea :<br>

                >>>          * a counter is defined by<br>

                >>>           * a name ( o1, n2, etc. ) that

                uniquely identifies the<br>

                >>>         nature of the measure ( outbound

                internet transit, amount of<br>

                >>>         RAM, etc. )<br>

                >>>           * the component in which it can

                be found ( nova, swift etc.)<br>

                >>>          * and by columns, each one is set

                with the result of<br>

                >>>         aggregate(find(record),record)

                where<br>

                >>>           * find() looks for the existing

                column as found by<br>

                >>>         selecting with the unique key (

                maybe the name and the<br>

                >>>         resource id )<br>

                >>>           * record is a detailed

                description of the metering event to<br>

                >>>         be aggregated (<br>

                >>>         <a moz-do-not-send="true"

                  href="http://wiki.openstack.org/SystemUsageData#compute.instance.exists"

                  target="_blank">http://wiki.openstack.org/SystemUsageData#compute.instance.exists</a>:<br>

                >>>         )<br>

                >>>           * the aggregate() function

                returns the updated row. By<br>

                >>>         default it just += the counter

                value with the old row<br>

                >>>         returned by find()<br>

                >>><br>

                >>><br>

                >>>     Would we want aggregation to occur

                within the database where we<br>

                >>>     are collecting events, or should that

                move somewhere else?<br>

                >>     I assume the events collected by the

                metering agents will all be<br>

                >>     archived for auditing (or re-building the

                database)<br>

                >>     <a moz-do-not-send="true"

href="http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44"

                  target="_blank">http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44</a><br>

                >>     <<a moz-do-not-send="true"

href="http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44"

                  target="_blank">http://wiki.openstack.org/EfficientMetering?action=diff&rev2=45&rev1=44</a>><br>

                >><br>

                >>     Therefore the aggregation should occur when

                the database is<br>

                >>     updated to account for a new event.<br>

                >><br>

                >>     Does this make sense ? I may have

                misunderstood part of your question.<br>

                >><br>

                >><br>

                >> I guess what I don't understand is why the

                aggregated data is written<br>

                >> back to the metering database at all. If it's

                in the same database, it<br>

                >> seems like it should be in a different "table"

                (or equivalent) so the<br>

                >> original data is left alone.<br>

                > In my view the events are not stored in a database,

                they are merely<br>

                > appended to a log file. The database is built from

                the events with<br>

                > aggregated data. I now understand that you (and

                Joshua Harlow) think<br>

                > it's better to not aggregate the data and let the

                billing system do this<br>

                > job.<br>

                <br>

              </div>

            </div>

            My intent when writing the blueprint was that each event

            would be<br>

            recorded atomically in the database, as it is the only way

            to control<br>

            that we have not missed any. Aggregation, should be done at

            the external<br>

            API level if the request is to get the sum of a given

            counter.<br>

          </blockquote>

          <div><br>

          </div>

          <div>That matches what I was thinking. The "log file" that

            Loic mentioned would in fact be a database that can handle a

            lot of writes. We could use some sort of simple file format,

            but since we're going to have to read and parse the log

            anyway, we might as well use a tool that makes that easy.</div>

          <div><br>

          </div>

          <div>Aggregation could happen either in a metering API based

            on the query, or an external app could retrieve a large

            dataset and manage the aggregation itself.</div>

          <div> </div>

          <blockquote class="gmail_quote" style="margin:0 0 0

            .8ex;border-left:1px #ccc solid;padding-left:1ex">

            What I missed in the blueprint and seems to be appearing

            clearly now, is<br>

            that an event need to be able to carry the

            "object-reference" for which<br>

            it was collected, and this would seem highly necessary

            looking at the<br>

            messages in this thread. A metering event would essentially

            be defined<br>

            by (who, what, which) instead of a simple (who, what).  As a

            consequence<br>

            we would need to extend the DB schema to add this

            [which/object<br>

            reference], and make sure that we carry it as well when we

            will work on<br>

            the message API format definition.<br>

            <br>

            How does this sound?<br>

          </blockquote>

          <div><br>

          </div>

          <div>I think so. A lot of these sorts of issues can probably

            be fixed by being careful about how we define the

            measurements. For example, I may want to be able to show a

            customer the network bandwidth used per server, not just per

            network. If we measure the bandwidth consumed by each VIF,

            the aggregation code can take care of summarizing by network

            (because we know where the VIF is) and/or server (because we

            know which server has the VIF).</div>

          <div><br>

          </div>

          <div>We may need to record more detail than a simple "which,"

            though, because it may be possible to change some

            information relevant for calculating the billing rate later.

            For example, a tenant can resize an instance, which would

            usually cause a change in the billing rate. Some of the

            relationships might change, too (Is it possible to move a

            VIF between networks?). </div>

          <div><br>

          </div>

          <div>At first I thought this might require separate table

            definitions per resource type (instance, network, etc.) but

            re-reading the table of counters in EfficientMetering I

            guess this is handled by measuring things like CPU, RAM, and

            block storage as separate counters? So a single event for

            creating a new instance might result in several records

            being written to the database, with the "which" set to the

            instance identifier. The data could then be presented as a

            unified "resource usage" report for that server.</div>

          <div><br>

          </div>

          <div>I think that works, but it may make the job of

            calculating the bill harder. We are planning to follow the

            model of specifying rates per size, so we would have to

            figure out which combination of CPU, RAM, and root volume

            storage matches up with a given size to determine the rate.</div>

        </div>

      </div>

    </blockquote>

    The counters and storage description currently in the blueprint are

    easily extensible. Adding a new counter does not require a

    modification to a database format. This is a good. But this

    simplicity makes it more difficult to link the counters together :

    it would be much easier if the values related to an instance were in

    a table with the instance id as a key. <br>

    <br>

    The approach of <a class="moz-txt-link-freetext" href="http://wiki.openstack.org/SystemUsageData">http://wiki.openstack.org/SystemUsageData</a> is to

    fully describe the instance on each

    <a class="moz-txt-link-freetext" href="http://wiki.openstack.org/SystemUsageData#compute.instance.exists:">http://wiki.openstack.org/SystemUsageData#compute.instance.exists:</a>

    event and they could be stored in a database table, almost as

    described. Aggregating the content of the table would then be more

    difficult because the structure of the table is specific to the

    resource and the sum() API function would need to be implemented for

    each resource type instead of relying on the unified format being

    presented in the blueprint. <br>

    <br>

    To keep the simplicity of the counters descriptions, we could add a

    description of the resource to the database. The instance ID

    84f84ea84 could be described as name=myinstance, flavor=m1.large

    etc. This instance description would be valid for a period of time,

    starting when the instance was created or when it is modified (name,

    flavor etc.).<br>

    <br>

    Yet another approach would be to use the Usage Record Format

    Recommendation from <a class="moz-txt-link-freetext" href="http://www.ogf.org/documents/GFD.98.pdf">http://www.ogf.org/documents/GFD.98.pdf</a> . The

    messages collected from nova, swift etc. would be translated in this

    format. The implementation may not be too complex if it translates

    well in a document (a defined by mongodb) or a JSON object. The

    resulting structure would be more complex than the current counter

    definition but the variance of the structure definition could be

    less important because it is more mature than any structure we could

    imagine if we start to think about it just today. However, I'm not

    sure that it matches our requirements because it is written in a the

    context of a grid (i.e. it's more about distributed computing than

    cloud). We could, for instance, ignore the parts related to "jobs".

    And we could also advocate for the addition of a "Differentiated

    Proporty" ( chapter 12.) to account of the Disk I/O in addition to

    Disk Usage because. <br>

    <br>

    <blockquote

cite="mid:CADb+p3TCUtJX9b+Lvsjm98akmVMZ44eho6tVfQ8fdfCq8UxhFA@mail.gmail.com"

      type="cite">

      <div class="gmail_extra">

        <div class="gmail_quote">

          <div>

            <div><br class="Apple-interchange-newline">

              Another piece I've been thinking about is handling

              boundary conditions when resource create and delete events

              don't both fall inside a billing cycle (or within the

              granularity of the metering system). That shouldn't be

              part of logging the events, necessarily, but it could be a

              reusable component that feeds into producing the

              aggregated data (either through the API, or as a way of

              processing the results returned by the API).</div>

          </div>

        </div>

      </div>

    </blockquote>

    Could you explain more about this ? I'm assuming you refer to, for

    instance, the following situation:<br>

    <br>

    a) T : Public IP is allocated <br>

    b) T+10 : event reports 100 bytes sent since T<br>

    c) T+20 : event reports 500 bytes sent since T<br>

    d) T+30 : end billing cycle<br>

    e) T+40 : event reports 1000 bytes sent since T<br>

    <br>

    When the billing cycle ends, we shoulld charge 50% of the bytes

    transfered between T+20 and T+40, hence 500 bytes. <br>

    <br>

    Your answer will allow me to expand on the "things that might

    happen" below.<br>

    <br>

    Cheers :-)<br>

    <blockquote

cite="mid:CADb+p3TCUtJX9b+Lvsjm98akmVMZ44eho6tVfQ8fdfCq8UxhFA@mail.gmail.com"

      type="cite">

      <div class="gmail_extra">

        <div class="gmail_quote">

          <blockquote class="gmail_quote" style="margin:0 0 0

            .8ex;border-left:1px #ccc solid;padding-left:1ex">

            <div class="im HOEnZb">

              >> Maybe it's time to start focusing these

              discussions on user stories?<br>

              >><br>

              > I agree. Would you like to go first ?<br>

            </div>

          </blockquote>

          <div><br>

          </div>

          <div>These are "things that might happen" use cases rather

            than "user stories," but let's see where they take us:</div>

          <div><br>

          </div>

          <div>1. User creates an instance, waits some period of time,

            then terminates it.</div>

          <div> - Vary the period of time to allow the events to both

            fall within the metering granularity window, to overlap an

            entire window, to start in one window and end in another.</div>

          <div> - The same variations for "billing cycle" instead of

            "metering granularity window."</div>

          <div>2. User creates an instance, waits some period of time,

            then resizes it.</div>

          <div> - Vary the period of time as above.</div>

          <div> - Do we need variations for resizing up and down?</div>

          <div>3. User creates an instance but it fails to create

            properly (provider issue).</div>

          <div>4. User creates an instance but it fails to boot after

            creation (bad image).</div>

          <div>5. User create volume storage, adds it to an existing

            instance, waits a period of time, then deletes the volume.</div>

          <div> - Vary the period of time as above.</div>

          <div>6. User creates volume storage, adds it to an existing

            instance, waits a period of time, then terminates the

            instance (I'm not sure what happens to the volume in that

            case, maybe it still exists?)</div>

          <div><br>

          </div>

          <div>A provider-related story might be:</div>

          <div><br>

          </div>

          <div>1. As a provider, I can query the metering API to

            determine the activity for a tenant within a given period of

            time.</div>

          <div><br>

          </div>

          <div>

            Although that's pretty vague. :-)</div>

          <div><br>

          </div>

          <div>Doug</div>

          <div><br>

          </div>

        </div>

      </div>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

Mailing list: <a class="moz-txt-link-freetext" href="https://launchpad.net/~openstack">https://launchpad.net/~openstack</a>

Post to     : <a class="moz-txt-link-abbreviated" href="mailto:openstack@lists.launchpad.net">openstack@lists.launchpad.net</a>

Unsubscribe : <a class="moz-txt-link-freetext" href="https://launchpad.net/~openstack">https://launchpad.net/~openstack</a>

More help   : <a class="moz-txt-link-freetext" href="https://help.launchpad.net/ListHelp">https://help.launchpad.net/ListHelp</a>

</pre>

    </blockquote>

    <br>

    <br>

    <pre class="moz-signature" cols="3000">-- 

Loïc Dachary         Chief Research Officer

// eNovance labs   <a class="moz-txt-link-freetext" href="http://labs.enovance.com">http://labs.enovance.com</a>

// ✉ <a class="moz-txt-link-abbreviated" href="mailto:loic@enovance.com">loic@enovance.com</a>  ☎ +33 1 49 70 99 82

</pre>

  </body>

</html>