[openstack-dev] Image without host-id

Santosh Kumar Santosh8.Kumar at aricent.com
Wed Nov 20 04:43:04 UTC 2013


Hi Experts,

I am following Havana guide for creating three node set up.

Everything has been installed and configured.

However not able to create network for VMs , That's why when creating VM .. they come out be without host-id ( VM gets lauch ).

#Nova network-create , is not working for me. Any pointer for the same.

Regards
Santosh


-----Original Message-----
From: openstack-dev-request at lists.openstack.org [mailto:openstack-dev-request at lists.openstack.org]
Sent: Wednesday, November 20, 2013 4:27 AM
To: openstack-dev at lists.openstack.org
Subject: OpenStack-dev Digest, Vol 19, Issue 55

Send OpenStack-dev mailing list submissions to
        openstack-dev at lists.openstack.org

To subscribe or unsubscribe via the World Wide Web, visit
        http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
or, via email, send a message with subject or body 'help' to
        openstack-dev-request at lists.openstack.org

You can reach the person managing the list at
        openstack-dev-owner at lists.openstack.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of OpenStack-dev digest..."


Today's Topics:

   1. Re: Reg : Security groups implementation using openflows in
      quantum ovs plugin (Amir Sadoughi)
   2. Re: [Nova] Does Nova really need an SQL database?
      (Caitlin Bestler)
   3. Re: [Neutron] - Vendor specific erros (Salvatore Orlando)
   4. Re: Introducing the new OpenStack service for Containers
      (James Bottomley)
   5. Re: [Heat] Continue discussing multi-region       orchestration
      (Clint Byrum)
   6. Re: [Swift] Metadata Search API (Paula Ta-Shma)
   7. Re: [infra] How to determine patch set load for a given
      project (Ilya Shakhat)
   8. Re: [Heat] rough draft of Heat autoscaling API
      (Christopher Armstrong)
   9. Re: [Swift] Metadata Search API (Jay Pipes)
  10. Re: [Nova] Does Nova really need an SQL database? (Joshua Harlow)
  11. Re: [Neutron] Race condition between DB layer and plugin
      back-end implementation (Joshua Harlow)
  12. Re: Introducing the new OpenStack service for     Containers
      (Sam Alba)
  13. Re: [Nova] Does Nova really need an SQL database? (Clint Byrum)
  14. Re: [Neutron] Race condition between DB layer and plugin
      back-end implementation (Joshua Harlow)
  15. Re: Introducing the new OpenStack service for     Containers
      (Eric Windisch)
  16. Re: [Nova] Does Nova really need an SQL database? (Joshua Harlow)
  17. Re: [Nova] [Ironic] [TripleO] scheduling flow with        Ironic?
      (Devananda van der Veen)
  18. RFC: Potential to increase min required libvirt   version to
      0.9.8 ? (Daniel P. Berrange)
  19. Re: Introducing the new OpenStack service for Containers
      (James Bottomley)
  20. Re: Introducing the new OpenStack service for Containers
      (Daniel P. Berrange)
  21. Re: [Heat] rough draft of Heat autoscaling API (Steven Dake)
  22. Re: [Nova] Does Nova really need an SQL database? (Chris Friesen)
  23. [Heat] Version Negotiation Middleware Accept Header issue
      (Fuente, Pablo A)
  24. Re: [Nova] Does Nova really need an SQL database? (Chris Friesen)
  25. Re: [Nova] Does Nova really need an SQL database? (Clint Byrum)
  26. Re: Introducing the new OpenStack service for     Containers
      (Rick Jones)
  27. Re: [Nova] Does Nova really need an SQL database? (Chris Friesen)
  28. Re: [Neutron][LBaaS] SSL Termination write-up (Eugene Nikanorov)
  29. Re: Introducing the new OpenStack service for     Containers
      (Russell Bryant)
  30. Re: [Heat] HOT software configuration refined after design
      summit discussions (Steve Baker)
  31. Re: [Ironic][Ceilometer] get IPMI data for        ceilometer
      (Devananda van der Veen)
  32. Re: [Heat] HOT software configuration refined after       design
      summit discussions (Clint Byrum)
  33. Re: [Heat] HOT software configuration refined after design
      summit discussions (Steve Baker)
  34. Re: [Heat] HOT software configuration refined after       design
      summit discussions (Clint Byrum)
  35. Re: [Heat] Continue discussing multi-region       orchestration
      (Zane Bitter)
  36. [TripleO] Easier way of trying TripleO (James Slagle)
  37. [Nova] Icehouse roadmap status (Russell Bryant)
  38. Re: [Horizon] PTL election (Thierry Carrez)
  39. Re: [Horizon] PTL election (Matthias Runge)
  40. Re: [Neutron] Race condition between DB layer and plugin
      back-end implementation (Salvatore Orlando)
  41. Re: [Heat] Version Negotiation Middleware Accept Header issue
      (Angus Salkeld)
  42. Re: [Heat] rough draft of Heat autoscaling API (Zane Bitter)
  43. Re: [Neutron] Race condition between DB layer and plugin
      back-end implementation (Joshua Harlow)


----------------------------------------------------------------------

Message: 1
Date: Tue, 19 Nov 2013 17:44:05 +0000
From: Amir Sadoughi <amir.sadoughi at RACKSPACE.COM>
To: "OpenStack Development Mailing List (not for usage questions)"
        <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] Reg : Security groups implementation
        using openflows in quantum ovs plugin
Message-ID: <3508E6AC-1A85-453C-8A68-54B5F2B18E23 at rackspace.com>
Content-Type: text/plain; charset="windows-1252"

Yes, my work has been on ML2 with neutron-openvswitch-agent.  I?m interested to see what Jun Park has. I might have something ready before he is available again, but would like to collaborate regardless.

Amir


On Nov 19, 2013, at 3:31 AM, Kanthi P <pavuluri.kanthi at gmail.com<mailto:pavuluri.kanthi at gmail.com>> wrote:

Hi All,

Thanks for the response!
Amir,Mike: Is your implementation being done according to ML2 plugin

Regards,
Kanthi


On Tue, Nov 19, 2013 at 1:43 AM, Mike Wilson <geekinutah at gmail.com<mailto:geekinutah at gmail.com>> wrote:
Hi Kanthi,

Just to reiterate what Kyle said, we do have an internal implementation using flows that looks very similar to security groups. Jun Park was the guy that wrote this and is looking to get it upstreamed. I think he'll be back in the office late next week. I'll point him to this thread when he's back.

-Mike


On Mon, Nov 18, 2013 at 3:39 PM, Kyle Mestery (kmestery) <kmestery at cisco.com<mailto:kmestery at cisco.com>> wrote:
On Nov 18, 2013, at 4:26 PM, Kanthi P <pavuluri.kanthi at gmail.com<mailto:pavuluri.kanthi at gmail.com>> wrote:
> Hi All,
>
> We are planning to implement quantum security groups using openflows for ovs plugin instead of iptables which is the case now.
>
> Doing so we can avoid the extra linux bridge which is connected between the vnet device and the ovs bridge, which is given as a work around since ovs bridge is not compatible with iptables.
>
> We are planning to create a blueprint and work on it. Could you please share your views on this
>
Hi Kanthi:

Overall, this idea is interesting and removing those extra bridges would certainly be nice. Some people at Bluehost gave a talk at the Summit [1] in which they explained they have done something similar, you may want to reach out to them since they have code for this internally already.

The OVS plugin is in feature freeze during Icehouse, and will be deprecated in favor of ML2 [2] at the end of Icehouse. I would advise you to retarget your work at ML2 when running with the OVS agent instead. The Neutron team will not accept new features into the OVS plugin anymore.

Thanks,
Kyle

[1] http://www.openstack.org/summit/openstack-summit-hong-kong-2013/session-videos/presentation/towards-truly-open-and-commoditized-software-defined-networks-in-openstack
[2] https://wiki.openstack.org/wiki/Neutron/ML2

> Thanks,
> Kanthi
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



_______________________________________________
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


_______________________________________________
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


_______________________________________________
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131119/96e892c6/attachment-0001.html>

------------------------------

Message: 2
Date: Tue, 19 Nov 2013 09:46:58 -0800
From: Caitlin Bestler <caitlin.bestler at nexenta.com>
To: openstack-dev at lists.openstack.org
Subject: Re: [openstack-dev] [Nova] Does Nova really need an SQL
        database?
Message-ID: <528BA412.3040801 at nexenta.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

On 11/18/2013 11:35 AM, Mike Spreitzer wrote:
> There were some concerns expressed at the summit about scheduler
> scalability in Nova, and a little recollection of Boris' proposal to
> keep the needed state in memory.  I also heard one guy say that he
> thinks Nova does not really need a general SQL database, that a NOSQL
> database with a bit of denormalization and/or client-maintained
> secondary indices could suffice.  Has that sort of thing been considered
> before?  What is the community's level of interest in exploring that?
>
> Thanks,
> Mike
>

How the data is stored is not the central question. The real issue is
how is the data normalized and distributed.

Data that is designed to be distributed deals with temporary
inconsistencies and only worries about eventual consistency.
Once you have that you can store the data in Objects, or in
a distributed database.

If you define your data so that you need global synchronization then
you will always be fighting scaling issues.





------------------------------

Message: 3
Date: Tue, 19 Nov 2013 18:50:03 +0100
From: Salvatore Orlando <sorlando at nicira.com>
To: "OpenStack Development Mailing List (not for usage questions)"
        <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] [Neutron] - Vendor specific erros
Message-ID:
        <CAGR=i3hAyCPadaEmnqqcY34ntYf_08NULAroa5X0KObY+G6LrA at mail.gmail.com>
Content-Type: text/plain; charset="windows-1252"

Thanks Avishay.

I think the status description error was introduced with this aim.
Whether vendor-specific error descriptions can make sense to a tenant,
that's a good question.
Personally, I feel like as a tenant that information would not be a lot
useful to me, as I would not be able to do any debug or maintenance on the
appliance where the error was generated; on the other hand, as a deployer I
might find that message very useful, but probably I would look for it in
the logs rather than in API responses; furthermore, as a deployer I might
find more convenient to not provide tenants any detail about the peculiar
driver being used.

On this note however, this is just my personal opinion. I'm sure there are
plenty of valid use cases for giving tenants vendor-specific error messages.

Salvatore


On 19 November 2013 13:00, Avishay Balderman <AvishayB at radware.com> wrote:

>  Hi Salvatore
>
> I think you are mixing between the state machine (ACTIVE,PENDEING_XYZ,etc)
>  and the status description
>
> All I want to do is to write a vendor specific error message when the
> state is ERROR.
>
> I DO NOT want to touch the state machine.
>
>
>
> See: https://bugs.launchpad.net/neutron/+bug/1248423
>
>
>
> Thanks
>
>
>
> Avishay
>
>
>
> *From:* Salvatore Orlando [mailto:sorlando at nicira.com]
> *Sent:* Thursday, November 14, 2013 1:15 PM
> *To:* OpenStack Development Mailing List (not for usage questions)
> *Subject:* Re: [openstack-dev] [Neutron] - Vendor specific erros
>
>
>
> In general, an error state make sense.
>
> I think you might want to send more details about how this state plugs
> into the load balancer state machine, but I reckon it is a generally
> non-recoverable state which could be reached by any other state; in that
> case it would be a generic enough case which might be supported by all
> drivers.
>
>
>
> It is good to point out that driver-specific state transitions however, in
> my opinion, are to avoid; application using the Neutron API will become
> non-portable, or at least users of the Neutron API would need to be aware
> that an entity might have a different state machine from driver to driver,
> which I reckon would be bad enough for a developer to decide to switch over
> to Cloudstack or AWS APIs!
>
>
>
> Salvatore
>
>
>
> PS: On the last point I am obviously joking, but not so much.
>
>
>
>
>
> On 12 November 2013 08:00, Avishay Balderman <AvishayB at radware.com> wrote:
>
>
>
> Hi
>
> Some of the DB entities in the LBaaS domain inherit from
> HasStatusDescription<https://github.com/openstack/neutron/blob/master/neutron/db/models_v2.py#L40>
>
> With this we can set the entity status (ACTIVE, PENDING_CREATE,etc) and a
> description for the status.
>
> There are flows in the Radware LBaaS driver that the  driver needs to set
> the entity status to ERROR and it is able to set the description of the
> error ?  the description is Radware specific.
>
> My question is:  Does it make sense to do that?
>
> After all the tenant is aware to the fact that he works against Radware
> load balancer -  the tenant selected Radware as the lbaas provider in the
> UI.
>
> Any reason not to do that?
>
>
>
> This is a generic issue/question and does not relate  to a specific plugin
> or driver.
>
>
>
> Thanks
>
>
>
> Avishay
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131119/fe539ef0/attachment-0001.html>

------------------------------

Message: 4
Date: Tue, 19 Nov 2013 10:02:45 -0800
From: James Bottomley <James.Bottomley at HansenPartnership.com>
To: "OpenStack Development Mailing List (not for usage questions)"
        <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] Introducing the new OpenStack service for
        Containers
Message-ID:
        <1384884165.18698.50.camel at dabdike.int.hansenpartnership.com>
Content-Type: text/plain; charset="ISO-8859-15"

On Mon, 2013-11-18 at 14:28 -0800, Stuart Fox wrote:
> Hey all
>
> Not having been at the summit (maybe the next one), could somebody
> give a really short explanation as to why it needs to be a separate
> service?
> It sounds like it should fit within the Nova area. It is, after all,
> just another hypervisor type, or so it seems.

I can take a stab at this:  Firstly, a container is *not* a hypervisor.
Hypervisor based virtualisation is done at the hardware level (so with
hypervisors you boot a second kernel on top of the virtual hardware),
container based virtualisation is done at the OS (kernel) level (so all
containers share the same kernel ... and sometimes even huge chunks of
the OS). With recent advances in the Linux Kernel, we can make a
container behave like a hypervisor (full OS/IaaS virtualisation), but
quite a bit of the utility of containers is that they can do much more
than hypervisors, so they shouldn't be constrained by hypervisor APIs
(which are effectively virtual hardware APIs).

It is possible to extend the Nova APIs to control containers more fully,
but there was resistance do doing this on the grounds that it's
expanding the scope of Nova, hence the new project.

James





------------------------------

Message: 5
Date: Tue, 19 Nov 2013 10:03:47 -0800
From: Clint Byrum <clint at fewbar.com>
To: openstack-dev <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] [Heat] Continue discussing multi-region
        orchestration
Message-ID: <1384883024-sup-1571 at clint-HP>
Content-Type: text/plain; charset=UTF-8

Excerpts from Zane Bitter's message of 2013-11-15 12:41:53 -0800:
> Good news, everyone! I have created the missing whiteboard diagram that
> we all needed at the design summit:
>
> https://wiki.openstack.org/wiki/Heat/Blueprints/Multi_Region_Support_for_Heat/The_Missing_Diagram
>
> I've documented 5 possibilities. (1) is the current implementation,
> which we agree we want to get away from. I strongly favour (2) for the
> reasons listed. I don't think (3) has many friends. (4) seems to be
> popular despite the obvious availability problem and doubts that it is
> even feasible. Finally, I can save us all some time by simply stating
> that I will -2 on sight any attempt to implement (5).
>
> When we're discussing this, please mention explicitly the number of the
> model you are talking about at any given time.
>
> If you have a suggestion for a different model, make your own diagram!
> jk, you can sketch it or something for me and I'll see if I can add it.

Thanks for putting this together Zane. I just now got around to looking
closely.

Option 2 is good. I'd love for option 1 to be made automatic by making
the client smarter, but parsing templates in the client will require
some deep thought before we decide it is a good idea.

I'd like to consider a 2a, which just has the same Heat engines the user
is talking to being used to do the orchestration in whatever region
they are in. I think that is actually the intention of the diagram,
but it looks like there is a "special" one that talks to the engines
that actually do the work.

2 may morph into 3 actually, if users don't like the nested stack
requirement for 2, we can do the work to basically make the engine create
a nested stack per region. So that makes 2 a stronger choice for first
implementation.

4 has an unstated pro, which is that attack surface is reduced. This
makes more sense when you consider the TripleO case where you may want
the undercloud (hardware cloud) to orchestrate things existing in the
overcloud (vm cloud) but you don't want the overcloud administrators to
be able to control your entire stack.

Given CAP theorem, option 5, the global orchestrator, would be doable
with not much change as long as partition tolerance were the bit we gave
up. We would just have to have a cross-region RPC bus and database. Of
course, since regions are most likely to be partitioned, that is not
really a good choice. Trading partition tolerance for consistency lands
us in the complexity black hole. Trading out availability makes it no
better than option 4.



------------------------------

Message: 6
Date: Tue, 19 Nov 2013 20:08:19 +0200
From: Paula Ta-Shma <PAULA at il.ibm.com>
To: openstack-dev at lists.openstack.org
Subject: Re: [openstack-dev] [Swift] Metadata Search API
Message-ID:
        <OFE1BBCB03.C13B2EFC-ONC2257C28.00632CB7-C2257C28.0063A3CD at il.ibm.com>
Content-Type: text/plain; charset=US-ASCII


> My apologies, I'm apparently coming into this quite late :) Would you
> mind sharing a link to the HP proposal? I wasn't at the summit
> unfortunately and am playing a bit of catch up.

Hi Jay,

Take a look at https://wiki.openstack.org/wiki/MetadataSearch
There are links there to the REST API proposed by HP and the slides
presented at the summit.

regards
Paula





------------------------------

Message: 7
Date: Tue, 19 Nov 2013 22:12:30 +0400
From: Ilya Shakhat <ishakhat at mirantis.com>
To: "OpenStack Development Mailing List (not for usage questions)"
        <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] [infra] How to determine patch set load
        for a given project
Message-ID:
        <CAMzOD1KtUxGHgs__WnFmibpxtyxyKA3z1nnyCV6qqbrXCPAO0Q at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Matt,

As an option you may estimate the load using Stackalytics data on number of
commits -
http://stackalytics.com/?release=icehouse&metric=commits&project_type=all&module=sqlalchemy-migrateNumber
of commits is certainly less than number of patches, but for project
sqlalchemy-migrate the multiplier 2 will give a good estimation.

Thanks,
Ilya


2013/11/19 Matt Riedemann <mriedem at linux.vnet.ibm.com>

> We have a team working on getting CI setup for DB2 10.5 in
> sqlalchemy-migrate and they were asking me if there was a way to calculate
> the patch load through that project.
>
> I asked around in the infra IRC channel and Jeremy Stanley pointed out
> that there might be something available in http://graphite.openstack.org/by looking for the project's test stats.
>
> I found that if you expand stats_counts > zuul > job and then search for
> your project (sqlalchemy-migrate in this case), you can find the jobs and
> their graphs for load. In my case I care about stats for
> gate-sqlalchemy-migrate-python27.
>
> I'm having a little trouble interpreting the data though. From looking at
> what's out there for review now, there is one new patch created on 11/19
> and the last new one before that was on 11/15. I see spikes in the graph
> around 11/15, 11/18 and 11/19, but I'm not sure what the 11/18 spike is
> from?
>
> --
>
> Thanks,
>
> Matt Riedemann
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131119/a01c8ce4/attachment-0001.html>

------------------------------

Message: 8
Date: Tue, 19 Nov 2013 12:14:24 -0600
From: Christopher Armstrong <chris.armstrong at rackspace.com>
To: "OpenStack Development Mailing List (not for usage questions)"
        <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] [Heat] rough draft of Heat autoscaling
        API
Message-ID:
        <CAPkRfUSz4jtyTxDVN6zVVfqU_U2cQcjizMqwt0VbW+yv5HqG+Q at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

On Mon, Nov 18, 2013 at 5:57 AM, Zane Bitter <zbitter at redhat.com> wrote:

> On 16/11/13 11:15, Angus Salkeld wrote:
>
>> On 15/11/13 08:46 -0600, Christopher Armstrong wrote:
>>
>>> On Fri, Nov 15, 2013 at 3:57 AM, Zane Bitter <zbitter at redhat.com> wrote:
>>>
>>>  On 15/11/13 02:48, Christopher Armstrong wrote:
>>>>
>>>>  On Thu, Nov 14, 2013 at 5:40 PM, Angus Salkeld <asalkeld at redhat.com
>>>>> <mailto:asalkeld at redhat.com>> wrote:
>>>>>
>>>>>     On 14/11/13 10:19 -0600, Christopher Armstrong wrote:
>>>>>
>>>>>         http://docs.heatautoscale.__apiary.io/
>>>>>
>>>>>         <http://docs.heatautoscale.apiary.io/>
>>>>>
>>>>>         I've thrown together a rough sketch of the proposed API for
>>>>>         autoscaling.
>>>>>         It's written in API-Blueprint format (which is a simple subset
>>>>>         of Markdown)
>>>>>         and provides schemas for inputs and outputs using JSON-Schema.
>>>>>         The source
>>>>>         document is currently at
>>>>>         https://github.com/radix/heat/__raw/as-api-spike/
>>>>> autoscaling.__apibp
>>>>>
>>>>>
>>>>> <https://github.com/radix/heat/raw/as-api-spike/autoscaling.apibp
>>>>> >
>>>>>
>>>>>
>>>>>         Things we still need to figure out:
>>>>>
>>>>>         - how to scope projects/domains. put them in the URL? get them
>>>>>         from the
>>>>>         token?
>>>>>         - how webhooks are done (though this shouldn't affect the API
>>>>>         too much;
>>>>>         they're basically just opaque)
>>>>>
>>>>>         Please read and comment :)
>>>>>
>>>>>
>>>>>     Hi Chistopher
>>>>>
>>>>>     In the group create object you have 'resources'.
>>>>>     Can you explain what you expect in there? I thought we talked at
>>>>>     summit about have a unit of scaling as a nested stack.
>>>>>
>>>>>     The thinking here was:
>>>>>     - this makes the new config stuff easier to scale (config get
>>>>> applied
>>>>>     ?  per scaling stack)
>>>>>
>>>>>     - you can potentially place notification resources in the scaling
>>>>>     ?  stack (think marconi message resource - on-create it sends a
>>>>>     ?  message)
>>>>>
>>>>>     - no need for a launchconfig
>>>>>     - you can place a LoadbalancerMember resource in the scaling stack
>>>>>     ?  that triggers the loadbalancer to add/remove it from the lb.
>>>>>
>>>>>
>>>>>     I guess what I am saying is I'd expect an api to a nested stack.
>>>>>
>>>>>
>>>>> Well, what I'm thinking now is that instead of "resources" (a
>>>>> mapping of
>>>>> resources), just have "resource", which can be the template definition
>>>>> for a single resource. This would then allow the user to specify a
>>>>> Stack
>>>>> resource if they want to provide multiple resources. How does that
>>>>> sound?
>>>>>
>>>>>
>>>> My thought was this (digging into the implementation here a bit):
>>>>
>>>> - Basically, the autoscaling code works as it does now: creates a
>>>> template
>>>> containing OS::Nova::Server resources (changed from AWS::EC2::Instance),
>>>> with the properties obtained from the LaunchConfig, and creates a
>>>> stack in
>>>> Heat.
>>>> - LaunchConfig can now contain any properties you like (I'm not 100%
>>>> sure
>>>> about this one*).
>>>> - The user optionally supplies a template. If the template is
>>>> supplied, it
>>>> is passed to Heat and set in the environment as the provider for the
>>>> OS::Nova::Server resource.
>>>>
>>>>
>>>>  I don't like the idea of binding to OS::Nova::Server specifically for
>>> autoscaling. I'd rather have the ability to scale *any* resource,
>>> including
>>> nested stacks or custom resources. It seems like jumping through hoops to
>>>
>>
>> big +1 here, autoscaling should not even know what it is scaling, just
>> some resource. solum might want to scale all sorts of non-server
>> resources (and other users).
>>
>
> I'm surprised by the negative reaction to what I suggested, which is a
> completely standard use of provider templates. Allowing a user-defined
> stack of resources to stand in for an unrelated resource type is the entire
> point of providers. Everyone says that it's a great feature, but if you try
> to use it for something they call it a "hack". Strange.
>

To clarify this position (which I already did in IRC), replacing one
concrete resource with another that means something in a completely
different domain is a hack -- say, replacing "server" with "group of
related resources". However, replacing OS::Nova::Server with something
which still does something very much like creating a server is reasonable
-- e.g., using a different API like one for creating containers or using a
different cloud provider's API.


>
> So, allow me to make a slight modification to my proposal:
>
> - The autoscaling service manages a template containing
> OS::Heat::ScaledResource resources. This is an imaginary resource type that
> is not backed by a plugin in Heat.
> - If no template is supplied by the user, the environment declares another
> resource plugin as the provider for OS::Heat::ScaledResource (by default it
> would be OS::Nova::Server, but this should probably be configurable by the
> deployer... so if you had a region full of Docker containers and no Nova
> servers, you could set it to OS::Docker::Container or something).
> - If a provider template is supplied by the user, it would be specified as
> the provider in the environment file.
>
> This, I hope, demonstrates that autoscaling needs no knowledge whatsoever
> about what it is scaling to use this approach.
>
>
It'd be interesting to see some examples, I think. I'll provide some
examples of my proposals, with the following caveats:

- I'm assuming a separation of launch configuration from scaling group, as
you proposed -- I don't really have a problem with this.
- I'm also writing these examples with the plural "resources" parameter,
which there has been some bikeshedding around - I believe the structure can
be the same whether we go with singular, plural, or even
whole-template-as-a-string.

# trivial example: scaling a single server

POST /launch_configs

{
    "name": "my-launch-config",
    "resources": {
        "my-server": {
            "type": "OS::Nova::Server",
            "properties": {
                "image": "my-image",
                "flavor": "my-flavor", # etc...
            }
        }
    }
}

POST /groups

{
    "name": "group-name",
    "launch_config": "my-launch-config",
    "min_size": 0,
    "max_size": 0,
}

(and then, the user would continue on to create a policy that scales the
group, etc)

# complex example: scaling a server with an attached volume

POST /launch_configs

{
    "name": "my-launch-config",
    "resources": {
        "my-volume": {
            "type": "OS::Cinder::Volume",
            "properties": {
                # volume properties...
            }
        },
        "my-server": {
            "type": "OS::Nova::Server",
            "properties": {
                "image": "my-image",
                "flavor": "my-flavor", # etc...
            }
        },
        "my-volume-attachment": {
            "type": "OS::Cinder::VolumeAttachment",
            "properties": {
                "volume_id": {"get_resource": "my-volume"},
                "instance_uuid": {"get_resource": "my-server"},
                "mountpoint": "/mnt/volume"
            }
        }
    }
}

(and so on, creating the group and policies in the same way).


Can you please provide an example of your proposal for the same use cases?
Please indicate how you'd specify the custom properties for each resource
and how you specify the provider template in the API.

--
IRC: radix
Christopher Armstrong
Rackspace
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131119/1fc4191f/attachment-0001.html>

------------------------------

Message: 9
Date: Tue, 19 Nov 2013 13:22:14 -0500
From: Jay Pipes <jaypipes at gmail.com>
To: openstack-dev at lists.openstack.org
Subject: Re: [openstack-dev] [Swift] Metadata Search API
Message-ID: <528BAC56.9090801 at gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

On 11/19/2013 01:08 PM, Paula Ta-Shma wrote:
>
>> My apologies, I'm apparently coming into this quite late :) Would you
>> mind sharing a link to the HP proposal? I wasn't at the summit
>> unfortunately and am playing a bit of catch up.
>
> Hi Jay,
>
> Take a look at https://wiki.openstack.org/wiki/MetadataSearch
> There are links there to the REST API proposed by HP and the slides
> presented at the summit.

Thank you, Paula!

-jay




------------------------------

Message: 10
Date: Tue, 19 Nov 2013 18:27:52 +0000
From: Joshua Harlow <harlowja at yahoo-inc.com>
To: "OpenStack Development Mailing List (not for usage questions)"
        <openstack-dev at lists.openstack.org>, Chris Friesen
        <chris.friesen at windriver.com>
Subject: Re: [openstack-dev] [Nova] Does Nova really need an SQL
        database?
Message-ID: <CEB0ECBE.4D584%harlowja at yahoo-inc.com>
Content-Type: text/plain; charset="us-ascii"

Personally I would prefer #3 from the below. #2 I think will still have to
deal with consistency issues, just switching away from a DB doesn't make
magical ponies and unicorns appear (in-fact it can potentially make the
problem worse if its done incorrectly - and its pretty easy to get it
wrong IMHO). #1 could also work, but then u hit a vertical scaling limit
(works if u paid oracle for there DB or IBM for DB2 I suppose). I prefer
#2 since I think it is honestly needed under all solutions.

On 11/19/13 9:29 AM, "Chris Friesen" <chris.friesen at windriver.com> wrote:

>On 11/18/2013 06:47 PM, Joshua Harlow wrote:
>> An idea related to this, what would need to be done to make the DB have
>> the exact state that a compute node is going through (and therefore the
>> scheduler would not make unreliable/racey decisions, even when there are
>> multiple schedulers). It's not like we are dealing with a system which
>> can not know the exact state (as long as the compute nodes are connected
>> to the network, and a network partition does not occur).
>
>How would you synchronize the various schedulers with each other?
>Suppose you have multiple scheduler nodes all trying to boot multiple
>instances each.
>
>Even if each at the start of the process each scheduler has a perfect
>view of the system, each scheduler would need to have a view of what
>every other scheduler is doing in order to not make racy decisions.
>
>I see a few options:
>
>1) Push scheduling down into the database itself.  Implement scheduler
>filters as SQL queries or stored procedures.
>
>2) Get rid of the DB for scheduling.  It looks like people are working
>on this: https://blueprints.launchpad.net/nova/+spec/no-db-scheduler
>
>3) Do multi-stage scheduling.  Do a "tentative" schedule, then try and
>update the DB to reserve all the necessary resources.  If that fails,
>someone got there ahead of you so try again with the new data.
>
>Chris
>
>_______________________________________________
>OpenStack-dev mailing list
>OpenStack-dev at lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




------------------------------

Message: 11
Date: Tue, 19 Nov 2013 18:33:02 +0000
From: Joshua Harlow <harlowja at yahoo-inc.com>
To: "OpenStack Development Mailing List (not for usage questions)"
        <openstack-dev at lists.openstack.org>, Isaku Yamahata
        <isaku.yamahata at gmail.com>
Cc: Robert Kukura <rkukura at redhat.com>
Subject: Re: [openstack-dev] [Neutron] Race condition between DB layer
        and plugin back-end implementation
Message-ID: <CEB0EDB9.4D58F%harlowja at yahoo-inc.com>
Content-Type: text/plain; charset="Windows-1252"

If you start adding these states you might really want to consider the
following work that is going on in other projects.

It surely appears that everyone is starting to hit the same problem (and
joining efforts would produce a more beneficial result).

Relevant icehouse etherpads:
- https://etherpad.openstack.org/p/CinderTaskFlowFSM
- https://etherpad.openstack.org/p/icehouse-oslo-service-synchronization

And of course my obvious plug for taskflow (which is designed to be a
useful library to help in all these usages).

- https://wiki.openstack.org/wiki/TaskFlow

The states u just mentioned start to line-up with
https://wiki.openstack.org/wiki/TaskFlow/States_of_Task_and_Flow

If this sounds like a useful way to go (joining efforts) then lets see how
we can make it possible.

IRC: #openstack-state-management is where I am usually at.

On 11/19/13 3:57 AM, "Isaku Yamahata" <isaku.yamahata at gmail.com> wrote:

>On Mon, Nov 18, 2013 at 03:55:49PM -0500,
>Robert Kukura <rkukura at redhat.com> wrote:
>
>> On 11/18/2013 03:25 PM, Edgar Magana wrote:
>> > Developers,
>> >
>> > This topic has been discussed before but I do not remember if we have
>>a
>> > good solution or not.
>>
>> The ML2 plugin addresses this by calling each MechanismDriver twice. The
>> create_network_precommit() method is called as part of the DB
>> transaction, and the create_network_postcommit() method is called after
>> the transaction has been committed. Interactions with devices or
>> controllers are done in the postcommit methods. If the postcommit method
>> raises an exception, the plugin deletes that partially-created resource
>> and returns the exception to the client. You might consider a similar
>> approach in your plugin.
>
>Splitting works into two phase, pre/post, is good approach.
>But there still remains race window.
>Once the transaction is committed, the result is visible to outside.
>So the concurrent request to same resource will be racy.
>There is a window after pre_xxx_yyy before post_xxx_yyy() where
>other requests can be handled.
>
>The state machine needs to be enhanced, I think. (plugins need
>modification)
>For example, adding more states like pending_{create, delete, update}.
>Also we would like to consider serializing between operation of ports
>and subnets. or between operation of subnets and network depending on
>performance requirement.
>(Or carefully audit complex status change. i.e.
>changing port during subnet/network update/deletion.)
>
>I think it would be useful to establish reference locking policy
>for ML2 plugin for SDN controllers.
>Thoughts or comments? If this is considered useful and acceptable,
>I'm willing to help.
>
>thanks,
>Isaku Yamahata
>
>> -Bob
>>
>> > Basically, if concurrent API calls are sent to Neutron, all of them
>>are
>> > sent to the plug-in level where two actions have to be made:
>> >
>> > 1. DB transaction ? No just for data persistence but also to collect
>>the
>> > information needed for the next action
>> > 2. Plug-in back-end implementation ? In our case is a call to the
>>python
>> > library than consequentially calls PLUMgrid REST GW (soon SAL)
>> >
>> > For instance:
>> >
>> > def create_port(self, context, port):
>> >         with context.session.begin(subtransactions=True):
>> >             # Plugin DB - Port Create and Return port
>> >             port_db = super(NeutronPluginPLUMgridV2,
>> > self).create_port(context,
>> >
>> port)
>> >             device_id = port_db["device_id"]
>> >             if port_db["device_owner"] == "network:router_gateway":
>> >                 router_db = self._get_router(context, device_id)
>> >             else:
>> >                 router_db = None
>> >             try:
>> >                 LOG.debug(_("PLUMgrid Library: create_port() called"))
>> > # Back-end implementation
>> >                 self._plumlib.create_port(port_db, router_db)
>> >             except Exception:
>> >             ?
>> >
>> > The way we have implemented at the plugin-level in Havana (even in
>> > Grizzly) is that both action are wrapped in the same "transaction"
>>which
>> > automatically rolls back any operation done to its original state
>> > protecting mostly the DB of having any inconsistency state or left
>>over
>> > data if the back-end part fails.=.
>> > The problem that we are experiencing is when concurrent calls to the
>> > same API are sent, the number of operation at the plug-in back-end are
>> > long enough to make the next concurrent API call to get stuck at the
>>DB
>> > transaction level, which creates a hung state for the Neutron Server
>>to
>> > the point that all concurrent API calls will fail.
>> >
>> > This can be fixed if we include some "locking" system such as calling:
>> >
>> > from neutron.common import utile
>> > ?
>> >
>> > @utils.synchronized('any-name', external=True)
>> > def create_port(self, context, port):
>> > ?
>> >
>> > Obviously, this will create a serialization of all concurrent calls
>> > which will ends up in having a really bad performance. Does anyone
>>has a
>> > better solution?
>> >
>> > Thanks,
>> >
>> > Edgar
>> >
>> >
>> > _______________________________________________
>> > OpenStack-dev mailing list
>> > OpenStack-dev at lists.openstack.org
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> >
>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>--
>Isaku Yamahata <isaku.yamahata at gmail.com>
>
>_______________________________________________
>OpenStack-dev mailing list
>OpenStack-dev at lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




------------------------------

Message: 12
Date: Tue, 19 Nov 2013 10:34:31 -0800
From: Sam Alba <sam.alba at gmail.com>
To: "OpenStack Development Mailing List (not for usage questions)"
        <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] Introducing the new OpenStack service for
        Containers
Message-ID:
        <CAJv9qeRgroJuhKe-M81PtmEUkUpMgfEDpDaeaVpHBxPWVgDPig at mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

On Tue, Nov 19, 2013 at 6:45 AM, Chuck Short <chuck.short at canonical.com> wrote:
> Hi
>
> I am excited to see containers getting such traction in the openstack
> project.
>
>
> On Mon, Nov 18, 2013 at 7:30 PM, Russell Bryant <rbryant at redhat.com> wrote:
>>
>> On 11/18/2013 06:30 PM, Dan Smith wrote:
>> >> Not having been at the summit (maybe the next one), could somebody
>> >> give a really short explanation as to why it needs to be a separate
>> >> service? It sounds like it should fit within the Nova area. It is,
>> >> after all, just another hypervisor type, or so it seems.
>> >
>> > But it's not just another hypervisor. If all you want from your
>> > containers is lightweight VMs, then nova is a reasonable place to put
>> > that (and it's there right now). If, however, you want to expose the
>> > complex and flexible attributes of a container, such as being able to
>> > overlap filesystems, have fine-grained control over what is shared with
>> > the host OS, look at the processes within a container, etc, then nova
>> > ends up needing quite a bit of change to support that.
>> >
>> > I think the overwhelming majority of folks in the room, after discussing
>> > it, agreed that Nova is infrastructure and containers is more of a
>> > platform thing. Making it a separate service lets us define a mechanism
>> > to manage these that makes much more sense than treating them like VMs.
>> > Using Nova to deploy VMs that run this service is the right approach,
>> > IMHO. Clayton put it very well, I think:
>> >
>> >   If the thing you want to deploy has a kernel, then you need Nova. If
>> >   your thing runs on a kernel, you want $new_service_name.
>> >
>> > I agree.
>> >
>> > Note that this is just another service under the compute project (or
>> > program, or whatever the correct terminology is this week).
>>
>> The Compute program is correct.  That is established terminology as
>> defined by the TC in the last cycle.
>>
>> > So while
>> > distinct from Nova in terms of code, development should be tightly
>> > integrated until (and if at some point) it doesn't make sense.
>>
>> And it may share a whole bunch of the code.
>>
>> Another way to put this:  The API requirements people have for
>> containers include a number of features considered outside of the
>> current scope of Nova (short version: Nova's scope stops before going
>> *inside* the servers it creates, except file injection, which we plan to
>> remove anyway).  That presents a problem.  A new service is one possible
>> solution.
>>
>> My view of the outcome of the session was not "it *will* be a new
>> service".  Instead, it was, "we *think* it should be a new service, but
>> let's do some more investigation to decide for sure".
>>
>> The action item from the session was to go off and come up with a
>> proposal for what a new service would look like.  In particular, we
>> needed a proposal for what the API would look like.  With that in hand,
>> we need to come back and ask the question again of whether a new service
>> is the right answer.
>>
>> I see 3 possible solutions here:
>>
>> 1) Expand the scope of Nova to include all of the things people want to
>> be able to do with containers.
>>
>> This is my least favorite option.  Nova is already really big.  We've
>> worked to split things out (Networking, Block Storage, Images) to keep
>> it under control.  I don't think a significant increase in scope is a
>> smart move for Nova's future.
>>
>
> This is my least favorite option. Like a lot of other responses already I
> see a lot of code duplication  because Nova and the new nova container's
> project. This just doesn't include the scheduler but  things like config
> driver, etc.

Can we dig into this option? Honestly, I'd be glad to find a way to
avoid reimplementing everything again (a new compute service with
Keystone, Glance, Horizon integration, etc...). But I do understand
the limitation of changing Nova to improve containers support.

Can someone bring more details (maybe in the spec etherpad, in a new
section) about this 3rd option?

Since the API (in the front) and the virt API (in the back) have to be
different, I barely see how we can reuse most of Nova's code.

>>
>> 2) Declare containers as explicitly out of scope and start a new project
>> with its own API.
>>
>> That is what is being proposed here.
>>
>> 3) Some middle ground that is a variation of #2.  Consider Ironic.  The
>> idea is that Nova's API will still be used for basic provisioning, which
>> Nova will implement by talking to Ironic.  However, there are a lot of
>> baremetal management things that don't fit in Nova at all, and those
>> only exist in Ironic's API.
>
>
> This is my preferred choice  as well. If we could leverage the existing nova
> API and extend it to include containers and features that users who use
> containers in their existing production environment wants.
>>
>>
>> I wanted to mention this option for completeness, but I don't actually
>> think it's the right choice here.  With Ironic you have a physical
>> resource (managed by Ironic), and then instances of an image running on
>> these physical resources (managed by Nova).
>>
>> With containers, there's a similar line.  You have instances of
>> containers (managed either by Nova or the new service) running on
>> servers (managed by Nova).  I think there is a good line for separating
>> concerns, with a container service on top of Nova.
>>
>>
>> Let's ask ourselves:  How much overlap is there between the current
>> compute API and a proposed containers API?  Effectively, what's the
>> diff?  How much do we expect this diff to change in the coming years?
>>
>> The current diff demonstrates a significant clash with the current scope
>> of Nova.  I also expect a lot of innovation around containers in the
>> next few years, which will result in wanting to do new cool things in
>> the API.  I feel that all of this justifies a new API service to best
>> position ourselves for the long term.

--
@sam_alba



------------------------------

Message: 13
Date: Tue, 19 Nov 2013 10:35:27 -0800
From: Clint Byrum <clint at fewbar.com>
To: openstack-dev <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] [Nova] Does Nova really need an SQL
        database?
Message-ID: <1384885883-sup-6253 at clint-HP>
Content-Type: text/plain; charset=UTF-8

Excerpts from Chris Friesen's message of 2013-11-19 09:29:00 -0800:
> On 11/18/2013 06:47 PM, Joshua Harlow wrote:
> > An idea related to this, what would need to be done to make the DB have
> > the exact state that a compute node is going through (and therefore the
> > scheduler would not make unreliable/racey decisions, even when there are
> > multiple schedulers). It's not like we are dealing with a system which
> > can not know the exact state (as long as the compute nodes are connected
> > to the network, and a network partition does not occur).
>
> How would you synchronize the various schedulers with each other?
> Suppose you have multiple scheduler nodes all trying to boot multiple
> instances each.
>
> Even if each at the start of the process each scheduler has a perfect
> view of the system, each scheduler would need to have a view of what
> every other scheduler is doing in order to not make racy decisions.
>

Your question assumes they need to be "in sync" at a granular level.

Each scheduler process can own a different set of resources. If they
each grab instance requests in a round-robin fashion, then they will
fill their resources up in a relatively well balanced way until one
scheduler's resources are exhausted. At that time it should bow out of
taking new instances. If it can't fit a request in, it should kick the
request out for retry on another scheduler.

In this way, they only need to be in sync in that they need a way to
agree on who owns which resources. A distributed hash table that gets
refreshed whenever schedulers come and go would be fine for that.



------------------------------

Message: 14
Date: Tue, 19 Nov 2013 18:42:11 +0000
From: Joshua Harlow <harlowja at yahoo-inc.com>
To: "OpenStack Development Mailing List (not for usage questions)"
        <openstack-dev at lists.openstack.org>, Joshua Harlow
        <harlowja at yahoo-inc.com>, Isaku Yamahata <isaku.yamahata at gmail.com>
Cc: Robert Kukura <rkukura at redhat.com>
Subject: Re: [openstack-dev] [Neutron] Race condition between DB layer
        and plugin back-end implementation
Message-ID: <CEB0F0AA.4D5A0%harlowja at yahoo-inc.com>
Content-Type: text/plain; charset="iso-8859-2"

And also of course, nearly forgot a similar situation/review in heat.

https://review.openstack.org/#/c/49440/

Except theres was/is dealing with stack locking (a heat concept).

On 11/19/13 10:33 AM, "Joshua Harlow" <harlowja at yahoo-inc.com> wrote:

>If you start adding these states you might really want to consider the
>following work that is going on in other projects.
>
>It surely appears that everyone is starting to hit the same problem (and
>joining efforts would produce a more beneficial result).
>
>Relevant icehouse etherpads:
>- https://etherpad.openstack.org/p/CinderTaskFlowFSM
>- https://etherpad.openstack.org/p/icehouse-oslo-service-synchronization
>
>And of course my obvious plug for taskflow (which is designed to be a
>useful library to help in all these usages).
>
>- https://wiki.openstack.org/wiki/TaskFlow
>
>The states u just mentioned start to line-up with
>https://wiki.openstack.org/wiki/TaskFlow/States_of_Task_and_Flow
>
>If this sounds like a useful way to go (joining efforts) then lets see how
>we can make it possible.
>
>IRC: #openstack-state-management is where I am usually at.
>
>On 11/19/13 3:57 AM, "Isaku Yamahata" <isaku.yamahata at gmail.com> wrote:
>
>>On Mon, Nov 18, 2013 at 03:55:49PM -0500,
>>Robert Kukura <rkukura at redhat.com> wrote:
>>
>>> On 11/18/2013 03:25 PM, Edgar Magana wrote:
>>> > Developers,
>>> >
>>> > This topic has been discussed before but I do not remember if we have
>>>a
>>> > good solution or not.
>>>
>>> The ML2 plugin addresses this by calling each MechanismDriver twice.
>>>The
>>> create_network_precommit() method is called as part of the DB
>>> transaction, and the create_network_postcommit() method is called after
>>> the transaction has been committed. Interactions with devices or
>>> controllers are done in the postcommit methods. If the postcommit
>>>method
>>> raises an exception, the plugin deletes that partially-created resource
>>> and returns the exception to the client. You might consider a similar
>>> approach in your plugin.
>>
>>Splitting works into two phase, pre/post, is good approach.
>>But there still remains race window.
>>Once the transaction is committed, the result is visible to outside.
>>So the concurrent request to same resource will be racy.
>>There is a window after pre_xxx_yyy before post_xxx_yyy() where
>>other requests can be handled.
>>
>>The state machine needs to be enhanced, I think. (plugins need
>>modification)
>>For example, adding more states like pending_{create, delete, update}.
>>Also we would like to consider serializing between operation of ports
>>and subnets. or between operation of subnets and network depending on
>>performance requirement.
>>(Or carefully audit complex status change. i.e.
>>changing port during subnet/network update/deletion.)
>>
>>I think it would be useful to establish reference locking policy
>>for ML2 plugin for SDN controllers.
>>Thoughts or comments? If this is considered useful and acceptable,
>>I'm willing to help.
>>
>>thanks,
>>Isaku Yamahata
>>
>>> -Bob
>>>
>>> > Basically, if concurrent API calls are sent to Neutron, all of them
>>>are
>>> > sent to the plug-in level where two actions have to be made:
>>> >
>>> > 1. DB transaction ? No just for data persistence but also to collect
>>>the
>>> > information needed for the next action
>>> > 2. Plug-in back-end implementation ? In our case is a call to the
>>>python
>>> > library than consequentially calls PLUMgrid REST GW (soon SAL)
>>> >
>>> > For instance:
>>> >
>>> > def create_port(self, context, port):
>>> >         with context.session.begin(subtransactions=True):
>>> >             # Plugin DB - Port Create and Return port
>>> >             port_db = super(NeutronPluginPLUMgridV2,
>>> > self).create_port(context,
>>> >
>>> port)
>>> >             device_id = port_db["device_id"]
>>> >             if port_db["device_owner"] == "network:router_gateway":
>>> >                 router_db = self._get_router(context, device_id)
>>> >             else:
>>> >                 router_db = None
>>> >             try:
>>> >                 LOG.debug(_("PLUMgrid Library: create_port()
>>>called"))
>>> > # Back-end implementation
>>> >                 self._plumlib.create_port(port_db, router_db)
>>> >             except Exception:
>>> >             ?
>>> >
>>> > The way we have implemented at the plugin-level in Havana (even in
>>> > Grizzly) is that both action are wrapped in the same "transaction"
>>>which
>>> > automatically rolls back any operation done to its original state
>>> > protecting mostly the DB of having any inconsistency state or left
>>>over
>>> > data if the back-end part fails.=.
>>> > The problem that we are experiencing is when concurrent calls to the
>>> > same API are sent, the number of operation at the plug-in back-end
>>>are
>>> > long enough to make the next concurrent API call to get stuck at the
>>>DB
>>> > transaction level, which creates a hung state for the Neutron Server
>>>to
>>> > the point that all concurrent API calls will fail.
>>> >
>>> > This can be fixed if we include some "locking" system such as
>>>calling:
>>> >
>>> > from neutron.common import utile
>>> > ?
>>> >
>>> > @utils.synchronized('any-name', external=True)
>>> > def create_port(self, context, port):
>>> > ?
>>> >
>>> > Obviously, this will create a serialization of all concurrent calls
>>> > which will ends up in having a really bad performance. Does anyone
>>>has a
>>> > better solution?
>>> >
>>> > Thanks,
>>> >
>>> > Edgar
>>> >
>>> >
>>> > _______________________________________________
>>> > OpenStack-dev mailing list
>>> > OpenStack-dev at lists.openstack.org
>>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>> >
>>>
>>>
>>> _______________________________________________
>>> OpenStack-dev mailing list
>>> OpenStack-dev at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>--
>>Isaku Yamahata <isaku.yamahata at gmail.com>
>>
>>_______________________________________________
>>OpenStack-dev mailing list
>>OpenStack-dev at lists.openstack.org
>>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>_______________________________________________
>OpenStack-dev mailing list
>OpenStack-dev at lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




------------------------------

Message: 15
Date: Tue, 19 Nov 2013 13:46:21 -0500
From: Eric Windisch <eric at cloudscaling.com>
To: "OpenStack Development Mailing List (not for usage questions)"
        <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] Introducing the new OpenStack service for
        Containers
Message-ID:
        <CALnZm-Ys8PORYcM8Th=4kHp-Sw0c_MUH6=cj-3bkDS=Szg9ydQ at mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

On Tue, Nov 19, 2013 at 1:02 PM, James Bottomley
<James.Bottomley at hansenpartnership.com> wrote:
> On Mon, 2013-11-18 at 14:28 -0800, Stuart Fox wrote:
>> Hey all
>>
>> Not having been at the summit (maybe the next one), could somebody
>> give a really short explanation as to why it needs to be a separate
>> service?
>> It sounds like it should fit within the Nova area. It is, after all,
>> just another hypervisor type, or so it seems.
>
> I can take a stab at this:  Firstly, a container is *not* a hypervisor.
> Hypervisor based virtualisation is done at the hardware level (so with
> hypervisors you boot a second kernel on top of the virtual hardware),
> container based virtualisation is done at the OS (kernel) level (so all
> containers share the same kernel ... and sometimes even huge chunks of
> the OS). With recent advances in the Linux Kernel, we can make a
> container behave like a hypervisor (full OS/IaaS virtualisation), but
> quite a bit of the utility of containers is that they can do much more
> than hypervisors, so they shouldn't be constrained by hypervisor APIs
> (which are effectively virtual hardware APIs).
>
> It is possible to extend the Nova APIs to control containers more fully,
> but there was resistance do doing this on the grounds that it's
> expanding the scope of Nova, hence the new project.

It might be worth noting that it was also brought up that
hypervisor-based virtualization can offer a number of features that
bridge some of these gaps, but are not supported in, nor may ever be
supported in Nova.

For example, Daniel brings up an interesting point with the
libvirt-sandbox feature. This is one of those features that bridges
some of the gaps. There are also solutions, however brittle, for
introspection that work on hypervisor-driven VMs. It is not clear what
the scope or desire for these features might be, how they might be
sufficiently abstracted between hypervisors and guest OSes, nor how
these would fit into any of the existing or planned compute API
buckets.

Having a separate service for managing containers draws a thick line
in the sand that will somewhat stiffen innovation around
hypervisor-based virtualization. That isn't necessarily a bad thing,
it will help maintain stability in the project. However, the choice
and the implications shouldn't be ignored.

--
Regards,
Eric Windisch



------------------------------

Message: 16
Date: Tue, 19 Nov 2013 18:50:01 +0000
From: Joshua Harlow <harlowja at yahoo-inc.com>
To: "OpenStack Development Mailing List (not for usage questions)"
        <openstack-dev at lists.openstack.org>, Chris Friesen
        <chris.friesen at windriver.com>
Subject: Re: [openstack-dev] [Nova] Does Nova really need an SQL
        database?
Message-ID: <CEB0F2BE.4D5A7%harlowja at yahoo-inc.com>
Content-Type: text/plain; charset="us-ascii"

Sorry that was I prefer #3 (not #2) at the end there. Keyboard failure ;)

On 11/19/13 10:27 AM, "Joshua Harlow" <harlowja at yahoo-inc.com> wrote:

>Personally I would prefer #3 from the below. #2 I think will still have to
>deal with consistency issues, just switching away from a DB doesn't make
>magical ponies and unicorns appear (in-fact it can potentially make the
>problem worse if its done incorrectly - and its pretty easy to get it
>wrong IMHO). #1 could also work, but then u hit a vertical scaling limit
>(works if u paid oracle for there DB or IBM for DB2 I suppose). I prefer
>#2 since I think it is honestly needed under all solutions.
>
>On 11/19/13 9:29 AM, "Chris Friesen" <chris.friesen at windriver.com> wrote:
>
>>On 11/18/2013 06:47 PM, Joshua Harlow wrote:
>>> An idea related to this, what would need to be done to make the DB have
>>> the exact state that a compute node is going through (and therefore the
>>> scheduler would not make unreliable/racey decisions, even when there
>>>are
>>> multiple schedulers). It's not like we are dealing with a system which
>>> can not know the exact state (as long as the compute nodes are
>>>connected
>>> to the network, and a network partition does not occur).
>>
>>How would you synchronize the various schedulers with each other?
>>Suppose you have multiple scheduler nodes all trying to boot multiple
>>instances each.
>>
>>Even if each at the start of the process each scheduler has a perfect
>>view of the system, each scheduler would need to have a view of what
>>every other scheduler is doing in order to not make racy decisions.
>>
>>I see a few options:
>>
>>1) Push scheduling down into the database itself.  Implement scheduler
>>filters as SQL queries or stored procedures.
>>
>>2) Get rid of the DB for scheduling.  It looks like people are working
>>on this: https://blueprints.launchpad.net/nova/+spec/no-db-scheduler
>>
>>3) Do multi-stage scheduling.  Do a "tentative" schedule, then try and
>>update the DB to reserve all the necessary resources.  If that fails,
>>someone got there ahead of you so try again with the new data.
>>
>>Chris
>>
>>_______________________________________________
>>OpenStack-dev mailing list
>>OpenStack-dev at lists.openstack.org
>>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>




------------------------------

Message: 17
Date: Tue, 19 Nov 2013 11:00:50 -0800
From: Devananda van der Veen <devananda.vdv at gmail.com>
To: "OpenStack Development Mailing List (not for usage questions)"
        <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] [Nova] [Ironic] [TripleO] scheduling flow
        with    Ironic?
Message-ID:
        <CAExZKEpdPwMKD5X+pifDGCxc7k27uOokhN2=CbVdw+zN6iu_pA at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

On Wed, Nov 13, 2013 at 10:11 PM, Alex Glikson <GLIKSON at il.ibm.com> wrote:

> Thanks, I understand the Nova scheduler part. One of the gaps there is
> related to the blueprint we have are working on [1]. I was wondering
> regarding the role of Ironic, and the exact interaction between the user,
> Nova and Ironic.
>

The interaction from the point of "nova boot" onwards will be the same --
nova maintains a list of available (host, node) resources, the scheduler
picks one according to the request, dispatches the work to n-cond / n-cpu,
which in turn calls down to various methods in the nova/virt/driver API.
The implementation of the ironic driver is a wrapper around
python-ironicclient library, which will make calls out to the ironic API
service, which in turn performs the necessary work.

Where the interaction is different is around the management of physical
machines; eg, enrolling them with Ironic, temporarily marking a machine as
unavailable while doing maintenance on it, and other sorts of things we
haven't actually written the code for yet.


> In particular, initially I thought that Ironic is going to have its own
> scheduler, resolving some of the issues and complexity within Nova (which
> could focus on VM management, maybe even getting rid of hosts versus nodes,
> etc).


I'm not sure how putting a scheduler in Ironic would solve this problem at
all.

Conversely, I don't think there's any need for the whole (host, node)
thing. Chris Behrens and I talked at the last summit about a possible
rewrite to nova-conductor that would remove the need for this distinction
entirely. I would love to see Nova just track node, and I think this can
work for typical hypervisors (kvm, xen, ...) as well.


> But it seems that Ironic aims to stay at the level of virt driver API.. It
> is a bit unclear to me what is the desired architecture going forward -
> e.g., if the idea is to standardize virt driver APIs but keep the
> scheduling centralized,


AFAIK, the nova.virt.driver API is the standard that all the virt drivers
are written to. Unless you're referring to libvirt's API, in which case, I
don't understand the question.


> maybe we should take the rest of virt drivers into separate projects as
> well, and extend Nova to schedule beyond just compute (if it is already
> doing so for virt + bare-metal).


Why would Nova need to schedule anything besides compute resources? In this
context, Ironic is merely providing a different type of compute resource,
and Nova is still scheduling compute workloads. That this hypervisor driver
has different scheduling characteristics (eg, flavor == node resource;
extra_specs:cpu_arch == node arch; and so on) than other hypervisor drivers
doesn't mean it's not still a compute resource.


> Alternatively, each of them could have its own scheduler (like the
> approach we took when splitting out cinder, for example) - and then someone
> on top (e.g., Heat) would need to do the cross-project logic. Taking
> different architectural approaches in different cases confuses me a bit.
>

Yes, well, Cinder is a different type of resource (block storage).


HTH,
-Deva
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131119/ce7873eb/attachment-0001.html>

------------------------------

Message: 18
Date: Tue, 19 Nov 2013 19:02:44 +0000
From: "Daniel P. Berrange" <berrange at redhat.com>
To: openstack-dev at lists.openstack.org
Subject: [openstack-dev] RFC: Potential to increase min required
        libvirt version to 0.9.8 ?
Message-ID: <20131119190244.GD23854 at redhat.com>
Content-Type: text/plain; charset=utf-8

Currently the Nova libvirt driver is declaring that it wants a minimum
of libvirt 0.9.6.

For cases where we use features newer than this, we have to do conditional
logic to ensure we operate correctly on old libvirt. We don't want to keep
adding conditionals forever since they complicate the code and thus impose
an ongoing dev burden.

I think it is wise to re-evaluate the min required libvirt version at the
start of each dev cycle. To do this we need to understand what versions
are available in the OS distributions we care about supporting for the
Icehouse release.

To this end I've created a wiki page covering Debian, Fedora, OpenSUSE,
RHEL and Ubuntu

  https://wiki.openstack.org/wiki/LibvirtDistroSupportMatrix

Considering those distros it looks like we can potentially increase the
min required libvirt to 0.9.8 for Icehouse.

If there are other distros I've missed which expect to support deployment
of Icehouse please add them to this list. Hopefully there won't be any
with libvirt software older than Ubuntu 12.04 LTS....


The reason I'm asking this now, is that we're working to make the libvirt
python module a separate tar.gz that can build with multiple libvirt
versions, and I need to decide how ancient a libvirt we should support
for it.

Regards.
Daniel
--
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|



------------------------------

Message: 19
Date: Tue, 19 Nov 2013 11:07:47 -0800
From: James Bottomley <James.Bottomley at HansenPartnership.com>
To: "OpenStack Development Mailing List (not for usage questions)"
        <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] Introducing the new OpenStack service for
        Containers
Message-ID:
        <1384888067.17335.9.camel at dabdike.int.hansenpartnership.com>
Content-Type: text/plain; charset="ISO-8859-15"

On Tue, 2013-11-19 at 13:46 -0500, Eric Windisch wrote:
> On Tue, Nov 19, 2013 at 1:02 PM, James Bottomley
> <James.Bottomley at hansenpartnership.com> wrote:
> > On Mon, 2013-11-18 at 14:28 -0800, Stuart Fox wrote:
> >> Hey all
> >>
> >> Not having been at the summit (maybe the next one), could somebody
> >> give a really short explanation as to why it needs to be a separate
> >> service?
> >> It sounds like it should fit within the Nova area. It is, after all,
> >> just another hypervisor type, or so it seems.
> >
> > I can take a stab at this:  Firstly, a container is *not* a hypervisor.
> > Hypervisor based virtualisation is done at the hardware level (so with
> > hypervisors you boot a second kernel on top of the virtual hardware),
> > container based virtualisation is done at the OS (kernel) level (so all
> > containers share the same kernel ... and sometimes even huge chunks of
> > the OS). With recent advances in the Linux Kernel, we can make a
> > container behave like a hypervisor (full OS/IaaS virtualisation), but
> > quite a bit of the utility of containers is that they can do much more
> > than hypervisors, so they shouldn't be constrained by hypervisor APIs
> > (which are effectively virtual hardware APIs).
> >
> > It is possible to extend the Nova APIs to control containers more fully,
> > but there was resistance do doing this on the grounds that it's
> > expanding the scope of Nova, hence the new project.
>
> It might be worth noting that it was also brought up that
> hypervisor-based virtualization can offer a number of features that
> bridge some of these gaps, but are not supported in, nor may ever be
> supported in Nova.
>
> For example, Daniel brings up an interesting point with the
> libvirt-sandbox feature. This is one of those features that bridges
> some of the gaps. There are also solutions, however brittle, for
> introspection that work on hypervisor-driven VMs. It is not clear what
> the scope or desire for these features might be, how they might be
> sufficiently abstracted between hypervisors and guest OSes, nor how
> these would fit into any of the existing or planned compute API
> buckets.

It's certainly possible, but some of them are possible in the same way
as it's possible to get a square peg into a round hole by beating the
corners flat with a sledge hammer ... it works, but it's much less
hassle just to use a round peg because it actually fits the job.

> Having a separate service for managing containers draws a thick line
> in the sand that will somewhat stiffen innovation around
> hypervisor-based virtualization. That isn't necessarily a bad thing,
> it will help maintain stability in the project. However, the choice
> and the implications shouldn't be ignored.

How about this: we get the container API agreed and we use a driver
model like Nova (we have to anyway since there about four different
container technologies interested in this), then we see if someone wants
to do a hypervisor driver emulating the features.

James





------------------------------

Message: 20
Date: Tue, 19 Nov 2013 19:09:23 +0000
From: "Daniel P. Berrange" <berrange at redhat.com>
To: "OpenStack Development Mailing List (not for usage questions)"
        <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] Introducing the new OpenStack service for
        Containers
Message-ID: <20131119190923.GE23854 at redhat.com>
Content-Type: text/plain; charset=utf-8

On Tue, Nov 19, 2013 at 10:02:45AM -0800, James Bottomley wrote:
> On Mon, 2013-11-18 at 14:28 -0800, Stuart Fox wrote:
> > Hey all
> >
> > Not having been at the summit (maybe the next one), could somebody
> > give a really short explanation as to why it needs to be a separate
> > service?
> > It sounds like it should fit within the Nova area. It is, after all,
> > just another hypervisor type, or so it seems.
>
> I can take a stab at this:  Firstly, a container is *not* a hypervisor.
> Hypervisor based virtualisation is done at the hardware level (so with
> hypervisors you boot a second kernel on top of the virtual hardware),
> container based virtualisation is done at the OS (kernel) level (so all
> containers share the same kernel ... and sometimes even huge chunks of
> the OS). With recent advances in the Linux Kernel, we can make a
> container behave like a hypervisor (full OS/IaaS virtualisation), but
> quite a bit of the utility of containers is that they can do much more
> than hypervisors, so they shouldn't be constrained by hypervisor APIs
> (which are effectively virtual hardware APIs).
>
> It is possible to extend the Nova APIs to control containers more fully,
> but there was resistance do doing this on the grounds that it's
> expanding the scope of Nova, hence the new project.

You're focusing on the low level technical differences between containers
and hypervisor here which IMHO is not the right comparison to be making.
Once you look at the high level use cases many of the referenced container
virt apps are trying to address things don't look anywhere near as clear
cut. As mentioned elsewhere libvirt-sandbox which provides a application
sandboxing toolkit is able to leverage either LXC or KVM virt to achieve
its high level goals. Based on my understanding of Docker, I believe it
would actually be possible to run Docker images under KVM without much
difficultly. There will certainly be some setups that aren't possible
todo with hypervisors, but I don't think those will be in the majority,
nor require starting again from scratch, throwing out Nova.

Daniel
--
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|



------------------------------

Message: 21
Date: Tue, 19 Nov 2013 12:20:48 -0700
From: Steven Dake <sdake at redhat.com>
To: "OpenStack Development Mailing List (not for usage questions)"
        <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] [Heat] rough draft of Heat autoscaling
        API
Message-ID: <528BBA10.2010102 at redhat.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed


On 11/17/2013 01:57 PM, Steve Baker wrote:
> On 11/15/2013 05:19 AM, Christopher Armstrong wrote:
>> http://docs.heatautoscale.apiary.io/
>>
>> I've thrown together a rough sketch of the proposed API for
>> autoscaling. It's written in API-Blueprint format (which is a simple
>> subset of Markdown) and provides schemas for inputs and outputs using
>> JSON-Schema. The source document is currently
>> at https://github.com/radix/heat/raw/as-api-spike/autoscaling.apibp
>>
> Apologies if I'm about to re-litigate an old argument, but...
>
> At summit we discussed creating a new endpoint (and new pythonclient)
> for autoscaling. Instead I think the autoscaling API could just be added
> to the existing heat-api endpoint.
>
> Arguments for just making auto scaling part of heat api include:
> * Significantly less development, packaging and deployment configuration
> of not creating a heat-autoscaling-api and python-autoscalingclient
> * Autoscaling is orchestration (for some definition of orchestration) so
> belongs in the orchestration service endpoint
> * The autoscaling API includes heat template snippets, so a heat service
> is a required dependency for deployers anyway
> * End-users are still free to use the autoscaling portion of the heat
> API without necessarily being aware of (or directly using) heat
> templates and stacks
> * It seems acceptable for single endpoints to manage many resources (eg,
> the increasingly disparate list of resources available via the neutron API)
>
> Arguments for making a new auto scaling api include:
> * Autoscaling is not orchestration (for some narrower definition of
> orchestration)
> * Autoscaling implementation will be handled by something other than
> heat engine (I have assumed the opposite)
> (no doubt this list will be added to in this thread)
A separate process can be autoscaled independently of heat-api which is
a big plus architecturally.

They really do different things, and separating their concerns at the
process level is a good goal.

I prefer a separate process for these reasons.

Regards
-steve

> What do you think?
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




------------------------------

Message: 22
Date: Tue, 19 Nov 2013 13:37:02 -0600
From: Chris Friesen <chris.friesen at windriver.com>
To: <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] [Nova] Does Nova really need an SQL
        database?
Message-ID: <528BBDDE.70709 at windriver.com>
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed

On 11/19/2013 12:35 PM, Clint Byrum wrote:

> Each scheduler process can own a different set of resources. If they
> each grab instance requests in a round-robin fashion, then they will
> fill their resources up in a relatively well balanced way until one
> scheduler's resources are exhausted. At that time it should bow out of
> taking new instances. If it can't fit a request in, it should kick the
> request out for retry on another scheduler.
>
> In this way, they only need to be in sync in that they need a way to
> agree on who owns which resources. A distributed hash table that gets
> refreshed whenever schedulers come and go would be fine for that.

That has some potential, but at high occupancy you could end up refusing
to schedule something because no one scheduler has sufficient resources
even if the cluster as a whole does.

This gets worse once you start factoring in things like heat and
instance groups that will want to schedule whole sets of resources
(instances, IP addresses, network links, cinder volumes, etc.) at once
with constraints on where they can be placed relative to each other.

Chris




------------------------------

Message: 23
Date: Tue, 19 Nov 2013 19:40:21 +0000
From: "Fuente, Pablo A" <pablo.a.fuente at intel.com>
To: "OpenStack Development Mailing List (not for usage questions)"
        <openstack-dev at lists.openstack.org>
Subject: [openstack-dev] [Heat] Version Negotiation Middleware Accept
        Header issue
Message-ID: <1384890019.3541.12.camel at pafuent-mobl4>
Content-Type: text/plain; charset="utf-8"

Hi,
        I noticed that the Accept HTTP Header checked in the Version
Negotiation Middleware by Heat is the same MIME type used by Glance
("application/vnd.openstack.images-")
        Is this OK, or should it be something like
"application/vnd.openstack.orchestration-"?
        If this is the case I would proceed to file a bug.

Thanks.
Pablo.

------------------------------

Message: 24
Date: Tue, 19 Nov 2013 13:46:25 -0600
From: Chris Friesen <chris.friesen at windriver.com>
To: Joshua Harlow <harlowja at yahoo-inc.com>, "OpenStack Development
        Mailing List (not for usage questions)"
        <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] [Nova] Does Nova really need an SQL
        database?
Message-ID: <528BC011.1000501 at windriver.com>
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed

On 11/19/2013 12:27 PM, Joshua Harlow wrote:
> Personally I would prefer #3 from the below. #2 I think will still have to
> deal with consistency issues, just switching away from a DB doesn't make
> magical ponies and unicorns appear (in-fact it can potentially make the
> problem worse if its done incorrectly - and its pretty easy to get it
> wrong IMHO). #1 could also work, but then u hit a vertical scaling limit
> (works if u paid oracle for there DB or IBM for DB2 I suppose). I prefer
> #3 since I think it is honestly needed under all solutions.

Personally I think we need a combination of #3 (resource reservation)
with something else to speed up scheduling.

We have multiple filters that currently loop over all the compute nodes,
gathering a bunch of data from the DB and then ignoring most of that
data while doing some simple logic in python.

There is really no need for the bulk of the resource information to be
stored in the DB.  The compute nodes could broadcast their current state
to all scheduler nodes, and the scheduler nodes could reserve resources
directly from the compute nodes (triggering an update of all the other
scheduler nodes).

Failing that, it should be possible to push at least some of the
filtering down into the DB itself. Stuff like ramfilter or cpufilter
would be trival (and fast) as an SQL query.

Chris



------------------------------

Message: 25
Date: Tue, 19 Nov 2013 11:51:23 -0800
From: Clint Byrum <clint at fewbar.com>
To: openstack-dev <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] [Nova] Does Nova really need an SQL
        database?
Message-ID: <1384890442-sup-3738 at clint-HP>
Content-Type: text/plain; charset=UTF-8

Excerpts from Chris Friesen's message of 2013-11-19 11:37:02 -0800:
> On 11/19/2013 12:35 PM, Clint Byrum wrote:
>
> > Each scheduler process can own a different set of resources. If they
> > each grab instance requests in a round-robin fashion, then they will
> > fill their resources up in a relatively well balanced way until one
> > scheduler's resources are exhausted. At that time it should bow out of
> > taking new instances. If it can't fit a request in, it should kick the
> > request out for retry on another scheduler.
> >
> > In this way, they only need to be in sync in that they need a way to
> > agree on who owns which resources. A distributed hash table that gets
> > refreshed whenever schedulers come and go would be fine for that.
>
> That has some potential, but at high occupancy you could end up refusing
> to schedule something because no one scheduler has sufficient resources
> even if the cluster as a whole does.
>

I'm not sure what you mean here. What resource spans multiple compute
hosts?

> This gets worse once you start factoring in things like heat and
> instance groups that will want to schedule whole sets of resources
> (instances, IP addresses, network links, cinder volumes, etc.) at once
> with constraints on where they can be placed relative to each other.
>

Actually that is rather simple. Such requests have to be serialized
into a work-flow. So if you say "give me 2 instances in 2 different
locations" then you allocate 1 instance, and then another one with
'not_in_location(1)' as a condition.



------------------------------

Message: 26
Date: Tue, 19 Nov 2013 12:08:43 -0800
From: Rick Jones <rick.jones2 at hp.com>
To: "OpenStack Development Mailing List (not for usage questions)"
        <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] Introducing the new OpenStack service for
        Containers
Message-ID: <528BC54B.9060801 at hp.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

On 11/19/2013 10:02 AM, James Bottomley wrote:
> It is possible to extend the Nova APIs to control containers more fully,
> but there was resistance do doing this on the grounds that it's
> expanding the scope of Nova, hence the new project.

How well received would another CLI/API to learn be among the end-users?

rick jones



------------------------------

Message: 27
Date: Tue, 19 Nov 2013 14:18:16 -0600
From: Chris Friesen <chris.friesen at windriver.com>
To: <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] [Nova] Does Nova really need an SQL
        database?
Message-ID: <528BC788.8050207 at windriver.com>
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed

On 11/19/2013 01:51 PM, Clint Byrum wrote:
> Excerpts from Chris Friesen's message of 2013-11-19 11:37:02 -0800:
>> On 11/19/2013 12:35 PM, Clint Byrum wrote:
>>
>>> Each scheduler process can own a different set of resources. If they
>>> each grab instance requests in a round-robin fashion, then they will
>>> fill their resources up in a relatively well balanced way until one
>>> scheduler's resources are exhausted. At that time it should bow out of
>>> taking new instances. If it can't fit a request in, it should kick the
>>> request out for retry on another scheduler.
>>>
>>> In this way, they only need to be in sync in that they need a way to
>>> agree on who owns which resources. A distributed hash table that gets
>>> refreshed whenever schedulers come and go would be fine for that.
>>
>> That has some potential, but at high occupancy you could end up refusing
>> to schedule something because no one scheduler has sufficient resources
>> even if the cluster as a whole does.
>>
>
> I'm not sure what you mean here. What resource spans multiple compute
> hosts?

Imagine the cluster is running close to full occupancy, each scheduler
has room for 40 more instances.  Now I come along and issue a single
request to boot 50 instances.  The cluster has room for that, but none
of the schedulers do.

>> This gets worse once you start factoring in things like heat and
>> instance groups that will want to schedule whole sets of resources
>> (instances, IP addresses, network links, cinder volumes, etc.) at once
>> with constraints on where they can be placed relative to each other.

> Actually that is rather simple. Such requests have to be serialized
> into a work-flow. So if you say "give me 2 instances in 2 different
> locations" then you allocate 1 instance, and then another one with
> 'not_in_location(1)' as a condition.

Actually, you don't want to serialize it, you want to hand the whole set
of resource requests and constraints to the scheduler all at once.

If you do them one at a time, then early decisions made with
less-than-complete knowledge can result in later scheduling requests
failing due to being unable to meet constraints, even if there are
actually sufficient resources in the cluster.

The "VM ensembles" document at
https://docs.google.com/document/d/1bAMtkaIFn4ZSMqqsXjs_riXofuRvApa--qo4UTwsmhw/edit?pli=1
has a good example of how one-at-a-time scheduling can cause spurious
failures.

And if you're handing the whole set of requests to a scheduler all at
once, then you want the scheduler to have access to as many resources as
possible so that it has the highest likelihood of being able to satisfy
the request given the constraints.

Chris



------------------------------

Message: 28
Date: Wed, 20 Nov 2013 00:28:49 +0400
From: Eugene Nikanorov <enikanorov at mirantis.com>
To: Vijay Venkatachalam <Vijay.Venkatachalam at citrix.com>
Cc: "openstack-dev at lists.openstack.org"
        <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] [Neutron][LBaaS] SSL Termination write-up
Message-ID:
        <CAJfiwOTjs-qdVMJY5DC0K4d_NF19=mhE4LGWCN8P+r25-C5szQ at mail.gmail.com>
Content-Type: text/plain; charset="windows-1252"

Hi Vijay,

Thanks for working on this. As was discussed at the summit, immediate
solution seems to be passing certificates via transient fields in Vip
object, which will avoid the need for certificate management (incl. storing
them).
If certificate management is concerned then I agree that it needs to be a
separate specialized service.

> My thought was to have independent certificate resource with VIP uuid as
one of the properties. VIP is already created and
> will help to identify the driver/device. The VIP property can be
depreciated in the long term.
I think it's a little bit forward looking at this moment, also I think that
certificates are not specific for load balancing.
Transient fields could help here too: client could pass certificate Id and
driver of corresponding device will communicate with external service
fetching corresponding certificate.

Thanks,
Eugene.


On Tue, Nov 19, 2013 at 5:48 PM, Vijay Venkatachalam <
Vijay.Venkatachalam at citrix.com> wrote:

>  Hi Sam, Eugene, & Avishay, etal,
>
>
>
>                 Today I spent some time to create a write-up for SSL
> Termination not exactly design doc. Please share your comments!
>
>
>
>
> https://docs.google.com/document/d/1tFOrIa10lKr0xQyLVGsVfXr29NQBq2nYTvMkMJ_inbo/edit
>
>
>
> Would like comments/discussion especially on the following note:
>
>
>
> SSL Termination requires certificate management. The ideal way is to
> handle this via an independent IAM service. This would take time to
> implement so the thought was to add the certificate details in VIP resource
> and send them directly to device. Basically don?t store the certificate key
> in the DB there by avoiding security concerns of maintaining certificates
> in controller.
>
>
>
> I would expect the certificates to become an independent resource in
> future thereby causing backward compatibility issues.
>
>
>
> Any ideas how to achieve this?
>
>
>
> My thought was to have independent certificate resource with VIP uuid as
> one of the properties. VIP is already created and will help to identify the
> driver/device. The VIP property can be depreciated in the long term.
>
>  Thanks,
>
> Vijay V.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131120/a9eb5222/attachment-0001.html>

------------------------------

Message: 29
Date: Tue, 19 Nov 2013 15:34:46 -0500
From: Russell Bryant <rbryant at redhat.com>
To: openstack-dev at lists.openstack.org
Subject: Re: [openstack-dev] Introducing the new OpenStack service for
        Containers
Message-ID: <528BCB66.5030706 at redhat.com>
Content-Type: text/plain; charset=ISO-8859-1

On 11/19/2013 03:08 PM, Rick Jones wrote:
> On 11/19/2013 10:02 AM, James Bottomley wrote:
>> It is possible to extend the Nova APIs to control containers more fully,
>> but there was resistance do doing this on the grounds that it's
>> expanding the scope of Nova, hence the new project.
>
> How well received would another CLI/API to learn be among the end-users?

It depends.  It is it mostly a duplication of the compute API, but just
slightly different enough to be annoying?  Or is it something
significantly different enough that much better suits their use cases?

That's the important question to answer here.  How different is it, and
does the difference justify a split?

The opinions so far are quite vast.  Those interested in pushing this
are going to go off and work on a more detailed API proposal.  I think
we should revisit this discussion when they have something for us to
evaluate against.

Even if this ends up staying in Nova, the API work is still useful.  It
would help us toward defining a container specific extension to the
compute API.

--
Russell Bryant



------------------------------

Message: 30
Date: Wed, 20 Nov 2013 09:40:54 +1300
From: Steve Baker <sbaker at redhat.com>
To: openstack-dev at lists.openstack.org
Subject: Re: [openstack-dev] [Heat] HOT software configuration refined
        after design summit discussions
Message-ID: <528BCCD6.6050002 at redhat.com>
Content-Type: text/plain; charset=ISO-8859-1

On 11/19/2013 08:37 PM, Thomas Spatzier wrote:
> Steve Baker <sbaker at redhat.com> wrote on 18.11.2013 21:52:04:
>> From: Steve Baker <sbaker at redhat.com>
>> To: openstack-dev at lists.openstack.org,
>> Date: 18.11.2013 21:54
>> Subject: Re: [openstack-dev] [Heat] HOT software configuration
>> refined after design summit discussions
>>
>> On 11/19/2013 02:22 AM, Thomas Spatzier wrote:
>>> Hi all,
>>>
>>> I have reworked the wiki page [1] I created last week to reflect
>>> discussions we had on the mail list and in IRC. From ML discussions
> last
>>> week it looked like we were all basically on the same page (with some
>>> details to be worked out), and I hope the new draft eliminates some
>>> confusion that the original draft had.
>>>
>>> [1]
> https://wiki.openstack.org/wiki/Heat/Blueprints/hot-software-config-WIP
>> Thanks Thomas, this looks really good. I've actually started on a POC
>> which maps to this model.
> Good to hear that, Steve :-)
> Now that we are converging, should we consolidate the various wiki pages
> and just have one? E.g. copy the complete contents of
> hot-software-config-WIP to your original hot-software-config, or deprecate
> all others and make hot-software-config-WIP the master?
Lets just bless hot-software-config-WIP and add to it as we flesh out
the implementation.

>> I've used different semantics which you may actually prefer some of,
>> please comment below.
>>
>> Resource types:
>> SoftwareConfig -> SoftwareConfig (yay!)
>> SoftwareDeployment -> SoftwareApplier - less typing, less mouth-fatigue
> I'm ok with SoftwareApplier. If we don't hear objections, I can change it
> in the wiki.
>
>> SoftwareConfig properties:
>> parameters -> inputs - just because parameters is overloaded already.
> Makes sense.
>
>> Although if the CM tool has their own semantics for inputs then that
>> should be used in that SoftwareConfig resource implementation instead.
>> outputs -> outputs
>>
>> SoftwareApplier properties:
>> software_config -> apply_config - because there will sometimes be a
>> corresponding remove_config
> Makes sense, and the remove_config thought is a very good point!
>
>> server -> server
>> parameters -> input_values - to match the 'inputs' schema property in
>> SoftwareConfig
> Agree on input_values.
>
>> Other comments on hot-software-config-WIP:
>>
>> Regarding apply_config/remove_config, if a SoftwareApplier resource is
>> deleted it should trigger any remove_config and wait for the server to
>> acknowledge when that is complete. This allows for any
>> evacuation/deregistering workloads to be executed.
>>
>> I'm unclear yet what the SoftwareConfig 'role' is for, unless the role
>> specifies the contract for a given inputs and outputs schema? How would
>> this be documented or enforced? I'm inclined to leave it out for now.
> So about 'role', as I stated in the wiki, my thinking was that there will
> be different SoftwareConfig and SoftwareApplier implementations per CM tool
> (more on that below), since all CM tools will probably have their specific
> metadata and runtime implementation. So in my example I was using Chef, and
> 'role' is just a Chef concept, i.e. you take a cookbook and configure a
> specific Chef role on a server.
OK, its Chef specific; I'm fine with that.
>> It should be possible to write a SoftwareConfig type for a new CM tool
>> as a provider template. This has some nice implications for deployers
>> and users.
> I think provider templates are a good thing to have clean componentization
> for re-use. However, I think it still would be good to allow users to
> define their SoftwareConfigs inline in a template for simple use cases. I
> heard that requirement in several posts on the ML last week.
> The question is whether we can live with a single implementation of
> SoftwareConfig and SoftwareApplier then (see also below).
Yes, a provider template would encapsulate some base SoftwareConfig
resource type, but users would be free to use this type inline in their
template too.
>> My hope is that there will not need to be a different SoftwareApplier
>> type written for each CM tool. But maybe there will be one for each
>> delivery mechanism. The first implementation will use metadata polling
>> and signals, another might use Marconi. Bootstrapping an image to
>> consume a given CM tool and applied configuration data is something that
>> we need to do, but we can make it beyond the scope of this particular
>> proposal.
> I was thinking about a single implementation, too. However, I cannot really
> imagine how one single implementation could handle both the different
> metadata of different CM tools, and different runtime implementation. I
> think we would want to support at least a handful of the most favorite
> tools, but cannot see at the moment how to cover them all in one
> implementation. My thought was that there could be a super-class for common
> behavior, and then plugins with specific behavior for each tool.
>
> Anyway, all of that needs to be verified, so working on PoC patches is
> definitely the right thing to do. For example, if we work on implementation
> for two CM tools (e.g. Chef and simple scripts), we can probably see if one
> common implementation is possible or not.
> Someone from our team is going to write a provider for Chef to try things
> out. I think that can be aligned nicely with your work.
I think there needs to a CM tool specific agent delivered to the server
which os-collect-config invokes. This agent will transform the config
data (input values, CM script, CM specific specialness) to a CM tool
invocation.

How to define and deliver this agent is the challenge. Some options are:
1) install it as part of the image customization/bootstrapping (golden
images or cloud-init)
2) define a (mustache?) template in the SoftwareConfig which
os-collect-config transforms into the agent script, which
os-collect-config then executes
3) a CM tool specific implementation of SoftwareApplier builds and
delivers a complete agent to os-collect-config which executes it

I may be leaning towards 3) at the moment. Hopefully any agent can be
generated with a sufficiently sophisticated base SoftwareApplier type,
plus maybe some richer intrinsic functions.
>> The POC I'm working on is actually backed by a REST API which does dumb
>> (but structured) storage of SoftwareConfig and SoftwareApplier entities.
>> This has some interesting implications for managing SoftwareConfig
>> resources outside the context of the stack which uses them, but lets not
>> worry too much about that *yet*.
> Sounds good. We are also defining some blueprints to break down the overall
> software config topic. We plan to share them later this week, and then we
> can consolidate with your plans and see how we can best join forces.
>
>
At this point it would be very helpful to spec out how specific CM tools
are invoked with given inputs, script, and CM tool specific options.

Maybe if you start with shell scripts, cfn-init and chef then we can all
contribute other CM tools like os-config-applier, puppet, ansible,
saltstack.

Hopefully by then my POC will at least be able to create resources, if
not deliver some data to servers.



------------------------------

Message: 31
Date: Tue, 19 Nov 2013 12:45:49 -0800
From: Devananda van der Veen <devananda.vdv at gmail.com>
To: Ladislav Smola <lsmola at redhat.com>
Cc: "openstack-dev at lists.openstack.org"
        <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] [Ironic][Ceilometer] get IPMI data for
        ceilometer
Message-ID:
        <CAExZKEqad-vUYgsE0cbAZyzwFynJgf9iqBy5t=xWMgvPK97pkw at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

On Mon, Nov 18, 2013 at 10:35 AM, Ladislav Smola <lsmola at redhat.com> wrote:

>  Hello. I have a couple of additional questions.
>
> 1. What about IPMI data that we want to get by polling. E.g. temperatures,
> etc. Will the Ironic be polling these kind of
>     data and send them directly to collector(or agent)? Not sure if this
> belongs to Ironic. It would have to support some
>     pluggable architecture for vendor specific pollsters like Ceilometer.
>
>
If there is a fixed set of information (eg, temp, fan speed, etc) that
ceilometer will want, let's make a list of that and add a driver interface
within Ironic to abstract the collection of that information from physical
nodes. Then, each driver will be able to implement it as necessary for that
vendor. Eg., an iLO driver may poll its nodes differently than a generic
IPMI driver, but the resulting data exported to Ceilometer should have the
same structure.

I don't think we should, at least right now, support pluggable pollsters on
the Ceilometer->Ironic side. Let's start with a small set of data that
Ironic exposes, make it pluggable internally for different types of
hardware, and iterate if necessary.


> 2. I've seen in the etherpad that the SNMP agent(pollster) will be also
> part of the Ironic(running next to conductor). Is it true?
>     Or that will be placed in Ceilometer central agent?
>

An SNMP agent doesn't fit within the scope of Ironic, as far as I see, so
this would need to be implemented by Ceilometer.

As far as where the SNMP agent would need to run, it should be on the same
host(s) as ironic-conductor so that it has access to the management network
(the physically-separate network for hardware management, IPMI, etc). We
should keep the number of applications with direct access to that network
to a minimum, however, so a thin agent that collects and forwards the SNMP
data to the central agent would be preferable, in my opinion.


Regards,
Devananda



>
>
> Thanks for response.
> Ladislav
>
>
>
> On 11/18/2013 06:25 PM, Devananda van der Veen wrote:
>
> Hi Lianhao Lu,
>
>  I briefly summarized my recollection of that session in this blueprint:
>
>  https://blueprints.launchpad.net/ironic/+spec/add-ceilometer-agent
>
>  I've responded to your questions inline as well.
>
>
> On Sun, Nov 17, 2013 at 10:24 PM, Lu, Lianhao <lianhao.lu at intel.com>wrote:
>
>> Hi stackers,
>>
>> During the summit session Expose hardware sensor (IPMI) data
>> https://etherpad.openstack.org/p/icehouse-summit-ceilometer-hardware-sensors,
>> it was proposed to deploy a ceilometer agent next to the ironic conductor
>> to the get the ipmi data. Here I'd like to ask some questions to figure out
>> what's the current missing pieces in ironic and ceilometer for that
>> proposal.
>>
>> 1. Just double check, ironic won't provide API to get IPMI data, right?
>>
>
>  Correct. This was generally felt to be unnecessary.
>
>>
>> 2. If deploying a ceilometer agent next to the ironic conductor, how does
>> the agent talk to the conductor? Through rpc?
>>
>
>  My understanding is that ironic-conductor will emit messages to the
> ceilimeter agent, and the communication is one-way. These could be
> triggered by a periodic task, or by some other event within Ironic, such as
> a change in the power state of a node.
>
>
>>
>> 3. Does the current ironic conductor have rpc_method to support getting
>> generic ipmi data, i.e. let the rpc_method caller specifying arbitrary
>> netfn/command to get any type of ipmi data?
>>
>
>  No, and as far as I understand, it doesn't need one.
>
>
>>
>> 4. I believe the ironic conductor uses some kind of node_id to associate
>> the bmc with its credentials, right? If so, how can the ceilometer agent
>> get those node_ids to ask the ironic conductor to poll the ipmi data? And
>> how can the ceilometer agent extract meaningful information from that
>> node_id to set those fields in the ceilometer Sample(e.g. recource_id,
>> project_id, user_id, etc.) to identify which physical node the ipmi data is
>> coming from?
>>
>
>  This question perhaps requires a longer answer.
>
>  Ironic references physical machines (nodes) internally with an integer
> node_id and externally with a standard uuid. When a Nova instance is
> created, it will be associated to a node, that node will have a reference
> to the nova instance_uuid which is exposed in our API, and can be passed to
> Ceilometer's agent. I believe that nova instance_uuid will enable
> ceilometer to detect the project, user, etc.
>
>  Should Ironic emit messages regarding nodes which are not provisioned?
> Physical machines that don't have a tenant instance on them are not
> associated to any project, user, tenant, quota, etc, so I suspect that we
> shouldn't notify about them. It would be like tracking the unused disks in
> a SAN.
>
>  Regards,
> Devananda
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131119/511686cc/attachment-0001.html>

------------------------------

Message: 32
Date: Tue, 19 Nov 2013 12:50:48 -0800
From: Clint Byrum <clint at fewbar.com>
To: openstack-dev <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] [Heat] HOT software configuration refined
        after   design summit discussions
Message-ID: <1384890870-sup-6012 at clint-HP>
Content-Type: text/plain; charset=UTF-8

Excerpts from Steve Baker's message of 2013-11-18 12:52:04 -0800:
>
> Regarding apply_config/remove_config, if a SoftwareApplier resource is
> deleted it should trigger any remove_config and wait for the server to
> acknowledge when that is complete. This allows for any
> evacuation/deregistering workloads to be executed.
>

I'm a little worried about the road that leads us down. Most configuration
software defines forward progress only. Meaning, if you want something
not there, you don't remove it from your assertions, you assert that it
is not there.

The reason this is different than the way we operate with resources is
that resources are all under Heat's direct control via well defined
APIs. In-instance things, however, will be indirectly controlled. So I
feel like focusing on a "diff" mechanism for user-deployed tools may be
unnecessary and might confuse. I'd much rather have a "converge"
mechanism for the users to focus on.



------------------------------

Message: 33
Date: Wed, 20 Nov 2013 10:06:21 +1300
From: Steve Baker <sbaker at redhat.com>
To: openstack-dev at lists.openstack.org
Subject: Re: [openstack-dev] [Heat] HOT software configuration refined
        after design summit discussions
Message-ID: <528BD2CD.3040904 at redhat.com>
Content-Type: text/plain; charset=ISO-8859-1

On 11/20/2013 09:50 AM, Clint Byrum wrote:
> Excerpts from Steve Baker's message of 2013-11-18 12:52:04 -0800:
>> Regarding apply_config/remove_config, if a SoftwareApplier resource is
>> deleted it should trigger any remove_config and wait for the server to
>> acknowledge when that is complete. This allows for any
>> evacuation/deregistering workloads to be executed.
>>
> I'm a little worried about the road that leads us down. Most configuration
> software defines forward progress only. Meaning, if you want something
> not there, you don't remove it from your assertions, you assert that it
> is not there.
>
> The reason this is different than the way we operate with resources is
> that resources are all under Heat's direct control via well defined
> APIs. In-instance things, however, will be indirectly controlled. So I
> feel like focusing on a "diff" mechanism for user-deployed tools may be
> unnecessary and might confuse. I'd much rather have a "converge"
> mechanism for the users to focus on.
>
>
A specific use-case I'm trying to address here is tripleo doing an
update-replace on a nova compute node. The remove_config contains the
workload to evacuate VMs and signal heat when the node is ready to be
shut down. This is more involved than just "uninstall the things".

Could you outline in some more detail how you think this could be done?



------------------------------

Message: 34
Date: Tue, 19 Nov 2013 13:28:31 -0800
From: Clint Byrum <clint at fewbar.com>
To: openstack-dev <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] [Heat] HOT software configuration refined
        after   design summit discussions
Message-ID: <1384896002-sup-7668 at clint-HP>
Content-Type: text/plain; charset=UTF-8

Excerpts from Steve Baker's message of 2013-11-19 13:06:21 -0800:
> On 11/20/2013 09:50 AM, Clint Byrum wrote:
> > Excerpts from Steve Baker's message of 2013-11-18 12:52:04 -0800:
> >> Regarding apply_config/remove_config, if a SoftwareApplier resource is
> >> deleted it should trigger any remove_config and wait for the server to
> >> acknowledge when that is complete. This allows for any
> >> evacuation/deregistering workloads to be executed.
> >>
> > I'm a little worried about the road that leads us down. Most configuration
> > software defines forward progress only. Meaning, if you want something
> > not there, you don't remove it from your assertions, you assert that it
> > is not there.
> >
> > The reason this is different than the way we operate with resources is
> > that resources are all under Heat's direct control via well defined
> > APIs. In-instance things, however, will be indirectly controlled. So I
> > feel like focusing on a "diff" mechanism for user-deployed tools may be
> > unnecessary and might confuse. I'd much rather have a "converge"
> > mechanism for the users to focus on.
> >
> >
> A specific use-case I'm trying to address here is tripleo doing an
> update-replace on a nova compute node. The remove_config contains the
> workload to evacuate VMs and signal heat when the node is ready to be
> shut down. This is more involved than just "uninstall the things".
>
> Could you outline in some more detail how you think this could be done?
>

So for that we would not remove the software configuration for the
nova-compute, we would assert that the machine needs vms evacuated.
We want evacuation to be something we explicitly do, not a side effect
of deleting things. Perhaps having delete hooks for starting delete
work-flows is right, but it set off a red flag for me so I want to make
sure we think it through.

Also IIRC, evacuation is not necessarily an in-instance thing. It looks
more like the weird thing we've been talking about lately which is
"how do we orchestrate tenant API's":

https://etherpad.openstack.org/p/orchestrate-tenant-apis



------------------------------

Message: 35
Date: Tue, 19 Nov 2013 22:38:23 +0100
From: Zane Bitter <zbitter at redhat.com>
To: openstack-dev at lists.openstack.org
Subject: Re: [openstack-dev] [Heat] Continue discussing multi-region
        orchestration
Message-ID: <528BDA4F.8090605 at redhat.com>
Content-Type: text/plain; charset=UTF-8; format=flowed

On 19/11/13 19:03, Clint Byrum wrote:
> Excerpts from Zane Bitter's message of 2013-11-15 12:41:53 -0800:
>> Good news, everyone! I have created the missing whiteboard diagram that
>> we all needed at the design summit:
>>
>> https://wiki.openstack.org/wiki/Heat/Blueprints/Multi_Region_Support_for_Heat/The_Missing_Diagram
>>
>> I've documented 5 possibilities. (1) is the current implementation,
>> which we agree we want to get away from. I strongly favour (2) for the
>> reasons listed. I don't think (3) has many friends. (4) seems to be
>> popular despite the obvious availability problem and doubts that it is
>> even feasible. Finally, I can save us all some time by simply stating
>> that I will -2 on sight any attempt to implement (5).
>>
>> When we're discussing this, please mention explicitly the number of the
>> model you are talking about at any given time.
>>
>> If you have a suggestion for a different model, make your own diagram!
>> jk, you can sketch it or something for me and I'll see if I can add it.
>
> Thanks for putting this together Zane. I just now got around to looking
> closely.
>
> Option 2 is good. I'd love for option 1 to be made automatic by making
> the client smarter, but parsing templates in the client will require
> some deep thought before we decide it is a good idea.
>
> I'd like to consider a 2a, which just has the same Heat engines the user
> is talking to being used to do the orchestration in whatever region
> they are in. I think that is actually the intention of the diagram,
> but it looks like there is a "special" one that talks to the engines
> that actually do the work.

Yes, I think you're saying the same thing as the diagram was intended to
convey. (Each orange box is meant to be a Heat engine.)

So the user talks to the Heat endpoint in a region of their choice
(doesn't matter which, they're all the same). When a OS::Heat::Stack
resource has Region and/or Endpoint specified, it will use
python-heatclient to connect to the appropriate engine for that nested
stack.

> 2 may morph into 3 actually, if users don't like the nested stack
> requirement for 2, we can do the work to basically make the engine create
> a nested stack per region. So that makes 2 a stronger choice for first
> implementation.

Yeah, if you run your own standalone Heat engine, you've effectively
created 3 with no code changes from 2 afaict. I wouldn't recommend that
operators deploy this way though.

> 4 has an unstated pro, which is that attack surface is reduced. This
> makes more sense when you consider the TripleO case where you may want
> the undercloud (hardware cloud) to orchestrate things existing in the
> overcloud (vm cloud) but you don't want the overcloud administrators to
> be able to control your entire stack.
>
> Given CAP theorem, option 5, the global orchestrator, would be doable
> with not much change as long as partition tolerance were the bit we gave

Well, in the real world partitions _do_ happen, so the choice is to give
up consistency, availability or both.

> up. We would just have to have a cross-region RPC bus and database. Of
> course, since regions are most likely to be partitioned, that is not
> really a good choice. Trading partition tolerance for consistency lands
> us in the complexity black hole. Trading out availability makes it no
> better than option 4.

Yep, exactly.

cheers,
Zane.



------------------------------

Message: 36
Date: Tue, 19 Nov 2013 16:40:40 -0500
From: James Slagle <james.slagle at gmail.com>
To: openstack-dev at lists.openstack.org
Subject: [openstack-dev] [TripleO] Easier way of trying TripleO
Message-ID: <20131119214039.GA9377 at gmail.com>
Content-Type: text/plain; charset=us-ascii

I'd like to propose an idea around a simplified and complimentary version of
devtest that makes it easier for someone to get started and try TripleO.

The goal being to get people using TripleO as a way to experience the
deployment of OpenStack, and not necessarily a way to get an experience of a
useable OpenStack cloud itself.

To that end, we could:

1) Provide an undercloud vm image so that you could effectively skip the entire
   seed setup.
2) Provide pre-built downloadable images for the overcloud and deployment
   kernel and ramdisk.
3) Instructions on how to use these images to deploy a running
   overcloud.

Images could be provided for Ubuntu and Fedora, since both those work fairly
well today.

The instructions would look something like:

1) Download all the images.
2) Perform initial host setup.  This would be much smaller than what is
   required for devtest and off the top of my head would mostly be:
   - openvswitch bridge setup
   - libvirt configuration
   - ssh configuration (for the baremetal virtual power driver)
3) Start the undercloud vm.  It would need to be bootstrapped with an initial
   static json file for the heat metadata, same as the seed works today.
4) Any last mile manual configuration, such as nova.conf edits for the virtual
   power driver user.
6) Use tuskar+horizon (running on the undercloud)  to deploy the overcloud.
7) Overcloud configuration (don't see this being much different than what is
   there today).

All the openstack clients, heat templates, etc., are on the undercloud vm, and
that's where they're used from, as opposed to from the host (results in less stuff
to install/configure on the host).

We could also provide instructions on how to configure the undercloud vm to
provision baremetal.  I assume this would be possible, given the correct
bridged networking setup.

It could make sense to use an all in one overcloud for this as well, given it's
going for simplification.

Obviously, this approach implies some image management on the community's part,
and I think we'd document and use all the existing tools (dib, elements) to
build images, etc.

Thoughts on this approach?

--
-- James Slagle
--



------------------------------

Message: 37
Date: Tue, 19 Nov 2013 16:45:11 -0500
From: Russell Bryant <rbryant at redhat.com>
To: OpenStack Development Mailing List
        <openstack-dev at lists.openstack.org>
Subject: [openstack-dev] [Nova] Icehouse roadmap status
Message-ID: <528BDBE7.3000407 at redhat.com>
Content-Type: text/plain; charset=ISO-8859-1

Greetings,

We've made a lot of progress on reviewing blueprints over the last week
and have the icehouse-1 [1] list in reasonable shape.  Note that
icehouse-1 development must be completed two weeks from today, and that
time frame includes a major holiday in the US.  Please adjust plans and
expectations accordingly.

Here are the goals we need to be aiming for over this next week:

1) File blueprints!  We talked about a *lot* of stuff in the design
summit that is still not reflected in blueprints.  Please get those
blueprints filed and targerted to the appropriate icehouse milestone ASAP!

2) nova-core reviewers should consider committing to reviewing
blueprints they care most about.  That will give us a better picture of
which blueprints are most likely to get reviewed in time to be merged.

The process here is to just add a note to the blueprint whiteboard
indicating yourself as a review sponsor.  For example:

    "nova-core sponsors: russellb, alaski"

Once a blueprint has at least two sponsors, its priority can be raised
above Low.

3) nova-drivers: We need to keep up the pace on blueprint reviews.
There's still a lot to review in the overall icehouse list [2].  It's a
lot of work right now, early in the cycle while there's a lot of
planning going on.  It will start to taper off in coming weeks.


Please let me know if you have any questions.

Thanks everyone!


[1] https://launchpad.net/nova/+milestone/icehouse-1
[2] https://blueprints.launchpad.net/nova/icehouse

--
Russell Bryant



------------------------------

Message: 38
Date: Tue, 19 Nov 2013 23:02:49 +0100
From: Thierry Carrez <thierry at openstack.org>
To: "OpenStack Development Mailing List (not for usage questions)"
        <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] [Horizon] PTL election
Message-ID: <528BE009.60202 at openstack.org>
Content-Type: text/plain; charset=ISO-8859-1

Thierry Carrez wrote:
> Thierry Carrez wrote:
>> The two candidates who nominated themselves in time for this election are:
>>
>> * David Lyle
>> * Matthias Runge
>>
>> The election will be set up tomorrow, and will stay open for voting for
>> a week.

The poll is now closed, and the winner is David Lyle !

You can see results at:

http://www.cs.cornell.edu/w8/~andru/cgi-perl/civs/results.pl?id=E_cd16dd051e519ef2

Congrats, David!

--
Thierry Carrez (ttx)



------------------------------

Message: 39
Date: Tue, 19 Nov 2013 23:08:39 +0100
From: Matthias Runge <mrunge at redhat.com>
To: openstack-dev at lists.openstack.org
Subject: Re: [openstack-dev] [Horizon] PTL election
Message-ID: <528BE167.1090706 at redhat.com>
Content-Type: text/plain; charset=ISO-8859-1

On 11/19/2013 11:02 PM, Thierry Carrez wrote:

>
> The poll is now closed, and the winner is David Lyle !
>
David, well done and honestly deserved!

Matthias




------------------------------

Message: 40
Date: Tue, 19 Nov 2013 23:22:46 +0100
From: Salvatore Orlando <sorlando at nicira.com>
To: "OpenStack Development Mailing List (not for usage questions)"
        <openstack-dev at lists.openstack.org>
Cc: Robert Kukura <rkukura at redhat.com>, Isaku Yamahata
        <isaku.yamahata at gmail.com>
Subject: Re: [openstack-dev] [Neutron] Race condition between DB layer
        and plugin back-end implementation
Message-ID:
        <CAGR=i3gjqDAdBVOHHCvwYjKJhXGNhMb+7fTuXdzixBpkBytBZQ at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

For what is worth we have considered this aspect from the perspective of
the Neutron plugin my team maintains (NVP) during the past release cycle.

The synchronous model that most plugins with a controller on the backend
currently implement is simple and convenient, but has some flaws:

- reliability: the current approach where the plugin orchestrates the
backend is not really optimal when it comes to ensuring your running
configuration (backend/control plane) is in sync with your desired
configuration (neutron/mgmt plane); moreover in some case, due to neutron
internals, API calls to the backend are wrapped in a transaction too,
leading to very long SQL transactions, which are quite dangerous indeed. It
is not easy to recover from a failure due to an eventlet thread deadlocking
with a mysql transaction, where by 'recover' I mean ensuring neutron and
backend state are in sync.

- maintainability: since handling rollback in case of failures on the
backend and/or the db is cumbersome, this often leads to spaghetti code
which is very hard to maintain regardless of the effort (ok, I agree here
that this also depends on how good the devs are - most of the guys in my
team are very good, but unfortunately they have me too...).

- performance & scalability:
    -  roundtrips to the backend take a non-negligible toll on the duration
of an API call, whereas most Neutron API calls should probably just
terminate at the DB just like a nova boot call does not wait for the VM to
be ACTIVE to return.
    - we need to keep some operation serialized in order to avoid the
mentioned race issues

For this reason we're progressively moving toward a change in the NVP
plugin with a series of patches under this umbrella-blueprint [1].

For answering the issues mentioned by Isaku, we've been looking at a task
management library with an efficient and reliable set of abstractions for
ensuring operations are properly ordered thus avoiding those races (I agree
on the observation on the pre/post commit solution).
We are currently looking at using celery [2] rather than taskflow; mostly
because we've already have expertise on how to use it into our
applications, and has very easy abstractions for workflow design, as well
as for handling task failures.
Said that, I think we're still open to switch to taskflow should we become
aware of some very good reason for using it.

Regards,
Salvatore

[1]
https://blueprints.launchpad.net/neutron/+spec/nvp-async-backend-communication
[2] http://docs.celeryproject.org/en/master/index.html



On 19 November 2013 19:42, Joshua Harlow <harlowja at yahoo-inc.com> wrote:

> And also of course, nearly forgot a similar situation/review in heat.
>
> https://review.openstack.org/#/c/49440/
>
> Except theres was/is dealing with stack locking (a heat concept).
>
> On 11/19/13 10:33 AM, "Joshua Harlow" <harlowja at yahoo-inc.com> wrote:
>
> >If you start adding these states you might really want to consider the
> >following work that is going on in other projects.
> >
> >It surely appears that everyone is starting to hit the same problem (and
> >joining efforts would produce a more beneficial result).
> >
> >Relevant icehouse etherpads:
> >- https://etherpad.openstack.org/p/CinderTaskFlowFSM
> >- https://etherpad.openstack.org/p/icehouse-oslo-service-synchronization
> >
> >And of course my obvious plug for taskflow (which is designed to be a
> >useful library to help in all these usages).
> >
> >- https://wiki.openstack.org/wiki/TaskFlow
> >
> >The states u just mentioned start to line-up with
> >https://wiki.openstack.org/wiki/TaskFlow/States_of_Task_and_Flow
> >
> >If this sounds like a useful way to go (joining efforts) then lets see how
> >we can make it possible.
> >
> >IRC: #openstack-state-management is where I am usually at.
> >
> >On 11/19/13 3:57 AM, "Isaku Yamahata" <isaku.yamahata at gmail.com> wrote:
> >
> >>On Mon, Nov 18, 2013 at 03:55:49PM -0500,
> >>Robert Kukura <rkukura at redhat.com> wrote:
> >>
> >>> On 11/18/2013 03:25 PM, Edgar Magana wrote:
> >>> > Developers,
> >>> >
> >>> > This topic has been discussed before but I do not remember if we have
> >>>a
> >>> > good solution or not.
> >>>
> >>> The ML2 plugin addresses this by calling each MechanismDriver twice.
> >>>The
> >>> create_network_precommit() method is called as part of the DB
> >>> transaction, and the create_network_postcommit() method is called after
> >>> the transaction has been committed. Interactions with devices or
> >>> controllers are done in the postcommit methods. If the postcommit
> >>>method
> >>> raises an exception, the plugin deletes that partially-created resource
> >>> and returns the exception to the client. You might consider a similar
> >>> approach in your plugin.
> >>
> >>Splitting works into two phase, pre/post, is good approach.
> >>But there still remains race window.
> >>Once the transaction is committed, the result is visible to outside.
> >>So the concurrent request to same resource will be racy.
> >>There is a window after pre_xxx_yyy before post_xxx_yyy() where
> >>other requests can be handled.
> >>
> >>The state machine needs to be enhanced, I think. (plugins need
> >>modification)
> >>For example, adding more states like pending_{create, delete, update}.
> >>Also we would like to consider serializing between operation of ports
> >>and subnets. or between operation of subnets and network depending on
> >>performance requirement.
> >>(Or carefully audit complex status change. i.e.
> >>changing port during subnet/network update/deletion.)
> >>
> >>I think it would be useful to establish reference locking policy
> >>for ML2 plugin for SDN controllers.
> >>Thoughts or comments? If this is considered useful and acceptable,
> >>I'm willing to help.
> >>
> >>thanks,
> >>Isaku Yamahata
> >>
> >>> -Bob
> >>>
> >>> > Basically, if concurrent API calls are sent to Neutron, all of them
> >>>are
> >>> > sent to the plug-in level where two actions have to be made:
> >>> >
> >>> > 1. DB transaction ? No just for data persistence but also to collect
> >>>the
> >>> > information needed for the next action
> >>> > 2. Plug-in back-end implementation ? In our case is a call to the
> >>>python
> >>> > library than consequentially calls PLUMgrid REST GW (soon SAL)
> >>> >
> >>> > For instance:
> >>> >
> >>> > def create_port(self, context, port):
> >>> >         with context.session.begin(subtransactions=True):
> >>> >             # Plugin DB - Port Create and Return port
> >>> >             port_db = super(NeutronPluginPLUMgridV2,
> >>> > self).create_port(context,
> >>> >
> >>> port)
> >>> >             device_id = port_db["device_id"]
> >>> >             if port_db["device_owner"] == "network:router_gateway":
> >>> >                 router_db = self._get_router(context, device_id)
> >>> >             else:
> >>> >                 router_db = None
> >>> >             try:
> >>> >                 LOG.debug(_("PLUMgrid Library: create_port()
> >>>called"))
> >>> > # Back-end implementation
> >>> >                 self._plumlib.create_port(port_db, router_db)
> >>> >             except Exception:
> >>> >             ?
> >>> >
> >>> > The way we have implemented at the plugin-level in Havana (even in
> >>> > Grizzly) is that both action are wrapped in the same "transaction"
> >>>which
> >>> > automatically rolls back any operation done to its original state
> >>> > protecting mostly the DB of having any inconsistency state or left
> >>>over
> >>> > data if the back-end part fails.=.
> >>> > The problem that we are experiencing is when concurrent calls to the
> >>> > same API are sent, the number of operation at the plug-in back-end
> >>>are
> >>> > long enough to make the next concurrent API call to get stuck at the
> >>>DB
> >>> > transaction level, which creates a hung state for the Neutron Server
> >>>to
> >>> > the point that all concurrent API calls will fail.
> >>> >
> >>> > This can be fixed if we include some "locking" system such as
> >>>calling:
> >>> >
> >>> > from neutron.common import utile
> >>> > ?
> >>> >
> >>> > @utils.synchronized('any-name', external=True)
> >>> > def create_port(self, context, port):
> >>> > ?
> >>> >
> >>> > Obviously, this will create a serialization of all concurrent calls
> >>> > which will ends up in having a really bad performance. Does anyone
> >>>has a
> >>> > better solution?
> >>> >
> >>> > Thanks,
> >>> >
> >>> > Edgar
> >>> >
> >>> >
> >>> > _______________________________________________
> >>> > OpenStack-dev mailing list
> >>> > OpenStack-dev at lists.openstack.org
> >>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >>> >
> >>>
> >>>
> >>> _______________________________________________
> >>> OpenStack-dev mailing list
> >>> OpenStack-dev at lists.openstack.org
> >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >>
> >>--
> >>Isaku Yamahata <isaku.yamahata at gmail.com>
> >>
> >>_______________________________________________
> >>OpenStack-dev mailing list
> >>OpenStack-dev at lists.openstack.org
> >>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >
> >_______________________________________________
> >OpenStack-dev mailing list
> >OpenStack-dev at lists.openstack.org
> >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131119/134e2b31/attachment-0001.html>

------------------------------

Message: 41
Date: Wed, 20 Nov 2013 09:25:51 +1100
From: Angus Salkeld <asalkeld at redhat.com>
To: openstack-dev at lists.openstack.org
Subject: Re: [openstack-dev] [Heat] Version Negotiation Middleware
        Accept Header issue
Message-ID: <20131119222551.GA911 at redhat.com>
Content-Type: text/plain; charset=us-ascii; format=flowed

On 19/11/13 19:40 +0000, Fuente, Pablo A wrote:
>Hi,
>       I noticed that the Accept HTTP Header checked in the Version
>Negotiation Middleware by Heat is the same MIME type used by Glance
>("application/vnd.openstack.images-")
>       Is this OK, or should it be something like
>"application/vnd.openstack.orchestration-"?
>       If this is the case I would proceed to file a bug.

Yeah, that  looks wrong, I posted a fix here:
https://review.openstack.org/57338

-Angus

>
>Thanks.
>Pablo.
>_______________________________________________
>OpenStack-dev mailing list
>OpenStack-dev at lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



------------------------------

Message: 42
Date: Tue, 19 Nov 2013 23:27:11 +0100
From: Zane Bitter <zbitter at redhat.com>
To: openstack-dev at lists.openstack.org
Subject: Re: [openstack-dev] [Heat] rough draft of Heat autoscaling
        API
Message-ID: <528BE5BF.5080203 at redhat.com>
Content-Type: text/plain; charset=UTF-8; format=flowed

On 19/11/13 19:14, Christopher Armstrong wrote:
> On Mon, Nov 18, 2013 at 5:57 AM, Zane Bitter <zbitter at redhat.com
> <mailto:zbitter at redhat.com>> wrote:
>
>     On 16/11/13 11:15, Angus Salkeld wrote:
>
>         On 15/11/13 08:46 -0600, Christopher Armstrong wrote:
>
>             On Fri, Nov 15, 2013 at 3:57 AM, Zane Bitter
>             <zbitter at redhat.com <mailto:zbitter at redhat.com>> wrote:
>
>                 On 15/11/13 02:48, Christopher Armstrong wrote:
>
>                     On Thu, Nov 14, 2013 at 5:40 PM, Angus Salkeld
>                     <asalkeld at redhat.com <mailto:asalkeld at redhat.com>
>                     <mailto:asalkeld at redhat.com
>                     <mailto:asalkeld at redhat.com>>> wrote:
>
>                          On 14/11/13 10:19 -0600, Christopher Armstrong
>                     wrote:
>
>                     http://docs.heatautoscale.__ap__iary.io/
>                     <http://apiary.io/>
>
>                              <http://docs.heatautoscale.__apiary.io/
>                     <http://docs.heatautoscale.apiary.io/>>
>
>                              I've thrown together a rough sketch of the
>                     proposed API for
>                              autoscaling.
>                              It's written in API-Blueprint format (which
>                     is a simple subset
>                              of Markdown)
>                              and provides schemas for inputs and outputs
>                     using JSON-Schema.
>                              The source
>                              document is currently at
>                     https://github.com/radix/heat/____raw/as-api-spike/
>                     <https://github.com/radix/heat/__raw/as-api-spike/>
>                     autoscaling.__apibp
>
>
>                     <https://github.com/radix/__heat/raw/as-api-spike/__autoscaling.apibp
>                     <https://github.com/radix/heat/raw/as-api-spike/autoscaling.apibp>
>                      >
>
>
>                              Things we still need to figure out:
>
>                              - how to scope projects/domains. put them
>                     in the URL? get them
>                              from the
>                              token?
>                              - how webhooks are done (though this
>                     shouldn't affect the API
>                              too much;
>                              they're basically just opaque)
>
>                              Please read and comment :)
>
>
>                          Hi Chistopher
>
>                          In the group create object you have 'resources'.
>                          Can you explain what you expect in there? I
>                     thought we talked at
>                          summit about have a unit of scaling as a nested
>                     stack.
>
>                          The thinking here was:
>                          - this makes the new config stuff easier to
>                     scale (config get
>                     applied
>                          ?  per scaling stack)
>
>                          - you can potentially place notification
>                     resources in the scaling
>                          ?  stack (think marconi message resource -
>                     on-create it sends a
>                          ?  message)
>
>                          - no need for a launchconfig
>                          - you can place a LoadbalancerMember resource
>                     in the scaling stack
>                          ?  that triggers the loadbalancer to add/remove
>                     it from the lb.
>
>
>                          I guess what I am saying is I'd expect an api
>                     to a nested stack.
>
>
>                     Well, what I'm thinking now is that instead of
>                     "resources" (a
>                     mapping of
>                     resources), just have "resource", which can be the
>                     template definition
>                     for a single resource. This would then allow the
>                     user to specify a
>                     Stack
>                     resource if they want to provide multiple resources.
>                     How does that
>                     sound?
>
>
>                 My thought was this (digging into the implementation
>                 here a bit):
>
>                 - Basically, the autoscaling code works as it does now:
>                 creates a
>                 template
>                 containing OS::Nova::Server resources (changed from
>                 AWS::EC2::Instance),
>                 with the properties obtained from the LaunchConfig, and
>                 creates a
>                 stack in
>                 Heat.
>                 - LaunchConfig can now contain any properties you like
>                 (I'm not 100%
>                 sure
>                 about this one*).
>                 - The user optionally supplies a template. If the
>                 template is
>                 supplied, it
>                 is passed to Heat and set in the environment as the
>                 provider for the
>                 OS::Nova::Server resource.
>
>
>             I don't like the idea of binding to OS::Nova::Server
>             specifically for
>             autoscaling. I'd rather have the ability to scale *any*
>             resource,
>             including
>             nested stacks or custom resources. It seems like jumping
>             through hoops to
>
>
>         big +1 here, autoscaling should not even know what it is
>         scaling, just
>         some resource. solum might want to scale all sorts of non-server
>         resources (and other users).
>
>
>     I'm surprised by the negative reaction to what I suggested, which is
>     a completely standard use of provider templates. Allowing a
>     user-defined stack of resources to stand in for an unrelated
>     resource type is the entire point of providers. Everyone says that
>     it's a great feature, but if you try to use it for something they
>     call it a "hack". Strange.
>
>
> To clarify this position (which I already did in IRC), replacing one
> concrete resource with another that means something in a completely
> different domain is a hack -- say, replacing "server" with "group of
> related resources". However, replacing OS::Nova::Server with something
> which still does something very much like creating a server is
> reasonable -- e.g., using a different API like one for creating
> containers or using a different cloud provider's API.

Sure, but at the end of the day it's just a name that is used internally
and which a user would struggle to even find referenced anywhere (I
think if they look at the resources created by the autoscaling template
it *might* show up). The name is completely immaterial to the idea, as
demonstrated below where I did a straight string substitution (1 line in
the environment) for a better name and nothing changed.

>     So, allow me to make a slight modification to my proposal:
>
>     - The autoscaling service manages a template containing
>     OS::Heat::ScaledResource resources. This is an imaginary resource
>     type that is not backed by a plugin in Heat.
>     - If no template is supplied by the user, the environment declares
>     another resource plugin as the provider for OS::Heat::ScaledResource
>     (by default it would be OS::Nova::Server, but this should probably
>     be configurable by the deployer... so if you had a region full of
>     Docker containers and no Nova servers, you could set it to
>     OS::Docker::Container or something).
>     - If a provider template is supplied by the user, it would be
>     specified as the provider in the environment file.
>
>     This, I hope, demonstrates that autoscaling needs no knowledge
>     whatsoever about what it is scaling to use this approach.
>
>
> It'd be interesting to see some examples, I think. I'll provide some
> examples of my proposals, with the following caveats:

Excellent idea, thanks :)

> - I'm assuming a separation of launch configuration from scaling group,
> as you proposed -- I don't really have a problem with this.
> - I'm also writing these examples with the plural "resources" parameter,
> which there has been some bikeshedding around - I believe the structure
> can be the same whether we go with singular, plural, or even
> whole-template-as-a-string.
>
> # trivial example: scaling a single server
>
> POST /launch_configs
>
> {
>      "name": "my-launch-config",
>      "resources": {
>          "my-server": {
>              "type": "OS::Nova::Server",
>              "properties": {
>                  "image": "my-image",
>                  "flavor": "my-flavor", # etc...
>              }
>          }
>      }
> }

This case would be simpler with my proposal, assuming we allow a default:

  POST /launch_configs

  {
       "name": "my-launch-config",
       "parameters": {
           "image": "my-image",
           "flavor": "my-flavor", # etc...
       }
  }

If we don't allow a default it might be something more like:


  POST /launch_configs

  {
       "name": "my-launch-config",
       "parameters": {
           "image": "my-image",
           "flavor": "my-flavor", # etc...
       },
       "provider_template_uri":
"http://heat.example.com/<tenant_id>/resources_types/OS::Nova::Server/template"
  }


> POST /groups
>
> {
>      "name": "group-name",
>      "launch_config": "my-launch-config",
>      "min_size": 0,
>      "max_size": 0,
> }

This would be the same.

>
> (and then, the user would continue on to create a policy that scales the
> group, etc)
>
> # complex example: scaling a server with an attached volume
>
> POST /launch_configs
>
> {
>      "name": "my-launch-config",
>      "resources": {
>          "my-volume": {
>              "type": "OS::Cinder::Volume",
>              "properties": {
>                  # volume properties...
>              }
>          },
>          "my-server": {
>              "type": "OS::Nova::Server",
>              "properties": {
>                  "image": "my-image",
>                  "flavor": "my-flavor", # etc...
>              }
>          },
>          "my-volume-attachment": {
>              "type": "OS::Cinder::VolumeAttachment",
>              "properties": {
>                  "volume_id": {"get_resource": "my-volume"},
>                  "instance_uuid": {"get_resource": "my-server"},
>                  "mountpoint": "/mnt/volume"
>              }
>          }
>      }
> }

This appears slightly more complex on the surface; I'll explain why in a
second.

  POST /launch_configs

  {
       "name": "my-launch-config",
       "parameters": {
           "image": "my-image",
           "flavor": "my-flavor", # etc...
       }
       "provider_template": {
           "hot_format_version": "some random date",
           "parameters" {
               "image_name": {
                   "type": "string"
               },
               "flavor": {
                   "type": "string"
               } # &c. ...
           },
           "resources" {
               "my-volume": {
                   "type": "OS::Cinder::Volume",
                   "properties": {
                       # volume properties...
                   }
               },
               "my-server": {
                   "type": "OS::Nova::Server",
                   "properties": {
                       "image": {"get_param": "image_name"},
                       "flavor": {"get_param": "flavor"}, # etc...
                  }
               },
               "my-volume-attachment": {
                   "type": "OS::Cinder::VolumeAttachment",
                   "properties": {
                       "volume_id": {"get_resource": "my-volume"},
                       "instance_uuid": {"get_resource": "my-server"},
                       "mountpoint": "/mnt/volume"
                   }
               }
           },
           "outputs" {
                "public_ip_address": {
                    "Value": {"get_attr": ["my-server",
"public_ip_address"]} # &c. ...
           }
       }
  }

(BTW the template could just as easily be included in the group rather
than the launch config. If we put it here we can validate the parameters
though.)

There are a number of advantages to including the whole template, rather
than a resource snippet:
  - Templates are versioned!
  - Templates accept parameters
  - Templates can provide outputs - we'll need these when we go to do
notifications (e.g. to load balancers).

The obvious downside is there's a lot of fiddly stuff to include in the
template (hooking up the parameters and outputs), but this is almost
entirely mitigated by the fact that the user can get a template, ready
built with the server hooked up, from the API by hitting
/resource_types/OS::Nova::Server/template and just edit in the Volume
and VolumeAttachment. (For a different example, they could of course
begin with a different resource type - the launch config accepts any
keys for parameters.) To the extent that this encourages people to write
templates where the outputs are actually supplied, it will help reduce
the number of people complaining their load balancers aren't forwarding
any traffic because they didn't surface the IP addresses.

>
> (and so on, creating the group and policies in the same way).

ditto.

> Can you please provide an example of your proposal for the same use
> cases? Please indicate how you'd specify the custom properties for each
> resource and how you specify the provider template in the API.

As you can see, it's not really different, just an implementation
strategy where all the edge cases have already been worked out, and all
the parts already exist.

cheers,
Zane.



------------------------------

Message: 43
Date: Tue, 19 Nov 2013 22:56:34 +0000
From: Joshua Harlow <harlowja at yahoo-inc.com>
To: Salvatore Orlando <sorlando at nicira.com>, "OpenStack Development
        Mailing List (not for usage questions)"
        <openstack-dev at lists.openstack.org>
Cc: Robert Kukura <rkukura at redhat.com>, Isaku Yamahata
        <isaku.yamahata at gmail.com>
Subject: Re: [openstack-dev] [Neutron] Race condition between DB layer
        and plugin back-end implementation
Message-ID: <CEB128F9.4D630%harlowja at yahoo-inc.com>
Content-Type: text/plain; charset="cp1250"

Can u explain a little how using celery achieves workflow reliability and avoids races (or mitigates spaghetti code)?

To me celery acts as a way to distribute tasks, but does not deal with actually forming a easily understandable way of knowing that a piece of code that u design is actually going to go through the various state transitions (or states & workflows) that u expect (this is a higher level mechanism that u can build on-top of a distribution system). So this means that NVP (or neutron or other?) must be maintaining an orchestration/engine layer on-top of celery to add on this additional set of code that 'drives' celery to accomplish a given workflow in a reliable manner.

This starts to sound pretty similar to what taskflow is doing, not being a direct competitor to a distributed task queue such as celery but providing this higher level mechanism which adds on these benefits; since they are needed anyway.

To me these benefits currently are (may get bigger in the future):

1. A way to define a workflow (in a way that is not tied to celery, since celeries '@task' decorator ties u to celeries internal implementation).
     - This includes ongoing work to determine how to easily define a state-machine in a way that is relevant to cinder (and other projects).
2. A way to keep track of the state that the workflow goes through (this brings along resumption, progress information? when u track at the right level).
3. A way to execute that workflow reliably (potentially using celery, rpc, local threads, other future hotness)
     - This becomes important when u ask yourself how did u plan on testing celery in the gate/jenkins/CI?
4. A way to guarantee that the workflow upon failure is *automatically* resumed by some other entity.

More details @ http://www.slideshare.net/harlowja/taskflow-27820295

From: Salvatore Orlando <sorlando at nicira.com<mailto:sorlando at nicira.com>>
Date: Tuesday, November 19, 2013 2:22 PM
To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Cc: Joshua Harlow <harlowja at yahoo-inc.com<mailto:harlowja at yahoo-inc.com>>, Isaku Yamahata <isaku.yamahata at gmail.com<mailto:isaku.yamahata at gmail.com>>, Robert Kukura <rkukura at redhat.com<mailto:rkukura at redhat.com>>
Subject: Re: [openstack-dev] [Neutron] Race condition between DB layer and plugin back-end implementation

For what is worth we have considered this aspect from the perspective of the Neutron plugin my team maintains (NVP) during the past release cycle.

The synchronous model that most plugins with a controller on the backend currently implement is simple and convenient, but has some flaws:

- reliability: the current approach where the plugin orchestrates the backend is not really optimal when it comes to ensuring your running configuration (backend/control plane) is in sync with your desired configuration (neutron/mgmt plane); moreover in some case, due to neutron internals, API calls to the backend are wrapped in a transaction too, leading to very long SQL transactions, which are quite dangerous indeed. It is not easy to recover from a failure due to an eventlet thread deadlocking with a mysql transaction, where by 'recover' I mean ensuring neutron and backend state are in sync.

- maintainability: since handling rollback in case of failures on the backend and/or the db is cumbersome, this often leads to spaghetti code which is very hard to maintain regardless of the effort (ok, I agree here that this also depends on how good the devs are - most of the guys in my team are very good, but unfortunately they have me too...).

- performance & scalability:
    -  roundtrips to the backend take a non-negligible toll on the duration of an API call, whereas most Neutron API calls should probably just terminate at the DB just like a nova boot call does not wait for the VM to be ACTIVE to return.
    - we need to keep some operation serialized in order to avoid the mentioned race issues

For this reason we're progressively moving toward a change in the NVP plugin with a series of patches under this umbrella-blueprint [1].

For answering the issues mentioned by Isaku, we've been looking at a task management library with an efficient and reliable set of abstractions for ensuring operations are properly ordered thus avoiding those races (I agree on the observation on the pre/post commit solution).
We are currently looking at using celery [2] rather than taskflow; mostly because we've already have expertise on how to use it into our applications, and has very easy abstractions for workflow design, as well as for handling task failures.
Said that, I think we're still open to switch to taskflow should we become aware of some very good reason for using it.

Regards,
Salvatore

[1] https://blueprints.launchpad.net/neutron/+spec/nvp-async-backend-communication
[2] http://docs.celeryproject.org/en/master/index.html



On 19 November 2013 19:42, Joshua Harlow <harlowja at yahoo-inc.com<mailto:harlowja at yahoo-inc.com>> wrote:
And also of course, nearly forgot a similar situation/review in heat.

https://review.openstack.org/#/c/49440/

Except theres was/is dealing with stack locking (a heat concept).

On 11/19/13 10:33 AM, "Joshua Harlow" <harlowja at yahoo-inc.com<mailto:harlowja at yahoo-inc.com>> wrote:

>If you start adding these states you might really want to consider the
>following work that is going on in other projects.
>
>It surely appears that everyone is starting to hit the same problem (and
>joining efforts would produce a more beneficial result).
>
>Relevant icehouse etherpads:
>- https://etherpad.openstack.org/p/CinderTaskFlowFSM
>- https://etherpad.openstack.org/p/icehouse-oslo-service-synchronization
>
>And of course my obvious plug for taskflow (which is designed to be a
>useful library to help in all these usages).
>
>- https://wiki.openstack.org/wiki/TaskFlow
>
>The states u just mentioned start to line-up with
>https://wiki.openstack.org/wiki/TaskFlow/States_of_Task_and_Flow
>
>If this sounds like a useful way to go (joining efforts) then lets see how
>we can make it possible.
>
>IRC: #openstack-state-management is where I am usually at.
>
>On 11/19/13 3:57 AM, "Isaku Yamahata" <isaku.yamahata at gmail.com<mailto:isaku.yamahata at gmail.com>> wrote:
>
>>On Mon, Nov 18, 2013 at 03:55:49PM -0500,
>>Robert Kukura <rkukura at redhat.com<mailto:rkukura at redhat.com>> wrote:
>>
>>> On 11/18/2013 03:25 PM, Edgar Magana wrote:
>>> > Developers,
>>> >
>>> > This topic has been discussed before but I do not remember if we have
>>>a
>>> > good solution or not.
>>>
>>> The ML2 plugin addresses this by calling each MechanismDriver twice.
>>>The
>>> create_network_precommit() method is called as part of the DB
>>> transaction, and the create_network_postcommit() method is called after
>>> the transaction has been committed. Interactions with devices or
>>> controllers are done in the postcommit methods. If the postcommit
>>>method
>>> raises an exception, the plugin deletes that partially-created resource
>>> and returns the exception to the client. You might consider a similar
>>> approach in your plugin.
>>
>>Splitting works into two phase, pre/post, is good approach.
>>But there still remains race window.
>>Once the transaction is committed, the result is visible to outside.
>>So the concurrent request to same resource will be racy.
>>There is a window after pre_xxx_yyy before post_xxx_yyy() where
>>other requests can be handled.
>>
>>The state machine needs to be enhanced, I think. (plugins need
>>modification)
>>For example, adding more states like pending_{create, delete, update}.
>>Also we would like to consider serializing between operation of ports
>>and subnets. or between operation of subnets and network depending on
>>performance requirement.
>>(Or carefully audit complex status change. i.e.
>>changing port during subnet/network update/deletion.)
>>
>>I think it would be useful to establish reference locking policy
>>for ML2 plugin for SDN controllers.
>>Thoughts or comments? If this is considered useful and acceptable,
>>I'm willing to help.
>>
>>thanks,
>>Isaku Yamahata
>>
>>> -Bob
>>>
>>> > Basically, if concurrent API calls are sent to Neutron, all of them
>>>are
>>> > sent to the plug-in level where two actions have to be made:
>>> >
>>> > 1. DB transaction ? No just for data persistence but also to collect
>>>the
>>> > information needed for the next action
>>> > 2. Plug-in back-end implementation ? In our case is a call to the
>>>python
>>> > library than consequentially calls PLUMgrid REST GW (soon SAL)
>>> >
>>> > For instance:
>>> >
>>> > def create_port(self, context, port):
>>> >         with context.session.begin(subtransactions=True):
>>> >             # Plugin DB - Port Create and Return port
>>> >             port_db = super(NeutronPluginPLUMgridV2,
>>> > self).create_port(context,
>>> >
>>> port)
>>> >             device_id = port_db["device_id"]
>>> >             if port_db["device_owner"] == "network:router_gateway":
>>> >                 router_db = self._get_router(context, device_id)
>>> >             else:
>>> >                 router_db = None
>>> >             try:
>>> >                 LOG.debug(_("PLUMgrid Library: create_port()
>>>called"))
>>> > # Back-end implementation
>>> >                 self._plumlib.create_port(port_db, router_db)
>>> >             except Exception:
>>> >             ?
>>> >
>>> > The way we have implemented at the plugin-level in Havana (even in
>>> > Grizzly) is that both action are wrapped in the same "transaction"
>>>which
>>> > automatically rolls back any operation done to its original state
>>> > protecting mostly the DB of having any inconsistency state or left
>>>over
>>> > data if the back-end part fails.=.
>>> > The problem that we are experiencing is when concurrent calls to the
>>> > same API are sent, the number of operation at the plug-in back-end
>>>are
>>> > long enough to make the next concurrent API call to get stuck at the
>>>DB
>>> > transaction level, which creates a hung state for the Neutron Server
>>>to
>>> > the point that all concurrent API calls will fail.
>>> >
>>> > This can be fixed if we include some "locking" system such as
>>>calling:
>>> >
>>> > from neutron.common import utile
>>> > ?
>>> >
>>> > @utils.synchronized('any-name', external=True)
>>> > def create_port(self, context, port):
>>> > ?
>>> >
>>> > Obviously, this will create a serialization of all concurrent calls
>>> > which will ends up in having a really bad performance. Does anyone
>>>has a
>>> > better solution?
>>> >
>>> > Thanks,
>>> >
>>> > Edgar
>>> >
>>> >
>>> > _______________________________________________
>>> > OpenStack-dev mailing list
>>> > OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
>>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>> >
>>>
>>>
>>> _______________________________________________
>>> OpenStack-dev mailing list
>>> OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>--
>>Isaku Yamahata <isaku.yamahata at gmail.com<mailto:isaku.yamahata at gmail.com>>
>>
>>_______________________________________________
>>OpenStack-dev mailing list
>>OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
>>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>_______________________________________________
>OpenStack-dev mailing list
>OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


_______________________________________________
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131119/f4289601/attachment.html>

------------------------------

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


End of OpenStack-dev Digest, Vol 19, Issue 55
*********************************************




===============================================================================
Please refer to http://www.aricent.com/legal/email_disclaimer.html
for important disclosures regarding this electronic communication.
===============================================================================



More information about the OpenStack-dev mailing list