Open Stack

Fri Mar 21 09:11:36 UTC 2014

On Fri, Mar 21, 2014 at 4:28 AM, Zane Bitter <zbitter at redhat.com> wrote:

> On 19/03/14 17:13, Stan Lagun wrote:
>
>>
>>
>>
>> On Wed, Mar 19, 2014 at 9:18 PM, Zane Bitter <zbitter at redhat.com
>> <mailto:zbitter at redhat.com>> wrote:
>>
>>     On 19/03/14 05:00, Stan Lagun wrote:
>>
>>         Steven,
>>
>>         Agree with your opinion on HOT expansion. I see that inclusion of
>>         imperative workflows and ALM would require major Heat redesign and
>>         probably would be impossible without loosing compatibility with
>>         previous
>>         HOT syntax. It would blur Heat mission, confuse current users
>>         and rise a
>>         lot of questions what should and what should not be in Heat.
>>         Thats why
>>         we chose to built a system on top of Heat rather then expending
>> HOT.
>>
>>
>>     +1, I agree (as we have discussed before) that it would be a mistake
>>     to shoehorn workflow stuff into Heat. I do think we should implement
>>     the hooks I mentioned at the start of this thread to allow tighter
>>     integration between Heat and a workflow engine (i.e. Mistral).
>>
>>     So building a system on top of Heat is good. Building it on top of
>>     Mistral as well would also be good, and that was part of the
>>     feedback from the TC.
>>
>>     To me, building on top means building on top of the languages (which
>>     users will have to invest a lot of work in learning) as well, rather
>>     than having a completely different language and only using the
>>     underlying implementation(s).
>>
>>
>> 1. Murano application developers (publishers) are using Heat directly.
>> It is not the case that Murano executes some HOT templates under the
>> hood but HOT templates are essential part of application definition.
>> Developers still write HOT templates in HOT syntax
>>
>> 2. Please get me right here. It is not that we wanted to develop another
>> language just for fun or to stand out from community. It is not that we
>> wrote DSL and then started to think how we gonna use it and prove to
>> others that is needed. We completely understand all concerns regarding
>> new language. The only reason is that we had very concrete list of
>> problems and use cases that we wanted to address in Murano. We did
>> investigate on using HOT, Mistral, JavaScript, Lua and BPEL and we found
>> overwhelming obstacles with each of those approaches. It don't pretend
>> that having own DSL is good. Just that not having it is much worse. I'm
>> also not a HOT expert as you are and thus can be (partially) wrong about
>> HOT and not aware of some of its power features. If so as a technical
>> guys we would quickly come to consensus.
>>
>>
>>
>>         Now I would like to clarify why have we chosen imperative
>>         approach with DSL.
>>
>>         You see a DSL as an alternative to HOT but it is not. DSL is
>>         alternative
>>         to Python-encoded resources in Heat (heat/engine/resources/*.py).
>>         Imagine how Heat would look like if you let untrusted users to
>>         upload
>>         Python plugins to Heat engine and load them on the fly. Heat
>>         resources
>>         are written in Python which is imperative language. So that
>>         MuranoPL for
>>         the same reason.
>>
>>
>>     We had this exact problem in Heat, and we decided to solve it
>>     with... HOT:
>>     http://lists.openstack.org/__pipermail/openstack-dev/2013-_
>> _April/007989.html
>>
>>     <http://lists.openstack.org/pipermail/openstack-dev/2013-
>> April/007989.html>
>>
>>     If I may be so bold as to quote myself:
>>
>>     "...no cloud operator in the world - not even your friendly local IT
>>     department - is going to let users upload Python code to run
>>     in-memory in their orchestration engine along with all of the other
>>     users' code.
>>
>>     "If only there were some sort of language for defining OpenStack
>>     services that could be safely executed by users...
>>
>>     "Of course that's exactly what we're all about on this project :).
>>     So my proposal is to allow users to define their own resource types
>>     using a Heat template."
>>
>>
>>
>> Excellent quotes! I can sign under each of them. No one would ever allow
>> to upload Python code so we haven't used Python (and it is not the only
>> reason not to use it)
>>
>
> There's a fine line between a domain-specific programming language (as
> opposed to a domain-specific modelling language like HOT) and a
> general-purpose programming language. The reason no-one would run untrusted
> Python code is that Python is a programming language, not that it is
> general-purpose.

1. "The line between general-purpose languages and domain-specific
languages is not always sharp, as a language may have specialized features
for a particular domain but be applicable more broadly, or conversely may
in principle be capable of broad application but in practice used primarily
for a specific domain." This is quote from Wikipedia. PostScript, BPEL,
XSLT are Turing-complete and still domain-specific. And I've just seen
tetris written in sed.  MuranoPL is DSL in the same sense too - although it
has constructs that are usual to GP-languages it also has concepts that are
specific to Murano-domain and in general was not designed to practical for
general-purpose programming.

2. The reason no-one would run Python code is that Python is not secure.
People don't have problem of running untrusted JavaScript code in their
browser. I do agree that declarative languages like HOT looks more secure
than MuranoPL and this is may be a potential problem to convince people it
is secure

3. Despite HOT being more secure on the surface it is not necessary so in
reality. There is a Python class behind each entry in resources section of
HOT template. That Python code is run with root privileges and not
guaranteed to be safe. People make mistakes, forget to validate parameters,
make incorrect assumptions etc. Even if the code is proven to be secure
every single commit can introduce security breach. And no testing system
can detect this.
MuranoPL is 100% interpreted and only very limited set of APIs is written
in Python. Everything else is written on MuranoPL itself. So once you know
that is small core is secure then everything else is guarantee to be secure
also. There is no need to change interpreter on that small set of APIs
often so it is much easier to protect such small core than entire system.
This is similar to JavaScript - investment in JS-engine security make it
possible not to do hack proof for every single JS library.

>
>
>  "If only there were some sort of language for defining OpenStack
>> services that could be safely executed by users..."  - we created such
>> language.
>>
>
> And we found we already had one, that we successfully re-used for the
> purpose :)
>
>
>  It is exactly what you wanted to have then and now when it
>> exists you say it is useless. How this ironic.
>>
>
> I'm not saying it's useless; I'm saying that it's *too* powerful for the
> job. It's clear that you guys have done amazing work and I don't want to
> criticise it, especially after having become aware of it only very late,
> but I also have serious reservations as to whether this is the right
> solution for OpenStack.
>
>
>  It may be conceitedly to
>> assume that MuranoPL is 100% is safe for all kind of attacks exists. It
>> is not. But I know how to make it, I know it is doable because it was
>> designed to be such. MuranoPL code is completely isolated from
>> underlying operating system and its resources (sockets, files etc.). It
>> is not translated to Python or any other language and thus cannot be a
>> subject to code injection attacks. It is 100% interpreted and absolutely
>> everything about the interpreter can be customized and controlled. You
>> can meter everything, have ACL for every single aspect, you can limit
>> MuranoPL execution time, number of executed instructions. Almost
>> anything is possible. With Python even if you make it somehow 100%
>> secure it would be hard to convince people it is. But there is no such
>> problem with YAML DSL :)
>>
>
> This is all good news. I still worry that you have a very large footprint
> to secure.
>
>
>  The only thing I cannot agree with (but I hope you prove me wrong) is
>> that HOT is a solution for the problem. Surely you can do a lot of
>> things in HOT. But there is even a bigger list of things that you can't.
>> Otherwise why not Heat resources are written in Python and not in HOT?
>>
>
> Because there have to be some base types that you can use as building
> blocks. In the case of Heat, those base types are the set of things that
> you can create by talking to OpenStack APIs authenticated with the user's
> token. In the case of Mistral, I would expect it to be the set of actions
> that you can take by talking to OpenStack APIs authenticated with the
> user's token. And in the case of Murano, I would expect it to be the union
> of those two.
>
>
>  Can you implement auto-scaling or load-balancer using HOT alone without
>> Python code?
>>
>
> We can and do implement the load-balancer using a Heat template :)
>
> You could certainly implement autoscaling yourself, but you'd have to spin
> up a Compute server to run it on as part of the template. Which is sort of
> my point.
>
>
>  And what if application author (the guy that has absolutely
>> no control over the cloud where it going to be deployed. Even not aware
>> of its existence) needs his app to be autoscaled using Microsoft stack
>> technologies?
>>
>
> I'm not sure what this means, and I feel like we're getting off-topic.
>
>
>  What if cloud operator wants  to have additional control
>> on scaling strategy so that it can be bound to his billing system?
>>
>
> The operator can install whatever plugins they want.

They do but that is a bad solution. The reason is that plugins can
introduce additional resource types but they cannot modify existing code.
Most of the time cloud operators need to customize existing resources'
logic for their needs rather then rewriting it from scratch. And they want
their changes to be opaque to end-users. Imagine that cloud operator need
thats to get permission from his proprietary quota management system for
each VM spawned. If he would create custom MyInstance resource type
end-users could bypass it by using standard Instance resource rather than
custom one. Patching existing Python code is not good in that then operator
need to maintain his private fork of the Heat and have troubles with CD,
upgrades to newer versions etc.
Besides plugin system is not secure because plugins run with the privileges
of Heat engine and while I may trust Heat developers (community) but not
necessary trust 3rd party proprietary plugin.

>
>
>  What
>> if he wants that auto-scaling would be based on input from his existing
>> Nagios infrastructure rather then Ceilometer?
>>
>
> This is supported already in autoscaling. Ceilometer just hits a URL for
> an alarm, but you don't have to configure it this way. Anything can hit the
> URL.
>
> And this is a good example for our general approach - we provide a way
> that works using built-in OpenStack services and a hook that allows you to
> customise it with your own service, running on your own machine (whether
> that be an actual machine or an OpenStack Compute server). What we *don't*
> do is provide a way to upload your own code that we then execute for you as
> some sort of secondary Compute service.

1. Anything can hit the URL but it is auto-scaling resource that creates
Ceilometer alarms. And what should I do to make it create Nagios alarms for
example?
2. Your approach has its cons and pros. I do acknowledge and respect strong
sides of such decision. But it has its limitations.

>
>
>  One can imagine any number
>> of use cases that just cannot be addressed by HOT without HOT becoming
>> another MuranoPL. So the only option you left with for such cases is
>> writing custom Python plugin. Now I don't want to reimplement in Python
>> the whole auto-scaling logic just because I need some customization. But
>> if I just patch existing Heat code then I would have to maintain my
>> custom Heat fork and have all sorts of problems when next version of
>> OpenStack arrives.There is a risk of introducing security breaches with
>> my new code that would be executed with root account permissions.
>>
>
> I certainly hope *nobody* is running as root.

Even if its not exactly root account it doesn't really matter because Heat
can be installed on dedicated server. Python plugins in theory may log
private data, ignore tenant isolation, modify database is wrong manner,
open network connections and be used for DoS attacks and many more without
a real root account.

>
>
>  The
>> changes you would make to Heat would affect all tenants and may break
>> existing resources. And making application depend on presence of some
>> custom plugin makes it not portable across different clouds.
>> My point here is that HOT is not sufficient for coding new resource
>> types that are not superposition of existing resources. How can you do
>> that with a language that cannot even do the simplest arithmetics
>> operations?
>>
>
> Everything is a combination of existing resources, because the set of
> existing resources is the set of things which the operator provides
> as-a-Service. The set of things that the operator provides as a service
> plus the set of things that you can implement yourself on your own server
> (virtual or not) covers the entire universe of things. What you appear to
> be suggesting is that OpenStack must provide *Everything*-as-a-Service by
> allowing users to write their own services and have the operator execute
> them as-a-Service. This would be a breathtakingly ambitious undertaking,
> and I don't mean that in a good way.
>

1. By existing resources I mean resource types that are available in Heat.
If I need to talk to Marconi during deployment but there is no Marconi
plugin yet available in my Heat installation or use the latest feature
introduced by yesterdays commit to Nova I'm in trouble.

2. If you can implement something on user-land resources you can do the
same with Murano. It is not that Murano enforces to do it on server side.

3. Not everything can be done from VMs. There are use cases when you need
to access cloud operator's proprietary services and even hardware
components that just cannot be done from user-land or they need to be done
prior to VM spawn.
You can argue that if doing something from VM is not secure then why it is
secure to do on the server? The reason is that on server all operations can
be controlled and scoped by cloud operator. For example operator may say
that it is possible to access service XYZ only from the MuranoPL classes
signed by operator. Others can inherit and use those classes but can talk
to that service only the way cloud operator made possible. This cannot (not
always can) be enforced on VM-side.

>
>      Which we did:
>>     https://blueprints.launchpad.__net/heat/+spec/provider-__resource
>>
>>     <https://blueprints.launchpad.net/heat/+spec/provider-resource>
>>
>>
>>         We want application authors to be able to express application
>>         deployment
>>         and maintenance logic of any complexity. This may involve
>>         communication
>>         with 3rd party REST services (APIs of applications being deployed,
>>         external services like DNS server API, application licensing
>>         server API,
>>         billing systems, some hardware component APIs etc) and internal
>>         OpenStack services like Trove, Sahara, Marconi and others
>> including
>>         those that are not incubated yet and those to come in the
>>         future. You
>>         cannot have such things in HOT and when you required to you need
>> to
>>         develop custom resource in Python. Independence  on custom
>>         plugins is
>>         not good for Murano because they cannot be uploaded by end users
>> and
>>         thus he cannot write application definition that can be imported
>>         to/run
>>         on any cloud and need to convince cloud administrator to install
>> his
>>         Python plugin (something that is unimaginable in real life).
>>
>>
>>     Shouldn't Mistral be able to do all of those same things too?
>>
>>
>> Yes it should. In ideal world. Assembly language can do everything
>> Python can do but you do not expect people to use assembly unless they
>> have to. The same is true for Mistral - although it can possibly
>> improved to do all above but you need to be real genius to write
>> something like auto-scaling in Mistral DSL.
>>
>
> I'm still confused by this autoscaling example. We can all agree that
> clearly you would be insane to write your own autoscaling implementation
> (whatever this means) in any language when your cloud provider already has
> a highly-customisable one like Heat. But if you were to write your own the
> way to do it would be in a programming language of your choice running on a
> server you control. I still don't see how this has anything to do with
> Mistral.
>
> You listed above a list of features that, I submit, are required for both
> Mistral and Murano. You are proposing to implement them twice, rather than
> once.

1. As for autoscaling last time I checked (it may be fixed since then)
Heat's LoadBalancer spawned HAProxy on Fedora VM with hardcoded image name
and hardcoded nested stack template. This is not what I would call
highly-customizable solution. It is hard to imagine a generic enough
autoscaling implementation that could work with all possible
health-monitoring systems, load-balancers and would be as useful for
scaling RabbitMQ, MongoDB and MS SQL Server clusters as it is for web
farms. Surely you can have your own implementation on user-land resources
but why have you chose it to be Heat resource and not sample HOT template
in extras repository? Besides we both want Heat and Murano to be really
useful and not just chef/puppet bootstappers :)

2. I will do my best to avoid implementing them twice. But the fact that
Mistral is capable of doing something doesn't mean that using Mistral is an
optimal solution for Murano although it can be so. I can send messages
using oslo.messaging or directly via RabbitMQ/QPID and still there is
Marconi. I can install Hadoop myself using Heat/Puppet but still there is
Savanna. I admit there is an overlap between Mistral and Murano and we work
with Mistral team to eliminate it. But Mistral was designed for other use
cases and just cannot help Murano right now (and may always remain such if
Mistral developers decide to).

>
>  We are working closely with Mistral team and maybe once we would be able
>> to lay all MuranoPL tasks on Mistral shoulders and consume Mistral DSL
>> instead of our own, but this is not going to happen soon.
>>
>
> When the TC said "Murano is slightly too far up the stack at this point to
> meet the "measured progression of openstack as a whole" requirement", IMO
> one of the major things they meant was that you're inventing your own
> workflow thing, leading to duplication of effort between this and Workflow
> as a Service. (And Mistral folks are in turn doing the same thing by not
> using the same workflow library, taskflow, as the rest of OpenStack.)
> Candidly, I'm surprised that this is not the #1 thing on your priority list
> because IMO it's the #1 thing that will delay getting the project incubated.
>
>
>  Mistral DSL is
>> good for another use cases and its developers may not like the idea of
>> having Turing-complete language in Mistral.
>>
>
> They sound very wise ;)

You know that there is a de facto standard DSL exists for workflow
definitions. It is called BPEL and it is Turing-complete and as expressive
as MuranoPL. There are alternative standards like BPMN, YAWL and AFAIKS
they are all Turing-complete. So what makes you feel like Mistral DSL
doesn't need to be so?

>
>      Talking to existing OpenStack services doesn't seem hard - you could
>>     write plugins for those, or save time (and save the user learning
>>     another language) by using python-openstackclient and its syntax.
>>
>>     For everything else you have a small number of generic operations -
>>     e.g. post to a Marconi queue (ReST API calls to untrusted services
>>     are problematic from a security perspective) - and allow the user to
>>     handle them in their own code in a language of their choice running
>>     either on their own machine or on a sandboxed, metered Compute
>>     server, rather than in a custom Turing-complete DSL running
>>     unmetered on the Murano server.
>>
>>
>> This is usually true. Usually, not alway. I can easily see how workflow
>> may need to talk to some cloud operator proprietary services that are
>> not accessible from VMs for security reasons. Workflow may need to
>>
>
> I said for everything *else*. For things the operator specifically wants
> you to have access to (e.g. some proprietary service they run), they would
> have the option of installing a native command plugin to handle it. For
> everything *else* you drop a message in a Marconi queue and let the user
> handle it.
>
>
>  acquire some licenses before instantiating VMs. It may require to talk
>>
>
> Where is the license server running? Can you stick a process on it to
> listen for Marconi messages and generate license requests to the actual
> server?

I believe that everything is possible in real life. It can be application
vendor's site, organization's license pool, be on another cloud. Sometimes
you can sometimes you're not.
BTW how does Heat solve the problem that for certain software user has to
explicitly agree on software license term prior to it can be deployed?

>
>  to cloud operator's billing system or quotas management system or
>> identity management system. There are many things that cannot be done
>> from VMs or need to be done before VM is spawned.
>>
>> Calling external REST APIs is secure as long as you control the process
>> (can limit amount of traffic, request rate, timeouts, ACLs etc). This is
>> possible with MuranoPL.
>> And once again - everything that is done on server-side can be metered
>> and constrained.
>>
>
> It's only secure in a very narrow sense. You're effectively allowing
> people to use your servers to make any request they like, and that's going
> to be highly prone to confused-deputy-type problems. The only completely
> safe URLs to fetch are ones that are in the Keystone catalog, passing the
> token obtained from the user. (We break this rule in Heat BTW, and we
> should stop. It's hard once you have users though, and even harder when we
> can't depend on Marconi yet.)
>

You break it in Heat for good reason. It can be avoided there but I'm not
sure this will make Heat users happier. My point is that is may be not that
good as one would want to but there are cases when it worth it or even
absolutely necessary. Once again I don't say that this is the right way to
do in Murano. It is just possible under many restrictions and tight control
of cloud provider for the cases you really need it. Nothing more

>
> cheers,
> Zane.
>
>          Because DSL is a way to write custom resources (in Heats
>>         terminology) it
>>         has to be Turing-complete and have all the characteristics of
>>         general-purpose language. It also has to have domain-specific
>>         features
>>         because we cannot expect that DSL users would be as skilled as
>> Heat
>>         developers and could write such resources without knowledge on
>>         hosting
>>         engine architecture and internals.
>>
>>         HOT DSL is declarative because all the imperative stuff is
>> hardcoded
>>         into Heat engine. Thus all is left for HOT is to define "state
>>         of the
>>         world" - desired outcome. That is analogous to Object Model in
>>         Murano
>>         (see [1]). It is Object Model that can be compared to HOT, not
>>         DSL. As
>>         you can see it not more complex than HOT. Object Model is what
>>         end-user
>>         produces in Murano. And he event don't need to write it cause it
>>         can be
>>         composed in UI.
>>
>>         Now because DSL provides not only a way to write sandboxed
>>         isolated code
>>         but also a lot of declarations (classes, properties, parameters,
>>         inheritance and contracts) that are mostly not present in Python
>> we
>>         don't need Parameters or Output sections in Object Model because
>>         all of
>>         this can be inferred from resource (classes) DSL declaration.
>>         Another
>>         consequence is that most of the things that can be written wrong
>>         in HOT
>>         can be verified on client side by validating classes' contracts
>>         without
>>         trying to deploy the stack and then go through error log
>> debugging.
>>
>>
>>     This is being worked on:
>>     https://blueprints.launchpad.__net/heat/+spec/param-__constraints
>>
>>     <https://blueprints.launchpad.net/heat/+spec/param-constraints>
>>
>>
>> I agree that contracts can become part of HOT. Note that MuranoPL
>> contracts are orders of magnitude more powerful than parameter
>> constraints. This don't necessary mean that HOT needs that powerful
>> capabilities.  I'll be happy to collaborate on this in another
>> thread/place
>>
>>
>>
>>         Because all resources' attributes types their constraints are
>>         known in
>>         advance (note that resource attribute may be a reference to
>> another
>>         resource with constraints on that reference like "I want any
>>         (regular,
>>         Galera etc) MySQL implementation") UI knows how to correctly
>>         compose the
>>         environment and can point out your mistakes at design time. This
>> is
>>         similar to how statically typed languages like C++/Java can do a
>>         lot of
>>         validation at compile time rather then in runtime as in Python.
>>
>>         Personally I would love to see many of this features in HOT. What
>> is
>>         your vision on this? What of the mentioned above can be
>>         contributed to
>>         Heat? We definitely would like to integrate more with HOT and
>>         eliminate
>>         all duplications between projects. I think that Murano and Heat
>> are
>>         complimentary products that can effectively coexist. Murano
>> provides
>>         access to all HOT features and relies on Heat for most of its
>>         activities. I believe that we need to find an optimal way to
>>         integrate
>>         Heat, Murano, Mistral, Solum, Heater, TOSCA, do some integration
>>         between
>>         ex-Thermal and Murano Dashboard, be united regarding Glance
>>         usage for
>>         metadata and so on.
>>
>>
>>     +1
>>
>>     To me that implies that Murano should be a relatively thin wrapper
>>     that ties together HOT and Mistral's DSL.
>>
>>         We are okay with throwing MuranoPL out if the issues
>>         it solves would be addressed by HOT.
>>
>>         If you have a vision on how HOT can address the same domain
>> MuranoPL
>>         does or any plans for such features in upcoming Heat releases I
>>         would
>>         ask you to share it.
>>
>>         [1]
>>         https://wiki.openstack.org/__wiki/Murano/DSL/Blueprint#__
>> Object_model
>>         <https://wiki.openstack.org/wiki/Murano/DSL/Blueprint#
>> Object_model>
>>
>>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

-- 
Sincerely yours
Stanislav (Stan) Lagun
Senior Developer
Mirantis
35b/3, Vorontsovskaya St.
Moscow, Russia
Skype: stanlagun
www.mirantis.com
slagun at mirantis.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140321/f5cf6a37/attachment.html>

Open Stack

[openstack-dev] [Murano][Heat] MuranoPL questions?

OpenStack

Community

Documentation

Branding & Legal