[openstack-dev] [Mistral] Crack at a "Real life" workflow

Joshua Harlow harlowja at yahoo-inc.com
Sat Mar 8 17:05:37 UTC 2014


Thanks for the detailed explanation!

I have done similar stuff to what google has done for python a long time ago for php (when rasmus, the php creator worked at y!) so I know it's not simple. The part of me that likes doing languages thinks all this muranoPL stuff is neat. But at the same time it worries me, since providing users with a avenue to run code (or a DSL that has similar concepts) scares me :P Especially if it's running in python. I noted u don't provide access to lower lever python functions (besides the green thread one) so that's great. It will just be a tight rope walk to keep that DSL free of issues when it enables all those features.

Are there any pointers to the execution control that it is doing? Is there a external process time limiting the user code that is running (or is it ulimits?)? With raw loop access as it appears the DSL provides it's not exactly hard to lock up a whole process (which will lock up any other greenthreads to)... How is it that u can calculate ahead of time how much memory an object will use (python doesn't exactly provide this afaik) to impose memory limits. Are deployers of murano expected to isolate there murano deployments to avoid security problems (and resource contention)?

As for lua and others (I know the javascript built into java also, due to some of this being used for something similar at y!), those languages have been designed to run in very minimal modes (no access to sockets or files by default...) and there run times have been built from the ground up to have the features needed to control execution time, memory and resource access so it might not really be as hard there as u would think (although it's still not easy of course). Python afaik was never really built to have these features, to run in very minimal modes without access to things like sockets or files... (although google apparently hacked enough of them in for usage in app-engine).

Anyways maybe we should/can continue this on another thread :-)

Sent from my really tiny device...

On Mar 8, 2014, at 2:35 AM, "Stan Lagun" <slagun at mirantis.com<mailto:slagun at mirantis.com>> wrote:

This may be not proper thread to discuss MuranoPL but since I suggested Mistral to borrow some parts of it let me answer it here.

Q: Why not use some existing language?

A:
 1. There are not so many languages that can be embedded and securely used to run code in shared environments. And it cannot be Python itself (unless you're Google :) ). So anyway most of Murano audience would have to learn new language (cause they typically use Python ans shell for their job and we cannot use neither of them)

 2. MuranoPL is DSL - Domain Specific Language, not General-Purpose language. The specific part about MuranoPL is object composition. Let me explain.
In regular languages like Python you control entire application composition: it is your program who instantiates objects and assign those objects to attributes of other objects. You know (typically from documentation) what object type/value is expected to be in particular attribute/method argument. It is you who decide which 3rd party libraries to use in your project and map your requests to their APIs. You can validate your input/attributes/parameters anytime and most of the time you don't even need to to this as it is you who who is the caller and you trust your code. Besides you can rely on your unit-tests.

This is all what Murano is not. In Murano AppCatalog is a (hopefully) huge collection of classes that are written by different people (no central app that glue everything). It is end user that does object composition (binds objects together) and sets their attributes in UI dashboard. MuranoPL code is not instantly running and thus has no control over what is done in dashboard. Thats why MuranoPL has contracts that are much stricter than Java's statical typing. With contracts you validate even the most complex data structures, value types and constraints. Contracts is how you know that one object expects another instance of particular type. That allows constructing valid object model in UI without DSL code execution for every single UI click. That allows dynamic construction of UI forms - you know in advance what type value is expected for each attribute and how to validate it.
This is just one (but not the only) example of domain-specific feature that you will not have in lua or JavaScript. Mistral would probably have other unique domain-specific features.

 3. In MuranoPL everything can be customized. You control list of global/built-in f unctions and can have your own customized versions of them. You can have your domain-specific operators in expressions or modify standard ones (you can even make 2+2=5). You control all the program APIs so that no I/O is possible except for explicitly provided to DSL. You control the heap: you can hibernate the entire program at any time and resume it on another server from that place. You can have your own block constructs. You would not have such level of control with any other language

 4. Domain-specific is usually better than general-purposed


Q: Why YAML?

A:
 1. There are 2 possible alternatives: use some existing data format (XML, JSON, YAML) or create your own. And your own format for your own language means your own parser. And writing a parser for language of that complexity is a difficult task that would take months of work. From standard formats YAML is clearly most readable one

 2. There are a lot of XML languages. There is BPEL XML language for workflows. So there is nothing unusual in having programming language encoded as data serialization format. So why not have YAML for the same purpose considering it is more readable than XML?

 3. MuranoPL is not only code but also declarations - classes, properties, inheritance, methods, contracts etc. YAML is really good for declarations

 4. As for the code there is not so many difference from having you own parser. Sure you need to prepend dash (-) before each code line, but in Java/C you append semicolon after each like and everyone okay with that. In Python you write var = value, in MuranoPL $var: value. You use colon for block constructs in Python and in MuranoPL. As you can see there are not so many differences from normal Python code as one can expect from not using custom parser. As soon as you start writing in MuranoPL you soon forget that this is YAML and treat it like a normal language.

5. Because of YAML all JSON/YAML data structures that looks so nice and readable in YAML become first class citizens in MuranoPL. You can easily embed Heat template of REST API JSON body in DSL code. This even better than in Python

6. Everyone can take MuranoPL class and see what properties it exposes, what methods it has as all you need is standard YAML parser

7. YAML is still a data serialization format. So you can work with DSL code as with regular data - store in data base, insert code lines and methods, convert to other formats etc.

8. You can customize many thing on YAML parser level before it reaches DSL. DSL does not deal directly with YAML, only deserialized version of it. Thus you can control where those YAMLs are located. You can store them in Glance, database, file-system or generate them on the fly without DSL even notice.

9. With YAML you has less to explain (people already know it) and more of a tooling and IDE support


Q: Isn't DSL of that power would be a subject to DoS attacks, resource starvation attacks etc.?

A: This is a really good question cause nobody asked it yet :) And the answer is NO. Here is why:

With MuranoPL you have complete control over everything. You can have time quota for DSL code (remember 30 second quota for PHP scrips in typical web setup?). Or you can limit DSL scripts to some reasonable number of executed instructions. Or count raw CPU time for the script. You can make all loop constructs to have upper limit on iterations. You can do the same with YAQL functions.  You can enforce timeouts on all I/O operations. MuranoPL uses green threads and you can have a lot of them (unlike regular threads) and it is possible to limit total number of green threads used for DSL execution. It is possible to even limit memory usage. DSL code cannot allocate operating-system-level resources like file handles of TCP sockets. And there is garbage collector as DSL in interpreted on top of Python.

How easy it would be to do the same with JavaScript/Lua/whatever?







On Sat, Mar 8, 2014 at 7:05 AM, Joshua Harlow <harlowja at yahoo-inc.com<mailto:harlowja at yahoo-inc.com>> wrote:
What are the benefits of MuranoPL over a already established language?

I noted the following and don't quite understand why they are needed (reinventing a language):

- https://wiki.openstack.org/wiki/Murano/DSL/Blueprint#Extends
- https://wiki.openstack.org/wiki/Murano/DSL/Blueprint#Block_constructs
Q: where are those looping constructs executed? How hard is it to DOS murano servers (submitting jobs that loop forever). What execution limits are imposed? I noted that the parallel construct actually exposes the number of green threads (isn't this an avenue for resource starvation?).
- https://wiki.openstack.org/wiki/Murano/DSL/Blueprint#Object_model

IMHO, something just doesn't seem right when the above is created, fitting a language into YAML seems about as awkward as creating a language in XML (xquery[1] for example) . Why was this really preferred over just python or something simpler for example, [lua, javascript…], that already has language/object constructs… built-in and have runtimes that u can control the security domain of (python is not a good choice to run arbitrary code-in, find out how much energy google put into python + app-engine and u'll see what it takes).

http://en.wikipedia.org/wiki/XQuery#Examples

From: Stan Lagun <slagun at mirantis.com<mailto:slagun at mirantis.com>>

Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Date: Friday, March 7, 2014 at 9:36 AM

To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Subject: Re: [openstack-dev] [Mistral] Crack at a "Real life" workflow

Hello everyone!

Actually it is possible to construct YAML-based DSL that has all the constructs of regular OOP language like Python and at the same time be safe enough to be used for execution of untrusted code on shared server.

Take a look at Murano DSL.
For example the code above defines class "Instance": https://github.com/istalker2/MuranoDsl/blob/master/meta/com.mirantis.murano.services.Instance/manifest.yaml
The part that may be useful for Mistral is under Workflow key.
Here is some doc on the language: https://wiki.openstack.org/wiki/Murano/DSL/Blueprint
Technically you can code any workflow that you can in Python using such language (just don't look at all OOP-related stuff) and it will look very similar to Python but be safe as you can only call APIs that are explicitly provided for DSL

Hope this might be helpful for Mistral



On Fri, Mar 7, 2014 at 10:38 AM, Dmitri Zimine <dz at stackstorm.com<mailto:dz at stackstorm.com>> wrote:
I just moved the sample to Git; let's leverage git review for specific comments on the syntax.

https://github.com/dzimine/mistral-workflows/commit/d8c4a8c845e9ca49f6ea94362cef60489f2a46a3

DZ>

On Mar 6, 2014, at 10:36 PM, Dmitri Zimine <dz at stackstorm.com<mailto:dz at stackstorm.com>> wrote:

Folks, thanks for the input!

@Joe:

Hopefully Renat covered the differences.  Yet I am interested in how the same workflow can be expressed as Salt state(s) or Ansible playbooks. Can you (or someone else who knows them well) take a stub?


@Joshua
I am still new to Mistral and learning, but I think it _is_ relevant to taskflow. Should we meet, and you help me catch up? Thanks!

@Sandy:
Aaahr, I used the "D" word?!  :) I keep on arguing that YAML workflow representation doesn't make DSL.

And YES to the object model first to define the workflow, with YAML/JSON/PythonDSL/what-else as a syntax to build it. We are having these discussions on another thread and reviews.

Basically, in order to make a grammar expressive enough to work across a
web interface, we essentially end up writing a crappy language. Instead,
we should focus on the callback hooks to something higher level to deal
with these issues. Minstral should just say "I'm done this task, what
should I do next?" and the callback service can make decisions on where
in the graph to go next.

There must be some misunderstanding. Mistral _does follow AWS / BPEL engines approach, it is both doing "I'm done this task, what should I do next?" (executor) and "callback service" (engine that coordinates the flow and keeps the state). Like decider and activity workers in AWS Simple Workflow.

Engine maintains the state. Executors run tasks. Object model describes workflow as a graph of tasks with transitions, conditions, etc. YAML is one way to define a workflow. Nothing controversial :)

@all:

Wether one writes Python code or uses yaml? Depends on the user. There are good arguments for YAML. But if it's crappy, it looses. We want to see how it feels to write it. To me, mixed feelings so far, but promising. What do you guys think?

Comments welcome here:
https://github.com/dzimine/mistral-workflows/commit/d8c4a8c845e9ca49f6ea94362cef60489f2a46a3


DZ>


On Mar 6, 2014, at 10:41 AM, Sandy Walsh <sandy.walsh at rackspace.com<mailto:sandy.walsh at rackspace.com>> wrote:



On 03/06/2014 02:16 PM, Renat Akhmerov wrote:
IMO, it looks not bad (sorry, I’m biased too) even now. Keep in mind this is not the final version, we keep making it more expressive and concise.

As for killer object model it’s not 100% clear what you mean. As always, devil in the details. This is a web service with all the consequences. I assume what you call “object model” here is nothing else but a python binding for the web service which we’re also working on. Custom python logic you mentioned will also be possible to easily integrate. Like I said, it’s still a pilot stage of the project.

Yeah, the REST aspect is where the "tricky" part comes in :)

Basically, in order to make a grammar expressive enough to work across a
web interface, we essentially end up writing a crappy language. Instead,
we should focus on the callback hooks to something higher level to deal
with these issues. Minstral should just say "I'm done this task, what
should I do next?" and the callback service can make decisions on where
in the graph to go next.

Likewise with things like sending emails from the backend. Minstral
should just call a webhook and let the receiver deal with "active
states" as they choose.

Which is why modelling this stuff in code is usually always better and
why I'd lean towards the TaskFlow approach to the problem. They're
tackling this from a library perspective first and then (possibly)
turning it into a service. Just seems like a better fit. It's also the
approach taken by Amazon Simple Workflow and many BPEL engines.

-S


Renat Akhmerov
@ Mirantis Inc.



On 06 Mar 2014, at 22:26, Joshua Harlow <harlowja at yahoo-inc.com<mailto:harlowja at yahoo-inc.com>> wrote:

That sounds a little similar to what taskflow is trying to do (I am of course biased).

I agree with letting the native language implement the basics (expressions, assignment...) and then building the "domain" ontop of that. Just seems more natural IMHO, and is similar to what linq (in c#) has done.

My 3 cents.

Sent from my really tiny device...

On Mar 6, 2014, at 5:33 AM, "Sandy Walsh" <sandy.walsh at RACKSPACE.COM<mailto:sandy.walsh at RACKSPACE.COM>> wrote:

DSL's are tricky beasts. On one hand I like giving a tool to
non-developers so they can do their jobs, but I always cringe when the
DSL reinvents the wheel for basic stuff (compound assignment
expressions, conditionals, etc).

YAML isn't really a DSL per se, in the sense that it has no language
constructs. As compared to a Ruby-based DSL (for example) where you
still have Ruby under the hood for the basic stuff and extensions to the
language for the domain-specific stuff.

Honestly, I'd like to see a killer object model for defining these
workflows as a first step. What would a python-based equivalent of that
real-world workflow look like? Then we can ask ourselves, does the DSL
make this better or worse? Would we need to expose things like email
handlers, or leave that to the general python libraries?

$0.02

-S



On 03/05/2014 10:50 PM, Dmitri Zimine wrote:
Folks,

I took a crack at using our DSL to build a real-world workflow.
Just to see how it feels to write it. And how it compares with
alternative tools.

This one automates a page from OpenStack operation
guide: http://docs.openstack.org/trunk/openstack-ops/content/maintenance.html#planned_maintenance_compute_node

Here it is https://gist.github.com/dzimine/9380941
or here http://paste.openstack.org/show/72741/

I have a bunch of comments, implicit assumptions, and questions which
came to mind while writing it. Want your and other people's opinions on it.

But gist and paste don't let annotate lines!!! :(

May be we can put it on the review board, even with no intention to
check in,  to use for discussion?

Any interest?

DZ>


_______________________________________________
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


_______________________________________________
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


_______________________________________________
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



_______________________________________________
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




--
Sincerely yours
Stanislav (Stan) Lagun
Senior Developer
Mirantis
35b/3, Vorontsovskaya St.
Moscow, Russia
Skype: stanlagun
www.mirantis.com<http://www.mirantis.com/>
slagun at mirantis.com<mailto:slagun at mirantis.com>



--
Sincerely yours
Stanislav (Stan) Lagun
Senior Developer
Mirantis
35b/3, Vorontsovskaya St.
Moscow, Russia
Skype: stanlagun
www.mirantis.com<http://www.mirantis.com/>
slagun at mirantis.com<mailto:slagun at mirantis.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140308/24326f9b/attachment.html>


More information about the OpenStack-dev mailing list