[openstack-dev] [puppet][Fuel] Using Native Ruby Client for Openstack Providers

Sofer Athlan-Guyot sathlang at redhat.com
Tue Oct 20 13:56:59 UTC 2015

Gilles Dubreuil <gilles at redhat.com> writes:

> On 14/10/15 17:15, Gilles Dubreuil wrote:
>> On 14/10/15 10:36, Colleen Murphy wrote:
>>> On Tue, Oct 13, 2015 at 6:13 AM, Vladimir Kuklin <vkuklin at mirantis.com
>>> <mailto:vkuklin at mirantis.com>> wrote:
>>>     Puppetmaster and Fuelers,
>>>     Last week I mentioned that I would like to bring the theme of using
>>>     native ruby OpenStack client and use it within the providers.
>>>     Emilien told me that I had already been late and the decision was
>>>     made that puppet-openstack decided to not work with Aviator based on
>>>     [0]. I went through this thread and did not find any unresolvable
>>>     issues with using Aviator in comparison with potential benefits it
>>>     could have brought up.
>>>     What I saw actually was like that:
>>>     * Pros
>>>     1) It is a native ruby client
>>>     2) We can import it in puppet and use all the power of Ruby
>>>     3) We will not need to have a lot of forks/execs for puppet 
>>>     4) You are relying on negotiated and structured output provided by
>>>     API (JSON) instead of introducing workarounds for client output like [1]
>>>     * Cons
>>>     1) Aviator is not actively supported 
>>>     2) Aviator does not track all the upstream OpenStack features while
>>>     native OpenStack client does support them
>>>     3) Ruby community is not really interested in OpenStack (this one is
>>>     arguable, I think)
>>>     * Proposed solution
>>>     While I completely understand all the cons against using Aviator
>>>     right now, I see that Pros above are essential enough to change our
>>>     mind and invest our own resources into creating really good
>>>     OpenStack binding in Ruby.
>>>     Some are saying that there is not so big involvement of Ruby into
>>>     OpenStack. But we are actually working with Puppet/Ruby and are
>>>     invloved into community. So why should not we own this by ourselves
>>>     and lead by example here?
>>>     I understand that many of you do already have a lot of things on
>>>     their plate and cannot or would not want to support things like
>>>     additional library when native OpenStack client is working
>>>     reasonably well for you. But if I propose the following scheme to
>>>     get support of native Ruby client for OpenStack:
>>>     1) we (community) have these resources (speaking of the company I am
>>>     working for, we at Mirantis have a set of guys who could be very
>>>     interested in working on Ruby client for OpenStack)
>>>     2) we gradually improve Aviator code base up to the level that it
>>>     eliminates issues that are mentioned in  'Cons' section
>>>     3) we introduce additional set of providers and allow users and
>>>     operators to pick whichever they want
>>>     4) we leave OpenStackClient default one
>>>     Would you support it and allow such code to be merged into upstream
>>>     puppet-openstack modules?
>>>     [0] https://groups.google.com/a/puppetlabs.com/forum/#!searchin/puppet-openstack/aviator$20openstackclient/puppet-openstack/GJwDHNAFVYw/ayN4cdg3EW0J
>>>     [1] https://github.com/openstack/puppet-swift/blob/master/lib/puppet/provider/swift_ring_builder.rb#L21-L86
>>>     -- 
>>>     Yours Faithfully,
>>>     Vladimir Kuklin,
>>>     Fuel Library Tech Lead,
>>>     Mirantis, Inc.
>>>     +7 (495) 640-49-04 <tel:%2B7%20%28495%29%20640-49-04>
>>>     +7 (926) 702-39-68 <tel:%2B7%20%28926%29%20702-39-68>
>>>     Skype kuklinvv
>>>     35bk3, Vorontsovskaya Str.
>>>     Moscow, Russia,
>>>     www.mirantis.com <http://www.mirantis.ru/>
>>>     www.mirantis.ru <http://www.mirantis.ru/>
>>>     vkuklin at mirantis.com <mailto:vkuklin at mirantis.com>
>>> The scale-tipping reason we went with python-openstackclient over the
>>> Aviator library was that at the same time we were trying to switch, we
>>> were also developing keystone v3 features and we could only get those
>>> features from python-openstackclient.
>>> For the first two pros you listed, I'm not convinced they're really
>>> pros. Puppet types and providers are actually extremely well-suited to
>>> shelling out to command-line clients. There are existing, documented
>>> puppet APIs to do it and we get automatic debug output with it. Nearly
>>> every existing type and provider does this. It is not well-suited to
>>> call out to other non-standard ruby libraries because they need to be
>>> added as a dependency somehow, and doing this is not well-established in
>>> puppet. There are basically two options to do this:
>>>  1) Add a gem as a package resource and make sure the package resource
>>> is called before any of the openstack resources. I could see this
>>> working as an opt-in thing, but not as a default, for the same reason we
>>> don't require our users to install pip libraries - there is less
>>> security guarantees from pypi and rubygems than from distro packages,
>>> plus corporate infrastructure may not allow pulling packages from these
>>> types of sources. (I don't see this policy documented anywhere, this was
>>> just something that's been instilled in me since I started working on
>>> this team. If we want to revisit it, formalize it, or drop it, we can
>>> start a separate thread for that.)
>>> 2) Vendor the gem as a module. This was how we tried it before, and you
>>> can see aimonb/aviator <https://github.com/aimonb/puppet_aviator> for
>>> this. The problems I see with this are that we have to keep the module
>>> in sync with the gem, and there is no way to keep proper attribution in
>>> the module for work that was really done in the gem.
>>> I am not convinced that the fork/execs are really causing that much
>>> performance issues, though I've heard from some people that
>>> python-openstackclient itself is sort of slow.
>> That sounds contradictory.
>> Adding to previous comments on this matter:
>> Using a native Ruby library (A) to call class any Openstack service's API:
>> Puppet Agent (Ruby) <=> Net::HTTP (Ruby) <=> Openstack API (HTTP/REST)
>> Is not comparable with a system call to OSC (B):
>> Puppet Agent (Ruby) <=> Spawn (System) <=> OpenStackClient (Python) <=>
>> Openstack service API (Python)
>> In the second case, what we're currently using, the spawn is obviously
>> bad, although having OSC natively calling Openstack APIs is good.
>> As said previously, approach A makes effectively more sense.
>> Especially from an architecture viewpoint, Puppet providers must be
>> dealing with API not CLI.
>> So do we really need to see metrics comparing A vs B?
>> Fuel seems to have a case already anyway.
>> But before we jump on a looking glass, structurally, I think, there is
>> no doubt.
>>> The puppet agent itself
>>> has a lot of overhead that causes it to be slow, this is why puppetlabs
>>> is planning to rewrite it in C++.
>> Besides the need to optimize Puppet agent, C++ is not going to change
>> anything about the need to call Openstack APIs somehow.
>> BTW, if it was just me, I would rather use a functional language of the
>> past and the future, such as Elixir! ;)
>>> For pro #4, the example you gave isn't really valid because the swift
>>> module isn't using openstackclient, and openstackclient has the ability
>>> to output data in CSV or JSON. A valid and related complaint would be
>>> that openstackclient sometimes spits out ugly irrelevant warnings that
>>> we have to parse around, since puppet is terrible at splitting stdout
>>> from stderr.
>>> I think a major issue we had the first time we tried to do this was that
>>> we started doing work to rewrite these things without actually knowing
>>> if it would be better. Instead of using that as a precedent for doing
>>> the same thing again, I would like to take that experience and do this
>>> right by actually gathering data: what is wrong with how we're doing it
>>> now, what numbers and metrics show that things are bad? Can we create a
>>> proof-of-concept that demonstrates that we can really improve these numbers?
>>> If we decide that using a ruby library over the CLI is really a
>>> measurably better approach, I would recommend we get in contact with
>>> Mark Maglana, the author of Aviator, and get the library moved back over
>>> to openstack rather than GitHub, and set up a core reviewer team with
>>> multiple members from multiiple companies. We should also work with the
>>> OpenStack-SDK team to see if we can get this moved under their project
>>> or if they have suggestions for us.
>> So if a native Ruby library makes sense, which one? [1]
>> Before contacting Aviator's author(s) to make it moving or to clone it
>> under Openstack umbrella (under th name Astronaut!), we must answer the
>> question: Do we really need Aviator?
>> Aviator is a wrapper of Faraday [2], which itself is wrapping Net::HTTP.
>> Faraday which is great for API consumers like us because it allows:
>> 1. To use different adapters
>> ----------------------------
>> The default adapter is Net::HTTP.
>> Other adapter such as Typhoeus which can perform many requests in
>> parallel could be an advantage for example when scaling up.
>> A previously mentioned use case scenario of deploying hundreds of users,
>> projects and roles.
>> Although from the puppet agent, to be able to kick such parallel
>> creations, we might have to code specifically towards it.
>> Sure this might looming away with Keystone Federation and 3rd parties
>> identity managers, it still remains an issue when Keystone is used as
>> the identity service.
>> 2. The middleware paradigm
>> --------------------------
>> Openstack Puppet is a *big* API client consumer.
>> Look here for more details [3]
>> This feature would be very beneficial for Openstack Puppet.
>> So, what are the advantage of Aviator?
>> Besides the Logger, Aviator doesn't seems to be using much of the
>> middleware feature.
>> Sure the Describer makes things look good and I tend to agree with
>> Aviator approach from an OO viewpoint.
>> But that's not enough to hold us with Aviator.
>> Please add here if you find anything else.
>> Don't get me wrong, I think Aviator is a great idea but because of it's
>> state and our needs we might just use Faraday and add our own middleware
>> wrappers. Those can simply be on a per puppet module basis.
>> That makes sense from an architecture angle and should give us room for
>> scalabitity.
>> [1] http://www.slideshare.net/HiroshiNakamura/rubyhttp-clients-comparison
>> [2] https://github.com/lostisland/faraday
>> [3] http://mislav.uniqpath.com/2011/07/faraday-advanced-http/
> I spent a bit of time on this subject because it always made sense to
> me, thanks to Vladimir to bring it back to the table.
> I actually think we don't really need Faraday at all and we could easily
> go with Net::HTTP only.
> Because, the 2 main advantages I could see from Faraday, are the HTTP
> error propagation and the middleware.
> But Net::HTTP provides those HTTP errors and we don't really need
> advanced middleware features for HTTP response processing.
> Actually the values returned by Openstack APIs are easy to process, from
> JSON structures they get transformed directly in Hashes.

But you can have middleware for request as well.  An interesting
middleware I could think of was for authentication.  Having all the
authentication nicely wrapped in a middleware would be nice.

You can also have different backend.  It could offer a interesting way
to offer backward compatibility.  Maybe we could define a openstack cli
backend that would take exactly the same argument as the current one
(daydreaming I think ...)

Anyway, I still think that faraday should be the way to go, but the
argument of less dependencies may be compeling, but can be discussed.  I
think that the client would be anyway distributed as a gem and then
packaged.  So adding faraday in the mixt wouldn't be that bad.

> Anyway in order to get some of those real data I thought about comparing
> efficiency between the OSC and the native approaches.
> So I tested the execution time of a manifest creating 1000 tenants using
> either approach.
> The result are actually beyond expectation, it's *enormous* (I couldn't
> believe when I ran it), it's almost 70 times faster:
> - 1000 tenants created using OSC:
> real   7m54.582s
> user   6m27.764s
> sys    0m55.741s
> - 1000 tenants created using native Ruby lib (Net::HTTP):
> real   0m40.079s
> user    0m8.908s
> sys     0m0.799s
> The Ruby native library can scale better!

I think that most of the difference here is because the cli is forced to
do authentication for each request.  I did a simple test on a vm:

   sysdig 'proc.name=openstack' -c proc_exec_time &

   ruby -e 'puts "\nservice list"*200' | openstack --os-username beaker-civ3 --os-password secret --os-project-name servicesv3 --os-user-domain-name service_domain --os-project-domain-name service_domain --os-auth-url --os-identity-api-version 3

gives 7.21s,


    eval `ruby -e 'puts "openstack --os-username beaker-civ3 --os-password secret --os-project-name servicesv3 --os-user-domain-name service_domain --os-project-domain-name service_domain --os-auth-url --os-identity-api-version 3 service list;"*200'`

gave ~800ms*200 = 160s

The main difference is due to keystone authentication.  One way to solve
this is by caching authentication.  I don't think you can pass a token
to the openstack cli but that would be nice.

Another way would be to use the first form and instrumentatilize
openstackcli using Ruby::PTY and expect.  It doesn't look very exiting
but that would improve performance by a order of magnitude (22X in my
example), without changing the interface at all for the user (without
changing much actually)

Moving from the cli to dedicated client may add support for richer error
code as this is currently tighly bound to what the cli has to offer.
But maybe the cli could be improved instead of making a brand new

I red a mention of fog.io.  It's a very nice framework, it supports
openstack since 2011 and is very hackable!  The problem there, is that
it supports a lot of other thing too.  So in the spirit of keeping the
requirements small, I think this should be excluded from the choice
(unfortunatly, as it's the framework I'm most familiar with)

So I have mixed feeling on this one.  On one side I would be very
excited to participate to such a development, on the other side we may
reinvent the wheel without having explored simpler solution:
 - improve openstack cli to support token (or the "server" mode Richard
   talked about) or whatever authentication caching mecanism;
 - having an option in the openstack cli to return the exact server
   error (code and message);
 - maybe instrumentatlize using PTY/expect to gain speed without
   changing anything to the interface.

Sofer Athlan-Guyot

More information about the OpenStack-dev mailing list