<html>

  <head>

    <meta content="text/html; charset=windows-1252"

      http-equiv="Content-Type">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <div class="moz-cite-prefix">On 03/22/2016 09:15 AM, Flavio Percoco

      wrote:<br>

    </div>

    <blockquote cite="mid:20160322131513.GH15004@redhat.com" type="cite">On

      21/03/16 21:43 -0400, Adam Young wrote:

      <br>

      <blockquote type="cite">I had a good discussion with the Nova

        folks in IRC today.

        <br>

        <br>

        My goal was to understand what could talk to what, and the short

        according to dansmith

        <br>

        <br>

        " any node in nova land has to be able to talk to the queue for

        any other one for the most part: compute->compute,

        compute->conductor, conductor->compute,

        api->everything. There might be a few exceptions, but not

        worth it, IMHO, in the current architecture."

        <br>

        <br>

        Longer conversation is here:

        <br>

<a class="moz-txt-link-freetext" href="http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2016-03-21.log.html#t2016-03-21T17:54:27">http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2016-03-21.log.html#t2016-03-21T17:54:27</a>

        <br>

        <br>

        Right now, the message queue is a nightmare.  All sorts of

        sensitive information flows over the message queue: Tokens

        (including admin) are the most obvious.  Every piece of audit

        data. All notifications and all control messages.

        <br>

        <br>

        Before we continue down the path of "anything can talk to

        anything" can we please map out what needs to talk to what, and

        why?  Many of the use cases seem to be based on something that

        should be kicked off by the conductor, such as "migrate, resize,

        live-migrate" and it sounds like there are plans to make that

        happen.

        <br>

        <br>

        So, let's assume we can get to the point where, if node 1 needs

        to talk to node 2, it will do so only via the conductor.  With

        that in place, we can put an access control rule in place:

        <br>

      </blockquote>

      <br>

      I don't think this is going to scale well. Eventually, this will

      require

      <br>

      evolving the conductor to some sort of message scheduler, which is

      pretty much

      <br>

      what the message bus is supposed to do.

      <br>

    </blockquote>

    <br>

    I'll limit this to what happens with Rabbit and QPID (AMQP1.0) and

    leave 0 our of it for now.  I'll use rabbit as shorthand for both

    these, but the rules are the same for qpid.<br>

    <br>

    <br>

    <br>

    For, say, a migrate operation, the call goes to API, controller, and

    eventually down to one of the compute nodes.  Source? Target?  I

    don't know the code well enough to say, but let's say it is the

    source.  It sends an RPC message to the target node.  The message

    goes to  the central broker right now, and then back down to the

    targen node.  Meanwhile, the source node has set up a reply queue

    and that queue name has gone into the message.  The target machine

    responds  by getting a reference to the response queue and sends a

    message.  This message goes up to the broker, and then down to the

    the source node.<br>

    <br>

    A man in the middle could sit there and also read off the queue. It

    could modify a message, with its own response queue, and happily

    tranfer things back and forth.<br>

    <br>

    So, we have the HMAC proposal, which then puts crypto and key

    distribution all over the place.  Yes, it would guard against a MITM

    attack, but the cost in complexity and processor time it high.<br>

    <br>

    <br>

    Rabbit does not have a very flexible ACL scheme, bascially, a RegEx

    per Rabbit user.  However, we could easily spin up a new queue for

    direct node to node communication that did meet an ACL regex.  For

    example, if we said that the regex was that the node could only

    read/write queues that have its name in them, to make a request and

    response queue between node-1 and node-2 we could create a queues <br>

    <br>

    <br>

    node-1-node-2<br>

    node-1-node-2-<uuid>-reply<br>

    <br>

    <br>

    So, instead of a single queue request, there are two.  And conductor

    could tell the target node: start listening on this queue.<br>

    <br>

    <br>

    Or, we could pass the message through the conductor.  The request

    message goes from node-1 to conductor,  where conductor validates

    the businees logic of the message, then puts it into the message

    queue for node-2.  Responses can then go directly back from node-2

    to node-1 the way they do now.<br>

    <br>

    OR...we could set up a direct socket between the two nodes, with the

    socket set up info going over the broker.  OR we could use a web

    server,  OR send it over SNMP.  Or SMTP, OR TFTP.  There are many

    ways to get the messages from node to node.<br>

    <br>

    If  we are going to use the message broker to do this, we should at

    least make it possible to secure it, even if it is not the default

    approach.<br>

    <br>

    It might be possible to use a broker specific technology to optimize

    this, but I am not a Rabbit expert.  Maybe there is some way of

    filtering messages?<br>

    <br>

    <br>

    <blockquote cite="mid:20160322131513.GH15004@redhat.com" type="cite">

      <br>

      <blockquote type="cite">1.  Compute nodes can only read from the

        queue compute.<name>-novacompute-<index>.localdomain

        <br>

        2.  Compute nodes can only write to response queues in the RPC

        vhost

        <br>

        3.  Compute nodes can only write to notification queus in the

        notification host.

        <br>

        <br>

        I know that with AMQP, we should be able to identify the writer

        of a message.  This means that each compute node should have its

        own user.  I have identified how to do that for Rabbit and

        QPid.  I assume for 0mq is would make sense to use ZAP

        (<a class="moz-txt-link-freetext" href="http://rfc.zeromq.org/spec:27">http://rfc.zeromq.org/spec:27</a>) but I'd rather the 0mq

        maintainers chime in here.

        <br>

        <br>

      </blockquote>

      <br>

      NOTE: Gentle reminder that qpidd has been removed from

      oslo.messaging.

      <br>

    </blockquote>

    <br>

    Yes, but QPID is proton is AMQP1.0 and I did a proof of concept with

    it last summer.  It supports encryption and authentication over

    GSSAPI and is, I think, the best option for securing messaging in an

    OpenStack deployment at the moment.<br>

    <br>

    <blockquote cite="mid:20160322131513.GH15004@redhat.com" type="cite">

      <br>

      I think you can configure rabbit, amqp1 and other technologies to

      do what you're

      <br>

      suggesting here without much trouble. TBH, I'm not sure how many

      chances would

      <br>

      be required in Nova (or even oslo.messaging) but I'd dare to say

      non are

      <br>

      required.

      <br>

      <br>

      <blockquote type="cite">I think it is safe (and sane) to have the

        same use on the compute node communicate with  Neutron, Nova,

        and Ceilometer.  This will avoid a false sense of security: if

        one is compromised, they are all going to be compromised.  Plan

        accordingly.

        <br>

        <br>

        Beyond that, we should have message broker users for each of the

        components that is a client of the broker.

        <br>

        <br>

        Applications that run on top of the cloud, and that do not get

        presence on the compute nodes, should have their own VHost.  I

        see Sahara on my Tripleo deploy, but I assume there are others. 

        Either they completely get their own vhost, or the apps should

        share one separate from the RPC/Notification vhosts we currently

        have.  Even Heat might fall into this category.

        <br>

        <br>

        Note that those application users can be allowed to read from

        the notification queues if necessary.  They just should not be

        using the same vhost for their own traffic.

        <br>

        <br>

        Please tell me if/where I am blindingly wrong in my analysis.

        <br>

        <br>

      </blockquote>

      <br>

      I guess my question is: Have you identified things that need to be

      changed in

      <br>

      any of the projects for this to be possible? Or is it a pure

      deployment

      <br>

      recommendation/decision?

      <br>

    </blockquote>

    <br>

    There are certainly deployment changes we need to make that help. 

    And we can likely make it such that the compute nodes can only read

    from their own appropriate queues.  However, without changing the

    queue naming scheme, I can't see how to control who can write to

    where.  Right now, its a free for all.<br>

    <br>

    <blockquote cite="mid:20160322131513.GH15004@redhat.com" type="cite">

      <br>

      I'd argue that any change (assuming changes are required) are

      likely to happen

      <br>

      in specific projects (Nova, Neutron, etc) and that once this

      scenario is

      <br>

      supported, it'll remain a deployment choice to follow it or not.

      If I want my

      <br>

      undercloud services to use a single vhost and a single user, I

      must be able to

      <br>

      do that. The proposal in this email complicates deployments

      significantly,

      <br>

      despite it making sense from a security stand point.

      <br>

    </blockquote>

    So, nothing I am saying is preventing that.  OTOH, there is

    insufficient support from the RPC approach to do a more secure ACL.<br>

    <br>

    <br>

    <blockquote cite="mid:20160322131513.GH15004@redhat.com" type="cite">

      <br>

      One more thing. Depending on the messaging technology, having

      different virtual

      <br>

      hosts may have an impact on the performance when running under

      huge loads given

      <br>

      the fact that the data will be partitioned differently and,

      therefore,

      <br>

      written/read differently. I don't have good data at hand about

      this, sorry.

      <br>

    </blockquote>

    <br>

    So, I think that performance can be optimized many ways, including

    having multiple Brokers involved in a deployment.  I've seen

    architecture diagrams to that effect, but have not had to put it in

    to production myself.<br>

    <br>

    <blockquote cite="mid:20160322131513.GH15004@redhat.com" type="cite">

      <br>

      Flavio

      <br>

      <br>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">__________________________________________________________________________

OpenStack Development Mailing List (not for usage questions)

Unsubscribe: <a class="moz-txt-link-abbreviated" href="mailto:OpenStack-dev-request@lists.openstack.org?subject:unsubscribe">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a>

<a class="moz-txt-link-freetext" href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a>

</pre>

    </blockquote>

    <br>

  </body>

</html>