[openstack-dev] [nova] readout from Philly Operators Meetup

Kevin Benton blak111 at gmail.com
Thu Mar 12 08:25:47 UTC 2015

>The biggest disconnect in the model seems to be that Neutron assumes
you want self service networking. Most of these deploys don't. Or even
more importantly, they live in an organization where that is never
going to be an option.

>Neutron provider networks is close, except it doesn't provide for
floating IP / NAT.

Why don't shared networks work in these cases? The workflow here would be
that there is a admin tenant responsible for creating the networks and
setting up the neutron router and floating IP pools, etc. Then tenants
would attach their VMs to the shared networks.

On Wed, Mar 11, 2015 at 5:59 AM, Sean Dague <sean at dague.net> wrote:

> The last couple of days I was at the Operators Meetup acting as Nova
> rep for the meeting. All the sessions were quite nicely recorded to
> etherpads here - https://etherpad.openstack.org/p/PHL-ops-meetup
> There was both a specific Nova session -
> https://etherpad.openstack.org/p/PHL-ops-nova-feedback as well as a
> bunch of relevant pieces of information in other sessions.
> This is an attempt for some summary here, anyone else that was in
> attendance please feel free to correct if I'm interpreting something
> incorrectly. There was a lot of content there, so this is in no way
> comprehensive list, just the highlights that I think make the most
> sense for the Nova team.
> =========================
>  Nova Network -> Neutron
> =========================
> This remains listed as the #1 issue from the Operator Community on
> their burning issues list
> (https://etherpad.openstack.org/p/PHL-ops-burning-issues L18). During
> the tags conversation we straw polled the audience
> (https://etherpad.openstack.org/p/PHL-ops-tags L45) and about 75% of
> attendees were over on neutron already. However those on Nova Network
> we disproportionally the largest clusters and longest standing
> OpenStack users.
> Of those on nova-network about 1/2 had no interest in being on
> Neutron (https://etherpad.openstack.org/p/PHL-ops-nova-feedback
> L24). Some of the primary reasons were the following:
> - Complexity concerns - neutron has a lot more moving parts
> - Performance concerns - nova multihost means there is very little
>   between guests and the fabric, which is really important for the HPC
>   workload use case for OpenStack.
> - Don't want OVS - ovs adds additional complexity, and performance
>   concerns. Many large sites are moving off ovs back to linux bridge
>   with neutron because they are hitting OVS scaling limits (especially
>   if on UDP) - (https://etherpad.openstack.org/p/PHL-ops-OVS L142)
> The biggest disconnect in the model seems to be that Neutron assumes
> you want self service networking. Most of these deploys don't. Or even
> more importantly, they live in an organization where that is never
> going to be an option.
> Neutron provider networks is close, except it doesn't provide for
> floating IP / NAT.
> Going forward: I think the gap analysis probably needs to be revisited
> with some of the vocal large deployers. I think we assumed the
> functional parity gap was closed with DVR, but it's not clear in it's
> current format it actually meets the n-net multihost users needs.
> ===================
>  EC2 going forward
> ===================
> Having a sustaninable EC2 is of high interest to the operator
> community. Many large deploys have some users that were using AWS
> prior to using OpenStack, or currently are using both. They have
> preexisting tooling for that.
> There didn't seem to be any objection to the approach of an external
> proxy service for this function -
> (https://etherpad.openstack.org/p/PHL-ops-nova-feedback L111). Mostly
> the question is timing, and the fact that no one has validated the
> stackforge project. The fact that we landed everything people need to
> run this in Kilo is good, as these production deploys will be able to
> test it for their users when they upgrade.
> ============================
>  Burning Nova Features/Bugs
> ============================
> Hierarchical Projects Quotas
> ----------------------------
> Hugely desired feature by the operator community
> (https://etherpad.openstack.org/p/PHL-ops-nova-feedback L116). Missed
> Kilo. This made everyone sad.
> Action: we should queue this up as early Liberty priority item.
> Out of sync Quotas
> ------------------
> https://etherpad.openstack.org/p/PHL-ops-nova-feedback L63
> The quotas code is quite racey (this is kind of a known if you look at
> the bug tracker). It was actually marked as a top soft spot during
> last fall's bug triage -
> http://lists.openstack.org/pipermail/openstack-dev/2014-September/046517.html
> There is an operator proposed spec for an approach here -
> https://review.openstack.org/#/c/161782/
> Action: we should make a solution here a top priority for enhanced
> testing and fixing in Liberty. Addressing this would remove a lot of
> pain from ops.
> Reporting on Scheduler Fails
> ----------------------------
> Apparently, some time recently, we stopped logging scheduler fails
> above DEBUG, and that behavior also snuck back into Juno as well
> (https://etherpad.openstack.org/p/PHL-ops-nova-feedback L78). This
> has made tracking down root cause of failures far more difficult.
> Action: this should hopefully be a quick fix we can get in for Kilo
> and backport.
> =============================
>  Additional Interesting Bits
> =============================
> Rabbit
> ------
> There was a whole session on Rabbit -
> https://etherpad.openstack.org/p/PHL-ops-rabbit-queue
> Rabbit is a top operational concern for most large sites. Almost all
> sites have a "restart everything that talks to rabbit" script because
> during rabbit ha opperations queues tend to blackhole.
> All other queue systems OpenStack supports are worse than Rabbit (from
> experience in that room).
> oslo.messaging < 1.6.0 was a significant regression in dependability
> from the incubator code. It now seems to be getting better but still a
> lot of issues. (L112)
> Operators *really* want the concept in
> https://review.openstack.org/#/c/146047/ landed. (I asked them to
> provide such feedback in gerrit).
> Nova Rolling Upgrades
> ---------------------
> Most people really like the concept, couldn't find anyone that had
> used it yet because Neutron doesn't support it, so they had to big
> bang upgrades anyway.
> Galera Upstream Testing
> -----------------------
> The majority of deploys run with Galera MySQL. There was a question
> about whether or not we could get that into upstream testing pipeline
> as that's the common case.
>         -Sean
> --
> Sean Dague
> http://dague.net
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Kevin Benton
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150312/d45a63f1/attachment.html>

More information about the OpenStack-dev mailing list