[openstack-dev] [Nova][Neutron] Linuxbridge as the default in DevStack [was: Status of the nova-network to Neutron migration work]
Monty Taylor
mordred at inaugust.com
Sat Apr 18 16:42:04 UTC 2015
On 04/18/2015 10:44 AM, Fox, Kevin M wrote:
> Replying inline.
>
>> -----Original Message----- From: Monty Taylor
>> [mailto:mordred at inaugust.com] Sent: Friday, April 17, 2015 7:53 PM
>> To: openstack-dev at lists.openstack.org Subject: Re: [openstack-dev]
>> [Nova][Neutron] Linuxbridge as the default in DevStack [was: Status
>> of the nova-network to Neutron migration work]
>>
>> On 04/17/2015 06:48 PM, Rochelle Grober wrote:
>>> I know the DevStack issue seems to be solved, but I had to
>>> respond.....inline
>>>
>>> From: Fox, Kevin M [mailto:Kevin.Fox at pnnl.gov] Sent: Friday,
>>> April 17, 2015 12:28 To: OpenStack Development Mailing List (not
>>> for usage questions) Subject: Re: [openstack-dev] [Nova][Neutron]
>>> Linuxbridge as the default in DevStack [was: Status of the
>>> nova-network to Neutron migration work]
>>>
>>> No, the complaints from ops I have heard even internally, which
>>> I think is being echo'd here is "I understand how linux bridge
>>> works, I don't opensvswitch". and "I don't want to be bothered to
>>> learn to debug openvswitch because I don't think we need it".
>>>
>>> If linux bridge had feature parity with openvswitch, then it
>>> would be a reasonable argument or if the users truly didn't need
>>> the extra features provided by openvswitch/naas. I still assert
>>> though, that linux bridge won't get feature parity with
>>> openvswitch and the extra features are actually critical to users
>>> (DVR/NaaS), so its worth switching to opevnswitch and learning
>>> how to debug it. Linux Bridge is a nonsolution at this point.
>>
>> I'm sorry, but with all due respect - I believe that sounds very
>> much like sticking fingers in ears and not paying attention to the
>> very real needs of users.
>
> No, when you have complex software, with multiple classes of users,
> it is almost impossible to please all your users, in every way. Sime
> times, you must make hard decisions to make one users experience a
> little less good for the benefit of the whole community. /me channels
> Spock here...
>
> If it makes the Ops life a little harder, but for every Op that has
> to learn how to debug openvswitch, 100 users don't have to deal with
> the difference between nova-network and neutron api's and software
> built on top of OpensStack that only works with one of them, I think
> that's worth the tradeoff. Its unfortunate, but necessary. Ops have
> to learn new things all the time. Its in the job description.
Absolutely, I think that it's impossible to please everyone all the time
and I certainly don't think we should avoid ovs because it might be hard
to learn.
What I'm saying is that the assumption that every cloud wants SDN and
Floating IPs and that that should be our default and only world is
dangerous - because it's a very advanced way of running and totally not
necessary for most user's apps.
If OVS _is_ used, a default behavior of "connect a VM to a network"
would be no different to an end user than if a provider network were
used. However, if a cloud supports the extra functionality of SDN
features or of floating ips, those are opt in features - and it's
already well covered by the "does my cloud have this conceptual feature"
So what I'm saying is, asserting that we MUST have OVS because the most
complex case needs it and we don't want our users to be confused is a
fallacy.
> I currently Operate 3 different OpenStack clouds, so I'm not just
> trying to push work on others and not myself. I paid the learning
> curve cost.
>
>> Let me tell you some non-features I encounter currently:
>>
>> - Needing Floating IPs to get a public address
>>
>> This is touted as "the right way to do it" - but it's actually a
>> terrible experience for a user. The clouds I have access to that
>> just give me a direct DHCP address are much more useful.
>
> Another case of short term pain for long term gain.
>
> Its nice to be able to not use them, up until you realize you needed
> it, don't have it, and its too late to deal with it.
>
> Ip addresses are stateful creatures. You attach dns entries to them.
> Some users contact them directly. They are a window into your
> machine. The cloud is all about scaling. If you can't just move an ip
> from vm to vm, you force them to become pets. Without them, you're
> operating much like in the virtualization days before cloud.
>
>> In fact, we should delete floating ips - they are a non-feature
>> that make life harder. Literally no user of a cloud has ever wanted
>> them, although we've learned to deal with them.
>
> Terrible idea.
>
> The first time I moved a floating ip from vm 1 to vm2 to do a rolling
> update that took under a second, it paid off. And my users benefited.
> Or the times I deleted vm's and launched new vm's in their place, and
> no data was lost and no one noticed.
>
> Cloud is a very different way to do things and if you don't
> understand it well, can be confused with traditional virtualization.
> It too, is worth the learning curve to understand how to do things
> the Cloud Way. You don't know you want it, until you go through the
> learning curve and understand why they really make sense. To keep
> state where it belongs, out of the vm.
>
> This is the hart of the issue we're discussing. People are wanting to
> force the cloud software to function not as a cloud, but more like
> what they are familiar with. But that's a bad idea. You gut the very
> features that make Cloud awesome.
I'm currently running a control plane of a bunch of pets and some cattle
along with a dynamic pool of nodes that creates and deletes ten-thousand
VMs per day. These systems span multiple clouds.
Of those resources, exactly 1 of them has in the last 3 years had a
situation where a floating ip would have mitigated an end user problem.
So yes, I'm being extreme by saying delete them. There ARE some edge
cases where using them is helpful. I would love to be able to make the
advanced decision that I'd like a floating IP on my box IN ADDITION to
the IP the box has that came with it. That's pretty cool.
There are also plenty of cases where _requiring_ their use is actually
harmful. Running behind a NAT and not knowing your IP address is broken.
You do it when you have to or you must - not as a general model.
(try running Kerberos on a NATed machine, for instance. It gets rather
cranky)
>> - SDN
>>
>> I understand this is important for people, so let's keep it around
>> - but having software routers essentially means that it's a scaling
>> bottleneck. In the cloud Infra uses that has SDN, we have to create
>> multiple software routers to handle the scaling issues. On the
>> other hand, direct routing / linuxbridge does NOT have this
>> problem, because the network packets are routed directly.
>
> Only if you gut the network stack by making it flat and making NaaS
> optional.
The internet routes packets in a non-SDN model just fine thanks.
> But for NaaS, OpenVSwitch backend does support scaled routing. Its
> called DVR. The linux bridge agent does not. And at the current rate
> of development, the Linux Bridge is likely not to. If NaaS really is
> critical, then OpenVswitch over linux bridge pays dividends.
Yes. If NaaS is really critical. I argue that NaaS is a only interesting
to a small subset of our users and should be seen as an advanced feature.
>> We should not delete SDN like we should delete floating IPs,
>> because there are real users who have real uses cases and SDN helps
>> them. However, it should be an opt-in feature for a user that is an
>> add on.
>
> Then app developers can't rely on it being there, and users can't
> have as much software readily available to launch in their tenant,
> weakening the ecosystem.
Straw man. Let's get "I want a VM that has a working network" to be a
consistent thing. Once you have that, checking to see if your provider
offers SDN as an advanced add-in is fine. If it doesn't, you probably
won't run an SDN needing app there. Thing is - the number of apps that
NEED SDN are dwarfed by the number of apps that do not, in fact, need SDN.
>> vexxhost is getting this right right now - you automatically get a
>> DHCP'd direct routed IP on each VM you provision, but if you decide
>> you need fancy, you can opt in to create a private network.
>
> Then how do you deal with a hypervisor dieing and dns records
> pointing to that ip? You encourage the vm a pet. This seems fine
> until it happens. Then it hurts.
The idea that pet vms can be avoided is misguided. They exist. In fact,
OpenStack is AMAZING at running them. If you want a world of nothing but
cattle, go write 12-factor apps and do cloudfoundry or something.
The reality of the situation is that many apps cannot be 12-factor - and
in fact 12-factor is little more than a marketing buzz-word to make some
folks seem like visionaries.
Pets are fine. They are here to stay. And I will fight to the last ounce
of my breath for their support in OpenStack.
They are, in fact, probably the main differentiator we have vs. an
Amazon or a Google. If you want pure cattle, you're going to wind up at
AWS or GCE because the price-point is going to be unmatchable because
they're underwriting things.
However, what we can offer our users is choice. We can offer our users
clouds that work they way they want them to - rather than clouds that
operate the way Jeff Bezos says - and we can offer them clouds that
adapt to their workloads, rather than telling our users that they are
wrong and that they need to rewrite all of their applications to fit the
pre-conceived model of how apps "should" be written.
>>
>> - DVR
>>
>> I'm an end user. I do not care about this at all. DVR is only
>> important if you have bought in to software routers. It's a
>> solution to a problem that would go away if things worked like
>> networks.
>
> I'm a cloud user too. I don't directly care about it either. Other
> then needing to ensure when I use NaaS, it scales. The how is
> irrelevant to me. If it's done with a Cisco neutron plugin, that's
> fine. I don't have to care. The thing that sucks is going from cloud
> to cloud, and having to write two sets of templates, one for
> nova-network and one for neutron since the api's are different. The
> user is being forced to care. This is bad.
Yes. Totally agree. I hate it that I have to spend a giant amount of
effort on one of my clouds to get a working network to my VMs when on
the other cloud I get a VM that can talk to the network.
Guess which one I think should be the default behavior?
>>> :/ So is keeping nova-network around forever. :/ But other then
>>> requiring some more training for ops folks, I think Neutron can
>>> suit the rest of the use cases these days nova-network provided
>>> over neutron. The sooner we can put the nova-network issue to
>>> bed, the better off the ecosystem will be. It will take a couple
>>> of years for the ecosystem to settle out to deprecating it, since
>>> a lot of clouds take years to upgrade and finally put the issue
>>> to bed. Lets do that sooner rather then later so a couple of
>>> years from now, we're done. :/
>>
>> I'm about to deploy a cloud, I'm going to run neutron, and I'm not
>> going to run openvswitch because I do not need it. I will run the
>> equiv of flatdhcp.
>
> Its good that your using neutron. Its unfortunate for the community
> that this fracturing is occurring.
>
> App developers have at least 3 targets. Nova-network, neutron with
> flat network, neutron with tenant networks. It's a lot of effort to
> write and debug one template, let alone 3. :/ Still, I'd prefer 2
> over 3 any day. :/
>
>> If neutron doesn't have it, I will write it, because it's that
>> important that it exist.
>
> So be it. One of the great things about open source is you can do
> whatever you want.
>
> Oddly, this flexibility is also its Achilles heel. In the application
> space, it's a great thing. In an Operating System, it tends to hurt
> the flexibility of the things built on top. This is why Linux
> ultimately won over the BSD's. Linux stayed relatively fork free,
> while the BSD's are quite divergent. The lack of divergence helped
> Linux app developers.
>
> Another example, take cellphones. Linux was early to that party, and
> lost, since so many different linux implementations targeted the
> phone space with different api's. Everyone followed their self
> interests, and the ecosystem on top never materialized since there
> were too many ways to do everything and nothing worked the same.
>
> Google takes the same Linux kernel, puts a bit of userspace on top,
> calls it android and encourages an app ecosystem on top, and bam. The
> OS becomes the number one phone OS in terms of users. They do stuff
> to try and minimize forks and divergent functionality, and the whole
> ecosystem benefits from it. I'm not saying everything Google has done
> there has been good, but the general idea of app ecosystem
> encouragement is good.
>
> OpenStack is an operating system and needs to encourage a wealth of
> users/apps on top of it. The main way to do that is to make sure your
> abstractions are clean enough that the stuff under the hood don't
> matter to the cloud user/app developer. But with
> nova-network/neutron, it does. Same with FlatDHCP. :/
I would agree with you if you weren't trying to shove a broken model
down my throat as the "default"
I want a VM on the internet to behave like a VM on the internet and I
should not have to care about the details of the SDN or lack thereof
behind it.
If the SDN story looks, to the end user, like something other than
booting a VM on the internet that knows how to talk to things on the
internet - it means that the UI being presented to the user to boot vms
on the internet is broken.
Defaults should be the thing that is most commonly desired. Extra work
is fine to get advanced thigns - although _one_ model to do advanced
things is also preferrable to 20.
So what I keep trying to say is that there should be an easy and sane
way for me to get a non-natted VM connected to a network that can route
packets and I should not need to know anything about the advanced
network options available to me, because as a person who just wants a vm
that can talk to a network, I'm the default case.
If I want a NAT. If I want a cloud-level security group. If I want a
tenant network that is private between my various hosts - those are all
add ons - they are extra complexity - they are not the default and
should not be.
Now - I'm pretty sure that we can get to a place where the UI can
support a single model and a single set of operations for a user that
are easily understandable.
But we're not going to get there by ignoring the needs of the people who
want to boot vms that can talk to the network by default. If we're only
focusing on the people who want to do fancy network things, and not
serving the needs of the people who want to do simple network things -
then all we're doing is trading one set of limitations for a second set
of limitations, and we're switching which set of people we're excluding
from the party.
>>
>> If you take that ability away from me, you will be removing working
>> feature and replacing them with things that make my user experience
>> worse.
>>
>> Let's not do that. Let's listen to the people who are using this
>> thing as end users. Let's understand their experience and
>> frustration. And let's not chase pie-in-the-sky theory of how it
>> "should" work in the face of what a ton of people are asking and
>> even begging for. FlatDHCP is perfect for the 80% case. The extra
>> complexity of the additional things if you don't actually need them
>> is irresponsible.
>
> I think we're at a philosophical impasse here. Unfortunately I don't
> think we're going to agree. And that's ok. That's the beauty of open
> source. :)
Indeed! I fully support you in disagreeing with me.
> Thanks, Kevin
>
>>
>>>
>>> [Rockyg] Kevin, the problem is that the extra features *aren't*
>>> critical to the deployers and/or users of many of openstack
>>> deployments. And since they are not critical, the deployers
>>> won't *move* to using neutron that requires them to learn all
>>> this new "stuff" that thjey don't need. By not providing a
>>> simple path to a flatDHCP implementation, you will get existing
>>> users refusing to upgrade rather than take a bunch of extraneous
>>> stuff from Neutron because the OpenStack project deprecated
>>> "their network." So, likely two things will happen: 1) the
>>> deployments that are already you there configured with
>>> nova-network and flatDHCP will stop upgrading with the last
>>> nova-network release and 2) if there isn't a simple equivalent
>>> by then in neutron or some other openstack project, someone will
>>> fork to keep the flatDHCP solution moving forward.
>>>
>>> You can lead a devops to pizza, but you can't make it eat
>>> soylent green pizza. And that's how you lose some of the
>>> community and perhaps spur either Neutron's or OpenStack's
>>> successor open source project(s).
>>>
>>> KISS is still in effect. It seems Neutron is abstracting away
>>> the current network complexities for developers and endusers at
>>> the expense of tossing it all on the shoulders of the
>>> deployer/admins. Until you abstract some of that complexity out
>>> of the deployment path, either through good coding, useful
>>> templates, configuration and management tools, etc., you're going
>>> to continue to get pushback from the devops and they will
>>> continue to claim parity doesn't exist *for them*.
>>>
>>> Something I learned a while ago - the sysadmins control the
>>> system and stick with minor changes and/or single system by
>>> system upgrades until they are either tempted with something
>>> shiny/fun/cool/sexy/powerful or coerced by management to change.
>>> Until you can demonstrate a *benefit* to them to move to the
>>> neutron paradigm for their flatDHCP network, you won't get them
>>> to move. They'll take a learning ramp-up, for either less work or
>>> better control, but they won't take it for more work.
>>>
>>> --Rocky
>>>
>>> ________________________________ From: Kevin Benton
>>> [blak111 at gmail.com] Sent: Friday, April 17, 2015 11:49 AM To:
>>> OpenStack Development Mailing List (not for usage questions)
>>> Subject: Re: [openstack-dev] [Nova][Neutron] Linuxbridge as the
>>> default in DevStack [was: Status of the nova-network to Neutron
>>> migration work] I definitely understand that. But what is the
>>> major complaint from operators? I understood that quote to imply
>>> it was around Neutron's model of self-service networking.
>>>
>>> If the main reason the remaining Nova-net operators don't want to
>>> use Neutron is due to the fact that they don't want to deal with
>>> the Neutron API, swapping some implementation defaults isn't
>>> really going to get us anywhere on that front.
>>>
>>> It's an important distinction because it determines what
>>> actionable items we can take (e.g. what Salvatore mentioned in
>>> his email about defaults). Does that make sense?
>>>
>>> On Fri, Apr 17, 2015 at 11:33 AM, Jeremy Stanley
>>> <fungi at yuggoth.org<mailto:fungi at yuggoth.org>> wrote: On
>>> 2015-04-17 10:55:19 -0700 (-0700), Kevin Benton wrote:
>>>> I understand. What I'm saying is that switching to Linux bridge
>>>> will not change the networking model to 'just connect
>>>> everything to a simple flat network'. All of the complaints
>>>> about self-service networking will still hold.
>>>
>>> And conversely, swapping simple bridge interfaces for something
>>> else still means problems are harder to debug, whether or not
>>> you're stuck with self-service networking features you're not
>>> using. -- Jeremy Stanley
>>>
>>>
>> ___________________________________________________________________
>>
>>
___
>>> ____
>>>
>>>
>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: OpenStack-dev-
>> request at lists.openstack.org?subject:unsubscribe<http://O
>>> penStack-dev-request at lists.openstack.org?subject:unsubscribe>
>>>
>>>
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>>
>>>
>>> -- Kevin Benton
>>>
>>>
>>>
>>>
>> ___________________________________________________________________
>>
>>
___
>>> ____
>>>
>>>
>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe:
>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>
>>
>>
>>>
___________________________________________________________________
>> _______ OpenStack Development Mailing List (not for usage
>> questions) Unsubscribe: OpenStack-dev-
>> request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> __________________________________________________________________________
>
>
OpenStack Development Mailing List (not for usage questions)
> Unsubscribe:
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
More information about the OpenStack-dev
mailing list