Open Stack

Thu Apr 16 01:50:49 UTC 2015

On 16 April 2015 at 11:59, Sean Dague <sean at dague.net> wrote:

> I think the completeness statement here is as follows:
>
> 1. For OpenStack to scale to the small end, we need to be able to
> overlap services, otherwise you are telling people they basically have
> to start with a full rack of hardware to get 1 worker.

I don't see how that follows: you can run N venvs on one machine. Its
basically a lighter-weight container than containers, just without all
the process and resource isolation.

> 2. The Linux Distributors (who are a big part of our community) install
> everything at a system level. Python has no facility for having multiple
> versions of libraries installed at the system level, largely because
> virtualenvs were created (which solve the non system application level
> problem really well).

Actually, the statement about Python having no facility isn't true -
there are eggs and a mechanism to get a specific version of a
dependency into a process. It's not all that widely used, largely
because of the actual things thats missing: Python *doesn't* have the
ability to load multiple versions of a package into one process. So
once you ask for testtools==1.0.0, that process has only
testtools-1.0.0 in the singleton sys.modules['testtools'], and the
import machinery is defined as having the side effect of changing
global state, so this is sufficiently tricky to tackle noone has, even
with importlib etc being around now. NB: 'vendoring' does something in
this space by shifting an import to a different location, but its
fragile.

> 3. The alternative of putting a venv inside of every service violates
> the principle of single location security update.

So does copying code around, but we've officially adopted that as our
approach-until-things-are-mature... and in fact we're talking cluster
software, so there is (except the small scale) absolutely no
expectation of single-location security updates: we know and expect to
have to update N==machines locations, making that M>N is a small
matter of automation, be that docker/lxc/venvs.

I think the arguments about devstack and ease of hacking, + our
deployer community specifically requesting it are sufficient. Whether
they need to request it or not, they have :).

> Note: there is an aside about why we might *not* want to do this....
>
> That being said, if you deploy at a system level your upgrade unit is
> now 1 full box, instead of 1 service, because you can't do library
> isolation between old and new. A worker might have neutron, cinder, and
> nova agents. Only the Nova agents support rolling upgrade (cinder is
> working hard on this, I don't think neutron has visited this yet). So
> real rolling upgrade is sacrificed on this alter of install everything
> at a system level.

Yup. And most deployment frameworks want to scale by service, not by
box, which makes genuine containers super interesting....

>> Concretely, devstack should be doing one pip install run, and in
>> stable branches that needs to look something like:
>>
>> $ pip install -r known-good-list $path_to_nova $path_to_neutron ....
>
> Also remember we need to -e install a bunch of directories, not sure if
> that makes things easier or harder.

There's a particular bit of ugly in pip where directories have to be
resolved to packages, but if we use egg fragment names we can tell pip
the name and avoid that. The thing that that lookup does is cause
somewhat later binding of some requirements calls - I'm not sure it
would be a problem, but if it is its fairly straight forward to
address.

> So, one of the problems with that is often what we need is buried pretty
> deep like -
> https://github.com/openstack-dev/devstack/blob/master/lib/nova_plugins/functions-libvirt#L39

Both apt and rpm will perform much faster given a single invocation
than 10 or 20 little ones. There's a chunk of redundant work in dpkg
itself for instance that can be avoided by a single call. So we might
want to do that for that reason alone. pip doesn't currently do much
global big-O work, so it shouldn't be affected, but once we do start
considering already-installed-requirements, then it will start to have
the same issue.

> If devstack needs to uplift everything into an install the world phase,
> I think we're basically on the verge of creating OpenStack Package
> Manager so that we can specify this all declaratively (after processing
> a bunch of conditional feature logic). Which, is a thing someone could
> do (not it), but if we really get there we're probably just better off
> thinking about reviving building debs so that we can have the package
> manager globally keep track of all the requirements across multiple
> invocations.

mmm, I don't see that - to me a package manager has a lot more to do
with dependencies and distribution- I'd expect an opm to know about
things like 'nova requires the optional feature X to work with dvr, or
Y to work with ovn'. And where to get the tarballs or git repos from
given just 'openstack/nova'. Refactoring devstack into a bunch of pure
code - calculate the needed binaries and python bits, and a bunch of
impure code - do the installs, write the configs, start the services -
seems straight forward, good for maintenance, and a pretty low bar.
Maybe even migrate some of it to Python at the same time :).

> You are right, https://github.com/pypa/pip/issues/988 "Pip needs a
> dependency resolver" ends up being a problem for us regardless. And
> based on reading through it again tonight it seems pretty clear that a
> number of other complicated system level python programs are running
> head long into the same issues.

Yup, and its not super hard, just tedious to get right.

-Rob

-- 
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Converged Cloud

Open Stack

[openstack-dev] [all][pbr] splitting our deployment vs install dependencies

OpenStack

Community

Documentation

Branding & Legal