Open Stack

Thu Jan 23 22:48:32 UTC 2014

On 23 January 2014 09:27, Keith Basil <kbasil at redhat.com> wrote:

>>> 3) Deploy specific application level updates via packages or tarballs. (only selected applications/packages get deployed)
>>
>> ++. FWIW, #3 happens a heck of a lot more often than #1 or #2 in CD
>> environments, so this level of optimization will be frequently used.
>> And, as I've said before, optimizing for frequently-used scenarios is
>> worth spending the time on. Optimizing for infrequently-occurring
>> things... not so much. :)
>
> I don't understand the aversion to using existing, well-known tools to handle this?

If you're referring to operating system packages, I laid out the
overall case point by point in this talk at LCA -
http://mirror.linux.org.au/linux.conf.au/2014/Friday/111-Diskimage-builder_deep_dive_into_a_machine_compiler_-_Robert_Collins.mp4
- I'm not sure why the video is not linked in the programme
(https://lca2014.linux.org.au/programme/schedule/friday) yet.

> A hybrid model (blending 2 and 3, above) here I think would work best where
> TripleO lays down a baseline image and the cloud operator would employ an well-known
> and support configuration tool for any small diffs.

In doing that they would sacrifice all testing of the 'small diffs'.
Which is a great way to end up running a combination that doesn't
work.

> The operator would then be empowered to make the call for any major upgrades that
> would adversely impact the infrastructure (and ultimately the users/apps).  He/She
> could say, this is a major release, let's deploy the image.
>
> Something logically like this, seems reasonable:
>
>         if (system_change > 10%) {
>           use TripleO;
>           } else {
>           use Existing_Config_Management;
>         }
>
> It seems disruptive to force compute (or other) nodes to reboot on trivial updates.

We won't be doing that :), though it is much simpler to reason about,
so I believe some users will want it.

> If we are to get further enterprise adoption of OpenStack, this seems like a huge
> blocker.  This will be a very hard sell to get traditional IT folk to buy into
> this approach:
>
>         "Wait, *every* time I have to make a system change, I need to reboot my
>          entire cloud?"
>
> Elastic cloud concepts are already trying enough for the enterprise.

Sure, if you frame it like that it would be a scary and unreasonable
concept - they would quite reasonably ask 'what about all my running
VMs' and other such questions :). Also it would be a massive
exaggeration as we *already* *only* do reboots on software changes,
not on last-mile configuration changes.

With rolling upgrades there would be no end user visible downtime in
the *un*optimised case. The optimised case merely deploys faster.

Finally, remember that the undercloud machines have small (~170K)
numbers of files - which takes 28 seconds to stat cold - so rsyncing a
single changed file should be ~30seconds. Thats comparable (if not
better :)) than the time to run yum or apt.

-Rob

-- 
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Converged Cloud

Open Stack

[openstack-dev] [TripleO] our update story: can people live with it?

OpenStack

Community

Documentation

Branding & Legal