[Openstack-operators] [openstack][nova] Several questions/experiences about _base directory on a big production environment

Joe Topjian joe at topjian.net
Thu Apr 3 03:28:54 UTC 2014


Is it Ceph live migration that you don't think is mature for production or
live migration in general? If the latter, I'd like to understand why you
feel that way.

Looping back to Alejandro's original message: I share his pain of _base
issues. It's happened to me before and it sucks.

We use shared storage for a production cloud of ours. The cloud has a 24x7
SLA and shared storage with live migration helps us achieve that. It's not
a silver bullet, but it has saved us so many hours of work.

The remove_unused_base_images option is stable and works. I still disagree
with the default value being "true", but I can vouch that it has worked
without harm for the past year in an environment where it previously shot
me in the foot.

With that option enabled, you should not have to go into _base at all. Any
work that we do in _base is manual audits and the rare time when the
database might be inconsistent with what's really hosted.

To mitigate against potential _base issues, we just try to be as careful as
possible -- measure 5 times before cutting. Our standard procedure is to
move the files we plan on removing to a temporary directory and wait a few
days to see if any users raise an alarm.

Diego has a great point about not using qemu backing files: if your backend
storage implements deduplication and/or compression, you should see the
same savings as what _base is trying to achieve.

We're in the process of building a new public cloud and made the decision
to not implement shared storage. I have a queue of blog posts that I'd love
to write and the thoughts behind this decision is one of them. Very
briefly, the decision was based on the SLA that the public cloud will have
combined with our feeling that "cattle" instances are more acceptable to
the average end-user nowadays.

That's not to say that I'm "done" with shared storage. IMO, it all depends
on the environment. One great thing about OpenStack is that it can be
tailored to work in so many different environments.



On Wed, Apr 2, 2014 at 5:48 PM, matt <matt at nycresistor.com> wrote:

> there's shared storage on a centralized network filesystem... then there's
> shared storage on a distributed network filesystem.  thus the age old
> openafs vs nfs war is reborn.
>
> i'd check out ceph block device for live migration... but saying that...
> live migration has not achieved a maturity level that i'd even consider
> trying it in production.
>
> -matt
>
>
> On Wed, Apr 2, 2014 at 7:40 PM, Chris Friesen <chris.friesen at windriver.com
> > wrote:
>
>> So if you're recommending not using shared storage, what's your answer to
>> people asking for live-migration?  (Given that block migration is supposed
>> to be going away.)
>>
>> Chris
>>
>>
>> On 04/02/2014 05:08 PM, George Shuklin wrote:
>>
>>> Every time anyone start to consolidate resources (shared storage,
>>> virtual chassis for router, etc), it consolidate all failures to one.
>>> One failure and every consolidated system participating in festival.
>>>
>>> Then they starts to increase fault tolerance of consolidated system,
>>> raising administrative plank to the sky, requesting more and more
>>> hardware for the clustering, requesting enterprise-grade, "no one was
>>> fired buying enterprise <bullshit-brand-name-here>". As result -
>>> consolidated system works with same MTBF as non-consolidated, saving
>>> "costs" compare to even more enterprise-grade super-solution with cost
>>> of few percent countries GDP, and actually costs more than
>>> non-consolidated solution.
>>>
>>> Failure for x86 is ALWAYS option. Processor can not repeat instructions,
>>> no comparator between few parallel processors, and so on. Compare to
>>> mainframes. So, if failure is an option, that means, reduce importance
>>> of that failure, it scope.
>>>
>>> If one of 1k hosts goes down for three hours this is sad. But it much
>>> much much better than central system every of 1k hosts depends on goes
>>> down just for 11 seconds (3h*3600/1000).
>>>
>>> So answer is simple: do not aggregate. But _base to slower drives if you
>>> want to save costs, but do not consolidate failures.
>>>
>>> On 04/02/2014 09:04 PM, Alejandro Comisario wrote:
>>>
>>>> Hi guys ...
>>>> We have a pretty big openstack environment and we use a shared NFS to
>>>> populate backing file directory ( the famous _base directory located
>>>> on /var/lib/nova/instances/_base ) due to a human error, the backing
>>>> file used by thousands of guests was deleted, causing this guests to
>>>> go read-only filesystem in a second.
>>>>
>>>> Till that moment we were convinced to use the _base directory as a
>>>> shared NFS because:
>>>>
>>>> * spawning a new ami gives total visibility to the whole cloud making
>>>> instances take nothing to boot despite the nova region
>>>> * ease glance workload
>>>> * easiest management no having to replicate files constantly not
>>>> pushing bandwidth usage internally
>>>>
>>>> But after this really big issue, and after what took us to recover
>>>> from this, we were thinking about how to protect against this kind of
>>>> "single point of failure".
>>>> Our first aproach this days was to put Read Only the NFS share, making
>>>> impossible for computes ( and humans ) to write to that directory,
>>>> giving permision to just one compute whos the one responsible to spawn
>>>> an instance from a new ami and write the file to the directory, still
>>>> ... the storage keeps being the SPOF.
>>>>
>>>> So, we are handling the possibility of having the used backing files
>>>> LOCAL on every compute ( +1K hosts ) and reduce the failure chances to
>>>> the minimum, obviously, with a pararell talk about what technology to
>>>> use to keep data replicated among computes when a new ami is launched,
>>>> launching times, performance matters on compute nodes having to store
>>>> backing files locally, etc.
>>>>
>>>> This make me realize, i have a huge comminity behind openstack, so
>>>> wanted to ear from it:
>>>>
>>>> * what are your thoughts about what happened / what we are thinking
>>>> right now ?
>>>> * how does other users manage the backing file ( _base ) directory
>>>> having all this considerations on big openstack deployments ?
>>>>
>>>> I will be thrilled to read from other users, experiences and thoughts.
>>>>
>>>> As allways, best.
>>>> Alejandro
>>>>
>>>> _______________________________________________
>>>> OpenStack-operators mailing list
>>>> OpenStack-operators at lists.openstack.org
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>>
>>>
>>>
>>> _______________________________________________
>>> OpenStack-operators mailing list
>>> OpenStack-operators at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>
>>
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20140402/70157bf2/attachment.html>


More information about the OpenStack-operators mailing list