<div dir="ltr"><div>Hi Alejandro,</div><div><br></div><div>In our case, though we use shared storage for volumes and application data, we use local disks for the VM's backing files (_base).</div><div><br></div><div>To mitigate the space and performance issues, we adopted the following measures, of which standardization and minimization of ami's quantity are very important:</div>


<div><br></div><div>* Minimization of number of ami's by standarizing common software into a golden image (also improves deployment speed); we only use two standard versioned ami's. Only the latest version is used and cached. Any additional software installation is scripted and prepackaged in repositories</div>


<div>* Cleaning of old/unused backing files in _base directory using the configurable nova-compute periodic task (new in folsom) and also a custom script which cleans unresized/unsuffixed backing files. The later is not necessary in grizzly.</div>


<div>* Reservation of a percentage of unallocable space in each compute host specifically for _base files</div><div>* Use of glance-cache tools for caching locally to avoid network usage at instance launch time; we cache only golden images</div>


<div>* Better use of local disk space by using RAID 0 => Instances are disposable => lots of instances => lot of redundancy at the app level</div><div><br></div><div>Cheers,</div><div><br></div><div><br></div><div>

<br></div><div>--</div><div>Gustavo Randich</div><div>Devop</div><div>Despegar.com</div><div><br></div><div><br></div><div><br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Apr 3, 2014 at 6:41 PM, Alejandro Comisario <span dir="ltr"><<a href="mailto:alejandro.comisario@mercadolibre.com" target="_blank">alejandro.comisario@mercadolibre.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I would love to have insights regarding people using _base with no<br>

shared storage but locally on the compute, up&down sides, experiences<br>

& comments.<br>

<br>

Having base files on the same SATA disks where vm's are running seems<br>

big when decoupling _base from shared storage.<br>

<br>

best regards.<br>

<span class="HOEnZb"><font color="#888888">Alejandro<br>

</font></span><div class="HOEnZb"><div class="h5"><br>

On Thu, Apr 3, 2014 at 11:19 AM, Alejandro Comisario<br>

<<a href="mailto:alejandro.comisario@mercadolibre.com">alejandro.comisario@mercadolibre.com</a>> wrote:<br>

><br>

> Thanks to everyone for the prompted respones!<br>

> Its clear that _base on NFS is not the way to go 100% when thinking<br>

> about avoiding dissasters.<br>

> So, i believe its good to start talking about not using _base backing<br>

> files and maybe impressions IF using _base, concerns about having<br>

> these files locally on the compute on the same disks where the vms are<br>

> running ( in our case are SATA disks ).<br>

><br>

> That kind of discussion is think is the most relevant one.<br>

> What are the experiences of running the backing files locally on the<br>

> same compute where vms are running ?<br>

><br>

> best<br>

> Alejandro Comisario<br>

><br>

> On Thu, Apr 3, 2014 at 12:28 AM, Joe Topjian <<a href="mailto:joe@topjian.net">joe@topjian.net</a>> wrote:<br>

> > Is it Ceph live migration that you don't think is mature for production or<br>

> > live migration in general? If the latter, I'd like to understand why you<br>

> > feel that way.<br>

> ><br>

> > Looping back to Alejandro's original message: I share his pain of _base<br>

> > issues. It's happened to me before and it sucks.<br>

> ><br>

> > We use shared storage for a production cloud of ours. The cloud has a 24x7<br>

> > SLA and shared storage with live migration helps us achieve that. It's not a<br>

> > silver bullet, but it has saved us so many hours of work.<br>

> ><br>

> > The remove_unused_base_images option is stable and works. I still disagree<br>

> > with the default value being "true", but I can vouch that it has worked<br>

> > without harm for the past year in an environment where it previously shot me<br>

> > in the foot.<br>

> ><br>

> > With that option enabled, you should not have to go into _base at all. Any<br>

> > work that we do in _base is manual audits and the rare time when the<br>

> > database might be inconsistent with what's really hosted.<br>

> ><br>

> > To mitigate against potential _base issues, we just try to be as careful as<br>

> > possible -- measure 5 times before cutting. Our standard procedure is to<br>

> > move the files we plan on removing to a temporary directory and wait a few<br>

> > days to see if any users raise an alarm.<br>

> ><br>

> > Diego has a great point about not using qemu backing files: if your backend<br>

> > storage implements deduplication and/or compression, you should see the same<br>

> > savings as what _base is trying to achieve.<br>

> ><br>

> > We're in the process of building a new public cloud and made the decision to<br>

> > not implement shared storage. I have a queue of blog posts that I'd love to<br>

> > write and the thoughts behind this decision is one of them. Very briefly,<br>

> > the decision was based on the SLA that the public cloud will have combined<br>

> > with our feeling that "cattle" instances are more acceptable to the average<br>

> > end-user nowadays.<br>

> ><br>

> > That's not to say that I'm "done" with shared storage. IMO, it all depends<br>

> > on the environment. One great thing about OpenStack is that it can be<br>

> > tailored to work in so many different environments.<br>

> ><br>

> ><br>

> ><br>

> > On Wed, Apr 2, 2014 at 5:48 PM, matt <<a href="mailto:matt@nycresistor.com">matt@nycresistor.com</a>> wrote:<br>

> >><br>

> >> there's shared storage on a centralized network filesystem... then there's<br>

> >> shared storage on a distributed network filesystem.  thus the age old<br>

> >> openafs vs nfs war is reborn.<br>

> >><br>

> >> i'd check out ceph block device for live migration... but saying that...<br>

> >> live migration has not achieved a maturity level that i'd even consider<br>

> >> trying it in production.<br>

> >><br>

> >> -matt<br>

> >><br>

> >><br>

> >> On Wed, Apr 2, 2014 at 7:40 PM, Chris Friesen<br>

> >> <<a href="mailto:chris.friesen@windriver.com">chris.friesen@windriver.com</a>> wrote:<br>

> >>><br>

> >>> So if you're recommending not using shared storage, what's your answer to<br>

> >>> people asking for live-migration?  (Given that block migration is supposed<br>

> >>> to be going away.)<br>

> >>><br>

> >>> Chris<br>

> >>><br>

> >>><br>

> >>> On 04/02/2014 05:08 PM, George Shuklin wrote:<br>

> >>>><br>

> >>>> Every time anyone start to consolidate resources (shared storage,<br>

> >>>> virtual chassis for router, etc), it consolidate all failures to one.<br>

> >>>> One failure and every consolidated system participating in festival.<br>

> >>>><br>

> >>>> Then they starts to increase fault tolerance of consolidated system,<br>

> >>>> raising administrative plank to the sky, requesting more and more<br>

> >>>> hardware for the clustering, requesting enterprise-grade, "no one was<br>

> >>>> fired buying enterprise <bullshit-brand-name-here>". As result -<br>

> >>>> consolidated system works with same MTBF as non-consolidated, saving<br>

> >>>> "costs" compare to even more enterprise-grade super-solution with cost<br>

> >>>> of few percent countries GDP, and actually costs more than<br>

> >>>> non-consolidated solution.<br>

> >>>><br>

> >>>> Failure for x86 is ALWAYS option. Processor can not repeat instructions,<br>

> >>>> no comparator between few parallel processors, and so on. Compare to<br>

> >>>> mainframes. So, if failure is an option, that means, reduce importance<br>

> >>>> of that failure, it scope.<br>

> >>>><br>

> >>>> If one of 1k hosts goes down for three hours this is sad. But it much<br>

> >>>> much much better than central system every of 1k hosts depends on goes<br>

> >>>> down just for 11 seconds (3h*3600/1000).<br>

> >>>><br>

> >>>> So answer is simple: do not aggregate. But _base to slower drives if you<br>

> >>>> want to save costs, but do not consolidate failures.<br>

> >>>><br>

> >>>> On 04/02/2014 09:04 PM, Alejandro Comisario wrote:<br>

> >>>>><br>

> >>>>> Hi guys ...<br>

> >>>>> We have a pretty big openstack environment and we use a shared NFS to<br>

> >>>>> populate backing file directory ( the famous _base directory located<br>

> >>>>> on /var/lib/nova/instances/_base ) due to a human error, the backing<br>

> >>>>> file used by thousands of guests was deleted, causing this guests to<br>

> >>>>> go read-only filesystem in a second.<br>

> >>>>><br>

> >>>>> Till that moment we were convinced to use the _base directory as a<br>

> >>>>> shared NFS because:<br>

> >>>>><br>

> >>>>> * spawning a new ami gives total visibility to the whole cloud making<br>

> >>>>> instances take nothing to boot despite the nova region<br>

> >>>>> * ease glance workload<br>

> >>>>> * easiest management no having to replicate files constantly not<br>

> >>>>> pushing bandwidth usage internally<br>

> >>>>><br>

> >>>>> But after this really big issue, and after what took us to recover<br>

> >>>>> from this, we were thinking about how to protect against this kind of<br>

> >>>>> "single point of failure".<br>

> >>>>> Our first aproach this days was to put Read Only the NFS share, making<br>

> >>>>> impossible for computes ( and humans ) to write to that directory,<br>

> >>>>> giving permision to just one compute whos the one responsible to spawn<br>

> >>>>> an instance from a new ami and write the file to the directory, still<br>

> >>>>> ... the storage keeps being the SPOF.<br>

> >>>>><br>

> >>>>> So, we are handling the possibility of having the used backing files<br>

> >>>>> LOCAL on every compute ( +1K hosts ) and reduce the failure chances to<br>

> >>>>> the minimum, obviously, with a pararell talk about what technology to<br>

> >>>>> use to keep data replicated among computes when a new ami is launched,<br>

> >>>>> launching times, performance matters on compute nodes having to store<br>

> >>>>> backing files locally, etc.<br>

> >>>>><br>

> >>>>> This make me realize, i have a huge comminity behind openstack, so<br>

> >>>>> wanted to ear from it:<br>

> >>>>><br>

> >>>>> * what are your thoughts about what happened / what we are thinking<br>

> >>>>> right now ?<br>

> >>>>> * how does other users manage the backing file ( _base ) directory<br>

> >>>>> having all this considerations on big openstack deployments ?<br>

> >>>>><br>

> >>>>> I will be thrilled to read from other users, experiences and thoughts.<br>

> >>>>><br>

> >>>>> As allways, best.<br>

> >>>>> Alejandro<br>

> >>>>><br>

> >>>>> _______________________________________________<br>

> >>>>> OpenStack-operators mailing list<br>

> >>>>> <a href="mailto:OpenStack-operators@lists.openstack.org">OpenStack-operators@lists.openstack.org</a><br>

> >>>>> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators</a><br>

> >>>><br>

> >>>><br>

> >>>><br>

> >>>> _______________________________________________<br>

> >>>> OpenStack-operators mailing list<br>

> >>>> <a href="mailto:OpenStack-operators@lists.openstack.org">OpenStack-operators@lists.openstack.org</a><br>

> >>>> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators</a><br>

> >>><br>

> >>><br>

> >>><br>

> >>> _______________________________________________<br>

> >>> OpenStack-operators mailing list<br>

> >>> <a href="mailto:OpenStack-operators@lists.openstack.org">OpenStack-operators@lists.openstack.org</a><br>

> >>> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators</a><br>

> >><br>

> >><br>

> >><br>

> >> _______________________________________________<br>

> >> OpenStack-operators mailing list<br>

> >> <a href="mailto:OpenStack-operators@lists.openstack.org">OpenStack-operators@lists.openstack.org</a><br>

> >> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators</a><br>

> >><br>

> ><br>

> ><br>

> > _______________________________________________<br>

> > OpenStack-operators mailing list<br>

> > <a href="mailto:OpenStack-operators@lists.openstack.org">OpenStack-operators@lists.openstack.org</a><br>

> > <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators</a><br>

> ><br>

<br>

_______________________________________________<br>

Mailing list: <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a><br>

Post to     : <a href="mailto:openstack@lists.openstack.org">openstack@lists.openstack.org</a><br>

Unsubscribe : <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a><br>

</div></div></blockquote></div><br></div>