[Openstack-operators] /var/lib/nova/instances fs filled up corrupting my Linux instances

Joe Topjian joe.topjian at cybera.ca
Thu Mar 14 14:51:35 UTC 2013


On Thu, Mar 14, 2013 at 9:32 AM, Michael Still <mikal at stillhq.com> wrote:

> On Thu, Mar 14, 2013 at 10:23 AM, Joe Topjian <joe.topjian at cybera.ca>
> wrote:
>
> > The scenario that I ran into had these conditions:
> >
> > 1. Using shared storage
> > 2. All instances (1 or more) of a certain image or snapshots are running
> on
> > one compute node
> > 3. That compute node is taken down for 10 minutes or so (reboot,
> > maintenance, etc)
> > 4. That compute node is unable to mark its _base files as being in use
> since
> > it's offline
> > 5. Other compute nodes see that those _base files are not in use and
> delete
> > them
> > 6. The compute node comes back online and the image/snapshot in question
> is
> > now broke
> >
> > Has that scenario been accounted for or fixed?
>
> I'll check. What is the launchpad bug id?


https://bugs.launchpad.net/nova/+bug/1126375

Admittedly, the bug report does not explain the scenario in detail, but I
noted "No matter how many precautions are taken, some scenarios will still
slip by." which I still firmly believe. My intention was to push for the
cleanup to be turned off by default before discussing the possible ways it
would't work as expected. I felt that if by simply describing the scenario,
that single scenario would be accounted for but thought would not go into
any other ways it could happen (I feel this is what happened with the
NeCTAR incident).

I fully admit to being difficult with this, but it's something I believe
strongly in. I have never run into another service or package that has a
task enabled by default which deletes (rather than archives or recycles)
data. I am all for these types of cleanup tasks, but feel they must be
opt-in.


>
> Michael
>



-- 
Joe Topjian
Systems Administrator
Cybera Inc.

www.cybera.ca

Cybera is a not-for-profit organization that works to spur and support
innovation, for the economic benefit of Alberta, through the use
of cyberinfrastructure.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20130314/446a5e05/attachment.html>


More information about the OpenStack-operators mailing list