[openstack-dev] [Fuel] Diagnostic snapshot generation is broken due to lack of disk space

Artem Panchenko apanchenko at mirantis.com
Mon Jan 25 16:09:51 UTC 2016


Guys,

I want to pay your attention that we need to not only fix snapshots 
generation issue, but also prevent caused by it unexpected services 
failures (see details in a duplicate [0] of original [1] bug), which 
would become a challenge for not experienced users (for example he/she 
won't be able to authenticate in GUI or CLI some time after snapshot 
generation is started). I get this issue on our bare-metal lab (10 
slaves) each time I have an environment which is running more than 2 days.

Links usage for files copying doesn't help in such case, because tarball 
is still saved on /var partition. Also, if I want to workaround this 
issue, I have to perform a lot of actions: s hrink 'os-varlog' volume 
(because it's the biggest [2] one) in order to increase unallocated disk 
space, resize its FS, create new volume, create FS, mount it to 
/var/www/nailgun/dump and update fstab. Not easy way to make "Generate 
Diagnostic Snapshot " button work, right?

So, if we are going to address this diagnostic snapshot issue in 8.0, I 
want to remind you about b) option :)

> b) Make the snapshot location share the diskspace of /var/log?

Thanks!

[0] https://bugs.launchpad.net/fuel/+bug/1530131
[1] https://bugs.launchpad.net/fuel/+bug/1529182
[2] http://paste.openstack.org/show/484895/

On 18.01.16 13:20, Maciej Kwiek wrote:
> Igor: It seems that fqdn -> ipaddr will indeed be resolved. Please 
> share your feedback in review: https://review.openstack.org/#/c/266964/3
>
> On Fri, Jan 15, 2016 at 4:25 PM, Igor Kalnitsky 
> <ikalnitsky at mirantis.com <mailto:ikalnitsky at mirantis.com>> wrote:
>
>     Sheena -
>
>     What do you mean by *targeted*? Shotgun's designed to be a *targeted*
>     solution. If someone wants more *precise* targets - it's easy to
>     specify them in Nailgun's settings.yaml.
>
>     - Igor
>
>     On Fri, Jan 15, 2016 at 5:02 PM, Sheena Gregson
>     <sgregson at mirantis.com <mailto:sgregson at mirantis.com>> wrote:
>     > I’ve also seen the request multiple times to be able to provide more
>     > targeted snapshots which might also (partially) solve this
>     problem as it
>     > would require significantly less disk space to grab logs from a
>     subset of
>     > nodes for a specific window of time, instead of the more robust
>     grab-all
>     > solution we have now.
>     >
>     >
>     >
>     > From: Maciej Kwiek [mailto:mkwiek at mirantis.com
>     <mailto:mkwiek at mirantis.com>]
>     > Sent: Thursday, January 14, 2016 5:59 AM
>     > To: OpenStack Development Mailing List (not for usage questions)
>     > <openstack-dev at lists.openstack.org
>     <mailto:openstack-dev at lists.openstack.org>>
>     > Subject: Re: [openstack-dev] [Fuel] Diagnostic snapshot
>     generation is broken
>     > due to lack of disk space
>     >
>     >
>     >
>     > Igor,
>     >
>     >
>     >
>     > I will investigate this, thanks!
>     >
>     >
>     >
>     > Artem,
>     >
>     >
>     >
>     > I guess that if we have an untrusted user on master node, he
>     could just put
>     > something he wants to be in the snapshot in /var/log without
>     having to time
>     > the attack carefully with tar execution.
>     >
>     >
>     >
>     > I want to use links for directories, this saves me the trouble
>     of creating
>     > hardlinks for every single file in the directory. Although with how
>     > exclusion is currently implemented it can cause deleting log
>     files from
>     > original directories, need to check this out.
>     >
>     >
>     >
>     > About your PS: whole /var/log on master node (not in container)
>     is currently
>     > downloaded, I think we shouldn't change this as we plan to drop
>     containers
>     > in 9.0.
>     >
>     >
>     >
>     > Cheers,
>     >
>     > Maciej
>     >
>     >
>     >
>     > On Thu, Jan 14, 2016 at 12:32 PM, Artem Panchenko
>     <apanchenko at mirantis.com <mailto:apanchenko at mirantis.com>>
>     > wrote:
>     >
>     > Hi,
>     >
>     > using symlinks is a bit dangerous, here is a quote from the man you
>     > mentioned [0]:
>     >
>     >> The `--dereference' option is unsafe if an untrusted user can
>     modify
>     >> directories while tar is running.
>     >
>     > Hard links usage is much safer, because you can't use them for
>     directories.
>     > But at the same time implementation in shotgun would be more
>     complicated
>     > than with symlinks.
>     >
>     > Anyway, in order to determine what linking to use we need to
>     decide where
>     > (/var/log or another partition) diagnostic snapshot will be stored.
>     >
>     > p.s.
>     >
>     >>This doesn't really give us much right now, because most of the
>     logs are
>     >> fetched from master node via ssh due to shotgun being run in
>     mcollective
>     >> container
>     >
>     >
>     >
>     > AFAIK '/var/log/docker-logs/' is available from mcollective
>     container and
>     > mounted to /var/log/:
>     >
>     > [root at fuel-lab-cz5557 ~]# dockerctl shell mcollective mount -l |
>     grep
>     > os-varlog
>     > /dev/mapper/os-varlog on /var/log type ext4
>     > (rw,relatime,stripe=128,data=ordered)
>     >
>     > From my experience '/var/log/docker-logs/remote' folder is most
>     ' heavy'
>     > thing in snapshot.
>     >
>     > [0]
>     http://www.gnu.org/software/tar/manual/html_node/dereference.html
>     >
>     > Thanks!
>     >
>     >
>     >
>     > On 14.01.16 13:00, Igor Kalnitsky wrote:
>     >
>     > I took a glance on Maciej's patch and it adds a switch to tar
>     command
>     >
>     > to make it follow symbolic links
>     >
>     > Yeah, that should work. Except one thing - we previously had fqdn ->
>     >
>     > ipaddr links in snapshots. So now they will be resolved into full
>     >
>     > copy?
>     >
>     >
>     >
>     > I meant that symlinks also give us the benefit of not using
>     additional
>     >
>     > space (just as hardlinks do) while being able to link to files from
>     >
>     > different filesystems.
>     >
>     > I'm sorry, I got you wrong. :)
>     >
>     >
>     >
>     > - Igor
>     >
>     >
>     >
>     > On Thu, Jan 14, 2016 at 12:34 PM, Maciej Kwiek
>     <mkwiek at mirantis.com <mailto:mkwiek at mirantis.com>> wrote:
>     >
>     > Igor,
>     >
>     >
>     >
>     > I meant that symlinks also give us the benefit of not using
>     additional space
>     >
>     > (just as hardlinks do) while being able to link to files from
>     different
>     >
>     > filesystems.
>     >
>     >
>     >
>     > Also, as Barłomiej pointed out the `h` switch for tar should do
>     the trick
>     >
>     > [1].
>     >
>     >
>     >
>     > Cheers,
>     >
>     > Maciej
>     >
>     >
>     >
>     > [1]
>     http://www.gnu.org/software/tar/manual/html_node/dereference.html
>     >
>     >
>     >
>     > On Thu, Jan 14, 2016 at 11:22 AM, Bartlomiej Piotrowski
>     >
>     > <bpiotrowski at mirantis.com <mailto:bpiotrowski at mirantis.com>> wrote:
>     >
>     > Igor,
>     >
>     >
>     >
>     > I took a glance on Maciej's patch and it adds a switch to tar
>     command to
>     >
>     > make it follow symbolic links, so it looks good to me.
>     >
>     >
>     >
>     > Bartłomiej
>     >
>     >
>     >
>     > On Thu, Jan 14, 2016 at 10:39 AM, Igor Kalnitsky
>     <ikalnitsky at mirantis.com <mailto:ikalnitsky at mirantis.com>>
>     >
>     > wrote:
>     >
>     > Hey Maceij -
>     >
>     >
>     >
>     > About hardlinks - wouldn't it be better to use symlinks?
>     >
>     > This way we don't occupy more space than necessary
>     >
>     > AFAIK, hardlinks won't occupy much space. They are the links,
>     after all.
>     >
>     > :)
>     >
>     >
>     >
>     > As for symlinks, I'm afraid shotgun (and fabric underneath) won't
>     >
>     > resolve them and links are get to snapshot As Is. That means if
>     there
>     >
>     > will be no content in the snapshot they are pointing to, they are
>     >
>     > simply useless. Needs to be checked, though.
>     >
>     >
>     >
>     > - Igor
>     >
>     >
>     >
>     > On Thu, Jan 14, 2016 at 10:31 AM, Maciej Kwiek
>     <mkwiek at mirantis.com <mailto:mkwiek at mirantis.com>>
>     >
>     > wrote:
>     >
>     > Thanks for your insight guys!
>     >
>     >
>     >
>     > I agree with Oleg, I will see what I can do to make this work
>     this way.
>     >
>     >
>     >
>     > About hardlinks - wouldn't it be better to use symlinks? This way we
>     >
>     > don't
>     >
>     > occupy more space than necessary, and we can link to files and
>     >
>     > directories
>     >
>     > that are in other block device than /var. Please see [1] review
>     for a
>     >
>     > proposed change that introduces symlinks.
>     >
>     >
>     >
>     > This doesn't really give us much right now, because most of the logs
>     >
>     > are
>     >
>     > fetched from master node via ssh due to shotgun being run in
>     >
>     > mcollective
>     >
>     > container, but it's something! When we remove containers, this will
>     >
>     > prove
>     >
>     > more useful.
>     >
>     >
>     >
>     > Regards,
>     >
>     > Maciej Kwiek
>     >
>     >
>     >
>     > [1] https://review.openstack.org/#/c/266964/
>     >
>     >
>     >
>     > On Tue, Jan 12, 2016 at 1:51 PM, Oleg Gelbukh
>     <ogelbukh at mirantis.com <mailto:ogelbukh at mirantis.com>>
>     >
>     > wrote:
>     >
>     > I think we need to find a way to:
>     >
>     >
>     >
>     > 1) verify the size of snapshot without actually making it and
>     compare
>     >
>     > to
>     >
>     > the available disk space beforehand.
>     >
>     > 2) refuse to create snapshot if space is insufficient and notify
>     user
>     >
>     > (otherwise it breaks Admin node as we have seen)
>     >
>     > 3) provide a way to prioritize elements of the snapshot and exclude
>     >
>     > them
>     >
>     > based on the priorities or user choice.
>     >
>     >
>     >
>     > This will allow for better and safer UX with the snapshot.
>     >
>     >
>     >
>     > --
>     >
>     > Best regards,
>     >
>     > Oleg Gelbukh
>     >
>     >
>     >
>     > On Tue, Jan 12, 2016 at 1:47 PM, Maciej Kwiek
>     <mkwiek at mirantis.com <mailto:mkwiek at mirantis.com>>
>     >
>     > wrote:
>     >
>     > Hi!
>     >
>     >
>     >
>     > I need some advice on how to tackle this issue. There is a bug [1]
>     >
>     > describing the problem with creating a diagnostic snapshot. The
>     issue
>     >
>     > is
>     >
>     > that /var/log has 100GB available, while /var (where diagnostic
>     >
>     > snapshot is
>     >
>     > being generated - /var/www/nailgun/dump/fuel-snapshot according to
>     >
>     > [2]) has
>     >
>     > 10GB available, so dumping the logs can be an issue when logs size
>     >
>     > exceed
>     >
>     > free space in /var.
>     >
>     >
>     >
>     > There are several things we could do, but I am unsure on which
>     course
>     >
>     > to
>     >
>     > take. Should we
>     >
>     > a) Allocate more disk space for /var/www (or for whole /var)?
>     >
>     > b) Make the snapshot location share the diskspace of /var/log?
>     >
>     > c) Something else? What?
>     >
>     >
>     >
>     > Please share your thoughts on this.
>     >
>     >
>     >
>     > Cheers,
>     >
>     > Maciej Kwiek
>     >
>     >
>     >
>     > [1] https://bugs.launchpad.net/fuel/+bug/1529182
>     >
>     > [2]
>     >
>     >
>     >
>     >
>     https://github.com/openstack/fuel-web/blob/2855a9ba925c146b4802ab3cd2185f1dce2d8a6a/nailgun/nailgun/settings.yaml#L717
>     >
>     >
>     >
>     >
>     >
>     >
>     >
>     >
>     __________________________________________________________________________
>     >
>     > OpenStack Development Mailing List (not for usage questions)
>     >
>     > Unsubscribe:
>     >
>     > OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>     <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>     >
>     > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>     >
>     >
>     >
>     >
>     >
>     >
>     >
>     >
>     __________________________________________________________________________
>     >
>     > OpenStack Development Mailing List (not for usage questions)
>     >
>     > Unsubscribe:
>     >
>     > OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>     <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>     >
>     > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>     >
>     >
>     >
>     >
>     >
>     >
>     >
>     >
>     __________________________________________________________________________
>     >
>     > OpenStack Development Mailing List (not for usage questions)
>     >
>     > Unsubscribe:
>     >
>     > OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>     <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>     >
>     > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>     >
>     >
>     >
>     >
>     >
>     >
>     __________________________________________________________________________
>     >
>     > OpenStack Development Mailing List (not for usage questions)
>     >
>     > Unsubscribe:
>     >
>     > OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>     <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>     >
>     > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>     >
>     >
>     >
>     >
>     >
>     >
>     __________________________________________________________________________
>     >
>     > OpenStack Development Mailing List (not for usage questions)
>     >
>     > Unsubscribe:
>     OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>     <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>     >
>     > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>     >
>     >
>     >
>     >
>     >
>     >
>     __________________________________________________________________________
>     >
>     > OpenStack Development Mailing List (not for usage questions)
>     >
>     > Unsubscribe:
>     OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>     <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>     >
>     > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>     >
>     >
>     >
>     >
>     __________________________________________________________________________
>     >
>     > OpenStack Development Mailing List (not for usage questions)
>     >
>     > Unsubscribe:
>     OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>     <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>     >
>     > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>     >
>     >
>     >
>     > --
>     >
>     > Artem Panchenko
>     >
>     > QA Engineer
>     >
>     >
>     >
>     __________________________________________________________________________
>     > OpenStack Development Mailing List (not for usage questions)
>     > Unsubscribe:
>     OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>     <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>     > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>     >
>     >
>     >
>     >
>     >
>     __________________________________________________________________________
>     > OpenStack Development Mailing List (not for usage questions)
>     > Unsubscribe:
>     OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>     <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>     > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>     >
>
>     __________________________________________________________________________
>     OpenStack Development Mailing List (not for usage questions)
>     Unsubscribe:
>     OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>     <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-- 
Artem Panchenko
QA Engineer

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160125/184edde7/attachment.html>


More information about the OpenStack-dev mailing list