[openstack-dev] [Fuel] Diagnostic snapshot generation is broken due to lack of disk space

Maciej Kwiek mkwiek at mirantis.com
Tue Jan 26 09:27:49 UTC 2016


Artem,

[1] and [2] are changes that do exactly this - snapshot is created in
/var/log. Please review these changes if you haven't already.

Cheers,
Maciej

[1] https://review.openstack.org/#/c/270823/
[2] https://review.openstack.org/#/c/271179/

On Mon, Jan 25, 2016 at 5:09 PM, Artem Panchenko <apanchenko at mirantis.com>
wrote:

> Guys,
>
> I want to pay your attention that we need to not only fix snapshots
> generation issue, but also prevent caused by it unexpected services
> failures (see details in a duplicate [0] of original [1] bug), which would
> become a challenge for not experienced users (for example he/she won't be
> able to authenticate in GUI or CLI some time after snapshot generation is
> started). I get this issue on our bare-metal lab (10 slaves) each time I
> have an environment which is running more than 2 days.
>
> Links usage for files copying doesn't help in such case, because tarball
> is still saved on /var partition. Also, if I want to workaround this issue,
> I have to perform a lot of actions: s hrink 'os-varlog' volume (because
> it's the biggest [2] one) in order to increase unallocated disk space,
> resize its FS, create new volume, create FS, mount it to
> /var/www/nailgun/dump and update fstab. Not easy way to make "Generate
> Diagnostic Snapshot " button work, right?
>
> So, if we are going to address this diagnostic snapshot issue in 8.0, I
> want to remind you about b) option :)
>
> b) Make the snapshot location share the diskspace of /var/log?
>
>
> Thanks!
>
> [0] https://bugs.launchpad.net/fuel/+bug/1530131
> [1] https://bugs.launchpad.net/fuel/+bug/1529182
> [2] http://paste.openstack.org/show/484895/
>
>
> On 18.01.16 13:20, Maciej Kwiek wrote:
>
> Igor: It seems that fqdn -> ipaddr will indeed be resolved. Please share
> your feedback in review:  <https://review.openstack.org/#/c/266964/3>
> https://review.openstack.org/#/c/266964/3
>
> On Fri, Jan 15, 2016 at 4:25 PM, Igor Kalnitsky <ikalnitsky at mirantis.com>
> wrote:
>
>> Sheena -
>>
>> What do you mean by *targeted*? Shotgun's designed to be a *targeted*
>> solution. If someone wants more *precise* targets - it's easy to
>> specify them in Nailgun's settings.yaml.
>>
>> - Igor
>>
>> On Fri, Jan 15, 2016 at 5:02 PM, Sheena Gregson < <sgregson at mirantis.com>
>> sgregson at mirantis.com> wrote:
>> > I’ve also seen the request multiple times to be able to provide more
>> > targeted snapshots which might also (partially) solve this problem as it
>> > would require significantly less disk space to grab logs from a subset
>> of
>> > nodes for a specific window of time, instead of the more robust grab-all
>> > solution we have now.
>> >
>> >
>> >
>> > From: Maciej Kwiek [mailto: <mkwiek at mirantis.com>mkwiek at mirantis.com]
>> > Sent: Thursday, January 14, 2016 5:59 AM
>> > To: OpenStack Development Mailing List (not for usage questions)
>> > <openstack-dev at lists.openstack.org>
>> > Subject: Re: [openstack-dev] [Fuel] Diagnostic snapshot generation is
>> broken
>> > due to lack of disk space
>> >
>> >
>> >
>> > Igor,
>> >
>> >
>> >
>> > I will investigate this, thanks!
>> >
>> >
>> >
>> > Artem,
>> >
>> >
>> >
>> > I guess that if we have an untrusted user on master node, he could just
>> put
>> > something he wants to be in the snapshot in /var/log without having to
>> time
>> > the attack carefully with tar execution.
>> >
>> >
>> >
>> > I want to use links for directories, this saves me the trouble of
>> creating
>> > hardlinks for every single file in the directory. Although with how
>> > exclusion is currently implemented it can cause deleting log files from
>> > original directories, need to check this out.
>> >
>> >
>> >
>> > About your PS: whole /var/log on master node (not in container) is
>> currently
>> > downloaded, I think we shouldn't change this as we plan to drop
>> containers
>> > in 9.0.
>> >
>> >
>> >
>> > Cheers,
>> >
>> > Maciej
>> >
>> >
>> >
>> > On Thu, Jan 14, 2016 at 12:32 PM, Artem Panchenko <
>> apanchenko at mirantis.com>
>> > wrote:
>> >
>> > Hi,
>> >
>> > using symlinks is a bit dangerous, here is a quote from the man you
>> > mentioned [0]:
>> >
>> >> The `--dereference' option is unsafe if an untrusted user can modify
>> >> directories while tar is running.
>> >
>> > Hard links usage is much safer, because you can't use them for
>> directories.
>> > But at the same time implementation in shotgun would be more complicated
>> > than with symlinks.
>> >
>> > Anyway, in order to determine what linking to use we need to decide
>> where
>> > (/var/log or another partition) diagnostic snapshot will be stored.
>> >
>> > p.s.
>> >
>> >>This doesn't really give us much right now, because most of the logs are
>> >> fetched from master node via ssh due to shotgun being run in
>> mcollective
>> >> container
>> >
>> >
>> >
>> > AFAIK '/var/log/docker-logs/' is available from mcollective container
>> and
>> > mounted to /var/log/:
>> >
>> > [root at fuel-lab-cz5557 ~]# dockerctl shell mcollective mount -l | grep
>> > os-varlog
>> > /dev/mapper/os-varlog on /var/log type ext4
>> > (rw,relatime,stripe=128,data=ordered)
>> >
>> > From my experience '/var/log/docker-logs/remote' folder is most ' heavy'
>> > thing in snapshot.
>> >
>> > [0] http://www.gnu.org/software/tar/manual/html_node/dereference.html
>> >
>> > Thanks!
>> >
>> >
>> >
>> > On 14.01.16 13:00, Igor Kalnitsky wrote:
>> >
>> > I took a glance on Maciej's patch and it adds a switch to tar command
>> >
>> > to make it follow symbolic links
>> >
>> > Yeah, that should work. Except one thing - we previously had fqdn ->
>> >
>> > ipaddr links in snapshots. So now they will be resolved into full
>> >
>> > copy?
>> >
>> >
>> >
>> > I meant that symlinks also give us the benefit of not using additional
>> >
>> > space (just as hardlinks do) while being able to link to files from
>> >
>> > different filesystems.
>> >
>> > I'm sorry, I got you wrong. :)
>> >
>> >
>> >
>> > - Igor
>> >
>> >
>> >
>> > On Thu, Jan 14, 2016 at 12:34 PM, Maciej Kwiek < <mkwiek at mirantis.com>
>> mkwiek at mirantis.com> wrote:
>> >
>> > Igor,
>> >
>> >
>> >
>> > I meant that symlinks also give us the benefit of not using additional
>> space
>> >
>> > (just as hardlinks do) while being able to link to files from different
>> >
>> > filesystems.
>> >
>> >
>> >
>> > Also, as Barłomiej pointed out the `h` switch for tar should do the
>> trick
>> >
>> > [1].
>> >
>> >
>> >
>> > Cheers,
>> >
>> > Maciej
>> >
>> >
>> >
>> > [1] http://www.gnu.org/software/tar/manual/html_node/dereference.html
>> >
>> >
>> >
>> > On Thu, Jan 14, 2016 at 11:22 AM, Bartlomiej Piotrowski
>> >
>> > <bpiotrowski at mirantis.com> wrote:
>> >
>> > Igor,
>> >
>> >
>> >
>> > I took a glance on Maciej's patch and it adds a switch to tar command to
>> >
>> > make it follow symbolic links, so it looks good to me.
>> >
>> >
>> >
>> > Bartłomiej
>> >
>> >
>> >
>> > On Thu, Jan 14, 2016 at 10:39 AM, Igor Kalnitsky <
>> ikalnitsky at mirantis.com>
>> >
>> > wrote:
>> >
>> > Hey Maceij -
>> >
>> >
>> >
>> > About hardlinks - wouldn't it be better to use symlinks?
>> >
>> > This way we don't occupy more space than necessary
>> >
>> > AFAIK, hardlinks won't occupy much space. They are the links, after all.
>> >
>> > :)
>> >
>> >
>> >
>> > As for symlinks, I'm afraid shotgun (and fabric underneath) won't
>> >
>> > resolve them and links are get to snapshot As Is. That means if there
>> >
>> > will be no content in the snapshot they are pointing to, they are
>> >
>> > simply useless. Needs to be checked, though.
>> >
>> >
>> >
>> > - Igor
>> >
>> >
>> >
>> > On Thu, Jan 14, 2016 at 10:31 AM, Maciej Kwiek < <mkwiek at mirantis.com>
>> mkwiek at mirantis.com>
>> >
>> > wrote:
>> >
>> > Thanks for your insight guys!
>> >
>> >
>> >
>> > I agree with Oleg, I will see what I can do to make this work this way.
>> >
>> >
>> >
>> > About hardlinks - wouldn't it be better to use symlinks? This way we
>> >
>> > don't
>> >
>> > occupy more space than necessary, and we can link to files and
>> >
>> > directories
>> >
>> > that are in other block device than /var. Please see [1] review for a
>> >
>> > proposed change that introduces symlinks.
>> >
>> >
>> >
>> > This doesn't really give us much right now, because most of the logs
>> >
>> > are
>> >
>> > fetched from master node via ssh due to shotgun being run in
>> >
>> > mcollective
>> >
>> > container, but it's something! When we remove containers, this will
>> >
>> > prove
>> >
>> > more useful.
>> >
>> >
>> >
>> > Regards,
>> >
>> > Maciej Kwiek
>> >
>> >
>> >
>> > [1] https://review.openstack.org/#/c/266964/
>> >
>> >
>> >
>> > On Tue, Jan 12, 2016 at 1:51 PM, Oleg Gelbukh < <ogelbukh at mirantis.com>
>> ogelbukh at mirantis.com>
>> >
>> > wrote:
>> >
>> > I think we need to find a way to:
>> >
>> >
>> >
>> > 1) verify the size of snapshot without actually making it and compare
>> >
>> > to
>> >
>> > the available disk space beforehand.
>> >
>> > 2) refuse to create snapshot if space is insufficient and notify user
>> >
>> > (otherwise it breaks Admin node as we have seen)
>> >
>> > 3) provide a way to prioritize elements of the snapshot and exclude
>> >
>> > them
>> >
>> > based on the priorities or user choice.
>> >
>> >
>> >
>> > This will allow for better and safer UX with the snapshot.
>> >
>> >
>> >
>> > --
>> >
>> > Best regards,
>> >
>> > Oleg Gelbukh
>> >
>> >
>> >
>> > On Tue, Jan 12, 2016 at 1:47 PM, Maciej Kwiek < <mkwiek at mirantis.com>
>> mkwiek at mirantis.com>
>> >
>> > wrote:
>> >
>> > Hi!
>> >
>> >
>> >
>> > I need some advice on how to tackle this issue. There is a bug [1]
>> >
>> > describing the problem with creating a diagnostic snapshot. The issue
>> >
>> > is
>> >
>> > that /var/log has 100GB available, while /var (where diagnostic
>> >
>> > snapshot is
>> >
>> > being generated - /var/www/nailgun/dump/fuel-snapshot according to
>> >
>> > [2]) has
>> >
>> > 10GB available, so dumping the logs can be an issue when logs size
>> >
>> > exceed
>> >
>> > free space in /var.
>> >
>> >
>> >
>> > There are several things we could do, but I am unsure on which course
>> >
>> > to
>> >
>> > take. Should we
>> >
>> > a) Allocate more disk space for /var/www (or for whole /var)?
>> >
>> > b) Make the snapshot location share the diskspace of /var/log?
>> >
>> > c) Something else? What?
>> >
>> >
>> >
>> > Please share your thoughts on this.
>> >
>> >
>> >
>> > Cheers,
>> >
>> > Maciej Kwiek
>> >
>> >
>> >
>> > [1] https://bugs.launchpad.net/fuel/+bug/1529182
>> >
>> > [2]
>> >
>> >
>> >
>> >
>> https://github.com/openstack/fuel-web/blob/2855a9ba925c146b4802ab3cd2185f1dce2d8a6a/nailgun/nailgun/settings.yaml#L717
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> __________________________________________________________________________
>> >
>> > OpenStack Development Mailing List (not for usage questions)
>> >
>> > Unsubscribe:
>> >
>> > OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> >
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> __________________________________________________________________________
>> >
>> > OpenStack Development Mailing List (not for usage questions)
>> >
>> > Unsubscribe:
>> >
>> > OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> >
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> __________________________________________________________________________
>> >
>> > OpenStack Development Mailing List (not for usage questions)
>> >
>> > Unsubscribe:
>> >
>> > OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> >
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> >
>> >
>> >
>> >
>> >
>> >
>> __________________________________________________________________________
>> >
>> > OpenStack Development Mailing List (not for usage questions)
>> >
>> > Unsubscribe:
>> >
>> > OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> >
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> >
>> >
>> >
>> >
>> >
>> >
>> __________________________________________________________________________
>> >
>> > OpenStack Development Mailing List (not for usage questions)
>> >
>> > Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> >
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> >
>> >
>> >
>> >
>> >
>> >
>> __________________________________________________________________________
>> >
>> > OpenStack Development Mailing List (not for usage questions)
>> >
>> > Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> >
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> >
>> >
>> >
>> >
>> __________________________________________________________________________
>> >
>> > OpenStack Development Mailing List (not for usage questions)
>> >
>> > Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> >
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> >
>> >
>> >
>> > --
>> >
>> > Artem Panchenko
>> >
>> > QA Engineer
>> >
>> >
>> >
>> __________________________________________________________________________
>> > OpenStack Development Mailing List (not for usage questions)
>> > Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> >
>> >
>> >
>> >
>> >
>> __________________________________________________________________________
>> > OpenStack Development Mailing List (not for usage questions)
>> > Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> >
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribehttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> --
> Artem Panchenko
> QA Engineer
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160126/eee9916b/attachment-0001.html>


More information about the OpenStack-dev mailing list