[openstack-dev] [Fuel] Diagnostic snapshot generation is broken due to lack of disk space

Sheena Gregson sgregson at mirantis.com
Fri Jan 15 15:02:54 UTC 2016


I’ve also seen the request multiple times to be able to provide more
targeted snapshots which might also (partially) solve this problem as it
would require significantly less disk space to grab logs from a subset of
nodes for a specific window of time, instead of the more robust grab-all
solution we have now.



*From:* Maciej Kwiek [mailto:mkwiek at mirantis.com]
*Sent:* Thursday, January 14, 2016 5:59 AM
*To:* OpenStack Development Mailing List (not for usage questions) <
openstack-dev at lists.openstack.org>
*Subject:* Re: [openstack-dev] [Fuel] Diagnostic snapshot generation is
broken due to lack of disk space



Igor,



I will investigate this, thanks!



Artem,



I guess that if we have an untrusted user on master node, he could just put
something he wants to be in the snapshot in /var/log without having to time
the attack carefully with tar execution.



I want to use links for directories, this saves me the trouble of creating
hardlinks for every single file in the directory. Although with how
exclusion is currently implemented it can cause deleting log files from
original directories, need to check this out.



About your PS: whole /var/log on master node (not in container) is
currently downloaded, I think we shouldn't change this as we plan to drop
containers in 9.0.



Cheers,

Maciej



On Thu, Jan 14, 2016 at 12:32 PM, Artem Panchenko <apanchenko at mirantis.com>
wrote:

Hi,

using symlinks is a bit dangerous, here is a quote from the man you
mentioned [0]:

> The `--dereference' option is unsafe if an untrusted user can modify
directories while tar is running.

Hard links usage is much safer, because you can't use them for directories.
But at the same time implementation in shotgun would be more complicated
than with symlinks.

Anyway, in order to determine what linking to use we need to decide where
(/var/log or another partition) diagnostic snapshot will be stored.

p.s.

>This doesn't really give us much right now, because most of the logs are fetched from master node via ssh due to shotgun being run in mcollective container



AFAIK '/var/log/docker-logs/' is available from mcollective container and
mounted to /var/log/:

[root at fuel-lab-cz5557 ~]# dockerctl shell mcollective mount -l | grep
os-varlog
/dev/mapper/os-varlog on /var/log type ext4
(rw,relatime,stripe=128,data=ordered)

>From my experience '/var/log/docker-logs/remote' folder is most ' heavy'
thing in snapshot.

[0] http://www.gnu.org/software/tar/manual/html_node/dereference.html

Thanks!



On 14.01.16 13:00, Igor Kalnitsky wrote:

I took a glance on Maciej's patch and it adds a switch to tar command

to make it follow symbolic links

Yeah, that should work. Except one thing - we previously had fqdn ->

ipaddr links in snapshots. So now they will be resolved into full

copy?



I meant that symlinks also give us the benefit of not using additional

space (just as hardlinks do) while being able to link to files from

different filesystems.

I'm sorry, I got you wrong. :)



- Igor



On Thu, Jan 14, 2016 at 12:34 PM, Maciej Kwiek <mkwiek at mirantis.com>
<mkwiek at mirantis.com> wrote:

Igor,



I meant that symlinks also give us the benefit of not using additional space

(just as hardlinks do) while being able to link to files from different

filesystems.



Also, as Barłomiej pointed out the `h` switch for tar should do the trick

[1].



Cheers,

Maciej



[1] http://www.gnu.org/software/tar/manual/html_node/dereference.html



On Thu, Jan 14, 2016 at 11:22 AM, Bartlomiej Piotrowski

<bpiotrowski at mirantis.com> <bpiotrowski at mirantis.com> wrote:

Igor,



I took a glance on Maciej's patch and it adds a switch to tar command to

make it follow symbolic links, so it looks good to me.



Bartłomiej



On Thu, Jan 14, 2016 at 10:39 AM, Igor Kalnitsky
<ikalnitsky at mirantis.com> <ikalnitsky at mirantis.com>

wrote:

Hey Maceij -



About hardlinks - wouldn't it be better to use symlinks?

This way we don't occupy more space than necessary

AFAIK, hardlinks won't occupy much space. They are the links, after all.

:)



As for symlinks, I'm afraid shotgun (and fabric underneath) won't

resolve them and links are get to snapshot As Is. That means if there

will be no content in the snapshot they are pointing to, they are

simply useless. Needs to be checked, though.



- Igor



On Thu, Jan 14, 2016 at 10:31 AM, Maciej Kwiek <mkwiek at mirantis.com>
<mkwiek at mirantis.com>

wrote:

Thanks for your insight guys!



I agree with Oleg, I will see what I can do to make this work this way.



About hardlinks - wouldn't it be better to use symlinks? This way we

don't

occupy more space than necessary, and we can link to files and

directories

that are in other block device than /var. Please see [1] review for a

proposed change that introduces symlinks.



This doesn't really give us much right now, because most of the logs

are

fetched from master node via ssh due to shotgun being run in

mcollective

container, but it's something! When we remove containers, this will

prove

more useful.



Regards,

Maciej Kwiek



[1] https://review.openstack.org/#/c/266964/



On Tue, Jan 12, 2016 at 1:51 PM, Oleg Gelbukh <ogelbukh at mirantis.com>
<ogelbukh at mirantis.com>

wrote:

I think we need to find a way to:



1) verify the size of snapshot without actually making it and compare

to

the available disk space beforehand.

2) refuse to create snapshot if space is insufficient and notify user

(otherwise it breaks Admin node as we have seen)

3) provide a way to prioritize elements of the snapshot and exclude

them

based on the priorities or user choice.



This will allow for better and safer UX with the snapshot.



--

Best regards,

Oleg Gelbukh



On Tue, Jan 12, 2016 at 1:47 PM, Maciej Kwiek <mkwiek at mirantis.com>
<mkwiek at mirantis.com>

wrote:

Hi!



I need some advice on how to tackle this issue. There is a bug [1]

describing the problem with creating a diagnostic snapshot. The issue

is

that /var/log has 100GB available, while /var (where diagnostic

snapshot is

being generated - /var/www/nailgun/dump/fuel-snapshot according to

[2]) has

10GB available, so dumping the logs can be an issue when logs size

exceed

free space in /var.



There are several things we could do, but I am unsure on which course

to

take. Should we

a) Allocate more disk space for /var/www (or for whole /var)?

b) Make the snapshot location share the diskspace of /var/log?

c) Something else? What?



Please share your thoughts on this.



Cheers,

Maciej Kwiek



[1] https://bugs.launchpad.net/fuel/+bug/1529182

[2]



https://github.com/openstack/fuel-web/blob/2855a9ba925c146b4802ab3cd2185f1dce2d8a6a/nailgun/nailgun/settings.yaml#L717







__________________________________________________________________________

OpenStack Development Mailing List (not for usage questions)

Unsubscribe:

OpenStack-dev-request at lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev







__________________________________________________________________________

OpenStack Development Mailing List (not for usage questions)

Unsubscribe:

OpenStack-dev-request at lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev







__________________________________________________________________________

OpenStack Development Mailing List (not for usage questions)

Unsubscribe:

OpenStack-dev-request at lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev





__________________________________________________________________________

OpenStack Development Mailing List (not for usage questions)

Unsubscribe:

OpenStack-dev-request at lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev





__________________________________________________________________________

OpenStack Development Mailing List (not for usage questions)

Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev





__________________________________________________________________________

OpenStack Development Mailing List (not for usage questions)

Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__________________________________________________________________________

OpenStack Development Mailing List (not for usage questions)

Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



-- 

Artem Panchenko

QA Engineer


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160115/ec49c441/attachment.html>


More information about the OpenStack-dev mailing list