<html><head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
</head><body bgcolor="#FFFFFF" text="#000000">Ian, thanks for digging in
and helping sort out some of these issues! <br>
<span>
</span><br>
<blockquote style="border: 0px none;"
cite="mid:4b684314-bdba-8ead-6354-3984b7610705@redhat.com" type="cite">
<div style="margin:30px 25px 10px 25px;" class="__pbConvHr"><div
style="width:100%;border-top:2px solid #EDF1F4;padding-top:10px;"> <div
style="display:inline-block;white-space:nowrap;vertical-align:middle;width:49%;">
<a moz-do-not-send="true" href="mailto:iwienand@redhat.com"
style="color:#485664
!important;padding-right:6px;font-weight:500;text-decoration:none
!important;">Ian Wienand</a></div> <div
style="display:inline-block;white-space:nowrap;vertical-align:middle;width:48%;text-align:
right;"> <font color="#909AA4"><span style="padding-left:6px">April
4, 2018 at 11:04 PM</span></font></div> </div></div>
<div style="color:#909AA4;margin-left:24px;margin-right:24px;"
__pbrmquotes="true" class="__pbConvBody"><div><!----><br>We've long had
problems with this host and I've looked at it before<br>[1]. It often
drops out.<br><br>It seems there's enough interest we should dive a bit
deeper. Here's<br>what I've found out:<br><br>askbot<br>------<br><br>Of
the askbot site, it seems under control, except for an unbounded<br>session
log file. Proposed [2]<br><br> root@ask:/srv# du -hs *<br> 2.0G
askbot-site<br> 579M dist<br><br>overall<br>-------<br><br>The major
consumer is /var; where we've got<br><br> 3.9G log<br> 5.9G backups<br>
9.4G lib<br><br>backups<br>-------<br><br>The backup seem under control
at least; we're rotating them out and we<br>keep 10, and the size is
pretty consistently 500mb:<br><br> root@ask:/var/backups/pgsql_backups#
ls -lh<br> total 5.9G<br> -rw-r--r-- 1 root root 599M Apr 5 00:03
askbotdb.sql.gz<br> -rw-r--r-- 1 root root 598M Apr 4 00:03
askbotdb.sql.gz.1<br> ...<br><br>We could reduce the backup rotations to
just one if we like -- the<br>server is backed up nightly via bup, so
at any point we can get<br>previous dumps from there. bup should
de-duplicate everything, but<br>still, it's probably not necessary.<br><br>The
db directory was sitting at ~9gb<br><br> root@ask:/var/lib/postgresql#
du -hs<br> 8.9G .<br><br>AFAICT, it seems like the autovacuum is running
OK on the busy tables<br><br> askbotdb=# select relname,last_vacuum,
last_autovacuum, last_analyze, last_autoanalyze from pg_stat_user_tables
where last_autovacuum is not NULL;<br> relname | last_vacuum |
last_autovacuum | last_analyze |
last_autoanalyze <br>
------------------+-------------+-------------------------------+-------------------------------+-------------------------------<br>
django_session | | 2018-04-02 17:29:48.329915+00 |
2018-04-05 02:18:39.300126+00 | 2018-04-05 00:11:23.456602+00<br>
askbot_badgedata | | 2018-04-04 07:19:21.357461+00 |
| 2018-04-04 07:18:16.201376+00<br>
askbot_thread | | 2018-04-04 16:24:45.124492+00 |
| 2018-04-04 20:32:25.845164+00<br> auth_message
| | 2018-04-04 12:29:24.273651+00 | 2018-04-05
02:18:07.633781+00 | 2018-04-04 21:26:38.178586+00<br> djkombu_message
| | 2018-04-05 02:11:50.186631+00 |
| 2018-04-05 02:14:45.22926+00<br><br>Out of interest I did run a
manual<br><br> su - postgres -c "vacuumdb --all --full --analyze"<br><br>We
dropped something<br><br> root@ask:/var/lib/postgresql# du -hs<br> 8.9G
.<br> (after)<br> 5.8G <br><br>I installed pg_activity and watched for a
while; nothing seemed to be<br>really stressing it.<br><br>Ergo, I'm
not sure if there's much to do in the db layers.<br><br>logs<br>----<br><br>This
leaves the logs<br><br> 1.1G jetty<br> 2.9G apache2<br><br>The jetty
logs are cleaned regularly. I think they could be made more<br>quiet,
but they seem to be bounded.<br><br>Apache logs are rotated but never
cleaned up. Surely logs from 2015<br>aren't useful. Proposed [3]<br><br>Random
offline<br>--------------<br><br>[3] is an example of a user reporting
the site was offline. Looking<br>at the logs, it seems that puppet
found httpd not running at 07:14 and<br>restarted it:<br><br> Apr 4
07:14:40 ask puppet-user[20737]: (Scope(Class[Postgresql::Server]))
Passing "version" to postgresql::server is deprecated; please use
postgresql::globals instead.<br> Apr 4 07:14:42 ask puppet-user[20737]:
Compiled catalog for ask.openstack.org in environment production in
4.59 seconds<br> Apr 4 07:14:44 ask crontab[20987]: (root) LIST (root)<br>
Apr 4 07:14:49 ask puppet-user[20737]:
(/Stage[main]/Httpd/Service[httpd]/ensure) ensure changed 'stopped' to
'running'<br> Apr 4 07:14:54 ask puppet-user[20737]: Finished catalog
run in 10.43 seconds<br><br>Which first explains why when I looked, it
seemed OK. Checking the<br>apache logs we have:<br><br> [Wed Apr 04
07:01:08.144746 2018] [:error] [pid 12491:tid 140439253419776] [remote
176.233.126.142:43414] mod_wsgi (pid=12491): Exception occurred
processing WSGI script '/srv/askbot-site/config/django.wsgi'.<br> [Wed
Apr 04 07:01:08.144870 2018] [:error] [pid 12491:tid 140439253419776]
[remote 176.233.126.142:43414] IOError: failed to write data<br> ...
more until ...<br> [Wed Apr 04 07:15:58.270180 2018] [:error] [pid
17060:tid 140439253419776] [remote 176.233.126.142:43414] mod_wsgi
(pid=17060): Exception occurred processing WSGI script
'/srv/askbot-site/config/django.wsgi'.<br> [Wed Apr 04 07:15:58.270303
2018] [:error] [pid 17060:tid 140439253419776] [remote
176.233.126.142:43414] IOError: failed to write data<br><br>and the
restart logged<br><br> [Wed Apr 04 07:14:48.912626 2018] [core:warn]
[pid 21247:tid 140439370192768] AH00098: pid file
/var/run/apache2/apache2.pid overwritten -- Unclean shutdown of previous
Apache run?<br> [Wed Apr 04 07:14:48.913548 2018] [mpm_event:notice]
[pid 21247:tid 140439370192768] AH00489: Apache/2.4.7 (Ubuntu)
OpenSSL/1.0.1f mod_wsgi/3.4 Python/2.7.6 configured -- resuming normal
operations<br> [Wed Apr 04 07:14:48.913583 2018] [core:notice] [pid
21247:tid 140439370192768] AH00094: Command line: '/usr/sbin/apache2'<br>
[Wed Apr 04 14:59:55.408060 2018] [mpm_event:error] [pid 21247:tid
140439370192768] AH00485: scoreboard is full, not at MaxRequestWorkers<br><br>This
does not appear to be disk-space related; see the cacti graphs<br>for
that period that show the disk is full-ish, but not full [5].<br><br>What
caused the I/O errors? dmesg has nothing in it since 30/Mar.<br>kern.log
is empty.<br><br>Server<br>------<br><br>Most importantly, this sever
wants a Xenial upgrade. At the very<br>least that apache is known to
handle the "scoreboard is full" issue<br>better.<br><br>We should ensure
that we use a bigger instance; it's using up some<br>swap<br><br>
postgres@ask:~$ free -h<br> total used free
shared buffers cached<br> Mem: 3.9G 3.6G
269M 136M 11M 819M<br> -/+ buffers/cache: 2.8G
1.1G<br> Swap: 3.8G 259M 3.6G<br><br>tl;dr<br>-----<br><br>I
don't think there's anything run-away bad going on, but the server<br>is
undersized and needs a system update.<br><br>Since I've got this far
with it, over the next few days I'll see where<br>we are with the puppet
for a Xenial upgrade and see if we can't get a<br>migration underway.<br><br>Thanks,<br><br>-i<br><br>[1]
<a class="moz-txt-link-freetext" href="https://review.openstack.org/406670">https://review.openstack.org/406670</a><br>[2]
<a class="moz-txt-link-freetext" href="https://review.openstack.org/558977">https://review.openstack.org/558977</a><br>[3]
<a class="moz-txt-link-freetext" href="https://review.openstack.org/558985">https://review.openstack.org/558985</a><br>[4]
<a class="moz-txt-link-freetext" href="http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack-infra.2018-04-04.log.html#t2018-04-04T07:11:22">http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack-infra.2018-04-04.log.html#t2018-04-04T07:11:22</a><br>[5]
<a class="moz-txt-link-freetext" href="http://cacti.openstack.org/cacti/graph.php?action=zoom&local_graph_id=2547&rra_id=0&view_type=tree&graph_start=1522859103&graph_end=1522879839">http://cacti.openstack.org/cacti/graph.php?action=zoom&local_graph_id=2547&rra_id=0&view_type=tree&graph_start=1522859103&graph_end=1522879839</a><br><br>__________________________________________________________________________<br>OpenStack
Development Mailing List (not for usage questions)<br>Unsubscribe:
<a class="moz-txt-link-abbreviated" href="mailto:OpenStack-dev-request@lists.openstack.org?subject:unsubscribe">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br><a class="moz-txt-link-freetext" href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br></div></div>
<div style="margin:30px 25px 10px 25px;" class="__pbConvHr"><div
style="width:100%;border-top:2px solid #EDF1F4;padding-top:10px;"> <div
style="display:inline-block;white-space:nowrap;vertical-align:middle;width:49%;">
<a moz-do-not-send="true" href="mailto:pabelanger@redhat.com"
style="color:#485664
!important;padding-right:6px;font-weight:500;text-decoration:none
!important;">Paul Belanger</a></div> <div
style="display:inline-block;white-space:nowrap;vertical-align:middle;width:48%;text-align:
right;"> <font color="#909AA4"><span style="padding-left:6px">April
4, 2018 at 5:30 PM</span></font></div> </div></div>
<div style="color:#909AA4;margin-left:24px;margin-right:24px;"
__pbrmquotes="true" class="__pbConvBody"><div><!----><br>We also have a
2nd issue where the ask.o.o server doesn't appear to be large<br>enough
any more to handle the traffic. A few times over the last few weeks
we've<br>had outages due to the HDD being full.<br><br>We likely need to
reduce the number of days we retain database backups / http<br>logs or
look to attach a volume to increase storage.<br><br>Paul<br><br>__________________________________________________________________________<br>OpenStack
Development Mailing List (not for usage questions)<br>Unsubscribe:
<a class="moz-txt-link-abbreviated" href="mailto:OpenStack-dev-request@lists.openstack.org?subject:unsubscribe">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br><a class="moz-txt-link-freetext" href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br></div></div>
<div style="margin:30px 25px 10px 25px;" class="__pbConvHr"><div
style="width:100%;border-top:2px solid #EDF1F4;padding-top:10px;"> <div
style="display:inline-block;white-space:nowrap;vertical-align:middle;width:49%;">
<a moz-do-not-send="true" href="mailto:jimmy@openstack.org"
style="color:#485664
!important;padding-right:6px;font-weight:500;text-decoration:none
!important;">Jimmy McArthur</a></div> <div
style="display:inline-block;white-space:nowrap;vertical-align:middle;width:48%;text-align:
right;"> <font color="#909AA4"><span style="padding-left:6px">April
4, 2018 at 4:26 PM</span></font></div> </div></div>
<div style="color:#909AA4;margin-left:24px;margin-right:24px;"
__pbrmquotes="true" class="__pbConvBody">
<meta content="text/html; charset=UTF-8" http-equiv="content-type">
Hi everyone!<br>
<br>
We have a very robust and vibrant community at <a moz-do-not-send="true"
href="https://ask.openstack.org/">ask.openstack.org</a>. There are
literally dozens of posts a day. However, many of them don't receive
knowledgeable answers. I'm really worried about this becoming a vacuum
where potential community members get frustrated and don't realize how
to get more involved with the community. <br>
<br>
I'm looking for thoughts/ideas/feelings about this tool as well as
potential admin volunteers to help us manage the constant influx of
technical and not-so-technical questions around OpenStack. <br>
<br>
For those of you already contributing there, Thank You! For those that
are interested in becoming a moderator (instant AUC status!) or have
some additional ideas around fostering this community, please respond. <br>
<br>
Looking forward to your thoughts :)<br>
<br>
Thanks!<br>
Jimmy <br>
irc: jamesmcarthur<br>
<div>__________________________________________________________________________<br>OpenStack
Development Mailing List (not for usage questions)<br>Unsubscribe:
<a class="moz-txt-link-abbreviated" href="mailto:OpenStack-dev-request@lists.openstack.org?subject:unsubscribe">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br><a class="moz-txt-link-freetext" href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br></div></div>
</blockquote>
<br>
</body></html>