<html><head>

<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

</head><body bgcolor="#FFFFFF" text="#000000">Ian, thanks for digging in

 and helping sort out some of these issues! <br>

<span>


</span><br>

<blockquote style="border: 0px none;" 

cite="mid:4b684314-bdba-8ead-6354-3984b7610705@redhat.com" type="cite">

  <div style="margin:30px 25px 10px 25px;" class="__pbConvHr"><div 

style="width:100%;border-top:2px solid #EDF1F4;padding-top:10px;">   <div

 
style="display:inline-block;white-space:nowrap;vertical-align:middle;width:49%;">

        <a moz-do-not-send="true" href="mailto:iwienand@redhat.com" 

style="color:#485664 

!important;padding-right:6px;font-weight:500;text-decoration:none 

!important;">Ian Wienand</a></div>   <div 

style="display:inline-block;white-space:nowrap;vertical-align:middle;width:48%;text-align:

 right;">     <font color="#909AA4"><span style="padding-left:6px">April

 4, 2018 at 11:04 PM</span></font></div>    </div></div>

  <div style="color:#909AA4;margin-left:24px;margin-right:24px;" 

__pbrmquotes="true" class="__pbConvBody"><div><!----><br>We've long had 

problems with this host and I've looked at it before<br>[1].  It often 

drops out.<br><br>It seems there's enough interest we should dive a bit 

deeper.  Here's<br>what I've found out:<br><br>askbot<br>------<br><br>Of

 the askbot site, it seems under control, except for an unbounded<br>session

 log file.  Proposed [2]<br><br> root@ask:/srv# du -hs *<br> 2.0G     

askbot-site<br> 579M      dist<br><br>overall<br>-------<br><br>The major 

consumer is /var; where we've got<br><br> 3.9G      log<br> 5.9G      backups<br> 

9.4G    lib<br><br>backups<br>-------<br><br>The backup seem under control 

at least; we're rotating them out and we<br>keep 10, and the size is 

pretty consistently 500mb:<br><br> root@ask:/var/backups/pgsql_backups# 

ls -lh<br> total 5.9G<br> -rw-r--r-- 1 root root 599M Apr  5 00:03 

askbotdb.sql.gz<br> -rw-r--r-- 1 root root 598M Apr  4 00:03 

askbotdb.sql.gz.1<br> ...<br><br>We could reduce the backup rotations to

 just one if we like -- the<br>server is backed up nightly via bup, so 

at any point we can get<br>previous dumps from there.  bup should 

de-duplicate everything, but<br>still, it's probably not necessary.<br><br>The

 db directory was sitting at ~9gb<br><br> root@ask:/var/lib/postgresql# 

du -hs<br> 8.9G   .<br><br>AFAICT, it seems like the autovacuum is running

 OK on the busy tables<br><br> askbotdb=# select relname,last_vacuum, 

last_autovacuum, last_analyze, last_autoanalyze from pg_stat_user_tables

 where last_autovacuum is not NULL;<br>      relname      | last_vacuum |

        last_autovacuum        |         last_analyze          |       

last_autoanalyze        <br> 

------------------+-------------+-------------------------------+-------------------------------+-------------------------------<br>

  django_session   |             | 2018-04-02 17:29:48.329915+00 | 

2018-04-05 02:18:39.300126+00 | 2018-04-05 00:11:23.456602+00<br>  

askbot_badgedata |             | 2018-04-04 07:19:21.357461+00 |        

                       | 2018-04-04 07:18:16.201376+00<br>  

askbot_thread    |             | 2018-04-04 16:24:45.124492+00 |        

                       | 2018-04-04 20:32:25.845164+00<br>  auth_message

     |             | 2018-04-04 12:29:24.273651+00 | 2018-04-05 

02:18:07.633781+00 | 2018-04-04 21:26:38.178586+00<br>  djkombu_message 

 |             | 2018-04-05 02:11:50.186631+00 |                        

       | 2018-04-05 02:14:45.22926+00<br><br>Out of interest I did run a

 manual<br><br> su - postgres -c "vacuumdb --all --full --analyze"<br><br>We

 dropped something<br><br> root@ask:/var/lib/postgresql# du -hs<br> 8.9G

        .<br> (after)<br> 5.8G      <br><br>I installed pg_activity and watched for a

 while; nothing seemed to be<br>really stressing it.<br><br>Ergo, I'm 

not sure if there's much to do in the db layers.<br><br>logs<br>----<br><br>This

 leaves the logs<br><br> 1.1G       jetty<br> 2.9G    apache2<br><br>The jetty 

logs are cleaned regularly.  I think they could be made more<br>quiet, 

but they seem to be bounded.<br><br>Apache logs are rotated but never 

cleaned up.  Surely logs from 2015<br>aren't useful.  Proposed [3]<br><br>Random

 offline<br>--------------<br><br>[3] is an example of a user reporting 

the site was offline.  Looking<br>at the logs, it seems that puppet 

found httpd not running at 07:14 and<br>restarted it:<br><br> Apr  4 

07:14:40 ask puppet-user[20737]: (Scope(Class[Postgresql::Server])) 

Passing "version" to postgresql::server is deprecated; please use 

postgresql::globals instead.<br> Apr  4 07:14:42 ask puppet-user[20737]:

 Compiled catalog for ask.openstack.org in environment production in 

4.59 seconds<br> Apr  4 07:14:44 ask crontab[20987]: (root) LIST (root)<br>

 Apr  4 07:14:49 ask puppet-user[20737]: 

(/Stage[main]/Httpd/Service[httpd]/ensure) ensure changed 'stopped' to 

'running'<br> Apr  4 07:14:54 ask puppet-user[20737]: Finished catalog 

run in 10.43 seconds<br><br>Which first explains why when I looked, it 

seemed OK.  Checking the<br>apache logs we have:<br><br> [Wed Apr 04 

07:01:08.144746 2018] [:error] [pid 12491:tid 140439253419776] [remote 

176.233.126.142:43414] mod_wsgi (pid=12491): Exception occurred 

processing WSGI script '/srv/askbot-site/config/django.wsgi'.<br> [Wed 

Apr 04 07:01:08.144870 2018] [:error] [pid 12491:tid 140439253419776] 

[remote 176.233.126.142:43414] IOError: failed to write data<br> ... 

more until ...<br> [Wed Apr 04 07:15:58.270180 2018] [:error] [pid 

17060:tid 140439253419776] [remote 176.233.126.142:43414] mod_wsgi 

(pid=17060): Exception occurred processing WSGI script 

'/srv/askbot-site/config/django.wsgi'.<br> [Wed Apr 04 07:15:58.270303 

2018] [:error] [pid 17060:tid 140439253419776] [remote 

176.233.126.142:43414] IOError: failed to write data<br><br>and the 

restart logged<br><br> [Wed Apr 04 07:14:48.912626 2018] [core:warn] 

[pid 21247:tid 140439370192768] AH00098: pid file 

/var/run/apache2/apache2.pid overwritten -- Unclean shutdown of previous

 Apache run?<br> [Wed Apr 04 07:14:48.913548 2018] [mpm_event:notice] 

[pid 21247:tid 140439370192768] AH00489: Apache/2.4.7 (Ubuntu) 

OpenSSL/1.0.1f mod_wsgi/3.4 Python/2.7.6 configured -- resuming normal 

operations<br> [Wed Apr 04 07:14:48.913583 2018] [core:notice] [pid 

21247:tid 140439370192768] AH00094: Command line: '/usr/sbin/apache2'<br>

 [Wed Apr 04 14:59:55.408060 2018] [mpm_event:error] [pid 21247:tid 

140439370192768] AH00485: scoreboard is full, not at MaxRequestWorkers<br><br>This

 does not appear to be disk-space related; see the cacti graphs<br>for 

that period that show the disk is full-ish, but not full [5].<br><br>What

 caused the I/O errors?  dmesg has nothing in it since 30/Mar.<br>kern.log

 is empty.<br><br>Server<br>------<br><br>Most importantly, this sever 

wants a Xenial upgrade.  At the very<br>least that apache is known to 

handle the "scoreboard is full" issue<br>better.<br><br>We should ensure

 that we use a bigger instance; it's using up some<br>swap<br><br> 

postgres@ask:~$ free -h<br>              total       used       free    

 shared    buffers     cached<br> Mem:          3.9G       3.6G       

269M       136M        11M       819M<br> -/+ buffers/cache:       2.8G 

      1.1G<br> Swap:         3.8G       259M       3.6G<br><br>tl;dr<br>-----<br><br>I

 don't think there's anything run-away bad going on, but the server<br>is

 undersized and needs a system update.<br><br>Since I've got this far 

with it, over the next few days I'll see where<br>we are with the puppet

 for a Xenial upgrade and see if we can't get a<br>migration underway.<br><br>Thanks,<br><br>-i<br><br>[1]

 <a class="moz-txt-link-freetext" href="https://review.openstack.org/406670">https://review.openstack.org/406670</a><br>[2] 

<a class="moz-txt-link-freetext" href="https://review.openstack.org/558977">https://review.openstack.org/558977</a><br>[3] 

<a class="moz-txt-link-freetext" href="https://review.openstack.org/558985">https://review.openstack.org/558985</a><br>[4] 

<a class="moz-txt-link-freetext" href="http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack-infra.2018-04-04.log.html#t2018-04-04T07:11:22">http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack-infra.2018-04-04.log.html#t2018-04-04T07:11:22</a><br>[5]

 
<a class="moz-txt-link-freetext" href="http://cacti.openstack.org/cacti/graph.php?action=zoom&local_graph_id=2547&rra_id=0&view_type=tree&graph_start=1522859103&graph_end=1522879839">http://cacti.openstack.org/cacti/graph.php?action=zoom&local_graph_id=2547&rra_id=0&view_type=tree&graph_start=1522859103&graph_end=1522879839</a><br><br>__________________________________________________________________________<br>OpenStack

 Development Mailing List (not for usage questions)<br>Unsubscribe: 

<a class="moz-txt-link-abbreviated" href="mailto:OpenStack-dev-request@lists.openstack.org?subject:unsubscribe">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br><a class="moz-txt-link-freetext" href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br></div></div>

  <div style="margin:30px 25px 10px 25px;" class="__pbConvHr"><div 

style="width:100%;border-top:2px solid #EDF1F4;padding-top:10px;">   <div

 
style="display:inline-block;white-space:nowrap;vertical-align:middle;width:49%;">

        <a moz-do-not-send="true" href="mailto:pabelanger@redhat.com" 

style="color:#485664 

!important;padding-right:6px;font-weight:500;text-decoration:none 

!important;">Paul Belanger</a></div>   <div 

style="display:inline-block;white-space:nowrap;vertical-align:middle;width:48%;text-align:

 right;">     <font color="#909AA4"><span style="padding-left:6px">April

 4, 2018 at 5:30 PM</span></font></div>    </div></div>

  <div style="color:#909AA4;margin-left:24px;margin-right:24px;" 

__pbrmquotes="true" class="__pbConvBody"><div><!----><br>We also have a 

2nd issue where the ask.o.o server doesn't appear to be large<br>enough 

any more to handle the traffic. A few times over the last few weeks 

we've<br>had outages due to the HDD being full.<br><br>We likely need to

 reduce the number of days we retain database backups / http<br>logs or 

look to attach a volume to increase storage.<br><br>Paul<br><br>__________________________________________________________________________<br>OpenStack

 Development Mailing List (not for usage questions)<br>Unsubscribe: 

<a class="moz-txt-link-abbreviated" href="mailto:OpenStack-dev-request@lists.openstack.org?subject:unsubscribe">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br><a class="moz-txt-link-freetext" href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br></div></div>

  <div style="margin:30px 25px 10px 25px;" class="__pbConvHr"><div 

style="width:100%;border-top:2px solid #EDF1F4;padding-top:10px;">   <div

 
style="display:inline-block;white-space:nowrap;vertical-align:middle;width:49%;">

        <a moz-do-not-send="true" href="mailto:jimmy@openstack.org" 

style="color:#485664 

!important;padding-right:6px;font-weight:500;text-decoration:none 

!important;">Jimmy McArthur</a></div>   <div 

style="display:inline-block;white-space:nowrap;vertical-align:middle;width:48%;text-align:

 right;">     <font color="#909AA4"><span style="padding-left:6px">April

 4, 2018 at 4:26 PM</span></font></div>    </div></div>

  <div style="color:#909AA4;margin-left:24px;margin-right:24px;" 

__pbrmquotes="true" class="__pbConvBody">

<meta content="text/html; charset=UTF-8" http-equiv="content-type">

Hi everyone!<br>

  <br>

We have a very robust and vibrant community at <a moz-do-not-send="true"

 href="https://ask.openstack.org/">ask.openstack.org</a>.  There are 

literally dozens of posts a day. However, many of them don't receive 

knowledgeable answers.  I'm really worried about this becoming a vacuum 

where potential community members get frustrated and don't realize how 

to get more involved with the community.  <br>

  <br>

I'm looking for thoughts/ideas/feelings about this tool as well as 

potential admin volunteers to help us manage the constant influx of 

technical and not-so-technical questions around OpenStack.  <br>

  <br>

For those of you already contributing there, Thank You!  For those that 

are interested in becoming a moderator (instant AUC status!) or have 

some additional ideas around fostering this community, please respond.  <br>

  <br>

Looking forward to your thoughts :)<br>

  <br>

Thanks!<br>

Jimmy <br>

irc: jamesmcarthur<br>

<div>__________________________________________________________________________<br>OpenStack

 Development Mailing List (not for usage questions)<br>Unsubscribe: 

<a class="moz-txt-link-abbreviated" href="mailto:OpenStack-dev-request@lists.openstack.org?subject:unsubscribe">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br><a class="moz-txt-link-freetext" href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br></div></div>

</blockquote>

<br>

</body></html>