<o:p> </o:p></span></p><p class=3DMsoNormal><span lang=3DEN-US>Hi = Marc,<o:p></o:p></span></p><p class=3DMsoNormal><span = lang=3DEN-US><o:p> </o:p></span></p><p><b><span = lang=3DEN-US>Database settings</span></b><span = lang=3DEN-US><o:p></o:p></span></p><p class=3DMsoNormal><span = lang=3DEN-US>10k DB connections consume up to 21G of memory, which only = accounts for 10% of the server's memory in our env and will not cause = OOM risk.<o:p></o:p></span></p><p class=3DMsoNormal><span = lang=3DEN-US><o:p> </o:p></span></p><p class=3DMsoNormal><b><span = lang=3DEN-US>RabbitMQ<o:p></o:p></span></b></p><p = class=3DMsoNormal><span lang=3DEN-US><o:p> </o:p></span></p><p = class=3DMsoNormal><span lang=3DEN-US>The maximum number of RabbitMQ = connections is 20000, which is obtained by test in the 3000 node = environment.<o:p></o:p></span></p><p class=3DMsoNormal><span = lang=3DEN-US =
<o:p> </o:p></span></p><p class=3DMsoNormal><span = lang=3DEN-US>Thanks<o:p></o:p></span></p><p class=3DMsoNormal><span = lang=3DEN-US>Alex<o:p></o:p></span></p><p class=3DMsoNormal><span = lang=3DEN-US =
<o:p> </o:p></span></p><div><div =
0 *H÷ 010 +0 *H÷ $Content-Type: multipart/alternative; boundary="----=_NextPart_000_0285_01DB23C2.5F292080" This is a multipart message in MIME format. ------=_NextPart_000_0285_01DB23C2.5F292080 Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: 8bit Hi Marc, Database settings 10k DB connections consume up to 21G of memory, which only accounts for 10% of the server's memory in our env and will not cause OOM risk. RabbitMQ The maximum number of RabbitMQ connections is 20000, which is obtained by test in the 3000 node environment. Thanks Alex ·¢ŒþÈË: Marc Schoechlin [mailto:ms@256bit.org] ·¢ËÍʱŒä: 2024Äê10ÔÂ17ÈÕ 20:02 ÊÕŒþÈË: openstack-discuss@lists.openstack.org Ö÷Ìâ: Re: [largescale-sig]scaling story Hello Alex, thanks for the writeup! A few comments and questions from me: Database settings It seems dangerous to me to define the number of connections 'max_connections' so high (max 64K if you only connect to one ip). Every database connection requires resources on the operating system side at the very least. As far as I know, MariaDB allocates additional memory per active thread (e.g. sort_buffer_size <https://mariadb.com/docs/server/ref/mdb/system-variables/sort_buffer_size/> , join_buffer_size <https://mariadb.com/docs/server/ref/mdb/system-variables/join_buffer_size/> ). In extreme situations, this can cause MariaDB to either run out of memory (10k connections this is ~21GB only for threads when you use the mariadb defaults) due to CGroup resource limits or cause the memory requirement to grow so large that the OOM killer may even be activated on the node. Not a good thing for a database, especially when OOMKILL terminates the database using a SIGKILL. Furthermore, it could be that other limitations in the setup (IO hardware limits, ulimits or other configuration parameters) cause many thousands of connections to be active in parallel, but these are slowed down or even blocked/starved as a result. Ultimately, this will at least have a negative impact on response times, but it can also cause more serious problems that could, for example, cause the server to block its operation. These horror scenarios only occur in extreme situations, but it is precisely in these situations that these settings are particularly dangerous in my opinion. RabbitMQ I have also wondered what the limit for RabbitMQ is and whether there are potential difficulties here. As far as I know, the maximum number is automatically set, for example, by the ULimit or Erlang port limit applicable to the process (see also https://www.rabbitmq.com/docs/networking#tuning-for-large-number-of-connecti ons). What was your initial limit? In my setup, there are already a lot: $ docker exec -ti rabbitmq /bin/bash -c 'pgrep beam.smp|xargs -I PID grep -H ¡°Max open files¡± /proc/PID/limits' /proc/22/ <limits:Max> limits:Max open files 1048576 1048576 files $ docker exec -ti rabbitmq rabbitmqctl eval ' <erlang:system_info(port_limit)> erlang:system_info(port_limit).' 65536 Almost everything about the maximum possible connections in TCP can be used here for internal file system access. What I would like to know: How many connections can it handle in times of very high load? Do you have monitoring data from the situation you resolved? The RabbitMQ documentation mentioned above describes some interesting approaches - it probably makes sense to discuss this in more detail in a dedicated mail thread. Regards Marc ------=_NextPart_000_0285_01DB23C2.5F292080 Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: quoted-printable <html xmlns:v=3D"urn:schemas-microsoft-com:vml" = xmlns:o=3D"urn:schemas-microsoft-com:office:office" = xmlns:w=3D"urn:schemas-microsoft-com:office:word" = xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" = xmlns=3D"http://www.w3.org/TR/REC-html40"><head><meta = http-equiv=3DContent-Type content=3D"text/html; charset=3Dgb2312"><meta = name=3DGenerator content=3D"Microsoft Word 15 (filtered = medium)"><style><!-- /* Font Definitions */ @font-face {font-family:=CB=CE=CC=E5; panose-1:2 1 6 0 3 1 1 1 1 1;} @font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4;} @font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4;} @font-face {font-family:"\@=CB=CE=CC=E5"; panose-1:2 1 6 0 3 1 1 1 1 1;} @font-face {font-family:=CE=A2=C8=ED=D1=C5=BA=DA; panose-1:2 11 5 3 2 2 4 2 2 4;} @font-face {font-family:"\@=CE=A2=C8=ED=D1=C5=BA=DA"; panose-1:2 11 5 3 2 2 4 2 2 4;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0cm; margin-bottom:.0001pt; font-size:12.0pt; font-family:=CB=CE=CC=E5;} a:link, span.MsoHyperlink {mso-style-priority:99; color:blue; text-decoration:underline;} a:visited, span.MsoHyperlinkFollowed {mso-style-priority:99; color:purple; text-decoration:underline;} p {mso-style-priority:99; mso-margin-top-alt:auto; margin-right:0cm; mso-margin-bottom-alt:auto; margin-left:0cm; font-size:12.0pt; font-family:=CB=CE=CC=E5;} span.--l {mso-style-name:--l;} span.--r {mso-style-name:--r;} span.EmailStyle20 {mso-style-type:personal-reply; font-family:"Calibri",sans-serif; color:#1F497D;} .MsoChpDefault {mso-style-type:export-only; font-size:10.0pt;} @page WordSection1 {size:612.0pt 792.0pt; margin:72.0pt 90.0pt 72.0pt 90.0pt;} div.WordSection1 {page:WordSection1;} --></style><!--[if gte mso 9]><xml> <o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" /> </xml><![endif]--><!--[if gte mso 9]><xml> <o:shapelayout v:ext=3D"edit"> <o:idmap v:ext=3D"edit" data=3D"1" /> </o:shapelayout></xml><![endif]--></head><body lang=3DZH-CN link=3Dblue = vlink=3Dpurple><div class=3DWordSection1><p class=3DMsoNormal><span = lang=3DEN-US = style=3D'font-size:10.5pt;font-family:"Calibri",sans-serif;color:#1F497D'= style=3D'font-size:10.5pt;font-family:"Calibri",sans-serif;color:#1F497D'= style=3D'font-size:10.5pt;font-family:"Calibri",sans-serif;color:#1F497D'= style=3D'border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm = 0cm 0cm'><p class=3DMsoNormal><b><span = style=3D'font-size:11.0pt;font-family:"=CE=A2=C8=ED=D1=C5=BA=DA",sans-ser= if'>=B7=A2=BC=FE=C8=CB<span lang=3DEN-US>:</span></span></b><span = lang=3DEN-US = style=3D'font-size:11.0pt;font-family:"=CE=A2=C8=ED=D1=C5=BA=DA",sans-ser= if'> Marc Schoechlin [mailto:ms@256bit.org] <br></span><b><span = style=3D'font-size:11.0pt;font-family:"=CE=A2=C8=ED=D1=C5=BA=DA",sans-ser= if'>=B7=A2=CB=CD=CA=B1=BC=E4<span lang=3DEN-US>:</span></span></b><span = lang=3DEN-US = style=3D'font-size:11.0pt;font-family:"=CE=A2=C8=ED=D1=C5=BA=DA",sans-ser= if'> 2024</span><span = style=3D'font-size:11.0pt;font-family:"=CE=A2=C8=ED=D1=C5=BA=DA",sans-ser= if'>=C4=EA<span lang=3DEN-US>10</span>=D4=C2<span = lang=3DEN-US>17</span>=C8=D5<span lang=3DEN-US> = 20:02<br></span><b>=CA=D5=BC=FE=C8=CB<span = lang=3DEN-US>:</span></b><span lang=3DEN-US> = openstack-discuss@lists.openstack.org<br></span><b>=D6=F7=CC=E2<span = lang=3DEN-US>:</span></b><span lang=3DEN-US> Re: [largescale-sig]scaling = story<o:p></o:p></span></span></p></div></div><p class=3DMsoNormal><span = lang=3DEN-US><o:p> </o:p></span></p><p><span lang=3DEN-US>Hello = Alex,<o:p></o:p></span></p><p><span lang=3DEN-US>thanks for the = writeup!<o:p></o:p></span></p><p><span lang=3DEN-US>A few comments and = questions from me:<o:p></o:p></span></p><p><b><span = lang=3DEN-US>Database settings</span></b><span = lang=3DEN-US><o:p></o:p></span></p><p><span lang=3DEN-US>It seems = dangerous to me to define the number of connections 'max_connections' so = high (max 64K if you only connect to one ip).<br><br>Every database = connection requires resources on the operating system side at the very = least. As far as I know, MariaDB allocates additional memory per active = thread (e.g. <a = href=3D"https://mariadb.com/docs/server/ref/mdb/system-variables/sort_buf= fer_size/">sort_buffer_size</a>,<a = href=3D"https://mariadb.com/docs/server/ref/mdb/system-variables/join_buf= fer_size/"> join_buffer_size</a>). In extreme situations, this can cause = MariaDB to either run out of memory (10k connections this is ~21GB only = for threads when you use the mariadb defaults) due to CGroup resource = limits or cause the memory requirement to grow so large that the OOM = killer may even be activated on the node. Not a good thing for a = database, especially when OOMKILL terminates the database using a = SIGKILL.<br><br>Furthermore, it could be that other limitations in the = setup (IO hardware limits, ulimits or other configuration parameters) = cause many thousands of connections to be active in parallel, but these = are slowed down or even blocked/starved as a result. Ultimately, this = will at least have a negative impact on response times, but it can also = cause more serious problems that could, for example, cause the server to = block its operation. These horror scenarios only occur in extreme = situations, but it is precisely in these situations that these settings = are particularly dangerous in my = opinion.<br><br><b>RabbitMQ</b><o:p></o:p></span></p><p><span = class=3D--l><span lang=3DEN-US>I have also wondered what the limit for = RabbitMQ is and whether there are potential difficulties here. = </span></span><span lang=3DEN-US><o:p></o:p></span></p><p><span = class=3D--l><span lang=3DEN-US>As far as I know, the maximum number is = automatically set, for example, by the ULimit or Erlang port limit = applicable to the process (see also </span></span><span lang=3DEN-US><a = href=3D"https://www.rabbitmq.com/docs/networking#tuning-for-large-number-= of-connections">https://www.rabbitmq.com/docs/networking#tuning-for-large= -number-of-connections</a><span = class=3D--l>).</span><o:p></o:p></span></p><p><span class=3D--l><span = lang=3DEN-US>What was your initial limit?</span></span><span = lang=3DEN-US><br><br><o:p></o:p></span></p><p><span class=3D--l><span = lang=3DEN-US>In my setup, there are already a lot:</span></span><span = lang=3DEN-US><o:p></o:p></span></p><p><span class=3D--l><span = lang=3DEN-US style=3D'font-family:"Courier New"'>$ docker exec -ti = rabbitmq /bin/bash -c 'pgrep beam.smp|xargs -I PID grep -H =A1=B0Max = open files=A1=B1 /proc/PID/limits'</span></span><span lang=3DEN-US = style=3D'font-family:"Courier New"'><br><span = class=3D--l>/proc/22/</span></span><span lang=3DEN-US><a = href=3D"limits:Max"><span style=3D'font-family:"Courier = New"'>limits:Max</span></a></span><span class=3D--l><span lang=3DEN-US = style=3D'font-family:"Courier New"'> open files 1048576 1048576 = files</span></span><span lang=3DEN-US><o:p></o:p></span></p><p><span = class=3D--l><span lang=3DEN-US style=3D'font-family:"Courier New"'>$ = docker exec -ti rabbitmq rabbitmqctl eval '</span></span><span = lang=3DEN-US><a href=3D"erlang:system_info(port_limit)"><span = style=3D'font-family:"Courier = New"'>erlang:system_info(port_limit)</span></a></span><span = class=3D--l><span lang=3DEN-US style=3D'font-family:"Courier = New"'>.'</span></span><span lang=3DEN-US style=3D'font-family:"Courier = New"'><br><span class=3D--l>65536</span></span><span = lang=3DEN-US><o:p></o:p></span></p><p><span class=3D--l><span = lang=3DEN-US>Almost everything about the maximum possible connections in = TCP can be used here for internal file system access.</span></span><span = lang=3DEN-US><br><br><span class=3D--l>What I would like to know: How = many connections can it handle in times of very high load? Do you have = monitoring data from the situation you resolved?</span><br><span = class=3D--l>The RabbitMQ documentation mentioned above describes some = interesting approaches - it probably makes sense to discuss this in more = detail in a dedicated mail thread.</span><o:p></o:p></span></p><p><span = lang=3DEN-US>Regards<br>Marc<o:p></o:p></span></p></div></body></html> ------=_NextPart_000_0285_01DB23C2.5F292080-- 0É0± xðáwIÊèë`{0 *H÷ 0Y10 &ò,dcom10 &ò,dlangchao10 &ò,dhome10U INSPUR-CA0 170109092830Z 340511122004Z0Y10 &ò,dcom10 &ò,dlangchao10 &ò,dhome10U INSPUR-CA0"0 *H÷ 0 «ä5ïc$Œ©æ'µ¯Þ6>úUKÛdÔ²Áe9Î~{BîÒLgD÷*wvVÊŠ/DýUj_xá\m/ óž=kзéGÙœQ€ýx~Wùgk ÛÜøãÔ7É6NçÏ*?n°Ê²mhùè{ïôÌÆ 7üF-Î<@ÃÓͬWçÅyåLZrF 6~føÈ×T~$0d¡ýL|zšøW=ötÚ%ýq,¥Ã~Ÿ" ÀýŸÑö2T,QÕÔ,dºÂÅ^§ÈôïåJ)ëVvp Ó£00 +7CA0U0Uÿ0ÿ0U^YŠŽLX`Nöµ¥9Š2Á5j0 +70# +7&a°$öz(¶o§K0 *H÷ daòYÙ~×ì×NÑ3ŽËlP±i!¿ýÅòsÝLºP;I4hžØþ¡L¶äŠJêà·Õ:1àEÕs®9UÛ:8Mh{IJ캌£·ÚŒó×÷C_ŸhoÏúÆËYM&Uس?9êèr·ìgP_÷mÇ#YÖpîw0º¹«0 $p³jºnº¥¯/Ôä¬÷Ú ŠÖ5uþQݶh(y¶ènnã%E,øú&Zâ"dKسãi5VúP9±œ/ñ õkØŸsá/Xðﻯ€µ³O÷Í-ó¿Mè³3Óž{G(ªŸßŒG0³0 ~d¢LÅ££j¿d¢0 *H÷ 0Y10 &ò,dcom10 &ò,dlangchao10 &ò,dhome10U INSPUR-CA0 241015081309Z 291014081309Z0¯10 &ò,dcom10 &ò,dlangchao10 &ò,dhome1-0+U$æµå浪朮æ°æ®ææ¯æéå ¬åž10U å®æå¹³1%0# *H÷ songwenping@inspur.com0"0 *H÷ 0 ÓÀ3²\ž¹JÇUê² x| ËÃê«T\]µmK¿§'ËXÙ*çö Z;9ìÒhmçIæF©[.« õ9ò¯¶:üœ:Õ0Ëð7w-bŽô¿ýËÙßéèã&Æ®µ³«Îá1q×\Ï ÜÀosô5vÓI[£Ìª:Pè¥~%Ú³%¡øþzæ]go+or{Ø Üðë(o»#»ÓÌÞ2£ÿìöŒw? úE¡Àþí±à ±Œë±÷0cÃ1,«oã3?FpBšn'QÃ]÷y{èÖ#dŒ±£00= +700.&+7ò©×z©=÷Ø\Jý&§Mda0)U%"0 ++ +7 0U 05 +7 (0&0 +0 +0 +7 0D *H÷ 7050*H÷ 0*H÷ 0+0 *H÷ 0IUB0@ & +7 songwenping@inspur.comsongwenping@inspur.com0Uݵ>}6ÞúÏÀoÄ̹P0U#0^YŠŽLX`Nöµ¥9Š2Á5j0U00ÿ ü ùºldap:///CN=INSPUR-CA,CN=JTCA2012,CN=CDP,CN=Public%20Key%20Services,CN=Services,CN=Configuration,DC=home,DC=langchao,DC=com?certificateRevocationList?base?objectClass=cRLDistributionPoint:http://JTCA2012.home.langchao.com/CertEnroll/INSPUR-CA.crl0,+00±+0€ldap:///CN=INSPUR-CA,CN=AIA,CN=Public%20Key%20Services,CN=Services,CN=Configuration,DC=home,DC=langchao,DC=com?cACertificate?base?objectClass=certificationAuthority0d+0Xhttp://JTCA2012.home.langchao.com/CertEnroll/JTCA2012.home.langchao.com_INSPUR-CA(1).crt0S +7F0D B +7 42S-1-5-21-1606980848-706699826-1801674531-2274525320 *H÷ sMIú;E}auÅÚ 'õÛ<ÿ!b27óu"ãìïwªBÉî9¶zÎ |Ý»H+4šÏçÉxè §!Ó·-ÏâoU Žl ÍààV«"Ý5¢lzÝdcŽäøw^¯uâ+ßkäÃZì§q×rK éOæþä},öõ}:«óàuSºËuϱ³ûËõ._À0é[¢'m°@eV>Ì-Ú÷8Š$a>á^iiªßçüÏø(Ç;b̺kMÝé ÌrðÂm1MvhL£ÈC|?100p0Y10 &ò,dcom10 &ò,dlangchao10 &ò,dhome10U INSPUR-CA~d¢LÅ££j¿d¢0 + ø0 *H÷ 1 *H÷ 0 *H÷ 1 241021060606Z0# *H÷ 1a î7,\nöæÿ+ÈÌ^_<)(0 +71r0p0Y10 &ò,dcom10 &ò,dlangchao10 &ò,dhome10U INSPUR-CA~d¢LÅ££j¿d¢0*H÷ 1r p0Y10 &ò,dcom10 &ò,dlangchao10 &ò,dhome10U INSPUR-CA~d¢LÅ££j¿d¢0 *H÷ 1 00 `He*0 `He0 *H÷ 0 `He0*H÷ 0 *H÷ @0+0 `He0 `He0 `He0 *H÷ ŸöŽCºSf3×H¥5ÌÂó«VBçÇtEÅ>:Ä¢RA,þÓÈŽÍÛêIià ✬<Á/-MäB7(þA¢p^÷%B 3â©Íàȵ²ðT!Ðå9ŒkѧÚSŠÅœôzN*|J µÊ³ùËjt¡²[õ°ÇSADÊÇ÷ â#£¹F^®SµêÉ5»-ÖHxĹH\(á<PëÌlôéGÿÜN·: »A1bÀÄëRØÝ`c'P2È É~þpEHüê >DÛmÎ¥Ž|ÝuR°ÃªÄ\A;Þ×òZmuH