[Openstack-operators] Problems running large numbers of instances

Linux Datacenter linuxdatacenter at gmail.com
Mon Jan 16 13:04:09 UTC 2012


Hi Paul,

Thanks for the follow up ;-)

I've been sitting on this problem for a couple of hours now. Here's what I
determinded:

Changing flags in /usr/lib/python2.7/dist-packages/nova/flags.py regarding
sqlalchemy did not help:

DEFINE_integer('sql_pool_timeout', 30,
               'seconds to wait for connection from pool before erroring')
DEFINE_integer('sql_min_pool_size', 30,
               'minimum number of SQL connections to pool')
DEFINE_integer('sql_max_pool_size', 30,
               'maximum number of SQL connections to pool')


So I ended up with what you suggested - I wrote a small wrapper around
euca2ools, which takes the same arguments and runs required amount of vms
one-by-one in a loop:

http://paste.openstack.org/show/4312/

Here's how you launch it (exactly as euca-run-instances):

./local-run-instances.py -n10 -k <key> ami-XXXXXXXX

Cheers,


On 14 January 2012 00:37, Paul Guth <gunther at cloudscaling.com> wrote:

> We've run into the same problem and have filed a bug but haven't fixed it
> yet:
>
> https://bugs.launchpad.net/nova/+bug/907125
>
> For now I'd recommend not using "euca-run-instances -n" or using a smaller
> number.  Not exactly a fix....  :)
>
> --paul
>
> On Fri, Jan 13, 2012 at 12:57 AM, Linux Datacenter <
> linuxdatacenter at gmail.com> wrote:
>
>> Hi,
>>
>> When I try to run a large number of instances in single run, like:
>> euca-run-instances -n <some_large_number>
>>
>> where some_large_number is something like 30-50
>>
>> I hit some bug which I guess is related with sqlalchemy:
>>
>> in nova-api.log:
>> nova.api): TRACE: DetachedInstanceError: Parent instance <FixedIp at
>> 0x6d9fdd0> is not bound to a Session; lazy load operation of attribute
>> 'network' cannot proceed
>>
>> in nova-network.log:
>> (nova.rpc): TRACE: TimeoutError: QueuePool limit of size 10 overflow 10
>> reached, connection timed out, timeout 30
>>
>> euca-describe-instances:
>> Unknownerror
>>
>> Eventually, when I run euca-describe-instances, some of the instances are
>> stuck in "pending" state and never run.
>>
>>
>> I also tried to use cloud-run-instances,but it ends up with the same
>> error:
>>
>> Traceback (most recent call last):
>>   File "/usr/bin/cloud-run-instances", line 654, in <module>
>>     main()
>>   File "/usr/bin/cloud-run-instances", line 466, in main
>>     raise Exception('%s failed' % cmd)
>> Exception: euca-run-instances failed
>>
>>
>> Anyone has a remedy for this?
>>
>>
>> --
>> checkout my blog on linux clusters:
>> -- linuxdatacenter.blogspot.com --
>>
>> _______________________________________________
>> Openstack-operators mailing list
>> Openstack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>>
>
>
> --
> Paul Guth
> Cloudscaling <http://www.cloudscaling.com/> Technical Operations
> skype: pguth66
> phone: +1 408 647 5128
>
>
>


-- 
checkout my blog on linux clusters:
-- linuxdatacenter.blogspot.com --
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20120116/ef2dcfb7/attachment-0002.html>


More information about the Openstack-operators mailing list