[nova][ironic][ptg] Resource tracker scaling issues

Jim Rollenhagen jim at jimrollenhagen.com
Tue Nov 12 16:44:47 UTC 2019


On Tue, Nov 12, 2019 at 11:38 AM Belmiro Moreira <
moreira.belmiro.email.lists at gmail.com> wrote:

> Dan Smith just point me the conductor groups that were added in Stein.
>
> https://specs.openstack.org/openstack/nova-specs/specs/stein/implemented/ironic-conductor-groups.html
> This is an interesting way to partition the deployment much better than
> the multiple nova-computes setup.
>

Just a note, they aren't mutually exclusive. You can run multiple
nova-computes to manage a single conductor group, whether for HA or because
you're using groups for some other construct (cells, racks, halls, network
zones, etc) which you want to shard further.

// jim


> Thanks,
> Belmiro
> CERN
>
> On Tue, Nov 12, 2019 at 5:06 PM Belmiro Moreira <
> moreira.belmiro.email.lists at gmail.com> wrote:
>
>> Hi,
>> using several cells for the Ironic deployment would be great however it
>> doesn't work with the current architecture.
>> The nova ironic driver gets all the nodes available in Ironic. This means
>> that if we have several cells all of them will report the same nodes!
>> The other possibility is to have a dedicated Ironic instance per cell,
>> but in this case it will be very hard to manage a large deployment.
>>
>> What we are trying is to shard the ironic nodes between several
>> nova-computes.
>> nova/ironic deployment supports several nova-computes and it will be
>> great if the RT nodes cycle is sharded between them.
>>
>> But anyway, this will also require speeding up the big lock.
>> It would be great if a compute node can handle more than 500 nodes.
>> Considering our use case: 15k/500 = 30 compute nodes.
>>
>> Belmiro
>> CERN
>>
>>
>>
>> On Mon, Nov 11, 2019 at 9:13 PM Matt Riedemann <mriedemos at gmail.com>
>> wrote:
>>
>>> On 11/11/2019 7:03 AM, Chris Dent wrote:
>>> > Or using
>>> > separate processes? For the ironic and vsphere contexts, increased
>>> > CPU usage by the nova-compute process does not impact on the
>>> > workload resources, so parallization is likely a good option.
>>>
>>> I don't know how much it would help - someone would have to actually
>>> test it out and get metrics - but one easy win might just be using a
>>> thread or process executor pool here [1] so that N compute nodes could
>>> be processed through the update_available_resource periodic task
>>> concurrently, maybe $ncpu or some factor thereof. By default make it
>>> serialized for backward compatibility and non-ironic deployments. Making
>>> that too highly concurrent could have negative impacts on other things
>>> running on that host, like the neutron agent, or potentially storming
>>> conductor/rabbit with a ton of DB requests from that compute.
>>>
>>> That doesn't help with the scenario that the big
>>> COMPUTE_RESOURCE_SEMAPHORE lock is held by the periodic task while
>>> spawning, moving, or deleting an instance that also needs access to the
>>> big lock to update the resource tracker, but baby steps if any steps in
>>> this area of the code would be my recommendation.
>>>
>>> [1]
>>>
>>> https://github.com/openstack/nova/blob/20.0.0/nova/compute/manager.py#L8629
>>>
>>> --
>>>
>>> Thanks,
>>>
>>> Matt
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20191112/6016b6e4/attachment.html>


More information about the openstack-discuss mailing list