[openstack-dev] Scheduler proposal

Joshua Harlow harlowja at outlook.com
Thu Oct 8 15:38:57 UTC 2015


Joshua Harlow wrote:
> On Thu, 8 Oct 2015 10:43:01 -0400
> Monty Taylor<mordred at inaugust.com>  wrote:
>
>> On 10/08/2015 09:01 AM, Thierry Carrez wrote:
>>> Maish Saidel-Keesing wrote:
>>>> Operational overhead has a cost - maintaining 3 different database
>>>> tools, backing them up, providing HA, etc. has operational cost.
>>>>
>>>> This is not to say that this cannot be overseen, but it should be
>>>> taken into consideration.
>>>>
>>>> And *if* they can be consolidated into an agreed solution across
>>>> the whole of OpenStack - that would be highly beneficial (IMHO).
>>> Agreed, and that ties into the similar discussion we recently had
>>> about picking a common DLM. Ideally we'd only add *one* general
>>> dependency and use it for locks / leader election / syncing status
>>> around.
>>>
>> ++
>>
>> All of the proposed DLM tools can fill this space successfully. There
>> is definitely not a need for multiple.
>
> On this point, and just thinking out loud. If we consider saving
> compute_node information into say a node in said DLM backend (for
> example a znode in zookeeper[1]); this information would be updated
> periodically by that compute_node *itself* (it would say contain
> information about what VMs are running on it, what there utilization is
> and so-on).
>
> For example the following layout could be used:
>
> /nova/compute_nodes/<hypervisor-hostname>
>
> <hypervisor-hostname>  data could be:
>
> {
>      vms: [],
>      memory_free: XYZ,
>      cpu_usage: ABC,
>      memory_used: MNO,
>      ...
> }
>
> Now if we imagine each/all schedulers having watches
> on /nova/compute_nodes/ ([2] consul and etc.d have equivalent concepts
> afaik) then when a compute_node updates that information a push
> notification (the watch being triggered) will be sent to the
> scheduler(s) and the scheduler(s) could then update a local in-memory
> cache of the data about all the hypervisors that can be selected from
> for scheduling. This avoids any reading of a large set of data in the
> first place (besides an initial read-once on startup to read the
> initial list + setup the watches); in a way its similar to push
> notifications. Then when scheduling a VM ->  hypervisor there isn't any
> need to query anything but the local in-memory representation that the
> scheduler is maintaining (and updating as watches are triggered)...
>
> So this is why I was wondering about what capabilities of cassandra are
> being used here; because the above I think are unique capababilties of
> DLM like systems (zookeeper, consul, etcd) that could be advantageous
> here...
>
> [1]
> https://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#sc_zkDataModel_znodes
>
> [2]
> https://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#ch_zkWatches
>
>

And here's a final super-awesomeness,

Use the same existence of that znode + information (perhaps using 
ephemeral znodes or equivalent) to determine if a hypervisor is 'alive' 
or 'dead', thus removing the need to do queries and periodic writes to 
the nova database to determine if a hypervisors nova-compute service is 
alive or dead (with reads via 
https://github.com/openstack/nova/blob/master/nova/servicegroup/drivers/db.py#L33 
and other similar code scattered in nova)...

>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list