[openstack-dev] [Fuel] It is impossible to queue UpdateDnsmasqTask

Aleksandr Didenko adidenko at mirantis.com
Fri Jul 8 10:39:12 UTC 2016


Hi,

well, we have only one DHCP server that serves multiple clusters. Actions
with those multiple clusters may affect DHCP server configuration. So
queueing tasks that change DHCP server configuration seems like a
reasonable way to fix the problem. So options 2 and 3 are much better than
1 or 4.

Regards,
Alex

On Thu, Jul 7, 2016 at 10:59 AM, Georgy Kibardin <gkibardin at mirantis.com>
wrote:

> Continuing speaking to myself :)
>
> Option 4 implies that we generate puppet manifest with a set of admin
> networks instead of writing it into hiera. So, we get a two step task
> which, first, generates manifest with unique name and, second, calls puppet
> to apply it.
> However, theres still a problem with this approach. When we almost
> simultaneously delete environments A and B (and A goes a bit earlier)
> astute acknowledges two UpdateDnsmasqTask tasks for execution, however, it
> cannot guarantee that puppet for UpdateDnsmasqTask for env A will be
> executed earlier than for env B. This would lead to incorrect list of admin
> networks by the end of environment deletion.
>
> So the option 4 just doesn't work.
>
> On Wed, Jul 6, 2016 at 11:41 AM, Georgy Kibardin <gkibardin at mirantis.com>
> wrote:
>
>> Bulat is suggesting to move with 4. He suggest to merge all actions of
>> UpdateDnsmasqTask into one puppet task. There are three actions: syncing
>> admin network list to heira, dhcp ranges update and cobbler sync. The
>> problem I see with this approach is that current implementation does not
>> suppose passing any additional data to "puppet apply". Cobbler sync seems
>> to be a reasonable part of updating dhcp ranges config.
>>
>> Best,
>> Georgy
>>
>> On Thu, Jun 16, 2016 at 7:25 PM, Georgy Kibardin <gkibardin at mirantis.com>
>> wrote:
>>
>>> Hi All,
>>>
>>> Currently we can only run one instance of subj. at time. An attempt to
>>> run second one causes an exception. This behaviour at least may cause a
>>> cluster to stuck forever in "removing" state (reproduces here
>>> https://bugs.launchpad.net/fuel/+bug/1544493) or just produce
>>> incomprehensible "task already running" message. So we need to address the
>>> problem somehow. I see the following ways to fix it:
>>>
>>> 1. Just put the cluster into "error" state which would allow user to
>>> remove it later.
>>>   pros: simple and fixes the problem at hand (#1544493)
>>>   cons: it would be hard to detect "come againg later" situation; quite
>>> a lame behavior: why don't you "come again later" yourself.
>>>
>>> 2. Implement generic queueing in nailgun.
>>>     pros: quite simple
>>>     cons: it doesn't look like nailgun responsibility
>>>
>>> 3. Implement generic queueing in astute.
>>>    pros: this behaviour makes sense for astute.
>>>    cons: the implementation would be quite complex, we need to
>>> synchronize execution between separate worker processes.
>>>
>>> 4. Split the task so that each part would work with particular cluster.
>>>    pros: we don't extend our execution model
>>>    cons: untrivial implementation; no guarantee that we are always able
>>> to split master node tasks on per cluster basis.
>>>
>>> Best,
>>> Georgy
>>>
>>
>>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160708/866c5db8/attachment.html>


More information about the OpenStack-dev mailing list