[openstack-dev] [ironic] Hardware console processes in multi-conductor environment
Yuriy Zveryanskyy
yzveryanskyy at mirantis.com
Fri Mar 10 15:42:49 UTC 2017
Hi all.
Hardware nodes consoles have some specific: limited number of
concurrent console sessions (often to 1) that can be established.
There are some issues (described below) due to conflict between
distributed ironic conductors services and local console processes.
This affect only case with local console processes, currently
shellinabox and socat for example.
There are some possible "global" solutions:
1) Pluggable internal task API [1], currently rejected by community;
2) Non-pluggable internal task API that uses external service (there
is not necessary service currently in OpenStack);
3) Custom distributed process management based on ssh access
between ironic conductor hosts (looks like a hack);
4) New console interface drivers which implements tasks management
internally (like "k8s_shellinabox", "k8s_socat").
And partial solutions (some of them proposed below) are possible.
In multi-conductor environment ironic conductor process can be
died/stopped/blocked (removed) or started/restarted (added).
Possible cases:
1) Conductor removed
a) gracefully stopped. Some daemon processes like shellinabox
for consoles can continue to run. This issue can be fixed currently
as separate bug.
b) died/killed. Daemon processes can continue to run. This issue can
be fixed only by distributed tasks management ("global" solutions above).
c) all host with conductor died. No fix needed.
2) Conductor added/restarted
New conductor try to start processes for enabled consoles, but currently
processes on conductor hosts that works with these nodes before not
stopped [2]. I see two possible solution for this issue:
1) "Untakeover" periodic task for stopping console processes.
For this solution we should not stop non-local consoles.
2) Do not stop process on old conductor. Use redefined RPC routing
(based on saved into DB conductor that started console) on API side
for set console and wait stopping via API. This routing should also
ignore dead conductors.
If you have some ideas please leave comments.
[1] https://review.openstack.org/#/c/431605/
[2] https://bugs.launchpad.net/ironic/+bug/1632192
Yuriy Zveryanskyy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20170310/b945f646/attachment.html>
More information about the OpenStack-dev
mailing list