Trove Instance fails in build stage

Lingxian Kong anlin.kong at gmail.com
Fri Feb 19 19:59:26 UTC 2021


(adding openstack-discuss mail address)

Hi Abhinav,

Typically, there are 2 ports attached to Trove instance, one is mgmt port
meant to communicate with rabbitmq but can also be responsible for other
mgmt traffic (e.g. docker image pull) after routing table or dns
customization, that depends on your business requirements. The other one is
the user port mainly for database access. You can find more description
here
https://docs.openstack.org/trove/latest/admin/run_trove_in_production.html#management-network
.

There is only 1 Nova VM for each Trove instance.

Yes, you are using the right image if you deployed Ussuri Trove.

---
Lingxian Kong
Senior Cloud Engineer (Catalyst Cloud)
Trove PTL (OpenStack)
OpenStack Cloud Provider Co-Lead (Kubernetes)


On Fri, Feb 19, 2021 at 6:25 PM Abhinav Tyagi <abhinav31796 at gmail.com>
wrote:

> Hi Lingxian,
>
> Thanks for the timely help.
>
> We checked the logs and found out the trove guest agent instance is not
> able to connect to rabbimq container. We are using OpenStack-Ansible for
> deployment. It's a 3 master setup. But trove we have deployed manually on
> one compute physical node.
>
> Could it be because of the management network of trove and containers is
> not the same? Does trove guest instance should have 2 NIC's? When we create
> trove database instance, are there 2 instances launched? Are we using the
> right image?
>
> Regards,
> Aabhinav Tyagi
>
> On Fri, Feb 19, 2021 at 3:08 AM Lingxian Kong <anlin.kong at gmail.com>
> wrote:
>
>> Hi Abhinav,
>>
>> Please check the trove guest agent log according to this guide
>> https://docs.openstack.org/trove/latest/admin/troubleshooting.html.
>>
>> ---
>> Lingxian Kong
>> Senior Cloud Engineer (Catalyst Cloud)
>> Trove PTL (OpenStack)
>> OpenStack Cloud Provider Co-Lead (Kubernetes)
>>
>>
>> On Fri, Feb 19, 2021 at 9:20 AM Abhinav Tyagi <abhinav31796 at gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> Please help us in using trove. We are using
>>> https://tarballs.opendev.org/openstack/trove/images/trove-ussuri-mysql-ubuntu-xenial.qcow2
>>> image.
>>> We have a multi master openstack setup.
>>>
>>> These are the logs:
>>>
>>> Feb 18 19:03:30 bngoscompn01.comviva.com trove-taskmanager[1274599]:
>>> 2021-02-18 19:03:30.535 1274599 INFO trove.taskmanager.models [-] Waiting
>>> for instance 5fe896f6-dc7b-424a-87c2-64f207f422a2 up and running with
>>> timeout 900s
>>> Feb 18 19:18:33 bngoscompn01.comviva.com trove-taskmanager[1274599]:
>>> 2021-02-18 19:18:33.940 1274599 ERROR oslo.service.loopingcall [-] Dynamic
>>> backoff interval looping call
>>> 'trove.common.utils.build_polling_task.<locals>.poll_and_check' failed:
>>> oslo_service.loopingcall.LoopingCallTimeOut: Looping call timed out after
>>> 893.31 seconds
>>> Feb 18 19:18:33 bngoscompn01.comviva.com trove-taskmanager[1274599]:
>>> 2021-02-18 19:18:33.940 1274599 ERROR oslo.service.loopingcall Traceback
>>> (most recent call last):
>>> Feb 18 19:18:33 bngoscompn01.comviva.com trove-taskmanager[1274599]:
>>> 2021-02-18 19:18:33.940 1274599 ERROR oslo.service.loopingcall   File
>>> "/usr/lib/python3.6/site-packages/oslo_service/loopingcall.py", line 154,
>>> in _run_loop
>>> Feb 18 19:18:33 bngoscompn01.comviva.com trove-taskmanager[1274599]:
>>> 2021-02-18 19:18:33.940 1274599 ERROR oslo.service.loopingcall     idle =
>>> idle_for_func(result, self._elapsed(watch))
>>> Feb 18 19:18:33 bngoscompn01.comviva.com trove-taskmanager[1274599]:
>>> 2021-02-18 19:18:33.940 1274599 ERROR oslo.service.loopingcall   File
>>> "/usr/lib/python3.6/site-packages/oslo_service/loopingcall.py", line 351,
>>> in _idle_for
>>> Feb 18 19:18:33 bngoscompn01.comviva.com trove-taskmanager[1274599]:
>>> 2021-02-18 19:18:33.940 1274599 ERROR oslo.service.loopingcall     %
>>> self._error_time)
>>> Feb 18 19:18:33 bngoscompn01.comviva.com trove-taskmanager[1274599]:
>>> 2021-02-18 19:18:33.940 1274599 ERROR oslo.service.loopingcall
>>> oslo_service.loopingcall.LoopingCallTimeOut: Looping call timed out after
>>> 893.31 seconds
>>> Feb 18 19:18:33 bngoscompn01.comviva.com trove-taskmanager[1274599]:
>>> 2021-02-18 19:18:33.940 1274599 ERROR oslo.service.loopingcall
>>> Feb 18 19:18:33 bngoscompn01.comviva.com trove-taskmanager[1274599]:
>>> 2021-02-18 19:18:33.941 1274599 ERROR trove.taskmanager.models [-] Failed
>>> to create instance 5fe896f6-dc7b-424a-87c2-64f207f422a2, error: Polling
>>> request timed out..: trove.common.exception.PollTimeOut: Polling request
>>> timed out.
>>> Feb 18 19:18:33 bngoscompn01.comviva.com trove-taskmanager[1274599]:
>>> 2021-02-18 19:18:33.960 1274599 ERROR trove.taskmanager.models [-] Service
>>> status: ERROR, service error description: guestagent error:
>>> trove.common.exception.PollTimeOut: Polling request timed out.
>>> Feb 18 19:18:33 bngoscompn01.comviva.com trove-taskmanager[1274599]:
>>> 2021-02-18 19:18:33.970 1274599 DEBUG trove.db.models [-] Saving
>>> DBInstance: {'_sa_instance_state': <sqlalchemy.orm.state.InstanceState
>>> object at 0x7fb5447bd668>, 'tenant_id': '19aa2578cb444b828549b394eed8ebfb',
>>> 'hostname': None, 'shard_id': None, 'server_status': None,
>>> 'compute_instance_id': 'b928d7e4-6706-48d2-9fb9-c75e19b9765c', 'type':
>>> None, 'deleted': 0, 'id': '5fe896f6-dc7b-424a-87c2-64f207f422a2',
>>> 'task_id': 91, 'region_id': 'Bangalore', 'deleted_at': None,
>>> 'task_description': 'Build error: guestagent timeout.', 'encrypted_key':
>>> '***', 'datastore_version_id': 'b4bbc01c-e785-4944-8a76-6c154c19ddff',
>>> 'task_start_time': None, 'created': datetime.datetime(2021, 2, 18, 13, 33,
>>> 24), 'configuration_id': None, 'volume_id':
>>> '68accc42-34ae-4ac3-93fc-1efb9fed2af9', 'slave_of_id': None, 'updated':
>>> datetime.datetime(2021, 2, 18, 13, 48, 33, 969535), 'flavor_id':
>>> 'bbf4bad4-e258-45b4-8860-e5a51c86ed6e', 'name': 'trove_instance_15',
>>> 'cluster_id': None, 'volume_size': 10, 'errors': {}} save
>>> /usr/lib/python3.6/site-packages/trove/db/models.py:65
>>> Feb 18 19:18:33 bngoscompn01.comviva.com trove-taskmanager[1274599]:
>>> 2021-02-18 19:18:33.981 1274599 ERROR trove.taskmanager.models [-] Trove
>>> instance status: ERROR, Trove instance status description: Build error:
>>> guestagent timeout.: trove.common.exception.PollTimeOut: Polling request
>>> timed out.
>>> Feb 18 19:18:33 bngoscompn01.comviva.com trove-taskmanager[1274599]:
>>> 2021-02-18 19:18:33.994 1274599 DEBUG trove.db.models [-] Saving
>>> DBInstanceFault: {'_sa_instance_state': <sqlalchemy.orm.state.InstanceState
>>> object at 0x7fb5448688d0>, 'id': 'f0983744-4cfe-4863-b9ae-6fb33cc8a0d2',
>>> 'created': datetime.datetime(2021, 2, 18, 13, 48, 33, 994216), 'deleted':
>>> False, 'instance_id': '5fe896f6-dc7b-424a-87c2-64f207f422a2', 'message':
>>> 'Polling request timed out.', 'details': 'Traceback (most recent call
>>> last):\n  File "/usr/lib/python3.6/site-packages/trove/common/utils.py",
>>> line 207, in wait_for_task\n    return polling_task.wait()\n  File
>>> "/usr/local/lib/python3.6/site-packages/eventlet/event.py", line 125, in
>>> wait\n    result = hub.switch()\n  File
>>> "/usr/local/lib/python3.6/site-packages/eventlet/hubs/hub.py", line 309, in
>>> switch\n    return self.greenlet.switch()\n  File
>>> "/usr/lib/python3.6/site-packages/oslo_service/loopingcall.py", line 154,
>>> in _run_loop\n    idle = idle_for_func(result, self._elapsed(watch))\n
>>>  File "/usr/lib/python3.6/site-packages/oslo_service/loopingcall.py", line
>>> 351, in _idle_for\n    %
>>> self._error_time)\noslo_service.loopingcall.LoopingCallTimeOut:\n
>>>  Looping call timed out after 893.31 seconds\n\nDuring handling of the
>>> above exception, another exception occurred:\n\nTraceback (most recent call
>>> last):\n  File
>>> "/usr/lib/python3.6/site-packages/trove/taskmanager/models.py", line 431,
>>> in wait_for_instance\n    time_out=timeout)\n  File
>>> "/usr/lib/python3.6/site-packages/trove/common/utils.py", line 222, in
>>> poll_until\n    return wait_for_task(task)\n  File
>>> "/usr/lib/python3.6/site-packages/trove/common/utils.py", line 209, in
>>> wait_for_task\n    raise
>>> exception.PollTimeOut\ntrove.common.exception.PollTimeOut: Polling request
>>> timed out.\n', 'errors': {}, 'updated': datetime.datetime(2021, 2, 18, 13,
>>> 48, 33, 994288)} save /usr/lib/python3.6/site-packages/trove/db/models.py:65
>>>
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210220/60c6de4e/attachment-0001.html>


More information about the openstack-discuss mailing list