26 Sep
2024
26 Sep
'24
4:34 p.m.
Hi, OpenStack community, I’ve faced a problem when some of our jobs can get lost by a worker, for example from the screenshot, SIGTERM was called in several seconds after receiving a job by a worker. [cid:image001.png@01DB1013.D648A630] Then there were no new log messages related to this job. Then our client complained that LB stucked in PENDING_UPDATE for several days and we started investigation. Our MySQL (persistent storage) is clean, but in our Redis, I can see several jobs without TTL and I think they are related to the «lost» jobs. [cid:image002.png@01DB1014.1DFF1BD0] Is it an ok situation? Can it be related to the https://github.com/openstack/octavia/blob/master/octavia/common/base_taskflo... Let’s discuss it!