Hi, looking at taskflow code, it seems that those last_modified entries may not always be deleted: https://github.com/openstack/taskflow/blob/master/taskflow/jobs/backends/imp... I think it's something that could be improved but it doesn't indicate a potential bug there. Which octavia release do you use? I don't see the octavia_jobboard:listings hash in your output. it is used to keep all the current jobs in taskflow, when a job is posted, an element is added: https://github.com/openstack/taskflow/blob/master/taskflow/jobs/backends/imp... when the conductor is started in octavia (for instance when the worker restarts after a crash/kill), it fetches all the elements of this hash to schedule the jobs. https://github.com/openstack/taskflow/blob/master/taskflow/jobs/backends/imp... any suspicious backtraces in the octavia worker, healthmanager, housekeeping logs? Greg On Thu, Sep 26, 2024 at 3:15 PM Payne Max <yardalgedal@gmail.com> wrote:
Hi, OpenStack community,
I’ve faced a problem when some of our jobs can get lost by a worker, for example from the screenshot, SIGTERM was called in several seconds after receiving a job by a worker.
Then there were no new log messages related to this job. Then our client complained that LB stucked in PENDING_UPDATE for several days and we started investigation.
Our MySQL (persistent storage) is clean, but in our Redis, I can see several jobs without TTL and I think they are related to the «lost» jobs.
Is it an ok situation? Can it be related to the https://github.com/openstack/octavia/blob/master/octavia/common/base_taskflo... <https://github.com/openstack/octavia/blob/master/octavia/common/base_taskflow.py#L209-L211>? Let’s discuss it!