<div dir="ltr"><div style="font-family:arial,sans-serif;font-size:13px">I've been looking into a problem we're having locally with the network info cache in nova not updating when floating IP associations are changed through quantum. It looks like we need to tell nova about configuration changes quantum is making, and I wanted to work out the best approach on the list before I start submitting blueprints or patches.</div>
<div style="font-family:arial,sans-serif;font-size:13px"><br></div><div style="font-family:arial,sans-serif;font-size:13px"><div>The symptoms of the problem are that the floating IPs for instances are not reflected correctly in the output of "nova list" or in the horizon view for instances (after being associated or disassociated). There is almost always a long delay between a floating IP operation and when horizon will see it, when using the quantum API. Using the nova API to add/remove the floating IP updates the cache immediately. We could tell our users this, but Horizon uses the quantum API, so it wouldn't matter in a lot of cases.</div>
<div><br></div><div><div><div>The healer task in the compute manager runs every 60 seconds, and updates one instance at a time. After it cycles through the entire set of instances, it reloads the definitions from the DB and starts the cycle over. The comments on the commit that introduced this behavior (<span style="line-height:15px;font-size:12px;font-family:Arial,FreeSans,Helvetica,sans-serif">08fa534a0, from Feb 2012) imply that the author assumes nova knows about all changes, but this is a case where it doesn't.</span></div>
<div><br></div><div>Because it is only updating one instance at a time, and the instances don't come in any particular order, there's no way to predict how long it will take to update a given instance. The delay will be between 1 and N minutes, where N is the number of instances on that host (possibly up to 2*N if the instance created while the healer is working through its cached list of instances). We have a fairly large value for N in our configuration, so we're seeing delays as long as 30-45 minutes even with just a few instances running in a test cluster. During that time the instances work, and the quantum API knows which instance has a floating IP, but the user can't tell via horizon or the nova API.</div>
<div><br></div><div>It seems like we need to have quantum tell nova when a floating IP association changes so that nova can update the cache for the instance right away. There is a conductor API for updating the cache, but that's part of nova's internals so I don't think we want to use it directly. The two other ideas I had were to have nova listen for appropriate notifications from quantum, or to add an admin API extension to let quantum modify the cache from the outside using the REST client (either by sending new contents, or by telling nova to prioritize the update).</div>
</div><div><br></div></div><div>Has anyone else encountered this issue? Do either of these suggested approaches make sense?</div><div><br></div><div style>Doug</div><div style><br></div></div></div>