[openstack-dev] [nova] [quantum] approaches for having quantum update nova's net info cache
Vishvananda Ishaya
vishvananda at gmail.com
Tue Jun 4 18:11:49 UTC 2013
On Jun 4, 2013, at 8:38 AM, Mike Wilson <geekinutah at gmail.com> wrote:
> Doug,
>
> I'm glad you've brought this up. We had a similar issue to your own initially. I'm not sure our solution is the best one, but it is at least a springboard for discussion. As far as we could tell instance_nw_info is populated and cached because it is costly to generate the nw_info structure. We were seeing 5 separate calls to the quantum API to generate that structure whenever get_instance_nw_info was called. We are on Folsom, but I don't think the behavior has changed. What we ended up doing was implementing a instance_nw_info API in quantum that basically does all 5 calls in a single mysql query and returns the structure that nova wants to put together. We also patched nova to always call the quantum API instead of fetching a cache from the db. This solved two problems:
>
> 1. No stale caches
> 2. Reduced the number of calls going to the quantum API by 80%
>
> This may not be the prettiest solution, but without it there is no way we would be able to scale quantum across a large number of nodes. Also the stale cache problem really affects usability. Maybe there is a better reason for having that sitting in the database. I'm sure people more knowledgeable than myself can chime in on this.
This seems like a reasonable approach to me, but I worry about nova list when there are a large number of instances. Perhaps a bulk get_nw_info request with a list of instance_uuids would work?
Vish
>
> This is pretty close to what we have in production:
>
> https://github.com/JunPark/quantum/commit/3dd7f677d952eedee99c7e3f2eed6389fcb9d324#quantum/api/v2/attributes.py
>
> -Mike Wilson
>
>
>
> On Tue, Jun 4, 2013 at 8:43 AM, Doug Hellmann <doug.hellmann at dreamhost.com> wrote:
> I've been looking into a problem we're having locally with the network info cache in nova not updating when floating IP associations are changed through quantum. It looks like we need to tell nova about configuration changes quantum is making, and I wanted to work out the best approach on the list before I start submitting blueprints or patches.
>
> The symptoms of the problem are that the floating IPs for instances are not reflected correctly in the output of "nova list" or in the horizon view for instances (after being associated or disassociated). There is almost always a long delay between a floating IP operation and when horizon will see it, when using the quantum API. Using the nova API to add/remove the floating IP updates the cache immediately. We could tell our users this, but Horizon uses the quantum API, so it wouldn't matter in a lot of cases.
>
> The healer task in the compute manager runs every 60 seconds, and updates one instance at a time. After it cycles through the entire set of instances, it reloads the definitions from the DB and starts the cycle over. The comments on the commit that introduced this behavior (08fa534a0, from Feb 2012) imply that the author assumes nova knows about all changes, but this is a case where it doesn't.
>
> Because it is only updating one instance at a time, and the instances don't come in any particular order, there's no way to predict how long it will take to update a given instance. The delay will be between 1 and N minutes, where N is the number of instances on that host (possibly up to 2*N if the instance created while the healer is working through its cached list of instances). We have a fairly large value for N in our configuration, so we're seeing delays as long as 30-45 minutes even with just a few instances running in a test cluster. During that time the instances work, and the quantum API knows which instance has a floating IP, but the user can't tell via horizon or the nova API.
>
> It seems like we need to have quantum tell nova when a floating IP association changes so that nova can update the cache for the instance right away. There is a conductor API for updating the cache, but that's part of nova's internals so I don't think we want to use it directly. The two other ideas I had were to have nova listen for appropriate notifications from quantum, or to add an admin API extension to let quantum modify the cache from the outside using the REST client (either by sending new contents, or by telling nova to prioritize the update).
>
> Has anyone else encountered this issue? Do either of these suggested approaches make sense?
>
> Doug
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20130604/5a0752de/attachment.html>
More information about the OpenStack-dev
mailing list