[openstack-dev] [nova] [quantum] approaches for having quantum update nova's net info cache

Mike Wilson geekinutah at gmail.com
Tue Jun 4 15:38:02 UTC 2013


Doug,

I'm glad you've brought this up. We had a similar issue to your own
initially. I'm not sure our solution is the best one, but it is at least a
springboard for discussion. As far as we could tell instance_nw_info is
populated and cached because it is costly to generate the nw_info
structure. We were seeing 5 separate calls to the quantum API to generate
that structure whenever get_instance_nw_info was called. We are on Folsom,
but I don't think the behavior has changed. What we ended up doing was
implementing a instance_nw_info API in quantum that basically does all 5
calls in a single mysql query and returns the structure that nova wants to
put together. We also patched nova to always call the quantum API instead
of fetching a cache from the db. This solved two problems:

1. No stale caches
2. Reduced the number of calls going to the quantum API by 80%

This may not be the prettiest solution, but without it there is no way we
would be able to scale quantum across a large number of nodes. Also the
stale cache problem really affects usability. Maybe there is a better
reason for having that sitting in the database. I'm sure people more
knowledgeable than myself can chime in on this.

This is pretty close to what we have in production:

https://github.com/JunPark/quantum/commit/3dd7f677d952eedee99c7e3f2eed6389fcb9d324#quantum/api/v2/attributes.py

-Mike Wilson



On Tue, Jun 4, 2013 at 8:43 AM, Doug Hellmann
<doug.hellmann at dreamhost.com>wrote:

> I've been looking into a problem we're having locally with the network
> info cache in nova not updating when floating IP associations are changed
> through quantum. It looks like we need to tell nova about configuration
> changes quantum is making, and I wanted to work out the best approach on
> the list before I start submitting blueprints or patches.
>
> The symptoms of the problem are that the floating IPs for instances are
> not reflected correctly in the output of "nova list" or in the horizon view
> for instances (after being associated or disassociated). There is almost
> always a long delay between a floating IP operation and when horizon will
> see it, when using the quantum API. Using the nova API to add/remove the
> floating IP updates the cache immediately. We could tell our users this,
> but Horizon uses the quantum API, so it wouldn't matter in a lot of cases.
>
> The healer task in the compute manager runs every 60 seconds, and updates
> one instance at a time. After it cycles through the entire set of
> instances, it reloads the definitions from the DB and starts the cycle
> over. The comments on the commit that introduced this behavior  (08fa534a0,
> from Feb 2012) imply that the author assumes nova knows about all changes,
> but this is a case where it doesn't.
>
> Because it is only updating one instance at a time, and the instances
> don't come in any particular order, there's no way to predict how long it
> will take to update a given instance. The delay will be between 1 and N
> minutes, where N is the number of instances on that host (possibly up to
> 2*N if the instance created while the healer is working through its cached
> list of instances). We have a fairly large value for N in our
> configuration, so we're seeing delays as long as 30-45 minutes even with
> just a few instances running in a test cluster. During that time the
> instances work, and the quantum API knows which instance has a floating IP,
> but the user can't tell via horizon or the nova API.
>
> It seems like we need to have quantum tell nova when a floating IP
> association changes so that nova can update the cache for the instance
> right away. There is a conductor API for updating the cache, but that's
> part of nova's internals so I don't think we want to use it directly. The
> two other ideas I had were to have nova listen for appropriate
> notifications from quantum, or to add an admin API extension to let quantum
> modify the cache from the outside using the REST client (either by sending
> new contents, or by telling nova to prioritize the update).
>
> Has anyone else encountered this issue? Do either of these suggested
> approaches make sense?
>
> Doug
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20130604/aec1035a/attachment.html>


More information about the OpenStack-dev mailing list