<div dir="ltr">I have been recently investigating reports of slowness for list responses in the Neutron API.<div>This was first reported in [1], and then recently was observed with both the ML2 and the NSX plugins.</div><div>
The root cause of this issues is that a policy engine check is performed for every attribute of every resource returned in a response.</div><div>When tenant grow to a lot of ports, or when the API is executed with admin credentials without filters, this might become a non-negligible scale issue.</div>
<div>This issue is mostly due to three factors:</div><div>1) A log statement printing a line in the log for every attribute for which no policy criterion is defined; this has been treated with [2]</div><div>2) The fact that for every check neutron currently checks whether cached policy rules are still valid [3]</div>
<div>3) The fact that anyway Neutron perform really a lot of policy checks whether it should not</div><div><br></div><div>Despite the improvements [2] and [3] (mostly [2]), neutron for a list operation still spends for post-plugin operations (ie: policy checks) abotu 50% of the time it spends in the plugin.</div>
<div>Solving this problem is not difficult, but it might require changes which are worth of a discussion on the mailing list.</div><div>Up to the Havana release policy checks were performed in the plugin; this basically made responses dependent on plugin implementation and was terrible for API compatibility and portability; we took care of that with [4], which moved all policy checks to the API layer. However for one fix that we fixed, another thing was broken (*)</div>
<div><br></div><div>The API layer for list responses puts every item through policy checks to see which should not be visible to the user at all, which is fine.</div><div>However it also puts every attribute through a policy check to exclude those which should not be visible to the user, such as provider attributes for regular users.</div>
<div>Doing this for every resource might make sense if an attribute should be visible or not according to the data in the resource itself.</div><div>For instance a policy that shows port binding attributes could be defined for all the ports whose name is "ernest".</div>
<div>This might appear as great flexibility, but does it make any sense at all?</div><div>Does it make sense that an API list operation return a set of attributes for some items and a different one for others?</div><div>
I think not.</div><div><br></div><div>For this reason I am thinking we should what technically is a simple change: use policy cghecks determine the list of attributes to show only once for list response, and then re-use that list for the whole response.</div>
<div>The limitation here is that we should not have 'attribute-level' policies (**) which rely on the resource value.</div><div>I think this limitation is fair. If you like the approach I have some code here: <a href="http://paste.openstack.org/show/75371/">http://paste.openstack.org/show/75371/</a></div>
<div><br></div><div>And this leads me to the second part of the discussion I'd like to start.</div><div>The policy engine currently allows me to start a neutron server where, for instance, port binding are visible by admins only, and another neutron server where any user can see them.</div>
<div>This kind of makes the API not really portable, as people programming against the neutron API might encounter unexpected behaviours.</div><div>To this aim, one solution would be to 'hardcode' attributes' access rights into extensions definition. This way port bindings will always be admin_only regardless of which neutron endpoint one is accessing.</div>
<div>However, there are two drawbacks in this approach:</div><div>1 - This could break existing deployment which tweaked default policy, so the upgrade should be carefully planned</div><div>2 - This will make API definitions dependent on entries in policy.json. If a policy definition states that an attribute is admin_only, one will also have to ensure such policy is defined in policy.json.</div>
<div><br></div><div>Thanks for reading this email,</div><div>Salvatore</div><div><br></div><div>[1] <a href="https://bugs.launchpad.net/neutron/+bug/1236704">https://bugs.launchpad.net/neutron/+bug/1236704</a></div><div>[2] <a href="https://bugs.launchpad.net/neutron/+bug/1302467">https://bugs.launchpad.net/neutron/+bug/1302467</a></div>
<div>[3] <a href="https://bugs.launchpad.net/neutron/+bug/1302611">https://bugs.launchpad.net/neutron/+bug/1302611</a></div><div>[4] <a href="https://wiki.openstack.org/wiki/Neutron/Make-authz-orthogonal">https://wiki.openstack.org/wiki/Neutron/Make-authz-orthogonal</a><br>
<div>(*) typical behaviour of software 'fixed' by Salvatore</div></div><div>(**) policies such as get_network:provider:network_type.</div></div>