[Openstack-operators] [neutron] [os-vif] VF overcommitting and performance in SR-IOV

Maciej Kucia maciej at kucia.net
Mon Jan 22 23:47:34 UTC 2018


Thank you for the reply. I am interested in SR-IOV and pci whitelisting is
certainly involved.
I suspect that OpenStack itself can handle those numbers of devices,
especially in telco applications where not much scheduling is being done.
The feedback I am getting is from sysadmins who work on network
virtualization but I think this is just a rumor without any proof.

The question is if performance penalty from SR-IOV drivers or PCI itself is
negligible. Should cloud admin configure maximum number of VFs for
flexibility or should it be manually managed and balanced depending on
application?

Regards,
Maciej


>
> 2018-01-22 18:38 GMT+01:00 Jay Pipes <jaypipes at gmail.com>:
>
>> On 01/22/2018 11:36 AM, Maciej Kucia wrote:
>>
>>> Hi!
>>>
>>> Is there any noticeable performance penalty when using multiple virtual
>>> functions?
>>>
>>> For simplicity I am enabling all available virtual functions in my NICs.
>>>
>>
>> I presume by the above you are referring to setting your
>> pci_passthrough_whitelist on your compute nodes to whitelist all VFs on a
>> particular PF's PCI address domain/bus?
>>
>> Sometimes application is using only few of them. I am using Intel and
>>> Mellanox.
>>>
>>> I do not see any performance drop but I am getting feedback that this
>>> might not be the best approach.
>>>
>>
>> Who is giving you this feedback?
>>
>> The only issue with enabling (potentially 254 or more) VFs for each PF is
>> that each VF will end up as a record in the pci_devices table in the Nova
>> cell database. Multiply 254 or more times the number of PFs times the
>> number of compute nodes in your deployment and you can get a large number
>> of records that need to be stored. That said, the pci_devices table is well
>> indexed and even if you had 1M or more records in the table, the access of
>> a few hundred of those records when the resource tracker does a
>> PciDeviceList.get_by_compute_node() [1] will still be quite fast.
>>
>> Best,
>> -jay
>>
>> [1] https://github.com/openstack/nova/blob/stable/pike/nova/comp
>> ute/resource_tracker.py#L572 and then
>> https://github.com/openstack/nova/blob/stable/pike/nova/pci/
>> manager.py#L71
>>
>> Any recommendations?
>>>
>>> Thanks,
>>> Maciej
>>>
>>>
>>> _______________________________________________
>>> OpenStack-operators mailing list
>>> OpenStack-operators at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>
>>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180123/730287eb/attachment.html>


More information about the OpenStack-operators mailing list