<div dir="ltr">Appreciate the feedback. It seems the conclusion is that generally one can safety enable large number of VFs with an exception of some limited hardware configurations which might require reducing VFs number due to BIOS limitation.<div><br></div><div>Thanks & Regards,<br>Maciej</div></div><div class="gmail_extra"><br><div class="gmail_quote">2018-01-23 3:39 GMT+01:00 Blair Bethwaite <span dir="ltr"><<a href="mailto:blair.bethwaite@gmail.com" target="_blank">blair.bethwaite@gmail.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">This is starting to veer into magic territory for my level of<br>
understanding so beware... but I believe there are (or could be<br>
depending on your exact hardware) PCI config space considerations.<br>
IIUC each SRIOV VF will have its own PCI BAR. Depending on the window<br>
size required (which may be determined by other hardware features such<br>
as flow-steering), you can potentially hit compatibility issues with<br>
your server BIOS not supporting mapping of addresses which surpass<br>
4GB. This can then result in the device hanging on initialisation (at<br>
server boot) and effectively bricking the box until the device is<br>
removed.<br>
<br>
We have seen this first hand on a Dell R730 with Mellanox ConnectX-4<br>
card (there are several other Dell 13G platforms with the same BIOS<br>
chipsets). We were explicitly increasing the PCI BAR size for the<br>
device (not upping the number of VFs) in relation to a memory<br>
exhaustion issue when running MPI collective communications on hosts<br>
with 28+ cores, we only had 16 (or maybe 32, I forget) VFs configured<br>
in the firmware.<br>
<br>
At the end of that support case (which resulted in a replacement NIC),<br>
the support engineer's summary included:<br>
"""<br>
-When a BIOS limits the BAR to be contained in the 4GB address space -<br>
it is a BIOS limitation.<br>
Unfortunately, there is no way to tell - Some BIOS implementations use<br>
proprietary heuristics to decide when to map a specific BAR below 4GB.<br>
<br>
-When SR-IOV is enabled, and num-vfs is high, the corresponding VF BAR<br>
can be huge.<br>
In this case, the BIOS may exhaust the ~2GB address space that it has<br>
available below 4GB.<br>
In this case, the BIOS may hang – and the server won’t boot.<br>
"""<br>
<br>
At the very least you should ask your hardware vendors some very<br>
specific questions before doing anything that might change your PCI<br>
BAR sizes.<br>
<br>
Cheers,<br>
<div class="HOEnZb"><div class="h5"><br>
On 23 January 2018 at 11:44, Pedro Sousa <<a href="mailto:pgsousa@gmail.com">pgsousa@gmail.com</a>> wrote:<br>
> Hi,<br>
><br>
> I have sr-iov in production in some customers with maximum number of VFs and<br>
> didn't notice any performance issues.<br>
><br>
> My understanding is that of course you will have performance penalty if you<br>
> consume all those vfs, because you're dividing the bandwidth across them,<br>
> but other than if they're are there doing nothing you won't notice anything.<br>
><br>
> But I'm just talking from my experience :)<br>
><br>
> Regards,<br>
> Pedro Sousa<br>
><br>
> On Mon, Jan 22, 2018 at 11:47 PM, Maciej Kucia <<a href="mailto:maciej@kucia.net">maciej@kucia.net</a>> wrote:<br>
>><br>
>> Thank you for the reply. I am interested in SR-IOV and pci whitelisting is<br>
>> certainly involved.<br>
>> I suspect that OpenStack itself can handle those numbers of devices,<br>
>> especially in telco applications where not much scheduling is being done.<br>
>> The feedback I am getting is from sysadmins who work on network<br>
>> virtualization but I think this is just a rumor without any proof.<br>
>><br>
>> The question is if performance penalty from SR-IOV drivers or PCI itself<br>
>> is negligible. Should cloud admin configure maximum number of VFs for<br>
>> flexibility or should it be manually managed and balanced depending on<br>
>> application?<br>
>><br>
>> Regards,<br>
>> Maciej<br>
>><br>
>>><br>
>>><br>
>>> 2018-01-22 18:38 GMT+01:00 Jay Pipes <<a href="mailto:jaypipes@gmail.com">jaypipes@gmail.com</a>>:<br>
>>>><br>
>>>> On 01/22/2018 11:36 AM, Maciej Kucia wrote:<br>
>>>>><br>
>>>>> Hi!<br>
>>>>><br>
>>>>> Is there any noticeable performance penalty when using multiple virtual<br>
>>>>> functions?<br>
>>>>><br>
>>>>> For simplicity I am enabling all available virtual functions in my<br>
>>>>> NICs.<br>
>>>><br>
>>>><br>
>>>> I presume by the above you are referring to setting your<br>
>>>> pci_passthrough_whitelist on your compute nodes to whitelist all VFs on a<br>
>>>> particular PF's PCI address domain/bus?<br>
>>>><br>
>>>>> Sometimes application is using only few of them. I am using Intel and<br>
>>>>> Mellanox.<br>
>>>>><br>
>>>>> I do not see any performance drop but I am getting feedback that this<br>
>>>>> might not be the best approach.<br>
>>>><br>
>>>><br>
>>>> Who is giving you this feedback?<br>
>>>><br>
>>>> The only issue with enabling (potentially 254 or more) VFs for each PF<br>
>>>> is that each VF will end up as a record in the pci_devices table in the Nova<br>
>>>> cell database. Multiply 254 or more times the number of PFs times the number<br>
>>>> of compute nodes in your deployment and you can get a large number of<br>
>>>> records that need to be stored. That said, the pci_devices table is well<br>
>>>> indexed and even if you had 1M or more records in the table, the access of a<br>
>>>> few hundred of those records when the resource tracker does a<br>
>>>> PciDeviceList.get_by_compute_<wbr>node() [1] will still be quite fast.<br>
>>>><br>
>>>> Best,<br>
>>>> -jay<br>
>>>><br>
>>>> [1]<br>
>>>> <a href="https://github.com/openstack/nova/blob/stable/pike/nova/compute/resource_tracker.py#L572" rel="noreferrer" target="_blank">https://github.com/openstack/<wbr>nova/blob/stable/pike/nova/<wbr>compute/resource_tracker.py#<wbr>L572</a><br>
>>>> and then<br>
>>>><br>
>>>> <a href="https://github.com/openstack/nova/blob/stable/pike/nova/pci/manager.py#L71" rel="noreferrer" target="_blank">https://github.com/openstack/<wbr>nova/blob/stable/pike/nova/<wbr>pci/manager.py#L71</a><br>
>>>><br>
>>>>> Any recommendations?<br>
>>>>><br>
>>>>> Thanks,<br>
>>>>> Maciej<br>
>>>>><br>
>>>>><br>
>>>>> ______________________________<wbr>_________________<br>
>>>>> OpenStack-operators mailing list<br>
>>>>> <a href="mailto:OpenStack-operators@lists.openstack.org">OpenStack-operators@lists.<wbr>openstack.org</a><br>
>>>>> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators" rel="noreferrer" target="_blank">http://lists.openstack.org/<wbr>cgi-bin/mailman/listinfo/<wbr>openstack-operators</a><br>
>>>>><br>
>>>><br>
>>>> ______________________________<wbr>_________________<br>
>>>> OpenStack-operators mailing list<br>
>>>> <a href="mailto:OpenStack-operators@lists.openstack.org">OpenStack-operators@lists.<wbr>openstack.org</a><br>
>>>> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators" rel="noreferrer" target="_blank">http://lists.openstack.org/<wbr>cgi-bin/mailman/listinfo/<wbr>openstack-operators</a><br>
>>><br>
>>><br>
>><br>
>><br>
>> ______________________________<wbr>_________________<br>
>> OpenStack-operators mailing list<br>
>> <a href="mailto:OpenStack-operators@lists.openstack.org">OpenStack-operators@lists.<wbr>openstack.org</a><br>
>> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators" rel="noreferrer" target="_blank">http://lists.openstack.org/<wbr>cgi-bin/mailman/listinfo/<wbr>openstack-operators</a><br>
>><br>
><br>
><br>
> ______________________________<wbr>_________________<br>
> OpenStack-operators mailing list<br>
> <a href="mailto:OpenStack-operators@lists.openstack.org">OpenStack-operators@lists.<wbr>openstack.org</a><br>
> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators" rel="noreferrer" target="_blank">http://lists.openstack.org/<wbr>cgi-bin/mailman/listinfo/<wbr>openstack-operators</a><br>
><br>
<br>
<br>
<br>
</div></div><span class="HOEnZb"><font color="#888888">--<br>
Cheers,<br>
~Blairo<br>
</font></span></blockquote></div><br></div>