On 5/8/2019 2:31 PM, Eric Fried wrote:
Sundar-
I have a set of compute hosts, each with several NICs of type T. Each NIC has a set of PFs: PF1, PF2, .... Each PF is a resource provider, and each has a separate custom RC: CUSTOM_RC_PF1, CUSTOM_RC_PF2, ... . The VFs are inventories of the associated PF's RC. Provider networks etc. are traits on that PF.
It would be weird for the inventories to be called PF* if they're inventories of VF. I am focusing mainly on the concepts for now, not on the names. But mainly: why the custom resource classes? This is as elaborate an example as I could cook up. IRL, we may need some custom RC, but maybe not one for each PF type. The way "resourceless RP" + "same_subtree" is designed to work is best explained if I model your use case with standard resource classes instead:
CN | +---NIC1 (trait: I_AM_A_NIC) | | | +-----PF1_1 (trait: CUSTOM_PHYSNET1, inventory: VF=4) | | | +-----PF1_2 (trait: CUSTOM_PHYSNET2, inventory: VF=4) | +---NIC2 (trait: I_AM_A_NIC) | +-----PF2_1 (trait: CUSTOM_PHYSNET1, inventory: VF=4) | +-----PF2_2 (trait: CUSTOM_PHYSNET2, inventory: VF=4)
Now if I say:
?resources_T1=VF:1 &required_T1=CUSTOM_PHYSNET1 &resources_T2=VF:1 &required_T2=CUSTOM_PHYSNET2 &required_T3=I_AM_A_NIC &same_subtree=','.join([suffix for suffix in suffixes if suffix.startswith('_T')]) (i.e. '_T1,_T2,_T3')
...then I'll get two candidates:
- {PF1_1: VF=1, PF1_2: VF=1} <== i.e. both from NIC1 - {PF2_1: VF=1, PF2_2: VF=1} <== i.e. both from NIC2
...and no candidates where one VF is from each NIC.
IIUC this is how you wanted it.
Yes. The examples in the storyboard [1] for NUMA affinity use group numbers. If that were recast to use named groups, and we wanted NUMA affinity apart from device colocation, would that not require a different name than T? In short, if you want to express 2 different affinities/groupings, perhaps we need to use a name with 2 parts, and use 2 different same_subtree clauses. Just pointing out the implications. BTW, I noticed there is a standard RC for NIC VFs [2]. [1] https://storyboard.openstack.org/#!/story/2005575 [2] https://github.com/openstack/os-resource-classes/blob/master/os_resource_cla...
==============
With the custom resource classes, I'm having a hard time understanding the model. How unique are the _PF$N bits? Do they repeat (a) from one NIC to the next? (b) From one host to the next? (c) Never?
The only thing that begins to make sense is (a), because (b) and (c) would lead to skittles. So assuming (a), the model would look something like: Yes, (a) is what I had in mind. CN | +---NIC1 (trait: I_AM_A_NIC) | | | +-----PF1_1 (trait: CUSTOM_PHYSNET1, inventory: CUSTOM_PF1_VF=4) | | | +-----PF1_2 (trait: CUSTOM_PHYSNET2, inventory: CUSTOM_PF2_VF=4) | +---NIC2 (trait: I_AM_A_NIC) | +-----PF2_1 (trait: CUSTOM_PHYSNET1, inventory: CUSTOM_PF1_VF=4) | +-----PF2_2 (trait: CUSTOM_PHYSNET2, inventory: CUSTOM_PF2_VF=4)
Now you could get the same result with (essentially) the same request as above:
?resources_T1=CUSTOM_PF1_VF:1 &required_T1=CUSTOM_PHYSNET1 &resources_T2=CUSTOM_PF2_VF:1 &required_T2=CUSTOM_PHYSNET2 &required_T3=I_AM_A_NIC &same_subtree=','.join([suffix for suffix in suffixes if suffix.startswith('_T')]) (i.e. '_T1,_T2,_T3')
==>
- {PF1_1: CUSTOM_PF1_VF=1, PF1_2: CUSTOM_PF2_VF=1} - {PF2_1: CUSTOM_PF1_VF=1, PF2_2: CUSTOM_PF2_VF=1}
...except that in this model, PF$N corresponds to PHYSNET$N, so you wouldn't actually need the required_T$N=CUSTOM_PHYSNET$N to get the same result:
?resources_T1=CUSTOM_PF1_VF:1 &resources_T2=CUSTOM_PF2_VF:1 &required_T3=I_AM_A_NIC &same_subtree=','.join([suffix for suffix in suffixes if suffix.startswith('_T')]) (i.e. '_T1,_T2,_T3')
...because you're effectively encoding the physnet into the RC. Which is not good IMO.
But either way...
Do I have to create a 'resourceless RP' for the NIC card that contains the individual PF RPs as children nodes?
...if you want to be able to request this kind of affinity, then yes, you do (unless there's some consumable resource on the NIC, in which case it's not resourceless, but the spirit is the same). This is exactly what these features are being designed for.
Great. Thank you very much for the detailed reply. Regards, Sundar
Thanks, efried .