[openstack-dev] [nova][PCI] one use case make the flavor/extra-info based solution to be right choice

Robert Li (baoli) baoli at cisco.com
Thu Mar 20 13:50:47 UTC 2014


Hi Yongli,

I'm very glad that you bring this up and relive our discussion on PCI
passthrough and its application on networking. The use case you brought up
is:

           user wants a FASTER NIC from INTEL to join a virtual
networking. 

By FASTER, I guess that you mean that the user is allowed to select a
particular vNIC card. Therefore, the above statement can be translated
into the following requests for a PCI device:
        . Intel vNIC
        . 1G or 10G or ?
        . network to join

First of all, I'm not sure in a cloud environment, a user would care about
the vendor or card type. 1G or 10G doesn't have anything to do with the
bandwidth a user would get. But I guess a cloud provider may have the
incentive to do so for other reasons, and want to provide its users with
such choice. In any case, let's assume it's a valid use case.

With the initial PCI group proposal, we have one tag and you can tag the
Intel device with its group name, for example, "Intel_1G_phy1",
"Intel_10G_phy1". When requesting a particular device, the user can say:
pci_group="Intel_1G_phy1", or pci_group="Intel_10G_phy1", or if the user
don't care 1G or 10G, pci_group="Intel_1G_phy1" OR "Intel_10G_phy1".

I would also think that it's possible to have two tags on a networking
device with the above use case in mind: a group tag, and a network tag.
For example, a device can be tagged with pci_group="Intel_1G",
network="phy1". When requesting a networking device, the network tag can
be derived from the nic that's being requested.

As you can see, an admin defines the devices once on the compute nodes,
and doesn't need to do anything on top of that. It's simple and easy to
use.

My initial comments to the flavor/extra-info based solution are about the
PCI stats management and scheduling. Your latest patch seems to have
answered some of my original questions. However, your implementation seems
to deviate from (or I should say have clarified) the original proposal
https://docs.google.com/document/d/1vadqmurlnlvZ5bv3BlUbFeXRS_wh-dsgi5plSji
mWjU/edit, which doesn't provide detailed explanation on those.

Here, let me extract the comment I provided to this patch
https://review.openstack.org/#/c/63267/:

'''
I'd like to take an analogy with a database table. Assuming a device table
with columns of device properties (such as product_id, etc), designated as
P and extra attributes as E. So it would look like something as T-columns
= (P1, P2, ..., E1, E2, ..).
A pci_flavor_attrs is a subset of T-columns. With that, the entire device
table will be REDUCED to a smaller stats pool table. For example, if
pci_flavor_attrs is (P1, P2, E1, E2), then the stats pool table will look
like: S-columns = (P1, P2, E1, E2, COUNT). In the worst case, S-columns =
T-columns. Although a well educated admin wouldn't do that.
Therefore, requesting a PCI device is like doing a DB search based on the
stats pool table. And the search criteria is based on a combination of the
S-columns (for example, by way of nova flavor).
The admin can decide to define any extra attributes, and devices may be
tagged with different extra attributes. It's possible that many extra
attributes are defined, but some devices may be tagged with one. However,
all the extra attributes have to have corresponding columns in the stats
pool table.
I can see there are many ways to use such an interface. it also means it
could easily lead to misuse. An admin may define a lot of attributes,
later he may find it's not enough based on how he used it, and adding new
attributes or deleting attributes may not be a fun thing at all (due to
the fixed pci_flavor_attrs configuration), let alone how to do that in a
working cloud.
'''

Imagine in a cloud that supports PCI passthrough on various classes of PCI
cards (by class, I mean the linux pci device class). Examples are video,
crypto, networking, storage, etc. The pci_flavor_attrs needs to be defined
on EVERY node, and has to accommodate attributes from ALL of these classes
of cards. However, an attribute for one class of cards may not be
applicable to other classes of cards. However, the stats group are keyed
on pci_flavor_attrs, and PCI flavors can be defined with any attributes
from pci_flavor_attrs. Thus, it really lacks the level of abstraction that
clearly defines the usage and semantics. It's up to a well educated admin
to use it properly, and it's not easy to manage. Therefore, I believe it
requires further work.

I think that practical use cases would really help us find the right
solution, and provide the optimal interface to the admin/user. So let's
keep the discussion going.

thanks,
Robert

On 3/20/14 4:22 AM, "yongli he" <yongli.he at intel.com> wrote:

>HI, all
>
>cause of the Juno, the PCI discuss keen open, for group VS to
>flavor/extra-information based solution. there is a use case, which
>group based
>solution can not supported well.
>
>please considerate of this, and choose the flavor/extra-information
>based solution.
>
>
>Groups problem:
>
>I: exposed may detail under laying grouping to user, user burden of deal
>with those things. and in a OS system, the group name might be messy.
>refer to II)
>--------------------------
>II: group based solution can not well support such a simple use case:
>
>user want a faster NIC from Intel to join a virtual networking.
>
>suppose the tenant use physical network name is "phy1". then the 'group'
>style solution won't meeting such a simple use case. reason:
>
>1) the group name must be 'phy1', otherwise, the neutron can't not fill
>the pci request, the neutron have only the physical network name for this.
>(suppose the phy1 not bothering the user, if bothering user, user will
>see group like : the "intel_phy1" "ciscio_v1_phy1".... )
>
>2) because there is only one property in pci stats pool, user then loose
>the chance to choice the version or model of the pci device, then user
>can not request a simple thing like the "intel-NIC" or "1G_NIC.
>
>
>Regards
>Yongli He




More information about the OpenStack-dev mailing list