[openstack-dev] The PCI support blueprint

Ian Wells ijw.ubuntu at cack.org.uk
Mon Jul 22 16:49:17 UTC 2013

Per the last summit, there are many interested parties waiting on PCI
support.  Boris (who unfortunately waasn't there) jumped in with an
implementation before the rest of us could get a blueprint up, but I
suspect he's been stretched rather thinly and progress has been much
slower than I was hoping it would be.  There are many willing hands
happy to take this work on; perhaps it's time we did, so that we can
get something in before Havana.

I'm sure we could use a better scheduler.  I don't think that actually
affects most of the implementation of passthough and I don't think we
should tie the two together.  "The perfect is the enemy of the good."

And as far as the quantity of data passed back - we've discussed
before that it would be nice (for visibility purposes) to be able to
see an itemised list of all of the allocated and unallocated PCI
resources in the database.  There could be quite a lot per host (256
per card x say 10 cards depending on your hardware).  But passing that
itemised list back is somewhat of a luxury - in practice, what you
need for scheduling is merely a list of categories of card (those
pools where any one of the PCI cards in the pool would do) and counts.
 The compute node should be choosing a card from the pool in any case.
 The scheduler need only find a machine with cards available.

I'm not totally convinced that passing back the itemised list is
necessarily a problem, but in any case we can make the decision one
way or the other, take on the risk if we like, and get the code
written - if it turns out not to be scalable then we can fix *that* in
the next release, but at least we'll have something to play with in
the meantime.  Delaying the whole thing to I is just silly.

On 22 July 2013 17:34, Jiang, Yunhong <yunhong.jiang at intel.com> wrote:
> As for the scalability issue, boris, are you talking about the VF number issue, i.e. A physical PCI devices can at most have 256 virtual functions?
> I think we have discussed this before. We should keep the compute node to manage the same VF functions, so that VFs belongs to the same PF will have only one entry in DB, with a field indicating the number of free VFs. Thus there will be no scalability issue because the number of PCI slot is limited.
> We didn't implement this mechanism on current patch set because we agree to make it a  enhancement. If it's really a concern, please raise it and we will enhance our resource tracker for this. That's not complex task.
> Thanks
> --jyh
>> -----Original Message-----
>> From: Russell Bryant [mailto:rbryant at redhat.com]
>> Sent: Monday, July 22, 2013 8:22 AM
>> To: Jiang, Yunhong
>> Cc: boris at pavlovic.me; openstack-dev at lists.openstack.org
>> Subject: Re: The PCI support blueprint
>> On 07/22/2013 11:17 AM, Jiang, Yunhong wrote:
>> > Hi, Boris
>> >     I'm a surprised that you want to postpone the PCI support
>> (https://blueprints.launchpad.net/nova/+spec/pci-passthrough-base) to I
>> release. You and our team have been working on this for a long time, and
>> the patches has been reviewed several rounds. And we have been waiting
>> for your DB layer patch for two weeks without any update.
>> >
>> >     Can you give more reason why it's pushed to I release? If you are out
>> of bandwidth, we are sure to take it and push it to H release!
>> >
>> >     Is it because you want to base your DB layer on your 'A simple way to
>> improve nova scheduler'? That really does not make sense to me. Firstly,
>> that proposal is still under early discussion and get several different voices
>> already, secondly, PCI support is far more than DB layer, it includes
>> resource tracker, scheduler filter, libvirt support enhancement etc. Even if
>> we will change the scheduler that way after I release, we need only
>> change the DB layer, and I don't think that's a big effort!
>> Boris mentioned scalability concerns with the current approach on IRC.
>> I'd like more detail.
>> In general, if we can see a reasonable path to upgrade what we have now
>> to make it better in the future, then we don't need to block it because
>> of that.  If the current approach will result in a large upgrade impact
>> to users to be able to make it better, that would be a reason to hold
>> off.  It also depends on how serious the scalability concerns are.
>> --
>> Russell Bryant
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

More information about the OpenStack-dev mailing list