Open Stack

Wed Aug 1 19:39:57 UTC 2018

Hi Eric,
     Please see my responses inline. On an unrelated note, thanks for 
the pointer to the GPU spec 
(https://review.openstack.org/#/c/579359/10/doc/source/specs/rocky/device-passthrough.rst). 
I will review that.

On 7/31/2018 10:42 AM, Eric Fried wrote:
> Sundar-
>
>>    * Cyborg drivers deal with device-specific aspects, including
>>      discovery/enumeration of devices and handling the Device Half of the
>>      attach (preparing devices/accelerators for attach to an instance,
>>      post-attach cleanup (if any) after successful attach, releasing
>>      device/accelerator resources on instance termination or failed
>>      attach, etc.)
>>    * os-acc plugins deal with hypervisor/system/architecture-specific
>>      aspects, including handling the Instance Half of the attach (e.g.
>>      for libvirt with PCI, preparing the XML snippet to be included in
>>      the domain XML).
> This sounds well and good, but discovery/enumeration will also be
> hypervisor/system/architecture-specific. So...
Fair enough. We had discussed that too. The Cyborg drivers can also 
invoke REST APIs etc. for Power.
>> Thus, the drivers and plugins are expected to be complementary. For
>> example, for 2 devices of types T1 and T2, there shall be 2 separate
>> Cyborg drivers. Further, we would have separate plugins for, say,
>> x86+KVM systems and Power systems. We could then have four different
>> deployments -- T1 on x86+KVM, T2 on x86+KVM, T1 on Power, T2 on Power --
>> by suitable combinations of the drivers and plugins.
> ...the discovery/enumeration code for T1 on x86+KVM (lsdev? lspci?
> walking the /dev file system?) will be totally different from the
> discovery/enumeration code for T1 on Power
> (pypowervm.wrappers.ManagedSystem.get(adapter)).
>
> I don't mind saying "drivers do the device side; plugins do the instance
> side" but I don't see getting around the fact that both "sides" will
> need to have platform-specific code
Agreed. So, we could say:
- The plugins do the instance half. They are hypervisor-specific and 
platform-specific. (The term 'platform' subsumes both the architecture 
(Power, x86) and the server/system type.) They are invoked by os-acc.
- The drivers do the device half, device discovery/enumeration and 
anything not explicitly assigned to plugins. They contain 
device-specific and platform-specific code. They are invoked by Cyborg 
agent and os-acc.

Are you ok with the workflow in 
https://docs.google.com/drawings/d/1cX06edia_Pr7P5nOB08VsSMsgznyrz4Yy2u8nb596sU/edit?usp=sharing 
?
>> One secondary detail to note is that Nova compute calls os-acc per
>> instance for all accelerators for that instance, not once for each
>> accelerator.
> You mean for getVAN()?
Yes -- BTW, I renamed it as prepareVANs() or prepareVAN(), because it is 
not just a query as the name getVAN implies, but has side effects.
> Because AFAIK, os_vif.plug(list_of_vif_objects,
> InstanceInfo) is *not* how nova uses os-vif for plugging.

Yes, the os-acc will invoke the plug() once per VAN. IIUC, Nova calls 
Neutron once per instance for all networks, as seen in this code 
sequence in nova/nova/compute/manager.py:

_build_and_run_instance() --> _build_resources() -->

     _build_networks_for_instance() --> _allocate_network()

The _allocate_network() actually takes a list of requested_networks, and 
handles all networks for an instance [1].

Chasing this further down:

_allocate_network --> _allocate_network_async()

--> self.network_api.allocate_for_instance()

      == nova/network/rpcapi.py::allocate_for_instance()

So, even the RPC out of Nova seems to take a list of networks [2].

[1] 
https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1529
[2] 
https://github.com/openstack/nova/blob/master/nova/network/rpcapi.py#L163
> Thanks,
> Eric
> //lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Regards,
Sundar

Open Stack

[openstack-dev] [Nova] [Cyborg] Updates to os-acc proposal

OpenStack

Community

Documentation

Branding & Legal