[cyborg][nova][neutron]Summaries of Smartnic support integration

Sean Mooney smooney at redhat.com
Tue Jun 2 13:46:38 UTC 2020


On Sun, 2020-05-31 at 15:53 +0000, yumeng bao wrote:
> Hi Sean and Lajos,
> 
> Thank you so much for your quick response, good suggestions and feedbacks!
> 
> @Sean Mooney
> > if we want to supprot cyborg/smartnic integration we should add a new
> > device-profile extention that intoduces the ablity
> > for a non admin user to specify a cyborg device profile name as a new
> > attibute on the port.
> 
> +1,Agree. Cyborg likes this suggestion! This will be more clear that this field is for device profile usage.
> The reason why we were firstly thinking of using binding:profile is that this is a way with the smallest number of
> changes possible in both nova and neutron.But thinking of the non-admin issue, the  violation of the one way
> comunicaiton of binding:profile, and the possible security risk of breaking nova(which we surely don't want to do
> that), we surely prefer giving up binding:profile and  finding a better place to put the new device-profile extention.
> 
> > the neutron server could then either retirve the request groups form
> > cyborg and pass them as part of the port resouce
> > request using the mechanium added for minium bandwidth or it can leave
> > that to nova to manage.
> > 
> > i would kind of prefer neutron to do this but both could work.
> 
> 
> Yes, neutron server can also do that,but given the fact that we already landed the code of retriving request groups
> form cyborg in nova, can we reuse this process in nova and add new process in create_resource_requests to create
> accelerator resource request from port info?

the advantage of neutron doing it is it can merge the cyborg resouce requests with any other resouce requests for the
port, if nova does it it need to have slightly different logic the then existing code we have today.
the existing code would make the cyborg resouce requests be a seperate placemnt group. we need them to be merged with
the port request group. the current nova code also only support one device profile per instance so we need to ensure
whatever approch we take we need to ensure that device profiles can co exits in both the flavor and multiple ports and
the resource requests are grouped correctly in each case.
> 
> I would be very appreciated if this change can land in nova, as I see the advantages are:
> 1) This keeps accelerator request groups things handled in on place, which makes integration clear and simple, Nova
> controls the main manage things,the neutron handles network backends integration,and cyborg involves accelerator
> management.
> 2) Another good thing: this will dispels Lajos's concerns on port-resource-request!
im not really sure what that concern is. port-resouce-request was created for initally for qos minimum bandwith support
but ideally it is a mechanism for comunicating any placement resouce requirements to nova.
my proposal was that neutron would retrive the device profile resouce requests(and cache it) then append those requests
too the other port-resouce-requests so that they will be included in the ports request group.
> 3) As presented in the proposal(page 3 and 5 of the slide)[0], please don't worry! This will be a tiny change in nova.
> Cyborg will be very appreciated if this change can land in nova, for it saves much effort in cyborg-neutron
> integration.
> 
> 
> @Lajos Katona:
> > Port-resource-request (see:https://docs.openstack.org/api-ref/network/v2/index.html#port-resource-request)
> > is a read-only (and admin-only) field of ports, which is filled
> > based on the agent heartbeats. So now there is now polling of agents or
> > similar. Adding extra "overload" to this mechanism, like polling cyborg or
> > similar looks something out of the original design for me, not to speak about the performance issues to add
there is no need for any polling of cyborg.
the device-porfile in the cyborg api is immutable
you cannot specify the uuid when createing it and the the name is the unique constraint.
so even if someone was to delete and recreate the device profile with the same name the uuid would not be the same

The first time a device profile is added to the port the neutron server can lookup the device profile once and cache it.
so ideally the neutron server could cache the responce of a the "cyborg profile show" e.g. listing the resouce group
requests for the profile using the name and uuid. the uuid is only usful to catch the aba problem of people creating
, deleting and recreating the profile with the same name.

i should note that if you do delete and recreate the device profile its highly likely to break nova so that should not
be done. this is because nova is relying on the fact that cybogs api say this cannot change so we are not storing the
version the vm was booted with and are relying on cyborg to not change it. neutron can make the same assumtion that a
device profile definition will not change.


> >  
> >    - API requests towards cyborg (or anything else) to every port GET
> >    operation
> >    - store cyborg related information in neutron db which was fetched from
> >    cyborg (periodically I suppose) to make neutron able to fill
> >    port-resource-request.
> 
> 
> As mentioned above,if request group can land in nova, we don't need to concern API request towards cyborg and cyborg
> info merged to port-resource-request.
this would still have to be done in nova instead. really this is not something nova should have to special case for, im
not saying it cant but ideally neuton would leaverage the exitsing port-resource-request feature.
> Another question just for curiosity.In my understanding(Please correct me if I'm worng.), I feel that neutron doesn't
> need to poll cyborg periodically if neutron fill port-resource-request, just fetch it once port request happens. 
correct no need to poll, it just need to featch it when the profile is added to the port and it can be cached safely.
> because neutron can expect that the cyborg device_profile(provides resource request info for nova scheduler) don't
> change very often,
it imuntable in the cyborg api so it can expect that for a given name it should never change but it could detect that
by looking at both the name and uuid.
> it is the flavor of accelerator, and only admin can create/delete them.
yes only admin can create and delete them and it does not support update.
i think its invalid to delete a device profile if its currently in use by any neutron port or nova instance.
its certenly invalid or should to delete it if there is a arq using the device-profile.
> 
> [0]pre-PTG slides update: https://docs.qq.com/slide/DVkxSUlRnVGxnUFR3
> 
> Regards,
> Yumeng
> 
> 
> 
> 
> On Friday, May 29, 2020, 3:21:08 PM GMT+8, Lajos Katona <katonalala at gmail.com> wrote: 
> 
> 
> 
> 
> 
> Hi,
> 
> Port-resource-request (see: https://docs.openstack.org/api-ref/network/v2/index.html#port-resource-requestst ) is a
> read-only (and admin-only) field of ports, which is filled
> based on the agent heartbeats. So now there is now polling of agents or similar. Adding extra "overload" to this
> mechanism, like polling cyborg or similar
> looks something out of the original design for me, not to speak about the performance issues to add 
>     * API requests towards cyborg (or anything else) to every port GET operation
>     * store cyborg related information in neutron db which was fetched from cyborg (periodically I suppose) to make
> neutron able to fill port-resource-request.
> Regards
> Lajos
> 
> Sean Mooney <smooney at redhat.com> ezt írta (időpont: 2020. máj. 28., Cs, 16:13):
> > On Thu, 2020-05-28 at 20:50 +0800, yumeng bao wrote:
> > > 
> > > Hi all,
> > > 
> > > 
> > > In cyborg pre-PTG meeting conducted last week[0],shaohe from Intel introduced SmartNIC support integrations,and
> > > we've
> > > reached some initial agreements: 
> > > 
> > > The workflow for a user to create a server with network acceleartor(accelerator is managed by Cyborg) is:
> > > 
> > >    1. create a port with accelerator request specified into binding_profile field
> > >   NOTE: Putting the accelerator request(device_profile) into binding_profile is one possible solution implemented
> > > in
> > > our POC.
> > 
> > the binding profile field is not really intended for this.
> > 
> > https://github.com/openstack/neutron-lib/blob/master/neutron_lib/api/definitions/portbindings.py#L31-L34
> > its intended to pass info from nova to neutron but not the other way around.
> > it was orgininally introduced so that nova could pass info to the neutron plug in
> > specificly the sriov pci address. it was not intended for two way comunicaiton to present infom form neutron
> > to nova.
> > 
> > we kindo of broke that with the trusted vf feature but since that was intended to be admin only as its a security
> > risk
> > in a mulit tenant cloud its a slightl different case. 
> > i think we should avoid using the binding profile for passing info form neutron to nova and keep it for its orginal
> > use of passing info from the virt dirver to the network backend.
> > 
> > 
> > >         Another possible solution,adding a new attribute to port object for cyborg specific use instead of using
> > > binding_profile, is discussed in shanghai Summit[1].
> > > This needs check with neutron team, which neutron team would suggest? 
> > 
> > from a nova persepctive i would prefer if this was  a new extention.
> > the binding profile is admin only by default so its not realy a good way to request features be enabled.
> > you can use neutron rbac policies to alther that i belive but in genral i dont think we shoudl advocate for non
> > admins
> > to be able to modify the binding profile as they can break nova. e.g. by modifying the pci addres.
> > if we want to supprot cyborg/smartnic integration we should add a new device-profile extention that intoduces the
> > ablity
> > for a non admin user to specify a cyborg device profile name as a new attibute on the port.
> > 
> > the neutron server could then either retirve the request groups form cyborg and pass them as part of the port
> > resouce
> > request using the mechanium added for minium bandwidth or it can leave that to nova to manage.
> > 
> > i would kind of prefer neutron to do this but both could work.
> > > 
> > >    2.create a server with the port created
> > > 
> > > Cyborg-nova-neutron integration workflow can be found on page 3 of the slide[2] presented in pre-PTG.
> > > 
> > > And we also record the introduction! Please find the pre-PTG meeting vedio record in [3] and [4], they are the
> > > same,
> > > just for different region access.
> > > 
> > > 
> > > [0]http://lists.openstack.org/pipermail/openstack-discuss/2020-May/014987.html
> > > [1]https://etherpad.opendev.org/p/Shanghai-Neutron-Cyborg-xproj
> > > [2]pre-PTG slides:https://docs.qq.com/slide/DVm5Jakx5ZlJXY3lw
> > > [3]pre-PTG vedio records in Youtube:https://www.youtube.com/watch?v=IN4haOK7sQg&feature=youtu.be
> > > [4]pre-PTG vedio records in Youku:
> > > http://v.youku.com/v_show/id_XNDY5MDA4NjM2NA==.html?x&sharefrom=iphone&sharekey=51459cbd599407990dd09940061b374d4
> > > 
> > > Regards,
> > > Yumeng
> > > 
> > 
> > 
> > 
> 
> 




More information about the openstack-discuss mailing list