[cyborg][nova][neutron]Summaries of Smartnic support integration

Sean Mooney smooney at redhat.com
Wed Jun 3 05:31:29 UTC 2020


On Wed, 2020-06-03 at 00:27 +0000, Feng, Shaohe wrote:
> Yes, we should make sure that device profiles can co exits in both the flavor and multiple ports and the resource
> requests are grouped correctly in each case.
> We also need to ensure whatever approach we take to make the accelerators request(ARQ) co exits from both flavor and
> multiple ports. 
> Such as we may want to delete a port, so we should distinguish the ARQ from flavor or ports. 
> 
> So which approach? Update the ARQ to port "binding:profile", or bind the ARQ with port UUID instead of instance UUID?
> Or other approach? 

i kind of like the idea of using the neutron port uuid for the arq as it makes it very explcit which resouces are mapped
to the port. that said while i would like neutron to retrive the port requests i think it make sense for nova to create
and bind the ARQs so if we can have multiple ARQs with the same consumer uuid e.g. the vm that would work too.
in the long term however i think the neuton port UUID will be simpler. e.g. if we support attach and detach in the
future of smartnic interfaces then not having to fine the specfic arq correspondiing to a port to delete and instad just
using the port UUID will be nice.
> 
> 
> 
> Regards
> Shaohe
> 
> -----Original Message-----
> From: Sean Mooney <smooney at redhat.com> 
> Sent: 2020年6月2日 21:47
> To: yumeng bao <yumeng_bao at yahoo.com>; Lajos Katona <katonalala at gmail.com>
> Cc: openstack maillist <openstack-discuss at lists.openstack.org>; Feng, Shaohe <shaohe.feng at intel.com>
> Subject: Re: [cyborg][nova][neutron]Summaries of Smartnic support integration
> 
> On Sun, 2020-05-31 at 15:53 +0000, yumeng bao wrote:
> > Hi Sean and Lajos,
> > 
> > Thank you so much for your quick response, good suggestions and feedbacks!
> > 
> > @Sean Mooney
> > > if we want to supprot cyborg/smartnic integration we should add a 
> > > new device-profile extention that intoduces the ablity for a non 
> > > admin user to specify a cyborg device profile name as a new attibute 
> > > on the port.
> > 
> > +1,Agree. Cyborg likes this suggestion! This will be more clear that this field is for device profile usage.
> > The reason why we were firstly thinking of using binding:profile is 
> > that this is a way with the smallest number of changes possible in 
> > both nova and neutron.But thinking of the non-admin issue, the  
> > violation of the one way comunicaiton of binding:profile, and the possible security risk of breaking nova(which we
> > surely don't want to do that), we surely prefer giving up binding:profile and  finding a better place to put the new
> > device-profile extention.
> > 
> > > the neutron server could then either retirve the request groups form 
> > > cyborg and pass them as part of the port resouce request using the 
> > > mechanium added for minium bandwidth or it can leave that to nova to 
> > > manage.
> > > 
> > > i would kind of prefer neutron to do this but both could work.
> > 
> > 
> > Yes, neutron server can also do that,but given the fact that we 
> > already landed the code of retriving request groups form cyborg in 
> > nova, can we reuse this process in nova and add new process in create_resource_requests to create accelerator
> > resource request from port info?
> 
> the advantage of neutron doing it is it can merge the cyborg resouce requests with any other resouce requests for the
> port, if nova does it it need to have slightly different logic the then existing code we have today.
> the existing code would make the cyborg resouce requests be a seperate placemnt group. we need them to be merged with
> the port request group. the current nova code also only support one device profile per instance so we need to ensure
> whatever approch we take we need to ensure that device profiles can co exits in both the flavor and multiple ports and
> the resource requests are grouped correctly in each case.
> > 
> > I would be very appreciated if this change can land in nova, as I see the advantages are:
> > 1) This keeps accelerator request groups things handled in on place, 
> > which makes integration clear and simple, Nova controls the main 
> > manage things,the neutron handles network backends integration,and cyborg involves accelerator management.
> > 2) Another good thing: this will dispels Lajos's concerns on port-resource-request!
> 
> im not really sure what that concern is. port-resouce-request was created for initally for qos minimum bandwith
> support but ideally it is a mechanism for comunicating any placement resouce requirements to nova.
> my proposal was that neutron would retrive the device profile resouce requests(and cache it) then append those
> requests too the other port-resouce-requests so that they will be included in the ports request group.
> > 3) As presented in the proposal(page 3 and 5 of the slide)[0], please don't worry! This will be a tiny change in
> > nova.
> > Cyborg will be very appreciated if this change can land in nova, for 
> > it saves much effort in cyborg-neutron integration.
> > 
> > 
> > @Lajos Katona:
> > > Port-resource-request 
> > > (see:https://docs.openstack.org/api-ref/network/v2/index.html#port-r
> > > esource-request) is a read-only (and admin-only) field of ports, 
> > > which is filled based on the agent heartbeats. So now there is now 
> > > polling of agents or similar. Adding extra "overload" to this 
> > > mechanism, like polling cyborg or similar looks something out of the 
> > > original design for me, not to speak about the performance issues to 
> > > add
> 
> there is no need for any polling of cyborg.
> the device-porfile in the cyborg api is immutable you cannot specify the uuid when createing it and the the name is
> the unique constraint.
> so even if someone was to delete and recreate the device profile with the same name the uuid would not be the same
> 
> The first time a device profile is added to the port the neutron server can lookup the device profile once and cache
> it.
> so ideally the neutron server could cache the responce of a the "cyborg profile show" e.g. listing the resouce group
> requests for the profile using the name and uuid. the uuid is only usful to catch the aba problem of people creating ,
> deleting and recreating the profile with the same name.
> 
> i should note that if you do delete and recreate the device profile its highly likely to break nova so that should not
> be done. this is because nova is relying on the fact that cybogs api say this cannot change so we are not storing the
> version the vm was booted with and are relying on cyborg to not change it. neutron can make the same assumtion that a
> device profile definition will not change.
> 
> 
> > >  
> > >    - API requests towards cyborg (or anything else) to every port GET
> > >    operation
> > >    - store cyborg related information in neutron db which was fetched from
> > >    cyborg (periodically I suppose) to make neutron able to fill
> > >    port-resource-request.
> > 
> > 
> > As mentioned above,if request group can land in nova, we don't need to 
> > concern API request towards cyborg and cyborg info merged to port-resource-request.
> 
> this would still have to be done in nova instead. really this is not something nova should have to special case for,
> im not saying it cant but ideally neuton would leaverage the exitsing port-resource-request feature.
> > Another question just for curiosity.In my understanding(Please correct 
> > me if I'm worng.), I feel that neutron doesn't need to poll cyborg periodically if neutron fill port-resource-
> > request, just fetch it once port request happens.
> 
> correct no need to poll, it just need to featch it when the profile is added to the port and it can be cached safely.
> > because neutron can expect that the cyborg device_profile(provides 
> > resource request info for nova scheduler) don't change very often,
> 
> it imuntable in the cyborg api so it can expect that for a given name it should never change but it could detect that
> by looking at both the name and uuid.
> > it is the flavor of accelerator, and only admin can create/delete them.
> 
> yes only admin can create and delete them and it does not support update.
> i think its invalid to delete a device profile if its currently in use by any neutron port or nova instance.
> its certenly invalid or should to delete it if there is a arq using the device-profile.
> > 
> > [0]pre-PTG slides update: https://docs.qq.com/slide/DVkxSUlRnVGxnUFR3
> > 
> > Regards,
> > Yumeng
> > 
> > 
> > 
> > 
> > On Friday, May 29, 2020, 3:21:08 PM GMT+8, Lajos Katona <katonalala at gmail.com> wrote: 
> > 
> > 
> > 
> > 
> > 
> > Hi,
> > 
> > Port-resource-request (see: 
> > https://docs.openstack.org/api-ref/network/v2/index.html#port-resource
> > -requestst ) is a read-only (and admin-only) field of ports, which is 
> > filled based on the agent heartbeats. So now there is now polling of 
> > agents or similar. Adding extra "overload" to this mechanism, like polling cyborg or similar looks something out of
> > the original design for me, not to speak about the performance issues to add
> >     * API requests towards cyborg (or anything else) to every port GET operation
> >     * store cyborg related information in neutron db which was fetched 
> > from cyborg (periodically I suppose) to make neutron able to fill port-resource-request.
> > Regards
> > Lajos
> > 
> > Sean Mooney <smooney at redhat.com> ezt írta (időpont: 2020. máj. 28., Cs, 16:13):
> > > On Thu, 2020-05-28 at 20:50 +0800, yumeng bao wrote:
> > > > 
> > > > Hi all,
> > > > 
> > > > 
> > > > In cyborg pre-PTG meeting conducted last week[0],shaohe from Intel 
> > > > introduced SmartNIC support integrations,and we've reached some 
> > > > initial agreements:
> > > > 
> > > > The workflow for a user to create a server with network acceleartor(accelerator is managed by Cyborg) is:
> > > > 
> > > >    1. create a port with accelerator request specified into binding_profile field
> > > >   NOTE: Putting the accelerator request(device_profile) into 
> > > > binding_profile is one possible solution implemented in our POC.
> > > 
> > > the binding profile field is not really intended for this.
> > > 
> > > https://github.com/openstack/neutron-lib/blob/master/neutron_lib/api
> > > /definitions/portbindings.py#L31-L34
> > > its intended to pass info from nova to neutron but not the other way around.
> > > it was orgininally introduced so that nova could pass info to the 
> > > neutron plug in specificly the sriov pci address. it was not 
> > > intended for two way comunicaiton to present infom form neutron to nova.
> > > 
> > > we kindo of broke that with the trusted vf feature but since that 
> > > was intended to be admin only as its a security risk in a mulit 
> > > tenant cloud its a slightl different case.
> > > i think we should avoid using the binding profile for passing info 
> > > form neutron to nova and keep it for its orginal use of passing info from the virt dirver to the network backend.
> > > 
> > > 
> > > >         Another possible solution,adding a new attribute to port 
> > > > object for cyborg specific use instead of using binding_profile, is discussed in shanghai Summit[1].
> > > > This needs check with neutron team, which neutron team would suggest? 
> > > 
> > > from a nova persepctive i would prefer if this was  a new extention.
> > > the binding profile is admin only by default so its not realy a good way to request features be enabled.
> > > you can use neutron rbac policies to alther that i belive but in 
> > > genral i dont think we shoudl advocate for non admins to be able to 
> > > modify the binding profile as they can break nova. e.g. by modifying the pci addres.
> > > if we want to supprot cyborg/smartnic integration we should add a 
> > > new device-profile extention that intoduces the ablity for a non 
> > > admin user to specify a cyborg device profile name as a new attibute on the port.
> > > 
> > > the neutron server could then either retirve the request groups form 
> > > cyborg and pass them as part of the port resouce request using the 
> > > mechanium added for minium bandwidth or it can leave that to nova to manage.
> > > 
> > > i would kind of prefer neutron to do this but both could work.
> > > > 
> > > >    2.create a server with the port created
> > > > 
> > > > Cyborg-nova-neutron integration workflow can be found on page 3 of the slide[2] presented in pre-PTG.
> > > > 
> > > > And we also record the introduction! Please find the pre-PTG 
> > > > meeting vedio record in [3] and [4], they are the same, just for 
> > > > different region access.
> > > > 
> > > > 
> > > > [0]http://lists.openstack.org/pipermail/openstack-discuss/2020-May
> > > > /014987.html 
> > > > [1]https://etherpad.opendev.org/p/Shanghai-Neutron-Cyborg-xproj
> > > > [2]pre-PTG slides:https://docs.qq.com/slide/DVm5Jakx5ZlJXY3lw
> > > > [3]pre-PTG vedio records in 
> > > > Youtube:https://www.youtube.com/watch?v=IN4haOK7sQg&feature=youtu.
> > > > be [4]pre-PTG vedio records in Youku:
> > > > http://v.youku.com/v_show/id_XNDY5MDA4NjM2NA==.html?x&sharefrom=ip
> > > > hone&sharekey=51459cbd599407990dd09940061b374d4
> > > > 
> > > > Regards,
> > > > Yumeng
> > > > 
> > > 
> > > 
> > > 
> > 
> > 
> 
> 




More information about the openstack-discuss mailing list