[Openstack] Can any user add or delete OpenStack Swift middleware?

Kota TSUYUZAKI tsuyuzaki.kota at lab.ntt.co.jp
Tue Oct 2 06:27:46 UTC 2018


Hi Quao,

See inline responses below, please.

(2018/09/27 0:58), Qiao Kang wrote:
> Kota,
> 
> Sorry for the late response, see more below:
> 
> On Fri, Sep 21, 2018 at 2:59 AM Kota TSUYUZAKI
> <tsuyuzaki.kota at lab.ntt.co.jp> wrote:
>>
>> Hi Qiao,
>>
>>> Thanks! I'm interested and would like to join, as well as contribute!
>>>
>>
>> One example, that is how the multi-READ works, is around [1], the storlets middleware can make a subrequest against to the backend Swift then,
>> attach the request input to the application in the Docker container by passing the file descriptor to be readable[2][3][4]*.
>> After all of the prepartion for the invocation, the input descriptors will be readable in the storlet app as the InputFile.
>>
>> * At first, the runtime prepares the extra source stub at [2], then creates a set of pipes for each sources to be communicated with the app
>> inside the docker daemon[3], then, the runtime module reads the extra data from Swift GET and flushes all buffers into the descriptor [4].
>>
>> 1: https://github.com/openstack/storlets/blob/master/storlets/swift_middleware/handlers/proxy.py#L294-L305
>> 2: https://github.com/openstack/storlets/blob/master/storlets/gateway/gateways/docker/runtime.py#L571-L581
>> 3: https://github.com/openstack/storlets/blob/master/storlets/gateway/gateways/docker/runtime.py#L665-L666
>> 4: https://github.com/openstack/storlets/blob/master/storlets/gateway/gateways/docker/runtime.py#L833-L840
>>
>> Following the mechanism, IMO, what we can do to enable multi-out is
>>
>> - add the capability to create some PUT subrequest at swift_middleware module (define the new API header too)
>> - create the extra communication write-able fds in the storlets runtime (perhaps, storlets daemon is also needed to be changed)
>> - pass all data from the write-able fds as to the sub PUT request input
>>
>>
>> If you have any nice idea rather than me, it's always welcome tho :)
> 
> I think your approach is clear and straightforward. One quick question:
>> - create the extra communication write-able fds in the storlets runtime (perhaps, storlets daemon is also needed to be changed)
> So the Storlet app will write to those fds? Are these fds temporary
> and need to be destroyed after PUT requests in step 3?

Good Question! I don't think we need new code to destory the descriptors.
Note that we have a couple of file descriptors that should be closed successfuly.

One is inside the container, that will be closed [1] at the storlet daemon agent that invoke the storlet app.
The other is the descripter to be read from at the swift middleware layer. That will be passed to a StorletResponse
object, then the reponse object can be closed by wsgi middleware[2][3]. Basically wsgi middleware
has the reponsibility to close the application response iterator so we expect the iterator will be closed around wsgi middleware
pipeline then it propergates to the read fd.

If you find any fd leak that is not intentioned, please feel free to report it as a bug, we had already fought to the leaking descriptors :)

1: https://github.com/openstack/storlets/blob/master/storlets/agent/daemon/server.py#L211-L212
2: https://github.com/openstack/storlets/blob/master/storlets/gateway/gateways/docker/runtime.py#L845
3: https://github.com/openstack/storlets/blob/master/storlets/gateway/common/stob.py#L119-L123


> 
>>
>>> Another potential use case: imagine I want to compress objects upon
>>> PUTs using two different algorithms X and Y, and use the future
>>> 'multi-write' feature to store three objects upon any single PUT
>>> (original copy, X-compressed copy and Y-compressed copy). I can
>>> install two Storlets which implement X and Y respectively. However,
>>> seems Storlets engine can only invoke one per PUT, so this is still
>>> not feasible. Is that correct?
>>>
>>
>> It sounds interesting. As you know, yes, one Storlet application can be invoked per one PUT.
>> On the other hand, Storlets has been capable to run several applications as you want.
>> One idea using the capability, if you develop an application like:
>>
>> - Storlet app invokes several multi threads with their output descriptor
>> - Storlet app reads the input stream, then pushes the data into the threads
>> - Each thread performs as you want (one does as X compression, the other does as Y compression)
>>   then, writes its own result to the output descriptor
>>
>> It might work for your use case.
> 
> Sounds great, I guess it should work as well.
> 
> I'm also concerned with "performance isolation" in Storlets. For
> instance, is it possible for a user to launch several very
> heavy-loaded Storlets apps to consume lots of CPU/memory resources to
> affect other users? Does Storlets do performance/resource isolation?
> 
Currently, we don't have the option to limit such resources but a good news, the storlets apps
will run in docker container so that it has the capability to limit the resouce via cgroups.

Current implementation for the docker command to make the sandbox is around [4] so I guess, expand the args,
then adding configuration to set the resource limitation might work.

4: https://github.com/openstack/storlets/blob/master/scripts/restart_docker_container.c#L68


Thanks,
Kota

> Thanks,
> Qiao
> 
>>
>> Thanks,
>> Kota
>>
>>
>> (2018/09/19 5:52), Qiao Kang wrote:
>>> Dear Kota,
>>>
>>> On Mon, Sep 17, 2018 at 11:43 PM Kota TSUYUZAKI
>>> <tsuyuzaki.kota at lab.ntt.co.jp> wrote:
>>>>
>>>> Hi Quio,
>>>>
>>>>> I know Storlets can provide user-defined computation functionalities,
>>>>> but I guess some capabilities can only be achieved using middleware.
>>>>> For example, a user may want such a feature: upon each PUT request, it
>>>>> creates a compressed copy of the object and stores both the original
>>>>> copy and compressed copy. It's feasible using middlware but I don't
>>>>> think Storlets provide such capability.
>>>>
>>>> Interesting, exactly currently it's not supported to write to multi objects for a PUT request but as well as other middlewares we could adopt the feasibility into Storlets if you prefer.
>>>> Right now, the multi read (i.e. GET from multi sources) is only available and I think we would be able to expand the logic to PUT requests too. IIRC, in those days, we had discussion on sort of the
>>>> multi-out use cases and I'm sure the data structure inside Storlets are designed to be capable to that expantion. At that time, we called them "Tee" application on Storlets, I could not find the
>>>> historical discussion logs about how to implement tho, sorry. I believe that would be an use case for storlets if you prefer the user-defined application flexibilities rather than operator defined
>>>> Swift middleware.
>>>>
>>>> The example of multi-read (GET from multi sources) are here:
>>>> https://github.com/openstack/storlets/blob/master/tests/functional/python/test_multiinput_storlet.py
>>>>
>>>> And if you like to try to write multi write, please join us, I'm happy to help you anytime.
>>>>
>>>
>>> Thanks! I'm interested and would like to join, as well as contribute!
>>>
>>> Another potential use case: imagine I want to compress objects upon
>>> PUTs using two different algorithms X and Y, and use the future
>>> 'multi-write' feature to store three objects upon any single PUT
>>> (original copy, X-compressed copy and Y-compressed copy). I can
>>> install two Storlets which implement X and Y respectively. However,
>>> seems Storlets engine can only invoke one per PUT, so this is still
>>> not feasible. Is that correct?
>>>
>>>>
>>>>> Another example is that a user may want to install a Swift3-like
>>>>> middleware to provide APIs to a 3rd party, but she doesn't want other
>>>>> users to see this middleware.
>>>>>
>>>>
>>>> If the definition can be made by operators, perhaps one possible solution that preparing different proxy-server endpoint for different users is available. i.e. an user uses no-s3api available proxy,
>>>> then the others use a different proxy-server endpoint that has the s3api in the pipeline.
>>>>
>>>> Or, it sounds like kinda defaulter middleware[1], I don't think it has the scope turning on/off the middlewares for now.
>>>>
>>>> 1: https://review.openstack.org/#/c/342857/
>>>
>>> I see, thanks for pointing out the defaulter project!
>>>
>>> Best,
>>> Qiao
>>>
>>>>
>>>> Best,
>>>> Kota
>>>>
>>>> (2018/09/18 11:34), Qiao Kang wrote:
>>>>> Kota,
>>>>>
>>>>> Thanks for your reply, very helpful!
>>>>>
>>>>> I know Storlets can provide user-defined computation functionalities,
>>>>> but I guess some capabilities can only be achieved using middleware.
>>>>> For example, a user may want such a feature: upon each PUT request, it
>>>>> creates a compressed copy of the object and stores both the original
>>>>> copy and compressed copy. It's feasible using middlware but I don't
>>>>> think Storlets provide such capability.
>>>>>
>>>>> Another example is that a user may want to install a Swift3-like
>>>>> middleware to provide APIs to a 3rd party, but she doesn't want other
>>>>> users to see this middleware.
>>>>>
>>>>> Regards,
>>>>> Qiao
>>>>>
>>>>> On Mon, Sep 17, 2018 at 9:19 PM Kota TSUYUZAKI
>>>>> <tsuyuzaki.kota at lab.ntt.co.jp> wrote:
>>>>>>
>>>>>> With Storlets, users will be able to create their own applications that are able to run like as a Swift middeleware. The application (currently Python and Java are supported as the language but the
>>>>>> apps can calls any binaries in the workspace) can be uploaded as a Swift object, then, users can invoke them with just an extra header that specifies your apps.
>>>>>>
>>>>>> To fit your own use case, we may have to consider to invole or to integrate the system for you but I believe Storlets could be a choice for you.
>>>>>>
>>>>>> In detail, Storlets documantation is around there,
>>>>>>
>>>>>> Top Level Index: https://docs.openstack.org/storlets/latest/index.html
>>>>>> System Overview: https://docs.openstack.org/storlets/latest/storlet_engine_overview.html
>>>>>> APIs: https://docs.openstack.org/storlets/latest/api/overview_api.html
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Kota
>>>>>>
>>>>>> (2018/09/17 8:59), John Dickinson wrote:
>>>>>>> You may be interested in Storlets. It's another OpenStack project, maintained by a Swift core reviewer, that provides this sort of user-defined middleware functionality.
>>>>>>>
>>>>>>> You can also ask about it in #openstack-swift
>>>>>>>
>>>>>>> --John
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 16 Sep 2018, at 9:25, Qiao Kang wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I'm wondering whether Swift allows any user (not the administrator) to
>>>>>>>> specify which middleware that she/he wants his data object to go throught.
>>>>>>>> For instance, Alice wants to install a middleware but doesn't want Bob to
>>>>>>>> use it, where Alice and Bob are two accounts in a single Swift cluster.
>>>>>>>>
>>>>>>>> Or maybe all middlewares are pre-installed globally and cannot be
>>>>>>>> customized on a per-account basis?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Qiao
>>>>>>>> _______________________________________________
>>>>>>>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>>>>>>> Post to     : openstack at lists.openstack.org
>>>>>>>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>>>>>> Post to     : openstack at lists.openstack.org
>>>>>>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>>>>>
>>>>>>
>>>>>> --
>>>>>> ----------------------------------------------------------
>>>>>> Kota Tsuyuzaki(露﨑 浩太)  <tsuyuzaki.kota at lab.ntt.co.jp>
>>>>>> NTT Software Innovation Center
>>>>>> Distributed Computing Technology Project
>>>>>> Phone  0422-59-2837
>>>>>> Fax    0422-59-2965
>>>>>> -----------------------------------------------------------
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>>>>> Post to     : openstack at lists.openstack.org
>>>>>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> ----------------------------------------------------------
>>>> Kota Tsuyuzaki(露﨑 浩太)  <tsuyuzaki.kota at lab.ntt.co.jp>
>>>> NTT Software Innovation Center
>>>> Distributed Computing Technology Project
>>>> Phone  0422-59-2837
>>>> Fax    0422-59-2965
>>>> -----------------------------------------------------------
>>>>
>>>
>>>
>>
>>
>>
> 
> 


-- 
----------------------------------------------------------
Kota Tsuyuzaki(露﨑 浩太)  <tsuyuzaki.kota at lab.ntt.co.jp>
NTT Software Innovation Center
Distributed Computing Technology Project
Phone  0422-59-2837
Fax    0422-59-2965
-----------------------------------------------------------




More information about the Openstack mailing list