[large-scale][cinder] backend_native_threads_pool_size option with rbd backend
Hey all, Is there any recommendation on the number of threads to use when using RBD backend (option backend_native_threads_pool_size)? The doc is saying that 20 is the default but it should be increased, specially for the RBD driver, but up to which value? Is there anyone tuning this parameter in their openstack deployments? If yes, maybe we can add some recommendations on openstack large-scale doc about it? Cheers, Arnaud.
Hi Arnaud, We discussed this in last week's cinder meeting and unfortunately we haven't tested it thoroughly so we don't have any performance numbers to share. What we can tell is the reason why RBD requires a higher number of native threads. RBD calls C code which could potentially block green threads hence blocking the main operation therefore all of the calls in RBD to execute operations are wrapped to use native threads so depending on the operations we want to perform concurrently, we can set the value of backend_native_threads_pool_size for RBD. Thanks and regards Rajat Dhasmana On Mon, Jun 27, 2022 at 9:35 PM Arnaud Morin <arnaud.morin@gmail.com> wrote:
Hey all,
Is there any recommendation on the number of threads to use when using RBD backend (option backend_native_threads_pool_size)? The doc is saying that 20 is the default but it should be increased, specially for the RBD driver, but up to which value?
Is there anyone tuning this parameter in their openstack deployments?
If yes, maybe we can add some recommendations on openstack large-scale doc about it?
Cheers,
Arnaud.
Hey, Thanks for your answer! OK, I understand the why ;) also because we hit some issues on our deployment. So we increase the number of threads to 100 but we also enable the deferred deletion (keeping in mind the quota usage downsides that it brings). We also disabled the periodic task to compute usage and use the less precise way from db. First question here: do you think we are going the right path? One thing we are not yet sure is how to calculate correctly the number of threads to use. Should we do basic math with the number of deletion per minutes? Or should we take the number of volumes in the backend into account? Something in the middle? Thanks! Arnaud Le 5 juillet 2022 18:06:14 GMT+02:00, Rajat Dhasmana <rdhasman@redhat.com> a écrit :
Hi Arnaud,
We discussed this in last week's cinder meeting and unfortunately we haven't tested it thoroughly so we don't have any performance numbers to share. What we can tell is the reason why RBD requires a higher number of native threads. RBD calls C code which could potentially block green threads hence blocking the main operation therefore all of the calls in RBD to execute operations are wrapped to use native threads so depending on the operations we want to perform concurrently, we can set the value of backend_native_threads_pool_size for RBD.
Thanks and regards Rajat Dhasmana
On Mon, Jun 27, 2022 at 9:35 PM Arnaud Morin <arnaud.morin@gmail.com> wrote:
Hey all,
Is there any recommendation on the number of threads to use when using RBD backend (option backend_native_threads_pool_size)? The doc is saying that 20 is the default but it should be increased, specially for the RBD driver, but up to which value?
Is there anyone tuning this parameter in their openstack deployments?
If yes, maybe we can add some recommendations on openstack large-scale doc about it?
Cheers,
Arnaud.
On 05/07, Arnaud wrote:
Hey,
Thanks for your answer! OK, I understand the why ;) also because we hit some issues on our deployment. So we increase the number of threads to 100 but we also enable the deferred deletion (keeping in mind the quota usage downsides that it brings).
Hi Arnaud, Deferred deletion should reduce the number of required native threads since delete calls will complete faster.
We also disabled the periodic task to compute usage and use the less precise way from db.
Are you referring to the 'rbd_exclusive_cinder_pool' configuration option? Because that should already have the optimum default (True value).
First question here: do you think we are going the right path?
One thing we are not yet sure is how to calculate correctly the number of threads to use. Should we do basic math with the number of deletion per minutes? Or should we take the number of volumes in the backend into account? Something in the middle?
Native threads on the RBD driver are not only used for deletion, they are used for *all* RBD calls. We haven't defined any particular method to calculate the optimum number of threads on a system, but I can think of 2 possible avenues to explore: - Performance testing: Run a set of tests with a high number of concurrent requests and different operations and see how Cinder performs. I wouldn't bother with individual attach and detach to VM operations because those are noops on the Cinder side, creating volume from image with either different images or cache disabled would be better. - Reporting native thread usage: To really know if the number of native threads is sufficient or not you could modify the Cinder volume manager (and possibly also eventlet.tpool to gather statistics on the number of used/free native threads and number of queued requests that are waiting for a native thread to pick them up. Cheers, Gorka.
Thanks!
Arnaud
Le 5 juillet 2022 18:06:14 GMT+02:00, Rajat Dhasmana <rdhasman@redhat.com> a écrit :
Hi Arnaud,
We discussed this in last week's cinder meeting and unfortunately we haven't tested it thoroughly so we don't have any performance numbers to share. What we can tell is the reason why RBD requires a higher number of native threads. RBD calls C code which could potentially block green threads hence blocking the main operation therefore all of the calls in RBD to execute operations are wrapped to use native threads so depending on the operations we want to perform concurrently, we can set the value of backend_native_threads_pool_size for RBD.
Thanks and regards Rajat Dhasmana
On Mon, Jun 27, 2022 at 9:35 PM Arnaud Morin <arnaud.morin@gmail.com> wrote:
Hey all,
Is there any recommendation on the number of threads to use when using RBD backend (option backend_native_threads_pool_size)? The doc is saying that 20 is the default but it should be increased, specially for the RBD driver, but up to which value?
Is there anyone tuning this parameter in their openstack deployments?
If yes, maybe we can add some recommendations on openstack large-scale doc about it?
Cheers,
Arnaud.
Yes, I was talking about rbd_exclusive_cinder_pool. It was false by default on our side because we are still running cinder stein release :( Thank you for the answer about the calculation methods! Do you mind if I copy paste your answer to the large-scale documentaiton ([1])? Cheers, Arnaud. [1] https://docs.openstack.org/large-scale/ On 06.07.22 - 10:20, Gorka Eguileor wrote:
On 05/07, Arnaud wrote:
Hey,
Thanks for your answer! OK, I understand the why ;) also because we hit some issues on our deployment. So we increase the number of threads to 100 but we also enable the deferred deletion (keeping in mind the quota usage downsides that it brings).
Hi Arnaud,
Deferred deletion should reduce the number of required native threads since delete calls will complete faster.
We also disabled the periodic task to compute usage and use the less precise way from db.
Are you referring to the 'rbd_exclusive_cinder_pool' configuration option? Because that should already have the optimum default (True value).
First question here: do you think we are going the right path?
One thing we are not yet sure is how to calculate correctly the number of threads to use. Should we do basic math with the number of deletion per minutes? Or should we take the number of volumes in the backend into account? Something in the middle?
Native threads on the RBD driver are not only used for deletion, they are used for *all* RBD calls.
We haven't defined any particular method to calculate the optimum number of threads on a system, but I can think of 2 possible avenues to explore:
- Performance testing: Run a set of tests with a high number of concurrent requests and different operations and see how Cinder performs. I wouldn't bother with individual attach and detach to VM operations because those are noops on the Cinder side, creating volume from image with either different images or cache disabled would be better.
- Reporting native thread usage: To really know if the number of native threads is sufficient or not you could modify the Cinder volume manager (and possibly also eventlet.tpool to gather statistics on the number of used/free native threads and number of queued requests that are waiting for a native thread to pick them up.
Cheers, Gorka.
Thanks!
Arnaud
Le 5 juillet 2022 18:06:14 GMT+02:00, Rajat Dhasmana <rdhasman@redhat.com> a écrit :
Hi Arnaud,
We discussed this in last week's cinder meeting and unfortunately we haven't tested it thoroughly so we don't have any performance numbers to share. What we can tell is the reason why RBD requires a higher number of native threads. RBD calls C code which could potentially block green threads hence blocking the main operation therefore all of the calls in RBD to execute operations are wrapped to use native threads so depending on the operations we want to perform concurrently, we can set the value of backend_native_threads_pool_size for RBD.
Thanks and regards Rajat Dhasmana
On Mon, Jun 27, 2022 at 9:35 PM Arnaud Morin <arnaud.morin@gmail.com> wrote:
Hey all,
Is there any recommendation on the number of threads to use when using RBD backend (option backend_native_threads_pool_size)? The doc is saying that 20 is the default but it should be increased, specially for the RBD driver, but up to which value?
Is there anyone tuning this parameter in their openstack deployments?
If yes, maybe we can add some recommendations on openstack large-scale doc about it?
Cheers,
Arnaud.
On 06/07, Arnaud Morin wrote:
Yes, I was talking about rbd_exclusive_cinder_pool. It was false by default on our side because we are still running cinder stein release :(
Thank you for the answer about the calculation methods! Do you mind if I copy paste your answer to the large-scale documentaiton ([1])?
Feel free to use it any way you want. :-)
Cheers, Arnaud.
[1] https://docs.openstack.org/large-scale/
On 06.07.22 - 10:20, Gorka Eguileor wrote:
On 05/07, Arnaud wrote:
Hey,
Thanks for your answer! OK, I understand the why ;) also because we hit some issues on our deployment. So we increase the number of threads to 100 but we also enable the deferred deletion (keeping in mind the quota usage downsides that it brings).
Hi Arnaud,
Deferred deletion should reduce the number of required native threads since delete calls will complete faster.
We also disabled the periodic task to compute usage and use the less precise way from db.
Are you referring to the 'rbd_exclusive_cinder_pool' configuration option? Because that should already have the optimum default (True value).
First question here: do you think we are going the right path?
One thing we are not yet sure is how to calculate correctly the number of threads to use. Should we do basic math with the number of deletion per minutes? Or should we take the number of volumes in the backend into account? Something in the middle?
Native threads on the RBD driver are not only used for deletion, they are used for *all* RBD calls.
We haven't defined any particular method to calculate the optimum number of threads on a system, but I can think of 2 possible avenues to explore:
- Performance testing: Run a set of tests with a high number of concurrent requests and different operations and see how Cinder performs. I wouldn't bother with individual attach and detach to VM operations because those are noops on the Cinder side, creating volume from image with either different images or cache disabled would be better.
- Reporting native thread usage: To really know if the number of native threads is sufficient or not you could modify the Cinder volume manager (and possibly also eventlet.tpool to gather statistics on the number of used/free native threads and number of queued requests that are waiting for a native thread to pick them up.
Cheers, Gorka.
Thanks!
Arnaud
Le 5 juillet 2022 18:06:14 GMT+02:00, Rajat Dhasmana <rdhasman@redhat.com> a écrit :
Hi Arnaud,
We discussed this in last week's cinder meeting and unfortunately we haven't tested it thoroughly so we don't have any performance numbers to share. What we can tell is the reason why RBD requires a higher number of native threads. RBD calls C code which could potentially block green threads hence blocking the main operation therefore all of the calls in RBD to execute operations are wrapped to use native threads so depending on the operations we want to perform concurrently, we can set the value of backend_native_threads_pool_size for RBD.
Thanks and regards Rajat Dhasmana
On Mon, Jun 27, 2022 at 9:35 PM Arnaud Morin <arnaud.morin@gmail.com> wrote:
Hey all,
Is there any recommendation on the number of threads to use when using RBD backend (option backend_native_threads_pool_size)? The doc is saying that 20 is the default but it should be increased, specially for the RBD driver, but up to which value?
Is there anyone tuning this parameter in their openstack deployments?
If yes, maybe we can add some recommendations on openstack large-scale doc about it?
Cheers,
Arnaud.
On 06/07, Gorka Eguileor wrote:
On 05/07, Arnaud wrote:
Hey,
Thanks for your answer! OK, I understand the why ;) also because we hit some issues on our deployment. So we increase the number of threads to 100 but we also enable the deferred deletion (keeping in mind the quota usage downsides that it brings).
Hi Arnaud,
Deferred deletion should reduce the number of required native threads since delete calls will complete faster.
We also disabled the periodic task to compute usage and use the less precise way from db.
Are you referring to the 'rbd_exclusive_cinder_pool' configuration option? Because that should already have the optimum default (True value).
First question here: do you think we are going the right path?
One thing we are not yet sure is how to calculate correctly the number of threads to use. Should we do basic math with the number of deletion per minutes? Or should we take the number of volumes in the backend into account? Something in the middle?
Native threads on the RBD driver are not only used for deletion, they are used for *all* RBD calls.
We haven't defined any particular method to calculate the optimum number of threads on a system, but I can think of 2 possible avenues to explore:
- Performance testing: Run a set of tests with a high number of concurrent requests and different operations and see how Cinder performs. I wouldn't bother with individual attach and detach to VM operations because those are noops on the Cinder side, creating volume from image with either different images or cache disabled would be better.
- Reporting native thread usage: To really know if the number of native threads is sufficient or not you could modify the Cinder volume manager (and possibly also eventlet.tpool to gather statistics on the number of used/free native threads and number of queued requests that are waiting for a native thread to pick them up.
Hi Arnaud, I just realized you should also be able to use Guru Meditation Reports [1] to get the native threads that are executing at a given time. Cinder uses multiple processes, one for the parent and one for each individual backend, so the PID that should be used to send the signal is not the parent. We can get GMR in the logs for all backend with: $ ps -C cinder-volume -o pid --no-headers | tail -n +2 | xargs sudo kill -SIGUSR2 Then go into the "Threads" section and see how many native threads there are. Cheers, Gorka. [1]: https://docs.openstack.org/nova/queens/reference/gmr.html
Cheers, Gorka.
Thanks!
Arnaud
Le 5 juillet 2022 18:06:14 GMT+02:00, Rajat Dhasmana <rdhasman@redhat.com> a écrit :
Hi Arnaud,
We discussed this in last week's cinder meeting and unfortunately we haven't tested it thoroughly so we don't have any performance numbers to share. What we can tell is the reason why RBD requires a higher number of native threads. RBD calls C code which could potentially block green threads hence blocking the main operation therefore all of the calls in RBD to execute operations are wrapped to use native threads so depending on the operations we want to perform concurrently, we can set the value of backend_native_threads_pool_size for RBD.
Thanks and regards Rajat Dhasmana
On Mon, Jun 27, 2022 at 9:35 PM Arnaud Morin <arnaud.morin@gmail.com> wrote:
Hey all,
Is there any recommendation on the number of threads to use when using RBD backend (option backend_native_threads_pool_size)? The doc is saying that 20 is the default but it should be increased, specially for the RBD driver, but up to which value?
Is there anyone tuning this parameter in their openstack deployments?
If yes, maybe we can add some recommendations on openstack large-scale doc about it?
Cheers,
Arnaud.
participants (4)
-
Arnaud
-
Arnaud Morin
-
Gorka Eguileor
-
Rajat Dhasmana