[Keystone] Performance degradation
Hi, Over the last few weeks we have being doing performance tests on the Stein version and have seen a notable performance degradation. Further investigation showed that Keystone seems to be the bottleneck. An example of this is running a Rally test that creates a keystone tenant with users. We see that in Queens this takes 20 seconds to complete the whole iteration whilst Stein takes 30 seconds. We have done the following tests: 1. Vanilla devstack Queens vs Steins 2. In our Stein version e have swapped out the Keystone Stein container with the Keystone Queens container and the numbers are considerably better too (please note the same keystone configuration is used with regards to processes/threads etc.) Are any folks familiar with what may cause this? Are there any performance improvement suggestions or hints? Thanks Gary
Hi, Over the last few weeks we have being doing performance tests on the Stein version and have seen a notable performance degradation. Further investigation showed that Keystone seems to be the bottleneck. An example of this is running a Rally test that creates a keystone tenant with users. We see that in Queens this takes 20 seconds to complete the whole iteration whilst Stein takes 30 seconds. (please see the attached file) We have done the following tests: 1. Vanilla devstack Queens vs Steins 2. In our Stein version e have swapped out the Keystone Stein container with the Keystone Queens container and the numbers are considerably better too (please note the same keystone configuration is used with regards to processes/threads etc.) Are any folks familiar with what may cause this? Are there any performance improvement suggestions or hints? Thanks Gary
Hi, Over the last few weeks we have being doing performance tests on the Stein version and have seen a notable performance degradation. Further investigation showed that Keystone seems to be the bottleneck. An example of this is running a Rally test that creates a keystone tenant with users. We see that in Queens this takes 20 seconds to complete the whole iteration whilst Stein takes 30 seconds. We have done the following tests: 1. Vanilla devstack Queens vs Steins 2. In our Stein version e have swapped out the Keystone Stein container with the Keystone Queens container and the numbers are considerably better too (please note the same keystone configuration is used with regards to processes/threads etc.) Are any folks familiar with what may cause this? Are there any performance improvement suggestions or hints? Thanks Gary
Hi Gary, A couple things come to mind right off the bat that I want to ensure are addressed: 1) Is caching enabled and is the memcache server in-fact accessible from the keystone process? Keystone is developed assuming you have caching enabled (we will be examining, during Train, making this a requirement instead of "strongly encouraged"). 2) Did you swap from uuid to Fernet between the deployments? Unfortunately, fernet is slower than UUID. We opted for fernet and taking the performance hit in light of the significantly better token management/maintenance for long-term running clouds. UUID token provider was removed as of rocky. 3) I am curious about the scenario you have built for rally: Is this scenario a real-world-ish scenario that you are doing on a regular basis? I want to be clear we are troubleshooting real-world(ish) scenarios and not synthetic problems that only occur in test scenarios; there are options to streamline test scenarios separately from real-world use-cases. Tell me more about the rally scenario. I also want to point out that I've been unable to get any real information from the attached files (they seem to be broken). 4) Is this running under mod_wsgi? uwsgi? is there a lot of other process space contention if under mod_wsgi in apache? (There isn't a lot of information about your configuration in the posed question), Configuration information, deployment in the container, etc helps us understand what is going on. Thanks! On Sun, Jun 16, 2019 at 1:48 AM Gary Kotton <gkotton@vmware.com> wrote:
Hi,
Over the last few weeks we have being doing performance tests on the Stein version and have seen a notable performance degradation. Further investigation showed that Keystone seems to be the bottleneck.
An example of this is running a Rally test that creates a keystone tenant with users. We see that in Queens this takes 20 seconds to complete the whole iteration whilst Stein takes 30 seconds.
We have done the following tests:
1. Vanilla devstack Queens vs Steins 2. In our Stein version e have swapped out the Keystone Stein container with the Keystone Queens container and the numbers are considerably better too (please note the same keystone configuration is used with regards to processes/threads etc.)
Are any folks familiar with what may cause this?
Are there any performance improvement suggestions or hints?
Thanks
Gary
Please see the comments inline below. Please note that we too the queens code (with the same configuration for stein) and that produced far better results. From: Morgan Fainberg <morgan.fainberg@gmail.com> Date: Sunday, 16 June 2019 at 18:44 To: Gary Kotton <gkotton@vmware.com> Cc: "openstack-discuss@lists.openstack.org" <openstack-discuss@lists.openstack.org> Subject: Re: [Keystone] Performance degradation Hi Gary, A couple things come to mind right off the bat that I want to ensure are addressed: 1) Is caching enabled and is the memcache server in-fact accessible from the keystone process? Keystone is developed assuming you have caching enabled (we will be examining, during Train, making this a requirement instead of "strongly encouraged"). [Gary] Yes, caching is enabled and memcache is accessible. 2) Did you swap from uuid to Fernet between the deployments? Unfortunately, fernet is slower than UUID. We opted for fernet and taking the performance hit in light of the significantly better token management/maintenance for long-term running clouds. UUID token provider was removed as of rocky. [Gary] We are using Fernet. Please note that in Queens we are also using Fernet. 3) I am curious about the scenario you have built for rally: Is this scenario a real-world-ish scenario that you are doing on a regular basis? I want to be clear we are troubleshooting real-world(ish) scenarios and not synthetic problems that only occur in test scenarios; there are options to streamline test scenarios separately from real-world use-cases. Tell me more about the rally scenario. I also want to point out that I've been unable to get any real information from the attached files (they seem to be broken). [Gary] I am getting information from the team on this and will get back to you 4) Is this running under mod_wsgi? uwsgi? is there a lot of other process space contention if under mod_wsgi in apache? (There isn't a lot of information about your configuration in the posed question), Configuration information, deployment in the container, etc helps us understand what is going on. [Gary] It is mod_wsgi. This is containerized. There is nothing else running in the same container. Thanks! On Sun, Jun 16, 2019 at 1:48 AM Gary Kotton <gkotton@vmware.com<mailto:gkotton@vmware.com>> wrote: Hi, Over the last few weeks we have being doing performance tests on the Stein version and have seen a notable performance degradation. Further investigation showed that Keystone seems to be the bottleneck. An example of this is running a Rally test that creates a keystone tenant with users. We see that in Queens this takes 20 seconds to complete the whole iteration whilst Stein takes 30 seconds. We have done the following tests: 1. Vanilla devstack Queens vs Steins 2. In our Stein version e have swapped out the Keystone Stein container with the Keystone Queens container and the numbers are considerably better too (please note the same keystone configuration is used with regards to processes/threads etc.) Are any folks familiar with what may cause this? Are there any performance improvement suggestions or hints? Thanks Gary
participants (2)
-
Gary Kotton
-
Morgan Fainberg