[qa][ptg][nova][cinder][keystone][neutron][glance][swift][placement] How to make integrated-gate testing (tempest-full) more stable and fast
Current integrated-gate jobs (tempest-full) is not so stable for various bugs specially timeout. We tried to improve it via filtering the slow tests in the separate tempest-slow job but the situation has not been improved much. We talked about the Ideas to make it more stable and fast for projects especially when failure is not related to each project. We are planning to split the integrated-gate template (only tempest-full job as first step) per related services. Idea: - Run only dependent service tests on project gate. - Tempest gate will keep running all the services tests as the integrated gate at a centeralized place without any change in the current job. - Each project can run the below mentioned template. - All below template will be defined and maintained by QA team. I would like to know each 6 services which run integrated-gate jobs 1."Integrated-gate-networking" (job to run on neutron gate) Tests to run in this template: neutron APIs , nova APIs, keystone APIs ? All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for neutron gate: exlcude the cinder API tests, glance API tests, swift API tests, 2."Integrated-gate-storage" (job to run on cinder gate, glance gate) Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, Nova APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for cinder, glance gate: excluded the neutron APIs tests, Keystone APIs tests 3. "Integrated-gate-object-storage" (job to run on swift gate) Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for swift gate: excluded the neutron APIs tests, - Keystone APIs tests, - Nova APIs tests. Note: swift does not run integrated-gate as of now. 4. "Integrated-gate-compute" (job to run on Nova gate) tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs and All scenario currently running in tempest-full in same way ( means non-slow and in serial) Improvement for Nova gate: excluded the swift APIs tests(not running in current job but in future, it might), Keystone API tests. 5. "Integrated-gate-identity" (job to run on keystone gate) Tests to run is : all as all project use keystone, we might need to run all tests as it is running in integrated-gate. But does keystone is being unsed differently by all services? if no then, is it enough to run only single service tests say Nova or neutron ? 6. "Integrated-gate-placement" (job to run on placement gate) Tests to run in this template: Nova APIs tests, Neutron APIs tests + scenario tests + any new service depends on placement APIs Improvement for placement gate: excluded the glance APIs tests, cinder APIs tests, swift APIs tests, keystone APIs tests Thoughts on this approach? The important point is we must not lose the coverage of integrated testing per project. So I would like to get each project view if we are missing any dependency (proposed tests removal) in above proposed templates. - https://etherpad.openstack.org/p/qa-train-ptg -gmann
+1 On Sun, May 5, 2019 at 3:18 PM Ghanshyam Mann <gmann@ghanshyammann.com> wrote:
Current integrated-gate jobs (tempest-full) is not so stable for various bugs specially timeout. We tried to improve it via filtering the slow tests in the separate tempest-slow job but the situation has not been improved much.
We talked about the Ideas to make it more stable and fast for projects especially when failure is not related to each project. We are planning to split the integrated-gate template (only tempest-full job as first step) per related services.
Idea: - Run only dependent service tests on project gate. - Tempest gate will keep running all the services tests as the integrated gate at a centeralized place without any change in the current job. - Each project can run the below mentioned template. - All below template will be defined and maintained by QA team.
I would like to know each 6 services which run integrated-gate jobs
1."Integrated-gate-networking" (job to run on neutron gate) Tests to run in this template: neutron APIs , nova APIs, keystone APIs ? All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for neutron gate: exlcude the cinder API tests, glance API tests, swift API tests,
2."Integrated-gate-storage" (job to run on cinder gate, glance gate) Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, Nova APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for cinder, glance gate: excluded the neutron APIs tests, Keystone APIs tests
3. "Integrated-gate-object-storage" (job to run on swift gate) Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for swift gate: excluded the neutron APIs tests, - Keystone APIs tests, - Nova APIs tests. Note: swift does not run integrated-gate as of now.
4. "Integrated-gate-compute" (job to run on Nova gate) tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs and All scenario currently running in tempest-full in same way ( means non-slow and in serial) Improvement for Nova gate: excluded the swift APIs tests(not running in current job but in future, it might), Keystone API tests.
5. "Integrated-gate-identity" (job to run on keystone gate) Tests to run is : all as all project use keystone, we might need to run all tests as it is running in integrated-gate. But does keystone is being unsed differently by all services? if no then, is it enough to run only single service tests say Nova or neutron ?
6. "Integrated-gate-placement" (job to run on placement gate) Tests to run in this template: Nova APIs tests, Neutron APIs tests + scenario tests + any new service depends on placement APIs Improvement for placement gate: excluded the glance APIs tests, cinder APIs tests, swift APIs tests, keystone APIs tests
Thoughts on this approach?
The important point is we must not lose the coverage of integrated testing per project. So I would like to get each project view if we are missing any dependency (proposed tests removal) in above proposed templates.
- https://etherpad.openstack.org/p/qa-train-ptg
-gmann
I think this is a really great approach. +1 Nate On Sun, May 05, 2019 at 02:18:08AM -0500, Ghanshyam Mann wrote:
Current integrated-gate jobs (tempest-full) is not so stable for various bugs specially timeout. We tried to improve it via filtering the slow tests in the separate tempest-slow job but the situation has not been improved much.
We talked about the Ideas to make it more stable and fast for projects especially when failure is not related to each project. We are planning to split the integrated-gate template (only tempest-full job as first step) per related services.
Idea: - Run only dependent service tests on project gate. - Tempest gate will keep running all the services tests as the integrated gate at a centeralized place without any change in the current job. - Each project can run the below mentioned template. - All below template will be defined and maintained by QA team.
I would like to know each 6 services which run integrated-gate jobs
1."Integrated-gate-networking" (job to run on neutron gate) Tests to run in this template: neutron APIs , nova APIs, keystone APIs ? All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for neutron gate: exlcude the cinder API tests, glance API tests, swift API tests,
2."Integrated-gate-storage" (job to run on cinder gate, glance gate) Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, Nova APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for cinder, glance gate: excluded the neutron APIs tests, Keystone APIs tests
3. "Integrated-gate-object-storage" (job to run on swift gate) Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for swift gate: excluded the neutron APIs tests, - Keystone APIs tests, - Nova APIs tests. Note: swift does not run integrated-gate as of now.
4. "Integrated-gate-compute" (job to run on Nova gate) tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs and All scenario currently running in tempest-full in same way ( means non-slow and in serial) Improvement for Nova gate: excluded the swift APIs tests(not running in current job but in future, it might), Keystone API tests.
5. "Integrated-gate-identity" (job to run on keystone gate) Tests to run is : all as all project use keystone, we might need to run all tests as it is running in integrated-gate. But does keystone is being unsed differently by all services? if no then, is it enough to run only single service tests say Nova or neutron ?
6. "Integrated-gate-placement" (job to run on placement gate) Tests to run in this template: Nova APIs tests, Neutron APIs tests + scenario tests + any new service depends on placement APIs Improvement for placement gate: excluded the glance APIs tests, cinder APIs tests, swift APIs tests, keystone APIs tests
Thoughts on this approach?
The important point is we must not lose the coverage of integrated testing per project. So I would like to get each project view if we are missing any dependency (proposed tests removal) in above proposed templates.
- https://etherpad.openstack.org/p/qa-train-ptg
-gmann
Yes, I also like this approach On Mon, May 6, 2019 at 6:48 AM Nate Johnston <nate.johnston@redhat.com> wrote:
I think this is a really great approach. +1
Nate
Current integrated-gate jobs (tempest-full) is not so stable for various bugs specially timeout. We tried to improve it via filtering the slow tests in the separate tempest-slow job but the situation has not been improved much.
We talked about the Ideas to make it more stable and fast for projects especially when failure is not related to each project. We are planning to split the integrated-gate template (only tempest-full job as first step) per related services.
Idea: - Run only dependent service tests on project gate. - Tempest gate will keep running all the services tests as the integrated gate at a centeralized place without any change in the current job. - Each project can run the below mentioned template. - All below template will be defined and maintained by QA team.
I would like to know each 6 services which run integrated-gate jobs
1."Integrated-gate-networking" (job to run on neutron gate) Tests to run in this template: neutron APIs , nova APIs, keystone APIs ? All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for neutron gate: exlcude the cinder API tests, glance API tests, swift API tests,
2."Integrated-gate-storage" (job to run on cinder gate, glance gate) Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, Nova APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for cinder, glance gate: excluded the neutron APIs tests, Keystone APIs tests
3. "Integrated-gate-object-storage" (job to run on swift gate) Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for swift gate: excluded the neutron APIs tests, - Keystone APIs tests, - Nova APIs tests. Note: swift does not run integrated-gate as of now.
4. "Integrated-gate-compute" (job to run on Nova gate) tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs and All scenario currently running in tempest-full in same way ( means non-slow and in serial) Improvement for Nova gate: excluded the swift APIs tests(not running in current job but in future, it might), Keystone API tests.
5. "Integrated-gate-identity" (job to run on keystone gate) Tests to run is : all as all project use keystone, we might need to run all tests as it is running in integrated-gate. But does keystone is being unsed differently by all services? if no
On Sun, May 05, 2019 at 02:18:08AM -0500, Ghanshyam Mann wrote: then, is it enough to run only single service tests say Nova or neutron ?
6. "Integrated-gate-placement" (job to run on placement gate) Tests to run in this template: Nova APIs tests, Neutron APIs tests +
scenario tests + any new service depends on placement APIs
Improvement for placement gate: excluded the glance APIs tests, cinder APIs tests, swift APIs tests, keystone APIs tests
Thoughts on this approach?
The important point is we must not lose the coverage of integrated testing per project. So I would like to get each project view if we are missing any dependency (proposed tests removal) in above proposed templates.
- https://etherpad.openstack.org/p/qa-train-ptg
-gmann
On Sun, May 5, 2019 at 12:19 AM Ghanshyam Mann <gmann@ghanshyammann.com> wrote:
Current integrated-gate jobs (tempest-full) is not so stable for various bugs specially timeout. We tried to improve it via filtering the slow tests in the separate tempest-slow job but the situation has not been improved much.
We talked about the Ideas to make it more stable and fast for projects especially when failure is not related to each project. We are planning to split the integrated-gate template (only tempest-full job as first step) per related services.
Idea: - Run only dependent service tests on project gate. - Tempest gate will keep running all the services tests as the integrated gate at a centeralized place without any change in the current job. - Each project can run the below mentioned template. - All below template will be defined and maintained by QA team.
I would like to know each 6 services which run integrated-gate jobs
1."Integrated-gate-networking" (job to run on neutron gate) Tests to run in this template: neutron APIs , nova APIs, keystone APIs ? All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for neutron gate: exlcude the cinder API tests, glance API tests, swift API tests,
2."Integrated-gate-storage" (job to run on cinder gate, glance gate) Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, Nova APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for cinder, glance gate: excluded the neutron APIs tests, Keystone APIs tests
3. "Integrated-gate-object-storage" (job to run on swift gate) Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for swift gate: excluded the neutron APIs tests, - Keystone APIs tests, - Nova APIs tests. Note: swift does not run integrated-gate as of now.
4. "Integrated-gate-compute" (job to run on Nova gate) tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs and All scenario currently running in tempest-full in same way ( means non-slow and in serial) Improvement for Nova gate: excluded the swift APIs tests(not running in current job but in future, it might), Keystone API tests.
5. "Integrated-gate-identity" (job to run on keystone gate) Tests to run is : all as all project use keystone, we might need to run all tests as it is running in integrated-gate. But does keystone is being unsed differently by all services? if no then, is it enough to run only single service tests say Nova or neutron ?
6. "Integrated-gate-placement" (job to run on placement gate) Tests to run in this template: Nova APIs tests, Neutron APIs tests + scenario tests + any new service depends on placement APIs Improvement for placement gate: excluded the glance APIs tests, cinder APIs tests, swift APIs tests, keystone APIs tests
Thoughts on this approach?
The important point is we must not lose the coverage of integrated testing per project. So I would like to get each project view if we are missing any dependency (proposed tests removal) in above proposed templates.
- https://etherpad.openstack.org/p/qa-train-ptg
-gmann
For the "Integrated-gate-identity", I have a slight worry that we might lose some coverage with this change. I am unsure of how varied the use of Keystone is outside of KeystoneMiddleware (i.e. token validation) consumption that all services perform, Heat (not part of the integrated gate) and it's usage of Trusts, and some newer emerging uses such as "look up limit data" (potentially in Train, would be covered by Nova). Worst case, we could run all the integrated tests for Keystone changes (at least initially) until we have higher confidence and minimize the tests once we have a clearer audit of how the services use Keystone. The changes would speed up/minimize the usage for the other services directly and Keystone can follow down the line. I want to be as close to 100% sure we're not going to suddenly break everyone because of some change we land. Keystone fortunately and unfortunately sits below most other services in an OpenStack deployment and is heavily relied throughout almost every single request. --Morgan
---- On Tue, 07 May 2019 07:06:23 +0900 Morgan Fainberg <morgan.fainberg@gmail.com> wrote ----
On Sun, May 5, 2019 at 12:19 AM Ghanshyam Mann <gmann@ghanshyammann.com> wrote:
For the "Integrated-gate-identity", I have a slight worry that we might lose some coverage with this change. I am unsure of how varied the use of Keystone is outside of KeystoneMiddleware (i.e. token validation) consumption that all services perform, Heat (not part of the integrated gate) and it's usage of Trusts, and some newer emerging uses such as "look up limit data" (potentially in Train, would be covered by Nova). Worst case, we could run all the integrated tests for Keystone changes (at least initially) until we have higher confidence and minimize the tests once we have a clearer audit of how the services use Keystone. The changes would speed up/minimize the usage for the other services directly and Keystone can follow down the line. I want to be as close to 100% sure we're not going to suddenly break everyone because of some change we land. Keystone fortunately and unfortunately sits below most other services in an OpenStack deployment and is heavily relied throughout almost every single request. --Morgan
Thanks Morgan. That was what we were worried during PTG discussion. I agree with your point about not to lose coverage and first get to know how Keystone is being used by each service. Let's keep running the all service tests for keystone gate as of now and later we can shorten the tests run based on the clarity of usage. -gmann
Current integrated-gate jobs (tempest-full) is not so stable for various bugs specially timeout. We tried to improve it via filtering the slow tests in the separate tempest-slow job but the situation has not been improved much.
We talked about the Ideas to make it more stable and fast for projects especially when failure is not related to each project. We are planning to split the integrated-gate template (only tempest-full job as first step) per related services.
Idea: - Run only dependent service tests on project gate. - Tempest gate will keep running all the services tests as the integrated gate at a centeralized place without any change in the current job. - Each project can run the below mentioned template. - All below template will be defined and maintained by QA team.
I would like to know each 6 services which run integrated-gate jobs
1."Integrated-gate-networking" (job to run on neutron gate) Tests to run in this template: neutron APIs , nova APIs, keystone APIs ? All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for neutron gate: exlcude the cinder API tests, glance API tests, swift API tests,
2."Integrated-gate-storage" (job to run on cinder gate, glance gate) Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, Nova APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for cinder, glance gate: excluded the neutron APIs tests, Keystone APIs tests
3. "Integrated-gate-object-storage" (job to run on swift gate) Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for swift gate: excluded the neutron APIs tests, - Keystone APIs tests, - Nova APIs tests. Note: swift does not run integrated-gate as of now.
4. "Integrated-gate-compute" (job to run on Nova gate) tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs and All scenario currently running in tempest-full in same way ( means non-slow and in serial) Improvement for Nova gate: excluded the swift APIs tests(not running in current job but in future, it might), Keystone API tests.
5. "Integrated-gate-identity" (job to run on keystone gate) Tests to run is : all as all project use keystone, we might need to run all tests as it is running in integrated-gate. But does keystone is being unsed differently by all services? if no then, is it enough to run only single service tests say Nova or neutron ?
6. "Integrated-gate-placement" (job to run on placement gate) Tests to run in this template: Nova APIs tests, Neutron APIs tests + scenario tests + any new service depends on placement APIs Improvement for placement gate: excluded the glance APIs tests, cinder APIs tests, swift APIs tests, keystone APIs tests
Thoughts on this approach?
The important point is we must not lose the coverage of integrated testing per project. So I would like to get each project view if we are missing any dependency (proposed tests removal) in above proposed templates.
- https://etherpad.openstack.org/p/qa-train-ptg
-gmann
---- On Mon, 27 May 2019 18:43:35 +0900 Ghanshyam Mann <gmann@ghanshyammann.com> wrote ----
---- On Tue, 07 May 2019 07:06:23 +0900 Morgan Fainberg <morgan.fainberg@gmail.com> wrote ----
On Sun, May 5, 2019 at 12:19 AM Ghanshyam Mann <gmann@ghanshyammann.com> wrote:
For the "Integrated-gate-identity", I have a slight worry that we might lose some coverage with this change. I am unsure of how varied the use of Keystone is outside of KeystoneMiddleware (i.e. token validation) consumption that all services perform, Heat (not part of the integrated gate) and it's usage of Trusts, and some newer emerging uses such as "look up limit data" (potentially in Train, would be covered by Nova). Worst case, we could run all the integrated tests for Keystone changes (at least initially) until we have higher confidence and minimize the tests once we have a clearer audit of how the services use Keystone. The changes would speed up/minimize the usage for the other services directly and Keystone can follow down the line. I want to be as close to 100% sure we're not going to suddenly break everyone because of some change we land. Keystone fortunately and unfortunately sits below most other services in an OpenStack deployment and is heavily relied throughout almost every single request. --Morgan
Thanks Morgan. That was what we were worried during PTG discussion. I agree with your point about not to lose coverage and first get to know how Keystone is being used by each service. Let's keep running the all service tests for keystone gate as of now and later we can shorten the tests run based on the clarity of usage.
We can disable the ssh validation for "Integrated-gate-identity" which keystone does not need to care about. This can save the keystone gate for ssh timeout failure. -gmann
-gmann
Current integrated-gate jobs (tempest-full) is not so stable for various bugs specially timeout. We tried to improve it via filtering the slow tests in the separate tempest-slow job but the situation has not been improved much.
We talked about the Ideas to make it more stable and fast for projects especially when failure is not related to each project. We are planning to split the integrated-gate template (only tempest-full job as first step) per related services.
Idea: - Run only dependent service tests on project gate. - Tempest gate will keep running all the services tests as the integrated gate at a centeralized place without any change in the current job. - Each project can run the below mentioned template. - All below template will be defined and maintained by QA team.
I would like to know each 6 services which run integrated-gate jobs
1."Integrated-gate-networking" (job to run on neutron gate) Tests to run in this template: neutron APIs , nova APIs, keystone APIs ? All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for neutron gate: exlcude the cinder API tests, glance API tests, swift API tests,
2."Integrated-gate-storage" (job to run on cinder gate, glance gate) Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, Nova APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for cinder, glance gate: excluded the neutron APIs tests, Keystone APIs tests
3. "Integrated-gate-object-storage" (job to run on swift gate) Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for swift gate: excluded the neutron APIs tests, - Keystone APIs tests, - Nova APIs tests. Note: swift does not run integrated-gate as of now.
4. "Integrated-gate-compute" (job to run on Nova gate) tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs and All scenario currently running in tempest-full in same way ( means non-slow and in serial) Improvement for Nova gate: excluded the swift APIs tests(not running in current job but in future, it might), Keystone API tests.
5. "Integrated-gate-identity" (job to run on keystone gate) Tests to run is : all as all project use keystone, we might need to run all tests as it is running in integrated-gate. But does keystone is being unsed differently by all services? if no then, is it enough to run only single service tests say Nova or neutron ?
6. "Integrated-gate-placement" (job to run on placement gate) Tests to run in this template: Nova APIs tests, Neutron APIs tests + scenario tests + any new service depends on placement APIs Improvement for placement gate: excluded the glance APIs tests, cinder APIs tests, swift APIs tests, keystone APIs tests
Thoughts on this approach?
The important point is we must not lose the coverage of integrated testing per project. So I would like to get each project view if we are missing any dependency (proposed tests removal) in above proposed templates.
- https://etherpad.openstack.org/p/qa-train-ptg
-gmann
---- On Tue, 28 May 2019 22:21:44 +0900 Ghanshyam Mann <gmann@ghanshyammann.com> wrote ----
---- On Mon, 27 May 2019 18:43:35 +0900 Ghanshyam Mann <gmann@ghanshyammann.com> wrote ----
---- On Tue, 07 May 2019 07:06:23 +0900 Morgan Fainberg <morgan.fainberg@gmail.com> wrote ----
On Sun, May 5, 2019 at 12:19 AM Ghanshyam Mann <gmann@ghanshyammann.com> wrote:
For the "Integrated-gate-identity", I have a slight worry that we might lose some coverage with this change. I am unsure of how varied the use of Keystone is outside of KeystoneMiddleware (i.e. token validation) consumption that all services perform, Heat (not part of the integrated gate) and it's usage of Trusts, and some newer emerging uses such as "look up limit data" (potentially in Train, would be covered by Nova). Worst case, we could run all the integrated tests for Keystone changes (at least initially) until we have higher confidence and minimize the tests once we have a clearer audit of how the services use Keystone. The changes would speed up/minimize the usage for the other services directly and Keystone can follow down the line. I want to be as close to 100% sure we're not going to suddenly break everyone because of some change we land. Keystone fortunately and unfortunately sits below most other services in an OpenStack deployment and is heavily relied throughout almost every single request. --Morgan
Thanks Morgan. That was what we were worried during PTG discussion. I agree with your point about not to lose coverage and first get to know how Keystone is being used by each service. Let's keep running the all service tests for keystone gate as of now and later we can shorten the tests run based on the clarity of usage.
We can disable the ssh validation for "Integrated-gate-identity" which keystone does not need to care about. This can save the keystone gate for ssh timeout failure.
-gmann
-gmann
Current integrated-gate jobs (tempest-full) is not so stable for various bugs specially timeout. We tried to improve it via filtering the slow tests in the separate tempest-slow job but the situation has not been improved much.
We talked about the Ideas to make it more stable and fast for projects especially when failure is not related to each project. We are planning to split the integrated-gate template (only tempest-full job as first step) per related services.
Idea: - Run only dependent service tests on project gate. - Tempest gate will keep running all the services tests as the integrated gate at a centeralized place without any change in the current job. - Each project can run the below mentioned template. - All below template will be defined and maintained by QA team.
I would like to know each 6 services which run integrated-gate jobs
1."Integrated-gate-networking" (job to run on neutron gate) Tests to run in this template: neutron APIs , nova APIs, keystone APIs ? All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for neutron gate: exlcude the cinder API tests, glance API tests, swift API tests,
2."Integrated-gate-storage" (job to run on cinder gate, glance gate) Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, Nova APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for cinder, glance gate: excluded the neutron APIs tests, Keystone APIs tests
3. "Integrated-gate-object-storage" (job to run on swift gate) Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for swift gate: excluded the neutron APIs tests, - Keystone APIs tests, - Nova APIs tests. Note: swift does not run integrated-gate as of now.
4. "Integrated-gate-compute" (job to run on Nova gate) tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs and All scenario currently running in tempest-full in same way ( means non-slow and in serial) Improvement for Nova gate: excluded the swift APIs tests(not running in current job but in future, it might), Keystone API tests.
5. "Integrated-gate-identity" (job to run on keystone gate) Tests to run is : all as all project use keystone, we might need to run all tests as it is running in integrated-gate. But does keystone is being unsed differently by all services? if no then, is it enough to run only single service tests say Nova or neutron ?
6. "Integrated-gate-placement" (job to run on placement gate) Tests to run in this template: Nova APIs tests, Neutron APIs tests + scenario tests + any new service depends on placement APIs Improvement for placement gate: excluded the glance APIs tests, cinder APIs tests, swift APIs tests, keystone APIs tests
I have prepared the new template for integrated gate testing[1] and tested in DNM patch [2]. You can observe ~20 min less time on new jobs(except compute one). But the main thing is will improve the stability of gate. Once they are merged, I will propose the patch to replace those template on the projects gate. NOTE: Along with APIs tests, I have back listed the non-dependent scenario tests also. [1] https://review.opendev.org/#/q/topic:refactor-integrated-gate-testing+(statu...) [2] https://review.opendev.org/#/c/669313/ -gmann
Thoughts on this approach?
The important point is we must not lose the coverage of integrated testing per project. So I would like to get each project view if we are missing any dependency (proposed tests removal) in above proposed templates.
- https://etherpad.openstack.org/p/qa-train-ptg
-gmann
On 5/5/19 12:18 AM, Ghanshyam Mann wrote:
Current integrated-gate jobs (tempest-full) is not so stable for various bugs specially timeout. We tried to improve it via filtering the slow tests in the separate tempest-slow job but the situation has not been improved much.
We talked about the Ideas to make it more stable and fast for projects especially when failure is not related to each project. We are planning to split the integrated-gate template (only tempest-full job as first step) per related services.
Idea: - Run only dependent service tests on project gate. I love this plan already. - Tempest gate will keep running all the services tests as the integrated gate at a centeralized place without any change in the current job. - Each project can run the below mentioned template. - All below template will be defined and maintained by QA team. My biggest regret is that I couldn't figure out how to do this myself. Much thanks to the QA team!
I would like to know each 6 services which run integrated-gate jobs
1."Integrated-gate-networking" (job to run on neutron gate) Tests to run in this template: neutron APIs , nova APIs, keystone APIs ? All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for neutron gate: exlcude the cinder API tests, glance API tests, swift API tests,
2."Integrated-gate-storage" (job to run on cinder gate, glance gate) Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, Nova APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for cinder, glance gate: excluded the neutron APIs tests, Keystone APIs tests
3. "Integrated-gate-object-storage" (job to run on swift gate) Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for swift gate: excluded the neutron APIs tests, - Keystone APIs tests, - Nova APIs tests. This sounds great. My only question is why Cinder tests are still included, but I trust that it's there for a reason and I'm just revealing my own ignorance of Swift's consumers, however removed. Note: swift does not run integrated-gate as of now.
Correct, and for all the reasons that you're seeking to address. Some eight months ago I'd gotten tired of seeing spurious failures that had nothing to do with Swift, and I was hard pressed to find an instance where the tempest tests caught a regression or behavior change that wasn't already caught by Swift's own functional tests. In short, the signal-to-noise ratio for those particular tests was low enough that a failure only told me "you should leave a recheck comment," so I proposed https://review.opendev.org/#/c/601813/ . There was also a side benefit of having our longest-running job change from legacy-tempest-dsvm-neutron-full (at 90-100 minutes) to swift-probetests-centos-7 (at ~30 minutes), tightening developer feedback loops. It sounds like this proposal addresses both concerns: by reducing the scope of tests to what might actually exercise the Swift API (if indirectly), the signal-to-noise ratio should be much better and the wall-clock time will be reduced.
4. "Integrated-gate-compute" (job to run on Nova gate) tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs and All scenario currently running in tempest-full in same way ( means non-slow and in serial) Improvement for Nova gate: excluded the swift APIs tests(not running in current job but in future, it might), Keystone API tests.
5. "Integrated-gate-identity" (job to run on keystone gate) Tests to run is : all as all project use keystone, we might need to run all tests as it is running in integrated-gate. But does keystone is being unsed differently by all services? if no then, is it enough to run only single service tests say Nova or neutron ?
6. "Integrated-gate-placement" (job to run on placement gate) Tests to run in this template: Nova APIs tests, Neutron APIs tests + scenario tests + any new service depends on placement APIs Improvement for placement gate: excluded the glance APIs tests, cinder APIs tests, swift APIs tests, keystone APIs tests
Thoughts on this approach?
The important point is we must not lose the coverage of integrated testing per project. So I would like to get each project view if we are missing any dependency (proposed tests removal) in above proposed templates.
As far as Swift is aware, these dependencies seem accurate; at any rate, *we* don't use anything other than Keystone, even by way of another API. Further, Swift does not use particularly esoteric Keysonte APIs; I would be OK with integrated-gate-identity not exercising Swift's API with the assumption that some other (or indeed, almost *any* other) service would likely exercise the parts that we care about.
- https:/etherpad.openstack.org/p/qa-train-ptg
-gmann
On Tue, May 7, 2019 at 12:31 AM Tim Burke <tim@swiftstack.com> wrote:
On 5/5/19 12:18 AM, Ghanshyam Mann wrote:
Current integrated-gate jobs (tempest-full) is not so stable for various bugs specially timeout. We tried to improve it via filtering the slow tests in the separate tempest-slow job but the situation has not been improved much.
We talked about the Ideas to make it more stable and fast for projects especially when failure is not related to each project. We are planning to split the integrated-gate template (only tempest-full job as first step) per related services.
Idea: - Run only dependent service tests on project gate.
I love this plan already.
- Tempest gate will keep running all the services tests as the integrated gate at a centeralized place without any change in the current job. - Each project can run the below mentioned template. - All below template will be defined and maintained by QA team.
My biggest regret is that I couldn't figure out how to do this myself. Much thanks to the QA team!
I would like to know each 6 services which run integrated-gate jobs
1."Integrated-gate-networking" (job to run on neutron gate) Tests to run in this template: neutron APIs , nova APIs, keystone APIs ? All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for neutron gate: exlcude the cinder API tests, glance API tests, swift API tests,
2."Integrated-gate-storage" (job to run on cinder gate, glance gate) Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, Nova APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for cinder, glance gate: excluded the neutron APIs tests, Keystone APIs tests
3. "Integrated-gate-object-storage" (job to run on swift gate) Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial) Improvement for swift gate: excluded the neutron APIs tests, - Keystone APIs tests, - Nova APIs tests.
This sounds great. My only question is why Cinder tests are still included, but I trust that it's there for a reason and I'm just revealing my own ignorance of Swift's consumers, however removed.
Note: swift does not run integrated-gate as of now.
Correct, and for all the reasons that you're seeking to address. Some eight months ago I'd gotten tired of seeing spurious failures that had nothing to do with Swift, and I was hard pressed to find an instance where the tempest tests caught a regression or behavior change that wasn't already caught by Swift's own functional tests. In short, the signal-to-noise ratio for those particular tests was low enough that a failure only told me "you should leave a recheck comment," so I proposed https://review.opendev.org/#/c/601813/ . There was also a side benefit of having our longest-running job change from legacy-tempest-dsvm-neutron-full (at 90-100 minutes) to swift-probetests-centos-7 (at ~30 minutes), tightening developer feedback loops.
It sounds like this proposal addresses both concerns: by reducing the scope of tests to what might actually exercise the Swift API (if indirectly), the signal-to-noise ratio should be much better and the wall-clock time will be reduced.
4. "Integrated-gate-compute" (job to run on Nova gate) tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs and All scenario currently running in tempest-full in same way ( means non-slow and in serial) Improvement for Nova gate: excluded the swift APIs tests(not running in current job but in future, it might), Keystone API tests.
5. "Integrated-gate-identity" (job to run on keystone gate) Tests to run is : all as all project use keystone, we might need to run all tests as it is running in integrated-gate. But does keystone is being unsed differently by all services? if no then, is it enough to run only single service tests say Nova or neutron ?
6. "Integrated-gate-placement" (job to run on placement gate) Tests to run in this template: Nova APIs tests, Neutron APIs tests + scenario tests + any new service depends on placement APIs Improvement for placement gate: excluded the glance APIs tests, cinder APIs tests, swift APIs tests, keystone APIs tests
Thoughts on this approach?
The important point is we must not lose the coverage of integrated testing per project. So I would like to get each project view if we are missing any dependency (proposed tests removal) in above proposed templates.
As far as Swift is aware, these dependencies seem accurate; at any rate, *we* don't use anything other than Keystone, even by way of another API. Further, Swift does not use particularly esoteric Keysonte APIs; I would be OK with integrated-gate-identity not exercising Swift's API with the assumption that some other (or indeed, almost *any* other) service would likely exercise the parts that we care about.
- https:/etherpad.openstack.org/p/qa-train-ptg
-gmann
While I'm all up for limiting the scope Tempest is targeting for each patch to save time and our precious infra resources I have feeling that we might end up missing something here. Honestly I'm not sure what that something would be and maybe it's me thinking the scopes wrong way around. For example: 4. "Integrated-gate-compute" (job to run on Nova gate) I'm not exactly sure what any given Nova patch would be able to break from Cinder, Glance or Neutron or on number 2 what Swift is depending on Glance and Cinder that we could break when we introduce a change. Shouldn't we be looking "What projects are consuming service X and target those Tempest tests"? In Glance perspective this would be (from core projects) Glance, Cinder, Nova; Cinder probably interested about Cinder, Glance and Nova (anyone else consuming Cinder?) etc. I'd like to propose approach where we define these jobs and run them in check for the start and let gate run full suites until we figure out are we catching something in gate we did not catch in check and once the understanding has been reached that we have sufficient coverage, we can go ahead and swap gate using those jobs as well. This approach would give us the benefit where the impact is highest until we are confident we got the coverage right. I think biggest issue is that for the transition period _everyone_ needs to understand that gate might catch something check did not and simple "recheck" might not be sufficient when tempest succeeded in check but failed in gate. Best, Erno "jokke_" Kuvaja
---- On Thu, 16 May 2019 20:48:30 +0900 Erno Kuvaja <ekuvaja@redhat.com> wrote ----
On Tue, May 7, 2019 at 12:31 AM Tim Burke <tim@swiftstack.com> wrote:
On 5/5/19 12:18 AM, Ghanshyam Mann wrote: Current integrated-gate jobs (tempest-full) is not so stable for various bugs specially timeout. We triedto improve it via filtering the slow tests in the separate tempest-slow job but the situation has not been improved much.We talked about the Ideas to make it more stable and fast for projects especially when failure is notrelated to each project. We are planning to split the integrated-gate template (only tempest-full job asfirst step) per related services. Idea:- Run only dependent service tests on project gate. I love this plan already. - Tempest gate will keep running all the services tests as the integrated gate at a centeralized place without any change in the current job.- Each project can run the below mentioned template. - All below template will be defined and maintained by QA team. My biggest regret is that I couldn't figure out how to do this myself. Much thanks to the QA team! I would like to know each 6 services which run integrated-gate jobs1."Integrated-gate-networking" (job to run on neutron gate) Tests to run in this template: neutron APIs , nova APIs, keystone APIs ? All scenario currently running in tempest-full in the same way ( means non-slow and in serial)Improvement for neutron gate: exlcude the cinder API tests, glance API tests, swift API tests,2."Integrated-gate-storage" (job to run on cinder gate, glance gate)Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, Nova APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial)Improvement for cinder, glance gate: excluded the neutron APIs tests, Keystone APIs tests3. "Integrated-gate-object-storage" (job to run on swift gate)Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial)Improvement for swift gate: excluded the neutron APIs tests, - Keystone APIs tests, - Nova APIs tests. This sounds great. My only question is why Cinder tests are still included, but I trust that it's there for a reason and I'm just revealing my own ignorance of Swift's consumers, however removed. Note: swift does not run integrated-gate as of now. Correct, and for all the reasons that you're seeking to address. Some eight months ago I'd gotten tired of seeing spurious failures that had nothing to do with Swift, and I was hard pressed to find an instance where the tempest tests caught a regression or behavior change that wasn't already caught by Swift's own functional tests. In short, the signal-to-noise ratio for those particular tests was low enough that a failure only told me "you should leave a recheck comment," so I proposed https://review.opendev.org/#/c/601813/ . There was also a side benefit of having our longest-running job change from legacy-tempest-dsvm-neutron-full (at 90-100 minutes) to swift-probetests-centos-7 (at ~30 minutes), tightening developer feedback loops. It sounds like this proposal addresses both concerns: by reducing the scope of tests to what might actually exercise the Swift API (if indirectly), the signal-to-noise ratio should be much better and the wall-clock time will be reduced.
4. "Integrated-gate-compute" (job to run on Nova gate)tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs and All scenario currently running in tempest-full in same way ( means non-slow and in serial)Improvement for Nova gate: excluded the swift APIs tests(not running in current job but in future, it might), Keystone API tests. 5. "Integrated-gate-identity" (job to run on keystone gate)Tests to run is : all as all project use keystone, we might need to run all tests as it is running in integrated-gate.But does keystone is being unsed differently by all services? if no then, is it enough to run only single service tests say Nova or neutron ?6. "Integrated-gate-placement" (job to run on placement gate)Tests to run in this template: Nova APIs tests, Neutron APIs tests + scenario tests + any new service depends on placement APIs Improvement for placement gate: excluded the glance APIs tests, cinder APIs tests, swift APIs tests, keystone APIs testsThoughts on this approach?The important point is we must not lose the coverage of integrated testing per project. So I would like toget each project view if we are missing any dependency (proposed tests removal) in above proposed templates. As far as Swift is aware, these dependencies seem accurate; at any rate, *we* don't use anything other than Keystone, even by way of another API. Further, Swift does not use particularly esoteric Keysonte APIs; I would be OK with integrated-gate-identity not exercising Swift's API with the assumption that some other (or indeed, almost *any* other) service would likely exercise the parts that we care about. - https:/etherpad.openstack.org/p/qa-train-ptg -gmann While I'm all up for limiting the scope Tempest is targeting for each patch to save time and our precious infra resources I have feeling that we might end up missing something here. Honestly I'm not sure what that something would be and maybe it's me thinking the scopes wrong way around. For example:4. "Integrated-gate-compute" (job to run on Nova gate) I'm not exactly sure what any given Nova patch would be able to break from Cinder, Glance or Neutron or on number 2 what Swift is depending on Glance and Cinder that we could break when we introduce a change.
There can be various scenario where these services are cross-dependent. It is difficult to judge the isolation among them. For example, multi-attach feature depends on Nova as well as Cinder to work correctly. Either side change can break this feature.
Shouldn't we be looking "What projects are consuming service X and target those Tempest tests"? In Glance perspective this would be (from core projects) Glance, Cinder, Nova; Cinder probably interested about Cinder, Glance and Nova (anyone else consuming Cinder?) etc.
I agree on your point of more optimize the testing base on consumer only. But there are few cross service call among consumer and consumed services. For example, Nova and Cinder call back to each other in case of the Swap volume feature. To be honest, I want to cover the most broader possible coverage with consumer and consumed services cross-testing. There is a possibility of optimizing it more but that has the risk of losing some coverage and introducing a regression. That risk is more dangerous and we should avoid that until we are very clear about service isolation.
I'd like to propose approach where we define these jobs and run them in check for the start and let gate run full suites until we figure out are we catching something in gate we did not catch in check and once the understanding has been reached that we have sufficient coverage, we can go ahead and swap gate using those jobs as well. This approach would give us the benefit where the impact is highest until we are confident we got the coverage right. I think biggest issue is that for the transition period _everyone_ needs to understand that gate might catch something check did not and simple "recheck" might not be sufficient when tempest succeeded in check but failed in gate.
I like your idea of testing this idea as experimental way before actual migration. But I am worried about how to do that. There are two challenges here- 1. Any job in gate pipeline has to run in check pipeline first. Replacing integrated-gate to integrated-gate-* in check pipeline only need exception in that process. 2. how to get the matrix of failure-gap between check and gate pipeline due to this change? OpenStack health dashboard does not collect the check pipeline data. -gmann
Best, Erno "jokke_" Kuvaja
---- On Tue, 07 May 2019 08:25:11 +0900 Tim Burke <tim@swiftstack.com> wrote ----
On 5/5/19 12:18 AM, Ghanshyam Mann wrote: Current integrated-gate jobs (tempest-full) is not so stable for various bugs specially timeout. We triedto improve it via filtering the slow tests in the separate tempest-slow job but the situation has not been improved much.We talked about the Ideas to make it more stable and fast for projects especially when failure is notrelated to each project. We are planning to split the integrated-gate template (only tempest-full job asfirst step) per related services. Idea:- Run only dependent service tests on project gate. I love this plan already. - Tempest gate will keep running all the services tests as the integrated gate at a centeralized place without any change in the current job.- Each project can run the below mentioned template. - All below template will be defined and maintained by QA team. My biggest regret is that I couldn't figure out how to do this myself. Much thanks to the QA team! I would like to know each 6 services which run integrated-gate jobs1."Integrated-gate-networking" (job to run on neutron gate) Tests to run in this template: neutron APIs , nova APIs, keystone APIs ? All scenario currently running in tempest-full in the same way ( means non-slow and in serial)Improvement for neutron gate: exlcude the cinder API tests, glance API tests, swift API tests,2."Integrated-gate-storage" (job to run on cinder gate, glance gate)Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs, Nova APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial)Improvement for cinder, glance gate: excluded the neutron APIs tests, Keystone APIs tests3. "Integrated-gate-object-storage" (job to run on swift gate)Tests to run in this template: Cinder APIs , Glance APIs, Swift APIs and All scenario currently running in tempest-full in the same way ( means non-slow and in serial)Improvement for swift gate: excluded the neutron APIs tests, - Keystone APIs tests, - Nova APIs tests. This sounds great. My only question is why Cinder tests are still included, but I trust that it's there for a reason and I'm just revealing my own ignorance of Swift's consumers, however removed.
As Cinder use Swift as one of the backend, I think it is worth running cinder tests on the swift gate. But honestly saying I am covering the most possible broader coverage so that we do not lose any coverage among dependent services. Later we can always optimize this template more once we are very clear about the isolation of services.
Note: swift does not run integrated-gate as of now.
Yeah, Kota too brought this in QA team. We suggested to wait for the stability of integrated-gate (this mailing thread) and after that, swift can add the integrated-gate-* template.
Correct, and for all the reasons that you're seeking to address. Some eight months ago I'd gotten tired of seeing spurious failures that had nothing to do with Swift, and I was hard pressed to find an instance where the tempest tests caught a regression or behavior change that wasn't already caught by Swift's own functional tests. In short, the signal-to-noise ratio for those particular tests was low enough that a failure only told me "you should leave a recheck comment," so I proposed https://review.opendev.org/#/c/601813/ . There was also a side benefit of having our longest-running job change from legacy-tempest-dsvm-neutron-full (at 90-100 minutes) to swift-probetests-centos-7 (at ~30 minutes), tightening developer feedback loops. It sounds like this proposal addresses both concerns: by reducing the scope of tests to what might actually exercise the Swift API (if indirectly), the signal-to-noise ratio should be much better and the wall-clock time will be reduced.
True, many other project gate faces a similar problem. Let's see how much this idea can improve integrated gate testing. Thanks for confirmation from the Swift side. -gmann
4. "Integrated-gate-compute" (job to run on Nova gate)tests to run is : Nova APIs, Cinder APIs , Glance APIs ?, neutron APIs and All scenario currently running in tempest-full in same way ( means non-slow and in serial)Improvement for Nova gate: excluded the swift APIs tests(not running in current job but in future, it might), Keystone API tests. 5. "Integrated-gate-identity" (job to run on keystone gate)Tests to run is : all as all project use keystone, we might need to run all tests as it is running in integrated-gate.But does keystone is being unsed differently by all services? if no then, is it enough to run only single service tests say Nova or neutron ?6. "Integrated-gate-placement" (job to run on placement gate)Tests to run in this template: Nova APIs tests, Neutron APIs tests + scenario tests + any new service depends on placement APIs Improvement for placement gate: excluded the glance APIs tests, cinder APIs tests, swift APIs tests, keystone APIs testsThoughts on this approach?The important point is we must not lose the coverage of integrated testing per project. So I would like toget each project view if we are missing any dependency (proposed tests removal) in above proposed templates. As far as Swift is aware, these dependencies seem accurate; at any rate, *we* don't use anything other than Keystone, even by way of another API. Further, Swift does not use particularly esoteric Keysonte APIs; I would be OK with integrated-gate-identity not exercising Swift's API with the assumption that some other (or indeed, almost *any* other) service would likely exercise the parts that we care about. - https:/etherpad.openstack.org/p/qa-train-ptg -gmann
participants (7)
-
Erno Kuvaja
-
Ghanshyam Mann
-
LIU Yulong
-
Miguel Lavalle
-
Morgan Fainberg
-
Nate Johnston
-
Tim Burke