Re: Re: [all][tc][horizon] A web tool which helps administrators in managing openstack clusters
Hello everyone, Thanks for your attention and advice for this project. We have read each reply thoroughly and the good news is that we’re able to give answers to some questions raised by them. As Lingxian Kong said: I have a few questions/suggestions: 1. It'd be great and gain more attractions if you could provide a demo about how "openstack-admin" looks like 2. What OpenStack services has "openstack-admin" already integrated? Is it easy to integrate with others? 1. We have deployed openstack-admin <http://218.205.220.13:9384/login> on a mini openstack cluster as a demo, any access would be welcomed. - Username: openstack-admin - Password: @Dem0 2. Since openstack-admin gets information it needs by querying the sql database, it’s fairly easy to integrate with all openstack services. - As the demo shows, openstack-admin has integrated *nova*(almost all GET and part of POST), *cinder*(GET and snapshot-creation), *neutron*(subnets and ports), *keystone*(projects) and *glance*(images), that’s all we need in our own working environment. - If we need to integrate more services to openstack-admin(e.g. adding a create instance button or integrating with swift), it would not be a complex task, either. As Adrian Turjak said: The first major issue is that you connect to the databases of the services directly. That's a major issue, both for long term compatibility, and security. The APIs should always be the main point of contact and the ONLY contract that the services have to maintain. By connecting to the database directly you are now relying on a data structure that can, and likely will change, and any security and sanity checking on filters and queries is now handled on your layer rather than the application itself. Not only that, but your dashboard also now needs passwords for all the databases, and by the sounds of it all the message queues. And as Mohammed Naser said: While I agree with you that querying database is much faster, this introduces two issues that I imagine for users: - Dashboards generally having direct access via SQL is a scary thing from an operators perspective - also, it will make maintaining the project quite hard because I don't think any projects expose a *stable* database API. Well, we’re not surprised that our querying approach would be challenged since it does sound unsafe. However, we have made some efforts to solve problems which have been posed: - We use a ORM library to create all queries, which ensures that only those instructions we have specified in the backend(i.e. select, order by, where and other harmless querying instructions) could be executed, protecting our databases from dangerous attacks like SQL injection. All sanity or security checkings would be automatically completed by those library functions. - All instructions that *may* change the database(e.g. start, shutoff, migrate) would be executed by calling standard openstack API, only pure GET instructions were implemented by querying databases directly. We have wrapped each API call with a go func() { ... }() to avoid the extremely long calling period. The results of API calls would be sent back to frontend by websocket asynchronously. - Passwords of databases and message queues(and many other kinds of information) are stored in a config file which would be loaded by openstack-admin. Simply by modifying this file, we could be consistent with all changes about sql databases and MQs. I hope my explanation is clear enough, and we’re willing to solve other possible issues existing. Cheers, Douglas Zhang
Keep in mind Keystone's database has what is considered privileged information in it. Notably user passwords (bcrypt or scrypt hashed) and user credentials (encrypted) Even with hashing, it is never recommended to expose these values externally. An example I give is: do you consider password hashes in your shadow file secure enough to publish publically? (The answer should be an emphatic "no"). Keystone also contains in many deployments PII (personally identifying information), while this is not explicitly part of Keystone nor recommended to store in Keystone, there could be other legal ramifications to expose of this data enmasse especially if the data would have been protected via the API. I highly recommend, with a security hat on, not connecting and interacting with Keystone's database directly for this reason. It is possible, even with an ORM, someone will decide to develop a mechanism to pull user related information or there may be exposure that can leak arbitrary data from within the DB. I will also echo concerns that you will have a hard time keeping up across versions with the various database schema changes. For example between stein and train keystone will have added resource options that are intended to communicate immutability for some resources. These are loaded behind the scenes with a join and translated to something usable via code. The referencing keys are minimalist and may be a simple ID or a 4-letter code instead of the full option name. I am sure Keystone is not the only Service that has conventions for data in the Database that do not translate to something useful without being run through the api code. —Morgan
On Aug 25, 2019, at 03:28, Douglas Zhang <lychzhz@gmail.com> wrote:
Hello everyone,
Thanks for your attention and advice for this project. We have read each reply thoroughly and the good news is that we’re able to give answers to some questions raised by them.
As Lingxian Kong said:
I have a few questions/suggestions:
It'd be great and gain more attractions if you could provide a demo about how "openstack-admin" looks like
What OpenStack services has "openstack-admin" already integrated? Is it easy to integrate with others?
We have deployed openstack-admin on a mini openstack cluster as a demo, any access would be welcomed.
Username: openstack-admin
Password: @Dem0
Since openstack-admin gets information it needs by querying the sql database, it’s fairly easy to integrate with all openstack services. As the demo shows, openstack-admin has integrated nova(almost all GET and part of POST), cinder(GET and snapshot-creation), neutron(subnets and ports), keystone(projects) and glance(images), that’s all we need in our own working environment.
If we need to integrate more services to openstack-admin(e.g. adding a create instance button or integrating with swift), it would not be a complex task, either.
As Adrian Turjak said: The first major issue is that you connect to the databases of the services directly. That's a major issue, both for long term compatibility, and security. The APIs should always be the main point of contact and the ONLY contract that the services have to maintain. By connecting to the database directly you are now relying on a data structure that can, and likely will change, and any security and sanity checking on filters and queries is now handled on your layer rather than the application itself. Not only that, but your dashboard also now needs passwords for all the databases, and by the sounds of it all the message queues. And as Mohammed Naser said:
While I agree with you that querying database is much faster, this introduces two issues that I imagine for users:
Dashboards generally having direct access via SQL is a scary thing from an operators perspective
also, it will make maintaining the project quite hard because I don't think any projects expose a stable database API.
Well, we’re not surprised that our querying approach would be challenged since it does sound unsafe. However, we have made some efforts to solve problems which have been posed:
We use a ORM library to create all queries, which ensures that only those instructions we have specified in the backend(i.e. select, order by, where and other harmless querying instructions) could be executed, protecting our databases from dangerous attacks like SQL injection. All sanity or security checkings would be automatically completed by those library functions.
All instructions that may change the database(e.g. start, shutoff, migrate) would be executed by calling standard openstack API, only pure GET instructions were implemented by querying databases directly. We have wrapped each API call with a go func() { ... }() to avoid the extremely long calling period. The results of API calls would be sent back to frontend by websocket asynchronously.
Passwords of databases and message queues(and many other kinds of information) are stored in a config file which would be loaded by openstack-admin. Simply by modifying this file, we could be consistent with all changes about sql databases and MQs.
I hope my explanation is clear enough, and we’re willing to solve other possible issues existing.
Cheers,
Douglas Zhang
Douglas Zhang wrote:
[...] As Adrian Turjak said:
The first major issue is that you connect to the databases of the services directly. That's a major issue, both for long term compatibility, and security. The APIs should always be the main point of contact and the ONLY contract that the services have to maintain. By connecting to the database directly you are now relying on a data structure that can, and likely will change, and any security and sanity checking on filters and queries is now handled on your layer rather than the application itself. Not only that, but your dashboard also now needs passwords for all the databases, and by the sounds of it all the message queues.
And as Mohammed Naser said:
While I agree with you that querying database is much faster, this introduces two issues that I imagine for users: [...]
* also, it will make maintaining the project quite hard because I don't think any projects expose a /stable/ database API.
Well, we’re not surprised that our querying approach would be challenged since it does sound unsafe. However, we have made some efforts to solve problems which have been posed: [...]
I hope my explanation is clear enough, and we’re willing to solve other possible issues existing.
Thanks for the replies! As I highlighted above, you did not address the main issue raised by Adrian and Mohammed, which is that the database schema in OpenStack services is not stable. Our project teams only commit to one public API, and that is the REST one. Querying the database directly is definitely faster (both in original coding and query performance), but you incur enormous technical debt by taking this shortcut. *Someone* will have to care about keeping openstack-admin queries and projects database schema in sync forever after. That means projects either need to commit to a stable database API in addition to a stable REST API, *or* openstack-admin maintainers will have to keep up with any database schema change in any future version of any component they interact with. At this point in the history of OpenStack, IMHO we need to care more about long-term sustainability with a limited number of maintainers, than about speed. There are definitely optimizations that can be made to make the slowest queries faster, without incurring massive technical debt that will have to be repaid by maintainers forever after. It's definitely less funny and rewarding than writing a superfast new database-connected dashboard, but I'd argue that it is where development resources should be spent today... -- Thierry Carrez (ttx)
Even if we agreeded that direct DB access was OK (it isn't), it is worth adding that this tool will not be backwards compatible in any way. It will always be linked to the current release DB schema. It wouldn't handle mixed version clouds, or anything even a little out of sync without a lot of extra complexity. Horizon, as awful as it is at times, is great in that it is backwards compatible VERY far, and in fact we have historically run Horizon close to master, while lagging behind in other services at various versions. On 27/08/19 9:04 PM, Thierry Carrez wrote:
Douglas Zhang wrote:
[...] As Adrian Turjak said:
The first major issue is that you connect to the databases of the services directly. That's a major issue, both for long term compatibility, and security. The APIs should always be the main point of contact and the ONLY contract that the services have to maintain. By connecting to the database directly you are now relying on a data structure that can, and likely will change, and any security and sanity checking on filters and queries is now handled on your layer rather than the application itself. Not only that, but your dashboard also now needs passwords for all the databases, and by the sounds of it all the message queues.
And as Mohammed Naser said:
While I agree with you that querying database is much faster, this introduces two issues that I imagine for users: [...]
* also, it will make maintaining the project quite hard because I don't think any projects expose a /stable/ database API.
Well, we’re not surprised that our querying approach would be challenged since it does sound unsafe. However, we have made some efforts to solve problems which have been posed: [...]
I hope my explanation is clear enough, and we’re willing to solve other possible issues existing.
Thanks for the replies!
As I highlighted above, you did not address the main issue raised by Adrian and Mohammed, which is that the database schema in OpenStack services is not stable. Our project teams only commit to one public API, and that is the REST one.
Querying the database directly is definitely faster (both in original coding and query performance), but you incur enormous technical debt by taking this shortcut. *Someone* will have to care about keeping openstack-admin queries and projects database schema in sync forever after. That means projects either need to commit to a stable database API in addition to a stable REST API, *or* openstack-admin maintainers will have to keep up with any database schema change in any future version of any component they interact with.
At this point in the history of OpenStack, IMHO we need to care more about long-term sustainability with a limited number of maintainers, than about speed. There are definitely optimizations that can be made to make the slowest queries faster, without incurring massive technical debt that will have to be repaid by maintainers forever after.
It's definitely less funny and rewarding than writing a superfast new database-connected dashboard, but I'd argue that it is where development resources should be spent today...
participants (4)
-
Adrian Turjak
-
Douglas Zhang
-
Morgan Fainberg
-
Thierry Carrez