[nova][tripleo][rpm-packaging][kolla][puppet][debian][osa] Nova enforces that no DB credentials are allowed for the nova-compute service

Sean Mooney smooney at redhat.com
Mon Nov 23 18:21:22 UTC 2020


On Mon, 2020-11-23 at 18:34 +0100, Thomas Goirand wrote:
> Hi Sean,
> 
> Thanks for your post.
> 
> On 11/23/20 2:32 PM, Sean Mooney wrote:
> > nova need to enforce it as we use the absense or present of the db creads to know
> > if common code is running in the compute agent or in controller services
> 
> I'm a bit shocked to read what I've read in this thread, about the
> double-guess that Nova is doing.
> 
> The way the Nova team has been describing, this really looks like hacks
> to hide the internals of Nova, and what the team is asking, is more
> band-aid for it. Probably things have been done this way to make it
> easier for the users, but at this point, it feels like we shouldn't
> attempt to hide facts anymore, and try to have everything explicit,
> which is going the opposite way of what you describe.
> 
> Why don't we go the other way around, and get things like a
> superconductor=true configuration directive, for example?
this would break upgrades for everyone one of our upgrade constratins
is you should never have to update your config as part of an upgrade.
where config changes are needed we always give you at least 1 cycle to update
your configs as part of a seperate maintance window.
it would also force all deployment tools and packages to chagne as we would have
new portentially incompatible options in config files.

spcificlly packagers would have to ship different nova config file for supper conductors and cell conductors for that usecase.


> 
> On 11/23/20 2:32 PM, Sean Mooney wrote:
> > it is a bug to have the db cred in the set fo configs passed to
> > nova-comptue and it has been for years.
> 
> In such case, detect that it's the nova-compute agent that's running,
> detect that it has access to db creds, and either:
> - display a big warning (I would prefer that)
> - display an error and quit (maybe bad in the case of an all-in-one setup)
> 
> This is orthogonal to the fact that Nova code is doing a hack (which
> should be fixed) to check which daemon is currently running in what mode.
it is not a hack, checking the binary name would be a hack as that is common code
that shoudl be independnt of what name you give the file on disk.


this was the where it would have failed from rocky on
https://github.com/openstack/nova/blob/dc93e3b510f53d5b2198c8edd22528f0c899617e/nova/compute/rpcapi.py#L416-L421
if you used upgrade level auto but set the api db creds in teh config of the compute agent
https://github.com/openstack/nova/blob/dc93e3b510f53d5b2198c8edd22528f0c899617e/nova/compute/rpcapi.py#L441-L461

if the api db options are set we get the min servie verion across all cells via a direct api call

    service_version = service_obj.get_minimum_version_all_cells(
                    context.get_admin_context(), ['nova-compute'])

and if they are not we do an indirect cal via the cell conductor to get the min compute service version in the local cell
not that it is a remoteable method https://github.com/openstack/nova/blob/dc93e3b510f53d5b2198c8edd22528f0c899617e/nova/objects/service.py#L516-L519
so when invoked form the compute agent it is converted to an rpc to the conductor to execute the lookup.
get_minimum_version_all_cells is not remoteable since calling it form a  compute node would result in an upcall to iterate over all the top level
cells
https://github.com/openstack/nova/blob/dc93e3b510f53d5b2198c8edd22528f0c899617e/nova/objects/service.py#L522

it was part of moving to cell v2 https://github.com/openstack/nova/commit/d779b33d72d6ef41651ce7b93fe982f121bae2d7
resolving https://bugs.launchpad.net/nova/+bug/1807044

the reason for the new failure is 
that we now uncontionally check the min compute node version using the same logic that we used before.
https://github.com/openstack/nova/blob/dc93e3b510f53d5b2198c8edd22528f0c899617e/nova/utils.py#L1057-L1087

  if CONF.api_database.connection is not None:
        scope = 'system'
        current_service_version = service.get_minimum_version_all_cells(
            ctxt, ['nova-compute'])
    else:
        scope = 'cell'
        # We in a cell so target our query to the current cell only
        current_service_version = service.Service.get_minimum_version(
            ctxt, 'nova-compute')

again using the presne of the db configurtion as a sentinal for is this code allowed to query the db directly.

the patch that is under review to convert this to a warning just catches the excption for teh blocked direct
db acess and logs the warning then execute the else branch

https://review.opendev.org/c/openstack/nova/+/762175/1/nova/utils.py#1073


it is not a hack to rely on something that we have depened on for 2 years and we have stated for many more
namely that the compute  agent may not have the db options set.


> 
> On 11/23/20 2:32 PM, Sean Mooney wrote:
> > we could make this just a ERROR log without the hard fail but that
> > would still not change the fact there is a bug in packages or
> > deployment tools that should be fixed.
> 
> Probably. But that shouldn't be upstream author's business on how things
> are deployed. IMO, in the case of an all-in-one, nova-compute should
> continue to work and just ignore the db params, and at worse display a
> huge warning on the logs.
well there i dissagree the warning/error message is the bare minium we should do 
and at most we should allow ti to contiue executing and optimally it should stop the service sicne we know
the config is invalid.
> 
> With the light of this thread, my opinion now has shifted to *not* have
> special files for the db credential, to give Nova a chance to tell the
> users what to do if nova-compute detects a mistake.
> 
> If we push the creds in /etc/nova/nova-db.conf, it wont be loaded by
> nova-compute, and it wont be able to warn the user that the file
> shouldn't be there on a compute node.
> 
that is true we cant warn for a bug in the deployment too or in your packages but that is not something 
nova should have to do.
we can warn when you invoke us with incorreect import but we should not have to check for other mistakes
you have made.
>  Checking for the file existence
> only would be wrong (because it could have empty values and just be
> there ... just because it's there! :) ).
> 
> Hoping sharing my view is constructive and adding value to the thread,
> Cheers,

it is and just bacuase i dont agree with your view point does not mean i dont want to hear you express
it as it chanlagnes me and other to constantly revialuate our postion based on the different persepctive shared :)
> 
> Thomas Goirand (zigo)
> 





More information about the openstack-discuss mailing list