Hi thank you for the feedback.

first of all we can wrap
> https://github.com/openstack/nova/blob/7399728e8937c31b7e92ed8fa7f9ce860c9a3b9f/nova/compute/manager.py#L2584-L2587
> in an if that check if dynmic vendor data is enabled by checking len(CONF.api.vendordata_dynamic_targets) > 1
> https://docs.openstack.org/nova/latest/configuration/config.html#api.vendordata_dynamic_targets

> second instead of doing
> {'boot_roles': ','.join(context.roles)}
>
> we can add a boot_roles_count key and use that to add pagination by adding one boot_role_<index> key per role
> that will allow you to have arbitary number of roles with one key (boot_role_0) per role
> and its very easy to detect and fall back to the boot_roles key if boot_roles_count is not presnet.

That makes sense to me. Let me update the commit.

thank you,
Masahito

-----Original Message-----
From: <smooney@redhat.com>
To: "Masahito Muroi"<masahito.muroi@lycorp.co.jp>; <openstack-discuss@lists.openstack.org>;
Cc:
Sent: 2024/08/13(火) 18:57 (GMT+09:00)
Subject: Re: [Nova][Keystone] Instance boot fail if user has many roles

On Tue, 2024-08-13 at 10:47 +0100, smooney@redhat.com wrote:
> On Tue, 2024-08-13 at 17:39 +0900, Masahito Muroi wrote:
> > Hi Nova and Keystone folks,
> >
> > I'd like to get your feedback for the bug[1] and the commit[2]. We use Keystone as company internal role management
> > system and hit the issue frequently.
> >
> > The commit is to change instance_system_metadata table schema, but it could be too big change only for the long role
> > list.
> changing the instnace_system_metadata schema form nvarchar 255 to text is not really an option.
>
> it woudl be a very large data migration for exsting user as you would be moving the sting data form bign stored inline
> in the table to being stored out of line as seperate files on disk
> that woudl signifcalty impact the performance so that approch is not viable IMO
> >
> > Another idea I can imagine is to introduce "store_boot_roles" boolean configuration to store the boot_roles or not.
> > IIUC, the boot_roles is used only by a specific vendor data in the metadata api.
> that is also not really a good option as this data is used form muplile location and reuqiring you sett the same
> config option on the compute conductor scheduler and api the same is very error prone.
>
> >
> > If there is another good value to store from Keystone side, I'm happy to use the value instead of role name, too.
> im not actully sure why we are storign the roles at all in the instance_system_metadata.
>
> the roles are not an atribute of the vm they are an atirbute of the user token used to create the vm
> as such i woudl argure we likely shoudl not be storing them at all since that data could be stale
> if we ever read it again later. since we cant rely on it to be current we shoudl not store it.
>
> however im not sure why we orginally stored this so woe woudl need to investiagte that.
ah it  was added for dynmic vendor metadata
https://github.com/openstack/nova/commit/6d8b58dc6f1cbda8d664b3487674f87049491c74
we cant just drop that without deprecation but there are a few things we can do.

first of all we can wrap
https://github.com/openstack/nova/blob/7399728e8937c31b7e92ed8fa7f9ce860c9a3b9f/nova/compute/manager.py#L2584-L2587
in an if that check if dynmic vendor data is enabled by checking len(CONF.api.vendordata_dynamic_targets) > 1
https://docs.openstack.org/nova/latest/configuration/config.html#api.vendordata_dynamic_targets

second instead of doing
{'boot_roles': ','.join(context.roles)}

we can add a boot_roles_count key and use that to add pagination by adding one boot_role_<index> key per role
that will allow you to have arbitary number of roles with one key (boot_role_0) per role
and its very easy to detect and fall back to the boot_roles key if boot_roles_count is not presnet.

that minimise the upgrade impact

we coudl also do an online data migration when the instace.system_metadata is next read or witten too.
that will avoid the large database hit on upgrade by spreadign the migration across new usage of the upgraded
code path.

> >
> > 1. https://bugs.launchpad.net/nova/+bug/2075100
> > 2. https://protect2.fireeye.com/v1/url?k=31323334-501d0a38-3133420b-454441504e31-45bfb03284bc255b&q=1&e=57e40200-28ef-405e-b7e7-e6df13c155e5&u=https%3A%2F%2Freview.opendev.org%2Fc%2Fopenstack%2Fnova%2F%2B%2F925163
>
> >
> > thank you,
> > Masahito
>