[openstack-dev] [ironic] The scenary to rolling upgrade Ironic

Tan, Lin lin.tan at intel.com
Wed Oct 14 08:44:08 UTC 2015


Hi guys,

I am looking at https://bugs.launchpad.net/ironic/+bug/1502903 which is related to rolling upgrade and here is Jim's patch https://review.openstack.org/#/c/234450
I really have a concern or question about how to do Ironic doing rolling upgrades. It might be my mistake, but I would like to discuss here and get some feedback.

I manually did a rolling upgrade for a private OpenStack Cloud before. There are three main tasks for upgrade:
1. upgrade the code of service.
2. change configuration. 
3. the upgrade of DB Schema in DB, which is the most difficult and time-consuming part.

The current rolling upgrade solution or live upgrade are highly depends on upgrade different services in place one-by-one while make new service A can still communicate with old service B.
The ideal case is after we upgrade one of the services, others can still work without break.
This is can be done by using versionedobject and RPC version. For example, new Nova-API and new Nova-conductor can talk to old Nova-compute.
In the case of Nova services, it was suggests to follow below steps:
1. expand DB schema
2. pin RPC versions and object version at current
3. upgrade all nova-conductor servers because it will talk with DB
4. upgrade all nova services on controller nodes like nova-api
5. upgrade all nova-compute nodes
6. unpin RPC versions
7. shrink DB schema.
This is perfect for Nova. Because it has many nova-compute nodes, and few nova-conductor nodes and nova-api nodes. It's not necessary to upgrade nova-compute services at one time, which is time consuming.

For Ironic, we only have ir-conductor and ir-api. So the question is should we upgrade ir-conductor first or ir-api?
In my opinion, the ideal case is that we can have old ir-conductor and new ir-conductors coexist, which means we should upgrade ir-api to latest at first. But it's impossible at the moment, because ir-conductor will talk to DB directly and we only have one DB schema. That's a large difference between Ironic and Nova. We are missing a layer like nova-conductor.
The second case is upgrade ir-conductors first. That means if we upgrade the DB Schema, we have to upgrade all ir-conductors at once. During the upgrade, we could not provide Ironic service at all.

So I would suggest to stop all Ironic service, and upgrade ir-api first, and then upgrade ir-conductor one by one. Only enable the ir-conductor which has done the upgrade. Or upgrade ir-api and ir-conductors at once, although it sounds stupid a little bit.

What do you guys think?


Best Regards,

Tan




More information about the OpenStack-dev mailing list