[openstack-dev] [Magnum] API service won't work if conductor down?
王华
wanghua.humble at gmail.com
Mon Feb 29 01:38:09 UTC 2016
Hi Corey,
What is the steps when we do zero downtime upgrades? Can you explain it in
detail? Or is there any docs for it?
Best Regards
Wanghua
On Sat, Feb 27, 2016 at 9:26 PM, Corey O'Brien <coreypobrien at gmail.com>
wrote:
> We discussed this at the midcycle as a part of zero downtime upgrades. In
> order to be able to do Magnum upgrades, we talked about directing all DB
> access through a service that understands the schema for both the current
> and previous versions of the database. Moving direct DB access into the API
> will make it harder to do zero downtime upgrades in the future. I think we
> should leave things the way they are today.
>
> Corey
>
> On Fri, Feb 26, 2016, 22:08 王华 <wanghua.humble at gmail.com> wrote:
>
>> Hi all,
>>
>> I want to allow magnum-api to access DB, so that magnum-api can work even
>> if magnum-conductor is down.
>>
>> Best Regards,
>> Wanghua
>>
>> On Thu, Feb 4, 2016 at 3:42 PM, 王华 <wanghua.humble at gmail.com> wrote:
>>
>>> I think we should allow magnum-api to access DB directly like nova-api.
>>>
>>> As describe in [1], nova may have many compute nodes and it may take an
>>> hour or a month to upgrade. But the number of magnum-api and
>>> magnum-conductor is limited, the upgrade of them is fast. They don't
>>> benefit from the method. We should upgrade them like the control services
>>> in nova and upgrade them together.
>>>
>>> In this step, you will upgrade everything but the compute nodes. This
>>> means nova-api, nova-scheduler, nova-conductor, nova-consoleauth,
>>> nova-network, and nova-cert. In reality, this needs to be done fairly
>>> atomically. So, shut down all of the affected services, roll the new code,
>>> and start them back up. This will result in some downtime for your API, but
>>> in reality, it should be easy to quickly perform the swap. In later
>>> releases, we’ll reduce the pain felt here by eliminating the need for the
>>> control services to go together.
>>>
>>> [1]
>>> http://www.danplanet.com/blog/2015/06/26/upgrading-nova-to-kilo-with-minimal-downtime/
>>>
>>>
>>> On Thu, Feb 4, 2016 at 4:59 AM, Hongbin Lu <hongbin.lu at huawei.com>
>>> wrote:
>>>
>>>> I can clarify Eli’s question further.
>>>>
>>>>
>>>>
>>>> 1) is this by designed that we don't allow magnum-api to access DB
>>>> directly ?
>>>>
>>>> Yes, that is what it is. Actually, The magnum-api was allowed to access
>>>> DB directly in before. After the indirection API patch landed [1],
>>>> magnum-api starts using magnum-conductor as a proxy to access DB. According
>>>> to the inputs from oslo team, this design allows operators to take down
>>>> either magnum-api or magnum-conductor to upgrade. This is not the same as
>>>> nova-api, because nova-api, nova-scheduler, and nova-conductor are assumed
>>>> to be shutdown all together as an atomic unit.
>>>>
>>>>
>>>>
>>>> I think we should make our own decision here. If we can pair magnum-api
>>>> with magnum-conductor as a unit, we can remove the indirection API and
>>>> allow both binaries to access DB. This could mitigate the potential
>>>> performance bottleneck of message queue. On the other hand, if we stay with
>>>> the current design, we would allow magnum-api and magnum-conductor to scale
>>>> independently. Thoughts?
>>>>
>>>>
>>>>
>>>> [1] https://review.openstack.org/#/c/184791/
>>>>
>>>>
>>>>
>>>> Best regards,
>>>>
>>>> Hongbin
>>>>
>>>>
>>>>
>>>> *From:* Kumari, Madhuri [mailto:madhuri.kumari at intel.com]
>>>> *Sent:* February-03-16 10:57 AM
>>>> *To:* OpenStack Development Mailing List (not for usage questions)
>>>> *Subject:* Re: [openstack-dev] [Magnum] API service won't work if
>>>> conductor down?
>>>>
>>>>
>>>>
>>>> Corey the one you are talking about has changed to coe-service-*.
>>>>
>>>>
>>>>
>>>> Eli, IMO we should display proper error message. M-api service should
>>>> only have read permission.
>>>>
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Madhuri
>>>>
>>>>
>>>>
>>>> *From:* Corey O'Brien [mailto:coreypobrien at gmail.com
>>>> <coreypobrien at gmail.com>]
>>>> *Sent:* Wednesday, February 3, 2016 6:50 PM
>>>> *To:* OpenStack Development Mailing List (not for usage questions) <
>>>> openstack-dev at lists.openstack.org>
>>>> *Subject:* Re: [openstack-dev] [Magnum] API service won't work if
>>>> conductor down?
>>>>
>>>>
>>>>
>>>> The service-* commands aren't related to the magnum services (e.g.
>>>> magnum-conductor). The service-* commands are for services on the bay that
>>>> the user creates and deletes.
>>>>
>>>>
>>>>
>>>> Corey
>>>>
>>>>
>>>>
>>>> On Wed, Feb 3, 2016 at 2:25 AM Eli Qiao <liyong.qiao at intel.com> wrote:
>>>>
>>>> hi
>>>> Whey I try to run magnum service-list to list all services (seems now
>>>> we only have m-cond service), it m-cond is down(which means no conductor at
>>>> all),
>>>> API won't response and will return a timeout error.
>>>>
>>>> taget at taget-ThinkStation-P300:~/devstack$ magnum service-list
>>>> ERROR: Timed out waiting for a reply to message ID
>>>> fd1e9529f60f42bf8db903bbf75bbade (HTTP 500)
>>>>
>>>> And I debug more and compared with nova service-list, nova will give
>>>> response and will tell the conductor is down.
>>>>
>>>> and deeper I get this in magnum-api boot up:
>>>>
>>>>
>>>> * # Enable object backporting via the conductor
>>>> base.MagnumObject.indirection_api = base.MagnumObjectIndirectionAPI()*
>>>>
>>>> so in magnum_service api code
>>>>
>>>> return objects.MagnumService.list(context, limit, marker,
>>>> sort_key,
>>>> sort_dir)
>>>>
>>>> will require to use magnum-conductor to access DB, but no
>>>> magnum-conductor at all, then we get a 500 error.
>>>> (nova-api doesn't specify *indirection_api so nova-api can access DB*)
>>>>
>>>> My question is:
>>>>
>>>> 1) is this by designed that we don't allow magnum-api to access DB
>>>> directly ?
>>>> 2) if 1) is by designed, then `magnum service-list` won't work, and the
>>>> error message should be improved such as "magnum service is down , please
>>>> check magnum conductor is alive"
>>>>
>>>> What do you think?
>>>>
>>>> P.S. I tested comment this line:
>>>> *# base.MagnumObject.indirection_api =
>>>> base.MagnumObjectIndirectionAPI()*
>>>> magnum-api will response but failed to create bay(), which means api
>>>> service have read access but can not write it at all since(all db write
>>>> happened in conductor layer).
>>>>
>>>> --
>>>>
>>>> Best Regards, Eli(Li Yong)Qiao
>>>>
>>>> Intel OTC China
>>>>
>>>>
>>>> __________________________________________________________________________
>>>> OpenStack Development Mailing List (not for usage questions)
>>>> Unsubscribe:
>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>>
>>>>
>>>> __________________________________________________________________________
>>>> OpenStack Development Mailing List (not for usage questions)
>>>> Unsubscribe:
>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>>
>>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160229/ec032a4a/attachment.html>
More information about the OpenStack-dev
mailing list