[Openstack] [nova] Database not delete PCI info after device is removed from host and nova.conf

Eddie Yen missile0407 at gmail.com
Mon Jul 10 00:36:42 UTC 2017


Hi there,

Does the information already enough or need additional items?

Thanks,
Eddie.

2017-07-07 10:49 GMT+08:00 Eddie Yen <missile0407 at gmail.com>:

> Sorry,
>
> Re-new the nova-compute log after remove "1002:68c8" and restart
> nova-compute.
> http://paste.openstack.org/show/qUCOX09jyeMydoYHc8Oz/
>
> 2017-07-07 10:37 GMT+08:00 Eddie Yen <missile0407 at gmail.com>:
>
>> Hi Jay,
>>
>> Below are few logs and information you may want to check.
>>
>>
>>
>> I wrote GPU inforamtion into nova.conf like this.
>>
>> pci_passthrough_whitelist = [{ "product_id":"0ff3", "vendor_id":"10de"
>> }, { "product_id":"68c8", "vendor_id":"1002" }]
>>
>> pci_alias = [{ "product_id":"0ff3", "vendor_id":"10de", "device_type":
>> "type-PCI", "name":"k420" }, { "product_id":"68c8", "vendor_id":"1002",
>> "device_type":"type-PCI", "name":"v4800" }]
>>
>> Then restart the services.
>>
>> nova-compute log when insert new GPU device info into nova.conf and
>> restart service:
>> http://paste.openstack.org/show/z015rYGXaxYhVoafKdbx/
>>
>> Strange is, the log shows that resource tracker only collect information
>> of new setup GPU, not included the old one.
>>
>>
>> But If I do some actions on the instance contained old GPU, the tracker
>> will get both GPU.
>> http://paste.openstack.org/show/614658/
>>
>> Nova database shows correct information on both GPU
>> http://paste.openstack.org/show/8JS0i6BMitjeBVRJTkRo/
>>
>>
>>
>> Now remove ID "1002:68c8" from nova.conf and compute node, and restart
>> services.
>>
>> The pci_passthrough_whitelist and pci_alias only keep "10de:0ff3" GPU
>> info.
>>
>> pci_passthrough_whitelist = { "product_id":"0ff3", "vendor_id":"10de" }
>>
>> pci_alias = { "product_id":"0ff3", "vendor_id":"10de", "device_type":"
>> type-PCI", "name":"k420" }
>>
>> nova-compute log shows resource tracker report node only have "10de:0ff3"
>> PCI resource
>> http://paste.openstack.org/show/VjLinsipne5nM8o0TYcJ/
>>
>> But in Nova database, "1002:68c8" still exist, and stayed in "Available"
>> status. Even "deleted" value shows not zero.
>> http://paste.openstack.org/show/SnJ8AzJYD6wCo7jslIc2/
>>
>>
>> Many thanks,
>> Eddie.
>>
>> 2017-07-07 9:05 GMT+08:00 Eddie Yen <missile0407 at gmail.com>:
>>
>>> Uh wait,
>>>
>>> Is that possible it still shows available if PCI device still exist in
>>> the same address?
>>>
>>> Because when I remove the GPU card, I replace it to a SFP+ network card
>>> in the same slot.
>>> So when I type lspci the SFP+ card stay in the same address.
>>>
>>> But it still doesn't make any sense because these two cards definitely
>>> not a same VID:PID.
>>> And I set the information as VID:PID in nova.conf
>>>
>>>
>>> I'll try reproduce this issue and put a log on this list.
>>>
>>> Thanks,
>>>
>>> 2017-07-07 9:01 GMT+08:00 Jay Pipes <jaypipes at gmail.com>:
>>>
>>>> Hmm, very odd indeed. Any way you can save the nova-compute logs from
>>>> when you removed the GPU and restarted the nova-compute service and paste
>>>> those logs to paste.openstack.org? Would be useful in tracking down
>>>> this buggy behaviour...
>>>>
>>>> Best,
>>>> -jay
>>>>
>>>> On 07/06/2017 08:54 PM, Eddie Yen wrote:
>>>>
>>>>> Hi Jay,
>>>>>
>>>>> The status of the "removed" GPU still shows as "Available" in
>>>>> pci_devices table.
>>>>>
>>>>> 2017-07-07 8:34 GMT+08:00 Jay Pipes <jaypipes at gmail.com <mailto:
>>>>> jaypipes at gmail.com>>:
>>>>>
>>>>>
>>>>>     Hi again, Eddie :) Answer inline...
>>>>>
>>>>>     On 07/06/2017 08:14 PM, Eddie Yen wrote:
>>>>>
>>>>>         Hi everyone,
>>>>>
>>>>>         I'm using OpenStack Mitaka version (deployed from Fuel 9.2)
>>>>>
>>>>>         In present, I installed two different model of GPU card.
>>>>>
>>>>>         And wrote these information into pci_alias and
>>>>>         pci_passthrough_whitelist in nova.conf on Controller and
>>>>> Compute
>>>>>         (the node which installed GPU).
>>>>>         Then restart nova-api, nova-scheduler,and nova-compute.
>>>>>
>>>>>         When I check database, both of GPU info registered in
>>>>>         pci_devices table.
>>>>>
>>>>>         Now I removed one of the GPU from compute node, and remove the
>>>>>         information from nova.conf, then restart services.
>>>>>
>>>>>         But I check database again, the information of the removed card
>>>>>         still exist in pci_devices table.
>>>>>
>>>>>         How can I do to fix this problem?
>>>>>
>>>>>
>>>>>     So, when you removed the GPU from the compute node and restarted
>>>>> the
>>>>>     nova-compute service, it *should* have noticed you had removed the
>>>>>     GPU and marked that PCI device as deleted. At least, according to
>>>>>     this code in the PCI manager:
>>>>>
>>>>>     https://github.com/openstack/nova/blob/master/nova/pci/manag
>>>>> er.py#L168-L183
>>>>>     <https://github.com/openstack/nova/blob/master/nova/pci/mana
>>>>> ger.py#L168-L183>
>>>>>
>>>>>     Question for you: what is the value of the status field in the
>>>>>     pci_devices table for the GPU that you removed?
>>>>>
>>>>>     Best,
>>>>>     -jay
>>>>>
>>>>>     p.s. If you really want to get rid of that device, simply remove
>>>>>     that record from the pci_devices table. But, again, it *should* be
>>>>>     removed automatically...
>>>>>
>>>>>     _______________________________________________
>>>>>     Mailing list:
>>>>>     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>>>>     <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>>>     Post to     : openstack at lists.openstack.org
>>>>>     <mailto:openstack at lists.openstack.org>
>>>>>     Unsubscribe :
>>>>>     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>>>>     <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>>>>
>>>>>
>>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20170710/3c494429/attachment.html>


More information about the Openstack mailing list