[Openstack] [nova] Database not delete PCI info after device is removed from host and nova.conf

Eddie Yen missile0407 at gmail.com
Fri Jul 7 01:05:45 UTC 2017


Uh wait,

Is that possible it still shows available if PCI device still exist in the
same address?

Because when I remove the GPU card, I replace it to a SFP+ network card in
the same slot.
So when I type lspci the SFP+ card stay in the same address.

But it still doesn't make any sense because these two cards definitely not
a same VID:PID.
And I set the information as VID:PID in nova.conf


I'll try reproduce this issue and put a log on this list.

Thanks,

2017-07-07 9:01 GMT+08:00 Jay Pipes <jaypipes at gmail.com>:

> Hmm, very odd indeed. Any way you can save the nova-compute logs from when
> you removed the GPU and restarted the nova-compute service and paste those
> logs to paste.openstack.org? Would be useful in tracking down this buggy
> behaviour...
>
> Best,
> -jay
>
> On 07/06/2017 08:54 PM, Eddie Yen wrote:
>
>> Hi Jay,
>>
>> The status of the "removed" GPU still shows as "Available" in pci_devices
>> table.
>>
>> 2017-07-07 8:34 GMT+08:00 Jay Pipes <jaypipes at gmail.com <mailto:
>> jaypipes at gmail.com>>:
>>
>>
>>     Hi again, Eddie :) Answer inline...
>>
>>     On 07/06/2017 08:14 PM, Eddie Yen wrote:
>>
>>         Hi everyone,
>>
>>         I'm using OpenStack Mitaka version (deployed from Fuel 9.2)
>>
>>         In present, I installed two different model of GPU card.
>>
>>         And wrote these information into pci_alias and
>>         pci_passthrough_whitelist in nova.conf on Controller and Compute
>>         (the node which installed GPU).
>>         Then restart nova-api, nova-scheduler,and nova-compute.
>>
>>         When I check database, both of GPU info registered in
>>         pci_devices table.
>>
>>         Now I removed one of the GPU from compute node, and remove the
>>         information from nova.conf, then restart services.
>>
>>         But I check database again, the information of the removed card
>>         still exist in pci_devices table.
>>
>>         How can I do to fix this problem?
>>
>>
>>     So, when you removed the GPU from the compute node and restarted the
>>     nova-compute service, it *should* have noticed you had removed the
>>     GPU and marked that PCI device as deleted. At least, according to
>>     this code in the PCI manager:
>>
>>     https://github.com/openstack/nova/blob/master/nova/pci/manag
>> er.py#L168-L183
>>     <https://github.com/openstack/nova/blob/master/nova/pci/mana
>> ger.py#L168-L183>
>>
>>     Question for you: what is the value of the status field in the
>>     pci_devices table for the GPU that you removed?
>>
>>     Best,
>>     -jay
>>
>>     p.s. If you really want to get rid of that device, simply remove
>>     that record from the pci_devices table. But, again, it *should* be
>>     removed automatically...
>>
>>     _______________________________________________
>>     Mailing list:
>>     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>     <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>     Post to     : openstack at lists.openstack.org
>>     <mailto:openstack at lists.openstack.org>
>>     Unsubscribe :
>>     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>     <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20170707/0777cbb7/attachment.html>


More information about the Openstack mailing list