<div dir="ltr">Hi there,<div><br></div><div>Does the information already enough or need additional items?</div><div><br></div><div>Thanks,</div><div>Eddie.</div></div><div class="gmail_extra"><br><div class="gmail_quote">2017-07-07 10:49 GMT+08:00 Eddie Yen <span dir="ltr"><<a href="mailto:missile0407@gmail.com" target="_blank">missile0407@gmail.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Sorry,<div><br></div><div>Re-new the <span style="font-size:14px">nova-compute log after remove "1002:68c8" and restart nova-compute.</span></div><div><a href="http://paste.openstack.org/show/qUCOX09jyeMydoYHc8Oz/" target="_blank">http://paste.openstack.org/<wbr>show/qUCOX09jyeMydoYHc8Oz/</a><br></div></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">2017-07-07 10:37 GMT+08:00 Eddie Yen <span dir="ltr"><<a href="mailto:missile0407@gmail.com" target="_blank">missile0407@gmail.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hi Jay,<div><br></div><div>Below are few logs and information you may want to check.</div><div><br></div><div><br></div><div><br></div><div>I wrote GPU inforamtion into nova.conf like this.</div><div>







<p class="m_-3499010537563185850m_2622623848353942441gmail-p1"><font face="monospace, monospace"><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">pci_passthrough_whitelist = [{ </span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"product_id"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">:</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"0ff3"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">, </span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"vendor_id"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">:</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"10de"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1"> }, { </span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"product_id"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">:</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"68c8"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">, </span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"vendor_id"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">:</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"1002"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1"> }]</span></font></p>
<p class="m_-3499010537563185850m_2622623848353942441gmail-p1"><font face="monospace, monospace"><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">pci_alias = [{ </span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"product_id"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">:</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"0ff3"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">, </span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"vendor_id"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">:</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"10de"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">, </span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"device_type"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">:</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"type-PCI"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">, </span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"name"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">:</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"k420"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1"> }, { </span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"product_id"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">:</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"68c8"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">, </span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"vendor_id"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">:</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"1002"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">, </span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"device_type"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">:</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"type-PCI"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">, </span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"name"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">:</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"v4800"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1"> }]</span></font></p></div><div><br></div><div>Then restart the services.</div><div><br></div><div>nova-compute log when insert new GPU device info into nova.conf and restart service:</div><div><a href="http://paste.openstack.org/show/z015rYGXaxYhVoafKdbx/" target="_blank">http://paste.openstack.org/sho<wbr>w/z015rYGXaxYhVoafKdbx/</a><br></div><div><br></div><div>Strange is, the log shows that resource tracker only collect information of new setup GPU, not included the old one.</div><div><br></div><div><br></div><div>But If I do some actions on the instance contained old GPU, the tracker will get both GPU.</div><div><a href="http://paste.openstack.org/show/614658/" target="_blank">http://paste.openstack.org/sho<wbr>w/614658/</a><br></div><div><br></div><div>Nova database shows correct information on both GPU</div><div><a href="http://paste.openstack.org/show/8JS0i6BMitjeBVRJTkRo/" target="_blank">http://paste.openstack.org/sho<wbr>w/8JS0i6BMitjeBVRJTkRo/</a><br></div><div><br></div><div><br></div><div><br></div><div>Now remove ID "1002:68c8" from nova.conf and compute node, and restart services. </div><div><br></div><div>The pci_passthrough_whitelist and pci_alias only keep "10de:0ff3" GPU info.</div><div><br></div><div><p class="m_-3499010537563185850m_2622623848353942441gmail-p1"><font face="monospace, monospace"><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">pci_passthrough_whitelist = { </span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"product_id"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">:</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"0ff3"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">, </span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"vendor<wbr>_id"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">:</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"10de"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1"> }</span></font></p><p class="m_-3499010537563185850m_2622623848353942441gmail-p1"><font face="monospace, monospace"><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">pci_alias = { </span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"product_id"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">:</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"0ff3"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">, </span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"vendor<wbr>_id"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">:</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"10de"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">, </span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"device_type"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">:</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"<wbr>type-PCI"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">, </span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"name"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1">:</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s2">"k420"</span><span class="m_-3499010537563185850m_2622623848353942441gmail-s1"> }</span></font></p></div><div><br></div><div>nova-compute log shows resource tracker report node only have "10de:0ff3" PCI resource</div><div><a href="http://paste.openstack.org/show/VjLinsipne5nM8o0TYcJ/" target="_blank">http://paste.openstack.org/sho<wbr>w/VjLinsipne5nM8o0TYcJ/</a><br></div><div><br></div><div>But in Nova database, "1002:68c8" still exist, and stayed in "Available" status. Even "deleted" value shows not zero.</div><div><a href="http://paste.openstack.org/show/SnJ8AzJYD6wCo7jslIc2/" target="_blank">http://paste.openstack.org/sho<wbr>w/SnJ8AzJYD6wCo7jslIc2/</a><br></div><div><br></div><div><br></div><div>Many thanks,</div><div>Eddie.</div></div><div class="m_-3499010537563185850HOEnZb"><div class="m_-3499010537563185850h5"><div class="gmail_extra"><br><div class="gmail_quote">2017-07-07 9:05 GMT+08:00 Eddie Yen <span dir="ltr"><<a href="mailto:missile0407@gmail.com" target="_blank">missile0407@gmail.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Uh wait,<div><br></div><div>Is that possible it still shows available if PCI device still exist in the same address?</div><div><br></div><div>Because when I remove the GPU card, I replace it to a SFP+ network card in the same slot.</div><div>So when I type lspci the SFP+ card stay in the same address.</div><div><br></div><div>But it still doesn't make any sense because these two cards definitely not a same VID:PID.</div><div>And I set the information as VID:PID in nova.conf</div><div><br></div><div><br></div><div>I'll try reproduce this issue and put a log on this list.</div><div><br></div><div>Thanks,</div></div><div class="m_-3499010537563185850m_2622623848353942441HOEnZb"><div class="m_-3499010537563185850m_2622623848353942441h5"><div class="gmail_extra"><br><div class="gmail_quote">2017-07-07 9:01 GMT+08:00 Jay Pipes <span dir="ltr"><<a href="mailto:jaypipes@gmail.com" target="_blank">jaypipes@gmail.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hmm, very odd indeed. Any way you can save the nova-compute logs from when you removed the GPU and restarted the nova-compute service and paste those logs to <a href="http://paste.openstack.org" rel="noreferrer" target="_blank">paste.openstack.org</a>? Would be useful in tracking down this buggy behaviour...<br>
<br>
Best,<br>
-jay<span><br>
<br>
On 07/06/2017 08:54 PM, Eddie Yen wrote:<br>
</span><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>
Hi Jay,<br>
<br>
The status of the "removed" GPU still shows as "Available" in pci_devices table.<br>
<br></span>
2017-07-07 8:34 GMT+08:00 Jay Pipes <<a href="mailto:jaypipes@gmail.com" target="_blank">jaypipes@gmail.com</a> <mailto:<a href="mailto:jaypipes@gmail.com" target="_blank">jaypipes@gmail.com</a>>>:<div><div class="m_-3499010537563185850m_2622623848353942441m_-8267789714198706726h5"><br>
<br>
    Hi again, Eddie :) Answer inline...<br>
<br>
    On 07/06/2017 08:14 PM, Eddie Yen wrote:<br>
<br>
        Hi everyone,<br>
<br>
        I'm using OpenStack Mitaka version (deployed from Fuel 9.2)<br>
<br>
        In present, I installed two different model of GPU card.<br>
<br>
        And wrote these information into pci_alias and<br>
        pci_passthrough_whitelist in nova.conf on Controller and Compute<br>
        (the node which installed GPU).<br>
        Then restart nova-api, nova-scheduler,and nova-compute.<br>
<br>
        When I check database, both of GPU info registered in<br>
        pci_devices table.<br>
<br>
        Now I removed one of the GPU from compute node, and remove the<br>
        information from nova.conf, then restart services.<br>
<br>
        But I check database again, the information of the removed card<br>
        still exist in pci_devices table.<br>
<br>
        How can I do to fix this problem?<br>
<br>
<br>
    So, when you removed the GPU from the compute node and restarted the<br>
    nova-compute service, it *should* have noticed you had removed the<br>
    GPU and marked that PCI device as deleted. At least, according to<br>
    this code in the PCI manager:<br>
<br>
    <a href="https://github.com/openstack/nova/blob/master/nova/pci/manager.py#L168-L183" rel="noreferrer" target="_blank">https://github.com/openstack/n<wbr>ova/blob/master/nova/pci/manag<wbr>er.py#L168-L183</a><br>
    <<a href="https://github.com/openstack/nova/blob/master/nova/pci/manager.py#L168-L183" rel="noreferrer" target="_blank">https://github.com/openstack/<wbr>nova/blob/master/nova/pci/mana<wbr>ger.py#L168-L183</a>><br>
<br>
    Question for you: what is the value of the status field in the<br>
    pci_devices table for the GPU that you removed?<br>
<br>
    Best,<br>
    -jay<br>
<br>
    p.s. If you really want to get rid of that device, simply remove<br>
    that record from the pci_devices table. But, again, it *should* be<br>
    removed automatically...<br>
<br>
    ______________________________<wbr>_________________<br>
    Mailing list:<br>
    <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" rel="noreferrer" target="_blank">http://lists.openstack.org/cgi<wbr>-bin/mailman/listinfo/openstac<wbr>k</a><br>
    <<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" rel="noreferrer" target="_blank">http://lists.openstack.org/cg<wbr>i-bin/mailman/listinfo/opensta<wbr>ck</a>><br>
    Post to     : <a href="mailto:openstack@lists.openstack.org" target="_blank">openstack@lists.openstack.org</a><br></div></div>
    <mailto:<a href="mailto:openstack@lists.openstack.org" target="_blank">openstack@lists.openst<wbr>ack.org</a>><br>
    Unsubscribe :<br>
    <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" rel="noreferrer" target="_blank">http://lists.openstack.org/cgi<wbr>-bin/mailman/listinfo/openstac<wbr>k</a><br>
    <<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" rel="noreferrer" target="_blank">http://lists.openstack.org/cg<wbr>i-bin/mailman/listinfo/opensta<wbr>ck</a>><br>
<br>
<br>
</blockquote>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>