<div class="zcontentRow"><div><div class="zhistoryRow" style="display:block"><div id="zwriteHistoryContainer"><div class="control-group zhistoryPanel"><div class="zhistoryHeader" style="padding: 8px; background-color: #F5F6F8;"><div>Hi all,<br></div></div><div class="zhistoryContent"><div><br>I was looking over the NovaClusterDataModelCollector code today and <br>trying to learn more about how watcher builds the nova CDM (and when) <br>and got digging into this change from Stein [1] where I noted what <br>appear to be several issues. I'd like to enumerate a few of those issues <br>here and then figure out how to proceed.<br><br>1. In general, a lot of this code for building the compute node model is <br>based on at least using the 2.53 microversion (Pike) in nova where the <br>hypervisor.id is a UUID - this is actually necessary for a multi-cell <br>environment like CERN. The nova_client.api_version config option already <br>defaults to 2.56 which was in Queens. I'm not sure what the <br>compatibility matrix looks like for Watcher, but would it be possible <br>for us to say that Watcher requires nova at least at Queens level API <br>(so nova_client.api_version >= 2.60), add a release note and a <br>"watcher-status upgrade check" if necessary. This might make things a <br>bit cleaner in the nova CDM code to know we can rely on a given minimum <br>version.<br><p>[licanwei]:We set the default nova api version to 2.56 , </p><p>but it's better to add a release note</p><p><br></p>2. I had a question about when the nova CDM gets built now [2]. It looks <br>like the nova CDM only gets built when there is an audit? But I thought <br>the CDM was supposed to get built on start of the decision-engine <br>service and then refreshed every hour (by default) on a periodic task or <br>as notifications are processed that change the model. Does this mean the <br>nova CDM is rebuilt fresh whenever there is an audit even if the audit <br>is not scoped? If so, isn't that potentially inefficient (and an <br>unnecessary load on the compute API every time an audit runs?).<br><p>[licanwei]:Yes, the CDM will be built when the first audit being created.</p><p>and don't rebuild if the next new audit with the same scope.</p><p><br></p>3. The host_aggregates and availability_zone compute audit scopes don't <br>appear to be documented in the docs or the API reference, just the spec <br>[3]. Should I open a docs bug about what are the supported audit scopes <br>and how they work (it looks like the host_aggregates scope works for <br>aggregate ids or names and availability_zone scope works for AZ names).<br><p>[licanwei]:There is an example in CLI command 'watcher help create audittemplate'</p><p>and it's a good idea to documented these.</p><p><br></p>4. There are a couple of issues with how the unscoped compute nodes are <br>retrieved from nova [4].<br><br>a) With microversion 2.33 there is a server-side configurable limit <br>applied when listing hypervisors (defaults to 1000). In a large cloud <br>this could be a problem since the watch client-side code is not paging.<br><br>b) The code is listing hypervisors with details, but then throwing away <br>those details to just get the hypervisor_hostname, then iterating over <br>each of those node names and getting the details per hypervisor again. I <br>see why this is done because of the scope vs unscoped cases, but we <br>could still optimize this I think (we might need some changes to <br>python-novaclient for this though, which should be easy enough to add).<br><p>[licanwei]: Yes, If novaclient can do some changes, we can optimize the code.</p><p><br></p>5. For each server on a node, we get the details of the server in <br>separate API calls to nova [5]. Why can't we just do a GET <br>/servers/detail and filter on "host" or "node" so it's a single API call <br>to nova per hypervisor?<br><p>[licanwei] This also depends on novaclient.</p><p><br></p>I'm happy to work on any of this but if there are any reasons things <br>need to be done this way please let me know before I get started. Also, <br>how would the core team like these kinds of improvements tracked? With bugs?<br><p>[licanwei]: welcome to improve Watcher. bug or other kind is not important</p><p><br></p>[1] https://review.opendev.org/#/c/640585/<br>[2] <br>https://review.opendev.org/#/c/640585/10/watcher/decision_engine/model/collector/nova.py@181<br>[3] <br>https://specs.openstack.org/openstack/watcher-specs/specs/stein/implemented/scope-for-watcher-datamodel.html<br>[4] <br>https://review.opendev.org/#/c/640585/10/watcher/decision_engine/model/collector/nova.py@257<br>[5] <br>https://review.opendev.org/#/c/640585/10/watcher/decision_engine/model/collector/nova.py@399<br><br>-- <br><br>Thanks,<br><br>Matt<br></div></div></div></div></div></div><p><br></p></div>