<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Feb 21, 2019 at 6:47 PM Graham Hayes <<a href="mailto:gr@ham.ie">gr@ham.ie</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
<br>
On 21/02/2019 17:28, Sylvain Bauza wrote:<br>
> <br>
> <br>
> On Thu, Feb 21, 2019 at 6:14 PM Graham Hayes <<a href="mailto:gr@ham.ie" target="_blank">gr@ham.ie</a><br>
> <mailto:<a href="mailto:gr@ham.ie" target="_blank">gr@ham.ie</a>>> wrote:<br>
> <br>
<br>
<snip><br>
<br>
> <br>
> > * If you had a magic wand and could inspire and make a single<br>
> > sweeping architectural or software change across the services,<br>
> > what would it be? For now, ignore legacy or upgrade concerns.<br>
> > What role should the TC have in inspiring and driving such<br>
> > changes?<br>
> <br>
> 1: Single agent on each compute node that allows for plugins to do<br>
> all the work required. (Nova / Neutron / Vitrage / watcher / etc)<br>
> <br>
> 2: Remove RMQ where it makes sense - e.g. for nova-api -> nova-compute<br>
> using something like HTTP(S) would make a lot of sense.<br>
> <br>
> 3: Unified Error codes, with a central registry, but at the very least<br>
> each time we raise an error, and it gets returned a user can see<br>
> where in the code base it failed. e.g. a header that has<br>
> OS-ERROR-COMPUTE-3142, which means that someone can google for<br>
> something more informative than the VM failed scheduling<br>
> <br>
> 4: OpenTracing support in all projects.<br>
> <br>
> 5: Possibly something with pub / sub where each project can listen for<br>
> events and not create something like designate did using<br>
> notifications.<br>
> <br>
> <br>
> That's the exact reason why I tried to avoid to answer about<br>
> architectural changes I'd like to see it done. Because when I read the<br>
> above lines, I'm far off any consensus on those.<br>
> To answer 1. and 2. from my Nova developer's hat, I'd just say that we<br>
> invented Cells v2 and Placement.<br>
<br>
Sure - this was if *I* had a magic wand - I have a completely different<br>
viewpoint to others. No community really ever has a full agreement.<br>
<br></blockquote><div><br></div><div>Fair point, we work with consensus, not full agreements. It's always good to keep that distinction in mind.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
>From a TC perspective we have to look at these things from an<br>
overall view. My suggestions above were for *all* projects, specifically<br>
for #2 - I used a well known pattern as an example, but it can apply to<br>
Trove talking to DB instances, Octavia to LBaaS nodes (they already do<br>
this, and it is a good pattern), Zun, possibly Magnum (this is not an<br>
exaustive list, and may not suit all listed projects, I am taking them<br>
from the top of my head).<br>
<br></blockquote><div><br></div><div>I'd be interested in discussing the use cases requiring such important architectural splits. <br></div><div>The main reason why Cells v2 was implemented was to address the MQ/DB scalability issue of 1000+ compute nodes. The Edge thingy came after this, so it wasn't the main driver for change.</div><div>If the projects you mention have the same footprints at scale, then yeah I'm supportive of any redesign discussion that would come up.</div><div><br></div><div>That said, before stepping in into major redesigns, I'd wonder : could the inter-services communication be improved in terms of reducing payload ?</div><div></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
>From what I understand there was even talk of doing it for Nova so that<br>
a central control plane could manage remote edge compute nodes without<br>
having to keep a RMQ connection alive across the WAN, but I am not sure<br>
where that got to.<br>
<br></blockquote><div><br></div><div>That's a separate usecase (Edge) which wasn't the initial reason why we started implementing Cells V2. I haven't heard any request from the Edge WG during the PTGs about changing our messaging interface because $WAN but I'm open to ideas.</div><div><br></div><div>-Sylvain</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
> To be clear, the redesign wasn't coming from any other sources but our<br>
> users, complaining about scale. IMHO If we really want to see some<br>
> comittee driving us about feature requests, this should be the UC and<br>
> not the TC.<br>
<br>
It should be a combination - UC and TC should be communicating about<br>
these requests - UC for the feedback, and the TC to see hwo they fit<br>
with the TCs vision for the direction of OpenStack.<br>
<br>
> Whatever it is, at the end of the day, we're all paid by our sponsors.<br>
> Meaning that any architectural redesign always hits the reality wall<br>
> where you need to convince your respective Product Managers of the great<br>
> benefit of the redesign. I'm maybe too pragmatic, but I remember so many<br>
> discussions we had about redesigns that I now feel we just need hands,<br>
> not ideas.<br>
<br>
I fully agree, and it has been an issue in the community for as long as<br>
I can remember. It doesn't mean that we should stop pushing the project<br>
forward. We have already moved the needle with the cycle goals, so we<br>
can influence what features are added to projects. Lets continue to do<br>
so.<br>
<br>
<br>
<snip><br>
<br>
</blockquote></div></div>