<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0in;
margin-right:0in;
margin-bottom:0in;
margin-left:.5in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
span.EmailStyle17
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:10381575;
mso-list-type:hybrid;
mso-list-template-ids:466025270 67698703 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l0:level1
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level2
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level3
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l0:level4
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level5
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level6
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l0:level7
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level8
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level9
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l1
{mso-list-id:442727578;
mso-list-type:hybrid;
mso-list-template-ids:-1837349448 67698703 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l1:level1
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1:level2
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1:level3
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l1:level4
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1:level5
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1:level6
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l1:level7
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1:level8
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1:level9
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l2
{mso-list-id:632910924;
mso-list-type:hybrid;
mso-list-template-ids:-1252485242 -1056528264 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l2:level1
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:38.25pt;
text-indent:-.25in;}
@list l2:level2
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:74.25pt;
text-indent:-.25in;}
@list l2:level3
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
margin-left:110.25pt;
text-indent:-9.0pt;}
@list l2:level4
{mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:146.25pt;
text-indent:-.25in;}
@list l2:level5
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:182.25pt;
text-indent:-.25in;}
@list l2:level6
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
margin-left:218.25pt;
text-indent:-9.0pt;}
@list l2:level7
{mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:254.25pt;
text-indent:-.25in;}
@list l2:level8
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:290.25pt;
text-indent:-.25in;}
@list l2:level9
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
margin-left:326.25pt;
text-indent:-9.0pt;}
@list l3
{mso-list-id:923954029;
mso-list-type:hybrid;
mso-list-template-ids:486294500 -408756484 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l3:level1
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;
mso-ascii-font-family:Calibri;
mso-fareast-font-family:Calibri;
mso-hansi-font-family:Calibri;
mso-bidi-font-family:"Times New Roman";}
@list l3:level2
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l3:level3
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l3:level4
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l3:level5
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l3:level6
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l3:level7
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l3:level8
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l3:level9
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l4
{mso-list-id:1305086676;
mso-list-type:hybrid;
mso-list-template-ids:1674846744 1813439348 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l4:level1
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:.75in;
text-indent:-.25in;}
@list l4:level2
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:1.25in;
text-indent:-.25in;}
@list l4:level3
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
margin-left:1.75in;
text-indent:-9.0pt;}
@list l4:level4
{mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:2.25in;
text-indent:-.25in;}
@list l4:level5
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:2.75in;
text-indent:-.25in;}
@list l4:level6
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
margin-left:3.25in;
text-indent:-9.0pt;}
@list l4:level7
{mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:3.75in;
text-indent:-.25in;}
@list l4:level8
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:4.25in;
text-indent:-.25in;}
@list l4:level9
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
margin-left:4.75in;
text-indent:-9.0pt;}
@list l5
{mso-list-id:1621642566;
mso-list-type:hybrid;
mso-list-template-ids:-994005626 67698713 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l5:level1
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l5:level2
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l5:level3
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l5:level4
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l5:level5
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l5:level6
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l5:level7
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l5:level8
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l5:level9
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l6
{mso-list-id:1957786357;
mso-list-type:hybrid;
mso-list-template-ids:356315710 452615970 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l6:level1
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:.75in;
text-indent:-.25in;}
@list l6:level2
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:1.25in;
text-indent:-.25in;}
@list l6:level3
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
margin-left:1.75in;
text-indent:-9.0pt;}
@list l6:level4
{mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:2.25in;
text-indent:-.25in;}
@list l6:level5
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:2.75in;
text-indent:-.25in;}
@list l6:level6
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
margin-left:3.25in;
text-indent:-9.0pt;}
@list l6:level7
{mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:3.75in;
text-indent:-.25in;}
@list l6:level8
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:4.25in;
text-indent:-.25in;}
@list l6:level9
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
margin-left:4.75in;
text-indent:-9.0pt;}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Hi Zane,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">At this stage our implementation (as mentioned in
<a href="https://wiki.openstack.org/wiki/Heat/ConvergenceDesign">wiki</a>) achieves your design goals.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoListParagraph" style="text-indent:-.25in;mso-list:l3 level1 lfo1"><![if !supportLists]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><span style="mso-list:Ignore">1.<span style="font:7.0pt "Times New Roman"">
</span></span></span><![endif]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">In case of a parallel update, our implementation adjusts graph according to new template and waits for dispatched resource tasks to complete.<o:p></o:p></span></p>
<p class="MsoListParagraph" style="text-indent:-.25in;mso-list:l3 level1 lfo1"><![if !supportLists]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><span style="mso-list:Ignore">2.<span style="font:7.0pt "Times New Roman"">
</span></span></span><![endif]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Reason for basing our PoC on Heat code:<o:p></o:p></span></p>
<p class="MsoListParagraph" style="margin-left:.75in;text-indent:-.25in;mso-list:l4 level1 lfo2">
<![if !supportLists]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><span style="mso-list:Ignore">a.<span style="font:7.0pt "Times New Roman"">
</span></span></span><![endif]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">To solve contention processing parent resource by all dependent resources in parallel.<o:p></o:p></span></p>
<p class="MsoListParagraph" style="margin-left:.75in;text-indent:-.25in;mso-list:l4 level1 lfo2">
<![if !supportLists]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><span style="mso-list:Ignore">b.<span style="font:7.0pt "Times New Roman"">
</span></span></span><![endif]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">To avoid porting issue from PoC to HeatBase. (just to be aware of potential issues asap)<o:p></o:p></span></p>
<p class="MsoListParagraph" style="text-indent:-.25in;mso-list:l3 level1 lfo1"><![if !supportLists]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><span style="mso-list:Ignore">3.<span style="font:7.0pt "Times New Roman"">
</span></span></span><![endif]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Resource timeout would be helpful, but I guess its resource specific and has to come from template and default values from plugins.<o:p></o:p></span></p>
<p class="MsoListParagraph" style="text-indent:-.25in;mso-list:l3 level1 lfo1"><![if !supportLists]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><span style="mso-list:Ignore">4.<span style="font:7.0pt "Times New Roman"">
</span></span></span><![endif]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">We see resource notification aggregation and processing next level of resources without contention and with minimal DB usage as the problem area.
We are working on the following approaches in <b>parallel.</b><o:p></o:p></span></p>
<p class="MsoListParagraph" style="margin-left:.75in;text-indent:-.25in;mso-list:l6 level1 lfo5">
<![if !supportLists]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><span style="mso-list:Ignore">a.<span style="font:7.0pt "Times New Roman"">
</span></span></span><![endif]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Use a Queue per stack to serialize notification.<o:p></o:p></span></p>
<p class="MsoListParagraph" style="margin-left:.75in;text-indent:-.25in;mso-list:l6 level1 lfo5">
<![if !supportLists]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><span style="mso-list:Ignore">b.<span style="font:7.0pt "Times New Roman"">
</span></span></span><![endif]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Get parent ProcessLog (ResourceID, EngineID) and initiate convergence upon first child notification. Subsequent children who fail to get parent resource
lock will directly send message to waiting parent task (topic=stack_id.parent_resource_id)<o:p></o:p></span></p>
<p class="MsoNormal" style="margin-left:.5in"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Based on performance/feedback we can select either or a mashed version.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Advantages:<o:p></o:p></span></p>
<p class="MsoListParagraph" style="text-indent:-.25in;mso-list:l0 level1 lfo6"><![if !supportLists]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><span style="mso-list:Ignore">1.<span style="font:7.0pt "Times New Roman"">
</span></span></span><![endif]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Failed Resource tasks can be re-initiated after ProcessLog table lookup.<o:p></o:p></span></p>
<p class="MsoListParagraph" style="text-indent:-.25in;mso-list:l0 level1 lfo6"><![if !supportLists]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><span style="mso-list:Ignore">2.<span style="font:7.0pt "Times New Roman"">
</span></span></span><![endif]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">One worker == one resource.<o:p></o:p></span></p>
<p class="MsoListParagraph" style="text-indent:-.25in;mso-list:l0 level1 lfo6"><![if !supportLists]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><span style="mso-list:Ignore">3.<span style="font:7.0pt "Times New Roman"">
</span></span></span><![endif]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Supports concurrent updates<o:p></o:p></span></p>
<p class="MsoListParagraph" style="text-indent:-.25in;mso-list:l0 level1 lfo6"><![if !supportLists]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><span style="mso-list:Ignore">4.<span style="font:7.0pt "Times New Roman"">
</span></span></span><![endif]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Delete == update with empty stack<o:p></o:p></span></p>
<p class="MsoListParagraph" style="text-indent:-.25in;mso-list:l0 level1 lfo6"><![if !supportLists]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><span style="mso-list:Ignore">5.<span style="font:7.0pt "Times New Roman"">
</span></span></span><![endif]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Rollback == update to previous know good/completed stack.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Disadvantages:<o:p></o:p></span></p>
<p class="MsoListParagraph" style="text-indent:-.25in;mso-list:l1 level1 lfo7"><![if !supportLists]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><span style="mso-list:Ignore">1.<span style="font:7.0pt "Times New Roman"">
</span></span></span><![endif]><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Still holds stackLock (WIP to remove with ProcessLog)<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal" style="text-autospace:none"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Completely understand your concern on reviewing our code, since commits are numerous and there is change of course at places. Our
start commit is [c1b3eb22f7ab6ea60b095f88982247dd249139bf] though this might not help
</span><span style="font-size:11.0pt;font-family:Wingdings;color:#1F497D">J</span><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">
<o:p></o:p></span></p>
<p class="MsoNormal" style="text-autospace:none"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Your Thoughts.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Happy Thanksgiving.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Vishnu.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal" style="margin-left:.5in"><b><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">From:</span></b><span style="font-size:11.0pt;font-family:"Calibri","sans-serif""> Angus Salkeld [mailto:asalkeld@mirantis.com]
<br>
<b>Sent:</b> Thursday, November 27, 2014 9:46 AM<br>
<b>To:</b> OpenStack Development Mailing List (not for usage questions)<br>
<b>Subject:</b> Re: [openstack-dev] [Heat] Convergence proof-of-concept showdown<o:p></o:p></span></p>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
<div>
<div>
<div>
<p class="MsoNormal" style="margin-left:.5in">On Thu, Nov 27, 2014 at 12:20 PM, Zane Bitter <<a href="mailto:zbitter@redhat.com" target="_blank">zbitter@redhat.com</a>> wrote:<o:p></o:p></p>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">
<p class="MsoNormal" style="mso-margin-top-alt:0in;margin-right:0in;margin-bottom:12.0pt;margin-left:47.55pt">
A bunch of us have spent the last few weeks working independently on proof of concept designs for the convergence architecture. I think those efforts have now reached a sufficient level of maturity that we should start working together on synthesising them
into a plan that everyone can forge ahead with. As a starting point I'm going to summarise my take on the three efforts; hopefully the authors of the other two will weigh in to give us their perspective.<br>
<br>
<br>
Zane's Proposal<br>
===============<br>
<br>
<a href="https://github.com/zaneb/heat-convergence-prototype/tree/distributed-graph" target="_blank">https://github.com/zaneb/heat-convergence-prototype/tree/distributed-graph</a><br>
<br>
I implemented this as a simulator of the algorithm rather than using the Heat codebase itself in order to be able to iterate rapidly on the design, and indeed I have changed my mind many, many times in the process of implementing it. Its notable departure from
a realistic simulation is that it runs only one operation at a time - essentially giving up the ability to detect race conditions in exchange for a completely deterministic test framework. You just have to imagine where the locks need to be. Incidentally,
the test framework is designed so that it can easily be ported to the actual Heat code base as functional tests so that the same scenarios could be used without modification, allowing us to have confidence that the eventual implementation is a faithful replication
of the simulation (which can be rapidly experimented on, adjusted and tested when we inevitably run into implementation issues).<br>
<br>
This is a complete implementation of Phase 1 (i.e. using existing resource plugins), including update-during-update, resource clean-up, replace on update and rollback; with tests.<br>
<br>
Some of the design goals which were successfully incorporated:<br>
- Minimise changes to Heat (it's essentially a distributed version of the existing algorithm), and in particular to the database<br>
- Work with the existing plugin API<br>
- Limit total DB access for Resource/Stack to O(n) in the number of resources<br>
- Limit overall DB access to O(m) in the number of edges<br>
- Limit lock contention to only those operations actually contending (i.e. no global locks)<br>
- Each worker task deals with only one resource<br>
- Only read resource attributes once<br>
<br>
<span style="color:#1F497D"><o:p></o:p></span></p>
<p class="MsoNormal" style="mso-margin-top-alt:0in;margin-right:0in;margin-bottom:12.0pt;margin-left:.5in">
<br>
Open questions:<br>
- What do we do when we encounter a resource that is in progress from a previous update while doing a subsequent update? Obviously we don't want to interrupt it, as it will likely be left in an unknown state. Making a replacement is one obvious answer, but
in many cases there could be serious down-sides to that. How long should we wait before trying it? What if it's still in progress because the engine processing the resource already died?<o:p></o:p></p>
</blockquote>
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">Also, how do we implement resource level timeouts in general?<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"> <o:p></o:p></p>
</div>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">
<p class="MsoNormal" style="margin-left:.5in"><br>
Michał's Proposal<br>
=================<br>
<br>
<a href="https://github.com/inc0/heat-convergence-prototype/tree/iterative" target="_blank">https://github.com/inc0/heat-convergence-prototype/tree/iterative</a><br>
<br>
Note that a version modified by me to use the same test scenario format (but not the same scenarios) is here:<br>
<br>
<a href="https://github.com/zaneb/heat-convergence-prototype/tree/iterative-adapted" target="_blank">https://github.com/zaneb/heat-convergence-prototype/tree/iterative-adapted</a><br>
<br>
This is based on my simulation framework after a fashion, but with everything implemented synchronously and a lot of handwaving about how the actual implementation could be distributed. The central premise is that at each step of the algorithm, the entire graph
is examined for tasks that can be performed next, and those are then started. Once all are complete (it's synchronous, remember), the next step is run. Keen observers will be asking how we know when it is time to run the next step in a distributed version
of this algorithm, where it will be run and what to do about resources that are in an intermediate state at that time. All of these questions remain unanswered.<o:p></o:p></p>
</blockquote>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">Yes, I was struggling to figure out how it could manage an IN_PROGRESS state as it's stateless. So you end up treading on the other action's toes.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">Assuming we use the resource's state (IN_PROGRESS) you could get around that. Then you kick off a converge when ever an action completes (if there is nothing new to be<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">done then do nothing).<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"> <o:p></o:p></p>
</div>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">
<p class="MsoNormal" style="margin-left:.5in"><br>
A non-exhaustive list of concerns I have:<br>
- Replace on update is not implemented yet<br>
- AFAIK rollback is not implemented yet<br>
- The simulation doesn't actually implement the proposed architecture<br>
- This approach is punishingly heavy on the database - O(n^2) or worse<o:p></o:p></p>
</blockquote>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">Yes, re-reading the state of all resources when ever run a new converge is worrying, but I think Michal had some ideas to minimize this.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"> <o:p></o:p></p>
</div>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">
<p class="MsoNormal" style="margin-left:.5in">- A lot of phase 2 is mixed in with phase 1 here, making it difficult to evaluate which changes need to be made first and whether this approach works with existing plugins<br>
- The code is not really based on how Heat works at the moment, so there would be either a major redesign required or lots of radical changes in Heat or both<br>
<br>
I think there's a fair chance that given another 3-4 weeks to work on this, all of these issues and others could probably be resolved. The question for me at this point is not so much "if" but "why".<br>
<br>
Michał believes that this approach will make Phase 2 easier to implement, which is a valid reason to consider it. However, I'm not aware of any particular issues that my approach would cause in implementing phase 2 (note that I have barely looked into it at
all though). In fact, I very much want Phase 2 to be entirely encapsulated by the Resource class, so that the plugin type (legacy vs. convergence-enabled) is transparent to the rest of the system. Only in this way can we be sure that we'll be able to maintain
support for legacy plugins. So a phase 1 that mixes in aspects of phase 2 is actually a bad thing in my view.<br>
<br>
I really appreciate the effort that has gone into this already, but in the absence of specific problems with building phase 2 on top of another approach that are solved by this one, I'm ready to call this a distraction.<o:p></o:p></p>
</blockquote>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">In it's defence, I like the simplicity of it. The concepts and code are easy to understand - tho' part of this is doesn't implement all the stuff on your list yet.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in"> <o:p></o:p></p>
</div>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">
<p class="MsoNormal" style="margin-left:.5in"><br>
<br>
Anant & Friends' Proposal<br>
=========================<br>
<br>
First off, I have found this very difficult to review properly since the code is not separate from the huge mass of Heat code and nor is the commit history in the form that patch submissions would take (but rather includes backtracking and iteration on the
design). As a result, most of the information here has been gleaned from discussions about the code rather than direct review. I have repeatedly suggested that this proof of concept work should be done using the simulator framework instead, unfortunately so
far to no avail.<br>
<br>
The last we heard on the mailing list about this, resource clean-up had not yet been implemented. That was a major concern because that is the more difficult half of the algorithm. Since then there have been a lot more commits, but it's not yet clear whether
resource clean-up, update-during-update, replace-on-update and rollback have been implemented, though it is clear that at least some progress has been made on most or all of them. Perhaps someone can give us an update.<o:p></o:p></p>
</blockquote>
<div>
<p class="MsoNormal" style="margin-left:.5in"><br>
<a href="https://github.com/anantpatil/heat-convergence-poc">https://github.com/anantpatil/heat-convergence-poc</a><br>
<o:p></o:p></p>
</div>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">
<p class="MsoNormal" style="margin-left:.5in"><br>
AIUI this code also mixes phase 2 with phase 1, which is a concern. For me the highest priority for phase 1 is to be sure that it works with existing plugins. Not only because we need to continue to support them, but because converting all of our existing 'integration-y'
unit tests to functional tests that operate in a distributed system is virtually impossible in the time frame we have available. So the existing test code needs to stick around, and the existing stack create/update/delete mechanisms need to remain in place
until such time as we have equivalent functional test coverage to begin eliminating existing unit tests. (We'll also, of course, need to have unit tests for the individual elements of the new distributed workflow, functional tests to confirm that the distributed
workflow works in principle as a whole - the scenarios from the simulator can help with _part_ of this - and, not least, an algorithm that is as similar as possible to the current one so that our existing tests remain at least somewhat representative and don't
require too many major changes themselves.)<br>
<br>
Speaking of tests, I gathered that this branch included tests, but I don't know to what extent there are automated end-to-end functional tests of the algorithm?<br>
<br>
From what I can gather, the approach seems broadly similar to the one I eventually settled on also. The major difference appears to be in how we merge two or more streams of execution (i.e. when one resource depends on two or more others). In my approach, the
dependencies are stored in the resources and each joining of streams creates a database row to track it, which is easily locked with contention on the lock extending only to those resources which are direct dependencies of the one waiting. In this approach,
both the dependencies and the progress through the graph are stored in a database table, necessitating (a) reading of the entire table (as it relates to the current stack) on every resource operation, and (b) locking of the entire table (which is hard) when
marking a resource operation complete.<br>
<br>
I chatted to Anant about this today and he mentioned that they had solved the locking problem by dispatching updates to a queue that is read by a single engine per stack.<br>
<br>
My approach also has the neat side-effects of pushing the data required to resolve get_resource and get_att (without having to reload the resources again and query them) as well as to update dependencies (e.g. because of a replacement or deletion) along with
the flow of triggers. I don't know if anything similar is at work here.<br>
<br>
It's entirely possible that the best design might combine elements of both approaches.<br>
<br>
The same open questions I detailed under my proposal also apply to this one, if I understand correctly.<br>
<br>
<br>
I'm certain that I won't have represented everyone's work fairly here, so I encourage folks to dive in and correct any errors about theirs and ask any questions you might have about mine. (In case you have been living under a rock, note that I'll be out of
the office for the rest of the week due to Thanksgiving so don't expect immediate replies.)<br>
<br>
I also think this would be a great time for the wider Heat community to dive in and start asking questions and suggesting ideas. We need to, ahem, converge on a shared understanding of the design so we can all get to work delivering it for Kilo.<o:p></o:p></p>
</blockquote>
<div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:0in;margin-right:0in;margin-bottom:12.0pt;margin-left:.5in">
Agree, we need to get moving on this.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="margin-left:.5in">-Angus<br>
<o:p></o:p></p>
</div>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">
<p class="MsoNormal" style="margin-left:.5in"><br>
cheers,<br>
Zane.<br>
<br>
_______________________________________________<br>
OpenStack-dev mailing list<br>
<a href="mailto:OpenStack-dev@lists.openstack.org" target="_blank">OpenStack-dev@lists.openstack.org</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><o:p></o:p></p>
</blockquote>
</div>
<p class="MsoNormal" style="margin-left:.5in"><o:p> </o:p></p>
</div>
</div>
</div>
</body>
</html>