[openstack-dev] [heat][tripleo] Heat memory usage in the TripleO gate during Ocata

Zane Bitter zbitter at redhat.com
Fri Jan 6 17:12:03 UTC 2017

tl;dr everything looks great, and memory usage has dropped by about 64% 
since the initial Newton release of Heat.

I re-ran my analysis of Heat memory usage in the tripleo-heat-templates 
gate. (This is based on the gate-tripleo-ci-centos-7-ovb-nonha job.) 
Here's a pretty picture:


There is one major caveat here: for the period marked in grey where it 
says "Only 2 engine workers", the job was configured to use only 2 
heat-enginer worker processes instead of 4, so this is not an 
apples-to-apples comparison. The inital drop at the beginning and the 
subsequent bounce at the end are artifacts of this change. Note that the 
stable/newton branch is _still_ using only 2 engine workers.

The rapidly increasing usage on the left is due to increases in the 
complexity of the templates during the Newton cycle. It's clear that if 
there has been any similar complexity growth during Ocata, it has had a 
tiny effect on memory consumption in comparison.

I tracked down most of the step changes to identifiable patches:

2016-10-07: 2.44GiB -> 1.64GiB
  - https://review.openstack.org/382068/ merged, making ResourceInfo 
classes more memory-efficient. Judging by the stable branch (where this 
and the following patch were merged at different times), this was 
responsible for dropping the memory usage from 2.44GiB -> 1.83GiB. 
(Which seems like a disproportionately large change?)
  - https://review.openstack.org/#/c/382377/ merged, so we no longer 
create multiple yaql contexts. (This was responsible for the drop from 
1.83GiB -> 1.64GiB.)

2016-10-17: 1.62GiB -> 0.93GiB
  - https://review.openstack.org/#/c/386696/ merged, reducing the number 
of engine workers on the undercloud to 2.

2016-10-19: 0.93GiB -> 0.73GiB (variance also seemed to drop after this)
  - https://review.openstack.org/#/c/386247/ merged (on 2016-10-16), 
avoiding loading all nested stacks in a single process simultaneously 
much of the time.
  - https://review.openstack.org/#/c/383839/ merged (on 2016-10-16), 
switching output calculations to RPC to avoid almost all simultaneous 
loading of all nested stacks.

2016-11-08: 0.76GiB -> 0.70GiB
  - This one is a bit of a mystery???

2016-11-22: 0.69GiB -> 0.50GiB
  - https://review.openstack.org/#/c/398476/ merged, improving the 
efficiency of resource listing?

2016-12-01: 0.49GiB -> 0.88GiB
  - https://review.openstack.org/#/c/399619/ merged, returning the 
number of engine workers on the undercloud to 4.

It's not an exact science because IIUC there's a delay between a patch 
merging in Heat and it being used in subsequent t-h-t gate jobs. e.g. 
the change to getting outputs over RPC landed the day before the 
instack-undercloud patch that cut the number of engine workers, but the 
effects don't show up until 2 days after. I'd love to figure out what 
happened on the 8th of November, but I can't correlate it to anything 
obvious. The attribution of the change on the 22nd also seems dubious, 
but the timing adds up (including on stable/newton).

It's fair to say that none of the other patches we merged in an attempt 
to reduce memory usage had any discernible effect :D

It's worth reiterating that TripleO still disables convergence in the 
undercloud, so these are all tests of the legacy code path. It would be 
great if we could set up a non-voting job on t-h-t with convergence 
enabled and start tracking memory use over time there too. As a first 
step, maybe we could at least add an experimental job on Heat to give us 
a baseline?

The next big improvement to memory use is likely to come from 
https://review.openstack.org/#/c/407326/ or something like it (though I 
don't think we have a firm decision on whether we'd apply this to 
non-convergence stacks). Hopefully that will deliver a nice speed boost 
for convergence too.


