[openstack-dev] [tripleo] Mistral Workflow for deriving THT parameters

John Fulton johfulto at redhat.com
Wed Jan 25 03:54:31 UTC 2017


On 01/23/2017 05:07 AM, Saravanan KR wrote:
> Thanks John for the info.
>
> I am going through the spec in detail. And before that, I had few
> thoughts about how I wanted to approach this, which I have drafted in
> https://etherpad.openstack.org/p/tripleo-derive-params. And it is not
> 100% ready yet, I was still working on it.

Awesome. Thank you Saravanan for taking the time to review this. I
made some updates in the etherpad above.

> As of now, there are few differences on top of my mind, which I want
> to highlight, I am still going through the specs in detail:
>
> * Profiles vs Features - Considering a overcloud node as a profiles
> rather than a node which can host these features, would have
> limitations to it. For example, if i need a Compute node to host both
> Ceph (OSD) and DPDK, then the node will have multiple profiles or we
> have to create a profile like -
> hci_enterprise_many_small_vms_with_dpdk? The first one is not
> appropriate and the later is not scaleable, may be something else in
> your mind?

Why is the later not scaleable? It's analogous to composable roles.

With Composable Roles, if I want HCI, which is made from the Compute
and CephStorage roles, then I add a name for the new role and list
services they should have by borrowing from the examples shipped
in openstack-tripleo-heat-templates/roles_data.yaml. So I could call
my new role role OsdCompute and make:

 
https://github.com/RHsyseng/hci/blob/master/custom-templates/custom-roles.yaml#L168-L193

Similarly, if I want to make a new profile, then I give it a name and
then combine what I want. E.g. if the workload profiles file had
hci_throughput like this:

   hci_throughput:
     workload::average_guest_flavor: 'm1.large'
     workload::average_guest_memory_size_in_mb: 8192
     workload::average_guest_CPU_utilization_percentage: 90
     workload::tuned_profile: 'throughput-performance'

The deployer could easily compose their own hci_latency profile as:

   hci_latency:
     workload::average_guest_flavor: 'm1.large'
     workload::average_guest_memory_size_in_mb: 8192
     workload::average_guest_CPU_utilization_percentage: 90
     workload::tuned_profile: 'latency-performance'

The above is a simple example, but if more parameters were modified
the ability to add multiple tunables per profile would be useful as
the deployer would just need to specify they want just that name and
they'd get the other params that came with it. So, I was not
suggesting we ship every possible profile but that we provide a few
proven examples of known use-cases and also have one example with all
of them for CI tests (similar to CI for composable roles).

I think the above fits in with the notion of tags that you mentioned in
that they can be combined, e.g. "dpdk,osd" vs "sriov,osd".  The
difference is that the deployer could give any combination of them a
name. As the number of inputs for derived parameters grows, so does
the benefit of a name to refer to a set of them.

Perhaps the templates should not be for "Workload Profiles" but for
"Derived THT" and those templates should call different functions.
Then some of those functions would include derivations to optimize for
different workloads while other functions would make derivations for
DPDK or SRIOV deploys. Something like:

   hci_dpdk:
     derive::workload::average_guest_flavor: 'm1.large'
     derive::workload::average_guest_memory_size_in_mb: 8192
     derive::workload::average_guest_CPU_utilization_percentage: 90
     derive::tag::network: 'dpdk'

In something like the above, you could implement and support the tags
you described, and a user would not need to use performance profiles.
They could just include a new THT env file, e.g. derived_parmas.yaml,
and indicate which of the many derivable parameters they want to use.

What do you think of exposing the tags to the user as in the above?

> * Independent - The initial plan of this was to be independent
> execution, also can be added to deploy if needed.

I agree.

> * Not to expose/duplicate parameters which are straight forward,
> for example tuned-profile name should be associated with feature
> internally, Workflows will decide it.

By "straight forward" do you mean non-derived?

I'd prefer to allow an advanced deployer to compose a performance
profile with whatever performance tweaks they need. So, I'd put
tuned in a workload profile because if I want to tune my overcloud for
that workload then I expect it to have the appropriate tuned profile.

I see a few ways this could go. Given tags or profiles, I think we
both want a way to refer to a set of parameters with a simple name
and we want either composability within the name or the ability to
combine more than one name in a deployment. However I see two options:

A. Do we want to only derive parameters that we think must be derived
and require users to manually set non-derived ones outside of this
spec?

B. Do we want to allow for any parameter to be derived if we unify
those parameters under a name and offer workflow that has an identity
function which returns it's input as output (but translates for Heat
Env files as needed) in addition to other functions (e.g. the
cpu_mem_calculator referenced in the spec), because we think they
could be used to benefit the deployment for performance or other
reasons?

I think that giving a name to a set of parameters still involves
deriving values with Mistral, it's just that the derivation is a
workflow which has to take care of finding that value and saving it
in the Heat environment. Such a workflow has a benefit because the
deployer doesn't need to know all of the variables that are benefiting
them, provided that they know their workload. I.e. the simple power of
sharing a combination may be just as valuable as applying formulas to
derive a value within that combination.

I think we're making a lot of progress overall as we're discussing
what abstraction best suits the solution to the individual problems
we're trying to solve. Ideally, we'll come to an abstraction that
works well for both of our problems.

> * And another thing, which I couldn't get is, where will the workflow
> actions be defined, in THT or tripleo_common?

The Mistral workflow could be shipped in tripleo-common in workbooks
or THT extraconfig/tasks. I'll indicate in the spec that either could
be used and solicit feedback.

Either way, THT would ship the example profiles/tags as a Heat
environment file and users would modify it the same way they modify
roles_data.yaml.

Heat would be modified to have a OS::Mistral::WorflowExecution
resource as the spec is written today.

> The requirements which I thought of, for deriving workflow are:
> Parameter Deriving workflow should be
>
> * independent to run the workflow

I agree. A deployer could see the output (the Heat env files) of the
workflow and download them from Swift independently of deploying the
overcloud itself.

> * take basic parameters inputs, for easy deployment, keep very minimal
> set of mandatory parameters, and rest as optional parameters

This may not work for a general solution to allowing users to compose
their own performance profiles as indicated in A and B above. I'd
certainly use minimal parameters for some settings; e.g. determining
the NIC used by Ceph and starting the OSD service with numactl.

> * read introspection data from Ironic DB and Swift-stored blob

Yes.

> I will add these comments as starting point on the spec. We will work
> towards bringing down the differences, so that operators headache is
> reduced to a greater extent.

Sounds like a good plan.

> Regards,
> Saravanan KR
>
> On Fri, Jan 20, 2017 at 9:56 PM, John Fulton <johfulto at redhat.com> wrote:
>> On 01/11/2017 11:34 PM, Saravanan KR wrtoe:
>>>
>>> Thanks John, I would really appreciate if you could tag me on the
>>> reviews. I will do the same for mine too.
>>
>>
>> Hi Saravanan,
>>
>> Following up on this, have a look at the OS::Mistral::WorflowExecution
>> Heat spec [1] to trigger Mistral workflows. I'm hoping to use it for
>> deriving THT parameters for optimal resource isolation in HCI
>> deployments as I mentioned below. I have a spec [2] which describes
>> the derivation of the values, but this is provided as an example for
>> the more general problem of capturing the rules used to derive the
>> values so that deployers may easily apply them.
>>
>> Thanks,
>>   John
>>
>> [1] OS::Mistral::WorflowExecution https://review.openstack.org/#/c/267770/
>> [2] TripleO Performance Profiles https://review.openstack.org/#/c/423304/
>>
>>> On Wed, Jan 11, 2017 at 8:03 PM, John Fulton <johfulto at redhat.com> wrote:
>>>>
>>>> On 01/11/2017 12:56 AM, Saravanan KR wrote:
>>>>>
>>>>>
>>>>> Thanks Emilien and Giulio for your valuable feedback. I will start
>>>>> working towards finalizing the workbook and the actions required.
>>>>
>>>>
>>>>
>>>> Saravanan,
>>>>
>>>> If you can add me to the review for your workbook, I'd appreciate it. I'm
>>>> trying to solve a similar problem, of computing THT params for HCI
>>>> deployments in order to isolate resources between CephOSDs and
>>>> NovaComputes,
>>>> and I was also looking to use a Mistral workflow. I'll add you to the
>>>> review
>>>> of any related work, if you don't mind. Your proposal to get NUMA info
>>>> into
>>>> Ironic [1] helps me there too. Hope to see you at the PTG.
>>>>
>>>> Thanks,
>>>>   John
>>>>
>>>> [1] https://review.openstack.org/396147
>>>>
>>>>
>>>>>> would you be able to join the PTG to help us with the session on the
>>>>>> overcloud settings optimization?
>>>>>
>>>>>
>>>>> I will come back on this, as I have not planned for it yet. If it
>>>>> works out, I will update the etherpad.
>>>>>
>>>>> Regards,
>>>>> Saravanan KR
>>>>>
>>>>>
>>>>> On Wed, Jan 11, 2017 at 5:10 AM, Giulio Fidente <gfidente at redhat.com>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> On 01/04/2017 09:13 AM, Saravanan KR wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> The aim of this mail is to ease the DPDK deployment with TripleO. I
>>>>>>> would like to see if the approach of deriving THT parameter based on
>>>>>>> introspection data, with a high level input would be feasible.
>>>>>>>
>>>>>>> Let me brief on the complexity of certain parameters, which are
>>>>>>> related to DPDK. Following parameters should be configured for a good
>>>>>>> performing DPDK cluster:
>>>>>>> * NeutronDpdkCoreList (puppet-vswitch)
>>>>>>> * ComputeHostCpusList (PreNetworkConfig [4], puppet-vswitch) (under
>>>>>>> review)
>>>>>>> * NovaVcpuPinset (puppet-nova)
>>>>>>>
>>>>>>> * NeutronDpdkSocketMemory (puppet-vswitch)
>>>>>>> * NeutronDpdkMemoryChannels (puppet-vswitch)
>>>>>>> * ComputeKernelArgs (PreNetworkConfig [4]) (under review)
>>>>>>> * Interface to bind DPDK driver (network config templates)
>>>>>>>
>>>>>>> The complexity of deciding some of these parameters is explained in
>>>>>>> the blog [1], where the CPUs has to be chosen in accordance with the
>>>>>>> NUMA node associated with the interface. We are working a spec [2], to
>>>>>>> collect the required details from the baremetal via the introspection.
>>>>>>> The proposal is to create mistral workbook and actions
>>>>>>> (tripleo-common), which will take minimal inputs and decide the actual
>>>>>>> value of parameters based on the introspection data. I have created
>>>>>>> simple workbook [3] with what I have in mind (not final, only
>>>>>>> wireframe). The expected output of this workflow is to return the list
>>>>>>> of inputs for "parameter_defaults",  which will be used for the
>>>>>>> deployment. I would like to hear from the experts, if there is any
>>>>>>> drawbacks with this approach or any other better approach.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> hi, I am not an expert, I think John (on CC) knows more but this looks
>>>>>> like
>>>>>> a good initial step to me.
>>>>>>
>>>>>> once we have the workbook in good shape, we could probably integrate it
>>>>>> in
>>>>>> the tripleo client/common to (optionally) trigger it before every
>>>>>> deployment
>>>>>>
>>>>>> would you be able to join the PTG to help us with the session on the
>>>>>> overcloud settings optimization?
>>>>>>
>>>>>> https://etherpad.openstack.org/p/tripleo-ptg-pike
>>>>>> --
>>>>>> Giulio Fidente
>>>>>> GPG KEY: 08D733BA
>>>
>>>
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



More information about the OpenStack-dev mailing list