[Openstack] Chef Deployment System for Swift - a proposed design - feedback?

Jay Payne letterj at gmail.com
Sun May 1 14:14:10 UTC 2011


Andi,

This looks great.   I do have some thoughts/questions.

If you are using 1.3, do you have a separate role for the management
functionality in the proxy?    It's not a good idea to have all your
proxy servers running in management mode (unless you only have one
proxy).

Why only 1 ring-compute node?  If that node is lost or unavailable do
you loose your ring-builder files?

When I create an environment I always setup utilities like st,
get-nodes, stats-report, and a simple functional test script on a
server to help troubleshoot and manage the cluster(s).

Are you using packages or eggs to deploy the swift code?   If your
using packages, are you building them yourself or using the ones from
launchpad?

If you have more than three proxy servers, do you plan on using load balancers?


Thanks
--J




On Sun, May 1, 2011 at 8:37 AM, andi abes <andi.abes at gmail.com> wrote:
> Judd,
>   Sorry, today I won't be around. I'd love to hear feedback and suggestions
> on what I have so far ( I'm not 100% sure when I can make the fully
> available, but I'm hoping this is very soon). I'm running with swift 1.3 on
> ubuntu 10.10.
> I'm using  the environment pattern in chef - when nodes search for their
> peers a predicate comparing the node[:swift][:config][:environment] to the
> corresponding value on the prospective peer. A "default" value is assigned
> to this by the default recipe's attributes, so if only 1 cluster is present,
> all nodes are eligible. For example, when proxy recipe creates the memcached
> list of addresses, it searches for all the other nodes with the swift-proxy
> role assigned to it anded with the above.
>  Is that what you meant about having a classifier?
> To the basic roles you've described (base, storage and proxy) I've added 1:
> ring-compute. This role should be assigned to only 1 node on top of either
> the storage or proxy. This server will recompute the rings whenever the set
> of storage servers change (new disks or machines added or existing ones
> removed). It uses information on the affected machines (ip, disk and zone
> assignment) to update the ring. Once the ring is updated, it is rsynced to
> all the other nodes in the cluster.
> [ this is achieved with a LWRP as I mentioned earlier, which first parses
> the current ring info, then compares it to the current set of disks. It
> notifies chef if any changes were made, so that the rsync actions can be
> triggered]
> At this point, all disks are added to all rings. Though it should be easy to
> make this conditional on an attribute on the node/disk or some heuristic.
> The 'base role' is also a bit more extensive than your description, but
> needs a bit more work. The recipe uses a swift_disk LWRP to ensure the
> partition table matches the configuration. The LWRP accepts an array
> describing the desired partition table, and allows using a :remaining token
> to indicate using what's left (should only be used on the last partition).
> At this point it, the recipe is pretty hard coded. It assumes that /dev/sdb
> is dedicated to storage. it just requires a hard coded single partition,
> that uses the whole disk. If it's not present, or different, the LWRP
> creates a BSD label, a single partition and xfs filesystem.  Once this is
> done, the available disk is 'published' as node attributes. If the current
> state of the system matches the desired state, nothing is modified.
> The proxy and storage roles, do as you'd expect. install the relevant
> packages (including memcached on the proxy using the opscode recipe) and
> plunk in the server/cluster specific info into the relevant config file
> templates.
> What's totally not yet addressed:
> - starting services
> - prep-ing  the authentication
> - injecting disk configuration.
> This cookbook can be used by it self, but it is more powerful when used in
> conjunction with the crowbar proposal mechanism. crowbar basically allows
> you to define the qualifications for a system to be assigned a given role,
> edit the automatic role assignment and configuration of the cluster and then
> materialize the cluster based on these values by driving chef. Part of the
> planned crowbar capability is performing automatic disk/zone allocation,
> based on network topology and connectivity. In the mean time, the allocation
> is done when the storage node is created.
> Currently,  a proposal can provide values for the following:
> "cluster_hash",  "cluster_admin_pw", "replicas", and user/group to run swift
> as. I'm really curious to hear what other configuration parameters might be
> useful
> I'm really curious to hear some feedback about this.
> and sorry about the timing, I'm in the boston area and it's finally nice
> around here... hence other weekend plans.
>
>
> On Thu, Apr 28, 2011 at 11:25 PM, Judd Maltin <judd at newgoliath.com> wrote:
>>
>> Hey andi,
>> I'm psyched to collaborate on this.  I'm a busy guy, but I'm dedicating
>> Sunday to this. So if you have time Sunday, that would be best to catch up
>> via IRC, IM or voice.
>> Having a node classifier of some sort is critical.
>> -Judd
>>
>> Judd Maltin
>> +1 917 882 1270
>> Happiness is a straight line and a goal. -fn
>> On Apr 28, 2011, at 11:59, andi abes <andi.abes at gmail.com> wrote:
>>
>> Judd,
>>  Ok. Here are some of the thoughts I've had (and have  mostly working, but
>> hitting some swift snags..) Maybe we can collaborate on this?
>> Since Chef promotes idempotent operations and cookbooks, I put some effort
>> in making sure that changes are only made if they're required, particularly
>> around destructive or expensive operations.
>> The 2 main cases are:
>> - partitioning disks, which is obviously destructive
>> - building / rebuilding the ring files - rebalancing the ring is
>> relatively expensive.
>> for both cases I've built a LWRP which reads the current state of affairs,
>> and decides if and what are the required changes. For disks, I'm using '
>> 'parted', which produces machine friendly output. For ring files I'm using
>> the output from ring-builder.
>> the LWRP are driven by recipes which inspect databag - very similar to
>> your approach. However, they also utilize inventory information about
>> available resources created by crowbar in chef during the initial bring up
>> of the deployment.
>> (As a side note - Dell has announced plans to opensource most of crowbar,
>> but there's legalese involved)
>>
>> I'd be more than happy to elaborate and collaborate on this !
>>
>> a
>>
>>
>>
>>
>>
>>
>>
>> On Thu, Apr 28, 2011 at 11:35 AM, Judd Maltin <judd at newgoliath.com> wrote:
>>>
>>> Hi Andi,
>>>
>>> Indeed, the swift recipes hadn't been updated since mid 2010, so I pushed
>>> forward with my own.
>>>
>>> Thanks!
>>> -judd
>>>
>>> On Thu, Apr 28, 2011 at 10:03 AM, andi abes <andi.abes at gmail.com> wrote:
>>>>
>>>> Judd,
>>>>  this is a great idea... actually so great, that some folks @Dell and
>>>> OpsCode, me included, have been working on it.
>>>> Have a peek on
>>>>https://github.com/opscode/openstack-cookbooks/tree/master/cookbooks
>>>> This effort is also being included into Crowbar (take a peek
>>>> here: http://robhirschfeld.com/tag/crowbar/) which adds the steps needed to
>>>> start with bare metal (rather than installed OS), then using chef to get to
>>>> a working Open stack deployment.
>>>> (if you're at the design meeting, there are demos scheduled).
>>>> That said - I'm updating the swift cookbook, and hope to update github
>>>> soon.
>>>> a.
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Apr 27, 2011 at 9:55 PM, Jay Payne <letterj at gmail.com> wrote:
>>>>>
>>>>> Judd,
>>>>>
>>>>> I'm not that familiar with Chef (I'll do some research) but I have a
>>>>> couple of questions and some thoughts:
>>>>>
>>>>> 1.  Is this for a multi-server environment?
>>>>> 2.  Are all your proxy nodes going to have "allow_account_management =
>>>>> true" in the configs?   It might be a good idea to have a second proxy
>>>>> config for account management only
>>>>> 3.  Have you looked at using swauth instead of auth?
>>>>> 4.  Have you thought about an admin or client node that has utilities
>>>>> on it like st and stats-report?
>>>>> 5.  How where will you do on-going ring management or changes?
>>>>> 6.  I would think about including some type of functional test at the
>>>>> end of the deployment process to verify everything was created
>>>>> properly and that all nodes can communicate.
>>>>>
>>>>>
>>>>>
>>>>> --J
>>>>>
>>>>> On Wed, Apr 27, 2011 at 6:18 PM, Judd Maltin <judd at newgoliath.com>
>>>>> wrote:
>>>>> > Hi Folks,
>>>>> >
>>>>> > I've been hacking away at creating an automated deployment system for
>>>>> > Swift
>>>>> > using Chef.  I'd like to drop a design idea on you folks (most of
>>>>> > which I've
>>>>> > already implemented) and get feedback from this esteemed group.
>>>>> >
>>>>> > My end goal is to have a "manifest" (apologies to Puppet) which will
>>>>> > define
>>>>> > an entire swift cluster, deploy it automatically, and allow edits to
>>>>> > the
>>>>> > ingredients to manage the cluster.  In this case, a "manifest" is a
>>>>> > combination of a chef databag describing the swift settings, and a
>>>>> > spiceweasel infrastructure.yaml file describing the OS configuration.
>>>>> >
>>>>> > Ingredients:
>>>>> > - swift cookbook with base, proxy and server recipes.  proxy nodes
>>>>> > also
>>>>> > (provisionally) contain auth services. storage nodes handle object,
>>>>> > container and account services.
>>>>> > -- Base recipe handles common package install, OS user creation.
>>>>> > Sets up
>>>>> > keys.
>>>>> > -- Proxy recipe handles proxy nodes: network config, package install,
>>>>> > memcache config, proxy and auth package config, user creation, ring
>>>>> > management (including builder file backup), user management
>>>>> > -- Storage recipe handles storage nodes: network config, storage
>>>>> > device
>>>>> > config, package install, ring management.
>>>>> >
>>>>> > - chef databag that describes a swift cluster (eg:
>>>>> > mycluster_databag.json)
>>>>> > -- proxy config settings
>>>>> > -- memcached settings
>>>>> > -- settings for all rings and devices
>>>>> > -- basic user settings
>>>>> > -- account management
>>>>> >
>>>>> > - chef "spiceweasel" file that auto-vivifies the infrastructure: (eg:
>>>>> > mycluster_infra.yaml)
>>>>> > -- uploads cookbooks
>>>>> > -- uploads roles
>>>>> > -- uploads the cluster's databag
>>>>> > -- kicks off node provisioning by requesting from infrastructure API
>>>>> > (ec2 or
>>>>> > what have you) the following:
>>>>> > --- chef roles applied (role[swift:proxy] or role[swift:storage])
>>>>> > --- server flavor
>>>>> > --- storage device configs
>>>>> > --- hostname
>>>>> > --- proxy and storage network details
>>>>> >
>>>>> > By calling this spiceweasel file, the infrastructure can leap into
>>>>> > existence.
>>>>> >
>>>>> > I'm more or less done with all this stuff - and I'd really appreciate
>>>>> > conceptual feedback before I take out all the non-sense code I have
>>>>> > in the
>>>>> > files and publish.
>>>>> >
>>>>> > Many thanks!  Happy spring, northern hemispherians!
>>>>> > -judd
>>>>> >
>>>>> > Judd Maltin
>>>>> > T: 917-882-1270
>>>>> > F: 501-694-7809
>>>>> > A loving heart is never wrong.
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > _______________________________________________
>>>>> > Mailing list: https://launchpad.net/~openstack
>>>>> > Post to     : openstack at lists.launchpad.net
>>>>> > Unsubscribe : https://launchpad.net/~openstack
>>>>> > More help   : https://help.launchpad.net/ListHelp
>>>>> >
>>>>> >
>>>>>
>>>>> _______________________________________________
>>>>> Mailing list: https://launchpad.net/~openstack
>>>>> Post to     : openstack at lists.launchpad.net
>>>>> Unsubscribe : https://launchpad.net/~openstack
>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>
>>>
>>>
>>>
>>> --
>>> Judd Maltin
>>> T: 917-882-1270
>>> F: 501-694-7809
>>> A loving heart is never wrong.
>>>
>>>
>>>
>>
>
>




More information about the Openstack mailing list