[placement][ptg] code refactoring constraints and goals

Jay Pipes jaypipes at gmail.com
Wed Apr 10 10:54:44 UTC 2019


On 04/08/2019 01:05 PM, Eric Fried wrote:
> On 4/8/19 10:43 AM, Chris Dent wrote:
>>      * How (and should) we further split and shrink
>> objects/resource_provider.py
> 
> To be specific here:
> 
> objects/resource_provider.py is currently ~2KLOC. That's not *horrible*
> (see below), and if it really didn't lend itself to being further
> subdivided, I would say we shouldn't force it just for the sake of
> making that number smaller.
> 
> However, IMO, this file logically deals with several things:
> 
> * Resource provider inventory/capacity/usage
> * Resource provider membership in aggregates
> * Traits on resource providers
> * Creating, updating, and deleting resource providers
> * Finding resource providers in various ways
> 
> If you look at the structure of the file (in its current order - see
> below) these groupings are pretty clear.
> 
> Note the presence of "resource provider" in each bullet above. We
> already have separate modules for dealing with e.g. CRUD of traits in
> their own right. Though the mapping isn't 1:1 with the above list.
> 
> So we have the following options, I think:
> 
> (1) Do nothing. Leave objects/resource_provider.py as is.
> (2) Move X-related stuff into the base X.py module, so like move things
> related to "Traits on resource providers" into objects/trait.py. This
> level of shuffle would probably entail things like creating a new module
> for aggregate.py
> (3) Split the above into separate modules while preserving their
> resource_provider-ness. From the perspective of code consuming the
> modules, this would look like:
> 
> from placement.objects import resource_provider as rp_crud
> from placement.objects.resource_provider import trait as rp_trait
> from placement.objects.resource_provider import search as rp_search
> from placement.objects import trait
> ...
> # Do CRUD-y stuff with resource providers
> rp = rp_crud.ResourceProvider(...)
> rp.create()
> # Do trait stuff
> t = trait.Trait(...)
> t.create()
> # Do trait-y stuff with resource providers
> rp_trait.add_traits_to_provider(rp, [t])
> # Do search-y stuff with resource providers
> rp_search.get_all_by_filters(...)
> 
> To me, this is nice and intuitive.

Meh :) I actually think option 1 (Do nothing) is best here. We went 
through a bunch of churn for the (excellent) refactoring for OVO-ectomy 
and splitting the resource_provider.py file out into the other related 
object/*.py files.

I don't really see a huge value in further splitting the resource 
provider functions out into separate modules with a 
/placement/objects/resource_provider/ module. I actually think this 
version of the above code is more readable:

from placement.objects import resource_provider
from placement.objects import trait
...
# Do CRUD-y stuff with resource providers
rp = resource_provider.ResourceProvider(...)
rp.create()
# Do trait stuff
t = trait.Trait(...)
t.create()
# Do trait-y stuff with resource providers
rp.add_traits([t])
# Do search-y stuff with resource providers
resource_provider.get_all_by_filters(...)

Note that the "add_traits()" is a method on the ResourceProvider object 
in the above. I think this is more readable for create/update/delete 
operations. For read operations, I prefer a consistently-named set of 
get_by_XXX and get_all_by_XXX methods in module-level scope as opposed 
to @classmethods on an object.
>> * Is it worth reordering methods to some kind of arbitrary ordering?
> 
> My preference on this: keep related methods/classes/etc. near each other
> in the file.
> 
> Pros:
> * Intuitive.
> * Increases the probability that, when I'm looking at a method that
> calls another method, both methods will fit on my screen at the same
> time. (Workaround: multiple editor windows to the same file.)
> 
> Cons:
> * Sometimes there's no sensible grouping, which makes this hard to
> define and therefore hard to follow (and hard to enforce, if that's a
> thing we want to do).
> * Ability to know exactly where to look to find something. (Workaround:
> Use a real IDE, where you can do a project-wide search for a symbol,
> hotkey from a symbol's use to its definition, etc. Also you can choose
> to view a module's skeleton in its real order *or* alphabetically.)

My preference is to have related methods grouped together as well, but I 
don't consider this a priority work item.

>> * If a method or module is "long" (by some definition) does that
>>    make it bad?
> 
> Not to the point where we should do something like implement sonar.
> Having had experience with that, it just ain't worth the hassle.
> However, there is such a thing as too long. Certain nova libvirt modules
> come to mind. For those of us who do choose to use real IDEs, these can
> take a long time to process (in-band syntax checking, symbol correlation
> highlighting, git blame, etc.). I'm going to go out on a limb and say we
> should strive really hard never to have a module in the 5-digit LOCs.
> And it would be really nice to stay under 2KLOC, but only if it doesn't
> feel forced.

Agreed.

>> * Where is the balance between backport hurting churn and keeping
>>    the mix fresh?
> 
> I'm not sure "fresh" is the heaviest hitter on the plus side. Making the
> code organized and crisp and usable (from a developer standpoint) and
> maintainable is the main thing to me.
> 
> On the minus side, in addition to backport pain, there's the difficulty
> tracking down the change that did a thing. Run git blame today and it
> looks like Chris wrote virtually the whole thing in a small number of
> massive monolithic chunks. You have to use bisect or similar to track
> something down. To me, this is the biggest pain point.
> 
> So I think we should strive to get to a point where the code structure
> is "good enough" and then stop doing huge refactors. That way, as we
> move forward, changes will accumulate on that base and things will look
> more "normal" history-wise.

Agreed.

>> * Should there be a more cleanly defined boundary between the
>>    methods that marshal data (the top-level methods in objects/*.py)
>>    and those that talk to the persistence layer (also in
>>    objects/*.py)?
>>
>>      * Do we need to define (better) or document the layers/boundaries
>>        in the system?
> 
> Sure, if we can find a fairly clean split. I don't have this enough in
> my head to volunteer to do it, but I'll review it if someone else does :)

See above. My preference is to have create/update/delete operations be 
normal object methods and get operations be module-level functions with 
consistent names. No @classmethods or @staticmethods would be my preference.

Best,
-jay



More information about the openstack-discuss mailing list