On 04/08/2019 01:05 PM, Eric Fried wrote:
On 4/8/19 10:43 AM, Chris Dent wrote:
* How (and should) we further split and shrink objects/resource_provider.py
To be specific here:
objects/resource_provider.py is currently ~2KLOC. That's not *horrible* (see below), and if it really didn't lend itself to being further subdivided, I would say we shouldn't force it just for the sake of making that number smaller.
However, IMO, this file logically deals with several things:
* Resource provider inventory/capacity/usage * Resource provider membership in aggregates * Traits on resource providers * Creating, updating, and deleting resource providers * Finding resource providers in various ways
If you look at the structure of the file (in its current order - see below) these groupings are pretty clear.
Note the presence of "resource provider" in each bullet above. We already have separate modules for dealing with e.g. CRUD of traits in their own right. Though the mapping isn't 1:1 with the above list.
So we have the following options, I think:
(1) Do nothing. Leave objects/resource_provider.py as is. (2) Move X-related stuff into the base X.py module, so like move things related to "Traits on resource providers" into objects/trait.py. This level of shuffle would probably entail things like creating a new module for aggregate.py (3) Split the above into separate modules while preserving their resource_provider-ness. From the perspective of code consuming the modules, this would look like:
from placement.objects import resource_provider as rp_crud from placement.objects.resource_provider import trait as rp_trait from placement.objects.resource_provider import search as rp_search from placement.objects import trait ... # Do CRUD-y stuff with resource providers rp = rp_crud.ResourceProvider(...) rp.create() # Do trait stuff t = trait.Trait(...) t.create() # Do trait-y stuff with resource providers rp_trait.add_traits_to_provider(rp, [t]) # Do search-y stuff with resource providers rp_search.get_all_by_filters(...)
To me, this is nice and intuitive.
Meh :) I actually think option 1 (Do nothing) is best here. We went through a bunch of churn for the (excellent) refactoring for OVO-ectomy and splitting the resource_provider.py file out into the other related object/*.py files. I don't really see a huge value in further splitting the resource provider functions out into separate modules with a /placement/objects/resource_provider/ module. I actually think this version of the above code is more readable: from placement.objects import resource_provider from placement.objects import trait ... # Do CRUD-y stuff with resource providers rp = resource_provider.ResourceProvider(...) rp.create() # Do trait stuff t = trait.Trait(...) t.create() # Do trait-y stuff with resource providers rp.add_traits([t]) # Do search-y stuff with resource providers resource_provider.get_all_by_filters(...) Note that the "add_traits()" is a method on the ResourceProvider object in the above. I think this is more readable for create/update/delete operations. For read operations, I prefer a consistently-named set of get_by_XXX and get_all_by_XXX methods in module-level scope as opposed to @classmethods on an object.
* Is it worth reordering methods to some kind of arbitrary ordering?
My preference on this: keep related methods/classes/etc. near each other in the file.
Pros: * Intuitive. * Increases the probability that, when I'm looking at a method that calls another method, both methods will fit on my screen at the same time. (Workaround: multiple editor windows to the same file.)
Cons: * Sometimes there's no sensible grouping, which makes this hard to define and therefore hard to follow (and hard to enforce, if that's a thing we want to do). * Ability to know exactly where to look to find something. (Workaround: Use a real IDE, where you can do a project-wide search for a symbol, hotkey from a symbol's use to its definition, etc. Also you can choose to view a module's skeleton in its real order *or* alphabetically.)
My preference is to have related methods grouped together as well, but I don't consider this a priority work item.
* If a method or module is "long" (by some definition) does that make it bad?
Not to the point where we should do something like implement sonar. Having had experience with that, it just ain't worth the hassle. However, there is such a thing as too long. Certain nova libvirt modules come to mind. For those of us who do choose to use real IDEs, these can take a long time to process (in-band syntax checking, symbol correlation highlighting, git blame, etc.). I'm going to go out on a limb and say we should strive really hard never to have a module in the 5-digit LOCs. And it would be really nice to stay under 2KLOC, but only if it doesn't feel forced.
Agreed.
* Where is the balance between backport hurting churn and keeping the mix fresh?
I'm not sure "fresh" is the heaviest hitter on the plus side. Making the code organized and crisp and usable (from a developer standpoint) and maintainable is the main thing to me.
On the minus side, in addition to backport pain, there's the difficulty tracking down the change that did a thing. Run git blame today and it looks like Chris wrote virtually the whole thing in a small number of massive monolithic chunks. You have to use bisect or similar to track something down. To me, this is the biggest pain point.
So I think we should strive to get to a point where the code structure is "good enough" and then stop doing huge refactors. That way, as we move forward, changes will accumulate on that base and things will look more "normal" history-wise.
Agreed.
* Should there be a more cleanly defined boundary between the methods that marshal data (the top-level methods in objects/*.py) and those that talk to the persistence layer (also in objects/*.py)?
* Do we need to define (better) or document the layers/boundaries in the system?
Sure, if we can find a fairly clean split. I don't have this enough in my head to volunteer to do it, but I'll review it if someone else does :)
See above. My preference is to have create/update/delete operations be normal object methods and get operations be module-level functions with consistent names. No @classmethods or @staticmethods would be my preference. Best, -jay