Re: [placement][ptg] code refactoring constraints and goals

10 Apr 2019

      On 04/08/2019 01:05 PM, Eric Fried wrote:
...
On 4/8/19 10:43 AM, Chris Dent wrote:
...
    * How (and should) we further split and shrink
objects/resource_provider.py
To be specific here:
objects/resource_provider.py is currently ~2KLOC. That's not *horrible*
(see below), and if it really didn't lend itself to being further
subdivided, I would say we shouldn't force it just for the sake of
making that number smaller.
However, IMO, this file logically deals with several things:
* Resource provider inventory/capacity/usage
* Resource provider membership in aggregates
* Traits on resource providers
* Creating, updating, and deleting resource providers
* Finding resource providers in various ways
If you look at the structure of the file (in its current order - see
below) these groupings are pretty clear.
Note the presence of "resource provider" in each bullet above. We
already have separate modules for dealing with e.g. CRUD of traits in
their own right. Though the mapping isn't 1:1 with the above list.
So we have the following options, I think:
(1) Do nothing. Leave objects/resource_provider.py as is.
(2) Move X-related stuff into the base X.py module, so like move things
related to "Traits on resource providers" into objects/trait.py. This
level of shuffle would probably entail things like creating a new module
for aggregate.py
(3) Split the above into separate modules while preserving their
resource_provider-ness. From the perspective of code consuming the
modules, this would look like:
from placement.objects import resource_provider as rp_crud
from placement.objects.resource_provider import trait as rp_trait
from placement.objects.resource_provider import search as rp_search
from placement.objects import trait
...
# Do CRUD-y stuff with resource providers
rp = rp_crud.ResourceProvider(...)
rp.create()
# Do trait stuff
t = trait.Trait(...)
t.create()
# Do trait-y stuff with resource providers
rp_trait.add_traits_to_provider(rp, [t])
# Do search-y stuff with resource providers
rp_search.get_all_by_filters(...)
To me, this is nice and intuitive.
Meh :) I actually think option 1 (Do nothing) is best here. We went 
through a bunch of churn for the (excellent) refactoring for OVO-ectomy 
and splitting the resource_provider.py file out into the other related 
object/*.py files.

I don't really see a huge value in further splitting the resource 
provider functions out into separate modules with a 
/placement/objects/resource_provider/ module. I actually think this 
version of the above code is more readable:

from placement.objects import resource_provider
from placement.objects import trait
...
# Do CRUD-y stuff with resource providers
rp = resource_provider.ResourceProvider(...)
rp.create()
# Do trait stuff
t = trait.Trait(...)
t.create()
# Do trait-y stuff with resource providers
rp.add_traits([t])
# Do search-y stuff with resource providers
resource_provider.get_all_by_filters(...)

Note that the "add_traits()" is a method on the ResourceProvider object 
in the above. I think this is more readable for create/update/delete 
operations. For read operations, I prefer a consistently-named set of 
get_by_XXX and get_all_by_XXX methods in module-level scope as opposed 
to @classmethods on an object.
...
...
* Is it worth reordering methods to some kind of arbitrary ordering?
My preference on this: keep related methods/classes/etc. near each other
in the file.
Pros:
* Intuitive.
* Increases the probability that, when I'm looking at a method that
calls another method, both methods will fit on my screen at the same
time. (Workaround: multiple editor windows to the same file.)
Cons:
* Sometimes there's no sensible grouping, which makes this hard to
define and therefore hard to follow (and hard to enforce, if that's a
thing we want to do).
* Ability to know exactly where to look to find something. (Workaround:
Use a real IDE, where you can do a project-wide search for a symbol,
hotkey from a symbol's use to its definition, etc. Also you can choose
to view a module's skeleton in its real order *or* alphabetically.)
My preference is to have related methods grouped together as well, but I 
don't consider this a priority work item.
...
...
* If a method or module is "long" (by some definition) does that
   make it bad?
Not to the point where we should do something like implement sonar.
Having had experience with that, it just ain't worth the hassle.
However, there is such a thing as too long. Certain nova libvirt modules
come to mind. For those of us who do choose to use real IDEs, these can
take a long time to process (in-band syntax checking, symbol correlation
highlighting, git blame, etc.). I'm going to go out on a limb and say we
should strive really hard never to have a module in the 5-digit LOCs.
And it would be really nice to stay under 2KLOC, but only if it doesn't
feel forced.
Agreed.
...
...
* Where is the balance between backport hurting churn and keeping
   the mix fresh?
I'm not sure "fresh" is the heaviest hitter on the plus side. Making the
code organized and crisp and usable (from a developer standpoint) and
maintainable is the main thing to me.
On the minus side, in addition to backport pain, there's the difficulty
tracking down the change that did a thing. Run git blame today and it
looks like Chris wrote virtually the whole thing in a small number of
massive monolithic chunks. You have to use bisect or similar to track
something down. To me, this is the biggest pain point.
So I think we should strive to get to a point where the code structure
is "good enough" and then stop doing huge refactors. That way, as we
move forward, changes will accumulate on that base and things will look
more "normal" history-wise.
Agreed.
...
...
* Should there be a more cleanly defined boundary between the
   methods that marshal data (the top-level methods in objects/*.py)
   and those that talk to the persistence layer (also in
   objects/*.py)?
    * Do we need to define (better) or document the layers/boundaries
       in the system?
Sure, if we can find a fairly clean split. I don't have this enough in
my head to volunteer to do it, but I'll review it if someone else does :)
See above. My preference is to have create/update/delete operations be 
normal object methods and get operations be module-level functions with 
consistent names. No @classmethods or @staticmethods would be my preference.

Best,
-jay

Re: [placement][ptg] code refactoring constraints and goals

Jay Pipes