On 4/8/19 10:43 AM, Chris Dent wrote:
* How (and should) we further split and shrink objects/resource_provider.py
To be specific here: objects/resource_provider.py is currently ~2KLOC. That's not *horrible* (see below), and if it really didn't lend itself to being further subdivided, I would say we shouldn't force it just for the sake of making that number smaller. However, IMO, this file logically deals with several things: * Resource provider inventory/capacity/usage * Resource provider membership in aggregates * Traits on resource providers * Creating, updating, and deleting resource providers * Finding resource providers in various ways If you look at the structure of the file (in its current order - see below) these groupings are pretty clear. Note the presence of "resource provider" in each bullet above. We already have separate modules for dealing with e.g. CRUD of traits in their own right. Though the mapping isn't 1:1 with the above list. So we have the following options, I think: (1) Do nothing. Leave objects/resource_provider.py as is. (2) Move X-related stuff into the base X.py module, so like move things related to "Traits on resource providers" into objects/trait.py. This level of shuffle would probably entail things like creating a new module for aggregate.py (3) Split the above into separate modules while preserving their resource_provider-ness. From the perspective of code consuming the modules, this would look like: from placement.objects import resource_provider as rp_crud from placement.objects.resource_provider import trait as rp_trait from placement.objects.resource_provider import search as rp_search from placement.objects import trait ... # Do CRUD-y stuff with resource providers rp = rp_crud.ResourceProvider(...) rp.create() # Do trait stuff t = trait.Trait(...) t.create() # Do trait-y stuff with resource providers rp_trait.add_traits_to_provider(rp, [t]) # Do search-y stuff with resource providers rp_search.get_all_by_filters(...) To me, this is nice and intuitive. The issue is the module hierarchy. As written above, it would entail putting the CRUD-y stuff into objects/resource_provider/__init__.py, which some folks don't like. I suppose we could have objects/resource_provider/crud.py, but that doesn't match the pattern for other things like objects/trait.py Discuss.
* Is it worth reordering methods to some kind of arbitrary ordering?
My preference on this: keep related methods/classes/etc. near each other in the file. Pros: * Intuitive. * Increases the probability that, when I'm looking at a method that calls another method, both methods will fit on my screen at the same time. (Workaround: multiple editor windows to the same file.) Cons: * Sometimes there's no sensible grouping, which makes this hard to define and therefore hard to follow (and hard to enforce, if that's a thing we want to do). * Ability to know exactly where to look to find something. (Workaround: Use a real IDE, where you can do a project-wide search for a symbol, hotkey from a symbol's use to its definition, etc. Also you can choose to view a module's skeleton in its real order *or* alphabetically.)
* If a method or module is "long" (by some definition) does that make it bad?
Not to the point where we should do something like implement sonar. Having had experience with that, it just ain't worth the hassle. However, there is such a thing as too long. Certain nova libvirt modules come to mind. For those of us who do choose to use real IDEs, these can take a long time to process (in-band syntax checking, symbol correlation highlighting, git blame, etc.). I'm going to go out on a limb and say we should strive really hard never to have a module in the 5-digit LOCs. And it would be really nice to stay under 2KLOC, but only if it doesn't feel forced.
* Where is the balance between backport hurting churn and keeping the mix fresh?
I'm not sure "fresh" is the heaviest hitter on the plus side. Making the code organized and crisp and usable (from a developer standpoint) and maintainable is the main thing to me. On the minus side, in addition to backport pain, there's the difficulty tracking down the change that did a thing. Run git blame today and it looks like Chris wrote virtually the whole thing in a small number of massive monolithic chunks. You have to use bisect or similar to track something down. To me, this is the biggest pain point. So I think we should strive to get to a point where the code structure is "good enough" and then stop doing huge refactors. That way, as we move forward, changes will accumulate on that base and things will look more "normal" history-wise.
* Should there be a more cleanly defined boundary between the methods that marshal data (the top-level methods in objects/*.py) and those that talk to the persistence layer (also in objects/*.py)?
* Do we need to define (better) or document the layers/boundaries in the system?
Sure, if we can find a fairly clean split. I don't have this enough in my head to volunteer to do it, but I'll review it if someone else does :)
* How can we encourage a similar refactor/revisit-often attitude to the docs?
Fostering the culture that way, as we're doing. Bringing it up periodically. Having a "doc sweep day" where members of the/a (sub)team volunteer to each read a chunk of doc/code thoroughly? efried