On pre-commit, black, ruff, ruff-format and everything else $linter related
o/ Seeing as we've been doing quite a bit of experimentation on this topic in the SDK team recently, and it's come up in random discussions a few times over recent days, I thought it might be helpful to share a few tips and tricks on how our linting stored has evolved across SDK deliverables. Hopefully it's helpful to someone. Historically, teams have provided a `pep8` tox target to run their linters. Originally this would have run flake8 or the hacking wrapper, but over the years these targets have expanded to cover other linters like bandit, doc8 and mypy, as well as various flake8 plugins. In recent times, we have been slowly introducing use of the pre-commit tool [1]. Initially this was purely additive, as it's a nice help for users working in a terminal (like me!) and mostly duplicated what was defined in the `pep8` target. This duplication led to issues where pre-commit was doing slightly different things from the `pep8` target, so we have evolved this so that the `pep8` target now simply calls pre-commit. This is pretty easy to do now since most if not all of the linters in common use across OpenStack - flake8, hacking, doc8, bandit, etc. - provides pre-commit hooks. I recommend anyone using pre-commit does the same thing otherwise these thing can drift. For example, your pep8 target can become: [testenv:pep8] description = Run style checks. skip_install = true deps = pre-commit commands = pre-commit run --all-files --show-diff-on-failure We've since evolved this further to start changing the linters we've used in general. We first introduced black [2], the automatic code formatting tool. The rationale behind this was to remove the need to even think about code style and focus on the actual semantic change. We're already overburdened, and going back and forth on style issues is a waste of everyone's time. Contributors (particularly new ones) can simply install pre-commit or use the `pep8` tox target and their code will be automatically restyled in a generally non- offensive manner. Is it perfect? No, but it's close enough that we've decided we don't care. When we have introduced black, we have typically kept the changes introduced by black separate from the changes to e.g. the `.pre-commit- config.yaml` or `tox.ini` file. This allows us to introduce a `.git-blame- ignore-revs` file afterwards which some software forges (like GitHub, though sadly not Gittea (yet!)) can use to ignore these bulk update changes when using `git-blame`. You can also configure git locally to ignore this, and I've made an effort to place comments in the files whenever I've added them. In addition to black, we also added other linters like bandit [3] and pyupgrade [4], including these in pre-commit so that they would run automatically on commit and in CI. However, as anyone reading the orange site or surveying the broader Python landscape might be aware, the ruff [5] linter and formatter has been gaining steam and is pretty mature at this point. In recent weeks, we have been migrating pretty much all our linters and formatters - black, pyupgrade, bandit and (most of) flake8 - to ruff and ruff-format. We have kept hacking but only enable the H (hacking) ruleset plus any custom rules for now, but at some point we will explore getting these few hacking rules into ruff or writing them off. In any case, this migration has resulted in a significantly trimmer pre- commit-config.yaml file (and much faster runtime to boot). You can see the end result on a repo like e.g. osc-lib [6]. All in all, I think I can recommend this approach for anyone interested in pursuing it. Linters are far from the most important aspect of development, but I've found these tools to yield a nice improvement in developer experience. Top tips: * Have your 'pep8' target (and any other linter-related target run 'pre-commit' rather than the linters themselves), and don't forget to remove linters from test-requirements.txt once you do (with the exception of hacking if you have custom hacking rules) * Consider using ruff and ruff-format in place of all non-hacking linters/formatters: it's seriously fast and provides rules to cover most of these. * When applying ruff-format or the UP (pyupgrade) ruff rules, do it separately so you can add the commit ID to a .git-blame-ignore-revs file. You should be able to do this pre-merge since Gerrit/Zuul merge patches rather than rebasing them so the commit ID won't change. Other points: * When we first introduced pre-commit, some people noted that installing from Git rather than PyPI introduced a possible attack vector since tags can be changed. Given we only use a small, curated set of linters from trusted sources, I don't know how much of an issue this is but I'll raise it here. * Difficulty backporting code has been highlighted as an blocker for running a formatter like black or ruff-format. We don't do many backports in SDK, OSC etc. so this hasn't been a huge issue. If it were though, we would simply apply the formatter to the stable branch first before backporting. Hope this helps, Stephen [1] https://pre-commit.com/ [2] https://github.com/psf/black [3] https://github.com/PyCQA/bandit [4] https://github.com/asottile/pyupgrade [5] https://github.com/astral-sh/ruff [6] https://opendev.org/openstack/osc-lib/
On Thu, Sep 12, 2024, at 10:43 AM, Stephen Finucane wrote:
o/
Trimming this down as I'm mostly interested in the use of pre-commit.
In recent times, we have been slowly introducing use of the pre-commit tool [1]. Initially this was purely additive, as it's a nice help for users working in a terminal (like me!) and mostly duplicated what was defined in the `pep8` target. This duplication led to issues where pre-commit was doing slightly different things from the `pep8` target, so we have evolved this so that the `pep8` target now simply calls pre-commit. This is pretty easy to do now since most if not all of the linters in common use across OpenStack - flake8, hacking, doc8, bandit, etc. - provides pre-commit hooks. I recommend anyone using pre-commit does the same thing otherwise these thing can drift. For example, your pep8 target can become:
[testenv:pep8] description = Run style checks. skip_install = true deps = pre-commit commands = pre-commit run --all-files --show-diff-on-failure
One of my main concerns with pre-commit is that it didn't operate on tool packages (for example wheels from pypi) but instead with tooling source code and then built them all from source. This has a couple of downsides in a CI environment: it makes caching more difficult (we can't rely on our pypi cache) and you need to ensure you've got build toolchains for all the tools invoked by pre-commit. This is less of a concern when you can install these things once (or infrequently) on your local laptop. Has pre-commit improved on this at all? Quickly skimming the docs it seems that python hook targets must still "hook repository must be installable via pip install . (usually by either setup.py or pyproject.toml)". This is potentially extra problematic when you start adding in tools like Ruff that require a full rust toolchain to build. Suddenly we're replacing python slowness with language runtime bootstrapping and tool compilation time. In the ruff case it looks like they've worked around that not by fixing pre-commit but instead by creating a shim repo that will install the prebuilt wheels: https://github.com/astral-sh/ruff-pre-commit. Whatever works I guess. Just be aware that this is sort of a worst case scenario from an Internet is unreliable standpoint as now we need both GitHub and PyPI to be up and accessible rather than one or the other. This also has the downside of sidestepping constraints, but in the case of linters we've already largely done that so that projects don't have to update in lockstep. This doesn't really change the status quo there.
On Thu, 2024-09-12 at 11:17 -0700, Clark Boylan wrote:
On Thu, Sep 12, 2024, at 10:43 AM, Stephen Finucane wrote:
o/
Trimming this down as I'm mostly interested in the use of pre-commit.
In recent times, we have been slowly introducing use of the pre-commit tool [1]. Initially this was purely additive, as it's a nice help for users working in a terminal (like me!) and mostly duplicated what was defined in the `pep8` target. This duplication led to issues where pre-commit was doing slightly different things from the `pep8` target, so we have evolved this so that the `pep8` target now simply calls pre-commit. This is pretty easy to do now since most if not all of the linters in common use across OpenStack - flake8, hacking, doc8, bandit, etc. - provides pre-commit hooks. I recommend anyone using pre-commit does the same thing otherwise these thing can drift. For example, your pep8 target can become:
[testenv:pep8] description = Run style checks. skip_install = true deps = pre-commit commands = pre-commit run --all-files --show-diff-on-failure
One of my main concerns with pre-commit is that it didn't operate on tool packages (for example wheels from pypi) but instead with tooling source code and then built them all from source. This has a couple of downsides in a CI environment: it makes caching more difficult (we can't rely on our pypi cache) and you need to ensure you've got build toolchains for all the tools invoked by pre-commit. This is less of a concern when you can install these things once (or infrequently) on your local laptop. ya so locally this is effectively a non issue as there is some caching invovled
we have been using pre-commit in our ci now for a number of years. it has not been problemaitc to the stability or speed of the ci form what i have seen but it may have increased the total bandwith consumed. i expect that we are benefiting form the provider mirrors. we refence the repos by https urls to allow caching proxies to cache them if configured and use git tags to pin the version so it changes very in frequently.
Has pre-commit improved on this at all? Quickly skimming the docs it seems that python hook targets must still "hook repository must be installable via pip install . (usually by either setup.py or pyproject.toml)".
i dont think this has change but this has not been a painpoint based on my expirnce with using it both in openstack and in the recent work i have been doing in https://github.com/openstack-k8s-operators/ i proposed the introduction of precommit there 2 years ago https://github.com/openstack-k8s-operators/nova-operator/commit/78a8a7348f44... and we have never had any isssue with using hooks form git in ci or locally as a result of how it functions. it has been very stable for us and very fast. those repos are all written in go so the precise hooks we use are alittle different but it has been a very positive experience. https://github.com/openstack-k8s-operators/nova-operator/blob/main/.pre-comm...
This is potentially extra problematic when you start adding in tools like Ruff that require a full rust toolchain to build. Suddenly we're replacing python slowness with language runtime bootstrapping and tool compilation time. In the ruff case it looks like they've worked around that not by fixing pre-commit but instead by creating a shim repo that will install the prebuilt wheels: https://github.com/astral-sh/ruff-pre-commit. Whatever works I guess. Just be aware that this is sort of a worst case scenario from an Internet is unreliable standpoint as now we need both GitHub and PyPI to be up and accessible rather than one or the other.
on the rust point we already need the rust toolcain in some context because of pycryptography we normlly can avoid that by installign the prebuilt wheel in that case but not alwasy. for the most port we have provided partity in nova so that the existing tox envs can be used in the goglan operators we also provided a dual interface via either a make taget or precommit hook. in this case we had pre-commit local hook just run make if the ruff rust toolcahin was a concern we coudl have pre-commit call a tox target that ran it using a wheel or write our own hook. i will not that there are two general patterns for pre-commit, standalone repos that just provide the hook and in repo hooks. so the approch taken by https://github.com/astral-sh/ruff-pre-commit is actully one of the preferd ways to use pre-commit rather then somethign novel. hacking uses the in tree apprcoh https://github.com/openstack/hacking/blob/master/.pre-commit-hooks.yaml but i belive ```- .[pep257]``` could be repalced with ```- -c https://releases.openstack.org/constraints/upper/master .[pep257]``` or ```- -c https://releases.openstack.org/constraints/upper/master hacking[pep257]``` i have not actully tried that but i belive the additional_dependencies is appened to the pip install line directly so i think we can pass extra flags. https://github.com/pre-commit/pre-commit/blob/504149d2ca57f3115ccd4fa9c6ae69... if that does not work that would be a nice enhancement.
This also has the downside of sidestepping constraints, but in the case of linters we've already largely done that so that projects don't have to update in lockstep. This doesn't really change the status quo there.
i agree this largely does not change that aspect.
On Thu, 2024-09-12 at 18:43 +0100, Stephen Finucane wrote:
o/
Seeing as we've been doing quite a bit of experimentation on this topic in the SDK team recently, and it's come up in random discussions a few times over recent days, I thought it might be helpful to share a few tips and tricks on how our linting stored has evolved across SDK deliverables. Hopefully it's helpful to someone.
Historically, teams have provided a `pep8` tox target to run their linters. Originally this would have run flake8 or the hacking wrapper, but over the years these targets have expanded to cover other linters like bandit, doc8 and mypy, as well as various flake8 plugins.
In recent times, we have been slowly introducing use of the pre-commit tool [1]. Initially this was purely additive, as it's a nice help for users working in a terminal (like me!) and mostly duplicated what was defined in the `pep8` target. This duplication led to issues where pre-commit was doing slightly different things from the `pep8` target, so we have evolved this so that the `pep8` target now simply calls pre-commit. This is pretty easy to do now since most if not all of the linters in common use across OpenStack - flake8, hacking, doc8, bandit, etc. - provides pre-commit hooks. I recommend anyone using pre-commit does the same thing otherwise these thing can drift. For example, your pep8 target can become:
[testenv:pep8] description = Run style checks. skip_install = true deps = pre-commit commands = pre-commit run --all-files --show-diff-on-failure
We've since evolved this further to start changing the linters we've used in general. We first introduced black [2], the automatic code formatting tool. The rationale behind this was to remove the need to even think about code style and focus on the actual semantic change. We're already overburdened, and going back and forth on style issues is a waste of everyone's time. Contributors (particularly new ones) can simply install pre-commit or use the `pep8` tox target and their code will be automatically restyled in a generally non- offensive manner. Is it perfect? No, but it's close enough that we've decided we don't care. When we have introduced black, we have typically kept the changes introduced by black separate from the changes to e.g. the `.pre-commit- config.yaml` or `tox.ini` file. This allows us to introduce a `.git-blame- ignore-revs` file afterwards which some software forges (like GitHub, though sadly not Gittea (yet!)) can use to ignore these bulk update changes when using `git-blame`. You can also configure git locally to ignore this, and I've made an effort to place comments in the files whenever I've added them.
In addition to black, we also added other linters like bandit [3] and pyupgrade [4], including these in pre-commit so that they would run automatically on commit and in CI. However, as anyone reading the orange site or surveying the broader Python landscape might be aware, the ruff [5] linter and formatter has been gaining steam and is pretty mature at this point. In recent weeks, we have been migrating pretty much all our linters and formatters - black, pyupgrade, bandit and (most of) flake8 - to ruff and ruff-format. We have kept hacking but only enable the H (hacking) ruleset plus any custom rules for now, but at some point we will explore getting these few hacking rules into ruff or writing them off. In any case, this migration has resulted in a significantly trimmer pre- commit-config.yaml file (and much faster runtime to boot). You can see the end result on a repo like e.g. osc-lib [6].
All in all, I think I can recommend this approach for anyone interested in pursuing it. Linters are far from the most important aspect of development, but I've found these tools to yield a nice improvement in developer experience.
+1 i am very much in favor of doing this in all repo i contribute too. i strongly believe the benifts out way the downsides. After 18-24 months of doing development with pre-commit enabled repos and working in python, go and ansible, it was very jarring to work on nova and formatting to be a concern expressed in reviews. i literally had a working patch for a real bug proposed 3 months ago, 2 months later the first review feedback was mainly formatting related with 1 legitimate question that i was happy to answer. but that was enough of a barier that i just could not bring my self to work on it again until enough time had passed and i once again could find time to work on upstream bugs not related to any downstream priority or dalmation release prioty fixes like regression in 2024.2. i will eventually go fix it but if we had ruff-format or black in the nova 3 it would have been fix months ago. as a contributor and a reviewer formatting is the easisty thing to automate that would have the biggest day to day impact on my quality of life. I even considered running for PTL or TC this cycle with the aim to promote this type of change in our community but decided not to, mainly to prevent burnout, but also because i was hoping a conversation like this would be enough to instigate this change. Thank you for starting it.
Top tips:
* Have your 'pep8' target (and any other linter-related target run 'pre-commit' rather than the linters themselves), and don't forget to remove linters from test-requirements.txt once you do (with the exception of hacking if you have custom hacking rules) * Consider using ruff and ruff-format in place of all non-hacking linters/formatters: it's seriously fast and provides rules to cover most of these. * When applying ruff-format or the UP (pyupgrade) ruff rules, do it separately so you can add the commit ID to a .git-blame-ignore-revs file. You should be able to do this pre-merge since Gerrit/Zuul merge patches rather than rebasing them so the commit ID won't change.
Other points:
* When we first introduced pre-commit, some people noted that installing from Git rather than PyPI introduced a possible attack vector since tags can be changed. Given we only use a small, curated set of linters from trusted sources, I don't know how much of an issue this is but I'll raise it here. * Difficulty backporting code has been highlighted as an blocker for running a formatter like black or ruff-format. We don't do many backports in SDK, OSC etc. so this hasn't been a huge issue. If it were though, we would simply apply the formatter to the stable branch first before backporting.
+1 i don't accept the back-porting argument as justification not to do this personally. i understand the concern but i don't think that is enough prevent modernising the developer experience as someone who would be impacted by this for backports. even if i had to manually reformat thing i would prefer to pay that pain on every back-port if it meant we could have this functionality on master. after all it only need to be done once when moving form the branch with the formatter to the one without it.
Hope this helps, Stephen
[1] https://pre-commit.com/ [2] https://github.com/psf/black [3] https://github.com/PyCQA/bandit [4] https://github.com/asottile/pyupgrade [5] https://github.com/astral-sh/ruff [6] https://opendev.org/openstack/osc-lib/
participants (3)
-
Clark Boylan
-
smooney@redhat.com
-
Stephen Finucane