Poetry in GitLab

This weekend, I had occasion to build a new python-based utility and leaned in to my existing poetry tooling in order to do so. While starting the new project, I wanted to take advantage of some gitlab automation I'd previously used on other projects, so I figured I'd document it here.

Tooling overview for automation

The purpose of the gitlab automation here is to go from a feature branch to a new release without having to do any of the work myself.

I'm using a bunch of tools to achieve this:

  • poetry for dependency management and packaging
  • commitizen for enforcing conventional commits and managing release notes
  • pytest for test running and reporting
  • tox for test automation in multiple language versions (currently 3.10 and 3.11)

And, for good measure, I'll mention GitLab (the Pro version) for source repository and CI/CD, and JetBrains PyCharm, which I use as my IDE most of the time.

Automating the poetry delivery pipeline

Once I've got the project building and tests running, then I want to start rolling it out in versions. I first established this pipeline for another command-line tool (certalerter, my alerting tool for certlogger), so adapting for a new project should be straightforward.

I'm going to elide the coding and testing and stick to the automation for this post, and mostly do it by going through my .gitlab-ci.yml file a bit at a time.

Overall workflow

I've broken the workflow into 5 stages and limited the running of the workflow to: merge requests, commits to main branch when not in a merge request, adding a tag (mostly to handle releases).

I'm running all of this in docker (possibly k8s, but I haven't specifically enabled that yet). My python is pretty clean and I havne't had any problems with portability.

    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
      when: never
    - if: '$CI_COMMIT_BRANCH'
    - if: $CI_COMMIT_TAG

  image: python:3.11

There are 5 stages, of which most of them are pretty straightforward

  - build
  - test
  - bump
  - package
  - release

Building the project

Building the project is pretty straightforward, load in poetry to get our environment and the let it build. I've chosen to capture the distribution binaries (whl and tar.gz files) in the artifacts paths so that they don't need to be rebuilt for the testing phase. I'm not using the PyPi repository from gitlab yet, because I don't want every build to be uniquely kept there, but that's addressed in the package phase later.

  stage: build
    - pip install poetry
    - poetry build
      - dist/ct_nagios_plugins*.whl
      - dist/ct_nagios_plugins*.tar.gz
    expire_in: 1 week
  interruptible: true

Testing and coverage

In order to enable pushing automatically to production, I feel it's necessary to have well maintained test suites. As such, code is tested on every commit (and in all major environments) and coverage is maintained to see when the tests are back-sliding.

Each of the test environments is started in the appropriate python image and then tox and the coverage tools are installed so that we don't need to fully install python. Since the goal here is to create a stand-alone package, I want to take care not to introduce any unintended poetry dependencies.

The funky grep/sed/awk bit is to tease the coverage out of the coverage file for use by gitlab. The || true at the end of it ensures that being unable to get coverage through this method doesn't spoil the stage.

Finally, the test logs (junit-*.xml) and coverage reports (coverage-*.xml) are stored as artifacts.

    - build-job
      - PYTHON_VERSION: "3.11"
        TOXENV: py311
      - PYTHON_VERSION: "3.10"
        TOXENV: py310
  image: python:${PYTHON_VERSION}
  stage: test
    - pip install tox coverage
    - tox --installpkg dist/*.whl
    - coverage xml -o coverage-${PYTHON_VERSION}.xml
    - >
      grep ^\<coverage coverage-${PYTHON_VERSION}.xml
      | sed -n -e 's/.*line-rate=\"\([0-9.]*\)\".*/\1/p'
      | awk '{print "CodeCoverageOverall =" $1*100}'
      || true
  interruptible: true
      junit: junit-*.xml
        coverage_format: cobertura
        path: coverage-*.xml
  coverage: '/^CodeCoverageOverall =(\d+\.\d+)$/'

Bumping versions

In addition to the previously-mentioned rules for running, the version bump is very selective. It will only run on commits to main where bump is not part of the commit message. This (hopefully) prevents it from running twice without need. It also should stop loops.

Note the use of CI_BUMP_TOKEN here, which is a Personal Access Token (PAT) for GitLab that has permissions to read_repository and write_repository so that it can be used to write back to the repo. When I tried this originally, I expected to be able to commit back to my own repo, but ran into trouble, so using the PAT here makes that straightforward. The CI_BUMP_GITLAB_ID is probably not necessary, as __token__ should suffice.

Using poetry and cz here guarantees that the steps that are expected all run, but it also results in the above requirements. If I weren't committing back, but just setting a tag or release, I could easily do that with the API. Specifically, the CI_JOB_TOKEN doesn't have write_repository permission.

In my case, I use a specific PAT to this repository, so that I can limit the blast radius. I'd be happier if there were a way to request a read/write CI_JOB_TOKEN for certain stages, but even if that were available, it's not clear how that would be governed effectively without giving all stages in the pipeline access.

# need to clean in case tagging is screwy, since `git clean` doesn't know to remove tags
    - test
  stage: bump
    GIT_STRATEGY: clone
    - pip install poetry
    - poetry install
    - git config --global user.email "${GITLAB_USER_EMAIL}"
    - git config --global user.name "${GITLAB_USER_NAME}"
    - exit_code=0
    - poetry run cz bump --annotated-tag --changelog || exit_code=$?
    - echo "$exit_code is exit code ; $? was result"
    - |
      if [ $exit_code -eq 0 ]
        git remote set-url origin ${CI_SERVER_PROTOCOL}://${CI_BUMP_GITLAB_ID}:${CI_BUMP_TOKEN}@${CI_SERVER_HOST}/${CI_PROJECT_PATH}
        git push origin --follow-tags HEAD:${CI_COMMIT_BRANCH}
      elif [ $exit_code -eq 21 ]
        echo "Skipping push with no version change"
      elif [ $exit_code -eq 3 ]
        echo "Skipping push with no commits"
        echo "cz error code $exit_code"
        exit $exit_code
      when: never
# skip on bump, because you'll never bump after bump

Packaging the job

As with the bump stage, the package stage runs only at specific times. In particular, it will only run directly following a bump commit on the main branch.

Theoretically, I could use poetry and then make use of the publish command, but in this case, twine is fine (and dedicated).

  stage: package
    - pip install twine
    - TWINE_PASSWORD=${CI_JOB_TOKEN} TWINE_USERNAME=gitlab-ci-token python -m twine upload --verbose --disable-progress-bar --repository-url ${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/packages/pypi dist/*

Finishing off the release

The final phase, which only happens on tagged commits, is to tag the release. GitLab makes this easy by directly supporting the release process in the CI file.

  stage: release
  image: registry.gitlab.com/gitlab-org/release-cli:latest
    - if: $CI_COMMIT_TAG
    - echo "Running the release job for $CI_COMMIT_TAG."
    - "awk '/^## Unreleased/ { next } ; /^## / { r++ ; if ( r <2) { print ; next } else { exit } }; /^/ { print } ;' < CHANGELOG.md >INCREMENTAL_CHANGELOG.md"
    tag_name: $CI_COMMIT_TAG
    name: 'v$CI_COMMIT_TAG'
    description: INCREMENTAL_CHANGELOG.md