Efficient Pre-Commit Hooks with GitHub Actions

Pre-commit is a great tool for running various sanity checks (formatting, linting) on the code base. However, such scanning may be time-consuming (particularly on certain content like notebooks) which hits both user experience and billing for CI/CD (minutes are usually paid, except for public repos or very small projects).

Below, I demonstrate how to effectively optimize running pre-commit on GitHub Actions. The key is to cache both the pre-commit package and dependent hooks. Note that, as of now (April 2023), pre-commit native caching does only the second part. Fortunately, managing its cache is as simple as calling the GitHub Cache Action on ~/.cache/pre-commit.

name: pre-commit

on:
  push:
    branches: [experiments]
  pull_request:
    branches: [experiments, main]
  workflow_dispatch:

jobs:
  pre-commit:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v3
    - uses: actions/setup-python@v4
      with:
        python-version: 3.7
    - name: cache pre-commit deps
      id: cache_pre_commit
      uses: actions/cache@v3
      env:
          cache-name: cache-pre-commit
      with:
        path: |
          .pre_commit_venv 
          ~/.cache/pre-commit
        key: ${{ env.cache-name }}-${{ hashFiles('.pre-commit-config.yaml','~/.cache/pre-commit/*') }}
    - name: install pre-commit
      if: steps.cache_pre_commit.outputs.cache-hit != 'true'
      run: |
        python -m venv .pre_commit_venv
        . .pre_commit_venv/bin/activate
        pip install --upgrade pip
        pip install pre-commit
        pre-commit install --install-hooks
        pre-commit gc
    - name: run pre-commit hooks
      run: |
        . .pre_commit_venv/bin/activate  
        pre-commit run --color=always --all-files

Published by mskorski

Scientist, Consultant, Learning Enthusiast

Leave a comment

Your email address will not be published. Required fields are marked *