Contributing guide

Contributing guide#

This document aims at summarizing the most important information for getting you started on contributing to this project. We assume that you are already familiar with git and with making pull requests on GitHub.

For more extensive tutorials, that also cover the absolute basics, please refer to other resources such as the pyopensci tutorials, the scientific Python tutorials, or the scanpy developer guide.

Tip

The hatch project manager

We highly recommend to familiarize yourself with hatch. Hatch is a Python project manager that

manages virtual environments, separately for development, testing and building the documentation. Separating the environments is useful to avoid dependency conflicts.
allows to run tests locally in different environments (e.g. different python versions)
allows to run tasks defined in pyproject.toml, e.g. to build documentation.

While the project is setup with hatch in mind, it is still possible to use different tools to manage dependencies, such as uv or pip.

Cloning the repository#

The tutorial notebooks under docs/tutorials/notebooks/ are not stored in this repository. They live in the separate lueckenlab/patpy_tutorials repository and are pulled in as a git submodule. When you clone patpy, make sure to fetch submodules too:

git clone --recurse-submodules git@github.com:lueckenlab/patpy.git

If you already cloned without --recurse-submodules, run this once inside your clone:

git submodule update --init --recursive

Without this step, the docs/tutorials/notebooks/ folder will be empty and hatch run docs:build will fail to render the tutorials.

Installing dev dependencies#

In addition to the packages needed to use this package, you need additional python packages to run tests and build the documentation.

On the command line, you typically interact with hatch through its command line interface (CLI). Running one of the following commands will automatically resolve the environments for testing and building the documentation in the background:

hatch test  # defined in the table [tool.hatch.envs.hatch-test] in pyproject.toml
hatch run docs:build  # defined in the table [tool.hatch.envs.docs]

When using an IDE such as VS Code, you’ll have to point the editor at the paths to the virtual environments manually. The environment you typically want to use as your main development environment is the hatch-test environment with the latest Python version.

To get a list of all environments for your projects, run

hatch env show -i

This will list “Standalone” environments and a table of “Matrix” environments like the following:

+------------+---------+--------------------------+----------+---------------------------------+-------------+
| Name       | Type    | Envs                     | Features | Dependencies                    | Scripts     |
+------------+---------+--------------------------+----------+---------------------------------+-------------+
| hatch-test | virtual | hatch-test.py3.11-stable | dev      | coverage-enable-subprocess==1.0 | cov-combine |
|            |         | hatch-test.py3.14-stable | test     | coverage[toml]~=7.4             | cov-report  |
|            |         | hatch-test.py3.14-pre    |          | pytest-mock~=3.12               | run         |
|            |         |                          |          | pytest-randomly~=3.15           | run-cov     |
|            |         |                          |          | pytest-rerunfailures~=14.0      |             |
|            |         |                          |          | pytest-xdist[psutil]~=3.5       |             |
|            |         |                          |          | pytest~=8.1                     |             |
+------------+---------+--------------------------+----------+---------------------------------+-------------+

From the Envs column, select the environment name you want to use for development. In this example, it would be hatch-test.py3.14-stable.

Next, create the environment with

hatch env create hatch-test.py3.14-stable

Then, obtain the path to the environment using

hatch env find hatch-test.py3.14-stable

In case you are using VScode, now open the command palette (Ctrl+Shift+P) and search for Python: Select Interpreter. Choose Enter Interpreter Path and paste the path to the virtual environment from above.

In this future, this may become easier through a hatch vscode extension.

A popular choice for managing virtual environments is uv. The main disadvantage compared to hatch is that it supports only a single environment per project at a time, which requires you to mix the dependencies for running tests and building docs. This can have undesired side-effects, such as requiring to install a lower version of a library your project depends on, only because an outdated sphinx plugin pins an older version.

To initalize a virtual environment in the .venv directory of your project, simply run

uv sync --all-extras

The .venv directory is typically automatically discovered by IDEs such as VS Code.

Pip is nowadays mostly superseded by environment manager such as hatch. However, for the sake of completeness, and since it’s ubiquitously available, we describe how you can manage environments manually using pip:

python3 -m venv .venv
source .venv/bin/activate
pip install -e “.[dev,test,doc]”

The .venv directory is typically automatically discovered by IDEs such as VS Code.

Code-style#

This package uses pre-commit to enforce consistent code-styles. On every commit, pre-commit checks will either automatically fix issues with the code, or raise an error message.

To enable pre-commit locally, simply run

pre-commit install

in the root of the repository. Pre-commit will automatically download all dependencies when it is run for the first time.

Alternatively, you can rely on the pre-commit.ci service enabled on GitHub. If you didn’t run pre-commit before pushing changes to GitHub it will automatically commit fixes to your pull request, or show an error message.

If pre-commit.ci added a commit on a branch you still have been working on locally, simply use

git pull --rebase

to integrate the changes into yours. While the pre-commit.ci is useful, we strongly encourage installing and running pre-commit locally first to understand its usage.

Finally, most editors have an autoformat on save feature. Consider enabling this option for ruff and biome.

Writing tests#

This package uses pytest for automated testing. Please write Tests for every function added to the package.

Most IDEs integrate with pytest and provide a GUI to run tests. Just point yours to one of the environments returned by

hatch env create hatch-test  # create test environments for all supported versions
hatch env find hatch-test  # list all possible test environment paths

Alternatively, you can run all tests from the command line by executing

hatch test  # test with the highest supported Python version
# or
hatch test --all  # test with all supported Python versions

uv run pytest

source .venv/bin/activate
pytest

in the root of the repository.

Heavy dataset tests#

Tests marked with @pytest.mark.dataset download real datasets from Figshare (multi-GB) and are skipped by default — both locally and on every automatic CI run (PR, push to main, scheduled cron) — so day-to-day test runs stay fast. On CI they only run when the Test workflow is dispatched manually from the Actions tab (the workflow_dispatch trigger). Trigger one after changes to src/patpy/datasets/, or whenever you want to re-validate the Figshare downloads.

If you change anything in src/patpy/datasets/ (or otherwise want to validate the downloads) locally, opt in with --run-datasets:

hatch test -- --run-datasets

uv run pytest --run-datasets

source .venv/bin/activate
pytest --run-datasets

To avoid re-downloading on every run, point pytest at a persistent cache directory by exporting PATPY_TEST_DATASETDIR:

export PATPY_TEST_DATASETDIR="$HOME/.cache/patpy-test-datasets"
mkdir -p "$PATPY_TEST_DATASETDIR"
pytest --run-datasets -k test_combat   # for example

You can also restrict the run to a single dataset loader with -k (e.g. -k test_inflammation_atlas) so you don’t have to wait for every dataset to download.

Continuous integration#

Continuous integration via GitHub actions will automatically run the tests on all pull requests and test against the minimum and maximum supported Python version. Heavy @pytest.mark.dataset tests are skipped on every automatic run (PR, push to main, scheduled cron) — they would otherwise time out downloading multi-GB files. To run them on CI, dispatch the Test workflow manually from the Actions tab; see Heavy dataset tests above for the local opt-in workflow.

Additionally, there’s a CI job that tests against pre-releases of all dependencies (if there are any). The purpose of this check is to detect incompatibilities of new package versions early on and gives you time to fix the issue or reach out to the developers of the dependency before the package is released to a wider audience.

The CI job is defined in .github/workflows/test.yaml, however the single point of truth for CI jobs is the Hatch test matrix defined in pyproject.toml. This means that local testing via hatch and remote testing on CI tests against the same python versions and uses the same environments.

Publishing a release#

Updating the version number#

Before making a release, you need to update the version number in the pyproject.toml file. Please adhere to Semantic Versioning, in brief

Given a version number MAJOR.MINOR.PATCH, increment the:

MAJOR version when you make incompatible API changes,

MINOR version when you add functionality in a backwards compatible manner, and

PATCH version when you make backwards compatible bug fixes.

Additional labels for pre-release and build metadata are available as extensions to the MAJOR.MINOR.PATCH format.

Once you are done, commit and push your changes and navigate to the “Releases” page of this project on GitHub. Specify vX.X.X as a tag name and create a release. For more information, see managing GitHub releases. This will automatically create a git tag and trigger a Github workflow that creates a release on PyPI.

Writing documentation#

Please write documentation for new or changed features and use-cases. This project uses sphinx with the following features:

The myst extension allows to write documentation in markdown/Markedly Structured Text
Numpy-style docstrings (through the napoloen extension).
Jupyter notebooks as tutorials through myst-nb (See Tutorials with myst-nb)
sphinx-autodoc-typehints, to automatically reference annotated input and output types
Citations (like [VBH+23]) can be included with sphinxcontrib-bibtex

See scanpy’s Documentation for more information on how to write your own.

Tutorials with myst-nb and jupyter notebooks#

The documentation is set-up to render jupyter notebooks stored in the docs/tutorials/notebooks/ directory using myst-nb. That directory is a git submodule of lueckenlab/patpy_tutorials — see Cloning the repository for the initial setup, and Editing tutorial notebooks below for the day-to-day workflow. Currently, only notebooks in .ipynb format are supported that will be included with both their input and output cells. It is your responsibility to update and re-run the notebook whenever necessary.

If you are interested in automatically running notebooks as part of the continuous integration, please check out this feature request in the cookiecutter-scverse repository.

Editing tutorial notebooks (submodule workflow)#

Tutorial notebooks live in lueckenlab/patpy_tutorials and are pinned in patpy at a specific commit. Because the notebooks live in a different repository, a single notebook change typically requires two pull requests — one in each repo:

A PR on lueckenlab/patpy_tutorials containing the actual notebook changes.
A PR on lueckenlab/patpy that bumps the submodule pointer to the new tutorials commit, so the patpy docs build (and Read the Docs preview) renders the updated notebooks.

The day-to-day commands look like this:

1. Edit and open a PR in the tutorials submodule. From the patpy repo root:

cd docs/tutorials/notebooks
git checkout -b my-tutorial-update     # submodules check out a detached HEAD by default
# ... edit notebooks ...
git add <file>.ipynb
git commit -m "Update <tutorial>"
git push -u origin my-tutorial-update
# then open a PR on lueckenlab/patpy_tutorials and get it reviewed/merged

2. Bump the pointer in your patpy PR. Back in the parent repo, fast-forward the submodule to the merged tutorials commit (or to your branch if you want a preview before the tutorials PR is merged) and commit the pointer bump on your patpy feature branch:

cd ../../..                                              # back to patpy repo root
git checkout my-feature-branch                           # your patpy PR branch
git submodule update --remote docs/tutorials/notebooks   # fast-forward to the new tutorials commit
git add docs/tutorials/notebooks
git commit -m "Bump tutorials submodule"
git push

git status in patpy will then show no submodule diff. Without this pointer bump, your patpy PR (and the Read the Docs preview built for it) will keep rendering the old notebook content even after the tutorials PR is merged.

Tip. Strictly speaking the two PRs can move in either order, but the cleanest sequence is: open and merge the tutorials PR first, then open the patpy PR with the pointer bump on top of it. That way the patpy PR’s RTD preview shows the final state.

Pulling tutorial updates from someone else. Run this from the patpy root:

git submodule update --init --recursive             # if you don't have the submodule yet
git submodule update --remote docs/tutorials/notebooks  # fast-forward to the latest pinned commit

Common gotcha. If git status in patpy shows modified: docs/tutorials/notebooks (new commits), you have local commits in the submodule that aren’t pushed yet, or you forgot the pointer bump described above.

Hints#

If you refer to objects from other packages, please add an entry to intersphinx_mapping in docs/conf.py. Only if you do so can sphinx automatically create a link to the external documentation.
If building the documentation fails because of a missing link that is outside your control, you can add an entry to the nitpick_ignore list in docs/conf.py

Building the docs locally#

hatch run docs:build
hatch run docs:open

cd docs
uv run sphinx-build -M html . _build -W
(xdg-)open _build/html/index.html

source .venv/bin/activate
cd docs
sphinx-build -M html . _build -W
(xdg-)open _build/html/index.html