Changelog#

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

0.16.6#

Fixed#

  • Dependencies for patpy[milo-edger] amd patpy[de]. Previously, they allowed python 3.11, but this version is not supported by pertpy and caused problems on installation

0.16.5#

Added#

  • Interface to PILOT_GMA_VAE sample representation method at patpy.tl.sample_representation.PILOTGMVAE

Fixed#

  • Added missing method references to the API docs

0.16.4#

Added#

  • patpy.datasets.combat_stephenson: harmonized COMBAT + Stephenson COVID-19 PBMC cohort (251 samples / 1,399,435 cells / 1,856 genes) with shared cell_coarse_aligned annotations and joint PCA/scVI/scANVI embeddings.

  • patpy.datasets.inflammation_atlas: inflammation atlas cross-study dataset. Added only scANVI latents. 3 splits (main / external / validation) selected through a split argument, each with its own DatasetInfo (cell_type_key is Level1 for main, Level1pred for external/validation).

Changed#

  • Turn off running dataset tests manually, they can now only be triggered manually

Fixed#

  • Add locking scanpy test file to prevent concurrent reading and failing tests: https://github.com/scverse/scanpy/issues/4097

0.16.3#

Fixed#

  • Downloading COMBAT meta adata did not work due to incorrect link for the programmatic access

0.16.2#

Changed#

  • Move tutorials to submodule. Adapted the CI process and updated contributor information

0.16.1#

Added#

  • evaluate_representation(..., method="permanova") for categorical outcomes: PERMANOVA pseudo-F (single factor, same decomposition as vegan::adonis2) with permutation p-values in pure Python

Changed#

  • Moved distances caching from anndata object to the instance of the sample representation class

  • Renamed dataset loaders combat_preprocessed, hlca_preprocessed, onek1k_preprocessed, stephenson_preprocessed, ticatlas_preprocessed to combat, hlca, onek1k, stephenson, ticatlas (breaking; no deprecation aliases).

Added#

  • kind: Literal["raw", "processed"] = "processed" argument on every dataset loader. Currently only the processed variants are wired up; kind="raw" raises NotImplementedError until raw URLs are provided.

  • load_metadata: bool = False argument on patpy.datasets.combat that additionally downloads the ~4 MB sample-metadata AnnData (Figshare ID 64291092) and returns it as an extra tuple element. Combined with return_dataset_info, the return order is (adata, meta_adata, info).

Deleted#

  • DISTANCES_UNS_KEY from sample representation methods

  • combat_preprocessed, hlca_preprocessed, onek1k_preprocessed, stephenson_preprocessed, ticatlas_preprocessed (use combat, hlca, onek1k, stephenson, ticatlas instead).

0.16.0 – datasets module#

Added#

  • datasets module with functions to download preprocessed datasets

  • patpy/datasets/_datasets/DatasetInfo data class to define information about datasets

  • patpy/datasets/_datasets/_load_dataset util function

  • patpy/datasets/_datasets/combat_preprocessed dataset

  • patpy/datasets/_datasets/stephenson_preprocessed

  • patpy/datasets/_datasets/hlca_preprocessed dataset

  • patpy/datasets/_datasets/onek1k_preprocessed dataset

  • patpy/datasets/_datasets/ticatlas_preprocessed dataset

  • Tests for loading datasets tests/test_datasets.py

  • Github workflow at .github/workflows/test.yaml with caching to not download datasets each time

0.15.4#

Fixed#

  • patpy.tl.evaluation.replicate_robustness normalisation: the metric now reaches 0 in the worst case (replicate is the farthest neighbour for every sample).

Added#

  • Tests for replicate_robustness covering best, worst, and intermediate configurations.

  • Expanded docstring for replicate_robustness.

0.15.3#

Fixed#

  • The n_neighbors and reverse_technical_score parameters in patpy.tl.evaluation/knn_prediction_score were not used

Added#

  • Tests for the fixed bugs

0.15.2#

Deleted#

  • Git dependencies pulsar and pascient to fix PyOI integration, changed installation instructions

0.15.1#

Added — patpy.tl.condition_utils#

New module providing utilities for running any pertpy differential method across all pairwise contrasts of a multi-dimensional condition space.

  • run_condition_combinations(model_cls, adata, condition_cols, **kwargs) — one-liner that takes any pertpy class with a compare_groups classmethod, enumerates all observed pairwise condition contrasts, runs the model for each, and returns a concatenated DataFrame with a "contrast" column.

  • ConditionComparison(model_cls, **defaults) — thin class wrapper around the above; stores the model class and default kwargs so the same configuration can be reused across multiple datasets or condition axes via .run().

  • build_condition_combinations(adata, condition_cols) — returns a DataFrame of all observed (not Cartesian) combinations of multiple condition columns, with a joined "label" column.

  • build_all_pairwise_contrasts(adata, condition_cols) — returns a list of {group, baseline, label} dicts for every pairwise contrast of observed condition combinations.

  • filter_adata_to_conditions(adata, condition_col, groups) — subsets an AnnData to cells belonging to specific condition groups.

0.15.0#

Added#

  • PaSCient method wrapper (tl/supervised/PaSCient) for training and fine-tuning PaSCent foundational model

  • Example of running paSCient to docs/notebooks/supervised_methods_example.ipynb

0.14.1#

Added#

  • Sparse matrix support in sample representation methods

  • Tests with input layers containing sparse matrices

0.14.0#

Added#

  • CLR-transformation to the composition baseline to bridge it with SETA: https://www.bioconductor.org/packages//release/bioc/html/SETA.html

0.13.0#

Added#

  • SupervisedSampleMethod base class (tl/_base_sample_method.py) providing a shared scaffold for unsupervised and supervised sample-level methods.

  • MixMIL wrapper (tl/supervised/_mixmil.py) for the attention-based multi-instance mixed model by Engelmann et al. 2024 (https://arxiv.org/abs/2311.02455).

  • PULSAR wrapper (tl/supervised/_pulsar.py) for the zero-shot foundation model by Pang et al. 2025 (https://doi.org/10.1101/2025.11.24.685470).

  • Tests for all supervised methods in tests/test_supervised_methods.py, including fixtures with deterministic mock backends (no network access or GPU required), multi-label MixMIL tests, and PULSAR linear probe tests.

  • Base class for sample methods: (tl/_base_sample_method/BaseSampleMethod)

  • fit_linear_probe() method for sample-level methods

  • fine_tune() method for supervised sample-level methods with linear probing as a default

  • predict() method for supervised sample-level methods

  • States for sample-level methods with _check_adata_loaded() and _check_fitted()

  • Tests for supervised methods

Changed#

  • Both SupervisedSampleMethod and SampleRepresentationMethod now inherit basic functionality from BaseSampleMethod

0.12.0#

Added#

  • Foundational model interface with helical at pp/basic.py

  • Tests for helical embeddings

0.11.4#

Added#

  • Tests for all sample representation methods

  • Tests for preprocessing functions in pp/basic.py

  • Tests for evaluation utilities in tl/evaluation.py

  • conftest.py with reusable fixtures

Fixed#

  • prepare_data_for_phemd now handles dense matrices in addition to sparse ones

0.11.3#

Added#

  • An utils function _remove_negative_distances

Changed#

  • In Python implementations of GloScope, remove negative distances

0.11.2#

Added#

  • GloScope tutorial

Changed#

  • Update the rpy2 interface for R implementation of GloScope

0.11.1#

Fixed#

  • Fix bug in tl/sample_representation/GloScope_py with always accessing layer in obsm instead of a general slot

0.11.0#

Added#

  • Function tl/evaluation/trajectory_correlation to compute a corresponding SPARE metric

  • Function tl/evaluation/knn_prediction_score to compute a corresponding SPARE metric

  • Function tl/evaluation/replicate_robustness to compute a corresponding SPARE metric

  • Utils function tl/evaluation/_get_col_from_adata

  • Utils funciton tl/evaluation/_identity_up_to_suffix

0.10.0#

Added#

  • GloScope_py sample representation method (reimplementation of the original GloScope in Python for CPU and GPU)

Changed#

  • GloScope.calculate_distance_matrix now returns a NumPy array instead of a pandas DataFrame

0.9.3#

Changed#

  • Update rpy2 conversion in Gloscope.prepare_anndata()

0.9.2#

Changed#

  • Update readme with an overview and pypi link

0.9.1#

Changed#

  • Install PILOT and DiffusionEMD from PyPI, not GitHub

  • Fix actions and update documentation

0.9.0#

Changed#

  • GitHub actions files to match an updated scverse cookiecutter template

  • Breaking! Rename wherever possible: patient_representation -> patpy

  • Breaking! Rename tl.basic.py to tl.sample_representation

0.8.0#

Added#

  • persistence_evaluation method in patient_representation.tl.evaluation

  • Persistent homology file src/patient_representation/tl/persistence.py

0.7.2#

Changed#

  • Fix typo: patient_representations -> sample_representation in correlation functions

0.7.1#

Changed#

  • Fixed typo in GloScope causing empty distance matrix

0.7.0#

Added#

  • GloScope sample representation method (interface to R package via rpy2)

  • conda environment for gloscope

Changed#

  • GloScope R script now accepts n_workers argument

0.6.1#

Changed#

  • Use layers instead of obsm to store layer data in _move_layer_to_X method

0.6.0#

Changed#

  • Use cell_group_key instead of cell_type_key in MOFA and _get_pseudobulk

  • Use sample_representation instead of patient_representation in MOFA

0.5.0#

Deleted#

  • Remove mandatory filtering of cell types in and small samples in prepare_anndata method of SampleRepresentationMethod descendants

Changed#

  • Rerun example notebook with updated API

  • Add minor comments to the example notebook

0.4.0 – Synthetic data generation#

Added#

  • Functions to generate synthetic data simulating disease severity in src/datasets/synthetic.py

  • Synthetic data generation example notebook: docs/notebooks/synthetic_data_generation.ipynb

  • plot_embedding method for sample representations now accepts custom axes

0.3.0#

Sample representation refactoring:#

  • “cell type” is renamed to “cell group” everywhere to be more general

  • Some representation methods are renamed accordingly:

    • CellTypesComposition -> CellGroupComposition

    • CellTypePseudobulk -> GroupedPseudobulk

    • TotalPseudobulk -> Pseudobulk

  • patient_representation argument is renamed to sample_representation

  • “Patient representation” is now renamed to “Sample representation” eveywhere

  • The base class is now called SampleRepresentationMethod instead of PatientRepresentationMethod. This is important only for developers, users shouldn’t use it anyway

Deleted#

  • Not used SCellBow class

  • Example notebook in the documentation

0.2.0#

Added#

  • Warning about ongoing development in README

  • Function correlate_composition to the tools

  • Function correlate_cell_type_expression to the tools

  • Function correlation_volcano to the plotting

  • Patients trajectory example notebook

Changed#

  • Rename patient_representation to patpy