patpy.datasets.hlca

Contents

patpy.datasets.hlca#

patpy.datasets.hlca(kind='processed', overwrite=False, return_dataset_info=False)#

Human Lung Cell Atlas (HLCA) dataset.

The processed version was prepared with the standard scanpy pipeline; cells annotated as “nan” were removed; PCA, scVI, scANVI, and scPoli dimensionality reduction were applied. The dataset contains 1,687,127 cells and 3,000 features. The processed download is approximately 3 GB compressed and ~6.5 GB unzipped.

Parameters:
  • kind (Literal['raw', 'processed'] (default: 'processed')) – Either "processed" (default) or "raw". Currently only "processed" is available; "raw" raises NotImplementedError.

  • overwrite (bool (default: False)) – If True, re-download the dataset even when a cached copy exists.

  • return_dataset_info (bool (default: False)) – If True, return a tuple (adata, DatasetInfo) instead of just adata.

References

Sikkema, L., Ramírez-Suástegui, C., Strobl, D. C., Gillett, T. E., Zappia, L., Madissoon, E., … & Theis, F. J. (2023). An integrated cell atlas of the lung in health and disease. Nature medicine, 29(6), 1563-1577. https://doi.org/10.1038/s41591-023-02327-2

Return type:

AnnData | tuple[AnnData, DatasetInfo]

Returns:

AnnData object of scRNA-seq profiles, optionally paired with a DatasetInfo describing the dataset’s standard schema.

Examples

>>> import patpy
>>> adata = patpy.datasets.hlca()
>>> adata, info = patpy.datasets.hlca(return_dataset_info=True)