patpy.datasets.onek1k

Contents

patpy.datasets.onek1k#

patpy.datasets.onek1k(kind='processed', overwrite=False, return_dataset_info=False)#

OneK1K dataset.

The processed version was prepared with the standard scanpy pipeline; cells annotated as “nan” were removed; PCA, scVI, scANVI, and scPoli dimensionality reduction were applied. The dataset contains 1,248,980 cells and 3,000 features. The processed download is approximately 2.5 GB compressed and ~4 GB unzipped.

Parameters:
  • kind (Literal['raw', 'processed'] (default: 'processed')) – Either "processed" (default) or "raw". Currently only "processed" is available; "raw" raises NotImplementedError.

  • overwrite (bool (default: False)) – If True, re-download the dataset even when a cached copy exists.

  • return_dataset_info (bool (default: False)) – If True, return a tuple (adata, DatasetInfo) instead of just adata.

References

Yazar, S., Alquicira-Hernandez, J., Wing, K., Senabouth, A., Gordon, M. G., Andersen, S., … & Powell, J. E. (2022). Single-cell eQTL mapping identifies cell type–specific genetic control of autoimmune disease. Science, 376(6589), eabf3041. https://doi.org/10.1126/science.abf3041 https://onek1k.org/

Return type:

AnnData | tuple[AnnData, DatasetInfo]

Returns:

AnnData object of scRNA-seq profiles, optionally paired with a DatasetInfo describing the dataset’s standard schema.

Examples

>>> import patpy
>>> adata = patpy.datasets.onek1k()
>>> adata, info = patpy.datasets.onek1k(return_dataset_info=True)