scgenome.tl.cluster_cells

scgenome.tl.cluster_cells#

scgenome.tl.cluster_cells(adata, layer_name='copy', method='kmeans_bic', min_k=2, max_k=100, cell_ids=None, bin_ids=None, standardize=False)#

Cluster cells by copy number.

Parameters:
  • adata (AnnData) – copy number data

  • layer_name (str, optional) – layer with copy number data to plot, None for X, by default ‘state’

  • method (str, optional) – clustering method, by default ‘kmeans_bic’

  • min_k (int, optional) – minimum number of clusters, by default 2

  • max_k (int, optional) – maximum number of clusters, by default 100

  • cell_ids (str, optional) – subset of cells to cluster, by default None

  • bin_ids (str, optional) – subset of bins to cluster, by default None

  • standarize (bool) – standardize the data prior to outlier detection, by default False

Returns:

copy number data with additional cluster_id and cluster_size columns

Return type:

AnnData

Examples

>>> import scgenome
>>> import anndata as ad
>>> import numpy as np
>>> adata = ad.AnnData(np.array([
...    [3, 3, 3, 6, 6],
...    [1, 1, 1, 2, 2],
...    [1, 22, 1, 2, 2],
...    [1, 3, 3, 5, 5],
... ]).astype(np.float32))
>>> adata = scgenome.tl.cluster_cells_kmeans(adata, layer_name=None, max_k=3)
>>> adata.obs['cluster_id']
0    0
1    2
2    1
3    0
Name: cluster_id, dtype: category
Categories (3, int64): [0, 1, 2]