Single-Cell Whole Genome Analysis in Python.
API#
Import scgenome as:
import scgenome
Preprocessing: pp#
Data loading and pre-processing functionality.
Data loading#
|
Read hmmcopy results from the DLP pipeline. |
|
Convert hmmcopy pandas dataframes to anndata |
|
Convert signals pandas dataframes to anndata |
|
Count reads in bins from bams |
|
Read SNV genotyping into an AnnData |
Filtering#
|
Filter poor quality cells based on the filters provided. |
|
Calculate additional filtering metrics to be used by other filtering methods. |
Tools: tl#
Any transformation of the data matrix that is not preprocessing. In contrast to a preprocessing function, a tool usually adds an easily interpretable annotation to the data matrix, which can then be visualized with a corresponding plotting function.
Clustering#
|
Cluster cells by copy number. |
Aggregate hmmcopy copy number by cluster to create cluster CN matrix |
|
|
Aggregate copy number by cluster to create cluster CN matrix |
|
Sort cells by hierarchical clustering on copy number values. |
Embeddings#
|
Cluster cells by copy number. |
|
Compute PCA loadings matrix |
Generating binned data#
|
Create a regular binning of the genome |
|
Count gc in each bin |
|
Count gc in each bin |
Gene regions#
|
Read an ensembl gtf and extract gene start end |
|
Aggregate copy number by gene to create gene CN matrix |
Phylogenetics#
|
|
|
Anndata Manipulation#
|
Concatenate a list of anndata by obs (cells) |
Plotting: pl#
The plotting module scgenome.pl largely parallels the tl.* and a few of the pp.* functions.
For most tools and for some preprocessing functions, you’ll find a plotting function with the same name.
Note
TODO: more plotting functions matching tools
Copy number profiles and heatmaps#
|
Plot scatter points of copy number across the genome or a chromosome. |
|
Plot a copy number matrix |
|
Plot a copy number matrix |
|
Plot scatter points of gc by read count. |
Phylogenetics#
|
Plot a tree aligned to a CN values matrix heatmap |