Single-Cell Whole Genome Analysis in Python.

API#

Import scgenome as:

import scgenome

Preprocessing: `pp`#

Data loading and pre-processing functionality.

Data loading#

`pp.read_dlp_hmmcopy`(alignment_results_dir, ...)	Read hmmcopy results from the DLP pipeline.
`pp.convert_dlp_hmmcopy`(metrics_data, cn_data)	Convert hmmcopy pandas dataframes to anndata
`pp.convert_dlp_signals`(hscn, metrics_data)	Convert signals pandas dataframes to anndata
`pp.read_bam_bin_counts`(bins, bams[, excluded])	Count reads in bins from bams
`pp.read_snv_genotyping`(filename)	Read SNV genotyping into an AnnData

Filtering#

`pp.filter_cells`(adata[, filters, inplace])	Filter poor quality cells based on the filters provided.
`pp.calculate_filter_metrics`(adata[, ...])	Calculate additional filtering metrics to be used by other filtering methods.

Tools: `tl`#

Any transformation of the data matrix that is not preprocessing. In contrast to a preprocessing function, a tool usually adds an easily interpretable annotation to the data matrix, which can then be visualized with a corresponding plotting function.

Clustering#

`tl.cluster_cells`(adata[, layer_name, ...])	Cluster cells by copy number.
`tl.aggregate_clusters_hmmcopy`(adata)	Aggregate hmmcopy copy number by cluster to create cluster CN matrix
`tl.aggregate_clusters`(adata[, agg_X, ...])	Aggregate copy number by cluster to create cluster CN matrix
`tl.sort_cells`(adata[, layer_name, cell_ids, ...])	Sort cells by hierarchical clustering on copy number values.

Embeddings#

`tl.compute_umap`(adata[, layer_name, ...])	Cluster cells by copy number.
`tl.pca_loadings`(adata[, layer, ...])	Compute PCA loadings matrix

Generating binned data#

`tl.create_bins`(binsize)	Create a regular binning of the genome
`tl.count_gc`(bins, genome_fasta[, ...])	Count gc in each bin
`tl.mean_from_bigwig`(bins, bigwig_file, ...)	Count gc in each bin

Gene regions#

`tl.read_ensemble_genes_gtf`(gtf_filename)	Read an ensembl gtf and extract gene start end
`tl.aggregate_genes`(adata, genes[, ...])	Aggregate copy number by gene to create gene CN matrix

Phylogenetics#

`tl.prune_leaves`(tree, f)
`tl.align_cn_tree`(tree, adata)

Anndata Manipulation#

tl.ad_concat_cells(adatas)

Concatenate a list of anndata by obs (cells)

Plotting: `pl`#

The plotting module scgenome.pl largely parallels the tl.* and a few of the pp.* functions. For most tools and for some preprocessing functions, you’ll find a plotting function with the same name.

Note

TODO: more plotting functions matching tools

Copy number profiles and heatmaps#

`pl.plot_cn_profile`(adata, obs_id[, ...])	Plot scatter points of copy number across the genome or a chromosome.
`pl.plot_cell_cn_matrix`(adata[, layer_name, ...])	Plot a copy number matrix
`pl.plot_cell_cn_matrix_fig`(adata[, ...])	Plot a copy number matrix
`pl.plot_gc_reads`(adata, obs_id, **kwargs)	Plot scatter points of gc by read count.

Phylogenetics#

pl.plot_tree_cn(tree, adata[, ...])

Plot a tree aligned to a CN values matrix heatmap

API

Contents

API#

Preprocessing: pp#

Data loading#

Filtering#

Tools: tl#

Clustering#

Embeddings#

Generating binned data#

Gene regions#

Phylogenetics#

Anndata Manipulation#

Plotting: pl#

Copy number profiles and heatmaps#

Phylogenetics#

Preprocessing: `pp`#

Tools: `tl`#

Plotting: `pl`#