Single Cell

BioAlpha single-cell is an algorithmic framework that can run full single-cell analysis pipeline on Python for large biological datasets in realtime.

Import BioAlpha single-cell as:

from bioalpha import singlecell as sc

or:

from bioalpha import sc

Following the BioAlpha Tutorial and AUCell Tutorial to see how to use it.

Preprocessing: pp

bioalpha.singlecell.pp provides basic transformation on data.

preprocessing.filter_cells

Filter cell outliers based on counts and numbers of genes expressed.

preprocessing.filter_genes

Filter genes based on number of cells or counts.

preprocessing.calculate_qc_metrics

Calculate quality control metrics.

preprocessing.log_normalize

Normalize counts per cell and calculate log1p.

preprocessing.log1p

Logarithmize the data matrix.

preprocessing.highly_variable_genes

Annotate highly variable genes.

preprocessing.regress_out

Regress out (mostly) unwanted sources of variation.

preprocessing.scale

Scale data to unit variance and zero mean.

preprocessing.pca

Principal component analysis.

preprocessing.subsample

Subsample to a fraction of the number of observations.

preprocessing.harmony_integrate

Use harmony algorithm to integrate different experiments.

preprocessing.combine_keys

Combine keys in adata.obs.

preprocessing.neighbors

Compute a neighborhood graph of observations.

Tool: tl

Provide methods for an interpretable annotation and visualizing with parallel module on bioalpha.sc.pl.

tools.kmeans

Cluster cells into subgroups.

tools.louvain

Cluster cells into subgroups.

tools.tsne

Run t-SNE t-distributed stochastic neighborhood embedding (tSNE).

tools.umap

Embed the neighborhood graph using UMAP (Uniform Manifold Approximation and Projection).

tools.rank_genes_groups

Rank genes for characterizing groups.

tools.aucell

Calculate enrichment of gene signatures for single cells.

Plotting: pl

plotting.highest_expr_genes

Fraction of counts assigned to each gene over all cells.

plotting.highly_variable_genes

Plot dispersions or normalized variance versus means for genes.

plotting.tsne

Scatter plot in tSNE basis.

plotting.umap

Scatter plot in UMAP basis.

plotting.rank_genes_groups_volcano

Plot ranking of genes using volcano plot.

plotting.aucell_heatmap

Plotting AUC scores of cells and gene_sets.

plotting.aucell

Plotting AUC scores of cells and gene_sets.

Reading

read_10x_mtx

Fast read 10x-Genomics-formatted mtx directory.

read_10x_h5

Read 10x-Genomics-formatted hdf5 file.

read_h5ad

Read .h5ad-formatted hdf5 file.

read_mtx

Fast reading algorithm for mtx files.

read_gene_signatures

Read gene signatures from JSON file.

read_gene_signatures_gmt

Read gene signatures from GMT file.

read_gene_signatures_GSEA

Read gene signatures from GSEA (https://www.gsea-msigdb.org/).

get_gene_signatures_name_from_GSEA

Get gene signature name from GSEA (https://www.gsea-msigdb.org/).

download_GEO_data

Download suplementary file of data from GEO (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi).

Writing

write_bioturing_format

Write Bioturing format.

Experimental

experimental.tools.find_marker

Identifies marker genes.

experimental.tools.thresholding

Find threshold to cut-off and smooth the expression matrix.