bioalpha.singlecell.preprocessing.filter_genes
- bioalpha.singlecell.preprocessing.filter_genes(adata: AnnData, min_counts: int | None = None, min_cells: int | None = None, max_counts: int | None = None, max_cells: int | None = None, inplace: bool = True, key_added: str | None = 'filter_genes_mask', layer: str | None = None, obs_mask: str | None = None, var_mask: str | None = None, **kwargs) Tuple[ndarray, ndarray] | None
Filter genes based on number of cells or counts.
- Parameters:
adata (
AnnData
) – The annotated data matrix of shapen_obs
*n_vars
. Rows correspond to cells and columns to genes.csr_mtx (
csr_matrix
) – (n_cells x n_genes) The csr sparse expression matrix.min_counts (Optional[
np.float32
], default =None
) – Minimum number of counts to keep genes.min_cells (Optional[
np.float32
], default =None
) – Minimum number of cells expressed to keep genes.max_counts (Optional[
np.float32
], default =None
) – Maximum number of counts to keep genes.max_cells (Optional[
np.float32
], default =None
) – Maximum number of cells expressed to keep genes.inplace (
bool
, default =True
) – Perform computation inplace or return result.key_added (Optional[
str
], default =filter_genes_mask
) – Name of the field inadata.var
where the filter array is stored. Only for mapping data.layer (Optional[
str
], default =None
) – Layer to filtering instead ofX
. IfNone
,X
is used. Only for mapping data.obs_mask (Optional[
str
], default =None
) – Ifobs_mask
is notNone
, filter cells byadata.obs[obs_mask]
.var_mask (Optional[
str
], default =None
) – Ifobs_mask
is notNone
, filter genes byadata.obs[obs_mask]
.**kwargs – Other parameters passed to
BatchReader
.
- Returns:
Depending on
inplace
, returns the following arrays or directly subsetsand annotates the data matrix
gene_subset (
ndarray
) – Boolean index mask that does filtering. True means that the gene is kept. False means the gene is removed.number_per_gene (
ndarray
) – Depending on what was tresholded (counts
orcells
), the array storesn_counts
orn_cells
per gene.