bioalpha.singlecell.preprocessing.filter_cells

bioalpha.singlecell.preprocessing.filter_cells(adata: AnnData | H5ADMap, min_counts: int | None = None, min_genes: int | None = None, max_counts: int | None = None, max_genes: int | None = None, inplace: bool = True, key_added: str | None = 'filter_cells_mask', layer: str | None = None, obs_mask: str | None = None, var_mask: str | None = None, **kwargs) Tuple[ndarray, ndarray] | None

Filter cell outliers based on counts and numbers of genes expressed.

Parameters:
  • adata (Union[AnnData, H5ADMap]) – The annotated data matrix of shape n_obs * n_vars. Rows correspond to cells and columns to genes.

  • min_counts (Optional[int], default = None) – Minimum number of counts to keep cells.

  • min_genes (Optional[int], default = None) – Minimum number of genes expressed to keep cells.

  • max_counts (Optional[int], default = None) – Maximum number of counts to keep cells.

  • max_genes (Optional[int], default = None) – Maximum number of genes expressed to keep cells.

  • inplace (bool, default = True) – Perform computation inplace or return result.

  • key_added (Optional[str], default = filter_cells_mask) – Name of the field in adata.obs where the filter array is stored. Only for mapping data.

  • layer (Optional[str], default = None) – Layer to filtering instead of X. If None, X is used. Only for mapping data.

  • obs_mask (Optional[str], default = None) – If obs_mask is not None, filter cells by adata.obs[obs_mask].

  • var_mask (Optional[str], default = None) – If obs_mask is not None, filter genes by adata.obs[obs_mask].

  • **kwargs – Other parameters passed to BatchReader.

Returns:

  • Depending on inplace, returns the following arrays or directly subsets and annotates the data matrix

  • cells_subset (ndarray) – Boolean index mask that does filtering. True means that the cell is kept. False means the cell is removed.

  • number_per_cell (ndarray) – Depending on what was tresholded (counts or genes), the array stores n_counts or n_cells per gene.