bioalpha.singlecell.tools.kmeans
- bioalpha.singlecell.tools.kmeans(data: AnnData, k: int, restrict_to: Tuple[str, Sequence[str]] | None = None, random_state: int = 0, key_added: str = 'kmeans', use_rep: str | None = None, n_pcs: int | None = None, copy: bool = False, return_info: bool = True, **kwargs) AnnData | None
Cluster cells into subgroups. Cluster cells using the K-means algorithm.
- Parameters:
adata (
AnnData) – The annotated data matrix of shapen_obsxn_vars. Rows correspond to cells and columns to genes.k (int) – The number of clusters
restrict_to (Optional[Tuple[
str, Sequence[str]]], default =None) – Restrict the clustering to the categories within the key for sample annotation, tuple needs to contain(obs_key, list_of_categories).random_state (Optional[Union[
int,RandomState]], default =0) – Change the initialization of the optimization.key_added (
str, default ="leiden") –adata.obskey under which to add the cluster labels.n_pcs (Optional[
int], default =None,) – Use this many PCs. Ifn_pcs==0use.Xifuse_rep is None.use_rep (Optional[
str], default =None) – Use the indicated representation."X"or any key for.obsmis valid. IfNone, the representation is chosen automatically: For.n_vars< 50,.Xis used, otherwise “X_pca” is used. If “X_pca” is not present, it’s computed with default parameters.copy (
bool, default =False) – Whether to copyadataor modify it inplace.return_info (
bool, default =True) – Whether returningcentroidsandtotal_distanceskwargs (
dict) – Any further arguments to pass to_sctools.clustering.kmeans(which in turn passes arguments to thepartition_type).
- Returns:
adata – If
copy=Trueit returns or else adds fields toadata:.obs[
key_added] Array of dim (number of samples) that stores the subgroup id ("0","1", …) for each cell..uns[
key_added]["centroids"] The centroids of each clusters. Only forreturn_info = True..uns[
key_added]["total_distances"] Total distances from cells to centroids. Only forreturn_info = True.
- Return type:
AnnData