bioalpha.singlecell.read_gene_signatures_gmt

bioalpha.singlecell.read_gene_signatures_gmt(gmt_file: Path, adata: AnnData | None = None, key_added: str | None = None, field_sep: str = '\t', gene_sep: str = '\t', copy: bool = False) List[GeneSignature] | AnnData | None

Read gene signatures from GMT file.

Parameters:
  • gmt_file (Path) – path to .gmt gene signatures file.

  • adata (Optional[AnnData], default = None) – The annotated data matrix of shape n_obs x n_vars. If passed a not None value, gene_signatures will be saved in adata.uns. Otherwise, return a list gene signatures.

  • key_added (Optional[str], default = None) – The key in adata.uns information is saved to. Can not be set when adata is None.

  • field_sep (str, default = "\t",) – The separator that separates fields in a line.

  • gene_sep (str, default = "\t") – The separator that separates the genes.

  • copy (bool, default = False) – Whether to copy adata or modify it inplace.

Returns:

  • If adata is passed, "gene_sigs" will be added in

  • adata.uns and returned if copy = True.

  • Otherwise, return a List[GeneSignature].

Example

>>> # Download gene sets here
... # https://data.broadinstitute.org/gsea-msigdb/msigdb/release/2023.1.Mm/
>>> from bioalpha import sc
>>> sc.read_gene_signatures_gmt("m1.all.v2023.1.Mm.entrez.gmt")
[GeneSignature(name='chr1A1', gene2weight=frozendict.frozendict({'115487633':   1.0, '115487594': 1.0, ...)), ...]