bioalpha.singlecell.read_gene_signatures
- bioalpha.singlecell.read_gene_signatures(json_file: Path, adata: AnnData | H5ADMap | None = None, key_added: str | None = None, key_name: str | None = 'systematicName', key_gene: str | None = 'geneSymbols', copy: bool = False) List[GeneSignature] | AnnData | None
Read gene signatures from
JSONfile.- Parameters:
json_file (
Path) – path to gene signatures file.adata (Optional[
AnnData], default =None) – The annotated data matrix of shapen_obsxn_vars. If passed anot Nonevalue,gene_signatureswill be saved inadata.uns. Otherwise, return a list gene signatures.key_added (Optional[
str], default =None) – The key inadata.unsinformation is saved to. Can not be set whenadataisNone.key_name (Optional[
str], default ="systematicName",) – The key for loadding thenameof gene set. (See the below example).key_gene (Optional[
str], default ="geneSymbols") – The key for loadding thegene_signatures. (See the below example).copy (
bool, default =False) – Whether to copyadataor modify it inplace.
- Returns:
If
adatais passed,"gene_sigs"will be added inadata.unsand returned ifcopy = True.Otherwise, return a
List[GeneSignature].
Example
>>> # Download gene sets here ... # https://data.broadinstitute.org/gsea-msigdb/msigdb/release/2023.1.Mm/ >>> from bioalpha import sc >>> sc.read_gene_signatures("m5.go.v2023.1.Mm.json") [GeneSignature(name='MM5283', gene2weight=frozendict.frozendict({'Aasdhppt': 1, 'Aldh1l1': 1, 'Aldh1l2': 1, 'Mthfd1': 1, 'Mthfd1l': 1, 'Mthfd2l': 1})),...] >>> # change the key_name will change the GeneSignature name >>> sc.read_gene_signatures("m5.go.v2023.1.Mm.json", key_name="exactSource") [GeneSignature(name='GO:0009256', gene2weight=frozendict.frozendict({'Aasdhppt': 1, 'Aldh1l1': 1, 'Aldh1l2': 1, 'Mthfd1': 1, 'Mthfd1l': 1, 'Mthfd2l': 1})),...] >>> # NOTE: when file structures like this ... { gene_set1: ["gene1", "gene2"], gene_set2: ["gene1", "gene2"], ... } ... # key_name and key_gene must be set to None