bioalpha.singlecell.read_gene_signatures
- bioalpha.singlecell.read_gene_signatures(json_file: Path, adata: AnnData | H5ADMap | None = None, key_added: str | None = None, key_name: str | None = 'systematicName', key_gene: str | None = 'geneSymbols', copy: bool = False) List[GeneSignature] | AnnData | None
Read gene signatures from
JSON
file.- Parameters:
json_file (
Path
) – path to gene signatures file.adata (Optional[
AnnData
], default =None
) – The annotated data matrix of shapen_obs
xn_vars
. If passed anot None
value,gene_signatures
will be saved inadata.uns
. Otherwise, return a list gene signatures.key_added (Optional[
str
], default =None
) – The key inadata.uns
information is saved to. Can not be set whenadata
isNone
.key_name (Optional[
str
], default ="systematicName"
,) – The key for loadding thename
of gene set. (See the below example).key_gene (Optional[
str
], default ="geneSymbols"
) – The key for loadding thegene_signatures
. (See the below example).copy (
bool
, default =False
) – Whether to copyadata
or modify it inplace.
- Returns:
If
adata
is passed,"gene_sigs"
will be added inadata.uns
and returned ifcopy = True
.Otherwise, return a
List[GeneSignature]
.
Example
>>> # Download gene sets here ... # https://data.broadinstitute.org/gsea-msigdb/msigdb/release/2023.1.Mm/ >>> from bioalpha import sc >>> sc.read_gene_signatures("m5.go.v2023.1.Mm.json") [GeneSignature(name='MM5283', gene2weight=frozendict.frozendict({'Aasdhppt': 1, 'Aldh1l1': 1, 'Aldh1l2': 1, 'Mthfd1': 1, 'Mthfd1l': 1, 'Mthfd2l': 1})),...] >>> # change the key_name will change the GeneSignature name >>> sc.read_gene_signatures("m5.go.v2023.1.Mm.json", key_name="exactSource") [GeneSignature(name='GO:0009256', gene2weight=frozendict.frozendict({'Aasdhppt': 1, 'Aldh1l1': 1, 'Aldh1l2': 1, 'Mthfd1': 1, 'Mthfd1l': 1, 'Mthfd2l': 1})),...] >>> # NOTE: when file structures like this ... { gene_set1: ["gene1", "gene2"], gene_set2: ["gene1", "gene2"], ... } ... # key_name and key_gene must be set to None