nichecompass.utils.extract_gp_dict_from_omnipath_lr_interactions
- nichecompass.utils.extract_gp_dict_from_omnipath_lr_interactions(species, min_curation_effort=2, load_from_disk=False, save_to_disk=False, lr_network_file_path='../data/gene_programs/omnipath_lr_network.csv', gene_orthologs_mapping_file_path='../data/gene_annotations/human_mouse_gene_orthologs.csv', plot_gp_gene_count_distributions=True, gp_gene_count_distributions_save_path=None)
Retrieve 724 human ligand-receptor interactions from OmniPath and extract them into a gene program dictionary. OmniPath is a database of molecular biology prior knowledge that combines intercellular communication data from many different resources (all resources for intercellular communication included in OmniPath can be queried via ´op.requests.Intercell.resources()´). If ´species´ is ´mouse´, orthologs from human interactions are returned.
Parts of the implementation are inspired by https://workflows.omnipathdb.org/intercell-networks-py.html (01.10.2022).
- Parameters:
species (
Literal['mouse','human']) – Species for which the gene programs will be extracted. The default is human. Human genes are mapped to mouse orthologs using a mapping file. NicheCompass contains a default mapping file stored under “<root>/data/gene_annotations/human_mouse_gene_orthologs.csv”, which was created with Ensembl BioMart (http://www.ensembl.org/info/data/biomart/index.html).min_curation_effort (
int(default:2)) – Indicates how many times an interaction has to be described in a paper and mentioned in a database to be included in the retrieval.load_from_disk (
bool(default:False)) – If ´True´, the OmniPath ligand receptor interactions will be loaded from disk instead of from the OmniPath library.save_to_disk (
bool(default:False)) – If ´True´, the OmniPath ligand receptor interactions will additionally be stored on disk. Only applies if ´load_from_disk´ is ´False´.lr_network_file_path (
Optional[str] (default:'../data/gene_programs/omnipath_lr_network.csv')) – Path of the file where the OmniPath ligand receptor interactions will be stored (if ´save_to_disk´ is ´True´) or loaded from (if ´load_from_disk´ is ´True´).gene_orthologs_mapping_file_path (
Optional[str] (default:'../data/gene_annotations/human_mouse_gene_orthologs.csv')) – Path of the file where the gene orthologs mapping is stored if species is ´mouse´.plot_gp_gene_count_distributions (
bool(default:True)) – If ´True´, display the distribution of gene programs per number of source and target genes.gp_gene_count_distributions_save_path (
Optional[str] (default:None)) – Path of the file where the gene program gene count distribution plot will be saved if ´plot_gp_gene_count_distributions´ is ´True´.
- Return type:
- Returns:
gp_dict: Nested dictionary containing the OmniPath ligand-receptor interaction gene programs with keys being gene program names and values being dictionaries with keys ´sources´, ´targets´, ´sources_categories´, and ´targets_categories´, where ´sources´ contains the OmniPath ligands, ´targets´ contains the OmniPath receptors, ´sources_categories´ contains the categories of the sources, and ´targets_categories´ contains the categories of the targets.