scdrs.util.load_gs#

scdrs.util.load_gs(gs_path: str, src_species: str | None = None, dst_species: str | None = None, to_intersect: List[str] | None = None) → dict[source]#

Load the gene set file (.gs file).

Parameters:

gs_pathstr: Path to the gene set file with two columns ‘TRAIT’ and ‘GENESET’, separated by tab. ‘TRAIT’ column contain trait names. ‘GENESET’ column contain gene names (matching expression matrix) and gene weights (for weighted gene set). For unweighted gene set, the ‘GENESET’ column contains only gene names separated by comma, e.g., “<gene1>,<gene2>,<gene3>”. For weighted gene set, the ‘GENESET’ column contains gene names and weights, e.g., “<gene1>:<weight1>,<gene2>:<weight2>,<gene3>:<weight3>”.
src_speciesstr, default=None: Source species, must be either ‘mmusculus’ or ‘hsapiens’ if not None
dst_speciesstr, default=None: Destination species, must be either ‘mmusculus’ or ‘hsapiens’ if not None
to_intersectList[str], default=None.: Gene list to intersect with the input .gs file.

Returns:

dict_gsdict

Dictionary of gene sets: {: trait1: (gene_list, gene_weight_list), trait2: (gene_list, gene_weight_list), …

}