scdrs.util.load_gs#

scdrs.util.load_gs(gs_path: str, src_species: str | None = None, dst_species: str | None = None, to_intersect: List[str] | None = None) dict[source]#

Load the gene set file (.gs file).

Parameters:
gs_pathstr

Path to the gene set file with two columns ‘TRAIT’ and ‘GENESET’, separated by tab. ‘TRAIT’ column contain trait names. ‘GENESET’ column contain gene names (matching expression matrix) and gene weights (for weighted gene set). For unweighted gene set, the ‘GENESET’ column contains only gene names separated by comma, e.g., “<gene1>,<gene2>,<gene3>”. For weighted gene set, the ‘GENESET’ column contains gene names and weights, e.g., “<gene1>:<weight1>,<gene2>:<weight2>,<gene3>:<weight3>”.

src_speciesstr, default=None

Source species, must be either ‘mmusculus’ or ‘hsapiens’ if not None

dst_speciesstr, default=None

Destination species, must be either ‘mmusculus’ or ‘hsapiens’ if not None

to_intersectList[str], default=None.

Gene list to intersect with the input .gs file.

Returns:
dict_gsdict
Dictionary of gene sets: {

trait1: (gene_list, gene_weight_list), trait2: (gene_list, gene_weight_list), …

}