scdrs.score_cell#
- scdrs.score_cell(data, gene_list, gene_weight=None, ctrl_match_key='mean_var', n_ctrl=1000, n_genebin=200, weight_opt='vs', copy=False, return_ctrl_raw_score=False, return_ctrl_norm_score=False, random_seed=0, verbose=False, save_intermediate=None)[source]#
Score cells based on the disease gene set.
Preprocessing information data.uns[“SCDRS_PARAM”] is required (run scdrs.pp.preprocess first).
It operates in implicit-covariate-correction mode if both FLAG_SPARSE and FLAG_COV are True, where computations are based on the implicit covariate-corrected data
CORRECTED_X = data.X + COV_MAT * COV_BETA + COV_GENE_MEAN.
It operates in normal mode otherwise, where computations are based on data.X,
- Parameters:
- dataanndata.AnnData
Single-cell data of shape (n_cell, n_gene). Assumed to be size-factor-normalized and log1p-transformed.
- gene_listlist
Disease gene list of length n_disease_gene.
- gene_weightarray_like, default=None
Gene weights of length n_disease_gene for genes in the gene_list. If gene_weight=None, the weights are set to be one.
- ctrl_match_keystr, default=”mean_var”
Gene-level statistic used for matching control and disease genes; should be in data.uns[“SCDRS_PARAM”][“GENE_STATS”].
- n_ctrlint, default=1000
Number of control gene sets.
- n_genebinint, default=200
Number of bins for dividing genes by ctrl_match_key if data.uns[“SCDRS_PARAM”][“GENE_STATS”][ctrl_match_key] is a continuous variable.
- weight_optstr, default=”vs”
Option for computing the raw score
‘uniform’: average over the genes in the gene_list.
‘vs’: weighted average with weights equal to 1/sqrt(technical_variance_of_logct).
‘inv_std’: weighted average with weights equal to 1/std.
‘od’: overdispersion score.
- copybool, default=False
If to make copy of the AnnData object to avoid writing on the orignal data.
- return_raw_ctrl_scorebool, default=False
If to return raw control scores.
- return_norm_ctrl_scorebool, default=False
If to return normalized control scores.
- random_seedint, default=0
Random seed.
- verbosebool, default=False
If to output messages.
- save_intermediatestr, default=None
File path prefix for saving intermediate results.
- Returns:
- df_respandas.DataFrame (dtype=np.float32)
scDRS results of shape (n_cell, n_key) with columns
raw_score: raw disease scores.
norm_score: normalized disease scores.
mc_pval: Monte Carlo p-values based on the normalized control scores of the same cell.
pval: scDRS individual cell-level disease-association p-values.
nlog10_pval: -log10(pval). Needed in case the single precision (np.float32) gives inaccurate p-values
zscore: one-side z-score converted from pval.
ctrl_raw_score_*: raw control scores.
ctrl_norm_score_*: normalized control scores.