Identify significant principal components (PCs)
jackstrawPlot(
gobject,
spat_unit = NULL,
feat_type = NULL,
expression_values = c("normalized", "scaled", "custom"),
reduction = c("cells", "feats"),
feats_to_use = "hvf",
center = TRUE,
scale_unit = TRUE,
ncp = 20,
ylim = c(0, 1),
iter = 10,
threshold = 0.01,
random_subset = NULL,
set_seed = TRUE,
seed_number = 1234,
verbose = TRUE,
show_plot = NULL,
return_plot = NULL,
save_plot = NULL,
save_param = list(),
default_save_name = "jackstrawPlot"
)
giotto object
spatial unit (e.g. "cell")
feature type (e.g. "rna", "dna", "protein")
expression values to use
cells or genes
subset of features to use for PCA
center data before PCA
scale features before PCA
number of principal components to calculate
y-axis limits on jackstraw plot
number of iterations for jackstraw
p-value threshold to call a PC significant
randomized subset of matrix to use to approximate but speed up calculation
logical. whether to set a seed when random_subset is used
seed number to use when random_subset is used
show progress of jackstraw method
logical. show plot
logical. return ggplot object
logical. save the plot
list of saving parameters, see
showSaveParameters
default save name for saving, don't change, change save_name in save_param
if return_plot
= TRUE
: ggplot object for jackstraw method
if return_plot
= FALSE
: silently returns number of significant PCs
The Jackstraw method uses the permutationPA
function. By systematically permuting genes it identifies robust, and thus
significant, PCs. This implementation makes small modifications to SVD
calculation for improved efficiency and flexibility with different matrix
types.
This implementation supports both dense and sparse input matrices.
steps
Select singular values to calculate based on matrix dims and ncp
Find SVD to get variance explained of each PC
Randomly sample across features then re-calculate randomized variance
Determine P-value by comparing actual vs randomized explained variance, indicating the significance of each PC
g <- GiottoData::loadGiottoMini("visium")
#> 1. read Giotto object
#> 2. read Giotto feature information
#> 3. read Giotto spatial information
#> 3.1 read Giotto spatial shape information
#> 3.2 read Giotto spatial centroid information
#> 3.3 read Giotto spatial overlap information
#> 4. read Giotto image information
#> python already initialized in this session
#> active environment : '/usr/bin/python3'
#> python version : 3.10
#> checking default envname 'giotto_env'
#> a system default python environment was found
#> Using python path:
#> "/usr/bin/python3"
jackstrawPlot(gobject = g)
#> using 'jackstraw' to identify significant PCs If used in
#> published research, please cite:
#> Neo Christopher Chung and John D. Storey (2014).
#> 'Statistical significance of variables driving systematic variation in
#> high-dimensional data. Bioinformatics
#> "hvf" column was found in the feats metadata information and will be
#> used to select highly variable features
#> Estimating number of significant principal components:
#>
#> number of estimated significant components: 7