compute highly variable features
calculateHVF(
gobject,
spat_unit = NULL,
feat_type = NULL,
expression_values = c("normalized", "scaled", "custom"),
method = c("cov_groups", "cov_loess", "var_p_resid"),
reverse_log_scale = FALSE,
logbase = 2,
expression_threshold = 0,
nr_expression_groups = 20,
zscore_threshold = 1.5,
HVFname = "hvf",
difference_in_cov = 0.1,
var_threshold = 1.5,
var_number = NULL,
random_subset = NULL,
set_seed = TRUE,
seed_number = 1234,
show_plot = NULL,
return_plot = NULL,
save_plot = NULL,
save_param = list(),
default_save_name = "HVFplot",
return_gobject = TRUE,
verbose = TRUE
)
giotto object
spatial unit
feature type
expression values to use
method to calculate highly variable features
reverse log-scale of expression values (default = FALSE)
if reverse_log_scale
is TRUE, which log base was used?
expression threshold to consider a gene detected
(cov_groups) number of expression groups for cov_groups
(cov_groups) zscore to select hvg for cov_groups
name for highly variable features in cell metadata
(cov_loess) minimum difference in coefficient of variance required
(var_p_resid) variance threshold for features for var_p_resid method
(var_p_resid) number of top variance features for var_p_resid method
random subset to perform HVF detection on.
Passing NULL
runs HVF on all cells.
logical. whether to set a seed when random_subset is used
seed number to use when random_subset is used
show plot
return ggplot object (overridden by return_gobject
)
logical. directly save the plot
list of saving parameters from
GiottoVisuals::all_plots_save_function()
default save name for saving, don't change, change save_name in save_param
boolean: return giotto object (default = TRUE)
be verbose
giotto object highly variable features appended to feature metadata
(fDataDT()
)
Currently we provide 2 ways to calculate highly variable genes:
1. high coeff of variance (COV) within groups:
First genes are binned (nr_expression_groups) into average expression
groups and the COV for each feature is converted into a z-score within each
bin. Features with a z-score higher than the threshold
(zscore_threshold) are considered highly variable.
2. high COV based on loess regression prediction:
A predicted COV is calculated for each feature using loess regression
(COV~log(mean expression))
Features that show a higher than predicted COV (difference_in_cov)
are considered highly variable.
g <- GiottoData::loadGiottoMini("visium")
#> 1. read Giotto object
#> 2. read Giotto feature information
#> 3. read Giotto spatial information
#> 3.1 read Giotto spatial shape information
#> 3.2 read Giotto spatial centroid information
#> 3.3 read Giotto spatial overlap information
#> 4. read Giotto image information
#>
#> checking default envname 'giotto_env'
#> a system default python environment was found
#> Using python path:
#> "/usr/bin/python3"
calculateHVF(g)
#> hvf has already been used, will be overwritten
#> An object of class giotto
#> >Active spat_unit: cell
#> >Active feat_type: rna
#> [SUBCELLULAR INFO]
#> polygons : cell
#> [AGGREGATE INFO]
#> expression -----------------------
#> [cell][rna] raw normalized scaled
#> spatial locations ----------------
#> [cell] raw
#> spatial networks -----------------
#> [cell] Delaunay_network spatial_network
#> spatial enrichments --------------
#> [cell][rna] cluster_metagene DWLS
#> dim reduction --------------------
#> [cell][rna] pca custom_pca umap custom_umap tsne
#> nearest neighbor networks --------
#> [cell][rna] sNN.pca custom_NN
#> attached images ------------------
#> images : alignment image
#>
#>
#> Use objHistory() to see steps and params used