Run initialization for HMRF model
initHMRF_V2(
gobject,
spat_unit = NULL,
feat_type = NULL,
expression_values = c("scaled", "normalized", "custom"),
spatial_network_name = "Delaunay_network",
use_spatial_genes = c("binSpect", "silhouetteRank"),
use_score = FALSE,
gene_list_from_top = 2500,
filter_method = c("none", "elbow"),
user_gene_list = NULL,
use_pca = FALSE,
use_pca_dim = 1:20,
gene_samples = 500,
gene_sampling_rate = 2,
gene_sampling_seed = 10,
use_metagene = FALSE,
cluster_metagene = 50,
top_metagene = 20,
existing_spatial_enrichm_to_use = NULL,
use_neighborhood_composition = FALSE,
spatial_network_name_for_neighborhood = NULL,
metadata_to_use = NULL,
hmrf_seed = 100,
cl.method = c("km", "leiden", "louvain"),
resolution.cl = 1,
k = 10,
tolerance = 1e-05,
zscore = c("none", "rowcol", "colrow"),
nstart = 1000,
factor_step = 1.05,
python_path = NULL
)
giotto object
spatial unit
feature type
expression values to use
name of spatial network to use for HMRF
which of Giotto's spatial genes to use
use score as gene selection criterion (applies when use_spatial_genes=silhouetteRank)
total spatial genes before sampling
filter genes by top or by elbow method, prior to sampling
user-specified genes (optional)
if PCA is used on the spatial gene expression value for clustering
dimensions of the PCs of the selected expression
number of spatial gene subset to use for HMRF
parameter (1-50) controlling proportion of gene samples from different module when sampling, 1 corresponding to equal gene samples between different modules; 50 corresponding to gene samples proportional to module size.
random number seed to sample spatial genes
if metagene expression is used for clustering
number of metagenes to use
= number of genes in each cluster for the metagene calculation
name of existing spatial enrichment result to use
if neighborhood composition is used for hmrf
spatial network used to calculate neighborhood composition
metadata used to calculate neighborhood composition
random number seed to generate initial mean vector of HMRF model
clustering method to calculate the initial mean vector, selecting from 'km', 'leiden', or 'louvain'
resolution of Leiden or Louvain clustering
number of HMRF domains
error tolerance threshold
type of zscore to use
number of Kmeans initializations from which to select the best initialization
dampened factor step
python_path
initialized HMRF
This function is the initialization step of HMRF domain clustering. First, user specify which of Giotto's spatial genes to run, through use_spatial_genes. Spatial genes have been stored in the gene metadata table. A first pass of genes will filter genes that are not significantly spatial, as determined by filter_method. If filter_method is none, then top 2500 (gene_list_from_top) genes ranked by pvalue are considered spatial. If filter_method is elbow, then the exact cutoff is determined by the elbow in the -log10 P-value vs. gene rank plot. Second, users have a few options to decrease the dimension of the spatial genes for clustering, listed with selection priority: 1. use PCA of the spatial gene expressions (selected by use_pca) 2. use metagene expressions (selected by use_metagene) 3. sampling to select 500 spatial genes (controlled by gene_samples). Third, once spatial genes are finalized, we are using clustering method to initialize HMRF. Instead of select spatial genes for domain clustering, HMRF method could also applied on unit neighborhood composition of any group membership(such as cell types), specified by parameter: use_neighborhood_composition, spatial_network_name_for_neighborhood and metadata_to_use. Also HMRF provides the opportunity for user to do clustering by any customized spatial enrichment matrix (existing_spatial_enrichm_to_use). There are 3 clustering algorithm: K-means, Leiden, and Louvain to determine initial centroids of HMRF. The initialization is then finished. This function returns a list containing y (expression), nei (neighborhood structure), numnei (number of neighbors), blocks (graph colors), damp (dampened factor), mu (mean), sigma (covariance), k, genes, edgelist, init.cl (initial clusters), spat_unit, feat_type. This information is needed for the second step, doHMRF.
g <- GiottoData::loadGiottoMini("visium")
#> 1. read Giotto object
#> 2. read Giotto feature information
#> 3. read Giotto spatial information
#> 3.1 read Giotto spatial shape information
#> 3.2 read Giotto spatial centroid information
#> 3.3 read Giotto spatial overlap information
#> 4. read Giotto image information
#> python already initialized in this session
#> active environment : '/usr/bin/python3'
#> python version : 3.12
#> checking default envname 'giotto_env'
#> a system default python environment was found
#> Using python path:
#> "/usr/bin/python3"
g <- binSpect(g, return_gobject = TRUE)
#>
#> This is the single parameter version of binSpect
#>
#> 1. matrix binarization complete
#>
#> 2. spatial enrichment test completed
#>
#> 3. (optional) average expression of high
#> expressing cells calculated
#>
#> 4. (optional) number of high expressing cells
#> calculated
initHMRF_V2(gobject = g, cl.method = "km")
#>
#> If used in published research, please cite:
#> Q Zhu, S Shah, R Dries, L Cai, GC Yuan.
#> 'Identification of spatially associated subpopulations by combining
#> scRNAseq and sequential fluorescence in situ hybridization data'
#> Nature biotechnology 36 (12), 1183-1190. 2018
#> Error: packages 'graphcoloring', 'smfishHmrf' are not yet installed
#>
#> To install:
#> devtools::install_bitbucket("qzhudfci/graphcoloring")
#> devtools::install_bitbucket("qzhudfci/smfishHmrf-r")