fast normalize and/or scale expression values of Giotto object
normalizeGiotto(
gobject,
spat_unit = NULL,
feat_type = NULL,
expression_values = "raw",
norm_methods = c("standard", "pearson_resid", "osmFISH", "quantile"),
library_size_norm = TRUE,
scalefactor = 6000,
log_norm = TRUE,
log_offset = 1,
logbase = 2,
scale_feats = TRUE,
scale_genes = deprecated(),
scale_cells = TRUE,
scale_order = c("first_feats", "first_cells"),
theta = 100,
name = "scaled",
update_slot = deprecated(),
verbose = TRUE
)
giotto
object
spatial unit
feature type
expression values to use
normalization method to use
normalize cells by library size
scale factor to use after library size normalization
transform values to log-scale
offset value to add to expression matrix, default = 1
log base to use to log normalize expression values
z-score genes over all cells
deprecated, use scale_feats
z-score cells over all genes
order to scale feats and cells
theta parameter for the pearson residual normalization step
character. name to use for normalization results
deprecated. Use name
param instead
be verbose
giotto
object
Currently there are two 'methods' to normalize your raw counts data.
A. The standard method follows the standard protocol which can be adjusted
using the provided parameters and follows the following order:
1. Data normalization for total library size and scaling by a custom scale-factor.
2. Log transformation of data.
3. Z-scoring of data by genes and/or cells.
B. The normalization method as provided by the osmFISH paper is also
implemented:
1. First normalize genes, for each gene divide the counts by the total gene count and multiply by the total number of genes.
2. Next normalize cells, for each cell divide the normalized gene counts by the total counts per cell and multiply by the total number of cells.
C. The normalization method as provided by Lause/Kobak et al is also
implemented:
1. First calculate expected values based on Pearson correlations.
2. Next calculate z-scores based on observed and expected values.
D. Quantile normalization across features
1. Rank feature expression
2. Define a common distribution by sorting expression values per feature then finding the mean across all features per index
3. Apply common distribution to expression information by using the ranks from step 1 as indices
By default the latter two results will be saved in the Giotto slot for scaled expression, this can be changed by changing the update_slot parameters
g <- GiottoData::loadGiottoMini("visium")
#> 1. read Giotto object
#> 2. read Giotto feature information
#> 3. read Giotto spatial information
#> 3.1 read Giotto spatial shape information
#> 3.2 read Giotto spatial centroid information
#> 3.3 read Giotto spatial overlap information
#> 4. read Giotto image information
#> python already initialized in this session
#> active environment : '/usr/bin/python3'
#> python version : 3.10
#> checking default envname 'giotto_env'
#> a system default python environment was found
#> Using python path:
#> "/usr/bin/python3"
normalizeGiotto(g) # default is method A
#> first scale feats and then cells
#> > normalized already exists and will be replaced with new values
#> Setting expression [cell][rna] normalized
#> > scaled already exists and will be replaced with new values
#> Setting expression [cell][rna] scaled
#> An object of class giotto
#> >Active spat_unit: cell
#> >Active feat_type: rna
#> dimensions : 634, 624 (features, cells)
#> [SUBCELLULAR INFO]
#> polygons : cell
#> [AGGREGATE INFO]
#> expression -----------------------
#> [cell][rna] raw normalized scaled
#> spatial locations ----------------
#> [cell] raw
#> spatial networks -----------------
#> [cell] Delaunay_network spatial_network
#> spatial enrichments --------------
#> [cell][rna] cluster_metagene DWLS
#> dim reduction --------------------
#> [cell][rna] pca custom_pca umap custom_umap tsne
#> nearest neighbor networks --------
#> [cell][rna] sNN.pca custom_NN
#> attached images ------------------
#> images : alignment image
#>
#>
#> Use objHistory() to see steps and params used