R/clustering.R
labelTransfer.Rd
When two sets of data share an embedding space, transfer the labels from one of the sets to the other based on KNN similarity voting in that space.
# S4 method for class 'giotto,giotto'
labelTransfer(
x,
y,
spat_unit = NULL,
feat_type = NULL,
labels,
k = 10,
name = paste0("trnsfr_", labels),
prob = TRUE,
reduction = "cells",
reduction_method = "pca",
reduction_name = "pca",
dimensions_to_use = 1:10,
return_gobject = TRUE,
...
)
# S4 method for class 'giotto,missing'
labelTransfer(
x,
spat_unit = NULL,
feat_type = NULL,
source_cell_ids,
target_cell_ids,
labels,
k = 10,
name = paste0("trnsfr_", labels),
prob = TRUE,
reduction = "cells",
reduction_method = "pca",
reduction_name = "pca",
dimensions_to_use = 1:10,
return_gobject = TRUE,
...
)
target object
source object
metadata column in source with labels to transfer
number of k-neighbors to train a KNN classifier
metadata column in target to apply the full set of labels to
output knn probabilities together with label predictions
reduction on cells or features (default = "cells")
shared reduction method (default = "pca" space)
name of shared reduction space (default name = "pca")
dimensions to use in shared reduction space (default = 1:10)
Arguments passed on to FNN::knn
algorithm
nearest neighbor search algorithm.
cell/spatial IDs with the source labels to transfer
cell/spatial IDs to transfer the labels to.
IDs from source_cell_ids
are always included as well.
object x
with new transferred labels added to metadata
This function trains a KNN classifier with FNN::knn()
.
The training data is from object y
or source_cell_ids
subset in x
and
uses existing annotations within the cell metadata.
Cells without annotation/labels from x
or target_cell_ids
subset in x
will receive predicted labels (and optional probabilities when
prob = TRUE
).
IMPORTANT This projection assumes that you're using the same dimension reduction space (e.g. PCA) and number of dimensions (e.g. first 10 PCs) to train the KNN classifier as you used to create the initial annotations/labels in the source Giotto object.
This function can allow you to work with very big data as you can predict cell labels on a smaller & subsetted Giotto object and then project the cell labels to the remaining cells in the target Giotto object. It can also be used to transfer labels from one set of annotated data to another dataset based on expression similarity after joining and integrating.
g <- GiottoData::loadGiottoMini("visium")
#> 1. read Giotto object
#> 2. read Giotto feature information
#> 3. read Giotto spatial information
#> 3.1 read Giotto spatial shape information
#> 3.2 read Giotto spatial centroid information
#> 3.3 read Giotto spatial overlap information
#> 4. read Giotto image information
#> python already initialized in this session
#> active environment : '/usr/bin/python3'
#> python version : 3.10
#> checking default envname 'giotto_env'
#> a system default python environment was found
#> Using python path:
#> "/usr/bin/python3"
id_subset <- sample(spatIDs(g), 300)
n_pred <- nrow(pDataDT(g)) - 300
# transfer labels from one object to another ###################
g_small <- g[, id_subset]
# additional steps to get labels to transfer on smaller object...
g <- labelTransfer(g, g_small, labels = "leiden_clus")
# transfer labels between subsets of a single object ###########
g <- labelTransfer(g,
label = "leiden_clus", source_cell_ids = id_subset, name = "knn_leiden2"
)