Given the path to a CosMx experiment directory, creates a Giotto
object. For lower level control over loading, please see importCosMx()
createGiottoCosMxObject(
cosmx_dir,
version = "default",
FOVs = NULL,
slide = 1,
feat_type = c("rna", "negprobes"),
split_keyword = list("NegPrb"),
load_images = list(composite = "composite", overlay = "overlay"),
load_expression = FALSE,
load_cellmeta = TRUE,
load_transcripts = TRUE,
poly_pref = "mask",
image_negative_y = NULL,
fov_shifts_path = NULL,
transcript_path = NULL,
cell_labels_dir = NULL,
expression_path = NULL,
metadata_path = NULL,
cores = determine_cores(),
verbose = NULL,
instructions = NULL,
remove_unvalid_polygons = deprecated(),
data_to_use = deprecated(),
remove_background_polygon = deprecated(),
background_algo = deprecated(),
...
)
full path to the exported cosmx directory
character. Version of CosMx output. Current selections are either "default", "v6", and "legacy" (for the NSCLC dataset).
field of views to load.
numeric. Slide number. Defaults to 1. This must be correct so that cell_IDs will match across polygons and (if loaded) expression matrix.
character. feature type. Provide more than one value if
using the split_keyword
param. For each set of keywords to split by, an
additional feat_type should be provided in the same order. Affects how
the transcripts information is loaded. Helpful for separating out the
QC probes. See details.
list of character vectors of keywords to split the
transcripts based on their feat_ID. Keywords will be grepl()
matched against the feature IDs information. See details.
named list of filepaths to directories. Loads the composite and overlay images by default.
logical. (Default = FALSE) whether to load provided expression matrix
logical. (Default = TRUE) whether to load provided cell metadata
logical. (Default = TRUE) whether to load provided transcript detections
character. "mask" or "csv". Determines whether to load in the polygons from the mask images (default) or the csv polygons file.
Optional logical. Whether images are assumed to map
to positive or negative y values before fov shifts are applied. Affects
images (and polygons generated from masks). This overrides any settings from
selecting version
.
Optional. Filepath to fov_positions_file
Optional. Filepath to desired transcripts file to load.
Optional. Path to directory containing CellLabels images to load as polygons.
Optional. Filepath to cell feature matrix to load.
Optional. Filepath to metadata file to load.
nthreads to use when loading in cell metadata and transcripts
be verbose when building Giotto object
list of instructions or output result
from createGiottoInstructions
deprecated. Now always done
deprecated. Not used
deprecated. Now always done
deprecated. Not used
Arguments passed on to importCosMx
micron
logical. Whether to scale spatial information as micron instead of the default pixels
px2um
numeric. Scalefactor from pixels to micron. Defaults to 0.12028 based on `CosMx-ReadMe.html` info. May be different depending on dataset.
a giotto object
[Expected Directory] This function generates a giotto object when given a link to a cosmx output directory. It expects the following items within the directory where the bolded portions are what this function matches against:
CellComposite (folder of images)
CellLabels (folder of images)
CellOverlay (folder of images)
CompartmentLabels (folder of images)
experimentname_exprMat_file.csv (file)
experimentname_fov_positions_file.csv (file)
experimentname_metadata_file.csv (file)
experimentname_tx_file.csv (file)
experimentname-polygons.csv (file)
The subdirectories should contain only the image files in order to be loaded correctly.
[feat_type and split_keyword]
Additional QC probe information is in the subcellular feature detections
information and must be separated from the gene expression information
during processing.
The QC probes have prefixes that allow them to be selected from the rest of
the feature IDs.
Giotto uses feat_type
and split_keyword
params to select these QC
probes out as separate feature types. See examples in
[GiottoClass::createGiottoPoints]
for how this works.
The Gene expression subset labeled as rna
is accepted as the subset of
feat_IDs that do not get matched to any of the split_keywords
.
[Images] Images in the expected CellComposite and CellOverlay folders will be loaded as giotto largeImage objects by default.
[Polygons] Some outputs provide both the mask images and the
polygons.csv
. Giotto uses the mask images by default to convert to polygons
info. However, if only the polygons.csv
file is present or
poly_pref = csv
is set, then the csv file will be used instead. One thing
to watch out for when loading from the csv
is that in some datasets, the
polygons preferably should not overlap each other. Giotto is not fully
compatible with overlapping annotations. It also opens up the possibility
of double counting transcripts. Polygon overlaps do not appear to be an
issue in the most recent outputs.
[Spatial Alignment Issues] Different versions of the CosMx output have
changed how images (and polygons generated from masks) should be aligned
relative to the vector information (transcript detections). The version
param affects how the images are mapped by default. image_negative_y
is
a toggle that overrides the image mapping style when an appropriate version
is difficult to determine.
showGiottoImageNames
can be used to see the available images.
if (FALSE) { # \dontrun{
f <- "file path to cosmx flatfile output directory"
createGiottoCosMxObject(f)
# load older CosMx format
createGiottoCosMxObject(f, version = "legacy")
# force images and mask image polygons to shift up one image height
createGiottoCosMxObject(f, image_negative_y = FALSE)
# load only aggregated data
createGiottoCosMxObject(f,
load_cellmeta = TRUE,
load_expression = TRUE,
load_transcripts = FALSE
# data filepaths not needed unless they are not in expected locations
)
} # }