Get facility clusters on the phylogeny

get_clusters(tr, locs, pureness = 1, bootstrap = NULL, pt)

Arguments

tr

a tree object returned by the read.tree function from the ape package

locs

a named vector of locations of isolates (e.g. facility of isolation), with the name being the sample ID

pureness

how pure each cluster should be (must be > 0.5) (optional, defauly = 1)

bootstrap

Bootstrap support to use to filter unconfident tree edges (optional, default = NULL)

pt

a named vector of patients each isolate originated from, with the name being the sample ID. If this information is unavailable, set pt = NULL.

Value

list where pure_subtree_info is a data.frame of facility clusters on phylogeny, index indicates which element that cluster is in the list of subtrees, NA indicates it is not part of a subtree; subtrees is an object of the actual subtrees (can be used for plotting); cluster_pureness is the purness of each cluster

Examples

if (FALSE) {
locs <- metadata %>% dplyr::select(isolate_id, facility) %>% tibble::deframe()
pt <- metadata %>% dplyr::select(isolate_id, patient_id) %>% tibble::deframe()
clusts <- get_clusters(tr, locs, pureness = 1, bootstrap = NULL, pt)
}