You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
From our work in Nanduri et al., we developed the pathogen-embed tools to project seasonal flu alignments into low-dimensional representations and identify clusters of genetically related sequences. We can use these tools to jointly embed alignments from multiple genes like HA and NA and identify putative reassortment events. The pathogen-embed package is now part of the Nextstrain Docker and Conda environments, so we can easily run these tools from our seasonal flu workflows.
Description
Add rules to the core seasonal flu workflow to annotate HA and NA trees with t-SNE embedding coordinates (tsne_x and tsne_y) using pathogen-distance and pathogen-embed and labels of clusters identified with pathogen-cluster (tsne_label). Calculate distances for each gene segment individually and produce a t-SNE embedding from all distances and alignments together using the optimal settings from Nanduri et al. Then, produce clusters using optimal settings for Nextstrain clades from the same work.
Calculate genetic distances per gene alignment with pathogen-distance
Generate t-SNE embedding with all gene alignments and distances with pathogen-embed
Generate clusters from t-SNE embedding with pathogen-cluster
Convert clusters and embedding TSV to node data JSON
Annotate all gene trees with clusters and embeddings
Update Auspice config JSONs to include colorings for the cluster label and embedding fields
The text was updated successfully, but these errors were encountered:
Context
From our work in Nanduri et al., we developed the pathogen-embed tools to project seasonal flu alignments into low-dimensional representations and identify clusters of genetically related sequences. We can use these tools to jointly embed alignments from multiple genes like HA and NA and identify putative reassortment events. The pathogen-embed package is now part of the Nextstrain Docker and Conda environments, so we can easily run these tools from our seasonal flu workflows.
Description
Add rules to the core seasonal flu workflow to annotate HA and NA trees with t-SNE embedding coordinates (
tsne_x
andtsne_y
) usingpathogen-distance
andpathogen-embed
and labels of clusters identified withpathogen-cluster
(tsne_label
). Calculate distances for each gene segment individually and produce a t-SNE embedding from all distances and alignments together using the optimal settings from Nanduri et al. Then, produce clusters using optimal settings for Nextstrain clades from the same work.pathogen-distance
pathogen-embed
pathogen-cluster
The text was updated successfully, but these errors were encountered: